Timeouts in Scheduler tasks, when upgrading from 2.9

Modified on Wed, 25 Sep, 2019 at 1:19 PM

Summary


This advisory relates to the use of the Omniscope Scheduler app to schedule long running tasks, and applies to those upgrading Omniscope from version 2.9 (or earlier) to 2018.x or later. You may see scheduled tasks fail with a log message similar to that shown further below, and will need to adjust the timeout settings in your tasks.


In detail


In each Task configuration, there is a "Time out (seconds)" setting, which defaults to 300 (5 minutes). 


In Omniscope 2.9, this applied only when the Scheduler-wide "Fork scheduled execution" option was explicitly ticked (i.e. when the scheduler would execute each task in a separate process - this is not the default); there was no way to limit rogue tasks when executing tasks in the same process as the scheduler. So, in essence, with defaults, tasks would have an ignored timeout, and would run to completion without the timeout being enforced.


In Omniscope 2018+, the timeout setting always takes effect. Whether running tasks in a separate process or in the same process, the task will be killed or "interrupted" if this timeout is reached. So, with defaults, tasks will be killed if they run for more than 5 minutes.


The symptoms of this are seen in the scheduler log, similar to the following:

Error publishing block "Database output"

com.visokio.util.vfw: Error publishing block "Database output"

 at com.visokio.ent.actions.files.PublishDataManagerOutputBlocks.va(PublishDataManagerOutputBlocks.java:29)

 at com.visokio.ent.actions.files.PublishDataManagerOutputBlocks.va(PublishDataManagerOutputBlocks.java:40)

 at com.visokio.ent.actions.FileAction.va(FileAction.java:82)

 at com.visokio.ent.actions.ChainAction.va(ChainAction.java:93)

 at com.visokio.ent.actions.ChainAction.vb(ChainAction.java:97)

 at com.visokio.ent.actions.ChainAction.va(ChainAction.java:46)

 at com.visokio.ent.actions.ChainAction.va(ChainAction.java:93)

 at com.visokio.ent.actions.ChainAction.vb(ChainAction.java:97)

 at com.visokio.ent.actions.ChainAction.va(ChainAction.java:46)

 at com.visokio.ent.actions.va.va(va.java:172)

 at com.visokio.ent.scheduler.vh.vy(vh.java:1)

 at com.visokio.ent.scheduler.vh.vj(vh.java:4)

 at com.visokio.util.v_b.run(v_b.java:5)

 at com.visokio.util.v_h.vc(v_h.java:1)

 at com.visokio.util.vaoj.va(vaoj.java:138)

 at com.visokio.util.vaoj.va(vaoj.java:254)

 at com.visokio.util.vts.ve(vts.java:102)

 at com.visokio.util.vum.vu(vum.java:10)

 at com.visokio.util.vu6.vh(vu6.java:22)

 at com.visokio.util.vu3.ve(vu3.java:76)

 at com.visokio.util.vu3.va(vu3.java:64)

 at com.visokio.util.vgq.run(vgq.java:2)

 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

 at java.lang.Thread.run(Thread.java:748)

Caused by: com.visokio.util.vtn: Cancelled via interrupt

 at com.visokio.util.vas8.va(vas8.java:6)

 at com.visokio.util.v_c.va(v_c.java:15)

 at com.visokio.util.vik.va(vik.java:95)

 at com.visokio.util.vml.va(vml.java:24)

 at com.visokio.ent.actions.files.PublishDataManagerOutputBlocks.va(PublishDataManagerOutputBlocks.java:21)

 ... 24 more

Caused by: java.lang.InterruptedException

 at java.lang.Object.wait(Native Method)

 at java.lang.Object.wait(Object.java:502)

 at com.visokio.util.v_c.va(v_c.java:50)

 ... 27 more


The solution is to configure an appropriate time limit in all your tasks. Put in sufficient time to allow the task to normally complete, but to fail if something is clearly going wrong, allowing for natural variation in processing time and network bandwidth, and allowing further for those tasks which have high variation in run time. Some examples:

  • If your task typically takes 10 seconds, leave the default ("300" seconds)
  • If your task typically takes 4 minutes, perhaps increase to a 10 minute timeout (enter "600" seconds)
  • If your task takes 2 hours, perhaps use a 4 hour timeout (enter "14400" seconds)
  • For a 1 minute timeout: use "60"
  • For a 10 minute timeout: use "600"
  • For a 1 hour timeout: use "3600"
  • For a 1 day timeout: use "86400"
    (etc.)


You should not simply put in a huge (e.g. 6 week) timeout for all tasks, since if a task runs into problems and (rarely) runs near-indefinitely, you do not want your other tasks to be blocked from executing. Parallel task execution is configurable in the Scheduler, but is not unlimited and is often set to just 1 task at a time.



Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article