开发者

Monitor Unattended Batch Program

Currently, we have a batch program that is running 24/7. It actually tests several pages and it just sends an email to us if it finds any error on the page. If there are no emails, we assume that the program is still running.

Having said that, we actually need a service (perhaps) or another way to know if the program has stopped running. The program is installed in a Test Machine that is open 24/7. Currently, we're thinking about some kind 开发者_Python百科of Push monitoring service ex. a third system party will be pinged by our program and if it does not receive the expected ping, it will alert us. Do you know such service? Or can you recommend other options? Thanks!


About your best way to monitor the script is to have it log its status and or checkpoint to a file periodically. Each phase/major iteration your script would either log to a file or submit a message to syslog. Alternatively if your batch script iterates past a specific point of code often enough you could insert a health check timer. When a specified timeout has occured you will put a message into a log file.

The pseudocode might look like this..

check_timeout
  is current time > timeout
    Yes ->  write a log message and set timeout to a point x seconds/minutes/hours into the future.
    No -> Do nothing and return from function
endcheck_timeout


Main
  set timeout to 0
  loop
    check_timeout
    do processing
  endloop
endmain

Alternatively you could change your check_timeout routine to forward on a message to a monitoring system such as Zabbix using the zabbix_sender to update an item with the current time. Then you would write a trigger to activate if the last time updated was 1.5 or more times greater than the average check in interval (Depends on your average load but you may have time variance).


There are two solutions:

  1. Email you when your batch script is down
  2. Restart your batch script if it's down

For (1), download pslist and bmail. Use them with the following batch script:

@echo off
:start
set SECONDS=10
pslist | findstr /i YOUR_BATCH_SCRIPT > isrunning.txt
for %%A in (isrunning.txt) do if %%~zA==0 bmail -s SMTPSERVER -t TOEMAIL -f FROMEMAIL -h -a "Batch script is down!"
choice /C a /T %SECONDS% /D a
goto :start

NOTE: You will need to edit YOUR_BATCH_SCRIPT and the parameters for bmail (smtpserver etc.) to suit your environment.

For (2), you can use a utility like Application Monitor to restart your batch program if it crashes.


guys, thank you all for your responses and I'm just grateful for all your help. Anyways, I came back to inform you (and others that may have and will have the same need) that I already found the service that fit my requirements. I'm now using the free Pushmon service. It's actually about to launch but I've already tried it via an invite code. I've been using it for several weeks already along with our new scheduled testing programs and so far, it hasn't failed me yet.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜