Page 1 of 2

If a job doesn't start or complete by a specific time

Posted: 19 Apr 2011 12:46
by donnawonna
Our developers want to provide a maximum time limit for a job. They need it to stop executing if it is not completed by say 8:00am and should be forced ok. Can controlm be coded to do all this without manual intervention from an operator? If so, how?

Posted: 20 Apr 2011 10:01
by sandu
In the PostProc tab:

When: Late Time
Param: 0800
To: killjob
Urgency:Urgent
Message: %%ORDERID

where killjob shoud be defined as a kill script in ctm_menu/Shout Destination table.

e.g

11 P S killjob <abdolute_path>/killjob.sh


killjob.sh script looks like:

#/usr/bin/sh

ctmkilljob -ORDERID $2

Posted: 23 Apr 2011 8:01
by donnawonna
Thanks for your response. It's greatly appreciated!

Posted: 25 Apr 2011 4:38
by futre25
Hi,

excuse me.

I performed a test with data indicating, but I can not kill the process, or to send me a alert.

Why? I'm doing wrong

Attachment my data:

1.- I create a new post and new scripts:

Shout Destination Table 'SYSTEM '
------------------------------------

# Destination Type Adr Logical Name Physical Name
--- ---------------- --- ----------------- ----------------------------
1 O S CONSOLE 2 E S ECS
3 L S IOALOG
4 P S KILLJOB /tmp/killjob.sh

q) Quit e#) Edit entry # n) New entry d#) Delete entry #


And I create a new scripts in /tmp

xxxx_xxxx-xxxxxx [56] ls -ltr killjob.sh
-rwxrwxrwx 1 xxxxxx controlm 72 Apr 25 16:15 killjob.sh
xxxxxx-xxxxx [57] more killjob.sh
echo "Lanzado KILL JOB: " > /tmp/prueba_kill.txt
ctmkilljob -ORDERID $2

In my PostProc tab:

When: Exectime
Param: >1
To: KILLJOB
Urgency:Urgent
Message: %%ORDERID

My job delay 3 minutes in executing.

Thanks to all.

Posted: 25 Apr 2011 6:56
by sandu
What does it say in the log of the job?

Posted: 26 Apr 2011 9:31
by futre25
Hi.

the command of the job is a sleep 250 and the problem is that not executing the command ctmkilljob, because not performed the call to the command.

Execute the sleep, but not shows the alert, and not kill the job. Terminate succesfully.

Thanks for your answer.

Posted: 26 Apr 2011 12:17
by Walty
Hi futre25,

In your script:

echo "Lanzado KILL JOB: $2" > /tmp/prueba_kill.txt &
ctmkilljob -ORDERID $2 &
ctmshout -DEST ECS -SEVERITY U -MESSAGE "JOB with ORDERID=$2 was Killed " &


The ampersand (&) sign must be present at the end of each command
The <ctmshout> utility send alerte to GAS

In PostProc tab:

When: Exectime
Param: >1
To: KILLJOB
Urgency:Regular
Message: %%ORDERID

Posted: 26 Apr 2011 1:28
by futre25
Hi.

Thanks for your aswner.

The problem not exist in the scripts, but the scripts not reaches to executing.

I think that the problem is to create the new Shout Destination table -. KILLJOB.

Never executed the scripts.

Thanks for your help and dedication.

Posted: 26 Apr 2011 4:47
by sandu
Try and see if EXECTIME > 001 works instead of >1.

Posted: 26 Apr 2011 5:10
by futre25
Thanks for your responde.

The sentence is fine, because when testing the same exectime, but with the destination table ECS, the execution was correct and send a alert.

I'm doing wrong?

Thanks sandu.

Posted: 26 Apr 2011 6:58
by sandu
If you did the whole setup correctly it should work. I would go again over the shout definition, script permissions and job definitions to see if they are correct.

If a job doesn't start or complete by a specific time

Posted: 26 Apr 2011 9:28
by Admin007
I also tried this and it does not appear to be working.

I created a script in /tmp: killjob.sh

#/usr/bin/sh

ctmkilljob -ORDERID $2

I set up ctmsys as follows:

5 P S killjob /tmp/killjob.sh

I set up the job definition on PostProc tab as follows:

When: Late Time
Param: 1520 (just using any test time)
To: killjob
Urgency: Urgent
Message: %%ORDERID

To test this I created a very basic job definition performing an ls -ltr

I added a PRECMD of sleep 360 to force the job to sleep for 6 minutes before it executes.

The job slept and executed fine but never called the script to kill it even though the 1520 time came and went.

Thinking about it, would it be that the job is not actually executing when 1520 arrives? Since it is sleeping. I would still think it was submitted to the system by Control-M and then sleeps that the Late Time would still apply.

Posted: 26 Apr 2011 9:40
by sandu
This will not work. PRECMD is not part of the job. It does not get included in the runtime of the job. Look at statistics for the job to see the elapsed time.
ls command might be to quick to have the chance to kill the job before it completes. Sometimes there is a delay depending on how many you run on your datacentre. Try like before with the sleep 240 in the command line for the job..Then go on the actual agent box and monitor the process when it runs.
Does the script reside on Control-M/Server box?

Try to run the ctmkilljob -ORDERID $2 command from CTM/Server box replacing $2 with the order ID of the job after it has started.

Posted: 26 Apr 2011 10:01
by Admin007
I set it up with the command line to sleep 240, removed the PRECMD and attempted it numerous times without success.

Yes, the killjob.sh script resides on the Control-M/Server box.

I was able to kill the job while watching it process on the box by issuing:

ctmkilljob -ORDERID 00gqo

I then reran the job after adjusting the Late Time parm and then ran my script from the home directory and it killed the job.

I know the script works. It appears the job never calls the script though via the PostProc tab.

Posted: 27 Apr 2011 8:24
by Walty
Very strange.
I use Shout Destination tables for multiple actions without particular problems (v6.3.01.700)
Your Shout Destination table used is it active ? (ctmshtb)
I tried 2 executions (Exectime & Late Time) and it's work without problem

Shout Destination table (SYSTEM):
11 P S KILLJOB /tmp/killjob.sh

Script:
ls -lrt killjob.sh ; more killjob.sh
-rwxrwxrwx 1 labctm01 controlm 157 Apr 26 12:24 killjob.sh

echo "Lanzado KILL JOB: $2" > /tmp/prueba_kill.txt &
ctmkilljob -ORDERID $2 &
ctmshout -DEST ECS -SEVERITY U -MESSAGE "JOB with ORDERID=$2 was Killed " &

Job1 definition:
Task Type: Command
Command: sleep 240
PostProc: Exectime
Param: >1
To : KILLJOB
Urgency: Regular
Message: %%ORDERID

Log execution job1:
Date Time Code Job Name Job Id ----- Message -----

27/04/11 07:52:55 CS5065 ORDERED JOB:16783; DAILY FORCED, ODATE 20110426
27/04/11 07:52:56 SL5105 SUBMITTED TO labctm01
27/04/11 07:53:01 TR5101 STARTED AT 20110427075256 ON labctm01
27/04/11 07:53:01 TR5120 JOB STATE CHANGED TO Executing
27/04/11 07:53:56 TR5201 SHOUT TO KILLJOB PERFORMED
27/04/11 07:53:57 UT5409 JOB KILLED BY USER labctm01
27/04/11 07:54:01 TR5100 ENDED AT 20110427075401. OSCOMPSTAT 143. RUNCNT 1

27/04/11 07:54:01 TR5134 ENDED NOTOK
27/04/11 07:54:01 TR5120 JOB STATE CHANGED TO Analyzed
27/04/11 07:54:01 SL5120 JOB STATE CHANGED TO Post processed

Job2 definition:
Task Type: Command
Command: sleep 480
PostProc: Late Time
Param: 0800
To : KILLJOB
Urgency: Regular
Message: %%ORDERID

Log execution job2:
Date Time Code Job Name Job Id ----- Message -----

27/04/11 07:55:07 CS5065 ORDERED JOB:16786; DAILY FORCED, ODATE 20110426
27/04/11 07:55:07 SL5105 SUBMITTED TO labctm01
27/04/11 07:55:11 TR5101 STARTED AT 20110427075507 ON labctm01
27/04/11 07:55:11 TR5120 JOB STATE CHANGED TO Executing
27/04/11 08:00:06 TR5201 SHOUT TO KILLJOB PERFORMED
27/04/11 08:00:08 UT5409 JOB KILLED BY USER labctm01
27/04/11 08:00:11 TR5100 ENDED AT 20110427080011. OSCOMPSTAT 143. RUNCNT 1

27/04/11 08:00:11 TR5134 ENDED NOTOK
27/04/11 08:00:11 TR5120 JOB STATE CHANGED TO Analyzed
27/04/11 08:00:12 SL5120 JOB STATE CHANGED TO Post processed

To be continued .... maybe other users will have new suggestions