Using recovery options in job definitions

The recovery options indicate the actions to be taken if a job fails.

The following table summarizes possible combinations of recovery options and actions. The table is based on the following criteria from a job stream called sked1:
  • Job stream sked1 has two jobs, job1 and job2.
  • If selected for job1, the recovery job is jobr.
  • Job2 is dependent on job1 and will not start until job1 is complete.
Table 1. Recovery option table
Prompt/Job Stop Continue Rerun

Recovery prompt: No

Recovery job: No

Intervention is required. Run job2 regardless of job1 completion status.
  • Rerun job1.
    • If job1 ends in error, issue scheduler prompt.
    • If reply is yes, repeat the above steps.
  • If job1 is successful, run job2.

Recovery prompt: Yes

Recovery job: No

Issue recovery prompt. Intervention is required.
  • Issue recovery prompt.
  • If reply is yes, run job2 regardless of job1 completion status.
  • Issue recovery prompt.
    • If reply is yes, rerun job1.
    • If job1 ends in error, repeat the above steps.
  • If job1 is successful, run job2.

Recovery prompt: No

Recovery job: Yes

  • Run jobr.
    • If jobr ends in error, intervention is required.
    • If jobr is successful, run job2.
  • Run jobr.
  • Run job2 regardless of job1 completion status.
  • Run jobr.
    • If jobr ends in error, intervention is required.
    • If jobr is successful, rerun job1.
  • If job1 ends in error, issue scheduler prompt.
    • If reply is yes, repeat the above steps.
    • If job1 is successful, run job2.

Recovery prompt: Yes

Recovery job: Yes

  • Issue recovery prompt.
    • If reply is yes, run jobr.
      • If jobr ends in error, intervention is required.
      • If jobr is successful, run job2.
  • Issue recovery prompt.
    • If reply is yes, run jobr.
  • Run job2 regardless of job1 completion status.
  • Issue recovery prompt.
    • If reply is yes, run jobr.
      • If jobr ends in error, intervention is required.
      • If jobr is successful, rerun job1.
  • If job1 ends in error, repeat the above steps.
  • If job1 is successful, run job2.
Note:
  • Intervention is required means that job2 is not released from its dependency on job1, and therefore must be released by the operator. You can also manually rerun job1 or cancel it.
  • The continue recovery option overrides the abend state, which might cause the schedule containing the job ended in error to be marked as successful. This prevents the schedule from being carried forward to the next day.
  • If you select the Rerun option without supplying a recovery prompt, when the job is unsuccessful HCL Workload Automation creates a prompt that asks if you want to proceed.
  • To reference a recovery job in conman, you must use the name of the original job (job1 in the scenario above, not jobr). Recovery jobs are run only one per abend.
Not all jobs are eligible to have recovery jobs run on a different workstation. Follow these guidelines:
  • If either workstation is an extended agent, it must be hosted by a domain manager or a fault-tolerant agent that runs in Full Status mode.
  • The recovery job workstation must be in the same domain as the parent job workstation.
  • If the recovery job workstation is a fault-tolerant agent, it must run in Full Status mode.