HCL Workload Automation, Version 9.4

Hadoop Map Reduce jobs

A Hadoop Map Reduce job defines, schedules, monitors, and manages the execution of Hadoop Map Reduce .jar files. You can bundle your Map Reduce code in a .jar file and run it using this job.

Prerequisites

Hadoop Map Reduce job definition

A description of the job properties and valid values are detailed in the context-sensitive help in the Dynamic Workload Console by clicking the question mark (?) icon in the top-right corner of the properties pane.

For more information about creating jobs using the various supported product interfaces, see Defining a job.

The following table lists the required and optional attributes for Hadoop Map Reduce jobs:
Table 1. Required and optional attributes for the definition of a Hadoop Map Reduce job
Attribute Description and value Required
Hadoop Installation Directory The directory where you installed Hadoop. For example, if Hadoop is installed in this path: /opt/hadoop/hadoop_2.6.0/bin/hadoop, you must specify the /opt/hadoop/hadoop_2.6.0 path for this attribute.  
Jar File The path and name of the jar file containing the Hadoop Map Reduce code
Main Class The Java class containing the main method to run when the job is loaded.  
Arguments The arguments of the job are provided to the main method  

Scheduling the job in HCL Workload Automation

You schedule HCL Workload Automation Hadoop Map Reduce jobs by defining them in job streams. Add the job to a job stream with all the necessary scheduling arguments and submit the job stream.

You can submit jobs by using the Dynamic Workload Console, Application Lab or the conman command line. See Scheduling and submitting jobs and job streams for information about how to schedule and submit jobs and job streams using the various interfaces.

Monitoring the job

If the HCL Workload Automation agent stops when you submit the HCL Workload Automation Hadoop Map Reduce job or while the job is running, when the agent becomes available again, the job status changes to ABEND and you have to resubmit the job. The Hadoop Map Reduce job status changes to UNDEFINED. You can view this information in the Extra information section of the Hadoop Map Reduce job in the Dynamic Workload Console.

For information about how to monitor jobs using the different product interfaces available, see Monitoring HCL Workload Automation jobs.

HadoopMapReduceJobExecutor.properties file

The properties file is automatically generated either when you perform a "Test Connection" from the Dynamic Workload Console in the job definition panels, or when you submit the job to run the first time. Once the file has been created, you can customize it. This is especially useful when you need to schedule several jobs of the same type. You can specify the values in the properties file and avoid having to provide information such as credentials and other information, for each job. You can override the values in the properties files by defining different values at job definition time.

The TWS_INST_DIR\TWS\JavaExt\cfg\HadoopMapReduceJobExecutor.properties file contains the following properties:
# Hadoop install directory
hadoopDir=/usr/local/hadoop
# Hadoop RunJar Main Class
className=
# Hadoop RunJar Arguments
arguments=

The hadoopDir property must be specified either in this file or when creating the Hadoop Map Reduce job definition in the Dynamic Workload Console. For more information, see the Dynamic Workload Console online help.

Job properties

While the job is running, you can track the status of the job and analyze the properties of the job. In particular, in the Extra Information section, if the job contains variables, you can verify the value passed to the variable from the remote system. Some job streams use the variable passing feature, for example, the value of a variable specified in job 1, contained in job stream A, is required by job 2 in order to run in the same job stream.

For information about how to display the job properties from the various supported interfaces, see Analyzing the job log. For example, from the conman command line, you can see the job properties by running:

conman sj <job_name>;props
where <job_name> is the Hadoop Map Reduce job name.

The properties are listed in the Extra Information section of the output command.

For information passing variables between jobs, see Passing job properties from one job to another in the same job stream instance.

Job log content

For information about how to display the job log from the various supported interfaces, see Analyzing the job log.

For example, you can see the job log content by running conman sj <job_name>;stdlist, where <job_name> is the Hadoop Map Reduce job name.

See also

From the Dynamic Workload Console you can perform the same task as described in

Creating job definitions.

For more information about how to create and edit scheduling objects, see

Designing your Workload.