HCL Workload Automation, Version 9.4

The event rule management process

Event-driven workload automation is an ongoing process and can be reduced to the following steps:
  1. An event rule definition is created or modified with the Dynamic Workload Console or with the composer command line and saved in the objects database. Rule definitions can be saved as draft or non-draft.
  2. All new and modified non-draft rules saved in the database are periodically (by default every five minutes) found, built, and deployed by an internal process named rule builder. At this time they become active. Meanwhile, an event processing server, which is normally located in the master domain manager, receives all events from the agents and processes them.
  3. The updated monitoring configurations are downloaded to the HCL Workload Automation agents and activated. Each HCL Workload Automation agent runs a component named monman that manages two services named monitoring engine and ssmagent that are to catch the events occurring on the agent and perform a preliminary filtering action on them.
  4. Each monman detects and sends its events to the event processing server.
  5. The event processing server receives the events and checks if they match any deployed event rule.
  6. If an event rule is matched, the event processing server calls an actions helper to carry out the actions.
  7. The action helper creates an event rule instance and logs the outcome of the action in the database.
  8. The administrator or the operator reviews the status of event rule instances and actions in the database and logs.

The event-driven workload automation feature is automatically installed with the product. You can at any time change the value of the enEventDrivenWorkloadAutomation global option if you do not want to use it in your HCL Workload Automation network.

Event-driven workload automation is based on a number of services, subsystems, and internal mechanisms. The following ones are significant because they can be managed:
monman
Is installed on every HCL Workload Automation agent where it checks for all local events. All detected events are forwarded to the event processing server. The following conman commands are available to manage monman:
Table 1. conman commands for managing monitoring engines
Command Purpose
deployconf Updates the monitoring configuration file for the event monitoring engine on an agent. It is an optional command since the configuration is normally deployed automatically.
showcpus getmon Returns the list of event rules defined for the monitor running on an agent. This command can be used remotely to get the information of the configuration file in another agent of the network
startmon Starts monman on an agent. Can be issued from a different agent.
stopmon Stops monman on an agent. Can be issued from a different agent.

monman starts automatically each time a new Symphony is activated. This is determined by the autostart monman local option that is set to yes by default (and that you can disable if you do not want to monitor events on a particular agent).

Following each rule deployment cycle, updated monitoring configurations are automatically distributed to the agents hosting rules that have been changed since the last deployment. Note that there might be some transitory situations while deployment is under course. For example, if a rule is pending deactivation, the agents might be sending events in the time fraction that the new configuration files have not been deployed yet, but the event processor already discards them.

If an agent is unable to send events to the event processing server for a specified period of time, the monitoring status of the agent is automatically turned off. The period of time can be customized (in seconds) with the edwa connection timeout parameter in the localopts file. By default, it is set to 300 seconds (5 minutes).

The following events can be configured in the BMEvents.conf file to post the monitoring status of an agent:
  • TWS_Stop_Monitoring (261) : sent when the monitoring status of an agent is set to off (for stopmon command or because the agent is unable to send events to the event processing server).
  • TWS_Start_Monitoring (262): sent when the monitoring status of an agent is set to on (for startmon command or because the agent has restarted to send events to the event processing server).
These events have the following positional fields:
  1. Event number
  2. Affected workstation
  3. Reserved, currently always set to 1
Event processing server
Can be installed on the master domain manager, the backup master, or on any fault-tolerant agent installed as a backup master. It runs in the application server. It can be active on only one node in the network. It builds the rules, creates configuration files for the agents, and notifies the agents to download the new configurations. It receives and correlates the events sent by the monitoring engines and runs the actions. The following conman commands are available to manage the event processing server:
Table 2. conman commands for managing the event processing server
Command Purpose
starteventprocessor Starts the event processing server
stopeventprocessor Stops the event processing server
switcheventprocessor Switches the event processing server from the master domain manager to the backup master or fault-tolerant agent installed as a backup master, or vice versa

The event processing server starts automatically with the master domain manager. Only one event processor may run in the network at any time. If you want to run the event processor installed on a workstation other than the master (that is, on the backup master or on any fault-tolerant agent installed as backup master), you must first use the switcheventprocessor command to make it the active event processing server.

Note: If you set the ignore keyword on the workstation definition of the agent (installed as backup master) that at the time hosts the active event processor, the first following JnextPlan occurrence acknowledges that this particular agent is out of the plan. As a consequence, it cannot restart the event processor hosted there. For this reason, the scheduler yields a warning message and starts the event processor hosted by the master domain manager.