JUG Worker Source Code Documentation

The JUG worker is a generic "agent" that asks the master for work, obtains necessary software/data, executes jobs, and sends back the results.

The files downloaded (aka staged-in) for a particular job fall under two basic types: software and rundata. Software is installed once and shared (potentially) by multiple jobs, while rundata is copied into the work space for each job individually (though it may still be cached to avoid having to download it every time).

Files may be downloaded by the worker individually, but software is usually installed in the form of a package, from which one or more files are extracted. For a description of the simple packaging format, see Packaging Software.

Execution Cycle

The jug worker considers all jobs to consist of three main stages: stage-in, run, and stage-out. In order to make optimal use of system resources, jobs are "juggled" so that the stage-in of the next job is happening while the current job is running (and similarly for stage-out).

The specific steps taken by the worker to run a job are listed below:

Stage-in
Download and install all necessary software and rundata.
Execute stage_in_command if there is one.
Run
Execute run_command.
Polling
If defined, call polling_command until jug_polling.vars indicates that the job is finished. This is used in queue management software.
Stage-out
Execute stage_out_command if there is one.
Publish the existence of any files found in the job's output directory. Wait for these to be stored by a storage worker before "committing" the job, which marks it as done.

In the default configuration for job execution workers, only one jobs runs at a time, only one is staged in at a time, and only one is staged out at a time. Any of these stages may be configured to have multiple threads. That might be useful, for example, if the jobs being run by the worker are actually being submitted into some other processing queue. It also may be advantageous to give workers greater autonomy in the face of network and service outages by letting them pre-stage-in N jobs or queue up the stage-out of N jobs instead of just 1.

Configuration

The worker is configured using jug_worker_setup and instantiated with jug_make_worker or by the action of a queue manager.