splitbatch(1) General Commands Manual splitbatch(1)
NAME
splitbatch - Produce multiple command files for running batchruntomo in
parallel
SYNOPSIS
splitbatch -num # [-max #] batchruntomo_command_file
DESCRIPTION
Splitbatch is a Python script that will take a command file for running
the Batchruntomo program on multiple tilt series and produce multi-
ple command files (jobs), with one data set per file, to be run in par-
allel by Processchunks.
The input command file should contain CPUMachineList and GPUMachineList
entries to Batchruntomo with the full collection of resources avail-
able for all of the jobs. Splitbatch will make sure that the CPUMa-
chineList specifies at least two machines, but does not otherwise deal
with the CPUs. Instead, Processchunks must be given this full pro-
cessor list as well as the maximum number of jobs to run in parallel in
its -M option. It will divide up the CPUs so that each job is directed
to use a specific set of CPUs, and so that the first machine assigned
for each job, the one the job is run on, has as many cores as possible
to use for programs parallelized with OpenMP. In this way, competition
between jobs for CPU cores should be minimized.
Since GPUs are more scarce and used only occasionally, the GPU list is
not subdivided and assigned either in the command files or by Pro-
ceschunks(1). Instead, when a job needs to run a step with GPUs, it
passes the full list of available GPUs to the Gpuallocator program
along with the maximum number of GPUs that it would like to use. The
program keeps track of GPUs allocated to the set of jobs and responds
with whatever GPUs are available, from 1 to the given maximum. If none
are available, Batchruntomo will keep trying indefinitely, issuing
periodic warnings when the wait is long. It is not clear whether the
best strategy is to set the maximum to the full number available or
not. If it is set to the full number, then every job will get all the
GPUs when they are free and other jobs will have to wait when they are
already reserved. Otherwise, a job is much less likely to have to wait
but will get fewer GPUs, perhaps only one, while the rest of the GPUs
may become idle when their jobs finish.
With this arrangement for allocating CPUs dynamically, the jobs must be
run with Processchunks at the command line.
All of the jobs can be controlled through a single file that Batchrun-
tomo(1) checks for quit, pause, or finish signals. The name of this
file is the rootname of the bacth input file with extension ".cmds".
To make all jobs quit as soon as possible, use the command
echo Q > rootname.cmds
or use F instead of Q to make them all quit after finishing their cur-
rent data sets.
Using a Cluster Queue
The management of resources is completely different when running
Batchruntomo in parallel on a cluster queue. Four different differ-
ent arrangements for use of cluster resources are supported:
1) A single CPU per run. The top-level Processchunks puts each
Batchruntomo job on the queue up to the number to run in parallel.
The batch command file has information about the queue command and max-
imum number of jobs to submit at once, subject to override by environ-
ment variables set by Processchunks when it runs it. selections to
take effect. Batchruntomo runs single command files directly and
runs operations that can be parallelized in chunks on the queue with
Processchunks. Everything is limited to a single thread, so opera-
tions parallelized with multi-threading (most of the single command
files) will run more slowly than usual. This mode provides efficient
use of CPU resources on the cluster but may incur less efficient use of
the file system and memory caching if many data sets are run at once.
2) Multiple CPU cores and one or more GPUs per run. This requires
that the Queuechunk command being used (generally obtained by Etomo
from the cpu.adoc) has a resource request for a specific number of
cores and possible GPUs. Batchruntomo runs single command files
directly and allows them to use all the cores available for multi-
threaded operations. Operations on a CPU that can be parallelized in
chunks are run directly with Processchunks, up to given number of
cores. Operations on a GPU are also run directly in chunks if more
than one GPU is available, or as a single command file if there is only
one GPU. This mode is comparable to running Batchruntomo on a sin-
gle computer, where sometimes the cores and more often the GPU are
fully utilized.
3-1 and 3-2) A separate queue for GPU operations, with one GPU per
task. The main queue would not have a GPU allocated. Steps using a GPU
are run on this secondary queue. Because Batchruntomo has to submit
each GPU operation to run on this queue, and no single IMOD process
uses more than one GPU, only a single GPU can be taken advantage of by
each command file or chunk being run. When more than one GPU is avail-
able overall on this queue, operations will be spit into chunks and
each chunk submitted to the queue; otherwise a single command file will
be submitted. In mode 3-1, only a single core is available and non-GPU
operations are run as for mode 1. In mode 3-2, multiple cores are
available to each job and non-GPU operations are run as for mode 2.
This mode allows more efficient use of GPUs on the cluster, provided
that the cluster is configured to allocate just one GPU from a multi-
GPU node when the queue submission requests that. If not, mode 2 is
best.
Each of these modes involves a particular set of options give to Pro-
cesschunks(1) and Batchruntomo. The options in the batch command
file will govern if the command file is run outside of Process-
chunks(1), but if it is run by Processchunks, the options provided
to the latter override the ones in the command file.
Mode 1) The queue is specified to Processchunks by
-q: the maximum number of queue entries
The machine list, a queuechunk command
and to Batchruntomo by
-QueueCommand: the queuechunk command
-MaxJobsOnQueue: the maximum number of queue entries
Mode 2) The queue is specified to Processchunks by
-q: the maximum number of queue entries
The machine list, a queuechunk command
-JC: the number of cores
-JG: the number of GPUs
and to Batchruntomo by:
-CoresPerClusterJob: the number of cores
-GPUsPerClusterJob: the number of GPUs
Modes 3-1 and 3-2) The secondary queue is specified to Process-
chunks(1) by
-SQ: the queuechunk command
-SN: the maximum number of queue entries to submit at once
and to Batchruntomo by
-GPUQueueCommand: the queuchunk command
-MaxGPUJobsOnQueue: the maximum number of queue entries
The primary queue is specified to both programs as for modes 1 and 2,
respectively, except that for mode 2, only cores should be specified,
not GPUs.
To enforce consistent usage, Splitbatch will object if the Batchrun-
tomo(1) command file has either of the CPUMachineList or GPUMachineList
options for non-cluster processing included along with the main queue
options, -QueueCommand or -CoresPerClusterJob.
OPTIONS
When the program is invoked with no arguments or with -h, it gives a
usage statement that shows the default values for these options as well
as the currently allowed abbreviations to the short option names.
-comfile OR -CommandFile File
The input Batchruntomo command file name, with or without its
extension. This entry is required. The command file can be
entered either specifically with this option or as a non-option
argument.
-maxgpu OR -MaxGPUsForOneJob value
Maximum number of GPUs to request for a step that needs a GPU.
The default is 4.
-help OR -usage
Print a usage statement and exit.
CLUSTER EXAMPLES
This section lists some Processchunks commands for running in the
different modes, and also shows how the entries should appear in
cpu.adoc
Mode 1 - a simple queue named "cluster":
processchunks -M 4 -q 32 -Q cluster 'queuechunk -t pbs' \
batchJul25-193628_BB1
cpu.adoc:
[Queue = cluster]
command = queuechunk -t pbs
number = 300
Mode 2 - a queue where multiple cores and GPU's can be requested:
processchunks -M 4 -q 6 -JC 8 -JG 4 'queuechunk -t slurm -l
-c8,-n1,--partition=sgpu' \
batchJul25-193628_BB1
cpu.adoc:
[Queue = BatchGPU]
command = queuechunk -t slurm -l -c8,-n1,--partition=sgpu
coresPerClusterJob = 8
gpusPerClusterJob = 4
number = 8
Such a configuration is best for accessing GPUs when the cluster
software is not configured to give only on GPU instead of all the
GPUs on the node. Here the "number" would be set to the number of
such GPU nodes available on the cluster. The coresPerClusterJob
should be set to the same number as whatever entry in the queue
command specifies the number of cores to allocate ("-c8" here). Note
that the attributes coresPerClusterJob and gpusPerClusterJob were
originally named coresPerNode and gpusPerNode prior to 4.12.40. Both
were misleading because such a job may not get an entire cluster
node, and using coresPerNode in this context conflicted with the
previous usage of it. coresPerNode is still used when specifying a
slurm queue running in exclusive resource allocation mode, but such a
queue is not suitable for multi-core processing within a batch job
and will be treated by Etomo as a simple queue having one core per
job.
Mode 3-1 - a primary queue with one core and a secondary queue with one
GPU:
processchunks -M 4 -q 32 -SN 4 -SQ 'queuechunk -t slurm -l
-c1,-n1,-G1,--partition=aa100
'queuechunk -t slurm -l -c1,-n1,--partition=amilan' \
batchJul25-193628_BB1
cpu.adoc:
[Queue = BatchPrimary]
command = queuechunk -t slurm -l -c1,-n1,--partition=amilan
number = 300
[Queue = alpine-1GPU]
command = queuechunk -t slurm -l -c1,-n1,-G1,--partition=aa100
coresPerClusterJob = 1
gpusPerClusterJob = 1
number = 24
Here the number might be the total number of GPUs available (e.g., 8
machines with 3 each), provided that the cluster software can assign
each one separately. With "-SN 4", each batch job will submit up to
4
chunks to the queue with GPUs. If the queue software does allow
getting more than one core per job, it might be preferable to use the
next setup instead:
Mode 3-2 - a primary queue with six cores and a secondary queue with
one GPU:
processchunks -M 4 -q 32 -SN 4 -SQ 'queuechunk -t slurm -l
-c1,-n1,-G1,--partition=aa100
'queuechunk -t slurm -l -c6,-n1,--partition=amilan' \
batchJul25-193628_BB1
cpu.adoc:
[Queue = BatchCores]
command = queuechunk -t slurm -l -c6,-n1,--partition=amilan
coresPerClusterJob = 6
number = 50
[Queue = alpine-1GPU]
command = queuechunk -t slurm -l -c1,-n1,-G1,--partition=aa100
coresPerClusterJob = 1
gpusPerClusterJob = 1
number = 24
If you define any multicore queues, note that these are useful only for
parallel Batchruntomo processing. For all other parallel processing
in Etomo, you need to define an additional simple queue with only one
core allocated per job, since that is all that will be used in those
situations.
FILES
The command files are given the same root name as the input file and
are numbered from "-001". There is a finishing file to remove command
files, but the log files are left.
AUTHOR
David Mastronarde <mast at colorado dot edu>
SEE ALSO
processchunks, batchruntomo
splitbatch(1)