splitbatch(1)               General Commands Manual              splitbatch(1)



NAME
       splitbatch - Produce multiple command files for running batchruntomo in
       parallel

SYNOPSIS
       splitbatch  -num #  [-max #] batchruntomo_command_file

DESCRIPTION
       Splitbatch is a Python script that will take a command file for running
       the Batchruntomo program on multiple tilt series and produce multi-
       ple command files (jobs), with one data set per file, to be run in par-
       allel by Processchunks.

       The input command file should contain CPUMachineList and GPUMachineList
       entries to Batchruntomo with the full collection of resources avail-
       able for all of the jobs.  Splitbatch will make sure that the CPUMa-
       chineList specifies at least two machines, but does not otherwise deal
       with the CPUs.  Instead, Processchunks must be given this full pro-
       cessor list as well as the maximum number of jobs to run in parallel in
       its -M option.  It will divide up the CPUs so that each job is directed
       to use a specific set of CPUs, and so that the first machine assigned
       for each job, the one the job is run on, has as many cores as possible
       to use for programs parallelized with OpenMP.  In this way, competition
       between jobs for CPU cores should be minimized.

       Since GPUs are more scarce and used only occasionally, the GPU list is
       not subdivided and assigned either in the command files or by Pro-
       ceschunks(1).  Instead, when a job needs to run a step with GPUs, it
       passes the full list of available GPUs to the Gpuallocator program
       along with the maximum number of GPUs that it would like to use.  The
       program keeps track of GPUs allocated to the set of jobs and responds
       with whatever GPUs are available, from 1 to the given maximum.  If none
       are available, Batchruntomo will keep trying indefinitely, issuing
       periodic warnings when the wait is long. It is not clear whether the
       best strategy is to set the maximum to the full number available or
       not.  If it is set to the full number, then every job will get all the
       GPUs when they are free and other jobs will have to wait when they are
       already reserved.  Otherwise, a job is much less likely to have to wait
       but will get fewer GPUs, perhaps only one, while the rest of the GPUs
       may become idle when their jobs finish.

       With this arrangement for allocating CPUs dynamically, the jobs must be
       run with Processchunks at the command line.

       All of the jobs can be controlled through a single file that Batchrun-
       tomo(1) checks for quit, pause, or finish signals.  The name of this
       file is the rootname of the bacth input file with extension ".cmds".
       To make all jobs quit as soon as possible, use the command
           echo Q > rootname.cmds
       or use F instead of Q to make them all quit after finishing their cur-
       rent data sets.

   Using a Cluster Queue
       The management of resources is completely different when running
       Batchruntomo in parallel on a cluster queue.  Four different differ-
       ent arrangements for use of cluster resources are supported:
          1) A single CPU per run.  The top-level Processchunks puts each
       Batchruntomo job on the queue up to the number to run in parallel.
       The batch command file has information about the queue command and max-
       imum number of jobs to submit at once, subject to override by environ-
       ment variables set by Processchunks when it runs it.  selections to
       take effect.  Batchruntomo runs single command files directly and
       runs operations that can be parallelized in chunks on the queue with
       Processchunks.  Everything is limited to a single thread, so opera-
       tions parallelized with multi-threading (most of the single command
       files) will run more slowly than usual.  This mode provides efficient
       use of CPU resources on the cluster but may incur less efficient use of
       the file system and memory caching if many data sets are run at once.
          2) Multiple CPU cores and one or more GPUs per run.  This requires
       that the Queuechunk command being used (generally obtained by Etomo
       from the cpu.adoc) has a resource request for a specific number of
       cores and possible GPUs.  Batchruntomo runs single command files
       directly and allows them to use all the cores available for multi-
       threaded operations.  Operations on a CPU that can be parallelized in
       chunks are run directly with Processchunks, up to given number of
       cores.  Operations on a GPU are also run directly in chunks if more
       than one GPU is available, or as a single command file if there is only
       one GPU.  This mode is comparable to running Batchruntomo on a sin-
       gle computer, where sometimes the cores and more often the GPU are
       fully utilized.
          3-1 and 3-2) A separate queue for GPU operations, with one GPU per
       task.  The main queue would not have a GPU allocated. Steps using a GPU
       are run on this secondary queue.  Because Batchruntomo has to submit
       each GPU operation to run on this queue, and no single IMOD process
       uses more than one GPU, only a single GPU can be taken advantage of by
       each command file or chunk being run.  When more than one GPU is avail-
       able overall on this queue, operations will be spit into chunks and
       each chunk submitted to the queue; otherwise a single command file will
       be submitted.  In mode 3-1, only a single core is available and non-GPU
       operations are run as for mode 1.  In mode 3-2, multiple cores are
       available to each job and non-GPU operations are run as for mode 2.
       This mode allows more efficient use of GPUs on the cluster, provided
       that the cluster is configured to allocate just one GPU from a multi-
       GPU node when the queue submission requests that.  If not, mode 2 is
       best.

       Each of these modes involves a particular set of options give to Pro-
       cesschunks(1) and Batchruntomo.  The options in the batch command
       file will govern if the command file is run outside of Process-
       chunks(1), but if it is run by Processchunks, the options provided
       to the latter override the ones in the command file.
          Mode 1) The queue is specified to Processchunks by
            -q: the maximum number of queue entries
            The machine list, a queuechunk command
        and to Batchruntomo by
            -QueueCommand: the queuechunk command
            -MaxJobsOnQueue: the maximum number of queue entries

          Mode 2) The queue is specified to Processchunks by
            -q: the maximum number of queue entries
            The machine list, a queuechunk command
            -JC: the number of cores
            -JG: the number of GPUs
        and to Batchruntomo by:
            -CoresPerClusterJob: the number of cores
            -GPUsPerClusterJob: the number of GPUs

          Modes 3-1 and 3-2) The secondary queue is specified to Process-
       chunks(1) by
            -SQ: the queuechunk command
            -SN: the maximum number of queue entries to submit at once
        and to Batchruntomo by
            -GPUQueueCommand: the queuchunk command
            -MaxGPUJobsOnQueue: the maximum number of queue entries
        The primary queue is specified to both programs as for modes 1 and 2,
        respectively, except that for mode 2, only cores should be specified,
        not GPUs.

       To enforce consistent usage, Splitbatch will object if the Batchrun-
       tomo(1) command file has either of the CPUMachineList or GPUMachineList
       options for non-cluster processing included along with the main queue
       options, -QueueCommand or -CoresPerClusterJob.


OPTIONS
       When the program is invoked with no arguments or with -h, it gives a
       usage statement that shows the default values for these options as well
       as the currently allowed abbreviations to the short option names.

       -comfile OR -CommandFile File
              The input Batchruntomo command file name, with or without its
              extension.  This entry is required.  The command file can be
              entered either specifically with this option or as a non-option
              argument.

       -maxgpu OR -MaxGPUsForOneJob value
              Maximum number of GPUs to request for a step that needs a GPU.
              The default is 4.

       -help OR -usage
              Print a usage statement and exit.

CLUSTER EXAMPLES
       This section lists some Processchunks commands for running in the
       different modes, and also shows how the entries should appear in
       cpu.adoc

       Mode 1 - a simple queue named "cluster":
           processchunks -M 4 -q 32 -Q cluster 'queuechunk -t pbs'  \
             batchJul25-193628_BB1

         cpu.adoc:
           [Queue = cluster]
           command = queuechunk -t pbs
           number = 300

       Mode 2 - a queue where multiple cores and GPU's can be requested:
           processchunks -M 4 -q 6 -JC 8 -JG 4       'queuechunk -t slurm -l
       -c8,-n1,--partition=sgpu' \
             batchJul25-193628_BB1

         cpu.adoc:
           [Queue = BatchGPU]
           command = queuechunk -t slurm -l -c8,-n1,--partition=sgpu
           coresPerClusterJob = 8
           gpusPerClusterJob = 4
           number = 8

         Such a configuration is best for accessing GPUs when the cluster
         software is not configured to give only on GPU instead of all the
         GPUs on the node.  Here the "number" would be set to the number of
         such GPU nodes available on the cluster. The coresPerClusterJob
         should be set to the same number as whatever entry in the queue
         command specifies the number of cores to allocate ("-c8" here).  Note
         that the attributes coresPerClusterJob and gpusPerClusterJob were
         originally named coresPerNode and gpusPerNode prior to 4.12.40.  Both
         were misleading because such a job may not get an entire cluster
         node, and using coresPerNode in this context conflicted with the
         previous usage of it.  coresPerNode is still used when specifying a
         slurm queue running in exclusive resource allocation mode, but such a
         queue is not suitable for multi-core processing within a batch job
         and will be treated by Etomo as a simple queue having one core per
         job.

       Mode 3-1 - a primary queue with one core and a secondary queue with one
       GPU:
         processchunks -M 4 -q 32 -SN 4 -SQ     'queuechunk -t slurm -l
       -c1,-n1,-G1,--partition=aa100
             'queuechunk -t slurm -l -c1,-n1,--partition=amilan' \
             batchJul25-193628_BB1

         cpu.adoc:
           [Queue = BatchPrimary]
           command = queuechunk -t slurm -l -c1,-n1,--partition=amilan
           number = 300

           [Queue = alpine-1GPU]
           command = queuechunk -t slurm -l -c1,-n1,-G1,--partition=aa100
           coresPerClusterJob = 1
           gpusPerClusterJob = 1
           number = 24

         Here the number might be the total number of GPUs available (e.g., 8
         machines with 3 each), provided that the cluster software can assign
         each one separately.  With "-SN 4", each batch job will submit up to
       4
         chunks to the queue with GPUs.  If the queue software does allow
         getting more than one core per job, it might be preferable to use the
         next setup instead:

       Mode 3-2 - a primary queue with six cores and a secondary queue with
       one GPU:
         processchunks -M 4 -q 32 -SN 4 -SQ    'queuechunk -t slurm -l
       -c1,-n1,-G1,--partition=aa100
             'queuechunk -t slurm -l -c6,-n1,--partition=amilan' \
             batchJul25-193628_BB1

         cpu.adoc:
           [Queue = BatchCores]
           command = queuechunk -t slurm -l -c6,-n1,--partition=amilan
           coresPerClusterJob = 6
           number = 50

           [Queue = alpine-1GPU]
           command = queuechunk -t slurm -l -c1,-n1,-G1,--partition=aa100
           coresPerClusterJob = 1
           gpusPerClusterJob = 1
           number = 24

       If you define any multicore queues, note that these are useful only for
       parallel Batchruntomo processing.  For all other parallel processing
       in Etomo, you need to define an additional simple queue with only one
       core allocated per job, since that is all that will be used in those
       situations.

FILES
       The command files are given the same root name as the input file and
       are numbered from "-001".  There is a finishing file to remove command
       files, but the log files are left.

AUTHOR
       David Mastronarde  <mast at colorado dot edu>

SEE ALSO
       processchunks, batchruntomo



                                                                 splitbatch(1)