Using SLURM in the DCMB Virtual Classroom - Job Management

From Sibko-wiki : A miscellaneous wiki for all IT things DCMB
Jump to: navigation, search

Contents

Job Management

If you've already gotten started with simple jobs, and the srun and sbatch commands. You may now want to move on to the basics of multiple jobs and job management. To do so you need to learn several further commands to interact with the SLURM (Simple Linux Utility for Resource Management) workload manager.

Meet the Queue

As a user, your core concern with the operation of SLURM will be the queue. The acronym SLURM when deconstructed specifically references resource management. You request resources from SLURM, now or in the future, and SLURM will assign them if available or hold, queue, your job until the resources are available. As there is no guarantee of immediacy the typical use of a system like SLURM is to prepare a script or program to run in batch, without interactivity. The combination of resources and the script or program being run is collectively referred to as a job, and typically using a SLURM system is a matter of submitting and managing jobs and surveying and monitoring the queue.

Example 1: List the queue

jdpoisso@dcmb-classroom-5:~$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
               109     batch y-crunch  cmbtest  R       3:32      1 classroom-server
               110     batch stress-n jdpoisso  R       3:29      1 classroom-server
               111     batch stress-n jdpoisso  R       3:26      1 classroom-server
               112     batch stress-n jdpoisso  R       3:26      1 classroom-server
               113     batch stress-n jdpoisso  R       3:26      1 classroom-server
jdpoisso@dcmb-classroom-5:~$

The simple way to view the contents of the SLURM queue is with the squeue command. The squeue command with no arguments will output the entire contents of the SLURM queue. The squeue shows a full list of active and queued job IDs by every user, organized into columns for easy review. In the prior example you see multiple jobs active and running on the system, no jobs are pending.

Example 2: List the queue with with waiting jobs.

jdpoisso@dcmb-classroom-5:~$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
               116     batch   crunch  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               117     batch    crash  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               118     batch     bang  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               119     batch    smash  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               120     batch     ouch  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               130     batch y-crunch jdpoisso PD       0:00      1 (Resources)
               131     batch y-crunch jdpoisso PD       0:00      1 (Priority)
               132     batch y-crunch jdpoisso PD       0:00      1 (Priority)
               133     batch y-crunch jdpoisso PD       0:00      1 (Priority)
               115     batch    crich  cmbtest  R       4:16      1 classroom-server
               121     batch stress-n jdpoisso  R       3:37      1 classroom-server
               122     batch stress-n jdpoisso  R       3:32      1 classroom-server
               125     batch y-crunch jdpoisso  R       2:41      1 classroom-server
               126     batch y-crunch jdpoisso  R       2:38      1 classroom-server
               127     batch y-crunch jdpoisso  R       0:05      1 classroom-server
               128     batch y-crunch jdpoisso  R       0:02      1 classroom-server
               129     batch y-crunch jdpoisso  R       0:02      1 classroom-server
jdpoisso@dcmb-classroom-5:~$

In this example however you see much more activity and several jobs are being held for lack of resources. In the column headed with 'ST' we see some jobs marked 'PD', pending, and we see some jobs marked 'R', running. Where the jobs are marked pending we can see a column 'NODELIST(REASON)' which gives us a reason for its state. The column marked 'USER' shows us that two users, cmbtest and jdpoisso, are currently active and using the system resources.

If you expand on this information from squeue you'll see that only one job from cmbtest is running and the rest report a 'REASON' of (QOSMaxCpuPerUserLimit). This means that granting an additional resource request would put this user over their assigned limit of CPU cores, i.e. the MaxCpuPerUser limit. So all the jobs submitted by cmbtest: 116, 117, 118, 119, 120 (crunch, crash, bang, smash, and ouch) are in the pending state because to grant any of the resource requests of jobs 116, 117, 118, 119, 120 would exceed the user's assigned limits.

Examining jdpoisso we see a different picture. In his case he doesn't seem to have the same level of constraints as cmbtest but some of his jobs are pending with a different 'REASON' than that given to cmbtest. Looking first at job 130, the reason is marked (Resources) which means the system simply does not have the resources with which to run another job! So many of the CPU cores, so much of the RAM, are assigned to other jobs that there are not enough to honor the request so these jobs are pending for resources to become available. The subsequent jobs: 131, 132, 133 are all marked (Priority) meaning that they are being held because the first blocked job (130) has priority over the subsequent jobs and they are waiting for it to finish first.

Example 3: Time passes

jdpoisso@dcmb-classroom-5:~$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
               118     batch     bang  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               119     batch    smash  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               120     batch     ouch  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               132     batch y-crunch jdpoisso PD       0:00      1 (Resources)
               133     batch y-crunch jdpoisso PD       0:00      1 (Priority)
               117     batch    crash  cmbtest  R       5:14      1 classroom-server
               126     batch y-crunch jdpoisso  R      29:54      1 classroom-server
               127     batch y-crunch jdpoisso  R      27:21      1 classroom-server
               128     batch y-crunch jdpoisso  R      27:18      1 classroom-server
               129     batch y-crunch jdpoisso  R      27:18      1 classroom-server
               130     batch y-crunch jdpoisso  R      25:21      1 classroom-server
               131     batch y-crunch jdpoisso  R      14:34      1 classroom-server
jdpoisso@dcmb-classroom-5:~$

Now we see that nearly a half hour later. We see that some jobs have disappeared, completed or aborted, while others have started as resources have become available and their requests can be honored. As time carries on the pending jobs will gradually be released and be automatically started and run on the system, as new jobs are submitted they will pend until a time when sufficient resources are available to run.

Advanced squeue

Though the scope of the DCMB virtual classroom may keep queue activity to a manageable level without special sorting or options it may be that you want or need to constrain the output to more specific subsets of jobs. To do this you can use special arguments to the squeue command. The most immediately useful would be to sort job output by user using the --users= argument.

Example 4: squeue --users before and after

jdpoisso@dcmb-classroom-5:~$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
               136     batch  dickory  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               137     batch     dock  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               138     batch    mouse  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               139     batch      ran  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               140     batch       up  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               141     batch      the  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               142     batch    clock  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               143     batch  hickory  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               144     batch  dickory  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               145     batch     dock  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               135     batch  hickory  cmbtest  R       1:52      1 classroom-server
               146     batch y-crunch jdpoisso  R       0:37      1 classroom-server
               147     batch crunchbe jdpoisso  R       0:08      1 classroom-server
jdpoisso@dcmb-classroom-5:~$ squeue --users=jdpoisso
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
               146     batch y-crunch jdpoisso  R       1:42      1 classroom-server
               147     batch crunchbe jdpoisso  R       1:13      1 classroom-server
jdpoisso@dcmb-classroom-5:~$

Another common argument to the squeue command would be to display only the subset of jobs that are pending in the queue. You can do this with the --state= argument.

Example 5: squeue with the --state= argument

jdpoisso@dcmb-classroom-5:~$ squeue --state=PD
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
               137     batch     dock  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               138     batch    mouse  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               139     batch      ran  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               140     batch       up  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               141     batch      the  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               142     batch    clock  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               143     batch  hickory  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               144     batch  dickory  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               145     batch     dock  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
jdpoisso@dcmb-classroom-5:~$

(Note: the state= argument can take either the short state form or the long name form - i.e. 'PD' or 'Pending')

Arguments may even be combined to a further subset of results, both the state= and users= argument may be given to squeue for the list of jobs that match both conditions.

Example 6: Multiple arguments can be made to squeue

jdpoisso@dcmb-classroom-5:~$ squeue --users=cmbtest --state=Pending
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
               136     batch  dickory  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               137     batch     dock  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               138     batch    mouse  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit) 
               139     batch      ran  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               140     batch       up  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               141     batch      the  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               142     batch    clock  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               143     batch  hickory  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               144     batch  dickory  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               145     batch     dock  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
jdpoisso@dcmb-classroom-5:~$


Argument   Short Form Effect 
--account-AFilters squeue results by account (default is 'class' in virtual classroom)
--name-nFilters squeue results by name
--priority-PSorts jobs by priority

Tell me about my job

The squeue command gives you a fairly high level overview of the queue and the jobs running and waiting on the system. However, it may be that you may need a detailed synopsis of a job waiting in the queue. For detailed job information use scontrol show job. This dumps a complete accounting of all relevant SLURM information on the job: resource requests, script, output, timestamps, time limits. In the DCMB virtual classroom it will even keep the job information and report it with scontrol show job up to 1 day after it completes or aborts.

Example 7: Way too much information

jdpoisso@dcmb-classroom-5:~$ scontrol show job 118
JobId=118 JobName=bang
   UserId=cmbtest(114177927) GroupId=users(100) MCS_label=N/A
   Priority=4294901643 Nice=0 Account=class QOS=normal
   JobState=RUNNING Reason=None Dependency=(null) 
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
   RunTime=00:03:51 TimeLimit=02:00:00 TimeMin=N/A
   SubmitTime=2018-11-06T15:39:57 EligibleTime=2018-11-06T15:39:57
   StartTime=2018-11-06T16:18:31 EndTime=2018-11-06T18:18:31 Deadline=N/A
   PreemptTime=None SuspendTime=None SecsPreSuspend=0
   LastSchedEval=2018-11-06T16:18:31
   Partition=batch AllocNode:Sid=dcmb-classroom-cmbtest:22153
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=classroom-server
   BatchHost=classroom-server
   NumNodes=1 NumCPUs=4 NumTasks=1 CPUs/Task=4 ReqB:S:C:T=0:0:*:*
   TRES=cpu=4,mem=30G,node=1,billing=4
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   MinCPUsNode=4 MinMemoryNode=30G MinTmpDiskNode=0
   Features=(null) DelayBoot=00:00:00
   Gres=(null) Reservation=(null)
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=/home/cmbtest/y-cruncher.sh
   WorkDir=/home/cmbtest
   StdErr=/home/cmbtest/slurm-118.out
   StdIn=/dev/null
   StdOut=/home/cmbtest/slurm-118.out
   Power=

Canceling jobs in the queue

The last major command needed in general use is the ability to cancel jobs. Perhaps the job was not setup correctly, perhaps it will not generate data as hoped based on the results of an earlier job. In any case the command to cancel jobs is scancel which uses the jobid as the argument.

Example 8: Using the scancel command in context

jdpoisso@dcmb-classroom-5:~$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
               145     batch     dock  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               144     batch  dickory  cmbtest  R       1:14      1 classroom-server
               148     batch    pipi1 jdpoisso  R       0:07      1 classroom-server
               149     batch    pipi2 jdpoisso  R       0:03      1 classroom-server
jdpoisso@dcmb-classroom-5:~$ scancel 149
jdpoisso@dcmb-classroom-5:~$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
               149     batch    pipi2 jdpoisso CG       0:11      1 classroom-server
               145     batch     dock  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               144     batch  dickory  cmbtest  R       1:24      1 classroom-server
               148     batch    pipi1 jdpoisso  R       0:17      1 classroom-server
jdpoisso@dcmb-classroom-5:~$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
               145     batch     dock  cmbtest PD       0:00      1 (QOSMaxCpuPerUserLimit)
               144     batch  dickory  cmbtest  R       1:36      1 classroom-server
               148     batch    pipi1 jdpoisso  R       0:29      1 classroom-server
jdpoisso@dcmb-classroom-5:~$

In the example you can see the scancel 149 used to request cancellation of the job with ID 149. The state is then changed to 'CG' cancelling while the job is signaled and cleaned up. Within a few moments when checked again, the job is gone. The cleanup completed and the queue revised.

Personal tools
Namespaces

Variants
Actions
Navigation
Tools