Using SLURM in the DCMB Virtual Classroom - Job Management
Contents |
Job Management
If you've already gotten started with simple jobs, and the srun and sbatch commands. You may now want to move on to the basics of multiple jobs and job management. To do so you need to learn several further commands to interact with the SLURM (Simple Linux Utility for Resource Management) workload manager.
Meet the Queue
As a user, your core concern with the operation of SLURM will be the queue. The acronym SLURM when deconstructed specifically references resource management. You request resources from SLURM, now or in the future, and SLURM will assign them if available or hold, queue, your job until the resources are available. As there is no guarantee of immediacy the typical use of a system like SLURM is to prepare a script or program to run in batch, without interactivity. The combination of resources and the script or program being run is collectively referred to as a job, and typically using a SLURM system is a matter of submitting and managing jobs and surveying and monitoring the queue.
Example 1: List the queue
jdpoisso@dcmb-classroom-5:~$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
109 batch y-crunch cmbtest R 3:32 1 classroom-server
110 batch stress-n jdpoisso R 3:29 1 classroom-server
111 batch stress-n jdpoisso R 3:26 1 classroom-server
112 batch stress-n jdpoisso R 3:26 1 classroom-server
113 batch stress-n jdpoisso R 3:26 1 classroom-server
jdpoisso@dcmb-classroom-5:~$
The simple way to view the contents of the SLURM queue is with the squeue command. The squeue command with no arguments will output the entire contents of the SLURM queue. The squeue shows a full list of active and queued job IDs by every user, organized into columns for easy review. In the prior example you see multiple jobs active and running on the system, no jobs are pending.
Example 2: List the queue with with waiting jobs.
jdpoisso@dcmb-classroom-5:~$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
116 batch crunch cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
117 batch crash cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
118 batch bang cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
119 batch smash cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
120 batch ouch cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
130 batch y-crunch jdpoisso PD 0:00 1 (Resources)
131 batch y-crunch jdpoisso PD 0:00 1 (Priority)
132 batch y-crunch jdpoisso PD 0:00 1 (Priority)
133 batch y-crunch jdpoisso PD 0:00 1 (Priority)
115 batch crich cmbtest R 4:16 1 classroom-server
121 batch stress-n jdpoisso R 3:37 1 classroom-server
122 batch stress-n jdpoisso R 3:32 1 classroom-server
125 batch y-crunch jdpoisso R 2:41 1 classroom-server
126 batch y-crunch jdpoisso R 2:38 1 classroom-server
127 batch y-crunch jdpoisso R 0:05 1 classroom-server
128 batch y-crunch jdpoisso R 0:02 1 classroom-server
129 batch y-crunch jdpoisso R 0:02 1 classroom-server
jdpoisso@dcmb-classroom-5:~$
In this example however you see much more activity and several jobs are being held for lack of resources. In the column headed with 'ST' we see some jobs marked 'PD', pending, and we see some jobs marked 'R', running. Where the jobs are marked pending we can see a column 'NODELIST(REASON)' which gives us a reason for its state. The column marked 'USER' shows us that two users, cmbtest and jdpoisso, are currently active and using the system resources.
If you expand on this information from squeue you'll see that only one job from cmbtest is running and the rest report a 'REASON' of (QOSMaxCpuPerUserLimit). This means that granting an additional resource request would put this user over their assigned limit of CPU cores, i.e. the MaxCpuPerUser limit. So all the jobs submitted by cmbtest: 116, 117, 118, 119, 120 (crunch, crash, bang, smash, and ouch) are in the pending state because to grant any of the resource requests of jobs 116, 117, 118, 119, 120 would exceed the user's assigned limits.
Examining jdpoisso we see a different picture. In his case he doesn't seem to have the same level of constraints as cmbtest but some of his jobs are pending with a different 'REASON' than that given to cmbtest. Looking first at job 130, the reason is marked (Resources) which means the system simply does not have the resources with which to run another job! So many of the CPU cores, so much of the RAM, are assigned to other jobs that there are not enough to honor the request so these jobs are pending for resources to become available. The subsequent jobs: 131, 132, 133 are all marked (Priority) meaning that they are being held because the first blocked job (130) has priority over the subsequent jobs and they are waiting for it to finish first.
Example 3: Time passes
jdpoisso@dcmb-classroom-5:~$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
118 batch bang cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
119 batch smash cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
120 batch ouch cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
132 batch y-crunch jdpoisso PD 0:00 1 (Resources)
133 batch y-crunch jdpoisso PD 0:00 1 (Priority)
117 batch crash cmbtest R 5:14 1 classroom-server
126 batch y-crunch jdpoisso R 29:54 1 classroom-server
127 batch y-crunch jdpoisso R 27:21 1 classroom-server
128 batch y-crunch jdpoisso R 27:18 1 classroom-server
129 batch y-crunch jdpoisso R 27:18 1 classroom-server
130 batch y-crunch jdpoisso R 25:21 1 classroom-server
131 batch y-crunch jdpoisso R 14:34 1 classroom-server
jdpoisso@dcmb-classroom-5:~$
Now we see that nearly a half hour later. We see that some jobs have disappeared, completed or aborted, while others have started as resources have become available and their requests can be honored. As time carries on the pending jobs will gradually be released and be automatically started and run on the system, as new jobs are submitted they will pend until a time when sufficient resources are available to run.
Advanced squeue
Though the scope of the DCMB virtual classroom may keep queue activity to a manageable level without special sorting or options it may be that you want or need to constrain the output to more specific subsets of jobs. To do this you can use special arguments to the squeue command. The most immediately useful would be to sort job output by user using the --users= argument.
Example 4: squeue --users before and after
jdpoisso@dcmb-classroom-5:~$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
136 batch dickory cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
137 batch dock cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
138 batch mouse cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
139 batch ran cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
140 batch up cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
141 batch the cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
142 batch clock cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
143 batch hickory cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
144 batch dickory cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
145 batch dock cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
135 batch hickory cmbtest R 1:52 1 classroom-server
146 batch y-crunch jdpoisso R 0:37 1 classroom-server
147 batch crunchbe jdpoisso R 0:08 1 classroom-server
jdpoisso@dcmb-classroom-5:~$ squeue --users=jdpoisso
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
146 batch y-crunch jdpoisso R 1:42 1 classroom-server
147 batch crunchbe jdpoisso R 1:13 1 classroom-server
jdpoisso@dcmb-classroom-5:~$
Another common argument to the squeue command would be to display only the subset of jobs that are pending in the queue. You can do this with the --state= argument.
Example 5: squeue with the --state= argument
jdpoisso@dcmb-classroom-5:~$ squeue --state=PD
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
137 batch dock cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
138 batch mouse cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
139 batch ran cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
140 batch up cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
141 batch the cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
142 batch clock cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
143 batch hickory cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
144 batch dickory cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
145 batch dock cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
jdpoisso@dcmb-classroom-5:~$
(Note: the state= argument can take either the short state form or the long name form - i.e. 'PD' or 'Pending')
Arguments may even be combined to a further subset of results, both the state= and users= argument may be given to squeue for the list of jobs that match both conditions.
Example 6: Multiple arguments can be made to squeue
jdpoisso@dcmb-classroom-5:~$ squeue --users=cmbtest --state=Pending
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
136 batch dickory cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
137 batch dock cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
138 batch mouse cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
139 batch ran cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
140 batch up cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
141 batch the cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
142 batch clock cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
143 batch hickory cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
144 batch dickory cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
145 batch dock cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
jdpoisso@dcmb-classroom-5:~$
| Argument | Short Form | Effect |
| --account | -A | Filters squeue results by account (default is 'class' in virtual classroom) |
| --name | -n | Filters squeue results by name |
| --priority | -P | Sorts jobs by priority |
Tell me about my job
The squeue command gives you a fairly high level overview of the queue and the jobs running and waiting on the system. However, it may be that you may need a detailed synopsis of a job waiting in the queue. For detailed job information use scontrol show job. This dumps a complete accounting of all relevant SLURM information on the job: resource requests, script, output, timestamps, time limits. In the DCMB virtual classroom it will even keep the job information and report it with scontrol show job up to 1 day after it completes or aborts.
Example 7: Way too much information
jdpoisso@dcmb-classroom-5:~$ scontrol show job 118 JobId=118 JobName=bang UserId=cmbtest(114177927) GroupId=users(100) MCS_label=N/A Priority=4294901643 Nice=0 Account=class QOS=normal JobState=RUNNING Reason=None Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=00:03:51 TimeLimit=02:00:00 TimeMin=N/A SubmitTime=2018-11-06T15:39:57 EligibleTime=2018-11-06T15:39:57 StartTime=2018-11-06T16:18:31 EndTime=2018-11-06T18:18:31 Deadline=N/A PreemptTime=None SuspendTime=None SecsPreSuspend=0 LastSchedEval=2018-11-06T16:18:31 Partition=batch AllocNode:Sid=dcmb-classroom-cmbtest:22153 ReqNodeList=(null) ExcNodeList=(null) NodeList=classroom-server BatchHost=classroom-server NumNodes=1 NumCPUs=4 NumTasks=1 CPUs/Task=4 ReqB:S:C:T=0:0:*:* TRES=cpu=4,mem=30G,node=1,billing=4 Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=* MinCPUsNode=4 MinMemoryNode=30G MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 Gres=(null) Reservation=(null) OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null) Command=/home/cmbtest/y-cruncher.sh WorkDir=/home/cmbtest StdErr=/home/cmbtest/slurm-118.out StdIn=/dev/null StdOut=/home/cmbtest/slurm-118.out Power=
Canceling jobs in the queue
The last major command needed in general use is the ability to cancel jobs. Perhaps the job was not setup correctly, perhaps it will not generate data as hoped based on the results of an earlier job. In any case the command to cancel jobs is scancel which uses the jobid as the argument.
Example 8: Using the scancel command in context
jdpoisso@dcmb-classroom-5:~$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
145 batch dock cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
144 batch dickory cmbtest R 1:14 1 classroom-server
148 batch pipi1 jdpoisso R 0:07 1 classroom-server
149 batch pipi2 jdpoisso R 0:03 1 classroom-server
jdpoisso@dcmb-classroom-5:~$ scancel 149
jdpoisso@dcmb-classroom-5:~$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
149 batch pipi2 jdpoisso CG 0:11 1 classroom-server
145 batch dock cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
144 batch dickory cmbtest R 1:24 1 classroom-server
148 batch pipi1 jdpoisso R 0:17 1 classroom-server
jdpoisso@dcmb-classroom-5:~$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
145 batch dock cmbtest PD 0:00 1 (QOSMaxCpuPerUserLimit)
144 batch dickory cmbtest R 1:36 1 classroom-server
148 batch pipi1 jdpoisso R 0:29 1 classroom-server
jdpoisso@dcmb-classroom-5:~$
In the example you can see the scancel 149 used to request cancellation of the job with ID 149. The state is then changed to 'CG' cancelling while the job is signaled and cleaned up. Within a few moments when checked again, the job is gone. The cleanup completed and the queue revised.