QoS (Quality of Service)

We have implemented Quality of Service (QoS) for MonARCH.

The QoS can be added to each job that is submitted to Slurm. The quality of service associated with a job will affect the job in three ways:

  1. Job Scheduling Priority

  2. Job Preemption

  3. Job Limits

Important

All users currently have a default QOS assigned to them. There is no need to specify it in normal usage. Users may be asked to add a QOS field for specific reason, i.e. they belong to a partner share.

To view the current list of all QOSs and their current values, run the following command. (This uses a format field to show only the useful parameters)

sacctmgr show qos format="Name,MaxWall,MaxTRESPerUser%30,MaxJob,MaxSubmit,Priority,Preempt"

For example, the normal QOS has these properties.

‘normal’ qos

Default QOS for jobs

Max Wall Time

7 Days

Max GPU per User

4

Max CPU per User

65

Priority (Fairshare)

50

Preempt

No

How to run jobs with QoS

Explanation

To add a QOS place this in your SBATCH script, you add a line like this in your SLURM submission script.

#SBATCH --qos=normal

An example Slurm job script

#!/bin/bash
#SBATCH --job-name=MyJob
#SBATCH --qos=partner
#SBATCH --ntasks=2

module load openmpi/1.10.7-mlx
mpirun <program>