Skip to content

Running Jobs

Slurm

Slurm is a workload manager and job scheduler typically used in HPC systems to coordinate how computing resources are shared among users. It queues submitted tasks, allocates the necessary compute nodes to handle them and manages the execution and monitoring of those jobs.

slurm-diagram

Submitting a job

Jobs should be submitted using the sbatch command and the proper job directives.

sbatch [options] my_job.sh

Example of parameters you can use with sbatch:

-J, --job-name={name}
-q, --qos={name}
-p, --partition={name}

-t, --time={time}
-n, --ntasks={number}
-c, --cpus-per-task={number}
-N, --nodes={number}
Note

Job directives can be defined after the sbatch command (e.g. sbatch -A <project_name> -n 1 my_job.sh) or inside the bash script.

Job Directives

Job directives are options that define the job, such as user account, resources, run time, etc. They are specified in the first lines of the batch script.

Here is an example of a batch script (my_job.sh) for Deucalion and MN5.

my_job.sh
#!/bin/bash
#SBATCH --job-name=exampleJob
#SBATCH --partition=examplePartition
#SBATCH --account=exampleAccount
#SBATCH --time=02:00:00
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=2G

python my_python_script.py
Line by line breakdown

Each line in this batch script corresponds to a specific instruction:

  1. #!/bin/bash
    • Tells the operating system to use the Bash interpreter to execute the rest of the file.
  2. #SBATCH --partition=examplePartition
    • Instructs the SLURM scheduler to submit this job to a specific queue (partition) named examplePartition.
  3. #SBATCH --account account-name
    • Specifies the billing account or project group (account-name) that should be charged for the compute resources used by this job.
  4. #SBATCH --time=00:10:00
    • Sets the maximum runtime limit for the job to 10 minutes (HH:MM:SS). If the job runs longer, SLURM will terminate it.
  5. #SBATCH --nodes=1
    • Requests 1 compute node (a single physical machine within the cluster) to run the job.
  6. #SBATCH --ntasks=1
    • Specifies the number of process instances for the job.
  7. #SBATCH --cpus-per-task=1
    • Specifies the number of cpus per task.
  8. #SBATCH --mem=2G
    • Specifies the memory required per node.
  9. python my_python_script.py
    • Uses Python to run the my_python_script.py
my_job.sh
#!/bin/bash
#SBATCH --job-name=exampleJob
#SBATCH --qos=exampleQueue
#SBATCH --account=exampleAccount
#SBATCH --time=02:00:00
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=2G


python my_python_script.py
Line by line breakdown

Each line in this batch script corresponds to a specific instruction:

  1. #!/bin/bash
    • Tells the operating system to use the Bash interpreter to execute the rest of the file.
  2. #SBATCH --qos=exampleQueue
    • Instructs the SLURM scheduler to submit this job to a specific queue (partition) named exampleQueue.
  3. #SBATCH --account account-name
    • Specifies the billing account or project group (account-name) that should be charged for the compute resources used by this job.
  4. #SBATCH --time=00:10:00
    • Sets the maximum runtime limit for the job to 10 minutes (HH:MM:SS). If the job runs longer, SLURM will terminate it.
  5. #SBATCH --nodes=1
    • Requests 1 compute node (a single physical machine within the cluster) to run the job.
  6. #SBATCH --ntasks=1
    • Specifies the number of process instances for the job.
  7. #SBATCH --cpus-per-task=1
    • Specifies the number of cpus per task.
  8. #SBATCH --mem=2G
    • Specifies the memory required per node.
  9. python my_python_script.py
    • Uses Python to run the my_python_script.py

Account

The --account flag specifies the project or group to which the job's resource consumption is attributed.

On Deucalion, you can check your accounts with billing and a similar table to this will appear:

_________________________________________________
┃ Account     ┃ Used (h) ┃ Limit (h) ┃ Used (%) ┃
│ accounta    │   29     │        50 │    58.97 │
│ accountg    │   68     │       500 │    13.74 │
│ accountx    │   2872   │     10000 │    28.73 │
_________________________________________________

On MN you can check your available accounts by typing bsc_project list on your shell and see:

You currently have access to the following accounts:
    eporaif-XXX

Partitions / Queues

Selecting the correct partition ensures the job is routed to the specific hardware it requires, such as GPUs or high-memory nodes.

List of available partitions (Deucalion) and Queues (MN5)

To check the partitions available on Deucalion run sinfo.

Partition Architecture Max Nodes Time Limit GPU
dev-arm aarch64 2 4 hours
normal-arm aarch64 128 48 hours
large-arm aarch64 512 72 hours
dev-x86 x86_64 2 4 hours
normal-x86 x86_64 64 48 hours
large-x86 x86_64 128 72 hours
dev-a100-40 x86_64 1 4 hours
normal-a100-40 x86_64 4 48 hours
dev-a100-80 x86_64 1 4 hours
normal-a100-80 x86_64 4 48 hours

To check the queues available on MN5 run bsc_queues.

GPP

Queue Max. number of nodes (cores) Wallclock Slurm QoS name
BSC 125 (14,000) 48h gp_bsc
Data 4 (448) 72h gp_data
Debug 32 (3,584) 2h gp_debug
EuroHPC 800 (89,600) 72h gp_ehpc
HBM 50 (5,600) 72h gp_hbm
Interactive 1 (32) 2h gp_interactive
RES Class A 200 (22,400) 72h gp_resa
RES Class B 200 (22,400) 48h gp_resb
RES Class C 50 (5,600) 24h gp_resc
Training 32 (3,584) 48h gp_training

ACC (GPU)

Queue Max. number of nodes (cores) Wallclock Slurm QoS name
BSC 25 (2,000) 48h acc_bsc
Debug 8 (640) 2h acc_debug
EuroHPC 100 (8,000) 72h acc_ehpc
Interactive 1 (40) 2h acc_interactive
RES Class A 100 (8,000) 72h acc_resa
RES Class B 100 (8,000) 48h acc_resb
RES Class C 10 (800) 24h acc_resc
Training 4 (320) 48h acc_training

Manage Jobs

Check Job Status

You can check the status of your submitted job by executing:

squeue --me
Job Status Codes
Status ID
Completed CD
Completing CG
Failed F
Pending PD
Running R

Cancel a Job

Use the command scancel to cancel a submitted job.

scancel <job_id>

Interactive Jobs

In order to allocate an interactive job, use the salloc or srun commands.

Use salloc:

salloc -A <project_name> -p <partition>
Alternatively, use srun:
srun -A <project_name> --time=XX:XX:XX --nodes=1  -p <partition> --pty bash

salloc -A <project_name> -q <queue>

Allocate interactive GPU job:

salloc -A <project_name> -n 1 -c 40 -t 01:00:00 -q <gpu_queue_name> --gres=gpu:1