CRC How Do I Ensure My Job Has Enough Memory To Run Using SLURM?

Introduction

With all nodes on the cluster being shared among all users, it is possible that a single process owned by a single user might be consuming all of the memory on the node. If this happens, new jobs assigned to that node will likely not have enough memory to run. If your job is memory intensive, it would be best to request a node with enough available memory to run your job.

Determining Memory Requirements

In order to request a node with an appropriate amount of memory available, it might first be necessary to determine how much memory your job will actually need if this is not known ahead of time. To do this, it is recommended that you submit a test job and request exclusive access to a node. This means that only your job will be running on the node and will not interfere with or be disrupted by other jobs. Exclusive access will provide the best opportunity for you to measure the amount of memory that your job can consume without interference from other jobs.

To ensure accurate data collection, use srun when executing the program as follows:

#!/bin/bash

#SBATCH --job-name=YourJobNameHere

#SBATCH --ntasks=1

#SBATCH --time=00:30:00

#SBATCH --exclusive

echo "My job ran on:"

echo $SLURM_NODELIST

if [[ -d $SHARED_SCRATCH/$USER && -w $SHARED_SCRATCH/$USER ]]

then

cd $SHARED_SCRATCH/$USER

srun /path/to/myprogram

fi

A job ID will be provided so that job usage data can be displayed.

Show job usage data

You can use sacct for your test job to report, among other things, the memory utilization of your job. Default output does not contain memory utilization so will need to use the --format parameter to sacct to provide memory information.

# sacct -j jobID

JobID JobName Partition Account AllocCPUS State ExitCode

------------ ---------- ---------- ---------- ---------- ---------- --------

jobID TEST commons commons 16 COMPLETED 0:0

jobID.batch batch commons 1 COMPLETED 0:0

jobID.0 memhog commons 16 COMPLETED 0:0

jobID.1 sleep commons 16 COMPLETED 0:0

# sacct -j jobID --format=JobID,JobName,Partition,Account,AllocCPUS,State,ExitCode,MaxRSS,MaxVMSize

JobID JobName Partition Account AllocCPUS State ExitCode MaxRSS MaxVMSize

------------ ---------- ---------- ---------- ---------- ---------- -------- ---------- ----------

jobID TEST commons commons 16 COMPLETED 0:0

jobID.batch       batch               commons          1  COMPLETED      0:0      5944K    190800K

jobID.0          memhog               commons         16  COMPLETED      0:0          0          0

jobID.1           sleep               commons         16  COMPLETED      0:0       524K     58932K

In this example, the MaxVMSize represents the maximum amount of physical memory used by all processes in the job. This number can be inaccurate because it is polled from the operating system at regular intervals. It does not represent the memory usage at an instant in time. Fluctuations during the polling interval will be missed. To ensure the best possible accuracy you must request exclusive access to the node when your test job runs so that the job will have the maximum amount of memory available to it.

Use sacct on a Running Job

It is also possible to see the above values while a job is running with the following command:

# sacct -j jobID

JobID JobName Partition Account AllocCPUS State ExitCode

------------ ---------- ---------- ---------- ---------- ---------- --------

jobID TEST commons commons 16 RUNNING 0:0

jobID.batch batch commons 1 RUNNING 0:0

jobID.0 memhog commons 16 COMPLETED 0:0

jobID.1 sleep commons 16 RUNNING 0:0

Requesting a Specific Amount of Memory

Once you have determined how much memory your job requires, you should request access to a node with enough memory available to satisfy the job for all future jobs of its type. There is a SLURM option that can be used to find a node with a specific amount of memory available. Use the --mem option in your SLURM script similar to the following:

#SBATCH --nodes=4

#SBATCH --ntasks-per-node=1

#SBATCH --mem=2048MB

This combination of options will give you four nodes, only one task per node, and will assign the job to nodes with at least 2GB of physical memory available. The --mem option means the amount of physical memory that is needed by each task in the job. In this example the unit is megabytes, so 2GB is 2048MB. This example requires 8GB total memory or 2GB per process with one process per node. If your job requires more than one process per node, then the amount of free memory available on the node must be --ntasks-per-node X --mem. For example, if each process needs 2GB (--mem) and you will be using two tasks per node (--ntasks-per-node), then the node must have 4GB of free memory for the job to run.