Difference between revisions of "Introduction to Job Submission 02"

From arccwiki
Jump to: navigation, search
Line 1: Line 1:
=Introduction to Job Submission: 02: Memory, GPUs and JHob Arrays=
+
=Introduction to Job Submission: 02: Memory and GPUs=
  
 
==Introduction==
 
==Introduction==
Line 5: Line 5:
 
# <code>mem-per-cpu</code>
 
# <code>mem-per-cpu</code>
 
# <code>gpus</code>
 
# <code>gpus</code>
# <code>array</code>
 
 
A complete list of options can be found on the [https://slurm.schedmd.com/sbatch.html Slurm: sbatch] manual page or by typing <code>man sbatch</code> from the command line when logged onto teton.
 
A complete list of options can be found on the [https://slurm.schedmd.com/sbatch.html Slurm: sbatch] manual page or by typing <code>man sbatch</code> from the command line when logged onto teton.
 
<br/>
 
<br/>

Revision as of 19:30, 11 October 2019

Introduction to Job Submission: 02: Memory and GPUs

Introduction

The Slurm page introduces the basics of creating a batch script that is used on the command line with the sbatch command to submit and request a job on the cluster. This page is an extension that goes into a little more detail focusing on the use of the following slurm options:

  1. mem-per-cpu
  2. gpus

A complete list of options can be found on the Slurm: sbatch manual page or by typing man sbatch from the command line when logged onto teton.

Aims: The aims of this page are to extend the user's knowledge:

  • regards job allocation by introduce memory considerations, along side core options.
  • of the options available and how they function when creating batch scripts for submitting jobs.

Note:

  • This is an introduction, and not a complete overview of all the possible options available when using Slurm.
  • There are always alternatives to what is listed on this page, and how they can be used.
  • Using the options and terminology on this page, ARCC is better able to support you.
  • There are no hard and fast rules for which configuration you should use. You need to:
    • Understand how your application works with respect to using across a cluster and parallelism (if at all). For example is it build upon the concepts of MPI and OpenMP and can use multiple nodes and cores.
    • Not all applications can use parallelism and instead you simply run 100s of separate tasks.

Please share with ARCC your experiences of various configurations for whatever application you use so we can share it with the wider UW (and beyond) research community.

Prerequisites: You:

Memory Allocation

Previously we're talked about nodes having a maximum number of cores that can allocated, well they also have a maximum amount of memory that can be requested and allocated. Looking at the RAM (GB) column on the Teton Overview page, you can see that the RAM available across partitions varies from 64Gb up to 1024Gb.
NOTE: Just because a node has 1024Gb please do not to try grabbing it for your job. Remember:

  • You need to properly understand how your application uses parallelisation. Does it actually require 512Gb of memory or does it actually require 32 nodes with 32 cores on each? Notice from the overview page that the nodes that have 512Gb of RAM only support a maximum of 8 cores.
  • Some partitions are for specific investors, and preemption the act of stopping one or more "low-priority" jobs to let a "high-priority" job run might kick in and stop your job.
  • The cluster is for everyone, please be mindful towards your fellow researchers and only request what you really require. If we notice individuals abusing resources, and/or a PI can not access their investment, we will stop the individual jobs.


Using the mem-per-cpu option you can request that each cpu has this amount of memory available to it.

Remember, that you need to check the overall total amount of memory you're trying to allocate on a node. So calculate the total number of cores you're requesting for a node (ntasks-per-node * cpus-per-task and then multiple that by mem-per-cpu.

In the following examples, I am using the default ntasks-per-node value of 1:

Options Total memory Comments
#SBATCH --nodes=1
#SBATCH --cpus-per-task=8
#SBATCH --mem-per-cpu=8G
#SBATCH --partition=moran
8 * 8G = 64G Some moran nodes have 128G available. Job submitted.
#SBATCH --cpus-per-task=8
#SBATCH --mem-per-cpu=12G
#SBATCH --partition=moran
8 * 12G = 96G Job submitted.
#SBATCH --cpus-per-task=8
#SBATCH --mem-per-cpu=16G
#SBATCH --partition=moran
8 * 16G = 128G
sbatch: error: Batch job submission failed: Requested node configuration is not available

What happened here? We requested 128G and don't the teton nodes have 128G? Your total memory allocations has to be less than what the node allows.

Options Total memory Comments
#SBATCH --cpus-per-task=16
#SBATCH --mem-per-cpu=8G
#SBATCH --partition=teton
8 * 16G = 128G
sbatch: error: Batch job submission failed: Requested node configuration is not available
Same problem as before, teton nodes have a maximum of 128G.
#SBATCH --cpus-per-task=32
#SBATCH --mem-per-cpu=3G
#SBATCH --partition=teton
32 * 3 = 96 Job Submitted
#SBATCH --cpus-per-task=32
#SBATCH --mem-per-cpu=3.5G
#SBATCH --partition=teton
32 * 3.5G = 112G
sbatch: error: invalid memory constraint 3.5G

What happened here? Can't I request three and a half gigs? You can, but values have to be integer numbers, you can't define a decimal number.
What you can do is convert from G into M. But remember 1G does not equal 1000M, it actually equals 1024M. So 3.5G equals 3.5 * 1024 = 3584M.

Options Total memory Comments
#SBATCH --cpus-per-task=32
#SBATCH --mem-per-cpu=3584M
#SBATCH --partition=teton
2 * 3.5G = 112G Job Submitted
#SBATCH --cpus-per-task=32
#SBATCH --mem-per-cpu=4000M
#SBATCH --partition=teto
less then 128G (not by much) Job Submitted
#SBATCH --cpus-per-task=32
#SBATCH --mem-per-cpu=4096M
#SBATCH --partition=teton
equals 128G
sbatch: error: Batch job submission failed: Requested node configuration is not available
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=8
#SBATCH --mem-per-cpu=4000M
less then 128G Job Submitted
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=8
#SBATCH --mem-per-cpu=4096M
equals 128G
sbatch: error: Batch job submission failed: Requested node configuration is not available

Some Considerations

Shouldn't I always just request the best nodes? Consider the following: Teton nodes have 32 cores and a maximum of 128G, thus if you wanted to exclusively use the node and all the cores, the maximum you could request for each core is 4G. In comparison, the moran nodes (with 128G) have 16 cores, and can thus allocate a higher maximum of 8G. Maybe you could request two moran nodes (total of 32 cores) with each core having 8G, rather than a single teton node with each core having a lower 4G. This is a slightly contrived example, but hopefully it will get you thinking that popular better nodes are not always the best option. You job might actually get allocated resources quick rather than being queued.

Out-of-Memory Errors: Although you can allocated 'appropriate' resources, there is nothing stopping the actual application (behind the scenes so to speak) from trying to allocate and use more. So, in some cases the actual application will try to use more memory than is available on the node, and cause an out-of-memory error. Check the job .out/.err files for a message of the form:

slurmstepd: error: Detected 2 oom-kill event(s) in step 3280189.0 cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.
srun: error: m121: task 32: Out Of Memory
srun: Terminating job step 3280189.0

For commercial applications there's nothing we can directly do, and even for open source software trying to track down the memory leak can be very time consuming.

Can we predict if this is going to happen? At this moment in time, no. But we can suggest that you:

  • Keep track of the resource options you submit, the inputs actually used for the application (as the more complicated the simulation the more memory usage) and if it successfully ran. What you'll develop is the experience of knowing that ' for a particular simulation you require this set of resources '.
  • Work with your colleagues who also use the same application to get a feel for what works for them.

Requesting GPUs

Summary

Finally: We welcome feedback, and if anything isn't clear, or something is missing, or in fact you think there is a mistake, please don't hesitate to contact us.

Back to Internal: Staging Pages