Diff for "Slurm_Installation_for_Pax

Differences between revisions 2 and 3

Slurm Installation for Pax Cluster

Slurm is currently being tested as scheduler for the pax8 and pax9 blade chassis, containing the machines pax80 to pax9f. At the moment, interactive logins to these machines are possible, this will probably change in a production setup. Currently, you can run all slurm commands on the pax8 machines.

Kerberos Integration

You need to acquire an addressless Kerberos ticket for Slurm to work. This isn't the default on supported DESY machines, you'll have to call kinit -A. On self-maintained machines like notebooks, simply set noaddresses=true in the file /etc/krb5.conf. To check if your ticket is addressless, call klist -v.

Slurm was configured to always schedule complete nodes to each job.

Slurm Commands

The most important commands:

sinfo	Information about the cluster
squeue	Show current job list
srun	Parallel command execution
sbatch	Submit a batch job
scancel	Abort a job
sacct	Show accounting information

Parallel Execution

Slurm has integrated execution support for parallel programs, there is no need to use mpirun or mpiexec. To start a program in a job script or interactive session, use a command like srun -n 4 -N 2 hostname. This will execute the command hostname 4 times on 2 different machines.

MPI Support

srun can execute MPI programs, but the LD_LIBRARY_PATH must first be set, this is done by loading the right environment module, e.g. module add openmpi-x86_64. For openmpi, the command line option --resv-ports is needed for srun.

Job scripts

Parameters to slurm can be set on the sbatch command line or starting with a #SBATCH in the script. The most important parameters are:

-J	job name
--get-user-env	copy environment variables
-n	number of cores
-N	number of nodes
-t	run time of the job, default is 30 minutes
-A	account, default the same as UNIX group
-p	partition of the cluster
--switches	maximum number of switches connecting the allocated nodes

Time format

The runtime of a job is given as minutes, hours and minutes (HH:MM) or days and hours (DD-HH). The maximum run time was set to 48 hours.

Topology aware scheduling

The pax machines are connected by a fat tree of Infiniband switches, one switch in each blade center connects the included blade servers and one additional hierarchy level connects these switches. The Slurm scheduler is aware of this and you can request a maximum numbers of switches used in a job with the --switches option.

--switches=1

will ensure that all nodes are allocated in the same blade center.

Examples

An example job script is in slurm-mpi.job

Slurm_Installation_for_Pax_Cluster (last edited 2017-12-04 13:49:07 by GötzWaschk)

-  ⇤ ← Revision 2 as of 2014-04-29 08:24:20 → 
  Size: 3079
  Editor: GötzWaschk
  Comment:
+   ← Revision 3 as of 2014-04-29 15:38:15 → ⇥
  Size: 3161
  Editor: GötzWaschk
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 45:
+===== Examples =====
An example job script is in [[attachment:slurm-mpi.job]]

Wiki

Page