Differences between revisions 1 and 18 (spanning 17 versions)
Revision 1 as of 2006-02-20 15:33:34
Size: 1710
Editor: GötzWaschk
Comment:
Revision 18 as of 2009-06-10 13:02:18
Size: 1646
Editor: GötzWaschk
Comment:
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
At Zeuthen a cluster of 16 dual opteron machines is available. It is integrated into the SGE batch system. The documentation in ["Batch System Usage"] applies to it.  There are no dedicated parallel clusters available at the moment, but you can run parallel MPI jobs in the SGE farm. The documentation in [[Batch_System_Usage]] applies there.
Line 6: Line 6:

Applications for the cluster must be compiled on a 64 bit machine, at the moment, this means either lx64 or linfini. There are MPI versions for the GCC, Intel and PGI compilers installed:

/usr/local/ibgd/mpi/osu/gcc/mvapich-0.9.5/bin/mpicc

/usr/local/ibgd/mpi/osu/intel/mvapich-0.9.5/bin/mpicc

/usr/local/ibgd/mpi/osu/pgi/mvapich-0.9.5/bin/mpicc
Since SL5, all batch worker nodes have the openmpi implementation of the MPI standard installed. Recently the machines were upgraded to the default SL5.3 packages of openmpi. For 64 bit applications use the installation in /usr/lib64/openmpi/1.2.7-gcc/bin, for 32 bit use
the binaries from /usr/lib/openmpi/1.2.7-gcc/bin .
Line 16: Line 10:
Compilers for C++ and Fortran are available as well.
Line 18: Line 11:
=== Building applications ===
64 bit MPI Applications can be compiled on any 64 bit SL5 machine, e.g. sl5-64.ifh.de.
Line 21: Line 16:
A job script designated for a parallel job needs to specify the parallel environment and the number of required CPUs. The parameter looks like this: A job script designated for a parallel job needs to specify the parallel environment and the number of required CPUs. The parameter looks like this for up to 8 slots on a single node:
Line 23: Line 18:
#$ -pe mpich-ppn2 4 #$ -pe multicore-mpi 8
Line 25: Line 20:
It is important to request the right limit for memory with the parameter h_vmem. The machines have 3673204k of RAM and by default two jobs are executed on one node, so the maximal amount of memory is 1650M per process. For more MPI processes, use -pe mpi.


Be sure to call the right mpirun version for your architecture. If you application was compiled for 64 bit, use

/usr/lib64/openmpi/1.2.7-gcc/bin/mpirun -np $NSLOTS yourapp
Line 31: Line 32:
Be aware that the batch system renews the AFS token, but only on the node that starts the first process (node 0). That's why you should access the AFS from that node. An example scenario looks like this: == BLAS library ==
Both ATLAS ans Goto``BLAS are available.
Line 33: Line 35:
 1. Copy data from AFS to node 0.
 1. Copy it with scp to the nodes that need it to the directory $TMPDIR, the machine names are in $TMPDIR/machines
 1. Run your MPI job.
 1. Copy the results with scp from the local discs to node 0.
 1. Copy the data from node 0 to AFS.
 * ATLAS is in /opt/products/atlas

 * libgoto is in /usr/lib or /usr/lib64 respectively.

== Further documentation ==

[[http://www-zeuthen.desy.de/technisches_seminar/texte/Technisches_Seminar_Waschk.pdf|HPC-Clusters at DESY Zeuthen]] , 11/22/06, technical seminar

Usage of the Linux Clusters at DESY Zeuthen

There are no dedicated parallel clusters available at the moment, but you can run parallel MPI jobs in the SGE farm. The documentation in Batch_System_Usage applies there.

Building Applications

Since SL5, all batch worker nodes have the openmpi implementation of the MPI standard installed. Recently the machines were upgraded to the default SL5.3 packages of openmpi. For 64 bit applications use the installation in /usr/lib64/openmpi/1.2.7-gcc/bin, for 32 bit use the binaries from /usr/lib/openmpi/1.2.7-gcc/bin .

Building applications

64 bit MPI Applications can be compiled on any 64 bit SL5 machine, e.g. sl5-64.ifh.de.

Batch System Access

A job script designated for a parallel job needs to specify the parallel environment and the number of required CPUs. The parameter looks like this for up to 8 slots on a single node:

#$ -pe multicore-mpi 8

For more MPI processes, use -pe mpi.

Be sure to call the right mpirun version for your architecture. If you application was compiled for 64 bit, use

/usr/lib64/openmpi/1.2.7-gcc/bin/mpirun -np $NSLOTS yourapp

AFS Access

The application binary must be available to all nodes, that's why it should be placed in an AFS directory.

BLAS library

Both ATLAS ans GotoBLAS are available.

  • ATLAS is in /opt/products/atlas
  • libgoto is in /usr/lib or /usr/lib64 respectively.

Further documentation

HPC-Clusters at DESY Zeuthen , 11/22/06, technical seminar

Cluster (last edited 2024-05-10 10:44:44 by GötzWaschk)