Admin Login • Member Login • Workshop Attendee Login

Running Parallel Code on the CSP Clusters

Overview

We have two parallel environments available in the csp cluster. The first one, called 'smp', is for multi-threaded/multi-cpu code that runs on a single machine. As you can see under the 'Running Code' tab on the resources page, cluster nodes have a range of 8 - 24 cpus, so if your job uses <= 24 threads/cores, this environment will work best since all inter-process communication is done locally (low latency) instead of communicating over ethernet (high latency). Jobs using the smp environment can and should be run in your scratch directory as with running serial jobs.

The second parallel environment, called 'mpi', is for running multi-core jobs that span across two or more machines using OpenMPI 1.4.5. These jobs must be run from within your home directory since the scratch directories on each node are not accessible to the other nodes. You also need to have password-less public key authentication set up for your account so that nodes can communicate with each other as your user without requiring a password.

Usage

You will need to create a submit script as explained in the section on running serial jobs, but with an additional flag to tell the queue system which parallel environment to use as well as how many cpus your job requires.  Also, if you are using the mpi parallel environment you do not need the commands in your script to copy your code and results to and from your scratch space.

The flag to specify using a parallel environment is -pe and uses the following syntax:

-pe <pe name> <number or range of cpus>

You can specify a range or specific number of cpus. If you specify a range, the queue system will try to give you the maximum requested cpus, and will only give you less if the maximum requested cpus are not available. For example, to use the smp parallel environment with a range of 4-8 cpus use the following:

-pe smp 4-8

The actual number of cpus that the queue system allocated for the job is stored in the $NSLOTS variable, which you then use in your mpirun command:

mpirun -np $NSLOTS  executable

Also note that you do not specify a host file in your mpirun command. The queue system will create this host file dynamically based on your requested resources and the current cluster usage. 

Example Scripts

Sample submit script using the mpi parallel environment:


#!/bin/bash -l
### Tell queue system to use the directory of the submit script ###
### as the current working directory ###
#$ -cwd

### You can use the -N flag to give your job a name. Otherwise, it will be named after the submit script name ###
#$ -N mpitest

### Tell queue system to submit this job to a specific queue ###
###(currently opteron or xeon ) ###
#$ -q xeon

### Tell the queue system to use the smp parallel environment with the number or range of cpus ###
#$ -pe mpi 48

### compile code ###
mpicc -o mpitest mpitest.c

### execute code ###
mpirun -np $NSLOTS mpitest

Sample submit script using the smp parallel environment:


#!/bin/bash -l
### Tell queue system to use the directory of the submit script ###
### as the current working directory ###
#$ -cwd

### You can use the -N flag to give your job a name. Otherwise, it will be named after the submit script name ###
#$ -N mpitest

### Tell queue system to submit this job to a specific queue ###
###(currently opteron or xeon ) ###
#$ -q xeon

### Tell the queue system to use the smp parallel environment with the number or range of cpus ###
#$ -pe smp 8-12

### make results directory in your home directory ###
mkdir $JOB_ID
resultdir=`pwd`/$JOB_ID

### make directory on local machine in scratch ###
mkdir /scratch/jeff/$JOB_ID

### Tell queue system to write it's output and error files to the scratch directory ###
### PLEASE NOTE since these output files are created before this script is run, ###
### they must be in /scratch/username since /scratch/username/$JOB_ID doesn't exist yet
### 
#$ -o /scratch/jeff/simple.sh.o$JOB_ID
#$ -e /scratch/jeff/simple.sh.e$JOB_ID

### copy source file to job directory ###
cp mpitest.c /scratch/jeff/$JOB_ID

### change current working directory to job directory ###
cd /scratch/jeff/$JOB_ID

### compile code ###
mpicc -o mpitest mpitest.c

### execute code ###
mpirun -np $NSLOTS mpitest

###copy files back to home directory located on hal ###
cp * $resultdir

###clean up scratch directory since space is limited on local machines ###
rm -rf /scratch/jeff/$JOB_ID

 

Latest News

Your Support