Saturday, March 10, 2012

Using MPICH on the Janus Cluster

Overview
  • Each node has 12 cores (2 cpus) (intel eon 2.8 Ghz)
  • 2 GB of RAM per core, 24 GB of RAM per node
  • QDR infiniband interconnect
 File System

/home/<username> (2 GB quota)
/projects/<username> (256 GB quota, not for job output)
/scratch/stmp00/<username> (not backed up, intended for job I/O)

Steps to run an MPI Program
1) Load Dotkit for MPI
use .openmpi-1.4.3_ib

2) compile
mpicc -std=c99 -o <outfile> <inputfile>

The -sdd=c99 is needed if you want to use C99 (you probably do).  With c99, you can do a loop like:
for(int i = 0; i <n; i++) {...}
Without c99, you have to write this loop as:
int i;
for(i = 0; i < n; i++) {...}

3a) Submit
use Torque
use Moab

3b) write a batch script to run the mpi program (must be in PBS standard) parallel batch scheduler?

3c) submit:
qsub -q janus-debug run.sh

-q = run queue
rc.colorado.edu/crcdocs/queues

4) track status
qstat -f -u $USER
showq -u $USER


5) Kill job
qdel [jobid]
mjobctl -c [jobid]