Basics

The Grid HTC cluster is designed as a multipurpose computing resource for scientific research. However it is optimised for high throughput computing (hence HTC). What this means is that we have a large data storage capacity (>5PB) and a large number of job slots (>4000) and a large network bandwidth to connect the two. If you work requires large data storage or lots of single processor slots then we provide a good solution. If your work requires inter job communication (e.g. MPI) or has high memory requirements (>4GB/HT core) then our system is not well suited to you. There maybe other solutions available (such as the HPC cluster).

Our batch system is SLURM (Simple Linux Utility for Resource Management). Details about SLURM can be found here. A summary of useful SLURM commands can be found here. They differ with respect to the Gridengine commands , the difference are summarised here.

SLUM Setup

SLURM Queues

The default queues (partitions in SLURM language) to submit jobs to are:

Partition Name = debug, Default Memory Per CPU =1024, Maximum Memory Per CPU =2048, Max Wall Clock Time = 1 hour

Partition Name = prod, Default Memory Per CPU =1024, Maximum Memory Per CPU =4096, Max Wall Clock Time = 4 hours

Submitting a job

to submit a job to the "production" partition (aka queue), using 1 task per node (one job slot)

sbatch -p prod test_slurm

[$] cat test_slurm
#!/bin/sh

hostname
uptime

Several additional options are available when submitting a job. Clearly stating your job limits will make the job more lily to succeed and may improve turnaround time.

#Time limit for the job
--time=01:00:00

# memory in MB; default limit is 1024MB per core
--mem-per-cpu=1024

# Number of cores per job per node.
--ntasks-per-node=1

# Number of compute nodes for the job.
--nodes=1

#Name of job. Default is the JobID
--job-name="hello_test" 

# Name of file for stdout. Default is the JobID
--output=test.out

Alternatively the job options can be put in the job script e.g.

#SBATCH --partition=prod --qos=general-compute
#SBATCH --time=00:15:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --mem-per-cpu=1024
#SBATCH --job-name="hello_test"
#SBATCH --output=test-srun.out

hostname
uptime
# some slurm variables
echo "SLURM_JOBID="$SLURM_JOBID
echo "SLURM_JOB_NODELIST"=$SLURM_JOB_NODELIST
echo "SLURM_NNODES"=$SLURM_NNODES
echo "SLURMTMPDIR="$SLURMTMPDIR

echo "working directory = "$SLURM_SUBMIT_DIR

Note:

Clearly stating your job limits will make the job more lily to succeed and may improve turnaround time.
Requesting more nodes or more cores than 1 will lead to much longer turnaround times. the grid cluster is optimized for single core, single nodes high through put computing (htc).

#=== MPI ===

#to run mpi job see

#https://www.open-mpi.org/faq/?category=slurm

Get information from SLUM

A queued or running job can be cancelled

scancel <job_id_list>

scontrol - Used view Slurm configuration and state.

scontrol show job <job_id_list>

squeue - view information about jobs located in the Slurm scheduling queue.

squeue -j <job_id_list>

squeue -u <user_list>

sacct - displays accounting data for all jobs and job steps in the Slurm job accounting log or Slurm database

sacct -j <job_id_list>

More complex examples

Submit job that needs a gpu(gpu:1=1 gpu, max 4) sbatch -p centos7_gpu --gres=gpu:1 test_gpu.sh

echo $CUDA_VISIBLE_DEVICES
nvidia-smi

Example Script

This is a more complex example show how various option scan be used when submitting a job.

SBATCH --job-name="hello_test"

# Create Working Directory
WDIR=/data/scratch/tmp/$USER/$SLURM_JOBID
mkdir -p $WDIR
if [ ! -d $WDIR ]
then
  echo $WDIR not created
  exit
fi
cd $WDIR

# Copy Data and Config Files.
cp $HOME/Data/FrogProject/FrogFile .

# Put your Science related commands here
/share/apps/runsforever FrogFile

# Copy Results Back to Home Directory
RDIR=$HOME/FrogProject/Results/$SLURM_JOBID
mkdir -p $RDIR
cp NobelPrizeWinningResults $RDIR

# Cleanup
rm -rf $WDIR

A more complicated example. This will create multiple directories, with the name LJ.$t where $t is a number, with a control file containing different temperature, $t, in each directory. The commented out section submits jobs from each directory in turn, which should use the unique control file.

 MINTEMP=0
 MAXTEMP=3000
 TEMPSTEP=30
 SAMPLE="LJ"
 for ((t=$MINTEMP;t<=$MAXTEMP;t=t+$TEMPSTEP)) do
         echo "Creating Directory for temperature: " $t
         DIR=${SAMPLE}.${t}
         mkdir $DIR
         CONTROLFILE=${DIR}/CONTROL
         # If more complicated, could copy a default control file or something
         echo "Control stuff" >$CONTROLFILE
         echo "more control stuff" >>$CONTROLFILE
         echo "temperature " $t >>$CONTROLFILE
 done
 
 
 
 #for ((t=$MINTEMP;t<=$MAXTEMP;t=t+$MAXTEMP)); do
 #       cd ${SAMPLE}.${t}
 #       sbatch -p production myjob.sh
 #       cd ..
 #done

Cluster basics (last edited 2015-09-22 08:24:51 by apw043) EditInfoAdd LinkAttachments MoinMoin PoweredPython PoweredGPL licensedValid HTML 4.01