University of Alberta banner
Academic Information and Communication TechnologiesUniversity of Alberta
About AICT | Getting Help | Online Tools | Services | Research | Teaching & Learning | Computer Security
  

WestGrid Logo

     University of Alberta
     WestGrid Site-Specific Information
     Portable Batch System (PBS)
Jump to a Section:
Introduction · Job Scripts · Submitting Jobs · Stopping Jobs · Checkpointing


Introduction:

This document describes how to run jobs on the WestGrid complex of SGI Origin computers using PBS, the Portable Batch System. Please read the information for new users before proceeding. You should also become acquainted with the Usage Policies and Priority Calculation documents.

Also, again, please feel free to contact research.support@ualberta.ca if you have any questions or comments.


Job Scripts:

To prepare a job for submission, first create a shell script that can be used to run your program. Include all necessary environment variable settings.

For example, to run a program that has been parallelized with OpenMP directives, your script might look like this:

#! /bin/sh

export OMP_NUM_THREADS=64

cd /scratch/esumbar
./a.out

Next, introduce PBS directives to specify your job's resource needs and describe its attributes. PBS directives are shell-script comments that have the form "#PBS flag value". Be sure to group all of the PBS directives together at the beginning of the script and do not intermingle executable statements.

For example:

#! /bin/sh
#PBS -S /bin/sh
#PBS -q arcturus
#PBS -l host=arcturus
#PBS -l ncpus=64
#PBS -l walltime=12:00:00
#PBS -m bea
#PBS -M esumbar@ualberta.ca
#PBS -N myjob

export OMP_NUM_THREADS=64

cd /scratch/esumbar
./a.out >output 2>&1

If you were to submit the above script for execution, PBS would interpreted the directives as follows:

LINE 2:
-S /bin/sh

This is the shell that PBS will use to execute your script file. If omitted, your login shell on the execution host is used. This has an impact on which set of startup files are processed, and consequently, which set of environment variables your script inherits. This shell and the one specified on the #! line don't have to be the same. The only rule is that the syntax of the script must be consistent with the shell specified on the #! line.

LINE 3:
-q arcturus

The requested queue; and for this example we have chosen "arcturus". Each queue has resource limits which reflect the capabilities of the machine that hosts the queue or which are derived from administrative policies. There are currently seven supported queues, and they happen to have the same name as their respective host. They all have a 24-hour duration.


queue
name
duration
(hours)
min
ncpus
max
available
ncpus
max
available
memory
(GBytes)
nexus 24 1 6 8
arcturus 24 64 256 256

borealis
24 8 64 16
australis 24 16 64 32
corona n/a n/a n/a n/a
helios 24 1 32 16

LINE 5:
-l ncpus=64

Number of cpus required by your job. In this example, 64 processors of arcturus, are been requested. Just remember to request one cpu for each parallel thread of execution.

LINE 6:
-l walltime=12:00:00

This specifies the minimum amount of elapsed time required by your job. Typically, this is set to the time needed to reach the first checkpoint.

LINE 7:
-m bea

Instruct PBS to email you when your job begins and ends, or is aborted. The actual email address is specified on the next line ; i.e., in the "M" directive.

LINE 8:
-M esumbar@ualberta.ca

Specifies the email address that will receive PBS notifications. In this example PBS would try to send mail to one of our analysts, please be sure to substitute your own email address.

LINE 9:
-N myjob

In this example we have chosen an unimaginative name of "myjob". Chose a short name (less than 16 characters, no spaces) that you can use to identify your job. If omitted, the name of the script (truncated to the first 15 characters) is used.
For the sake of simplicity, we recommend that you save the script in the same directory as your program files, preferably in a personal directory under /scratch. Give the script an appropriate name, in our example we called it "myjob.sh" .
When the job runs, the script is executed on your behalf using the specified shell on the specified host. At this point, it's just an ordinary shell script and all comments, including PBS directives, are ignored. The program (a.out in the example) runs as a child process of the script. When the program terminates, execution returns to the script, which itself terminates, finally terminating the job.

Submitting and Monitoring Jobs:

To submit a job, simply "cd" to the job directory and submit the script to PBS using the "qsub" command. You can subsequently monitor the status of the job with the "qstat" command.

For example:

First here is a very simple script, (you can see the entire script from the "more" command), that submits a job to the aurora queue, to use 4 processors, with a wall time of 12 hours, and it is to send an email when it is done (please substitute your own email address):

Terminal Session

Next, here is screen shoot of actually submitting the above script to PBS using the "qsub" command, and using the "qstat" command to monitor the status of the job:

Terminal Session

(lines in output edited to save space).

Terminal Session

A "Q" in the S (status) column means that the job is waiting in the specified queue, while an "R" means that the job is currently running. The TSK column displays either the requested number of cpus or the memory-equivalent number of cpus, whichever is greater. The job that was submitted in this example was given the identification number 87642. It shows up waiting in the aurora queue as expected. See the man pages regarding qsub and qstat for further details.

Normally, while a job is executing, PBS redirects standard output and standard error output from the script to two private files. When the job ends, this output is returned to you as, in this example, mytestjob.o87642 (standard output) and mytestjob.e87642 (standard error). The standard output from your program inherits this redirection through the process hierarchy. Consequently, if your program calls printf() (C) or executes the print* or write(*,fmt) statements (Fortran) as a way of monitoring execution progress, you will not be able to see this output until the termination of the job. To overcome this problem, you should explicitly redirect standard output from your program to your own file(s) as illustrated in the example script. Be sure to follow each relevant output statement in the code with a call to the fflush() function (C) or the flush subroutine (SGI Fortran) to force an immediate update of the file(s).


Stopping Jobs:

The job owner can remove a waiting job from the queue or terminate a running job using the qdel command and supplying the PBS job identification number. When used to terminate a running job, the qdel command delivers a TERM signal to the process group. To send a KILL signal, or any other supported signal, to a running job, use the qsig command. Signalling a waiting job has no effect. See the man pages for additional documentation.


Checkpointing and Resubmitting Jobs:

Checkpointing is the process of saving the value of essential data into one or more files so that the program can be restarted. When a new job is submitted and the program starts running again, the saved data is read in from the checkpoint file(s) and execution resumes at the point where the program was terminated.

For example, here is a basic checkpointing algorithm implemented in Fortran for a typical simulation program:

      program demo
      implicit none
      integer NX, NY, NZ, MAXSTEP, CPT
      parameter (NX=128, NY=128, NZ=128)
      parameter (MAXSTEP=500000, CPT=1000)
      real*8 data(NX,NY,NZ)
      integer step, laststep
      integer get_laststep

      laststep = get_laststep()
      if (laststep .eq. 0) then
         call initialize_data(data,NX,NY,NZ)
      else
         call read_checkpoint(laststep,CPT,data,NX,NY,NZ)
      end if

      do step = laststep + 1, MAXSTEP
         call do_work(data,NX,NY,NZ)
         if (mod(step,CPT) .eq. 0) then
            call write_checkpoint(step,CPT,data,NX,NY,NZ)
         end if
      end do

      call final_output(data,NX,NY,NZ)
      call clean_up()
      stop
      end

In this program, a checkpoint is performed whenever step is evenly divisible by CPT. The implementation of the checkpoint-related subroutines get_laststep, read_checkpoint, and write_checkpoint reference alternating checkpoint files as a measure of fault tolerance.

The full text of the demonstration program is in demo.f . An equivalent C version is in demo.c .


Contacting Us:

If you'd like more information about using the machines, production schedule, machine availability, and/or help with porting your code, please contact us research.support@ualberta.ca


Revised: August 02, 2006.

  

  

University of Alberta

© University of Alberta
AICT Privacy Policy