|
||||
Usage Policies for the MACI SGI Origin Machines.
NOTE:
As of 20 January, 2004
January 23, 2004:
This page is in the process of becoming an archive/history page.
This page describes the usage rules, scheduling policies and priority calculations for our current SGI Origin computers: Aurora, Borealis and Australis. (For information regarding the scheduling policy for the University of Calgary's MACI cluster of Compaq Alphas see: www.maci-cluster.ucalgary.ca ) Please contact Research.Support@ualberta.ca with questions or comments related to the SGI Origin machines.
|
||||
|
||||
|
|
||||
| There are currently three SGI Origins in operation at the University of Alberta. Aurora, an Origin 2000, is the only Origin on which we permit limited interactive use: the limitations are described below. The other Origins are Borealis (an Origin 2400) and Australis (an Origin 3800). These machines are reserved for batch processing only. | ||||
|
||||
|
The three SGI Origins
share disk space mounted on an SGI TP9400 disk array. The space is
mounted as /scratch and users can create their own sub directory in the
scratch space. As with all scratch space we do not guarantee to backup
files in this space. Until now, CNS has been using /scratch in an ongoing test of the capabilities of the TSM backup facility. As a result, most files in this space have been backed up on a regular basis. Users must be aware, however, that no assurance is given that any file located in /scratch will be backed up. All critical files should therefore be copied or moved to a more secure area to ensure an appropriate archive and/or backup. |
||||
|
||||
The Portable Batch System (PBS) is used to manage all non-interactive jobs running on Aurora, Borealis and Australis. PBS keeps track of jobs submitted and runs them in order according to their priority. All users are, therefore, expected to submit their jobs to PBS rather than running them directly, except for limited interactive use permitted on Aurora as described below. Interactive jobs:
Short interactive jobs are intended to allow users to compile and test their programs. This is to reserve most of the machine for large batch jobs. There is no interactive use on Borealis or Australis. Note that interactive usage is monitored and users who repeatedly exceed 60 minutes of walltime may have their processes killed. The Origin machines are expected to be used to run jobs which cannot be run elsewhere. MACI has a large Compaq Alpha cluster in Calgary for its members. These machines are connected with Gigabit ethernet and Myrinet networks. While the SGI Origins can be used for parallel work, MACI is encouraging all of its members to use the Compaq machines for any serial, i.e. single processor, problems that members have to run. In addition CNS maintains machines specifically for scalar numerical work, the numerical server, as well as a PC cluster which provides a parallel environment. Please contact Research.Support@ualberta.ca if you wish to pursue the move of your jobs to these other machines. The maximum walltime (physical elapsed time) for jobs on Aurora and Australis is 24 hours, while the maximum walltime for jobs on Borealis is 12 hours. More precisely, at certain times of the day, all running jobs will be stopped. These times are currently 11:45 on Aurora and Australis, and 11:45 and 23:45 on Borealis. These restarts ensure that jobs in the queue will have a chance to start in a reasonable period and keep large parallel jobs from being shut out. Please use checkpointing (saving the current state of your program before a restart) to avoid losing the results of your calculation during a restart and wasting CPU cycles.
The research support team will be happy to assist users in porting code and ensuring that their programs are making efficient use of the parallel architecture of the SGI Origins. Programs to be run on Australis may be evaluated by Research Support to ensure efficient use of the machine.
Updated: November,
2001. |
||||
|
||||
|
Nothing new at this time.
Updated: November 7,
2001. |
||||
|
||||
Previous Scheduling
Policies archive
file for Aurora/Borealis (contains the Policy reports
back to April of '98 when Aurora first arrived).
Updated: November 21,
2000. |
||||
|
||||
|
||||