Skip navigation links
Saturday, November 21st 2009 02:05:01 PM PST
tcc-3-71.local
Triton Resource supports four queues for job submission. Users must have an account through either the TAPP or a specific project, in order that the accounting system be able to charge the job time.
The default charge is per processing core per hour; if a node is allocated, all the cores on that node will be charged regardless of whether or not the job actually uses them. The base SU is anchored to the TCC nodes, which run at 2.4 GHz. Each node has eight such cores.
If PDAF/M nodes are specified, the charge has a premium of two or four times the base rate, depending on the node's memory capacity. In addition, these nodes have 32 cores and run at 2.5 GHz.
The table below provides details about available queues and their associated charges.
Memory requests are for all nodes combined. Node and core requests are per-node.
#PBS -l nodes=2:ppn=16
#PBS -l mem=1024GB
If submitted to the large queue, the above request would not be deferred, since it is possible to match the request with existing resources. However, it would result in a charge factor of 256 (32 cores x 2 nodes x 4) because it must be scheduled on the PDAFM nodes (the only way to satisfy 1024GB on two nodes, which blocks all 32 cores on each node).
If the above request asked for two nodes and 2048GB, it would be reduced to 1024GB, since the system cannot provide more than 512GB per node.
A special case of this example involves requests of more than 20 PDAF nodes. In this case, a combination of PDAF (256GB) and PDAFM (512GB) nodes is required. The scheduler is configured to require that the memory demand is satisfied by the smaller (256GB) nodes for all requested nodes, even though some of the allocated nodes would have 512GB.
Requests for resources exceeding the available maximums will be deferred and retried by the scheduler. After a limited number of retries, they will be put on hold and require administrator intervention.
A request for an interactive queue made between 8 p.m and 8 a.m. on weeknights will be deferred until the next 8 a.m.-8 p.m. weekday window and then scheduled.
Requests for more than the maximum number of nodes will not be rejected, as the scheduler makes no assumptions regarding future node availability. Requests that do not specify a memory size will be given the default amount of memory per node (see table below).
| Queue | Cluster | Nodes in Queue (node Max) |
Cores per Node (ppn Max) | Charge Premium | Hours Available | Max Node Memory | Default Node Memory | Max Queue Memory |
|---|---|---|---|---|---|---|---|---|
| batch | TCC | 246 | 8 | none | 24x7 | 24GB | 24GB | 5904GB |
| small (shared) | TCC | 10 | 8 | none | 24x7 | 24GB | 24GB | 240GB |
| Queue | Cluster | Nodes in Queue (node Max) |
Cores per Node (ppn Max) | Charge Premium | Hours Available | Max Node Memory | Default Node Memory | Max Queue Memory |
|---|---|---|---|---|---|---|---|---|
| large | PDAF | 19 | 32 | 2x | 8 a.m. to 8 p.m. PT Monday through Friday | 256GB | 128GB | 4864GB |
| large | PDAFM | 7 | 32 | 4x | 8 a.m. to 8 p.m. PT Monday through Friday | 512GB | 128GB | 3584GB |
| express (interactive) | PDAF | 1 | 32 | 2x | 8 a.m. to 8 p.m. PT Monday through Friday | 256GB | 128GB | 256GB |
| express (interactive) | PDAFM | 1 | 32 | 4x | 8 a.m. to 8 p.m. PT Monday through Friday | 512GB | 128GB | 512GB |
| Queue | Cluster | Nodes in Queue (node Max) |
Cores per Node (ppn Max) | Charge Premium | Hours Available | Max Node Memory | Default Node Memory | Max Queue Memory |
|---|---|---|---|---|---|---|---|---|
| large | PDAF | 20 | 32 | 2x | 8 p.m. to 8 a.m. PT Monday through Friday and all weekend | 256GB | 128GB | 5120GB |
| large | PDAFM | 8 | 32 | 4x | 8 p.m. to 8 a.m. PT Monday through Friday and all weekend | 512GB | 128GB | 4096GB |
SUs are charged at the rate of the number of processing cores per hour that are allocated to a job, regardless of the number of cores actually used. The value is rounded to the nearest hour after multiplying the cores and node premium (described below). See the Running Jobs page for more information on how to submit jobs to each queue.
2x Premium on PDAF PDAF (256 GB) nodes are charged at twice the rate of TCC node cores; one hour of use incurs a charge of 64 SUs (32 cores x 2).
4x Premium on PDAFM PDAFM (512 GB) nodes are charged at four times the rate of TCC node cores; one hour incurs a charge of 128 SUs (32 cores x 4).
The small queue will be charged "per core", not per node as all the others are, but the node must be shared if other jobs can use it; there are 10 nodes in this queue.
The large queue policy for these nodes is based on the amount of memory requested, and charges are proportional to the relative number of CPUs on the node. For example, if a job requests 128 GB and one node, the charge will be for 16 cores (half of the cores on a 256-GB node). If a job requests 512 GB and two nodes, it will be charged for 64 cores (two entire 256-GB nodes). To get 32 cores on two 512-GB nodes (16 cores on each node), the job would need to request 16 processors per node. The user would still be charged for two full nodes.
The Triton node allocation policy favors PDAF nodes over more expensive
PDAFM nodes, but this will not ensure a job will run on a PDAF node if a PDAFM node
fits the scheduler's plan. To require a specific
large queue node type, use the memory feature:
#PBS -l nodes=1:mem256gb
or
#PBS -l nodes=1:mem512gb
Note: The first example is not the same as specifying:
#PBS -l mem=256GB
which will only suggest and not require a PDAF node.
The express queue is only available between 8 a.m and 8 p.m. Pacific Time Monday through Friday; this queue is intended for interactive use, and users may only have one job running at a time and no more than two jobs waiting.
Before a job can be scheduled, the system verifies available credits in the user account. It does not actually charge the account at this time, but SUs (CPU-hour credits) equal to the estimated charges must be available. The system uses values from the job script to estimate these charges according to the following formulas:
The formula for batch queue requests is:
#CPUs x #nodes x wall time
The formula for large queue requests is:
ChargeFactor x #CPUs x #nodes x wall time
The ChargeFactor for a 256-GB node is 2; for a 512-GB node it is 4.
Here are some examples:
Queue batch: for a single-node job requesting four CPUs with two hours maximum wall clock time, submitted via this script:
#PBS -q batch#PBS -l nodes=1:ppn=4#PBS -l walltime=2:00:00For this request to get scheduled, the account must have available
8 x 1 x 2 = 16 SUs
Batch queue nodes are charged for all eight CPUs regardless of how many are actually requested or used by the job. To be charged only for CPUs actually used, submit to the small queue.
Queue large: for a single-node job requesting four CPUs with two hours maximum wall clock time, submitted via this script:
#PBS -q large#PBS -l nodes=1:ppn=4#PBS -l walltime=2:00:00For this request to get scheduled, the account must have available
2 x 4 x 1 x 2 = 16 SUs
Adding 256GB memory to this request increases the CPUs required. The ChargeFactor becomes 4 due to requesting the large queue.
#PBS -l mem=256GBNote: This memory request will result in actual availability of 252 GB due to system overhead.
For this request to get scheduled, the account must have available
2 x 32 x 1 x 2 = 128 SUs
These jobs could be scheduled on either a 256-GB node or a 512-GB node, depending on availability and system load. If the memory-requesting job runs on a 512-GB node, the ChargeFactor would be 4, but CPUs required would be 16, so the SUs required to schedule remains 128.
Queue large: changing the memory request to 512 GB increases the CPUs required, and guarantees the job will be scheduled on a 512-GB node with 504 GB memory available (and using a ChargeFactor of 4):
#PBS -l mem=512GBFor this request to get scheduled, the account must have available
4 x 32 x 1 x 2 = 256 SUs
When any job finishes, the account gets debited by actual CPU time used (rounded to the nearest half hour).
If a job runs more than five minutes beyond its requested wall time, it will be canceled by the system. Jobs charges that exceed available SUs in the account will not be canceled, but will result in a negative balance that can be credited later.
Open a Ticket with Triton Resource Support using the Support Ticket Form.
Join the Discussion Forum Sign up for our Email Discussion List.
FAQ Read the FAQ Page.
