UCSD Logo UCSD Logo For Printing Skip navigation links

Navigation

UCSD Triton Resource @ SDSC

Quick Status

Triton Resource Node Status

Friday, January 4th 2013 11:25:01 PM PST


Total TCC Nodes Up: 243

Total 256GB (PDAF) Nodes Up: 20

Total 512GB (PDAFM) Nodes Up: 8

Rack 2 Up Count: 80

Rack 3 Up Count: 79

Rack 4 Up Count: 5

Rack 5 Up Count: 79

Skip navigation menus Start of navigation menus

Charge Policies for Triton Compute Jobs

Overview of Triton Accounting

Triton Resource supports four queues for job submission. Users must have an account through either the TAPP or a specific project, in order that the accounting system be able to charge the job time.

The default charge is per processing core per hour; if a node is allocated, all the cores on that node will be charged regardless of whether or not the job actually uses them. The base SU is calibrated off of the TCC nodes, which run at 2.4 GHz. Each node has eight such cores.

If PDAFM nodes are specified, the charge has a premium of two times the base rate. These nodes have 32 cores and run at 2.5 GHz.

The following general policies are in effect for scheduling jobs on Triton:

The table below provides details about available queues and charges.

Job Charging on Triton

How are jobs charged to my account?

By default, user accounts are set to charge jobs against a personal account that matches their username. If preferred, the default can be set to charge against another account, such as a project or shared account based on a TAPP or campus allocation.

If a job is submitted and the account is depleted below the estimated SUs needed to run, it will be deferred until the account is replenished. The qstat -f command will report a message similar to:

cannot debit job account - no funds

After the account balance is adjusted, the job will be able to run without being resubmitted. It will go into the idle state when the scheduler rechecks balances, and then get scheduled normally.

How do I check my account balance?

You can check your account balance and status by running gbalance -u <username>. This will show your personal account as well as any other accounts you can charge to.

For more information on Triton accounts, please see the FAQ Accounts section.

Running Jobs and Triton Accounts

To specify the account to be charged, use the -A option. It is recommended to use this option with all job submission scripts and qsub commands, to clearly indicate which account the user wants to be charged for the job.

#PBS -A <account name>

Memory requests are for all nodes combined. Node and core requests are per-node.

If submitted to the large queue, the above request would not be deferred, since it is possible to match the request with existing resources. However, it would result in a charge factor of 128 (32 cores x 2 nodes x 2) because it must be scheduled on the PDAFM nodes (the only way to satisfy 1008GB on two nodes, which blocks all 32 cores on each node).

If the above request asked for two nodes and 2016GB, it would be reduced to 1008GB, since the system cannot provide more than 504GB per node.

A special case of this example involves requests of more than 20 PDAF nodes. In this case, a combination of PDAF (256GB) and PDAFM (512GB) nodes is required. The scheduler is configured to require that the memory demand is satisfied by the smaller (256GB) nodes for all requested nodes, even though some of the allocated nodes would have 512GB.

Requests for resources exceeding the available maximums will be deferred and retried by the scheduler. After a limited number of retries, they will be put on hold and require administrator intervention.

A request for an interactive queue made between 8 p.m and 8 a.m. on weeknights will be deferred until the next 8 a.m.-8 p.m. weekday window and then scheduled.

Requests for more than the maximum number of nodes will not be rejected, as the scheduler makes no assumptions regarding future node availability. Requests that do not specify a memory size will be given the default amount of memory per node (see table below).

Job Queues Available Any Time

QueueClusterNodes in Queue
(node Max)
Cores per Node
(ppn Max)
Charge Premium Hours AvailableMax Node MemoryDefault Node Memory Max Queue Memory
batchTCC2468none 24x724GB24GB5904GB
small (shared)TCCvariable8none 24x724GB24GB960GB

Job Queues Only Available on Weekdays

QueueClusterNodes in Queue
(node Max)
Cores per Node
(ppn Max)
Charge Premium Hours AvailableMax Node MemoryDefault Node Memory Max Queue Memory
largePDAF19321x 8 a.m. to 8 p.m. PT Monday through Friday256GB 128GB4864GB
largePDAFM7322x 8 a.m. to 8 p.m. PT Monday through Friday512GB 128GB3584GB
express (interactive)PDAF1321x 8 a.m. to 8 p.m. PT Monday through Friday256GB 128GB256GB
express (interactive)PDAFM1322x 8 a.m. to 8 p.m. PT Monday through Friday512GB 128GB512GB

Job Queues Available on Nights and Weekends

QueueClusterNodes in Queue
(node Max)
Cores per Node
(ppn Max)
Charge Premium Hours AvailableMax Node MemoryDefault Node Memory Max Queue Memory
largePDAF20321x 8 p.m. to 8 a.m. PT Monday through Friday and all weekend256GB 128GB5120GB
largePDAFM8322x 8 p.m. to 8 a.m. PT Monday through Friday and all weekend512GB 128GB4096GB

SUs are charged at the rate of the number of processing cores per hour that are allocated to a job, regardless of the number of cores actually used. The value is rounded to the nearest hour after multiplying the cores and node premium (described below). See the Running Jobs page for more information on how to submit jobs to each queue.

No Premium on PDAF PDAF (256 GB) nodes are charged at the same rate as TCC node cores; one hour of use incurs a charge of 32 SUs (32 cores x 1).

2x Premium on PDAFM PDAFM (512 GB) nodes are charged at two times the rate of TCC node cores; one hour incurs a charge of 64 SUs (32 cores x 2).

A job that specifically requests PDAF (see example below) may actually get scheduled on a PDAFM node, but it will be charged at the lower PDAF rate.

The small queue will be charged "per core", not per node as all the others are, but the node must be shared if other jobs can use it; there are a variable number of nodes in this queue, depending on demand.

The large queue policy for these nodes is based on the amount of memory requested, and charges are proportional to the relative number of CPUs on the node. For example, if a job requests 126 GB and one node, the charge will be for 16 cores (half of the cores on a 256-GB node). If a job requests 504 GB and two nodes, it will be charged for 64 cores (two entire 256-GB nodes). To get 32 cores on two 512-GB nodes (16 cores on each node), the job would need to request 16 processors per node. The user would still be charged for two full nodes.

The Triton node allocation policy favors PDAF nodes over more expensive PDAFM nodes. If a job requests 252 GB, it will only be charged for the PDAF rate if the full complement of 32 processors is also requested. To request a specific large queue node type, use the memory attribute:

#PBS -q large
#PBS -l nodes=1:ppn=32

and either

#PBS -l mem=252gb
or

#PBS -l mem=504gb

The express queue is only available between 8 a.m and 8 p.m. Pacific Time Monday through Friday; this queue is intended for interactive use, and users may only have one job running at a time and no more than two jobs waiting.

Job Charging Examples

Account Charge Factors

This table shows the main factors determining how account charges are generated for each of the main queues and node types of the Triton Resource.

Memory-request Driven Charge Factors
QueueMemory
(determining factor)
Allocated Cores
(on single node of this type)
Charge FactorSUs Charged (per CPU-hour)
*large128gb16 (PDAF)1x16
large128gb8 (PDAFM)2x16
large256gb32 (PDAF)1x32
large256gb16 (PDAFM)2x32
large384gb24 (PDAFM)2x48
large512gb32 (PDAFM)2x64

*Note: Large queue requests can be scheduled on either a PDAF or PDAFM node. To explicitly request time on a 256-GB (PDAF) node, use the memory attribute:

#PBS -q large
#PBS -l nodes=1:ppn=32
#PBS -l mem=252gb

To explicitly request time on a 512-GB (PDAFM) node, request > 252 GB/node or use the memory attribute:

#PBS -q large
#PBS -l nodes=1:ppn=32
#PBS -l mem=504gb

Processor-request Driven Charge Factors
QueueNode TypePPN (Requested)Node CountAllocated Cores
(determining factor)
SUs Charged (per CPU-hour)
batchTCC1188
batchTCCNot specified188
batchTCCNot specified21616
Shared Node Charge Factors
QueueNode TypePPN (Requested)Allocated Cores
(determining factor)
SUs Charged (per CPU-hour)
*smallTCC111
smallTCC444

*Note: Small queue requests must share the node with other jobs requesting the small queue. There will be contention for memory, network, and disk space available to the node when sharing with other jobs.

Account Charge Pre-verification

Before a job can be scheduled, the system verifies available credits in the user account. It does not actually charge the account at this time, but SUs (CPU-hour credits) equal to the estimated charges must be available. The system uses values from the job script to estimate these charges according to the following formulas:

The formula for batch queue requests is:

#CPUs x #nodes x wall time

The formula for large queue requests is:

ChargeFactor x #CPUs x #nodes x wall time

The ChargeFactor for a 256-GB node is 1; for a 512-GB node it is 2.

Here are some examples:

  1. Queue batch: for a single-node job requesting four CPUs with two hours maximum wall clock time, submitted via this script:

    For this request to get scheduled, the account must have available

    8 x 1 x 2 = 16 SUs

    Batch queue nodes are charged for all eight CPUs regardless of how many are actually requested or used by the job. To be charged only for CPUs actually used, submit to the small queue.

  2. Queue large: for a single-node job requesting four CPUs with two hours maximum wall clock time, submitted via this script:

    Note: PDAF/M requests will be allocated either 16 or 32 cores per node. No smaller CPU allocations are supported.

    For this request to get scheduled, the account must have available

    1 x 16 x 1 x 2 = 32 SUs

    Adding 252GB memory to this request increases the CPUs required. The #CPUs becomes 32 due to requesting entire memory of the node.

    #PBS -l mem=252gb

    For this request to get scheduled, the account must have available

    1 x 32 x 1 x 2 = 64 SUs

    These jobs could be scheduled on either a 256-GB node or a 512-GB node, depending on availability and system load. If the memory-requesting job runs on a 512-GB node, the ChargeFactor would be 2, but CPUs required would only be 16 (because only half of the PDAFM node's memory is required), so the SUs required to schedule remains 64.

    2 x 16 x 1 x 2 = 64 SUs
  3. Queue large: changing the memory request to 504 GB increases the CPUs required, and guarantees the job will be scheduled on a 512-GB node with 504 GB memory available (and using a ChargeFactor of 2):

    #PBS -l mem=504gb

    For this request to get scheduled, the account must have available

    2 x 32 x 1 x 2 = 128 SUs

When any job finishes, the account gets debited by actual CPU time used (rounded to the nearest half hour).

If a job runs more than five minutes beyond its requested wall time, it will be canceled by the system. Jobs charges that exceed available SUs in the account will not be canceled, but will result in a negative balance that can be credited later.

Contact Us

Open a Ticket with Triton Resource Support using the Support Ticket Form.

Join the Discussion Forum Sign up for our Email Discussion List.

Follow Triton on Twitter

FAQ Read the FAQ Page.

Terms of Use | Privacy

Back to page top End of page