Service Units¶
One service unit (SU) is approximately equal to 1 core hour of computing. The charge is calculated based on factors that the slurm workload manager calls 'Trackable RESources' or TRES. The important TRES values for calculating CRC SUs are:
- Number of cores requested
- RAM requested
- On the GPU cluster, number of cards requested.
Each of these has a TRES Billing Weight assigned to it in the cluster configuration files. These weights along with the amount of resources your job is allocated are used to construct a total cost in SUs. Here is a table listing the weights for each cluster and partition:
Cluster | Partition | CPU/GPU Weight | Memory Weight |
---|---|---|---|
SMP | |||
smp | 0.8 | 0.102 | |
high-mem | 1.0 | 0.0477 | |
MPI | |||
opa-high-mem | 1 | 0.149 | |
mpi | 1 | 0.93 | |
GPU | |||
gtx1080 | 1 | 0 | |
v100 | 5 | 0 | |
power9 | 5 | 0 | |
a100 | 8 | 0 | |
a100_multi | 8 | 0 | |
a100_nvlink | 8 | 0 | |
l40s | 8 | 0 | |
HTC | |||
htc | 1 | 0.128 |
To see a more detailed view of this information (including investment hardware configurations), you can use the scontrol slurm command, providing the -M flag to specify a cluster:
[nlc60@login0b ~] : scontrol -M htc show partition
PartitionName=htc
AllowGroups=ALL AllowAccounts=ALL AllowQos=short,normal,long,htc-htc-s,htc-htc-n,htc-htc-l,htc-htc-s-invest,htc-htc-n-invest,htc-htc-l-invest
AllocNodes=ALL Default=YES QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=1 MaxTime=UNLIMITED MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
Nodes=htc-1024-n[0-3],htc-n[24-49]
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
OverTimeLimit=NONE PreemptMode=CANCEL
State=UP TotalCPUs=1792 TotalNodes=30 SelectTypeParameters=NONE
JobDefaults=(null)
DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED
TRES=cpu=1792,mem=19587200M,node=30,billing=2448
TRESBillingWeights=CPU=1.0,Mem=0.128G
PartitionName=scavenger
AllowGroups=ALL AllowAccounts=ALL AllowQos=short,normal,long,htc-scavenger-s,htc-scavenger-n,htc-scavenger-l
AllocNodes=ALL Default=NO QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=1 MaxTime=UNLIMITED MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
Nodes=htc-n[24-31]
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
OverTimeLimit=NONE PreemptMode=CANCEL
State=UP TotalCPUs=384 TotalNodes=8 SelectTypeParameters=NONE
JobDefaults=(null)
DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED
TRES=cpu=384,mem=6180000M,node=8
TRESBillingWeights=CPU=0.0,Mem=0.0G
For a concise view of the TRES used by a job, you can use the sacct
command:
[nlc60@login0b ~] : sacct -X -M smp -j 6169876 --format=User,JobID,Jobname,AllocTRES%30,Elapsed
User JobID JobName AllocTRES Elapsed
--------- ------------ ---------- ------------------------------ ----------
nlc60 6169876 hello_wor+ cpu=1,mem=4018M,node=1 00:00:01