The phoenix cluster policy works mainly by pool of resources limits. Each contribution is counted according to the CPUs, Memory, and GPUs added to the cluster. The different types of CPUs aren't taken into account.
Starting Mar. 18 2025, the GPUs are grouped together and the limit is per GPU group contribution.
Each CS lab (PI) has a slurm account on the phoenix cluster. Each such account is limited by the lab's contribution to the cluster.
For example, if a lab contributed 2 nodes with 8 CPUs, 500G RAM, and 2 GPUs each from group g0. Than that lab won't be able to use more than 16 CPUs, 1T RAM and 4 GPUs from group g0 at any given time.
Due to scheduling constrains, especially when the cluster is loaded, jobs might still get delayed before the lab limit has been reached.
To check the account usage and limits, one can use the slimits utility:
slimits
Staring Mar. 18 2025, the GPUs on the phoenix cluster are grouped together by GPU type, and are considered as a resource like CPU, Memory, etc. Each lab is limited to the GPU groups it contributed to.
To request a GPU, need to either specify the exact GPU type e.g. --gres gpu:a10
, or the GPU group to take GPU from e.g. --gres gg:g0
. Either way, the limit will hold against the relevant GPU group.
Requesting a generic GPU (e.g. --gres gpu:3
) or a generic GPU group (e.g. --gres gg:3
) will not work.
All accounts have access to the killable account. This account doesn't have limits, but jobs there can be killed by "normal" jobs.
Courses, projects (e.g. engineering), and labs which didn't contribute to the cluster can only run on the killable account.
Jobs in the killable account have the lowest priority.
The priority is a number assigned to each job which determine the order in which the scheduler will try to schedule the jobs.
The priorities are not related to the limits. High priority jobs might not run because the lab has reached its resources limits. In which case lower priority jobs will start before the higher priority ones.
Currently there are no priorities differences between accounts. All accounts are equal priority-wise. A lab which bought more resources - will have higher limits, not higher priority.
The main priority factor is the fair-share. The more a user (or a lab) has used in the past, the lower the priority he'll have. Currently only CPU time is taken into account for usage history, but this might change in the future and different weights can be given to different resources (CPU, memory, GPU). The historic usage for the fair-share calculation has a half life decay of 7 days.
The cinder cluster comprises from old phoenix cluster nodes. On September of each year, nodes that are older than 7 years will be moved to the cinder cluster.
The cinder cluster's policy is the same as the phoenix.
Set the GPU limit on a per GPU group type, instead of generally on GPUs.
Another layer of killable jobs. Labs could pay for <resource>-time (cpu-time, gpu-time, etc.). These jobs will have priority between "killable" jobs and "normal" jobs. They could kill killable jobs and could be killed by "normal" jobs.
Slurm supports cluster federation, in which jobs are sent to all clusters which the user/account can run on. This could help users who have access to other clusters such as hm or blaise. Or generally increase utilization if the killable jobs could also migrate.
Nodes | # | CPU (sockets:cores:threads) | RAM | GPU |
---|---|---|---|---|
ampere-01 | 1 | 32 (8:4:1) | 377G | 8 a10 (22G,g0) |
arion-[01-02] | 2 | 128 (2:64:1) | 503G | 8 a10 (22G,g0) |
binky-[01-05] | 5 | 48 (8:6:1) | 503G | 8 a5000 (24G,g0) |
creek-[01-04] | 4 | 72 (2:18:2) | 251G | 8 rtx2080 (10G,g0) |
cyril-01 | 1 | 128 (2:64:1) | 503G | 8 a6000 (48G,g4) |
drape-[01-03] | 3 | 128 (2:64:1) | 503G | 8 a5000 (24G,g0) |
dumfries-003 | 1 | 32 (2:8:2) | 125G | 2 rtx2080 (10G,g0) |
dumfries-006 | 1 | 32 (2:8:2) | 125G | 3 rtx2080 (10G,g0) |
dumfries-007 | 1 | 32 (2:8:2) | 125G | 2 rtx2080 (10G,g0) |
dumfries-[001-002] | 2 | 32 (2:8:2) | 125G | 4 rtx2080 (10G,g0) |
dumfries-[004-005] | 2 | 32 (2:8:2) | 125G | 4 rtx2080 (10G,g0) |
dumfries-[008-010] | 3 | 32 (2:8:2) | 125G | 4 rtx2080 (10G,g0) |
epona-[01-02] | 2 | 128 (2:64:1) | 503G | 8 a40 (45G,g4) |
firefoot-[01-07] | 7 | 64 (2:32:1) | 1T | 8 l40s (45G,g4) |
firth-[01-02] | 2 | 40 (2:10:2) | 376G | 8 rtx6000 (24G,g0) |
gringolet-[01-06] | 6 | 128 (2:64:1) | 1T | |
hasufel-[01-02] | 2 | 64 (2:32:1) | 1T | 8 l4 (23G,g0) |
wadi-[01-05] | 5 | 128 (2:32:2) | 503G |
Nodes | # | CPU (sockets:cores:threads) | RAM | GPU |
---|---|---|---|---|
cb-[06,08,10,14,19] | 5 | 16 (2:4:2) | 62G | |
cortex-01 | 1 | 16 (2:8:1) | 251G | 8 m60 (8G) |
cortex-02 | 1 | 16 (2:8:1) | 251G | 6 m60 (8G) |
cortex-[03-05] | 3 | 16 (2:8:1) | 251G | 8 m60 (8G) |
cortex-[06-08] | 3 | 24 (2:12:1) | 251G | 8 m60 (8G) |
gsm-01 | 1 | 32 (2:8:2) | 251G | |
gsm-[03-04] | 2 | 32 (2:8:2) | 251G | 3 black (6G) |
lucy-[02-03] | 2 | 48 (2:12:2) | 377G | 2 gtx980 (4G) |
ohm-[54-64] | 11 | 48 (2:12:2) | 251G | |
oxygen-[01-08] | 8 | 48 (2:12:2) | 251G | |
sm-15 | 1 | 16 (2:4:2) | 23G | |
sm-16 | 1 | 16 (2:4:2) | 46G | |
sm-[01-04,08] | 5 | 16 (2:4:2) | 46G | |
sm-[17-18,20] | 3 | 24 (2:6:2) | 62G |
Resource | Phoenix | Cinder | |
---|---|---|---|
Nodes | 50 | 47 | |
CPUs | 3968 | 1520 | |
Memory | 27.15T | 8.87T | |
GPUs | 267 | 72 | |
GPU Group: g0 | 187 | N/A | |
GPU Group: g4 | 80 | N/A | |
GPU: gtx980 | 4G | 0 | 4 |
GPU: black | 6G | 0 | 6 |
GPU: m60 | 8G | 0 | 62 |
GPU: rtx2080 | 10G | 67 | 0 |
GPU: a10 | 22G | 24 | 0 |
GPU: l4 | 23G | 16 | 0 |
GPU: a5000 | 24G | 64 | 0 |
GPU: rtx6000 | 24G | 16 | 0 |
GPU: a40 | 45G | 16 | 0 |
GPU: l40s | 45G | 56 | 0 |
GPU: a6000 | 48G | 8 | 0 |