apps.crc_idle
Command line application for listing idle Slurm resources.
The application relies on the info command to identify idle resources and summarize how many resources are available on each cluster partition. Resource summaries are provided for GPU and CPU partitions.
Module Contents
- class CrcIdle
Bases:
apps.utils.cli.BaseParser
Display idle Slurm resources.
- cluster_types
- get_cluster_list(args: argparse.Namespace) tuple[str]
Return a list of clusters specified by command line arguments.
Returns a tuple of clusters specified by command line arguments. If no clusters were specified, then return a tuple of all cluster names.
- Parameters:
args – Parsed command line arguments
- Returns:
A tuple of cluster names
- static _count_idle_cpu_resources(cluster: str, partition: str) dict[int, dict[str, int]]
Return the idle CPU resources on a given cluster partition.
- Parameters:
cluster – The cluster to print a summary for.
partition – The partition in the parent cluster.
- Returns:
A dictionary mapping the number of idle resources to a dictionary with the number of nodes with that many idle resources, minimum free memory, and maximum free memory on these nodes.
- static _count_idle_gpu_resources(cluster: str, partition: str) dict[int, dict[str, int]]
Return idle GPU resources on a given cluster partition.
If the host node is in a drain state, the GPUs are reported as unavailable.
- Parameters:
cluster – The cluster to print a summary for.
partition – The partition in the parent cluster.
- Returns:
A dictionary mapping the number of idle resources to the number of nodes with that many idle resources.
- count_idle_resources(cluster: str, partition: str) dict[int, dict[str, int]]
Determine the number of idle resources on a given cluster partition.
The returned dictionary maps the number of idle resources (e.g., cores) to the number of nodes in the partition having that many resources idle.
- Parameters:
cluster – The cluster to print a summary for.
partition – The partition in the parent cluster.
- Returns:
A dictionary mapping idle resources to number of nodes.
- print_partition_summary(cluster: str, partition: str, idle_resources: dict) None
Print a summary of idle resources in a single partition
- Parameters:
cluster – The cluster to print a summary for
partition – The partition in the parent cluster
idle_resources – Dictionary mapping idle resources to number of nodes
- app_logic(args: argparse.Namespace) None
Logic to evaluate when executing the application.
- Parameters:
args – Parsed command line arguments.