apps.crc_idle

Command line application for listing idle Slurm resources.

The application relies on the info command to identify idle resources and summarize how many resources are available on each cluster partition. Resource summaries are provided for GPU and CPU partitions.

Module Contents

class CrcIdle

Bases: apps.utils.cli.BaseParser

Display idle Slurm resources.

cluster_types
get_cluster_list(args: argparse.Namespace) tuple[str]

Return a list of clusters specified by command line arguments.

Returns a tuple of clusters specified by command line arguments. If no clusters were specified, then return a tuple of all cluster names.

Parameters:

args – Parsed command line arguments

Returns:

A tuple of cluster names

static _count_idle_cpu_resources(cluster: str, partition: str) dict[int, dict[str, int]]

Return the idle CPU resources on a given cluster partition.

Parameters:
  • cluster – The cluster to print a summary for.

  • partition – The partition in the parent cluster.

Returns:

A dictionary mapping the number of idle resources to a dictionary with the number of nodes with that many idle resources, minimum free memory, and maximum free memory on these nodes.

static _count_idle_gpu_resources(cluster: str, partition: str) dict[int, dict[str, int]]

Return idle GPU resources on a given cluster partition.

If the host node is in a drain state, the GPUs are reported as unavailable.

Parameters:
  • cluster – The cluster to print a summary for.

  • partition – The partition in the parent cluster.

Returns:

A dictionary mapping the number of idle resources to the number of nodes with that many idle resources.

count_idle_resources(cluster: str, partition: str) dict[int, dict[str, int]]

Determine the number of idle resources on a given cluster partition.

The returned dictionary maps the number of idle resources (e.g., cores) to the number of nodes in the partition having that many resources idle.

Parameters:
  • cluster – The cluster to print a summary for.

  • partition – The partition in the parent cluster.

Returns:

A dictionary mapping idle resources to number of nodes.

print_partition_summary(cluster: str, partition: str, idle_resources: dict) None

Print a summary of idle resources in a single partition

Parameters:
  • cluster – The cluster to print a summary for

  • partition – The partition in the parent cluster

  • idle_resources – Dictionary mapping idle resources to number of nodes

app_logic(args: argparse.Namespace) None

Logic to evaluate when executing the application.

Parameters:

args – Parsed command line arguments.