Skip to main content

Compute Pools

A Compute Pool is a group of shared Compute resources used to create Kubernetes clusters where your AI/ML applications run. In the hub-spoke architecture, each Compute Pool becomes a spoke cluster on which applications and models are deployed.

Types of Compute Pools

There are two types of Compute Pools, allowing you to deploy your applications on the infrastructure that best suits your needs.

ModeDescriptionPalette RequiredUse Case
DedicatedA single cluster used to host a single App DeploymentYes- Isolated workloads
- Specific hardware needs
- Stringent compliance or security requirements
SharedOne or multiple clusters that share resources and are used to host multiple App DeploymentsYes- Resource efficiency for workloads with similar needs
- Maximize hardware utilization to reduce costs
- Development or staging environments

Compute Pool Provisioning

You can create shared or dedicated Compute Pools before deploying AI/ML applications. Provisioning Kubernetes clusters in advance reduces the number of steps involved when deploying applications and models by allowing data scientists to select an existing Compute Pool for their workload rather than create a new one.

The following table illustrates the types of Compute Pools you can create in specific workflows.

WorkflowDedicatedShared
App Deployment
Compute Pool

When you create a dedicated or shared Compute Pool, PaletteAI provisions the underlying Kubernetes infrastructure:

  1. PaletteAI validates the configuration against available Compute resources.
  2. PaletteAI requests cluster provisioning from Palette using the Profile Bundle's Cluster Profile.
  3. Palette provisions the Kubernetes cluster.
  4. PaletteAI retrieves the cluster's kubeconfig from Palette.
  5. PaletteAI registers the cluster as an OCM spoke with the hub.
  6. PaletteAI installs the required controllers (Flux, OCM work agent) on the spoke.

Once complete, the Compute Pool is ready to receive applications using Application or Fullstack Profile Bundles.

To learn how applications are deployed to Compute Pools, refer to our App Deployments guide. For an in-depth look at the complete deployment flow, from Compute Pools to workload provisioning, refer to our Hub and Spoke Model guide.

Hardware Capacity and Allocation

PaletteAI tracks hardware resources in the Compute Pool CRD's status field. This information helps PaletteAI determine whether a Compute Pool can accept additional workloads and is used for GPU quota enforcement at the Project level.

  • status.hardwareCapacity - Total resources available across all control plane and worker nodes in the Compute Pool's clusters. This includes CPU count, architecture, memory, and GPU family and count.
  • status.hardwareAllocation - Resources currently allocated to App Deployments running on the Compute Pool. Allocation is summed across all applications deployed to the pool.
  • status.status - Overall health of the Compute Pool.

View hardware information associated with your Compute Pool with the following command.

kubectl get computepool <pool-name> --namespace <project-namespace> --output yaml
Example Compute Pool status
status:
status: Running
hardwareCapacity:
- architecture: AMD64
totalCPU: 32
totalMemory: '128Gi'
gpu:
- family: 'NVIDIA-A100'
gpuCount: 8
hardwareAllocation:
- architecture: AMD64
gpu:
- family: 'NVIDIA-A100'
gpuCount: 4
aiWorkloadRefs:
- name: training-job-1
namespace: project-a

Resource Groups

Resource groups let you restrict which machines a Compute Pool can use. This is useful when you need workloads to run on specific hardware, such as machines in a particular network zone or with high-performance storage.

To use resource groups, tag your machines in Palette with labels that begin with palette.ai.rg/. For example, palette.ai.rg/network-pool: '1' or palette.ai.rg/storage-tier: 'high-performance'. Then specify the same labels in the controlPlaneResourceGroups or workerResourceGroups fields in your ComputePool resource.

Example ComputePool manifest
spec:
clusterVariant:
controlPlaneResourceGroups:
network-pool: '1'
workerResourceGroups:
storage-tier: 'high-performance'

The key-value pair in palette.ai.rg/<key>: "<value>" assigned to the machine must match the key-value pair defined in the Compute Pool in order for the machine to be added to the Compute Pool.

Resources

Refer to the following articles to learn more about the role Compute Pools play in PaletteAI: