Compute

A Compute resource provides a real-time inventory of the machines available for deploying AI/ML applications and models. It connects to Palette using the Palette integration configured in Settings, discovers machines that have been tagged for PaletteAI, and reports which ones are healthy and eligible for cluster deployment. A Compute can reference a Settings resource in the current namespace or fall back to the project's configured Settings if omitted.

When you create an App Deployment, PaletteAI checks the Compute resource to determine whether sufficient machines are available to fulfill the requested resources (GPUs, CPUs, memory) before provisioning a cluster.

tip

To learn how to register your physical or virtual machines as Palette-compatible edge nodes, use either the EdgeForge Workflow (Appliance Mode) or Agent Mode, and register the nodes with Palette.

Compute Status

The Compute resource reports discovered machines in two categories:

Control plane candidates - Machines eligible to run Kubernetes control plane components (typically CPU-only machines).
Worker candidates - Machines eligible to run AI/ML applications (typically GPU-equipped machines)

The Compute controller reconciles automatically every 30 seconds and updates its status with the available compute resources. Use the following command to check the available machines in a Project.

kubectl get compute <compute-name> --namespace <namespace> --output yaml

Machines are grouped by their hardware profiles. The following example indicates that the following machines are available:

2 control plane candidates with 4 CPUs each
3 worker candidates with 8 NVIDIA H100 GPUs each
5 worker candidates with 8 NVIDIA A100 GPUs each, assigned to specific resource groups

Example Compute status
status:
  availableControlPlaneCompute:
    - architecture: AMD64
      available: true
      cpuCount: 4
      instances: 2
      machines:
        - edge-5db0384219cfa0fa4ef97d53bf291b2e
        - edge-228638428bf0078309b65730b24101ee
  availableWorkerCompute:
    - architecture: AMD64
      available: true
      family: NVIDIA H100
      gpuCount: 8
      instances: 3
      machines:
        - edge-e078384256765be6e92fc1118aa9f283
        - edge-71bb3842c85b1731c37665c3d2ed0d10
        - edge-d9673842ab48f990d65c11f504f13183
    - architecture: AMD64
      available: true
      family: NVIDIA A100
      gpuCount: 8
      instances: 5
      resourceGroups:
        network-pool: '3'
        storage-tier: 'high-performance'
      machines:
        - edge-9baf38425dacad857c70ccdbabb48028
        - edge-bde03842fbb6a9cf2498b150c9799c17
        - edge-0d493842666418fea457d60f9c5c243b
        - edge-145938425bcde72556d6bcecafd938f1
        - edge-c4593842ff28a470533ac7b2253397b2

Machine Discovery

info

To tag your edge nodes, add labels below the stylus.site.tags parameter in your edge node's user-data file or tag edge nodes once they are registered with Palette using Edge Host Grid View.

For PaletteAI to discover a machine, it must be registered in Palette with the required palette.ai: true tag.

If the Palette agent cannot auto-detect hardware specifications, you can provide them manually, and PaletteAI will fall back to using the tag values provided.

Tag	Description	Example
`gpus: <count>`	Number of GPUs	`gpus: 8`
`cpus: <count>`	Number of CPUs	`cpus: 6`
`gpu-memory: <size>`	GPU memory (M, MB, MiB, G, GB, GiB)	`gpu-memory: 80G`
`gpu-family: <family>`	GPU model family	`gpu-family: nvidia-a100`

Role Eligibility Tags

By default, PaletteAI uses simple rules to determine machine eligibility:

Machines with GPUs - Worker candidates only
Machines without GPUs - Control plane candidates only

You can override these defaults by adding the following tags.

Tag	Effect
`palette.ai/control-plane: true`	Allows a GPU machine to serve as a control plane node
`palette.ai/worker: true`	Allows a non-GPU machine to serve as a worker node

warning

Do not apply both tags to the same machine. If you do, PaletteAI treats it as a worker only.

Resource Groups

Resource groups let you organize machines for targeted workload placement. Resource groups appear in the Compute status and can be used by Compute Pools to select specific subsets of machines. Refer to our Compute Pool guide for more information.

GPU Optimization for Minimum Worker Requirements

When a Compute Config specifies minWorkerNodes, PaletteAI may need to provision more nodes than the GPU request requires. To avoid wasting GPU resources on filler nodes, PaletteAI uses the following selection order:

Allocate GPU nodes to satisfy the GPU requirement.
Fill remaining slots with machines tagged palette.ai/worker: true (non-GPU workers).
If no non-GPU workers are available, select GPU machines with the lowest GPU count.

For example, you request 8 GPUs with minWorkerNodes: 3. One 8-GPU machine satisfies the GPU requirement. For the remaining two nodes, PaletteAI prefers machines tagged palette.ai/worker: true to avoid allocating additional GPUs unnecessarily.

Node	GPUs	Tags	Role
`gpu-node-1`	8	N/A	GPU workload
`cpu-node-1`	0	`palette.ai/worker: true`	General worker
`cpu-node-2`	0	`palette.ai/worker: true`	General worker

Resources

Refer to the following articles to learn more about the role Compute plays in PaletteAI:

Settings - Provide Palette credentials for machine discovery
Compute Config - Define cluster deployment defaults
Compute Pool - Group discovered machines into logical cluster pools for App Deployments

Compute Status​

Machine Discovery​

Role Eligibility Tags​

Resource Groups​

GPU Optimization for Minimum Worker Requirements​

Resources​