Compute
A Compute resource provides a real-time inventory of the machines available for deploying AI/ML applications and models. It connects to Palette using the Palette integration configured in Settings, discovers machines that have been tagged for PaletteAI, and reports which ones are healthy and eligible for cluster deployment. A Compute can reference a Settings resource in the current namespace or fall back to the project's configured Settings if omitted.
When you create an App Deployment, PaletteAI checks the Compute resource to determine whether sufficient machines are available to fulfill the requested resources (GPUs, CPUs, memory) before provisioning a cluster.
To learn how to register your physical or virtual machines as Palette-compatible edge nodes, use either the EdgeForge Workflow (Appliance Mode) or Agent Mode, and register the nodes with Palette.
Compute Status
The Compute resource reports discovered machines in two categories:
- Control plane candidates - Machines eligible to run Kubernetes control plane components (typically CPU-only machines).
- Worker candidates - Machines eligible to run AI/ML applications (typically GPU-equipped machines)
The Compute controller reconciles automatically every 30 seconds and updates its status with the available compute resources. Use the following command to check the available machines in a Project.
kubectl get compute <compute-name> --namespace <namespace> --output yaml
Machines are grouped by their hardware profiles. The following example indicates that the following machines are available:
- 2 control plane candidates with 4 CPUs each
- 3 worker candidates with 8 NVIDIA H100 GPUs each
- 5 worker candidates with 8 NVIDIA A100 GPUs each, assigned to specific resource groups
status:
availableControlPlaneCompute:
- architecture: AMD64
available: true
cpuCount: 4
instances: 2
machines:
- edge-5db0384219cfa0fa4ef97d53bf291b2e
- edge-228638428bf0078309b65730b24101ee
availableWorkerCompute:
- architecture: AMD64
available: true
family: NVIDIA H100
gpuCount: 8
instances: 3
machines:
- edge-e078384256765be6e92fc1118aa9f283
- edge-71bb3842c85b1731c37665c3d2ed0d10
- edge-d9673842ab48f990d65c11f504f13183
- architecture: AMD64
available: true
family: NVIDIA A100
gpuCount: 8
instances: 5
resourceGroups:
network-pool: '3'
storage-tier: 'high-performance'
machines:
- edge-9baf38425dacad857c70ccdbabb48028
- edge-bde03842fbb6a9cf2498b150c9799c17
- edge-0d493842666418fea457d60f9c5c243b
- edge-145938425bcde72556d6bcecafd938f1
- edge-c4593842ff28a470533ac7b2253397b2
Machine Discovery
To tag your edge nodes, add labels below the stylus.site.tags parameter in your edge node's user-data file or tag edge nodes once they are registered with Palette using Edge Host Grid View.
For PaletteAI to discover a machine, it must be registered in Palette with the required palette.ai: true tag.
If the Palette agent cannot auto-detect hardware specifications, you can provide them manually, and PaletteAI will fall back to using the tag values provided.
| Tag | Description | Example |
|---|---|---|
gpus: <count> | Number of GPUs | gpus: 8 |
cpus: <count> | Number of CPUs | cpus: 6 |
gpu-memory: <size> | GPU memory (M, MB, MiB, G, GB, GiB) | gpu-memory: 80G |
gpu-family: <family> | GPU model family | gpu-family: nvidia-a100 |
Role Eligibility Tags
By default, PaletteAI uses simple rules to determine machine eligibility:
- Machines with GPUs - Worker candidates only
- Machines without GPUs - Control plane candidates only
You can override these defaults by adding the following tags.
| Tag | Effect |
|---|---|
palette.ai/control-plane: true | Allows a GPU machine to serve as a control plane node |
palette.ai/worker: true | Allows a non-GPU machine to serve as a worker node |
Do not apply both tags to the same machine. If you do, PaletteAI treats it as a worker only.
Resource Groups
Resource groups let you organize machines for targeted workload placement. Resource groups appear in the Compute status and can be used by Compute Pools to select specific subsets of machines. Refer to our Compute Pool guide for more information.
GPU Optimization for Minimum Worker Requirements
When a Compute Config specifies minWorkerNodes, PaletteAI may need to provision more nodes than the GPU request requires. To avoid wasting GPU resources on filler nodes, PaletteAI uses the following selection order:
- Allocate GPU nodes to satisfy the GPU requirement.
- Fill remaining slots with machines tagged
palette.ai/worker: true(non-GPU workers). - If no non-GPU workers are available, select GPU machines with the lowest GPU count.
For example, you request 8 GPUs with minWorkerNodes: 3. One 8-GPU machine satisfies the GPU requirement. For the remaining two nodes, PaletteAI prefers machines tagged palette.ai/worker: true to avoid allocating additional GPUs unnecessarily.
| Node | GPUs | Tags | Role |
|---|---|---|---|
gpu-node-1 | 8 | N/A | GPU workload |
cpu-node-1 | 0 | palette.ai/worker: true | General worker |
cpu-node-2 | 0 | palette.ai/worker: true | General worker |
Resources
Refer to the following articles to learn more about the role Compute plays in PaletteAI:
- Settings - Provide Palette credentials for machine discovery
- Compute Config - Define cluster deployment defaults
- Compute Pool - Group discovered machines into logical cluster pools for App Deployments