Hub and Spoke Model
PaletteAI uses a hub-spoke architecture to separate the control plane from the data plane. The hub cluster is where you manage and configure applications. Spoke clusters are where your AI/ML applications actually run. This separation allows a single control plane to orchestrate workloads across many environments.
The hub-spoke model solves several challenges for AI/ML applications:
- Centralized management - Platform teams configure Profile Bundles, Tenants, and Projects in one place. Data scientists deploy App Deployments through a single UI or API. All configuration lives on the hub.
- Distributed execution - AI/ML applications run where the hardware is located. Each workload is executed on a spoke cluster. Workloads can have a dedicated cluster or share the cluster with other workloads.
- Resource isolation - Different teams or workloads run on separate spoke clusters, preventing resource contention and providing security boundaries.
- Flexible scaling - Add spoke clusters as your needs grow without changing your management workflow.
Hub Cluster
The hub cluster runs PaletteAI's control plane. This is where you interact with PaletteAI, whether through the UI, kubectl, or GitOps workflows.
The hub cluster is responsible for running the following components:
- PaletteAI controllers
- PaletteAI UI
- Open Cluster Management (OCM) control plane for multi-cluster orchestration
- All PaletteAI CRDs (Tenants, Projects, Settings, Profile Bundles, AI Workloads)
The hub cluster can be deployed on any Kubernetes cluster, including in the same cluster as self-hosted Palette. Each PaletteAI installation consists of one hub cluster only.
Spoke Clusters
Spoke clusters are where AI/ML applications run. In PaletteAI, spoke clusters are also known as Compute Pools. Each Compute Pool you create becomes a spoke when it is registered with the hub cluster. Typically, you do not interact with spoke clusters; you manage your applications through PaletteAI (hub cluster), and the spokes fetch the updated configurations from the hub cluster.
Spoke clusters are responsible for running the following components:
- AI/ML applications and models
- Flux controllers for managing the lifecycles of AI/ML applications
- OCM work agents for spoke-hub communication (klusterlets)
While there is only one hub cluster per PaletteAI installation, there is no limit to the number of spoke clusters.
Deployment Patterns
PaletteAI supports two deployment patterns: hub-as-spoke and dedicated spokes. By default, PaletteAI is installed using the hub-as-spoke model.
The following table addresses key differences between the two patterns.
| Consideration | Hub-as-Spoke | Dedicated Spokes |
|---|---|---|
| Setup | Easier (single cluster) | Complex (requires spoke RBAC setup and kubeconfig secrets) |
| Control plane isolation | Workloads share resources with PaletteAI controllers | Full isolation (control plane unaffected by workload behavior) |
| Workload isolation | Control plane failures can affect workloads | Workloads isolated from hub issues |
| Scaling | Limited by single cluster capacity | Add spoke clusters independently |
| Security boundaries | Single security domain | Workloads can run in separate accounts, VPCs, or networks |
| Resources | Workloads compete with PaletteAI for CPU, memory, GPUs | Dedicated resources per spoke |
| Cost | Lower (single cluster) | Higher (additional clusters for spokes) |
For production deployments, we recommend using dedicated spoke clusters. You can start with hub-as-spoke for initial setup and add dedicated spokes later without reinstalling PaletteAI.
For platform-specific instructions on configuring dedicated spokes, refer to the appropriate EKS Environment Setup and GKE Spoke Setup guide.
Hub-as-Spoke
In the hub-as-spoke pattern, a single cluster acts as both the control plane (hub) and a workload cluster (spoke). This is the default installation method and is configured with inCluster: true and forceInternalEndpointLookup: true in the fleetConfig section of your Helm values.
fleetConfig.spokes[i].name must be set to hub-as-spoke in order to install PaletteAI using the hub-as-spoke pattern.
fleetConfig:
spokes:
- name: hub-as-spoke
kubeconfig:
inCluster: true
klusterlet:
forceInternalEndpointLookup: true
Dedicated Spokes
In the dedicated spokes pattern, the hub cluster runs only the PaletteAI control plane, while separate spoke clusters run your AI/ML applications and models. Each spoke cluster requires a kubeconfig secret on the hub and appropriate RBAC permissions for the FleetConfig controller to bootstrap the Open Cluster Management (OCM) agent. Refer to the Multi-Cluster Orchestration with OCM section for details.
fleetConfig:
spokes:
- name: spoke-cluster-1
kubeconfig:
inCluster: false
secretReference:
name: spoke-1-kubeconfig
kubeconfigKey: kubeconfig
- name: spoke-cluster-2
kubeconfig:
inCluster: false
secretReference:
name: spoke-2-kubeconfig
kubeconfigKey: kubeconfig
How Workloads Flow from Hub to Spoke
The following process illustrates the process of initiating an AI/ML deployment through PaletteAI (hub cluster) and instantiating it on a spoke cluster. For learning purposes, the following example deploys an AI application on an existing Compute Pool (spoke cluster) in a shared environment.
- AIWorkload created - You select a Profile Bundle and Compute Pool on the hub.
- WorkloadDeployment generated - PaletteAI combines the Workload Profile (within the Profile Bundle) with an Environment (placement policy).
- Placement resolved - The Environment determines which spokes receive the workload.
- Workload distributed - OCM sends the Workload to target spokes via ManifestWork resources.
Application Lifecycle with Flux
Once workloads reach spoke clusters, Flux handles their lifecycle.
- Resources rendered - The workload controller renders Workload Profiles into Kubernetes manifests and uploads them to an OCI registry.
- App deployed - Flux pulls the manifests from the OCI registry and applies them to the cluster.
- State monitored - Flux continuously monitors the deployed resources and corrects any drift from the desired state.
- Updates performed - When a Profile Bundle or Workload Profile changes, PaletteAI re-renders the manifests. Flux detects the change and updates the deployed resources.
This GitOps approach ensures workloads stay in sync with their definitions and provides automatic recovery from configuration drift. To learn more about how Flux operates in PaletteAI, refer to our OCI Registries guide.
The status then flows from spoke to hub, allowing you to monitor your workloads from the PaletteAI UI without connecting directly to spoke clusters.
For a high-level look at provisioning infrastructure provisioning and deploying workloads, refer to the following guides:
- Compute Pool Provisioning - How clusters are created
- App Deployment - How applications are deployed
Multi-Cluster Orchestration with OCM
PaletteAI uses Open Cluster Management (OCM) for multi-cluster orchestration. OCM provides the machinery for distributing workloads from hub to spokes.
Key OCM Concepts
| Concept | What It Does | PaletteAI Usage |
|---|---|---|
| ManagedCluster | Represents a registered spoke | Created automatically when Compute Pool is provisioned |
| ManifestWork | Contains resources to deploy on spoke | Created by PaletteAI to distribute Workloads |
| Placement | Selects which clusters receive workloads | Configured via Environments |
| Klusterlet | Agent running on spoke | Installed automatically on Compute Pools |
It is important to note that you do not interact with OCM resources directly. PaletteAI manages them based on your App Deployment and Environment configurations.
Environments and Placement
Environments control how workloads are distributed to spoke clusters. Each environment contains a topology policy that determines which spoke clusters receive the workload and how rollouts are performed.
PaletteAI's default topology (topology-ocm) creates OCM Placement and ManifestWorkReplicaSet resources that handle cluster selection and workload distribution.
For most use cases, you specify the Compute Pool during the App Deployment workflow, and PaletteAI handles the Environment configuration automatically. Advanced users can create custom Environments for complex placement scenarios, or when bringing their own clusters outside of Palette.
ManagedClusterSets and ManagedClusterSetBindings
Open Cluster Management organizes clusters into logical groups called ManagedClusterSets. ManagedClusterSets are bound to namespaces using ManagedClusterSetBindings, enabling fine-grained control over which clusters can be selected by Placements in those namespaces.
Cluster Selection
The cluster selection process works as follows:
-
A namespaced
Placementresource is created. It can only select clusters fromManagedClusterSetsbound to that namespace. -
If no
ManagedClusterSetBindingexists in thePlacement's namespace, no clusters can be selected. -
If multiple
ManagedClusterSetBindingsexist in thePlacement's namespace, clusters may be selected from the union of all clusters across all bound cluster sets. -
The
Placementresource can target a specific cluster set, or it can select clusters across all available clusters sets using predicates (label selectors, taints and tolerations, and prioritizers).
ManagedClusterSets in PaletteAI
PaletteAI automatically creates three managed cluster sets and namespaces. The managed cluster sets are automatically bound to these namespaces to enable placements to function out-of-the-box.
| Cluster Set | Bound Namespace | Additional Details |
|---|---|---|
default | managed-cluster-set-default | Includes all ManagedClusters that have not been explicitly assigned a ManagedClusterSet via the cluster.open-cluster-management.io/clusterset label. Note that this label does not provide exclusivity: a managed cluster "assigned" to cluster set A can still be included in cluster set B via a label selector. |
global | managed-cluster-set-global | Always includes all ManagedClusters known to the hub. |
spokes | managed-cluster-set-spokes | Includes all ManagedClusters whose fleetconfig.open-cluster-management.io/managedClusterType label is neither hub nor hub-as-spoke. Useful for deployment of foundational workloads to all spoke clusters. |
Communication and Security
Hub-spoke communication uses OCM with mutual TLS.
- Hub-to-spoke - Workload definitions sent via ManifestWork resources
- Spoke-to-hub - Status updates sent via OCM agent
Spoke clusters pull workload definitions from the hub cluster. The hub never pushes directly into spoke clusters. This pull-based model works well with firewalls and NAT, as spokes only need outbound connectivity to the hub.
Workload data (model weights, training data, inference requests) never passes through the hub cluster. The hub only manages control plane operations; your data stays on the spoke clusters where workloads run.
Refer to our Security page for more information on how security is handled in PaletteAI.