App Deployments
An App Deployment represents an AI/ML application deployed using a Profile Bundle; the Profile Bundle must contain a Workload Profile with the type Application. An App Deployment is the primary method that data scientists and ML engineers use to deploy their workloads onto Compute Pools.
Appliance Installation
The appliance install will deploy self-hosted Palette and PaletteAI on the same cluster using bare metal or edge devices. The installation is broken into two parts:
Architecture
PaletteAI abstracts away the complexity of deploying AI and ML application stacks on Kubernetes. Built on proven orchestration technologies, PaletteAI enables data science teams to deploy and manage their own AI and ML application stacks while platform engineering teams maintain control over infrastructure, security, and more.
Compute
A Compute resource provides a real-time inventory of the machines available for deploying AI/ML applications and models. It connects to Palette using the Palette integration configured in Settings, discovers machines that have been tagged for PaletteAI, and reports which ones are healthy and eligible for cluster deployment. A Compute can reference a Settings resource in the current namespace or fall back to the project's configured Settings if omitted.
Compute Config
A Compute Config defines default settings for how PaletteAI deploys Kubernetes clusters. It captures infrastructure details, such as networking, SSH keys, and node configurations, so that data scientists and ML engineers can deploy workloads without specifying these settings each time.
Compute Pools
A Compute Pool is a group of shared Compute resources used to create Kubernetes clusters where your AI/ML applications run. In the hub-spoke architecture, each Compute Pool becomes a spoke cluster on which applications and models are deployed.
ComputePool Configuration Reference
This page provides technical reference information for configuring ComputePools.
Concepts
This section covers the core concepts for working with PaletteAI. Whether you are a platform engineer setting up infrastructure or a data scientist deploying workloads, these concepts explain how PaletteAI organizes resources and manages AI/ML deployments.
Glossary
This glossary defines key terms and concepts used throughout PaletteAI. Concepts inherited from other software, such as Palette and Open Cluster Management (OCM), are indicated, along with their source.
Helm Chart Configuration
This page contains the complete Helm chart configuration reference for the latest version of PaletteAI.
Hub and Spoke Model
PaletteAI uses a hub-spoke architecture to separate the control plane from the data plane. The hub cluster is where you manage and configure applications. Spoke clusters are where your AI/ML applications actually run. This separation allows a single control plane to orchestrate workloads across many environments.
OCI Registries
PaletteAI uses OCI (Open Container Initiative) registries to store and distribute workload artifacts between hub and spoke clusters. When you deploy an application using the App Deployment workflow, PaletteAI renders your Workload Profile into Kubernetes manifests, packages them as OCI artifacts, and stores them in a registry. Flux controllers, which exist on each spoke cluster, then pull these artifacts and apply them.
Outputs
Components and traits are rendered into Kubernetes resources when a Workload is deployed. There are three types of outputs that can be referenced via workload profile macros:
Prepare Infrastructure
This guide covers preparing the infrastructure, installing PaletteAI on the nodes, and linking the edge nodes to the leader node.
Roles and Permissions
Tenant Role Permissions
Tenants and Projects
PaletteAI uses Tenants and Projects to organize teams, control access, and manage GPU resources. This two-level hierarchy lets platform engineering teams set organization-wide policies while giving individual data science teams autonomy over their own workspaces.
Troubleshooting Compute Pools
This page provides troubleshooting guidance for common Compute Pool issues.