Skip to main content

17 docs tagged with "paletteai"

View all tags

App Deployments

An App Deployment represents an AI/ML application deployed using a Profile Bundle; the Profile Bundle must contain a Workload Profile with the type Application. An App Deployment is the primary method that data scientists and ML engineers use to deploy their workloads onto Compute Pools.

Appliance Installation

The appliance install will deploy self-hosted Palette and PaletteAI on the same cluster using bare metal or edge devices. The installation is broken into two parts:

Architecture

PaletteAI abstracts away the complexity of deploying AI and ML application stacks on Kubernetes. Built on proven orchestration technologies, PaletteAI enables data science teams to deploy and manage their own AI and ML application stacks while platform engineering teams maintain control over infrastructure, security, and more.

Compute

A Compute resource provides a real-time inventory of the machines available for deploying AI/ML applications and models. It connects to Palette using the Palette integration configured in Settings, discovers machines that have been tagged for PaletteAI, and reports which ones are healthy and eligible for cluster deployment. A Compute can reference a Settings resource in the current namespace or fall back to the project's configured Settings if omitted.

Compute Config

A Compute Config defines default settings for how PaletteAI deploys Kubernetes clusters. It captures infrastructure details, such as networking, SSH keys, and node configurations, so that data scientists and ML engineers can deploy workloads without specifying these settings each time.

Compute Pools

A Compute Pool is a group of shared Compute resources used to create Kubernetes clusters where your AI/ML applications run. In the hub-spoke architecture, each Compute Pool becomes a spoke cluster on which applications and models are deployed.

Concepts

This section covers the core concepts for working with PaletteAI. Whether you are a platform engineer setting up infrastructure or a data scientist deploying workloads, these concepts explain how PaletteAI organizes resources and manages AI/ML deployments.

Glossary

This glossary defines key terms and concepts used throughout PaletteAI. Concepts inherited from other software, such as Palette and Open Cluster Management (OCM), are indicated, along with their source.

Helm Chart Configuration

This page contains the complete Helm chart configuration reference for the latest version of PaletteAI.

Hub and Spoke Model

PaletteAI uses a hub-spoke architecture to separate the control plane from the data plane. The hub cluster is where you manage and configure applications. Spoke clusters are where your AI/ML applications actually run. This separation allows a single control plane to orchestrate workloads across many environments.

OCI Registries

PaletteAI uses OCI (Open Container Initiative) registries to store and distribute workload artifacts between hub and spoke clusters. When you deploy an application using the App Deployment workflow, PaletteAI renders your Workload Profile into Kubernetes manifests, packages them as OCI artifacts, and stores them in a registry. Flux controllers, which exist on each spoke cluster, then pull these artifacts and apply them.

Outputs

Components and traits are rendered into Kubernetes resources when a Workload is deployed. There are three types of outputs that can be referenced via workload profile macros:

Prepare Infrastructure

This guide covers preparing the infrastructure, installing PaletteAI on the nodes, and linking the edge nodes to the leader node.

Tenants and Projects

PaletteAI uses Tenants and Projects to organize teams, control access, and manage GPU resources. This two-level hierarchy lets platform engineering teams set organization-wide policies while giving individual data science teams autonomy over their own workspaces.