48 docs tagged with "paletteai"

Air-Gapped Deployment Guide

This guide walks through an end-to-end deployment of PaletteAI VerteX in an air-gapped environment

An App Deployment represents an AI/ML application deployed using a Profile Bundle; the Profile Bundle must contain a Workload Profile with the type Application. An App Deployment is the primary method that data scientists and ML engineers use to deploy their workloads onto Compute Pools.

Appliance Installation

The appliance install will deploy self-hosted Palette and PaletteAI on the same cluster using bare metal or edge devices. The installation is broken into two parts:

Architecture

PaletteAI abstracts away the complexity of deploying AI and ML application stacks on Kubernetes. Built on proven orchestration technologies, PaletteAI enables data science teams to deploy and manage their own AI and ML application stacks while platform engineering teams maintain control over infrastructure, security, and more.

Compute

A Compute resource is a live inventory of the machines available for deploying AI/ML applications and models. It connects to Palette using the Palette integration configured in Settings, discovers machines that have been tagged for PaletteAI, and reports which ones are healthy and eligible for cluster deployment. The controller refreshes the inventory automatically.

Compute Configs

A Compute Config is a reusable blueprint of cluster settings. Platform administrators capture infrastructure details such as networking, Secure Shell (SSH) keys, and node configuration in a Compute Config once. Data scientists and ML engineers then deploy against those defaults without repeating boilerplate.

Compute Pools

A Compute Pool is a group of shared Compute resources that PaletteAI turns into one or more Kubernetes clusters where your AI/ML applications and models run. In the hub-spoke architecture, each Compute Pool becomes a spoke cluster on which applications and models are deployed.

Compute Reference

This page provides technical reference information for the Compute resource: the discovery tags PaletteAI reads from Palette Edge hosts, the spec fields you can set, and the status fields the controller reports. For concepts, refer to Compute. For the raw resource spec, refer to the Compute CRD documentation.

Compute Resources

Compute resources are how PaletteAI turns your physical machines into Kubernetes clusters that run AI/ML applications and models.

ComputeConfig Configuration Reference

This page provides technical reference information for configuring ComputeConfig resources. For concepts, refer to Compute Configs. For step-by-step instructions, refer to Create and Manage Compute Configs. For the raw resource spec, refer to the ComputeConfig CRD documentation.

ComputePool Configuration Reference

This page provides technical reference information for configuring ComputePools.

Configure Integrations

A Settings resource holds the integrations PaletteAI uses to provision infrastructure and govern artificial intelligence and machine learning (AI/ML) workloads. This page shows how to configure each integration type and how to govern model availability through Project Model Settings.

Configure Settings

Settings define integrations and configuration values used by Projects and Compute Pools. For integration types, prerequisites, and examples, refer to Settings and Integrations.

Featured Profile Bundles

Featured Profile Bundles are curated, ready-to-use Profile Bundles from PaletteAI Studio. Use these reference pages for stack-specific guidance beyond the general Create and Manage Profile Bundles and Import Profile Bundles tasks.

Glossary

This glossary defines key terms and concepts used throughout PaletteAI. Entries indicate concepts inherited from other software, such as Palette and Open Cluster Management (OCM), and name their source.

Helm Chart Configuration

This page contains the complete Helm chart configuration reference for the latest version of PaletteAI.

Hub and Spoke Model

PaletteAI uses a hub-spoke architecture to separate the control plane from the data plane. The hub cluster is where you manage and configure applications. Spoke clusters are where your AI/ML applications actually run. This separation allows a single control plane to orchestrate workloads across many environments.

Known Issues

Review all known issues in PaletteAI and learn more about their status.

Multi-Instance GPU

Multi-Instance GPU (MIG) is an NVIDIA feature that partitions a supported GPU into as many as seven isolated GPU instances. Each instance has dedicated compute, memory, and memory bandwidth, so several workloads can share one physical GPU without competing for the same resources. MIG helps you increase GPU utilization when individual workloads do not need a full GPU.

Multi-Tenancy

PaletteAI is multi-tenant. It organizes teams, controls access, and manages GPU resources through a hierarchy of scopes, so platform engineering teams can set organization-wide policy while data science teams keep autonomy over their own workspaces.

NVIDIA Run:ai

NVIDIA Runai as a ready-to-use Profile Bundle so platform teams can deploy GPU scheduling on PaletteAI Compute Pools without assembling operators and charts by hand.

OCI Registries

PaletteAI uses OCI (Open Container Initiative) registries to store and distribute workload artifacts between hub and spoke clusters. When you create an App Deployment or Model Deployment, the spoke cluster's Workload controller renders your Workload Profile into Kubernetes manifests, packages them as OCI artifacts, and stores them in the spoke's OCI registry. Flux controllers on the spoke then pull these artifacts and apply them.

PaletteAI 1.1.0 Release Notes

Summary

PaletteAI 1.1.1 Release Notes

Summary

PaletteAI 1.1.2 Release Notes

Summary

PaletteAI 1.1.3 Release Notes

Summary

PaletteAI 1.1.4 Release Notes

Summary

PaletteAI 1.1.5 Release Notes

Summary

PaletteAI 1.1.6 Release Notes

Summary

PaletteAI 1.1.7 Release Notes

Summary

PaletteAI 1.1.8 Release Notes

Summary

PaletteAI 1.2.0 Release Notes

Summary

PaletteAI Components

PaletteAI is installed through the Mural umbrella Helm chart, which deploys a set of cooperating services. Some of these services are built by Spectro Cloud, and others are established open-source projects that PaletteAI configures and manages for you. This page explains what each component is, what it does, and where it runs, so you can recognize the services in your cluster and understand how they fit together.

Prepare Infrastructure

PaletteAI appliance preparation follows one of two paths:

Project Scope

A Project is a workspace where teams deploy and manage AI/ML applications. It is the narrowest scope and where most day-to-day work happens. Every Project belongs to a parent Tenant. Projects provide the following benefits:

Register an Edge Node

This guide explains how to register a bare-metal or virtual edge node with the Palette VerteX

Release Notes

July 29, 2026 — PaletteAI 1.2.0

Role Permissions Reference

This page lists the full Kubernetes Role-Based Access Control (RBAC) permissions that PaletteAI grants to each Tenant and Project role. For an overview of each role and how OpenID Connect (OIDC) groups bind to roles, refer to the Roles and Permissions concept page.

Roles and Permissions

PaletteAI manages permissions using standard Kubernetes Role-Based Access Control (RBAC), with one consistent extension: every role in PaletteAI is bound to OpenID Connect (OIDC) groups rather than to individual users. When you create a Tenant or Project, PaletteAI generates the underlying roles and role bindings automatically and connects them to the OIDC groups you specify in the Tenant's tenantRoleMapping or the Project's roleMapping. Group membership in your identity provider grants or revokes access; there are no per-user resources to maintain inside the cluster.

Scaling Policies

A Scaling Policy is a reusable set of autoscaling rules for Compute Pools. It defines the CPU and GPU utilization thresholds that trigger scaling, how long utilization must hold before PaletteAI acts, the minimum and maximum resources a pool may scale between, and how long to wait between scaling actions. One policy can be attached to many Compute Pools, so teams can standardize scaling behavior instead of tuning each pool individually.

ScalingPolicy Configuration Reference

This page provides technical reference information for configuring ScalingPolicy resources. For step-by-step instructions, refer to Create and Manage Scaling Policies. For the raw resource spec, refer to the ScalingPolicy CRD documentation.

Self-Hosted Quick Start

Use this guide for an end-to-end self-hosted setup that starts with the appliance installation and ends with importing and deploying your first profile bundle.

Settings and Integrations

A Settings resource holds the external integrations PaletteAI uses to provision infrastructure and govern artificial intelligence and machine learning (AI/ML) workloads. The System, Tenants, and Projects each reference a Settings resource, and the controller uses the integrations defined there to communicate with Palette and the registries that supply your models.

Sharing Resources

PaletteAI shares resources down the scope hierarchy: a resource defined once at System or Tenant scope can be used by many Projects without being copied into each one. This enables platform teams to curate a common catalog of building blocks — for example, an approved Profile Bundle — while individual teams keep working in their own isolated Project namespaces.

System Scope

System is the top-level scope in PaletteAI, backed by the pai-system namespace and managed by platform operators. It sits above Tenants and Projects in the scope hierarchy.

Tenant Scope

A Tenant represents an organization or major division within your company and groups one or more Projects under a single administrative boundary. Platform engineering teams typically manage Tenants. Tenants provide the following benefits:

Troubleshooting Compute Pools

This page provides troubleshooting guidance for common Compute Pool issues.

UI Action Permissions Reference

PaletteAI enforces Role-Based Access Control (RBAC) across the UI. Each create, edit, or delete action is available only to users whose role includes the required permission. If an action listed below is not visible in the UI, your role does not include the permission. Contact your administrator to request access. For the full list of permissions granted to each PaletteAI role, refer to the Role Permissions reference.

Air-Gapped Deployment Guide

App Deployments

Appliance Installation

Architecture

Compute

Compute Configs

Compute Pools

Compute Reference

Compute Resources

ComputeConfig Configuration Reference

ComputePool Configuration Reference

Configure Integrations

Configure Settings

Featured Profile Bundles

Glossary

Helm Chart Configuration

Hub and Spoke Model

Known Issues

Multi-Instance GPU

Multi-Tenancy

NVIDIA Run:ai

OCI Registries

PaletteAI 1.1.0 Release Notes

PaletteAI 1.1.1 Release Notes

PaletteAI 1.1.2 Release Notes

PaletteAI 1.1.3 Release Notes

PaletteAI 1.1.4 Release Notes

PaletteAI 1.1.5 Release Notes

PaletteAI 1.1.6 Release Notes

PaletteAI 1.1.7 Release Notes

PaletteAI 1.1.8 Release Notes

PaletteAI 1.2.0 Release Notes

PaletteAI Components

Prepare Infrastructure

Project Scope

Register an Edge Node

Release Notes

Role Permissions Reference

Roles and Permissions

Scaling Policies

ScalingPolicy Configuration Reference

Self-Hosted Quick Start

Settings and Integrations

Sharing Resources

System Scope

Tenant Scope

Troubleshooting Compute Pools

UI Action Permissions Reference