Skip to main content

22 docs tagged with "paletteai"

View all tags

App Deployments

An App Deployment represents an AI/ML application deployed using a Profile Bundle; the Profile Bundle must contain a Workload Profile with the type Application. An App Deployment is the primary method that data scientists and ML engineers use to deploy their workloads onto Compute Pools.

Appliance Installation

The appliance install will deploy self-hosted Palette and PaletteAI on the same cluster using bare metal or edge devices. The installation is broken into two parts:

Architecture

PaletteAI abstracts away the complexity of deploying AI and ML application stacks on Kubernetes. Built on proven orchestration technologies, PaletteAI enables data science teams to deploy and manage their own AI and ML application stacks while platform engineering teams maintain control over infrastructure, security, and more.

Compute

A Compute resource provides a real-time inventory of the machines available for deploying AI/ML applications and models. It connects to Palette using the Palette integration configured in Settings, discovers machines that have been tagged for PaletteAI, and reports which ones are healthy and eligible for cluster deployment. A Compute can reference a Settings resource in the current namespace or fall back to the project's configured Settings if omitted.

Compute Config

A Compute Config defines default settings for how PaletteAI deploys Kubernetes clusters. It captures infrastructure details, such as networking, SSH keys, and node configurations, so that data scientists and ML engineers can deploy workloads without specifying these settings each time.

Compute Pools

A Compute Pool is a group of shared Compute resources used to create Kubernetes clusters where your AI/ML applications run. In the hub-spoke architecture, each Compute Pool becomes a spoke cluster on which applications and models are deployed.

Concepts

This section covers the core concepts for working with PaletteAI. Whether you are a platform engineer setting up infrastructure or a data scientist deploying workloads, these concepts explain how PaletteAI organizes resources and manages AI/ML deployments.

Configure Integrations

A Settings resource holds the integrations PaletteAI uses to provision infrastructure and govern artificial intelligence and machine learning (AI/ML) workloads. This page shows how to configure each integration type and how to govern model availability through Project Model Settings.

Configure Settings

Settings define integrations and configuration values used by Projects and Compute Pools. For integration types, prerequisites, and examples, refer to Settings and Integrations.

Glossary

This glossary defines key terms and concepts used throughout PaletteAI. Entries indicate concepts inherited from other software, such as Palette and Open Cluster Management (OCM), and name their source.

Helm Chart Configuration

This page contains the complete Helm chart configuration reference for the latest version of PaletteAI.

Hub and Spoke Model

PaletteAI uses a hub-spoke architecture to separate the control plane from the data plane. The hub cluster is where you manage and configure applications. Spoke clusters are where your AI/ML applications actually run. This separation allows a single control plane to orchestrate workloads across many environments.

OCI Registries

PaletteAI uses OCI (Open Container Initiative) registries to store and distribute workload artifacts between hub and spoke clusters. When you deploy an application using the App Deployment workflow, PaletteAI renders your Workload Profile into Kubernetes manifests, packages them as OCI artifacts, and stores them in a registry. Flux controllers, which exist on each spoke cluster, then pull these artifacts and apply them.

Role Permissions Reference

This page lists the full Kubernetes Role-Based Access Control (RBAC) permissions that PaletteAI grants to each Tenant and Project role. For an overview of each role and how OpenID Connect (OIDC) groups bind to roles, refer to the Roles and Permissions concept page.

Roles and Permissions

PaletteAI manages permissions using standard Kubernetes Role-Based Access Control (RBAC), with one consistent extension: every role in PaletteAI is bound to OpenID Connect (OIDC) groups rather than to individual users. When you create a Tenant or Project, PaletteAI generates the underlying roles and role bindings automatically and connects them to the OIDC groups you specify in the Tenant's tenantRoleMapping or the Project's roleMapping. Group membership in your identity provider grants or revokes access; there are no per-user resources to maintain inside the cluster.

Self-Hosted Quick Start

Use this guide for an end-to-end self-hosted setup that starts with the appliance installation and ends with importing and deploying your first profile bundle.

Settings and Integrations

A Settings resource holds the external integrations PaletteAI uses to provision infrastructure and govern artificial intelligence and machine learning (AI/ML) workloads. Tenants and Projects each reference a Settings resource, and the controller uses the integrations defined there to communicate with Palette and the registries that supply your models.

Tenants and Projects

PaletteAI uses Tenants and Projects to organize teams, control access, and manage GPU resources. This two-level hierarchy lets platform engineering teams set organization-wide policies while giving individual data science teams autonomy over their own workspaces.