Version: v1.1.x

Concepts

This section covers the core concepts for working with PaletteAI. Whether you are a platform engineer setting up infrastructure or a data scientist deploying workloads, these concepts explain how PaletteAI organizes resources and manages AI/ML deployments.

Organization and Access

PaletteAI uses a hierarchical structure to organize teams and control access to resources.

Tenants and Projects - Organizational hierarchy for multi-tenancy, GPU quotas, and team access control.
Settings - External integrations (Palette API credentials) and namespace-scoped configuration.
Roles and Permissions - RBAC roles automatically created for Tenants and Projects.

Compute Resources

These resources define where your AI/ML applications run.

Compute - Discovers available machines for cluster provisioning.
Compute Config - Default settings for cluster deployment (networking, SSH, node configurations).
Compute Pools - Kubernetes clusters where applications run (dedicated or shared modes).

Workload Resources

These resources define what gets deployed and how workloads are configured.

Workloads - Workloads, WorkloadProfiles, and WorkloadDeployments
Definitions - Components, Traits, and Policies that compose workloads.
Variables - User-defined configuration values for templating.
Environments - Placement policies that control which clusters receive workloads.

Deployments

These resources package and deploy applications to your infrastructure.

Profile Bundles - Reusable packages combining Palette Cluster Profiles and PaletteAI Workload Profiles.
App Deployments - Deploy AI/ML applications to Compute Pools using Profile Bundles.
Model Deployments - Deploy AI/ML models for inference using Model as a Service or custom configurations.

Deployment Flow

A typical deployment involves these concepts working together:

Platform setup - A platform engineer creates a Tenant with Settings for Palette integration
Project creation - Teams get Projects with GPU quotas and role-based access. Refer to Create and Manage Projects for how to create projects.
Infrastructure provisioning - Compute discovers available machines; Compute Pools provision Kubernetes clusters
Application packaging - Profile Bundles package infrastructure and application configurations
Deployment - Data scientists create App Deployments or Model Deployments using Profile Bundles on Compute Pools

For a deeper look at the system architecture, refer to the Architecture guide.

Organization and Access​

Compute Resources​

Workload Resources​

Deployments​

Deployment Flow​

Organization and Access

Compute Resources

Workload Resources

Deployments

Deployment Flow