Skip to main content

Glossary

This glossary defines key terms and concepts used throughout PaletteAI. Concepts inherited from other software, such as Palette and Open Cluster Management (OCM), are indicated, along with their source.

AI Workload

A resource that represents an application or model deployed using a Profile Bundle via the App Deployment or Model Deployment workflow.

App Deployment

A workflow used to deploy AI/ML applications, ranging from open-source to Enterprise proprietary, focused on AI use cases.

Cluster Profile

(Palette concept) A reusable software stack built in Palette that is composed of infrastructure or application layers, which are used to deploy clusters. Cluster Profiles are a component of Profile Bundles and are primarily used for infrastructure-related use cases, such as creating a new Compute Pool, but can also be used to deploy additional functionality for applications and models, such as monitoring or logging. There are three types: Add-on, Infrastructure, and Full. Profile Bundles can contain multiple Add-on Cluster Profiles but can only contain one Full or Infrastructure Cluster Profile.

Add-on Cluster Profile

(Palette concept) A type of Palette Cluster Profile that contains only add-on layers (such as monitoring, logging, ingress, and service mesh applications) and is designed for reuse across multiple clusters and projects.

Full Cluster Profile

(Palette concept) A type of Palette Cluster Profile that combines infrastructure and add-on layers into a single, end-to-end stack.

Infrastructure Cluster Profile

(Palette concept) A Palette Cluster Profile that contains the core infrastructure layers of a Kubernetes cluster: operating system (OS), Kubernetes distribution , Container Network Interface (CNI), and Container Storage Interface (CSI).

Compute

A resource that provides a real-time inventory of the machines available for AI/ML applications and models. It connects to Palette using the Palette integration configured in Settings, discovers machines that have been tagged for PaletteAI, and reports which ones are healthy and eligible for deploying Compute Pools, applications, or models.

Compute Config

A resource that defines default settings for how PaletteAI deploys Compute Pools. It captures infrastructure details, such as networking, SSH keys, and node configurations, so that data scientists and ML engineers can deploy workloads without specifying these settings each time.

Compute Pool

A resource that represents Kubernetes clusters where AI/ML applications and models run. Compute Pools operate in two modes: dedicated (reserved for a single workload) and shared (used for multiple workloads).

Dedicated Compute Pool

A Compute Pool mode where a Kubernetes cluster is provisioned exclusively for a single application or model, with GPU, CPU, and memory reserved for that workload. Ideal for production scenarios, isolated workloads, specific hardware needs, or stringent compliance or security requirements.

Shared Compute Pool

A Compute Pool mode where one or more Kubernetes clusters host multiple applications or models, sharing GPU, CPU, and memory resources across workloads. Best suited for development, experimentation, or budget-conscious use cases.

Definitions

Reusable CUE templates that generate Kubernetes resources. There are three types: Components, Traits, and Policies. Definitions are versioned for consistency and control.

Component Definition

A CUE template that defines the core building blocks of an application, such as a web server, database, or ML model. Component Definitions specify what gets deployed and how it generates Kubernetes resources. They can be customized with Traits and are versioned for consistency.

Policy Definition

A CUE template that defines system-level rules applied across applications and models. Policy Definitions control placement (which clusters receive workloads), environment-specific overrides, and topology configurations. Commonly used for managing multi-cluster deployments.

Trait Definition

A CUE template that adds operational capabilities to a Component without changing the Component Definition itself. Trait Definitions provide features like ingress, autoscaling, or monitoring. Traits are applied to Components within Workload Profiles and versioned independently.

Hub

(OCM concept) The cluster that runs the multi-cluster control plane of OCM. All spoke clusters communicate with the hub and retrieve workloads from the hub or through an OCI registry.

Model Deployment

A workflow used to deploy AI/ML models to Compute Pools. Model Deployments support two approaches: Custom Model Deployment and Model as a Service.

Custom Model Deployment

A Model Deployment workflow for bringing your own AI/ML model and deploying it with full control over the infrastructure and configuration. Designed for production workloads or advanced use cases requiring hardware-optimized performance. Unlike Model as a Service, Custom Model Deployments give you control over the Cluster Profile, Workload Profile, and Compute resource allocation.

Model as a Service

A Model Deployment workflow for quickly exploring and testing pre-built AI/ML models from external sources such as Hugging Face or NVIDIA NIMs. It prioritizes ease of use over hardware optimization, making it ideal for demos, experimentation, and research.

Open Cluster Management (OCM)

A Kubernetes project that helps teams deploy and manage workloads across multiple clusters from a central control plane.

PaletteAI Studio

A catalog of ready-to-use Profile Bundles that support a wide range of use cases. All Profile Bundles are designed to work out of the box but can be modified as needed.

Profile Bundle

Reusable bundles that package Cluster Profiles and Workload Profiles for consistent, repeatable deployments of infrastructure, applications, or models across Compute Pools. There are three types: Application , Infrastructure, and Fullstack.

Application Profile Bundle

A Profile Bundle type that contains only Workload Profiles. Used to deploy applications or models to existing Compute Pools.

Fullstack Profile Bundle

A Profile Bundle type that contains both Cluster Profiles and Workload Profiles. Used for simultaneous provisioning of Compute Pools and the deployment of applications or models. Combines Infrastructure and Application Profile Bundle capabilities, allowing teams to deploy both the cluster and workloads in a single operation.

Infrastructure Profile Bundle

A Profile Bundle type used to provision Compute Pools. Contains at least one Cluster Profile of type Infrastructure or Full, and may optionally include Infrastructure Workload Profiles for deploying infrastructure-related dependencies. Typically used by platform engineers to create shared or dedicated Compute Pools for data scientists and ML engineers.

Project

(Palette concept) A team workspace within a Tenant where most day-to-day work happens, giving teams their own space, controlled access, and dedicated resources to build and run AI/ML applications and models. Project Team members are assigned roles based on access requirements.

Project Role

A role that defines who can view, manage, or administer a Project, ensuring the right people have appropriate access to a team’s workspace.

Resource Groups

A filter that ensures only specific machines are selected from the Compute resource when provisioning Compute Pools.

Settings

A resource that configures external integrations for PaletteAI. Supported integrations include Palette, NVIDIA NGC, and Hugging Face. A Palette integration is required to deploy Compute Pools.

Spoke

(OCM concept) A member that is managed by the hub cluster. The spoke cluster is constantly pulling desired state from the hub. Spoke clusters are also known as ManagedClusters.

Tenant

(Palette concept) The top-level organizational container that groups multiple Projects, enabling platform teams to centrally manage access, policies, infrastructure, and resources across teams.

Variables

A method for injecting custom configuration values across Workload Profiles using macro syntax. Variables support multiple data types, can be required or optional, and can be overridden at the Project or deployment level.

Workload Profile

Application configurations packaged within Profile Bundles that define what gets deployed to Compute Pools. There are three types: Application, Model, and Infrastructure. Unlike Cluster Profiles, Workload Profiles are created and managed in PaletteAI, making them accessible to both platform engineers and data scientists.

Application Workload Profile

A Workload Profile type used to deploy end-user-facing applications and their required dependencies (excluding infrastructure components). Application Workload Profiles are used in the App Deployment workflow to deploy AI/ML applications to existing Compute Pools or as part of a Fullstack Profile Bundle for simultaneous application and infrastructure deployment.

Infrastructure Workload Profile

A Workload Profile type used to deploy infrastructure-related dependencies that are not part of the Cluster Profile. Useful when deploying infrastructure components you do not want Palette to manage, or when dependencies reside in private Helm or OCI registries managed by PaletteAI but not in Palette.

Model Workload Profile

A Workload Profile type used to deploy AI/ML models, including their required dependencies such as inference engines. Model Workload Profiles are used in the Model Deployment workflow to deploy models to Compute Pools, supporting both custom models and pre-built models from external sources.