Skip to main content

22 docs tagged with "paletteai"

View all tags

App Deployments

An App Deployment represents an AI/ML application deployed using a Profile Bundle; the Profile Bundle must contain a Workload Profile with the type Application. An App Deployment is the primary method that data scientists and ML engineers use to deploy their workloads onto Compute Pools.

Appliance Installation

The appliance install will deploy self-hosted Palette and PaletteAI on the same cluster using bare metal or edge devices. The installation is broken into three parts:

Architecture

PaletteAI abstracts away the complexity of deploying AI and ML application stacks on Kubernetes. Built on proven orchestration technologies, PaletteAI enables data science teams to deploy and manage their own AI and ML application stacks while platform engineering teams maintain control over infrastructure, security, and more.

Compute

A Compute resource provides a real-time inventory of the machines available for deploying AI/ML applications and models. It connects to Palette using the Palette integration configured in Settings, discovers machines that have been tagged for PaletteAI, and reports which ones are healthy and eligible for cluster deployment. A Compute can reference a Settings resource in the current namespace or fall back to the project's configured Settings if omitted.

Compute Config

A Compute Config defines default settings for how PaletteAI deploys Kubernetes clusters. It captures infrastructure details, such as networking, SSH keys, and node configurations, so that data scientists and ML engineers can deploy workloads without specifying these settings each time.

Compute Pools

A Compute Pool is a group of shared Compute resources used to create Kubernetes clusters where your AI/ML applications run. In the hub-spoke architecture, each Compute Pool becomes a spoke cluster on which applications and models are deployed.

Concepts

This section covers the core concepts you need to understand when working with PaletteAI. Whether you are a platform engineer setting up infrastructure or a data scientist deploying workloads, these concepts explain how PaletteAI organizes resources and manages AI/ML deployments.

Create and Manage Projects

A Project is a team workspace within a Tenant. Projects are where most day-to-day work happens; they give teams their own space, controlled access, and dedicated resources to build and run AI/ML applications and models.

Deploy PaletteAI

This is the final guide in the PaletteAI appliance installation process. In this guide, you will trigger the actual cluster creation process and install PaletteAI on the cluster.

Glossary

This glossary defines key terms and concepts used throughout PaletteAI. Concepts inherited from other software, such as Palette and Open Cluster Management (OCM), are indicated, along with their source.

Hub and Spoke Model

PaletteAI uses a hub-spoke architecture to separate the control plane from the data plane. The hub cluster is where you manage and configure applications. Spoke clusters are where your AI/ML applications actually run. This separation allows a single control plane to orchestrate workloads across many environments.

Integrate with Palette

Integrations are external service connections configured in a Settings resource that provide credentials and endpoints for PaletteAI to interact with external platforms. Integrations can be added while creating a Tenant, Project, or at any time within the Project scope. A Palette integration is required to deploy applications and models.

Kubernetes Installation

This page guides you through the process of installing PaletteAI on any Kubernetes cluster, whether it is managed by cloud providers (EKS, GKE, AKS) or self-managed in on-premises or edge environments. The deployment method covered uses the hub-as-spoke pattern, which allows the hub cluster to also act as a spoke cluster, and deploys Zot for the Open Container Initiative (OCI) registry. By acting as a spoke cluster, AI/ML applications can be deployed directly on the hub cluster. To learn more about hub and spoke clusters, refer to our Hub-Spoke Model guide.

OCI Registries

PaletteAI uses OCI (Open Container Initiative) registries to store and distribute workload artifacts between hub and spoke clusters. When you deploy an application using the App Deployment workflow, PaletteAI renders your Workload Profile into Kubernetes manifests, packages them as OCI artifacts, and stores them in a registry. Flux controllers, which exist on each spoke cluster, then pull these artifacts and apply them.

Prepare Helm Chart Values

In this section, you will prepare the necessary Helm chart values for the PaletteAI cluster. The Helm chart is used to deploy PaletteAI on the cluster. Sections that are not covered in this guide can be left as is.

Prepare Infrastructure

This guide covers preparing the user-data ISO creation, accessing Local UI, and linking the edge nodes to the leader node. Follow the guide sequentially by reviewing each subsection to ensure a successful preparation of the infrastructure.

Profile Bundles

Profile Bundles are reusable bundles that package infrastructure and application configurations for consistent, repeatable deployments across Compute Pools. Instead of manually configuring each deployment's software components, you define it once in a Profile Bundle and reuse them wherever needed.

Settings

Settings is a resource that configures external integrations for PaletteAI. Settings can be created in the Tenant or Project scope. Settings created in the Tenant scope are the default fallback for all projects in the Tenant. You can also create a Settings in the Project namespace and override the default Tenant Settings.

Tenants and Projects

PaletteAI uses Tenants and Projects to organize teams, control access, and manage GPU resources. This two-level hierarchy lets platform engineering teams set organization-wide policies while giving individual data science teams autonomy over their own workspaces.

What is PaletteAI?

Deploying AI/ML applications on Kubernetes is complex. Data science teams need GPU-accelerated clusters, specialized storage, and properly configured networking; however, they would rather focus on experiments and models than infrastructure. Meanwhile, platform teams want to provide self-service access while maintaining control over resources, costs, and security.