Skip to main content
Version: v0.1.x

Audit Logging

PaletteAI creates an audit log of platform activity by using Kubernetes admission webhooks to monitor requests. Each event is then sent to the Prometheus Alertmanager instance included in the PaletteAI deployment. This page explains what is captured, how to query audit logs, and how to forward them to a long-term storage destination.

Prerequisites

Overview

Every CREATE, UPDATE, and DELETE operation on a PaletteAI resource triggers an audit event. CONNECT operations are excluded.

Each event is a Prometheus alert sent to Alertmanager with the following labels and annotations.

FieldTypeDescription
alertnamelabelAlways audit. Used by Alertmanager to route events to the audit receiver.
audit_idlabelUnique random ID per event. Ensures Alertmanager treats each audit entry as a separate event instead of merging similar ones together.
operationlabelCREATE, UPDATE, or DELETE.
gvklabelGroupVersionKind of the resource, e.g. spectrocloud.com/v1alpha1/Tenant.
namelabelName of the resource.
namespacelabelNamespace of the resource.
projectlabelProject associated with the resource. Maps 1:1 with namespace.
tenantlabelTenant associated with the resource.
actorlabelUsername of the user who performed the operation.
actorUIDlabelUID of the user who performed the operation.
admission_statuslabelallowed or denied.
summaryannotationHuman-readable description, e.g. 'alice' performed 'CREATE' on 'Workload' 'my-project/my-workload'.
reasonannotationOptional. Populated when admission_status is denied. Contains the validation error.

Default Storage Limitations

By default, Alertmanager stores alerts on a 50Mi PersistentVolumeClaim (PVC). Alerts are active for one hour (resolve_timeout: 1h) and are not resent for one year (repeat_interval: 1y), meaning each unique audit event is effectively stored once and then expires. This is not suitable for long-term audit retention. Refer to Forward to Long-Term Storage to configure a durable destination.

Configure Audit Logging

Audit logging is enabled by default as part of installation. The following parameters are available under global.auditLogging in your values.yaml if you need to adjust the defaults.

ParameterDescriptionDefault
global.auditLogging.enabledEnable or disable audit logging.true
global.auditLogging.alertmanagerURLURL of the Alertmanager service to send audit events to.https://alertmanager.mural-system.svc.cluster.local:9093
global.auditLogging.timeoutTimeout for sending audit events to Alertmanager.2s
global.auditLogging.basicAuth.usernameUsername for Alertmanager basic auth. Set to empty string to disable.user
global.auditLogging.basicAuth.passwordPassword for Alertmanager basic auth. Set to empty string to disable.pass
global.auditLogging.tls.insecureSkipVerifySkip Transport Layer Security (TLS) certificate verification when connecting to Alertmanager. Not recommended for production.false
global.auditLogging.tls.caCertSecretNameName of the Secret containing the CA certificate for Alertmanager TLS. When set, cert-manager issues an Alertmanager TLS certificate using this secret name. Clients (including the audit webhook) also use this secret to trust the Alertmanager CA when establishing TLS connections.alertmanager-tls-cert
global.auditLogging.tls.minVersionMinimum TLS version. Accepted values: TLS12, TLS13.TLS12
global.auditLogging.tls.maxVersionMaximum TLS version. Accepted values: TLS12, TLS13.TLS13
global.auditLogging.httpProxyHTTP proxy URL for Alertmanager requests. Sets the HTTP_PROXY environment variable.""
global.auditLogging.httpsProxyHTTPS proxy URL for Alertmanager requests. Sets the HTTPS_PROXY environment variable.""
global.auditLogging.noProxyComma-separated list of hosts to exclude from proxying. Sets the NO_PROXY environment variable.""
info

By default, audit logging only captures events from the hub cluster, as Alertmanager is deployed with a ClusterIP service and no ingress, making it unreachable from dedicated spoke clusters. To capture audit events from spokes as well, expose the hub's Alertmanager externally via a LoadBalancer service or an ingress, then set global.auditLogging.alertmanagerURL to the external endpoint.

When you update these credentials, ensure the probe headers remain in sync.

warning

global.auditLogging.basicAuth.username and global.auditLogging.basicAuth.password must match the credentials set in alertmanager.livenessProbe and alertmanager.readinessProbe. When you change these credentials, regenerate the Base64-encoded string and update the probe headers accordingly. Refer to the Alertmanager section of your installation guide for details (Vanilla Kubernetes, GKE, EKS).

The following is an example of a production-ready global.auditLogging configuration:

global:
auditLogging:
enabled: true
alertmanagerURL: 'https://alertmanager.mural-system.svc.cluster.local:9093'
timeout: '2s'
basicAuth:
username: YOUR_USERNAME
password: YOUR_PASSWORD
tls:
insecureSkipVerify: false
caCertSecretName: alertmanager-tls-cert
minVersion: 'TLS12'
maxVersion: 'TLS13'

Configure Alertmanager Routing

By default, audit events are matched by alertname: audit and sent to a dedicated audit receiver. This receiver is not configured with any integrations by default, so you must add a webhook_config or another integration to forward events to a long-term destination. The repeat_interval: 1y setting helps ensure each unique event is forwarded only once, while group_wait: 1s and group_interval: 10s help forward events quickly with little delay.

Configure alertmanager.config in your values.yaml file.

alertmanager:
config:
global:
resolve_timeout: 1h
receivers:
- name: default-receiver
- name: audit
# Add webhook_configs or other integrations here to forward audit logs
route:
receiver: default-receiver
group_wait: 10s
group_interval: 5m
repeat_interval: 3h
routes:
- match:
alertname: audit
receiver: audit
group_wait: 1s
group_interval: 10s
repeat_interval: 1y

Access Audit Logs

Use kubectl port-forward to access the Alertmanager UI and API directly from your local machine.

kubectl port-forward svc/alertmanager 9093:9093 --namespace <release-namespace>

The Alertmanager UI is available at https://localhost:9093. Because Alertmanager uses a cluster-internal TLS certificate, your browser shows a certificate warning. You can safely proceed or add a security exception.

Log in using the credentials configured in global.auditLogging.basicAuth.

Query Audit Logs via the API

Use the Alertmanager HTTP API to query audit events programmatically. The --insecure flag skips TLS verification for the cluster-internal certificate.

List all active audit events
curl --insecure --user username:password \
"https://localhost:9093/api/v2/alerts?filter=alertname%3D%22audit%22"
Filter by actor
curl --insecure --user username:password \
"https://localhost:9093/api/v2/alerts?filter=alertname%3D%22audit%22&filter=actor%3D%22alice%40example.com%22"
Filter by operation
curl --insecure --user username:password \
"https://localhost:9093/api/v2/alerts?filter=alertname%3D%22audit%22&filter=operation%3D%22DELETE%22"
Filter by tenant
curl --insecure --user username:password \
"https://localhost:9093/api/v2/alerts?filter=alertname%3D%22audit%22&filter=tenant%3D%22my-tenant%22"
info

The Alertmanager API only returns active alerts — events within the resolve_timeout window (one hour by default). For historical queries, you need to configure a long-term storage destination as described in Forward to Long-Term Storage.

Forward to Long-Term Storage

Because audit events expire after one hour in Alertmanager's local storage, production deployments should forward events to a durable destination. To do so, add a webhook_config to the audit receiver in alertmanager.config. Alertmanager POSTs a JSON payload to the configured URL for each group of matching alerts.

The Alertmanager webhook payload has the following structure.

{
"version": "4",
"groupKey": "...",
"status": "firing",
"receiver": "audit",
"groupLabels": {},
"commonLabels": {
"alertname": "audit",
"actor": "alice@example.com",
"operation": "CREATE",
"tenant": "my-tenant"
},
"commonAnnotations": {
"summary": "'alice@example.com' performed 'CREATE' on 'Workload' 'my-project/my-workload'"
},
"alerts": [...]
}

Depending on your destination, you may need to adjust the routing configuration, add additional receivers, or introduce a processing layer to transform the payload.

Next Steps