Nimservice

TypeComponent

Parameters

Parameter	Type	Required	Default	Description
authSecret	string	Yes		The name of an existing pull secret containing the NGC_API_KEY
image	object	Yes		Image defines image attributes.
affinity	object	No		Affinity is a group of affinity scheduling rules.
annotations	map	No		Annotations for the workload
args	array	No
command	array	No
draResources	array	No		DRAResources is the list of DRA resource claims to be used for the NIMService deployment or leader worker set.
env	array	No
expose	object	No		Expose defines attributes to expose the service.
groupID	integer	No
inferencePlatform	string	No	`"standalone"`	InferencePlatform specifies the inference platform to use for this NIMService. Valid values are "standalone" (default) and "kserve".
labels	map	No		Labels for the workload
livenessProbe	object	No		Probe defines attributes for startup/liveness/readiness probes.
metrics	object	No		Metrics defines attributes to setup metrics collection.
multiNode	object	No		NimServiceMultiNodeConfig defines the configuration for multi-node NIMService.
nodeSelector	map	No
podAffinity	object	No		Deprecated: Use Affinity instead.
proxy	object	No		ProxySpec defines the proxy configuration for NIMService.
readinessProbe	object	No		Probe defines attributes for startup/liveness/readiness probes.
replicas	integer	No
resources	object	No		Resources is the resource requirements for the NIMService deployment or leader worker set. Note: Only traditional resources like cpu/memory and custom device plugin resources are supported here. Any DRA claim references are ignored. Use DRAResources instead for those.
router	object	No
runtimeClassName	string	No
scale	object	No		Autoscaling defines attributes to automatically scale the service based on metrics.
schedulerName	string	No
startupProbe	object	No		Probe defines attributes for startup/liveness/readiness probes.
storage	object	No		Storage is the target storage for caching NIM model if NIMCache is not provided
tolerations	array	No
userID	integer	No

Template

The following tabs display the definition's Cue template and the rendered YAML. The rendered YAML is the output of the Cue template when the definition is applied to a cluster.

nimservice: { type: "component" description: "NIMService is the Schema for the nimservices API." labels: { "componentdefinition.spectrocloud.com/type": "application" "wl.spectrocloud.com/provider": "apps.nvidia.com" "definition.spectrocloud.com/category": "NVIDIA-NIM" } } template: { output: { apiVersion: "apps.nvidia.com/v1alpha1" kind: "NIMService" metadata: { labels: { if parameter.labels != _|_ { parameter.labels } "wl.spectrocloud.com/name": context.workloadName "wl.spectrocloud.com/component": context.name } if parameter.annotations != _|_ { annotations: parameter.annotations } } spec: { if parameter.affinity != _|_ { affinity: parameter.affinity } if parameter.annotations != _|_ { annotations: parameter.annotations } if parameter.args != _|_ { args: parameter.args } authSecret: parameter.authSecret if parameter.command != _|_ { command: parameter.command } if parameter.draResources != _|_ { draResources: parameter.draResources } if parameter.env != _|_ { env: parameter.env } if parameter.expose != _|_ { expose: parameter.expose } if parameter.groupID != _|_ { groupID: parameter.groupID } image: parameter.image if parameter.inferencePlatform != _|_ { inferencePlatform: parameter.inferencePlatform } if parameter.labels != _|_ { labels: parameter.labels } if parameter.livenessProbe != _|_ { livenessProbe: parameter.livenessProbe } if parameter.metrics != _|_ { metrics: parameter.metrics } if parameter.multiNode != _|_ { multiNode: parameter.multiNode } if parameter.nodeSelector != _|_ { nodeSelector: parameter.nodeSelector } if parameter.podAffinity != _|_ { podAffinity: parameter.podAffinity } if parameter.proxy != _|_ { proxy: parameter.proxy } if parameter.readinessProbe != _|_ { readinessProbe: parameter.readinessProbe } if parameter.replicas != _|_ { replicas: parameter.replicas } if parameter.resources != _|_ { resources: parameter.resources } if parameter.router != _|_ { router: parameter.router } if parameter.runtimeClassName != _|_ { runtimeClassName: parameter.runtimeClassName } if parameter.scale != _|_ { scale: parameter.scale } if parameter.schedulerName != _|_ { schedulerName: parameter.schedulerName } if parameter.startupProbe != _|_ { startupProbe: parameter.startupProbe } if parameter.storage != _|_ { storage: parameter.storage } if parameter.tolerations != _|_ { tolerations: parameter.tolerations } if parameter.userID != _|_ { userID: parameter.userID } } } parameter: { // +usage=Annotations for the workload annotations?: [string]: string // +usage=Labels for the workload labels?: [string]: string // +usage=Affinity is a group of affinity scheduling rules. affinity?: { // +usage=Describes node affinity scheduling rules for the pod. nodeAffinity?: { // +usage=The scheduler will prefer to schedule pods to nodes that satisfy the affinity expressions specified by this field, but it may choose a node that violates one or more of the expressions. The node that is most preferred is the one with the greatest sum of weights, i.e. for each node that meets all of the scheduling requirements (resource request, requiredDuringScheduling affinity expressions, etc.), compute a sum by iterating through the elements of this field and adding "weight" to the sum if the node matches the corresponding matchExpressions; the node(s) with the highest sum are the most preferred. preferredDuringSchedulingIgnoredDuringExecution?: [...{ // +usage=A node selector term, associated with the corresponding weight. preference: { // +usage=A list of node selector requirements by node's labels. matchExpressions?: [...{ // +usage=The label key that the selector applies to. key: string // +usage=Represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt. operator: string // +usage=An array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. If the operator is Gt or Lt, the values array must have a single element, which will be interpreted as an integer. This array is replaced during a strategic merge patch. values?: [...string] }] // +usage=A list of node selector requirements by node's fields. matchFields?: [...{ // +usage=The label key that the selector applies to. key: string // +usage=Represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt. operator: string // +usage=An array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. If the operator is Gt or Lt, the values array must have a single element, which will be interpreted as an integer. This array is replaced during a strategic merge patch. values?: [...string] }] } // +usage=Weight associated with matching the corresponding nodeSelectorTerm, in the range 1-100. weight: int }] // +usage=If the affinity requirements specified by this field are not met at scheduling time, the pod will not be scheduled onto the node. If the affinity requirements specified by this field cease to be met at some point during pod execution (e.g. due to an update), the system may or may not try to eventually evict the pod from its node. requiredDuringSchedulingIgnoredDuringExecution?: { // +usage=Required. A list of node selector terms. The terms are ORed. nodeSelectorTerms: [...{ // +usage=A list of node selector requirements by node's labels. matchExpressions?: [...{ // +usage=The label key that the selector applies to. key: string // +usage=Represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt. operator: string // +usage=An array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. If the operator is Gt or Lt, the values array must have a single element, which will be interpreted as an integer. This array is replaced during a strategic merge patch. values?: [...string] }] // +usage=A list of node selector requirements by node's fields. matchFields?: [...{ // +usage=The label key that the selector applies to. key: string // +usage=Represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt. operator: string // +usage=An array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. If the operator is Gt or Lt, the values array must have a single element, which will be interpreted as an integer. This array is replaced during a strategic merge patch. values?: [...string] }] }] } } // +usage=Describes pod affinity scheduling rules (e.g. co-locate this pod in the same node, zone, etc. as some other pod(s)). podAffinity?: { // +usage=The scheduler will prefer to schedule pods to nodes that satisfy the affinity expressions specified by this field, but it may choose a node that violates one or more of the expressions. The node that is most preferred is the one with the greatest sum of weights, i.e. for each node that meets all of the scheduling requirements (resource request, requiredDuringScheduling affinity expressions, etc.), compute a sum by iterating through the elements of this field and adding "weight" to the sum if the node has pods which matches the corresponding podAffinityTerm; the node(s) with the highest sum are the most preferred. preferredDuringSchedulingIgnoredDuringExecution?: [...{ // +usage=Required. A pod affinity term, associated with the corresponding weight. podAffinityTerm: { // +usage=A label query over a set of resources, in this case pods. If it's null, this PodAffinityTerm matches with no Pods. labelSelector?: { // +usage=matchExpressions is a list of label selector requirements. The requirements are ANDed. matchExpressions?: [...{ // +usage=key is the label key that the selector applies to. key: string // +usage=operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. operator: string // +usage=values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. values?: [...string] }] // +usage=matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. matchLabels?: [string]: string } // +usage=MatchLabelKeys is a set of pod label keys to select which pods will be taken into consideration. The keys are used to lookup values from the incoming pod labels, those key-value labels are merged with `labelSelector` as `key in (value)` to select the group of existing pods which pods will be taken into consideration for the incoming pod's pod (anti) affinity. Keys that don't exist in the incoming pod labels will be ignored. The default value is empty. The same key is forbidden to exist in both matchLabelKeys and labelSelector. Also, matchLabelKeys cannot be set when labelSelector isn't set. matchLabelKeys?: [...string] // +usage=MismatchLabelKeys is a set of pod label keys to select which pods will be taken into consideration. The keys are used to lookup values from the incoming pod labels, those key-value labels are merged with `labelSelector` as `key notin (value)` to select the group of existing pods which pods will be taken into consideration for the incoming pod's pod (anti) affinity. Keys that don't exist in the incoming pod labels will be ignored. The default value is empty. The same key is forbidden to exist in both mismatchLabelKeys and labelSelector. Also, mismatchLabelKeys cannot be set when labelSelector isn't set. mismatchLabelKeys?: [...string] // +usage=A label query over the set of namespaces that the term applies to. The term is applied to the union of the namespaces selected by this field and the ones listed in the namespaces field. null selector and null or empty namespaces list means "this pod's namespace". An empty selector ({}) matches all namespaces. namespaceSelector?: { // +usage=matchExpressions is a list of label selector requirements. The requirements are ANDed. matchExpressions?: [...{ // +usage=key is the label key that the selector applies to. key: string // +usage=operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. operator: string // +usage=values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. values?: [...string] }] // +usage=matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. matchLabels?: [string]: string } // +usage=namespaces specifies a static list of namespace names that the term applies to. The term is applied to the union of the namespaces listed in this field and the ones selected by namespaceSelector. null or empty namespaces list and null namespaceSelector means "this pod's namespace". namespaces?: [...string] // +usage=This pod should be co-located (affinity) or not co-located (anti-affinity) with the pods matching the labelSelector in the specified namespaces, where co-located is defined as running on a node whose value of the label with key topologyKey matches that of any node on which any of the selected pods is running. Empty topologyKey is not allowed. topologyKey: string } // +usage=weight associated with matching the corresponding podAffinityTerm, in the range 1-100. weight: int }] // +usage=If the affinity requirements specified by this field are not met at scheduling time, the pod will not be scheduled onto the node. If the affinity requirements specified by this field cease to be met at some point during pod execution (e.g. due to a pod label update), the system may or may not try to eventually evict the pod from its node. When there are multiple elements, the lists of nodes corresponding to each podAffinityTerm are intersected, i.e. all terms must be satisfied. requiredDuringSchedulingIgnoredDuringExecution?: [...{ // +usage=A label query over a set of resources, in this case pods. If it's null, this PodAffinityTerm matches with no Pods. labelSelector?: { // +usage=matchExpressions is a list of label selector requirements. The requirements are ANDed. matchExpressions?: [...{ // +usage=key is the label key that the selector applies to. key: string // +usage=operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. operator: string // +usage=values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. values?: [...string] }] // +usage=matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. matchLabels?: [string]: string } // +usage=MatchLabelKeys is a set of pod label keys to select which pods will be taken into consideration. The keys are used to lookup values from the incoming pod labels, those key-value labels are merged with `labelSelector` as `key in (value)` to select the group of existing pods which pods will be taken into consideration for the incoming pod's pod (anti) affinity. Keys that don't exist in the incoming pod labels will be ignored. The default value is empty. The same key is forbidden to exist in both matchLabelKeys and labelSelector. Also, matchLabelKeys cannot be set when labelSelector isn't set. matchLabelKeys?: [...string] // +usage=MismatchLabelKeys is a set of pod label keys to select which pods will be taken into consideration. The keys are used to lookup values from the incoming pod labels, those key-value labels are merged with `labelSelector` as `key notin (value)` to select the group of existing pods which pods will be taken into consideration for the incoming pod's pod (anti) affinity. Keys that don't exist in the incoming pod labels will be ignored. The default value is empty. The same key is forbidden to exist in both mismatchLabelKeys and labelSelector. Also, mismatchLabelKeys cannot be set when labelSelector isn't set. mismatchLabelKeys?: [...string] // +usage=A label query over the set of namespaces that the term applies to. The term is applied to the union of the namespaces selected by this field and the ones listed in the namespaces field. null selector and null or empty namespaces list means "this pod's namespace". An empty selector ({}) matches all namespaces. namespaceSelector?: { // +usage=matchExpressions is a list of label selector requirements. The requirements are ANDed. matchExpressions?: [...{ // +usage=key is the label key that the selector applies to. key: string // +usage=operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. operator: string // +usage=values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. values?: [...string] }] // +usage=matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. matchLabels?: [string]: string } // +usage=namespaces specifies a static list of namespace names that the term applies to. The term is applied to the union of the namespaces listed in this field and the ones selected by namespaceSelector. null or empty namespaces list and null namespaceSelector means "this pod's namespace". namespaces?: [...string] // +usage=This pod should be co-located (affinity) or not co-located (anti-affinity) with the pods matching the labelSelector in the specified namespaces, where co-located is defined as running on a node whose value of the label with key topologyKey matches that of any node on which any of the selected pods is running. Empty topologyKey is not allowed. topologyKey: string }] } // +usage=Describes pod anti-affinity scheduling rules (e.g. avoid putting this pod in the same node, zone, etc. as some other pod(s)). podAntiAffinity?: { // +usage=The scheduler will prefer to schedule pods to nodes that satisfy the anti-affinity expressions specified by this field, but it may choose a node that violates one or more of the expressions. The node that is most preferred is the one with the greatest sum of weights, i.e. for each node that meets all of the scheduling requirements (resource request, requiredDuringScheduling anti-affinity expressions, etc.), compute a sum by iterating through the elements of this field and adding "weight" to the sum if the node has pods which matches the corresponding podAffinityTerm; the node(s) with the highest sum are the most preferred. preferredDuringSchedulingIgnoredDuringExecution?: [...{ // +usage=Required. A pod affinity term, associated with the corresponding weight. podAffinityTerm: { // +usage=A label query over a set of resources, in this case pods. If it's null, this PodAffinityTerm matches with no Pods. labelSelector?: { // +usage=matchExpressions is a list of label selector requirements. The requirements are ANDed. matchExpressions?: [...{ // +usage=key is the label key that the selector applies to. key: string // +usage=operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. operator: string // +usage=values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. values?: [...string] }] // +usage=matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. matchLabels?: [string]: string } // +usage=MatchLabelKeys is a set of pod label keys to select which pods will be taken into consideration. The keys are used to lookup values from the incoming pod labels, those key-value labels are merged with `labelSelector` as `key in (value)` to select the group of existing pods which pods will be taken into consideration for the incoming pod's pod (anti) affinity. Keys that don't exist in the incoming pod labels will be ignored. The default value is empty. The same key is forbidden to exist in both matchLabelKeys and labelSelector. Also, matchLabelKeys cannot be set when labelSelector isn't set. matchLabelKeys?: [...string] // +usage=MismatchLabelKeys is a set of pod label keys to select which pods will be taken into consideration. The keys are used to lookup values from the incoming pod labels, those key-value labels are merged with `labelSelector` as `key notin (value)` to select the group of existing pods which pods will be taken into consideration for the incoming pod's pod (anti) affinity. Keys that don't exist in the incoming pod labels will be ignored. The default value is empty. The same key is forbidden to exist in both mismatchLabelKeys and labelSelector. Also, mismatchLabelKeys cannot be set when labelSelector isn't set. mismatchLabelKeys?: [...string] // +usage=A label query over the set of namespaces that the term applies to. The term is applied to the union of the namespaces selected by this field and the ones listed in the namespaces field. null selector and null or empty namespaces list means "this pod's namespace". An empty selector ({}) matches all namespaces. namespaceSelector?: { // +usage=matchExpressions is a list of label selector requirements. The requirements are ANDed. matchExpressions?: [...{ // +usage=key is the label key that the selector applies to. key: string // +usage=operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. operator: string // +usage=values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. values?: [...string] }] // +usage=matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. matchLabels?: [string]: string } // +usage=namespaces specifies a static list of namespace names that the term applies to. The term is applied to the union of the namespaces listed in this field and the ones selected by namespaceSelector. null or empty namespaces list and null namespaceSelector means "this pod's namespace". namespaces?: [...string] // +usage=This pod should be co-located (affinity) or not co-located (anti-affinity) with the pods matching the labelSelector in the specified namespaces, where co-located is defined as running on a node whose value of the label with key topologyKey matches that of any node on which any of the selected pods is running. Empty topologyKey is not allowed. topologyKey: string } // +usage=weight associated with matching the corresponding podAffinityTerm, in the range 1-100. weight: int }] // +usage=If the anti-affinity requirements specified by this field are not met at scheduling time, the pod will not be scheduled onto the node. If the anti-affinity requirements specified by this field cease to be met at some point during pod execution (e.g. due to a pod label update), the system may or may not try to eventually evict the pod from its node. When there are multiple elements, the lists of nodes corresponding to each podAffinityTerm are intersected, i.e. all terms must be satisfied. requiredDuringSchedulingIgnoredDuringExecution?: [...{ // +usage=A label query over a set of resources, in this case pods. If it's null, this PodAffinityTerm matches with no Pods. labelSelector?: { // +usage=matchExpressions is a list of label selector requirements. The requirements are ANDed. matchExpressions?: [...{ // +usage=key is the label key that the selector applies to. key: string // +usage=operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. operator: string // +usage=values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. values?: [...string] }] // +usage=matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. matchLabels?: [string]: string } // +usage=MatchLabelKeys is a set of pod label keys to select which pods will be taken into consideration. The keys are used to lookup values from the incoming pod labels, those key-value labels are merged with `labelSelector` as `key in (value)` to select the group of existing pods which pods will be taken into consideration for the incoming pod's pod (anti) affinity. Keys that don't exist in the incoming pod labels will be ignored. The default value is empty. The same key is forbidden to exist in both matchLabelKeys and labelSelector. Also, matchLabelKeys cannot be set when labelSelector isn't set. matchLabelKeys?: [...string] // +usage=MismatchLabelKeys is a set of pod label keys to select which pods will be taken into consideration. The keys are used to lookup values from the incoming pod labels, those key-value labels are merged with `labelSelector` as `key notin (value)` to select the group of existing pods which pods will be taken into consideration for the incoming pod's pod (anti) affinity. Keys that don't exist in the incoming pod labels will be ignored. The default value is empty. The same key is forbidden to exist in both mismatchLabelKeys and labelSelector. Also, mismatchLabelKeys cannot be set when labelSelector isn't set. mismatchLabelKeys?: [...string] // +usage=A label query over the set of namespaces that the term applies to. The term is applied to the union of the namespaces selected by this field and the ones listed in the namespaces field. null selector and null or empty namespaces list means "this pod's namespace". An empty selector ({}) matches all namespaces. namespaceSelector?: { // +usage=matchExpressions is a list of label selector requirements. The requirements are ANDed. matchExpressions?: [...{ // +usage=key is the label key that the selector applies to. key: string // +usage=operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. operator: string // +usage=values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. values?: [...string] }] // +usage=matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. matchLabels?: [string]: string } // +usage=namespaces specifies a static list of namespace names that the term applies to. The term is applied to the union of the namespaces listed in this field and the ones selected by namespaceSelector. null or empty namespaces list and null namespaceSelector means "this pod's namespace". namespaces?: [...string] // +usage=This pod should be co-located (affinity) or not co-located (anti-affinity) with the pods matching the labelSelector in the specified namespaces, where co-located is defined as running on a node whose value of the label with key topologyKey matches that of any node on which any of the selected pods is running. Empty topologyKey is not allowed. topologyKey: string }] } } annotations?: [string]: string args?: [...string] // +usage=The name of an existing pull secret containing the NGC_API_KEY authSecret: string command?: [...string] // +usage=DRAResources is the list of DRA resource claims to be used for the NIMService deployment or leader worker set. draResources?: [...{ // +usage=ClaimCreationSpec is the spec to auto-generate a DRA resource claim template. Only one of ClaimCreationSpec, ResourceClaimName or ResourceClaimTemplateName must be specified. claimCreationSpec?: { devices: [...{ // +usage=AttributeSelectors defines the criteria which must be satisfied by the device attributes of a device. attributeSelectors?: [...{ // +usage=Key is the name of the device attribute. This is either a qualified name or a simple name. If it is a simple name, then it is assumed to be prefixed with the DRA driver name. Eg: "gpu.nvidia.com/productName" is equivalent to "productName" if the driver name is "gpu.nvidia.com". Otherwise they're treated as 2 different attributes. key: string // +usage=Op is the operator to use for comparing the device attribute value. Supported operators are: * Equal: The device attribute value must be equal to the value specified in the selector. * NotEqual: The device attribute value must not be equal to the value specified in the selector. * GreaterThan: The device attribute value must be greater than the value specified in the selector. * GreaterThanOrEqual: The device attribute value must be greater than or equal to the value specified in the selector. * LessThan: The device attribute value must be less than the value specified in the selector. * LessThanOrEqual: The device attribute value must be less than or equal to the value specified in the selector. op: *"Equal" | "NotEqual" | "GreaterThan" | "GreaterThanOrEqual" | "LessThan" | "LessThanOrEqual" // +usage=Value is the value to compare against the device attribute. value?: { // +usage=BoolValue is a true/false value. boolValue?: bool // +usage=IntValue is a number. intValue?: int // +usage=StringValue is a string value. stringValue?: string // +usage=VersionValue is a semantic version according to semver.org spec 2.0.0. versionValue?: string } }] // +usage=CapacitySelectors defines the criteria which must be satisfied by the device capacity of a device. capacitySelectors?: [...{ // +usage=Key is the name of the resource. This is either a qualified name or a simple name. If it is a simple name, then it is assumed to be prefixed with the DRA driver name. Eg: "gpu.nvidia.com/memory" is equivalent to "memory" if the driver name is "gpu.nvidia.com". Otherwise they're treated as 2 different attributes. key: string // +usage=Op is the operator to use for comparing against the device capacity. Supported operators are: * Equal: The resource quantity value must be equal to the value specified in the selector. * NotEqual: The resource quantity value must not be equal to the value specified in the selector. * GreaterThan: The resource quantity value must be greater than the value specified in the selector. * GreaterThanOrEqual: The resource quantity value must be greater than or equal to the value specified in the selector. * LessThan: The resource quantity value must be less than the value specified in the selector. * LessThanOrEqual: The resource quantity value must be less than or equal to the value specified in the selector. op: *"Equal" | "NotEqual" | "GreaterThan" | "GreaterThanOrEqual" | "LessThan" | "LessThanOrEqual" // +usage=Value is the resource quantity to compare against. Pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$. value: _ }] // +usage=CELExpressions is a list of CEL expressions that must be satisfied by the DRA device. celExpressions?: [...string] // +usage=Count is the number of devices to request. count: int // +usage=DeviceClassName references a specific DeviceClass to inherit configuration and selectors from. deviceClassName: *"gpu.nvidia.com" | string // +usage=DriverName is the name of the DRA driver providing the capacity information. Must be a DNS subdomain. Pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*. driverName?: *"gpu.nvidia.com" | string // +usage=Name is the name of the device request to use in the generated claim spec. Must be a valid DNS_LABEL. Pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*. name: string }] // +usage=GenerateName is an optional name prefix to use for generating the resource claim template. generateName?: string } // +usage=Requests is the list of requests in the referenced DRA resource claim. to be made available to the model container of the NIMService pods. If empty, everything from the claim is made available, otherwise only the result of this subset of requests. requests?: [...string] // +usage=ResourceClaimName is the name of a DRA resource claim object in the same namespace as the NIMService. Exactly one of ResourceClaimName and ResourceClaimTemplateName must be set. Pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*. resourceClaimName?: string // +usage=ResourceClaimTemplateName is the name of a DRA resource claim template object in the same namespace as the pods for this NIMService. The template will be used to create a new DRA resource claim, which will be bound to the pods created for this NIMService. Exactly one of ResourceClaimName and ResourceClaimTemplateName must be set. Pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*. resourceClaimTemplateName?: string }] env?: [...{ // +usage=Name of the environment variable. Must be a C_IDENTIFIER. name: string // +usage=Variable references $(VAR_NAME) are expanded using the previously defined environment variables in the container and any service environment variables. If a variable cannot be resolved, the reference in the input string will be unchanged. Double $$ are reduced to a single $, which allows for escaping the $(VAR_NAME) syntax: i.e. "$$(VAR_NAME)" will produce the string literal "$(VAR_NAME)". Escaped references will never be expanded, regardless of whether the variable exists or not. Defaults to "". value?: string // +usage=Source for the environment variable's value. Cannot be used if value is not empty. valueFrom?: { // +usage=Selects a key of a ConfigMap. configMapKeyRef?: { // +usage=The key to select. key: string // +usage=Name of the referent. This field is effectively required, but due to backwards compatibility is allowed to be empty. Instances of this type with an empty value here are almost certainly wrong. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names name?: *"" | string // +usage=Specify whether the ConfigMap or its key must be defined optional?: bool } // +usage=Selects a field of the pod: supports metadata.name, metadata.namespace, `metadata.labels['<KEY>']`, `metadata.annotations['<KEY>']`, spec.nodeName, spec.serviceAccountName, status.hostIP, status.podIP, status.podIPs. fieldRef?: { // +usage=Version of the schema the FieldPath is written in terms of, defaults to "v1". apiVersion?: string // +usage=Path of the field to select in the specified API version. fieldPath: string } // +usage=Selects a resource of the container: only resources limits and requests (limits.cpu, limits.memory, limits.ephemeral-storage, requests.cpu, requests.memory and requests.ephemeral-storage) are currently supported. resourceFieldRef?: { // +usage=Container name: required for volumes, optional for env vars containerName?: string // +usage=Specifies the output format of the exposed resources, defaults to "1" Pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$. divisor?: _ // +usage=Required: resource to select resource: string } // +usage=Selects a key of a secret in the pod's namespace secretKeyRef?: { // +usage=The key of the secret to select from. Must be a valid secret key. key: string // +usage=Name of the referent. This field is effectively required, but due to backwards compatibility is allowed to be empty. Instances of this type with an empty value here are almost certainly wrong. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names name?: *"" | string // +usage=Specify whether the Secret or its key must be defined optional?: bool } } }] // +usage=Expose defines attributes to expose the service. expose?: { // +usage=Deprecated: Use .spec.router instead. ingress?: { annotations?: [string]: string // +usage=ingress, or virtualService - not both enabled?: bool // +usage=IngressSpec describes the Ingress the user wishes to exist. spec?: { // +usage=defaultBackend is the backend that should handle requests that don't match any rule. If Rules are not specified, DefaultBackend must be specified. If DefaultBackend is not set, the handling of requests that do not match any of the rules will be up to the Ingress controller. defaultBackend?: { // +usage=resource is an ObjectRef to another Kubernetes resource in the namespace of the Ingress object. If resource is specified, a service.Name and service.Port must not be specified. This is a mutually exclusive setting with "Service". resource?: { // +usage=APIGroup is the group for the resource being referenced. If APIGroup is not specified, the specified Kind must be in the core API group. For any other third-party types, APIGroup is required. apiGroup?: string // +usage=Kind is the type of resource being referenced kind: string // +usage=Name is the name of resource being referenced name: string } // +usage=service references a service as a backend. This is a mutually exclusive setting with "Resource". service?: { // +usage=name is the referenced service. The service must exist in the same namespace as the Ingress object. name: string // +usage=port of the referenced service. A port name or port number is required for a IngressServiceBackend. port?: { // +usage=name is the name of the port on the Service. This is a mutually exclusive setting with "Number". name?: string // +usage=number is the numerical port number (e.g. 80) on the Service. This is a mutually exclusive setting with "Name". number?: int } } } // +usage=ingressClassName is the name of an IngressClass cluster resource. Ingress controller implementations use this field to know whether they should be serving this Ingress resource, by a transitive connection (controller -> IngressClass -> Ingress resource). Although the `kubernetes.io/ingress.class` annotation (simple constant name) was never formally defined, it was widely supported by Ingress controllers to create a direct binding between Ingress controller and Ingress resources. Newly created Ingress resources should prefer using the field. However, even though the annotation is officially deprecated, for backwards compatibility reasons, ingress controllers should still honor that annotation if present. ingressClassName?: string // +usage=rules is a list of host rules used to configure the Ingress. If unspecified, or no rule matches, all traffic is sent to the default backend. rules?: [...{ // +usage=host is the fully qualified domain name of a network host, as defined by RFC 3986. Note the following deviations from the "host" part of the URI as defined in RFC 3986: 1. IPs are not allowed. Currently an IngressRuleValue can only apply to the IP in the Spec of the parent Ingress. 2. The `:` delimiter is not respected because ports are not allowed. Currently the port of an Ingress is implicitly :80 for http and :443 for https. Both these may change in the future. Incoming requests are matched against the host before the IngressRuleValue. If the host is unspecified, the Ingress routes all traffic based on the specified IngressRuleValue. host can be "precise" which is a domain name without the terminating dot of a network host (e.g. "foo.bar.com") or "wildcard", which is a domain name prefixed with a single wildcard label (e.g. "*.foo.com"). The wildcard character '*' must appear by itself as the first DNS label and matches only a single label. You cannot have a wildcard label by itself (e.g. Host == "*"). Requests will be matched against the Host field in the following way: 1. If host is precise, the request matches this rule if the http host header is equal to Host. 2. If host is a wildcard, then the request matches this rule if the http host header is to equal to the suffix (removing the first label) of the wildcard rule. host?: string // +usage=HTTPIngressRuleValue is a list of http selectors pointing to backends. In the example: http://<host>/<path>?<searchpart> -> backend where where parts of the url correspond to RFC 3986, this resource will be used to match against everything after the last '/' and before the first '?' or '#'. http?: { // +usage=paths is a collection of paths that map requests to backends. paths: [...{ // +usage=backend defines the referenced service endpoint to which the traffic will be forwarded to. backend: { // +usage=resource is an ObjectRef to another Kubernetes resource in the namespace of the Ingress object. If resource is specified, a service.Name and service.Port must not be specified. This is a mutually exclusive setting with "Service". resource?: { // +usage=APIGroup is the group for the resource being referenced. If APIGroup is not specified, the specified Kind must be in the core API group. For any other third-party types, APIGroup is required. apiGroup?: string // +usage=Kind is the type of resource being referenced kind: string // +usage=Name is the name of resource being referenced name: string } // +usage=service references a service as a backend. This is a mutually exclusive setting with "Resource". service?: { // +usage=name is the referenced service. The service must exist in the same namespace as the Ingress object. name: string // +usage=port of the referenced service. A port name or port number is required for a IngressServiceBackend. port?: { // +usage=name is the name of the port on the Service. This is a mutually exclusive setting with "Number". name?: string // +usage=number is the numerical port number (e.g. 80) on the Service. This is a mutually exclusive setting with "Name". number?: int } } } // +usage=path is matched against the path of an incoming request. Currently it can contain characters disallowed from the conventional "path" part of a URL as defined by RFC 3986. Paths must begin with a '/' and must be present when using PathType with value "Exact" or "Prefix". path?: string // +usage=pathType determines the interpretation of the path matching. PathType can be one of the following values: * Exact: Matches the URL path exactly. * Prefix: Matches based on a URL path prefix split by '/'. Matching is done on a path element by element basis. A path element refers is the list of labels in the path split by the '/' separator. A request is a match for path p if every p is an element-wise prefix of p of the request path. Note that if the last element of the path is a substring of the last element in request path, it is not a match (e.g. /foo/bar matches /foo/bar/baz, but does not match /foo/barbaz). * ImplementationSpecific: Interpretation of the Path matching is up to the IngressClass. Implementations can treat this as a separate PathType or treat it identically to Prefix or Exact path types. Implementations are required to support all path types. pathType: string }] } }] // +usage=tls represents the TLS configuration. Currently the Ingress only supports a single TLS port, 443. If multiple members of this list specify different hosts, they will be multiplexed on the same port according to the hostname specified through the SNI TLS extension, if the ingress controller fulfilling the ingress supports SNI. tls?: [...{ // +usage=hosts is a list of hosts included in the TLS certificate. The values in this list must match the name/s used in the tlsSecret. Defaults to the wildcard host setting for the loadbalancer controller fulfilling this Ingress, if left unspecified. hosts?: [...string] // +usage=secretName is the name of the secret used to terminate TLS traffic on port 443. Field is left optional to allow TLS routing based on SNI hostname alone. If the SNI host in a listener conflicts with the "Host" header field used by an IngressRule, the SNI host is used for termination and value of the "Host" header is used for routing. secretName?: string }] } } // +usage=Service defines attributes to create a service. service?: { annotations?: [string]: string // +usage=GRPCPort is the GRPC serving port Note: This port is only applicable for NIMs that runs a Triton GRPC Inference Server. grpcPort?: int // +usage=MetricsPort is the port for metrics Note: This port is only applicable for NIMs that runs a separate metrics endpoint on Triton Inference Server. metricsPort?: int // +usage=Override the default service name name?: string // +usage=Port is the main api serving port (default: 8000) port?: int // +usage=Service Type string describes ingress methods for a service type?: string } } groupID?: int // +usage=Image defines image attributes. image: { pullPolicy?: string pullSecrets?: [...string] repository: string tag: string } // +usage=InferencePlatform specifies the inference platform to use for this NIMService. Valid values are "standalone" (default) and "kserve". inferencePlatform?: *"standalone" | "kserve" labels?: [string]: string // +usage=Probe defines attributes for startup/liveness/readiness probes. livenessProbe?: { enabled?: bool // +usage=Probe describes a health check to be performed against a container to determine whether it is alive or ready to receive traffic. probe?: { // +usage=Exec specifies a command to execute in the container. exec?: { // +usage=Command is the command line to execute inside the container, the working directory for the command is root ('/') in the container's filesystem. The command is simply exec'd, it is not run inside a shell, so traditional shell instructions ('|', etc) won't work. To use a shell, you need to explicitly call out to that shell. Exit status of 0 is treated as live/healthy and non-zero is unhealthy. command?: [...string] } // +usage=Minimum consecutive failures for the probe to be considered failed after having succeeded. Defaults to 3. Minimum value is 1. failureThreshold?: int // +usage=GRPC specifies a GRPC HealthCheckRequest. grpc?: { // +usage=Port number of the gRPC service. Number must be in the range 1 to 65535. port: int // +usage=Service is the name of the service to place in the gRPC HealthCheckRequest (see https://github.com/grpc/grpc/blob/master/doc/health-checking.md). If this is not specified, the default behavior is defined by gRPC. service?: *"" | string } // +usage=HTTPGet specifies an HTTP GET request to perform. httpGet?: { // +usage=Host name to connect to, defaults to the pod IP. You probably want to set "Host" in httpHeaders instead. host?: string // +usage=Custom headers to set in the request. HTTP allows repeated headers. httpHeaders?: [...{ // +usage=The header field name. This will be canonicalized upon output, so case-variant names will be understood as the same header. name: string // +usage=The header field value value: string }] // +usage=Path to access on the HTTP server. path?: string // +usage=Name or number of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME. port: _ // +usage=Scheme to use for connecting to the host. Defaults to HTTP. scheme?: string } // +usage=Number of seconds after the container has started before liveness probes are initiated. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes initialDelaySeconds?: int // +usage=How often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1. periodSeconds?: int // +usage=Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness and startup. Minimum value is 1. successThreshold?: int // +usage=TCPSocket specifies a connection to a TCP port. tcpSocket?: { // +usage=Optional: Host name to connect to, defaults to the pod IP. host?: string // +usage=Number or name of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME. port: _ } // +usage=Optional duration in seconds the pod needs to terminate gracefully upon probe failure. The grace period is the duration in seconds after the processes running in the pod are sent a termination signal and the time when the processes are forcibly halted with a kill signal. Set this value longer than the expected cleanup time for your process. If this value is nil, the pod's terminationGracePeriodSeconds will be used. Otherwise, this value overrides the value provided by the pod spec. Value must be non-negative integer. The value zero indicates stop immediately via the kill signal (no opportunity to shut down). This is a beta field and requires enabling ProbeTerminationGracePeriod feature gate. Minimum value is 1. spec.terminationGracePeriodSeconds is used if unset. terminationGracePeriodSeconds?: int // +usage=Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes timeoutSeconds?: int } } // +usage=Metrics defines attributes to setup metrics collection. metrics?: { enabled?: bool // +usage=for use with the Prometheus Operator and the primary service object serviceMonitor?: { additionalLabels?: [string]: string annotations?: [string]: string // +usage=Duration is a valid time duration that can be parsed by Prometheus model.ParseDuration() function. Supported units: y, w, d, h, m, s, ms Examples: `30s`, `1m`, `1h20m15s`, `15d` Pattern: ^(0|(([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?)$. interval?: string // +usage=Duration is a valid time duration that can be parsed by Prometheus model.ParseDuration() function. Supported units: y, w, d, h, m, s, ms Examples: `30s`, `1m`, `1h20m15s`, `15d` Pattern: ^(0|(([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?)$. scrapeTimeout?: string } } // +usage=NimServiceMultiNodeConfig defines the configuration for multi-node NIMService. multiNode?: { // +usage=BackendType specifies the backend type for the multi-node NIMService. Currently only LWS is supported. backendType?: *"lws" | string // +usage=MPI config for NIMService using LeaderWorkerSet mpi?: { // +usage=MPIStartTimeout specifies the timeout in seconds for starting the cluster. mpiStartTimeout: int } // +usage=Parallelism specifies the parallelism strategy for the multi-node NIMService. parallelism: { // +usage=Pipeline specifies pipeline parallelism size for the multi-node NIMService. pipeline?: int // +usage=Tensor specifies tensor parallelism size for the multi-node NIMService. tensor?: int } } nodeSelector?: [string]: string // +usage=Deprecated: Use Affinity instead. podAffinity?: { // +usage=The scheduler will prefer to schedule pods to nodes that satisfy the affinity expressions specified by this field, but it may choose a node that violates one or more of the expressions. The node that is most preferred is the one with the greatest sum of weights, i.e. for each node that meets all of the scheduling requirements (resource request, requiredDuringScheduling affinity expressions, etc.), compute a sum by iterating through the elements of this field and adding "weight" to the sum if the node has pods which matches the corresponding podAffinityTerm; the node(s) with the highest sum are the most preferred. preferredDuringSchedulingIgnoredDuringExecution?: [...{ // +usage=Required. A pod affinity term, associated with the corresponding weight. podAffinityTerm: { // +usage=A label query over a set of resources, in this case pods. If it's null, this PodAffinityTerm matches with no Pods. labelSelector?: { // +usage=matchExpressions is a list of label selector requirements. The requirements are ANDed. matchExpressions?: [...{ // +usage=key is the label key that the selector applies to. key: string // +usage=operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. operator: string // +usage=values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. values?: [...string] }] // +usage=matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. matchLabels?: [string]: string } // +usage=MatchLabelKeys is a set of pod label keys to select which pods will be taken into consideration. The keys are used to lookup values from the incoming pod labels, those key-value labels are merged with `labelSelector` as `key in (value)` to select the group of existing pods which pods will be taken into consideration for the incoming pod's pod (anti) affinity. Keys that don't exist in the incoming pod labels will be ignored. The default value is empty. The same key is forbidden to exist in both matchLabelKeys and labelSelector. Also, matchLabelKeys cannot be set when labelSelector isn't set. matchLabelKeys?: [...string] // +usage=MismatchLabelKeys is a set of pod label keys to select which pods will be taken into consideration. The keys are used to lookup values from the incoming pod labels, those key-value labels are merged with `labelSelector` as `key notin (value)` to select the group of existing pods which pods will be taken into consideration for the incoming pod's pod (anti) affinity. Keys that don't exist in the incoming pod labels will be ignored. The default value is empty. The same key is forbidden to exist in both mismatchLabelKeys and labelSelector. Also, mismatchLabelKeys cannot be set when labelSelector isn't set. mismatchLabelKeys?: [...string] // +usage=A label query over the set of namespaces that the term applies to. The term is applied to the union of the namespaces selected by this field and the ones listed in the namespaces field. null selector and null or empty namespaces list means "this pod's namespace". An empty selector ({}) matches all namespaces. namespaceSelector?: { // +usage=matchExpressions is a list of label selector requirements. The requirements are ANDed. matchExpressions?: [...{ // +usage=key is the label key that the selector applies to. key: string // +usage=operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. operator: string // +usage=values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. values?: [...string] }] // +usage=matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. matchLabels?: [string]: string } // +usage=namespaces specifies a static list of namespace names that the term applies to. The term is applied to the union of the namespaces listed in this field and the ones selected by namespaceSelector. null or empty namespaces list and null namespaceSelector means "this pod's namespace". namespaces?: [...string] // +usage=This pod should be co-located (affinity) or not co-located (anti-affinity) with the pods matching the labelSelector in the specified namespaces, where co-located is defined as running on a node whose value of the label with key topologyKey matches that of any node on which any of the selected pods is running. Empty topologyKey is not allowed. topologyKey: string } // +usage=weight associated with matching the corresponding podAffinityTerm, in the range 1-100. weight: int }] // +usage=If the affinity requirements specified by this field are not met at scheduling time, the pod will not be scheduled onto the node. If the affinity requirements specified by this field cease to be met at some point during pod execution (e.g. due to a pod label update), the system may or may not try to eventually evict the pod from its node. When there are multiple elements, the lists of nodes corresponding to each podAffinityTerm are intersected, i.e. all terms must be satisfied. requiredDuringSchedulingIgnoredDuringExecution?: [...{ // +usage=A label query over a set of resources, in this case pods. If it's null, this PodAffinityTerm matches with no Pods. labelSelector?: { // +usage=matchExpressions is a list of label selector requirements. The requirements are ANDed. matchExpressions?: [...{ // +usage=key is the label key that the selector applies to. key: string // +usage=operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. operator: string // +usage=values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. values?: [...string] }] // +usage=matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. matchLabels?: [string]: string } // +usage=MatchLabelKeys is a set of pod label keys to select which pods will be taken into consideration. The keys are used to lookup values from the incoming pod labels, those key-value labels are merged with `labelSelector` as `key in (value)` to select the group of existing pods which pods will be taken into consideration for the incoming pod's pod (anti) affinity. Keys that don't exist in the incoming pod labels will be ignored. The default value is empty. The same key is forbidden to exist in both matchLabelKeys and labelSelector. Also, matchLabelKeys cannot be set when labelSelector isn't set. matchLabelKeys?: [...string] // +usage=MismatchLabelKeys is a set of pod label keys to select which pods will be taken into consideration. The keys are used to lookup values from the incoming pod labels, those key-value labels are merged with `labelSelector` as `key notin (value)` to select the group of existing pods which pods will be taken into consideration for the incoming pod's pod (anti) affinity. Keys that don't exist in the incoming pod labels will be ignored. The default value is empty. The same key is forbidden to exist in both mismatchLabelKeys and labelSelector. Also, mismatchLabelKeys cannot be set when labelSelector isn't set. mismatchLabelKeys?: [...string] // +usage=A label query over the set of namespaces that the term applies to. The term is applied to the union of the namespaces selected by this field and the ones listed in the namespaces field. null selector and null or empty namespaces list means "this pod's namespace". An empty selector ({}) matches all namespaces. namespaceSelector?: { // +usage=matchExpressions is a list of label selector requirements. The requirements are ANDed. matchExpressions?: [...{ // +usage=key is the label key that the selector applies to. key: string // +usage=operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. operator: string // +usage=values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. values?: [...string] }] // +usage=matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. matchLabels?: [string]: string } // +usage=namespaces specifies a static list of namespace names that the term applies to. The term is applied to the union of the namespaces listed in this field and the ones selected by namespaceSelector. null or empty namespaces list and null namespaceSelector means "this pod's namespace". namespaces?: [...string] // +usage=This pod should be co-located (affinity) or not co-located (anti-affinity) with the pods matching the labelSelector in the specified namespaces, where co-located is defined as running on a node whose value of the label with key topologyKey matches that of any node on which any of the selected pods is running. Empty topologyKey is not allowed. topologyKey: string }] } // +usage=ProxySpec defines the proxy configuration for NIMService. proxy?: { certConfigMap?: string httpProxy?: string httpsProxy?: string noProxy?: string } // +usage=Probe defines attributes for startup/liveness/readiness probes. readinessProbe?: { enabled?: bool // +usage=Probe describes a health check to be performed against a container to determine whether it is alive or ready to receive traffic. probe?: { // +usage=Exec specifies a command to execute in the container. exec?: { // +usage=Command is the command line to execute inside the container, the working directory for the command is root ('/') in the container's filesystem. The command is simply exec'd, it is not run inside a shell, so traditional shell instructions ('|', etc) won't work. To use a shell, you need to explicitly call out to that shell. Exit status of 0 is treated as live/healthy and non-zero is unhealthy. command?: [...string] } // +usage=Minimum consecutive failures for the probe to be considered failed after having succeeded. Defaults to 3. Minimum value is 1. failureThreshold?: int // +usage=GRPC specifies a GRPC HealthCheckRequest. grpc?: { // +usage=Port number of the gRPC service. Number must be in the range 1 to 65535. port: int // +usage=Service is the name of the service to place in the gRPC HealthCheckRequest (see https://github.com/grpc/grpc/blob/master/doc/health-checking.md). If this is not specified, the default behavior is defined by gRPC. service?: *"" | string } // +usage=HTTPGet specifies an HTTP GET request to perform. httpGet?: { // +usage=Host name to connect to, defaults to the pod IP. You probably want to set "Host" in httpHeaders instead. host?: string // +usage=Custom headers to set in the request. HTTP allows repeated headers. httpHeaders?: [...{ // +usage=The header field name. This will be canonicalized upon output, so case-variant names will be understood as the same header. name: string // +usage=The header field value value: string }] // +usage=Path to access on the HTTP server. path?: string // +usage=Name or number of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME. port: _ // +usage=Scheme to use for connecting to the host. Defaults to HTTP. scheme?: string } // +usage=Number of seconds after the container has started before liveness probes are initiated. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes initialDelaySeconds?: int // +usage=How often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1. periodSeconds?: int // +usage=Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness and startup. Minimum value is 1. successThreshold?: int // +usage=TCPSocket specifies a connection to a TCP port. tcpSocket?: { // +usage=Optional: Host name to connect to, defaults to the pod IP. host?: string // +usage=Number or name of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME. port: _ } // +usage=Optional duration in seconds the pod needs to terminate gracefully upon probe failure. The grace period is the duration in seconds after the processes running in the pod are sent a termination signal and the time when the processes are forcibly halted with a kill signal. Set this value longer than the expected cleanup time for your process. If this value is nil, the pod's terminationGracePeriodSeconds will be used. Otherwise, this value overrides the value provided by the pod spec. Value must be non-negative integer. The value zero indicates stop immediately via the kill signal (no opportunity to shut down). This is a beta field and requires enabling ProbeTerminationGracePeriod feature gate. Minimum value is 1. spec.terminationGracePeriodSeconds is used if unset. terminationGracePeriodSeconds?: int // +usage=Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes timeoutSeconds?: int } } replicas?: int // +usage=Resources is the resource requirements for the NIMService deployment or leader worker set. Note: Only traditional resources like cpu/memory and custom device plugin resources are supported here. Any DRA claim references are ignored. Use DRAResources instead for those. resources?: { // +usage=Claims lists the names of resources, defined in spec.resourceClaims, that are used by this container. This is an alpha field and requires enabling the DynamicResourceAllocation feature gate. This field is immutable. It can only be set for containers. claims?: [...{ // +usage=Name must match the name of one entry in pod.spec.resourceClaims of the Pod where this field is used. It makes that resource available inside a container. name: string // +usage=Request is the name chosen for a request in the referenced claim. If empty, everything from the claim is made available, otherwise only the result of this request. request?: string }] // +usage=Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ limits?: {...} // +usage=Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ requests?: {...} } router?: { // +usage=Annotations for the router, e.g. for ingress class or gateway annotations?: [string]: string // +usage=Gateway is the gateway to use for the created HTTPRoute. gateway?: { // +usage=HTTPRoutesEnabled is a flag to enable HTTPRoutes for the created gateway. httpRoutesEnabled?: bool // +usage=Name of the gateway name: string // +usage=Namespace of the gateway namespace: string } // +usage=HostDomainName is the domain name of the hostname matched by the router. The hostname is constructed as "<nimServiceName>.<namespace>.<hostDomainName>", where the <nimServiceName> a subdomain of the matched hostname. eg. example.com for "<nimServiceName>.<namespace>.example.com" Pattern: ^(([a-z0-9][a-z0-9\-]*[a-z0-9])|[a-z0-9]+\.)*([a-z]+|xn\-\-[a-z0-9]+)\.?$. hostDomainName?: string // +usage=Ingress is the ingress controller to use for the created ingress. ingress?: { // +usage=IngressClass is the ingress class to use for the created ingress. ingressClass: string // +usage=TLSSecretName is the name of the secret containing the TLS certificate and key. tlsSecretName?: string } } runtimeClassName?: string // +usage=Autoscaling defines attributes to automatically scale the service based on metrics. scale?: { annotations?: [string]: string enabled?: bool // +usage=HorizontalPodAutoscalerSpec defines the parameters required to setup HPA. hpa?: { // +usage=HorizontalPodAutoscalerBehavior configures the scaling behavior of the target in both Up and Down directions (scaleUp and scaleDown fields respectively). behavior?: { // +usage=scaleDown is scaling policy for scaling Down. If not set, the default value is to allow to scale down to minReplicas pods, with a 300 second stabilization window (i.e., the highest recommendation for the last 300sec is used). scaleDown?: { // +usage=policies is a list of potential scaling polices which can be used during scaling. If not set, use the default values: - For scale up: allow doubling the number of pods, or an absolute change of 4 pods in a 15s window. - For scale down: allow all pods to be removed in a 15s window. policies?: [...{ // +usage=periodSeconds specifies the window of time for which the policy should hold true. PeriodSeconds must be greater than zero and less than or equal to 1800 (30 min). periodSeconds: int // +usage=type is used to specify the scaling policy. type: string // +usage=value contains the amount of change which is permitted by the policy. It must be greater than zero value: int }] // +usage=selectPolicy is used to specify which policy should be used. If not set, the default value Max is used. selectPolicy?: string // +usage=stabilizationWindowSeconds is the number of seconds for which past recommendations should be considered while scaling up or scaling down. StabilizationWindowSeconds must be greater than or equal to zero and less than or equal to 3600 (one hour). If not set, use the default values: - For scale up: 0 (i.e. no stabilization is done). - For scale down: 300 (i.e. the stabilization window is 300 seconds long). stabilizationWindowSeconds?: int // +usage=tolerance is the tolerance on the ratio between the current and desired metric value under which no updates are made to the desired number of replicas (e.g. 0.01 for 1%). Must be greater than or equal to zero. If not set, the default cluster-wide tolerance is applied (by default 10%). For example, if autoscaling is configured with a memory consumption target of 100Mi, and scale-down and scale-up tolerances of 5% and 1% respectively, scaling will be triggered when the actual consumption falls below 95Mi or exceeds 101Mi. This is an alpha field and requires enabling the HPAConfigurableTolerance feature gate. Pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$. tolerance?: _ } // +usage=scaleUp is scaling policy for scaling Up. If not set, the default value is the higher of: * increase no more than 4 pods per 60 seconds * double the number of pods per 60 seconds No stabilization is used. scaleUp?: { // +usage=policies is a list of potential scaling polices which can be used during scaling. If not set, use the default values: - For scale up: allow doubling the number of pods, or an absolute change of 4 pods in a 15s window. - For scale down: allow all pods to be removed in a 15s window. policies?: [...{ // +usage=periodSeconds specifies the window of time for which the policy should hold true. PeriodSeconds must be greater than zero and less than or equal to 1800 (30 min). periodSeconds: int // +usage=type is used to specify the scaling policy. type: string // +usage=value contains the amount of change which is permitted by the policy. It must be greater than zero value: int }] // +usage=selectPolicy is used to specify which policy should be used. If not set, the default value Max is used. selectPolicy?: string // +usage=stabilizationWindowSeconds is the number of seconds for which past recommendations should be considered while scaling up or scaling down. StabilizationWindowSeconds must be greater than or equal to zero and less than or equal to 3600 (one hour). If not set, use the default values: - For scale up: 0 (i.e. no stabilization is done). - For scale down: 300 (i.e. the stabilization window is 300 seconds long). stabilizationWindowSeconds?: int // +usage=tolerance is the tolerance on the ratio between the current and desired metric value under which no updates are made to the desired number of replicas (e.g. 0.01 for 1%). Must be greater than or equal to zero. If not set, the default cluster-wide tolerance is applied (by default 10%). For example, if autoscaling is configured with a memory consumption target of 100Mi, and scale-down and scale-up tolerances of 5% and 1% respectively, scaling will be triggered when the actual consumption falls below 95Mi or exceeds 101Mi. This is an alpha field and requires enabling the HPAConfigurableTolerance feature gate. Pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$. tolerance?: _ } } maxReplicas: int metrics?: [...{ // +usage=containerResource refers to a resource metric (such as those specified in requests and limits) known to Kubernetes describing a single container in each pod of the current scale target (e.g. CPU or memory). Such metrics are built in to Kubernetes, and have special scaling options on top of those available to normal per-pod metrics using the "pods" source. containerResource?: { // +usage=container is the name of the container in the pods of the scaling target container: string // +usage=name is the name of the resource in question. name: string // +usage=target specifies the target value for the given metric target: { // +usage=averageUtilization is the target value of the average of the resource metric across all relevant pods, represented as a percentage of the requested value of the resource for the pods. Currently only valid for Resource metric source type averageUtilization?: int // +usage=averageValue is the target value of the average of the metric across all relevant pods (as a quantity) Pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$. averageValue?: _ // +usage=type represents whether the metric type is Utilization, Value, or AverageValue type: string // +usage=value is the target value of the metric (as a quantity). Pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$. value?: _ } } // +usage=external refers to a global metric that is not associated with any Kubernetes object. It allows autoscaling based on information coming from components running outside of cluster (for example length of queue in cloud messaging service, or QPS from loadbalancer running outside of cluster). external?: { // +usage=metric identifies the target metric by name and selector metric: { // +usage=name is the name of the given metric name: string // +usage=selector is the string-encoded form of a standard kubernetes label selector for the given metric When set, it is passed as an additional parameter to the metrics server for more specific metrics scoping. When unset, just the metricName will be used to gather metrics. selector?: { // +usage=matchExpressions is a list of label selector requirements. The requirements are ANDed. matchExpressions?: [...{ // +usage=key is the label key that the selector applies to. key: string // +usage=operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. operator: string // +usage=values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. values?: [...string] }] // +usage=matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. matchLabels?: [string]: string } } // +usage=target specifies the target value for the given metric target: { // +usage=averageUtilization is the target value of the average of the resource metric across all relevant pods, represented as a percentage of the requested value of the resource for the pods. Currently only valid for Resource metric source type averageUtilization?: int // +usage=averageValue is the target value of the average of the metric across all relevant pods (as a quantity) Pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$. averageValue?: _ // +usage=type represents whether the metric type is Utilization, Value, or AverageValue type: string // +usage=value is the target value of the metric (as a quantity). Pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$. value?: _ } } // +usage=object refers to a metric describing a single kubernetes object (for example, hits-per-second on an Ingress object). object?: { // +usage=describedObject specifies the descriptions of a object,such as kind,name apiVersion describedObject: { // +usage=apiVersion is the API version of the referent apiVersion?: string // +usage=kind is the kind of the referent; More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds kind: string // +usage=name is the name of the referent; More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names name: string } // +usage=metric identifies the target metric by name and selector metric: { // +usage=name is the name of the given metric name: string // +usage=selector is the string-encoded form of a standard kubernetes label selector for the given metric When set, it is passed as an additional parameter to the metrics server for more specific metrics scoping. When unset, just the metricName will be used to gather metrics. selector?: { // +usage=matchExpressions is a list of label selector requirements. The requirements are ANDed. matchExpressions?: [...{ // +usage=key is the label key that the selector applies to. key: string // +usage=operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. operator: string // +usage=values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. values?: [...string] }] // +usage=matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. matchLabels?: [string]: string } } // +usage=target specifies the target value for the given metric target: { // +usage=averageUtilization is the target value of the average of the resource metric across all relevant pods, represented as a percentage of the requested value of the resource for the pods. Currently only valid for Resource metric source type averageUtilization?: int // +usage=averageValue is the target value of the average of the metric across all relevant pods (as a quantity) Pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$. averageValue?: _ // +usage=type represents whether the metric type is Utilization, Value, or AverageValue type: string // +usage=value is the target value of the metric (as a quantity). Pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$. value?: _ } } // +usage=pods refers to a metric describing each pod in the current scale target (for example, transactions-processed-per-second). The values will be averaged together before being compared to the target value. pods?: { // +usage=metric identifies the target metric by name and selector metric: { // +usage=name is the name of the given metric name: string // +usage=selector is the string-encoded form of a standard kubernetes label selector for the given metric When set, it is passed as an additional parameter to the metrics server for more specific metrics scoping. When unset, just the metricName will be used to gather metrics. selector?: { // +usage=matchExpressions is a list of label selector requirements. The requirements are ANDed. matchExpressions?: [...{ // +usage=key is the label key that the selector applies to. key: string // +usage=operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist. operator: string // +usage=values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch. values?: [...string] }] // +usage=matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed. matchLabels?: [string]: string } } // +usage=target specifies the target value for the given metric target: { // +usage=averageUtilization is the target value of the average of the resource metric across all relevant pods, represented as a percentage of the requested value of the resource for the pods. Currently only valid for Resource metric source type averageUtilization?: int // +usage=averageValue is the target value of the average of the metric across all relevant pods (as a quantity) Pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$. averageValue?: _ // +usage=type represents whether the metric type is Utilization, Value, or AverageValue type: string // +usage=value is the target value of the metric (as a quantity). Pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$. value?: _ } } // +usage=resource refers to a resource metric (such as those specified in requests and limits) known to Kubernetes describing each pod in the current scale target (e.g. CPU or memory). Such metrics are built in to Kubernetes, and have special scaling options on top of those available to normal per-pod metrics using the "pods" source. resource?: { // +usage=name is the name of the resource in question. name: string // +usage=target specifies the target value for the given metric target: { // +usage=averageUtilization is the target value of the average of the resource metric across all relevant pods, represented as a percentage of the requested value of the resource for the pods. Currently only valid for Resource metric source type averageUtilization?: int // +usage=averageValue is the target value of the average of the metric across all relevant pods (as a quantity) Pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$. averageValue?: _ // +usage=type represents whether the metric type is Utilization, Value, or AverageValue type: string // +usage=value is the target value of the metric (as a quantity). Pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$. value?: _ } } // +usage=type is the type of metric source. It should be one of "ContainerResource", "External", "Object", "Pods" or "Resource", each mapping to a matching field in the object. type: string }] minReplicas?: int } } schedulerName?: string // +usage=Probe defines attributes for startup/liveness/readiness probes. startupProbe?: { enabled?: bool // +usage=Probe describes a health check to be performed against a container to determine whether it is alive or ready to receive traffic. probe?: { // +usage=Exec specifies a command to execute in the container. exec?: { // +usage=Command is the command line to execute inside the container, the working directory for the command is root ('/') in the container's filesystem. The command is simply exec'd, it is not run inside a shell, so traditional shell instructions ('|', etc) won't work. To use a shell, you need to explicitly call out to that shell. Exit status of 0 is treated as live/healthy and non-zero is unhealthy. command?: [...string] } // +usage=Minimum consecutive failures for the probe to be considered failed after having succeeded. Defaults to 3. Minimum value is 1. failureThreshold?: int // +usage=GRPC specifies a GRPC HealthCheckRequest. grpc?: { // +usage=Port number of the gRPC service. Number must be in the range 1 to 65535. port: int // +usage=Service is the name of the service to place in the gRPC HealthCheckRequest (see https://github.com/grpc/grpc/blob/master/doc/health-checking.md). If this is not specified, the default behavior is defined by gRPC. service?: *"" | string } // +usage=HTTPGet specifies an HTTP GET request to perform. httpGet?: { // +usage=Host name to connect to, defaults to the pod IP. You probably want to set "Host" in httpHeaders instead. host?: string // +usage=Custom headers to set in the request. HTTP allows repeated headers. httpHeaders?: [...{ // +usage=The header field name. This will be canonicalized upon output, so case-variant names will be understood as the same header. name: string // +usage=The header field value value: string }] // +usage=Path to access on the HTTP server. path?: string // +usage=Name or number of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME. port: _ // +usage=Scheme to use for connecting to the host. Defaults to HTTP. scheme?: string } // +usage=Number of seconds after the container has started before liveness probes are initiated. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes initialDelaySeconds?: int // +usage=How often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1. periodSeconds?: int // +usage=Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness and startup. Minimum value is 1. successThreshold?: int // +usage=TCPSocket specifies a connection to a TCP port. tcpSocket?: { // +usage=Optional: Host name to connect to, defaults to the pod IP. host?: string // +usage=Number or name of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME. port: _ } // +usage=Optional duration in seconds the pod needs to terminate gracefully upon probe failure. The grace period is the duration in seconds after the processes running in the pod are sent a termination signal and the time when the processes are forcibly halted with a kill signal. Set this value longer than the expected cleanup time for your process. If this value is nil, the pod's terminationGracePeriodSeconds will be used. Otherwise, this value overrides the value provided by the pod spec. Value must be non-negative integer. The value zero indicates stop immediately via the kill signal (no opportunity to shut down). This is a beta field and requires enabling ProbeTerminationGracePeriod feature gate. Minimum value is 1. spec.terminationGracePeriodSeconds is used if unset. terminationGracePeriodSeconds?: int // +usage=Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes timeoutSeconds?: int } } // +usage=Storage is the target storage for caching NIM model if NIMCache is not provided storage?: { // +usage=HostPath is the host path volume for caching NIM hostPath?: string // +usage=NIMCacheVolSpec defines the spec to use NIMCache volume. nimCache?: { name?: string profile?: string } // +usage=PersistentVolumeClaim is the pvc volume used for caching NIM pvc?: { // +usage=Annotations for the PVC annotations?: [string]: string // +usage=Create specifies whether to create a new PersistentVolumeClaim (PVC). If set to false, an existing PVC must be referenced via the `Name` field. create?: bool // +usage=Name of the PVC to use. Required if `Create` is false (i.e., using an existing PVC). name?: string // +usage=Size of the NIM cache in Gi, used during PVC creation size?: string // +usage=StorageClass to be used for PVC creation. Leave it as empty if the PVC is already created or a default storage class is set in the cluster. storageClass?: string // +usage=SubPath is the path inside the PVC that should be mounted subPath?: string // +usage=VolumeAccessMode is the volume access mode of the PVC volumeAccessMode?: string } // +usage=ReadOnly mode indicates if the volume should be mounted as read-only readOnly?: bool // +usage=SharedMemorySizeLimit sets the max size of the shared memory volume (emptyDir) used by NIMs for fast model runtime I/O. Pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$. sharedMemorySizeLimit?: _ } tolerations?: [...{ // +usage=Effect indicates the taint effect to match. Empty means match all taint effects. When specified, allowed values are NoSchedule, PreferNoSchedule and NoExecute. effect?: string // +usage=Key is the taint key that the toleration applies to. Empty means match all taint keys. If the key is empty, operator must be Exists; this combination means to match all values and all keys. key?: string // +usage=Operator represents a key's relationship to the value. Valid operators are Exists and Equal. Defaults to Equal. Exists is equivalent to wildcard for value, so that a pod can tolerate all taints of a particular category. operator?: string // +usage=TolerationSeconds represents the period of time the toleration (which must be of effect NoExecute, otherwise this field is ignored) tolerates the taint. By default, it is not set, which means tolerate the taint forever (do not evict). Zero and negative values will be treated as 0 (evict immediately) by the system. tolerationSeconds?: int // +usage=Value is the taint value the toleration matches to. If the operator is Exists, the value should be empty, otherwise just a regular string. value?: string }] userID?: int } }

Parameters​

Template​

Parameters

Template