Running Kubernetes Pods Requesting a Specific GPU Type

Kubernetes and OpenShift allows you to schedule Pods that have access to GPU accelerator resources. It is also possible to request a specific kind of accelerator.

This can be done on any resource type that allows defining the container specifications, such as: Jobs, Pods, Deployments, etc.

It is therefore possible to specify the number of GPUs request in the resource limits section and the specific GPU type in the Node Selector section.

The following is an example for a Job requesting one NVIDIA V100 GPU.

apiVersion: batch/v1
kind: Job
metadata:
  name: job-with-gpu-access
spec:
  template:
    spec:
      containers:
        - name: job-with-gpu-access
          image: <image>:<tag>
          resources:
            requests:
              nvidia.com/gpu: "1"
      nodeSelector:
        nvidia.com/gpu.product: Tesla-V100-PCIE-32GB

More information and more complex scheduling, such as requesting only GPUs with a specific amount of VRAM available (useful when you know the model size you’re trying to run inference on) can be found in the Kubernetes documentation on GPU scheduling.

Leave a comment