libnvidia-container

NVIDIA container runtime library

1,063

252

1,063

View on GitHub

Top Related Projects

nvidia-docker

17,477

Build and run Docker containers leveraging NVIDIA GPUs

runc

12,970

CLI tool for spawning and running containers according to the OCI specification

containerd

20,094

An open and reliable container runtime

moby

71,051

The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems

kubernetes

119,826

Production-Grade Container Scheduling and Management

docker-ce

5,776

:warning: This repository is deprecated and will be archived (Docker CE itself is NOT deprecated) see the https://github.com/docker/docker-ce/blob/master/README.md :warning:

Quick Overview

NVIDIA/libnvidia-container is a low-level library and a suite of utilities for configuring Linux containers with NVIDIA GPUs. It provides a stable, well-defined interface for integrating NVIDIA GPU support into container runtimes and orchestration platforms, ensuring that containerized applications can efficiently utilize NVIDIA GPUs.

Pros

Enables seamless integration of NVIDIA GPUs into containerized environments
Provides a consistent and reliable interface for container runtimes
Supports a wide range of NVIDIA GPU architectures and driver versions
Enhances security by implementing proper isolation and access control for GPU resources

Cons

Limited to NVIDIA GPUs, not applicable for other GPU vendors
Requires additional setup and configuration compared to standard container deployments
May introduce complexity for users unfamiliar with GPU-accelerated containers
Potential performance overhead in some scenarios due to the abstraction layer

Code Examples

# Example 1: List available NVIDIA devices
nvidia-container-cli list

# Example 2: Run a container with GPU support
docker run --runtime=nvidia --gpus all nvidia/cuda:11.0-base nvidia-smi

# Example 3: Configure GPU limits for a container
docker run --runtime=nvidia --gpus '"device=0,1"' nvidia/cuda:11.0-base nvidia-smi

Getting Started

To get started with libnvidia-container:

Install the NVIDIA Container Toolkit:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit

Configure Docker to use the NVIDIA runtime:

sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Verify the installation:

docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

This should display information about the available NVIDIA GPUs in the container.

Competitor Comparisons

nvidia-docker

17,477

Build and run Docker containers leveraging NVIDIA GPUs

Pros of nvidia-docker

Higher-level abstraction, easier to use for Docker integration
Provides a complete solution for running GPU-accelerated Docker containers
Includes runtime and CLI tools for seamless NVIDIA GPU support in Docker

Cons of nvidia-docker

Depends on libnvidia-container, adding an extra layer of complexity
May have slightly higher overhead due to the additional abstraction layer
Less flexibility for custom low-level implementations

Code Comparison

libnvidia-container (low-level C API):

struct nvidia_container_config config = {0};
nvidia_container_t *cnt;
nvidia_container_init(&cnt, NULL, NULL);
nvidia_container_config_init(&config);
nvidia_container_setup(cnt, &config);

nvidia-docker (high-level Docker CLI):

docker run --gpus all nvidia/cuda:11.0-base nvidia-smi

Summary

libnvidia-container is a low-level library providing the core functionality for NVIDIA GPU support in containers, while nvidia-docker is a higher-level tool built on top of libnvidia-container, offering a more user-friendly interface for Docker users. libnvidia-container provides more flexibility and control, while nvidia-docker simplifies the process of running GPU-accelerated containers in Docker environments.

runc

12,970

CLI tool for spawning and running containers according to the OCI specification

Pros of runc

Widely adopted industry standard for container runtime
Supports a broader range of container use cases beyond GPU-specific scenarios
More extensive community support and contributions

Cons of runc

Lacks native GPU support and NVIDIA-specific optimizations
May require additional configuration for GPU passthrough in containers
Not optimized for NVIDIA hardware acceleration out of the box

Code Comparison

runc:

func (r *Runc) Create(context context.Context, id, bundle string, opts *CreateOpts) error {
    args := []string{"create", "--bundle", bundle}
    if opts != nil {
        args = append(args, opts.AdditionalArgs...)
    }
    cmd := r.command(context, append(args, id)...)
    return runOrError(cmd)
}

libnvidia-container:

nvml_device_t dev;
int ret = nvmlDeviceGetHandleByIndex(i, &dev);
if (ret != NVML_SUCCESS) {
    return -1;
}
ret = nvmlDeviceGetUUID(dev, uuid, sizeof(uuid));

The code snippets demonstrate the different focus areas of the two projects. runc deals with general container creation, while libnvidia-container specifically handles NVIDIA GPU device management.

containerd

20,094

An open and reliable container runtime

Pros of containerd

Broader scope and functionality as a general-purpose container runtime
Widely adopted in the container ecosystem, used by Docker and Kubernetes
Active development with frequent updates and community support

Cons of containerd

Lacks built-in GPU support for NVIDIA hardware
More complex setup and configuration for GPU-accelerated workloads
Larger codebase and potentially higher resource overhead

Code comparison

containerd (Go):

func (c *Client) NewContainer(ctx context.Context, id string, opts ...NewContainerOpts) (Container, error) {
    ctx, done, err := c.withLease(ctx)
    if err != nil {
        return nil, err
    }
    defer done(ctx)

libnvidia-container (C):

int nvidia_container_cli_create(struct nvidia_container_cli *ctr, const char *container_id,
                                const char *rootfs, const char *config_path)
{
    int ret = -1;
    char *args[] = {ctr->path, "create", "--pid", ctr->pid, container_id, rootfs, config_path, NULL};

Summary

containerd is a more versatile container runtime with broader adoption, while libnvidia-container focuses specifically on NVIDIA GPU support for containers. containerd requires additional setup for GPU workloads, whereas libnvidia-container provides native GPU integration but with a narrower scope.

moby

71,051

The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems

Pros of moby

Broader scope and functionality as a complete container platform
Larger community and ecosystem for support and contributions
More extensive documentation and resources for users

Cons of moby

Higher complexity and steeper learning curve
Larger codebase and resource footprint
Not specifically optimized for NVIDIA GPU support

Code Comparison

moby (Docker Engine):

func (daemon *Daemon) containerStart(container *container.Container, checkpoint string, checkpointDir string, resetRestartManager bool) (err error) {
    container.Lock()
    defer container.Unlock()
    if container.Running {
        return nil
    }
    // ... (additional code)
}

libnvidia-container:

static int
nvc_driver_load(struct nvc_context *ctx, const char *opts)
{
    int ret = -1;
    char *args[MAX_KERNEL_MODULES + 1] = {NULL};
    // ... (additional code)
}

Summary

moby (Docker Engine) is a comprehensive container platform with a large ecosystem, while libnvidia-container focuses specifically on NVIDIA GPU support for containers. moby offers broader functionality but comes with increased complexity, while libnvidia-container provides specialized GPU integration for containerized applications.

kubernetes

119,826

Production-Grade Container Scheduling and Management

Pros of kubernetes

Widely adopted, industry-standard container orchestration platform
Extensive ecosystem with numerous tools and integrations
Supports multi-cloud and hybrid cloud deployments

Cons of kubernetes

Steeper learning curve and more complex setup compared to libnvidia-container
Requires more resources to run and manage
May be overkill for simple container deployments or single-node setups

Code comparison

kubernetes:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx

libnvidia-container:

int main(int argc, char *argv[])
{
    struct nvidia_container_config config = {0};
    return nvidia_container_cli(&config, argc, argv);
}

kubernetes is a comprehensive container orchestration platform, while libnvidia-container focuses specifically on NVIDIA GPU support for containers. kubernetes offers more extensive features for managing containerized applications at scale, but libnvidia-container provides a simpler, more focused solution for GPU-enabled containers. The choice between the two depends on the specific requirements of your project and the scale of your container deployment needs.

docker-ce

5,776

:warning: This repository is deprecated and will be archived (Docker CE itself is NOT deprecated) see the https://github.com/docker/docker-ce/blob/master/README.md :warning:

Pros of docker-ce

Broader scope and functionality, covering the entire Docker ecosystem
Larger community and more extensive documentation
Supports a wide range of container use cases beyond GPU-specific scenarios

Cons of docker-ce

More complex and resource-intensive due to its comprehensive nature
May require additional configuration for GPU support
Less specialized for NVIDIA GPU integration

Code Comparison

libnvidia-container:

int nvidia_container_cli_load_mig(struct error *err, const char *root)
{
    int ret = -1;
    char *mig_config_path = NULL;

    mig_config_path = str_printf("%s%s", root, MIG_CONFIG_FILE);
    if (mig_config_path == NULL)
        return (-1);

docker-ce:

func (daemon *Daemon) containerStart(container *container.Container, checkpoint string, checkpointDir string, resetRestartManager bool) (err error) {
	start := time.Now()
	container.Lock()
	defer container.Unlock()
	if container.Paused {
		return fmt.Errorf("cannot start a paused container, try unpause instead")

Summary

libnvidia-container focuses specifically on NVIDIA GPU support for containers, offering a lightweight and specialized solution. docker-ce, on the other hand, provides a comprehensive container platform with broader functionality but may require additional setup for GPU integration. The choice between them depends on the specific use case and requirements of the project.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

libnvidia-container

This repository provides a library and a simple CLI utility to automatically configure GNU/Linux containers leveraging NVIDIA hardware.
The implementation relies on kernel primitives and is designed to be agnostic of the container runtime.

Installing the library

From packages

Configure the package repository for your Linux distribution.

Install the packages:

libnvidia-container1
libnvidia-container-tools

From sources

With Docker:

# Generate docker images for a supported <os><version>
make {ubuntu18.04, ubuntu16.04, debian10, debian9, centos7, amazonlinux2, opensuse-leap15.1}

# Or generate docker images for all supported distributions in the dist/ directory
make docker

The resulting images have the name nvidia/libnvidia-container/<os>:<version>

Without Docker:

make install

# Alternatively in order to customize the installation paths
DESTDIR=/path/to/root make install prefix=/usr

Using the library

Container runtime example

Refer to the nvidia-container-runtime project.

Command line example

# Setup a new set of namespaces
cd $(mktemp -d) && mkdir rootfs
sudo unshare --mount --pid --fork

# Setup a rootfs based on Ubuntu 16.04 inside the new namespaces
curl http://cdimage.ubuntu.com/ubuntu-base/releases/16.04/release/ubuntu-base-16.04.6-base-amd64.tar.gz | tar -C rootfs -xz
useradd -R $(realpath rootfs) -U -u 1000 -s /bin/bash nvidia
mount --bind rootfs rootfs
mount --make-private rootfs
cd rootfs

# Mount standard filesystems
mount -t proc none proc
mount -t sysfs none sys
mount -t tmpfs none tmp
mount -t tmpfs none run

# Isolate the first GPU device along with basic utilities
nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --no-cgroups --utility --device 0 $(pwd)

# Change into the new rootfs
pivot_root . mnt
umount -l mnt
exec chroot --userspec 1000:1000 . env -i bash

# Run nvidia-smi from within the container
nvidia-smi -L

Copyright and License

This project is released under the BSD 3-clause license.

Additionally, this project can be dynamically linked with libelf from the elfutils package (https://sourceware.org/elfutils), in which case additional terms apply.
Refer to NOTICE for more information.

Issues and Contributing

Checkout the Contributing document!

Please let us know by filing a new issue
You can contribute by opening a pull request

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot