NGC | Catalog
CatalogContainersNGC Pre-Flight Check

NGC Pre-Flight Check

Logo for NGC Pre-Flight Check
Description
The Pre-Flight Check container verifies that the container runtime is setup correctly for GPUs and InfiniBand.
Publisher
NVIDIA
Latest Tag
20.11
Modified
April 1, 2024
Compressed Size
27.24 MB
Multinode Support
No
Multi-Arch Support
No

Pre-flight Check

The NGC Pre-flight Check container is a light-weight tool that verifies that the container runtime is setup correctly for GPUs and InfiniBand. You can run this container prior to running your HPC or Deep Learning model on your system. The output message can be used as a guide to troubleshoot issues, prior to running containers from the NGC catalog.

Docker

$ docker run --rm -it --gpus all -v /dev/infiniband --cap-add IPC_LOCK nvcr.io/hpc/preflightcheck:20.11
INFO: The NVIDIA Driver was detected.
INFO: NVRM version: NVIDIA UNIX x86_64 Kernel Module 450.51.06 Sun Jul 19 20:02:54 UTC 2020
INFO: Found CUDA driver library: /usr/lib64/libcuda.so.1
INFO: Latest CUDA supported version: 11000
INFO: Number of GPUs detected: 8
INFO: Detected Mellanox OFED version 4.6-1.0.1
INFO: Detected nv_peer_mem version 1.0-7
INFO: Number of InfiniBand devices detected: 4

Intentional breakage examples

The InfiniBand devices are not mounted in the container (-v /dev/infiniband):

$ docker run --rm -it --gpus all nvcr.io/hpc/preflightcheck:20.11
INFO: The NVIDIA Driver was detected.
INFO: NVRM version: NVIDIA UNIX x86_64 Kernel Module 450.51.06 Sun Jul 19 20:02:54 UTC 2020
INFO: Found CUDA driver library: /usr/lib64/libcuda.so.1
INFO: Latest CUDA supported version: 11000
INFO: Number of GPUs detected: 8
WARNING: No InfiniBand devices detected

Disable GPU support (-e NVIDIA_VISIBLE_DEVICES=""):

$ docker run --rm -it -e NVIDIA_VISIBLE_DEVICES="" nvcr.io/hpc/preflightcheck:20.11
WARNING: The NVIDIA Driver was not detected.  GPU functionality will not be available.
INFO: Use 'docker run --gpus all' to start this container; see
INFO: https://github.com/NVIDIA/nvidia-docker/wiki/Installation-(Native-GPU-Support)
WARNING: No InfiniBand devices detected

Singularity

$ singularity run --nv docker://nvcr.io/hpc/preflightcheck:20.11
INFO: The NVIDIA Driver was detected.
INFO: NVRM version: NVIDIA UNIX x86_64 Kernel Module 450.51.06 Sun Jul 19 20:02:54 UTC 2020
INFO: Found CUDA driver library: /.singularity.d/libs/libcuda.so.1
INFO: Latest CUDA supported version: 11000
INFO: Number of GPUs detected: 8
INFO: Detected Mellanox OFED version 4.6-1.0.1
INFO: Detected nv_peer_mem version 1.0-7
INFO: Number of InfiniBand devices detected: 4

Intentional breakage examples

The --nv Singularity option is omitted:

$ singularity run docker://nvcr.io/hpc/preflightcheck:20.11
INFO: The NVIDIA Driver was detected.
INFO: NVRM version: NVIDIA UNIX x86_64 Kernel Module 450.51.06 Sun Jul 19 20:02:54 UTC 2020
WARNING: Unable to find CUDA driver library
WARNING: Unable to detect the latst CUDA version supported by the driver
WARNING: Unable to get list of GPUs
INFO: Detected Mellanox OFED version 4.6-1.0.1
INFO: Detected nv_peer_mem version 1.0-7
INFO: Number of InfiniBand devices detected: 4

The --contain Singularity option is used to isolate the container and omit --nv:

$ singularity run --contain docker://nvcr.io/hpc/preflightcheck:20.11
WARNING: The NVIDIA Driver was not detected.  GPU functionality will not be available.
INFO: Use 'singularity run --nv' to start this container; see
INFO: https://sylabs.io/guides/3.5/user-guide/gpu.html
WARNING: No InfiniBand devices detected