For downloads and more information, please view on a desktop device.

Description

Scripts and utilities for getting started with TAO Computer Vision Inference Pipeline

Publisher

NVIDIA

Latest Version

v0.3-ga

Modified

April 4, 2023

Compressed Size

1.35 MB

TAO Toolkit Computer Vision Inference Pipeline

The TAO Toolkit Computer Vision Inference Pipeline is a C++ based SDK that provides APIs to build applications from inferences from purpose-built pre-trained AI models. The underlying framework provides a foundation to build multimodal applications. For example, the Gaze Estimation sample application requires the combination of Face Detection and Facial Landmarks (Fiducial Keyoints) Estimation.

The TAO Toolkit Computer Vision Inference Pipeline is made up of three key components:

NVIDIA Triton Inference Server: Hosts and serves AI models
NVIDIA TAO Converter: Converts pre-trained TAO models into highly optimized TensorRT models.
Inference Client: Samples written in C++ that display usage of APIs to request Computer Vision inferences

The purpose-built AI models that are supported by this Inference Pipeline are as follows:

Body Pose Estimation from TAO's BodyPoseNet
Emotion Classification from TAO's EmotionNet
Face Detection from TAO's FaceNet
Facial Landmark Estimation from TAO's FPENet
Gaze Estimation from TAO's GazeNet
Gesture Classification from TAO's GestureNet
Heart Rate Estimation from TAO's HeartRateNet

Developers can retrain these supported TAO models and easily consume their inferences using the C++ API.

The Client provides example applications that can consume video handles and visualize the inferences for each of the above AI models. There is also an example application for Body Pose Estimation, which is leveraged for the Gesture Classification.

Running TAO Toolkit Computer Vision Inference Pipeline

Quick Start scripts for the TAO Toolkit Computer Inference Pipeline are hosted on the NVIDIA GPU Cloud (NGC) and can be pulled using the NGC CLI tool.

These scripts will automatically pull containers and models for x86 or aarch64 (Jetson).

Configure the NGC API Key

Along with downloading the NGC CLI tool, users will have to set up an NGC API Key.

Users will be prompted to enter the key and the org nvidia during set up.

Download the TAO Toolkit Computer Vision Quick Start Scripts

Using the NGC CLI tool, download the Quick Start via:

ngc registry resource download-version "nvidia/tao/tao_cv_inference_pipeline_quick_start:v0.3-ga"

This will download the Quick Start Scripts, documentation for the TAO CV API and samples, and 3rdparty License information.

Using the Quick Start Scripts

Enter the newly downloaded folder, and ensure the scripts are executable:

cd scripts
chmod +x *.sh

Configuring

General configuration for the containers deployed using the quick start script can be seen by viewing the file config.sh. By default, the configuration file is set to launch all available containers on the supported GPU which is selected automatically based on the system architecture.

If you would like to use a video handle, ensure your video device handle (for example, /dev/video0) has been entered in config.sh to make it discoverable to the relevant Client container.

NOTE: Please make note of the resolutions and FPS support for your video handle (eg. using the command v4l2-ctl-list-formats-ext).

Models are automatically downloaded to the host machine at the location (absolute path) specified by the variable models_location inside the config.sh. This location becomes important in the context of retraining and replacing the TensorRT models.

By default, deployable TAO models come encrypted with their own keys. These keys listed in the config are specific to those models that exist on NGC. These do not need to modified unless a user wishes to work with retrained and re-encrypted TAO models.

Also inside the config.sh is a field to specify a volume mount for the sample applications. This would be useful in the case of a user wanting to modify applications and saving that new source to the host machine as opposed to the container (which if exited can result in loss of modifications).

All of the configuration options are documented within the configuration file itself.

Initialization

Run

bash tao_cv_init.sh

This script will pull all the relevant containers and models to the machine. It will also download specific 3rdparty dependencies into the client container, and saves this image.

Successful completion of this download will result in:

[INFO] Finished pulling containers and models

The script will then compile the TAO models into TensorRT models to deploy for the NVIDIA Triton Server. This step can take up to 10 minutes as it compiles all the purpose-built TAO models. Upon successful completion of this, users will see the following:

[INFO] SUCCESS: Proceed to 'tao_cv_start_server.sh'

Launching The Server and Client Containers

Run:

bash tao_cv_start_server.sh

This will launch the NVIDIA Triton Server to allow inference requests. To verify the server has started correctly, users can check if the output shows:

I0428 03:14:46.464529 1 grpc_server.cc:1973] Started GRPCService at 0.0.0.0:8001
I0428 03:14:46.464569 1 http_server.cc:1443] Starting HTTPService at 0.0.0.0:8000
I0428 03:14:46.507043 1 http_server.cc:1458] Starting Metrics Service at 0.0.0.0:8002

To stop the server, use ctrl-c in the relevant terminal.

Next, in another terminal, proceed to run:

bash tao_cv_start_client.sh

This will open an interactive container session with sample applications and all the necessary libraries.

For more information regarding the CV sample applications, please refer to the official documentation for the TAO CV Inference Pipeline.

Each sample application follows the format:

./path/to/binary path/to/config/file

Assuming the config files are set up correctly for video handle or file, fps, resolution, etc., users can run the following sample applications:

./samples/tao_cv/demo_bodypose/bodypose samples/tao_cv/demo_bodypose/demo.conf
./samples/tao_cv/demo_emotion/emotion samples/tao_cv/demo_emotion/demo.conf
./samples/tao_cv/demo_facedetect/facedetect samples/tao_cv/demo_facedetect/demo.conf
./samples/tao_cv/demo_faciallandmarks/faciallandmarks samples/tao_cv/demo_faciallandmarks/demo.conf
./samples/tao_cv/demo_gaze/gaze samples/tao_cv/demo_gaze/demo.conf
./samples/tao_cv/demo_gesture/gesture samples/tao_cv/demo_gesture/demo.conf
./samples/tao_cv/demo_heartrate/heartrate samples/tao_cv/demo_heartrate/demo.conf

Stopping

To stop active containers, run:

bash tao_cv_stop.sh

Cleaning

To clean your machine of containers and/or models that were downloaded at init, run and follow the prompts:

bash tao_cv_clean.sh

Integration with TAO

A utility script tao_cv_compile.sh is provided to ease the deployment of TAO models into the Inference Pipeline. The models are downloaded to the host system in the models_location specified in config.sh. Simply replace the newly-trained ETLT model in the respective tao_*/ folder while preserving the name, and run one of the following:

bash tao_cv_compile.sh -m bodypose_int8 -k 
bash tao_cv_compile.sh -m emotion -k 
bash tao_cv_compile.sh -m facedetect_int8 -k 
bash tao_cv_compile.sh -m faciallandmarks_int8 -k 
bash tao_cv_compile.sh -m gaze -k 
bash tao_cv_compile.sh -m gesture_int8 -k 
bash tao_cv_compile.sh -m heartrate -k

with the encoding key used for retraining. The NVIDIA Triton Server points to the models_location so during the next tao_cv_start_server.sh call, the newly deployed TensorRT model will serve inferences.

Documentation

Provided with the Quick Start comes documentation regarding the TAO Toolkit Computer Vision API and sample applications.

Source code for the applications is provided along with information regarding headers for inferences. The chapter on nvidia::jrt::api::TAOCVAPI is the focus for developers who want to build their own AI Applications.

Samples are documented in File References for samples/tao_cv/.

License

License for TAO Toolkit Computer Vision Inference Pipeline containers is included within the containers at workspace/TAO-CV-Inference-Pipeline-EULA.pdf. License for the pre-trained models are available with the model files. By pulling and using the TAO Toolkit Computer Vision Inference Pipeline and downloading models, you accept the terms and conditions of these licenses.

Ethical AI

NVIDIA's platforms and application frameworks enable developers to build a wide array of AI applications. Consider potential algorithmic bias when choosing or creating the models being deployed. Work with the model's developer to ensure that it meets the requirements for the relevant industry and use case; that the necessary instruction and documentation are provided to understand error rates, confidence intervals, and results; and that the model is being used under the conditions and in the manner intended.