NGC | Catalog
CatalogResourcesBERT on Google Cloud AI Platform

BERT on Google Cloud AI Platform

Logo for BERT on Google Cloud AI Platform
Description
Fine-tune a pre-trained BERT model with the SQuAD dataset, optimize for inference using TensorRT and deploy with Triton Inference Server on Google Cloud AI Platform using Custom Containers
Publisher
NVIDIA
Latest Version
v1.0
Modified
April 4, 2023
Compressed Size
1.14 MB

Description

This repo contains a set of code and scripts that performs the followings on Google Cloud AI Platform with assets from NGC:

  • Finetuning BERT model on SQuAD dataset
  • Optimizing finetuned BERT model with TensorRT
  • Deploying BERT TensorRT engine with Triton Inference Server

Each folder, named after each tasks, contains an individual set of scripts and code, along with the readme that shows the general steps to run them. The scripts, as-is, are NOT READY TO USE as the scripts may contain commands with user-specific information such as Google Cloud Storage bucket name, or user's Google Cloud project name.

Reader may refer to the demo recording that accompanies this repo, to learn the details.

Usage Instructions

Jupyter notebook, ngc_triton_bert_deployment/bert_on_caip.ipynb, is the best place to start, as it contains the commands to run for each task. However, it is strongly recommended to watch demo recording created to walk through step-by-step on what's shown in the notebook, to understand the details.

For each tasks, refer to individual readme, included in the directory to learn more about the task and its detail.

License

Refer to the following NVIDIA End User License Agreements, included in LICENSE file.