To train your model using mixed or TF32 precision with Tensor Cores or using FP32, perform the following steps using the default parameters of the GNMT v2 model on the WMT16 English German dataset. For the specifics concerning training and inference, see the Advanced section.
1. Clone the repository.
git clone https://github.com/NVIDIA/DeepLearningExamples
cd DeepLearningExamples/TensorFlow/Translation/GNMT
2. Build the GNMT v2 TensorFlow container.
bash scripts/docker/build.sh
3. Start an interactive session in the NGC container to run. training/inference.
bash scripts/docker/interactive.sh
4. Download and preprocess the dataset.
Data will be downloaded to the data
directory (on the host). The data
directory is mounted to the /workspace/gnmt/data
location in the Docker
container.
bash scripts/wmt16_en_de.sh
5. Start training.
All results and logs are saved to the results
directory (on the host) or to
the /workspace/gnmt/results
directory (in the container). The training script
saves the checkpoint after every training epoch and after every 2000 training steps
within each epoch. You can modify the results directory using the --output_dir
argument.
To launch mixed precision training on 1 GPU, run:
python nmt.py --output_dir=results --batch_size=128 --learning_rate=5e-4 --amp
To launch mixed precision training on 8 GPUs, run:
python nmt.py --output_dir=results --batch_size=1024 --num_gpus=8 --learning_rate=2e-3 --amp
To launch FP32 (TF32 on NVIDIA Ampere GPUs) training on 1 GPU, run:
python nmt.py --output_dir=results --batch_size=128 --learning_rate=5e-4
To launch FP32 (TF32 on NVIDIA Ampere GPUs) training on 8 GPUs, run:
python nmt.py --output_dir=results --batch_size=1024 --num_gpus=8 --learning_rate=2e-3
6. Start evaluation.
The training process automatically runs evaluation and outputs the BLEU score after each training epoch. Additionally, after the training is done, you can manually run inference on test dataset with the checkpoint saved during the training.
To launch mixed precision inference on 1 GPU, run:
python nmt.py --output_dir=results --infer_batch_size=128 --mode=infer --amp
To launch FP32 (TF32 on NVIDIA Ampere GPUs) inference on 1 GPU, run:
python nmt.py --output_dir=results --infer_batch_size=128 --mode=infer
7. Start translation.
After the training is done, you can translate custom sentences with the checkpoint saved during the training.
echo "The quick brown fox jumps over the lazy dog" >file.txt
python nmt.py --output_dir=results --mode=translate --translate-file=file.txt
cat file.txt.trans
Der schnelle braune Fuchs springt über den faulen Hund