NGC | Catalog
CatalogModelsTacotron2 LJSpeech

Tacotron2 LJSpeech

Logo for Tacotron2 LJSpeech
Description
Model checkpoints for the Tacotron 2 model trained with NeMo.
Publisher
NVIDIA
Latest Version
2
Modified
April 4, 2023
Size
107.44 MB

Overview

This is a checkpoint for the Tacotron 2 model that was trained in NeMo on LJspeech for 1200 epochs. It was trained with Apex/Amp optimization level O0, with 8 * 16GB V100, and with a batch size of 48 per GPU for a total batch size of 384.

It contains the checkpoints for the Tacotron 2 Neural Modules and the yaml config file:

  • TextEmbedding.pt
  • Tacotron2Encoder.pt
  • Tacotron2Decoder.pt
  • Tacotron2Postnet.pt
  • tacotron2.yaml

Documentation

Refer to documentation at https://github.com/NVIDIA/NeMo

Usage example: Put the checkpoints into the checkpoint dir, and run tts_infer.py (from NeMo's TTS examples).

python tts_infer.py --model_config=$checkpoint_dir/tacotron2.yaml --eval_dataset=test.json --load_dir=$checkpoint_dir