QuartzNet 15x5 LibriSpeech

QuartzNet 15x5 LibriSpeech

Logo for QuartzNet 15x5 LibriSpeech
Description
This is a checkpoint for QuartzNet 15x5 trained only on LibriSpeech (with speed perturbation) using NeMo.
Publisher
NVIDIA
Latest Version
1
Modified
April 4, 2023
Size
72.59 MB

Overview

This is a checkpoint for QuartzNet 15x5 trained only on LibriSpeech (speed perturbed) using NeMo, and is the one mentioned in the QuartzNet paper under section 4.1 LibriSpeech. It was trained with Apex/Amp optimization level O0 for 400 epochs.

The model achieves a greedy WER of 3.83% on LibriSpeech dev-clean, 11.08% on dev-other, 3.90% on test-clean, and 11.28% on test-other.

The files included in this model are:

  • JasperEncoder-STEP-406556.pt - pretrained encoder module
  • JasperDecoderForCTC-STEP-406556.pt - pretrained decoder module
  • quartznet15x5.yaml - the config file, the same as in the NeMo repository

Documentation

The source code and developer guide is available at https://github.com/NVIDIA/NeMo.

Usage example: Download the checkpoint files and place them in a checkpoint directory. Then, run jasper_eval.py (from NeMo's ASR examples).

python jasper_eval.py --model_config=$nemo_root/nemo/examples/asr/configs/quartznet15x5.yaml --eval_datasets=test.json --load_dir=$checkpoint_dir