NGC | Catalog
CatalogModelsMegatron GPT2 345M

Megatron GPT2 345M

Logo for Megatron GPT2 345M
Description
345M parameter generative Megatron model
Publisher
-
Latest Version
v0.0
Modified
April 4, 2023
Size
676.92 MB

Megatron-LM GPT2 345M

Megatron is a large, powerful transformer. For this particular Megatron model we trained a generative, left-to-right transformer in the style of GPT-2. This model contains 345 million parameters made up of 24 layers, 16 attention heads, and a hidden size of 1024.

This model was trained on text sourced from Wikipedia, RealNews, OpenWebText, and CC-Stories.

Find more information at our repo: https://github.com/NVIDIA/Megatron-LM