Announcing Megatron for Training Trillion Parameter Models & NVIDIA Jarvis Availability

NVIDIA announced several major breakthroughs in conversational AI that will bring in a new wave of conversational AI applications.

Conversational AI is opening new ways for enterprises to interact with customers in every industry using applications like real-time transcription, translation, chatbots and virtual assistants. Building domain-specific interactive applications requires state-of-the-art models, optimizations for real time performance, and tools to adapt those models with your data. This week at GTC, NVIDIA announced several major breakthroughs in conversational AI that will bring in a new wave of conversational AI applications.


NVIDIA Megatron is a PyTorch-based framework for training giant language models based on the transformer architecture. Larger language models are helping produce superhuman-like responses and are being used in applications such as email phrase completion, document summarization and live sports commentary. The Megatron framework has also been harnessed by the University of Florida to develop GatorTron, the world’s largest clinical language model.

Highlights include:

  • Linearly scale training up to 1 trillion parameters on DGX SuperPOD with advanced optimizations and parallelization algorithms. 
  • Built on cuBLAS, NCCL, NVLINK and InfiniBand to train a language model on multi-GPU, multi-node systems
  • Improvement in throughput by more than 100x when moving from 1 billion parameter model on 32 A100 GPUs to 1T parameter on 3072 A100 GPUs
  • Achieve sustained 50% utilization of Tensor Cores.

Read the technical blog post for more details.
Megatron is available on GitHub.


NVIDIA also announced new achievements for Jarvis, a fully accelerated conversational AI framework, including highly accurate automatic speech recognition, real-time translation for multiple languages and text-to-speech capabilities to create expressive conversational AI agents.

Highlights include:

  • Out-of-the-box speech recognition model trained on multiple large corpus with greater than 90% accuracy
  • Transfer Learning Toolkit in TAO to finetune models on any domain
  • Real-time translation for 5 languages that run under 100ms latency per sentence
  • Expressive text-to-speech that delivers 30x higher throughput compared with Tacotron2

These new capabilities will be available in Q2 2021 as part of the ongoing beta program.

Jarvis beta currently includes state-of-the-art models pre-trained for thousands of hours on NVIDIA DGX; Transfer Learning Toolkit for adapting those models to your domain with zero coding; Optimized end-to-end speech, vision, and language pipelines that run in real-time.

To get started with Jarvis, read this introductory blog on building and deploying custom conversational AI models using Jarvis and NVIDIA Transfer Learning Toolkit. Read the technical blog post >

Next, try these sample applications for ideas on what you can build with Jarvis out-of-the-box:

  1. Jarvis Rasa assistant: End-to-end voice enabled AI assistant demonstrating integration of Jarvis Speech and Rasa
  2. Jarvis Contact App: Peer-to-peer video chat with streaming transcription and named entity recognition
  3. Question Answering: Build a QA system with a few lines of Python code using read-to-use Jarvis NLP service 

Join us at NVIDIA GTC for free on April 13th for our session “Building and Deploying a Custom Conversational AI App with NVIDIA Transfer Learning Toolkit and Jarvis” to learn more.

Leave a Reply

Your email address will not be published.