NVIDIA Riva for Developers
NVIDIA® Riva is a set of GPU-accelerated multilingual speech and translation microservices for building fully customizable, real-time conversational AI pipelines. Riva includes automatic speech recognition (ASR), text-to-speech (TTS), and neural machine translation (NMT) and is deployable in all clouds, in data centers, at the edge, or in embedded devices. With Riva, organizations can add speech and translation capabilities with large language models (LLMs) and retrieval-augmented generation (RAG) to transform chatbots into powerful multilingual assistants and avatars.
How NVIDIA Riva Works
Speech and translation AI microservices convert spoken words into text (speech recognition), written language into spoken words (speech synthesis), and spoken or written words from one language to another (translation). Pretrained AI models are trained on vast datasets and can be fine-tuned on custom datasets to accelerate the development of domain-specific models. Fully containerized, these microservices are optimized for real-time performance and offline high throughput on premises or in the cloud, and can quickly scale to hundreds and thousands of parallel streams.
Quick-Start Guide
Get step-by-step instructions for deploying pretrained models and how to interact with them.
Real-World Use Cases
See how to use Riva for multilingual transcription, translation, and voice.
Ways to Get Started With NVIDIA Riva
Use the right tools and technologies to build and deploy fully customizable, multilingual speech and translation AI applications.
Try
Experience Riva through a UI-based portal for exploring and prototyping with NVIDIA-managed endpoints, available for free through NVIDIA's API catalog.
Try NowDeploy
Get a free license to try NVIDIA AI Enterprise in production for 90 days using your existing infrastructure.
Development Starter Kits
Start developing your speech and translation AI application with Riva by accessing tutorials, notebooks, forums, release notes, and comprehensive documentation.
Automatic Speech Recognition
Achieve high transcription accuracy for Arabic, English, French, German, Hindi, Italian, Japanese, Korean, Mandarin, Portuguese, Russian, and Spanish with state-of-the-art models pretrained on thousands of hours of audio on NVIDIA supercomputers.
Text-to-Speech
Customize across English, German, Italian, Mandarin, and Spanish TTS pipelines for the voice and intonation you want.
Neural Machine Translation
Integrate highly accurate text-to-text, speech-to-text, or speech-to-speech translation for up to 32 languages into your conversational application pipelines.
NVIDIA Riva Learning Library
More Resources
Ethical AI
NVIDIA’s platforms and application frameworks enable developers to build a wide array of AI applications. Always consider potential algorithmic bias when choosing or creating the models being deployed. Work with the model’s developer to ensure that it meets the requirements for the relevant industry and use case; that the necessary instruction and documentation are provided to understand error rates, confidence intervals, and results; and that the model is being used under the conditions and in the manner intended.
Stay up to date on the latest speech and translation AI news from NVIDIA.