Building Cloud-Native, AI-Powered Avatars with NVIDIA Omniverse ACE

Explore the AI technology powering Violet, the interactive avatar showcased this week in the NVIDIA GTC 2022 keynote. Learn new details about NVIDIA Omniverse…

Explore the AI technology powering Violet, the interactive avatar showcased this week in the NVIDIA GTC 2022 keynote. Learn new details about NVIDIA Omniverse Avatar Cloud Engine (ACE), a collection of cloud-native AI microservices for faster, easier deployment of interactive avatars, and NVIDIA Tokkio, a domain-specific AI reference application that leverages Omniverse ACE for creating fully autonomous interactive customer service avatars.

AI-powered avatars in the cloud

Digital assistants and avatars can take many different forms and shapes, from the common text-driven chatbots to fully animated digital humans and physical robots that can see and hear people. These avatars will populate virtual worlds to help us create and build things, be a brand ambassador and customer service agent, help you find something on a website, take your order at a drive-through, or recommend a retirement or insurance plan.

A real-time interactive 3D avatar can deliver a natural, engaging experience that makes people feel more comfortable. AI-based virtual assistants can also use non-verbal cues like your facial expressions and eye contact, to enhance communication and understanding of your requests and intent.

Figure 1. NVIDIA Omniverse ACE powers AI-powered avatars like Violet, from the NVIDIA Tokkio reference application showcased at GTC

But building these avatar applications at scale requires a broad range of expertise, including computer graphics, AI, and DevOps. Most current methods for animating avatars leverage traditional motion capturing solutions, which are challenging to use for real-time applications.

Cutting-edge NVIDIA AI technologies, such as Omniverse Audio2Face, NVIDIA Riva, and NVIDIA Metropolis, change the game by enabling avatar motion to be driven by audio and video. Connecting character animation directly to an avatar’s conversational intelligence enables faster, easier engineering and deployment of interactive avatars at scale.

When an avatar is created, it must also be integrated into an application and deployed. This requires powerful GPUs to drive both the rendering of sophisticated 3D characters and the AI intelligence that brings them to life. Monolithic solutions are optimized for specific endpoints, while cloud-native solutions are more scalable across all endpoints, including mobile, web, and limited compute devices such as augmented reality headsets.

NVIDIA Omniverse Avatar Cloud Engine (ACE) helps address these challenges by delivering all the necessary AI building blocks to bring intelligent avatars to life, at scale.

Omniverse ACE and AI microservices

Omniverse ACE is a collection of cloud-native AI models and microservices for building, customizing, and deploying intelligent and engaging avatars easily. These AI microservices power the backend of interactive avatars, making it possible for these virtual robots to see, perceive, intelligently converse, and provide recommendations to users.

Omniverse ACE uses Universal Scene Description (USD) and the NVIDIA Unified Compute Framework (UCF), a fully accelerated framework that enables you to combine optimized and accelerated microservices into real-time AI applications.

Every microservice has a bounded domain context (animation AI, conversational AI, vision AI, data analytics, or graphics rendering) and can be independently managed and deployed from UCF Studio.

The AI microservices include the following:

Animation AI: Omniverse Audio2Face simplifies the animation of a 3D character to match any voice-over track, helping users animate characters for games, films, or real-time digital assistants.
Conversational AI: Includes the NVIDIA Riva SDK for speech AI and the NVIDIA NeMo Megatron framework for natural language processing. These tools enable you to quickly build and deploy cutting-edge applications that deliver high-accuracy, expressive voices, and real-time responses.
Vision AI: NVIDIA Metropolis enables computer vision workflows—from model development to deployment—for individual developers, higher education and research, and enterprises.
Recommendation AI: NVIDIA Merlin is an open-source framework for building high-performing recommender systems at scale. It includes libraries, methods, and tools that streamline recommender builds.

NVIDIA UCF includes validated deployment-ready microservices to accelerate application development. The abstraction of each domain from the application alleviates the need for low-level domain and platform knowledge. New and custom microservices can be created using NVIDIA SDKs.

No-code design tools for cloud deployment

Application developers will be able to bring all these UCF-based microservices together using NVIDIA UCF Studio, a no-code application builder tool to create, manage, and deploy applications to a private or public cloud of choice.

Designs are visualized as a combination of microservice processing pipelines. Using drag-and-drop operations, you can quickly create and combine these pipelines to build powerful applications that incorporate different AI modalities, graphics, and other processing functions.

Figure 2. Example of an avatar AI workflow pipeline built with Omniverse ACE

Built-in design rules and verification are part of the UCF Studio development environment to ensure that applications built there are correct-by-construction. When they’re complete, applications can be packaged into NVIDIA GPU-enabled containers and deployed to the cloud easily, using Helm charts.

Building Violet, the NVIDIA Tokkio avatar

NVIDIA Tokkio, showcased in the GTC keynote, represents the latest evolution of avatar development using Omniverse ACE. In the demo, Jensen Huang introduces Violet, a cloud-based, interactive customer service avatar that is fully autonomous.

Violet was developed using the NVIDIA Tokkio application workflow, which enables interactive avatars to see, perceive, converse intelligently, and provide recommendations to enhance customer service, both online and in places like restaurants and stores.

While the user interface and specific AI microservice components will continue to be refined within UCF Studio, the core process of how to create an avatar AI workflow pipeline and deploy it will remain the same. You will be able to quickly select, drag-and-drop, and switch between microservices to easily customize your avatars.

Video 1. The NVIDIA GTC demo showcased Violet, an AI-powered avatar that responds to natural speech and makes intelligent recommendations

You start with a fully rigged avatar and some basic animation that was rendered in Omniverse. With UCF Studio, you can select the necessary components to make the Violet character interactive. This example includes Riva automatic speech recognition (ASR) and text-to-speech (TTS) features to make her listen and speak, and Omniverse Audio2Face to provide the necessary animation.

Then, connect Violet to a food ordering dataset to enable her to handle customer orders and queries. When you’re done, UCF Studio generates a Helm chart that can be deployed onto a Kubernetes cluster through a series of CLI commands. Now, the Violet avatar is running in the cloud and can be interacted with through a web-based application or a physical food service kiosk.

Next, update her language model so that she can answer questions that don’t relate to food orders. The NVIDIA Tokkio application framework includes a customizable pretrained natural language processing (NLP) model built using NVIDIA NeMo Megatron. Her language model can be updated, in this case to a predeployed Megatron large language model (LLM) microservice, by going back into UCF Studio and updating the inference settings. Violet is redeployed and can now respond to broader, open-domain questions.

Omniverse ACE microservices will also support avatars rendered in third-party engines. You can switch out the avatar that this NVIDIA Tokkio pipeline is driving. Back in UCF Studio, replace the current microservice output of Omniverse Audio2Face to drive UltraViolet, an avatar created using Epic’s MetaHuman in Unreal Engine 5.

Learn more about Omniverse ACE

The more companies rely on AI-assisted virtual agents, the more they will want to ensure that users are relaxed, trusting, and comfortable interacting with these virtual agents and AI-assisted cloud applications.

With Omniverse ACE and domain-specific AI reference applications like NVIDIA Tokkio, you can more easily meet the demand for intelligent and responsive avatars like Violet. Take 3D models built and rendered with popular platforms like Unreal Engine and then connect these characters to AI microservices from Omniverse ACE, to bring them to life.

Interested in Omniverse ACE and getting early access when it becomes available?

Sign up for the latest news about Omniverse ACE and developing cloud-native interactive avatars.
Apply for the NVIDIA Tokkio Early Access program.
Get started today with the NVIDIA Riva SDK and Omniverse Audio2Face application.

To learn more about Omniverse ACE and how to build and deploy interactive avatars on the cloud, add the Building the Future of Work with AI-powered Digital Humans GTC session to your calendar.

AI-powered avatars in the cloud

Omniverse ACE and AI microservices

No-code design tools for cloud deployment

Building Violet, the NVIDIA Tokkio avatar

Learn more about Omniverse ACE

Leave a Reply Cancel reply