One of the biggest names in racing is going even bigger. Performance automaker Lotus launched its first SUV, the Eletre, earlier this week. The fully electric vehicle sacrifices little in terms of speed and outperforms when it comes to technology. It features an immersive digital cockpit, lengthy battery range of up to 370 miles and Read article >
Artists deploying the critically acclaimed GeForce RTX 4090 GPUs are primed to receive significant performance boosts in key creative apps. Plus, a special spook-tober edition of In the NVIDIA Studio features two talented 3D artists and their Halloween-themed creations this week.
The future is autonomous, and AI is already transforming the transportation industry. But what exactly is an autonomous vehicle and how does it work? Autonomous…
The future is autonomous, and AI is already transforming the transportation industry. But what exactly is an autonomous vehicle and how does it work?
Autonomous vehicles are born in the data center. They require a combination of sensors, high-performance hardware, software, and high-definition mapping to operate without a human at the wheel. While the concept of this technology has existed for decades, production self-driving systems have just recently become possible due to breakthroughs in AI and compute.
Specifically, massive leaps in high-performance computing have opened new possibilities in developing, training, testing, validating, and operating autonomous vehicles. The Introduction to Autonomous Vehicles GTC session walks through these breakthroughs, how current self-driving technology works, and what’s on the horizon for intelligent transportation.
From the cloud
The deep neural networks that run in the vehicle are trained on massive amounts of driving data. They must learn how to identify and react to objects in the real world—an incredibly time-consuming and costly process.
A test fleet of 50 vehicles generates about 1.6 petabytes of data each day, which must be ingested, encoded, and stored before any further processing can be done.
Then, the data must be combed through to find scenarios useful for training, such as new situations or situations underrepresented in the current dataset. These useful frames typically amount to just 10% of the total collected data.
You must then label every object in the scene, including traffic lights and signs, vehicles, pedestrians, and animals, so that the DNNs can learn to identify them as well as checking for accuracy.
NVIDIA DGX data center solutions have made this onerous process into a streamlined operation by providing a veritable data factory for training and testing. With high-performance compute, you can automate the curation and labeling process, as well as run many DNN tests in parallel.
When a new model or set of models is ready to be deployed, you can then validate the networks by replaying the model against thousands of hours of driving scenarios in the data center. Simulation also provides the capability to test these models in the countless edge cases an autonomous vehicle could encounter in the real world.
NVIDIA DRIVE Sim is built on NVIDIA Omniverse to deliver a powerful, cloud-based simulation platform capable of generating a wide range of real-world scenarios for AV development and validation. It creates highly accurate, digital twins of real-world environments using precision map data.
It can run just the AV software, which is known as software-in-the-loop, or the software running on the same compute as it would in the vehicle for hardware-in-the-loop testing.
You can truly tailor situations to your specific needs using the NVIDIA DRIVE Replicator tool, which can generate entirely new data. These scenarios include physically based sensor data, along with the corresponding ground truth, to complement real-world driving data and reduce the time and cost of development.
To the car
Validated deep neural networks run in the vehicle on centralized, high-performance AI compute.
Redundant and diverse sensors, including camera, radar, lidar, and ultrasonics, collect data from the surrounding environment as the car drives. The DNNs use this data to detect objects and infer information to make driving decisions.
Processing this data while running multiple DNNs concurrently requires an incredibly high-performance AI platform.
NVIDIA DRIVE Orin is a highly advanced, software-defined compute platform for autonomous vehicles. It achieves 254 trillion operations per second, enough to handle these functions while achieving systematic safety standards for public road operations.
In addition to DNNs for perception, AVs rely on maps with centimeter-level detail for accurate localization, which is the vehicle’s ability to locate itself in the world.
Proper localization requires constantly updated maps that reflect current road conditions, such as a work zone or a lane closure, so vehicles can accurately measure distances in the environment. These maps must efficiently scale across AV fleets, with fast processing and minimal data storage. Finally, they must be able to function worldwide, so AVs can operate at scale.
NVIDIA DRIVE Map is a multimodal mapping platform designed to enable the highest levels of autonomy while improving safety. It combines survey maps built by dedicated mapping vehicles with AI-based crowdsourced mapping from customer vehicles. DRIVE Map includes four localization layers—camera, lidar, radar, and GNSS—providing the redundancy and versatility required by the most advanced AI drivers.
Continuous improvement
The AV development process isn’t linear. As humans, we never stop learning, and AI operates in the same way.
Autonomous vehicles will continue to get smarter over time as the software is trained for new tasks, enhanced, tested, and validated, then updated to the vehicle over the air.
This pipeline is continuous, with data from the vehicle constantly being collected to continuously train and improve the networks, which are then fed back into the vehicle. AI is used at all stages of the real-time computing pipeline, from perception, mapping, and localization to planning and control.
This continuous cycle is what turns vehicles from their traditional fixed-function operation to software-defined devices. Most vehicles are as advanced as they will ever be at the point of sale. With this new software-defined architecture, automakers can continually update vehicles throughout their lives with new features and functionality.
One of China’s popular battery-electric startups now has the brains to boot. NETA Auto, a Zheijiang-based electric automaker, this week announced it will build its future electric vehicles on the NVIDIA DRIVE Orin platform. These EVs will be software defined, with automated driving and intelligent features that will be continuously upgraded via over-the-air updates. This Read article >
When customers walk into a Microsoft Experience Center in New York City, Sydney or London, they’re instantly met with stunning graphics displayed on multiple screens and high-definition video walls inside a multi-story building. Built to showcase the latest technologies, Microsoft Experience Centers surround customers with vibrant, immersive graphics as they explore new products, watch technical Read article >
This spook-tacular Halloween edition of GFN Thursday features a special treat: 40% off a six-month GeForce NOW Priority Membership — get it for just $29.99 for a limited time. Several sweet new games are also joining the GeForce NOW library. Creatures of the night can now stream vampire survival game V Rising from the cloud. Read article >
Edge AI is the deployment of AI applications in devices throughout the physical world. It’s called “edge AI” because the AI computation is done near the…
Edge AI is the deployment of AI applications in devices throughout the physical world. It’s called “edge AI” because the AI computation is done near the user at the edge of the network, close to where the data is located, rather than centrally in a cloud computing facility or private data center.
Posted by Kedem Snir, Software Engineer, and Gal Elidan, Senior Staff Research Scientist, Google Research
Whether it’s a professional honing their skills or a child learning to read, coaches and educators play a key role in assessing the learner’s answer to a question in a given context and guiding them towards a goal. These interactions have unique characteristics that set them apart from other forms of dialogue, yet are not available when learners practice alone at home. In the field of natural language processing, this type of capability has not received much attention and is technologically challenging. We set out to explore how we can use machine learning to assess answers in a way that facilitates learning.
In this blog, we introduce an important natural language understanding (NLU) capability called Natural Language Assessment (NLA), and discuss how it can be helpful in the context of education. While typical NLU tasks focus on the user’s intent, NLA allows for the assessment of an answer from multiple perspectives. In situations where a user wants to know how good their answer is, NLA can offer an analysis of how close the answer is to what is expected. In situations where there may not be a “correct” answer, NLA can offer subtle insights that include topicality, relevance, verbosity, and beyond. We formulate the scope of NLA, present a practical model for carrying out topicality NLA, and showcase how NLA has been used to help job seekers practice answering interview questions with Google’s new interview prep tool, Interview Warmup.
Overview of Natural Language Assessment (NLA)
The goal of NLA is to evaluate the user’s answer against a set of expectations. Consider the following components for an NLA system interacting with students:
A question presented to the student
Expectations that define what we expect to find in the answer (e.g., a concrete textual answer, a set of topics we expect the answer to cover, conciseness)
An answer provided by the student
An assessment output (e.g., correctness, missing information, too specific or general, stylistic feedback, pronunciation, etc.)
[Optional] A context (e.g., a chapter in a book or an article)
With NLA, both the expectations about the answer and the assessment of the answer can be very broad. This enables teacher-student interactions that are more expressive and subtle. Here are two examples:
A question with a concrete correct answer: Even in situations where there is a clear correct answer, it can be helpful to assess the answer more subtly than simply correct or incorrect. Consider the following:
Context: Harry Potter and the Philosopher’s Stone Question: “What is Hogwarts?” Expectation: “Hogwarts is a school of Witchcraft and Wizardry” [expectation is given as text] Answer: “I am not exactly sure, but I think it is a school.”
The answer may be missing salient details but labeling it as incorrect wouldn’t be entirely true or useful to a user. NLA can offer a more subtle understanding by, for example, identifying that the student’s answer is too general, and also that the student is uncertain.
Illustration of the NLA process from input question, answer and expectation to assessment output
This kind of subtle assessment, along with noting the uncertainty the student expressed, can be important in helping students build skills in conversational settings.
Topicality expectations: There are many situations in which a concrete answer is not expected. For example, if a student is asked an opinion question, there is no concrete textual expectation. Instead, there’s an expectation of relevance and opinionation, and perhaps some level of succinctness and fluency. Consider the following interview practice setup:
Question: “Tell me a little about yourself?” Expectations: { “Education”, “Experience”, “Interests” } (a set of topics) Answer: “Let’s see. I grew up in the Salinas valley in California and went to Stanford where I majored in economics but then got excited about technology so next I ….”
In this case, a useful assessment output would map the user’s answer to a subset of the topics covered, possibly along with a markup of which parts of the text relate to which topic. This can be challenging from an NLP perspective as answers can be long, topics can be mixed, and each topic on its own can be multi-faceted.
A Topicality NLA Model
In principle, topicality NLA is a standard multi-class task for which one can readily train a classifier using standard techniques. However, training data for such scenarios is scarce and it would be costly and time consuming to collect for each question and topic. Our solution is to break each topic into granular components that can be identified using large language models (LLMs) with a straightforward generic tuning.
We map each topic to a list of underlying questions and define that if the sentence contains an answer to one of those underlying questions, then it covers that topic. For the topic “Experience” we might choose underlying questions such as:
Where did you work?
What did you study?
…
While for the topic “Interests” we might choose underlying questions such as:
What are you interested in?
What do you enjoy doing?
…
These underlying questions are designed through an iterative manual process. Importantly, since these questions are sufficiently granular, current language models (see details below) can capture their semantics. This allows us to offer a zero-shot setting for the NLA topicality task: once trained (more on the model below), it is easy to add new questions and new topics, or adapt existing topics by modifying their underlying content expectation without the need to collect topic specific data. See below the model’s predictions for the sentence “I’ve worked in retail for 3 years” for the two topics described above:
A diagram of how the model uses underlying questions to predict the topic most likely to be covered by the user’s answer.
Since an underlying question for the topic “Experience” was matched, the sentence would be classified as “Experience”.
Application: Helping Job Seekers Prepare for Interviews
Interview Warmup is a new tool developed in collaboration with job seekers to help them prepare for interviews in fast-growing fields of employment such as IT Support and UX Design. It allows job seekers to practice answering questions selected by industry experts and to become more confident and comfortable with interviewing. As we worked with job seekers to understand their challenges in preparing for interviews and how an interview practice tool could be most useful, it inspired our research and the application of topicality NLA.
We build the topicality NLA model (once for all questions and topics) as follows: we train an encoder-only T5 model (EncT5 architecture) with 350 million parameters on Question-Answers data to predict the compatibility of an <underlying question, answer> pair. We rely on data from SQuAD 2.0 which was processed to produce <question, answer, label> triplets.
In the Interview Warmup tool, users can switch between talking points to see which ones were detected in their answer.
The tool does not grade or judge answers. Instead it enables users to practice and identify ways to improve on their own. After a user replies to an interview question, their answer is parsed sentence-by-sentence with the Topicality NLA model. They can then switch between different talking points to see which ones were detected in their answer. We know that there are many potential pitfalls in signaling to a user that their response is “good”, especially as we only detect a limited set of topics. Instead, we keep the control in the user’s hands and only use ML to help users make their own discoveries about how to improve.
So far, the tool has had great results helping job seekers around the world, including in the US, and we have recently expanded it to Africa. We plan to continue working with job seekers to iterate and make the tool even more helpful to the millions of people searching for new jobs.
A short film showing how Interview Warmup and its NLA capabilities were developed in collaboration with job seekers.
Conclusion
Natural Language Assessment (NLA) is a technologically challenging and interesting research area. It paves the way for new conversational applications that promote learning by enabling the nuanced assessment and analysis of answers from multiple perspectives. Working together with communities, from job seekers and businesses to classroom teachers and students, we can identify situations where NLA has the potential to help people learn, engage, and develop skills across an array of subjects, and we can build applications in a responsible way that empower users to assess their own abilities and discover ways to improve.
Acknowledgements
This work is made possible through a collaboration spanning several teams across Google. We’d like to acknowledge contributions from Google Research Israel, Google Creative Lab, and Grow with Google teams among others.