Developing a Question Answering Application Quickly Using NVIDIA Riva

There is a high chance that you have asked your smart speaker a question like, “How tall is Mount Everest?” If you did, it probably said, “Mount Everest is 29,032 feet above sea level.” Have you ever wondered how it found an answer for you? Question answering (QA) is loosely defined as a system consisting … Continued

There is a high chance that you have asked your smart speaker a question like, “How tall is Mount Everest?” If you did, it probably said, “Mount Everest is 29,032 feet above sea level.” Have you ever wondered how it found an answer for you?

Question answering (QA) is loosely defined as a system consisting of information retrieval (IR) and natural language processing (NLP), which is concerned with answering questions posed by humans in a natural language. If you are not familiar with information retrieval, it is a technique to obtain relevant information to a query, from a pool of resources, webpages, or documents in the database, for example. The easiest way to understand the concept is the search engine that you use daily. 

You then need an NLP system to find an answer within the IR system that is relevant to the query. Although I just listed what you need for building a QA system, it is not a trivial task to build IR and NLP from scratch. Here’s how NVIDIA Riva makes it easy to develop a QA system.

Riva overview

NVIDIA Riva is an accelerated SDK for building multimodal conversational AI services that use an end-to-end deep learning pipeline. The Riva framework includes optimized services for speech, vision, and natural language understanding (NLU) tasks. In addition to providing several pretrained models for the entire pipeline of your conversational AI service, Riva is also architected for deployment at scale. In this post, I look closely into the QA function of Riva and how you can create your own QA application with it.

Riva QA function

To understand how the Riva QA function works, start with Bidirectional Encoder Representations from Transformers (BERT). It’s a transformer-based, NLP, pretraining method developed by Google in 2018, and it completely changed the field of NLP. BERT understands the contextual representation of a given word in a text. It is pretrained on a large corpus of data, including Wikipedia. 

With the pretrained BERT, a strong NLP engine, you can further fine-tune it to perform QA with many question-answer pairs like those in the Stanford Question Answering Dataset (SQuAD). The model can now find an answer for a question in natural language from a given context: sentences or paragraphs. Figure 1 shows an example of QA, where it highlights the word “gravity” as an answer to the query, “What causes precipitation to fall?”. In this example, the paragraph is the context and the successfully fine-tuned QA model returns the word “gravity” as an answer.

A paragraph describing the meteorological explanation of precipitation. It has three pairs of questions and answers at the bottom.
Figure 1. Question-answer pairs for a sample passage in the SQuAD dataset.
Source: SQuAD: 100,000+ Questions for Machine Comprehension of Text.

Create a QA system with Riva

Teams of engineers and researchers at NVIDIA deliver a quality QA function that you can use right out-of-the-box with Riva. The Riva NLP service provides a set of high-level API actions that include QA, NaturalQuery. The Wikipedia API action allows you to fetch articles posted on Wikipedia, an online encyclopedia, with a query in natural language. That’s the information retrieval system that I discussed earlier. Combining the Wikipedia API action and Riva QA function, you can create a simple QA system with a few lines of Python code. 

Start by installing the Wikipedia API for Python. Next, import the Riva NLP service API and gRPC, the underlying communication framework for Riva.

!pip install wikipedia
import wikipedia as wiki
import grpc
import riva_api.riva_nlp_pb2 as rnlp
import riva_api.riva_nlp_pb2_grpc as rnlp_srv

Now, create an input query. Use the Wikipedia API action to fetch the relevant articles and define the number of them to fetch, defined as max_articles_combine. Ask a question, “What is speech recognition?” You then print out the titles of the articles returned from the search. Finally, you add the summaries of each article into a variable: combined_summary.

input_query = "What is speech recognition?"
wiki_articles =
max_articles_combine = 3
combined_summary = ""
if len(wiki_articles) == 0:
    print("ERROR: Could not find any matching results in Wikipedia.")
    for article in wiki_articles[:min(len(wiki_articles), max_articles_combine)]:
        print(f"Getting summary for: {article}")
        combined_summary += "n" + wiki.summary(article)
The figure shows the printed output of the Python code run, three articles related to speech recognition.
Figure 2. Titles of articles fetched by Wikipedia API action.

Next, open a gRPC channel that points to the location where the Riva server is running. Because you are running the Riva server locally, it is ‘localhost:50051‘. Then, instantiate NaturalQueryRequest, and send a request to the Riva server, passing both the query and the context. Finally, print the response, returned from the Riva server.

channel = grpc.insecure_channel('localhost:50051')
riva_nlp = rnlp_srv.RivaLanguageUnderstandingStub(channel)req = rnlp.NaturalQueryRequest()
req.query = input_query
req.context = combined_summary
resp = riva_nlp.NaturalQuery(req)

print(f"Query: {input_query}")
print(f"Answer: {resp.results[0].answer}")
The output of the Python code run, the query and answer, generated by the Riva QA function.
Figure 3. Example query and answer. 


With Riva QA and the Wikipedia API action, you just created a simple QA application. If there’s an article in Wikipedia that is relevant to your query, you can theoretically find answers. Imagine that you have a database full of articles relevant to your domain, company, industry, or anything of interest. You can create a QA service that can find answers to the questions specific to your field of interest. Obviously, you would need an IR system that would fetch relevant articles from your database, like the Wikipedia API action used in this post. When you have the IR system in your pipeline, Riva can help you find an answer for you. We look forward to the cool applications that you’ll create with Riva. .

Leave a Reply

Your email address will not be published. Required fields are marked *