Categories
Misc

How to Build an Instant Machine Learning Web Application with Streamlit and FastAPI

Imagine that you’re working on a machine learning (ML) project and you’ve found your champion model. What happens next? For many, the project ends there,…

Imagine that you’re working on a machine learning (ML) project and you’ve found your champion model. What happens next? For many, the project ends there, with their models sitting isolated in a Jupyter notebook. Others will take the initiative to convert their notebooks to scripts for somewhat production-grade code. 

Both of these end points restrict a project’s accessibility, requiring knowledge of source code hosting sites like GitHub and Bitbucket. A better solution is to convert your project into a prototype with a frontend that can be deployed on internal servers. 

While a prototype may not be production standard, it’s an effective technique companies use to provide stakeholders with insight into a proposed solution. This then allows the company to collect feedback and develop better iterations in the future.  

To develop a prototype, you will need:

  1. A frontend for user interaction
  2. A backend that can process requests

Both requirements can take a significant amount of time to build, however. In this tutorial, you will learn how to rapidly build your own machine learning web application using Streamlit for your frontend and FastAPI for your microservice, simplifying the process. Learn more about microservices in Building a Machine Learning Microservice with FastAPI

You can try the application featured in this tutorial using the code in the kurtispykes/car-evaluation-project GitHub repository.

Overview of Streamlit and FastAPI

Streamlit, an open-source app framework, aims to simplify the process of building web applications for machine learning and data science. It has been gaining a significant amount of traction in the applied ML community in recent years. Founded in 2018, Streamlit was born out of the frustrations of ex-Google engineers faced with the challenges experienced by practitioners when deploying machine learning models and dashboards. 

Using the Streamlit framework, data scientists and machine learning practitioners can build their own predictive analytics web applications in a few hours. There is no need to depend on front-end engineers or knowledge of HTML, CSS, or Javascript since it’s all done in Python.

FastAPI has also had a rapid rise to prominence among Python developers. It’s a modern web framework, also initially released in 2018, that was designed to compensate in almost all areas in which Flask falls flat. One of the great things about switching to FastAPI is the learning curve is not so steep, especially if you already know Flask. With FastAPI you can expect thorough documentation, short development times, simple testing, and easy deployment. This makes it possible to develop RESTful APIs in Python. 

By combining the power of the two frameworks, it’s possible to develop an exciting machine learning application you could share with your friends, colleagues, and stakeholders in less than a day. 

Build a full-stack machine learning application

The following steps guide you through building a simple classification model using FastAPI and Streamlit. This model evaluates whether a car is acceptable based on the following six input features: 

  • buying: The cost to buy the car
  • maint: The cost of maintenance 
  • doors: The number of doors 
  • persons: The carrying capacity (number of people) 
  • lug_boot:  The size of the luggage boot
  • safety: The estimated safety 

You can download the full Car Evaluation dataset from the UCI machine learning repository

After you have done all of the data analysis, trained your champion model, and packaged the machine learning model, the next step is to create two dedicated services: 1) the FastAPI backend and 2) the Streamlit frontend. These two services can then be deployed in two Docker containers and orchestrated using Docker Compose.

Each service requires its own Dockerfile to assemble the Docker images. A Docker Compose YAML file is also required to define and share both container applications. The following steps work through the development of each service. 

The user interface

In the car_evaluation_streamlit package, create a simple user-interface in the app.py file using Streamlit. The code below includes: 

  1. A title for the UI 
  2. A short description of the project
  3. Six interactive elements the user will use to input information about a car
  4. Class values returned by the API 
  5. A submit button that, when clicked, will send all data collected from the user to the machine learning API service as a post request and then display the response from the model 
import requests

import streamlit as st

# Define the title
st.title("Car evaluation web application")
st.write(
    "The model evaluates a cars acceptability based on the inputs below.
    Pass the appropriate details about your car using the questions below to discover if your car is acceptable."
)

# Input 1
buying = st.radio(
    "What are your thoughts on the car's buying price?",
    ("vhigh", "high", "med", "low")
)

# Input 2
maint = st.radio(
    "What are your thoughts on the price of maintenance for the car?",
    ("vhigh", "high", "med", "low")
)

# Input 3
doors = st.select_slider(
    "How many doors does the car have?",
    options=["2", "3", "4", "5more"]
)

# Input 4
persons = st.select_slider(
    "How many passengers can the car carry?",
    options=["2", "4", "more"]
)

# Input 5
lug_boot = st.select_slider(
    "What is the size of the luggage boot?",
    options=["small", "med", "big"]
)

# Input 6
safety = st.select_slider(
    "What estimated level of safety does the car provide?",
    options=["low", "med", "high"]
)

# Class values to be returned by the model
class_values = {
    0: "unacceptable",
    1: "acceptable",
    2: "good",
    3: "very good"
    }

# When 'Submit' is selected
if st.button("Submit"):

    # Inputs to ML model
    inputs = {
        "inputs": [
            {
                "buying": buying,
                "maint": maint,
                "doors": doors,
                "persons": persons,
                "lug_boot": lug_boot,
                "safety": safety
            }
        ]
        }
       
    # Posting inputs to ML API
    response = requests.post(f"http://host.docker.internal:8001/api/v1/predict/", json=inputs, verify=False)
    json_response = response.json()

    prediction = class_values[json_response.get("predictions")[0]]

    st.subheader(f"This car is **{prediction}!**")

The only framework required for this service is Streamlit. In the requirements.txt file, note the version of Streamlit to install when creating the Docker image.

streamlit>=1.12.0, 

Now, add the Dockerfile to create the docker image for this service:

FROM python:3.9.4

WORKDIR /opt/car_evaluation_streamlit

ADD ./car_evaluation_streamlit /opt/car_evaluation_streamlit
RUN pip install --upgrade pip
RUN pip install -r /opt/car_evaluation_streamlit/requirements.txt

EXPOSE 8501

CMD ["streamlit", "run", "app.py"]

Each command creates a layer and each layer is an image.

The REST API

REpresentational State Transfer Application Programming Interfaces (REST APIs) is a software architecture that enables two applications to communicate with one another. In technical terms, a REST API transfers the state of a requested resource to the client. In this scenario, the requested resource will be a prediction from the machine learning model.

The API built with FastAPI can be found in the car_evaluation_api package. Locate the app/main.py file, which is used to run the application. For more information about how the API was developed, see Building a Machine Learning microservice with FastAPI

from typing import Any

from fastapi import APIRouter, FastAPI, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import HTMLResponse
from loguru import logger

from app.api import api_router
from app.config import settings, setup_app_logging

# setup logging as early as possible
setup_app_logging(config=settings)


app = FastAPI(
    title=settings.PROJECT_NAME, openapi_url=f"{settings.API_V1_STR}/openapi.json"
)

root_router = APIRouter()


@root_router.get("/")
def index(request: Request) -> Any:
    """Basic HTML response."""
    body = (
        ""
        ""
        "

Welcome to the API

" "
" "Check the docs: here" "
" "" "" ) return HTMLResponse(content=body) app.include_router(api_router, prefix=settings.API_V1_STR) app.include_router(root_router) # Set all CORS enabled origins if settings.BACKEND_CORS_ORIGINS: app.add_middleware( CORSMiddleware, allow_origins=[str(origin) for origin in settings.BACKEND_CORS_ORIGINS], allow_credentials=True, allow_methods=["*"], allow_headers=["*"], ) if __name__ == "__main__": # Use this for debugging purposes only logger.warning("Running in development mode. Do not run like this in production.") import uvicorn uvicorn.run(app, host="localhost", port=8001, log_level="debug")

The code above defines the server, which includes three endpoints:

  • "/": An endpoint used to define a body that returns an HTML response
  • "/health": An endpoint to return the health response schema of the model 
  • "/predict": An endpoint used to serve predictions from the trained model

You may only see the "/" endpoint in the code above: this is because the "/health" and "/predict" endpoints were imported from the API module and added to the application router. 
Next, save the dependencies for the API service in the requirements.txt file:

--extra-index-url="https://repo.fury.io/kurtispykes/"
car-evaluation-model==1.0.0

uvicorn>=0.18.2, =0.79.0, =0.0.5, =1.9.1, =3.10.0, =0.6.0, 

Note: An extra index was added to pip to install the packaged model from Gemfury.

Next, add the Dockerfile to the car_evalutation_api package. 

FROM python:3.9.4

# Create the user that will run the app
RUN adduser --disabled-password --gecos '' ml-api-user

WORKDIR /opt/car_evaluation_api

ARG PIP_EXTRA_INDEX_URL

# Install requirements, including from Gemfury
ADD ./car_evaluation_api /opt/car_evaluation_api
RUN pip install --upgrade pip
RUN pip install -r /opt/car_evaluation_api/requirements.txt

RUN chmod +x /opt/car_evaluation_api/run.sh
RUN chown -R ml-api-user:ml-api-user ./

USER ml-api-user

EXPOSE 8001

CMD ["bash", "./run.sh"]

Both services have been created, as well as the instructions to build the containers for each service. 

The next step is to wire the containers together so you can start using your machine learning application. Before proceeding, make sure you have Docker and Docker Compose installed. Reference the Docker Compose installation guide if necessary. 

Wire the Docker containers

To wire the containers together, locate the docker-compose.yml file in the packages/ directory. 

The contents of the Docker Compose file are provided below:

version: '3'

services:
  car_evaluation_streamlit:
    build:
        dockerfile: car_evaluation_streamlitDockerfile
    ports:
      - 8501:8501
    depends_on:
      - car_evaluation_api

  car_evaluation_api:
    build:
        dockerfile: car_evaluation_apiDockerfile
    ports:
      - 8001:8001

This file defines the version of Docker Compose to use, defines the two services to be wired together, the ports to expose, and the paths to their respective Dockerfiles. Note that the car_evaluation_streamlit service informs Docker Compose that it depends on the car_evaluation_api service. 

To test the application, navigate to the project root from your command prompt (the location of the docker-compose.yml file). Then run the following command to build the images and spin up both containers:

docker-compose up -d --build

It may take a minute or two to build the images. Once the Docker images are built, you can navigate to http://localhost:8501 to use the application.

A GIF demonstrating a prediction of the ML model after inputs were sent from the Streamlit user interface.
Figure 1. Machine learning web application demonstrating a prediction after inputs were sent from the Streamlit user interface

Figure 1 shows the six model inputs outlined at the beginning of this post:

  1. The car buying price (low, medium, high, very high) 
  2. The car’s maintenance costs (low, medium, high, very high)
  3. The number of doors the car has (2, 3, 4, 5+)  
  4. The number of passengers the car can carry (2, 4, more)
  5. The size of the luggage boot (small, medium, big).
  6. The expected safety of the car (low, medium, high)

Summary 

Congratulations—you have just created your own full-stack machine learning web application. The next steps may involve deploying the application on the web using services such as Heroku Cloud, Google App Engine, or Amazon EC2. 

Streamlit enables developers to rapidly build aesthetically pleasing user interfaces for data science and machine learning. A working knowledge of Python is all that is required to get started with Streamlit. FastAPI is a modern web framework designed to compensate in most areas where Flask falls flat. You can use Streamlit and FastAPI backend together to build a full-stack web application with Docker and Docker Compose.  

Categories
Misc

What Is Green Computing?

Everyone wants green computing. Mobile users demand maximum performance and battery life. Businesses and governments increasingly require systems that are powerful yet environmentally friendly. And cloud services must respond to global demands without making the grid stutter. For these reasons and more, green computing has evolved rapidly over the past three decades, and it’s here Read article >

The post What Is Green Computing? appeared first on NVIDIA Blog.

Categories
Misc

GeForce RTX 4090 GPU Arrives, Enabling New World-Building Possibilities for 3D Artists This Week ‘In the NVIDIA Studio’

This week ‘In the NVIDIA Studio’ creators can now pick up the GeForce RTX 4090 GPU, available from top add-in card providers including ASUS, Colorful, Gainward, Galaxy, GIGABYTE, INNO3D, MSI, Palit, PNY and ZOTAC, as well as from system integrators and builders worldwide.

The post GeForce RTX 4090 GPU Arrives, Enabling New World-Building Possibilities for 3D Artists This Week ‘In the NVIDIA Studio’ appeared first on NVIDIA Blog.

Categories
Misc

Upcoming Event: Deep Learning Accelerator on the Jetson AGX Orin

On October 18, learn how to unlock one-third of AI compute on the NVIDIA Jetson AGX Orin by leveraging Deep Learning Accelerator for your embedded AI…

On October 18, learn how to unlock one-third of AI compute on the NVIDIA Jetson AGX Orin by leveraging Deep Learning Accelerator for your embedded AI applications.

Categories
Misc

Just Released: HPC SDK v22.9

Four panels vertically laid out each showing a simulation with a black backgroundThis version 22.9 update to the NVIDIA HPC SDK includes fixes and minor enhancements.Four panels vertically laid out each showing a simulation with a black background

This version 22.9 update to the NVIDIA HPC SDK includes fixes and minor enhancements.

Categories
Misc

Free DLI Mini Self-Paced Course: Assemble a Simple Robot in NVIDIA Isaac Sim

This self-paced, free tutorial provides a basic understanding of the NVIDIA Isaac Sim interface and the documentation needed to begin robot simulation projects.

This self-paced, free tutorial provides a basic understanding of the NVIDIA Isaac Sim interface and the documentation needed to begin robot simulation projects.

Categories
Misc

Beyond Words: Large Language Models Expand AI’s Horizon

Back in 2018, BERT got people talking about how machine learning models were learning to read and speak. Today, large language models, or LLMs, are growing up fast, showing dexterity in all sorts of applications. They’re, for one, speeding drug discovery, thanks to research from the Rostlab at Technical University of Munich, as well as Read article >

The post Beyond Words: Large Language Models Expand AI’s Horizon appeared first on NVIDIA Blog.

Categories
Offsites

AudioLM: a Language Modeling Approach to Audio Generation

Generating realistic audio requires modeling information represented at different scales. For example, just as music builds complex musical phrases from individual notes, speech combines temporally local structures, such as phonemes or syllables, into words and sentences. Creating well-structured and coherent audio sequences at all these scales is a challenge that has been addressed by coupling audio with transcriptions that can guide the generative process, be it text transcripts for speech synthesis or MIDI representations for piano. However, this approach breaks when trying to model untranscribed aspects of audio, such as speaker characteristics necessary to help people with speech impairments recover their voice, or stylistic components of a piano performance.

In “AudioLM: a Language Modeling Approach to Audio Generation”, we propose a new framework for audio generation that learns to generate realistic speech and piano music by listening to audio only. Audio generated by AudioLM demonstrates long-term consistency (e.g., syntax in speech, melody in music) and high fidelity, outperforming previous systems and pushing the frontiers of audio generation with applications in speech synthesis or computer-assisted music. Following our AI Principles, we’ve also developed a model to identify synthetic audio generated by AudioLM.

From Text to Audio Language Models
In recent years, language models trained on very large text corpora have demonstrated their exceptional generative abilities, from open-ended dialogue to machine translation or even common-sense reasoning. They have further shown their capacity to model other signals than texts, such as natural images. The key intuition behind AudioLM is to leverage such advances in language modeling to generate audio without being trained on annotated data.

However, some challenges need to be addressed when moving from text language models to audio language models. First, one must cope with the fact that the data rate for audio is significantly higher, thus leading to much longer sequences — while a written sentence can be represented by a few dozen characters, its audio waveform typically contains hundreds of thousands of values. Second, there is a one-to-many relationship between text and audio. This means that the same sentence can be rendered by different speakers with different speaking styles, emotional content and recording conditions.

To overcome both challenges, AudioLM leverages two kinds of audio tokens. First, semantic tokens are extracted from w2v-BERT, a self-supervised audio model. These tokens capture both local dependencies (e.g., phonetics in speech, local melody in piano music) and global long-term structure (e.g., language syntax and semantic content in speech, harmony and rhythm in piano music), while heavily downsampling the audio signal to allow for modeling long sequences.

However, audio reconstructed from these tokens demonstrates poor fidelity. To overcome this limitation, in addition to semantic tokens, we rely on acoustic tokens produced by a SoundStream neural codec, which capture the details of the audio waveform (such as speaker characteristics or recording conditions) and allow for high-quality synthesis. Training a system to generate both semantic and acoustic tokens leads simultaneously to high audio quality and long-term consistency.

Training an Audio-Only Language Model
AudioLM is a pure audio model that is trained without any text or symbolic representation of music. AudioLM models an audio sequence hierarchically, from semantic tokens up to fine acoustic tokens, by chaining several Transformer models, one for each stage. Each stage is trained for the next token prediction based on past tokens, as one would train a text language model. The first stage performs this task on semantic tokens to model the high-level structure of the audio sequence.

In the second stage, we concatenate the entire semantic token sequence, along with the past coarse acoustic tokens, and feed both as conditioning to the coarse acoustic model, which then predicts the future tokens. This step models acoustic properties such as speaker characteristics in speech or timbre in music.

In the third stage, we process the coarse acoustic tokens with the fine acoustic model, which adds even more detail to the final audio. Finally, we feed acoustic tokens to the SoundStream decoder to reconstruct a waveform.

After training, one can condition AudioLM on a few seconds of audio, which enables it to generate consistent continuation. In order to showcase the general applicability of the AudioLM framework, we consider two tasks from different audio domains:

  • Speech continuation, where the model is expected to retain the speaker characteristics, prosody and recording conditions of the prompt while producing new content that is syntactically correct and semantically consistent.
  • Piano continuation, where the model is expected to generate piano music that is coherent with the prompt in terms of melody, harmony and rhythm.

In the video below, you can listen to examples where the model is asked to continue either speech or music and generate new content that was not seen during training. As you listen, note that everything you hear after the gray vertical line was generated by AudioLM and that the model has never seen any text or musical transcription, but rather just learned from raw audio. We release more samples on this webpage.

To validate our results, we asked human raters to listen to short audio clips and decide whether it is an original recording of human speech or a synthetic continuation generated by AudioLM. Based on the ratings collected, we observed a 51.2% success rate, which is not statistically significantly different from the 50% success rate achieved when assigning labels at random. This means that speech generated by AudioLM is hard to distinguish from real speech for the average listener.

Our work on AudioLM is for research purposes and we have no plans to release it more broadly at this time. In alignment with our AI Principles, we sought to understand and mitigate the possibility that people could misinterpret the short speech samples synthesized by AudioLM as real speech. For this purpose, we trained a classifier that can detect synthetic speech generated by AudioLM with very high accuracy (98.6%). This shows that despite being (almost) indistinguishable to some listeners, continuations generated by AudioLM are very easy to detect with a simple audio classifier. This is a crucial first step to help protect against the potential misuse of AudioLM, with future efforts potentially exploring technologies such as audio “watermarking”.

Conclusion
We introduce AudioLM, a language modeling approach to audio generation that provides both long-term coherence and high audio quality. Experiments on speech generation show not only that AudioLM can generate syntactically and semantically coherent speech without any text, but also that continuations produced by the model are almost indistinguishable from real speech by humans. Moreover, AudioLM goes well beyond speech and can model arbitrary audio signals such as piano music. This encourages the future extensions to other types of audio (e.g., multilingual speech, polyphonic music, and audio events) as well as integrating AudioLM into an encoder-decoder framework for conditioned tasks such as text-to-speech or speech-to-speech translation.

Acknowledgments
The work described here was authored by Zalán Borsos, Raphaël Marinier, Damien Vincent, Eugene Kharitonov, Olivier Pietquin, Matt Sharifi, Olivier Teboul, David Grangier, Marco Tagliasacchi and Neil Zeghidour. We are grateful for all discussions and feedback on this work that we received from our colleagues at Google.

Categories
Misc

Fall Into October With 25 New Games Streaming on GeForce NOW

Cooler weather, the changing colors of the leaves, the needless addition of pumpkin spice to just about everything, and discount Halloween candy are just some things to look forward to in the fall. GeForce NOW members can add one more thing to the list — 25 games joining the cloud gaming library in October, including Read article >

The post Fall Into October With 25 New Games Streaming on GeForce NOW appeared first on NVIDIA Blog.

Categories
Misc

Explainer: What Is Extended Reality?

Extended reality, or XR, is a collective term that refers to immersive technologies, including virtual reality, augmented reality, and mixed reality.

Extended reality, or XR, is a collective term that refers to immersive technologies, including virtual reality, augmented reality, and mixed reality.