Building NVIDIA GPU-Accelerated Pipelines on Azure Synapse Analytics with RAPIDS

Azure recently announced support for NVIDIA’s T4 Tensor Core Graphics Processing Units (GPUs) which are optimized for deploying machine learning inferencing or analytical workloads in a cost-effective manner. With Apache Spark™ deployments tuned for NVIDIA GPUs, plus pre-installed libraries, Azure Synapse Analytics offers a simple way to leverage GPUs to power a variety of data … Continued

Azure recently announced support for NVIDIA’s T4 Tensor Core Graphics Processing Units (GPUs) which are optimized for deploying machine learning inferencing or analytical workloads in a cost-effective manner. With Apache Spark™ deployments tuned for NVIDIA GPUs, plus pre-installed libraries, Azure Synapse Analytics offers a simple way to leverage GPUs to power a variety of data processing and machine learning tasks. With built-in support for RAPIDS acceleration, the Azure Synapse version of GPU-accelerated Spark offers at least 2x performance gain on standard analytical benchmarks compared to running on CPUs, all without any code changes.

Currently, this GPU acceleration feature in Azure Synapse is available for private preview by request.

Benefits of NVIDIA GPU acceleration

NVIDIA GPUs offer extraordinarily high compute performance, bringing parallel processing to multi-core servers to accelerate demanding workloads. A CPU consists of a few cores optimized for sequential serial processing, whereas. On the other hand, a GPU has a massively parallel architecture consisting of thousands of smaller and more efficient cores designed to handle multiple tasks simultaneously. Considering that data scientists spend up to 80% of their time on data pre-processing, GPUs are a critical tool to accelerate data processing pipelines compared to relying on pipelines containing CPUs alone.

One of the most efficient and familiar ways to build these pipelines is using Apache Spark™. The benefits of NVIDIA GPU acceleration in Apache Spark™ include:

  • Faster to complete data processing, queries, and model training, which grants accelerated iteration and time to insight.
  • The same GPU-accelerated infrastructure is helpful to eliminate the need for complex decision-making and tuning for both Spark and ML/DL frameworks.
  • Fewer compute nodes are required; reducing infrastructure cost and potentially helping avoid scale-related problems.

NVIDIA and Azure synapse collaboration

NVIDIA and Azure Synapse have teamed up to bring GPU acceleration to data scientists and data engineers. This integration will give customers the freedom to use NVIDIA GPUs for Apache Spark™ applications with no-code changes and with an experience identical to a CPU cluster. In addition, this collaboration will continue to add support for the latest NVIDIA GPUs and networking products and provide continuous enhancements for big data customers who are looking to improve productivity and save costs with a single pipeline for data engineering, data preparation, and machine learning.

To learn more about this project, check out our presentation at NVIDIA’s GTC 2021 Conference.

Apache Spark™ 3.0 GPU Acceleration in Azure Synapse

While Apache Spark™ provides GPU support out-of-box, configuring and managing all the required hardware and installing all the low-level libraries can take significant effort. When you try GPU-enabled Apache Spark™ pools in Azure Synapse, you will immediately notice a surprisingly simple user experience:

Behind the scenes heavy lifting: To efficiently use GPUs, libraries are used to perform communication with the graphics card on the host machine. Installing and configuring these libraries takes time and effort. Azure Synapse takes care of pre-installing these libraries and setting up all the complex networking between compute nodes through integration with GPU Apache Spark™ pools. Within just a few minutes, you can stop worrying about setup and focus on solving business problems.

Optimized Spark configuration: By collaborating between NVIDIA and Azure Synapse, we have come up with optimal configurations for your GPU-enabled Apache Spark™ pools. Thus, your workloads run most optimally saving you both time and operational costs.

Packed with Data Prep and ML Libraries: The GPU-enabled Apache Spark™ pools in Azure Synapse come built-in with two popular libraries with support for more on the way:

  • RAPIDS for Data Prep: RAPIDS is a suite of open-source software libraries and APIs for executing end-to-end data science and analytics pipelines entirely on GPUs for a substantial speed-up, particularly on large data sets. Built on top of NVIDIA CUDA and UCX, the RAPIDS Accelerator for Apache Spark™ enables GPU-accelerated SQL, DataFrame operations, and Spark shuffles. Since there are no code changes required to leverage these accelerations, you can also accelerate your data pipelines that rely on Linux Foundation’s Delta Lake or Microsoft’s Hyperspace indexing (both of which are available on Synapse out-of-box).
  • Hummingbird for accelerating scoring and inference over your traditional ML models. Hummingbird is a library for converting traditional ML operators to tensors, with the goal of accelerating inference (scoring/prediction) for traditional machine learning models.
Figure 1: Spark Data Prep and ML in Azure Synapse.

When running NVIDIA Decision Support (NDS) test queries, derived from industry-known benchmarks, over 1 TB of Parquet data our early results indicate that GPUs can deliver nearly 2x acceleration in overall query performance, without any code changes.

 Figure 2: Overall performance results.
Figure 3: Current Azure Synapse offerings.

Classification model outputs floats


I’m not sure what’s going on with my classification model, as it predicts objects as floats instead of classes. Is it something to do with the loss function or activation functions at the end of the network?

The barebones code can be found here:

submitted by /u/dahkneela
[visit reddit] [comments]


Does label_map.pbtx have to be contiguous?

Say I would like to quickly remove an object from my dataset. Can I drop all samples with the specific id (say id=7), and then remove the definition of the label with id=7 from the label_map file (making it skip from 6 straight to 8)?

Or is it expected that the ids are contiguous, so that I have to shift all labels (8 and up) down by one after removing 7?

submitted by /u/Meriipu
[visit reddit] [comments]


Find Your Groove: Add NVIDIA AI Essentials Series to Your Summer Playlist

If AI, data science, graphics or robotics is your jam, stream the NVIDIA AI Essentials Learning Series this summer. These intro-level courses provide foundational knowledge to students and early-career developers looking to broaden their areas of expertise. The free series includes over a dozen sessions — each less than an hour long — on topics Read article >

The post Find Your Groove: Add NVIDIA AI Essentials Series to Your Summer Playlist appeared first on The Official NVIDIA Blog.


Batches in TF-Slim

Hi all, I’m trying to use TF-Slim (yes it has to be tf-slim) and I’m having some trouble figuring out how to break my data up into batches. I want to avoid loading my dataset into RAM (although I could), but the documentation doesn’t specify how to handle batches. If anyone that has used tf-slim before could shed some light, it would be much appreciated.

submitted by /u/prinse4515
[visit reddit] [comments]


Unsupervised learning technique

I want to create a model that is made up by a bunch of objects. Each one has a name and 6 attributes associated with it. I want to make an unsupervised model that groups objects with similar attributes together. When a piece of data is added, I would like to have the group it best fits into outputted and be able to get the names of other objects in this group. Is this possible with tensorflow?

submitted by /u/UnreadyDog
[visit reddit] [comments]


Learn How to Build Applications of AI for Anomaly Detection

The NVIDIA Deep Learning Institute (DLI) is offering instructor-led, hands-on training on how to implement multiple AI-based approaches to solve a specific use case of identifying network intrusions for telecommunications.

Whether you need to monitor cybersecurity threats, fraudulent financial transactions, product defects, or equipment health, artificial intelligence can help you catch data abnormalities before they impact your business. AI models can be trained and deployed to automatically analyze datasets, define “normal behavior,” and identify breaches in patterns quickly and effectively. These models can then be used to predict future anomalies. With massive amounts of data available across industries and subtle distinctions between normal and abnormal patterns, it’s critical that organizations use AI to quickly detect anomalies that pose a threat.

The NVIDIA Deep Learning Institute (DLI) is offering instructor-led, hands-on training on how to implement multiple AI-based approaches to solve a specific use case of identifying network intrusions for telecommunications. You’ll learn three different anomaly detection techniques using GPU-accelerated XGBoost, deep learning-based autoencoders, and generative adversarial networks (GANs) and then implement and compare supervised and unsupervised learning techniques. At the end of the workshop, you’ll be able to use AI to detect anomalies in your work across telecommunications, cybersecurity, finance, manufacturing, and other key industries.

By participating in this workshop, you’ll:

  • Prepare data and build, train, and evaluate models using XGBoost, autoencoders, and GANs
  • Detect anomalies in datasets with both labeled and unlabeled data
  • Classify anomalies into multiple categories regardless of whether the original data was labeled

This training will be offered:

Tue, Sep 21, 2021, 9:00 a.m. – 5:00 p.m. CEST/EMEA, UTC+2

Tue, Sep 21, 2021, 9:00 a.m. – 5:00 p.m. PDT, UTC-7

Space is limited, register now.


Upcoming Webinar: Building a Computer Vision Service Using NVIDIA NGC and Google Cloud

Join the NGC team for a webinar and live Q&A on Aug. 25, at 10 a.m. PT

The NGC team is hosting a webinar and live Q&A. Topics include how to use containers from the NGC catalog deployed from Google Cloud Marketplace to GKE, a managed Kubernetes service on Google Cloud, that easily builds, deploys, and runs AI solutions.

Building a Computer Vision Service Using NVIDIA NGC and Google Cloud
August 25 at 10 a.m. PT

Organizations are using computer vision to improve the product experience, increase production, and drive operational efficiencies. But, building a solution requires large amounts of labeled data, the software and hardware infrastructure to train AI models, and the tools to run real-time inference that will scale with demand.

With one click, NGC containers for AI can be deployed from Google Cloud Marketplace to GKE. This managed Kubernetes service on Google Cloud, makes it easy for enterprises to build, deploy, and run their AI solutions.

By joining this webinar, you will learn:

  • How the NGC catalog can work with GCP Marketplace to accelerate your AI workflows.
  • About ways the Transfer Learning Toolkit can be used as a template and a custom training data set.
  • How to easily deploy an NVIDIA Triton inferencing container from the GCP Marketplace that will scale inference using GKE.

Register now >>> 


Google at ACL 2021

This week, the 59th annual meeting of the Association for Computational Linguistics (ACL), a premier conference covering a broad spectrum of research areas that are concerned with computational approaches to natural language, is taking place online.

As a leader in natural language processing and understanding, and a Diamond Level sponsor of ACL 2021, Google will showcase the latest research in the field with over 35 publications, and the organization of and participation in a variety of workshops and tutorials.

If you’re registered for ACL 2021, we hope that you’ll visit the Google virtual booth in Gather Town to learn more about the projects and opportunities at Google that go into solving interesting problems for billions of people. You can also learn more about Google’s participation on the ACL 2021 Expo page, and see a full list of Google publications below (Google affiliations in bold).

Organizing Committee
Senior Area Chairs include: Dan Roth, Emily Pitler, Jimmy Lin, Ming-Wei Chang, Sebastian Ruder, Slav Petrov
Area Chairs include: Ankur P. Parikh, Artem Sokolov, Bhuwan Dhingra, Cicero Nogueira dos Santos, Colin Cherry, Dani Yogatama, David Mimno, Hideto Kazawa, Ian Tenney, Jasmijn Bastings, Jun Suzuki, Katja Filippova, Kyle Gorma, Lu Wang, Manaal Faruqui, Natalie Schluter, Peter Liu, Radu Soricut, Sebastian Gehrmann, Shashi Narayan, Tal Linzen, Vinodkumar Prabhakaran, Waleed Ammar

Parameter-Efficient Multi-task Fine-Tuning for Transformers via Shared Hypernetwork
Rabeeh Karimi Mahabadi*, Sebastian Ruder, Mostafa Dehghani, James Henderson

TicketTalk: Toward Human-Level Performance with End-to-End, Transaction-Based Dialog Systems
Bill Byrne, Karthik Krishnamoorthi, Saravanan Ganesh, Mihir Sanjay Kale

Increasing Faithfulness in Knowledge-Grounded Dialogue with Controllable Feature
Hannah Rashkin, David Reitter, Gaurav Singh Tomar, Dipanjan Das

Compositional Generalization and Natural Language Variation: Can a Semantic Parsing Approach Handle Both?
Peter Shaw, Ming-Wei Chang, Panupong Pasupat, Kristina Toutanova

Exploiting Language Relatedness for Low Web-Resource Language Model Adaptation: An Indic Languages Study
Yash Khemchandani, Sarvesh Mehtani, Vaidehi Patil, Abhijeet Awasthi, Partha Talukdar, Sunita Sarawagi

Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Model
Matthew Finlayson, Aaron Mueller, Sebastian Gehrmann, Stuart Shieber, Tal Linzen*, Yonatan Belinkov

Modeling Fine-Grained Entity Types with Box Embeddings
Yasumasa Onoe, Michael Boratko, Andrew McCallum, Greg Durrett

TextSETTR: Few-Shot Text Style Extraction and Tunable Targeted Restyling
Parker Riley*, Noah Constant, Mandy Guo, Girish Kumar*, David Uthus, Zarana Parekh

Which Linguist Invented the Lightbulb? Presupposition Verification for Question-Answering
Najoung Kim*, Ellie Pavlick, Burcu Karagol Ayan, Deepak Ramachandran

H-Transformer-1D: Fast One-Dimensional Hierarchical Attention for Sequences
Zhenhai Zhu, Radu Soricut

Are Pretrained Convolutions Better than Pretrained Transformers?
Yi Tay, Mostafa Dehghani, Jai Gupta, Dara Bahri, Vamsi Aribandi, Zhen Qin, Donald Metzler

Benchmarking Scalable Methods for Streaming Cross Document Entity Coreference
Robert L Logan IV, Andrew McCallum, Sameer Singh, Dan Bikel

PhotoChat: A Human-Human Dialogue Dataset With Photo Sharing Behavior For Joint Image-Text Modeling
Xiaoxue Zang, Lijuan Liu, Maria Wang, Yang Song*, Hao Zhang, Jindong Chen

Focus Attention: Promoting Faithfulness and Diversity in Summarization
Rahul Aralikatte*, Shashi Narayan, Joshua Maynez, Sascha Rothe, Ryan McDonald*

A Cognitive Regularizer for Language Modeling
Jason Wei, Clara Meister, Ryan Cotterell

Language Model Augmented Relevance Score
Ruibo Liu, Jason Wei, Soroush Vosoughi

Cross-Replication Reliability – An Empirical Approach to Interpreting Inter-rater Reliability
Ka Wong, Praveen Paritosh, Lora Aroyo

TIMEDIAL: Temporal Commonsense Reasoning in Dialog
Lianhui Qin*, Aditya Gupta, Shyam Upadhyay, Luheng He, Yejin Choi, Manaal Faruqui

StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling
Yikang Shen*, Yi Tay, Che Zheng, Dara Bahri, Donald Metzler, Aaron Courville

MOLEMAN: Mention-Only Linking of Entities with a Mention Annotation Network
Nicholas FitzGerald, Jan A. Botha, Daniel Gillick, Daniel M. Bikel, Tom Kwiatkowski, Andrew McCallum

Neural Retrieval for Question Answering with Cross-Attention Supervised Data Augmentation
Yinfei Yanga, Ning Jinb, Kuo Linb, Mandy Guoa, Daniel Cera

ROPE: Reading Order Equivariant Positional Encoding for Graph-Based Document Information Extraction
Chen-Yu Lee, Chun-Liang Li, Chu Wang∗, Renshen Wang, Yasuhisa Fujii, Siyang Qin, Ashok Popat, Tomas Pfister

Measuring and Improving BERT’s Mathematical Abilities by Predicting the Order of Reasoning
Piotr Piekos, Henryk Michalewski, Mateusz Malinowsk

Improving Compositional Generalization in Classification Tasks via Structure Annotations
Juyong Kim, Pradeep Ravikumar, Joshua Ainslie, Santiago Ontañón

A Simple Recipe for Multilingual Grammatical Error Correction
Sascha Rothe, Jonathan Mallinson, Eric Malmi, Sebastian Krause, Aliaksei Severyn

nmT5 – Is Parallel Data Still Relevant for Pre-training Massively Multilingual Language Models?
Mihir Kale, Aditya Siddhant, Noah Constant, Melvin Johnson, Rami Al-Rfou, Linting Xue

QA-Driven Zero-Shot Slot Filling with Weak Supervision Pretraining
Xinya Du*, Luheng He, Qi Li, Dian Yu*, Panupong Pasupat, Yuan Zhang

AgreeSum: Agreement-Oriented Multi-Document Summarization
Richard Yuanzhe Pang*, Adam D. Lelkes, Vinh Q. Tran, Cong Yu

Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering
Aditya Gupta, Jiacheng Xu*, Shyam Upadhyay, Diyi Yang, Manaal Faruqui

Training ELECTRA Augmented with Multi-word Selection
Jiaming Shen*, Jialu Liu, Tianqi Liu, Cong Yu, Jiawei Han

A Survey of Data Augmentation Approaches for NLP
Steven Y. Feng, Varun Gangal, Jason Wei, Sarath Chandar, Soroush Vosoughi, Teruko Mitamura, Eduard Hovy

RealFormer: Transformer Likes Residual Attention
Ruining He, Anirudh Ravula, Bhargav Kanagal, Joshua Ainslie

Scaling Within Document Coreference to Long Texts
Raghuveer Thirukovalluru, Nicholas Monath, Kumar Shridhar, Manzil Zaheer, Mrinmaya Sachan, Andrew McCallum

MergeDistill: Merging Language Models using Pre-trained Distillation
Simran Khanuja, Melvin Johnson, Partha Talukdar

DoT: An Efficient Double Transformer for NLP tasks with Tables
Syrine Krichene, Thomas Müller*, Julian Martin Eisenschlos

How Reliable are Model Diagnostics?
Vamsi Aribandi, Yi Tay, Donald Metzler

Interactive Learning for Natural Language Processing
Organizers include: Filip Radlinski
Invited Panelist: Julia Kreutzer

6th Workshop on Representation Learning for NLP (RepL4NLP-2021)
Organizers include: Chris Dyer, Laura Rimell

Third Workshop on Gender Bias for Natural Language Processing
Organizers include: Kellie Webster

Benchmarking: Past, Present and Future
Invited Speaker: Eunsol Choi

SemEval-2021, 15th International Workshop on Semantic Evaluation
Organizers include: Natalie Schluter

Workshop on Online Abuse and Harms
Organizers include: Vinodkumar Prabhakaran

GEM: Natural Language Generation, Evaluation, and Metrics
Organizers include: Sebastian Gehrmann

Workshop on Natural Language Processing for Programming
Invited Speaker: Charles Sutton

WPT 2021: The 17th International Conference on Parsing Technologies
Organizers include: Weiwei Sun

Recognizing Multimodal Entailment
Instructors include: Cesar Ilharco, Vaiva Imbrasaite, Ricardo Marino, Jannis Bulian, Chen Sun, Afsaneh Shirazi, Lucas Smaira, Cordelia Schmid

*  Work conducted while at Google. 


Better Than 8K Resolution: NVIDIA Inception Displays Global AI Startup Ecosystem

There are more AI startups in healthcare than any other single industry. The number of AI startups in media and entertainment is about the same as that in retail. More than one in 10 of all AI startups is based in California. How do we know this? NVIDIA Inception, our acceleration platform for AI startups, Read article >

The post Better Than 8K Resolution: NVIDIA Inception Displays Global AI Startup Ecosystem appeared first on The Official NVIDIA Blog.