Categories
Misc

RAPIDS Accelerator for Apache Spark Release v21.10

This post details the latest functionality of RAPIDS Accelerator for Apache Spark.

RAPIDS Accelerator for Apache Spark v21.10 is now available! As an open source project, we value our community, their voice, and requests. This release constitutes community requests for operations that are ideally suited for GPU acceleration. 

Important callouts for this release:

  • Speed up – performance improvements and cost savings.
  • New Functionality – new I/O and nested datatype Qualification and Profiling tool features. 
  • Community Updates – updates to the spark-examples repository.

Speed up

RAPIDS Accelerator for Apache Spark is growing at a great pace in both functionality and performance. Standard industry benchmarks are a great way to measure performance over a period of time but another barometer to measure performance is to measure performance of common operators that are used in the data preprocessing stage or in data analytics.

We used four such queries shown in the chart below:

  • Count Distinct: a function used to estimate the number of unique page views or unique customers visiting an e-commerce site.
  • Window: a critical operator necessary for preprocessing components in analyzing timestamped event data in marketing or financial industry.
  • Intersect: an operator used to remove duplicates in a dataframes.
  • Cross-join: A common use for a cross join is to obtain all combinations of items.

These queries were run on a Google Cloud Platform (GCP) machine with 2xT4 GPUs each with 104GB RAM. The dataset used was of size 3TB with multiple different data types. More information about the setup and the queries can be found in the spark-rapids-examples repository on GitHub. These four queries show not only performance and cost benefits but also the range of speed-up (27x to 1.5x) varies depending on compute intensity. These queries vary in compute and network utilization similar to a practical use case in data preprocessing.

A bar chart showing GPU vs CPU runtime for four microbenchmarks (Apache Spark Operators) 1. Cross-join 2. Intersect 3.Windowing (with & without data skew) 4.Count Distinct..
  The preceding graph is a little sneak peek into the speed-up one can expect while using Spark-Rapids. A detailed performance analysis will be provided in the next release blog.
Figure 1: Microbenchmark Queries runtime on Google Cloud Platform Dataproc Cluster: GPU vs CPU.

New functionality

Plug-in

Most Apache Spark users are aware that Spark 3.2 was released this October. The v21.10 release has support for Spark 3.2 and CUDA 11.4. In this release, we focused on expanding support for I/O, nested data processing and machine learning functionality. RAPIDS Accelerator for Apache Spark v21.10 released a new plug-in jar to support machine learning in Spark. 

Currently, this jar supports training for the Principal Component Analysis algorithm. The ETL jar extended the input type support for Parquet and ORC. It now also provides users with the functionality to use HashAggregate, Sort, Join SHJ and Join BHJ on nested data. In addition to support for nested datatypes a performance test was also run.

In the figure below, we show that the speed-up is observed for two queries using nested data type input. Some other interesting features that were added in v21.10 are, pos_explode, create_map and so on. Please refer to RAPIDS Accelerator for Apache Spark’s documentation for a detailed list of new features.

A bar chart showing GPU vs CPU runtime for two microbenchmarks (Apache Spark Operators) 1. Count Distinct 2. Windowing.
Figure 2: Microbenchmark Queries runtime for nested datatypes on Google Cloud Platform Dataproc Cluster: GPU vs CPU.

Profiling & qualification tool

In addition to the plug-in, multiple new features were also added to RAPIDS Accelerator for Apache Spark’s Qualification and Profiling tool. The Qualification tool can now report the different nested datatypes and write data formats present. It now also includes support for adding conjunction and disjunction filters, and filter based Regular Expressions and usernames.

The Qualifications tool is not the only one with new tricks: the Profiling tool now provides structured output format and support to scale and run a large number of event logs.

Community updates

We are excited to announce that we are in public preview on Azure and we welcome Azure users to try RAPIDS Accelerated for Apache Spark on Azure Synapse.

We invite you to view our talks presented at NVIDIA’s flagship event, GTC, held from Nov. 8-11, to learn how AI is transforming the world. The RAPIDS Accelerator team presented two talks; Accelerating Apache Spark gives an overview of new functionality and other upcoming features. Also, Discover Common Apache Spark Operations Turbocharged with RAPIDS and NVIDIA GPUs covers many microbenchmarks on Apache Spark.

Coming soon

The upcoming versions will introduce support for 128-bit decimal datatype, inference support for the Principle Component Analysis algorithm and additional nested data type support for multi-level struct and maps. 

In addition, lookout for MIG support for NVIDIA Ampere Architecture based GPUs (A100/A30) which can help improve throughput on running multiple spark jobs with A100. As always, we want to thank all of you for using RAPIDS Accelerator for Apache Spark and we look forward to hearing from you. Reach out to us on GitHub and let us know how we can continue to improve your experience using RAPIDS Accelerator on Apache Spark.

Categories
Misc

Prepare for Genshin Impact, Coming to GeForce NOW in Limited Beta

GeForce NOW is charging into the new year at full force. This GFN Thursday comes with the news that Genshin Impact, the popular open-world action role-playing game, will be coming to the cloud this year, arriving in a limited beta. Plus, this year’s CES announcements were packed with news for GeForce NOW. Battlefield 4: Premium Read article >

The post Prepare for Genshin Impact, Coming to GeForce NOW in Limited Beta appeared first on The Official NVIDIA Blog.

Categories
Misc

Get Started on NVIDIA Triton with an Introductory Course from NVIDIA DLI

Practice machine learning operations and learn how to deploy your own machine learning models on a NVIDIA Triton GPU server.

Deploying a Model for Inference at Production Scale

A lot of love goes into building a machine-learning model. Challenges range from identifying the variables to predict to experimentation finding the best model architecture to sampling the correct training data. But, what good is the model if you can’t access it?

Enter the NVIDIA Triton Inference Server. NVIDIA Triton helps data scientists and system administrators turn the same machines you use to train your models into a web server for model prediction. While a GPU is not required, an NVIDIA Triton Inference Server can take advantage of multiple installed GPUs to quickly process large batches of requests.

To get hands-on practice with a live server, the NVIDIA Deep Learning Institute (DLI) is offering a 4-hour, self-paced course titled Deploying a Model for Inference at Production Scale.

MLOps Overview

NVIDIA Triton was created with Machine Learning Operations, or MLOps, in mind. MLOps is a relatively new field evolved from Developer Operations, or DevOps, to focus on scaling and maintaining machine-learning models in a production environment. NVIDIA Triton is equipped with features such as model versioning for easy rollbacks. It is also compatible with Prometheus to track and manage server metrics such as latency and request count.

Course Information

This course covers an introduction to MLOps coupled with hands-on practice with a live NVIDIA Triton Inference Server. 

Learning objectives include:

  • Deploying neural networks from a variety of frameworks onto a live NVIDIA Triton Server.
  • Measuring GPU usage and other metrics with Prometheus.
  • Sending asynchronous requests to maximize throughput.

Upon completion, developers will be able to deploy their own models on an NVIDIA Triton Server.

Start Learning >>

For additional hands-on training visit the NVIDIA Deep Learning Institute.

Categories
Misc

Teamwork Makes AVs Work: NVIDIA and Deloitte Deliver Turnkey Solutions for AV Developers

Autonomous vehicles are born in the data center, which is why NVIDIA and Deloitte are delivering a strong foundation for developers to deploy robust self-driving technology. At CES this week, the companies detailed their collaboration, which is aimed at easing the biggest pain points in AV development. Deloitte, a leading global consulting firm, is pairing Read article >

The post Teamwork Makes AVs Work: NVIDIA and Deloitte Deliver Turnkey Solutions for AV Developers appeared first on The Official NVIDIA Blog.

Categories
Misc

John Snow Labs Spark-NLP 3.4.0: New OpenAI GPT-2, new ALBERT, XLNet, RoBERTa, XLM-RoBERTa, and Longformer for Sequence Classification, support for Spark 3.2, new distributed Word2Vec, extend support to more Databricks & EMR runtimes, new state-of-the-art transformer models, bug fixes, and lots more!

John Snow Labs Spark-NLP 3.4.0: New OpenAI GPT-2, new ALBERT, XLNet, RoBERTa, XLM-RoBERTa, and Longformer for Sequence Classification, support for Spark 3.2, new distributed Word2Vec, extend support to more Databricks & EMR runtimes, new state-of-the-art transformer models, bug fixes, and lots more! submitted by /u/dark-night-rises
[visit reddit] [comments]
Categories
Misc

Evaluating in TensorFlow Object Detection API – AttributeError

Hello everyone,

I have successfully trained my model using the TensorFlow Object Detection API and wanted to evaluate it on it. I used the following site as a guide: https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html

Link for the code “model_main_tf2.py”: https://github.com/tensorflow/models/blob/master/research/object_detection/model_main_tf2.py

After running the script “model_main_tf2.py”, I received the following error message:

-> INFO:tensorflow:Waiting for new checkpoint at models/my_ssd_resnet50_v1_fpn -> I1220 17:06:56.024288 140351537808192 checkpoint_utils.py:140] Waiting for new checkpoint at models/my_ssd_resnet50_v1_fpn -> INFO:tensorflow:Found new checkpoint at models/my_ssd_resnet50_v1_fpn/ckpt-2 -> I1220 17:06:56.024974 140351537808192 checkpoint_utils.py:149] Found new checkpoint at models/my_ssd_resnet50_v1_fpn/ckpt-2 -> 2021-12-20 17:06:56.098253: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2) -> /home/ameisemuhammed/anaconda3/envs/tensorflow/lib/python3.9/site-packages/keras/backend.py:401: UserWarning: tf.keras.backend.set_learning_phase is deprecated and will be removed after 2020-10-11. To update it, simply pass a True/False value to the `training` argument of the `call` method of your layer or model. -> warnings.warn(‘`tf.keras.backend.set_learning_phase` is deprecated and ‘ -> 2021-12-20 17:07:08.993353: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8204 -> Traceback (most recent call last): -> File “/home/ameisemuhammed/TensorFlow/workspace/training_demo/model_main_tf2.py”, line 114, in <module> tf.compat.v1.app.run() -> File “/home/ameisemuhammed/anaconda3/envs/tensorflow/lib/python3.9/site-packages/tensorflow/python/platform/app.py”, line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) -> File “/home/ameisemuhammed/anaconda3/envs/tensorflow/lib/python3.9/site-packages/absl/app.py”, line 303, in run _run_main(main, args) -> File “/home/ameisemuhammed/anaconda3/envs/tensorflow/lib/python3.9/site-packages/absl/app.py”, line 251, in _run_main sys.exit(main(argv)) -> File “/home/ameisemuhammed/TensorFlow/workspace/training_demo/model_main_tf2.py”, line 81, in main model_lib_v2.eval_continuously( -> File “/home/ameisemuhammed/anaconda3/envs/tensorflow/lib/python3.9/site-packages/object_detection/model_lib_v2.py”, line 1141, in eval_continuously optimizer.shadow_copy(detection_model) -> File “/home/ameisemuhammed/anaconda3/envs/tensorflow/lib/python3.9/site-packages/keras/optimizer_v2/optimizer_v2.py”, line 830, in getattribute raise e -> File “/home/ameisemuhammed/anaconda3/envs/tensorflow/lib/python3.9/site-packages/keras/optimizer_v2/optimizer_v2.py”, line 820, in getattribute return super(OptimizerV2, self).getattribute(name) -> AttributeError: ‘SGD’ object has no attribute ‘shadow_copy’“

My versions:

TensorFlow = 2.6.0
TensorFlow GPU = 2.6.0
Ubuntu = 20.04
Python = 3.9.7
GPU = NVIDIA Corporation TU104 [GeForce RTX 2080]
CUDA = 11.5
cuDNN = 8.2.4

Could the problem be from the TensorFlow version or CUDA/cuDNN version?
If more information is needed, please let me know!

submitted by /u/GT_King0895
[visit reddit] [comments]

Categories
Misc

‘AI Dungeon’ Creator Nick Walton Uses AI to Generate Infinite Gaming Storylines

What started as Nick Walton’s college hackathon project grew into AI Dungeon, a popular text adventure game with over 1.5 million users. Walton is the co-founder and CEO of Latitude, a Utah-based startup that uses AI to create unique gaming storylines. He spoke with NVIDIA AI Podcast host Noah Kravitz about how natural language processing Read article >

The post ‘AI Dungeon’ Creator Nick Walton Uses AI to Generate Infinite Gaming Storylines appeared first on The Official NVIDIA Blog.

Categories
Misc

Advent of Code 2021 in pure TensorFlow – day 10

Advent of Code 2021 in pure TensorFlow - day 10 submitted by /u/pgaleone
[visit reddit] [comments]
Categories
Misc

Neural Network Pinpoints Artist by Examining a Painting’s Brushstrokes

An up close view of a person holding a paintbrush and painting on a canvas.Researchers developed a new AI algorithm that can identify a painter based on brushstrokes, with precision down to a single bristle.An up close view of a person holding a paintbrush and painting on a canvas.

Spotting painting forgeries just became a bit easier with a newly developed AI tool that picks up style differences with precision down to a single paintbrush bristle. The research, from a team at Case Western Reserve University (CWRU), trained convolutional neural networks to learn and identify a painter based on the 3D topography of a painting. This work could help historians and art experts distinguish between artists in collaborative pieces, and find fraudulent copies.

There are several methods to authenticating antique paintings. Experts often evaluate the style and condition of materials and use scientific methods such as microscopic analysis, infrared spectroscopy, and reflectography.

However, these exhaustive methods are time-consuming and can result in errors. They also cannot identify multiple painters of one piece of art. According to the study, painters such as El Greco and Rembrandt often employed workshops of artists to paint parts of a canvas in the same style as their own, making individual contributions unclear.

While analyzing artwork with machine learning is a relatively new field, recent studies have focused on combining AI methods with high-resolution images of paintings to learn a painter’s style and identify an artist. The researchers hypothesized that 3D analysis could hold even more data than an image, where features such as brushwork patterns along with paint deposition and drying methods could serve as an artist’s unique fingerprint. 

“3D topography is a new way for AI to ‘see’ a painting,” senior author Kenneth Singer, the Ambrose Swasey Professor of Physics at CWRU, said in a press release.

Extracting topographical data from a surface with an optical profiler, the researchers scanned 12 paintings of the same scene, painted with identical materials, but by four different artists. Sampling small square patches of the art, approximately 5 to 15 mm, the optical profiler detects and logs minute changes on a surface, which can be attributed to how someone holds and uses a paintbrush. 

They then trained an ensemble of convolutional neural networks to find patterns in the small patches, sampling between 160 to 1,440 patches for each of the artists. Using NVIDIA GPUs with cuDNN-accelerated deep learning frameworks, the algorithm matches the samples back to a single painter.

The team tested the algorithm against 180 patches of an artist’s painting, matching the samples back to a painter at about 95% accuracy. 

According to coauthor Michael Hinczewski, the Warren E. Rupp Associate Professor of Physics at CWRU, the ability to work with such small training sets is promising for later art historical applications with limited training datasets.

Figure 1: Overview of the data acquisition workflow and an ensemble of convolutional neural networks used to assign artist attribution probabilities to each patch. Credit: Ji, F., McMaster, M.S., Schwab, S. et al./Herit Sci

“Most of the other studies using AI for art attribution are focused on photos of entire paintings,” said Hinczewski. “We broke the painting down into virtual patches ranging from one-half millimeter to a few centimeters square. So we no longer even have information about the subject matter—but we can accurately predict who painted it from an individual patch. That’s amazing.”

Based on their findings the researchers view surface topography as an additional tool for attribution and forgery detection using an unbiased and quantitative analysis. In a collaboration with art conservation company Factum Arte based in Madrid, the team is working on artist workshop attribution and conservation studies on several works of the Spanish Renaissance painter El Greco.

The data and code associated with the research are available through GitHub. The work is a joint effort between researchers from the CWRU Department of Art History and Art, Cleveland Institute of Art, and the Cleveland Museum of Art.


Read the published research in Heritage Science. >>
Read the press release. >>

Categories
Misc

NVIDIA Metropolis Partners Showcase Vision AI Traffic Optimization at CES 2022

Image of a city street with traffic overlaid with a CES 2022 promo text.Explore NVIDIA Metropolis partners showcasing new technologies to improve city mobility at CES 2022.Image of a city street with traffic overlaid with a CES 2022 promo text.

Consumer Electronics Show (CES), an annual trade show organized by the Consumer Technology Association, brings together thought leaders, products, and technologies working to transform traffic and roadways, an important cross-section of daily life.

With limited roadways and growing populations, cities increasingly look to automation, and simulation for managing traffic and constrained infrastructure. NVIDIA partners worldwide are deploying the NVIDIA Metropolis video analytics platform, leveraging real-time sensors and AI to design more efficient roadways and optimize traffic safety and operations. 

The following NVIDIA Metropolis partners are showcasing how they help manage traffic with AI at CES. 

Asilla: Asilla develops behavior recognition AI solutions that use posture estimation technology to enhance public safety. Asilla is helping cities and a wide range of industries improve safety and security by detecting abnormal behavior in real-time and enabling prompt response to events. Check out Asilla at booth #51127 in Sands Hall.

Bitsensing: Bitsensing uses GPU and radar technology to connect cities, roads, buildings, and individuals, building the complete autonomous, connected environment. With cutting-edge imaging radar technology, Bitsensing accelerates creating the ultimate smart city to bring a new level of reliability and convenience. Visit Bitsensing at booth #61059 in Eureka Park.

Ekin: Ekin develops the next generation of smart city solutions to optimize safety and security for cities. A forward-thinking provider of quantitative data to cities based on cutting-edge artificial intelligence technology, with a focus on traffic management, smart parking, smart city living, and public safety. Visit Ekin at LVCC – North Hall Booth #9136.

Nota: Nota produces an AI software optimization platform, which automates and optimizes customers’ AI applications. Nota creates lightweight AI models that are low latency, energy-efficient, and accurate. They work to optimize processor usage and proliferate lower-end edge devices. Check out Nota at booth #9646 in LVCC North Hall.

NoTraffic: NoTraffic’s real-time, plug-and-play autonomous traffic management platform uses AI and cloud computing to reinvent how cities run their transport networks. The NoTraffic platform is an end-to-end hardware and software solution installed at intersections, transforming roadways to optimize traffic flows and reduce accidents. Check out NoTraffic at booth #9130 in LVCC North Hall.

Ouster: Cities are using Ouster digital lidar solutions to capture the environment in minute detail and detect vehicles, vulnerable road users, and incidents in real time to improve safety and traffic efficiency. Ouster lidar’s 3D spatial awareness and 24/7 performance combine the high-resolution imagery of cameras with the all-weather reliability of radar. Check out Ouster and live demos at booth #3843 in LVCC West Hall.

Velodyne Lidar: Velodyne’s lidar-based Intelligent Infrastructure Solution (IIS) is a complete end-to-end Smart City solution. IIS creates a real-time 3D map of roads and intersections, providing precise traffic and pedestrian safety analytics, road user classification, and smart signal actuation. The solution is deployed in the US, Canada and across EMEA and APAC. Check out Velodyne Lidar at booth #6005 in LVCC West Hall.

Register for CES, happening Jan. 5-8 in Las Vegas.

CES 2022 promo banner advertisement.