submitted by /u/MLtinkerer
[visit reddit]
[comments]
Hi All.
I am using Keras and Tensorflow 2.0. I have code that tries to
set the number of inter and intra op threads. I have added the
session stuff for compatability, but it still won’t work right.
from keras import backend as K
….
….
import tensorflow as tf
session_conf =
tf.compat.v1.ConfigProto(inter_op_parallelism_threads=int(os.environ[‘NUM_INTER_THREADS’]),
intra_op_parallelism_threads=int(os.environ[‘NUM_INTRA_THREADS’]))
sess =
tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(),
config=session_conf)
K.set_session(sess)
Then it blows up with:
RuntimeError: `set_session` is not available when using
TensorFlow 2.0.
Any advice?
submitted by /u/dunn_ditty
[visit reddit]
[comments]
Once the founder of a wearable computing startup, Arye Barnehama understands the toils of manufacturing consumer devices. He moved to Shenzhen in 2014 to personally oversee production lines for his brain waves-monitoring headband, Melon. It was an experience that left an impression: manufacturing needed automation. His next act is Elementary Robotics, which develops robotics for Read article >
The post All AIs on Quality: Startup’s NVIDIA Jetson-Enabled Inspections Boost Manufacturing appeared first on The Official NVIDIA Blog.
Pinterest now has more than 440 million reasons to offer the best visual search experience. That’s because its monthly active users are tracking this high for its popular image sharing and social media service. Visual search enables Pinterest users to search for images using text, screenshots or camera photos. It’s the core AI behind how Read article >
The post Pinterest Trains Visual Search Faster with Optimized Architecture on NVIDIA GPUs appeared first on The Official NVIDIA Blog.
NVIDIA SimNet v20.12 Released
With this release, use cases such as heat sinks, data center cooling, aerodynamics and deformation of solids in linear elastic regime can be solved.
NVIDIA recently announced the release of SimNet v20.12 with support for new physics such as Fluid Mechanics, Linear Elasticity and Conductive as well as Convective Heat Transfer. Systems governed by Ordinary Differential Equations (ODEs) as well as Partial Differential Equations (PDEs) can now be solved. With this release, use cases such as heat sinks, data center cooling, aerodynamics and deformation of solids in linear elastic regime can be solved.
Previously announced in Sep, NVIDIA SimNet is a Physics Informed Neural Networks (PINNs) toolkit for students and researchers who are either looking to get started with AI-driven physics simulations or are looking to leverage a powerful framework to implement their domain knowledge to solve complex nonlinear physics problems with real-world applications.
SimNet v20.12 highlights
Multi-parameter training of Complex Geometries and Physics:
As a result of enhancements in network architectures as well as performance improvements, SimNet v20.12 converges to a lower loss faster. This enables training on several parameters in a single run. For a 10-parameter Limerock, training and inference for 59,049 configurations (3 values for each design parameter) took 1000 V100 GPU hours. For same number of solver runs, the solver would take over 18.4 million hours (with 26 hours/configuration for a 12-core workstation)
Linear Elasticity in Solids:
Linear elastic solid deformation is now included in the release in both Navier-Cauchy as well as Equilibrium forms. The solution has good agreement with finite element results.
The stresses from the linear elasticity formulation from SimNet were used in a digital twin model, developed by University of Central Florida, using RNN to model fatigue crack growth in an aircraft panel.
Improved STL geometry library:
The PySDF library for STL geometries has been enhanced for about 10x more performance with better accuracy for complex geometries.
Integral form of Partial Differential Equations:
Some physics problems have no classical PDE (or strong) form but only a variational (or weak) form. This requires handling the PDEs in a different approach other than its original (classical) form, especially for interface problem, concave domain, singular problem, etc. In SimNet, the PDEs can be solved not only in their strong form, but also in their weak form.
For example, a point source represented by delta Dirac function cannot be solved by the differential equations based PINNs but an integral form can capture the singular behavior at the center.
Strong Scaling Performance:
For the multi-GPU cases, the learning rate is gradually increased from the baseline case and this allows the model to train without diverging early on and allows the model to converge faster as a result of the increased global batch size coupled with the increased learning rate. The loss function evolution as the number of GPUs is increased from 1 to 16 for the NVSwitch heat sink case shows a progressive scaling from 2x for the 2 GPU case to 8x for the 16 GPU case.
SimNet in other news / events:
- At PHM Society 2020, Nvidia collaborated with UCF (University of Central Florida) to solve the use case of a Digital Twin of Aircraft cabin panels using SimNet and this was published in the PHM Society event. To learn more, watch the video (#2) here: https://www.phmsociety2020.com/corporate-sponsors/nvidia
- At SC20, we showcased how SimNet can help medical researchers simulate and predict the underlying blood flow physics in an aneurysm. Read more here: https://news.developer.nvidia.com/sc20-demo-flow-physics-quantification-in-an-aneurysm-using-nvidia-simnet/
Read the paper, NVIDIA SimNet: an AI-accelerated multi-physics simulation framework here.
Give SimNet v20.12 a try by requesting access today.
Refreshing a Live Service Game
We talked to Haiyong Qian, NetEase Game Engine Development Researcher and Manager of NetEase Thunder Fire Games Technical Center, to see what he’s learned as the Justice team added NVIDIA ray-tracing solutions to their development pipeline.
How NetEase Thunder Fire Games keeps their Massively Multiplayer Online Game (MMO) “Justice” looking new years after release.
Delivering an endless stream of content to players in a live service game is an enormous undertaking. Managing that responsibility while staying graphically competitive is a herculean feat.
The fidelity bar is constantly being raised. Most games released in 2018 didn’t support real-time ray-tracing. Now, it’s a feature that players expect in cutting-edge games, and it’s been integrated into a wide range of titles. Justice – NetEase’s popular Chinese MMO – runs on an engine that debuted in 2012, but the game is beautiful by 2020 standards. This is thanks to talented artists, smart engine design, and the integration of real-time ray tracing and DLSS.
We talked with Haiyong Qian, NetEase Game Engine Development Researcher and Manager of NetEase Thunder Fire Games Technical Center, to see what he’s learned as the Justice team added NVIDIA ray-tracing solutions to their development pipeline.
NVIDIA: What is the development team size for Justice?
Qian: More than 300 members in the whole development team, while there are 20 members in the game engine teach team.
NVIDIA: Why did you decide to add ray traced effects into the game?
Qian: Applying ray tracing technology into the real-time rendering field, especially the gaming field, has always been the dream of our game developers, but it was impossible to achieve before due to the performance limitation. In 2018, NVIDIA launched the first RTX GPU, which paved the way for this dream to become true, and we did not hesitate to decide trying it in Justice.
NVIDIA: A lot of developers starting out with real-time ray tracing struggle with performance because they try to make everything reflective. Do you have any advice on materials to use when building an environment that will be ray traced?
Qian: There are still many optimization methods. For example, materials with high roughness in the scene do not need to participate in raytracing. In addition, if the game engine is based on the Deferred Rendering architecture, rays can be emitted in the screen space based on the GBuffer information to reduce the times of ray bouncing.
NVIDIA: How long did it take to add RTXGI to your game? What does RTXGI do to improve the look of the game?
Qian: Before integration of RTXGI, we have already completed the DX12 upgrade to our game engine and RT & DLSS integration. With these works done, adding RTXGI to the game is an easy task, which took about 2 weeks to finish. RTXGI solves some problems of traditional GI: light leaking and excessively long baking time, and it supports dynamic light sources, which greatly improves the expressiveness of the scene.
NVIDIA: What were your team’s biggest personal learnings about real-time ray tracing from working on Justice?
Qian: First of all, if there is a breakthrough in technology, there must be sufficient accumulation. Secondly, the combined team effort is very important. Without the close cooperation among our team members and the dedicated collaboration with NVIDIA China team, this could not be possible.
NVIDIA: How were you able to balance computationally expensive real-time ray tracing features with performance?
Qian: Justice is an MMO open world game, RT features are now available to several suitable scenes, which can achieve a good balance between image quality and performance. And of course, with the help of the killer app: DLSS, we will gradually open more and more RT scenes later.
NVIDIA: Did you experience any bottlenecks or challenges when incorporating real-time ray tracing into the demo? If so, how did you overcome them?
Qian: As the first in-house game engine which integrated RTX function in China, there were tons of difficulties and challenges. For more than 2 months, our entire team basically only slept 3 or 4 hours a day. There were endless tech issues left to be solved. Here I would like to take this chance to thank the NVIDIA China team for their generous help, helping us overcome difficulties one by one, and we finally made it to achieve today’s accomplishment.
NVIDIA: What has been made easier for your team with the integration of real-time ray tracing and DLSS into your pipeline?
Qian: It`s the advanced architecture of our engine. We can adopt RT and DLSS with only minor modifications to the render pipeline. Instead, DX12 API upgrade took the largest workload during the whole RTX development progress.
NVIDIA: How is real-time ray tracing and DLSS changing game development?
Qian: From the perspective of artists, it brings out a brand-new content creation pipeline and a better, richer visual quality. And from a game design perspective, RT can bring whole new gameplay elements.
NVIDIA: How has your audience responded to the new look of your game, after real-time raytracing and DLSS has been added?
Qian: Players are very excited. They are fully affirmed for the performance of RT and DLSS. You can see this from Weibo and Baidu Tieba gamer communities. There are also many feedbacks from overseas players on YouTube. Of course, after being stage RTX content on GTC China2018 and CES2019 of Jensen`s Keynote, we were so confident on China game content being accepted world widely.
NVIDIA: If you had built the game from the ground up with real-time ray tracing and DLSS in mind, what would you have done differently?
Qian: We probably consider having DX12 API support in our engine in the first place.
NVIDIA: Are you planning to release any other games with real-time ray tracing and DLSS?
Qian: Yes, there are several games under development by NetEase Thunder Fire studio which will feature RTX technologies. Please stay tuned.
NVIDIA: What real-time ray tracing effect in Justice are you most excited for your players to see?
Qian: All RT features can bring out more realistic representation of the game world. The most exciting one among them must be ray tracing reflections.
NVIDIA: What advice would you give to other developers who are building live service games, and want to keep their games looking competitive graphically?
Qian: Our strategy is to provide players with the best experience regardless it’s still in development or has been released to the public. Therefore, as long as it is a technology that can enhance the player’s gaming experience, we will go all out to implement it in game. And of course, first of all, you should have a good engine with relatively good extensibility, because none of us knows what an advanced technology will look like in the future.
NVIDIA: Can you talk about any future plan you want to incorporate NVIDIA technology into Justice (such as NVIDIA Real-Time Denoiser, RTX Direct Illumination, etc.)?
Qian: In terms of technology, we have always been radical, and we will constantly push various new tech features, including those you mentioned. As long as these technologies can improve our players’ experience, we`ll work on it.
NVIDIA: What made you decide on integrating the latest ray tracing technology which no games have tried before, e.g. releasing the first real-time RTX demo in China and the first RTXGI powered game in the world?
Qian: I will summarize here, mainly two points: 1) It fulfills the dream that real time ray tracing that can be applied to the game field; 2) We think these technologies can bring our players a better game experience.
NVIDIA: Adding ray tracing effects into an in-house game engine could be more challenging than using existing commercial engines. If so, what are the challenges and what are the strengths of Justice’s engine?
Qian:I think the difficulties are the same, but the difficulty of using commercial engines has been solved by others. For in-house engine we must overcome these difficulties by ourselves. As I said before, in this matter, the biggest challenge was to upgrade the engine to DX12, because when we designed this engine 8 years ago, DX12 had not yet been released, and its features were unforeseen at the time. Another big challenge is how to balance between RT effects and performance. Fortunately, our team has very rich experience in independent research, and our engine architecture also has full freedom of horizontal and vertical expansion capabilities. NVIDIA China content team also gave us very strong support. Eventually, these tasks were successfully accomplished.
Deep regression
Have people been using deep learning to do regression? I noticed
that fitting polynomials using least squares leads to much better
accuracy! Is there any rule of thumb to get arbitrary accuracy with
deep regression?
submitted by /u/matibilkis
[visit reddit]
[comments]
Hello All,
I am new to TensorFlow and I have a problem wherein I need to
count boats on a lake using Keras. I have seen this done in two
separate papers now one counting
whales and another
counting ships in the ocean. however, both are using python.
While I am not apposed to learning another language, I was curious
if there are any tutorials out there about using Keras to count
objects coding in R. does anyone know of anything like this that I
could read over? atm I am stuck with either trying to muddle my way
through building a CNN from scratch without any guidance or
learning a new language, neither of which is something I am
particularly excited about tackling.
any help would be greatly appreciated.
submitted by /u/mthompson2100
[visit reddit]
[comments]
CUDA 11.2 includes improved user experience and application performance through a combination of driver/toolkit compatibility enhancements, new memory suballocator feature, and compiler upgrades.
CUDA Toolkit is a complete, fully-featured software development platform for building GPU-accelerated applications, providing all the components needed to develop apps targeting every NVIDIA GPU platform.
CUDA 11 announced support for the new NVIDIA A100 based on the NVIDIA Ampere architecture, and CUDA 11.1 delivered support for NVIDIA GeForce RTX 30 Series and Quadro RTX Series GPU platforms.
Today, CUDA 11.2 is introducing improved user experience and application performance through a combination of driver/toolkit compatibility enhancements, new memory suballocator feature, and compiler enhancements including an LLVM upgrade.
This new 11.2 release also delivers programming model updates to CUDA Graphs and Cooperative Groups, as well as expanding support for latest generation operating systems and compilers.
We describe some of these innovative feature introductions with more detail in a new blog Enhancing Memory Allocation with New NVIDIA CUDA 11.2 Features, and we will publish additional blogs on compiler enhancements shortly. Follow all CUDA Developer Blogs here.
Download CUDA 11.2 Toolkit today.
Watch [GTC Fall Session] CUDA New Features and Beyond: Ampere Programming for Developers
The 2020.3 release of NVIDIA Nsight Compute included in CUDA Toolkit 11.2 introduces several new features that simplify the process of CUDA kernel profiling and optimization.
The 2020.3 release of NVIDIA Nsight Compute included in CUDA Toolkit 11.2 introduces several new features that simplify the process of CUDA kernel profiling and optimization.
Profile Series
The new Profile Series feature allows developers to configure ranges for multiple kernel parameters. Nsight Compute will automatically iterate through the ranges and profile each combination to help you find the best configuration. These parameters include the number of registers per thread, shared memory sizes, and the shared memory configuration. This automates a process that previously would need manual support, and can provide optimized performance configurations with minimal changes to source code.
The Profile Series configuration is available in the UI’s Interactive Profiling activity.
Import Source
This highly requested feature enables users to archive source files within their Nsight Compute results. It allows any user with access to the results to resolve performance data to lines in the source code, even if they don’t have access to the original source files. Sharing results with teammates and archiving them for future analysis are just a couple of uses for this new feature. Users can import source files with the (–import-source) command-line option or via the UI when configuring the profile.
Source Files can also be imported later via the Profile Menu.
Additionally, there are several other new capabilities available in this release. These include Memory Allocation Tracking, support for derived metrics, and additional configurations and advice for the recently released Application Replay feature.
For complete details, check out the Nsight Compute Release Notes.
Download Nsight Compute 2020.3 and check out featured spotlight video demonstrations on Roofline Analysis and Application Replay!