Categories
Misc

linux seems stucking on cudnn how to fix it?

hi

same repo to run the tensorflow gpu code.

on Win, it passed the stage like :

- weight name: discriminator/gan/conv6/bias:0, shape: [256], size: 256 [32;1mTrigger callback: [0mTotal counts of trainable weights: 33579064. Total size of trainable weights: 0G 32M 24K 56B (Assuming32-bit data type.) 2022-05-05 11:29:14.865680: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll 2022-05-05 11:29:15.131382: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2022-05-05 11:29:15.901900: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: Invoking ptxas not supported on Windows Relying on driver to perform ptx compilation. This message will be only logged once. 

however the linux stays at the

Total size of trainable weights: 0G 32M 24K 56B (Assuming32-bit data type.) 

for ever,

assume this linux wasn’t able to open the cudnn???

cudnn is installed by running

$conda list cudatoolkit 10.0.130 hf841e97_10 conda-forge cudnn 7.6.5.32 ha8d7eb6_1 conda-forge 

the version seems fine with tensorflow-gpu 1.15 tensorflow-cuda

after installing the cuda with system package manager,

but there is another cudatoolkit 10 and the cudnn comes withthe tensorflow-gpu. did not restart the machine. does this matter?

2022-05-05 12:46:14.496348: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA 2022-05-05 12:46:14.520365: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2601325000 Hz 2022-05-05 12:46:14.521958: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5565bcb72f40 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2022-05-05 12:46:14.521982: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2022-05-05 12:46:14.523737: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1 2022-05-05 12:46:14.526821: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: CUDA_ERROR_SYSTEM_DRIVER_MISMATCH: system has unsupported display driver / cuda driver combination 2022-05-05 12:46:14.526889: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:200] libcuda reported version is: 510.60.2 2022-05-05 12:46:14.526904: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:204] kernel reported version is: 495.44.0 2022-05-05 12:46:14.526916: E tensorflow/stream_executor/cuda/cuda_diagnostics.cc:313] kernel version 495.44.0 does not match DSO version 510.60.2 -- cannot find working devices in this configuration 

is this because the cuda 11 installed by system not compatible with the cudatoolkit 10 by conda?

any idea how to fix this one?

thanks a lot

submitted by /u/boydbuilding
[visit reddit] [comments]

Categories
Misc

does tensorflow 1.15 work with cuda 11 and cudnn 8?

Hi,

found this

https://github.com/NVIDIA/tensorflow

but have no idea to make it work.

no matches found: nvidia-tensorflow[horovod] 

right now

$ nvcc --version Cuda compilation tools, release 11.6, 

and

$conda list cudatoolkit 10.0.130 hf841e97_10 conda-forge cudnn 7.6.5.32 ha8d7eb6_1 conda-forge 

and tensorflow doesn’t work right now. the repo were developed under tensorflow 1.15..

what is the best solution right now?

to make tensorflow 1.15 work with cuda 11 and cudnn 8?

or downgrade cuda 11 on system

or what

thanks a ton

submitted by /u/boydbuilding
[visit reddit] [comments]

Categories
Misc

Driver’s Ed: How Waabi Uses AI, Simulation to Teach Autonomous Vehicles to Drive

Teaching the AI brains of autonomous vehicles to understand the world as humans do requires billions of miles of driving experience. The road to achieving this astronomical level of driving leads to the virtual world. On the latest episode of the AI Podcast, Waabi CEO and founder Raquel Urtasun joins NVIDIA’s Katie Burke Washabaugh to Read article >

The post Driver’s Ed: How Waabi Uses AI, Simulation to Teach Autonomous Vehicles to Drive appeared first on NVIDIA Blog.

Categories
Misc

Upcoming Event: Session on Vision AI at the Edge, from Zero to Deployment Using Low-Code Development at Embedded Vision Summit

This Embedded Vision Summit session will showcase a low-code approach to develop fully optimized and accelerated vision AI applications using DeepStream SDK and graph composer.

Categories
Misc

DLI Course: Disaster Risk Monitoring Using Satellite Imagery

Learn to build and deploy a deep learning model for automated flood detection using satellite imagery in this new self-paced course from the NVIDIA Deep Learning Institute.

Categories
Offsites

Learning Locomotion Skills Safely in the Real World

The promise of deep reinforcement learning (RL) in solving complex, high-dimensional problems autonomously has attracted much interest in areas such as robotics, game playing, and self-driving cars. However, effectively training an RL policy requires exploring a large set of robot states and actions, including many that are not safe for the robot. This is a considerable risk, for example, when training a legged robot. Because such robots are inherently unstable, there is a high likelihood of the robot falling during learning, which could cause damage.

The risk of damage can be mitigated to some extent by learning the control policy in computer simulation and then deploying it in the real world. However, this approach usually requires addressing the difficult sim-to-real gap, i.e., the policy trained in simulation can not be readily deployed in the real world for various reasons, such as sensor noise in deployment or the simulator not being realistic enough during training. Another approach to solve this issue is to directly learn or fine-tune a control policy in the real world. But again, the main challenge is to assure safety during learning.

In “Safe Reinforcement Learning for Legged Locomotion”, we introduce a safe RL framework for learning legged locomotion while satisfying safety constraints during training. Our goal is to learn locomotion skills autonomously in the real world without the robot falling during the entire learning process. Our learning framework adopts a two-policy safe RL framework: a “safe recovery policy” that recovers robots from near-unsafe states, and a “learner policy” that is optimized to perform the desired control task. The safe learning framework switches between the safe recovery policy and the learner policy to enable robots to safely acquire novel and agile motor skills.

The Proposed Framework
Our goal is to ensure that during the entire learning process, the robot never falls, regardless of the learner policy being used. Similar to how a child learns to ride a bike, our approach teaches an agent a policy while using “training wheels”, i.e., a safe recovery policy. We first define a set of states, which we call a “safety trigger set”, where the robot is close to violating safety constraints but can still be saved by a safe recovery policy. For example, the safety trigger set can be defined as a set of states with the height of the robots being below a certain threshold and the roll, pitch, yaw angles being too large, which is an indication of falls. When the learner policy results in the robot being within the safety trigger set (i.e., where it is likely to fall), we switch to the safe recovery policy, which drives the robot back to a safe state. We determine when to switch back to the learner policy by leveraging an approximate dynamics model of the robot to predict the future robot trajectory. For example, based on the position of the robot’s legs and the current angle of the robot based on sensors for roll, pitch, and yaw, is it likely to fall in the future? If the predicted future states are all safe, we hand the control back to the learner policy, otherwise, we keep using the safe recovery policy.

The state diagram of the proposed approach. (1) If the learner policy violates the safety constraint, we switch to the safe recovery policy. (2) If the learner policy cannot ensure safety in the near future after switching to the safe recovery policy, we keep using the safe recovery policy. This allows the robot to explore more while ensuring safety.

This approach ensures safety in complex systems without resorting to opaque neural networks that may be sensitive to distribution shifts in application. In addition, the learner policy is able to explore states that are near safety violations, which is useful for learning a robust policy.

Because we use “approximated” dynamics to predict the future trajectory, we also examine how much safer a robot would be if we used a much more accurate model for its dynamics. We provide a theoretical analysis of this problem and show that our approach can achieve minimal safety performance loss compared to one with a full knowledge about the system dynamics.

Legged Locomotion Tasks
To demonstrate the effectiveness of the algorithm, we consider learning three different legged locomotion skills:

  1. Efficient Gait: The robot learns how to walk with low energy consumption and is rewarded for consuming less energy.
  2. Catwalk: The robot learns a catwalk gait pattern, in which the left and right two feet are close to each other. This is challenging because by narrowing the support polygon, the robot becomes less stable.
  3. Two-leg Balance: The robot learns a two-leg balance policy, in which the front-right and rear-left feet are in stance, and the other two are lifted. The robot can easily fall without delicate balance control because the contact polygon degenerates into a line segment.
Locomotion tasks considered in the paper. Top: efficient gait. Middle: catwalk. Bottom: two-leg balance.

Implementation Details
We use a hierarchical policy framework that combines RL and a traditional control approach for the learner and safe recovery policies. This framework consists of a high-level RL policy, which produces gait parameters (e.g., stepping frequency) and feet placements, and pairs it with a low-level process controller called model predictive control (MPC) that takes in these parameters and computes the desired torque for each motor in the robot. Because we do not directly command the motors’ angles, this approach provides more stable operation, streamlines the policy training due to a smaller action space, and results in a more robust policy. The input of the RL policy network includes the previous gait parameters, the height of the robot, base orientation, linear, angular velocities, and feedback to indicate whether the robot is approaching the safety trigger set. We use the same setup for each task.

We train a safe recovery policy with a reward for reaching stability as soon as possible. Furthermore, we design the safety trigger set with inspiration from capturability theory. In particular, the initial safety trigger set is defined to ensure that the robot’s feet can not fall outside of the positions from which the robot can safely recover using the safe recovery policy. We then fine-tune this set on the real robot with a random policy to prevent the robot from falling.

Real-World Experiment Results
We report the real-world experimental results showing the reward learning curves and the percentage of safe recovery policy activations on the efficient gait, catwalk, and two-leg balance tasks. To ensure that the robot can learn to be safe, we add a penalty when triggering the safe recovery policy. Here, all the policies are trained from scratch, except for the two-leg balance task, which was pre-trained in simulation because it requires more training steps.

Overall, we see that on these tasks, the reward increases, and the percentage of uses of the safe recovery policy decreases over policy updates. For instance, the percentage of uses of the safe recovery policy decreases from 20% to near 0% in the efficient gait task. For the two-leg balance task, the percentage drops from near 82.5% to 67.5%, suggesting that the two-leg balance is substantially harder than the previous two tasks. Still, the policy does improve the reward. This observation implies that the learner can gradually learn the task while avoiding the need to trigger the safe recovery policy. In addition, this suggests that it is possible to design a safe trigger set and a safe recovery policy that does not impede the exploration of the policy as the performance increases.

The reward learning curve (blue) and the percentage of safe recovery policy activations (red) using our safe RL algorithm in the real world.

In addition, the following video shows the learning process for the two-leg balance task, including the interplay between the learner policy and the safe recovery policy, and the reset to the initial position when an episode ends. We can see that the robot tries to catch itself when falling by putting down the lifted legs (front left and rear right) outward, creating a support polygon. After the learning episode ends, the robot walks back to the reset position automatically. This allows us to train policy autonomously and safely without human supervision.

Early training stage.
Late training stage.
Without a safe recovery policy.

Finally, we show the clips of learned policies. First, in the catwalk task, the distance between two sides of the legs is 0.09m, which is 40.9% smaller than the nominal distance. Second, in the two-leg balance task, the robot can maintain balance by jumping up to four times via two legs, compared to one jump from the policy pre-trained from simulation.

Final learned two-leg balance.

Conclusion
We presented a safe RL framework and demonstrated how it can be used to train a robotic policy with no falls and without the need for a manual reset during the entire learning process for the efficient gait and catwalk tasks. This approach even enables training of a two-leg balance task with only four falls. The safe recovery policy is triggered only when needed, allowing the robot to more fully explore the environment. Our results suggest that learning legged locomotion skills autonomously and safely is possible in the real world, which could unlock new opportunities including offline dataset collection for robot learning.

No model is without limitation. We currently ignore the model uncertainty from the environment and non-linear dynamics in our theoretical analysis. Including these would further improve the generality of our approach. In addition, some hyper-parameters of the switching criteria are currently being heuristically tuned. It would be more efficient to automatically determine when to switch based on the learning progress. Furthermore, it would be interesting to extend this safe RL framework to other robot applications, such as robot manipulation. Finally, designing an appropriate reward when incorporating the safe recovery policy can impact learning performance. We use a penalty-based approach that obtained reasonable results in these experiments, but we plan to investigate this in future work to make further performance improvements.

Acknowledgements
We would like to thank our paper co-authors: Tingnan Zhang, Linda Luu, Sehoon Ha, Jie Tan, and Wenhao Yu. We would also like to thank the team members of Robotics at Google for discussions and feedback.

Categories
Misc

Bolster Network, Storage, and Security Infrastructure Services with NVIDIA DOCA 1.3

NVIDIA DOCA libraries simplify the development process of BlueField DPU applicationsThe latest release of the NVIDIA DOCA software framework focuses on enhancements to DOCA infrastructure services.NVIDIA DOCA libraries simplify the development process of BlueField DPU applications

The NVIDIA DOCA software framework provides a comprehensive, open development platform to accelerate the creation of DPU applications. DOCA continues to gain momentum and push the boundaries of the data center to offload, accelerate, and isolate network, storage, security, and management infrastructure. The release of the NVIDIA DOCA 1.3 software framework focuses on new features and enhancements of the software.

Key capabilities of DOCA 1.3

  • DOCA FLOW Lib with Optimized Flow Insertion
  • DOCA Communications Channel Library
  • DOCA Regex Library
  • DOCA App Shield SDK
  • OVN IPsec Encryption Full Offload
  • DOCA Services additions and enhancements include:
    • DOCA Telemetry
    • DOCA Host Based Networking
    • DOCA Flow Inspector

DOCA FLOW with Optimized Flow Insertion 

DOCA FLOW is an API that serves as an abstraction layer API for network acceleration. DOCA FLOW is the most fundamental API for building generic SDN execution pipelines in hardware.  

The main goal of DOCA FLOW is to provide a simple, complete framework for fast packet processing in data plane applications. The API provides a set of libraries for specific environments through the creation of an abstraction layer. DOCA FLOW makes it easy to develop HW-accelerated applications that have a match on up to two layers of tunneled packets.

With the addition of Optimized Flow Insertion (OFI), DOCA FLOW now offers a new way to manage the packet steering table of the DPU, offering several additional benefits. These include an increased flow insertion rate with more than a 10X performance increase for scaling to over 1M rules/sec. An improved security posture that eliminates the ability to hijack the underlying driver, and more flexibility are also key features.

DOCA Communications Channel

This release also introduces the DOCA Communications Channel for secure, flexible, and efficient application offload. DOCA Communications Channel is for isolated communication between the host software and the DOCA services running on the DPU. This gives, for example, Windows VM the ability to securely communicate with the service on the DPU Arm processors, without leveraging the regular network stack and risking exposure to malicious activity. Examples of DOCA services benefiting from this communication method include streaming services, Telemetry, App Shield monitoring, Remote APIs Orchestration, and Flow Inspection.

Regex Library

Regular expression, also known as Regex, is a standard pattern-matching tool used in many scripting languages. With it, you can create filters that can match patterns of text, rather than just single words or phrases. Regex was designed for high throughput, low latency Deep Packet Inspection applications that require packet payload inspection and anomaly detection, which can be achieved using RegEx pattern matching and string matching. The DOCA Regex Library is an important security and telemetry function that is now available in DOCA 1.3.

DOCA App Shield

DOCA App Shield was introduced in DOCA 1.2 for early access developers and has been enhanced in the DOCA 1.3 release.  

App Shield provides host monitoring, for cybersecurity vendors to create accelerated intrusion detection system solutions to identify an attack on any physical or virtual machine. It can feed data application status to security information and event management or extended detection and response tools. It can also enhance forensic investigations and incident response.

Security teams can protect their application processes, continuously validate integrity, and detect malicious activity with App Shield. If an attacker kills the machine security agent’s processes, App Shield can isolate the compromised host and prevent the malware from accessing confidential data or spreading to other resources. App Shield is an important advancement in the fight against cybercrime and an effective tool for zero-trust security.  

DOCA 1.3 now offers the App Shield Lib with a reference application and a supporting user’s guide for early access members.

OVN IPsec encryption full offload

DOCA 1.3 includes support for existing OVN deployments to accelerate IPsec datapath packet processing. OVN networks tunnel packets between physical devices and provides a single global configuration option for IPsec encryption for all OVN tunneled traffic in the network. With DOCA 1.3, drivers and runtime components have been updated to offload IPSec packet encryption and decryption and HMAC authentication, all with zero host CPU utilization based on the BlueField DPU. 

Host Based Networking

Host Based Networking (HBN) on the BlueField DPU helps manage and monitor traffic between VMs or containers on the same node. It also analyzes and encrypts (or decrypts then analyzes) traffic to and from the node—tasks that no ToR switch can perform.

HBN with BlueField DPUs revolutionizes how customers build and think about data center networks with a simplification of the ToR switch requirements as more intelligence is placed on the DPU. BlueField also provides an isolated environment for network policy configuration and enforcement, without software or dependencies on the host. 

Additional DOCA 1.3 SDK updates

  • LAG with Multi-host support
  • VirtIO enhancements

Community 

DOCA supports an open ecosystem for developers by providing industry-standard open APIs and frameworks and continuous improvements of DOCA Libs and services. To learn more about the community, or contribute to the innovation on the NVIDIA NGC catalog, join us on our forum.

Watch the following on-demand GTC sessions to learn more about DOCA 

Categories
Misc

MobileNetV2 for multiple detection

I have a model trained w/ mobilenetv2 for a specific project. The machine were the model is integrated can only accept one object at a time. The problem is I cannot control the user input since the machine will interact with lots of users there may be a;

a case where 2 or more objects are presented to the machine, I want to know if it’s possible for the model to determine that 2 or more objects with diff classification is indeed present in the machine, or if not what can be my work around?

My datasets doesnt have annotation but I’m not quite sure annotation would help since I’m only taking still images.

submitted by /u/clareeenceee
[visit reddit] [comments]

Categories
Misc

Trying to determine if Tensorflow can be used in this case.

Here is my use case. I have a image set of a full 360 walk around of a vehicle. I want to be able to classify which images are of the front, rear, passenger side, and driver side of the vehicle. I can train a model with hundreds of these image sets, all of different vehicles, each with around 100 frames per set.

I’m wondering if Tensorflow would be effective in classifying images into those four categories. I understand it would probably be good at classifying types of vehicles. I’m just uncertain if it would be good at this classification which is essentially a rotation of the various vehicles.

submitted by /u/LordPSIon
[visit reddit] [comments]

Categories
Misc

GFN Thursday Caught in 4K: 27 Games Arriving on GeForce NOW in May, Alongside 4K Streaming to PC and Mac Apps

Enjoy the finer things in life. May is looking pixel perfect for GeForce NOW gamers. RTX 3080 members can now take their games to the next level, streaming at 4K resolution on the GeForce NOW PC and Mac native apps — joining 4K support in the living room with SHIELD TV. There’s also a list Read article >

The post GFN Thursday Caught in 4K: 27 Games Arriving on GeForce NOW in May, Alongside 4K Streaming to PC and Mac Apps appeared first on NVIDIA Blog.