Categories
Misc

From Earth Sciences to Factory Production: GPU Hackathon Optimizes Modeling Results

Group image of participants of the digital TWCC GPU HackathonThe recent Taiwan Computing Cloud GPU Hackathon helped 12 teams advance their HPC and AI projects, using innovative technologies to address pressing global challenges.Group image of participants of the digital TWCC GPU Hackathon

While the world is continuously changing, one constant is the ongoing drive of developers to tackle challenges using innovative technologies. The recent Taiwan Computing Cloud (TWCC) GPU Hackathon exemplified such a drive, serving as a catalyst for developers and engineers to advance their HPC and AI projects using GPUs. 

A collaboration between the National Center for High-Performance Computing, Taiwan Web Service Corporation, NVIDIA, and OpenACC, 12 teams and 15 NVIDIA mentors, used approaches to accelerate projects ranging from an AI-driven manufacturing scheduling model to a rapid flood prediction model. 

Tapping AI to optimize production efficiency 

One of the key areas of smart manufacturing is optimizing and automating production line processes. Team AI Scheduler, with members from the Computational Intelligence Technology Center (CITC) of Industrial Technology Research Center (ITRI), came to the hackathon to work on their manufacturing scheduling model using machine learning.  

Traditional scheduling models mostly employ heuristic rules, which can respond to dynamic events instantly. However, their short-term approach does not often lead to the optimal solution and proves inflexible when dealing with changing variables, which limits their ongoing viability. 

The team’s approach uses a Monte Carlo Tree Search (MCTS) method, combining the classic tree search implementations alongside machine learning principles of reinforcement learning. This method addresses existing heuristic limitations for improved efficiency of the overall scheduling model for improved efficiency. 

Working with their mentor, Team AI Scheduler learned to use NVIDIA Nsight Systems to identify bottlenecks and use GPUs to parallelize their code. At the conclusion of the event, the team was able to accelerate the simulation step of their MCTS algorithm. This reduced the scheduling time from 6 hours to 30 minutes and achieved a speedup of 11.3x in overall scheduling efficiency.  

“Having proved the feasibility of using GPUs to accelerate our model at this hackathon, the next step is to adopt it into our commercial models for industry use,” said Dr. Tsan-Cheng Su and Hao-Che Huang of CITC, ITRI. 

Using GPUs to see the big picture in Earth sciences 

Located between the Eurasian and the Philippine Sea Plate, Taiwan is one of the most tectonically active places in the world, and an important base for global seismological research. Geological research and the time scale of tectonic activity is often measured in units of thousands–or tens of thousands–of years. This requires the use of massive amounts of data and adequate compute power to analyze efficiently. 

Hackathon Team IES-Geodynamics, led by Dr. Tan, is pictured.
Figure 1. Led by Dr. Tan (center), Team IES-Geodynamics pictured. 

The IES-Geodynamics team, led by Dr. Tan from the Institute of Earth Research, Academia Sinica, came to the GPU Hackathon to accelerate their numerical geodynamical model. Named DynEarthSol, it simulates mantle convection, subduction, mountain building, and tectonics. Previously, the team handled large volumes of data by reducing the number of calculations and steps by chunking data into pieces and restricting the computing processes to fit the limited computing power of the CPU. This made it very difficult to see the full picture of the research. 

Over the course of the hackathon, the team used a new data input method that leveraged the GPU to calculate the data and multiple steps. Using OpenACC, Team IES-Geodynamics was able to port 80% of their model to GPUs and achieved a 13.6X speedup. 

“This is my second time attending a GPU Hackathon and I will definitely attend the next one,” said Professor Eh Tan, Research Fellow from IES, Academia Sinica. “We have learned the appropriate way to adopt GPUs and the user-friendly profiling tool gives us a great idea for how to optimize our model.” 

The team will continue to work towards porting the remaining 20% of their model. They look forward to running more high-resolution models using GPUs to gain a deeper understanding of formation activities in Taiwan. 

Rapid flood assessment for emergency planning and response 

Flooding is among the most devastating natural disasters. Causing massive casualties and economic losses, floods affect an average of 21 million people worldwide each year with numbers expected to rise due to climate change and other factors. Preventing and mitigating these hazards is a critical endeavor. 

THINKLAB, a team from National Yang Chiao University (NYCU), is working on the development of a model that can provide fast and accurate results for emergency purposes while maintaining simplicity in operation. The proposed hybrid inundation model (HIM) solves the zero-inertia equation through the Cellular Automata approach and works with subgrid-scale interpolation strategies to generate higher-resolution results.

Simulating flood extents using the hybrid inundation model (HIM).
Figure 2. Example of flood extents produced by the HIM.

Developed using Python and NumPy libraries, the HIM model ran without parallel or GPU computations at the onset of the hackathon. During the event, Team THINKLAB used CuPy to parallelize their code to run on GPUs, then focused on applying user-defined CUDA kernels to the parameters. The result was a 672-time speedup, bringing the computation time from 2 weeks to approximately 30 minutes. 

“We learned so many techniques during this event and highly recommend these events to others,” said Obaja Wijaya, team member of THINKLAB. “NVIDIA is the expert in this field and by working with their mentors we have learned how to optimize models/codes using GPU programming.” 

Additional hackathons and boot camps are scheduled throughout 2022. For more information on GPU Hackathons and future events, visit https://www.gpuhackathons.org

Categories
Misc

Is it possible to embed a TensorFlow Lite model into an 8-bit microcontroller?

Hey guys, need your help, please 🙂

submitted by /u/markwatsn
[visit reddit] [comments]

Categories
Misc

How many times does TensorFlow Lite usually compress a model built with TensorFlow?

What do u think about it?

submitted by /u/markwatsn
[visit reddit] [comments]

Categories
Misc

Colab script for object detection with tensorflow and keras – ValueError: Unexpected result of `train_function` (Empty logs)

Hello to everyone,

I am trying to adapt the script from this link keras example to my custom dataset but I run into the following issue:

‘ValueError: Unexpected result of train_function
(Empty logs). Please use Model.compile(…, run_eagerly=True)
, or tf.config.run_functions_eagerly(True)
for more information of where went wrong, or file a issue/bug to tf.keras
.

My dataset is (I flattened it in order to surpass error for converting dict to tensorflow)

<TensorSliceDataset element_spec={'image/filename': TensorSpec(shape=(), dtype=tf.string, name=None), 'image/id': TensorSpec(shape=(), dtype=tf.int32, name=None), 'is_crowd': TensorSpec(shape=(), dtype=tf.bool, name=None), 'area': TensorSpec(shape=(), dtype=tf.float32, name=None), 'bbox': TensorSpec(shape=(1, 4), dtype=tf.float32, name=None), 'id': TensorSpec(shape=(), dtype=tf.int32, name=None), 'image': TensorSpec(shape=(480, 640, 3), dtype=tf.float32, name=None), 'label': TensorSpec(shape=(), dtype=tf.int32, name=None)}> 

while the example dataset is

<PrefetchDataset element_spec={'image': TensorSpec(shape=(None, None, 3), dtype=tf.uint8, name=None), 'image/filename': TensorSpec(shape=(), dtype=tf.string, name=None), 'image/id': TensorSpec(shape=(), dtype=tf.int64, name=None), 'objects': {'area': TensorSpec(shape=(None,), dtype=tf.int64, name=None), 'bbox': TensorSpec(shape=(None, 4), dtype=tf.float32, name=None), 'id': TensorSpec(shape=(None,), dtype=tf.int64, name=None), 'is_crowd': TensorSpec(shape=(None,), dtype=tf.bool, name=None), 'label': TensorSpec(shape=(None,), dtype=tf.int64, name=None)}}> 

My script is publicly available here. If anyone can help with what I am doing wrong (i.e. input images, tensors, model building), I would be so grateful!!

submitted by /u/agristats
[visit reddit] [comments]

Categories
Misc

Any advice on how to deploy a deep-learning model on mobile devices?

We currently have an app built on Xamarin and C#. My aim is to provide an analytics platform (which I’ve built in TF), however what would be the best way to deploy it? I’ve done some readings of the docs, but I’d love to hear your guys experience / thoughts?

submitted by /u/PrijNaidu
[visit reddit] [comments]

Categories
Misc

Can you add YOLO to the top of a pretrained model?

I have a InceptionResNetV2 model that is trained for identification of insects. I was wondering if I could change the base identification part of YOLO to use my model? My understanding is that YOLO trains identification based on Darknet,VGG, or other small networks and then moves to a partitioning method for the object detection so based on my limited knowledge I’m guessing it should theoretically be possible to replace these small base models but I am not sure if it is this simple or if my neural network architecture could work. I couldn’t find much information about this online.

submitted by /u/188_888
[visit reddit] [comments]

Categories
Misc

Machine Learning in Scraping with Rails

Machine Learning in Scraping with Rails submitted by /u/Kagermanov
[visit reddit] [comments]
Categories
Offsites

4D-Net: Learning Multi-Modal Alignment for 3D and Image Inputs in Time

While not immediately obvious, all of us experience the world in four dimensions (4D). For example, when walking or driving down the street we observe a stream of visual inputs, snapshots of the 3D world, which, when taken together in time, creates a 4D visual input. Today’s autonomous vehicles and robots are able to capture much of this information through various onboard sensing mechanisms, such as LiDAR and cameras.

LiDAR is a ubiquitous sensor that uses light pulses to reliably measure the 3D coordinates of objects in a scene, however, it is also sparse and has a limited range — the farther one is from a sensor, the fewer points will be returned. This means that far-away objects might only get a handful of points, or none at all, and might not be seen by LiDAR alone. At the same time, images from the onboard camera, which is a dense input, are incredibly useful for semantic understanding, such as detecting and segmenting objects. With high resolution, cameras can be very effective at detecting objects far away, but are less accurate in measuring the distance.

Autonomous vehicles collect data from both LiDAR and onboard camera sensors. Each sensor measurement is recorded at regular time intervals, providing an accurate representation of the 4D world. However, very few research algorithms use both of these in combination, especially when taken “in time”, i.e., as a temporally ordered sequence of data, mostly due to two major challenges. When using both sensing modalities simultaneously, 1) it is difficult to maintain computational efficiency, and 2) pairing the information from one sensor to another adds further complexity since there is not always a direct correspondence between LiDAR points and onboard camera RGB image inputs.

In “4D-Net for Learned Multi-Modal Alignment”, published at ICCV 2021, we present a neural network that can process 4D data, which we call 4D-Net. This is the first attempt to effectively combine both types of sensors, 3D LiDAR point clouds and onboard camera RGB images, when both are in time. We also introduce a dynamic connection learning method, which incorporates 4D information from a scene by performing connection learning across both feature representations. Finally, we demonstrate that 4D-Net is better able to use motion cues and dense image information to detect distant objects while maintaining computational efficiency.

4D-Net
In our scenario, we use 4D inputs (3D point clouds and onboard camera image data in time) to solve a very popular visual understanding task, the 3D box detection of objects. We study the question of how one can combine the two sensing modalities, which come from different domains and have features that do not necessarily match — i.e., sparse LiDAR inputs span the 3D space and dense camera images only produce 2D projections of a scene. The exact correspondence between their respective features is unknown, so we seek to learn the connections between these two sensor inputs and their feature representations. We consider neural network representations where each of the feature layers can be combined with other potential layers from other sensor inputs, as shown below.

4D-Net effectively combines 3D LiDAR point clouds in time with RGB images, also streamed in time as video, learning the connections between different sensors and their feature representations.

Dynamic Connection Learning Across Sensing Modalities
We use a light-weight neural architecture search to learn the connections between both types of sensor inputs and their feature representations, to obtain the most accurate 3D box detection. In the autonomous driving domain it is especially important to reliably detect objects at highly variable distances, with modern LiDAR sensors reaching several hundreds of meters in range. This implies that more distant objects will appear smaller in the images and the most valuable features for detecting them will be in earlier layers of the network, which better capture fine-scale features, as opposed to close-by objects represented by later layers. Based on this observation, we modify the connections to be dynamic and select among features from all layers using self-attention mechanisms. We apply a learnable linear layer, which is able to apply attention-weighting to all other layer weights and learn the best combination for the task at hand.

Connection learning approach schematic, where connections between features from the 3D point cloud inputs are combined with the features from the RGB camera video inputs. Each connection learns the weighting for the corresponding inputs.

Results
We evaluate our results against state-of-the-art approaches on the Waymo Open Dataset benchmark, for which previous models have only leveraged 3D point clouds in time or a combination of a single point cloud and camera image data. 4D-Net uses both sensor inputs efficiently, processing 32 point clouds in time and 16 RGB frames within 164 milliseconds, and performs well compared to other methods. In comparison, the next best approach is less efficient and accurate because its neural net computation takes 300 milliseconds, and uses fewer sensor inputs than 4D-Net.

Results on a 3D scene. Top: 3D boxes, corresponding to detected vehicles, are shown in different colors; dotted line boxes are for objects that were missed. Bottom: The boxes are shown in the corresponding camera images for visualization purposes.

Detecting Far-Away Objects
Another benefit of 4D-Net is that it takes advantage of both the high resolution provided by RGB, which can accurately detect objects on the image plane, and the accurate depth that the point cloud data provides. As a result, objects at a greater distance that were previously missed by point cloud-only approaches can be detected by a 4D-Net. This is due to the fusion of camera data, which is able to detect distant objects, and efficiently propagate this information to the 3D part of the network to produce accurate detections.

Is Data in Time Valuable?
To understand the value of the 4D-Net, we perform a series of ablation studies. We find that substantial improvements in detection accuracy are obtained if at least one of the sensor inputs is streamed in time. Considering both sensor inputs in time provides the largest improvements in performance.

4D-Net performance for 3D object detection measured in average precision (AP) when using point clouds (PC), Point Clouds in Time (PC + T), RGB image inputs (RGB) and RGB images in Time (RGB + T). Combining both sensor inputs in time is best (rightmost columns in blue) compared to the left-most columns (green) which use a PC without RGB inputs. All joint methods use our 4D-Net multi-modal learning.

Multi-stream 4D-Net
Since the 4D-Net dynamic connection learning mechanism is general, we are not limited to only combining a point cloud stream with an RGB video stream. In fact, we find that it is very cost-effective to provide a large resolution single-image stream, and a low-resolution video stream in conjunction with 3D point cloud stream inputs. Below, we demonstrate examples of a four-stream architecture, which performs better than the two-stream one with point clouds in time and images in time.

Dynamic connection learning selects specific feature inputs to connect together. With multiple input streams, 4D-Net has to learn connections between multiple target feature representations, which is straightforward as the algorithm does not change and simply selects specific features from the union of inputs. This is an incredibly light-weight process that uses a differentiable architecture search, which can discover new wiring within the model architecture itself and thus effectively find new 4D-Net models.

Example multi-stream 4D-Net which consists of a stream of 3D point clouds in time (PC+T), and multiple image streams: a high-resolution single image stream, a medium-resolution single image stream and a video stream (of even lower resolution) images.

Summary
While deep learning has made tremendous advances in real-life applications, the research community is just beginning to explore learning from multiple sensing modalities. We present 4D-Net which learns how to combine 3D point clouds in time and RGB camera images in time, for the popular application of 3D object detection in autonomous driving. We demonstrate that 4D-Net is an effective approach for detecting objects, especially at distant ranges. We hope this work will provide researchers with a valuable resource for future 4D data research.

Acknowledgements
This work is done by AJ Piergiovanni, Vincent Casser, Michael Ryoo and Anelia Angelova. We thank our collaborators, Vincent Vanhoucke, Dragomir Anguelov and our colleagues at Waymo and Robotics at Google for their support and discussions. We also thank Tom Small for the graphics animation.

Categories
Misc

Painting classification ESP32

Hi there! Hope you’re doing good.

First of all, I’m an Embedded Systems engineer trying to use TensorFlow Lite Micro to classify paintings, or pictures in general. I started out with Edge Impulse because, well, it’s an easy way to get started. I managed to get the public car detection project to work on an ESP32-cam in not too much time.

But then; classifying paintings… I don’t know why, but it appears there is one dominant class which gets classified more easily than others. That’s the case with three portraits and one unknown class. The confusion matrix, test dataset and test with my mobile phone (with which I’ve made the dataset) all look and work good. When I quantize the model and deploy it to an ESP32, everything appears to work, but one class specifically overshadows another one.

In short; training a MobileNetV1 96×96 rgb network results in a great confusion matrix identifying three portraits. When deployed to an ESP32, one specific class appears to not work. The other three (unknown and two portraits) seem to work correct.

What could be wrong here? Oh and btw, if anyone knows good resources for an embedded systems engineer to get to know ML better, that’s more than welcome.

submitted by /u/JVKran
[visit reddit] [comments]

Categories
Misc

Keras – How to save the VAE from the official example?

There is an official example of a Variational AutoEncoder running on MNIST:

https://github.com/keras-team/keras-io/blob/master/examples/generative/vae.py

I downloaded that code to test it on my machine, and I want to save it. So I simply added at the end of the file:

vae.save('model_keras_example') 

But that does not work it seems:

WARNING:tensorflow:Skipping full serialization of Keras layer <__main__.VAE object at 0x2abb26a278b0>, because it is not built. Traceback (most recent call last): File "/home/drozd/GAN/keras_example_vae.py", line 199, in <module> vae.save('model_keras_example') File "/opt/ebsofts/TensorFlow/2.6.0-foss-2021a-CUDA-11.3.1/lib/python3.9/site-packages/keras/engine/training.py", line 2145, in save save.save_model(self, filepath, overwrite, include_optimizer, save_format, File "/opt/ebsofts/TensorFlow/2.6.0-foss-2021a-CUDA-11.3.1/lib/python3.9/site-packages/keras/saving/save.py", line 149, in save_model saved_model_save.save(model, filepath, overwrite, include_optimizer, File "/opt/ebsofts/TensorFlow/2.6.0-foss-2021a-CUDA-11.3.1/lib/python3.9/site-packages/keras/saving/saved_model/save.py", line 75, in save saving_utils.raise_model_input_error(model) File "/opt/ebsofts/TensorFlow/2.6.0-foss-2021a-CUDA-11.3.1/lib/python3.9/site-packages/keras/saving/saving_utils.py", line 84, in raise_model_input_error raise ValueError( ValueError: Model <__main__.VAE object at 0x2abb26a278b0> cannot be saved because the input shapes have not been set. Usually, input shapes are automatically determined from calling `.fit()` or `.predict()`. To manually set the shapes, call `model.build(input_shape)`. 

I guess I’m not familiar enough with custom models defined as a class. What seems to be the problem here?

I found this: https://stackoverflow.com/questions/69311861/tf2-6-valueerror-model-cannot-be-saved-because-the-input-shapes-have-not-been
which suggests to add a call to compute_output_shape . When I do, it tells me that my custom model needs a call() method but I have no idea how to implement that with a VAE.

Any help would be much appreciated!

Edit : Seems like I can save the encoder and decoder separately:

vae.decoder.save('model_keras_example_decoder') vae.encoder.save('model_keras_example_encoder') 

Then I suppose I can build it back afterwards by reusing the same class…

submitted by /u/Milleuros
[visit reddit] [comments]