I am trying to use Tensorflow-lite to run inference on a video frame by frame. This is my code so far:
#include <iostream> #include "src/VideoProcessing.h" #include <cstdio> #include <opencv2/opencv.hpp> #include <opencv2/videoio.hpp> #include "tensorflow/lite/interpreter.h" #include "tensorflow/lite/kernels/register.h" #include "tensorflow/lite/model_builder.h" #include "tensorflow/lite/interpreter_builder.h" int main() { int fps = VideoProcessing::getFPS("trainer.mp4"); unsigned long size = VideoProcessing::getSize("trainer.mp4"); cv::VideoCapture cap("trainer.mp4"); //Check if input video exists if(!cap.isOpened()){ std::cout<<"Error opening video stream or file"<<std::endl; return -1; } //Create a window to show input video cv::namedWindow("input video", cv::WINDOW_NORMAL); //Keep playing video until video is completed while(true){ cv::Mat frame; //Capture frame by frame cap >> frame; //If frame is empty then break the loop if(frame.empty()){break;} //Show the current frame imshow("input video", frame); } //Close window after input video is completed cap.release(); //Destroy all the opened windows cv::destroyAllWindows(); std::cout << "Video file FPS: " << fps << std::endl; std::cout << "Video file size: " << size << std::endl; // Load the model std::unique_ptr<tflite::FlatBufferModel> model = tflite::FlatBufferModel::BuildFromFile("pose_landmark_full.tflite"); // Build the interpreter tflite::ops::builtin::BuiltinOpResolver resolver; std::unique_ptr<tflite::Interpreter> interpreter; tflite::InterpreterBuilder(*model, resolver)(&interpreter); if (interpreter == nullptr) { fprintf(stderr, "Failed to initiate the interpretern"); exit(-1); } return 0; }
I use this command to run my project:
g++ -std=c++17 main.cpp src/VideoProcessing.cpp `pkg-config --libs --cflags opencv4` -o result
My tensorflow-lite is in `/usr/local/include/tensorflow/lite/`. This is my output:
In file included from /usr/local/include/tensorflow/lite/model.h:21, from /usr/local/include/tensorflow/lite/kernels/register.h:18, from main.cpp:7: /usr/local/include/tensorflow/lite/interpreter_builder.h:26:10: fatal error: flatbuffers/flatbuffers.h: No such file or directory 26 | #include "flatbuffers/flatbuffers.h" // from u/flatbuffers | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated.
Why would a trained model that performs with around 95% accuracy have worse accuracy after Transforming, duplicating & vertically flipping the original training data & adding it in with the original data and retraining it?
Hello, so I am currently trying to learn TensorFlow and deep learning this month since we have an upcoming thesis proposal next semester and I am somewhat interested in proposing a thesis that will implement machine learning in it. To be specific, my envisioned thesis proposal will be about the early detection of pests in plants so that there will be a way for farmers to somewhat predict the conditions of their crops. I think that image recognition using machine learning can help me with this since I have read that there are a lot of things image recognition can do. Although my knowledge about machine learning and AI is very futile, especially the details in it like the Math involved and creating my own model from scratch for implementation of my envisioned topic. Tho I have heard from my colleagues, which were like me that has a proposal that involves machine learning and started doing their thesis with zero knowledge about it, that there are lots of open-source models available on the internet which can help me implement my proposal. Actually, they themselves used open-source models (which implement R-CNN and YOLO algorithms) in implementing their proposal, which I think is about microplastic detection in bodies of water using image recognition. What they’ve done is they formulated and populated their own datasets about it, the gathered dataset was used to train the model they got from the internet, tho they made some tweaking.
So my concern here is that, given my prospective proposal, should I in-depth learn about TensorFlow and create a model from scratch (which actually is good since I will learn the fundamentals about ML but it will require me a lot of effort) or just try to learn open-source models available on the internet then try to tweak something in it to best fit my intended function. I am currently taking a course in Udacity which is an Intro to Tensorflow for Deep Learning, this was the prescribed overview course of the tensorflow website, and I am somewhat unsatisfied by how things worked on that course since the course somewhat bombards me of all the codes and stuff which I really don’t know since I am new to TensorFlow (tho I know things in python). They did not give such an in-depth explanation of the codes although they give somewhat a crash course on different Deep Learning concepts such as CNNs. I am overwhelmed by that course since what I just did is to follow the tutorial, actually, I just push the play button in the google colab and wait for it to run and see if its working as intended just like what’s shown in the video tutorial. I know I am expecting a lot from that course knowing that it is an overview of the ML and TF, but I am really not satisfied with what I am getting. I want to know more about Tensorflow, especially the library itself and the codes involved, I think learning the code itself while learning the concepts involved will help me grow in this field. So are there any recommendations on which resources should I gather and learn to satisfy my learning needs in this field? Also, what better ways (I actually love getting advice from people who are better than me in this field) to learn ML and TF. I am looking forward to having a good conversation with you guys and also learning a lot about this field.
I’m learning tensorflow, so sorry if my question is too stupid.
I’ve just bought a RTX 3070, upgrading from my old GTX 970, both for gaming and for using with tensorflow.
If I use the two GPUs, my motherboard will “split” the speed of the PCIe3 slots from 16x to 8x. So my question is: is it worth it to keep the GTX 970 along with the RTX 3070 to send a part of the tensor calculation to the 970 using Strategy? Can the 970 give me more speed for training NNs, or will the gains be negligible? Do you guys think I should keep just the RTX 3070 on the motherboard?
My PSU is powerful enough for both (1000 watts), so the question here is just about the usefulness of the GTX 970 as a second gpu for NN training in tensorflow, since it has some CUDA cores that (I suppose) could potentially be used alongside the 3070.
What do you guys think? Have any of you tried something like that?
Is there such a thing a default null where a bunch of random real world images are used to make no choice? If we had two label classes class-1 & class-2 our model will always choose one or the other but we need a third class to say it is neither class one or class two. So there should be a default class that cross references & removes any data in this default class that correlates to either of our classes?
If we transform the image data… We input the raw, then transform it by splitting channels to R, G, B then input the seperate color channels then we transform it again to greyscale & input that so we do multiple rounds of transform for each single input data point…. Then to get a classification / prediction we all predictions from each transform and compare the results of each? Wouldn’t this allow us to achieve more accurate results…
Clara Holoscan SDK 0.2 offers real-time AI inference capabilities and fast I/O for high-performance streaming applications in medical devices.
Advances in edge computing, video cameras, real-time processing, and AI have helped transform medical devices over the years. NVIDIA developed the NVIDIA Clara Holoscan platform to support the development of software-defined AI medical devices. The platform consists of NVIDIA Clara Developer Kits, the NVIDIA Clara Holoscan SDK, and NVIDIA Clara Holoscan MGX for production-ready deployment.
The latest release of the NVIDIA Clara Holoscan SDK 0.2 offers real-time AI inference capabilities and fast I/O for high-performance streaming applications in medical devices. This includes endoscopy, ultrasound, surgical robots, microscopy, and genomics sequencing instruments.
The release also consists of:
Core backend on NVIDIA Graph eXecution Framework (GXF) vs. GStreamer.
A sample endoscopy AI application.
A customizable AI pipeline to add your own model.
Support for both the Clara AGX Developer Kit with the Jetson AGX Xavier and NVIDIA RTX 6000 and the Clara Holoscan Development Kit with the Jetson AGX Orin and NVIDIA RTX A6000.
Support on the NVIDIA JetPack 5.0 SDK, which includes Ubuntu 20.04.
Graph eXecution Framework processes streaming data
The most significant change in the Clara Holoscan SDK 0.2 is the shift of the core backend from GStreamer to the NVIDIA GXF. GXF is a framework supporting component-based programming for streaming data processing pipelines. It is built for very efficient data ingestion, data transfer, and AI/ML workloads.
With GXF, developers can create reusable components and combine them in graphs to build applications for different products quickly. GXF supports the processing of video and AU streams as well as user-defined streaming data types used in medical devices such as raw ultrasound, radiology imaging scanners, and microscopes.
A recent test using the NVIDIA Latency Display Analysis Tool on a 1080p video stream showed that GXF offers a significant speedup compared to previous solutions. In the test, GXF reduced the overhead in an AI Inferencing application by nearly 3x compared to a similar GStreamer-based pipeline in the Clara Holoscan SDK 0.1.
Additionally, GXF supports user-customizable components to support generic data processing pipelines. GXF handles the critical parts of building a high-performance application due to two important components.
First is a scheduler that determines when components execute. The scheduler supports single or multithreaded execution, with conditional execution, asynchronous scheduling, and other custom tools.
Second, GXF has a memory allocator that provides a system with an upfront allocation of a large contiguous memory pool and reuses regions as needed. To ensure zero-copy data exchange between components, memory can be pinned to the device.
Endoscopy AI sample application on Clara Holoscan
Digital endoscopy has evolved as a key technology for medical screenings and minimally invasive surgeries. Using real-time AI platforms to process and analyze the video signal produced by the endoscopic camera has been growing.
This technology is helping with anomaly detection and measurements, image enhancements, alerts, and analytics. The Clara Holoscan SDK 0.2 includes a sample AI-enabled endoscopy application showcasing the end-to-end functionality of GXF and support for devices that interface with AJA with an HDMI input.
The endoscopy AI sample application has a deep learning model to perform object detection and tool tracking in real time on an endoscopy video stream.
The application uses several NVIDIA features to minimize the overall latency, including:
GPUDirect RDMA video data transfer to eliminate the overhead of copying to or from system memory.
NVIDIA Performance Primitive Library for CUDA-accelerated 2D image transformations before AI inference.
TensorRT runtime for optimized AI Inference and speed-up.
CUDA and OpenGL interoperability, which provides efficient resource sharing on the GPU for visualization.
To learn more about the endoscopy AI sample application, its hardware and software reference architecture on Clara Holoscan, as well as the path to production, download the Clara Holoscan Endoscopy Whitepaper.
Bring your own model AI application
Developers can bring their own AI model into the Clara Holoscan reference pipeline to create their own streaming workflow quickly. Swapping out of one model for another is accomplished by updating one configuration file and exporting data to the GXF native data format. Models saved in portable ONNX, as well as the NVIDIA performance-optimized TRT format, can be run on GXF’s built-in inference engines.
Support for the Clara Developer Kit
The Clara Holoscan SDK 0.2 is supported on the Clara AGX and the new Clara Holoscan Developer Kit. The next generation Clara Holoscan Development Kit is built with a high-performance NVIDIA Orin module, a powerful RTX A6000 GPU, and the connectivity performance of the ConnectX SmartNIC.
This kit is the ideal solution for developing the next generation of software-defined medical devices. Orin is geared for autonomous machines with high-speed interface support for multiple sensors and 8X the performance of the last generation for multiple concurrent AI inference pipelines.
Updated JetPack 5.0HP1 with Ubuntu 20.04
The NVIDIA JetPack SDK contains the base OS for the Clara Holoscan SDK. For version 0.2, the JetPack SDK is being upgraded from version 4.5 to version 5.0HP1. This upgrades the OS to L4T rel-34, to be on par with Ubuntu 20.04 with LTS Kernel 5.10.
Get started with the Clara Holoscan SDK
The Clara Holoscan SDK 0.2 and source code are now accessible on GitHub with an Apache 2.0 license.
This release of Isaac Sim adds more tools for AI-based robotics including Isaac Gym support for RL, Isaac Cortex for cobot programming, and much more.
Today, NVIDIA announced the availability of the 2022.1 release of NVIDIA Isaac Sim. As a robotics simulation and synthetic data generation (SDG) tool, this NVIDIA Omniverse application accelerates the development, testing, and training of AI in robotics.
With Isaac Sim, developers can generate production quality datasets to train AI-perception models. Developers will also be able to simulate robotic navigation and manipulation, as well as build a test environment to validate robotics applications continually.
The latest version advances the age of AI robots with new tools like NVIDIA Isaac Cortex, a decision framework for training collaborative robots (cobots), and Isaac Gym, a GPU-accelerated reinforcement learning (RL) framework. NVIDIA Isaac Replicator, a set of synthetic data generation tools, APIs, and workflows, has also been updated with new capabilities to generate industrial environments for SDG procedurally.
NVIDIA Isaac Sim 2022.1 release highlights
Isaac Cortex: Program cobot tasks as easily as programming game AI. Leverage this decision framework for cobots to develop task-aware and adaptive skills. Using its belief representation of the world, analogous to the robot’s brain, real or simulated data can be used as inputs and the resulting actuations will be generated.
Isaac GYM: Train robots in minutes instead of weeks. Train complex robotic skills using RL. The Isaac GYM is a GPU-accelerated tool that keeps the entire RL training workflow on the GPU, which is critical to reduce training time.
Omnigraph: Simplify application development and debugging with visual programming. Build robotic applications by visually connecting compute nodes together in this Omniverse visual programming and scripting environment. Robotic applications tend to be very modular and lend themselves well to visual programming.
Isaac Sim/Gazebo Connector: Move between both simulators depending on tasks. ROS developers using Gazebo can import simulation assets into Isaac Sim for tasks like generating synthetic datasets or high-fidelity rendering. Additionally, multiple gazebo simulations can stay live synched by connecting to Omniverse’s nucleus server.
Additional Features:
Windows Support (limited)
New Robots
Quadrupeds:A1, GO1, Anymal
AMR: Obelix
New modular warehouse and conveyor assets
New ROS pipelines implemented in Omnigraph
Training AI with synthetic data
Isaac Replicator is the synthetic data generation tool in Isaac Sim. Synthetic data is very useful in robotics to bootstrap training, address long-tail dataset challenges, and provide unavailable real-world data like speed and direction from synthetic video. Autonomous machines require synthetic data in training to ensure model robustness.
In the latest release, a new SDG feature called SceneBlox was added to generate scenes procedurally. SceneBlox can be used to create industrial environments like warehouses automatically. New examples were also added that demonstrate how to generate synthetic data and train a pose estimation model using Replicator.