Learn how developers used pretrained machine learning models and the Jetson Nano 2GB to create Mariola, a robot that can mimic human actions, from arm and head movements to making faces.
They say “imitation is the sincerest form of flattery.” Well, in the case of a robotics project by Polish-based developer Tomasz Tomanek, imitation—or mimicry—is the goal of his robot named Mariola.
In this latest Jetson Project of the Month, Tomanek has developed a funky little robot using pretrained machine learning models to make human-robot interactions come to life. The main controller for this robot is the Jetson Nano 2GB.
The use of PoseNet models make it possible for Mariola to recognize the posture and movements of a person, and then use those models to make the robot mimic or replicate those human actions. As Tomanek notes, “the use of the Jetson Nano makes it quite simple and straightforward to achieve this goal.”
An overview about Mariola is available in this YouTube video from the developer:
Video 1. Mariola with Jetson Nano
As you can see, Mariola is able to drive on wheels, move its arms, turn its head, and make faces. Separate Arduino controllers embedded in each section of the robot’s body enable those actions. Separate controllers for servo motors control the movement of the arms and head. The robot has four mecanum wheels so that it can move omnidirectionally.
Mariola’s facial expressions use a separate microcontroller built from NeoPixel LEDs, a set of two for each eye and a set of eight for the mouth. Daisy-chained together, they are driven by a separate Arduino nano board that manages color changes and the appearance of blinking eyes.
According to Tomanek, one key idea of the Mariola build was to make each subsystem a separate unit and let them communicate through an internal bus. There is a UART/BT receiver Arduino nano, and its role is to get the command from the user and decode to which subcontroller it needs to go and send it through CAN BUS.
Each subcontroller gets its commands from CAN BUS and creates the corresponding action for the wheels, the servos (hands and head moves), or the face (NeoPixels).
Tomanek notes in the NVIDIA Developer Forum that the Jetson Nano at the back of the robot is the brain running a customized Python script with the resnet18-body that returns the planar coordinates of a person’s joints when it detects them. Those coordinates are recalculated through an IK model to get the servo’s positions, and the results are sent to the master Arduino through UART. The Arduinos do the rest of the movement.
Currently, Mariola will detect and then mimic the movement of one person at a time. If no one is visible to the robot, or if more than one person is detected, no action occurs.
Why did Tomanek choose a Jetson Nano for this project? As he notes, “the potential power of the pretrained model available for the Jetson, along with the affordability [of the Jetson Nano], brought me to use the 2GB version to learn and see how it works.”
“This is a work in progress and learning project for me,” Tomanek notes. While there is no stated goal to Mariola, he sees it as an opportunity to experiment and learn what can be achieved by using this technology. “The best outcome so far is that with those behaviors driven by the machine learning model, there is a certain kind of autonomy for this small robot.”
When people interact with Mariola for the first time, Tomanek says “it always generates smiles. This is a very interesting aspect of human-robot interactions.” It’s easy to see why that would happen. Just watch Mariola in action–we dare you not to smile:
Video 2. Robotic project demo with Arduino and Jetson Nano
The Mariola project continues in active development, and is modified and updated on a regular basis. As Tomanek concludes in his overview video,“We’ll see what the future will bring.”
More details about the project are available in this GitHub repository.
Learn how you can prepost process your NVIDIA Modulus simulations using the Modulus Omniverse extension.
NVIDIA Modulus is a physics-machine learning platform that blends the power of physics with data to build high-fidelity, parameterized AI surrogate models that serve as digital twins to simulate with near real-time latency.
This cutting-edge framework is expanding its interactive simulation capabilities by integrating with the NVIDIA Omniverse (OV) platform for real-time virtual-world simulation and full-design fidelity visualization.
Previously, you would need to set up the visualization pipeline, a key component of simulation and analysis workflows, on your own. Now, you can use the built-in pipeline in Omniverse for common output scenarios such as visualizing streamlines and iso-surfaces for the outputs of the Modulus-trained AI model. Another key feature is being able to visualize and analyze the high-fidelity simulation output in near real time as you vary design parameters.
The three key advantages of adding the Modulus-OV extension:
This built-in visualization pipeline supports a small number of commonly used modalities such as streamlines, Scalar field slice, and Flow.
There is a near real-time simulation output for making design parameter changes and visualizing it on screen.
The rich ecosystem in Omniverse is now compatible for you to integrate with other extensions such as CAD tools, visualization tools for an end-to-end design, and simulation workflows.
The Modulus extension is available with Omniverse Create. After you install Omniverse Create on a supported OS using Omniverse Launcher, install the Modulus extension. Then go into the extension window and search for ‘Modulus’. This will bring up the core extension to install and enable the Modulus extension.
Figure 1: Enabling the Modulus extension in Omniverse Create
For this preview release, Modulus extension is supported only on the Linux platform and the GPU Memory requirements for running both Omniverse Create and Modulus can be quite high. For existing scenarios, we have observed minimum GPU requirements of an NVIDIA RTX 3090 or higher.
Visualizing an interactive simulation
Simulation scenarios are prepackaged examples that help users get familiar with the capabilities of the extension.
For now, the following preconfigured scenarios are available to experiment with: modulus_scenario_fpga
Load this scenario extension by searching for its name in the extension manager (in the following, we will use modulus_scenario_fpga). Install and enable the extension. If you do this for the first time, this process can take a few minutes for the pretrained model to be downloaded and installed on your machine.
This scenario is based on the parameterized 3D heat sink example in Modulus, which, with the OV extension enabled you can visualize the airflow through the field-programmable gate array (FPGA) geometry.
In this scenario, the Modulus-trained parameterized neural network model is simulating airflow paths. The inference output data being used is the velocity magnitude, which is the airspeed at a given point defined on a volumetric surface. By putting a surface at a fairly low speed you can see where the airflow is slowing down, which would be the boundary, and as it hits the cooling fins shown in Figure 2.
You can also analyze the airflow by using streamlines, which are computed by adding advecting particles through the airflow. You can also play with the texture of the airflow for a better understanding of the airflow.
Figure 2. Visualizing and interactively modifying the simulation scenario
A set of common visualization modes is available with this release of the extension. Each mode will populate the stage that is currently open in Omniverse Create with visualization geometry, which will be updated as you change parameters.
Isosurface: Create an isosurface of the velocity magnitude.
Streamlines: Create a set of streamlines.
Slices: Add three axis-aligned slices of the velocity magnitude.
In addition, you can also vary the visualization parameters using knobs in the extension user interface. The model is not reevaluated when visualization parameters are modified. To learn about which parameters can be adjusted please refer to the OV integration documentation.
Figure 3. Changing visualization parameters and seeing the results interactively in the extension user interface
Another game-changing aspect of Modulus and Physics-ML is the ability to train a model on a parameterized space that can be used to infer a design space defined by a set of design parameters. Users can expose this within the scenario as various parameters knobs, that can be changed to infer and visualize the new simulation output in near real time. When you change these design parameters, the model is reevaluated to infer the new geometry and the output is visualized.
Figure 4. Changing design parameters like height, length, etc. of the fins of the heat sink
Learn more
To learn more about the extension and this example, please refer to the Discord Live session where we talk more about Modulus, its capabilities, and the Modulus OV extension.
NVIDIA is partnering with IronYun to leverage the capabilities of edge AI to help make the world a smarter, safer, more efficient place.
Nearly every organization is enticed by the ability to use cameras to understand their businesses better. Approximately 1 billion video cameras—the ultimate Internet of Things (IoT) sensors—are being used to help people around the world live better and safer.
But, there is a clear barrier to success. Putting the valuable data collected by these cameras to use requires significant human effort. The time-consuming process of manually reviewing massive amounts of footage is arduous and costly. Moreover, after the review is complete, much of the footage is either thrown out or stowed away in the cloud and never used again.
This leads to vast amounts of valuable data never being used to its full potential.
Luckily, thanks to advancements in AI and edge computing, organizations can now layer AI analytics directly onto their existing camera infrastructures to expand the value of video footage captured. By adding intelligence to the equation, organizations can transform physical environments into safer, smarter spaces with AI-powered video analytics systems.
Edge AI in action
This change is already helping companies in many industries improve customer experiences, enhance safety, drive accountability, and deliver operational efficiency. The possibilities for expanding these smart spaces and reaping even greater benefits are vast.
In the retail space, an AI-powered smart store can elevate the consumer shopping experience by using heat maps to improve customer traffic flow, accurately forecast product demand, and optimize supply chain logistics. Ultimately, these smart stores could completely transform retail as we know it and use cameras to create “just walk out” shopping experiences, with no cash registers required.
At electrical substations, intelligent video analytics is streamlining asset inspection and ensuring site safety and security. AI-powered analysis of real-time video streaming provides continuous monitoring of substation perimeters. This can be used to prevent unauthorized access, ensure technicians and engineers follow health and safety protocols, and detect dangerous environmental conditions like smoke and fire.
Creating a smart space
At the forefront of this smart space revolution is the AI vision company and NVIDIA Metropolis partner IronYun. The IronYun AI platform, Vaidio, is helping retailers, banks, NFL stadiums, factories, and more fuel their existing cameras with the power of AI.
NVIDIA and IronYun are working to leverage the capabilities of edge AI and help make the world a smarter, safer, more efficient place.
A smart space is more than simply a physical location equipped with cameras. To be truly smart, these spaces must take collected data to generate critical insights that create superior experiences.
According to IronYun, most organizations today use cameras to improve safety in their operations. The IronYun Vaidio platform extends beyond basic security applications and supports dozens of advanced AI-powered video analytics capabilities specific to each customer. From video search to heat map creation and detection for PPE, IronYun is helping organizations across all industries take their business to the next level with AI through a single platform.
How does this look in the real world? An NFL stadium that hosts 65,000 fans at every game uses Vaidio in interesting ways. The customer first approached IronYun in hopes of improving safety and security operations at the stadium. Once they saw Vaidio analytics in action, they realized they could leverage the same advanced platform to monitor and alert security of smoke, fire, falls, and fights, as well as detect crowd patterns.
IronYun CEO, Paul Sun says, “The tedious task of combing through hours of video footage can take days or weeks to complete. Using Vaidio’s AI video analytics, that same forensic video search can be done in seconds.”
Powering smart spaces across the world
Edge AI is the technology that is making smart spaces possible for organizations to mobilize data being produced at the edge.
The edge is simply a location, named for the way AI computation is done near or at the edge of a network rather than centrally in a cloud computing facility or private data center. Without the low latency and speed provided by the edge, many security and data gathering applications would not be effective or possible.
Sun says, “When you are talking about use cases to ensure safety like weapons detection or smoke and fire detection, instantaneous processing at the edge can accelerate alert and response times, especially relative to camera-based alternatives.”
Building the future
With the powerful capabilities of NVIDIA Metropolis, NVIDIA Fleet Command, and NVIDIA Certified-Systems IronYun applies AI analytics to help make the world safer and smarter,
The NVIDIA Metropolis platform offers IronYun the development tools and services to reduce the time and cost of developing their vision AI deployments. This is a key factor in their ability to bring multiple new and accurate AI-powered video analytics to the Vaidio platform every year.
Figure 1. With NVIDIA Fleet command IT admins can remotely manage edge systems across distributed edge sites
Fleet Command eliminates the need for IT teams to be on call 24/7 when a system experiences a bug or issue. Instead, they can troubleshoot and manage emergencies from the comfort of their office.
The Fleet Command dashboard sits in the cloud and provides administrators a control plane to deploy applications, alerts and analytics. It also provides provisioning and monitoring capabilities, user management control, and other features needed for day-to-day management of the lifecycle of an AI application.
The dashboard also has a private registry where organizations can securely store their own custom application or a partner application, such as IronYun’s Vaidio platform for deployment at any location.
“With NVIDIA Fleet Command, we are able to scale our vision applications from one or two cameras in a POC, to thousands of cameras in a production deployment. By simplifying the management of edge environments, and improving video analytics accuracy at scale, our customer environments indeed become safer and smarter,” says Sun.
Putting art, mathematics and computers together in the mid-1980s created a new genre of digital media: fractal art. In the NVIDIA Studio this week, computer graphics (CG) artist, educator and curator Xueguo Yang shares his insights behind fractal art — which uses algorithms to artistically represent calculations derived from geometric objects as digital images and animations.
Feedspot paid plan is required to export Combined RSS feeds. If you are the owner of this feed, please login to your Feedspot account to see upgrade options. In case of any questions feel free to reach us at team@feedspot.com.
Feedspot paid plan is required to export Combined RSS feeds. If you are the owner of this feed, please login to your Feedspot account to see upgrade options. In case of any questions feel free to reach us at team@feedspot.com.
Accelerated WEKA integrates the WEKA workbench with Python and Java libraries that support GPU to speedup the training and prediction time of machine learning models.
In recent years, there has been a surge in building and adopting machine learning (ML) tools. The use of GPUs to accelerate increasingly compute-intensive models has been a prominent trend.
To increase user access, the Accelerated WEKA project provides an accessible entry point for using GPUs in well-known WEKA algorithms by integrating open-source RAPIDS libraries.
In this post, you will be introduced to Accelerated WEKA and learn how to leverage GPU-accelerated algorithms with a graphical user interface (GUI) using WEKA software. This Java open-source alternative is suitable for beginners looking for a variety of ML algorithms from different environments or packages.
What is Accelerated WEKA?
Accelerated WEKA unifies the WEKA software, a well-known and open-source Java software, with new technologies that leverage the GPU to shorten the execution time of ML algorithms. It has two benefits aimed at users without expertise in system configuration and coding: an easy installation and a GUI that guides the configuration and execution of the ML tasks.
Accelerated WEKA is a collection of packages available for WEKA, and it can be extended to support new tools and algorithms.
What is RAPIDS?
RAPIDS is a collection of open-source Python libraries for users to develop and deploy data science workloads on NVIDIA GPUs. Popular libraries include cuDF for GPU-accelerated DataFrame processing and cuML for GPU-accelerated machine learning algorithms. RAPIDS APIs conform as much as possible to the CPU counterparts, such as pandas and scikit-learn.
Accelerated WEKA architecture
The building blocks of Accelerated WEKA are packages like WekaDeeplearning4j and wekaRAPIDS (inspired by wekaPython). WekaDeeplearning4j (WDL4J) already supports GPU processing but has very specific needs in terms of libraries and environment configuration. WDL4J provides WEKA wrappers for the Deeplearning4j library.
For Python users wekaPython initially provided Python integration by creating a server and communicating with it through sockets. With this the user can execute scikit-learn ML algorithms (or even XGBoost) inside the WEKA workbench. Furthermore, wekaRAPIDS provides integration with RAPIDS cuML library by using the same technique in wekaPython.
Together, both packages provide enhanced functionality and performance inside the user-friendly WEKA workbench. Accelerated WEKA goes a step further in the direction of performance by improving the communication between the JVM and Python interpreter. It does so by using alternatives like Apache Arrow and GPU memory sharing for efficient data transfer between the two languages.
Accelerated WEKA also provides integration with the RAPIDS cuML library, which implements machine learning algorithms that are accelerated on NVIDIA GPUs. Some cuML algorithms can even support multi-GPU solutions.
Supported algorithms
The algorithms currently supported by Accelerated WEKA are:
LinearRegression
LogisticRegression
Ridge
Lasso
ElasticNet
MBSGDClassifier
MBSGDRegressor
MultinomialNB
BernoulliNB
GaussianNB
RandomForestClassifier
RandomForestRegressor
SVC
SVR
LinearSVC
KNeighborsRegressor
KNeighborsClassifier
The algorithms supported by Accelerated WEKA in multi-GPU mode are:
KNeighborsRegressor
KNeighborsClassifier
LinearRegression
Ridge
Lasso
ElasticNet
MultinomialNB
CD
Using Accelerated WEKA GUI
During the Accelerated WEKA design stage, one main goal was for it to be easy to use. The following steps outline how to set it up on a system along with a brief example.
Please refer to the documentation for more information, and a comprehensive getting started. The only prerequisite for Accelerated WEKA is having Conda installed in your system.
The installation of Accelerated WEKA is available through Conda, a system providing package and environment management. Such capability means that a simple command can install all dependencies for the project. For example, on a Linux machine, issue the following command in a terminal for installing Accelerated WEKA and all dependencies.
After Conda has created the environment, activate it with the following command:
conda activate accelweka
This terminal instance just loaded all dependencies for Accelerated WEKA. Launch WEKA GUI Chooser with the command:
weka
Figure 1 shows the WEKA GUI Chooser window. From there, click the Explorer button to access the functionalities of Accelerated WEKA.
Figure 1. WEKA GUI Chooser window. This is the first window that appears when you start WEKA
In the WEKA Explorer window (Figure 2), click the Open file button to select a dataset file. WEKA works with ARFF files but can read from CSVs. Converting from CSVs can be pretty straightforward or require some configuration by the user, depending on the types of the attributes.
Figure 2. In the WEKA Explorer window users can import datasets, check statistics about the attributes, and apply filters to the dataset as preprocessing
The WEKA Explorer window with a dataset loaded is shown in Figure 3. Assuming one does not want to preprocess the data, clicking the Classify tab will present the classification options to the user.
Figure 3. WEKA Explorer window with a dataset loaded. After loading the dataset (either from an ARFF file or a CSV file) the attribute names appear on the left. Information regarding the selected attribute appears in the upper right. A chart containing the distribution of the class according to the selected attribute is viewable in the lower right
The Classify tab is presented in Figure 4. Clicking “Choose” button will show the implemented classifiers. Some might be disabled because of the dataset characteristics. To use Accelerated WEKA, the user must select rapids.CuMLClassifier. After that, clicking the bold CuMLClassifier will take the user to the option windows for the classifier.
Figure 4. In the WEKA Classify tab, the user can configure the classification algorithm and the test options that are going to be used in the experiment using the previously selected dataset
Figure 5 shows the Option window for CuMLClassifier. With the field RAPIDS learner, the user can choose the desired classifier among the ones supported by the package. The field Learner parameters are for the modification of the cuML parameters, details of which can be found in the cuML documentation.
The other options are for the user to fine-tune the attribute conversion, configure which python environment is to be used, and determine the number of decimal places the algorithm should operate. For the sake of this tutorial, select Random Forest Classifier and keep everything with the default configuration. Clicking OK will close the window and return to the previous tab.
Figure 5. With the WEKA Classifier configuration window, the user can configure the parameters of the selected classifier. In this case, it is showing the newly integrated CuMLClassifier options with the RandomForestClassifier learner selected
After configuring the Classifier according to the previous step, the parameters will be shown in the text field beside the Choose button. After clicking Start, WEKA will start executing the chosen classifier with the dataset.
Figure 6 shows the classifier in action. The Classifier output is showing debug and general information regarding the experiment, such as parameters, classifiers, dataset, and test options. The status shows the current state of the execution and the Weka bird on the bottom animates and flips from one side to the other while the experiment is running.
Figure 6. WEKA Classify tab with the chosen classification algorithm in progress
After the algorithm finishes the task, it will output the summary of the execution with information regarding predictive performance and the time taken. In Figure 7, the output shows the results for 10-fold cross-validation using the RandomForestClassifier from cuML through CuMLClassifier.
Figure 7. WEKA Classify tab after the experiment has been completed
Benchmarking Accelerated WEKA
We evaluated the performance of Accelerated WEKA, comparing the execution time of the algorithms on the CPU with the execution time using the Accelerated WEKA. The hardware used in the experiments was an i7-6700K, a GTX 1080Ti, and a DGX Station with four A100 GPUs. Unless stated otherwise, the benchmarks use a single GPU.
We used datasets with different characteristics for the benchmarks. Some of them were synthetic for better control of the attributes and instances, like the RDG and RBF generators. The RDG generator builds instances based on decision lists. The default configuration has 10 attributes, 2 classes, a minimum rule size of 1, and a maximum rule size of 10. We changed the minimum and maximum limits to 5 and 20, respectively. With this generator, we created datasets with 1, 2, 5, and 10 million instances, as well as 5 million instances with 20 attributes.
The RBFgenerator creates a random set of centers for each class and then generates instances by getting random offsets from the centers for the attribute values. The number of attributes is indicated with the suffix a__ (for example, a5k means 5 thousand attributes), and the number of instances is indicated by the suffix n__ (for example, n10k means 10 thousand instances).
Lastly, we used the HIGGS dataset, which contains data about the kinematic properties of the atom accelerator. The first 5 million instances of the HIGGS dataset were used to create HIGGS_5m.
The results for the wekaRAPIDS integration are shown, where we make a direct comparison between the baseline CPU execution with the Accelerated WEKA execution. The results for the WDL4J are shown in Table 5.
XGBoost (CV)
i7-6700K
GTX 1080Ti
Speedup
dataset
Baseline (seconds)
AWEKA SGM (seconds)
RDG1_1m
266.59
65.77
4.05
RDG1_2m
554.34
122.75
4.52
RDG1_5m
1423.34
294.40
4.83
RDG1_10m
2795.28
596.74
4.68
RDG1_5m_20a
2664.39
403.39
6.60
RBFa5k
17.16
15.75
1.09
RBFa5kn1k
110.14
25.43
4.33
RBFa5kn5k
397.83
49.38
8.06
Table 1. Execution time of experiments with XGBoost using cross-validation comparing the baseline CPU execution time with the Accelerated WEKA execution time while sharing GPU memory on a GTX 1080Ti GPU
XGBoost (no-CV)
i7-6700K
GTX 1080Ti
Speedup
A100
Speedup
dataset
Baseline (seconds)
AWEKA CSV (seconds)
AWEKA CSV (seconds)
RDG1_1m
46.40
21.19
2.19
22.69
2.04
RDG1_2m
92.87
34.76
2.67
35.42
2.62
RDG1_5m
229.38
73.49
3.12
65.16
3.52
RDG1_10m
461.83
143.08
3.23
106.00
4.36
RDG1_5m_20a
268.98
73.31
3.67
–
RBFa5k
5.76
7.73
0.75
8.68
0.66
RBFa5kn1k
23.59
13.38
1.76
19.84
1.19
RBFa5kn5k
78.68
34.61
2.27
29.84
2.64
HIGGS_5m
214.77
169.48
1.27
76.82
2.80
Table 2. Execution time of experiments with XGBoost without using cross-validation. A comparison of the baseline CPU execution time with the Accelerated WEKA execution time while sending a CSV file through sockets on a GTX 1080Ti GPU. Loading times of the dataset were taken out
RandomForest (CV)
i7-6700K
GTX 1080Ti
Speedup
dataset
Baseline (seconds)
AWEKA SGM (seconds)
RDG1_1m
494.27
97.55
5.07
RDG1_2m
1139.86
200.93
5.67
RDG1_5m
3216.40
511.08
6.29
RDG1_10m
6990.00
1049.13
6.66
RDG1_5m_20a
5375.00
825.89
6.51
RBFa5k
13.09
29.61
0.44
RBFa5kn1k
42.33
49.57
0.85
RBFa5kn5k
189.46
137.16
1.38
Table 3. Execution time of experiments with Random Forest using cross-validation comparing the baseline CPU execution time with the Accelerated WEKA execution time while sharing GPU memory on a GTX 1080Ti GPU
KNN (no-CV)
AMD EPYC 7742 (4 cores)
NVIDIA A100
Speedup
4X NVIDIA A100
Speedup
dataset
Baseline (seconds)
wekaRAPIDS (seconds)
wekaRAPIDS (seconds)
covertype
3755.80
67.05
56.01
42.42
88.54
RBFa5kn5k
6.58
59.94
0.11
56.21
0.12
RBFa5kn10k
11.54
62.98
0.18
59.82
0.19
RBFa500n10k
2.40
44.43
0.05
39.80
0.06
RBFa500n100k
182.97
65.36
2.80
45.97
3.98
RBFa50n10k
2.31
42.24
0.05
37.33
0.06
RBFa50n100k
177.34
43.37
4.09
37.94
4.67
RBFa50n1m
21021.74
77.33
271.84
46.00
456.99
Table 4. Execution time of experiments with KNN without using cross-validation comparing the baseline CPU execution time with the Accelerated WEKA execution on an NVIDIA A100 GPU
3,230,621 params Neural Network
i7-6700K
GTX 1080Ti
Speedup
Epochs
Baseline (seconds)
WDL4J (seconds)
50
1235.50
72.56
17.03
100
2775.15
139.86
19.84
250
7224.00
343.14
21.64
500
15375.00
673.48
22.83
Table 5. Execution time of experiments with a 3,230,621 parameter neural network comparing the baseline CPU execution time with the Accelerated WEKA execution on a GTX 1080Ti GPU. The experiments used a small subset of the MNIST dataset while increasing the number of epochs
This benchmarking shows that Accelerated WEKA provides the most benefit to compute-intensive tasks with larger datasets. Small datasets like the RBFa5k and RBFa5kn1k (possessing 100 and 1,000 instances, respectively) present bad speedup, which happens because the dataset is too small to make the overhead of moving things to GPU memory worthwhile.
Such behavior is noticeable in the A100 (Table 4) experiments, where the architecture is more complex. The benefits of using it start to kick in at the 100,000 instances or bigger datasets. For instance, The RBF datasets with 100,000 instances show ~3 and 4x speedup, which is still lackluster but shows improvement. Bigger datasets like the covertype dataset (~700,000 instances) or the RBFa50n1m dataset (1 million instances) show speedups of 56X and 271X, respectively. Note that for Deep Learning tasks, the Speedup can reach over 20X even with the GTX 1080Ti.
Key takeaways (Tie back to the Call to Action)
Accelerated WEKA will help you supercharge WEKA using RAPIDS. Accelerated WEKA helps with efficient algorithm implementations of RAPIDS and has an easy-to-use GUI. The installation process is simplified using the Conda environment, making it straightforward to use Accelerated WEKA from the beginning.
If you use Accelerated WEKA, please use the hashtag #AcceleratedWEKA on social media. Also, please refer to the documentation for the correct publication to cite Accelerated WEKA in academic work and find out more details about the project.
Contributing to Accelerated WEKA
WEKA is freely available under the GPL open-source license and so is Accelerated WEKA. In fact, Accelerated WEKA is provided through Conda to automate the installation of the required tools for the environment, and the additions to the source code are published to the main packages for WEKA. Contributions and bug fixes can be contributed as patch files and posted to the WEKA mailing list.
Acknowledgments
We would like to thank Ettikan Karuppiah, Nick Becker, Brian Kloboucher, and Dr. Johan Barthelemy from NVIDIA for the technical support they provided during the execution of this project. Their insights were essential in helping us reach the goal of efficient integration with the RAPIDS library. In addition, we would like to thank Johan Barthelemy for running benchmarks in extra graphic cards.
So my question is I have a variable which of type ops.Tensor. I need to convert this variable to a numpy variable. I tried different solutions online but in those cases the variable needs to be ops.EagerTensor to convert the variable into numpy object(. numpy (), tf.make_ndarray etc). Soo how can I convert my tensor object to eager tensor object or directly to numpy??