I’d like to share my experiment on how to easily create your own tiny machine learning model and run inferences on a microcontroller to detect the concentration of various gases. I will illustrate the whole process with my example of detecting the concentration of benzene (С6H6(GT)) based on the concentration of other recorded compounds.
Things I used in this project: Arduino Mega 2560, Neuton Tiny ML software
To my mind, such simple solutions may contribute to improving the air pollution problem which now causes serious concerns. In fact, the World Health Organization estimates that over seven million people die prematurely each year from diseases caused by air pollution. Can you imagine that?
As such, more and more organizations, responsible for monitoring emissions, need to have effective tools at their disposal to monitor the air quality in a timely way, and TinyML solutions seem to be the best technology for that. They are quite low-energy and cheap to produce, as well as they don’t require a permanent Internet connection. I believe these factors will promote the mass implementation of TinyML as a great opportunity to create AI-based devices and successfully solve various challenges.
Therefore, in my experiment, I take the most primitive 8-bit MCU to show that even such a device today can have ML models in it.
My dataset contained 5875 rows of hourly averaged responses from an array of oxide chemical sensors that were located on the field in a polluted area in Italy, at road level. Hourly averaged concentrations for CO, Non-Metanic Hydrocarbons, Benzene, Total Nitrogen Oxides (NOx), and Nitrogen Dioxide (NO2) were provided.
It is a regression problem.
Target metric – MAE (Mean Absolute Error). Target – C6H6(GT).
Attribute Information:RH – Relative Humidity
AH – Absolute Humidity
T – Temperature in °C;
PT08.S3(NOx) – Tungsten oxide. Hourly averaged sensor response (nominally NOx targeted);
PT08.S4(NO2) – Tungsten oxide. Hourly averaged sensor response (nominally NO2 targeted);
PT08.S5(O3) – Indium oxide. Hourly averaged sensor response (nominally O3 targeted);
PT08.S1(CO) – (Tin oxide) hourly averaged sensor response (nominally CO targeted);
CO(GT) – True hourly averaged concentration CO in mg/m^3 (reference analyzer);
PT08.S2(NMHC) – Titania. hourly averaged sensor response (nominally NMHC targeted);
You can see more details and download the dataset here: https://archive.ics.uci.edu/ml/datasets/air+qualityProcedure:
Step 1: Model Training
The model was created and trained with a free tool, Neuton TinyML, as I needed a super compact model that would fit into a tiny microcontroller with 8-bit precision. I tried to make such a model with the help of TensorFlow before, but it was too large to run operations on 8 bit.
To train the model, I converted the dataset into a CSV file, uploaded it to the platform, and selected the column that should be trained to make predictions.
The trained model had the following characteristics:
Additionally, I created models with TF and TF Lite and measured metrics on the same dataset. The comparison speaks louder than words. Also, as I said above, TF models still cannot run operations on 8 bits, but it was interesting for me to use just such a primitive device.
Step 2: Embedding into a Microcontroller
Upon completion of training, I downloaded the archive which contained all the necessary files, including meta-information about the model in two formats (binary, and HEX), calculator, Neuton library, and the implementation file.
Since I couldn’t run the experiment in field conditions with real gases, I developed a simple protocol to stream data from a computer.
Step 3: Running Inference on the Microcontroller
I connected a microcontroller on which the prediction was performed to a computer via a serial port, so signals were received in a binary format.
The microcontroller was programmed to turn on the red LED if the concentration of benzene was exceeded, and the green LED – if the concentration was within permitted limits. Check out the videos below to see how it worked.
In this case, the concentration of benzene is within reasonable bounds (<15 mg/m3).
In this case, the concentration of benzene exceeds the limits (>15 mg/m3).
My example vividly illustrates how everyone can easily use the TinyML approach to create compact but smart devices, even with 8-bit precision. I’m convinced that the low production costs and high efficiency of TinyML open up enormous opportunities for its worldwide implementation.
Due to the absence of the need to involve technical specialists, in this particular case, even non-data scientists can rapidly build super compact models and locate smart AI-driven devices throughout the area to monitor air quality in real-time. To my mind, it’s really inspiring that such small solutions can help us improve the environmental situation on a global scale!