(Source: putilov_denis- stock.adobe.com)
Welcome to the second blog in our Edge Impulse Fundamental series. In the first blog, we demonstrated the various mechanisms Edge Impulse offers. In this blog, we will take a practical look at the overall Edge Impulse workflow: from data collection and training to firmware deployment on targeted edge devices. To help with this, let’s imagine a real-world example. In this case, let’s assume we are trying to build a device that listens for a “secret” series of knocks and unlocks a door if the correct sequence of knocks is detected. We will leverage the microphone aboard the Arduino Nano 33 BLE Sense development board.
First, let’s download the necessary applications to make all this work. This includes the following:
At this point, we simply need to flash the firmware and launch the serial daemon. Conveniently included in the firmware repository are Windows, Mac OS, and Linux scripts to automate this process for select development boards, including the Nano 33 BLE Sense. If you haven’t done so already, create an account and log in to the Edge Impulse Studio (https://studio.edgeimpulse.com). Then click on the Devices tab to ensure your development board successfully made contact with the Edge Impulse service (Figure 1). With all that done, it’s time to deep dive into the real focus of this article, how to train a new model from scratch.
Devices
Figure 1: Edge Impulse provides native support for a wide range of development boards to directly connected to their training and testing environment. (Source: Green Shoe Garage)
For this project, we will use the Nano 33 BLE Sense’s built-in microphone to listen for a distinct pattern of knocks. In order to train the model, we will need to collect two datasets—one that captures the ambient sound with no knocks and one that captures the series of secret knocks. This process is called ingestion. Click on the Data Acquisition tab and look for the Record New Data section.
Data Acquisition
Some key attributes of note are the Sample Length and Frequency. The sample length determines how long a recording will be made, and the frequency determines the number of samples taken per second. Edge Impulse recommends capturing ten minutes of audio (captured in one-minute chunks), five minutes of just ambient noise, and five minutes of the knock. It is important to remember that capturing one minute's worth of audio data may take several minutes as the amount of memory on your particular development board will constrain you.
After we have the necessary raw data, it must be processed and turned into a neural network (Figure 2). In Edge Impulse parlance, this is called designing an ‘impulse.’ This is a multi-part process where we first define how to chop up the raw data into windows. This is done by specifying two values; first, the window size controls how much time, in milliseconds, each window should last. Second is window increase controls the start time of each subsequent window.
window size
window increase
Figure 2: Edge Impulse tools provide a simple manner to review and tag training and test data right from the browser. (Source: Green Shoe Garage)
With raw data broken into appropriate-sized windows, it’s time to transform the raw data into something useful to the neural network training algorithms. This process begins by sending the raw data through the aptly named processing block. Edge Impulse applies a signal processing technique called Mel Frequency Cepstral Coefficients (MFCC) for audio data. Other processing blocks are available, including blocks for images, flattening (slow-moving data like humidity readings), spectral analysis (fast-moving data like an accelerometer), and the ability to create a custom processing block. Several variables can be tweaked with the MFCC processing block, including:
processing block
By tweaking these parameters, you can alter how the output of MFCC. The output of these tweaks to the raw audio data is visualized as a spectrogram. The goal of tweaking these parameters is to ensure that the features that make up the knock and no-knock datasets are accurately and efficiently extracted from the raw data. The better results achieved now will help ensure that neural network can ultimately infer when the device is fielded in real-world conditions more easily.
Remember that different data types will use other methods to prepare the data for use in machine learning (ML) algorithms. Knowing which is correct for your data is a big part of the education and experience you will gain working on ML projects.
The massaging of the raw data is followed by a so-called learning block that takes the output of the processing block and uses it to train a neural network model. From the Edge Impulse Studio, select NN (Keras) Classifier from the left-hand navigation window which is suited for categorizing movement or recognizing audio. There is also transfer learning for classifying mages and K-mean anomaly detection for finding outliers in new data. There are also a few parameters to tweak before we run the training neural network model. These include:
NN (Keras) Classifier
With parameters adjusted, click on Start Training; at the end, you will have a trained neural network.
Start Training
Just like getting an education isn’t the end of professional learning, so too can the effectiveness of the neural network be improved by feeding it real-world data (meaning new data that was not used to train the neural network initially). This is a two-step process on Edge Impulse. First, is a quick test called Live classification, which allows the neural network to get exposed to new data to see how well it performs (Figure 3). One concern is the problem of overfitting, where the neural network responds excellently to the test data but not new, real-world data since the model has sort of “memorized” the test data. The second, more rigorous testing is known as Model testing. Every time live classification is run, the data is added to an ever-expanding test dataset.
Model testing
Figure 3: The ability to visualize and tune the raw data to improve the efficiency and accuracy of the model is a significant advantage of Edge Impulse over earlier manual processes associated with ML development. (Source: Green Shoe Garage)
Once the impulse has been trained and verified, it’s time to deploy the model back to your device. From an end user’s perspective, the magic of AI occurs during inferencing. That is when a fully trained model is deployed to either a cloud environment or an edge device so it can begin making predictions based on real-world interactions. But to reach that payoff, it requires having an easy-to-use software library that can be integrated into one’s project. Edge Impulse will package up the complete impulse—including the MFCC algorithm, neural network weights, and classification code - into a single C++ library. This will let the model run on low-powered embedded systems that may even lack an internet connection.
Join us for the third part of the Edge Impulse Fundamentals series. In part three, we will explore in detail one of the most crucial steps of the Edge Impulse workflow: impulse design.
Michael Parks, P.E. is the co-founder of Green Shoe Garage, a custom electronics design studio and embedded security research firm located in Western Maryland. He produces the Gears of Resistance Podcast to help raise public awareness of technical and scientific matters. Michael is also a licensed Professional Engineer in the state of Maryland and holds a Master’s degree in systems engineering from Johns Hopkins University.