In this application, you'll build a binary audio classifier. "Binary" here means there are two classes of audio the model will be able to classify: the sound you're interested in and everything else. In this README, we'll use the example of a "running faucet" detector. However, you can easily apply the principals described here to any other binary audio classification task (we even built a cat meow detector!).
You'll use Edge Impulse, a free edge AI platform, to collect data, train the model, and download the model as as Arduino library for use in your Swan firmware. The firmware continually collects 1 second clips of audio via the microphone and feeds them into your audio classifier model, publishing a note to Notehub if it detects that the faucet is running.
You will use the microphone and Swan to stream audio samples to your development machine over the USB connection. We recommend using a laptop here, as it will be easier to position everything near the faucet for collecting audio.
First, you need to flash the data acquisition firmware onto the Swan.
Open VS Code, click the PlatformIO icon on the left side of VS Code, click "Pick a folder", and select the directory 38-audio-classifier/data_acquisition.
Click the PlatformIO icon again, and under the Project Tasks menu, click "Build" to build the firmware image.
Prepare the Swan to receive the firmware image via DFU by following these instructions from the Swan Quickstart.
Under the Project Tasks menu, click "Upload" to upload the firmware image to the Swan. Note that this step can fail if 1) you already have a serial connection open to the Swan or 2) the micro USB cable is for charging only.
Under the Project Tasks menu, click "Monitor" to view the audio data streaming over serial. You should see a rapid stream of integers pouring down the console.
Close the Monitor task in the pane to the right of the serial console.
Now, you're ready to stream audio data to Edge Impulse.
Run edge-impulse-data-forwarder in a terminal window.
Follow the command line prompts to login and select your project. When prompted to name the sensor axis, enter audio. You can name the device whatever you like.
The CLI will print out a URL like this: https://studio.edgeimpulse.com/studio/223641/acquisition/training. Open the URL in your browser.
You'll see this form:
From here, follow the "Build a dataset" section of this Edge Impulse tutorial. Make sure to change the label and sample length before recording each scenario. For example, for the "2 minutes of background noise without much additional activity", make sure to set sample length to 120000 milliseconds and the label to "noise". Once you've set the label and sample length, click "Start sampling" to collect the audio data. Repeat this for each scenario described in the Edge Impulse tutorial to create your custom dataset.
Once you've collecting your training data, complete sections 4-9 of the Edge Impulse tutorial to train the model. Note that if you created a custom dataset, you may see a warning about having an empty testing set (i.e. all your data is in the training set). You can ignore this for now. Section 8 of the tutorial gives you an opportunity to collect data for your test set.
From your Edge Impulse project page, click "Deployment", select "Arduino library", scroll down to the bottom of the page, and click Build. Once the build is complete, your browser will prompt you to download a .zip file with an Arduino library comprising your model. Save this .zip file into 38-audio-classifier/classifier/lib and unzip it there.
There are a few environment variables that allow you to dynamically configure the behavior of the firmware. From your Notehub project page, click "Environment" in the left hand pane and configure the following environment variables:
label: This is the label of interest. For instance, if your dataset has the labels faucet and noise, you should set this to faucet. There is no default label of interest in the firmware, so it must be set for detection notes to be published.
detection_threshold: This is the probability level that must be exceeded for the firmware to publish a detection note. For example, if label is set to faucet and the detection_threshold is set to 0.7, the firmware will only publish a detection note if the model predicts faucet with > 0.7 probability. By default, this is set to 0.6.
publish_rate: Detection notes will only be published a maximum of once every publish_rate seconds. For example, if you set this variable to 45, a max of one note will be published every 45 seconds, even if there are multiple detections in that time frame. By default, this is set to 60 in the firmware.
Use VS Code to open the directory 38-audio-classifier/classifier.
In the file explorer, open main.cpp and uncomment this line: // #define PRODUCT_UID "com.my-company.my-name:my-project". Replace com.my-company.my-name:my-project with the ProductUID of the Notehub project you created in Notehub Setup.
You will also need to add the header file for your classifier in main.cpp:
// Include your specific Edge Impulse header file here:// #include <running_faucet_detector_inferencing.h>
Uncomment the #include and change running_faucet_detector_inferencing.h to the name of your specific header file. This header file is located in the src subdirectory of the directory you unzipped in Downloading the Model.
Click the PlatformIO icon, and under the Project Tasks menu, click "Build" to build the firmware image.
is purely diagnostic and indicates how long the digital signal processing (DSP) and classification operations took, in milliseconds. Then, it shows the probabilities output by the model for each class.
Move your laptop and the hardware so that the microphone is in range of the sound of a faucet. Start the Monitor task in VS Code so you can see the serial log. Turn on the faucet, and you should see this in the log if detection was successful: