Banner image credit @nahakiole on Unsplash.
Blues University is an ongoing series of articles geared towards beginner- and intermediate-level developers who are looking to expand their knowledge of embedded development and the IoT.
Embedded systems are increasingly widespread in the Internet of Things (IoT), powering everything from smart homes to industrial sensors. With connected devices on the rise, these systems are more critical than ever.
You can harness the true power of embedded systems when connecting them to cloud platforms. The cloud allows for data storage, computational offloading, and real-time analytics. It supports advanced processing, limitless storage, and intricate data analysis that isn't feasible on unconnected IoT devices.
This connectivity amplifies the value of the data collected and transforms it into actionable insights that can drive innovation and optimize various processes.
A high-level overview of how Blues orchestrates data from embedded devices through to your cloud platform of choice:
Embedded Systems and Their Characteristics
Embedded devices tailored for specific tasks generate data that varies in volume, structure, and frequency. The volume ranges from the minimal readings of a home temperature sensor to extensive outputs from industrial monitoring systems. Data may be structured or unstructured, like numerical readings or images, respectively. Then, devices generate data at various frequencies. Some send data in real time, while others do so in batches.
Data from embedded systems falls into two categories: static and dynamic. Static data, like configuration settings, is generally constant and updated infrequently. Dynamic data, such as sensor readings, changes often and may require real-time transmission to the cloud.
Understanding these data characteristics is vital for efficient and secure data transfer to the cloud — a topic we'll explore in subsequent sections.
Data Serialization and Compression
Embedded systems often operate in resource-constrained environments, so they must prioritize data transmission efficiency — especially when using cloud platforms, where constant, high-volume transfers quickly become costly. Serialization and compression are critical for ensuring this efficiency.
Serialization describes converting complex data structures, including arrays and objects, into a format that's easy to store, transmit, and reconstruct. Common serialization formats include JavaScript Object Notation (JSON), Protocol Buffers (Protobuf), and MessagePack. JSON, for example, is especially popular due to its widespread adoption and simplicity: It's lightweight and easy for humans to read and write — and for machines to parse and generate.
To illustrate, consider the following example, which is in embedded C using the ArduinoJson library:
#include <ArduinoJson.h>
StaticJsonDocument<200> doc;
doc["temperature"] = 23.4;
doc["humidity"] = 68;
char buffer[256];
serializeJson(doc, buffer);
The above example creates a simple JSON object with temperature and humidity readings and then serializes it into a string buffer.
On the other hand, Protobuf and MessagePack are binary serialization formats. They are more space-efficient than JSON but require dedicated libraries and tools for encoding and decoding. Protobuf, for example, requires a schema definition, which helps achieve more compact serialization.
Data compression is pivotal for embedded systems due to their often limited bandwidth and memory. This process reduces data size, making storage and transmission more efficient, significantly optimizing these systems' performance.
For embedded systems, tools like the miniLZO library offer fast data compression and decompression optimized for low memory usage and processing power. However, it's important to recognize the underlying techniques that enable these tools to function efficiently.
One high-level technique is run-length encoding (RLE). RLE compresses data by
representing sequences of repeated values as a single value followed by a count.
For instance, the sequence AAAABBB
would be A4B3
. This technique is
especially effective for compressing data with long spans of repeated values.
Another noteworthy technique is Huffman coding, a variable-length prefix coding algorithm frequently used for lossless data compression. The principle behind Huffman coding is to assign shorter codes to more frequent characters, reducing the length of the encoded data.
Consider the code below, which uses the miniLZO library. While the library facilitates the compression process on platforms like Arduino, the essence of its efficiency lies in the application of these foundational techniques.
#include <minilzo.h>
#define IN_LEN (128*1024ul)
#define OUT_LEN (IN_LEN + IN_LEN / 16 + 64 + 3)
static uint8_t __LZO_MMODEL in[IN_LEN];
static uint8_t __LZO_MMODEL out[OUT_LEN];
lzo_uint out_len;
// Compress the data
lzo1x_1_compress(in, IN_LEN, out, &out_len, wrkmem);
Balancing efficiency and the computational overhead on the embedded device is vital when choosing serialization and compression techniques. Though more efficient methods might save on data transfer costs and speed, they require more on-device processing power and memory.
Whether working with simple JSON structures or compressing data for minimal bandwidth usage, understanding these techniques is crucial for any engineer or developer in the IoT landscape.
Secure Transmission from Embedded Devices
In the connected world of IoT, security is paramount. Embedded devices often generate sensitive or critical data, meaning safe transmission is a priority. With the increasing sophistication of data breaches and cyber-attacks, securing data at the source — the embedded devices — is essential.
Blues Notecard provides "off-the-internet" connectivity, providing the most secure method of syncing data from your embedded devices with the cloud.
Ensuring Data Integrity
One primary concern is data integrity — ensuring that the data sent from a device is the same as the data received at the other end — achieved using hashing or checksums.
A hash function takes an input and produces a fixed-size string of characters, typically a digest, unique to the input. Checksums, on the other hand, are simpler calculations used to detect transmission errors.
Using a library like CryptoSuite for Arduino, a simple hash with SHA-256 encryption might look something like this:
#include <sha256.h>
uint8_t* data = "SecureData";
uint8_t hash[SHA256::HASH_SIZE];
SHA256 ctx = SHA256();
ctx.init();
ctx.update(data, strlen(data));
ctx.final(hash);
This example generates a SHA-256 hash for the data, producing a unique digest. If a single character in the data changes, the digest changes drastically.
Preserving Confidentiality with Data Encryption
While ensuring data integrity is vital, ensuring it remains confidential in transit is equally crucial. For this, encryption is employed. Encryption scrambles data so that only someone with the correct decryption key can understand it.
Advanced Encryption Standard (AES) is a common symmetric encryption algorithm. Using a library like AESLib for Arduino, you can encrypt data as follows:
#include <AESLib.h>
uint8_t key[] = "SuperSecretKey";
uint8_t data[] = "SensitiveInfo";
char encrypted[32];
AESLib aesLib;
aesLib.encrypt((byte*)data, key, encrypted, AES128);
Here, the data is encrypted using AES with a key. The encrypted data is transmitted, and only devices or platforms with the right key can decrypt and understand the data.
Secure Data Transmission Protocols
Beyond hashing and encryption, choosing the proper protocol for data transmission can further bolster security. Common IoT communication protocols like MQTT and Constrained Application Protocol (CoAP) offer secure versions — MQTTs and CoAPs, respectively. These versions employ a Secure Sockets Layer (SSL) and Transport Layer Security (TLS) for enhanced security during data transmission.
For instance, establishing a secure MQTTs connection using the MQTT library for Arduino might look as follows:
#include <MQTT.h>
MQTTClient client;
const char* broker = "broker.example.com";
const char* topic = "secure/topic";
client.begin(broker);
client.connect("DeviceID", "Username", "Password");
client.publish(topic, "Hello, Secure World!");
In this example, the device connects securely to an MQTT broker and publishes a message to a topic. The use of MQTTs ensures that the connection and data transfer are encrypted.
Cloud Systems Suited for Embedded Devices
As IoT proliferates, finding an appropriate cloud platform for managing and analyzing embedded device data becomes increasingly significant. Different cloud providers have curated platforms tailored for the IoT landscape, offering a range of features designed to cater to the unique needs of embedded systems.
AWS, Azure, Google Cloud Platform, Datacake, Ubidots, Losant, Blynk, and ThingWorx are just some of the cloud platforms that provide avenues for processing IoT data.
TIP: Learn how Blues Notehub provides easy data routing to virtually any cloud platform.
When evaluating which platform to choose, the following considerations are vital.
Scalability
Scalability is often a top priority. Because IoT deployments can range from a handful of devices to millions, it's critical to choose a platform that can scale seamlessly as needs grow. Most top-tier platforms, such as those mentioned above, offer extensive scalability, ensuring they can handle the demands of expansive IoT networks.
Cost
Cost is another significant factor. Different platforms have varying pricing models based on the number of messages, amount of data storage, or computational requirements. Understanding these pricing structures is essential to ensure the chosen platform remains cost-effective as the IoT deployment grows.
Support and Community Engagement
Support and community engagement can be critical, especially as you first embark on your IoT journey. Platforms backed by significant cloud providers typically have extensive documentation, tutorials, and active community forums. These resources can be invaluable for troubleshooting, learning best practices, or understanding how others have approached similar challenges.
Features
Finally, the platform's features should align with the project's requirements. Some projects might prioritize real-time analytics, while others might need robust device management abilities or ML integrations. Confirming that the platform's capabilities are in line with the project's goals is fundamental.
Device Identity and Authentication
In the digital sprawl of the IoT, security is of significant concern. As countless embedded devices interconnect, sending and receiving vast amounts of data, ensuring that each device is precisely what it claims to be becomes paramount.
Enter the dual concepts of device identity and authentication.
Unique Device Identification
Each device in an IoT network must have a unique identity, akin to an individual's social security number or a computer's IP address. This approach ensures the receiving entity can ascertain the device's legitimacy when it communicates by sending data or requesting actions.
Without such identification, the web of interconnected devices would quickly become unmanageable, with no definitive way to determine the origin or validity of a data packet.
For embedded devices, you can implement Unique Device Identification (UDI) using methods like the MAC address of network modules, built-in chip IDs in specific microcontrollers, or by adding external EEPROMs or flash memory with unique serial numbers. Additionally, manufacturers can program a unique serial number into each device's EEPROM. The method chosen should consider permanence, security, cost, and complexity.
In an Arduino context, you can retrieve the MAC address of a device with an Ethernet interface as follows:
#include <Ethernet.h>
byte mac[] = {0xDE, 0xAD, 0xBE, 0xEF, 0xFE, 0xED};
Ethernet.begin(mac);
Serial.print("MAC Address: ");
Serial.println(Ethernet.hardwareStatusToMACAddress(Ethernet.hardwareStatus()));
In the example above, a MAC address is assigned and then printed to the serial console, representing the device's unique identifier in the network.
Securing Device Authenticity
Having an identity is just the starting point. In the IoT ecosystem, it's not enough for a device to merely state its identity — it must prove it. This authentication step ensures that rogue devices or malicious entities cannot masquerade as legitimate devices.
Certificates and tokens are commonly used mechanisms for this authentication. Certificates, usually in the form of X.509 certificates, provide a cryptographic seal of authenticity. When a device presents its certificate, the recipient can verify its legitimacy against a trusted certificate authority (CA).
Here's a simplified example of setting up an SSL connection using an X.509 certificate on an Arduino with the WiFi101 library:
#include <WiFi101.h>
#include <WiFiSSLClient.h>
char ssid[] = "networkSSID"; // Network SSID
char pass[] = "password"; // Network password
WiFiSSLClient client;
const char* server = "secure.server.com";
void setup() {
WiFi.begin(ssid, pass);
while (WiFi.status() != WL_CONNECTED) {
delay(1000);
}
client.connect(server, 443);
// The X.509 certificate is checked during the connection
}
void loop() {
// Communication using the secured client
}
On the other hand, tokens are like digital “passes” issued to devices after an initial authentication step. JSON Web Tokens (JWTs) are a popular choice. Once a device possesses a token, it can present it for subsequent communications, proving authenticity.
In an IoT context, once an embedded device has its token (typically after a login or registration step), it can include it in its communications headers. Full JWT implementations on platforms like Arduino can be complex due to their computational needs. But conceptually, once a JWT is acquired, it can be attached to HTTP headers for subsequent secure communications.
Streamlining Data Ingestion in the Cloud
The ubiquity of IoT devices translates to a colossal influx of data into cloud platforms. Managing this data, ensuring its smooth ingestion, and providing optimal storage solutions becomes an art. Streamlining these processes can distinguish between a responsive, efficient system and one bogged down by data bottlenecks.
Direct Ingestion Versus Using Intermediaries
One primary decision in this arena is between direct ingestion and using intermediaries like gateways or brokers. Direct ingestion sends data directly from the embedded device to the cloud platform. This straightforward approach reduces the potential failure points that intermediaries may introduce.
An Arduino device, for instance, equipped with a WiFi or Ethernet shield, can
directly send data to cloud endpoints via HTTP POST
requests:
#include <Ethernet.h>
#include <SPI.h>
byte mac[] = {0xDE, 0xAD, 0xBE, 0xEF, 0xFE, 0xED};
EthernetClient client;
void sendDataToCloud(String data) {
if (client.connect("cloudplatform.com", 80)) {
client.println("POST /data-endpoint HTTP/1.1");
client.println("Host: cloudplatform.com");
client.println("Content-Length: " + String(data.length()));
client.println();
client.print(data);
}
}
Using Gateways or Brokers
However, direct ingestion might not always be viable or efficient, especially for devices located in remote areas or those producing vast amounts of data. That's where gateways or brokers come in. These intermediaries collect data from multiple devices, process or aggregate it, and then relay it to the cloud.
Using MQTT, a lightweight messaging protocol often used in IoT scenarios, devices can send data to brokers, which then forward this data to the cloud.
#include <Ethernet.h>
#include <PubSubClient.h>
byte mac[] = {0xDE, 0xAD, 0xBE, 0xEF, 0xFE, 0xED};
EthernetClient ethClient;
PubSubClient client(ethClient);
void setup() {
Ethernet.begin(mac);
client.setServer("mqttbroker.com", 1883);
client.connect("ArduinoClient");
}
void loop() {
String data = "Some Data";
client.publish("some/topic", data.c_str());
}
Handling Data Bursts and Backlogs
Handling data bursts and backlogs is another challenge. IoT devices can sometimes produce data in sporadic bursts, leading to potential backlogs on the receiving end. Leveraging services like AWS Kinesis or Azure Stream Analytics, you can optimize data ingestion to handle such bursts, preventing data loss and providing real-time analytics capabilities.
For the devices themselves, incorporating a buffer or cache mechanism can aid in mitigating data bursts. If the data can't be sent immediately — for example, due to network issues or rate limits — it can be stored temporarily and sent when possible.
Storing Time-Series Data
Finally, considering the time-dependent nature of data from embedded devices, efficiently storing time-series data is paramount. Time-series databases like InfluxDB or TimescaleDB are optimized for such tasks, providing rapid write and query performance. These databases can store vast amounts of timestamped data, allowing for sophisticated analytics operations.
While detailed cloud database operations are typically beyond the purview of embedded C code on devices like Arduino, the principle remains: ensuring that the backend cloud infrastructure is optimized for the data's nature. When embedded devices send timestamped data, the cloud platform should be ready to ingest, process, and store it efficiently.
Real-Time Analytics and Actionable Insights
The true power of IoT doesn't merely lie in connecting devices to the cloud but in drawing actionable insights from the data they generate. As devices transmit data to the cloud, there's a compelling need for real-time analysis to make informed, timely decisions. The sophistication of today's cloud platforms offers a suite of tools that facilitate this, turning raw data into actionable insights.
One of the revolutions in cloud computing is the concept of “serverless.” Instead of provisioning servers and managing infrastructure, developers can focus solely on their code, and the cloud provider takes care of the rest. AWS Lambda is a prime example, allowing developers to run backend code in response to events like database changes, HTTP requests, or device data ingestion in IoT.
Leveraging Serverless Functions
Imagine an embedded Arduino device sending temperature data to the cloud. If the temperature surpasses a certain threshold, an alert will be sent.
Using AWS Lambda in conjunction with Amazon's IoT services, you can seamlessly accomplish this:
#include <Ethernet.h>
byte mac[] = {0xDE, 0xAD, 0xBE, 0xEF, 0xFE, 0xED};
EthernetClient client;
void sendDataToAWS(float temperature) {
if (client.connect("YOUR_AWS_ENDPOINT", 80)) {
client.print("POST /temperature-endpoint HTTP/1.1");
client.print("Host: YOUR_AWS_ENDPOINT");
client.print("Content-Length: " + String(temperature));
client.println();
client.print(temperature);
}
}
Once the data hits AWS, a Lambda function can be triggered to process it:
# AWS Lambda function (Python)
import json
def lambda_handler(event, context):
temperature = event['temperature']
if temperature > THRESHOLD:
# Send an alert or take necessary action
send_alert("Temperature exceeds threshold!")
return {
'statusCode': 200,
'body': json.dumps('Temperature processed successfully!')
}
The beauty of serverless is the ability to scale automatically. Whether it's one device sending data or several sending data simultaneously, AWS Lambda scales to process it in real time.
Integrating with Other Cloud Services
Beyond simple event-driven functions, cloud platforms like AWS provide many integrated services for more advanced analytics, ML, and monitoring.
For instance, you can funnel the ingested data into Amazon Kinesis for real-time data streaming and analytics. If, for example, patterns or anomalies are detected using tools like Amazon SageMaker's ML capabilities, you can trigger actions accordingly, like adjusting other IoT devices, sending notifications, or integrating with third-party services.
Monitoring is another critical facet. Amazon offers services like CloudWatch, which can monitor the data and the health of IoT applications, providing dashboards, alarms, and insights into their operations. Such integrations mean that if an embedded device starts sending erratic data or there's an unusual spike in data traffic, you can configure alarms that alert you to take immediate action.
In essence, the synergy between embedded devices and cloud platforms offers a world where data isn't just collected but acted upon in real time. With serverless architectures and integrated cloud services, real-time analysis transforms into actionable insights, leading to smarter decisions, enhanced efficiency, and a more connected world.
Whether an Arduino sending temperature data or a complex sensor network monitoring an industrial plant, the fusion of devices, data, and the cloud redefines what's possible in the IoT landscape.
Cloud Upload Example
Now, let's put what you've learned above into action. In the following example, using the MQTT protocol, you'll use an emulated environment to simulate an embedded device capturing temperature data and sending it to an AWS endpoint.
Before jumping into these hands-on steps, ensure you have:
- An AWS account with an IoT device set up
- The MQTT endpoint for the AWS IoT device
- A Root CA certificate, device certificate, and private key for secure MQTT communication
For this program, we'll use the Paho MQTT library, a popular C client for MQTT. Ensure you have it installed or available in your environment.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include "MQTTClient.h"
#define ADDRESS "YOUR_AWS_IOT_MQTT_ENDPOINT"
#define CLIENTID "EmbeddedClient"
#define CERT_FILE "path_to_device_cert.pem"
#define KEY_FILE "path_to_device_private_key.pem"
#define CA_FILE "path_to_root_ca_cert.pem"
#define TOPIC "temperature/data"
#define PAYLOAD "{ \"temperature\": 22.5 }"
#define QOS 1
#define TIMEOUT 10000L
int main(int argc, char* argv[]) {
MQTTClient client;
MQTTClient_create(&client, ADDRESS, CLIENTID, MQTTCLIENT_PERSISTENCE_NONE, NULL);
MQTTClient_connectOptions conn_opts = MQTTClient_connectOptions_initializer;
conn_opts.keepAliveInterval = 20;
conn_opts.cleansession = 1;
conn_opts.keyStore = KEY_FILE;
conn_opts.trustStore = CA_FILE;
conn_opts.privateKey = KEY_FILE;
conn_opts.privateKeyPassword = NULL;
if (MQTTClient_connect(client, &conn_opts) != MQTTCLIENT_SUCCESS) {
printf("Failed to connect.\n");
exit(EXIT_FAILURE);
}
MQTTClient_message pubmsg = MQTTClient_message_initializer;
MQTTClient_deliveryToken token;
pubmsg.payload = PAYLOAD;
pubmsg.payloadlen = strlen(PAYLOAD);
pubmsg.qos = QOS;
pubmsg.retained = 0;
MQTTClient_publishMessage(client, TOPIC, &pubmsg, &token);
printf("Sending data: %s\n", PAYLOAD);
MQTTClient_waitForCompletion(client, token, TIMEOUT);
MQTTClient_disconnect(client, 10000);
MQTTClient_destroy(&client);
return EXIT_SUCCESS;
}
Compile and run the program. Ensure you've replaced placeholders with actual paths and endpoints. If everything is set correctly, this program will capture a static temperature value (22.5°C) and send it to your AWS IoT environment.
To see the incoming data, you can monitor the MQTT topic (temperature/data) on the AWS side.
NOTE: Ensure your AWS Identity and Access Management (IAM) policies allow the device to connect and publish to the MQTT topic and that you've granted other required permissions. Additionally, ensure your security policies on AWS IoT allow the connection from your device, adjusting firewalls or any network configurations as necessary.
This example provides a basic demonstration. In a real-world scenario, you'd capture dynamic sensor data, handle errors more comprehensively, and include mechanisms for re-trying failed uploads.
Simplifying Cloud Integrations with Blues
The IoT sphere requires efficient data transmission and connectivity solutions. The Blues ecosystem is active in this area, providing hardware and tools for streamlined data transfer from IoT devices to endpoints.
Blues focuses on smooth data transmission, addressing the challenges of connecting devices to the cloud. Notably, Notecard is a secure data transfer system-on-module with built-in cellular capability, ensuring reliable data syncing even in challenging environments.
Supplementing this, Notecarrier is a carrier board designed for Notecard, aiding developers in product integration.
Blues also offers Notehub, a cloud service facilitating the connection between Notecards and cloud destinations. It supports data routing and fleet management, giving an overview of all connected IoT devices.
More information on how to use Notecard and Notehub is available on Blues' official developer portal.
Data Flow with Blues
Notecard and Notehub work together to provide bidirectional wireless communication capabilities, both outbound (from your microcontroller or single-board computer to the cloud):
And inbound (from the cloud to your microcontroller or single-board computer):
Next Steps
In the ongoing development of the IoT, it's evident that integrating embedded systems and cloud platforms plays a crucial role in driving IoT advancements.
Embedded systems, fundamental to IoT, are effective in data capture, processing, and generation. Their capabilities are significantly enhanced when integrated with cloud platforms! Such integration not only boosts the performance of individual devices but also creates a robust network that can share, analyze, and process data in real time. This interconnectedness turns ordinary devices into smart systems, enabling them to communicate and make data-driven decisions.
This merging of embedded systems and cloud technology isn't just a step in tech evolution. It represents a shift towards a future where physical systems integrate with digital ones, resulting in smarter urban areas, more efficient industries, and improved life quality. Potential innovations might include health monitors that predict possible health issues or intelligent grids adjusting energy consumption based on real-time needs.
Blues is at the forefront of this transformation, representing the ideal of smooth device-cloud integration. They provide tools and products to make IoT more user-friendly, effective, and revolutionary. For those entering the IoT domain, Blues offers a comprehensive experience full of discovery, invention, and vast potential.
Blues University
This article if part of a broader series where you can dig deeper into each aspect of embedded development. To embark on this journey, be sure to follow Blues University, where you can explore and contribute to shaping the future of IoT.
If you're new to embedded development and the IoT, the best place to get started is with the Blues Starter Kit for Cell + WiFi and then join our community on Discourse. 💙