Sending files to AWS S3 Bucket using Zephyr RTOS HTTP API

Written by Jakub Zimnol | 16-Nov-2024 23:43:39

Introduction

An embedded device is typically designed to collect data over time — whether it’s logs, sensor readings, or even images captured by a small camera attached to the device. But how can a user retrieve this data after days, weeks, or even months of continuous operation? The most straightforward method is often through a USB connection, but what if the device is located hundreds of kilometers away, as is frequently the case with IoT devices?

One solution is for the device to send data chunks to the cloud periodically. However, this approach can quickly become chaotic to manage, especially if the data arrive out of order (thank you, UDP...). To streamline this, a more efficient method is to send a complete file over the air to a cloud-based storage server. But how can a small embedded device handle this reliably, especially when the entire file cannot fit into RAM to be sent as a single buffer payload?

It’s often said that UDP/CoAP is the best choice for low-power applications when sending small amounts of data. But what happens when the file is larger than 1 MB? In such cases, transmission speed becomes crucial — the less time the device spends sending the data, the less energy it will consume. This is where HTTP shines like a diamond, practically screaming "Use me!". Widely used by cloud servers, HTTP provides a reliable and efficient way to transfer large files from embedded devices to almost any server, making it an excellent choice for these use cases.

In this short tutorial, we’ll cover how to send a file from an embedded device running on Zephyr RTOS directly to an AWS S3 bucket. We’ll be using Zephyr RTOS with its generic HTTP API to handle the HTTP PUT operation, and to simplify the process, we will use a no-authentication method to upload the file to the S3 bucket.

Prerequisities

To follow the tutorial along, you’ll need the following:

A Zephyr RTOS SDK installed on your machine
A Zephyr RTOS-supported development board with any network connectivity (e.g. Wi-Fi or cellular)
An already created Zephyr RTOS application with any network connectivity
- This tutorial will focus solely on the HTTP client operations needed to send data to AWS S3
Ten minutes of your time

Setting up the AWS S3 bucket

Amazon S3 (Simple Storage Service) is a cloud-based storage solution that makes it easy to store and manage large amounts of data. Files, also called "objects", are stored within "buckets", which act like containers for organizing and securing your data. S3 is highly scalable, meaning it can handle anything from a single file (like a favorite photo of your dog) to thousands of files, such as all your favorite photos of your dog log files generated by a fleet of IoT devices.

To create an S3 bucket, follow the first two steps from the Getting started with Amazon S3 offcial guide. After you create a bucket, save the bucket name and the region where the bucket is created in the <bucket_name>.s3.<region>.amazonaws.com format (e.g. my-example-bucket.s3.eu-north-1.amazonaws.com) - it will be required later in the tutorial.

After creating the bucket, you need to configure the bucket policy to allow performing the PUT operation on the bucket. To do so, go to Amazon S3 -> Buckets -> <bucket_name> -> Permissions -> Bucket policy and paste the following policy (replace <bucket_name> with your bucket name):

After that, your bucket is ready to accept files via the HTTP PUT operation and you can proceed to the next step.

Zephyr RTOS application code step-by-step explanation

In this tutorial, we’ll focus on sending a file to an AWS S3 bucket. For demonstration purposes, we’ll use a simple snippet of Lorem Ipsum text, but in a real-world scenario, this could be replaced with actual file content read from non-volatile memory, such as an SD card.

So let's dive into the code!

Socket operations

The socket_connect() function establishes a socket connection to a specified server and port. It uses the POSIX API — specifically, the getaddrinfo() function — to obtain the server's address, then creates a socket based on this address information. Once the socket is created, it attempts to connect to the server. This function is utilized within send_file_via_http() to handle the connection setup.

The socket_cleanup() function handles resource cleanup, freeing the socket and address information once the connection is no longer needed.

File content operations

The file_content_get_size() function returns the size of the file content to be sent. In this example, it returns the length of the LOREM_IPSUM text, but it can be modified to return the actual file size read from e.g. the file system.

The file_content_get_chunk() function returns a specific chunk of the file content based on the given offset. It’s used within the http_payload_cb() function to read and send the file content chunk by chunk over the HTTP connection, simulating file content retrieval from a source such as the file system.

An argument specifying the file path can be added to both functions, which is especially useful when reading file content from an external file system (such as an SD card). This allows the functions to retrieve file content directly from the specified path.

HTTP request operations

The send_file_via_http() function is the main function responsible for sending a file over HTTP to the AWS S3 bucket. It first connects to the S3 bucket using the socket_connect() function, then prepares the HTTP request headers, including the Content-Length header with the file size. Next, it configures the HTTP request with the necessary information, such as the HTTP method, URL, host, payload callback, and response callback. The request is then sent using the http_client_req() function, and once completed, the socket connection is cleaned up with the socket_cleanup() function. Note that this function should be called only after the network connection (e.g. Wi-Fi or cellular) is established.

The http_response_cb() function is a callback that executes upon receiving an HTTP response. It logs the response status and any data received. The expected server response is OK, which corresponds to a status code of 200.

The http_payload_cb() function is a callback triggered when the HTTP client needs to send the entire payload data. It reads the file content in chunks using the file_content_get_chunk() function and transmits it with the send() function. It’s crucial that http_payload_cb() sends (and returns) exactly the same amount of data specified by the Content-Length header in the send_file_via_http() function.

The HTTP_PORT macro defines the port used for the HTTP connection set to the default HTTP port 80. The S3_BUCKET_ADDRESS macro specifies the address of your S3 bucket and should be replaced with your actual bucket address. The HTTP_REQUEST_URL macro defines the URL to which the file will be sent, which, in other words, is the path where the file will be stored in the bucket. Note that the specified path does not have to pre-exist in the bucket; it will be created automatically upon file upload.

General project structure

To simplify the understanding and possible later integration with your application, the code can be organized into several files:

sockets.h and sockets.c - These files contain the basic socket operations required for making HTTP requests.
file_content.h and file_content.c - These files contain the content of the file (and operations related to it) that will be sent to the AWS S3 bucket. As noted earlier, this can be modified to retrieve actual file content from non-volatile memory.
file_sender.h and file_sender.c - These files contain the core logic for sending files.
prj.conf - This file includes the Zephyr configuration settings required to compile and run the project.

The files can be organized as follows:

`sockets.h`

`sockets.c`

`file_content.h`

`file_content.c`

`file_sender.h`

`file_sender.c`

`prj.conf`

Building and running the project

To send files to the AWS S3 bucket, simply copy the .c and .h files listed above into your Zephyr RTOS project and include them in the build (for example, by adding them to the SOURCES list in the CMakeLists.txt file). After that, you can call the send_file_via_http() function in your application to send the file to the bucket. Be sure to:

Replace the S3_BUCKET_ADDRESS macro value in the file_sender.c file with your bucket address.
Call the send_file_via_http() function only after the network connection is established.
Set up your AWS S3 bucket and configure the bucket policy to allow the PUT operation for file uploads.

After flashing the board, open the serial terminal to view the logs. Example logs should look like this:

If you see the Response to: IPv4 PUT, status: OK log message, it means that the file was successfully sent to the AWS S3 bucket. You can verify this by checking the content of the bucket in the AWS S3 console.

Congratulations! You've successfully sent a file from your Zephyr RTOS device to an AWS S3 bucket using the HTTP PUT operation.

Summary

In this tutorial, we’ve shown how to send a file from an embedded device running Zephyr RTOS directly to an AWS S3 bucket. This method can be applied to send log files, sensor data, pictures of your dog, or any other data from your device to the cloud for further analysis and processing.

View full post