Airtame Cloud is a web-based system which enables organizations to remotely monitor and administrate large deployments of Airtame wireless streaming devices. In this post, we explore how Airtame devices communicate with Airtame Cloud via WebSockets and how we scaled our backend systems to handle increasing numbers of users and devices. Airtame Cloud provides an overview of […]
In this post, we explore how Airtame devices communicate with Airtame Cloud via WebSockets and how we scaled our backend systems to handle increasing numbers of users and devices.
All communication between Airtame devices and Airtame Cloud occurs over TLS-enabled WebSocket connections, which the devices initiate. WebSockets have several qualities which make them well-suited for our needs:
In order to establish a WebSocket connection, a device must present its authentication token for verification as follows:
After establishing the WebSocket connection, the device and backend communicate by exchanging JSON-RPC messages, of which there are two types:
Consider an example where a user initiates a remote firmware update on a device with id 42:
When device 42 receives the JSON-RPC update request, it parses and executes the command, initiating a firmware update on the device, then returns a JSON-RPC response containing a successful 200 status code to the responsible backend instance via WebSocket:
This backend instance then publishes this response to the deviceAction:1234 channel, which the server that handled the initial POST request is subscribed to. Finally, the status code and any additional information are used to construct an HTTP response, and the user is informed whether the update was successfully executed or not.
When we first released Airtame Cloud, a single backend server handled all of our traffic, including managing device WebSocket connections. As our user base increased, the server struggled to keep up, and we decided to deploy additional identical instances of our backend, or scale horizontally.
Since we were already using nginx as a reverse proxy in front of our single server, it was straightforward to set up round-robin routing between multiple servers. However, this introduced the need to re-route requests for specific devices to the servers maintaining WebSocket connections with those devices. We were pleased to discover that Redis Pub/Sub fit our use case quite naturally, and we’ve found that by using the JSON-RPC message id to identify the Pub/Sub response channel, reasoning about the flow of messages through the system is straightforward.
Now we can seamlessly scale our backend up or down as needed without changing any code.
It’s worth noting that gRPC, which had its first GA release in August 2016, could serve as an interesting alternative to our JSON-RPC-over-WebSockets approach. The use of HTTP/2 for transport and Protocol Buffers for serialization appears very promising for efficient bidirectional communication.
One potential drawback of gRPC in our case is that since HTTP/2 is stateless, we’d need to implement our own session management. Additionally, we really value the simplicity and readability of JSON-RPC. Nonetheless, the gRPC project looks very promising, especially with regard to performance.
If anyone has implemented a similar system using gRPC, we’d love to hear about your experience!
We’re always on the lookout for talented engineers who enjoy tackling challenging problems and are passionate about writing clean, maintainable code. If this sounds like you, check our our open positions and get in touch!
Scalable remote management of embedded Linux devices via WebSockets was originally published in Airtame Engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.