How to Scale IoT with Legacy Systems
LeveregeLeverege
Any IoT application involves the handling and use of large data sets. As your project scales or continues to grow, the size of this data will grow as well. It’s important to plan for how to handle this scale at the onset of your project. However, failing to plan for scaling can lead to a legacy system at capacity. This leads to latency and system load limitations, both of which affect the experience of the end user. To address this, you’ll need to find a way to scale the capacity of your legacy system, all while managing technical debt. To do so, you’ll need to consider the best approach as well as how to integrate with an older system.
"While scaling legacy systems presents its own challenges and complexities, it is still possible to do so while minimizing risk and downtime."
-Veronica Head
At its core, scaling your IoT infrastructure will involve scaling the database system and then ensuring that the rest of the system can efficiently function on this database structure. In addition to scaling the database, there are smaller improvements that can be made to the system’s function to improve overall performance. However, we'll primarily focus on database scaling. There are many considerations and implications for the design you choose.Â
From a physical point of view, there are two ways to scale a database–horizontally or vertically. In vertical scaling, you increase the capacity of a single machine. On the other hand, horizontal scaling involves increasing the number of machines to create multiple instances of your database across the system. A useful metaphor is to think of how to increase the number of books you can store. You can either get a taller bookshelf, or you can buy multiples of the same-sized bookshelf.
In vertical scaling, you will essentially allocate a bigger machine or server to handle your requests. To do this, you can upgrade the RAM or disk size of the existing machine. Since this means physically upgrading the hardware of the server machine, you will need to migrate over the data. One of the easiest ways to do this is to create a replica of your existing machine on the newer and bigger machine. Then, you can write data to both the new and existing machines. Once some time has passed, you can set the replica to take over as the primary machine and then take the older machine offline.
This approach can buy you some time to scale your legacy system and is relatively easy to implement. However, your system is still limited by how much data you can read/write as you still only have one machine at your disposal. To use a metaphor, this is a bit like asking one person to gather twenty books from one shelf versus asking twenty people to gather one book from one shelf each. You’ll still be limited by the types of queries you can make. Although you made your shelf larger and it can store more books, it is still one shelf.
With horizontal scaling, you will take your existing system, and split it into many smaller systems, called shards. This approach is typically called sharding. What this means is that your data is split across many machines. This has the advantage of increasing your storage capacity as well as your read/write capabilities. You also increase the system’s redundancy; if one shard is down or reaches its capacity, the other shards can take over. However, this approach is harder to implement and comes with some complexity.
To implement sharding, your system will need to be able to read and write across all shards versus just one machine. This requires a significant change to how your system reads and writes data. On top of that, a legacy system makes it harder to perform a significant design change.
Once you’ve decided on one or more strategies to improve your system’s performance, you’ll want to create a plan to test and implement these changes. Since you’re scaling a legacy system, it’s important to anticipate all possible risks and mitigate the effects on your end users. If you’re working with a live application, you’ll also want to consider any scheduled downtime or performance impacts in the testing process.
To prepare for testing, you’ll want to ensure that you have systems and processes in place that will help you mitigate risk. Generally speaking, you’ll want to anticipate any service disruptions and communicate these to your customer base proactively. Since you’re likely making significant changes to your system, the possibility of a disruption or outage is high. Giving an advanced warning and planning to make these changes during off hours and when you have support on hand will help limit the impact of such.
In addition to this, you’ll want to create a comprehensive testing plan. This should include regression testing of any applications or interfaces in the system. This will ensure that you maintain functionality across the whole system when you make your upgrades. If all goes according to plan, customers won’t notice any difference once changes are made. You should follow standard testing procedures, generally deploying on development, then staging before moving to production.
With these items in mind, you can plan for the risks involved with making large system upgrades. You can also ensure that you maintain functionality after the upgrade.
Since legacy systems often involve larger amounts of tech debt, you’ll want to consider all possible edge cases or unexpected bugs. Doing extensive validation testing prior to deployment is the best way to capture these issues. You’ll also need to anticipate unforeseen issues like added dependencies or upgrades needed. It is wise to account for this when considering project scope and adding buffer time.
Because you’re working with a live system that has existed for some time, it is a good idea to back up data before making any changes. You might also want the ability to easily roll back changes in the event anything breaks.
While scaling legacy systems presents its own challenges and complexities, it is still possible to do so while minimizing risk and downtime. To scale with little interruption, you’ll want to ensure that you plan in terms of system restructuring as well as testing. By taking these steps, you’ll be able to anticipate disruptions and be more effective.
New Podcast Episode
Recent Articles