Strangling Your Data

A while ago, I wrote a post looking at different ways to modernize an application while retaining preexisting functionality.  One of these techniques is called the strangler patterns which gets its name from the strangler fig. The idea being:

  1. Create a shim for the current system.
  2. Record and/or replicate the data within the shim.
  3. As the new replacement services come online, shunt the data to it.
  4. When there’s confidence the new service is behaving as expected, or enough of the original system has been re-implemented, strangle the legacy system.

In the original post, I highlighted two techniques that can be used to replicate data: one from a web proxy standpoint, and a second observing events through an ESB or message queue such as MQ or Apache Kafka. Today, I’d like to share another technique that may become popular in the near future, and that is to capture the changes as they occur within a database.

Database replication has been around for a while fulfilling the needs for availability and disaster recovery. However, there are two technologies that leverage this feature to create an event streams that can be used to mirror your data in a different database and/or acted upon by newer services.

The first is an open-source Red Hat project called Debezium. This project is a configurable service that has a collection of database adapters looking at traditional relational databases like PostgreSQL, MySQL, SQL Server, and more recently Oracle and DB2, no-SQL databases like Cassandra and MongoDB, and stream the event changes via Apache Kafka. Most of the connectors rely on access to the database’s transaction logs to capture every change and will then convert them into an event payload that can be sent to Kafka. The advantage is that data is captured in near real time, however the data is raw, native only to the database, and without context.

The second approach, and one I had a hand in helping TriggerMesh with, is an event source that looks at capturing on a transaction level where select database tables can be instrumented with triggers to grab the changes, and send them out as CloudEvents to be processed natively. The advantage is that you can capture the context, and the resulting data can be loaded into another database, or directed to a service and manipulated with additional data before being stored in their permanent home. The disadvantage is that it is reliant on how the database trigger is created and how the vendor reacts to failure scenarios.

When migrating or modernizing legacy applications, outside of the underlying logic of the service, the two biggest things to keep in mind are the interface points: the data and communication. Hopefully this will add another tool to your toolbox.

Want to know more about how I can help migrate your legacy system as a part of your Digital Transformation, the let’s chat!