Microservice Consolidation

As a developer and architect, it can be very easy to get caught up on the latest fads. Especially when looking at what’s on HackerNews or coming out of Silicon Valley. Microservices and Serverless functions are the rage in software development patterns. Who wouldn’t want a small simple component that is compartamentalized and allows for isolated changes? Components that do one thing well, and can scale appropriately based on load? If only things were this simple.

Early in my professional career, Object-Oriented programming was the rage with languages such as C++ and Java promising software nirvana where your components can be represented in small isolated classes, and then just “linked together” just like lego bricks. Unfortunately reality didn’t quite work out this way with the rat’s nest of class inheritance diagrams (if you’re lucky) making it next to impossible to determine if one change in an internal function will result in unexpected behavior further down in your application. Today microservices and serverless functions are the new lego bricks.

The reason why I bring this up is that I am seeing the same thing occurring with microservices where a monolithic, or legacy, application is decomposed into respective microservices, and the result is a tangled mess of services that depend on each other using HTTP, as their communication mechanism increasing the chance of failure due to network latency. The most extreme of which being the introduction of a serverless function where your service is broken down into an individual function and exposed as a single endpoint. With today’s tooling in the cloud, running these services is becoming much easier. Likewise, most of the major cloud providers make it easy to get up and running quickly. However, as ThoughtWorks wrote about in 2019, the release cycle maybe faster, but it doesn’t mean traditional architectural principles can be avoided.

In fact, there are some that have taken the decomposing to microservices to an extreme. Some starting around 2018, and others more recently. Even one of my clients realized they went overboard splitting up their core product into microservices several years ago, decided it was a bad idea and reintegrated it.

With the Ben Nadel post, he attributes part of this to Conway’s law: “Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization’s communication structure.” The other part, he attributes towards the technical or business needs.

While Conway’s law applied long before microservices came on the scene, it is natural for microservices to enable each team to not only be responsible for their own components, but to test and release when they’re ready instead of depending on a larger, monolithic system’s release schedule allowing the team to deliver their bits quicker, and more reliably. However, what happens when the developers don’t want to manage this additional responsibility? At best, the developers run with their new found responsibility, or work with a DevOps team to drive towards success where code can be released quickly and deployments happen with the end users or business barely noticing. At worst, deployments can get very messy with the customary late night or weekend deployments that take much longer than necessary, and the required follow-up incidents.

This addresses the human and organization dynamic, but how does this work on the technical side? With microservices and serverless functions, you get better resource utilization by being able to scale services to account for unexpected load spikes in both cases, or unique to serverless, the ability to spin your services down to zero so your organization pays for only what it uses.

Regardless of the reason, they come with the cost of additional overhead: coordination between services and bottlenecks with shared resources that cannot be scaled. In particular, the microservices need a way to find each other and pass environment/configuration details, and in a secure fashion. This requires, at a minimum, a key/value database like etcd or consul. Or you can leverage a service mesh such as Istio or linkerd to handle the configuration, and optimally the routing of the traffic coming into the service.

Then comes the next question of observability and auditing. Unlike a traditional Java Enterprise Edition application or older monolithic app where you had a single source of logs, and could even run a debugger through it, microservices are expected to have multiple copies running concurrently, and talk to other services. This requires a centralized logging framework like ELK and Splunk, or something to capture metrics data like Grafana. Even with your chosen logging solution, when a service acts up, you still need a way to figure out which calls are correlated with each other. Forcing all traffic into a single instance in this case is not the answer. In most cases, you will be required to modify your services to account for this behavior such as generating and propagating a correlation ID and ensuring your application logs are machine readable.

Perhaps the biggest factor to keep in mind with microservices, and occurs every time code is decoupled, is the use of common code between services. Especially when your organization has a dedicated team managing the common code library. Whenever these libraries are released, you must update all of your services to make use of it, which can slow down your development and deployment velocity.

With serverless, most of the same overhead exist, and you have the added bonus of being tied to your chosen cloud vendor to provide the underlying infrastructure. Regardless of the serverless or microservice approach, your documentation must also be kept up to date, with not just the services, but preferably all of the underlying dependencies, API gateway rules, etc. While there are some document generators like swagger that can take an OpenAPI spec file and generate the resulting documentation, it does not cover the underlying dependencies unless you explicitly add them in.

If this seems like a lot of overhead, it is. Note that I didn’t even mention how the services would securely talk to each other, or how the containers used can be validated for compliance. When the teams and/or the systems are small, then this can be overwhelming due to the amount of time spent building out and managing the infrastructure, or building and debugging common code or services with their interdependencies. As with any tool, their use depends on the situation. Just beware that some tools can also cause injury when misused.