It seems that conventional wisdom has coalesced around the idea that large applications should be built as monoliths first, and then broken down into microservices later. This makes a lot of intuitive sense: start with the easier approach first, and break out services later after your natural bounded contexts are apparent.

The only problem is that no one really has any good recommendations for exactly how one is supposed to break up a monolith. Sure - identify your bounded contexts and extract services according to the boundaries. That sounds simple enough, but turns out to be really, really hard in practice.

The black hole

One problem with monoliths is that they exert a tremendous amount of gravitational pull on your ongoing engineering efforts. Any attempt to build new features or make changes to existing logic inevitibly turns out to be much, much easier to do inside the monolith than outside. The biggest devlish detail is state. Suppose I want to make an enhancement to feature A, but all state for feature A is currently stored in the monolith. Now suppose I build my new enhancement as a separate service with its own state. Now I have a synchronization problem to keep my state in sync between the new service and monolith. Moreover, my logic for this feature overall is now split between two different applications, likely leading to bugs and maintenance headaches.

I can get around this problem by extracting the existing logic into a new service, and then making my enhancements to the extracted service. This sounds great, except that oftentimes the work required to do this extraction will be many times greater than the enhancement work that I need to do. So my message to stakeholders becomes, “I know you want to make these enhancements which will take 3 weeks, but first it’s gonna take us 3 months to extract this new microservice.” Multiply this logic across 4 or 5 different feature areas and you’ve got a very tough sell over the logic of “just add it to the monolith”. I know some organizations get around this problem by doing both: they have Alpha Team work on the service extraction and Beta Team work on the feature enhancement inside the monolith. This is great if you can spare that kind of duplicated effort.

It’s called a “tar ball” for a reason

By far the biggest challenge to extracting services out of your monolith is that fact that your bounded contexts are very likely a myth. You can draw those boxes on the whiteboard, but most of the time the logic in your monolith is a big ball of twine, with dependencies criss-crossing your code base and database tables. So when you do go to extract your billing logic into a nice, clean independent service, you find that all sorts of other features are woven around that logic. This is the nature of monoliths - clean internal modularity is the rare exception. Remember how we decided we would “start with the monolith” architecture? Well usually that means you are developing fast, without a clear understanding of your service boundaries. And that means it’s highly likely that you’ve got cross-feature dependencies all over the place.

So now your service extraction actually looks like two projects. First, re-organize the code internal to the monolith to create a properly observed service boundary, and then work to extract that code into a separate service. And by the way, each of those projects is likely gonna take more than a month. So now that feature you planned to implement in 3 weeks isn’t actually gonna get built until next quarter! And in the meantime the engineers are gonna be busy refactoring code and literally building no new features. This is when you get really popular with your PM and Sales teams. Or, you could just keep hacking on the monolith…

Oh, and you need perfect tests

So you’ve decided to extract a service from your monolith. Congratulations! To get some of those great microservice benefits you’ve decided to rewrite your billing logic as a separate service. You go about writing the service, implementing a nice API that mirrors the internal service API you created inside the monolith when you refactored all your pricing logic. But wait. How do you know that the new service implements all the pricing logic in exactly the same way as the old logic? Oh, cause you have unit tests. Good. Except all those tests are structured for the old logic, and they are written in a different language from the new service. Fine, you can just fall back on your exhaustive and complete integration test suite. Right? Right!?!? Oh, you’re integration tests aren’t that great? And they can’t really test all the logic cause your test data doesn’t have the years of acccumulated “real world” data that sits in your production data store? Hmm… that is too bad. And you can’t exactly run the new billing system in parallel with the old code because both systems mutate state all day long. Wow, I guess this is gonna be pretty tricky.

Let’s get real

The fact is, this problem is really, really hard. But people do it. It can be done. But here are some things to think about while you are happily pursuing your monolith first strategy:

  • Start thinking about microservices earlier than you think you need to. A better strategy than “monolith first” might be “microservices as soon as possbile.” As soon as you can start identifying some natural services boundaries then you should probably start investing in breaking a service out of your monolith. The longer you wait the harder it’s gonna be.
  • Break out a service that requires persistent state. One of the ways to “cheat” your microservice architecture is to start building new features as stateless services. This gives you logically separate services and multiple code bases, but avoids the really hard data synchronization problems that arise when you are running multiple stateful services. So don’t do that. Make sure that you start building new services that keep their own internal persistent state. This will force you to deal with your data synchronization and eventual consistency problems. You need to exercise that muscle and learn how to do it well if you want your eventual architecture to succeed.
  • Start refactoring your monolith now. If you’re monolith is a ball of spagetti, start up a serious project to start refactoring it into separate modules with service boundaries. You will need some application architecture help to enforce the boundaries (Rails and most other frameworks are not gonna do it). Start investing in that now. The sooner you start thinking about “independent services” within your code base the faster you will get to a place where service extraction isn’t a herculean task.
  • Recognize the anti-patterns in your tooling around the monolith. As the eng org grows around a monolith you will increasing find the need to create tooling to manage the massive parallel engineering going on in that one code base. Shared dev and testing environments and deployment pipelines become the bottlenecks in monolith development. Fights break out over coding standards and the “best” tools to use. Recognize this pain for what it is: a sign that you’re leeching engineering capacity into managing the monolith. The sooner you recognize this drain the sooner you can justify spending some of that time re-architecting to empower microservice development instead.

And finally, good luck. You’re gonna need it.