Martin Fowler’s popular article on microservices contains this advice: Going directly to a microservices architecture is risky, so consider building a monolithic system first. Split to microservices when, and if, you need it.
In our experience, Fowler’s model works well if you’ve got a fairly solid roadmap with high confidence attached. It’s especially useful for multiple teams who are being asked to move very fast. A monolith coordinates that work nicely and keeps things clean during the early stages. But when you’re in search of product/market fit and you need the feature set to be volatile (both in terms of new capabilities being added and other things being decommissioned), then it might make sense to follow a different path.
Sometimes, you’re better off giving yourself the flexibility of creating a network of loosely connected microservices to start – and then consolidating the code once you have a better understanding of the most important use cases.
Our First Iteration(s)
With Conjur, we followed a path that might be described as the opposite of Fowler’s model while we were searching for product/market fit. We didn’t know what the product would be, so we built a bunch of different microservices that were loosely connected, and sometimes we just abandoned ones that didn’t work out. We didn’t worry too much about performance or about a unified architecture.
What were we optimizing for?
- Rapid prototypingNew components could be written more-or-less without regard to the needs/decisions of other components. For example, we wrote our LDAP service in Node.js, because the best existing LDAP server library we found was implemented in Node.js. The rest of our stack was almost entirely written in Ruby.
- FlexibilityWe wanted the ability to add and drop services without having to refactor other services. Having a strongly-coupled set of features within a monolith would have made that more difficult.
- TestabilityThe system did have to be stable. Each microservice tested itself in a way that made the most sense for it. We didn’t have to build a “unified” testing suite for the whole product. We tested components individually, assembled them, and then smoke tested the unit.
What weren’t we optimizing for?
- Hiring and ramping up new team members.Training new team members on a complex system of independently maintained microservices isn’t straightforward, but we had a small, stable team who knew the ins and outs of the architecture.
- PerformanceIn the early days, we didn’t have requirements for performance yet and didn’t want to spend time guessing where improvements might be needed.
- TransactionalityBecause our services each had its own connection to the application’s Postgres database, more complex operations updated the database in separate transactions. Having those operations happen in a single transaction turned out to be important, but we didn’t know that yet.
Even though our architecture was made up of a number of separate microservices, the end product still needed to be simple for users and operators to work with. To that end, we installed our microservices together into one “monolithic” Docker image. Our image was based on phusion/baseimage, which uses the runit init system. We used nginx as a reverse proxy, forwarding client requests to the microservices. The microservices themselves all communicated with a central Postgres instance, which also ran in the container. Users and operators had a single container to manage, while our internal developer teams got a loosely coupled “microservice” development experience.
As the product evolved, our packaging process made it clear that some benefits could be had from merging some of the most critical microservices together into a single service.
Moving to Monolith
For the latest version of Conjur (Version 5) we looked at the services that were most useful and important. We had a much better understanding of how they related to each other, so we combined all of those critical pieces together into a single service and put it in the repository cyberark/conjur.
By moving to a single service, we’ve gained transactionality – each single action results in a single database transaction, in contrast to the multiple transactions that were made previously when each microservice communicated to the database directly. We also no longer have to worry about performance or reliability problems across microservice boundaries, and we’ve made it much easier for people to learn the code base and contribute, because there is less overhead to understanding.
In short, we started with a loosely coupled microservice system and then consolidated into a more rigid monolith once we understood the requirements a lot better. With a solid base of customer adoption, we were able to look at which services were most critical, which were dead ends that we could discard, and which of our original assumptions were invalid.
What did we learn?
- Some microservices were less usefulIn some cases, a microservice was only used by a single customer. These were marked for end-of-life or turned into “add-ons” that can be optionally installed.
- Client-side wasn’t a great place to do complex processingToo much processing on the client side meant that only the Ruby client was fully functional. By moving this processing into the server, we could much more easily make full-featured clients for all the client languages.
- Database transactions across microservices are difficult to manageIdeally, when designing a microservice-based architecture, each individual microservice is independent and you don’t have to worry about something as complex as distributed transactions. In our original design, each individual microservice had its own connection to the database, and would update it in its own transactions – but we found that this meant for some actions, there were multiple database transactions, which was not desirable.In the end, we decided to combine all of the code that touches the database into one unified service. No other code (e.g. extensions and add-ons) is allowed to touch the database directly; it must use API calls just like a regular client. This resolves our transactionality problem, and also has the effect of improving the security of our product by giving us tight control over how the database is updated.
The End Result
Starting out with a suite of microservices gave us a lot of flexibility when we were a startup trying to move quickly to adapt to changing market conditions. Once we had a better understanding of the core features of our product (and other technical considerations about stringing them together, such as the database transactionality), we were able to redesign the architecture to optimize it. In Conjur Version 5, there is a central service that can still be linked to other microservices (internal or external), but which is itself in charge of communicating with the database.
Due to the way we re-architected Conjur v5, we are able to realize a number of significant benefits:
- Vastly Improved PerformanceMoving code that was formerly across a service boundary into a single codebase, we could now run many statements in the same transaction. Loading a complex policy (set of RBAC objects and rules) became 100x faster.
- Simplified PackagingWe distribute our product packaged as a Docker image, but we do get occasional requests from customers to package our product differently. This is much easier with a smaller number of components to potentially refactor.
- Open-SourceabilityWe didn’t feel that open-sourcing a complex microservice framework with an “evolved” rather than “designed” architecture could ever be a success. By consolidating the key ideas and code into one repository, we were able to release github.com/cyberark/conjur as open source software and know that we were placing something into the world that was well-designed and understandable.
If you’re about to embark on a project and have similar goals as we did, maybe your team should follow our advice instead of Fowler’s. With a great group of people who aren’t sure what they need to build, you can start with a distributed microservices architecture and – once you understand the problem space better and know what’s needed to optimize your product – you can evolve it into a clean, high-performance monolith.
Alan Potter is a Senior Software Engineer at CyberArk, where he focuses on the core components of the Conjur system. He’s interested in designing secure, highly-available systems that can scale to whatever degree a customer needs.