A flurry of architectural discussions begin after a project starts. It ends up being a race against grasping the business deliverable and understanding the problem domain. Another related issue is getting the product into customer hands so that you can gather feedback on how it is being used and what could be improved.

If the architectural review process takes too long, you lose time delivering. If you expedite the process too much, you may be riddled with technical debt. It's a very fine balancing act, to say the least.

During the initial architectural discussions, the question of do we build a monolith or a number of micro services comes up. It's something that can result in a very heated discussion. This discussion sets the path for the project going forward, determines how intertwined the items are, and how much time must be spent in a sprint for quality control.

Build for now, not for tomorrow, because there are several key uncertainties in play: who may frequent your applications, and the ebb and flow of your code, as waves hit against it. As load is distributed against your application, you will see what is utilized the most, and needs the most attention. Will something need to be broken out? If it does, will it need to be removed through a monolithic design? How much time would that take? This decision is important, because re-implementing a solution from scratch is a very costly proposition. By starting from scratch you are, more than likely, removing time that could be used to benefit customers or end users.

As time goes on, this becomes a harder and harder sell to management. Convincing management to stop providing new features and address technical debt that customers aren't complaining about as of yet, is a very hard sell. If, however, the technical debt is continually ignored, the pendulum swings to where it takes longer and longer to develop features. This can be remediated by a solid technical foundation, which allows developers to address technical debt without a large time allocation.

I will be posting a more in-depth article on spikes to plan ahead. But in summary, it's better to leave room to grow rather than locking yourself in place.  

I have been taking an approach for several years now which starts as a monolith, but can be easily pivoted to a micro service. That is to say:

  • On origin of the project a single API service is provisioned.
  • As the utilization grows, we see where the load is concentrated. Then those services are spun out.

This is done via a mono repository. This places all code into one overarching repository. The Achilles heel of this approach is it must have solid CI/CD, Merge Request, Documentation, Tooling, and Testing. In short, this will fail unless you maintain quality and operational excellence.

Noted below is a sample layout of how I structure my projects. This relies on Kotlin multi-module projects, which allows a number of different projects to be configured under one repository root.

project
├── clients
│   ├── android
│   ├── iOS
│   └── web
└── services
    ├── authentication
    │   ├── api
    │   ├── persistence
    │   ├── messagebus
    │   └── records
    └── inventory

Clients

The clients folder contain all interfaces or graphical clients. Each type of client is a separate project as noted above. The clients will depend on records from each respective service. The client library to access the resource is based off of OpenAPI, AsyncAPI, or GraphQL contract. This is automatically generated via the CI/CD process.

The records consumed should also be based off of the contract definition. None of these elements should be hand written. The items should be built through build plugins and be based off a contract. This process will provide compile time checking what, if anything, would be broken by an update.

Services

Each service that drives the deliverable is located under a services folder. This can be small little pieces of the project. Each piece contains several common modules:

  • API is the back-end service that provides the API. It is contract driven, so either OpenAPI, AsyncAPI, or GraphQL. The contract will ensure adherence, via type safety, to the specified API layout.
  • Records are the data classes or objects that the API responds with. These will be consumed by the client and adjacent services. It will be broken into two portions: one is inter service communication (rSocket, gRPC, EventBus, etc.), the other is what the client will receive.
  • Persistence is the layer that talks to the database or non-persistent data store. It will cast the persistence response records into the records objects.
  • Message Bus allows for inter-service communication; it can be kafka, rSocket, etc...

Partial Monolith

graph TD; subgraph clients c1[iOS] --> g[API Gateway] c2[Android] --> g[API Gateway] c3[Web] --> g[API Gateway] end subgraph Persistence Abstraction / Message Bus g[Monolith] --> sm3[Shopping Cart] g[Monolith] --> sm2[Inventory] g[Monolith] --> sm1[Authentication] end subgraph Persistence subgraph Schema sm1[Authentication] --> sp1[Authentication] end subgraph Schema sm2[Inventory] --> sp2[Inventory] end subgraph Schema sm3[Shopping Cart] --> sp3[Shopping Cart] end end

Micro Services

graph TD; subgraph clients c1[iOS] --> g[API Gateway] c2[Android] --> g[API Gateway] c3[Web] --> g[API Gateway] end subgraph API Gateway g[API Gateway] --> sa1[Authentication] g[API Gateway] --> sa2[Inventory] g[API Gateway] --> sa3[Shopping Cart] end subgraph Message Bus sa1[Authentication] --> sm3[Shopping Cart] sa1[Authentication] --> sm2[Inventory] sa3[ShoppingCart] --> sm2[Inventory] sa3[ShoppingCart] --> sm1[Authentication] end subgraph Persistence sa1[Authentication] --> sp1[Authentication Persistence] sm1[Authentication] --> sp1[Authentication Persistence] sa2[Inventory] --> sp2[Inventory Persistence] sm2[Inventory] --> sp2[Inventory Persistence] sa3[ShoppingCart] --> sp3[ShoppingCart Persistence] sm3[ShoppingCart] --> sp3[ShoppingCart Persistence] end

Here we have two architectural diagrams.

First is the monolith. The monolith is deployed as a single service for the clients to communicate with. It can  be REST, GraphQl, or WebSockets. It is just an aggregator.

The monolith communicates via a message bus abstraction layer. This is covered more later. This message bus receives data from the persistence layer, which is broken into separate schema for each portion.

By sharing one database to start, we ease initial provisioning. By separating into schemas, we can more easily migrate the data to a new database in the future.

The micro services architecture takes a more typical approach. We are fronted by an API gateway. The API gateway communicates with a number of different services. Each message bus then communicates with each of its own persistence layers.

The micro service approach will be deployed at a minimum as follows:

  • One instance for the client facing API and Message Bus / Internal Communication.
  • The persistence layer.

Where is the dividing Line?

How do you tell when to stop at bundling everything up into one repository?

I look at this as a end business deliverable. Our business deliverable is to provide an eCommerce for end users. This may be broken up into services like:

  • Authentication
  • Shopping Cart
  • Orders
  • Inventory
  • Customer Chat

These are loosely tied together, but work together to provide an internal system.

You could have an external facing customer application, and then an internal application for responding to customer inquiries. Those could be split into two separate repositories.


Service Boundaries

Service Boundaries is one of the portions that can either make migrating easier or harder as time goes on. This is how often a service reaches across its boundaries.

  • Each service is broken into a separate schema.
  • The service should only query its persistence layer, not across schema.
  • The API layer should only be used for client queries.
  • The message bus layer is only utilized for internal communication.
  • All communications should be non blocking.

Recommended

graph TD; A[Client] --> B[Inventory API] A[Client] --> C[Shopping Cart API] C[Shopping Cart API] --> D[Inventory Message Bus] C[Shopping Cart API] --> F[Shopping Cart Persistence] D[Inventory Message Bus] --> E[Inventory Persistence] B[Inventory API] --> E[Inventory Persistence]

Here the Shopping Cart API talks to the Inventory Message Bus, to retrieve any information about inventory items, rather than talking directly to the inventory persistence.

Not Recommended

In this example the Shopping Cart API directly talks to the inventory persistence layer. This will make refactoring harder down the line.

graph TD; A[Client] --> B[Inventory API] A[Client] --> C[Shopping Cart API] C[Shopping Cart API] --> E[Inventory Persistence] C[Shopping Cart API] --> F[Shopping Cart Persistence] B[Inventory API] --> E[Inventory Persistence]

The benefits of a Mono Repository

If we were to break these into a number of different repositories, we would need to release each component, increment the dependency version, and then re-release. A high amount of merge requests could occur. This may require you to ensure that items are merged in a proper order.

An additional benefit is compiler time checks. I could say update the record layer and adjust the data class for a shopping cart with a mono-repository when I go to compile. I would see how every service utilizes it, and be notified immediately if my change broke a consumer of that record. With a multiple repository structure, we wouldn't be notified of the error until the consumer goes to update. Or worse we release this new record without checking on the client consumer.

To ease the service boundary layer, we want to automatically generate and document a lot of the transactions that happen across the service boundary.

Contract First

There are a number of platforms out there for defining APIs and how they operate. A lot of these are polyglot, and at a core, you should not be beholden to a specific language. This allows you to easily expand or consume your services.

Some of the more common contracts are AsyncAPI, OpenAPI, gRPC, and many more.

By focusing on a contract first you allow for:

  • Asynchronous development: web, mobile, and back-end can all work on the same feature based on adherence to a contract.
  • Automatic generation of data classes or record types.
  • Automatic generation of SDK and client libraries.
  • Validation of your domain by taking a type safe approach. If the contract changes, you will be notified at all levels that either the consumer or provider does not adhere to the contract.

When creating a ticket you can append a contract file. The first step in the release of this feature is to merge the contract in. This will then create mock servers for testing, clients, and API server information.

This test flow can be extended to integration testing. You can create a container artifact based on the mock server generated from the contract. Then during the CI/CD process, that mock container can be spun up for testing.

There is no mocking here; this is documentation that drives testing and delivery.

Monolithic

So far this is looking like a micro service architecture. How do we make this a monolith with room to grow? A new module is introduced strictly for the monolith aspect. I call it director, but it's essentially the main module.

project
├── clients
│   ├── android
│   ├── iOS
│   └── web
└── services
    ├── authentication
    │   ├── api
    │   ├── persistence
    │   ├── messagebus
    │   └── records
    └── inventory
    └── director

Assumptions

  • Name Space: design.animus.demo. At the root project level.
  • This will be framework agnostic and written in more pseudo code.
  • The name space follows folder hierarchy after the root level.

The director is our monolith module. This acts as a proxy to all of our different service layers.

services/director/src/main/../director.kt

import design.animus.demo.services.authentication.messagebus.authUser
import design.animus.demo.services.authentication.messagebus.getInventory
import design.animus.demo.services.authentication.messagebus.getInventoryById

class ECommereceRestRouter {
	
        get("/authenticate") {
            authUser()
        }
        get("/inventory") {
            getInventory()
        }
        get("/inventory/:id") {
            getInventoryById(id)
        }
}

This acts like a big nginx proxy. We've just separated our modules into separate containable units. If we adhere to the noted service boundaries it is relatively easy to split them out.

What about the message bus? Are you telling me to spin up kafka or something else right off the bat? The answer is no. This is a stub, when we break out the services layer and then  replace it with a message bus extrapolation.

services/authentication/messagebus/main/../getInventory

// initial
import design.animus.demo.services.inventory.persistence.queryInventory

suspend function getInventory() : Deferred<Inventory> {
	return queryInventory()
}

// on break out
import design.animus.demo.services.inventory.persistence.queryInventory

suspend function consumeGetInventory() : Deferred<Message> {
	val consumer = MessageBus.listen("design.animus.demo.services.query.getInventory")
    consumer.onReceive { message ->
    	val inventory = queryInventory()
        message.respond(inventory)
    }
    
}

suspend function getInventory() : Deferred<Inventory> {
	val response = messageBus.send("design.animus.demo.services.query.getInventory")
    return response
}

This is by no means a complete picture, but just gives an idea on the changes that can happen. It is a very simplistic quick response. On the initial layer we are talking directly to the persistence layer by retrieving the information, and then returning it.

The break out is based on sending a message to a message bus layer, requesting the information, and sending back the deferred response. The message bus can be sent through any number of means. The key portion between both options is they are deferred, and non blocking. The result will come some time in the future, but not right away. By deferring the operations and not synchronously waiting on them or directly querying the data base, it makes it easier to break out. By allowing for a future response, we are not designing for an immediate response, because talking across a message bus or internal API can have delays, errors, time outs, etc...

Does this result in additional code complexity?

Partially. There is an initial cost of setup of all these build files. However, once you get one done, I usually just copy a service skeleton. The cost to me more so is that I have to crank up the memory allowed to Intellij. It is a lot more resource intensive to keep the entirety of the project indexed. However, it allows for easy traversal of the project. It is very easy to search the project, search between definitions, and track down errors. I have personally saved a lot of time with this approach of having a back-end service and react NodeJS consumer. Running the process in the IDE via docker, I will be notified of a mishandled type on either the front-end or back-end.

The Cost

There is a cost here, but what is it? You've laid half the ground work for a micro service architecture, but not gone all the way in. You don't know if it's necessary. On the flip side, why start with a Micro Service architecture. Do you truly know which service will hit the biggest load or require the best performance? An analogy is that you brought a house just a bit bigger than you needed because you may add an addition down the line, or an outdoor kitchen. To prepare for this eventual change you weeded the area and removed any debris, laying the ground work but not building the structure. The cost is in quality and operations. This takes diligence to maintain, but this diligence will pay off. It provides a consistent frame work for testing features, validating releases, and gaining visibility into your processes:

  • Proper documentation explaining the business need and architectural decision.
  • Taking time to write out the initial API contracts.
  • Building the CI/CD pipelines to build the project.
  • Proper ephemeral testing to ensure service domain boundaries aren't crossed.
  • Building out a integration test layer that can spin up services for testing.
  • Proper setup of merge requests.
  • Automatic generation of clients and libraries based off of contracts.

This initial cost will allow for rapid delivery down the line. A lot of this is skeletal: a service is a service, and a client is a client. It's very easy to provide a skeleton that can be shared or act as a base between services. The cost though is a constant guarantee: we cannot say we will see more load in the shopping cart. Do you have an end user querying inventory all the time, or re-browsing their shopping carts?

What we can guarantee is that you have a team of developers constantly iterating forward. You can enable them by providing tooling and documentation, which allows them to operate at a quicker pace.

  • There will be bugs, and production failures. By introducing visibility early you can analyze these issues.
  • There will be testing requirements and quality assurance. By automating test cases, you reduce the load of development testing.
  • By providing contracts and documentation, on boarding is eased, allowing new developers to come to grips quicker.
  • With proper release tracking, you can determine what made it into a release.

Merge Requests

Merge requests are the backbone of this procedure, and have an overarching mono repo. It requires diligent checking before it is merged to the parent tree, as noted with each module dependent on another module. You will be made immediately aware if your change breaks the client.

There are two phases. First, is validation, where you are not aware of what the end deliverable will be. You are, therefore, iterating quickly to get it into the customer hands for feedback. At this point you shouldn't focus heavily on quality. You should add a sprint for clean up after validation.

Once you validate this:

  • Documentation - Architecture and data modeling diagrams should live with the code. This documentation must grow alongside the code base.
  • Performance Metrics - By embedding metrics you can quickly see during the merge request whether performance is drastically hindered.
  • Testing Reports - While you should have tests, you don't need to strive for 100% coverage.The compiler will do a lot of the work for you.

The CI/CD is critical to ensuring that on merge requests, testing is ephemeral and completed properly. This process could possibly even spin up an adjacent Kubernetes environment for testing.

Summary

This was just a brief overview of structuring projects: how you can start with a monolith but allow for room to change to a micro service architecture. By focusing your first several sprints on scaffolding and quality assurance, you can iterate in a quicker more reliable pattern down the line. This allows you to separate the deliverable into a number of encapsulated modules: the project starts as a monolith, but quickly breaks out into separate services, as needed.