What do application boundaries look like in cloud native landscapes?

Kim Clark
15 min readAug 3, 2021

--

Cloud native techniques and infrastructure enable us to build solutions from increasingly fine-grained components. If each piece of functionality is already self-contained, does the broader application boundary still exist? Although the boundary is less obvious, do we still need some grouping of fine-grained components in order for landscapes to remain understandable, maintainable, and indeed securable? This leads to the more interesting question of what effect this has on the communication styles such as APIs and events when they cross these boundaries.

…no application landscape is simply a mass of microservices or functions, there is always more structure to it than that.

This post is about software application landscapes, and what they look like at differing scales. Modern implementation techniques and platforms enable us to deploy increasingly fine-grained componentry through microservices, functions as a service and so on. But no application landscape is simply a mass of microservices or functions, there is always more structure to it than that.

Drawing a loose comparison, we could note that the universe is not just a mass of atoms. Atoms are grouped into molecules which are in turn grouped in further ways as structures in materials such as crystals, cells and organs in living organisms and so on. We can follow this all the way up to cosmological groupings of solar systems, and galaxies. Grouping is necessary and natural.

So, back to applications. Microservices are always grouped, and it’s that structure we’re going to explore, and whether that grouping results in an application boundary that is the same or different to more traditional applications.

We’ll then consider what role that application boundary performs. Is communication between two microservices different if they are in the same group, compared to if they are in different groups? It might look the same (e.g. a RESTful API call), but is it really the same at such different scales?

Let’s begin by considering what we used to mean by an application boundary, then progress to see what we might mean by that today.

What happened to the traditional application boundary?

Microservices and cloud native principles push us toward building our solutions from more fine-grained components. Where previously a collection of related business functions would be deployed to a single runtime and classed as an “application”, we now break up that functionality into independent components, deployed for example in containers. As each microservice is now an independent component on the network, the natural physical boundary of the application is no longer present.

Note, I’m using the word “function” as a broad term here, to mean a business function that might be deployed independently as a microservice component. I do not mean an individual low-level programmatic function.

In the past, grouping functions of an application onto a server helped save resources. Indeed, many applications were often deployed to a single application server (or more typically a high availability pair of servers).

The boundary of the application was then very clear as it was bounded by the application server runtime, and indeed also by the operating system. Communication within the application was via internal calls, and you had to perform specific configuration in order to make things available beyond the application, for example as APIs.

Indeed, servers were often used by multiple applications as this simplified operations and optimised resources.

A single topology could be built with high specifications for performance, availability, and security, and then many applications could be deployed to it. Any applications placed on the shared application server topology would inherit those characteristics. There was a clear relationship as to which applications were running on which operating system instances, and the application boundary was also preserved through the packaging of the applications.

However, everything running on the same server topology brought many disadvantages too. The initial topology had to be built with all future workloads in mind, resulting in a much longer design and construction phase. Deployments/changes to any part of an applications could have side effects on the application and other applications on the same server. The tight governance on a shared environment meant slower surrounding processes due to the tight governance required. As an example, all applications deployed to it would be tied to a specific version of the application server so couldn’t benefit from new features, or and perhaps even security fixes without a lengthy upgrade project.

With modern lightweight runtimes, and consistent ways to administer large numbers of these through standards-based platforms (e.g. Kubernetes), it is now perfectly reasonable to separate functionality out into independent components (e.g. containers) often called microservices.

Each microservice being deployed independently brings benefits in that they cannot have side effects on one another. Each can have its own version of the application server runtime, or even choose a completely different type of runtime. It also enables more independent and automated CICD pipelines to improve agility and consistency of deployment. Indeed, it enables independent deployment choices in terms of availability, scaling, upgrades, security and more.

…the physical boundary of those servers no longer has any direct relationship to the application groupings… If we want application boundaries to exist, we are going to have to find a way to re-introduce them.

There are still real servers running underneath the container platform of course, but they are to a large extent hidden from the application developer. The point here is that the physical boundary of those servers no longer has any direct relationship to the application groupings.

Since code of an application is no longer being deployed together as a one unit onto a single runtime, there is no longer a natural application boundary. If we want application boundaries to exist, we are going to have to find a way to re-introduce them.

If each function is now essentially self-contained, do we still need the notion of the application that previously grouped a set of functions together? Or could each microservice component just live independently on the platform in a sea of other microservice components?

Why might we want to retain the notion of an application boundary?

Just because there is no forced physical grouping of the functions on a server, should we immediately assume that there is no longer any value in that grouping?

We most probably would have deployed those functions together as parts of an application because they were somehow related. This can take a number of forms:

  • Functional. The components may be related to the same functional domain within the business. They likely referred to the related capabilities of the business (they might all relate to “payments” or “quotations” etc). So it was easier from an analysis and design perspective to evolve them as one overall “bounded context”, to use the language of Domain Driven Design. When we came to make changes later, it would be easier to see how they related to one another, what ripple effects those changes might have. It would also limit the range of tests we would need to do, to be confident that we could introduce the changes safely.
  • Data. A related but subtly separate common reason functions were often deployed together might be due to a shared a data model. “Ah, but when we break the application up into microservices, we’ll ensure the data models are all independent” I hear you say. We’ll certainly try, but experience tells us some components will still have undesirably close correlation between their data structures. Yes, we can lessen that with clear interfaces and anti-corruption layers and the like, but in any solution, there will be some components that are more intrinsically bound to one another than others. It will surely be a lot easier if those that do are known to be part of the same logical application.
  • Ownership. Who ultimately owns the components? Who cares most that the components meet their operational goals, who drives functional changes, who owns the funding and decides the priorities for the evolution of the components? It is going to be a lot easier to manage funding, change control, operational service levels and more if the grouping of components aligns in some way with business ownership.

From these alone we can see there is value in working with a set of components as a group. Formalising an application boundary to group components enables us to make changes at a fast pace within inside the boundary, compared to the slower pace exposed to those using the application from the outside. It enables an additional layer of decoupling, with benefits such as:

  1. Understanding the effects of change — Application boundaries allow us to be clear about what interactions are accessible from the outside of an application compared to the much greater number of interactions that occur within it. it is much easier to evaluate the possible ripple effects of a given change.
  2. Testing efficiency — The application boundary helps define a layer within our “test pyramid”, making it easier to calibrate the testing effort. It becomes clearer for example what depth and type and range of tests to apply, where to place stubs/mocks, and what the dependencies are between tests, and so on.
  3. Operational management — For effective introduction of site reliability engineering (SRE), we need to be able to understand the smaller (e.g. microservice) components in the context of a more holistic view. The application boundaries help us to consider how the microservices are used together to achieve broader business functions, ensuring we choose meaningful service level indicators (SLIs).

So, application boundaries are still valuable in order to keep IT landscapes manageable. We need to compartmentalise the landscape in order to operate and maintain it safely and efficiently, and to enable its evolution over time.

What effect do application boundaries have on inter-communication?

My personal focus area is integration. If application boundaries are to be more than just a logical grouping on paper, I’m most interested in how these boundaries affect the way the microservices communicate with one another, within, compared to across application boundaries.

One of the most common mechanisms for communication between components across networks today is APIs. These are typically RESTful APIs, using HTTP protocols and (typically) JSON data formats, although you could include GraphQL, or web services in this.

Let’s use APIs to explore what role application boundaries play, and then we will expand our viewpoint to look at other communication protocols.

Is there a difference between the use of APIs for communication within, as opposed to across applications? I first wrote about this back in 2018 when exploring the difference between APIs used between microservices in the same application vs those going between one application and another.

These interaction paths might from a distance appear technically identical, both perhaps passing JSON messages over HTTP in a RESTful style. However, differences emerge when you consider other aspects such as how the APIs are discovered, who can subscribe to use them, the security models they use, and the type of controls such you may want to put on them.

The thought was that interaction within an application boundary would likely be within in the same platform, and needs such as runtime discovery, routing, throttling, load balancing, and security could be handled by the core platform capabilities. A well-known example would be those provided by the container orchestration platform Kubernetes, perhaps in conjunction with a service mesh.

On the other hand, communication across applications, would require different and additional capabilities, such as more sophisticated mechanisms for API exploration (API developer portals), subscriber onboarding mechanisms, subscriber specific traffic management, external security models and more. These are the capabilities that API management software has evolved to provide.

In a further article I expanded on how communication within the application may well need to be further supplemented by use of a service mesh which abstracts a number of interaction patterns away from the component developers. Some of the capabilities of the service mesh appear similar to API management (traffic management, security), but their implementation, (such as in Istio, using sidecars on both sides and a control plane), is more suited to intra-application communication. Furthermore, there are a number of patterns that are more application development centric such as A/B testing, logging/tracing, mTLS, fault injection and more.

interactions within an application should use inherent platform routing capabilities and potentially a service mesh, whereas communication between separate applications would likely benefit from API management capabilities

The conclusion was that interactions within an application should use inherent platform routing capabilities and potentially a service mesh, whereas communication between separate applications would likely benefit from API management capabilities.

Since writing, the lines between the technologies are blurring somewhat. Service meshes are starting to exhibit some basic API management features, and API management capabilities are increasingly collaborating more closely with the service mesh to do some of their bidding. However I believe the fundamental concept still stands. The needs of intra and inter application communication are different, and there is still value in deciding where the application boundary lies.

What about other types of synchronous API?

Up to this point we’ve assumed all communication between components is synchronous, so for example using RESTful APIs, or perhaps more recently GraphQL, or indeed historically, web services.

Synchronous interactions assume requests traverse all the way to the backend system, and any componentry in between, on each request. This interaction pattern means they are likely the share many of the same additional concerns when they cross application boundaries. For example, we are going to want consumer specific access controls and throttling in order to protect backend systems from both a performance and security point of view.

a GraphQL interface exposed on an application boundary might need some more direct traffic control on query depth and cost

We will also see some additional requirements unique to the communication protocols. For example, callers of a GraphQL API can request information that might span several entities on the backend system, perhaps resulting in a deep and/or recursive query. Whilst application internal invocations of such an API might govern these risks through usage guidance, a GraphQL interface exposed on an application boundary might need some more direct traffic control on query depth and cost.

So, we can assume stronger “management” of APIs will be necessary on the boundary in general, and our implication that this may need to be provided by API management specifically designed for this purpose still stands.

What about asynchronous communication; messaging and events?

There are also of course asynchronous alternatives that enable decoupled communication such as messaging, and event streaming that are seeing a resurgence in recent years. If we are going to consider how synchronous communication is different inter and intra application, we should go through the same thinking process for asynchronous. This is worthy of a separate blog in its own right in due course, but let’s put some rudimentary thinking into the problem.

Many applications have used asynchronous communication internally over the years. A classic example is that of a payments engine which needs to take huge numbers of payment instructions through a complex journey with as much parallelisation as possible. Note that the various different functions within the payment engine all work with the same payment data model. Any changes to that data model will likely result in a full engine refactor. The point here is that although the various elements of logic within the payments engine are decoupled from a runtime point of view through asynchronous messaging, they are very much still coupled from a functional point of view and should be considered part of the same application (or at least domain). Let’s recall that breaking up an application into smaller microservice components with RESTful APIs between them doesn’t change that fact that there is still some notion of it being a single application. In the same way, introduction of asynchronous communication between parts of an application, doesn’t mean you’ve introduced an application boundary either.

So, what would be different about asynchronous communication between applications, across an application boundary? Once again it is about the fact the consumers of the messages or events are more distant to the owners of the application. They will want to be able to discover what messages they could potentially listen for, and be able to self-subscribe to use them. They may also want to be insulated from changes to the underlying schema used by the provider of the event. In turn, the providers or those events/messages will want to ensure that they can only be accessed by parties that have the right permissions to view the data. They may want to provide different versions containing more or less of the data.

many of the features we have come to expect from API management…are just beginning to [be] made available for asynchronous communication

If the above feels strangely familiar, that’s because these are many of the features we have come to expect from API management today. However, we are just beginning to see these made available for asynchronous communication. This is happening in parallel with the maturation of standards such as AsyncAPI that will be required in the way that the Open API Specification was for synchronous exposure. I work for IBM, and we’ve been developing in this space for a while under the name event endpoint management. It will certainly be interesting to see how it matures.

Is “application” the right term for these modern boundaries?

Perhaps those group boundaries will not remain the same as we had before, and if so, perhaps the term “application” is no longer the right name for this boundary.

There are some key differences between the boundary we’ve been referring to in this post compared to what we meant by “application” in the past.

  • The boundary is now a purely logical construct. It is unrelated to the underlying application servers, operating systems or indeed physical resources.
  • The boundary is no longer implicit. It is only there if we choose to design it in, and implement a differential style of communication.

This has some interesting potential implications for future IT landscapes

  • The boundary could be defined declaratively. The boundary might be present only because a particular set of policies say it is, and the underlying capabilities then enforce it.
  • The boundaries could be dynamic. What if boundaries need to change over time? In the past that would have meant unpicking code from an application, exposing network based interfaces, and then re-deploying it onto another server. Since microservice componentry is already running independently, moving it within a different boundary is much easier (although admittedly still not entirely trivial)
  • The boundary could be “discoverable”. If the boundary is simply declared in policy files, these can be programmatically “read” in order to discover the IT landscape at the application level. This has value in terms of providing instant architectural insight valuable for forward planning, operational control, diagnostics and more.
  • The boundary could be hybrid. We can define boundaries that bring together components living across a number of different platforms. This has in reality always been present. Consider even a traditional application that has its own database. The true application boundary is not really just that of the application server code, but also includes the database. The hybrid boundary is enforced simply by the fact that only the application knows the credentials to the database. So, in a way the idea of applications having a cross platform composition is nothing new, but this may make it easier to clearly define those hybrid boundaries.

There are many similarities in the above with the advances in software defined networking (SDN). SDNs enable us to declare that certain subnets exist, and define which components are within those subnets, and which can talk across them. Indeed, the implementation of application boundaries will often include SDN capabilities. However, as we’ve discussed already, there is much more to the boundary than network access.

A reality check

Before we get too excited about having a declarative description of our entire IT landscape, let’s just take a step back. These more forward-looking ideas are relatively easy to imagine if we were fortunate enough to implement our entire landscape using a single consistent platform. However, apart from a few young start-up companies, that’s not the situation for most enterprises. Implementing more declarative boundaries will require close co-ordination between elements such as networking, gateways, container platforms, service mesh technology and more.

modern platforms and architecture enable grouping that is more business aligned, more adaptable, more composable, and potentially dynamic

However, what we have hopefully affirmed in this post is the need to continue to find meaningful groupings of components, recognising that not doing so will lead to unmanageably complex landscapes. The good news is that modern platforms and architecture enable grouping that is more business aligned, more adaptable, more composable, and potentially dynamic. The key question is what happens on these boundaries. What protocols are exposed on them, what boundary capabilities are required, and what technologies are best suited to each use case.

Acknowledgements

Sincere thanks to the following for sanity checks and other contributions to this post: Carsten Bornert, Holly Cummins.

--

--

Kim Clark

Integration focused architect. Writes and presents regularly on integration architecture and design topics.