Stream-aligned teams providing an API isn't a XaaS

Combining Team Topologies and Domain-Driven Design Context Maps

Dec 01, 2024

I had a great conversation at Global Domain-Driven Day 2024 at VirtualDDD with a topic proposed by Kenny Baas-Schwegler during the Open Space.

The topic was in the lines of:

Can a Stream-Aligned Team provide a XaaS for other teams?

Using Team Topologies nomenclature, it refers to:

Even though in the book it is discouraged, sometime ago I wrote about when it made sense based on my experience. I leave here the post on how a stream aligned team allowed the evolution for a platform team.

Interim Platform Team

Aleix Morgadas

July 22, 2023

Read full story

The good thing about the open space is that everyone was encouraged to speak and great arguments occurred, in favor and against of. Yet, as we all shared, it all depends.

Introducing the context

Two stream aligned teams. Team 1 with capability A serving user Alice. Team 2 with capability B serving user Bob

We have two stream-aligned teams:

Team 1 with Capability A serving user Alice.
Team 2 with Capability B serving user Bob.

Then, as business as usual, a Team 1 detects a new user need for Alice that involves Capability B.

Let’s explore several scenarios with their consequences.

Reusing the capability
Stream-Aligned Team providing a XaaS
Moving the capability to a platform team.

1. Reusing the capability

Two stream aligned teams. Team 1 with capability A serving user Alice. Team 2 with capability B serving user Bob. Capability A is connected to Capability B

We have Team 2 that exposes the Capability B though an API already. Team 1 checked the API, and it suits the Alice’s needs.

So, they integrate Capability A with Capability B, they do all the tech best practices such as use case integration tests, and adding monitoring to be the integration works. They even add an Anti-Corruption Layer just in case.

All of a sudden, Team 2 is upstream to Team 1 without Team 2 knowing. Any change on Capability B will influence Team 1 even though they added an ACL.

So now, Capability B serves Alice’s and Bob’s needs. It has two potential points of change, and multiple problems can occur down the road.

A common scenario could be that Alice is an enterprise customer requiring specific features. Those become prioritized, and now Team 1 has influence over Team 2 roadmap to include certain features they require from Capability B.

Team 2 finds itself in a situation that:

Serves Bob’s and Alice’s needs, directly and indirectly.
Cannot evolve Capability B to serve Bob’s needs without being sure it doesn’t break Capability A integration.
Needs to adapt to changes required by Team 1.
Or, Team 1 needs to constantly adapt to the changes of Team 2, slowing down their development to serve Alice’s needs.

A better approach

Two stream aligned teams. Team 1 with capability A serving user Alice. Team 2 with capability B serving user Bob. A collaboration between the two teams is explicit to investigate the reusage of the capability B.

Instead of integrating directly, a collaboration between both teams is a better approach to determine is better.

Integrating both capabilities, and defining the collaboration expectations when both capabilities need to evolve. Such as SLAs, and alike.
Team 1 creating its own capability, adopting a separate ways approach. A small duplication can save us hours of coordination.

2. Stream-Aligned Team providing a XaaS

If we find that Team 2 is in constant collaboration with Team 1, making Capability B suitable for more users than Bob, why not providing Capability as XaaS?

I think we need to be very careful on these scenarios because we will lose focus and purpose for what’s Team 2 mission. Team 2 will need to serve at least two users, Bob and Internal Teams like Team 1. Both with different needs, timings, and SLAs.

Stream-aligned teams are meant for fast flow with reasonable cognitive load. Aiming to serve their Capabilities as a Service will incur into higher cognitive load, and you need to be sure they have the right skills to make it happen.

Another risk is aiming to make Capability B too generic and too big, instead of adopting a thin platform approach. Becoming a bottleneck for innovation for customers like Bob, and slowing down other teams as well.

I would only recommend this approach when the use cases are limited, and the team skills and maturity are high enough that it will not heavily increase the team cognitive load.

Serving internal customers have something that external customers don’t provide. Sense of control.

You can have a faster feedback loop internally, and look like you are doing progress. Be careful because it is a false sense of control, and you are forgetting the most critical point, external customers.

So, you might find yourself finding more joy creating an internal platform rather than focusing on what’s crucial for the business. So, if you adopt this approach, continue with caution.

3. Moving the capability to a platform team

The previous approach of a Steam-aligned team having its own external customers plus internal customers, like Team 1, moved the conversation into Platform Teams.

Even though it might look like a good approach from a Team Topologies perspective, it raised several concerns into the audience.

It seems like we moved a systems’ problem into a team problem.
Moving a shared capability B into a Platform Team and expose it as a service will not fix the issue by itself, because previous problems noted still applies into this scenario.

I think here we missed a key expectation of how Platform Teams need to behave from a Team Topologies perspective:

Adopting the Platform is optional by teams. Allowing separate ways in case the platform doesn’t have a good UX or doesn’t serve the initial team needs.
Thinnest Viable Platform over big platforms that solves everything.
Platform teams need to reduce team cognitive load of their consumers, not increase it, either by XaaS complexity or coordination complexity.

Therefore, I think some concerns of this approach comes from the misunderstanding of Platform Team mission and expected behavior. Which, based on current industry trend, it makes sense to be concerned about platform everything as a means to solve our problems.

Therefore, we can find some capability duplication here and there, but a platform team can provide a good experience for certain use cases that have higher complexity, cause team cognitive load, and prevent fast flow of value.

Recap

Do not integrate with other team’s capabilities without a conversation.
Integrating reduces some initial apparent cost of development, but it can create an increased cost down the road.
Providing a good capability experience to be consumed isn’t free, and it takes time.
XaaS can a collaboration approach between Stream-Aligned Teams in certain situations, but always being sure that:
- It doesn’t exceed teams’ cognitive load.
- It doesn’t reduce fast flow of value for the external customers in favor of internal customers.
Investing in a Platform Team doesn’t mean the platform in mandatory use, nor it will solve all the problems. You still need to collaborate with teams to understand their needs.
A platform doesn’t need to cover all teams needs, only a thin part of the main use cases to make a good thin platform instead of a sh*ty big platform that does everything but nothing good enough.

Thank you a lot for reading this post 😄.

I love to read your feedback and opinions to help me improve. You can DM at my LinkedIn, Bluesky, or just leave a comment using the following links:

Update

A nice thread happened on Bluesky about different approaches, and there was a great question on how that would look like in a Data Mesh.

I tried to expose how it could look like in terms of Flow of Change, but Data Mesh isn’t my area of experience. I haven’t been exposed to enough scenarios to make an informed call.

How do you see Data Mesh, and how can you describe flow of value using Team Topologies?

Vasco Duarte

Dec 1, 2024

I love the balanced discussion of the different options, and how you started with the story of how the problem develops over time.

We often read solutions that are focused on showing "the perfect solution" (tm), but in the end ignore the fact that code, architecture, team structure, all evolve over time, and what is a great solution today, won't be tomorrow!

But I miss something fundamental. It's not core to your article, but by not talking about how the organization affects architecture choices I think you are missing a key piece that is key for managers wanting to keep their architecture appropriate over time (not just find "the perfect solution" (tm)).

You don't discuss Conway's Law and how the choices we make in organization affect the architecture that the code ends up manifesting.

In other words, your article is missing the people+organization aspect of the very same problem. And I think it will inform your proposed solution, but also validate some other solutions depending on the context of the reader.

Expand full comment

2 replies by Aleix Morgadas and others

2 more comments...

Engineering Strategy

Interim Platform Team

Discussion about this post