
The Netherlands 🇳🇱 Fast Flow Conf edition had an Open Space Format.
What is Open Space?
When you bring together a group of folks, empower them to co-create their own sessions about what they want to talk, provide just enough structure to ensure progress, and hit the Start button – that’s Open Space, a style of self-organized “un-conference.”
Two people proposed a similar topic on platform engineering:
Why do PMs run away from internal platform products?
Who will take care of platform engineers?
Referring to the cognitive overload platform engineers usually face.
A great outcome of these spaces is that we connect with people with similar mindset, and facing similar challenges. We feel less alone, and we gain new perspectives.
So, I want to address both questions in this post from my perspective based on my experience.
Why do PMs run away from internal platform products?
Internal Platforms Products are hard. The users are our internal teams, developers. Developers are the type of users that I found harder to:
Understand their user needs.
Create an enough good experience so they adopt the platform product, instead of “we prefer to do it custom to our specific context”.
As a non-technical PM, it is hard to understand how to deliver those experiences if you didn’t experience the pain yourself. Plus, talking to developers teams gives you some insights, but those are hard to convert into something actionable without the help of your technical peers.
Platform PMs face a hard question.
How can you demonstrate the impact of the platform team?
We are investing NN.000 EUR/month in this team, are we able to return a x10 on the investment? How do you measure the impact? Shouldn’t we assign those engineers back to the product teams, where the impact is more direct? Do you have the needed support from the data team to create the dashboards to track the impact, or are you very low on their priority?
Those questions are hard to answer. You don’t have it easy. For one side, you are learning the domain, you’re making sense, while seeing as a cost rather than a value generator. Without the needed support by your peer teams that you should be building the platform for.
It’s hard.
Try this instead — Enabling work
What worked for me, for Platform Teams PMs, is to train senior or tech engineers on a set of Product Manager skills. So, we cover one part of the problem, knowing the engineering domain.
Yet, knowing the engineering domain isn’t enough. You are missing a lot of other essential skills. How to prove value, help platform adoption, user interviews, prioritization, …
Without those skills within the team, you have a hard time moving the platform forward, and having the x10 impact that a platform team should aim for.
Upskill your platform team on the product essential skills with a product enabling team.
In our case, we focused on helping them to upskill on:
Product discovery.
Go to market approach. How can we help other teams to start using the platform without enforcing using a bad platform experience.
Gather product feedback.
Prioritizing and saying no.
Build the team from product team engineers that show product thinking
Another great way to increase the odds of succeeding is to have people that are already used to product thinking.
They will be people that have been exposed to product reasoning, and they can do something similar in the platform team. You still have to upskill people in certain areas, because they won’t have the PM support, but they have already the mindset for continuing learning in this direction.
Who will take care of platform engineers?
From a Team Topologies perspective, we see Platform Teams as a way to improve flow by removing blockers and reducing cognitive load.
But the question was, who takes care of the platform engineers’ cognitive load?
We see from time to time platform engineers having a hard time delivering platform improvements.
They have a lot in their plate. They are unable to move forward but keeping the lights on. Plus, they're getting tickets from everyone with high priority, and unable to say no because product teams are busy doing important and valuable work.
We have been there, and we kind of were able to go out of that mess.
Antipatterns or bad practices
Platforms being the else clause of the backlog
if (productTeam.hasBacklogAvailability()) {
productTeam.assignInitiative(initiative);
}
else {
platformTeam.dealWithTheInitiativeNoOneElseCanDo(initiative);
}
This was our norm. We don’t know where this belongs, or the teams don’t have capacity, but it is “important”, therefore it goes to the platform team, asking to address that ASAP. Interrupting their current work in progress.
Too much work in progress
We need to do all of these things, let’s do them all because everything needs to be delivered together. Plus, we have this requests from product teams that’s blocking the business.
Platform Teams having a hard time prioritizing is a common problem. It is hard to say no to stuff that’s blocking the business. But we will see an alterative to this problem later.
“We know better than product teams”
This is way more common than I thought. Platform teams building a platform product without talking to internal teams. In the vacuum.
Guess what, that platform doesn’t solve the developer’s needs and either it’s dropped, or worse, it is enforced to use, making the whole DevEx a mess.
In this antipattern, we have a very late feedback loop, increasing the odds on:
Not solving the real need.
Not gathering feedback until too late.
Not showing the value faster enough to build trust and adoption momentum.
Understaffed platform engineering with unrealistic expectations
Can it be a team of 2 for the full 500 organization? Make it 3 so that one person at a time can go in vacation.
Having a well stuffed platform team is hard. Not staffing the team to handle a realistic workload will make the platform initiative to have a hard time to succeed.
Platform is a blocker for flow, let’s add 2 more people to these 20 people team to go faster
On the other side, I saw overstaffed platform teams. Usually, the platform team helped on key delivered, leadership saw the results, and now they are becoming bigger and bigger.
This will just slow everything down and create damaging behaviors, like:
Senior engineers only going to the platform team, creating kind of a backend team for product/frontend team doing simpler stuff.
Owning a lot of domains, from infrastructure, to core product facing features.
Increase of the communications internally and externally, increasing the telephone game.
Making the whole process slower, increasing cognitive load for all the platform people, and product teams.
Stopping the platform team because it doesn’t work.
Not knowing how to, or even trying, measure your value
This is the hardest thing I found on Platform Teams. How can we measure its value? It is hard, but not trying to have a way to understand how you are impacting the platform teams is a problem.
Because when the question of which is your impact? Which is the ROI the team generates? It is already late to start measuring it.
Platform Teams is about impact overtime, you need the data’s history and its trends to prove your value
SysAdmin Team 👉 DevOps Team 👉 Platform Team. Same behavior, different name
Old behavior, new naming.
I saw this multiple times, that teams adopts the latest trendy name without adopting the practices and underlying values.
It is true that it’s not 1 or 0, everything has an adoption curve. Yet, try to challenge how far you’re into adopting the new practices and values that apply to your context.
Try this instead
Define the platform team purpose and the domain
The platform team purpose is about helping internal teams, not about building an abstraction on top of K8s, nor do what others do not want to do.
Having a clear purpose and a domain will help you to say no easier. You will have a leverage point to not be the :else
clause.
Not only externally with the other teams, but internally too.
When someone might propose to do X with Kafka, you can challenge that and say: How does it solve our user needs? Can we do it smaller? Or is it up to us, or can we just give a Terraform configuration for others to use in AWS Kinesis for example, so that they don’t depend on us?
Limit the platform team size, make the platform open for extension
Small platform teams create thin platforms. You can read more about this idea of thin platforms at https://github.com/TeamTopologies/Thinnest-Viable-Platform-examples.
Having between 3 and 6 platform team helped me a lot by:
Avoid being the
:else
clause. The team is unable to absolve all the work, we need to be mindful of the work load.Limited work in progress. Teams find 1 to 3 WIP stuff to be a reasonable peace, adding more WIP will just increase communication, and it is not lineal with the throughput. 1 to 3 WIP for platform team is great.
Keep the domain small enough. If the domain needs to be bigger, we can consider adding another platform team, and then create a unique experience for the consumers.
We had a compute platform team, and a mobile platform team. But as well as security platform team and so on. Domain specific platform teams.
Their cognitive load becomes more obvious and works as a sensor to understand blockers, and problems down the road.
Allow platform enhancements by other teams.
When teams require stuff from the platform that the platform team cannot absorb, but it makes sense to be done there because it will help other teams too (an identified product team need), allow them to contribute to the platform.
InnerSource is the use of open source software development best practices and the establishment of an open source-like culture within organizations for the development of its non-open-source and/or proprietary software.
We can adopt a lot of good practices from the open source community. By making it easier for product teams to contribute back to the platform following the platform team guidelines, we ensure that we aren’t a blocker for change.
An alternative is to allow people to by pass the platform for their specific use case in case the platform doesn’t resolve their issue. Always with the vision of standarizing the component when more teams have the same need. We don’t want 200 components doing the same thing in different ways.
Rotate developers between stream-aligned teams/product teams and platform teams
Based on Dynamic Reteaming by Heidi Helfand, we can leverage people moving from team to team.
We can learn from the product teams needs by moving a platform engineer developer to the product team, do daily work delivering new features, and experience the product team members pains by themselves.
Then, they can bring that back to the platform team. This is a nice way of exploring user needs, and also provide insights on how to leverage the platform, helping on the adoption side.
The same way, we can move product team engineers, to the platform engineers for some time. They can bring all this knowledge, pains that prevent fast flow, as well as product skills to the platform team.
If platform name adds friction, do not use it
Platform is an overloaded word. Sometimes, we want to make sure leadership understand the value of a platform team. Yet, for them might mean business platform.
Instead of focusing on the naming, aim to focus on the team purpose. Improve this, or that. Like, product resiliency, product throughput, whatever it is a pain for the business that’s preventing flow of value.
By focusing on what it’s the team purpose, how you aim to impact the business, and how we can see if the team is performing, you can win way more support to drive the needed platform initiative.
Talk about money.
Each company is different, but here I want to give you some guideline on how you can start understanding your cost and the value you (need to) provide.
Get the amount of team members, and multiple by the average salary per month. Now, make that x12 to understand the cost of the platform team.
A common platform team could be 5 team members, monthly average salary of 4.000EUR/month.
Now, let’s add the lost of opportunity of that team working on a product initiative.
Usually, a product initiative aims to give at least 5x the ROI of the cost.
So, the potential cost of opportunity is around 1.200.000EUR.
You are aiming for high returns because most product initiatives fail, so the one that succeed needs to pay all the failed initiatives plus provide revenue.
You don’t need to justify the 240.000EUR, you are justifying the 1.200.000EUR.
That’s why you should find leverage points that, in combination to the Developer Productivity improvements, you can help product teams to go faster, and so on and so forth.
You should aim to provide at least 5 times your value, either by allowing new initiatives to go faster or reducing cost.
So, you need to generate indirectly 1.200.000EUR anually.
How do you measure this? With proxy metrics.
Let’s see a set of proxy metrics, some are better than the others, but start with the ones that are available for you and you can influence for business impact.
Jira Tickets per week.
DORA metrics.
Team Cognitive Load. You can use Teamperature for this.
Opened customer support tickets.
Downtime.
AWS bill.
You name it.
You need to choose the ones that are more relevant to the business, while improving the developers productivity.
Create a platform for what’s not availabile in the market. Stop productizing compute
I'm surprised how many times we create an AWS Fargate abstraction on top of Kubernetes, while the major developer pains are somewhere else.
It’s so hard to do a better DevEx that public cloud because the DevEx goes beyond the API. It is the tooling, the documentation, the available training, and YouTube videos.
Do not underestimate the effort needed to deliver a capability.
Understand the user needs, and leverage off-the-shelf solutions.
The best platform is the one that you don’t need to maintain.