Building a product vision, a team, and replacing Ktor with Spring Boot incrementally
Fintech Engineering Strategy. Post II
The Fintech post series aims to share my personal experience as an engineer manager and later on as head of engineering, which were the challenges, the decisions, and the good and bad outcomes they had. The content has been adapted to keep the decisions without disclosing internal information.
Fintech Engineering Strategy Post Series
Post II. Building a product vision, and a team, and replacing Ktor with Spring Boot incrementally. [This post]
Post VI. Reducing the chaos before addressing the complex socio-technical system
After the first deadline, we had time to understand the product vision, build the team, and start making steps forward to reduce the technical debt in the areas that could impact the coming deliveries.
The first thing that happened was that people who joined us for the delivery went back to their teams, so we started opening two positions, which we filled with internal rotation from another team within the tribe so that they would have already the context and be familiar with the source code as permanent moves.
The product that we owned as team A, let’s call it product A, is a legacy at least 1-year-old, which was in production for a small set of users before being generally available for all user base.
The tribe had three products, let’s call them A, B, and C. The order in which things happen is relevant.
The current tribe started as one team.
The team had the goal to build three products with an expectation of lower complexity by externalizing the high complexity to external partners.
At the beginning, the product A was the entry for all other products.
The products turned out to be more complex than expected.
The initial team left and three teams formed to drive those products forward.
Context
Team
Architecture
Macro
As you can guess, a change in any product will impact many shared services, creating the need for coordination between teams and adding processes to prevent a change in one product from breaking the other.
Micro
The service has a “ports and adapters architecture” using Ktor and (IIRC) Exposed for the persistence layer.
We can find a lot of duplicate code as DTOs between the presentation layer and the persistence layer, while the domain logic is almost anemic.
We need to follow the HTTP calls between services to understand the user journey as a whole.
There is a good enough test suite that covers the main use cases for each service.
Team dynamics and characteristics
The leadership and the team members that joined the team shared their pain points and I was able to see some areas of attention such as:
Lack of order and unpredictable results.
Stress.
Managing individuals.
Heroics.
Onboardings require three months until people become confident about changing code.
Using the Wardley Mapping Doctrine, I could identify some behavior as Phase 1.
Using the Kanban Maturity Model, I could also identify the team as ML0 Oblivious.
The business challenge
We need to push the product forward to make it generally available. We need predictability and introduce change more easily.
Engineering Strategy
Diagnosis
We detect low-maturity engineering practices followed by several architectural decisions that are slowing down today given the team's knowledge. The team members are split as frontend and backend, creating the need for coordination and splitting the user stories as subtasks that require integration.
The business needs evolved and the product and engineering haven’t kept up due to the lack of people working on the product for a long time.
There are team behaviors based on the previous context that might not apply to today’s team needs.
Direction
We will start behaving as a team and moving away from individual heroics. We will work on creating processes and practices that would lead to the predictability of outputs.
The practices that we adopt are aligned with XP and aligned with creating end-to-end ownership of the feature to be delivered by up-skilling team members to be full-stack developers.
Coherent Actions
Isolate the team
We will keep the team isolated from the rest of the tribe deliveries so that we can focus on adopting the right practices and mature the team before we’re exposed to organizational goals.
Adopt Spring Boot over Ktor and Exposed
We need faster onboarding and lower maintenance for our services. We expect by adopting Spring Boot to be able to adopt off-the-shelf capabilities such as resilient mechanisms and integrations with commodity solutions like observability, queue systems, and testing frameworks.
Change the job descriptions and hiring process
Move the job descriptions from backend/frontend specializations into a product engineering job description, and adapt the hiring process for the new job expectations.
Start adopting the full-stack mindset and upskill team members
We need to train people inside the team for the technologies that they aren’t familiar with and help them own a user story end to end instead of waiting on the integration between components. We expect a lower lead for chances and cycle time.
The training needs to be followed with slack time to practice with the current code base. Pairing is encouraged between people of different skills to boost knowledge transfer.
Remove established processes until things break
We are a new team, and we can decide how can we work within the company principles and guidelines. When in doubt, challenge the status quo. It is better to find the system’s problems by making things fail fast versus moving slower and detecting system flaws later.
Those problems can be at different levels like application, continuous integration, monitoring systems, and how do we plan work. Long story short, the socio-technical system that produces the value.
Adopt DORA metrics
We will measure the impact of our decisions with DORA metrics. Those will be available as weekly metrics so that we can assess how fast we are improving. The goal is the speed of improvement, not the current state of the DORA metrics.
Outputs and Outcomes
Desired
We improved multiple parts of the doctrine as best practices. Yet, we had a lot of work to master Phase I of the Wardley Mapping Doctrine. The good part is that we created awareness and the speed of improvement was impressive.
We also started behaving as ML1 Team-Focused using the Kanban-Maturty Model. We still had delays, defects, and re-work. We started behaving as a team, but we still had some individual heroics in some cases.
The behaviors were relayed to me in order to create new behaviors and dynamics. We needed to keep practicing them so that they became part of the team culture.
We strategically wanted the team to have a different maturity level and practices different from other teams within the tribe in order to start the maturity in one team and then start propagating this engineering culture to other teams incrementally.
Incrementally migrated from Ktor to Spring Boot in Production
We discussed multiple ways to migrate from Ktor to Spring Boot.
All at once. Blocking new features for at least one month.
Start a new microservice and co-live both for some time. We will implement the new features in the new one, and then slowly migrate previous features to the new microservice.
Adopt Spring Boot in the existing Ktor service, and migrate the features to Spring Boot only when necessary.
We did some spikes and we finally chose the third option. Ktor and Spring Boot are in the same repo, the same service. This way, we could reuse the whole code and no need to create new infrastructure and route the features depending on what’s implemented and what’s not.
This way, we were able to reuse all the source code. We exposed Spring Boot, and using Feature Flags, we decided if one use case was implemented and handled by Spring Boot or we call the Ktor running in the same machine, in localhost, to perform the old behaviour.
We accomplished that we didn’t care if a use case wasn’t migrated because it wasn’t a problem as the technical debt and the cost of running both services in the same Docker was minimal.
Undesired
People leaving
The person with the most mobile experience left the company, along with a backend developer. Both were internal rotations, so, a lot of context was lost.
Shortly after their departure, a new product engineer joined the team.
Sharing how to evolve incrementally didn’t stop other teams from performing a full rewrite from Ktor to Spring Boot
We did learning and documented the process so that other teams could benefit from this incremental approach. Unlucky us, some teams chose to do a full migration to Spring Boot and freeze the product delivery until the migration was done.
Learnings
Isolating a team and giving time for learning was key to adopting a full-stack mindset.
When people are hired in one role/specialty, and then you adopt a new approach, they might either be on board or just start looking for a workspace that suits their needs.
Sharing an incremental approach in a sync meeting and documenting seemed enough to help others follow the example, but it didn’t happen. Each team had its own approach to accomplishing a coherent action.
Here I have mixed feelings of what happened. On one side, we accomplished the tribe goal of adopting Spring Boot, Good! On the other side, those approaches needed a feature freeze until it was completed. The risk of a full migration up-front approach was way higher than an incremental approach and could cause a way longer delay in the delivery. What happens if the migration takes longer? What happens if we don’t deliver value for more than one month? The business was already concerned with the delivery speed. Missing the migration expected deadline would create a toll on the trust of the engineering team to accomplish deliverables, creating resistance to change in the future.
IMHO, an unnecessary risk given an alternative such as an incremental approach.
The new joiner onboarding time and complexity was already lower due to the adoption of Spring Boot.
The dynamics in place were already team-oriented instead of individual hero-oriented. This, for the person, was already the norm, and no resistance to those new methodologies compared to existing team members who decided to leave.
The hiring process took into account the new behaviors that we were looking for, which contributed to a faster adoption of the new behaviors and engineering culture that we were looking for.
Changing the hiring process and job descriptions was way more energy-consuming than expected. I had to coordinate the new job expectations with higher leadership and it required a lot of back and forth since those expectations were outside my impact zone.
Having a highly collaborative People department was key for the success of implementing a new hiring process and job expectations within a time frame of one month. Kudos to the best People professional I worked with.