We talk to Matt Coulter, Liberty Mutual about the pioneering journey we went on bringing Liberty Mutual into Serverless.
We’re joined today by Matt Coulter from Liberty Mutual, who’s very kindly agreed to take time out of his busy schedule to come here and talk to us. Matt, do you want to give a quick intro and tell us a bit about yourself? BTW, we are all good friends and have worked together for a long time!!
So for anyone who doesn’t know me, my name is Matt Coulter, I’m an Enterprise Architect in Liberty Mutual, working out of Liberty IT in Belfast. And for the past couple of years, I’ve been on this serverless first, well architected journey with a lot of our teams.
So I’ve been doing that primarily through a technology called AWS CDK. And that’s where people might have seen me doing things like open source work with CDK patterns, or running a conference called CDK Day.
Developer Community and CDK Patterns
Full disclosure! We worked together with Matt Coulter in Liberty Mutual for a long time. So we’re very familiar with a lot of this work. But it’s good to have a chat about it. So I’m interested in your opinion on this. It’s one of the things that we always talk about. How do we encourage a really strong developer community within a company? How do you find enabling engineers in a big enterprise or a big company?
So it’s an interesting question, and it has multiple sub components to how you answer it. Because when you talk about enabling engineers in a big company like Liberty Mutual, you’re talking about large numbers of engineers. So it could be 1000s, across the globe, we’re talking about as opposed to maybe a couple of small teams.
Direction alignment, enabling constraints and walking the walk
The most important thing is that you need to have alignment on direction. So you need to make sure that everybody there knows where they’re going. That’s the very first and most important thing. And that’s what most people skip, because if you don’t tell people the direction, everyone will just head their own way. And it’ll all actively work against itself, because it becomes a competition.
Then the second thing is you need enabling constraints, as Mark loves to call them. And that’s where you set up your guardrails so that it’s safe for people to experiment and go the direction you’ve set. They know there’s a maximum level of damage that they can cause.
And then the third thing, is you actually need to walk the walk that you’re setting up. I think it’s important to be able to show working software. So that you can say, I’m asking you to be Serverless First. This is how I think you should do it. Here are working production examples showing the way that you’re going to do it, as opposed to here’s just a high level theory.
Like you said, there are things like enabling constraints and setting the direction. Is that harvesting patterns? If you go across six squads in an enterprise or organisation, you’ll notice that they use common patterns. It’s how cloud providers identify their best managed services for things like queues, gateways or topics. Is there a process for getting everybody to come together and move in a certain direction?
There’s always a challenge between allowing divergence, and then what the right point is to tell everybody, you need to converge on the right solution. If the direction is new and different to what the company has done before, providing everybody’s going in the same general direction, like say for serverless, everybody is trying to use lambda and DynamoDB step functions.
AWS Well Architected Framework
If they’re all doing it slightly differently for a while, I think that’s okay. But then that’s where you need a process to go to the teams and say, I see similarities in Teams 1, 7 and 32. We need to start bringing them together to find the right way to do this. Specifically for cloud workloads, the well architected framework has been the one common language to identify what is personal preference and what is industry best practice for how this should be done. Jumping into teams and doing well architected reviews is a big piece of how we can unearth the similarities and differences.
When we worked with Matt Coulter as architects, we could see right across the organisation of Liberty Mutual. That was really handy, because you could spot those patterns. 10 years ago, when we had different platforms, it was a nightmare. But today there’s a convergence on AWS, because everyone’s in the same space. So there’s no excuse. Even with two different languages, the same architectural patterns are there. So I think that has made it easier.
Well architected is all about consistency as well. Once you identify the patterns, well architected is the consistent benchmark of guidance and advice on how this thing should be set up and architected effectively. Once you get it in place, it should facilitate that convergence.
Software Accelerator and CDK
We have a strong process or culture of sharing reusable patterns that work. Mark did a talk at CDK Day about the software accelerator and how it’s used with AWS CDK. We allow anybody to contribute a pattern that’s working software to our accelerator. And then everyone else can just use that.
Now, whenever I talk about the divergence and convergence, every team could theoretically contribute a slight variation on the same pattern. So that’s where if you use the well architected framework, you can assess them all. And then you can say, of all these patterns, we don’t need to delete all of the duplicate ones. But this one here is the well architected one, this is the one we’re going to put at the top of the page. And this is the one we think everyone should contribute to and converge on. So that’s how we use it with those two things combined.
Guardrails are key for contribution
The guardrails are key to enabling that contribution as well. There were good, enabling constraints, or enterprise guardrails. So developers contributing had to adhere to the standards for the enterprise. That made it easy for them to contribute and get rapid feedback that their pattern or solution was adhering to standards.
It’s one thing to say, you have an Open Source community, but then make it impossible for anybody to get over the bar of contribution. It’s another to make that contribution easy. I’m happy with those internal blogs on contributing showing how easy and how quickly you can do this.
One of the key elements is to get developers through step one, two, and three, and then they can go the rest of the way themselves. With big waves of technology change there can be fear, uncertainty and doubt. But getting them started down a particular path, opens it up for them to take the next step.
Matt, do you think Wardley mapping helped us to figure out what constraints we needed at Liberty Mutual? Or did it help engineers see the way forward? What’s your opinion on that, Matt? How much do you think that helped on this journey?
I would actually say the Wardley mapping that we did was a game changer at Liberty Mutual. I don’t want to undersell it. At a point in time, Mark and I were dropped into a new market with a couple of 100 developers all working on different apps. We knew the direction we wanted to go was serverless first.
By using Wardley maps, we could understand our current positioning and form a strategy for what we wanted to do. So we were able to see exactly all of the technical debt, the code liability between all of the teams and we could see how to have the most impact with the patterns and the enablement that we wanted to go for. I personally think that was key to the process.
It’s key to be able to talk people through that evolution axis, for example, see this big complicated thing you are having lots of trouble with? That needs to go into this new thing that you can replace it with. And those tens of thousands of lines of code can become practically nothing. That’s an easy sell.
Identifying the candidate list for pre built patterns
Mike, you mentioned earlier about spotting the patterns that you should coalesce around. The maps helped identify the commonalities and problems that engineers were having. So that becomes your candidate list for things we should consider for creating pre built patterns. We need to address the needs of developers on the ground.
I remember the process you guys went through. We did a similar one within our market. It’s okay if one squad is causing an inefficiency in the value stream eg. managing an ops server or database cluster. But when you come across that with 30 squads, then you begin to quantify the actual effect on the broader organisation. It’s a duplication of effort and lost developer hours in terms of productivity.
At the time, they were largely tech stack Wardley maps, where we were mapped the various tech stacks leveraged across the squads. A developer enablement team with that information can penetrate the market by identifying which patterns are going to add most value. I am sure you remember which pattern you released first and if it took off.
Wardley Maps applied at Liberty Mutual
There was a point in time where all the Wardley maps were layered on top of each other for every team, making an uber map. Every map is imperfect and wrong. We were just as biased for each map so when you layered them on top of each other, they were comparable. You could draw the line to say we want you to be close to your customer. And you could see how much of the market was doing undifferentiated, heavy lifting!
Then you could physically draw a line and say, we want you to transition that from there to here. And by drawing that line and making it visual is different from phoning somebody and saying, I want you to move.
A couple of times the person drawing the map would say ‘this thing is really important’. And then when they drew the map it was way down at invisible commodity and then would ask why is it sitting there? We don’t need to be doing that. They made the realisation themselves: ‘ah wow, stop doing that!’.
Wardley mapping is like a pair of binoculars, because you can see for miles. It gives you a huge vision of what’s happening.
As Matt mentioned, when we dropped into that new market, it (Wardley mapping) was a rapid accelerator to understand what we had got ourselves into. Very quickly you get up to speed, as an architect I can understand the landscape. Now I can understand where I can make an impact. I can understand the next right moves. For any aspiring leader or architect, mapping gives situational awareness to do that next right thing.
Matt Coulter and CDK Patterns
So there was a Value Flywheel effect there. But as you mentioned at the top of the chat Matt, CDK was massive. I personally think that was an unbelievable accelerator. Do you think you would have made as much progress without CDK? After all, CDK’s cloud development kit and those patterns enabled you to create a piece of infrastructure in seconds. Do you think you would have made the same progress without CDK?
It is a fascinating question. For anybody who’s watching this and doesn’t know CDK, the difference between it and traditional infrastructure as code is that CDK is focused on building your infrastructures through programming languages like TypeScript, Python, or .Net. The whole point is to bring the cloud to the developer, as opposed to making the developer learn to manage things like CloudFormation templates, which could be 1000s of lines of Yaml. And on one hand, the CloudFormation templates are beautifully succinct.
AWS CDK construct
On the other hand, if you’re a developer who spends every day working with business logic, whenever you transition to the Yaml file, personally, my head goes and I lose track of it. Whenever I first saw AWS CDK, I give credit to Mark, because he kept dropping hints about it for weeks, so he got me to go ‘what is this CDK thing?’. I took a really common use case for us at the time, which was like a private API gateway with a custom authorizer lambda. That was about 1500 lines of CloudFormation template. And on top of that, it also had a load of quirks. (Since then, they have fixed a lot of the quirks).
But you had to manually modify pieces of the CloudFormation every deployment, otherwise, your gateway didn’t deploy. I took that and built an AWS CDK construct. This was pretty much a week after AWS CDK went GA. I was able to reduce that whole thing down to a maximum of 14 lines. That was really five lines of code. It enables you to write local unit tests.
Removing cognitive burden for Developers
And it brought in our DevOps practices, because from a developer perspective, it is just TypeScript. And we know how to handle TypeScript. So I think that shift gave us a kickstart in the developer mindset as opposed to sitting around to tell our CloudFormation horror stories of the things that we’d broken and couldn’t backtrack off.
It just made it easy. You could IM someone a couple of lines of CDK and it would work. That would never happen in cloud formation.
The timing was perfect because a lot of developers were starting to read about the cloud, but it was terrifying to them. Like the old stories about running up big AWS bills on their private accounts. When CDK patterns came it demystified it and allowed them to lower the cognitive burden and the barrier to entry.
And one thing you didn’t really mention is that the pattern you created was incorporated into our CI CD pipeline, it followed all the enterprise standards and all the checks and balances that an enterprise needs. So it removes that friction for developers to experiment, and just try it out without the fear of blowing up the cloud or am I gonna take down all internal systems?
Having that developer community in place, also helps with CDK, and harnessing the economies of scale that you can get in a big organisation. When you’ve got that funnel with the accelerator and platform, developers can leverage other team’s constructs, and also contribute certain things back.
The CDK patterns that you set up with an open source funnel, channel and contributions from the broader community, helped centre people. So they didn’t go off, and start from scratch constantly, or go off in 50 different directions. You don’t want an organisation with 50 different flavours of the same construct.
What you said there is key. I originally described why I got interested in AWS CDK. But since that time, I have seen dozens of amazing technologies fail, because they didn’t build the community around it. It’s sitting there shining right into the corner, and nobody’s noticing it.
The importance of storytelling
The vast majority of the work that I’ve been doing over the past couple of years is storytelling, and actually getting people engaged. So instead of a lone person trying to effect change across hundreds of people, the only way I knew how to do it was to take the best of Cloud and bring it inside Liberty.
That’s why I started looking at how we should be building serverless architectures. What are the industry experts saying? They were people such as AWS heroes like Jeremy Daly. It was key to go out and look at the best patterns, build them open source and say this is the pattern, let’s discuss it, And then bring a slightly modified version of that pattern into Liberty. But tweak it for compliance, audit and our general needs.
It changed the conversation from trying to custom build everything for us to let’s engage with the wider community. And let’s try and build things together. I do think if I was trying to do it again, that part has been a massive success. And I wouldn’t I wouldn’t change anything about that story. I do think trying to try to get the developers to look outside is the big difference.
Going external for existence proof
That was a super multiplier because by going external, the influence internally is a pattern of existence proof. That got you loads of press with AWS and Werner Vogel. At the Summit last year, he gave you a couple of name checks and that amplifies it again. It’s amazing to see that validation and that builds up a lot of authority in the space.
For the first time, developers could find the answers by doing a Google search and finding CDK patterns.com or the pattern generator or technology that they were using internally. Previously, in big enterprises, the software that teams worked on could be so bespoke that there wasn’t much external help.
Now there’s videos, courses, YouTube, patterns, and websites supporting the general direction. That is the game changer. It’s self study with ‘Cloud Guru’ courses and Pluralsight supporting the direction of travel.
CDK Construct Example
There was a cracking example, and I don’t like going into specifics, but I remember one scenario where we’d been working a long time on getting a particular capability enabled. And it was taking a wee bit of time. But, in the background, we had a squad who was effectively standing a higher order CDK construct. Don’t quiz me on the level! Would that have been an L3 construct? It was an L3 construct sitting ready to go.
So as soon as that got approval, it was pushed out into the broader portfolio and was ready to be consumed by the whole organisation. That was a very powerful enabler. In the past the approach would have been: ‘I need to educate these 40 engineers on the specifics and nuance the architecture, and that would have been a far slower rollout. Rather it was literally a construct ready to consume which was phenomenal.
Open source patterns
I personally got value from open source patterns. As an architect, when I want to introduce a new capability into the org, like ‘X ray’ for doing your distributed tracing across your architectures, I can build the CDK pattern and put it out there to find flaws documented on the CDK pattern.
So before I bring it into Liberty for teams to use, I’m having the conversations with AWS (as the enabling architect for this account), on what I need AWS to do before we can use this. That takes the frustration away and prevents every team hitting the same issue, getting annoyed, leading to a collective buildup of frustration from all the developers. Whereas I was taking it for them. And they knew they could come to me and ask that question. And I think that works really well.
It’s flow efficiency by removing impediments continuously. You’re going ahead of the game by removing all the impediments, so when the teams adopt X ray, in this example, they’re not going to get all these impediments
The Value Flywheel Effect
It’s the same pattern that Amazon uses. It’s ILC: the innovate, leverage and commoditize cycle. We talk about the Value Flywheel Effect. And the fact that you build clarity of purpose, you map to remove the impediments, drive forward the technology and finally you go for long term value. So it’s that same cycle.
Amazon does that in their own space. But you were doing that as well with an exciting pattern, by leveraging and commoditizing it and letting the developers move up. So Matt, in your opinion, if you were a company watching this how can you get going? What does it take for someone to get started?
If you’re only getting going, I think you can skip a few steps. A lot of things have evolved over the past couple of years. If you study where the industry has been going, a lot of big companies will tell you what they’ve been doing.
Empowerment, enablement and lifting your developers up
The story is about empowerment and enablement and lifting your developer’s up. But it is also about making sure you have those enabling constraints to keep them together. And that’s why you hear about Spotify’s Golden Path Model specifically paving the road for an implementation. It’s okay if you want to deviate from it. But if you deviate, you’re on your own. Whereas if you stay on the golden path, they’ll keep it up to date and maintained.
So if I was starting today, I would go in and do Wardley mapping, like we mentioned earlier and say what is our landscape, what is our golden path going to be? And make it so easy to follow the golden path that the developers won’t even think about going elsewhere.
Create the right environment
So it’s not a case of saying you have to do this, it’s a case of someone who will suggest not doing it. And everyone will say why would we not do it, because this is the right thing to do.
That’s the environment you’re trying to create. And that’s based on the AWS Well Architected Framework and Serverless First ideologies, because it enables you to specifically get down to the golden path.
If you go for a containerized route, you can’t get as detailed with what your golden path is going to be. Because it’ll be like: ‘we use Spring Boot’….. ‘but we use Microservices’. Whereas if you go with serverless, you can say these are the core technologies we want to leverage in these instances and this is our orchestrator. It really does show the commonalities between teams and makes the golden path easier.
You can be finely grained in your enabling constraints.
Time to value is key
I think fast time to value is key. When developers are starting to embrace the golden path and patterns, the speed of delivering value is unheard of. And that builds momentum that keeps the flywheel turning. That means they don’t want to get off that golden path because they’re seeing value being delivered really rapidly.
You also create that avenue for off roading as well. It’s okay for teams to go and experiment on new routes or pathways and they accept the fact that they’re just going to go slow. But if it works out, there’s a wee bit of investment from the community when they bring that back in and extend the path offering.
You can do that when you’ve got the community and those facilities that funnel and harvest the efforts from the squads. It’s another flywheel thing – once you do it, it becomes self-sustaining and looks after itself. Culture leads to that as well.
So that’s the craic. Thanks very much Matt. We appreciate your time. I hear you have a book coming out – your CDK book.
So funnily enough it’s a relevant topic. You can go to thecdkbook.com. A few of us got together and went through the details of all you’ll ever want to know about AWS CDK. It’s out now.