There’s a fantastic article posted on medium by the BBC Digital Products Head of Architecture, Matthew Clark showing the serverless profile at BBC Online. I enjoyed reading it as I can identify with the approach that the BBC team took and many of the challenges they encountered. It is a great example of a team coming together to deliver a quality technology change, quickly and effectively.
It is worth remembering that BBC online is global and across all devices, so when a big event happens, the site is flooded with requests. It needs to be a resilient cloud application. However, as BBC online is a pioneer, the site has been live for over a decade so it was deployed on “legacy cloud”. Therefore the team have had to embark on a transformation journey by moving to AWS Serverless and giving it a serverless profile.
BBC online’s transformation checklist
1) Don’t solve what’s been solved elsewhere
The is classic Serverless. Managing virtual machines is a pain and expensive – let AWS do it. Maximise the work not done.
2) Remove duplication (but don’t over-simplify)
Be brave when you migrate. The BBC restructured teams and worked in reuse. They didn’t go purist either and recognised some differences. You need to be pragmatic when building large systems and moving to a serverless profile.
3) Break the tech silos through culture & communication
I must admit, when I started the article and read “hundreds of people took several years”, I did think “oh dear”. But when this section described the complexity involved and the challenge of team silos, I figured that a timeline of several years was reasonable. Tech is easy; inter-tribe communication is like splitting the atom. This has to be the most challenging problem in software.
4) Organise around the problems
This was brilliant. I think this diagram could be packaged and sold as a serverless profile “transformation playbook”.
5) Plan for the future, but build for today
Timeless – one of my favourite sayings from Software Engineering year ago is YAGNI – You Ain’t Gonna Need It. Engineers are always tempted to build things “just in case we need it” – you don’t. This brave principle will save you money.
6) Build first, optimise later
This was the kicker. There are lots of advantages for a Serverless profile, but performance is often cited as an issue. I think the team were correct to worry about performance, but I think they are very humble here. I reckon they had a really clean and tight design. If that’s the case, tech like Lambda will skill to unbelievable levels.
“Performance using serverless has been excellent. By not optimising too early, we saved a huge amount of effort.“
7) If the problem is complexity, start over
Compelling illustration of Gall’s Law here. Often companies are unwilling to do this as it may make you look stupid. I assume that the environment is right in the BBC and the engineer’s felt supported (thus able to fail fast).
“A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system.”, John Gall.
8) Move fast, release early and often, stay reliable
This is just classic DevOps and well covered in Accelerate by Nicole Forsgren. The DORA four key metrics advise that you can move fast AND have high stability. Well understood but not widely adopted.
Lambda scales better than EC2.
The architecture section is worth a read, but I’ll not repeat it here.
Review of the serverless profile at BBC Online
Engineers often think of performance as:
- On-prem is fastest as it’s in your data centre and close to everything.
- EC2 (hosted compute on AWS) is compute in the cloud, so fast, but has a slight network delay.
- Serverless – slow startup as it has a network delay AND it has to spin up a container.
The BBC story perfectly illustrates Serverless First, which is the opposite of the above:
- Well-Architected Serverless is your first option and will work for most cases.
- You can fall back to EC2, for some particular use cases (the list is growing shorter as AWS improves lambda and other services).
- On-prem is still an option for some legacy workloads.
And from the article:
“Increasingly, the rendering happens on AWS Lambda. About 2,000 lambdas run every second to create the BBC website; a number that we expect to grow. As discussed earlier, serverless removes the cost of operating and maintenance. And it’s far quicker at scaling. When there’s a breaking news event, our traffic levels can rocket in an instant; Lambda can handle this in a way that EC2 auto-scaling cannot.“
We have seen stories like this over the past few years, and I predict we will see many more. I reckon many companies are currently creating large, serverless solutions. Exciting times!
Here’s the BBC online article.
I’ve always been a fan of this team. I spent a lot of time looking at and reusing the BBC GEL (Global Experience Library) many years ago – it was ahead of its time, but that’s another story.