Site Overlay

How to Become a Cloud Engineer The Ultimate Guide

We talk you through our 8 tenets or principles on our ultimate guide to becoming a Cloud Software Engineer.

Top 8 Tenets or Principles of Cloud Software Engineers

Skills you need to become a cloud engineer

We joke that it took 10 years to research our 8 tenets. But they took only 10 minutes to write.

These 8 tenets or principles apply to a ‘high performance serverless first team’. But they could also apply to a high performance modern cloud team.

1. Chase a business outcome or a KPI

Teams should know what business KPI they’re working towards.

You should be able to tap a cloud software engineer on the shoulder and have them tell you what they’re working on. And what business impact the work they’re doing is going to have. 

This acts like a guardrail to allow you move with speed and velocity while relying on making good decisions. And really understanding your priorities and how you prioritise. The only way to do that is by tapping into the success of the product.

Use North Stars to track your business success

North Stars track the way from profitability or business success to the work you’re doing. And how you’re having an impact. They help you to make good decisions and move fast.

Don’t share these principles or tenets with teams, without giving them guidance on how to achieve or align to them. We do Northstar workshop to help teams get a good grasp on their KPIs and the Northstar for their business.

There’s a really simple thought behind that. You ask a team what their KPI is. If the team says ‘I don’t know’, then you run a Northstar workshop.  After the Northstar workshop if nobody can think of a KPI then the next step is to ask if the team should be doing this work.

This does not mean they are a bad team. They are being asked to do the wrong stuff. 

2. Be secure by design

Our number two principle is ‘Be secure by design’. This principle has worked to secure our development for a long time. And then AWS came out with ‘Secured by Design’. So we borrowed it.  Don’t do security afterwards. Bake it in from the start. It’s everyone’s job. Period/full stop.

Security is a difficult thing to retrofit. Use threat models and get it done early. Try to solve for what you can and what you know.

Bake it into all your engineering practices. And bake it into your pipelines. Shift it all left and help to enable teams to be more secure.

Don’t say that it’s too hard. We’re not doing it. Start today and bake it in!

If you are aligned with the business. And business success is number one, then being secure is number two. Because security has a risk profile, if you don’t do it right. And it can be an existential risk for the businesses if you don’t have a secure solution. Number one and two are in the right order here.

3. Keep a high throughput of work

Our third principle or tenet  is ‘Keep a high throughput of work’. It is borrowed from the DORA metrics in the ‘Accelerate‘ book by Nicole Forsgren. This principle looks at high throughput, which is deployment frequency and lead time.

The ‘Accelerate’ book gives us the language and external validation for what we need to say to teams. We can point to the ‘Accelerate’ book and the DORA metrics as actionable to quantify velocity, development time, deployment frequency and lead time.

For serverless teams, it is key to make changes fast and frequent. And always learn and drive observability. 

It drives the right behaviour for removing impediments, through fast flow and really questioning our dependencies. Why can’t we be in the elite category? Or why do we have this dependency on this group, or another group? And why can’t we deploy on demand? Why can’t we deploy multiple times a day? It really helps teams to think in the right way.

Speed is stability

I remember talking to a monthly release team who were angry and didn’t want to do extra work.  They felt their throughput was one per month or 12 a year. And they did not want to measure it. But they already had measured it! So it wasn’t going to be much work for them. And number two, what would happen if they got a zero day security vulnerability? They would have to break everything because they didn’t  know how to release it! They also didn’t know if business wanted anything else for another month.

As Charity Majors says, “speed is stability”. The more frequently you do something, the more you deploy to production.  You’re actually improving your stability. You smooth out the pathways and the error conditions. And you bake it into your pipelines. Which means that you automate a lot of the stuff that could go wrong.

4. Reliably run high stability systems

A lot of discussions with test teams, QA and software engineers drive the need for investment in world class quality and testing capabilities/practices. If you’re not stable, where’s the gap? What scenarios and behaviours have you not covered? Are there chaos engineering items you have missed? What gaps do you have in your test suites? You need to make sure that stability is there to drive the right behaviour.

And to drive the right evolution. You can’t achieve if you’ve got things in the middle, like handoffs, or dumping things over walls. It’s about promoting ownership. To get elite scores, you need to know what you’re doing and embrace that approach. They help to modernise,  shape and move teams towards that way of working. 

Cloud software engineers on The Serverless Edge
Photo by Christina @ on

5. Rent or reuse with build as a final option

Even with Serverless and SaaS, with our background you’re used to going straight to the workspace.  And with the FORESEE diagram, we find out what we are doing and it is coding. It’s a mindset thing.  And it’s a very healthy principle to embrace.

It’s back to knowing your business purpose. And then knowing your business KPIs. If you can achieve business outcomes without doing code you are at your most optimal. If you can leverage a SaaS offering that does what you need, that’s probably the next thing. And finally you need to build following a serverless first mindset and approach. Using all the serviceful offerings and managed services.

I’m not a big fan of hero developers or a hero mentality. Over the years, with this principle, we’ve learned to use ‘off the shelf’. Less code!

It’s about being a democratised engineer rather than a superhero who builds something that no one understands.

6. Continuously optimise the total cost

This is the best question to ask any team. Because good teams will tell you how much their cloud costs are. But loads of teams have no idea. This is a great measure of a good team. 

I would add that they also tell you how much they cost.  And how much they cost to do what you’re asking them to do. It gives you good advice and guidance. And gets straight to ROI and good projected ROI as well. .

How can we mature teams?

A good team will tell you the run cost. And a great team will tell you the total cost. But really good teams will get into a worst case development conversation about how much features cost.  And how much revenue they’re bringing in. In other words, how impactful they are to the business.

I always add a fantastic question: ‘how can we mature the teams?’ How can we evolve the team so that they can answer new questions readily? For example, total cost is going to include carbon footprint and sustainability costs. When your team is optimising travel cost, they are not only optimising for financial culture, they need to optimise for carbon footprint too. They have to drive conversation on finding the most ecologically friendly region for their workload. 

7. Build event driven via strong APIs

This sounds very easy. But from talking to Sam Dengler, nobody is doing this properly.  We’ve been talking about this for 20 years. Proper integration is still a mystery to most people.  

It is about making sure you’ve got the right things in the right places. But also at the right size. And having things that are composable. It’s about breaking things up into their smallest constituent parts. And changing things as frequently as possible.

I find that this one takes a lot of evolution and yields through different levels of complexity. And it takes time. You should always be thinking about it. Teams who are new to serveless and that way of working will reinvent what they know. 

The principles balance upon each other

The principles on improving stability, move you into the elite categories and drive you towards loosely coupled event driven architecture. That gives you more autonomy and freedom. And gives you the ability to deploy when you’re ready.  Because you are event driven and loosely coupled with strong API’s that give you autonomy. Architecturally that autonomy is baked in.

With the right team alignment, you can go fast and be in those elite groups. A lot of these principles balance on each other. If you’re trying to influence with one principle you have to have some of the elements of the other principles in place.

People like to think in layers. But when they try to do ‘events driven’ they try to go through layers. But that’s not event driven.

A lot of the facilitator practices have come to the fore in the last couple of years. There’s lots of good stuff from the DDD and ‘event storming’ from Alberto. As well as event bridge storming. So there’s good hands on facilitated techniques that demystify this.  And make it more approachable for squads to benefit from an event driven architecture.

8. Build solutions that fit in their heads

This principle is borrowed from Dan North. In other words, don’t build crazy systems that are too complicated. This has a nice nod towards Team Topologies and setting proper boundaries. We’ve seen teams become victims to crazy architectures. Where there’s too much to fit in your head and the cognitive load breaks people.

This one will evolve over time. When we are getting teams going my manta is ‘just enough design’. Some teams want to design everything up front and go into huge amounts of detail. But it is better to keep your world small.  Focus on what you’re doing today. When you’re moving with rapid development and continuous architecture, you should always be refactoring and changing.

Limit cognitive burden

You won’t design the end state upfront.  You have got to be prepared to change and move direction. In many serverless projects we’ve nuked what we’ve done after two or three weeks and started again. It’s just the way of work. But the point of the principle goes back to domain driven design and limiting cognitive burden. And making sure your groups and classifications are well defined. 

The Team Topologies guys nailed this one by optimising for cognitive burden. And that’s where all the other principles really come in.  We can design systems that are small. And are loosely coupled, event driven and deployed frequently. That helps reduce cognitive burden. It’s not easy to get there. It’s hard work, and you have to evolve. And you have to edit and incrementally go after it. But you can start to really optimise for solutions that do fit in the heads of your teams.

Other tenets or principles to consider for cloud software engineers

We talk a lot about well architected.  So whenever I talk about these principles, I’m always referring to modern cloud or serverless first.  Because well architected leverages these principles. If you do adopt these principles, you’re going to be heading towards well architected.

The other thing is operating with situational awareness.  There’s got to be a principle on Wardley maps or other techniques to really understand the direction.  And business KPIs lend to that understanding.  But there might be something more on operating with good situational awareness. If teams are following these principles. It would be good if they also have situational awareness. 

What about Developer Experience?

I look at these principles at a team level. Situational awareness feeds into these principles.  And well architected underpins those principles also. Collaboration and working as a broader group is important. Looking at the 8th principle, if you and all the other teams work to the well architected standards, you will have portability and mobility of teams.

Well architected is the ‘how’. These principles  are the direction.  Well architected looks security, costs, operational excellence, reliability, performance and sustainability. So they are threaded through the principles. And the ‘how’ is well architected. 

Cloud teams with a high throughput of work solve the challenge of developer experience. You cannot have a high throughput of work and deployment frequency if your developer experience is terrible.

Don’t accept poor developer experience.

Create an environment for success

Creating an environment for success is a critical enabler. 

What I am trying to get to is contribution. In order to have good developer experience, they need to be able to contribute.

Give back through an inner source programme. It is the idea of learning. You need to be learning as a team. There’s a curiosity principle there. The team should always question stuff.

Continuous learning applies to the ‘optimise total cost’ principle. Have you looked at other options and involved your stock? Are you continuously learning about new features and capabilities that are available? 

Be curious with a growth mindset!

2 thoughts on “How to Become a Cloud Engineer The Ultimate Guide

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Translate »