AWS has over 200 fully featured services for various technologies, industries, and use cases. Without careful management, businesses can face steep bills that hit their bottom line. Balancing high-quality performance, security, and cost control is critical. AWS cost optimization transforms cost management from a chore to a strategic move. With best practice, you can focus on improving efficiency and return on investment (ROI) instead of fretting over costs.
This article aims to help you understand, control, and significantly reduce cloud costs.
We continue our overview of the Well-Architected Framework with our take on the AWS Cost Optimization Tool. Serverless has changed how orgs fund IT from significant enterprise investments written off over several years to pay-as-you-go incremental payments weekly or monthly. In other words, Opex v’s Capex.
The AWS Cost Optimization tool is one of my favorite ones because no one understands or knows what it is. It’s got much better now with the modern cloud. Like the other pillars, there are ten questions and five subsections:
- Practice Cloud Financial Management
- Expenditure Awareness
- Cost Effective Resources
- Matching Supply and Demand
- Optimizing Over Time
Practice Cloud Financial Management
The first one is Practice Cloud Financial Management. How do you implement cloud financial management? Most people say: ‘We don’t know what that is!’. Try asking a development team: ‘How much did that cost? Most of the time, you get back a blank stare!
It’s a maturity step for teams to be able to respond and know how much their stack costs. If you get blank stares, you must dig deeper into their operational excellence, observability, and general engineering practices. Engineers are aware but need direction. Cloud financial management doesn’t need to mean ‘big, scary, or loads of spreadsheets.’ It can be as simple as knowing your monthly or weekly bill.
There has been a significant architectural shift since we’ve got more visibility into cost in the last few years. When working on the enterprise mainframes, you were dealing with capacity. And you did get into licensing, but not price. Moving from mainframes to cloud, you go expansive with extrapolated architecture to fit whatever scale you’re working to. You have to factor cloud financial management into your architectural decisions.
Developers understanding costs
Back in the enterprise days, the cost question was: ‘Is it five figures, six figures, or seven figures?’. You sometimes saw many hundreds of 1000s.
In the serverless and microservice world, your costs can fluctuate if you’re unaware. In the past, that would have been pre-bought five years ago, using tax-efficient methods, written off, and paid down over multiple years. What you designed and implemented would have had a negligible impact. Nowadays, the success or failure of your organization (depending on the scaling) can come down to how well you manage your cloud costs.
If you are interviewing a developer, and they tell you about their fantastic system, just ask: ‘How much does that cost per day?’. That’ll unpack a lot of stuff. Even though it’s one question, there’s a considerable amount behind it. You can go deep into savings plans, tagging, and being savvy and skilled with your cloud financial management. There’s an emerging cloud economics role.
The partnership between CFO and CTO
What’s interesting is that cloud providers are cleverly targeting CFOs. So, a partnership forms between the CFO and the CTO. The CFO team has fin ops people. It is an excellent excuse for a savvy architect to talk to the Finance department about how billing, costs, and budgets link with your architecture. That’s a great way to drive improvement instead of just wanting to refactor because it’s cool. Instead, you can refactor to save half a million dollars because Finance has told you this.
There are two types of financial plans in IT: OpEx v’s CapEx with AWS. CapEx, capital expenditure, is when you plan upfront, and OpEx, operational expenditure, is not scheduled for, i.e., it’s pay-as-you-go. Your dynamic workloads typically fall into OpEx, and many organizations need help transitioning from one to the other. The difficulty is that you are still determining your bill for that month. With data centers, you pay three years in advance and can offset tax.
CapEx and OpEx expectations
Setting expectations around OpEx v’s CapEx with the AWS Cost Optimization pillar is critically important, especially with business partners, who maybe aren’t aware of this and will wonder why my bill went from £50 to £3,000 this week. And the explanation might be running a load test for a new feature. So, you need to be very upfront and very good about setting expectations. OpEx v’s CapEx AWS costs will fluctuate continuously; give reasons why. It relates to ‘clarity of purpose,’ understanding what you’re doing, and being able to articulate that in a business way that links business and IT together.
I’ve seen a few teams do this exceptionally well. Cloud providers show you how your cost calculations. You can replicate those algorithms in your dashboards to predict costs.
How often has a developer come to your desk in the morning with the blood drained from their face, saying, ‘I think I have just spent $20,000!’? It is an excellent way to get some focus on well-architected!
That brings us to the expenditure awareness section of the question set. To grow that awareness, education is critical. You need to make teams aware that these things are available. The revamp of the console and having cost on the first screen means that cost awareness is growing increasingly, especially with the stuff we discussed. Your cost is an operational expenditure critical to the profitability of your business. It will be something other than a written-off item from 10 years ago.
Expenditure awareness is simple, but it is very new for some developers. So, how do you govern usage? How do you monitor usage and costs? And how do you decommission/how do you switch things off? That’s not what developers of yesteryear had to worry about. Back then, it was a ‘sysadmin’ thing. People can get alerts or emails if they leave things running over the weekend, driving the correct behavior because you are spending real money.
Tagging and tagging resources are levels of discipline that you need to get into. Because if something’s not tagged correctly, you are not monitoring it, but you still have to pay for it.
If you’re in a leadership position, you should be able to see a breakdown. If you have five applications in your portfolio, you should know the cost breakdown from those five. That’s easy to do.
The things to consider around governing usage are guardrails, how you set up your organizations, and how you set up your service control policies. If you’re going to be a serverless first shop, you can turn off the non-serverless capabilities that are very expensive if they’re left running. There are ways to establish good guardrails that give you the best cost optimization.
It’s serverless first, not serverless only, but if you devise a well-articulated excuse or reason why the serverless capability doesn’t work, we will alter the policy!
Cost Effective Resources
The next one is Cost-Effective Resources. There are many ways to skin a cat when building something. Developers always want to pick the fastest and wildest thing. But is it more cost-effective? Sometimes, you don’t need the quickest thing, and a moderate speed will do the job. It points to sustainability as well. It’s different from how fast you can get it. It’s how quickly you need to provide adequate service.
The total cost of ownership comes to the fore here. It’s not what it is for right now. It’s the long-term operational burden and cost. You can choose a super low-cost technology, but the cognitive burden is massive because it’s a new technology outside your team. Then, what’s the learning cost for that team to learn the tech stack or language that you chose for cost-efficiency reasons? You have got to take a bigger view. It’s for more than what you’re doing right now.
There’s a tonne of new stuff at the minute. How do you plan for data transfer charges? That’s one to look at if you have a large data footprint. When you look at the operational business processes, it can have backups, DRs, or whatever. You need to be careful that you don’t get charged for those. Ingress and egress can be a massive cost, especially if you’re going multi-region. You need to be aware of data costs for moving around regions.
Matching Supply and Demand and Optimising Over Time
The last two are Matching Supply and Demand and Optimising Over Time. There’s a piece around keeping up with the latest and greatest in AWS and tweaking your design so that you’re continuing efficiently. There’s also about selecting new services as you build new stuff and choosing wisely for a decent cost impact.
It is where serverless advantage kicks in. Matching Supply and Demand is taken care of for you as it scales with the load. If you’re on traditional architectures or EC2, pre-provision some other stuff and have it ready. That’s a complex calculation to get right. You’ll have a lot of wastage with an excess capacity just waiting to go whenever the demand comes in. So it’s a lot easier when you’re serverless. Being able to extrapolate your costs based on various dimensions is essential.
So that’s the craic. That’s the AWS cost optimization pillar and OpEx v’s CapEx in AWS. We’ll hit the Sustainability Pillar next. So please give us a like or a follow on YouTube or the Podcast. Our blog is TheServerlessEdge.com. And follow us on @ServerlessEdge on Twitter. Thank you very much.