This work is licensed under CC BY 4.0 - Read how use or adaptation requires attribution

Calculating Cost of Service Driven by Customer Cloud Unit Economics

As a FinOps engineer working directly with the CTO, our mission was to calculate the Cost-to-Serve for each customer purchasing product Tier A. We faced significant challenges in Tier A’s profitability, with the gross margin falling short of our expectations. Our objective was to understand the cost associated with serving each customer and identify behaviors leading to outlier costs.

Personas involved

Our primary focus was on the Finance and Engineering teams, who collaborate to make decisions concerning customers violating our platform’s Terms of Service. To execute this initiative, we engaged professionals from finance, data analytics, product, and infrastructure engineering.

Data considerations

Our main hurdle revolved around data availability, especially in our legacy product stack and allocated infrastructure components. Developing this new metric was pioneering, involving numerous manual and contextual calculations.

Many metrics related to resource consumption were missing, presenting challenges both from a metric and resource perspective. Some metrics were calculated manually using raw usage data, such as network throughput per customer and storage usage. For metrics that couldn’t be directly calculated, we had to rely on estimates derived from correlated raw usage data.

Creating an accurate component model of the product and understanding how resource consumption correlated with customer usage demanded a significant investment of time.

Business impact

The implementation of this metric led to an increase in profit margins. Notifying customers about Terms of Service violations resulted in changed behavior or customer removal. More significantly, this exercise sparked executive interest in data-driven decision-making, which had not been a formalized part of our recurring business processes. The emphasis on unit economics also generated more engineering and cost-efficiency questions, contributing to a long-term margin improvement and increased company value.

Maturity levels

Before introducing these metrics, our maturity level was low across the board, with the exception of the on-premises legacy stack, where all FinOps principles were in the PRE-crawl stage. After implementing the metrics, we achieved a “Walk” level across the board, except for the on-premises legacy stack, where we reached the “Crawl” stage.