Cloud Data: Observability Is the Forgotten Data

In this continuation of the cloud data series, discuss the forgotten data that is often overlooked when planning for cloud-native architectural solutions.

Like (2)

This article is a continuation of a series of posts to better understand how the pitfalls around the collection, maintenance, and storage of your cloud data can mean the difference between failure and success within your cloud strategy. The concepts in this series stem from brainstorming with my good friend Roel Hodzelmans and are additionally inspired by reactions from the audience to a talk given previously in Dublin, Ireland.

The initial post provided an introduction to cloud and data, and what that means in a cloud-native architecture beyond just storage. In this second article, we discuss the forgotten data that is often overlooked when planning for cloud-native architectural solutions.

Observability Is the Forgotten Data

When you look at observability, you might be thinking about data generated from logs, traces, metrics, and even events across your landscape. What you probably do not realize is that many of your applications and platforms have standard installation settings that generate large amounts of observability data by default. If you are not accounting for all that data being generated when you are heading into the cloud, you are going to have a hard time meeting your budget constraints for deploying and running your production solutions.

Martin Mao stated earlier this year that the growth of observability data is out of control. He talks about how organizations don't mind paying for that data if it led to better outcomes, such as happier customers, higher availability, faster remediation, or more revenue.

"Paying more for logging/metrics/tracing doesn't equate to a positive user experience. Consider how much data can be generated and shipped. $$$. You still need good people to turn data into action. It's remarkable how common this situation is, where an organization is paying more for their observability data (typically metrics, logs, traces, and sometimes events), then they do for their production infrastructure." - Martin Mao

Let's take a look at a simple experiment presented in an article on the hidden cost of data observability, where a simple "Hello, World!" application was deployed on a four-node Kubernetes cluster on GKE (see the article for details of the setup). Scripts were used to simulate load on the application and 30 days of observability data were collected in the following categories:

Tracing - One trace per second over 30 days totaled 2.5M traces for a total data size of 161GB.
End user metrics - Each back-end call generated a user interaction, so over 30 days, that's 2.5M EUM traces for a total data size of 1GB.
Logs - Mileage may vary depending on the configuration of your logging, but here, it was a 30-day total data size of 3.4GB.
Metrics - Collected using Prometheus configured for a 10-second sample rate across the cluster for a 30-day total data size of 285GB

Granted, this might not be a perfect example for your research, but it is simple and gives easy-to-follow results of just over 450GB of data for a single, simple application.

If you take into account that the average retention period for audits and compliance is at 13 months, you have to ask yourself how much data you are having to collect, transport, and store effectively across your cloud architecture(s). In modern cloud-native architectures, you can be deploying multiple times a day, where a container is sometimes only around for a few minutes or hours. The default of storing the observability data generated there may not need to be 13 months. Maybe trying setting retention periods for each data type can help with your generated data volume.

Also, consider the various environments that are set up and torn down weekly, or bi-weekly, such as test or lab environments. These certainly don't need extensive observability data retention, if any at all.

As Martin noted, paying for more data is one thing, but people are the core of any successful use case:

"Paying more for logging/metrics/tracing doesn't equate to a positive user experience. Consider how much data can be generated and shipped. $$$. You still need good people to turn data into action."

Who Owns These Decisions?

While realizing that there is a lot of unexpected cloud data coming out of your architecture, there remains an issue of who owns these decisions in your organization. The observability data explosion can cause a lot of issues and costs, but the question to answer is:

Do you dare to flip the switch on a new data collection in your architecture?

The following article in this series will take a look at what the industry is going to be doing in the near future to ensure there is a financial owner for their organization.

Cloud Data: Observability Is the Forgotten Data

Cloud Data: Observability Is the Forgotten Data

In this continuation of the cloud data series, discuss the forgotten data that is often overlooked when planning for cloud-native architectural solutions.

Observability Is the Forgotten Data

Who Owns These Decisions?

Recommend

California to ban sales of petrol-only vehicles by 2035

和乐春晖获数百万战略投资，主打银发人群新零售

Optimizing Your Cloud Costs

New report says self-harm posts are surging on Twitter - The Washington Post

目朗出品，吉利全新皮卡品牌——雷达汽车

提供AI个性化评估，新加坡教育SaaS平台Jackett获100万美元种子轮融资

Sony's Latest Dolby Atmos Soundbar Is A Gateway To 360 Spatial Sound

CEDEC | 索尼高管分享PS玩家大数据，玩免费游戏和网游成趋势

Just paint a wind turbine blade to save birds and bats

3 Reasons (and 2 Ways) to Use MongoDB’s Improved Time Series Collections

About Joyk