Observability 2.0: Why Engineering Teams Are Moving Beyond the Three Pillars Model

Modern observability is broken. Engineering teams spend most of their time fighting cardinality explosions and vendor bills that multiply overnight, while struggling to understand their increasingly complex software systems. Charity Majors, co-founder of Honeycomb and author of "Observability Engineering," explains why the industry is shifting from the traditional three-pillars approach to unified storage solutions that treat observability as a development tool, not just an operational afterthought.

Key Takeaways

The three-pillars model (metrics, logs, traces) creates expensive data silos that force engineers to correlate information across 15 different tools for every request
Observability 2.0 uses unified storage and structured events, enabling real-time slicing and dicing across high-cardinality dimensions without cost explosions
Cardinality governance consumes most observability engineering time because traditional metrics tools break when unique identifiers exceed 100 values
Open Telemetry has emerged as the solution to vendor lock-in, becoming the top CNCF project by commits and enabling portable instrumentation
Modern observability should underpin development workflows, not just operational debugging, with SLOs serving as the API for engineering teams
AI observability requires tracing entire workflows from software inputs through models to human feedback, not just monitoring model inputs and outputs
The best time to implement observability is while writing code, similar to writing tests—retrofitting loses the original intent and context

Timeline Overview

Early Career & Parse (2012-2013) — Charity experiences professionally humiliating outages at Parse due to unpredictable mobile app traffic spikes and inadequate tooling
Facebook & Scuba Discovery (2013-2014) — First exposure to high-cardinality observability through Facebook's internal Scuba tool, reducing debugging time from hours to seconds
Honeycomb Foundation (2016) — Co-founding Honeycomb with Christine based on the life-changing Scuba experience, building their own database despite conventional wisdom
Industry Evolution (2017-2020) — Three-pillars model emerges and dominates, while Honeycomb develops serverless database architecture and unified storage approach
Modern Challenges (2020-Present) — Rising costs force industry consolidation, Open Telemetry gains traction, AI creates new observability requirements for non-deterministic software

The Fatal Flaws of Three-Pillars Observability

The three-pillars model emerged in 2017 when Peter Bourgon coined the phrase describing observability as metrics, logs, and traces, but vendors adopted it primarily because they had separate products to sell for each pillar
Every request entering traditional systems gets stored across 15 different tools—metrics storage, dashboards, structured logs, unstructured logs, tracing tools, profiling tools, and analytics platforms—with no automatic correlation between them
Engineers become the human correlation layer, manually connecting data by recognizing patterns or copy-pasting IDs between systems, creating bottlenecks and increasing time to resolution
The cost multiplier effect makes this approach unsustainable as companies scale, with some organizations unknowingly storing identical request data in multiple expensive tools simultaneously
Data Dog and similar platforms offer pre-defined bridges between tools, but these require advance planning about which information will be important and where connections need to exist
The fundamental problem is having many sources of truth instead of unified storage, leading to dead ends where clicking on data points cannot reveal deeper context or related information

Understanding Cardinality: The Hidden Cost Explosion

Cardinality refers to the number of unique items in a set—user IDs representing 100 million users create maximum cardinality, while static values like "species equals human" create minimum cardinality
Traditional metrics tools are built exclusively for low-cardinality data and break catastrophically when unique combinations exceed roughly 100 values, causing exponential cost increases
The infamous "custom metrics" billing model charges for every unique combination of metric name and tag values, not just custom code—adding an IP address field can increase bills 100x overnight
World-class observability teams spend the majority of their time governing cardinality rather than solving actual engineering problems, representing a massive opportunity cost and productivity drain
High-cardinality data provides the most debugging value because unique identifiers make it easier to isolate specific problems and understand system behavior patterns
The cruel irony is that the most expensive data to store in traditional systems is also the most valuable for understanding and debugging complex distributed applications

Observability 2.0: Unified Storage and Structured Events

Observability 2.0 centers on unified storage where data exists once but can be visualized and accessed through multiple entry points, eliminating dead ends and correlation gaps
The architecture uses wide structured events organized around units of work, stored in columnar databases that enable real-time slicing and dicing without predefined schemas
Instead of storing request data 15 times across different tools, teams emit fewer but wider logs with rich context attached to each event, dramatically reducing storage multiplication
Honeycomb's "Bubble Up" feature exemplifies this approach—drawing a bubble around any graph anomaly automatically computes dimensional differences between the bubble contents and baseline data
Click house, Snowflake, and other columnar stores enable this shift by handling high-cardinality data efficiently without requiring advance index or schema definition
The transition represents moving from an operational tool focused on errors and outages to a development tool that underpins entire software feedback loops and accelerates time to value

The Open Telemetry Revolution and Vendor Portability

Open Telemetry has become the largest CNCF project by commits and contributors, surpassing even Kubernetes in developer engagement and industry adoption
The core promise allows teams to instrument code once with OTel and redirect the telemetry fire hose to any compatible vendor, forcing competition based on value rather than lock-in
Semantic conventions and consistent naming within OTel pipelines enable vendors to build sophisticated automated features because they understand the data structure and meaning
The project inherits from failed predecessors like Open Census and Open Tracing, with Ben Sigelman and Light Step providing crucial architectural leadership for broad industry adoption
While critics argue OTel feels "big and bloated," most teams can extract value without understanding every component, and the tooling increasingly "just works" without deep configuration
This shift represents the first time in observability history where vendor migration is practically feasible, fundamentally changing purchasing dynamics and reducing long-term risk

AI, Unknown Software, and the Future of Debugging

AI intersects observability in three critical areas: building and training models, developing with LLMs, and managing the universal problem of software of unknown origin
The proliferation of AI-generated code creates a worldwide version of Parse's original challenge—making unknown software uploaded by developers around the globe work reliably in production
AI observability cannot exist in isolation but must integrate with comprehensive software observability, tracing from all service inputs through models to human feedback loops
Many AI observability startups focus narrowly on model inputs and outputs while ignoring the broader trace-shaped, high-cardinality problem that spans entire application stacks
Production remains where code meets reality regardless of its origin—human-written or AI-generated code must be observed running in real environments to validate quality and behavior
The solution involves treating AI as intensified existing challenges rather than fundamentally new problems, requiring better traditional observability practices to handle non-deterministic software components

Engineering Culture and Organizational Transformation

Observability should underpin development feedback loops similar to how tests accelerate development—the best time to instrument is while writing code, capturing original intent and context
SLOs function as "the API for engineering teams," providing clear service level agreements that prevent micromanagement while enabling meaningful negotiations about reliability versus feature work
Modern observability enables engineering teams to explain their work in business language, helping CTOs and VPs of Engineering earn first-class seats at executive tables alongside other functions
Progressive deployment practices combined with observability 2.0 create greater value than either approach alone—feature flags, canaries, and rich telemetry enable confident rapid iteration
The shift from DevOps collaboration models toward platform engineering reflects observability ownership moving to teams whose internal customers are other engineers, not external operations
Testing in production becomes safer and more valuable when engineers can slice, dice, and immediately understand the impact of changes through high-cardinality dimensional analysis

Common Questions

Q: What makes observability costs spiral out of control?
A: The three-pillars model stores every request in 15+ different tools, while cardinality explosions in metrics systems can increase bills 100x overnight without code changes.

Q: How does observability 2.0 handle high-cardinality data differently?
A: Unified storage in columnar databases allows unlimited unique identifiers without exponential cost increases, enabling real-time exploration across any dimension.

Q: When should startups begin investing in observability?
A: As soon as code becomes real and intended for users—similar to writing tests, observability should be built while writing code to capture intent.

Q: Why is vendor lock-in less concerning in modern observability?
A: Open Telemetry enables portable instrumentation, allowing teams to switch vendors based on value rather than being trapped by proprietary data formats.

Q: How does AI change observability requirements?
A: AI creates more software of unknown origin requiring trace-shaped observability from inputs through models to feedback, not just model monitoring.

The future of observability lies in treating it as essential development infrastructure rather than operational overhead. Teams that embrace unified storage, structured events, and development-centric workflows will build better software faster while avoiding the cost explosions and correlation nightmares plaguing traditional approaches.

Observability 2.0: Why Engineering Teams Are Moving Beyond the Three Pillars Model

Table of Contents

Key Takeaways

Timeline Overview

The Fatal Flaws of Three-Pillars Observability

Understanding Cardinality: The Hidden Cost Explosion

Observability 2.0: Unified Storage and Structured Events

The Open Telemetry Revolution and Vendor Portability

AI, Unknown Software, and the Future of Debugging

Engineering Culture and Organizational Transformation

Common Questions

Latest

The General-Purpose Robot Revolution: Physical Intelligence's Foundation Model Breakthrough

Trump's Ukraine Gambit: European Allies Rush to Shape Security Architecture Before Putin Talks

Beyond the Media Kit: How to Land Press Coverage That Makes You a Trusted Expert

Soviet Immigrant's Warning: How Woke Culture Threatens Western Democracy

Observability 2.0: Why Engineering Teams Are Moving Beyond the Three Pillars Model

Table of Contents

Key Takeaways

Timeline Overview

The Fatal Flaws of Three-Pillars Observability

Understanding Cardinality: The Hidden Cost Explosion

Observability 2.0: Unified Storage and Structured Events

The Open Telemetry Revolution and Vendor Portability

AI, Unknown Software, and the Future of Debugging

Engineering Culture and Organizational Transformation

Common Questions

Related

Latest