Observability 101

Observability isn't just about gathering metrics.

Nov 14, 2024

∙ Paid

In this article, I will explain the essentials of observability, why it matters, and how to get started.

I started my career almost a decade ago with an internship in infrastructure, but as soon as I finished, I started to work in an event management and monitoring position, where I was exposed to IBM Tivoli Monitoring and Netcool Omnibus.

This role was my first real jump into the world of observability, and I quickly saw the importance of being able to anticipate and respond to system issues before they impacted users and what were the consequences when observability wasn’t up to par — believe me they are severe.

But enough about me, let’s jump into the goods:

What Is Observability?

Observability is a mechanism that can measure a system’s internal state by examining its outputs.

Traditionally, monitoring involved setting up alerts for specific metrics, such as CPU usage or memory, but observability goes deeper.

Observability aims to provide a more comprehensive view of your system’s health, performance, and reliability by focusing on three main pillars:

Metrics — Data measurements over time such as request rates, error rates, or system resource usage (CPU, Memory, etc)
Logs — Text records of everything that is happening inside your application or system that can be parsed to get detailed insights into events
Traces — Give you the ability to track the entire journey of your requests across services, showing how individual components interact.

These pillars work together to provide a holistic system view, enabling DevOps teams to detect and resolve issues more effectively.

Keep reading with a 7-day free trial

Subscribe to DevOps with Flavius to keep reading this post and get 7 days of free access to the full post archives.