Lacking Observability?
PC: Unsplash

Lacking Observability?

When we hear the word “Observability”, what naturally comes into our mind is all the telemetry data - Metrics, logs & traces to provide deep visibility into distributed systems and allow teams to get to the root cause of a multitude of issues & improve the system’s performance.

If you look from a security point of view, understanding security posture isn’t just applying the security configurations & best practices and hoping that everything is set. That shouldn’t be your strategy.

In recent years, The advent of microservices opened new vistas of opportunities in software development. By transforming the rules on application scalability—from the brute-force to the fine-grain approach—microservices - led organizations to build a more resilient, reliable, and fault-tolerant IT backbone. Being composed of individual containers integrated seamlessly, microservices applications brought in a bonanza of benefits, including faster turnaround times, improved extensibility, and better uptime.

This all brings Kubernetes a de facto cloud operating system and every day & more critical applications are containerized and shifted to a cloud-native environment. If you just look at the 2023 Kubernetes adoption report by Redhat, 1 in 5 respondents said security incidents led to employee termination, and more than 1 in 3 experienced revenue or customer loss.

Container & Kubernetes security issues often delay application rollout and at the same time, Kubernetes has quickly become a rich target for passive & targeted attackers. Containers & Kubernetes introduce a new layer of complexity in the software stack, leading to additional security challenges.

What are the Security challenges?

For example, businesses required more efficient approaches to counter threats that were consistently eluding anti-malware software and firewalls. Furthermore, while many organizations had previously focused their security strategies on fortifying the corporate network's perimeter, the evolving landscape of hybrid cloud environments revealed that a distinct perimeter no longer existed. This necessitated the development of alternative security measures that were robust yet agile enough to operate across diverse locations, rather than solely at the network's edge.

As I mentioned, observability is not just about monitoring system metrics but also understanding the security portion of it. why shouldn’t you also use the endless stream of telemetry data to identify security risks and vulnerabilities, just as you use it to monitor and stabilize operations? Even the best-planned observability strategy is incomplete without the fourth pillar of security.

That’s where your observability strategy needs “Security Observability” to have better visibility into potential threats & proactive approaches to addressing risks.

By leveraging the internal visibility observability provides and overlaying it with security data, businesses can extend their eyes and ears into every corner of their IT environment, creating what’s known as security observability.

Security observability is the ability to gain visibility into an organization’s security posture, including its ability to detect and respond to security threats and vulnerabilities. It involves collecting, analyzing, and visualizing security data to identify potential hazards and take proactive measures to mitigate them.

Without security observability, you can’t quantify a metric to represent the objective security properties of a system. Security investigations depend on retroactive data, and the only way to have data is to proactively collect it. Security observability is the only record you have.

Access to a single truth source with security observability can make it easier to identify, analyze, and categorize suspicious patterns or anomalies. But what exactly is this single source of truth? You guessed it: security data (think metadata from firewalls, threat detection, or traffic analyzers layered on top of telemetry data).

What security observability in Kubernetes can benefits us:

  • Increased Visibility: Observability tools provide details about pods running with privileged Linux capabilities in your environment.
  • Better Troubleshooting: Have any workloads in your environment made the outbound connection to the “known bad.domain.com”?
  • Enhanced Security: Show me all local privilege escalation techniques detected in the last 30 days.

With the leading cloud providers in the market, each cloud provider has their own observability stack in a cloud-native environment. With the system becoming complex, observability in cloud-native becomes more challenging.

Now we understand the importance of Security observability But the question is how do we build security observability?

According to the 2021 Verizon Data Breach Investigations Report, cloud assets were involved in 24% of all breaches analyzed in the report, up from 19% in 2020.

So, Who will come to the rescue? “EBPF”

What is eBPF?

eBPF is an emerging technology that enables event-driven custom code to run natively in an operating system kernel. This has spawned a new era of network, observability, and security platforms. eBPF extends kernel functionality without requiring changes to applications or the kernel to observe and enforce the runtime security policy. eBPF’s origins began with BPF, a kernel technology that was originally developed to aid packet filtering such as the inimitable tcpdump packet-capture utility.

The result of this BPF expansion—extended BPF or eBPF—allows programs broad access to kernel functions and system memory, but in a protected way. eBPF lets you gather detailed information about low-level networking, security, and other system-level activities within the kernel. Better yet, it works without requiring direct modifications to kernel code.

Unlike programs that run in user space, eBPF programs are inherently more efficient and potentially more powerful because they can see and respond to nearly all operations performed by the operating system. This means that, for the purposes of application tracing, eBPF programs also provide the advantage of not requiring any code instrumentation. And because eBPF supports event-driven functions, it allows tracing to be performed efficiently because CPU cycles are used only when needed.

In summary, eBPF programs allow safe and efficient access into kernel operations by:

  • Providing built-in hooks for programs based on system calls, kernel functions, network events, and other triggers.
  • Providing a mechanism for compiling and verifying code prior to running, helps ensure the security and stability of the system.
  • Offering a more straightforward way to enhance kernel functionality than is possible through LKMs, thereby allowing even small teams to efficiently develop safe programs that run in kernel space

Kernel space protects memory and hardware by restricting access to only the operating system and some specialized processes

Why eBPF?

eBPF collects and filters security observability data directly in the kernel, from memory or disk, and exports it to userspace as security observability events, where the data can be sent to an SIEM for advanced analysis. Because the kernel is shared across all containers,7 these events provide a historical record of the entire environment, from containers to the node processes in a Kubernetes cluster that make up a Kubernetes cluster.

Security observability data includes Kubernetes identity-aware information, such as labels, namespaces, pod names, container images, and more. eBPF programs can be used to translate and map processes, system calls, and network functions into a Kubernetes workload and identity.

eBPF programs are able to both observe Kubernetes workloads and enforce user-defined security policies. With access to all data that the kernel is aware of, you can monitor arbitrary kernel events such as system calls, network sockets, file descriptors, Unix socket domain connections, etc. Security policies are defined at runtime that observe and enforce desired behaviors by using a combination of kernel events. They can be fine-grained and applied to specific workloads by using policy selectors. If the appropriate policy selectors match, the pod can be terminated or paused for later investigation.

Areas where eBPF can be leveraged

  • Kernel Observability
  • Routing Network Traffic
  • Tracking TCP Connections
  • Pod and Container Statistics

The Four Golden Signals of Security Observability

As mentioned in the book Security Observability with BPF by Jed Salazar

Just like the SRE principle of service level objectives (SLO), Security observability allows us to assess our current security & tack improvements over time.        

SRE defines four golden signals for monitoring distributed systems. Similarly, we define the four golden signals of container security observability as

  • process execution,
  • network sockets (TCP, UDP, and Unix),
  • file access, and
  • layer 7 network identity

Collectively, these data points provide crucial information on what occurred during the lifecycle of containers to detect a breach, identify compromised systems, understand the impact of the breach, and remediate affected systems.

As shown in Figure 2, eBPF provides full insights into the four golden signals of security observability.

Figure 2. eBPF collection points for a process, correlated by the

To know more about these signals, I would highly recommend reading the book Security Observability with eBPF.

Books That will help you understand ebpf:

Top Open Source tools in use for Kubernetes security:

  • KubeScape
  • Kube-hunter
  • KubeShark
  • Open Policy Agent (OPA)
  • Checkov
  • Falco
  • KuberLinter
  • Cilium
  • Hubble
  • Inspector Gadget

Labs & Hands-on:

Enjoy the benefits of eBPF:

Security Team can use eBPF directly to write applications that are allowed privileged levels of access to kernel operations. An example of this type of product is Cilium Tetrragon.

Cilium Tetragon is an open-source security observability and runtime enforcement tool from the makers of Cilium. It captures different process and network event types through a user-supplied configuration to enable security observability on arbitrary hook points in the kernel. These different event types correspond to each of the golden signals.

For example, to detect process execution, Cilium Tetragon detects when a process starts and stops. To detect network sockets, it detects whenever a process opens, closes, accepts, or listens in on a network socket. File access is achieved by monitoring file descriptors and a combination of system calls, such as open, read, and write. To gain layer 7 network identity, it takes advantage of the observed fields during a connection via network sockets.

I hope you’ve enjoyed this short journey into the world of security observability and eBPF. It is the technology we’ve always wanted when in the trenches of threat detection and security engineering due to the fully customizable, in-kernel detection and prevention capabilities. It’s an incredibly exciting time for eBPF as it’s gone from an emerging Linux kernel technology to one of the hottest new tools in distributed computing and infrastructure technology.


I appreciate you reading The Security Chef.

Thanks for reading The Security Chef! Subscribe for free to receive new posts and support my work.

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics