《What Is eBPF? 》图解

2022-05-13 16:18:14   24  举报





《What Is eBPF?》是一篇图解文章，它详细介绍了eBPF（Extended Berkeley Packet Filter）的概念、原理和应用场景。eBPF是一种在内核中运行的轻量级虚拟机，它可以对内核事件进行跟踪和过滤，从而实现对系统性能的监控和优化。文章中通过生动的图示和实例，帮助读者更好地理解eBPF的工作原理和使用方法。总之，这篇文章为读者提供了一个全面而深入的了解eBPF的机会。

读书笔记

eBPF

Linux

K8S

图解好书大赛

作者其他创作

大纲/内容

原作地址：https://isovalent.com/data/liz-rice-what-is-ebpf.pdf

在过去的几年里，扩展的伯克利包过滤器 (#eBPF#) 已经从相对默默无闻变成了现代基础设施计算中最热门的技术领域之一。在这份报告中，Liz Rice 深入探讨了 eBPF，并解释了该框架如何为现代计算环境启用网络、安全和可观察性工具。 SRE、运维工程师和工程团队负责人将了解 eBPF 是什么以及它为何如此强大。

简介

more than “Extended Berkeley Packet Filter”

a framework that allows users to load and run custom programs within the kernel

Linux based but an implementation for Windows is under developing

can change kernel behavior

verify before loaded

attached to an event

powerful but complex

provides ability to create a new generation of tooling covering observability, security, networking, and more

see eBPF in action, a great place to start is the BCC project

Changing the Kernel Is Hard

kernel is complex

must be accepted by the community (and more specifically by Linus Torvalds) and be for the greater good of all

it takes literally years to get new functionality from the idea stage into a production environment Linux kernel

Kernel Modules is a possible choice but unsafe: may cause crashing or include vulnerabilities that an attacker could exploit

what does eBPF do to make it easier

safety:

The verifier ensure that it will always terminate safely and within a bounded number of instructions.

eBPF programs can only access memory they are supposed to access.

it would still be possible to write a malicious eBPF program, so load only trusted eBPF programs from verifiable sources, and only grant permissions to manage eBPF tools to people that you would trust with root access

efficiency

eBPF programs can be loaded into and removed from the kernel dynamically

how to write

The kernel accepts eBPF programs in bytecode form.

High-level languages that compiles to eBPF bytecode are C (compiled with clang/llvm) and more recently, Rust.

The user space part of an eBPF tool needs to load the program into the kernel and attach it to the right event.

In theory can be written in any language, but now C, Go, Rust, and Python provide libraries to support eBPF.

libbpf has become a popular option for making eBPF programs portable across different versions of the kernel.

A user space application uses the bpf() system call to load eBPF programs from an ELF file into the kernel

attach points:

Entry to/Exit from Functions: kprobes (attached to a kernel function entry point) and kretprobes (function exit), or more efficient alternative called fentry/fexit.

Tracepoints: list under /sys/kernel/debug/tracing/events

Perf Events: check `perf list` for detail

Linux Security Module Interface: AppArmor or SELinux make use of LSM.

Network Interfaces—eXpress Data Path: triggered when a packet is received.BPF programs can inspect or modify it, pass it on, drop it, or redirect it.

Sockets : run when applications open or perform other operations on a network socket, as well as when messages are sent or received.

Other Networking Hooks: called traffic control or tc within the kernel’s network stack where eBPF programs can run after initial packet processing.

eBPF Maps

data structures that are defined alongside eBPF programs

There are a variety of types, but all in key–value form

eBPF programs can read and write to them, as can user space code. interchange data between eBPF program and user space code.

header files can be included that
define those data structures in both the user space and kernel code,
but if these aren’t written in the same language, the author(s) will
need to carefully create structure definitions that are byte-for-byte
compatible.

Opensnoop Example

a utility that shows you what files any process opens.

check BCC Project

Complexity

Portability Across Kernels

BBC Aproach: compiling at runtime

compilation toolchain and kernel header files needed on destination machine

compile before start to run

CO-RE (compile once, run everywhere)

BTF (BPF Type Format)

Modern Linux kernels support BTF.

can generate a header file containing all the data structure information about akernel that a BPF program might need

libbpf, the BPF library

it leans on BTF information to adjust the eBPF code to compensate for any differences between the data structures present when it was compiled, and what’s on the destination machine.

Compiler support

The clang compiler was enhanced so that when it compiles
eBPF programs, it includes what are known as BTF relocations,
which are what libbpf uses to know what to adjust as it loads
BPF programs and maps into the kernel.

Optionally, a BPF skeleton

A skeleton can be autogenerated from a compiled BPF object file using bpftool gen skeleton, containing handy functions that user space code can call to manage the lifecycle of BPF pro‐ grams—loading them into the kernel, attaching them to events and so on. These functions are higher-level abstractions that can be more convenient for the developer than using libbpf directly.

more detailed explanation of CO-RE, read Andrii Nakryiko’s excellent description

Linux Kernel Knowledge

kernel documentation can be sparse, so source code reading might be needed

time-of-check to time-of-use (TOCTTOU). presentation

BPF CO-RE only handles data structure layout, not function or tracepoint change

Coordinating Multiple eBPF Programs

things get more complicated when coordinate interactions between different types of events

As an example, Cilium sees network packets at a variety of points through the kernel’s networking stack, and manipulates traffic based on information from the Kubernetes CNI (container network interface) about Kubernetes pods. Building this system requires Cilium developers to have an in-depth understanding of how the kernel handles network traffic, and how the user space concepts of “pods” and “containers” map to kernel concepts like cgroups and namespaces.

better for learning exercise and experience in this area could be highly valuable since it’s bound to continue to be a sought-after specialist skill for years to come.

But realistically, most organizations are unlikely to build much bespoke eBPF tooling in-house, but instead will leverage projects and products from the specialist eBPF community.

eBPF in Cloud Native

all the containers running on that machine share the same kernel

sidecar vs eBPF

sidecar need to be injected into every container, may waste resource for common data

eBPF and Process Isolation and security

the kernel operates on processes, and uses cgroups and namespaces to isolate processes from each other

The security controls that exist on a Linux system assume that the kernel itself can be trusted

with the verifier, eBPF code is subject to much stricter controls than the surrounding kernel code

do not running sensitive applications on a shared machine alongside untrusted applications

For highly sensitive data, you might not even want to run within a virtual machine on the same bare metal as untrusted users

nonroot users don’t have permission to load eBPF programs

do have to be careful about the provenance of the code you run. ( signature checking of eBPF)