《What Is eBPF? 》 图解
2022-05-13 16:18:14 24 举报
《What Is eBPF?》是一篇图解文章,它详细介绍了eBPF(Extended Berkeley Packet Filter)的概念、原理和应用场景。eBPF是一种在内核中运行的轻量级虚拟机,它可以对内核事件进行跟踪和过滤,从而实现对系统性能的监控和优化。文章中通过生动的图示和实例,帮助读者更好地理解eBPF的工作原理和使用方法。总之,这篇文章为读者提供了一个全面而深入的了解eBPF的机会。
作者其他创作
大纲/内容
原作地址:https://isovalent.com/data/liz-rice-what-is-ebpf.pdf
在过去的几年里,扩展的伯克利包过滤器 (#eBPF#) 已经从相对默默无闻变成了现代基础设施计算中最热门的技术领域之一。 在这份报告中,Liz Rice 深入探讨了 eBPF,并解释了该框架如何为现代计算环境启用网络、安全和可观察性工具。 SRE、运维工程师和工程团队负责人将了解 eBPF 是什么以及它为何如此强大。
简介
more than “Extended Berkeley Packet Filter”
a framework that allows users to load and run custom programs within the kernel
Linux based but an implementation for Windows is under developing
can change kernel behavior
verify before loaded
attached to an event
powerful but complex
provides ability to create a new generation of tooling covering observability, security, networking, and more
see eBPF in action, a great place to start is the BCC project
Changing the Kernel Is Hard
kernel is complex
must be accepted by the community (and more specifically by Linus Torvalds) and be for the greater good of all
it takes literally years to get new functionality from the idea stage into a production environment Linux kernel
Kernel Modules is a possible choice but unsafe: may cause crashing or include vulnerabilities that an attacker could exploit
what does eBPF do to make it easier
safety:
The verifier ensure that it will always terminate safely and within a bounded number of instructions.
eBPF programs can only access memory they are supposed to access.
it would still be possible to write a malicious eBPF program, so load only trusted eBPF programs from verifiable sources, and only grant permissions to manage eBPF tools to people that you would trust with root access
efficiency
eBPF programs can be loaded into and removed from the kernel dynamically
how to write
The kernel accepts eBPF programs in bytecode form.
High-level languages that compiles to eBPF bytecode are C (compiled with clang/llvm) and more recently, Rust.
The user space part of an eBPF tool needs to load the program into the kernel and attach it to the right event.
In theory can be written in any language, but now C, Go, Rust, and Python provide libraries to support eBPF.
libbpf has become a popular option for making eBPF programs portable across different versions of the kernel.
A user space application uses the bpf() system call to load eBPF programs from an ELF file into the kernel
attach points:
Entry to/Exit from Functions: kprobes (attached to a kernel function entry point) and kretprobes (function exit), or more efficient alternative called fentry/fexit.
Tracepoints: list under /sys/kernel/debug/tracing/events
Perf Events: check `perf list` for detail
Linux Security Module Interface: AppArmor or SELinux make use of LSM.
Network Interfaces—eXpress Data Path: triggered when a packet is received.BPF programs can inspect or modify it, pass it on, drop it, or redirect it.
Sockets : run when applications open or perform other operations on a network socket, as well as when messages are sent or received.
Other Networking Hooks: called traffic control or tc within the kernel’s network stack where eBPF programs can run after initial packet processing.
eBPF Maps
data structures that are defined alongside eBPF programs
There are a variety of types, but all in key–value form
eBPF programs can read and write to them, as can user space code. interchange data between eBPF program and user space code.
header files can be included that
define those data structures in both the user space and kernel code,
but if these aren’t written in the same language, the author(s) will
need to carefully create structure definitions that are byte-for-byte
compatible.
define those data structures in both the user space and kernel code,
but if these aren’t written in the same language, the author(s) will
need to carefully create structure definitions that are byte-for-byte
compatible.
Complexity
Portability Across Kernels
BBC Aproach: compiling at runtime
compilation toolchain and kernel header files needed on destination machine
compile before start to run
CO-RE (compile once, run everywhere)
BTF (BPF Type Format)
Modern Linux kernels support BTF.
can generate a header file containing all the data structure information about akernel that a BPF program might need
libbpf, the BPF library
it leans on BTF information to adjust the eBPF code to compensate for any differences between the data structures present when it was compiled, and what’s on the destination machine.
Compiler support
The clang compiler was enhanced so that when it compiles
eBPF programs, it includes what are known as BTF relocations,
which are what libbpf uses to know what to adjust as it loads
BPF programs and maps into the kernel.
eBPF programs, it includes what are known as BTF relocations,
which are what libbpf uses to know what to adjust as it loads
BPF programs and maps into the kernel.
Optionally, a BPF skeleton
A skeleton can be autogenerated from a compiled BPF object file using bpftool gen skeleton, containing handy functions that user space code can call to manage the lifecycle of BPF pro‐ grams—loading them into the kernel, attaching them to events and so on. These functions are higher-level abstractions that can be more convenient for the developer than using libbpf directly.
more detailed explanation of CO-RE, read Andrii Nakryiko’s excellent description
Linux Kernel Knowledge
kernel documentation can be sparse, so source code reading might be needed
time-of-check to time-of-use (TOCTTOU). presentation
BPF CO-RE only handles data structure layout, not function or tracepoint change
Coordinating Multiple eBPF Programs
things get more complicated when coordinate interactions between different types of events
As an example, Cilium sees network packets at a variety of points through the kernel’s networking stack, and manipulates traffic based on information from the Kubernetes CNI (container network interface) about Kubernetes pods. Building this system requires Cilium developers to have an in-depth understanding of how the kernel handles network traffic, and how the user space concepts of “pods” and “containers” map to kernel concepts like cgroups and namespaces.
better for learning exercise and experience in this area could be highly valuable since it’s bound to continue to be a sought-after specialist skill for years to come.
But realistically, most organizations are unlikely to build much bespoke eBPF tooling in-house, but instead will leverage projects and products from the specialist eBPF community.
eBPF in Cloud Native
all the containers running on that machine share the same kernel
sidecar vs eBPF
sidecar need to be injected into every container, may waste resource for common data
eBPF and Process Isolation and security
the kernel operates on processes, and uses cgroups and namespaces to isolate processes from each other
The security controls that exist on a Linux system assume that the kernel itself can be trusted
with the verifier, eBPF code is subject to much stricter controls than the surrounding kernel code
do not running sensitive applications on a shared machine alongside untrusted applications
For highly sensitive data, you might not even want to run within a virtual machine on the same bare metal as untrusted users
nonroot users don’t have permission to load eBPF programs
do have to be careful about the provenance of the code you run. ( signature checking of eBPF)
eBPF based tools and projects
Service Mesh
eBPF enables an efficient sidecarless model for service mesh, with one proxy per node rather than one per application pod
Observability
A Pixie flamegraph of everything running on a small Kubernetes cluster
Cilium’s Hubble UI shows network flows in a Kubernetes cluster
security
app.networkpolicy.io The network policy editor shows a visual representation of the effects of a policy
收藏
0 条评论
下一页