github.com/dylandreimerink/gobpfld@v0.6.1-0.20220205171531-e79c330ad608/cmd/examples/Example.md (about)

     1  # Examples
     2  
     3  This directory contains a number of small example programs which demonstrate aspects of the library.
     4  
     5  ## xdp_stats
     6  
     7  This program demonstrates the loading of a simple XDP eBPF program. The program is in ELF format and made using the C Clang+llvm toolchain. It is the Basic03 example from [xdp-tutorial](https://github.com/xdp-project/xdp-tutorial/tree/master/basic03-map-counter) but the userspace side of the example is replaced by this go program.
     8  
     9  It demonstrates opening and decoding an ELF file into a BPFProgram and BPFMap, loading the program and map into the kernel, attaching the program to the loopback interface and reading the stats reported by the XDP program from the map.
    10  
    11  ## xdp_stats_instructions
    12  
    13  This program is functionally identical to the xdp_stats example, but the program is not loaded from an ELF file, rather the `ebpf` package is used to craft the same program from individual instructions.
    14  
    15  It demonstrates how userspace applications can generate programs dynamically without needing a full toolchain to build the programs.
    16  
    17  ## xdp_stats_assembly
    18  
    19  This program is almost identical to the `xdp_stats_instructions` example, except it replaces the manual instruction crafting with eBPF assembly code which is parsed and turned into eBPF instruction by the `ebpf` package.
    20  
    21  ## kprobe_execve_stats
    22  
    23  This program attaches to the `execve` syscall which is called any time a program is executed on linux. The program simply counts the occurrences, more advanced programs can inspect the passed arguments.
    24  
    25  ## uprobe_bash_stats
    26  
    27  This program attaches to `/bin/bash` and is called any time the function at offset `0x030360`(main) is called. The offset is hardcoded in this example and may not work out of the box because the bash binary isn't exactly the same. The proper way to use such a program would be to get the symbols from the ELF file at runtime and extract the offset that way. This does require the program to expose debug symbols if you want to attach to anything other than the entry point.
    28  
    29  The program simply counts the occurrences, more advanced programs can inspect the passed arguments.
    30  
    31  ## per_cpu_map
    32  
    33  This eBPF program attaches to the loopback interface of the host and counts the amount of packets received per CPU, it stores this information in a `BPF_MAP_TYPE_PERCPU_ARRAY` map. This example demonstrates how to read from and write to a per CPU map type with the `gopbfld.BPFGenericMap`.
    34  
    35  ## map_batch
    36  
    37  This example demonstrates batch operations on maps. Batch operations offer speed improvements over non-batch operations since less syscalls/context switches are required for the same amount of work.
    38  
    39  ## map_iterator
    40  
    41  This example demonstrates the usage of map iterators to loop over maps. Iterators provide an easier API than using syscalls directly to loop over maps.
    42  
    43  ## map_pinning
    44  
    45  This program demonstrates how to pin and unpin maps to the bpf FS using gobpfld.
    46  
    47  ## bpf_to_bpf
    48  
    49  This eBPF program demonstrates [BPF to BPF calls](https://docs.cilium.io/en/stable/bpf/#bpf-to-bpf-calls). It shows that gobpfld can relocate code from the `.text` section of ELF files and recalculate the addresses of call instructions.
    50  
    51  The program records traffic usage per IP protocol, UDP destination port and TCP destination port. Since the program support both IPv4 and IPv6 it is a good demo of BPF to BPF since without this feature the `inc_*` functions whould have to be inlined multiple times.
    52  
    53  The `handle_ipv4` and `handle_ipv6` functions in turn call the `inc_ip_proto`, `inc_udp`, and `inc_tcp` functions thus showing that multiple calls are possible. The `inc_*` functions access maps which verifies that map FD relocations in the `.text` ELF section are also handled.
    54  
    55  During loading the verbose verifier log is dumped which confirms the usage of BPF to BPF calls in the first few lines: 
    56  ```
    57  BPF Verifier log:
    58  func#0 @0
    59  func#1 @27
    60  func#2 @59
    61  func#3 @91
    62  func#4 @118
    63  func#5 @147
    64  ```
    65  
    66  ## tailcall
    67  
    68  This eBPF program demonstrates [Tail calls](https://docs.cilium.io/en/stable/bpf/#tail-calls). It is functionally identiacal to the BPF to BPF example except the functions for the different protocols are implemented as separate programs linked together using tail calls.
    69  
    70  When tail calling another program the called program fully takes over and never returns back to the original program. This feature has a number of useful use cases, but does require some setup. For example tail calls allows multiple, seperatly compiled programs to cooperate. A security team can write a tool for network auditing and a networking team another program for forwarding, by sticking a small program enfront of both which can make the decition on which program should be called these two programs from different maintainers can run on the same network device. And since they are not the same program they can be seperatly upgraded and have separate userspace applications to manage both (assuming the userspace programs can coordinate the loading/unloading sequence).
    71  
    72  Another usecase for a/b testing one or multiple variations of a eBPF program. Yet another is for using multiple generated XDP programs on the same network device. The posibilities are endless realy.
    73  
    74  However one must also keep the folloing limitations in mind:
    75   * Tail calls have a limited depth(max 16 tail calls)
    76   * Tail calls have no arguments, any data must be passed via the program specific context or via per-cpu maps(because all tail calls will be executed on the same CPU without interruptions between them, per cpu maps can be safely used as scratch buffers)
    77   * Programs can only tail call to other programs of the same type and both must JIT'ed or both interperted, can't mix JIT and interperted eBPF programs.
    78  
    79  ## map_in_map
    80  
    81  This examples demonstrates the usage of the map in map types. The "array of maps" and "hash of maps" map types allow you to create a "outer" map which as value type has pointers/file descriptors to other maps. 
    82  
    83  There are a few use cases for this feature like:
    84   * Switching maps if you need to atomically update multiple settings at once
    85   * Switching a stats map so values in the map don't change while you are iterating over it(more accurate stats at a specific point in time)
    86  
    87  In the this case we demonstrate the API but the example doesn't really require it.
    88  
    89  ## icmp_pcap
    90  
    91  This example creates a raw socket and uses a eBPF program to filter out just ICMP traffic. It demonstrates how to write a socket filter program as well as how to attach a eBPF program to a socket using a file descriptor.
    92  
    93  ## udp_socket_filter
    94  
    95  This example creates an udp socket which listens on *:3000 using the stdlib net package. The eBPF program is then attached via the net.ListenConfig.Control callback. Event tho the socket listens on all ip addresses the eBPF program filters all traffic accept those with destination address 127.0.0.1.
    96  
    97  ## test_xdp_program
    98  
    99  This example example domonstrates how to test an XDP program without actually attaching the program to a link and sending actual traffic. Once a eBPF is loaded into the kernel we can ask the kernel to call the program X of times with data that is specified by us. The kernel will return the return value, the updated packet and the duration of execution in nanoseconds. 
   100  
   101  This is useful in a number of cases, for example:
   102    * Programatically testing XDP programs like a unit or intergration test.
   103    * Testing a program on production traffic (by capturing/mirroring frames with a raw socket and passing to the XDP program) 
   104    * Emulating hard to create edge cases (corrupt packets/failed checksums)
   105    * Benchmarking XDP programs
   106  
   107  ## xsk_echo_reply
   108  
   109  This example shows how to implement a ICMP echo reply (ping response) using XSK/[AF_XDP](https://www.kernel.org/doc/html/latest/networking/af_xdp.html). XSK(XDP socket) allows us to perform kernel bypass using XDP. We do this by creating a network socket, much like a normal network socket. Instead of binding it to an port and/or IP we just bind it to a network interface and NIC Queue. A XDP program is attached to the same network interface, this program now has the ability to send frames over this socket directly to the userspace application, thus bypassing the kernel network stack.
   110  
   111  We can also transmit to this socket which again bypasses the kernel stack. The technique is quite advanced and requires a lot of work in userspace to use (userspace network stack/packet decoding). However it is also very powerful, applications vary from virtualization to super fast packet capture. A major advantage of XSK is that we can directly read from and write to the same memory buffer the network driver will use to transmit and recieve data. This offers great performance because no memory has to change context (userspace<->kernel).
   112  
   113  The example implements manual packet decoding, this is done so this example doesn't cause the whole library to have extra dependencies. But a packet decoding/encoding library like [gopacket](https://github.com/google/gopacket) comes highly recommended.
   114  
   115  ## xsk_multi_sock
   116  
   117  This example is a variation on the xsk_echo_reply example. The main difference is that this example works on multi queue NIC's. On systems with multi queue NIC's incomming traffic is distribured amoung all RX queues based on flow(different fields depending on the protocol stack). Using the [ethtool](https://linux.die.net/man/8/ethtool) utility this behavour can be changed.
   118  
   119  So by default, in order to use XSK on a whole network device you need to bind a XSK to every RX/TX queue. Since a XSK can only be bound to 1 queue at a time it means you will have to manage a number of them. At first it might seem possible to redirect all frames to one socket since you can pick which socket to use in the XDP program. Unfortunately this doesn't work, XDP is only allowed to redirect to sockets bound on the same queue as where the frame enters. The XSK map is only meant for situations where there is more than one socket bound per queue(not yet supported by GoBPFLD).
   120  
   121  To make interacting with multiple sockets easier GoBPFLD provides the `XSKMultiSocket` which can be created using the `NewXSKMultiSocket` function. This multi socket has same functions as the `XSKSocket` except it balances reads and writes between all sockets contained in it. Using the multi socket does mean that reading and writing to the socket is limited to one goroutine. The `XSKMultiSocket` like the `XSKSocket` is not concurrent, only one goroutine can read or write to it at a time. Thus if latency or throughput is important it is recommended to not use the `XSKMultiSocket` and instead start a separate goroutine for each `XSKSocket`. Do keep in mind that when not using the `XSKMultiSocket` you are responsible for balancing outgoing(TX) packages across the sockets.
   122  
   123  The example contains both aproaches which can be selected using a flag.
   124  
   125  ## TODO
   126  
   127  * xsk encapsulation example
   128  * xsk write lease example
   129  * LPM trie example