github.com/SagerNet/gvisor@v0.0.0-20210707092255-7731c139d75c/g3doc/proposals/gsoc-2021-ideas.md (about) 1 # Project Ideas for Google Summer of Code 2021 2 3 This is a collection of project ideas for 4 [Google Summer of Code 2021][gsoc-2021-site]. These projects are intended to be 5 relatively self-contained and should be good starting projects for new 6 contributors to gVisor. We expect individual contributors to be able to make 7 reasonable progress on these projects over the course of several weeks. 8 Familiarity with Golang and knowledge about systems programming in Linux will be 9 helpful. 10 11 If you're interested in contributing to gVisor through Google Summer of Code 12 2021, but would like to propose your own idea for a project, please see our 13 [roadmap](../roadmap.md) for areas of development, and get in touch through our 14 [mailing list][gvisor-mailing-list] or [chat][gvisor-chat]! 15 16 ## Implement the `setns` syscall 17 18 Estimated complexity: *easy* 19 20 This project involves implementing the [`setns`][man-setns] syscall. gVisor 21 currently supports manipulation of namespaces through the `clone` and `unshare` 22 syscalls. These two syscalls essentially implement the requisite logic for 23 `setns`, but there is currently no way to obtain a file descriptor referring to 24 a namespace in gVisor. As described in the `setns` man page, the two typical 25 ways of obtaining such a file descriptor in Linux are by opening a file in 26 `/proc/[pid]/ns`, or through the `pidfd_open` syscall. 27 28 For gVisor, we recommend implementing the `/proc/[pid]/ns` mechanism first, 29 which would involve implementing a trivial namespace file type in procfs. 30 31 ## Implement `fanotify` 32 33 Estimated complexity: *medium* 34 35 Implement [`fanotify`][man-fanotify] in gVisor, which is a filesystem event 36 notification mechanism. gVisor currently supports `inotify`, which is a similar 37 mechanism with slightly different capabilities, but which should serve as a good 38 reference. 39 40 The `fanotify` interface adds two new syscalls: 41 42 - `fanotify_init` creates a new notification group, which is a collection of 43 filesystem objects watched by the kernel. The group is represented by a file 44 descriptor returned by this syscall. Events on the watched objects can be 45 retrieved by reading from this file descriptor. 46 47 - `fanotify_mark` adds a filesystem object to a watch group, or modifies the 48 parameters of an existing watch. 49 50 Unlike `inotify`, `fanotify` can set watches on filesystems and mount points, 51 which will require some additional data tracking on the corresponding filesystem 52 objects within the sentry. 53 54 A well-designed implementation should reuse the notifications from `inotify` for 55 files and directories (this is also how Linux implements these mechanisms), and 56 should implement the necessary tracking and notifications for filesystems and 57 mount points. 58 59 ## Implement `io_uring` 60 61 Estimated complexity: *hard* 62 63 `io_uring` is the latest asynchronous I/O API in Linux. This project will 64 involve implementing the system interfaces required to support `io_uring` in 65 gVisor. A successful implementation should have similar relatively performance 66 and scalability characteristics compared to synchronous I/O syscalls, as in 67 Linux. 68 69 The core of the `io_uring` interface is deceptively simple, involving only three 70 new syscalls: 71 72 - `io_uring_setup(2)` creates a new `io_uring` instance represented by a file 73 descriptor, including a set of request submission and completion queues 74 backed by shared memory ring buffers. 75 76 - `io_uring_register(2)` optionally binds kernel resources such as files and 77 memory buffers to handles, which can then be passed to `io_uring` 78 operations. Pre-registering resources in this way moves the cost of looking 79 up and validating these resources to registration time rather than paying 80 the cost during the operation. 81 82 - `io_uring_enter(2)` is the syscall used to submit queued operations and wait 83 for completions. This is the most complex part of the mechanism, requiring 84 the kernel to process queued request from the submission queue, dispatching 85 the appropriate I/O operation based on the request arguments and blocking 86 for the requested number of operations to be completed before returning. 87 88 An `io_uring` request is effectively an opcode specifying the I/O operation to 89 perform, and corresponding arguments. The opcodes and arguments closely relate 90 to the the corresponding synchronous I/O syscall. In addition, there are some 91 `io_uring`-specific arguments that specify things like how to process requests, 92 how to interpret the arguments and communicate the status of the ring buffers. 93 94 For a detailed description of the `io_uring` interface, see the 95 [design doc][io-uring-doc] by the `io_uring` authors. 96 97 Due to the complexity of the full `io_uring` mechanism and the numerous 98 supported operations, it should be implemented in two stages: 99 100 In the first stage, a simplified version of the `io_uring_setup` and 101 `io_uring_enter` syscalls should be implemented, which will only support a 102 minimal set of arguments and just one or two simple opcodes. This simplified 103 implementation can be used to figure out how to integrate `io_uring` with 104 gVisor's virtual filesystem and memory management subsystems, as well as 105 benchmark the implementation to ensure it has the desired performance 106 characteristics. The goal in this stage should be to implement the smallest 107 subset of features required to perform a basic operation through `io_uring`s. 108 109 In the second stage, support can be added for all the I/O operations supported 110 by Linux, as well as advanced `io_uring` features such as fixed files and 111 buffers (via `io_uring_register`), polled I/O and kernel-side request polling. 112 113 A single contributor can expect to make reasonable progress on the first stage 114 within the scope of Google Summer of Code. The second stage, while not 115 necessarily difficult, is likely to be very time consuming. However it also 116 lends itself well to parallel development by multiple contributors. 117 118 ## Implement message queues 119 120 Estimated complexity: *hard* 121 122 Linux provides two alternate message queues: 123 [System V message queues][man-sysvmq] and [POSIX message queues][man-posixmq]. 124 gVisor currently doesn't implement either. 125 126 Both mechanisms add multiple syscalls for managing and using the message queues, 127 see the relevant man pages above for their full description. 128 129 The core of both mechanisms are very similar, it may be possible to back both 130 mechanisms with a common implementation in gVisor. Linux however has two 131 distinct implementations. 132 133 An individual contributor can reasonably implement a minimal version of one of 134 these two mechanisms within the scope of Google Summer of Code. The System V 135 queue may be slightly easier to implement, as gVisor already implements System V 136 semaphores and shared memory regions, so the code for managing IPC objects and 137 the registry already exist. 138 139 [gsoc-2021-site]: https://summerofcode.withgoogle.com 140 [gvisor-chat]: https://gitter.im/gvisor/community 141 [gvisor-mailing-list]: https://groups.google.com/g/gvisor-dev 142 [io-uring-doc]: https://kernel.dk/io_uring.pdf 143 [man-fanotify]: https://man7.org/linux/man-pages/man7/fanotify.7.html 144 [man-sysvmq]: https://man7.org/linux/man-pages/man7/sysvipc.7.html 145 [man-posixmq]: https://man7.org/linux/man-pages//man7/mq_overview.7.html 146 [man-setns]: https://man7.org/linux/man-pages/man2/setns.2.html