gvisor.dev/gvisor@v0.0.0-20240520182842-f9d4d51c7e0f/g3doc/proposals/gsoc-2021-ideas.md (about)

     1  # Project Ideas for Google Summer of Code 2021
     2  
     3  This is a collection of project ideas for
     4  [Google Summer of Code 2021][gsoc-2021-site]. These projects are intended to be
     5  relatively self-contained and should be good starting projects for new
     6  contributors to gVisor. We expect individual contributors to be able to make
     7  reasonable progress on these projects over the course of several weeks.
     8  Familiarity with Golang and knowledge about systems programming in Linux will be
     9  helpful.
    10  
    11  If you're interested in contributing to gVisor through Google Summer of Code
    12  2021, but would like to propose your own idea for a project, please see our
    13  [roadmap](../roadmap.md) for areas of development, and get in touch through our
    14  [mailing list][gvisor-mailing-list] or [chat][gvisor-chat]!
    15  
    16  ## Implement the `setns` syscall
    17  
    18  Estimated complexity: *easy*
    19  
    20  This project involves implementing the [`setns`][man-setns] syscall. gVisor
    21  currently supports manipulation of namespaces through the `clone` and `unshare`
    22  syscalls. These two syscalls essentially implement the requisite logic for
    23  `setns`, but there is currently no way to obtain a file descriptor referring to
    24  a namespace in gVisor. As described in the `setns` man page, the two typical
    25  ways of obtaining such a file descriptor in Linux are by opening a file in
    26  `/proc/[pid]/ns`, or through the `pidfd_open` syscall.
    27  
    28  For gVisor, we recommend implementing the `/proc/[pid]/ns` mechanism first,
    29  which would involve implementing a trivial namespace file type in procfs.
    30  
    31  ## Implement `fanotify`
    32  
    33  Estimated complexity: *medium*
    34  
    35  Implement [`fanotify`][man-fanotify] in gVisor, which is a filesystem event
    36  notification mechanism. gVisor currently supports `inotify`, which is a similar
    37  mechanism with slightly different capabilities, but which should serve as a good
    38  reference.
    39  
    40  The `fanotify` interface adds two new syscalls:
    41  
    42  -   `fanotify_init` creates a new notification group, which is a collection of
    43      filesystem objects watched by the kernel. The group is represented by a file
    44      descriptor returned by this syscall. Events on the watched objects can be
    45      retrieved by reading from this file descriptor.
    46  
    47  -   `fanotify_mark` adds a filesystem object to a watch group, or modifies the
    48      parameters of an existing watch.
    49  
    50  Unlike `inotify`, `fanotify` can set watches on filesystems and mount points,
    51  which will require some additional data tracking on the corresponding filesystem
    52  objects within the sentry.
    53  
    54  A well-designed implementation should reuse the notifications from `inotify` for
    55  files and directories (this is also how Linux implements these mechanisms), and
    56  should implement the necessary tracking and notifications for filesystems and
    57  mount points.
    58  
    59  ## Implement `io_uring`
    60  
    61  Estimated complexity: *hard*
    62  
    63  `io_uring` is the latest asynchronous I/O API in Linux. This project will
    64  involve implementing the system interfaces required to support `io_uring` in
    65  gVisor. A successful implementation should have similar relatively performance
    66  and scalability characteristics compared to synchronous I/O syscalls, as in
    67  Linux.
    68  
    69  The core of the `io_uring` interface is deceptively simple, involving only three
    70  new syscalls:
    71  
    72  -   `io_uring_setup(2)` creates a new `io_uring` instance represented by a file
    73      descriptor, including a set of request submission and completion queues
    74      backed by shared memory ring buffers.
    75  
    76  -   `io_uring_register(2)` optionally binds kernel resources such as files and
    77      memory buffers to handles, which can then be passed to `io_uring`
    78      operations. Pre-registering resources in this way moves the cost of looking
    79      up and validating these resources to registration time rather than paying
    80      the cost during the operation.
    81  
    82  -   `io_uring_enter(2)` is the syscall used to submit queued operations and wait
    83      for completions. This is the most complex part of the mechanism, requiring
    84      the kernel to process queued request from the submission queue, dispatching
    85      the appropriate I/O operation based on the request arguments and blocking
    86      for the requested number of operations to be completed before returning.
    87  
    88  An `io_uring` request is effectively an opcode specifying the I/O operation to
    89  perform, and corresponding arguments. The opcodes and arguments closely relate
    90  to the corresponding synchronous I/O syscall. In addition, there are some
    91  `io_uring`-specific arguments that specify things like how to process requests,
    92  how to interpret the arguments and communicate the status of the ring buffers.
    93  
    94  For a detailed description of the `io_uring` interface, see the
    95  [design doc][io-uring-doc] by the `io_uring` authors.
    96  
    97  Due to the complexity of the full `io_uring` mechanism and the numerous
    98  supported operations, it should be implemented in two stages:
    99  
   100  In the first stage, a simplified version of the `io_uring_setup` and
   101  `io_uring_enter` syscalls should be implemented, which will only support a
   102  minimal set of arguments and just one or two simple opcodes. This simplified
   103  implementation can be used to figure out how to integrate `io_uring` with
   104  gVisor's virtual filesystem and memory management subsystems, as well as
   105  benchmark the implementation to ensure it has the desired performance
   106  characteristics. The goal in this stage should be to implement the smallest
   107  subset of features required to perform a basic operation through `io_uring`s.
   108  
   109  In the second stage, support can be added for all the I/O operations supported
   110  by Linux, as well as advanced `io_uring` features such as fixed files and
   111  buffers (via `io_uring_register`), polled I/O and kernel-side request polling.
   112  
   113  A single contributor can expect to make reasonable progress on the first stage
   114  within the scope of Google Summer of Code. The second stage, while not
   115  necessarily difficult, is likely to be very time consuming. However it also
   116  lends itself well to parallel development by multiple contributors.
   117  
   118  ## Implement message queues
   119  
   120  Estimated complexity: *hard*
   121  
   122  Linux provides two alternate message queues:
   123  [System V message queues][man-sysvmq] and [POSIX message queues][man-posixmq].
   124  gVisor currently doesn't implement either.
   125  
   126  Both mechanisms add multiple syscalls for managing and using the message queues,
   127  see the relevant man pages above for their full description.
   128  
   129  The core of both mechanisms are very similar, it may be possible to back both
   130  mechanisms with a common implementation in gVisor. Linux however has two
   131  distinct implementations.
   132  
   133  An individual contributor can reasonably implement a minimal version of one of
   134  these two mechanisms within the scope of Google Summer of Code. The System V
   135  queue may be slightly easier to implement, as gVisor already implements System V
   136  semaphores and shared memory regions, so the code for managing IPC objects and
   137  the registry already exist.
   138  
   139  [gsoc-2021-site]: https://summerofcode.withgoogle.com
   140  [gvisor-chat]: https://gitter.im/gvisor/community
   141  [gvisor-mailing-list]: https://groups.google.com/g/gvisor-dev
   142  [io-uring-doc]: https://kernel.dk/io_uring.pdf
   143  [man-fanotify]: https://man7.org/linux/man-pages/man7/fanotify.7.html
   144  [man-sysvmq]: https://man7.org/linux/man-pages/man7/sysvipc.7.html
   145  [man-posixmq]: https://man7.org/linux/man-pages//man7/mq_overview.7.html
   146  [man-setns]: https://man7.org/linux/man-pages/man2/setns.2.html