gvisor.dev/gvisor@v0.0.0-20240520182842-f9d4d51c7e0f/pkg/sentry/vfs/README.md (about) 1 # The gVisor Virtual Filesystem 2 3 ## Implementation Notes 4 5 ### Reference Counting 6 7 Filesystem, Dentry, Mount, MountNamespace, and FileDescription are all 8 reference-counted. Mount and MountNamespace are exclusively VFS-managed; when 9 their reference count reaches zero, VFS releases their resources. Filesystem and 10 FileDescription management is shared between VFS and filesystem implementations; 11 when their reference count reaches zero, VFS notifies the implementation by 12 calling `FilesystemImpl.Release()` or `FileDescriptionImpl.Release()` 13 respectively and then releases VFS-owned resources. Dentries are exclusively 14 managed by filesystem implementations; reference count changes are abstracted 15 through DentryImpl, which should release resources when reference count reaches 16 zero. 17 18 Filesystem references are held by: 19 20 - Mount: Each referenced Mount holds a reference on the mounted Filesystem. 21 22 Dentry references are held by: 23 24 - FileDescription: Each referenced FileDescription holds a reference on the 25 Dentry through which it was opened, via `FileDescription.vd.dentry`. 26 27 - Mount: Each referenced Mount holds a reference on its mount point and on the 28 mounted filesystem root. The mount point is mutable (`mount(MS_MOVE)`). 29 30 Mount references are held by: 31 32 - FileDescription: Each referenced FileDescription holds a reference on the 33 Mount on which it was opened, via `FileDescription.vd.mount`. 34 35 - Mount: Each referenced Mount holds a reference on its parent, which is the 36 mount containing its mount point. 37 38 - VirtualFilesystem: A reference is held on each Mount that has been connected 39 to a mount point, but not yet umounted. 40 41 MountNamespace and FileDescription references are held by users of VFS. The 42 expectation is that each `kernel.Task` holds a reference on its corresponding 43 MountNamespace, and each file descriptor holds a reference on its represented 44 FileDescription. 45 46 Notes: 47 48 - Dentries do not hold a reference on their owning Filesystem. Instead, all 49 uses of a Dentry occur in the context of a Mount, which holds a reference on 50 the relevant Filesystem (see e.g. the VirtualDentry type). As a corollary, 51 when releasing references on both a Dentry and its corresponding Mount, the 52 Dentry's reference must be released first (because releasing the Mount's 53 reference may release the last reference on the Filesystem, whose state may 54 be required to release the Dentry reference). 55 56 ### The Inheritance Pattern 57 58 Filesystem, Dentry, and FileDescription are all concepts featuring both state 59 that must be shared between VFS and filesystem implementations, and operations 60 that are implementation-defined. To facilitate this, each of these three 61 concepts follows the same pattern, shown below for Dentry: 62 63 ```go 64 // Dentry represents a node in a filesystem tree. 65 type Dentry struct { 66 // VFS-required dentry state. 67 parent *Dentry 68 // ... 69 70 // impl is the DentryImpl associated with this Dentry. impl is immutable. 71 // This should be the last field in Dentry. 72 impl DentryImpl 73 } 74 75 // Init must be called before first use of d. 76 func (d *Dentry) Init(impl DentryImpl) { 77 d.impl = impl 78 } 79 80 // Impl returns the DentryImpl associated with d. 81 func (d *Dentry) Impl() DentryImpl { 82 return d.impl 83 } 84 85 // DentryImpl contains implementation-specific details of a Dentry. 86 // Implementations of DentryImpl should contain their associated Dentry by 87 // value as their first field. 88 type DentryImpl interface { 89 // VFS-required implementation-defined dentry operations. 90 IncRef() 91 // ... 92 } 93 ``` 94 95 This construction, which is essentially a type-safe analogue to Linux's 96 `container_of` pattern, has the following properties: 97 98 - VFS works almost exclusively with pointers to Dentry rather than DentryImpl 99 interface objects, such as in the type of `Dentry.parent`. This avoids 100 interface method calls (which are somewhat expensive to perform, and defeat 101 inlining and escape analysis), reduces the size of VFS types (since an 102 interface object is two pointers in size), and allows pointers to be loaded 103 and stored atomically using `sync/atomic`. Implementation-defined behavior 104 is accessed via `Dentry.impl` when required. 105 106 - Filesystem implementations can access the implementation-defined state 107 associated with objects of VFS types by type-asserting or type-switching 108 (e.g. `Dentry.Impl().(*myDentry)`). Type assertions to a concrete type 109 require only an equality comparison of the interface object's type pointer 110 to a static constant, and are consequently very fast. 111 112 - Filesystem implementations can access the VFS state associated with objects 113 of implementation-defined types directly. 114 115 - VFS and implementation-defined state for a given type occupy the same 116 object, minimizing memory allocations and maximizing memory locality. `impl` 117 is the last field in `Dentry`, and `Dentry` is the first field in 118 `DentryImpl` implementations, for similar reasons: this tends to cause 119 fetching of the `Dentry.impl` interface object to also fetch `DentryImpl` 120 fields, either because they are in the same cache line or via next-line 121 prefetching.