github.com/creachadair/ffs@v0.17.3/README.md (about) 1 # Flexible Filesystem 2 3 [](https://pkg.go.dev/github.com/creachadair/ffs) 4 [](https://github.com/creachadair/ffs/actions/workflows/go-presubmit.yml) 5 6 An work-in-progress experimental storage-agnostic filesystem representation. 7 8 This project began as a way of sharing state for a transportable agent system 9 I started building as a hobby project between undergrad and grad school. I 10 lost interest in that, but found the idea behind the storage was still worth 11 having. The original was built in a combination of Python and C and used a 12 custom binary format; this re-implementation in Go uses Protocol Buffers and 13 eliminates the need for FFI. 14 15 ## Summary 16 17 A file in FFS is represented as a Merkle tree encoded in a [content-addressable 18 blob store](./blob). Unlike files in POSIX style filesystems, all files in FFS 19 have the same structure, consisting of binary content, children, and 20 metadata. In other words, every "file" is also potentially a "directory", and 21 vice versa. 22 23 Files are encoded in storage using wire-format [protocol 24 buffer](https://developers.google.com/protocol-buffers) messages as defined in 25 [`wiretype.proto`](./file/wiretype/wiretype.proto). The key messages are: 26 27 - A [`Node`](./file/wiretype/wiretype.proto#L59) is the top-level encoding of a 28 file. The storage key for a file is the content address (**storage key**) of 29 its wire-encoded node message. An empty `Node` message is a valid encoding of 30 an empty file with no children and no metadata. 31 32 - An [`Index`](./file/wiretype/wiretype.proto#L117) records the binary content 33 of a file, if any. An index records the total size of the file along with the 34 sizes, offsets, and storage keys of its data blocks. 35 36 - A [`Child`](./file/wiretype/wiretype.proto#L162) records the name and storage 37 key of a child of a file. Children are ordered lexicographically by name. 38 39 ### Binary Content 40 41 Binary file content is stored in discrete blocks. The block size is not fixed, 42 but varies over a (configurable) predefined range of sizes. Block boundaries 43 are chosen by splitting the file data with a [rolling hash](./block), similar 44 to the technique used in rsync or LBFS, and contents are stored as raw blobs. 45 46 The blocks belonging to a particular file are recorded in one or more 47 _extents_, where each extent represents an ordered, contiguous sequence of 48 blocks. Ranges of file content that consist of all zero-valued bytes are not 49 stored, allowing sparse files to be represented compactly. 50 51 ### Children 52 53 The children of a file are themselves files. Within the node, each child is 54 recorded as a pair comprising a non-empty string _name_ and the storage key of 55 another file. Each name must be unique among the children of a given file, but 56 it is fine for multiple children to share the same storage key. 57 58 ### Metadata 59 60 Files have no required metadata, but for convenience the node representation 61 includes optional [`Stat`](./file/wiretype/wiretype.proto#L71) and 62 [`XAttr`](./file/wiretype/wiretype.proto#L154) messages that encode typical 63 filesystem metadata like POSIX permissions, file type, modification timestamp, 64 and ownership. These fields are persisted in the encoding of a node, and thus 65 affect its storage key, but are not otherwise interpreted. 66 67 ## Related Tools 68 69 The [ffstools](https://github.com/creachadair/ffstools) repository defines 70 command-line tools for manipulating FFS data structures. 71 72 In addition, the [`ffuse`](https://github.com/creachadair/ffuse) repository 73 defines a FUSE filesystem that exposes the FFS data format. 74 75 To install the CLI: 76 ```sh 77 go install github.com/creachadair/ffstools/ffs@latest 78 ``` 79 80 Note, however, that for any "interesting" use you will probably want to build 81 the tool with additional storage engine aupport, beyond just the file and memory 82 stores it has out of the box. See the docs on that repo for more details of what 83 build tags are available.