github.com/creachadair/ffs@v0.17.3/README.md (about)

     1  # Flexible Filesystem
     2  
     3  [![GoDoc](https://img.shields.io/static/v1?label=godoc&message=reference&color=yellowgreen)](https://pkg.go.dev/github.com/creachadair/ffs)
     4  [![CI](https://github.com/creachadair/ffs/actions/workflows/go-presubmit.yml/badge.svg?event=push&branch=main)](https://github.com/creachadair/ffs/actions/workflows/go-presubmit.yml)
     5  
     6  An work-in-progress experimental storage-agnostic filesystem representation.
     7  
     8  This project began as a way of sharing state for a transportable agent system
     9  I started building as a hobby project between undergrad and grad school. I
    10  lost interest in that, but found the idea behind the storage was still worth
    11  having. The original was built in a combination of Python and C and used a
    12  custom binary format; this re-implementation in Go uses Protocol Buffers and
    13  eliminates the need for FFI.
    14  
    15  ## Summary
    16  
    17  A file in FFS is represented as a Merkle tree encoded in a [content-addressable
    18  blob store](./blob). Unlike files in POSIX style filesystems, all files in FFS
    19  have the same structure, consisting of binary content, children, and
    20  metadata. In other words, every "file" is also potentially a "directory", and
    21  vice versa.
    22  
    23  Files are encoded in storage using wire-format [protocol
    24  buffer](https://developers.google.com/protocol-buffers) messages as defined in
    25  [`wiretype.proto`](./file/wiretype/wiretype.proto). The key messages are:
    26  
    27  - A [`Node`](./file/wiretype/wiretype.proto#L59) is the top-level encoding of a
    28    file. The storage key for a file is the content address (**storage key**) of
    29    its wire-encoded node message. An empty `Node` message is a valid encoding of
    30    an empty file with no children and no metadata.
    31  
    32  - An [`Index`](./file/wiretype/wiretype.proto#L117) records the binary content
    33    of a file, if any. An index records the total size of the file along with the
    34    sizes, offsets, and storage keys of its data blocks.
    35  
    36  - A [`Child`](./file/wiretype/wiretype.proto#L162) records the name and storage
    37    key of a child of a file. Children are ordered lexicographically by name.
    38  
    39  ### Binary Content
    40  
    41  Binary file content is stored in discrete blocks.  The block size is not fixed,
    42  but varies over a (configurable) predefined range of sizes. Block boundaries
    43  are chosen by splitting the file data with a [rolling hash](./block), similar
    44  to the technique used in rsync or LBFS, and contents are stored as raw blobs.
    45  
    46  The blocks belonging to a particular file are recorded in one or more
    47  _extents_, where each extent represents an ordered, contiguous sequence of
    48  blocks. Ranges of file content that consist of all zero-valued bytes are not
    49  stored, allowing sparse files to be represented compactly.
    50  
    51  ### Children
    52  
    53  The children of a file are themselves files. Within the node, each child is
    54  recorded as a pair comprising a non-empty string _name_ and the storage key of
    55  another file. Each name must be unique among the children of a given file, but
    56  it is fine for multiple children to share the same storage key.
    57  
    58  ### Metadata
    59  
    60  Files have no required metadata, but for convenience the node representation
    61  includes optional [`Stat`](./file/wiretype/wiretype.proto#L71) and
    62  [`XAttr`](./file/wiretype/wiretype.proto#L154) messages that encode typical
    63  filesystem metadata like POSIX permissions, file type, modification timestamp,
    64  and ownership. These fields are persisted in the encoding of a node, and thus
    65  affect its storage key, but are not otherwise interpreted.
    66  
    67  ## Related Tools
    68  
    69  The [ffstools](https://github.com/creachadair/ffstools) repository defines
    70  command-line tools for manipulating FFS data structures.
    71  
    72  In addition, the [`ffuse`](https://github.com/creachadair/ffuse) repository
    73  defines a FUSE filesystem that exposes the FFS data format.
    74  
    75  To install the CLI:
    76  ```sh
    77  go install github.com/creachadair/ffstools/ffs@latest
    78  ```
    79  
    80  Note, however, that for any "interesting" use you will probably want to build
    81  the tool with additional storage engine aupport, beyond just the file and memory
    82  stores it has out of the box. See the docs on that repo for more details of what
    83  build tags are available.