github.com/anishathalye/periscope@v0.3.5/README.md (about)

     1  # Periscope [![Build Status](https://github.com/anishathalye/periscope/workflows/CI/badge.svg)](https://github.com/anishathalye/periscope/actions?query=workflow%3ACI)
     2  <!--
     3  Other useful stuff:
     4  
     5  https://goreportcard.com/report/github.com/anishathalye/periscope
     6  -->
     7  
     8  Periscope gives you "duplicate vision" to help you organize and de-duplicate your files without losing data.
     9  
    10  <p align="center">
    11  <img src="https://raw.githubusercontent.com/anishathalye/assets/master/periscope/demo.gif" width="636" alt="Periscope demo">
    12  </p>
    13  
    14  Periscope (`psc`) works differently from most other duplicate file finders. It
    15  is designed to be used _interactively_: Periscope will help you explore the
    16  filesystem, understand which files are duplicated, determine where duplicate
    17  copies live, and safely delete duplicates without losing data.
    18  
    19  Following a `psc scan`, Periscope lets you navigate and explore your filesystem
    20  with the workflow you're already used to &mdash; using your shell and commands
    21  like `cd`, `ls`, `tree`, and so on &mdash; while providing additional
    22  _duplicate-aware commands_ that mirror core filesystem utilities. For example,
    23  `psc ls` gives a directory listing that highlights duplicates, and `psc rm`
    24  deletes files only if a duplicate exists elsewhere. This makes it easy to
    25  understand how data is organized (and duplicated), reorganize files, and delete
    26  duplicates without worrying about accidentally losing data.
    27  
    28  <p align="center">
    29  <a href="#workflow">Workflow</a> &middot; <a href="#commands">Commands</a> &middot; <a href="#installation">Installation</a> &middot; <a href="#contributing">Contributing</a>
    30  </p>
    31  
    32  ## Workflow
    33  
    34  **Find duplicates**
    35  
    36  Start with `psc scan` to scan folders for duplicates. Once you run this, you
    37  shouldn't need to run it again while looking at and deleting duplicates, unless
    38  you move files around. If you delete files manually (rather than with `psc rm`),
    39  you can make Periscope detect deletions with `psc refresh`, which runs much
    40  faster than a full scan. `psc scan` is incremental, so if you want to scan a new
    41  directory or re-analyze one that was already scanned, you can always run the
    42  command again.
    43  
    44  **Understand duplicates**
    45  
    46  You can get a high-level understanding of how many duplicates you have and
    47  where they are located:
    48  
    49  - `psc summary` gives statistics on duplicate files
    50  - `psc report` shows a full list of duplicates, sorted by file size
    51  
    52  After identifying areas to explore with `psc report`, you can navigate to those
    53  directories in your shell with `cd`, and then you can use Periscope commands to
    54  understand duplicates:
    55  
    56  - `psc ls` gives a duplicate-aware directory listing (optionally recursively,
    57    with the `-R` flag)
    58  - `psc info` shows information on a specific file (and its duplicates)
    59  
    60  **Delete duplicates**
    61  
    62  You can use the `psc rm` command to delete duplicates. You can think of it like
    63  a safe version of `rm`: it will not let you delete files unless there are
    64  duplicate copies elsewhere. A `psc rm -r` will recursively delete duplicates
    65  but not unique files. A `psc rm --contained <path>` will delete duplicates only
    66  if a copy is contained in the given folder.
    67  
    68  **Remove duplicate database**
    69  
    70  When you're done with a Periscope session, you can delete the duplicate
    71  database with `psc finish`.
    72  
    73  ## Commands
    74  
    75  Run `psc help` to see the full list of commands and `psc help [command]` to see
    76  help on a specific command.
    77  
    78  **`psc scan` scans for duplicates**
    79  
    80  Scans paths for duplicates and populates the database with information about
    81  duplicates. Scans the current directory if given no argument. Scanning is
    82  incremental; if you want to start from scratch, run `psc finish` first.
    83  
    84  **`psc refresh` removes deleted files from the database**
    85  
    86  Removes deleted files from the duplicate database. `psc rm` does this
    87  automatically, so this command only needs to be used if you use some other
    88  program (e.g. coreutils `rm`) and want to remove missing files from the
    89  database. This command does not re-analyze files, so if you've made substantial
    90  changes to the filesystem, like moving files around or adding new files, it's
    91  best to do a `psc scan` of the relevant directories.
    92  
    93  **`psc finish` deletes the duplicate database**
    94  
    95  Deletes the duplicate database. Once you're done using Periscope, it's good to
    96  use this command to delete the duplicate database, so it doesn't waste space on
    97  disk.
    98  
    99  **`psc summary` reports statistics**
   100  
   101  Prints statistics about the duplicate database, such as number of duplicate
   102  files and the amount of space duplicates consume.
   103  
   104  **`psc report` reports scan results**
   105  
   106  Lists all duplicates in the duplicate database, sorted by file size. Because
   107  this list is usually large, it's helpful to pipe the output to a pager, e.g.
   108  `psc report | less`.
   109  
   110  **`psc export` exports scan results**
   111  
   112  Exports information about duplicates in a machine-readable format (default
   113  JSON). **This is the only output from Periscope that other programs should
   114  consume.** Future versions of Periscope may add to the information that's
   115  included in the dump, but the layout of existing data will not change.
   116  
   117  **`psc ls` lists a directory**
   118  
   119  Lists files and folders in the given directory (or the current directory, if
   120  none is given). This command shows the number of duplicates that each file has:
   121  1 means that there is a single duplicate elsewhere in the filesystem; if a file
   122  has no duplicates, the number is omitted. Directories are tagged with a 'd',
   123  and special files are tagged with a character describing their type, e.g. 'p'
   124  for named pipes. `-a` shows hidden files. `-d` lists only duplicates, while
   125  `-u` lists only unique files. `-v` lists all duplicates of every file, and `-r`
   126  shows the path to the duplicate as a relative path instead of an absolute path.
   127  `-R` lists files recursively; this flag combines well with the `-d` flag, to
   128  list only duplicate files recursively contained in a given directory.
   129  
   130  **`psc tree` lists all duplicates in a given directory**
   131  
   132  Lists all files recursively contained in the given directory (or the current
   133  directory, if none is given) that have a duplicate file elsewhere. This command
   134  hides hidden files and folders by default; the `-a` flag includes hidden files.
   135  
   136  This command shows a "flattened" representation; in most cases, a `psc ls -Rd`
   137  is more useful.
   138  
   139  **`psc info` inspects a file**
   140  
   141  Shows information about a single file's duplicates. Like with `psc ls`, the
   142  `-r` flag shows the path to the duplicate as a path relative to the given file.
   143  
   144  **`psc rm` deletes duplicates**
   145  
   146  Deletes duplicates but not unique files; no way of invoking this command will
   147  delete unique files. This command makes use of the database, but it
   148  double-checks files and their copies before it deletes anything, so a stale
   149  duplicate database will not result in data loss. The `-n` flag will perform a
   150  dry run, listing files that would be deleted but not actually deleting
   151  anything. `-r` deletes duplicates recursively. The `--contained <path>`
   152  argument gives more fine-grained control over deletion: files are only deleted
   153  if they have a duplicate _in the given location_. This is useful, for example,
   154  for deleting files from a "to organize" directory only if they are also
   155  contained in the "organized" directory, as in the demo video above. By default,
   156  `psc rm` does not delete any files when it's given a set where there are no
   157  duplicates outside the set: for example, if files "/a/x1" and "/a/x2" are
   158  duplicates, recursively removing "/a" will leave both files untouched. Passing
   159  the `--arbitrary` flag will result in such duplicates being handled by
   160  arbitrarily choosing one file to save and deleting the rest.
   161  
   162  ## Installation
   163  
   164  **Install with [Homebrew](https://brew.sh/) (on macOS):**
   165  
   166  ```bash
   167  brew install periscope
   168  ```
   169  
   170  **Download a binary release:**
   171  [Periscope releases](https://github.com/anishathalye/periscope/releases).
   172  
   173  Periscope has binary releases for macOS and Linux. It has not been tested on
   174  Windows.
   175  
   176  **Install from source with `go install`:**
   177  
   178  ```bash
   179  go install -v github.com/anishathalye/periscope/cmd/psc@latest
   180  ```
   181  
   182  Periscope depends on go-sqlite3, which uses cgo, so you need a C compiler
   183  present in your path. You might also need to set `CGO_ENABLED=1` if you have it
   184  disabled otherwise.
   185  
   186  <!--
   187  
   188  Testing releases:
   189  
   190  ```
   191  docker run -e --rm --privileged -v $PWD:/go/src/github.com/anishathalye/periscope -v /var/run/docker.sock:/var/run/docker.sock -w /go/src/github.com/anishathalye/periscope mailchain/goreleaser-xcgo --rm-dist --skip-publish
   192  ```
   193  
   194  Supply `--snapshot` if version is not tagged
   195  
   196  -->
   197  
   198  ## Contributing
   199  
   200  Bug reports, feature requests, feedback on the tool or documentation, and pull
   201  requests are all appreciated. If you are planning on making substantial changes
   202  that you hope to have merged, it is highly recommended that you first open an
   203  issue to discuss your proposed change.
   204  
   205  ## License
   206  
   207  Copyright (c) Anish Athalye (me@anishathalye.com). Released under GPLv3.
   208  See [LICENSE.txt](LICENSE.txt) for details.