github.com/anishathalye/periscope@v0.3.5/README.md (about) 1 # Periscope [![Build Status](https://github.com/anishathalye/periscope/workflows/CI/badge.svg)](https://github.com/anishathalye/periscope/actions?query=workflow%3ACI) 2 <!-- 3 Other useful stuff: 4 5 https://goreportcard.com/report/github.com/anishathalye/periscope 6 --> 7 8 Periscope gives you "duplicate vision" to help you organize and de-duplicate your files without losing data. 9 10 <p align="center"> 11 <img src="https://raw.githubusercontent.com/anishathalye/assets/master/periscope/demo.gif" width="636" alt="Periscope demo"> 12 </p> 13 14 Periscope (`psc`) works differently from most other duplicate file finders. It 15 is designed to be used _interactively_: Periscope will help you explore the 16 filesystem, understand which files are duplicated, determine where duplicate 17 copies live, and safely delete duplicates without losing data. 18 19 Following a `psc scan`, Periscope lets you navigate and explore your filesystem 20 with the workflow you're already used to — using your shell and commands 21 like `cd`, `ls`, `tree`, and so on — while providing additional 22 _duplicate-aware commands_ that mirror core filesystem utilities. For example, 23 `psc ls` gives a directory listing that highlights duplicates, and `psc rm` 24 deletes files only if a duplicate exists elsewhere. This makes it easy to 25 understand how data is organized (and duplicated), reorganize files, and delete 26 duplicates without worrying about accidentally losing data. 27 28 <p align="center"> 29 <a href="#workflow">Workflow</a> · <a href="#commands">Commands</a> · <a href="#installation">Installation</a> · <a href="#contributing">Contributing</a> 30 </p> 31 32 ## Workflow 33 34 **Find duplicates** 35 36 Start with `psc scan` to scan folders for duplicates. Once you run this, you 37 shouldn't need to run it again while looking at and deleting duplicates, unless 38 you move files around. If you delete files manually (rather than with `psc rm`), 39 you can make Periscope detect deletions with `psc refresh`, which runs much 40 faster than a full scan. `psc scan` is incremental, so if you want to scan a new 41 directory or re-analyze one that was already scanned, you can always run the 42 command again. 43 44 **Understand duplicates** 45 46 You can get a high-level understanding of how many duplicates you have and 47 where they are located: 48 49 - `psc summary` gives statistics on duplicate files 50 - `psc report` shows a full list of duplicates, sorted by file size 51 52 After identifying areas to explore with `psc report`, you can navigate to those 53 directories in your shell with `cd`, and then you can use Periscope commands to 54 understand duplicates: 55 56 - `psc ls` gives a duplicate-aware directory listing (optionally recursively, 57 with the `-R` flag) 58 - `psc info` shows information on a specific file (and its duplicates) 59 60 **Delete duplicates** 61 62 You can use the `psc rm` command to delete duplicates. You can think of it like 63 a safe version of `rm`: it will not let you delete files unless there are 64 duplicate copies elsewhere. A `psc rm -r` will recursively delete duplicates 65 but not unique files. A `psc rm --contained <path>` will delete duplicates only 66 if a copy is contained in the given folder. 67 68 **Remove duplicate database** 69 70 When you're done with a Periscope session, you can delete the duplicate 71 database with `psc finish`. 72 73 ## Commands 74 75 Run `psc help` to see the full list of commands and `psc help [command]` to see 76 help on a specific command. 77 78 **`psc scan` scans for duplicates** 79 80 Scans paths for duplicates and populates the database with information about 81 duplicates. Scans the current directory if given no argument. Scanning is 82 incremental; if you want to start from scratch, run `psc finish` first. 83 84 **`psc refresh` removes deleted files from the database** 85 86 Removes deleted files from the duplicate database. `psc rm` does this 87 automatically, so this command only needs to be used if you use some other 88 program (e.g. coreutils `rm`) and want to remove missing files from the 89 database. This command does not re-analyze files, so if you've made substantial 90 changes to the filesystem, like moving files around or adding new files, it's 91 best to do a `psc scan` of the relevant directories. 92 93 **`psc finish` deletes the duplicate database** 94 95 Deletes the duplicate database. Once you're done using Periscope, it's good to 96 use this command to delete the duplicate database, so it doesn't waste space on 97 disk. 98 99 **`psc summary` reports statistics** 100 101 Prints statistics about the duplicate database, such as number of duplicate 102 files and the amount of space duplicates consume. 103 104 **`psc report` reports scan results** 105 106 Lists all duplicates in the duplicate database, sorted by file size. Because 107 this list is usually large, it's helpful to pipe the output to a pager, e.g. 108 `psc report | less`. 109 110 **`psc export` exports scan results** 111 112 Exports information about duplicates in a machine-readable format (default 113 JSON). **This is the only output from Periscope that other programs should 114 consume.** Future versions of Periscope may add to the information that's 115 included in the dump, but the layout of existing data will not change. 116 117 **`psc ls` lists a directory** 118 119 Lists files and folders in the given directory (or the current directory, if 120 none is given). This command shows the number of duplicates that each file has: 121 1 means that there is a single duplicate elsewhere in the filesystem; if a file 122 has no duplicates, the number is omitted. Directories are tagged with a 'd', 123 and special files are tagged with a character describing their type, e.g. 'p' 124 for named pipes. `-a` shows hidden files. `-d` lists only duplicates, while 125 `-u` lists only unique files. `-v` lists all duplicates of every file, and `-r` 126 shows the path to the duplicate as a relative path instead of an absolute path. 127 `-R` lists files recursively; this flag combines well with the `-d` flag, to 128 list only duplicate files recursively contained in a given directory. 129 130 **`psc tree` lists all duplicates in a given directory** 131 132 Lists all files recursively contained in the given directory (or the current 133 directory, if none is given) that have a duplicate file elsewhere. This command 134 hides hidden files and folders by default; the `-a` flag includes hidden files. 135 136 This command shows a "flattened" representation; in most cases, a `psc ls -Rd` 137 is more useful. 138 139 **`psc info` inspects a file** 140 141 Shows information about a single file's duplicates. Like with `psc ls`, the 142 `-r` flag shows the path to the duplicate as a path relative to the given file. 143 144 **`psc rm` deletes duplicates** 145 146 Deletes duplicates but not unique files; no way of invoking this command will 147 delete unique files. This command makes use of the database, but it 148 double-checks files and their copies before it deletes anything, so a stale 149 duplicate database will not result in data loss. The `-n` flag will perform a 150 dry run, listing files that would be deleted but not actually deleting 151 anything. `-r` deletes duplicates recursively. The `--contained <path>` 152 argument gives more fine-grained control over deletion: files are only deleted 153 if they have a duplicate _in the given location_. This is useful, for example, 154 for deleting files from a "to organize" directory only if they are also 155 contained in the "organized" directory, as in the demo video above. By default, 156 `psc rm` does not delete any files when it's given a set where there are no 157 duplicates outside the set: for example, if files "/a/x1" and "/a/x2" are 158 duplicates, recursively removing "/a" will leave both files untouched. Passing 159 the `--arbitrary` flag will result in such duplicates being handled by 160 arbitrarily choosing one file to save and deleting the rest. 161 162 ## Installation 163 164 **Install with [Homebrew](https://brew.sh/) (on macOS):** 165 166 ```bash 167 brew install periscope 168 ``` 169 170 **Download a binary release:** 171 [Periscope releases](https://github.com/anishathalye/periscope/releases). 172 173 Periscope has binary releases for macOS and Linux. It has not been tested on 174 Windows. 175 176 **Install from source with `go install`:** 177 178 ```bash 179 go install -v github.com/anishathalye/periscope/cmd/psc@latest 180 ``` 181 182 Periscope depends on go-sqlite3, which uses cgo, so you need a C compiler 183 present in your path. You might also need to set `CGO_ENABLED=1` if you have it 184 disabled otherwise. 185 186 <!-- 187 188 Testing releases: 189 190 ``` 191 docker run -e --rm --privileged -v $PWD:/go/src/github.com/anishathalye/periscope -v /var/run/docker.sock:/var/run/docker.sock -w /go/src/github.com/anishathalye/periscope mailchain/goreleaser-xcgo --rm-dist --skip-publish 192 ``` 193 194 Supply `--snapshot` if version is not tagged 195 196 --> 197 198 ## Contributing 199 200 Bug reports, feature requests, feedback on the tool or documentation, and pull 201 requests are all appreciated. If you are planning on making substantial changes 202 that you hope to have merged, it is highly recommended that you first open an 203 issue to discuss your proposed change. 204 205 ## License 206 207 Copyright (c) Anish Athalye (me@anishathalye.com). Released under GPLv3. 208 See [LICENSE.txt](LICENSE.txt) for details.