github.com/charlievieth/fastwalk@v1.0.3/README.md (about) 1 [![GoDoc](https://img.shields.io/badge/godoc-reference-blue.svg)](https://pkg.go.dev/github.com/charlievieth/fastwalk) 2 [![Test fastwalk on macOS](https://github.com/charlievieth/fastwalk/actions/workflows/macos.yml/badge.svg)](https://github.com/charlievieth/fastwalk/actions/workflows/macos.yml) 3 [![Test fastwalk on Linux](https://github.com/charlievieth/fastwalk/actions/workflows/linux.yml/badge.svg)](https://github.com/charlievieth/fastwalk/actions/workflows/linux.yml) 4 [![Test fastwalk on Windows](https://github.com/charlievieth/fastwalk/actions/workflows/windows.yml/badge.svg)](https://github.com/charlievieth/fastwalk/actions/workflows/windows.yml) 5 6 # fastwalk 7 8 Fast parallel directory traversal for Golang. 9 10 Package fastwalk provides a fast parallel version of [`filepath.WalkDir`](https://pkg.go.dev/io/fs#WalkDirFunc) 11 that is \~2x faster on macOS, \~4x faster on Linux, \~6x faster on Windows, 12 allocates 50% less memory, and requires 25% fewer memory allocations. 13 Additionally, it is \~4-5x faster than [godirwalk](https://github.com/karrick/godirwalk) 14 across OSes. 15 16 Inspired by and based off of [golang.org/x/tools/internal/fastwalk](https://pkg.go.dev/golang.org/x/tools@v0.1.9/internal/fastwalk). 17 18 ## Features 19 20 * Fast: multiple goroutines stat the filesystem and call the 21 [`filepath.WalkDirFunc`](https://pkg.go.dev/io/fs#WalkDirFunc) callback concurrently 22 * Safe symbolic link traversal ([`Config.Follow`](https://pkg.go.dev/github.com/charlievieth/fastwalk#Config)) 23 * Same behavior and callback signature as [`filepath.WalkDir`](https://pkg.go.dev/path/filepath@go1.17.7#WalkDir) 24 * Wrapper functions are provided to ignore duplicate files and directories: 25 [`IgnoreDuplicateFiles()`](https://pkg.go.dev/github.com/charlievieth/fastwalk#IgnoreDuplicateFiles) 26 and 27 [`IgnoreDuplicateDirs()`](https://pkg.go.dev/github.com/charlievieth/fastwalk#IgnoreDuplicateDirs) 28 * Extensively tested on macOS, Linux, and Windows 29 30 ## Usage 31 32 Usage is the same as [`filepath.WalkDir`](https://pkg.go.dev/io/fs#WalkDirFunc), 33 but the [`walkFn`](https://pkg.go.dev/path/filepath@go1.17.7#WalkFunc) 34 argument to [`fastwalk.Walk`](https://pkg.go.dev/github.com/charlievieth/fastwalk#Walk) 35 must be safe for concurrent use. 36 37 Examples can be found in the [examples](./examples) directory. 38 39 <!-- TODO: this example is large move it to an examples folder --> 40 41 The below example is a very simple version of the POSIX 42 [find](https://pubs.opengroup.org/onlinepubs/007904975/utilities/find.html) utility: 43 ```go 44 // fwfind is a an example program that is similar to POSIX find, 45 // but faster and worse (it's an example). 46 package main 47 48 import ( 49 "flag" 50 "fmt" 51 "io/fs" 52 "os" 53 "path/filepath" 54 55 "github.com/charlievieth/fastwalk" 56 ) 57 58 const usageMsg = `Usage: %[1]s [-L] [-name] [PATH...]: 59 60 %[1]s is a poor replacement for the POSIX find utility 61 62 ` 63 64 func main() { 65 flag.Usage = func() { 66 fmt.Fprintf(os.Stdout, usageMsg, filepath.Base(os.Args[0])) 67 flag.PrintDefaults() 68 } 69 pattern := flag.String("name", "", "Pattern to match file names against.") 70 followLinks := flag.Bool("L", false, "Follow symbolic links") 71 flag.Parse() 72 73 // If no paths are provided default to the current directory: "." 74 args := flag.Args() 75 if len(args) == 0 { 76 args = append(args, ".") 77 } 78 79 // Follow links if the "-L" flag is provided 80 conf := fastwalk.Config{ 81 Follow: *followLinks, 82 } 83 84 walkFn := func(path string, d fs.DirEntry, err error) error { 85 if err != nil { 86 fmt.Fprintf(os.Stderr, "%s: %v\n", path, err) 87 return nil // returning the error stops iteration 88 } 89 if *pattern != "" { 90 if ok, err := filepath.Match(*pattern, d.Name()); !ok { 91 // invalid pattern (err != nil) or name does not match 92 return err 93 } 94 } 95 _, err = fmt.Println(path) 96 return err 97 } 98 for _, root := range args { 99 if err := fastwalk.Walk(&conf, root, walkFn); err != nil { 100 fmt.Fprintf(os.Stderr, "%s: %v\n", root, err) 101 os.Exit(1) 102 } 103 } 104 } 105 ``` 106 107 ## Benchmarks 108 109 Benchmarks were created using `go1.17.6` and can be generated with the `bench_comp` make target: 110 ```sh 111 $ make bench_comp 112 ``` 113 114 ### Darwin 115 116 **Hardware:** 117 ``` 118 goos: darwin 119 goarch: arm64 120 cpu: Apple M1 Max 121 ``` 122 123 #### [`filepath.WalkDir`](https://pkg.go.dev/path/filepath@go1.17.7#WalkDir) vs. [`fastwalk.Walk()`](https://pkg.go.dev/github.com/charlievieth/fastwalk#Walk): 124 ``` 125 filepath fastwalk delta 126 time/op 27.9ms ± 1% 13.0ms ± 1% -53.33% 127 alloc/op 4.33MB ± 0% 2.14MB ± 0% -50.55% 128 allocs/op 50.9k ± 0% 37.7k ± 0% -26.01% 129 ``` 130 131 #### [`godirwalk.Walk()`](https://pkg.go.dev/github.com/karrick/godirwalk@v1.16.1#Walk) vs. [`fastwalk.Walk()`](https://pkg.go.dev/github.com/charlievieth/fastwalk#Walk): 132 ``` 133 godirwalk fastwalk delta 134 time/op 58.5ms ± 3% 18.0ms ± 2% -69.30% 135 alloc/op 25.3MB ± 0% 2.1MB ± 0% -91.55% 136 allocs/op 57.6k ± 0% 37.7k ± 0% -34.59% 137 ``` 138 139 ### Linux 140 141 **Hardware:** 142 ``` 143 goos: linux 144 goarch: amd64 145 cpu: Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz 146 drive: Samsung SSD 970 PRO 1TB 147 ``` 148 149 #### [`filepath.WalkDir`](https://pkg.go.dev/path/filepath@go1.17.7#WalkDir) vs. [`fastwalk.Walk()`](https://pkg.go.dev/github.com/charlievieth/fastwalk#Walk): 150 151 ``` 152 filepath fastwalk delta 153 time/op 10.1ms ± 2% 2.8ms ± 2% -72.83% 154 alloc/op 2.44MB ± 0% 1.70MB ± 0% -30.46% 155 allocs/op 47.2k ± 0% 36.9k ± 0% -21.80% 156 ``` 157 158 #### [`godirwalk.Walk()`](https://pkg.go.dev/github.com/karrick/godirwalk@v1.16.1#Walk) vs. [`fastwalk.Walk()`](https://pkg.go.dev/github.com/charlievieth/fastwalk#Walk): 159 160 ``` 161 filepath fastwalk delta 162 time/op 13.7ms ±16% 2.8ms ± 2% -79.88% 163 alloc/op 7.48MB ± 0% 1.70MB ± 0% -77.34% 164 allocs/op 53.8k ± 0% 36.9k ± 0% -31.38% 165 ``` 166 167 ### Windows 168 169 **Hardware:** 170 ``` 171 goos: windows 172 goarch: amd64 173 pkg: github.com/charlievieth/fastwalk 174 cpu: Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz 175 ``` 176 177 #### [`filepath.WalkDir`](https://pkg.go.dev/path/filepath@go1.17.7#WalkDir) vs. [`fastwalk.Walk()`](https://pkg.go.dev/github.com/charlievieth/fastwalk#Walk): 178 179 ``` 180 filepath fastwalk delta 181 time/op 88.0ms ± 1% 14.6ms ± 1% -83.47% 182 alloc/op 5.68MB ± 0% 6.76MB ± 0% +19.01% 183 allocs/op 69.6k ± 0% 90.4k ± 0% +29.87% 184 ``` 185 186 #### [`godirwalk.Walk()`](https://pkg.go.dev/github.com/karrick/godirwalk@v1.16.1#Walk) vs. [`fastwalk.Walk()`](https://pkg.go.dev/github.com/charlievieth/fastwalk#Walk): 187 188 ``` 189 filepath fastwalk delta 190 time/op 87.4ms ± 1% 14.6ms ± 1% -83.34% 191 alloc/op 6.14MB ± 0% 6.76MB ± 0% +10.24% 192 allocs/op 100k ± 0% 90k ± 0% -9.59% 193 ``` 194 195 ## Darwin: getdirentries64 196 197 The `nogetdirentries` build tag can be used to prevent `fastwalk` from using 198 and linking to the non-public `__getdirentries64` syscall. This is required 199 if an app using `fastwalk` is to be distributed via Apple's App Store (see 200 https://github.com/golang/go/issues/30933 for more details). When using 201 `__getdirentries64` is disabled, `fastwalk` will use `readdir_r` instead, 202 which is what the Go standard library uses for 203 [`os.ReadDir`](https://pkg.go.dev/os#ReadDir) and is about \~10% slower than 204 `__getdirentries64` 205 ([benchmarks](https://github.com/charlievieth/fastwalk/blob/2e6a1b8a1ce88e578279e6e631b2129f7144ec87/fastwalk_darwin_test.go#L19-L57)). 206 207 Example of how to build and test that your program is not linked to `__getdirentries64`: 208 ```sh 209 # NOTE: the following only applies to darwin (aka macOS) 210 211 # Build binary that imports fastwalk without linking to __getdirentries64. 212 $ go build -tags nogetdirentries -o YOUR_BINARY 213 # Test that __getdirentries64 is not linked (this should print no output). 214 $ ! otool -dyld_info YOUR_BINARY | grep -F getdirentries64 215 ``` 216 217 There is a also a script [scripts/links2getdirentries.bash](scripts/links2getdirentries.bash) 218 that can be used to check if a program binary links to getdirentries.