github.com/xushiwei/go@v0.0.0-20130601165731-2b9d83f45bc9/doc/articles/race_detector.html (about) 1 <!--{ 2 "Title": "Data Race Detector", 3 "Template": true 4 }--> 5 6 <h2 id="Introduction">Introduction</h2> 7 8 <p> 9 Data races are among the most common and hardest to debug types of bugs in concurrent systems. 10 A data race occurs when two goroutines access the same variable concurrently and at least one of the accesses is a write. 11 See the <a href="/ref/mem/">The Go Memory Model</a> for details. 12 </p> 13 14 <p> 15 Here is an example of a data race that can lead to crashes and memory corruption: 16 </p> 17 18 <pre> 19 func main() { 20 c := make(chan bool) 21 m := make(map[string]string) 22 go func() { 23 m["1"] = "a" // First conflicting access. 24 c <- true 25 }() 26 m["2"] = "b" // Second conflicting access. 27 <-c 28 for k, v := range m { 29 fmt.Println(k, v) 30 } 31 } 32 </pre> 33 34 <h2 id="Usage">Usage</h2> 35 36 <p> 37 To help diagnose such bugs, Go includes a built-in data race detector. 38 To use it, add the <code>-race</code> flag to the go command: 39 </p> 40 41 <pre> 42 $ go test -race mypkg // to test the package 43 $ go run -race mysrc.go // to run the source file 44 $ go build -race mycmd // to build the command 45 $ go install -race mypkg // to install the package 46 </pre> 47 48 <h2 id="Report_Format">Report Format</h2> 49 50 <p> 51 When the race detector finds a data race in the program, it prints a report. 52 The report contains stack traces for conflicting accesses, as well as stacks where the involved goroutines were created. 53 Here is an example: 54 </p> 55 56 <pre> 57 WARNING: DATA RACE 58 Read by goroutine 185: 59 net.(*pollServer).AddFD() 60 src/pkg/net/fd_unix.go:89 +0x398 61 net.(*pollServer).WaitWrite() 62 src/pkg/net/fd_unix.go:247 +0x45 63 net.(*netFD).Write() 64 src/pkg/net/fd_unix.go:540 +0x4d4 65 net.(*conn).Write() 66 src/pkg/net/net.go:129 +0x101 67 net.func·060() 68 src/pkg/net/timeout_test.go:603 +0xaf 69 70 Previous write by goroutine 184: 71 net.setWriteDeadline() 72 src/pkg/net/sockopt_posix.go:135 +0xdf 73 net.setDeadline() 74 src/pkg/net/sockopt_posix.go:144 +0x9c 75 net.(*conn).SetDeadline() 76 src/pkg/net/net.go:161 +0xe3 77 net.func·061() 78 src/pkg/net/timeout_test.go:616 +0x3ed 79 80 Goroutine 185 (running) created at: 81 net.func·061() 82 src/pkg/net/timeout_test.go:609 +0x288 83 84 Goroutine 184 (running) created at: 85 net.TestProlongTimeout() 86 src/pkg/net/timeout_test.go:618 +0x298 87 testing.tRunner() 88 src/pkg/testing/testing.go:301 +0xe8 89 </pre> 90 91 <h2 id="Options">Options</h2> 92 93 <p> 94 The <code>GORACE</code> environment variable sets race detector options. 95 The format is: 96 </p> 97 98 <pre> 99 GORACE="option1=val1 option2=val2" 100 </pre> 101 102 <p> 103 The options are: 104 </p> 105 106 <ul> 107 <li> 108 <code>log_path</code> (default <code>stderr</code>): The race detector writes 109 its report to a file named <code>log_path.<em>pid</em></code>. 110 The special names <code>stdout</code> 111 and <code>stderr</code> cause reports to be written to standard output and 112 standard error, respectively. 113 </li> 114 115 <li> 116 <code>exitcode</code> (default <code>66</code>): The exit status to use when 117 exiting after a detected race. 118 </li> 119 120 <li> 121 <code>strip_path_prefix</code> (default <code>""</code>): Strip this prefix 122 from all reported file paths, to make reports more concise. 123 </li> 124 125 <li> 126 <code>history_size</code> (default <code>1</code>): The per-goroutine memory 127 access history is <code>32K * 2**history_size elements</code>. 128 Increasing this value can avoid a "failed to restore the stack" error in reports, at the 129 cost of increased memory usage. 130 </li> 131 </ul> 132 133 <p> 134 Example: 135 </p> 136 137 <pre> 138 $ GORACE="log_path=/tmp/race/report strip_path_prefix=/my/go/sources/" go test -race 139 </pre> 140 141 <h2 id="Excluding_Tests">Excluding Tests</h2> 142 143 <p> 144 When you build with <code>-race</code> flag, the <code>go</code> command defines additional 145 <a href="/pkg/go/build/#hdr-Build_Constraints">build tag</a> <code>race</code>. 146 You can use the tag to exclude some code and tests when running the race detector. 147 Some examples: 148 </p> 149 150 <pre> 151 // +build !race 152 153 package foo 154 155 // The test contains a data race. See issue 123. 156 func TestFoo(t *testing.T) { 157 // ... 158 } 159 160 // The test fails under the race detector due to timeouts. 161 func TestBar(t *testing.T) { 162 // ... 163 } 164 165 // The test takes too long under the race detector. 166 func TestBaz(t *testing.T) { 167 // ... 168 } 169 </pre> 170 171 <h2 id="How_To_Use">How To Use</h2> 172 173 <p> 174 To start, run your tests using the race detector (<code>go test -race</code>). 175 The race detector only finds races that happen at runtime, so it can't find 176 races in code paths that are not executed. 177 If your tests have incomplete coverage, 178 you may find more races by running a binary built with <code>-race</code> under a realistic 179 workload. 180 </p> 181 182 <h2 id="Typical_Data_Races">Typical Data Races</h2> 183 184 <p> 185 Here are some typical data races. All of them can be detected with the race detector. 186 </p> 187 188 <h3 id="Race_on_loop_counter">Race on loop counter</h3> 189 190 <pre> 191 func main() { 192 var wg sync.WaitGroup 193 wg.Add(5) 194 for i := 0; i < 5; i++ { 195 go func() { 196 fmt.Println(i) // Not the 'i' you are looking for. 197 wg.Done() 198 }() 199 } 200 wg.Wait() 201 } 202 </pre> 203 204 <p> 205 The variable <code>i</code> in the function literal is the same variable used by the loop, so 206 the read in the goroutine races with the loop increment. 207 (This program typically prints 55555, not 01234.) 208 The program can be fixed by making a copy of the variable: 209 </p> 210 211 <pre> 212 func main() { 213 var wg sync.WaitGroup 214 wg.Add(5) 215 for i := 0; i < 5; i++ { 216 go func(j int) { 217 fmt.Println(j) // Good. Read local copy of the loop counter. 218 wg.Done() 219 }(i) 220 } 221 wg.Wait() 222 } 223 </pre> 224 225 <h3 id="Accidentally_shared_variable">Accidentally shared variable</h3> 226 227 <pre> 228 // ParallelWrite writes data to file1 and file2, returns the errors. 229 func ParallelWrite(data []byte) chan error { 230 res := make(chan error, 2) 231 f1, err := os.Create("file1") 232 if err != nil { 233 res <- err 234 } else { 235 go func() { 236 // This err is shared with the main goroutine, 237 // so the write races with the write below. 238 _, err = f1.Write(data) 239 res <- err 240 f1.Close() 241 }() 242 } 243 f2, err := os.Create("file2") // The second conflicting write to err. 244 if err != nil { 245 res <- err 246 } else { 247 go func() { 248 _, err = f2.Write(data) 249 res <- err 250 f2.Close() 251 }() 252 } 253 return res 254 } 255 </pre> 256 257 <p> 258 The fix is to introduce new variables in the goroutines (note the use of <code>:=</code>): 259 </p> 260 261 <pre> 262 ... 263 _, err := f1.Write(data) 264 ... 265 _, err := f2.Write(data) 266 ... 267 </pre> 268 269 <h3 id="Unprotected_global_variable">Unprotected global variable</h3> 270 271 <p> 272 If the following code is called from several goroutines, it leads to races on the <code>service</code> map. 273 Concurrent reads and writes of the same map are not safe: 274 </p> 275 276 <pre> 277 var service map[string]net.Addr 278 279 func RegisterService(name string, addr net.Addr) { 280 service[name] = addr 281 } 282 283 func LookupService(name string) net.Addr { 284 return service[name] 285 } 286 </pre> 287 288 <p> 289 To make the code safe, protect the accesses with a mutex: 290 </p> 291 292 <pre> 293 var ( 294 service map[string]net.Addr 295 serviceMu sync.Mutex 296 ) 297 298 func RegisterService(name string, addr net.Addr) { 299 serviceMu.Lock() 300 defer serviceMu.Unlock() 301 service[name] = addr 302 } 303 304 func LookupService(name string) net.Addr { 305 serviceMu.Lock() 306 defer serviceMu.Unlock() 307 return service[name] 308 } 309 </pre> 310 311 <h3 id="Primitive_unprotected_variable">Primitive unprotected variable</h3> 312 313 <p> 314 Data races can happen on variables of primitive types as well (<code>bool</code>, <code>int</code>, <code>int64</code>, etc.), 315 as in this example: 316 </p> 317 318 <pre> 319 type Watchdog struct{ last int64 } 320 321 func (w *Watchdog) KeepAlive() { 322 w.last = time.Now().UnixNano() // First conflicting access. 323 } 324 325 func (w *Watchdog) Start() { 326 go func() { 327 for { 328 time.Sleep(time.Second) 329 // Second conflicting access. 330 if w.last < time.Now().Add(-10*time.Second).UnixNano() { 331 fmt.Println("No keepalives for 10 seconds. Dying.") 332 os.Exit(1) 333 } 334 } 335 }() 336 } 337 </pre> 338 339 <p> 340 Even such "innocent" data races can lead to hard-to-debug problems caused by 341 non-atomicity of the memory accesses, 342 interference with compiler optimizations, 343 or reordering issues accessing processor memory . 344 </p> 345 346 <p> 347 A typical fix for this race is to use a channel or a mutex. 348 To preserve the lock-free behavior, one can also use the 349 <a href="/pkg/sync/atomic/"><code>sync/atomic</code></a> package. 350 </p> 351 352 <pre> 353 type Watchdog struct{ last int64 } 354 355 func (w *Watchdog) KeepAlive() { 356 atomic.StoreInt64(&w.last, time.Now().UnixNano()) 357 } 358 359 func (w *Watchdog) Start() { 360 go func() { 361 for { 362 time.Sleep(time.Second) 363 if atomic.LoadInt64(&w.last) < time.Now().Add(-10*time.Second).UnixNano() { 364 fmt.Println("No keepalives for 10 seconds. Dying.") 365 os.Exit(1) 366 } 367 } 368 }() 369 } 370 </pre> 371 372 <h2 id="Supported_Systems">Supported Systems</h2> 373 374 <p> 375 The race detector runs on <code>darwin/amd64</code>, <code>linux/amd64</code>, and <code>windows/amd64</code>. 376 </p> 377 378 <h2 id="Runtime_Overheads">Runtime Overhead</h2> 379 380 <p> 381 The cost of race detection varies by program, but for a typical program, memory 382 usage may increase by 5-10x and execution time by 2-20x. 383 </p>