rsc.io/go@v0.0.0-20150416155037-e040fd465409/doc/articles/race_detector.html (about)

     1  <!--{
     2  	"Title": "Data Race Detector",
     3  	"Template": true
     4  }-->
     5  
     6  <h2 id="Introduction">Introduction</h2>
     7  
     8  <p>
     9  Data races are among the most common and hardest to debug types of bugs in concurrent systems.
    10  A data race occurs when two goroutines access the same variable concurrently and at least one of the accesses is a write.
    11  See the <a href="/ref/mem/">The Go Memory Model</a> for details.
    12  </p>
    13  
    14  <p>
    15  Here is an example of a data race that can lead to crashes and memory corruption:
    16  </p>
    17  
    18  <pre>
    19  func main() {
    20  	c := make(chan bool)
    21  	m := make(map[string]string)
    22  	go func() {
    23  		m["1"] = "a" // First conflicting access.
    24  		c &lt;- true
    25  	}()
    26  	m["2"] = "b" // Second conflicting access.
    27  	&lt;-c
    28  	for k, v := range m {
    29  		fmt.Println(k, v)
    30  	}
    31  }
    32  </pre>
    33  
    34  <h2 id="Usage">Usage</h2>
    35  
    36  <p>
    37  To help diagnose such bugs, Go includes a built-in data race detector.
    38  To use it, add the <code>-race</code> flag to the go command:
    39  </p>
    40  
    41  <pre>
    42  $ go test -race mypkg    // to test the package
    43  $ go run -race mysrc.go  // to run the source file
    44  $ go build -race mycmd   // to build the command
    45  $ go install -race mypkg // to install the package
    46  </pre>
    47  
    48  <h2 id="Report_Format">Report Format</h2>
    49  
    50  <p>
    51  When the race detector finds a data race in the program, it prints a report.
    52  The report contains stack traces for conflicting accesses, as well as stacks where the involved goroutines were created.
    53  Here is an example:
    54  </p>
    55  
    56  <pre>
    57  WARNING: DATA RACE
    58  Read by goroutine 185:
    59    net.(*pollServer).AddFD()
    60        src/net/fd_unix.go:89 +0x398
    61    net.(*pollServer).WaitWrite()
    62        src/net/fd_unix.go:247 +0x45
    63    net.(*netFD).Write()
    64        src/net/fd_unix.go:540 +0x4d4
    65    net.(*conn).Write()
    66        src/net/net.go:129 +0x101
    67    net.func·060()
    68        src/net/timeout_test.go:603 +0xaf
    69  
    70  Previous write by goroutine 184:
    71    net.setWriteDeadline()
    72        src/net/sockopt_posix.go:135 +0xdf
    73    net.setDeadline()
    74        src/net/sockopt_posix.go:144 +0x9c
    75    net.(*conn).SetDeadline()
    76        src/net/net.go:161 +0xe3
    77    net.func·061()
    78        src/net/timeout_test.go:616 +0x3ed
    79  
    80  Goroutine 185 (running) created at:
    81    net.func·061()
    82        src/net/timeout_test.go:609 +0x288
    83  
    84  Goroutine 184 (running) created at:
    85    net.TestProlongTimeout()
    86        src/net/timeout_test.go:618 +0x298
    87    testing.tRunner()
    88        src/testing/testing.go:301 +0xe8
    89  </pre>
    90  
    91  <h2 id="Options">Options</h2>
    92  
    93  <p>
    94  The <code>GORACE</code> environment variable sets race detector options.
    95  The format is:
    96  </p>
    97  
    98  <pre>
    99  GORACE="option1=val1 option2=val2"
   100  </pre>
   101  
   102  <p>
   103  The options are:
   104  </p>
   105  
   106  <ul>
   107  <li>
   108  <code>log_path</code> (default <code>stderr</code>): The race detector writes
   109  its report to a file named <code>log_path.<em>pid</em></code>.
   110  The special names <code>stdout</code>
   111  and <code>stderr</code> cause reports to be written to standard output and
   112  standard error, respectively.
   113  </li>
   114  
   115  <li>
   116  <code>exitcode</code> (default <code>66</code>): The exit status to use when
   117  exiting after a detected race.
   118  </li>
   119  
   120  <li>
   121  <code>strip_path_prefix</code> (default <code>""</code>): Strip this prefix
   122  from all reported file paths, to make reports more concise.
   123  </li>
   124  
   125  <li>
   126  <code>history_size</code> (default <code>1</code>): The per-goroutine memory
   127  access history is <code>32K * 2**history_size elements</code>.
   128  Increasing this value can avoid a "failed to restore the stack" error in reports, at the
   129  cost of increased memory usage.
   130  </li>
   131  
   132  <li>
   133  <code>halt_on_error</code> (default <code>0</code>): Controls whether the program
   134  exits after reporting first data race.
   135  </li>
   136  </ul>
   137  
   138  <p>
   139  Example:
   140  </p>
   141  
   142  <pre>
   143  $ GORACE="log_path=/tmp/race/report strip_path_prefix=/my/go/sources/" go test -race
   144  </pre>
   145  
   146  <h2 id="Excluding_Tests">Excluding Tests</h2>
   147  
   148  <p>
   149  When you build with <code>-race</code> flag, the <code>go</code> command defines additional
   150  <a href="/pkg/go/build/#hdr-Build_Constraints">build tag</a> <code>race</code>.
   151  You can use the tag to exclude some code and tests when running the race detector.
   152  Some examples:
   153  </p>
   154  
   155  <pre>
   156  // +build !race
   157  
   158  package foo
   159  
   160  // The test contains a data race. See issue 123.
   161  func TestFoo(t *testing.T) {
   162  	// ...
   163  }
   164  
   165  // The test fails under the race detector due to timeouts.
   166  func TestBar(t *testing.T) {
   167  	// ...
   168  }
   169  
   170  // The test takes too long under the race detector.
   171  func TestBaz(t *testing.T) {
   172  	// ...
   173  }
   174  </pre>
   175  
   176  <h2 id="How_To_Use">How To Use</h2>
   177  
   178  <p>
   179  To start, run your tests using the race detector (<code>go test -race</code>).
   180  The race detector only finds races that happen at runtime, so it can't find
   181  races in code paths that are not executed.
   182  If your tests have incomplete coverage,
   183  you may find more races by running a binary built with <code>-race</code> under a realistic
   184  workload.
   185  </p>
   186  
   187  <h2 id="Typical_Data_Races">Typical Data Races</h2>
   188  
   189  <p>
   190  Here are some typical data races.  All of them can be detected with the race detector.
   191  </p>
   192  
   193  <h3 id="Race_on_loop_counter">Race on loop counter</h3>
   194  
   195  <pre>
   196  func main() {
   197  	var wg sync.WaitGroup
   198  	wg.Add(5)
   199  	for i := 0; i < 5; i++ {
   200  		go func() {
   201  			fmt.Println(i) // Not the 'i' you are looking for.
   202  			wg.Done()
   203  		}()
   204  	}
   205  	wg.Wait()
   206  }
   207  </pre>
   208  
   209  <p>
   210  The variable <code>i</code> in the function literal is the same variable used by the loop, so
   211  the read in the goroutine races with the loop increment.
   212  (This program typically prints 55555, not 01234.)
   213  The program can be fixed by making a copy of the variable:
   214  </p>
   215  
   216  <pre>
   217  func main() {
   218  	var wg sync.WaitGroup
   219  	wg.Add(5)
   220  	for i := 0; i < 5; i++ {
   221  		go func(j int) {
   222  			fmt.Println(j) // Good. Read local copy of the loop counter.
   223  			wg.Done()
   224  		}(i)
   225  	}
   226  	wg.Wait()
   227  }
   228  </pre>
   229  
   230  <h3 id="Accidentally_shared_variable">Accidentally shared variable</h3>
   231  
   232  <pre>
   233  // ParallelWrite writes data to file1 and file2, returns the errors.
   234  func ParallelWrite(data []byte) chan error {
   235  	res := make(chan error, 2)
   236  	f1, err := os.Create("file1")
   237  	if err != nil {
   238  		res &lt;- err
   239  	} else {
   240  		go func() {
   241  			// This err is shared with the main goroutine,
   242  			// so the write races with the write below.
   243  			_, err = f1.Write(data)
   244  			res &lt;- err
   245  			f1.Close()
   246  		}()
   247  	}
   248  	f2, err := os.Create("file2") // The second conflicting write to err.
   249  	if err != nil {
   250  		res &lt;- err
   251  	} else {
   252  		go func() {
   253  			_, err = f2.Write(data)
   254  			res &lt;- err
   255  			f2.Close()
   256  		}()
   257  	}
   258  	return res
   259  }
   260  </pre>
   261  
   262  <p>
   263  The fix is to introduce new variables in the goroutines (note the use of <code>:=</code>):
   264  </p>
   265  
   266  <pre>
   267  			...
   268  			_, err := f1.Write(data)
   269  			...
   270  			_, err := f2.Write(data)
   271  			...
   272  </pre>
   273  
   274  <h3 id="Unprotected_global_variable">Unprotected global variable</h3>
   275  
   276  <p>
   277  If the following code is called from several goroutines, it leads to races on the <code>service</code> map.
   278  Concurrent reads and writes of the same map are not safe:
   279  </p>
   280  
   281  <pre>
   282  var service map[string]net.Addr
   283  
   284  func RegisterService(name string, addr net.Addr) {
   285  	service[name] = addr
   286  }
   287  
   288  func LookupService(name string) net.Addr {
   289  	return service[name]
   290  }
   291  </pre>
   292  
   293  <p>
   294  To make the code safe, protect the accesses with a mutex:
   295  </p>
   296  
   297  <pre>
   298  var (
   299  	service   map[string]net.Addr
   300  	serviceMu sync.Mutex
   301  )
   302  
   303  func RegisterService(name string, addr net.Addr) {
   304  	serviceMu.Lock()
   305  	defer serviceMu.Unlock()
   306  	service[name] = addr
   307  }
   308  
   309  func LookupService(name string) net.Addr {
   310  	serviceMu.Lock()
   311  	defer serviceMu.Unlock()
   312  	return service[name]
   313  }
   314  </pre>
   315  
   316  <h3 id="Primitive_unprotected_variable">Primitive unprotected variable</h3>
   317  
   318  <p>
   319  Data races can happen on variables of primitive types as well (<code>bool</code>, <code>int</code>, <code>int64</code>, etc.),
   320  as in this example:
   321  </p>
   322  
   323  <pre>
   324  type Watchdog struct{ last int64 }
   325  
   326  func (w *Watchdog) KeepAlive() {
   327  	w.last = time.Now().UnixNano() // First conflicting access.
   328  }
   329  
   330  func (w *Watchdog) Start() {
   331  	go func() {
   332  		for {
   333  			time.Sleep(time.Second)
   334  			// Second conflicting access.
   335  			if w.last < time.Now().Add(-10*time.Second).UnixNano() {
   336  				fmt.Println("No keepalives for 10 seconds. Dying.")
   337  				os.Exit(1)
   338  			}
   339  		}
   340  	}()
   341  }
   342  </pre>
   343  
   344  <p>
   345  Even such "innocent" data races can lead to hard-to-debug problems caused by
   346  non-atomicity of the memory accesses,
   347  interference with compiler optimizations,
   348  or reordering issues accessing processor memory .
   349  </p>
   350  
   351  <p>
   352  A typical fix for this race is to use a channel or a mutex.
   353  To preserve the lock-free behavior, one can also use the
   354  <a href="/pkg/sync/atomic/"><code>sync/atomic</code></a> package.
   355  </p>
   356  
   357  <pre>
   358  type Watchdog struct{ last int64 }
   359  
   360  func (w *Watchdog) KeepAlive() {
   361  	atomic.StoreInt64(&amp;w.last, time.Now().UnixNano())
   362  }
   363  
   364  func (w *Watchdog) Start() {
   365  	go func() {
   366  		for {
   367  			time.Sleep(time.Second)
   368  			if atomic.LoadInt64(&amp;w.last) < time.Now().Add(-10*time.Second).UnixNano() {
   369  				fmt.Println("No keepalives for 10 seconds. Dying.")
   370  				os.Exit(1)
   371  			}
   372  		}
   373  	}()
   374  }
   375  </pre>
   376  
   377  <h2 id="Supported_Systems">Supported Systems</h2>
   378  
   379  <p>
   380  The race detector runs on <code>darwin/amd64</code>, <code>freebsd/amd64</code>,
   381  <code>linux/amd64</code>, and <code>windows/amd64</code>.
   382  </p>
   383  
   384  <h2 id="Runtime_Overheads">Runtime Overhead</h2>
   385  
   386  <p>
   387  The cost of race detection varies by program, but for a typical program, memory
   388  usage may increase by 5-10x and execution time by 2-20x.
   389  </p>