github.com/onflow/flow-go@v0.35.7-crescendo-preview.23-atree-inlining/utils/binstat/README.md (about)

     1  # `binstat` - run-time & space efficient stats in bins
     2  
     3  ## Why use `binstat`?
     4  
     5  * `pprof` reports the seconds of CPU used and not the seconds of wall-clock used.
     6  * `trace` can be too heavyweight for larger Golang programs.
     7  * prometheus can be heavyweight, and it's difficult to programmatically manage stats once exported.
     8  * logging can be heavyweight; 1M function calls == 1M log lines.
     9  
    10  ## `binstat` goals
    11  
    12  * `binstat` is disabled at compile time by default resulting in zero production code overhead.
    13  * `binstat` may be enabled at compile time on a stat by stat basis without enabling all stats.
    14  * `binstat` aims to be efficient at collecting small stats very fast, e.g. measuring lock time.
    15  * `binstat` aims to allow stats to be compared programmatically; is this worse/better than last run?
    16  
    17  ## How `binstat` works?
    18  
    19  * Sprinkle `binstat.Enter*(<what>)` and `binstat.Leave*()` into Golang source code, e.g.:
    20  
    21  ```
    22  import (
    23  ...
    24  	"github.com/onflow/flow-go/utils/binstat"
    25  ...
    26  )
    27  ...
    28  	for loop := 0; loop < 2; loop++ {
    29  		bs := binstat.EnterTime(fmt.Sprintf("loop-%d", loop))
    30  		myFunc() // code to generate stats for
    31  		binstat.Leave(bs)
    32  	}
    33  ```
    34  
    35  * Run the program, and max. once per second a stats file will be written opportunistically, e.g.:
    36  
    37  ```
    38  $ cat myproc.pid-085882.txt| egrep loop
    39  /GOMAXPROCS=1,GC=100/what[loop-0]/time[1.000000-1.999999]=1 1.912292 // e.g. mypkg.myFunc:161
    40  /GOMAXPROCS=8,GC=100/what[loop-1]/time[0.600000-0.699999]=1 0.696432 // e.g. mypkg.myFunc:161
    41  $ #   user defined string ^^^^^^
    42  $ #          bin time range in seconds ^^^^^^^^^^^^^^^^^
    43  $ #                                 times bin incremented ^
    44  $ #                        total seconds for all increments ^^^^^^^^
    45  $ #                                         name & line of instrumented file ^^^^^^^^^^^^^^^^
    46  ```
    47  
    48  * If `myFunc()` is identical, why different results? E.g. launches 3 go-routines executing slower if `GOMAXPROCS=1`.
    49  
    50  ### How 'collapsable' `<what>` works?
    51  
    52  * If `<what>` has the format `~<default len><what>`, then `~4RootL1L2` only uses `~4Root` in the bin name, unless env var `BINSTAT_LEN_WHAT="~Root=6"` in which case the bin name uses `~4RootL1`.
    53  * In this way, the bin quantity for a particular `<what>` defaults to fewer bins, but may be enlarged using the env var.
    54  
    55  ## Disable AKA 'hide' `binstat` instrumentation before pushing code
    56  
    57  ```
    58  $ utils/binstat/binstat                     
    59  utils/binstat/binstat: Please specifiy a command line option: (list|hide|unhide)-instrumentation [regex='binstat.Bin(Net|MakeID)']
    60  $ utils/binstat/binstat hide-instrumentation
    61  - hide instrumentation found 17 files to examine <-- source files containing binstat instrumentation
    62  - hide instrumentation wrote 0 files             <-- source files rewritten with commented binstat
    63  ```
    64  
    65  * Auto commented `binstat` code looks like this:
    66  
    67  ```
    68  import (
    69  ...
    70  	_ "github.com/onflow/flow-go/utils/binstat"
    71  ...
    72  )
    73  ...
    74  	for loop := 0; loop < 2; loop++ {
    75  		//bs := binstat.EnterTime(fmt.Sprintf("loop-%d", loop))
    76  		myFunc() // code to generate stats for
    77  		//binstat.Leave(bs)
    78  	}
    79  ```
    80  
    81  ## What makes `binstat` efficient?
    82  
    83  * The number of bins is relatively small, regardless of the number of function calls.
    84  * For timing, `binstat` uses `runtime.nano()` which is used by & about twice as fast as `time.Now()`.
    85  * A lock per stat collection is eliminated using a `sync.RWMutex{}` reader/writer mutual exclusion lock:
    86    * Usual case: If the bin already exists, any 'reader' can *concurrently* update it via atomic operations.
    87    * Rare event: Else a single 'writer' blocks all 'readers' to add the new bin.
    88  * `binstat` instrumentation can be disabled in two different ways:
    89    * At compile time via source code commenting of instrumentation.
    90    * At run-time if `BINSTAT_ENABLE` does not exist.
    91  
    92  ## Example comparison with `pprof`
    93  
    94  * This example -- on a GCP Linux box -- launches 6 identical go-routines, 3 times, with `gomaxprocs=1` & then `=8`.
    95  
    96  ```
    97  $ pushd binstat ; go test -v -vv ./... 2>&1 | perl -lane 's~\\n~\n~g; s~"time".*?,~~g; print;' ; popd
    98  ...
    99  === RUN   TestWithPprof
   100  ...
   101  {"level":"debug","message":"test: loop=0 try=1; running 6 identical functions with gomaxprocs=1"}
   102  ...
   103  {"level":"debug","message":"test: loop=0 try=2; running 6 identical functions with gomaxprocs=1"}
   104  ...
   105  {"level":"debug","message":"test: loop=0 try=3; running 6 identical functions with gomaxprocs=1"}
   106  ...
   107  {"level":"debug","message":"test: loop=1 try=1; running 6 identical functions with gomaxprocs=8"}
   108  ...
   109  {"level":"debug","message":"test: loop=1 try=2; running 6 identical functions with gomaxprocs=8"}
   110  ...
   111  {"level":"debug","message":"test: loop=1 try=3; running 6 identical functions with gomaxprocs=8"}
   112  ...
   113  ```
   114  
   115  * The test collects running times from `binstat` & `pprof` for a side by side comparison.
   116    * Note: With `gomaxprocs=1` there is a large delta between CPU & wall-clock, e.g. 0.07 vs 0.40 seconds.
   117    * Note: With `gomaxprocs=8` `pprof` CPU time incorrectly varies between 0.02 to 0.12 seconds.
   118  
   119  ```
   120  {"level":"debug","message":"test: binstat------- pprof---------"}
   121  {"level":"debug","message":"test: try1 try2 try3 try1 try2 try3"}
   122  {"level":"debug","message":"test: 0.38 0.40 0.38 0.06 0.06 0.05 // f1() seconds; loop=0 gomaxprocs=1"}
   123  {"level":"debug","message":"test: 0.37 0.38 0.36 0.06 0.07 0.06 // f2() seconds; loop=0 gomaxprocs=1"}
   124  {"level":"debug","message":"test: 0.29 0.37 0.34 0.06 0.06 0.06 // f3() seconds; loop=0 gomaxprocs=1"}
   125  {"level":"debug","message":"test: 0.33 0.35 0.32 0.06 0.07 0.06 // f4() seconds; loop=0 gomaxprocs=1"}
   126  {"level":"debug","message":"test: 0.28 0.34 0.29 0.05 0.06 0.06 // f5() seconds; loop=0 gomaxprocs=1"}
   127  {"level":"debug","message":"test: 0.40 0.40 0.36 0.06 0.05 0.06 // f6() seconds; loop=0 gomaxprocs=1"}
   128  {"level":"debug","message":"test: binstat------- pprof---------"}
   129  {"level":"debug","message":"test: try1 try2 try3 try1 try2 try3"}
   130  {"level":"debug","message":"test: 0.07 0.08 0.07 0.04 0.05 0.07 // f1() seconds; loop=1 gomaxprocs=8"}
   131  {"level":"debug","message":"test: 0.07 0.08 0.07 0.05 0.07 0.05 // f2() seconds; loop=1 gomaxprocs=8"}
   132  {"level":"debug","message":"test: 0.07 0.08 0.07 0.06 0.11 0.02 // f3() seconds; loop=1 gomaxprocs=8"}
   133  {"level":"debug","message":"test: 0.07 0.08 0.07 0.06 0.06 0.06 // f4() seconds; loop=1 gomaxprocs=8"}
   134  {"level":"debug","message":"test: 0.07 0.08 0.07 0.09 0.05 0.04 // f5() seconds; loop=1 gomaxprocs=8"}
   135  {"level":"debug","message":"test: 0.07 0.08 0.07 0.03 0.04 0.12 // f6() seconds; loop=1 gomaxprocs=8"}
   136  ```
   137  
   138  * Finally, the test shows the `binstat` sorted file containing the stats.
   139  * This part is when `GOMAXPROCS=1`.
   140    * Note: The `binstat.tck` bins show how many 1/10ths of a second had how many go-routines running.
   141    * Note: The `binstat.dmp` bins show how long opportunistically saving the `binstat` file took.
   142  
   143  ```
   144  {"level":"debug","message":"test: output of command: ls -al ./binstat.test.pid-*.binstat.txt ; cat ./binstat.test.pid-*.binstat.txt | sort --version-sort
   145  -rw-r--r--  1 simonhardy-francis  staff  3610 23 Jun 17:10 ./binstat.test.pid-047616.binstat.txt
   146  /GOMAXPROCS=1,CPUS=8/what[~1f-via-f1]/time[0.200000-0.299999]=1 0.292582 // e.g. utils/binstat_test.run.func1:83
   147  /GOMAXPROCS=1,CPUS=8/what[~1f-via-f1]/time[0.300000-0.399999]=2 0.758238 // e.g. utils/binstat_test.run.func1:83
   148  /GOMAXPROCS=1,CPUS=8/what[~1f-via-f2]/time[0.200000-0.299999]=2 0.587235 // e.g. utils/binstat_test.run.func1:83
   149  /GOMAXPROCS=1,CPUS=8/what[~1f-via-f2]/time[0.300000-0.399999]=1 0.356942 // e.g. utils/binstat_test.run.func1:83
   150  /GOMAXPROCS=1,CPUS=8/what[~1f-via-f3]/time[0.300000-0.399999]=3 1.017089 // e.g. utils/binstat_test.run.func1:83
   151  /GOMAXPROCS=1,CPUS=8/what[~1f-via-f4]/time[0.200000-0.299999]=2 0.562171 // e.g. utils/binstat_test.run.func1:83
   152  /GOMAXPROCS=1,CPUS=8/what[~1f-via-f4]/time[0.300000-0.399999]=1 0.315660 // e.g. utils/binstat_test.run.func1:83
   153  /GOMAXPROCS=1,CPUS=8/what[~1f-via-f5]/time[0.200000-0.299999]=2 0.567996 // e.g. utils/binstat_test.run.func1:83
   154  /GOMAXPROCS=1,CPUS=8/what[~1f-via-f5]/time[0.300000-0.399999]=1 0.301534 // e.g. utils/binstat_test.run.func1:83
   155  /GOMAXPROCS=1,CPUS=8/what[~1f-via-f6]/time[0.300000-0.399999]=3 1.157075 // e.g. utils/binstat_test.run.func1:83
   156  /GOMAXPROCS=1,CPUS=8/what[internal-NumG]/size[4]=6 0.000185 // e.g. utils/binstat.tick:424
   157  /GOMAXPROCS=1,CPUS=8/what[internal-NumG]/size[5]=2 0.000042 // e.g. utils/binstat.tick:424
   158  /GOMAXPROCS=1,CPUS=8/what[internal-NumG]/size[9]=2 0.000012 // e.g. utils/binstat.tick:424
   159  /GOMAXPROCS=1,CPUS=8/what[internal-NumG]/size[10-19]=9 0.000137 // e.g. utils/binstat.tick:424
   160  /GOMAXPROCS=1,CPUS=8/what[internal-dump]/time[0.000600-0.000699]=1 0.000650 // e.g. utils/binstat.dump:436
   161  /GOMAXPROCS=1,CPUS=8/what[loop-0]/time[1.000000-1.999999]=1 1.812913 // e.g. utils/binstat_test.TestWithPprof:176
   162  ```
   163  
   164  * This part is when `GOMAXPROCS=8`.
   165    * Note: The `f1` - `f6` functions executed in parallel without pre-emption & therefore quicker.
   166  
   167  ```
   168  /GOMAXPROCS=8,CPUS=8/what[~1f-via-f1]/time[0.070000-0.079999]=2 0.145029 // e.g. utils/binstat_test.run.func1:83
   169  /GOMAXPROCS=8,CPUS=8/what[~1f-via-f1]/time[0.080000-0.089999]=1 0.082286 // e.g. utils/binstat_test.run.func1:83
   170  /GOMAXPROCS=8,CPUS=8/what[~1f-via-f2]/time[0.070000-0.079999]=2 0.144383 // e.g. utils/binstat_test.run.func1:83
   171  /GOMAXPROCS=8,CPUS=8/what[~1f-via-f2]/time[0.100000-0.199999]=1 0.100980 // e.g. utils/binstat_test.run.func1:83
   172  /GOMAXPROCS=8,CPUS=8/what[~1f-via-f3]/time[0.070000-0.079999]=2 0.146867 // e.g. utils/binstat_test.run.func1:83
   173  /GOMAXPROCS=8,CPUS=8/what[~1f-via-f3]/time[0.100000-0.199999]=1 0.106264 // e.g. utils/binstat_test.run.func1:83
   174  /GOMAXPROCS=8,CPUS=8/what[~1f-via-f4]/time[0.070000-0.079999]=2 0.144903 // e.g. utils/binstat_test.run.func1:83
   175  /GOMAXPROCS=8,CPUS=8/what[~1f-via-f4]/time[0.090000-0.099999]=1 0.098164 // e.g. utils/binstat_test.run.func1:83
   176  /GOMAXPROCS=8,CPUS=8/what[~1f-via-f5]/time[0.070000-0.079999]=2 0.144949 // e.g. utils/binstat_test.run.func1:83
   177  /GOMAXPROCS=8,CPUS=8/what[~1f-via-f5]/time[0.090000-0.099999]=1 0.098013 // e.g. utils/binstat_test.run.func1:83
   178  /GOMAXPROCS=8,CPUS=8/what[~1f-via-f6]/time[0.070000-0.079999]=2 0.141520 // e.g. utils/binstat_test.run.func1:83
   179  /GOMAXPROCS=8,CPUS=8/what[~1f-via-f6]/time[0.100000-0.199999]=1 0.103813 // e.g. utils/binstat_test.run.func1:83
   180  /GOMAXPROCS=8,CPUS=8/what[~2egEnterTimeVal]/size[100-199]/time[0.000010-0.000019]=1 0.000017 // e.g. utils/binstat.TestBinstatInternal:20
   181  /GOMAXPROCS=8,CPUS=8/what[~2egEnterTimeVal]/size[myPoint]/time[0.000005-0.000005]=1 0.000006 // e.g. utils/binstat.TestBinstatInternal:20
   182  /GOMAXPROCS=8,CPUS=8/what[~2egEnter]/size[100-199]=1 0.000070 // e.g. utils/binstat.TestBinstatInternal:17
   183  /GOMAXPROCS=8,CPUS=8/what[~2egenterTimeValInternal]/size[100-199]/time[0.000004-0.000004]=1 0.000005 // e.g. utils/binstat.TestBinstatInternal:24
   184  /GOMAXPROCS=8,CPUS=8/what[~2egenterTimeValInternal]/size[myPoint]/time[0.000002-0.000002]=1 0.000002 // e.g. utils/binstat.TestBinstatInternal:24
   185  /GOMAXPROCS=8,CPUS=8/what[internal-NumG]/size[4]=5 0.000160 // e.g. utils/binstat.tick:424
   186  /GOMAXPROCS=8,CPUS=8/what[internal-NumG]/size[5]=1 0.000029 // e.g. utils/binstat.tick:424
   187  /GOMAXPROCS=8,CPUS=8/what[internal-NumG]/size[10-19]=2 0.000046 // e.g. utils/binstat.tick:424
   188  /GOMAXPROCS=8,CPUS=8/what[internal-dump]/time[0.000900-0.000999]=1 0.000943 // e.g. utils/binstat.dump:436
   189  /GOMAXPROCS=8,CPUS=8/what[loop-1]/time[0.800000-0.899999]=1 0.819436 // e.g. utils/binstat_test.TestWithPprof:176
   190  ```
   191  
   192  * This part shows internal timings, as well as the Unix time, elapsed seconds since process start, & the number of GCs and the total seconds they took..
   193  
   194  ```
   195  /internal/GCStats=9 0.001566
   196  /internal/binstat.debug=36 0.001022
   197  /internal/binstat.enter=70 0.001932
   198  /internal/binstat.point=71 0.000882
   199  /internal/binstat.x_2_y=111 0.000383
   200  /internal/second=1624493407 2.665259
   201  "}
   202  ```
   203  
   204  ## Environment variables
   205  
   206  * If `BINSTAT_ENABLE` exists, `binstat` is enabled. Default: Disabled.
   207  * If `BINSTAT_VERBOSE` exists, `binstat` outputs debug info & e.g. function location. Default: Disabled.
   208  * If `BINSTAT_DMP_NAME` exists, use it. Default: `<process name>.pid-<process pid>.binstat.txt`.
   209  * If `BINSTAT_DMP_PATH` exists, output dump to `<BINSTAT_DMP_PATH>/<BINSTAT_DMP_NAME>`. Default: `.`.
   210  * If `BINSTAT_CUT_PATH` exists, cut function name path with this. Default: `=github.com/onflow/flow-go/`.
   211  * If `BINSTAT_LEN_WHAT` exists, truncate `<what>` to `<len>` from `~<what>=<len>`, e.g. `~f=99;~2eg=0`.
   212  
   213  ## Todo
   214  
   215  * Monitor this [tsc github issue](https://github.com/templexxx/tsc/issues/8) in case we can accelerate timing further.
   216  * How to best compress & access the per second `binstat` file from outside the container?