github.com/westcoastroms/westcoastroms-build@v0.0.0-20190928114312-2350e5a73030/build/soong/docs/perf.md

github.com/westcoastroms/westcoastroms-build@v0.0.0-20190928114312-2350e5a73030/build/soong/docs/perf.md (about)

     1  # Build Performance
     2  
     3  ## Debugging Build Performance
     4  
     5  ### Tracing
     6  
     7  soong_ui has tracing built in, so that every build execution's trace can be
     8  viewed.  Just open `$OUT_DIR/build.trace.gz` in Chrome's <chrome://tracing>, or
     9  with [catapult's trace viewer][catapult trace_viewer]. The last few traces are
    10  stored in `build.trace.#.gz` (larger numbers are older). The associated logs
    11  are stored in `soong.#.log`.
    12  
    13  ![trace example](./trace_example.png)
    14  
    15  ### Soong
    16  
    17  Soong can be traced and profiled using the standard Go tools. It understands
    18  the `-cpuprofile`, `-trace`, and `-memprofile` command line arguments, but we
    19  don't currently have an easy way to enable them in the context of a full build.
    20  
    21  ### Kati
    22  
    23  In general, the slow path of reading Android.mk files isn't particularly
    24  performance sensitive, since it doesn't need to happen on every build. It is
    25  important for the fast-path (detecting whether it needs to regenerate the ninja
    26  file) to be fast however. And it shouldn't hit the slow path too often -- so
    27  don't rely on output of a `$(shell)` command that includes the current timestamp,
    28  or read a file that's going to change on every build.
    29  
    30  #### Regen check is slow
    31  
    32  In most cases, we've found that the fast-path is slow because all of the
    33  `$(shell)` commands need to be re-executed to determine if their output changed.
    34  The `$OUT_DIR/soong.log` contains statistics from the regen check:
    35  
    36  ```
    37  .../kati.go:127: *kati*: regen check time: 1.699207
    38  .../kati.go:127: *kati*: glob time (regen): 0.377193 / 33609
    39  .../kati.go:127: *kati*: shell time (regen): 1.313529 / 184
    40  .../kati.go:127: *kati*:  0.217 find device vendor -type f -name \*.pk8 -o -name verifiedboot\* -o -name \*.x509.pem -o -name oem\*.prop | sort
    41  .../kati.go:127: *kati*:  0.105 cd packages/apps/Dialer ; find -L . -type d -name "res"
    42  .../kati.go:127: *kati*:  0.035 find device vendor -maxdepth 4 -name '*_aux_variant_config.mk' -o -name '*_aux_os_config.mk' | sort
    43  .../kati.go:127: *kati*:  0.029 cd frameworks/base ; find -L  core/java graphics/java location/java media/java media/mca/effect/java media/mca/filterfw/java media/mca/filterpacks/java drm/java opengl/java sax/java telecomm/java telephony/java wifi/java lowpan/java keystore/java rs/java ../opt/telephony/src/java/android/telephony ../opt/telephony/src/java/android/telephony/gsm ../opt/net/voip/src/java/android/net/rtp ../opt/net/voip/src/java/android/net/sip   -name "*.html" -and -not -name ".*"
    44  .../kati.go:127: *kati*:  0.025 test -d device && find -L device -maxdepth 4 -path '*/marlin/BoardConfig.mk'
    45  .../kati.go:127: *kati*:  0.023 find packages/apps/Settings/tests/robotests -type f -name '*Test.java' | sed -e 's!.*\(com/google.*Test\)\.java!\1!' -e 's!.*\(com/android.*Test\)\.java!\1!' | sed 's!/!\.!g' | cat
    46  .../kati.go:127: *kati*:  0.022 test -d vendor && find -L vendor -maxdepth 4 -path '*/marlin/BoardConfig.mk'
    47  .../kati.go:127: *kati*:  0.017 cd cts/tests/tests/shortcutmanager/packages/launchermanifest ; find -L  ../src -name "*.java" -and -not -name ".*"
    48  .../kati.go:127: *kati*:  0.016 cd cts/tests/tests/shortcutmanager/packages/launchermanifest ; find -L  ../../common/src -name "*.java" -and -not -name ".*"
    49  .../kati.go:127: *kati*:  0.015 cd libcore && (find luni/src/test/java -name "*.java" 2> /dev/null) | grep -v -f java_tests_blacklist
    50  .../kati.go:127: *kati*: stat time (regen): 0.250384 / 4405
    51  ```
    52  
    53  In this case, the total time spent checking was 1.69 seconds, even though the
    54  other "(regen)" numbers add up to more than that (some parts are parallelized
    55  where possible). The biggest contributor is the `$(shell)` times -- 184
    56  executions took a total of 1.31 seconds. The top 10 longest shell functions are
    57  printed.
    58  
    59  All the longest commands in this case are all variants of a call to `find`, but
    60  this is where using pure make functions instead of calling out to the shell can
    61  make a performance impact -- many calls to check if `26 > 20` can add up. We've
    62  added some basic math functions in `math.mk` to help some common use cases that
    63  used to be rather expensive when they were used too often.
    64  
    65  There are some optimizations in place for find commands -- if Kati can
    66  understand the find command, the built-in find emulator can turn some of them
    67  into glob or stat checks (falling back to calling `find` if one of those imply
    68  that the output may change). Many of the common macros produce find commands
    69  that Kati can understand, but if you're writing your own, you may want to
    70  experiment with other options if they're showing up in this list. For example,
    71  if this was significantly more expensive (either in runtime, or was called
    72  often):
    73  
    74  ```
    75  .../kati.go:127: *kati*:  0.015 cd libcore && (find luni/src/test/java -name "*.java" 2> /dev/null) | grep -v -f java_tests_blacklist
    76  ```
    77  
    78  It may be more efficient to move the grep into make, so that the `find` portion
    79  can be rewritten and cached:
    80  
    81  ```
    82  $(filter-out $(file <$(LOCAL_PATH)/java_tests_blacklist),$(call all-java-files-under,luni/src/test/java))
    83  ```
    84  
    85  Others can be simplified by just switching to an equivalent find command that
    86  Kati understands:
    87  
    88  ```
    89  .../kati.go:127: *kati*:  0.217 find device vendor -type f -name \*.pk8 -o -name verifiedboot\* -o -name \*.x509.pem -o -name oem\*.prop | sort
    90  ```
    91  
    92  By adding the implicit `-a` and moving the `| sort` to Make, this can now be
    93  cached by Kati:
    94  
    95  ```
    96  $(sort $(shell find device vendor -type -f -a -name \*.pk8 -o -name verifiedboot\* -o -name \*.x509.pem -o -name oem\*.prop))
    97  ```
    98  
    99  Kati is learning about the implicit `-a` in [this change](https://github.com/google/kati/pull/132)
   100  
   101  #### Kati regens too often
   102  
   103  Kati prints out what triggered the slow path to be taken -- this can be a
   104  changed file, a changed environment variable, or different output from a
   105  `$(shell)` command:
   106  
   107  ```
   108  out/soong/Android-aosp_arm.mk was modified, regenerating...
   109  ```
   110  
   111  The state is stored in `$OUT_DIR/.kati_stamp*` files, and can be (partially)
   112  read with the `ckati_stamp_dump` tool in prebuilts/build-tools. More debugging
   113  is available when ckati is run with `--regen_debug`, but that can be a lot of
   114  data to understand.
   115  
   116  ### Ninja
   117  
   118  #### Understanding why something rebuilt
   119  
   120  Add `NINJA_ARGS="-d explain"` to your environment before a build, this will cause
   121  ninja to print out explanations on why actions were taken. Start reading from the
   122  beginning, as this much data can be hard to read:
   123  
   124  ```
   125  $ cd art
   126  $ mma
   127  $ touch runtime/jit/profile_compilation_info.h
   128  $ NINJA_ARGS="-d explain" mma
   129  ...
   130  ninja explain: output out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-data/linux_glibc_x86_64/obj/art/tools/cpp-define-generator/main.o older than most recent input art/runtime/jit/profile_compilation_info.h (
   131  1516683538 vs 1516685188)
   132  ninja explain: out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-data/linux_glibc_x86_64/obj/art/tools/cpp-define-generator/main.o is dirty
   133  ninja explain: out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-data/linux_glibc_x86_64/cpp-define-generator-data is dirty
   134  ninja explain: out/soong/host/linux-x86/bin/cpp-define-generator-data is dirty
   135  ninja explain: out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-asm-support/gen/asm_support_gen.h is dirty
   136  ninja explain: out/soong/.intermediates/art/cmdline/art_cmdline_tests/android_arm_armv7-a_core_cmdline_parser_test/obj/art/cmdline/cmdline_parser_test.o is dirty
   137  ...
   138  ```
   139  
   140  In this case, art/cmdline/cmdline_parser_test.o was rebuilt because it uses
   141  asm_support_gen.h, which was generated by cpp-define-generator-data, which uses
   142  profile_compilation_info.h.
   143  
   144  You'll likely need to cross-reference this data against the build graph in the
   145  various .ninja files. The files are (mostly) human-readable, but a (slow) web
   146  interface can be used by running `NINJA_ARGS="-t browse <target>" m`.
   147  
   148  #### Builds take a long time
   149  
   150  If the long part in the trace view of a build is a relatively solid block, then
   151  the performance is probably more related to how much time the actual build
   152  commands are taking than having extra dependencies, or slowdowns in
   153  soong/kati/ninja themselves.
   154  
   155  Beyond looking at visible outliers in the trace view, we don't have any tooling
   156  to help in this area yet. It's possible to aggregate some of the raw data
   157  together, but since our builds are heavily parallelized, it's particularly easy
   158  for build commands to impact unrelated build commands. This is an area we'd
   159  like to improve -- we expect keeping track of user/system time per-action would
   160  provide more reliable data, but tracking some full-system data (memory/swap
   161  use, disk bandwidth, etc) may also be necessary.
   162  
   163  ## Known Issues
   164  
   165  ### Common
   166  
   167  #### mm
   168  
   169  Soong always loads the entire module graph, so as modules convert from Make to
   170  Soong, `mm` is becoming closer to `mma`. This produces more correct builds, but
   171  does slow down builds, as we need to verify/produce/load a larger build graph.
   172  
   173  We're exploring a few options to speed up build startup, one being [an
   174  experimental set of ninja patches][ninja parse optimization],
   175  though that's not the current path we're working towards.
   176  
   177  ### Android 8.1 (Oreo MR1)
   178  
   179  In some cases, a tree would get into a state where Soong would be run twice on
   180  every incremental build, even if there was nothing to do. This was fixed in
   181  master with [these changes][blueprint_microfactory], but they were too
   182  significant to backport at the time. And while they fix this particular issue,
   183  they appear to cause ninja to spend more time during every build loading the
   184  `.ninja_log` / `.ninja_deps` files, especially as they become larger.
   185  
   186  A workaround to get out of this state is to remove the build.ninja entry from
   187  `$OUT_DIR/.ninja_log`:
   188  
   189  ```
   190  sed -i "/\/build.ninja/d" $(get_build_var OUT_DIR)/.ninja_log
   191  ```
   192  
   193  [catapult trace_viewer]: https://github.com/catapult-project/catapult/blob/master/tracing/README.md
   194  [ninja parse optimization]: https://android-review.googlesource.com/c/platform/external/ninja/+/461005
   195  [blueprint_microfactory]: https://android-review.googlesource.com/q/topic:%22blueprint_microfactory%22+status:merged