kythe.io@v0.0.68-0.20240422202219-7225dbc01741/kythe/web/site/examples.md (about)

     1  ---
     2  layout: page
     3  title: Examples
     4  permalink: /examples/
     5  order: 10
     6  ---
     7  
     8  * toc
     9  {:toc}
    10  
    11  This document assumes that the latest release archive from
    12  [https://github.com/kythe/kythe/releases](https://github.com/kythe/kythe/releases)
    13  has been unpacked into /opt/kythe/.  See /opt/kythe/README.md for more
    14  information.
    15  
    16  ## Extracting Compilations
    17  
    18  {% highlight bash %}
    19  # Generate Kythe protobuf sources
    20  bazel build //kythe/proto:all
    21  
    22  # Environment variables common to Kythe extractors
    23  export KYTHE_ROOT_DIRECTORY="$PWD"                        # Root of source code corpus
    24  export KYTHE_OUTPUT_DIRECTORY="/tmp/kythe.compilations/"  # Output directory
    25  export KYTHE_VNAMES="$PWD/kythe/data/vnames.json"         # Optional: VNames configuration
    26  
    27  mkdir -p "$KYTHE_OUTPUT_DIRECTORY"
    28  
    29  # Extract a Java compilation
    30  # java -Xbootclasspath/p:third_party/javac/javac*.jar \
    31  #   com.google.devtools.kythe.extractors.java.standalone.Javac8Wrapper \
    32  #   <javac_arguments>
    33  java -Xbootclasspath/p:third_party/javac/javac*.jar \
    34    -jar /opt/kythe/extractors/javac_extractor.jar \
    35    com.google.devtools.kythe.extractors.java.standalone.Javac8Wrapper \
    36    kythe/java/com/google/devtools/kythe/platform/kzip/*.java
    37  
    38  # Extract a C++ compilation
    39  # /opt/kythe/extractors/cxx_extractor <arguments>
    40  /opt/kythe/extractors/cxx_extractor -x c++ kythe/cxx/common/scope_guard.h
    41  {% endhighlight %}
    42  
    43  ## Extracting Compilations using Bazel
    44  
    45  Kythe uses Bazel to build itself and has implemented Bazel
    46  [action_listener](https://docs.bazel.build/versions/master/be/extra-actions.html#action_listener)s
    47  that use Kythe's Java and C++ extractors.  This effectively allows Bazel to
    48  extract each compilation as it is run during the build.
    49  
    50  ### Extracting the Kythe repository
    51  
    52  Add the flag
    53  `--experimental_action_listener=@io_kythe//kythe/extractors:extract_kzip_java`
    54  to make Bazel extract Java compilations and
    55  `--experimental_action_listener=@io_kythe//kythe/extractors:extract_kzip_cxx` to do the
    56  same for C++.
    57  
    58  {% highlight bash %}
    59  # Extract all Java/C++ compilations in Kythe
    60  bazel build -k \
    61    --experimental_action_listener=@io_kythe//kythe/extractors:extract_kzip_java \
    62    --experimental_action_listener=@io_kythe//kythe/extractors:extract_kzip_cxx \
    63    --experimental_extra_action_top_level_only \
    64    //kythe/cxx/... //kythe/java/...
    65  
    66  # Find the extracted .kzip files
    67  find -L bazel-out -name '*.kzip'
    68  {% endhighlight %}
    69  
    70  ### Extracting other Bazel based repositories
    71  
    72  You can use the Kythe release to extract compilations from other Bazel based
    73  repositories.
    74  
    75  {% highlight bash%}
    76  # Download and unpack the latest Kythe release
    77  wget -O /tmp/kythe.tar.gz \
    78      https://github.com/kythe/kythe/releases/download/$KYTHE_VERSION/kythe-$KYTHE_VERSION.tar.gz
    79  tar --no-same-owner -xvzf /tmp/kythe.tar.gz --directory /opt
    80  echo 'KYTHE_DIR=/opt/kythe-$KYTHE_VERSION' >> $BASH_ENV
    81  
    82  # Build the repository with extraction enabled
    83  bazel --bazelrc=$KYTHE_DIR/extractors.bazelrc \
    84      build --override_repository kythe_release=$KYTHE_DIR \
    85      //...
    86  {% endhighlight %}
    87  
    88  ## runextractor tool
    89  
    90  `runextractor` is a generic extraction tool that works with any build system capable of emitting a [compile_commands.json](https://clang.llvm.org/docs/JSONCompilationDatabase.html) file. `runextractor` invokes an extractor for each compilation action listed in compile_commands.json and generates a kzip in the output directory for each.
    91  
    92  Build systems capable of emitting a `compile_commands.json` include CMake, Ninja, [gn](https://gn.googlesource.com/gn/+/master/docs/reference.md), waf, and others.
    93  
    94  
    95  ### runextractor configuration
    96  
    97  `runextractor` is configured via a set of environment variables:
    98  
    99  *   `KYTHE_ROOT_DIRECTORY`: The absolute path for file input to be extracted.
   100      This is generally the root of the repository. All files extracted will be
   101      stored relative to this path.
   102  *   `KYTHE_OUTPUT_DIRECTORY`: The absolute path for storing output.
   103  *   `KYTHE_CORPUS`: The corpus label for extracted files.
   104  
   105  
   106  ### Extracting from a compile_commands.json file
   107  
   108  This example uses Ninja, but the first step can be adapted for others.
   109  
   110  1. Begin by building your project with compile_commands.json enabled. For ninja, the command is `ninja -t compdb > compile_commands.json`
   111  2. Set environment variables - see above section.
   112  3. Invoke runextractor: `runextractor compdb -extractor /opt/kythe/extractors/cxx_extractor`
   113  4. If successful, the output directory should contain one kzip for each compilation action. An optional last step is to merge these into one kzip with `/opt/kythe/tools/kzip merge --output $KYTHE_OUTPUT_DIRECTORY/merged.kzip $KYTHE_OUTPUT_DIRECTORY/*.kzip`.
   114  
   115  ### Extracting CMake based repositories
   116  
   117  The `runextractor` tool has a convenience subcommand for cmake-based repositories that first invokes CMake to generate a compile_commands.json, then processes the listed compilation actions. However the same result could be achieved by invoking CMake manually, then using the generic `runextractor compdb` command.
   118  
   119  **These instructions assume your environment is already set up to successfully
   120  run cmake for your repository**.
   121  
   122  ```shell
   123  $ export KYTHE_ROOT_DIRECTORY="/absolute/path/to/repo/root"
   124  $ export KYTHE_CORPUS="github.com/myproject/myrepo"
   125  
   126  $ export KYTHE_OUTPUT_DIRECTORY="/tmp/kythe-output"
   127  $ mkdir -p "$KYTHE_OUTPUT_DIRECTORY"
   128  
   129  # $CMAKE_ROOT_DIRECTORY is passed into the -sourcedir flag. This value should be
   130  # the directory that contains the top-level CMakeLists.txt file. In many
   131  # repositories this path is the same as $KYTHE_ROOT_DIRECTORY.
   132  $ export CMAKE_ROOT_DIRECTORY="/absolute/path/to/cmake/root"
   133  
   134  $ /opt/kythe/tools/runextractor cmake \
   135      -extractor=/opt/kythe/extractors/cxx_extractor \
   136      -sourcedir=$CMAKE_ROOT_DIRECTORY
   137  ```
   138  
   139  ## Extracting Gradle based repositories
   140  
   141  1. Install compiler wrapper
   142  
   143      Extraction works by intercepting all calls to `javac` and saving the compiler arguments and inputs to a "compilation unit", which is stored in a .kzip file. We have a javac-wrapper.sh script that forwards javac calls to the java extractor and then calls javac. Add this to the end of your project's build.gradle:
   144  
   145      ```groovy
   146      allprojects {
   147        gradle.projectsEvaluated {
   148          tasks.withType(JavaCompile) {
   149            options.fork = true
   150            options.forkOptions.executable = '/opt/kythe/extractors/javac-wrapper.sh'
   151          }
   152        }
   153      }
   154      ```
   155  
   156  2. VName configuration
   157  
   158      Next, you will need to create a vnames.json mapping file, which tells the
   159      extractor how to assign vnames to files based on their paths. A basic vnames
   160      config for a gradle project looks like:
   161  
   162      ```json
   163      [
   164        {
   165          "pattern": "(build/[^/]+)/(.*)",
   166          "vname": {
   167            "corpus": "MY_CORPUS",
   168            "path": "@2@",
   169            "root": "@1@"
   170          }
   171        },
   172        {
   173          "pattern": ".*/.gradle/caches/(.*)",
   174          "vname": {
   175            "corpus": "MY_CORPUS",
   176            "path": "@1@",
   177            "root": ".gradle/caches"
   178          }
   179        },
   180        {
   181          "pattern": "(.*)",
   182          "vname": {
   183            "corpus": "MY_CORPUS",
   184            "path": "@1@"
   185          }
   186        }
   187      ]
   188      ```
   189  
   190      (note: change "MY_CORPUS" to the actual corpus for your project)
   191  
   192      You can test your vname config using the `vnames` command line tool. For example:
   193  
   194      ```shell
   195      bazel build //kythe/go/util/tools/vnames
   196  
   197      echo "some/test/path.java" | ./bazel-bin/kythe/go/util/tools/vnames/vnames apply-rules --rules vnames.json
   198      > {
   199      >   "corpus": "MY_CORPUS",
   200      >   "path": "some/test/path.java"
   201      > }
   202      ```
   203  
   204  
   205  3. Extraction
   206  
   207      ```shell
   208      # note: you may want to use a different javac depending on your install
   209      export REAL_JAVAC="/usr/bin/javac"
   210      export JAVA_HOME="$(readlink -f $REAL_JAVAC | sed 's:/bin/javac::')"
   211      export JAVAC_EXTRACTOR_JAR="/opt/kythe/extractors/javac_extractor.jar"
   212  
   213      export KYTHE_VNAMES="$PWD/vnames.json"
   214  
   215      export KYTHE_ROOT_DIRECTORY="$PWD" # paths in the compilation unit will be made relative to this
   216      export KYTHE_OUTPUT_DIRECTORY="/tmp/extracted_gradle_project"
   217      mkdir -p "$KYTHE_OUTPUT_DIRECTORY"
   218  
   219      ./gradlew clean build -x test -Dno_werror=true
   220  
   221      # merge all kzips into one
   222      /opt/kythe/tools/kzip merge --output $KYTHE_OUTPUT_DIRECTORY/merged.kzip $KYTHE_OUTPUT_DIRECTORY/*.kzip
   223      ```
   224  
   225  4. Examine results
   226  
   227      If extraction was successful, the final kzip should be at `$KYTHE_OUTPUT_DIRECTORY/merged.kzip`. The `kzip` tool can be used to inspect the result.
   228  
   229      ```shell
   230      $ kzip info --input merged.kzip | jq . # view summary information
   231      $ kzip view merged.kzip | jq .         # view all compilation units in the kzip
   232      ```
   233  
   234  
   235  ## Extracting projects built with `make`
   236  
   237  Projects built with make can be extracted by substituting the C/C++ compiler with a wrapper script that invokes both Kythe's cxx_extractor binary and the actual C/C++ compiler.
   238  
   239  Given a simple example project:
   240  
   241  ```c++
   242  # main.cc
   243  
   244  #include <iostream>
   245  
   246  int main(int argc, char** argv) {
   247      std::cout << "Hello" << std::endl;
   248  }
   249  ```
   250  
   251  ```shell
   252  # makefile
   253  
   254  all: bin
   255  
   256  bin: main.cc
   257    $(CXX) main.cc -o bin
   258  
   259  ```
   260  
   261  ```shell
   262  # cxx_wrapper.sh
   263  #!/bin/bash -e
   264  
   265  $KYTHE_RELEASE_DIR/extractors/cxx_extractor "$@" &
   266  /usr/bin/c++ "$@"
   267  ```
   268  
   269  Extraction is done by setting the `CXX` make variable as well as some environment variables that configure `cxx_extractor`.
   270  
   271  ```shell
   272  # directory where kythe release has been installed
   273  export KYTHE_RELEASE_DIR=/opt/kythe-v0.0.50
   274  
   275  # parameters for cxx_extractor
   276  export KYTHE_CORPUS=mycorpus
   277  export KYTHE_ROOT_DIRECTORY="$PWD"
   278  export KYTHE_OUTPUT_DIRECTORY=/tmp/extract
   279  
   280  export CXX="cxx_wrapper.sh"
   281  
   282  mkdir -p "$KYTHE_OUTPUT_DIRECTORY"
   283  
   284  make
   285  ```
   286  
   287  If all goes well, this will populate `$KYTHE_OUTPUT_DIRECTORY` with kzip files, one for each compiler invocation. These files can be inspected with the `kzip` tool distributed as part of the kythe release. For example `kzip view $KYTHE_OUTPUT_DIRECTORY/some.file.kzip | jq`.
   288  
   289  
   290  ## Indexing Compilations
   291  
   292  All Kythe indexers analyze compilations emitted from
   293  [extractors](#extracting-compilations) as either a
   294  [.kzipĀ file]({{site.baseurl}}/docs/kythe-kzip.html).  The indexers will then
   295  emit a [delimited
   296  stream]({{site.data.development.source_browser}}/kythe/go/platform/delimited/delimited.go)
   297  of [entry protobufs]({{site.baseurl}}/docs/kythe-storage.html#_entry) that can
   298  then be stored in a [GraphStore]({{site.baseurl}}/docs/kythe-storage.html).
   299  
   300  {% highlight bash %}
   301  # Indexing a C++ compilation
   302  # /opt/kythe/indexers/cxx_indexer --ignore_unimplemented <kzip-file> > entries
   303  /opt/kythe/indexers/cxx_indexer --ignore_unimplemented \
   304    .kythe_compilations/c++/kythe_cxx_indexer_cxx_libIndexerASTHooks.cc.c++.kzip > entries
   305  
   306  # Indexing a Java compilation
   307  # java -Xbootclasspath/p:third_party/javac/javac*.jar \
   308  #   com.google.devtools.kythe.analyzers.java.JavaIndexer \
   309  #   <kzip-file> > entries
   310  java -Xbootclasspath/p:third_party/javac/javac*.jar \
   311    com.google.devtools.kythe.analyzers.java.JavaIndexer \
   312    $PWD/.kythe_compilations/java/kythe_java_com_google_devtools_kythe_analyzers_java_analyzer.java.kzip > entries
   313  
   314  # View indexer's output entry stream as JSON
   315  /opt/kythe/tools/entrystream --write_format=json < entries
   316  
   317  # Write entry stream into a GraphStore
   318  /opt/kythe/tools/write_entries --graphstore leveldb:/tmp/gs < entries
   319  {% endhighlight %}
   320  
   321  ## Indexing the Kythe Repository
   322  
   323  {% highlight bash %}
   324  mkdir -p .kythe_{graphstore,compilations}
   325  # .kythe_serving is the output directory for the resulting Kythe serving tables
   326  # .kythe_graphstore is the output directory for the resulting Kythe GraphStore
   327  # .kythe_compilations will contain the intermediary .kzip file for each
   328  #   indexed compilation
   329  
   330  # Produce the .kzip files for each compilation in the Kythe repo
   331  ./kythe/extractors/bazel/extract.sh "$PWD" .kythe_compilations
   332  
   333  # Index the compilations, producing a GraphStore containing a Kythe index
   334  bazel build //kythe/release:docker
   335  docker run --rm \
   336    -v "${PWD}:/repo" \
   337    -v "${PWD}/.kythe_compilations:/compilations" \
   338    -v "${PWD}/.kythe_graphstore:/graphstore" \
   339    google/kythe --index
   340  
   341  # Generate corresponding serving tables
   342  /opt/kythe/tools/write_tables --graphstore .kythe_graphstore --out .kythe_serving
   343  {% endhighlight %}
   344  
   345  ## Using Cayley to explore a GraphStore
   346  
   347  Install Cayley if necessary:
   348  [https://github.com/google/cayley/releases](https://github.com/google/cayley/releases)
   349  
   350  {% highlight bash %}
   351  # Convert GraphStore to nquads format
   352  bazel run //kythe/go/storage/tools/triples --graphstore /path/to/graphstore | \
   353    gzip >kythe.nq.gz
   354  
   355  cayley repl --dbpath kythe.nq.gz # or cayley http --dbpath kythe.nq.gz
   356  {% endhighlight %}
   357  
   358      // Get all file nodes
   359      cayley> g.V().Has("/kythe/node/kind", "file").All()
   360  
   361      // Get definition anchors for all record nodes
   362      cayley> g.V().Has("/kythe/node/kind", "record").Tag("record").In("/kythe/edge/defines").All()
   363  
   364      // Get the file(s) defining a particular node
   365      cayley> g.V("node_ticket").In("/kythe/edge/defines").Out("/kythe/edge/childof").Has("/kythe/node/kind", "file").All()
   366  
   367  ## Serving data over HTTP
   368  
   369  The `http_server` tool can be run over the serving table created with the
   370  `write_tables` binary (see above).
   371  
   372  {% highlight bash %}
   373  # --listen localhost:8080 allows access from only this machine; change to
   374  # --listen :8080 to allow access from any machine
   375  /opt/kythe/tools/http_server \
   376    --listen localhost:8080 \
   377    --serving_table .kythe_serving
   378  {% endhighlight %}