kythe.io@v0.0.68-0.20240422202219-7225dbc01741/kythe/docs/schema/callgraph.txt (about)

     1  // Copyright 2016 The Kythe Authors. All rights reserved.
     2  //
     3  // Licensed under the Apache License, Version 2.0 (the "License");
     4  // you may not use this file except in compliance with the License.
     5  // You may obtain a copy of the License at
     6  //
     7  //   http://www.apache.org/licenses/LICENSE-2.0
     8  //
     9  // Unless required by applicable law or agreed to in writing, software
    10  // distributed under the License is distributed on an "AS IS" BASIS,
    11  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    12  // See the License for the specific language governing permissions and
    13  // limitations under the License.
    14  
    15  Callgraphs
    16  ==========
    17  :Revision: 1.0
    18  :toc2:
    19  :toclevels: 3
    20  :priority: 999
    21  
    22  The Kythe graph contains information about callable objects like
    23  link:./#function[functions] and link:./#refcall[ref/call]
    24  edges that target them. This information can be used to satisfy queries about
    25  the locations at which different functions are called:
    26  
    27  [kythe,C++,"foo calls bar.",0]
    28  --------------------------------------------------------------------------------
    29  //- @bar defines/binding FnBar
    30  void bar() { }
    31  //- @"bar()" ref/call FnBar
    32  //- @"bar()" childof FnFoo
    33  //- @foo defines/binding FnFoo
    34  void foo() { bar(); }
    35  --------------------------------------------------------------------------------
    36  
    37  In this example, function `foo` makes a single call to function `bar`. The
    38  callsite is the expression `bar()`. This is spanned by an anchor that has a
    39  `ref/call` edge pointing to the function node for `bar`. The anchor also has a
    40  `childof` edge that points to function `foo`. This edge indicates that the
    41  effects of the anchor should be blamed on `foo`. There will be up to one such
    42  blame `childof` edge with an anchor source and a semantic target.
    43  
    44  The assertions in the example trace a possible set of queries made against
    45  a Kythe graph:
    46  
    47  . Identify the semantic object for the definition in question (`FnBar`).
    48  . Find the locations at which that object is called. These will be
    49    `ref/call` edges with `anchor` sources and semantic targets.
    50  . Find the semantic objects associated with each callsite anchor. These will
    51    be attached by `childof` edges.
    52  . Finally, we can show the binding location for the caller by looking up its
    53    `defines/binding` edge.
    54  
    55  These queries will not capture the full set of traditional callsites. We will
    56  fix this by first finding calls to forward declarations; then we will consider
    57  overrides.
    58  
    59  == Forward declarations
    60  
    61  A call made using a forward declaration may not always `ref/call` the definition
    62  that link:./#completedby[completedby] that declaration. This allows the
    63  Kythe graph to model a set of programs (where different programs link against
    64  different implementations for the same declaration) rather than a single program
    65  (where linking different implementations in this way would result in undefined
    66  behavior). In order to build a complete call graph, one must therefore also look
    67  along `completedby` edges to find alternate possible nodes that may be used to
    68  make calls to a given declaration or definition.
    69  
    70  [kythe,C++,"foo calls a forward-declared bar.",0]
    71  --------------------------------------------------------------------------------
    72  #include "bardecl.h"
    73  //- @"c.bar()" ref/call BarDecl
    74  void foo(C& c) { c.bar(); }
    75  
    76  #example bardecl.h
    77  //- @bar defines/binding BarDecl
    78  struct C { void bar(); };
    79  
    80  #example barimpl.cc
    81  #include "bardecl.h"
    82  //- @bar defines/binding BarImpl
    83  void C::bar() { }
    84  
    85  // Imagine that we only had BarDecl (by looking up the ref/call from c.bar()
    86  // in foo()). We can find the implementation BarImpl of C::bar by taking the
    87  // following trip:
    88  //- BarDecl completedby BarImpl
    89  --------------------------------------------------------------------------------
    90  
    91  Note that this overestimates the possible set of nodes for a given callee: for
    92  example, in a given program a forward declaration might never be linked (in the
    93  joining-together-object-files sense) with a definition. In order to filter these
    94  extra associations out, one would need to consider both which program is in
    95  focus by the user and which modules went into building that program. This is out
    96  of scope for this document.
    97  
    98  `completedby` relationships should only be established when an indexer observes
    99  that a completion has actually occurred. This allows users of the graph to
   100  avoid connecting implementations to signatures that those implementations
   101  never observe:
   102  
   103  [kythe,C++,"Unrelated defs and decls can be distinguished.",0]
   104  --------------------------------------------------------------------------------
   105  #example foo1.h
   106  // A header that declares a global foo().
   107  //- @foo defines/binding Foo1Decl
   108  void foo();
   109  #example foo2.h
   110  // An unrelated header that also declares a foo().
   111  //- @foo defines/binding Foo2Decl
   112  void foo();
   113  #example foo1.cc
   114  #include "foo1.h"
   115  // This definition sees the first declaration (but never the second).
   116  //- @foo defines/binding Foo1Defn
   117  //- Foo1Decl completedby Foo1Defn
   118  void foo() { }
   119  #example foo2.cc
   120  #include "foo2.h"
   121  // This definition sees only the second declaration.
   122  //- @foo defines/binding Foo2Defn
   123  //- Foo2Decl completedby Foo2Defn
   124  void foo() { }
   125  --------------------------------------------------------------------------------
   126  
   127  Finally, be careful to follow the `completedby` relationship in both directions,
   128  since the Kythe graph records specifically *which* declaration is used in a
   129  function call.
   130  
   131  [kythe,C++,"Make sure to reach every callsite.",0]
   132  --------------------------------------------------------------------------------
   133  #include "foo.h"
   134  //- @"foo()"=FooDeclCall ref/call FooDecl
   135  void baz() { foo(); }
   136  //- @foo=FooImplAnchor defines/binding FooImpl
   137  void foo() { }
   138  //- @"foo()"=FooImplCall ref/call FooImpl
   139  void bar() { foo(); }
   140  #example foo.h
   141  //- @foo=FooDeclAnchor defines/binding FooDecl
   142  void foo();
   143  
   144  // Finding all callsites for foo() follows the same query pattern. Depending
   145  // on your starting point, different VNames are unknown: if you begin with
   146  // FooImplAnchor, you'll need to search forward along the completedby edge for
   147  // FooDecl.
   148  //- FooDecl completedby FooImpl
   149  
   150  // If you start with FooDeclAnchor, you first need to find FooDecl,
   151  // then you can search backward for all of the nodes that complete it.
   152  //- FooImplAnchor defines/binding FooImpl
   153  //- FooDeclAnchor defines/binding FooDecl
   154  
   155  // Then look for all the ref/call edges terminating at those nodes:
   156  //- FooDeclCall ref/call FooDecl
   157  //- FooImplCall ref/call FooImpl
   158  --------------------------------------------------------------------------------
   159  
   160  == Overrides
   161  
   162  For callsites that involve dynamic binding, Kythe indexers should emit
   163  `ref/call` edges to the most specific possible node.
   164  
   165  [kythe,C++,"Use the most specific static overrides.",0]
   166  --------------------------------------------------------------------------------
   167  //- @f defines/binding DefSF
   168  struct S { virtual void f() { } };
   169  //- @f defines/binding DefTF
   170  //- DefTF overrides DefSF
   171  struct T : public S { void f() override { } };
   172  //- @"s->f()" ref/call DefSF
   173  void CallSF(S* s) { s->f(); }
   174  //- @"t->f()" ref/call DefTF
   175  void CallTF(T* t) { t->f(); }
   176  --------------------------------------------------------------------------------
   177  
   178  Depending on the semantics of the query you want to make, you may need to walk
   179  up or down the override chain. For example, to find all of the calls to `S::f`
   180  in the example above, including those calls that are made to functions that
   181  override the implementation in `S`:
   182  
   183  [kythe,C++,"Follow the override chain downward.",0]
   184  --------------------------------------------------------------------------------
   185  //- @f defines/binding DefSF
   186  struct S { virtual void f() { } };
   187  struct T : public S { void f() override { } };
   188  void CallSF(S* s) { s->f(); }
   189  void CallTF(T* t) { t->f(); }
   190  
   191  // Starting with DefSF, find all callers (and callers of overrides):
   192  // Build a set of overrides.
   193  //- DefTF overrides DefSF
   194  // Find all callsites.
   195  //- DefSFCall ref/call DefSF
   196  //- DefTFCall ref/call DefTF
   197  --------------------------------------------------------------------------------
   198  
   199  == Final query
   200  
   201  We can now build a single query that will unify defs and decls and search up and
   202  down the override chain. This query starts with some semantic object `Fn` and
   203  yields the various anchors that call `Fn` in the broadest sense.
   204  
   205  . Add `Fn` to set `O`.
   206  . Repeat until fixed point:
   207  .. For every `o` in `O`, for all nodes `o'` such that `o' overrides o` or
   208     `o overrides o'`, add `o'` to `O`.
   209  .. For every `o` in `O`, for all nodes `o'` such that (for some `a`)
   210     `a defines/binding o'` and (`o completedby o'`),
   211     add `o'` to `O`.
   212  .. For every `o` in `O`, for all nodes `o'` such that (for some `a`)
   213     `a defines/binding o` and (`o' completedby o`),
   214     add `o'` to `O`.
   215  . Return the set of all `a` such that there is some `o` in `O` such that
   216    `a ref/call o`.
   217  
   218  Since this query is expensive in practice, we (intend to) precompute it as part
   219  of building serving tables.