kythe.io@v0.0.68-0.20240422202219-7225dbc01741/kythe/docs/schema/callgraph.txt (about) 1 // Copyright 2016 The Kythe Authors. All rights reserved. 2 // 3 // Licensed under the Apache License, Version 2.0 (the "License"); 4 // you may not use this file except in compliance with the License. 5 // You may obtain a copy of the License at 6 // 7 // http://www.apache.org/licenses/LICENSE-2.0 8 // 9 // Unless required by applicable law or agreed to in writing, software 10 // distributed under the License is distributed on an "AS IS" BASIS, 11 // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 // See the License for the specific language governing permissions and 13 // limitations under the License. 14 15 Callgraphs 16 ========== 17 :Revision: 1.0 18 :toc2: 19 :toclevels: 3 20 :priority: 999 21 22 The Kythe graph contains information about callable objects like 23 link:./#function[functions] and link:./#refcall[ref/call] 24 edges that target them. This information can be used to satisfy queries about 25 the locations at which different functions are called: 26 27 [kythe,C++,"foo calls bar.",0] 28 -------------------------------------------------------------------------------- 29 //- @bar defines/binding FnBar 30 void bar() { } 31 //- @"bar()" ref/call FnBar 32 //- @"bar()" childof FnFoo 33 //- @foo defines/binding FnFoo 34 void foo() { bar(); } 35 -------------------------------------------------------------------------------- 36 37 In this example, function `foo` makes a single call to function `bar`. The 38 callsite is the expression `bar()`. This is spanned by an anchor that has a 39 `ref/call` edge pointing to the function node for `bar`. The anchor also has a 40 `childof` edge that points to function `foo`. This edge indicates that the 41 effects of the anchor should be blamed on `foo`. There will be up to one such 42 blame `childof` edge with an anchor source and a semantic target. 43 44 The assertions in the example trace a possible set of queries made against 45 a Kythe graph: 46 47 . Identify the semantic object for the definition in question (`FnBar`). 48 . Find the locations at which that object is called. These will be 49 `ref/call` edges with `anchor` sources and semantic targets. 50 . Find the semantic objects associated with each callsite anchor. These will 51 be attached by `childof` edges. 52 . Finally, we can show the binding location for the caller by looking up its 53 `defines/binding` edge. 54 55 These queries will not capture the full set of traditional callsites. We will 56 fix this by first finding calls to forward declarations; then we will consider 57 overrides. 58 59 == Forward declarations 60 61 A call made using a forward declaration may not always `ref/call` the definition 62 that link:./#completedby[completedby] that declaration. This allows the 63 Kythe graph to model a set of programs (where different programs link against 64 different implementations for the same declaration) rather than a single program 65 (where linking different implementations in this way would result in undefined 66 behavior). In order to build a complete call graph, one must therefore also look 67 along `completedby` edges to find alternate possible nodes that may be used to 68 make calls to a given declaration or definition. 69 70 [kythe,C++,"foo calls a forward-declared bar.",0] 71 -------------------------------------------------------------------------------- 72 #include "bardecl.h" 73 //- @"c.bar()" ref/call BarDecl 74 void foo(C& c) { c.bar(); } 75 76 #example bardecl.h 77 //- @bar defines/binding BarDecl 78 struct C { void bar(); }; 79 80 #example barimpl.cc 81 #include "bardecl.h" 82 //- @bar defines/binding BarImpl 83 void C::bar() { } 84 85 // Imagine that we only had BarDecl (by looking up the ref/call from c.bar() 86 // in foo()). We can find the implementation BarImpl of C::bar by taking the 87 // following trip: 88 //- BarDecl completedby BarImpl 89 -------------------------------------------------------------------------------- 90 91 Note that this overestimates the possible set of nodes for a given callee: for 92 example, in a given program a forward declaration might never be linked (in the 93 joining-together-object-files sense) with a definition. In order to filter these 94 extra associations out, one would need to consider both which program is in 95 focus by the user and which modules went into building that program. This is out 96 of scope for this document. 97 98 `completedby` relationships should only be established when an indexer observes 99 that a completion has actually occurred. This allows users of the graph to 100 avoid connecting implementations to signatures that those implementations 101 never observe: 102 103 [kythe,C++,"Unrelated defs and decls can be distinguished.",0] 104 -------------------------------------------------------------------------------- 105 #example foo1.h 106 // A header that declares a global foo(). 107 //- @foo defines/binding Foo1Decl 108 void foo(); 109 #example foo2.h 110 // An unrelated header that also declares a foo(). 111 //- @foo defines/binding Foo2Decl 112 void foo(); 113 #example foo1.cc 114 #include "foo1.h" 115 // This definition sees the first declaration (but never the second). 116 //- @foo defines/binding Foo1Defn 117 //- Foo1Decl completedby Foo1Defn 118 void foo() { } 119 #example foo2.cc 120 #include "foo2.h" 121 // This definition sees only the second declaration. 122 //- @foo defines/binding Foo2Defn 123 //- Foo2Decl completedby Foo2Defn 124 void foo() { } 125 -------------------------------------------------------------------------------- 126 127 Finally, be careful to follow the `completedby` relationship in both directions, 128 since the Kythe graph records specifically *which* declaration is used in a 129 function call. 130 131 [kythe,C++,"Make sure to reach every callsite.",0] 132 -------------------------------------------------------------------------------- 133 #include "foo.h" 134 //- @"foo()"=FooDeclCall ref/call FooDecl 135 void baz() { foo(); } 136 //- @foo=FooImplAnchor defines/binding FooImpl 137 void foo() { } 138 //- @"foo()"=FooImplCall ref/call FooImpl 139 void bar() { foo(); } 140 #example foo.h 141 //- @foo=FooDeclAnchor defines/binding FooDecl 142 void foo(); 143 144 // Finding all callsites for foo() follows the same query pattern. Depending 145 // on your starting point, different VNames are unknown: if you begin with 146 // FooImplAnchor, you'll need to search forward along the completedby edge for 147 // FooDecl. 148 //- FooDecl completedby FooImpl 149 150 // If you start with FooDeclAnchor, you first need to find FooDecl, 151 // then you can search backward for all of the nodes that complete it. 152 //- FooImplAnchor defines/binding FooImpl 153 //- FooDeclAnchor defines/binding FooDecl 154 155 // Then look for all the ref/call edges terminating at those nodes: 156 //- FooDeclCall ref/call FooDecl 157 //- FooImplCall ref/call FooImpl 158 -------------------------------------------------------------------------------- 159 160 == Overrides 161 162 For callsites that involve dynamic binding, Kythe indexers should emit 163 `ref/call` edges to the most specific possible node. 164 165 [kythe,C++,"Use the most specific static overrides.",0] 166 -------------------------------------------------------------------------------- 167 //- @f defines/binding DefSF 168 struct S { virtual void f() { } }; 169 //- @f defines/binding DefTF 170 //- DefTF overrides DefSF 171 struct T : public S { void f() override { } }; 172 //- @"s->f()" ref/call DefSF 173 void CallSF(S* s) { s->f(); } 174 //- @"t->f()" ref/call DefTF 175 void CallTF(T* t) { t->f(); } 176 -------------------------------------------------------------------------------- 177 178 Depending on the semantics of the query you want to make, you may need to walk 179 up or down the override chain. For example, to find all of the calls to `S::f` 180 in the example above, including those calls that are made to functions that 181 override the implementation in `S`: 182 183 [kythe,C++,"Follow the override chain downward.",0] 184 -------------------------------------------------------------------------------- 185 //- @f defines/binding DefSF 186 struct S { virtual void f() { } }; 187 struct T : public S { void f() override { } }; 188 void CallSF(S* s) { s->f(); } 189 void CallTF(T* t) { t->f(); } 190 191 // Starting with DefSF, find all callers (and callers of overrides): 192 // Build a set of overrides. 193 //- DefTF overrides DefSF 194 // Find all callsites. 195 //- DefSFCall ref/call DefSF 196 //- DefTFCall ref/call DefTF 197 -------------------------------------------------------------------------------- 198 199 == Final query 200 201 We can now build a single query that will unify defs and decls and search up and 202 down the override chain. This query starts with some semantic object `Fn` and 203 yields the various anchors that call `Fn` in the broadest sense. 204 205 . Add `Fn` to set `O`. 206 . Repeat until fixed point: 207 .. For every `o` in `O`, for all nodes `o'` such that `o' overrides o` or 208 `o overrides o'`, add `o'` to `O`. 209 .. For every `o` in `O`, for all nodes `o'` such that (for some `a`) 210 `a defines/binding o'` and (`o completedby o'`), 211 add `o'` to `O`. 212 .. For every `o` in `O`, for all nodes `o'` such that (for some `a`) 213 `a defines/binding o` and (`o' completedby o`), 214 add `o'` to `O`. 215 . Return the set of all `a` such that there is some `o` in `O` such that 216 `a ref/call o`. 217 218 Since this query is expensive in practice, we (intend to) precompute it as part 219 of building serving tables.