kythe.io@v0.0.68-0.20240422202219-7225dbc01741/kythe/docs/schema/indexing-generated-code.txt (about) 1 // Copyright 2016 The Kythe Authors. All rights reserved. 2 // 3 // Licensed under the Apache License, Version 2.0 (the "License"); 4 // you may not use this file except in compliance with the License. 5 // You may obtain a copy of the License at 6 // 7 // http://www.apache.org/licenses/LICENSE-2.0 8 // 9 // Unless required by applicable law or agreed to in writing, software 10 // distributed under the License is distributed on an "AS IS" BASIS, 11 // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 // See the License for the specific language governing permissions and 13 // limitations under the License. 14 15 = Indexing Generated Code 16 17 :Revision: 1.0 18 :toc2: 19 :toclevels: 3 20 :priority: 999 21 22 Source code generators like link:https://www.gnu.org/software/flex/[Flex], 23 link:https://www.gnu.org/software/bison/[GNU Bison], and 24 link:http://www.swig.org/[SWIG] take a high-level description of a software 25 component and generate the code necessary to realize that component in a 26 lower-level or general-purpose programming language. Users browsing projects 27 that use these components usually want cross-references to take them 28 from use sites of a generated interface to the high-level code that brought 29 that interface into being. They do not normally want to see the generated 30 implementation, as this is often difficult (or uninteresting) to read. This 31 document describes how to encode information about generated code to permit 32 cross-language links. 33 34 To make the discussion easier to understand let's pretend we are working with 35 two languages: SourceLang and TargetLang. SourceLang has `.source` file and TargetLang 36 has `.target` files. We also have a tool (generator) that can take generate 37 `foo.target` file from `foo.source` file. We have following components: 38 39 * Source Indexer - Kythe indexer that takes `.source` files and outputs index 40 data. 41 * Target Indexer - Kythe indexer that takes `.target` files and outputs index 42 data. 43 * Generator - tool that produces `.target` files from `.source` files. 44 * Post processor - Kythe tool that takes all index data produced by all 45 indexers, processes it and outputs final Kythe graph that contains data 46 for both SourceLang and TargetLang. 47 48 Now we want to teach Kythe how to create cross-references between generated 49 `foo.target` file and original `foo.source` file. The main idea is pretty simple: 50 Generator has to output extra data containing mapping of elements in `foo.target` 51 to the original elements from `foo.source`. Then when Target Indexer is indexing 52 `foo.target` it will use that mapping to output *generates* or *imputes* edges. 53 These edges connect nodes from `foo.target` with nodes in `foo.source`. 54 55 Kythe doesn't require implementors to use one concrete approach for passing 56 mapping metadata and outputting *generates* and *imputes* edges. Below we 57 describe two different approaches, each has its own pros and cons. But in 58 both cases it is assumed that implementors can change Generator and Target 59 Indexer. If possible the *generates* approach is preferred as it requires less 60 post-processing work. 61 62 TIP: You can find an example implementation at 63 link:https://github.com/kythe/kythe/tree/master/kythe/examples/proto[GitHub]. 64 The current sample web UI does not interpret the parts of the schema we will 65 use; this is a work in progress. 66 67 == Java To JavaScript with *imputes* edges 68 69 This approach is generic and works for any combination of SourceLang and 70 TargetLang. In this example we generate JavaScript files from Java file so 71 SourceLang is Java and TargetLang is JavaScript. Given `Color.java`: 72 73 [source,java] 74 ------------------------------------------------------------------------------- 75 public enum Color { 76 RED; 77 } 78 ------------------------------------------------------------------------------- 79 80 Generator produces `color.js`: 81 [source,javascript] 82 ------------------------------------------------------------------------------- 83 const Color = { 84 RED: 0, 85 }; 86 ------------------------------------------------------------------------------- 87 88 === Changes to Generator 89 90 To support cross-references betwen `color.js` and `Color.java` we need to update 91 Generator to output the following mapping data for `Color`, `RED` elements. 92 93 [source,json] 94 ------------------------------------------------------------------------------- 95 { 96 "type": "kythe0", 97 "meta": [{ 98 "type": "anchor_anchor", 99 "source_begin": 13, 100 "source_end": 18, 101 "target_begin": 6, 102 "target_end": 11, 103 "edge": "/kythe/edge/imputes", 104 "source_vname": { 105 "corpus": "corpus", 106 "path": "path/to/Color.java" 107 } 108 }, { 109 "type": "anchor_anchor", 110 "source_begin": 22, 111 "source_end": 25, 112 "target_begin": 18, 113 "target_end": 21, 114 "edge": "/kythe/edge/imputes", 115 "source_vname": { 116 "corpus": "corpus", 117 "path": "path/to/Color.java" 118 } 119 }] 120 } 121 ------------------------------------------------------------------------------- 122 123 This mapping has 2 `meta` entries. The first entry for `Color`, the second for 124 `RED`. Note: 125 126 * Each entry doesn't contain names of elements. Each entry contains only 127 position of elements in the source (`Color.java`) and target (`color.js`) 128 files. 129 * Each position is defined as byte offset inside file and not as line/column. 130 This is required because in Kythe anchors are defined using byte offsets and 131 not line/column. In this example JavaScript indexer will process this 132 mapping and will need to output *anchor* for `Color.java` and indexer 133 doesn't have access to the `Color.java` file (it has access only to JS 134 files). Because of that JS indexer can't translate line/column to byte 135 offset. 136 * Entry doesn't contain vnames of elements in `Color.java` or `color.js` and 137 instead contains positions. VNames of nodes are internal details of each 138 indexer and subject to change. Generator usually a standalone tool that 139 doesn't know rules for producing vnames for specific language so it's 140 impossible for Generator to output vnames of nodes. If in your case 141 VNames are stable and well-specified you can use simpler approach 142 using *generates* described in `Protocol Buffer` section below. 143 144 To pass this mapping to the JavaScript Indexer Generator will append it 145 as a comment at the last line of `color.js`: 146 147 [source,javascript] 148 ------------------------------------------------------------------------------- 149 const Color = { 150 RED: 0, 151 }; 152 153 // Kythe Indexing Metadata: 154 // {"type":"kythe0","meta":[{"type":"anchor_anchor","source_begin":13,... 155 ------------------------------------------------------------------------------- 156 157 Inlining metadata inside `color.js` has benefit of avoiding passing extra 158 files to Indexer. All Indexer needs is to know that some JavaScript files can 159 contain metadata on the last line and parse it. 160 161 One downside is that it adds noise to `color.js` but usually generated 162 files are invisible to developers so it's not a big concern. 163 164 ==== Changes to JavaScript Indexer 165 166 On JavaScript Indexer side we need to parse metadata and output *imputes* 167 edges. To parse metadata indexer can check last two lines of all `.js` files 168 and see if they contain `// Kythe Indexing Metadata:` and if so - parse 169 the last line as JSON. 170 171 For each `meta` entry indexer should do the following: 172 173 1. Output an *anchor* using `source_begin` and `source_end`. `source_vname` 174 should be used as file containing the anchor. 175 2. Find a JavaScript node that has *defines/binding* anchor with the same 176 `target_begin/end` position. 177 3. Ouptut one *imputes* edge from the *anchor* created at step 1 to the node 178 found at step 2. 179 180 Note that this only applies to meta entries with type `anchor_anchor`. For other 181 types structure might be different. See link:https://github.com/kythe/kythe/issues/3711[issue #3711]. 182 183 Here is what JavaScript indexer outputs for the `Color` element using the 184 rules above: 185 186 [kythe,dot,"JavaScript Indexer graph",0] 187 -------------------------------------------------------------------------------- 188 digraph G { 189 size="7,7"; 190 coloranchorjava [label="anchor\nColor.java:0:12-17"]; 191 redanchorjava [label="anchor\nColor.java:1:2-5"]; 192 coloranchorjs [label="anchor\ncolor.js:0:6-11"]; 193 redanchorjs [label="anchor\ncolor.js:0:2-5"]; 194 colornode [label="Color node\nin JS"]; 195 rednode [label="RED node\nin JS"]; 196 197 coloranchorjs -> colornode [label = "defines/binding"]; 198 redanchorjs -> rednode [label = "defines/binding"]; 199 coloranchorjava -> colornode [label = "imputes"]; 200 redanchorjava -> rednode [label = "imputes"]; 201 } 202 -------------------------------------------------------------------------------- 203 204 Output of Java Indexer looks like this: 205 206 [kythe,dot,"Java Indexer graph",0] 207 -------------------------------------------------------------------------------- 208 digraph G { 209 size="7,7"; 210 coloranchorjava [label="anchor\nColor.java:0:12-17"]; 211 redanchorjava [label="anchor\nColor.java:1:2-5"]; 212 colornodejava [label="Color node\nin Java"]; 213 rednodejava [label="RED node\nin Java"]; 214 215 coloranchorjava -> colornodejava [label = "defines/binding"]; 216 redanchorjava -> rednodejava [label = "defines/binding"]; 217 } 218 -------------------------------------------------------------------------------- 219 220 === Post-processor 221 222 Once Java and JavaScript Indexers finished their output is merged and 223 postprocessor finds all anchors that have both *defines/binding* and 224 *imputes* edges and creates *generates* edge: 225 226 [kythe,dot,"Processed final graph",0] 227 -------------------------------------------------------------------------------- 228 digraph G { 229 size="7,7"; 230 coloranchorjava [label="anchor\nColor.java:0:12-17"]; 231 redanchorjava [label="anchor\nColor.java:1:2-5"]; 232 coloranchorjs [label="anchor\ncolor.js:0:6-11"]; 233 redanchorjs [label="anchor\ncolor.js:0:2-5"]; 234 colornode [label="Color node\nin JS"]; 235 rednode [label="RED node\nin JS"]; 236 colornodejava [label="Color node\nin Java"]; 237 rednodejava [label="RED node\nin Java"]; 238 239 coloranchorjs -> colornode [label = "defines/binding"]; 240 redanchorjs -> rednode [label = "defines/binding"]; 241 coloranchorjava -> colornode [label = "imputes"]; 242 redanchorjava -> rednode [label = "imputes"]; 243 coloranchorjava -> colornodejava [label = "defines/binding"]; 244 redanchorjava -> rednodejava [label = "defines/binding"]; 245 colornodejava -> colornode [label = "generates"]; 246 rednodejava -> rednode [label = "generates"]; 247 } 248 -------------------------------------------------------------------------------- 249 250 This is the end state. Now tools using Kythe graph can see that Color enum 251 in JS is generated by Color enum in Java and perform proper action (for example 252 IDE upon clicking on `Color` in JS file will go to the definition of `Color` 253 enum in java file. 254 255 == Protocol Buffers with *generates* edges 256 257 This approach is easier to implement compared to *imputes* approach described 258 above, but it requires tighter integration with Indexer and Generator. When 259 Generator outputs code it also adds a mapping as in the *imputes* approach, 260 but instead of mapping location to location it outputs VNames of nodes from 261 `foo.source`. It requires Generator to know exactly what VNames will be produced 262 by the Source Indexer. This approach is feasible when either VNames either 263 have simple stable form or Generator can reuse code from Source Indexer to 264 generate VNames. 265 266 In this example we generate C++ files from Protocol buffer definitions. So 267 SourceLang is Protocol Buffers and TargetLang is C++. 268 269 The Kythe project uses 270 link:https://developers.google.com/protocol-buffers/[protocol buffers] for 271 data interchange. The `protoc` compiler reads a domain-specific language 272 that describes messages and synthesizes code that serializes, deserializes, 273 and manipulates these messages. It can generate code in a number of different 274 target languages by swapping out backend components. These accept an encoding 275 of the message descriptions in the original source file and emit source text. 276 277 [kythe,dot,"protoc architecture",0] 278 -------------------------------------------------------------------------------- 279 digraph G { 280 size="7,7"; 281 protosrc [label=".proto", shape=note]; 282 frontend [label="protoc", shape=rectangle]; 283 descriptor [label="descriptor", shape=note]; 284 backend [label="C++ language backend", shape=rectangle]; 285 ccsrc [label=".pb.h", shape=note]; 286 protosrc -> frontend; 287 frontend -> descriptor; 288 descriptor -> backend; 289 backend -> ccsrc; 290 } 291 -------------------------------------------------------------------------------- 292 293 === Indexing `.proto` definitions 294 295 `.proto` files are written in a domain-specific programming language for 296 describing various properties about messages and other data. It is interesting 297 to index these on their own, as messages in one `.proto` file may be used in 298 another `.proto` file. Here is a very simple example of the language: 299 300 [source,c] 301 -------------------------------------------------------------------------------- 302 syntax = "proto3"; 303 package kythe.examples.proto.example; 304 305 // A single proto message. 306 message Foo { 307 } 308 -------------------------------------------------------------------------------- 309 310 This file describes the empty message `kythe.examples.proto.example.Foo` 311 using features from version 3 of the language. When run through `protoc` 312 with the appropriate options set, it will generate the interface `example.pb.h` 313 and the implementation `example.pb.cc`. These may be used to interact with 314 `Foo` messages in $$C++$$. 315 316 As it turns out, `protoc` can be coerced into saving the descriptor that it 317 passes to its backends. Ordinarily, this descriptor would merely be an 318 abstract version of the `.proto` input file that discards syntax and records 319 only the details necessary to generate source code. If asked, `protoc` will 320 also keep track of source locations (`--include_source_info`) and data about 321 the `.proto` files that are (transitively) imported (`--include_imports`). 322 This information is sufficient to build a Kythe graph for a given `.proto` 323 definition file. It will become important later that every object that the 324 descriptor describes has an address, like "4.0", that corresponds (roughly) 325 to its position in the descriptor's AST. These addresses are used as keys into 326 the table that keeps track of source locations in the original `.proto` file. 327 328 [kythe,dot,"protoc architecture with indexer",0] 329 -------------------------------------------------------------------------------- 330 digraph G { 331 size="7,7"; 332 protosrc [label=".proto", shape=note]; 333 frontend [label="protoc", shape=rectangle]; 334 descriptor [label="descriptor", shape=note]; 335 descriptorfile [label="FileDescriptorSet", shape=note, color=blue]; 336 indexer [label="Kythe proto_indexer", shape=rectangle, color=blue]; 337 backend [label="C++ language backend", shape=rectangle]; 338 ccsrc [label=".pb.h", shape=note]; 339 entries [label="Kythe entries", shape=note, color=blue]; 340 protosrc -> frontend; 341 frontend -> descriptor; 342 frontend -> descriptorfile [color=blue]; 343 protosrc -> indexer [color=blue]; 344 descriptorfile -> indexer [color=blue]; 345 descriptor -> backend; 346 backend -> ccsrc; 347 indexer -> entries [color=blue]; 348 } 349 -------------------------------------------------------------------------------- 350 351 This extra information is stored as a file that contains a 352 `proto2.FileDescriptorSet` message, which in turn is a list of the 353 `proto2.FileDescriptorProto` messages used in the course of processing `.proto` 354 input. Note that this message does not contain `.proto` source text, so the 355 `proto_indexer` must have access to the original source files. 356 357 We can add a verifier assertion to check that `Foo` declares a Kythe node: 358 359 [source,c] 360 -------------------------------------------------------------------------------- 361 syntax = "proto3"; 362 package kythe.examples.proto.example; 363 364 // A single proto message. 365 //- @Foo defines/binding MessageFoo? 366 message Foo { 367 } 368 -------------------------------------------------------------------------------- 369 370 and see that it was unified with the appropriate VName: 371 372 .Output 373 ---- 374 MessageFoo: EVar(... = App(vname, 375 (4.0, kythe, "", kythe/examples/proto/example.proto, protobuf))) 376 ---- 377 378 == Using generated source code 379 380 Imagine that we have a simple $$C++$$ user of our generated source code for 381 `Foo`. Its code, with a verifier assertion, looks like this: 382 383 [source,c] 384 -------------------------------------------------------------------------------- 385 #include "kythe/examples/proto/example.pb.h" 386 387 //- @Foo ref CxxFooDecl? 388 void UseProto(kythe::examples::proto::example::Foo* foo) { 389 } 390 -------------------------------------------------------------------------------- 391 392 The Kythe pipeline for indexing our combined program looks like this: 393 394 [kythe,dot,"first indexing pipeline",0] 395 -------------------------------------------------------------------------------- 396 digraph G { 397 size="7,7!"; 398 usersrc [label="proto_user.cc", shape=note, color=blue]; 399 ccextractor [label="C++ extractor", shape=rectangle, color=blue]; 400 kzip [label="proto_user.kzip", shape=note, color=blue]; 401 protosrc [label=".proto", shape=note]; 402 frontend [label="protoc", shape=rectangle]; 403 descriptor [label="descriptor", shape=note]; 404 descriptorfile [label="FileDescriptorSet", shape=note]; 405 indexer [label="Kythe proto_indexer", shape=rectangle]; 406 ccindexer [label="Kythe C++ indexer", shape=rectangle, color=blue]; 407 backend [label="C++ language backend", shape=rectangle]; 408 ccsrc [label=".pb.h", shape=note]; 409 entries [label="Kythe entries", shape=note]; 410 protosrc -> frontend; 411 frontend -> descriptor; 412 frontend -> descriptorfile; 413 protosrc -> indexer; 414 descriptorfile -> indexer; 415 descriptor -> backend; 416 backend -> ccsrc; 417 indexer -> entries; 418 usersrc -> ccextractor [color=blue]; 419 ccsrc -> ccextractor [color=blue]; 420 usersrc -> ccsrc [color=blue]; 421 ccextractor -> kzip [color=blue]; 422 kzip -> ccindexer [color=blue]; 423 ccindexer -> entries [color=blue]; 424 } 425 -------------------------------------------------------------------------------- 426 427 When we use the verifier to inspect the resulting `CxxFooDecl`, we see that 428 it has not been unified with the VName for `Foo`: 429 430 .Output 431 ---- 432 CxxFooDecl: EVar(... = 433 App(vname, (srl0y/pwih+G6wsjFLMTVKQPC7lLH3/9MVK2d2aJHeE=, 434 kythe, bazel-out/genfiles, kythe/examples/proto/example.pb.h, 435 c++))) 436 ---- 437 438 This is because the `kythe::examples::proto::example::Foo` type is a $$C++$$ 439 type defined in `example.pb.h`. That it was defined in some original `.proto` 440 file has no meaning to the $$C++$$ compiler. Furthermore, the Kythe $$C++$$ 441 indexer has no understanding of the `protoc` language and the VNames that the 442 Kythe proto_indexer produces. 443 444 Our goal is to add edges in the graph between `CxxFooDecl` and `MessageFoo` 445 so that clients can take into account their relationship when displaying 446 cross-references or answering other queries. We do not want to unify them in the 447 same node, as they are legitimately different objects. Users may wish to 448 navigate to the generated $$C++$$ code for `CxxFooDecl` or to view uses of 449 `MessageFoo` in other languages. To support these different uses, we will emit 450 a link:/docs/schema#generates[generates] edge such that `MessageFoo` 451 *generates* `CxxFooDecl`. Clients can choose to follow the edge or to disregard 452 it. 453 454 Observe that the $$C++$$ indexer and `protoc` backend both observe the same 455 content in the `.pb.h` file; therefore, both programs see the same offsets 456 for various tokens. If the `protoc` backend were to link those offsets back 457 to the objects in the `FileDescriptorProto` using well-known names--and if the 458 Kythe proto_indexer guaranteed a particular mechanism for generating VNames 459 from those well-known names--we could close the loop in the $$C++$$ indexer by 460 emitting *generates* edges to the proto_indexer's nodes whenever the $$C++$$ 461 indexer trips over the `protoc` backend's marked offsets. 462 463 In other words, if the `.pb.h` contained code like: 464 [source,c] 465 -------------------------------------------------------------------------------- 466 ... 467 class Foo { 468 ... 469 -------------------------------------------------------------------------------- 470 471 and the `protoc` backend that generated it reported that the text range 472 `Foo` was associated with an object in its original `FileDescriptorProto` at 473 some location encoded as "4.0"—and the proto_indexer guaranteed it would 474 always emit objects with signatures based on their descriptor locations--the 475 $$C++$$ indexer would only need to watch for *defines/binding* edges starting at 476 that text range. Should such an edge be emitted, the $$C++$$ indexer would also 477 emit a *generates* edge to the `proto` node. 478 479 === Annotations in `protoc` backends 480 481 We have already seen how to command the `protoc` frontend to emit location 482 information for `.proto` source files. The frontend does not, however, know 483 anything about the source code that its various backends emit. We must pass 484 additional flags to these backends to get them to produce location information 485 as `proto2.GeneratedCodeInfo` messages. These messages connect byte offsets 486 in generated source code with paths in the `proto2.FileDescriptorProto` AST. 487 These paths are the same ones used by the `proto2.SourceCodeInfo` message that 488 the Kythe proto_indexer consumes; they are the paths we will use to link up 489 `protobuf` language nodes with the nodes for generated source code. 490 491 Each `protoc` backend must be individually instrumented to produce 492 `proto2.GeneratedCodeInfo` messages. To turn annotation on for the $$C++$$ 493 backend, you can pass `--cpp_out=annotate_headers=1:normal/output/path` to 494 `protoc`. In practice, you will also need to provide an `annotation_pragma_name` 495 and an `annotation_guard_name`, so the full `cpp_out` value may look like 496 `annotate_headers=1,annotation_pragma_name=kythe_metadata,annotation_guard_name=KYTHE_IS_RUNNING:normal/output/path`. 497 498 When `annotate_headers=1` is asserted to the $$C++$$ backend, it will generate 499 `.meta` files alongside any files with annotations. For example, in the same 500 directory as `example.pb.h`, you will find an `example.pb.h.meta` file. This 501 file contains a serialized `proto2.GeneratedCodeInfo` message. This message 502 contains a series of spans in `example.pb.h`, the filenames to the `.proto` 503 files that caused those spans to be generated, and the AST paths in the 504 `FileDescriptorProto` for those `.proto` files. `example.pb.h` explicitly 505 depends on `example.pb.h.meta` using a pragma and a preprocessor symbol: 506 507 [source,c] 508 -------------------------------------------------------------------------------- 509 // Generated by the protocol buffer compiler. DO NOT EDIT! 510 // source: kythe/examples/proto/example.proto 511 512 ... 513 514 #ifdef KYTHE_IS_RUNNING 515 #pragma kythe_metadata "kythe/examples/proto/example.pb.h.meta" 516 #endif // KYTHE_IS_RUNNING 517 518 ... 519 -------------------------------------------------------------------------------- 520 521 The Kythe $$C++$$ extractor and indexer both understand what to do with this 522 pragma (and both define `KYTHE_IS_RUNNING`). The extractor will add the `.meta` 523 file to the `kzip` it produces; the indexer will load the `.meta` file, 524 translate it from `protoc` annotations to generic Kythe metadata, and use it 525 to append `generates` edges for `defines/binding` edges emitted from 526 `example.pb.h`. 527 528 [kythe,dot,"first indexing pipeline",0] 529 -------------------------------------------------------------------------------- 530 digraph G { 531 size="7,7!"; 532 usersrc [label="proto_user.cc", shape=note]; 533 ccextractor [label="C++ extractor", shape=rectangle]; 534 kzip [label="proto_user.kzip", shape=note]; 535 protosrc [label=".proto", shape=note]; 536 frontend [label="protoc", shape=rectangle]; 537 descriptor [label="descriptor", shape=note]; 538 descriptorfile [label="FileDescriptorSet", shape=note]; 539 indexer [label="Kythe proto_indexer", shape=rectangle]; 540 ccindexer [label="Kythe C++ indexer", shape=rectangle]; 541 backend [label="C++ language backend", shape=rectangle]; 542 ccsrc [label=".pb.h", shape=note]; 543 ccmeta [label=".pb.h.meta", shape=note, color=blue]; 544 entries [label="Kythe entries", shape=note]; 545 protosrc -> frontend; 546 frontend -> descriptor; 547 frontend -> descriptorfile; 548 protosrc -> indexer; 549 descriptorfile -> indexer; 550 descriptor -> backend; 551 backend -> ccsrc; 552 backend -> ccmeta [color=blue]; 553 ccsrc -> ccmeta [color=blue]; 554 indexer -> entries; 555 usersrc -> ccsrc; 556 usersrc -> ccextractor; 557 ccsrc -> ccextractor; 558 ccmeta -> ccextractor [color=blue]; 559 ccextractor -> kzip; 560 kzip -> ccindexer; 561 ccindexer -> entries; 562 } 563 -------------------------------------------------------------------------------- 564 565 Now we can write verifier assertions that show we have established a link 566 between the proto source and use sites of its generated code: 567 568 [source,c] 569 -------------------------------------------------------------------------------- 570 #include "kythe/examples/proto/example.pb.h" 571 572 //- @Foo ref CxxFooDecl 573 //- MessageFoo? generates CxxFooDecl 574 //- vname(_, "kythe", "", "kythe/examples/proto/example.proto", "protobuf") 575 //- defines/binding MessageFoo 576 void UseProto(kythe::examples::proto::example::Foo* foo) { 577 } 578 -------------------------------------------------------------------------------- 579 580 .Output 581 ---- 582 MessageFoo: EVar(... = App(vname, 583 (4.0, kythe, "", kythe/examples/proto/example.proto, protobuf))) 584 ---- 585 586 Of course, Kythe clients need to understand that *generates* edges should be 587 followed. Solving this problem is out of this document's scope. 588 589 ==== Providing annotations for other languages 590 591 To generate metadata for a different language backend, you must determine or 592 implement the following: 593 594 * The `protoc` backend for the language must be able to produce 595 `proto2.GeneratedCodeInfo` buffers. 596 * There must be some way to signal to your indexer and extractor that a 597 `.meta` file is associated with a different source file. 598 * That `.meta` file must be made available to the extractor during extraction. 599 For hermetic build systems, this means that the target driving `protoc` must 600 list the `.meta` file as an output. Any target that uses that `protoc` 601 target must require the `.meta` file as an input. 602 * The indexer must read the `.meta` file and use it to emit `generates` 603 edges that connect up to the nodes produced by the Kythe proto_indexer. 604 605 The method for annotating source code is designed such that it can 606 be implemented purely at the output stage; for example, if you have an 607 abstraction for emitting *defines/binding* edges from anchors, you can 608 check at every edge (starting from a file with loaded metadata) whether you 609 should emit an additional `generates` edge.