cuelang.org/go@v0.13.0/internal/core/adt/cycle.go (about) 1 // Copyright 2022 CUE Authors 2 // 3 // Licensed under the Apache License, Version 2.0 (the "License"); 4 // you may not use this file except in compliance with the License. 5 // You may obtain a copy of the License at 6 // 7 // http://www.apache.org/licenses/LICENSE-2.0 8 // 9 // Unless required by applicable law or agreed to in writing, software 10 // distributed under the License is distributed on an "AS IS" BASIS, 11 // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 // See the License for the specific language governing permissions and 13 // limitations under the License. 14 15 package adt 16 17 // TODO: 18 // - compiler support for detecting cross-pattern references. 19 // - handle propagation of cyclic references to root across disjunctions. 20 21 // # Cycle detection algorithm V3 22 // 23 // The cycle detection algorithm detects the following kind of cycles: 24 // 25 // - Structural cycles: cycles where a field, directly or indirectly, ends up 26 // referring to an ancestor node. For instance: 27 // 28 // a: b: a 29 // 30 // a: b: c 31 // c: a 32 // 33 // T: a?: T 34 // T: a: {} 35 // 36 // - Reference cycles: cycles where a field, directly or indirectly, end up 37 // referring to itself: 38 // a: a 39 // 40 // a: b 41 // b: a 42 // 43 // - Inline cycles: cycles within an expression, for instance: 44 // 45 // x: {y: x}.out 46 // 47 // Note that it is possible for the unification of two non-cyclic structs to be 48 // cyclic: 49 // 50 // y: { 51 // f: h: g 52 // g: _ 53 // } 54 // x: { 55 // f: _ 56 // g: f 57 // } 58 // 59 // Even though the above contains no cycles, the result of `x & y` is cyclic: 60 // 61 // f: h: g 62 // g: f 63 // 64 // Cycle detection is inherently a dynamic process. 65 // 66 // ## ALGORITHM OVERVIEW 67 // 68 // 1. Traversal with Path Tracking: 69 // • Perform a depth-first traversal of the CUE value graph. 70 // • Maintain a path (call stack) of ancestor nodes during traversal. 71 // For this purpose, we separately track the parent relation as well 72 // as marking nodes that are currently being processed. 73 // 2. Per-Conjunct Cycle Tracking: 74 // • For each conjunct in a node’s value (i.e., c1 & c2 & ... & cn), 75 // track cycles independently. 76 // • A node is considered non-cyclic if any of its conjuncts is 77 // non-cyclic. 78 // 3. Handling References: 79 // • When encountering a reference, check if it points to any node in the 80 // current path. 81 // • If yes, mark the conjunct as cyclic. 82 // • If no, add the referenced node to the path and continue traversal. 83 // 4. Handling Optional Constructs: 84 // • Conjuncts originating from optional fields, pattern constraints, and 85 // disjunctions are marked as optional. 86 // • Cycle tracking for optional conjuncts is identical to conjuncts for 87 // conjuncts not marked as optional up to the point a cycle is detected 88 // (i.e. all conjuncts are cyclic). 89 // • When a cycle is detected, the lists of referenced nodes are cleared 90 // for each conjuncts, which thereby are afforded one additional level 91 // of cycles. This allows for any optional paths to terminate. 92 // 93 // 94 // ## CALL STACK 95 // 96 // There are two key types of structural cycles: referencing an ancestor and 97 // repeated mixing in of cyclic types. We track these separately. 98 // 99 // We also keep track the non-cyclicity of conjuncts a bit differently for these 100 // cases. 101 // 102 // ### Ancestor References 103 // 104 // Ancestor references are relatively easy to detect by simply checking if a 105 // resolved reference is a direct parent, or is a node that is currently under 106 // evaluation. 107 // 108 // An ancestor cycle is considered to be a structural cycle if there are no 109 // new sibling conjuncts associated with new structure. 110 // 111 // ### Reoccurring references 112 // 113 // For reoccuring references, we need to maintain a per-conjunct list of 114 // references. When a reference was previously resolved in a conjunct, we may 115 // have a cycle and will mark the conjunct as such. 116 // 117 // A cycle from a reoccurring reference is a structural cycle if there are 118 // no incoming arcs from any non-cyclic conjunct. The need for this subtle 119 // distinction can be clarified by an example; 120 // 121 // crossRefNoCycle: t4: { 122 // T: X={ 123 // y: X.x 124 // } 125 // // Here C.x.y must consider any incoming arc: here T originates from 126 // // a non-cyclic conjunct, but once evaluated it becomes cyclic and 127 // // will be the only conjunct. This is not a cycle, though. We must 128 // // take into account that T was introduced from a non-cyclic 129 // // conjunct. 130 // C: T & { x: T } 131 // } 132 // 133 // 134 // ## OPTIONAL PATHS 135 // 136 // Cyclic references for conjuncts that originate from an "optional" path, such 137 // as optional fields and pattern constraints, may not necessary be cyclic, as 138 // on a next iteration such conjuncts _may_ still terminate. 139 // 140 // To allow for this kind of eventuality, optional conjuncts are processed in 141 // two phases: 142 // 143 // - they behave as normal conjuncts up to the point a cycle is detected 144 // - afterwards, their reference history is cleared and they are afforded to 145 // proceed until the next cycle is detected. 146 // 147 // Note that this means we may allow processing to proceed deeper than strictly 148 // necessary in some cases. 149 // 150 // Note that we only allow this for references: for cycles with ancestor nodes 151 // we immediately terminate for optional fields. This simplifies the algorithm. 152 // But it is also correct: in such cases either the whole node is in an optional 153 // path, in which case reporting an error is benign (as they are allowed), or 154 // the node corresponds to a non-optional field, in which case a cycle can be 155 // expected to reproduce another non-optional cycle, which will be an error. 156 // 157 // ### Examples 158 // 159 // These are not cyclic: 160 // 161 // 1. The structure is cyclic, but he optional field needs to be "fed" to 162 // continue the cycle: 163 // 164 // a: b?: a // a: {} 165 // 166 // b: [string]: b // b: {} 167 // 168 // c: 1 | {d: c} // c: 1 169 // 170 // 2. The structure is cyclic. Conjunct `x: a` keeps detecting cycles, but 171 // is fed with new structure up until x.b.c.b.c.b. After this, this 172 // (optional) conjunct is allowed to proceed until the next cycle, which 173 // not be reached, as the `b?` is not unified with a concrete value. 174 // So the result of `x` is `{b: c: b: c: b: c: {}}`. 175 // 176 // a: b?: c: a 177 // x: a 178 // x: b: c: b: c: b: {} 179 // 180 // These are cyclic: 181 // 182 // 3. Here the optional conjunct triggers a new cycle of itself, but also 183 // of a conjunct that turns `b` into a regular field. It is thus a self- 184 // feeding cycle. 185 // 186 // a: b?: a 187 // a: b: _ 188 // 189 // c: [string]: c 190 // c: b: _ 191 // 192 // 4. Here two optional conjuncts end up feeding each other, resulting in a 193 // cycle. 194 // 195 // a: c: a | int 196 // a: a | int 197 // 198 // y1: c?: c: y1 199 // x1: y1 200 // x1: c: y1 201 // 202 // y2: [string]: b: y2 203 // x2: y2 204 // x2: b: y2 205 // 206 // 207 // ## INLINE CYCLES 208 // 209 // The semantics for treating inline cycles can be derived by rewriting CUE of 210 // the form 211 // 212 // x: {...}.out 213 // 214 // as 215 // 216 // x: _x.out 217 // _x: {...} 218 // 219 // A key difference is that as such structs are not "rooted" (they have no path 220 // from the root of the configuration tree) and thus any error should be caught 221 // and evaluated before doing a lookup in such structs to be correct. For the 222 // purpose of this algorithm, this especially pertains to structural cycles. 223 // 224 // Note that the scope in which scope the "helper" field is defined may 225 // determine whether or not there is a structural cycle. Consider, for instance, 226 // 227 // X: {in: a, out: in} 228 // a: b: (X & {in: a}).out 229 // 230 // Two possible rewrites are: 231 // 232 // X: {in: a, out: in} 233 // a: b: _a.out 234 // _a: X & {in: a} 235 // 236 // and 237 // 238 // X: {in: a, out: in} 239 // a: { 240 // b: _b.out 241 // _b: X & {in: a} 242 // } 243 // 244 // The former prevents a structural cycle, the later results in a structural 245 // cycle. 246 // 247 // The current implementation takes the former approach, which more closely 248 // mimics the V2 implementation. Note that other approaches are possible. 249 // 250 // ### Examples 251 // 252 // Expanding these out with the above rules should give the same results. 253 // 254 // Cyclic: 255 // 256 // 1. This is an example of mutual recursion, triggered by n >= 2. 257 // 258 // fibRec: { 259 // nn: int, 260 // out: (fib & {n: nn}).out 261 // } 262 // fib: { 263 // n: int 264 // if n >= 2 { out: (fibRec & {nn: n - 2}).out } 265 // if n < 2 { out: n } 266 // } 267 // fib2: fib & {n: 2} 268 // 269 // is equivalent to 270 // 271 // fibRec: { 272 // nn: int, 273 // out: _out.out 274 // _out: fib & {n: nn} 275 // } 276 // fib: { 277 // n: int 278 // if n >= 2 { 279 // out: _out.out 280 // _out: fibRec & {nn: n - 2} 281 // } 282 // if n < 2 { out: n } 283 // } 284 // fib2: fib & {n: 2} 285 // 286 // Non-cyclic: 287 // 288 // 2. This is not dissimilar to the previous example, but since additions are 289 // done on separate lines, each field is only visited once and no cycle is 290 // triggered. 291 // 292 // f: { in: number, out: in } 293 // k00: 0 294 // k10: (f & {in: k00}).out 295 // k20: (f & {in: k10}).out 296 // k10: (f & {in: k20}).out 297 // 298 // which is equivalent to 299 // 300 // f: { in: number, out: in } 301 // k0: 0 302 // k1: _k1.out 303 // k2: _k2.out 304 // k1: _k3.out 305 // _k1: f 306 // _k2: f 307 // _k3: f 308 // _k1: in: k0 309 // _k2: in: k1 310 // _k3: in: k2 311 // 312 // and thus is non-cyclic. 313 // 314 // ## EDGE CASES 315 // 316 // This section lists several edge cases, including interactions with the 317 // detection of self-reference cycles. 318 // 319 // Self-reference cycles, like `a: a`, evaluate to top. The evaluator detects 320 // this cases and drop such conjuncts, effectively treating them as top. 321 // 322 // ### Self-referencing patterns 323 // 324 // Self-references in patterns are typically handled automatically. But there 325 // are some edge cases where the are not: 326 // 327 // _self: x: [...and(x)] 328 // _self 329 // x: [1] 330 // 331 // Patterns are recorded in Vertex values that are themselves evaluated to 332 // allow them to be compared, such as in subsumption or filtering disjunctions. 333 // In the above case, `x` may be evaluated to be inserted in the pattern 334 // Vertex, but because the pattern is not itself `x`, node identity cannot be 335 // used to detect a self-reference. 336 // 337 // The current solution is to mark a node as a pattern constraint and treat 338 // structural cycles to such nodes as "reference cycles". As pattern constraints 339 // are optional, it is safe to ignore such errors. 340 // 341 // ### Lookups in inline cycles 342 // 343 // A lookup, especially in inline cycles, should be considered evidence of 344 // non-cyclicity. Consider the following example: 345 // 346 // { p: { x: p, y: 1 } }.p.x.y 347 // 348 // without considering a lookup as evidence of non-cyclicity, this would be 349 // resulting in a structural cycle. 350 // 351 // ## CORRECTNESS 352 // 353 // ### The algorithm will terminate 354 // 355 // First consider the algorithm without optional conjuncts. If a parent node is 356 // referenced, it will obviously be caught. The more interesting case is if a 357 // reference to a node is made which is later reintroduced. 358 // 359 // When a conjunct splits into multiple conjuncts, its entire cycle history is 360 // copied. This means that any cyclic conjunct will be marked as cyclic in 361 // perpetuity. Non-cyclic conjuncts will either remain non-cyclic or be turned 362 // into a cycle. A conjunct can only remain non-cyclic for a maximum of the 363 // number of nodes in a graph. For any structure to repeat, it must have a 364 // repeated reference. This means that eventually either all conjuncts will 365 // either terminate or become cyclic. 366 // 367 // Optional conjuncts do not materially alter this property. The only difference 368 // is that when a node-level cycle is detected, we continue processing of some 369 // conjuncts until this next cycle is reached. 370 // 371 // 372 // ## TODO 373 // 374 // - treatment of let fields 375 // - tighter termination for some mutual cycles in optional conjuncts. 376 377 // DEPRECATED: V2 cycle detection. 378 // 379 // TODO(evalv3): remove these comments once we have fully moved to V3. 380 // 381 382 // Cycle detection: 383 // 384 // - Current algorithm does not allow for early non-cyclic conjunct detection. 385 // - Record possibly cyclic references. 386 // - Mark as cyclic if no evidence is found. 387 // - Note that this also activates the same reference in other (parent) conjuncts. 388 389 // CYCLE DETECTION ALGORITHM 390 // 391 // BACKGROUND 392 // 393 // The cycle detection is inspired by the cycle detection used by Tomabechi's 394 // [Tomabechi COLING 1992] and Van Lohuizen's [Van Lohuizen ACL 2000] graph 395 // unification algorithms. 396 // 397 // Unlike with traditional graph unification, however, CUE uses references, 398 // which, unlike node equivalence, are unidirectional. This means that the 399 // technique to track equivalence through dereference, as common in graph 400 // unification algorithms like Tomabechi's, does not work unaltered. 401 // 402 // The unidirectional nature of references imply that each reference equates a 403 // facsimile of the value it points to. This renders the original approach of 404 // node-pointer equivalence useless. 405 // 406 // 407 // PRINCIPLE OF ALGORITHM 408 // 409 // The solution for CUE is based on two observations: 410 // 411 // - the CUE algorithm tracks all conjuncts that define a node separately, - 412 // accumulating used references on a per-conjunct basis causes duplicate 413 // references to uniquely identify cycles. 414 // 415 // A structural cycle, as defined by the spec, can then be detected if all 416 // conjuncts are marked as a cycle. 417 // 418 // References are accumulated as follows: 419 // 420 // 1. If a conjunct is a reference the reference is associated with that 421 // conjunct as well as the conjunct corresponding to the value it refers to. 422 // 2. If a conjunct is a struct (including lists), its references are associated 423 // with all embedded values and fields. 424 // 425 // To narrow down the specifics of the reference-based cycle detection, let us 426 // explore structural cycles in a bit more detail. 427 // 428 // 429 // STRUCTURAL CYCLES 430 // 431 // See the language specification for a higher-level and more complete overview. 432 // 433 // We have to define when a cycle is detected. CUE implementations MUST report 434 // an error upon a structural cycle, and SHOULD report cycles at the shortest 435 // possible paths at which they occur, but MAY report these at deeper paths. For 436 // instance, the following CUE has a structural cycle 437 // 438 // f: g: f 439 // 440 // The shortest path at which the cycle can be reported is f.g, but as all 441 // failed configurations are logically equal, it is fine for implementations to 442 // report them at f.g.g, for instance. 443 // 444 // It is not, however, correct to assume that a reference to a parent is always 445 // a cycle. Consider this case: 446 // 447 // a: [string]: b: a 448 // 449 // Even though reference `a` refers to a parent node, the cycle needs to be fed 450 // by a concrete field in struct `a` to persist, meaning it cannot result in a 451 // cycle as defined in the spec as it is defined here. Note however, that a 452 // specialization of this configuration _can_ result in a cycle. Consider 453 // 454 // a: [string]: b: a 455 // a: c: _ 456 // 457 // Here reference `a` is guaranteed to result in a structural cycle, as field 458 // `c` will match the pattern constraint unconditionally. 459 // 460 // In other words, it is not possible to exclude tracking references across 461 // pattern constraints from cycle checking. 462 // 463 // It is tempting to try to find a complete set of these edge cases with the aim 464 // to statically determine cases in which this occurs. But as [Carpenter 1992] 465 // demonstrates, it is possible for cycles to be created as a result of unifying 466 // two graphs that are themselves acyclic. The following example is a 467 // translation of Carpenters example to CUE: 468 // 469 // y: { 470 // f: h: g 471 // g: _ 472 // } 473 // x: { 474 // f: _ 475 // g: f 476 // } 477 // 478 // Even though the above contains no cycles, the result of `x & y` is cyclic: 479 // 480 // f: h: g 481 // g: f 482 // 483 // This means that, in practice, cycle detection has at least partially a 484 // dynamic component to it. 485 // 486 // 487 // ABSTRACT ALGORITHM 488 // 489 // The algorithm is described declaratively by defining what it means for a 490 // field to have a structural cycle. In the below, a _reference_ is uniquely 491 // identified by the pointer identity of a Go Resolver instance. 492 // 493 // Cycles are tracked on a per-conjunct basis and are not aggregated per Vertex: 494 // administrative information is only passed on from parent to child conjunct. 495 // 496 // A conjunct is a _parent_ of another conjunct if is a conjunct of one of the 497 // non-optional fields of the conjunct. For instance, conjunct `x` with value 498 // `{b: y & z}`, is a parent of conjunct `y` as well as `z`. Within field `b`, 499 // the conjuncts `y` and `z` would be tracked individually, though. 500 // 501 // A conjunct is _associated with a reference_ if its value was obtained by 502 // evaluating a reference. Note that a conjunct may be associated with many 503 // references if its evaluation requires evaluating a chain of references. For 504 // instance, consider 505 // 506 // a: {x: d} 507 // b: a 508 // c: b & e 509 // d: y: 1 510 // 511 // the first conjunct of field `c` (reference `b`) has the value `{x: y: 1}` and 512 // is associated with references `b` and `a`. 513 // 514 // The _tracked references_ of a conjunct are all references that are associated 515 // with it or any of its ancestors (parents of parents). For instance, the 516 // tracked references of conjunct `b.x` of field `c.x` are `a`, `b`, and `d`. 517 // 518 // A conjunct is a violating cycle if it is a reference that: 519 // - occurs in the tracked references of the conjunct, or 520 // - directly refers to a parent node of the conjunct. 521 // 522 // A conjunct is cyclic if it is a violating cycle or if any of its ancestors 523 // are a violating cycle. 524 // 525 // A field has a structural cycle if it is composed of at least one conjunct 526 // that is a violating cycle and no conjunct that is not cyclic. 527 // 528 // Note that a field can be composed of only cyclic conjuncts while still not be 529 // structural cycle: as long as there are no conjuncts that are a violating 530 // cycle, it is not a structural cycle. This is important for the following 531 // case: 532 // 533 // a: [string]: b: a 534 // x: a 535 // x: c: b: c: {} 536 // 537 // Here, reference `a` is never a cycle as the recursive references crosses a 538 // pattern constraint that only instantiates if it is unified with something 539 // else. 540 // 541 // 542 // DISCUSSION 543 // 544 // The goal of conjunct cycle marking algorithm is twofold: - mark conjuncts 545 // that are proven to propagate indefinitely - mark them as early as possible 546 // (shortest CUE path). 547 // 548 // TODO: Prove all cyclic conjuncts will eventually be marked as cyclic. 549 // 550 // TODO: 551 // - reference marks whether it crosses a pattern, improving the case 552 // a: [string]: b: c: b 553 // This requires a compile-time detection mechanism. 554 // 555 // 556 // REFERENCES 557 // [Tomabechi COLING 1992]: https://aclanthology.org/C92-2068 558 // Hideto Tomabechi. 1992. Quasi-Destructive Graph Unification with 559 // Structure-Sharing. In COLING 1992 Volume 2: The 14th International 560 // Conference on Computational Linguistics. 561 // 562 // [Van Lohuizen ACL 2000]: https://aclanthology.org/P00-1045/ 563 // Marcel P. van Lohuizen. 2000. "Memory-Efficient and Thread-Safe 564 // Quasi-Destructive Graph Unification". In Proceedings of the 38th Annual 565 // Meeting of the Association for Computational Linguistics, pages 352–359, 566 // Hong Kong. Association for Computational Linguistics. 567 // 568 // [Carpenter 1992]: 569 // Bob Carpenter, "The logic of typed feature structures." 570 // Cambridge University Press, ISBN:0-521-41932-8 571 572 // TODO: mark references as crossing optional boundaries, rather than 573 // approximating it during evaluation. 574 575 type CycleInfo struct { 576 // CycleType is used by the V3 cycle detection algorithm to track whether 577 // a cycle is detected and of which type. 578 CycleType CyclicType 579 580 // IsCyclic indicates whether this conjunct, or any of its ancestors, 581 // had a violating cycle. 582 // TODO: make this a method and use CycleType == IsCyclic after V2 is removed. 583 IsCyclic bool 584 585 // Inline is used to detect expressions referencing themselves, for instance: 586 // {x: out, out: x}.out 587 Inline bool 588 589 // TODO(perf): pack this in with CloseInfo. Make an uint32 pointing into 590 // a buffer maintained in OpContext, using a mark-release mechanism. 591 Refs *RefNode 592 } 593 594 // A RefNode is a linked list of associated references. 595 type RefNode struct { 596 Ref Resolver 597 Arc *Vertex // Ref points to this Vertex 598 599 // Node is the Vertex of which Ref is evaluated as a conjunct. 600 // If there is a cyclic reference (not structural cycle), then 601 // the reference will have the same node. This allows detecting reference 602 // cycles for nodes referring to nodes with an evaluation cycle 603 // (mode tracked to Evaluating status). Examples: 604 // 605 // a: x 606 // Y: x 607 // x: {Y} 608 // 609 // and 610 // 611 // Y: x.b 612 // a: x 613 // x: b: {Y} | null 614 // 615 // In both cases there are not structural cycles and thus need to be 616 // distinguished from regular structural cycles. 617 Node *Vertex 618 619 Next *RefNode 620 Depth int32 621 } 622 623 // cyclicConjunct is used in nodeContext to postpone the computation of 624 // cyclic conjuncts until a non-cyclic conjunct permits it to be processed. 625 type cyclicConjunct struct { 626 c Conjunct 627 arc *Vertex // cached Vertex 628 } 629 630 // CycleType indicates the type of cycle detected. The CyclicType is associated 631 // with a conjunct and may only increase in value for child conjuncts. 632 type CyclicType uint8 633 634 const ( 635 NoCycle CyclicType = iota 636 637 // like newStructure, but derived from a reference. If this is set, a cycle 638 // will move to maybeCyclic instead of isCyclic. 639 IsOptional 640 641 // maybeCyclic is set if a cycle is detected within an optional field. 642 // 643 MaybeCyclic 644 645 // IsCyclic marks that this conjunct has a structural cycle. 646 IsCyclic 647 ) 648 649 func (n *nodeContext) detectCycleV3(arc *Vertex, env *Environment, x Resolver, ci CloseInfo) (_ CloseInfo, skip bool) { 650 n.assertInitialized() 651 652 // If we are pointing to a direct ancestor, and we are in an optional arc, 653 // we can immediately terminate, as a cycle error within an optional field 654 // is okay. If we are pointing to a direct ancestor in a non-optional arc, 655 // we also can terminate, as this is a structural cycle. 656 // TODO: use depth or check direct ancestry. 657 if n.hasAncestorV3(arc) { 658 return n.markCyclicV3(arc, env, x, ci) 659 } 660 661 // As long as a node-wide cycle has not yet been detected, we allow cycles 662 // in optional fields to proceed unchecked. 663 if n.hasNonCyclic && ci.CycleType == MaybeCyclic { 664 return ci, false 665 } 666 667 for r := ci.Refs; r != nil; r = r.Next { 668 if equalDeref(r.Arc, arc) { 669 if equalDeref(r.Node, n.node) { 670 // reference cycle 671 return ci, true 672 } 673 674 // If there are still any non-cyclic conjuncts, and if this conjunct 675 // is optional, we allow this to continue one more cycle. 676 if ci.CycleType == IsOptional && n.hasNonCyclic { 677 ci.CycleType = MaybeCyclic 678 // There my still be a cycle if the optional field is a pattern 679 // that unifies with itself, as in: 680 // 681 // [string]: c 682 // a: b 683 // b: _ 684 // c: a: int 685 // 686 // This is equivalent to a reference cycle. 687 if r.Depth == n.node.state.depth { 688 return ci, true 689 } 690 ci.Refs = nil 691 return ci, false 692 } 693 694 return n.markCyclicPathV3(arc, env, x, ci) 695 } 696 if equalDeref(r.Node, n.node) && r.Ref == x && arc.nonRooted { 697 return n.markCyclicPathV3(arc, env, x, ci) 698 } 699 } 700 701 ci.Refs = &RefNode{ 702 Arc: deref(arc), 703 Ref: x, 704 Node: deref(n.node), 705 Next: ci.Refs, 706 Depth: n.depth, 707 } 708 709 return ci, false 710 } 711 712 // markNonCyclic records when a non-cyclic conjunct is processed. 713 func (n *nodeContext) markNonCyclic(id CloseInfo) { 714 switch id.CycleType { 715 case NoCycle, IsOptional: 716 n.hasNonCyclic = true 717 } 718 } 719 720 // markCyclicV3 marks a conjunct as being cyclic. Also, it postpones processing 721 // the conjunct in the absence of evidence of a non-cyclic conjunct. 722 func (n *nodeContext) markCyclicV3(arc *Vertex, env *Environment, x Resolver, ci CloseInfo) (CloseInfo, bool) { 723 ci.CycleType = IsCyclic 724 ci.IsCyclic = true 725 726 n.hasAnyCyclicConjunct = true 727 n.hasAncestorCycle = true 728 729 if !n.hasNonCycle && env != nil { 730 // TODO: investigate if we can get rid of cyclicConjuncts in the new 731 // evaluator. 732 v := Conjunct{env, x, ci} 733 n.cyclicConjuncts = append(n.cyclicConjuncts, cyclicConjunct{v, arc}) 734 return ci, true 735 } 736 return ci, false 737 } 738 739 func (n *nodeContext) markCyclicPathV3(arc *Vertex, env *Environment, x Resolver, ci CloseInfo) (CloseInfo, bool) { 740 ci.CycleType = IsCyclic 741 ci.IsCyclic = true 742 743 n.hasAnyCyclicConjunct = true 744 745 if !n.hasNonCyclic && !n.hasNonCycle && env != nil { 746 // TODO: investigate if we can get rid of cyclicConjuncts in the new 747 // evaluator. 748 v := Conjunct{env, x, ci} 749 n.cyclicConjuncts = append(n.cyclicConjuncts, cyclicConjunct{v, arc}) 750 return ci, true 751 } 752 return ci, false 753 } 754 755 // combineCycleInfo merges the cycle information collected in the context into 756 // the given CloseInfo. Note that it only merges the cycle information in its 757 // entirety, if present, to avoid getting unrelated data. 758 func (c *OpContext) combineCycleInfo(ci CloseInfo) CloseInfo { 759 cc := c.ci.CycleInfo 760 if cc.IsCyclic { 761 ci.CycleInfo = cc 762 } 763 return ci 764 } 765 766 // hasDepthCycle uses depth counters to keep track of cycles: 767 // - it allows detecting reference cycles as well (state evaluating is 768 // no longer used in v3) 769 // - it can capture cycles across inline structs, which do not have 770 // Parent set. 771 // 772 // TODO: ensure that evalDepth is cleared when a node is finalized. 773 func (c *OpContext) hasDepthCycle(v *Vertex) bool { 774 if s := v.state; s != nil && v.status != finalized { 775 return s.evalDepth > 0 && s.evalDepth < c.evalDepth 776 } 777 return false 778 } 779 780 // hasAncestorV3 checks whether a node is currently being processed. The code 781 // still assumes that is includes any node that is currently being processed. 782 func (n *nodeContext) hasAncestorV3(arc *Vertex) bool { 783 if n.ctx.hasDepthCycle(arc) { 784 return true 785 } 786 787 // TODO: insert test conditions for Bloom filter that guarantee that all 788 // parent nodes have been marked as "hot", in which case we can avoid this 789 // traversal. 790 // if n.meets(allAncestorsProcessed) { 791 // return false 792 // } 793 794 for p := n.node.Parent; p != nil; p = p.Parent { 795 // TODO(perf): deref arc only once. 796 if equalDeref(p, arc) { 797 return true 798 } 799 } 800 return false 801 } 802 803 func (n *nodeContext) hasOnlyCyclicConjuncts() bool { 804 return (n.hasAncestorCycle && !n.hasNonCycle) || 805 (n.hasAnyCyclicConjunct && !n.hasNonCyclic) 806 } 807 808 // setOptionalV3 marks a conjunct as being optional. The nodeContext is 809 // currently unused, but allows for checks to be added and to add logging during 810 // debugging. 811 func (c *CloseInfo) setOptionalV3(n *nodeContext) { 812 _ = n // See comment. 813 if c.CycleType == NoCycle { 814 c.CycleType = IsOptional 815 } 816 } 817 818 // markCycle checks whether the reference x is cyclic. There are two cases: 819 // 1. it was previously used in this conjunct, and 820 // 2. it directly references a parent node. 821 // 822 // Other inputs: 823 // 824 // arc the reference to which x points 825 // env, ci the components of the Conjunct from which x originates 826 // 827 // A cyclic node is added to a queue for later processing if no evidence of a 828 // non-cyclic node has so far been found. updateCyclicStatus processes delayed 829 // nodes down the line once such evidence is found. 830 // 831 // If a cycle is the result of "inline" processing (an expression referencing 832 // itself), an error is reported immediately. 833 // 834 // It returns the CloseInfo with tracked cyclic conjuncts updated, and 835 // whether or not its processing should be skipped, which is the case either if 836 // the conjunct seems to be fully cyclic so far or if there is a valid reference 837 // cycle. 838 func (n *nodeContext) markCycle(arc *Vertex, env *Environment, x Resolver, ci CloseInfo) (_ CloseInfo, skip bool) { 839 unreachableForDev(n.ctx) 840 841 n.assertInitialized() 842 843 // TODO(perf): this optimization can work if we also check for any 844 // references pointing to arc within arc. This can be done with compiler 845 // support. With this optimization, almost all references could avoid cycle 846 // checking altogether! 847 // if arc.status == Finalized && arc.cyclicReferences == nil { 848 // return v, false 849 // } 850 851 // Check whether the reference already occurred in the list, signaling 852 // a potential cycle. 853 found := false 854 depth := int32(0) 855 for r := ci.Refs; r != nil; r = r.Next { 856 if r.Ref != x { 857 // TODO(share): this is a bit of a hack. We really should implement 858 // (*Vertex).cyclicReferences for the new evaluator. However, 859 // implementing cyclicReferences is somewhat tricky, as it requires 860 // referenced nodes to be evaluated, which is a guarantee we may not 861 // want to give. Moreover, it seems we can find a simpler solution 862 // based on structure sharing. So punt on this solution for now. 863 if r.Arc != arc || !n.ctx.isDevVersion() { 864 continue 865 } 866 found = true 867 } 868 869 // A reference that is within a graph that is being evaluated 870 // may repeat with a different arc and will point to a 871 // non-finalized arc. A repeating reference that points outside the 872 // graph will always be the same address. Hence, if this is a 873 // finalized arc with a different address, it resembles a reference that 874 // is included through a different path and is not a cycle. 875 if !equalDeref(r.Arc, arc) && arc.status == finalized { 876 continue 877 } 878 879 // For dynamically created structs we mark this as an error. Otherwise 880 // there is only an error if we have visited the arc before. 881 if ci.Inline && (arc.IsDynamic || equalDeref(r.Arc, arc)) { 882 n.reportCycleError() 883 return ci, true 884 } 885 886 // We have a reference cycle, as distinguished from a structural 887 // cycle. Reference cycles represent equality, and thus are equal 888 // to top. We can stop processing here. 889 // var nn1, nn2 *Vertex 890 // if u := r.Node.state.underlay; u != nil { 891 // nn1 = u.node 892 // } 893 // if u := n.node.state.underlay; u != nil { 894 // nn2 = u.node 895 // } 896 if equalDeref(r.Node, n.node) { 897 return ci, true 898 } 899 900 depth = r.Depth 901 found = true 902 903 // Mark all conjuncts of this Vertex that refer to the same node as 904 // cyclic. This is an extra safety measure to ensure that two conjuncts 905 // cannot work in tandom to circumvent a cycle. It also tightens 906 // structural cycle detection in some cases. Late detection of cycles 907 // can result in a lot of redundant work. 908 // 909 // TODO: this loop is not on a critical path, but it may be evaluated 910 // if it is worthy keeping at some point. 911 for i, c := range n.node.Conjuncts { 912 if c.CloseInfo.IsCyclic { 913 continue 914 } 915 for rr := c.CloseInfo.Refs; rr != nil; rr = rr.Next { 916 // TODO: Is it necessary to find another way to find 917 // "parent" conjuncts? This mechanism seems not entirely 918 // accurate. Maybe a pointer up to find the root and then 919 // "spread" downwards? 920 if r.Ref == x && equalDeref(r.Arc, rr.Arc) { 921 n.node.Conjuncts[i].CloseInfo.IsCyclic = true 922 break 923 } 924 } 925 } 926 927 break 928 } 929 930 if arc.state != nil { 931 if d := arc.state.evalDepth; d > 0 && d >= n.ctx.optionalMark { 932 arc.IsCyclic = true 933 } 934 } 935 936 // The code in this switch statement registers structural cycles caught 937 // through EvaluatingArcs to the root of the cycle. This way, any node 938 // referencing this value can track these nodes early. This is mostly an 939 // optimization to shorten the path for which structural cycles are 940 // detected, which may be critical for performance. 941 outer: 942 switch arc.status { 943 case evaluatingArcs: // also Evaluating? 944 if arc.state.evalDepth < n.ctx.optionalMark { 945 break 946 } 947 948 // The reference may already be there if we had no-cyclic structure 949 // invalidating the cycle. 950 for r := arc.cyclicReferences; r != nil; r = r.Next { 951 if r.Ref == x { 952 break outer 953 } 954 } 955 956 arc.cyclicReferences = &RefNode{ 957 Arc: deref(arc), 958 Ref: x, 959 Next: arc.cyclicReferences, 960 } 961 962 case finalized: 963 // Insert cyclic references from found arc, if any. 964 for r := arc.cyclicReferences; r != nil; r = r.Next { 965 if r.Ref == x { 966 // We have detected a cycle, with the only exception if arc is 967 // a disjunction, as evaluation always stops at unresolved 968 // disjunctions. 969 if _, ok := arc.BaseValue.(*Disjunction); !ok { 970 found = true 971 } 972 } 973 ci.Refs = &RefNode{ 974 Arc: deref(r.Arc), 975 Node: deref(n.node), 976 977 Ref: x, 978 Next: ci.Refs, 979 Depth: n.depth, 980 } 981 } 982 } 983 984 // NOTE: we need to add a tracked reference even if arc is not cyclic: it 985 // may still cause a cycle that does not refer to a parent node. For 986 // instance: 987 // 988 // y: [string]: b: y 989 // x: y 990 // x: c: x 991 // 992 // -> 993 // - in conjuncts 994 // - out conjuncts: these count for cycle detection. 995 // x: { 996 // [string]: <1: y> b: y 997 // c: x 998 // } 999 // x.c: { 1000 // <1: y> b: y 1001 // <2: x> y 1002 // [string]: <3: x, y> b: y 1003 // <2: x> c: x 1004 // } 1005 // x.c.b: { 1006 // <1: y> y 1007 // [string]: <4: y; Cyclic> b: y 1008 // <3: x, y> b: y 1009 // } 1010 // x.c.b.b: { 1011 // <3: x, y> y 1012 // [string]: <5: x, y, Cyclic> b: y 1013 // <4: y, Cyclic> y 1014 // [string]: <5: x, y, Cyclic> b: y 1015 // } 1016 // x.c.c: { // structural cycle 1017 // <3: x, y> b: y 1018 // <2: x> x 1019 // <6: x, Cyclic>: y 1020 // [string]: <8: x, y; Cyclic> b: y 1021 // <7: x, Cyclic>: c: x 1022 // } 1023 // x.c.c.b: { // structural cycle 1024 // <3: x, y> y 1025 // [string]: <3: x, y; Cyclic> b: y 1026 // <8: x, y; Cyclic> y 1027 // } 1028 // -> 1029 // x: [string]: b: y 1030 // x: c: b: y 1031 // x: c: [string]: b: y 1032 // x: c: b: b: y 1033 // x: c: b: [string]: b: y 1034 // x: c: b: b: b: y 1035 // .... // structural cycle 1 1036 // x: c: c: x // structural cycle 2 1037 // 1038 // Note that in this example there is a structural cycle at x.c.c, but we 1039 // would need go guarantee that cycle is detected before the algorithm 1040 // descends into x.c.b. 1041 if !found || depth != n.depth { 1042 // Adding this in case there is a definite cycle is unnecessary, but 1043 // gives somewhat better error messages. 1044 // We also need to add the reference again if the depth differs, as 1045 // the depth is used for tracking "new structure". 1046 // var nn *Vertex 1047 // if u := n.node.state.underlay; u != nil { 1048 // nn = u.node 1049 // } 1050 ci.Refs = &RefNode{ 1051 Arc: deref(arc), 1052 Ref: x, 1053 Node: deref(n.node), 1054 Next: ci.Refs, 1055 Depth: n.depth, 1056 } 1057 } 1058 1059 if !found && arc.status != evaluatingArcs { 1060 // No cycle. 1061 return ci, false 1062 } 1063 1064 // TODO: consider if we should bail if a cycle is detected using this 1065 // mechanism. Ultimately, especially when the old evaluator is removed 1066 // and the status field purged, this should be used instead of the above. 1067 // if !found && arc.state.evalDepth < n.ctx.optionalMark { 1068 // // No cycle. 1069 // return ci, false 1070 // } 1071 1072 alreadyCycle := ci.IsCyclic 1073 ci.IsCyclic = true 1074 1075 // TODO: depth might legitimately be 0 if it is a root vertex. 1076 // In the worst case, this may lead to a spurious cycle. 1077 // Fix this by ensuring the root vertex starts with a depth of 1, for 1078 // instance. 1079 if depth > 0 { 1080 // Look for evidence of "new structure" to invalidate the cycle. 1081 // This is done by checking for non-cyclic conjuncts between the 1082 // current vertex up to the ancestor to which the reference points. 1083 // Note that the cyclic conjunct may not be marked as such, so we 1084 // look for at least one other non-cyclic conjunct if this is the case. 1085 upCount := n.depth - depth 1086 for p := n.node.Parent; p != nil; p = p.Parent { 1087 if upCount--; upCount <= 0 { 1088 break 1089 } 1090 a := p.Conjuncts 1091 count := 0 1092 for _, c := range a { 1093 count += getNonCyclicCount(c) 1094 } 1095 if !alreadyCycle { 1096 count-- 1097 } 1098 if count > 0 { 1099 return ci, false 1100 } 1101 } 1102 } 1103 1104 n.hasAnyCyclicConjunct = true 1105 if !n.hasNonCycle && env != nil { 1106 // TODO: investigate if we can get rid of cyclicConjuncts in the new 1107 // evaluator. 1108 v := Conjunct{env, x, ci} 1109 n.cyclicConjuncts = append(n.cyclicConjuncts, cyclicConjunct{v, arc}) 1110 return ci, true 1111 } 1112 1113 return ci, false 1114 } 1115 1116 func getNonCyclicCount(c Conjunct) int { 1117 switch a, ok := c.x.(*ConjunctGroup); { 1118 case ok: 1119 count := 0 1120 for _, c := range *a { 1121 count += getNonCyclicCount(c) 1122 } 1123 return count 1124 1125 case !c.CloseInfo.IsCyclic: 1126 return 1 1127 1128 default: 1129 return 0 1130 } 1131 } 1132 1133 // updateCyclicStatusV3 looks for proof of non-cyclic conjuncts to override 1134 // a structural cycle. 1135 func (n *nodeContext) updateCyclicStatusV3(c CloseInfo) { 1136 if n.ctx.inDisjunct == 0 { 1137 n.hasFieldValue = true 1138 } 1139 if !c.IsCyclic { 1140 n.hasNonCycle = true 1141 for _, c := range n.cyclicConjuncts { 1142 ci := c.c.CloseInfo 1143 if c.arc != nil { 1144 n.scheduleVertexConjuncts(c.c, c.arc, ci) 1145 } else { 1146 n.scheduleConjunct(c.c, ci) 1147 } 1148 } 1149 n.cyclicConjuncts = n.cyclicConjuncts[:0] 1150 } 1151 } 1152 1153 // updateCyclicStatus looks for proof of non-cyclic conjuncts to override 1154 // a structural cycle. 1155 func (n *nodeContext) updateCyclicStatus(c CloseInfo) { 1156 unreachableForDev(n.ctx) 1157 1158 if !c.IsCyclic { 1159 n.hasNonCycle = true 1160 for _, c := range n.cyclicConjuncts { 1161 n.addVertexConjuncts(c.c, c.arc, false) 1162 } 1163 n.cyclicConjuncts = n.cyclicConjuncts[:0] 1164 } 1165 } 1166 1167 func assertStructuralCycleV3(n *nodeContext) bool { 1168 n.cyclicConjuncts = n.cyclicConjuncts[:0] 1169 1170 if n.hasOnlyCyclicConjuncts() { 1171 n.reportCycleError() 1172 return true 1173 } 1174 return false 1175 } 1176 1177 func assertStructuralCycle(n *nodeContext) bool { 1178 if n.hasAnyCyclicConjunct && !n.hasNonCycle { 1179 n.reportCycleError() 1180 return true 1181 } 1182 return false 1183 } 1184 1185 func (n *nodeContext) reportCycleError() { 1186 b := &Bottom{ 1187 Code: StructuralCycleError, 1188 Err: n.ctx.Newf("structural cycle"), 1189 Value: n.node.Value(), 1190 Node: n.node, 1191 // TODO: probably, this should have the referenced arc. 1192 } 1193 n.setBaseValue(CombineErrors(nil, n.node.Value(), b)) 1194 n.node.Arcs = nil 1195 } 1196 1197 // makeAnonymousConjunct creates a conjunct that tracks self-references when 1198 // evaluating an expression. 1199 // 1200 // Example: 1201 // TODO: 1202 func makeAnonymousConjunct(env *Environment, x Expr, refs *RefNode) Conjunct { 1203 return Conjunct{ 1204 env, x, CloseInfo{CycleInfo: CycleInfo{ 1205 Inline: true, 1206 Refs: refs, 1207 }}, 1208 } 1209 } 1210 1211 // incDepth increments the evaluation depth. This should typically be called 1212 // before descending into a child node. 1213 func (n *nodeContext) incDepth() { 1214 n.ctx.evalDepth++ 1215 } 1216 1217 // decDepth decrements the evaluation depth. It should be paired with a call to 1218 // incDepth and be called after the processing of child nodes is done. 1219 func (n *nodeContext) decDepth() { 1220 n.ctx.evalDepth-- 1221 } 1222 1223 // markOptional marks that we are about to process an "optional element" that 1224 // allows errors. In these cases, structural cycles are not "terminal". 1225 // 1226 // Examples of such constructs are: 1227 // 1228 // Optional fields: 1229 // 1230 // a: b?: a 1231 // 1232 // Pattern constraints: 1233 // 1234 // a: [string]: a 1235 // 1236 // Disjunctions: 1237 // 1238 // a: b: null | a 1239 // 1240 // A call to markOptional should be paired with a call to unmarkOptional. 1241 func (n *nodeContext) markOptional() (saved int) { 1242 saved = n.ctx.evalDepth 1243 n.ctx.optionalMark = n.ctx.evalDepth 1244 return saved 1245 } 1246 1247 // See markOptional. 1248 func (n *nodeContext) unmarkOptional(saved int) { 1249 n.ctx.optionalMark = saved 1250 } 1251 1252 // markDepth assigns the current evaluation depth to the receiving node. 1253 // Any previously assigned depth is saved and returned and should be restored 1254 // using unmarkDepth after processing n. 1255 // 1256 // When a node is encountered with a depth set to a non-zero value this 1257 // indicates a cycle. The cycle is an evaluation cycle when the node's depth 1258 // is equal to the current depth and a structural cycle otherwise. 1259 func (n *nodeContext) markDepth() (saved int) { 1260 saved = n.evalDepth 1261 n.evalDepth = n.ctx.evalDepth 1262 return saved 1263 } 1264 1265 // See markDepth. 1266 func (n *nodeContext) unmarkDepth(saved int) { 1267 n.evalDepth = saved 1268 }