github.com/cockroachdb/cockroach@v20.2.0-alpha.1+incompatible/pkg/sql/opt/ordering/doc.go (about)

     1  // Copyright 2018 The Cockroach Authors.
     2  //
     3  // Use of this software is governed by the Business Source License
     4  // included in the file licenses/BSL.txt.
     5  //
     6  // As of the Change Date specified in that file, in accordance with
     7  // the Business Source License, use of this software will be governed
     8  // by the Apache License, Version 2.0, included in the file
     9  // licenses/APL.txt.
    10  
    11  /*
    12  Package ordering contains operator-specific logic related to orderings - whether
    13  ops can provide Required orderings, what orderings do they need to require from
    14  their children, etc.
    15  
    16  The package provides generic APIs that can be called on any RelExpr, as well as
    17  operator-specific APIs in some cases.
    18  
    19  Required orderings
    20  
    21  A Required ordering is part of the physical properties with respect to which an
    22  expression was optimized. It effectively describes a set of orderings, any of
    23  which are acceptable. See OrderingChoice for more information on how this set is
    24  represented.
    25  
    26  An operator can provide a Required ordering if it can guarantee its results
    27  respect at least one ordering in the OrderingChoice set, perhaps by requiring
    28  specific orderings of its inputs and/or configuring its execution in a specific
    29  way. This package implements the logic that decides whether each operator can
    30  provide a Required ordering, as well as what Required orderings on its input(s)
    31  are necessary.
    32  
    33  Provided orderings
    34  
    35  In a single-node serial execution model, the Required ordering would be
    36  sufficient to configure execution. But in a distributed setting, even if an
    37  operator logically has a natural ordering, when multiple instances of that
    38  operator are running on multiple nodes we must do some extra work (row
    39  comparisons) to maintain their natural orderings when their results merge into a
    40  single node. We must know exactly what order must be maintained on the streams
    41  (i.e. along which columns we should perform the comparisons).
    42  
    43  Consider a Scan operator that is scanning an index on a,b. In query:
    44    SELECT a, b FROM abc ORDER BY a, b
    45  the Scan has Required ordering "+a,+b". Now consider another case where (as part
    46  of some more complicated query) we have the same Scan operator but with Required
    47  ordering "+b opt(a)"¹, which means that any of "+b", "+b,±a", "±a,+b" are
    48  acceptable. Execution would still need to be configured with "+a,+b" because
    49  that is the ordering for the rows that are produced², but this information is
    50  not available from the Required ordering "+b opt(a)".
    51  
    52  ¹This could for example happen under a Select with filter "a=1".
    53  ²For example, imagine that node A produces rows (1,4), (2,1) and node B produces
    54  rows (1,2), (2,3). If these results need to be merged on a single node and we
    55  configure execution to "maintain" an ordering of +b, it will cause an incorrect
    56  ordering or a runtime error.
    57  
    58  To address this issue, this package implements logic to calculate Provided
    59  orderings for each expression in the lowest-cost tree. Provided orderings are
    60  calculated bottom-up, in conjunction with the Required ordering at the level of
    61  each operator.
    62  
    63  The Provided ordering is a specific opt.Ordering which describes the ordering
    64  produced by the operator, and which intersects the Required OrderingChoice (when
    65  the operator's FDs are taken into account). A best-effort attempt is made to
    66  keep the Provided ordering as simple as possible, to minimize the comparisons
    67  that are necessary to maintain it.
    68  */
    69  package ordering