github.com/cockroachdb/cockroach@v20.2.0-alpha.1+incompatible/pkg/sql/opt/norm/rules/reject_nulls.opt (about)

     1  # =============================================================================
     2  # reject_nulls.opt contains normalization rules that simplify expressions based
     3  # on "null-rejecting filters". A null-rejecting filter on column x discards rows
     4  # in which the value of x is null. Many common SQL comparison expressions reject
     5  # null values, such as:
     6  #
     7  #   x = 1          -- Rejects nulls on x
     8  #   y < 10         -- Rejects nulls on y
     9  #   x = y          -- Rejects nulls on x and y
    10  #   x IS NOT NULL  -- Rejects nulls on x
    11  #
    12  # Null rejection analysis is used to simplify outer joins into inner joins. This
    13  # in turn unlocks additional rewrite rules that only work with inner joins. Some
    14  # of these rules are needed to fully decorrelate correlated subqueries, such as
    15  # this pattern:
    16  #
    17  #   SELECT * FROM a WHERE (SELECT MIN(z) FROM b WHERE a.x=b.x) = a.y
    18  #
    19  # Null rejection rules work in concert with predicate pushdown rules to first
    20  # recognize a null rejection filter at a higher level, and then to push it down
    21  # to a lower level where it can simplify an outer join. As an example:
    22  #
    23  #   SELECT min(b.y) FROM a LEFT JOIN b ON True GROUP BY a.x HAVING min(b.y)=1
    24  #
    25  # The min(b.y)=1 expression filters rows where min(b.y) is null, and because of
    26  # the properties of the SQL min function, it therefore excludes GroupBy input
    27  # rows where b.y is null. Since b.y is a null-extended LEFT JOIN column,
    28  # pushing down a "b.y IS NOT NULL" predicate will turn the LEFT JOIN into an
    29  # INNER JOIN.
    30  #
    31  # Filter synthesis and pushdown must be done with care in order to avoid
    32  # repeated pushdown of redundant filters. The null rejection rules rely on the
    33  # Logical.Rule.RejectNullCols property to determine when it's useful to create a
    34  # new IS NOT NULL filter. See the Logical.Rule.RejectNullCols comment for more
    35  # details on how this works.
    36  # =============================================================================
    37  
    38  # RejectNullsLeftJoin reduces a LeftJoin operator to an InnerJoin operator (or a
    39  # FullJoin to a RightJoin) when there is a null-rejecting filter on any column
    40  # from the right side. The effect of the null-rejecting filter is that output
    41  # rows with all NULL values on the right side created by the left (or full) join
    42  # are eliminated, making the join equivalent to an inner (or right) join. For
    43  # example:
    44  #
    45  #   SELECT * FROM a LEFT OUTER JOIN b ON a.x = b.x WHERE b.y < 5
    46  #
    47  # can be reduced to:
    48  #
    49  #   SELECT * FROM a INNER JOIN b ON a.x = b.x WHERE b.y < 5
    50  #
    51  # since b.y < 5 is a null-rejecting filter on the right side.
    52  #
    53  # This rule is marked as high priority so that it runs before Select filter
    54  # pushdown rules. Those rules may remove a filter before it's had a chance to
    55  # rewrite the input join.
    56  #
    57  # Citations: [1]
    58  [RejectNullsLeftJoin, Normalize, HighPriority]
    59  (Select
    60      $input:(LeftJoin | LeftJoinApply | FullJoin
    61          $left:*
    62          $right:*
    63          $on:*
    64          $private:*
    65      )
    66      $filters:* &
    67          (HasNullRejectingFilter $filters (OutputCols $right))
    68  )
    69  =>
    70  (Select
    71      (ConstructNonLeftJoin
    72          (OpName $input)
    73          $left
    74          $right
    75          $on
    76          $private
    77      )
    78      $filters
    79  )
    80  
    81  # RejectNullsRightJoin is symmetric with RejectNullsLeftJoin. It reduces a
    82  # FullJoin operator to a LeftJoin when there is a null-rejecting filter on any
    83  # column from the left side.
    84  #
    85  # This rule is marked as high priority for the same reason as
    86  # RejectNullsLeftJoin.
    87  [RejectNullsRightJoin, Normalize, HighPriority]
    88  (Select
    89      $input:(FullJoin $left:* $right:* $on:* $private:*)
    90      $filters:* &
    91          (HasNullRejectingFilter $filters (OutputCols $left))
    92  )
    93  =>
    94  (Select (LeftJoin $left $right $on $private) $filters)
    95  
    96  # RejectNullsGroupBy pushes a "col IS NOT NULL" null-rejecting filter below the
    97  # GroupBy operator if it allows null rejection for that column (i.e. if it's in
    98  # the NullRejectCols set). See ruleProps.buildGroupByProps for more details on
    99  # the criteria for setting NullRejectCols. See the file header comment for more
   100  # information on null rejection.
   101  #
   102  # This rule is important for decorrelation in cases similar to this:
   103  #
   104  #   SELECT * FROM a WHERE (SELECT MIN(z) FROM b WHERE a.x=b.x) = a.y
   105  #
   106  # The top-level "= a.y" filter rejects NULL values in the b.z column, which ends
   107  # up in the right side of a LeftJoin operator, thus enabling it to be mapped to
   108  # an InnerJoin operator.
   109  #
   110  # This rule is not useful for DistinctOn: it can only fire if there are no
   111  # FirstAgg aggregates, but in that case the filter would have gotten pushed
   112  # through DistinctOn.
   113  #
   114  # This rule is marked as low priority so that it runs after Select filter
   115  # pushdown rules. If a filter can be pushed down in its entirety, that's
   116  # preferable to synthesizing a new "col IS NOT NULL" filter.
   117  [RejectNullsGroupBy, Normalize, LowPriority]
   118  (Select
   119      $input:(GroupBy | ScalarGroupBy
   120          $innerInput:*
   121          $aggregations:*
   122          $groupingPrivate:*
   123      )
   124      $filters:* &
   125          (HasNullRejectingFilter
   126              $filters
   127              $rejectCols:(RejectNullCols $input)
   128          )
   129  )
   130  =>
   131  (Select
   132      ((OpName $input)
   133          (Select
   134              $innerInput
   135              [
   136                  (FiltersItem
   137                      (IsNot
   138                          (NullRejectAggVar
   139                              $aggregations
   140                              $rejectCols
   141                          )
   142                          (Null (AnyType))
   143                      )
   144                  )
   145              ]
   146          )
   147          $aggregations
   148          $groupingPrivate
   149      )
   150      $filters
   151  )