modernc.org/cc@v1.0.1/v2/testdata/_sqlite/ext/fts3/README.syntax (about)

     1  
     2  1. OVERVIEW
     3  
     4    This README file describes the syntax of the arguments that may be passed to
     5    the FTS3 MATCH operator used for full-text queries. For example, if table 
     6    "t1" is an Fts3 virtual table, the following SQL query:
     7  
     8      SELECT * FROM t1 WHERE <col> MATCH <full-text query>
     9  
    10    may be used to retrieve all rows that match a specified for full-text query. 
    11    The text "<col>" should be replaced by either the name of the fts3 table 
    12    (in this case "t1"), or by the name of one of the columns of the fts3 
    13    table. <full-text-query> should be replaced by an SQL expression that 
    14    computes to a string containing an Fts3 query.
    15  
    16    If the left-hand-side of the MATCH operator is set to the name of the
    17    fts3 table, then by default the query may be matched against any column
    18    of the table. If it is set to a column name, then by default the query
    19    may only match the specified column. In both cases this may be overriden
    20    as part of the query text (see sections 2 and 3 below).
    21  
    22    As of SQLite version 3.6.8, Fts3 supports two slightly different query 
    23    formats; the standard syntax, which is used by default, and the enhanced
    24    query syntax which can be selected by compiling with the pre-processor
    25    symbol SQLITE_ENABLE_FTS3_PARENTHESIS defined.
    26  
    27      -DSQLITE_ENABLE_FTS3_PARENTHESIS
    28  
    29  2. STANDARD QUERY SYNTAX
    30  
    31    When using the standard Fts3 query syntax, a query usually consists of a 
    32    list of terms (words) separated by white-space characters. To match a
    33    query, a row (or column) of an Fts3 table must contain each of the specified
    34    terms. For example, the following query:
    35  
    36      <col> MATCH 'hello world'
    37  
    38    matches rows (or columns, if <col> is the name of a column name) that 
    39    contain at least one instance of the token "hello", and at least one 
    40    instance of the token "world". Tokens may be grouped into phrases using
    41    quotation marks. In this case, a matching row or column must contain each
    42    of the tokens in the phrase in the order specified, with no intervening
    43    tokens. For example, the query:
    44  
    45      <col> MATCH '"hello world" joe"
    46  
    47    matches the first of the following two documents, but not the second or
    48    third:
    49  
    50      "'Hello world', said Joe."
    51      "One should always greet the world with a cheery hello, thought Joe."
    52      "How many hello world programs could their be?"
    53  
    54    As well as grouping tokens together by phrase, the binary NEAR operator 
    55    may be used to search for rows that contain two or more specified tokens 
    56    or phrases within a specified proximity of each other. The NEAR operator
    57    must always be specified in upper case. The word "near" in lower or mixed
    58    case is treated as an ordinary token. For example, the following query:
    59  
    60      <col> MATCH 'engineering NEAR consultancy'
    61  
    62    matches rows that contain both the "engineering" and "consultancy" tokens
    63    in the same column with not more than 10 other words between them. It does
    64    not matter which of the two terms occurs first in the document, only that
    65    they be seperated by only 10 tokens or less. The user may also specify
    66    a different required proximity by adding "/N" immediately after the NEAR
    67    operator, where N is an integer. For example:
    68  
    69      <col> MATCH 'engineering NEAR/5 consultancy'
    70  
    71    searches for a row containing an instance of each specified token seperated
    72    by not more than 5 other tokens. More than one NEAR operator can be used
    73    in as sequence. For example this query:
    74  
    75      <col> MATCH 'reliable NEAR/2 engineering NEAR/5 consultancy'
    76  
    77    searches for a row that contains an instance of the token "reliable" 
    78    seperated by not more than two tokens from an instance of "engineering",
    79    which is in turn separated by not more than 5 other tokens from an
    80    instance of the term "consultancy". Phrases enclosed in quotes may
    81    also be used as arguments to the NEAR operator.
    82  
    83    Similar to the NEAR operator, one or more tokens or phrases may be 
    84    separated by OR operators. In this case, only one of the specified tokens
    85    or phrases must appear in the document. For example, the query:
    86  
    87      <col> MATCH 'hello OR world'
    88  
    89    matches rows that contain either the term "hello", or the term "world",
    90    or both. Note that unlike in many programming languages, the OR operator
    91    has a higher precedence than the AND operators implied between white-space
    92    separated tokens. The following query matches documents that contain the
    93    term 'sqlite' and at least one of the terms 'fantastic' or 'impressive',
    94    not those that contain both 'sqlite' and 'fantastic' or 'impressive':
    95  
    96      <col> MATCH 'sqlite fantastic OR impressive'
    97  
    98    Any token that is part of an Fts3 query expression, whether or not it is
    99    part of a phrase enclosed in quotes, may have a '*' character appended to
   100    it. In this case, the token matches all terms that begin with the characters
   101    of the token, not just those that exactly match it. For example, the 
   102    following query:
   103  
   104      <col> MATCH 'sql*'
   105  
   106    matches all rows that contain the term "SQLite", as well as those that
   107    contain "SQL".
   108  
   109    A token that is not part of a quoted phrase may be preceded by a '-'
   110    character, which indicates that matching rows must not contain the 
   111    specified term. For example, the following:
   112  
   113      <col> MATCH '"database engine" -sqlite'
   114  
   115    matches rows that contain the phrase "database engine" but do not contain
   116    the term "sqlite". If the '-' character occurs inside a quoted phrase,
   117    it is ignored. It is possible to use both the '-' prefix and the '*' postfix
   118    on a single term. At this time, all Fts3 queries must contain at least
   119    one term or phrase that is not preceded by the '-' prefix.
   120  
   121    Regardless of whether or not a table name or column name is used on the 
   122    left hand side of the MATCH operator, a specific column of the fts3 table
   123    may be associated with each token in a query by preceding a token with
   124    a column name followed by a ':' character. For example, regardless of what
   125    is specified for <col>, the following query requires that column "col1"
   126    of the table contains the term "hello", and that column "col2" of the
   127    table contains the term "world". If the table does not contain columns
   128    named "col1" and "col2", then an error is returned and the query is
   129    not run.
   130  
   131      <col> MATCH 'col1:hello col2:world'
   132  
   133    It is not possible to associate a specific table column with a quoted 
   134    phrase or a term preceded by a '-' operator. A '*' character may be
   135    appended to a term associated with a specific column for prefix matching.
   136  
   137  3. ENHANCED QUERY SYNTAX
   138  
   139    The enhanced query syntax is quite similar to the standard query syntax,
   140    with the following four differences:
   141  
   142    1) Parenthesis are supported. When using the enhanced query syntax,
   143       parenthesis may be used to overcome the built-in precedence of the
   144       supplied binary operators. For example, the following query:
   145  
   146         <col> MATCH '(hello world) OR (simple example)'
   147  
   148       matches documents that contain both "hello" and "world", and documents
   149       that contain both "simple" and "example". It is not possible to forumlate
   150       such a query using the standard syntax.
   151  
   152    2) Instead of separating tokens and phrases by whitespace, an AND operator
   153       may be explicitly specified. This does not change query processing at
   154       all, but may be used to improve readability. For example, the following
   155       query is handled identically to the one above:
   156  
   157         <col> MATCH '(hello AND world) OR (simple AND example)'
   158  
   159       As with the OR and NEAR operators, the AND operator must be specified
   160       in upper case. The word "and" specified in lower or mixed case is 
   161       handled as a regular token.
   162  
   163    3) The '-' token prefix is not supported. Instead, a new binary operator,
   164       NOT, is included. The NOT operator requires that the query specified
   165       as its left-hand operator matches, but that the query specified as the
   166       right-hand operator does not. For example, to query for all rows that
   167       contain the term "example" but not the term "simple", the following
   168       query could be used:
   169  
   170         <col> MATCH 'example NOT simple'
   171  
   172       As for all other operators, the NOT operator must be specified in
   173       upper case. Otherwise it will be treated as a regular token.
   174  
   175    4) Unlike in the standard syntax, where the OR operator has a higher
   176       precedence than the implicit AND operator, when using the enhanced
   177       syntax implicit and explict AND operators have a higher precedence
   178       than OR operators. Using the enhanced syntax, the following two
   179       queries are equivalent:
   180  
   181         <col> MATCH 'sqlite fantastic OR impressive'
   182         <col> MATCH '(sqlite AND fantastic) OR impressive'
   183  
   184       however, when using the standard syntax, the query:
   185  
   186         <col> MATCH 'sqlite fantastic OR impressive'
   187  
   188       is equivalent to the enhanced syntax query:
   189  
   190         <col> MATCH 'sqlite AND (fantastic OR impressive)'
   191  
   192       The precedence of all enhanced syntax operators, in order from highest
   193       to lowest, is:
   194  
   195         NEAR       (highest precedence, tightest grouping)
   196         NOT
   197         AND
   198         OR         (lowest precedence, loosest grouping)
   199  
   200    Using the advanced syntax, it is possible to specify expressions enclosed
   201    in parenthesis as operands to the NOT, AND and OR operators. However both
   202    the left and right hand side operands of NEAR operators must be either
   203    tokens or phrases. Attempting the following query will return an error:
   204  
   205      <col> MATCH 'sqlite NEAR (fantastic OR impressive)'
   206  
   207    Queries of this form must be re-written as:
   208  
   209      <col> MATCH 'sqlite NEAR fantastic OR sqlite NEAR impressive'