github.com/datastax/go-cassandra-native-protocol@v0.0.0-20220706104457-5e8aad05cf90/specs/native_protocol_v3.spec (about)

     1  
     2                               CQL BINARY PROTOCOL v3
     3  
     4  
     5  Table of Contents
     6  
     7    1. Overview
     8    2. Frame header
     9      2.1. version
    10      2.2. flags
    11      2.3. stream
    12      2.4. opcode
    13      2.5. length
    14    3. Notations
    15    4. Messages
    16      4.1. Requests
    17        4.1.1. STARTUP
    18        4.1.2. AUTH_RESPONSE
    19        4.1.3. OPTIONS
    20        4.1.4. QUERY
    21        4.1.5. PREPARE
    22        4.1.6. EXECUTE
    23        4.1.7. BATCH
    24        4.1.8. REGISTER
    25      4.2. Responses
    26        4.2.1. ERROR
    27        4.2.2. READY
    28        4.2.3. AUTHENTICATE
    29        4.2.4. SUPPORTED
    30        4.2.5. RESULT
    31          4.2.5.1. Void
    32          4.2.5.2. Rows
    33          4.2.5.3. Set_keyspace
    34          4.2.5.4. Prepared
    35          4.2.5.5. Schema_change
    36        4.2.6. EVENT
    37        4.2.7. AUTH_CHALLENGE
    38        4.2.8. AUTH_SUCCESS
    39    5. Compression
    40    6. Data Type Serialization Formats
    41    7. User Defined Type Serialization
    42    8. Result paging
    43    9. Error codes
    44    10. Changes from v2
    45  
    46  
    47  1. Overview
    48  
    49    The CQL binary protocol is a frame based protocol. Frames are defined as:
    50  
    51        0         8        16        24        32         40
    52        +---------+---------+---------+---------+---------+
    53        | version |  flags  |      stream       | opcode  |
    54        +---------+---------+---------+---------+---------+
    55        |                length                 |
    56        +---------+---------+---------+---------+
    57        |                                       |
    58        .            ...  body ...              .
    59        .                                       .
    60        .                                       .
    61        +----------------------------------------
    62  
    63    The protocol is big-endian (network byte order).
    64  
    65    Each frame contains a fixed size header (9 bytes) followed by a variable size
    66    body. The header is described in Section 2. The content of the body depends
    67    on the header opcode value (the body can in particular be empty for some
    68    opcode values). The list of allowed opcode is defined Section 2.3 and the
    69    details of each corresponding message is described Section 4.
    70  
    71    The protocol distinguishes 2 types of frames: requests and responses. Requests
    72    are those frame sent by the clients to the server, response are the ones sent
    73    by the server. Note however that the protocol supports server pushes (events)
    74    so responses does not necessarily come right after a client request.
    75  
    76    Note to client implementors: clients library should always assume that the
    77    body of a given frame may contain more data than what is described in this
    78    document. It will however always be safe to ignore the remaining of the frame
    79    body in such cases. The reason is that this may allow to sometimes extend the
    80    protocol with optional features without needing to change the protocol
    81    version.
    82  
    83  
    84  
    85  2. Frame header
    86  
    87  2.1. version
    88  
    89    The version is a single byte that indicate both the direction of the message
    90    (request or response) and the version of the protocol in use. The up-most bit
    91    of version is used to define the direction of the message: 0 indicates a
    92    request, 1 indicates a responses. This can be useful for protocol analyzers to
    93    distinguish the nature of the packet from the direction which it is moving.
    94    The rest of that byte is the protocol version (3 for the protocol defined in
    95    this document). In other words, for this version of the protocol, version will
    96    have one of:
    97      0x03    Request frame for this protocol version
    98      0x83    Response frame for this protocol version
    99  
   100    Please note that the while every message ship with the version, only one version
   101    of messages is accepted on a given connection. In other words, the first message
   102    exchanged (STARTUP) sets the version for the connection for the lifetime of this
   103    connection.
   104  
   105    This document describe the version 3 of the protocol. For the changes made since
   106    version 2, see Section 10.
   107  
   108  
   109  2.2. flags
   110  
   111    Flags applying to this frame. The flags have the following meaning (described
   112    by the mask that allow to select them):
   113      0x01: Compression flag. If set, the frame body is compressed. The actual
   114            compression to use should have been set up beforehand through the
   115            Startup message (which thus cannot be compressed; Section 4.1.1).
   116      0x02: Tracing flag. For a request frame, this indicate the client requires
   117            tracing of the request. Note that not all requests support tracing.
   118            Currently, only QUERY, PREPARE and EXECUTE queries support tracing.
   119            Other requests will simply ignore the tracing flag if set. If a
   120            request support tracing and the tracing flag was set, the response to
   121            this request will have the tracing flag set and contain tracing
   122            information.
   123            If a response frame has the tracing flag set, its body contains
   124            a tracing ID. The tracing ID is a [uuid] and is the first thing in
   125            the frame body. The rest of the body will then be the usual body
   126            corresponding to the response opcode.
   127  
   128    The rest of the flags is currently unused and ignored.
   129  
   130  2.3. stream
   131  
   132    A frame has a stream id (a [short] value). When sending request messages, this
   133    stream id must be set by the client to a non-negative value (negative stream id
   134    are reserved for streams initiated by the server; currently all EVENT messages
   135    (section 4.2.6) have a streamId of -1). If a client sends a request message
   136    with the stream id X, it is guaranteed that the stream id of the response to
   137    that message will be X.
   138  
   139    This allow to deal with the asynchronous nature of the protocol. If a client
   140    sends multiple messages simultaneously (without waiting for responses), there
   141    is no guarantee on the order of the responses. For instance, if the client
   142    writes REQ_1, REQ_2, REQ_3 on the wire (in that order), the server might
   143    respond to REQ_3 (or REQ_2) first. Assigning different stream id to these 3
   144    requests allows the client to distinguish to which request an received answer
   145    respond to. As there can only be 32768 different simultaneous streams, it is up
   146    to the client to reuse stream id.
   147  
   148    Note that clients are free to use the protocol synchronously (i.e. wait for
   149    the response to REQ_N before sending REQ_N+1). In that case, the stream id
   150    can be safely set to 0. Clients should also feel free to use only a subset of
   151    the 32768 maximum possible stream ids if it is simpler for those
   152    implementation.
   153  
   154  2.4. opcode
   155  
   156    An integer byte that distinguish the actual message:
   157      0x00    ERROR
   158      0x01    STARTUP
   159      0x02    READY
   160      0x03    AUTHENTICATE
   161      0x05    OPTIONS
   162      0x06    SUPPORTED
   163      0x07    QUERY
   164      0x08    RESULT
   165      0x09    PREPARE
   166      0x0A    EXECUTE
   167      0x0B    REGISTER
   168      0x0C    EVENT
   169      0x0D    BATCH
   170      0x0E    AUTH_CHALLENGE
   171      0x0F    AUTH_RESPONSE
   172      0x10    AUTH_SUCCESS
   173  
   174    Messages are described in Section 4.
   175  
   176    (Note that there is no 0x04 message in this version of the protocol)
   177  
   178  
   179  2.5. length
   180  
   181    A 4 byte integer representing the length of the body of the frame (note:
   182    currently a frame is limited to 256MB in length).
   183  
   184  
   185  3. Notations
   186  
   187    To describe the layout of the frame body for the messages in Section 4, we
   188    define the following:
   189  
   190      [int]          A 4 bytes signed integer
   191      [long]         A 8 bytes signed integer
   192      [short]        A 2 bytes unsigned integer
   193      [string]       A [short] n, followed by n bytes representing an UTF-8
   194                     string.
   195      [long string]  An [int] n, followed by n bytes representing an UTF-8 string.
   196      [uuid]         A 16 bytes long uuid.
   197      [string list]  A [short] n, followed by n [string].
   198      [bytes]        A [int] n, followed by n bytes if n >= 0. If n < 0,
   199                     no byte should follow and the value represented is `null`.
   200      [short bytes]  A [short] n, followed by n bytes if n >= 0.
   201  
   202      [option]       A pair of <id><value> where <id> is a [short] representing
   203                     the option id and <value> depends on that option (and can be
   204                     of size 0). The supported id (and the corresponding <value>)
   205                     will be described when this is used.
   206      [option list]  A [short] n, followed by n [option].
   207      [inet]         An address (ip and port) to a node. It consists of one
   208                     [byte] n, that represents the address size, followed by n
   209                     [byte] representing the IP address (in practice n can only be
   210                     either 4 (IPv4) or 16 (IPv6)), following by one [int]
   211                     representing the port.
   212      [consistency]  A consistency level specification. This is a [short]
   213                     representing a consistency level with the following
   214                     correspondance:
   215                       0x0000    ANY
   216                       0x0001    ONE
   217                       0x0002    TWO
   218                       0x0003    THREE
   219                       0x0004    QUORUM
   220                       0x0005    ALL
   221                       0x0006    LOCAL_QUORUM
   222                       0x0007    EACH_QUORUM
   223                       0x0008    SERIAL
   224                       0x0009    LOCAL_SERIAL
   225                       0x000A    LOCAL_ONE
   226  
   227      [string map]      A [short] n, followed by n pair <k><v> where <k> and <v>
   228                        are [string].
   229      [string multimap] A [short] n, followed by n pair <k><v> where <k> is a
   230                        [string] and <v> is a [string list].
   231  
   232  
   233  4. Messages
   234  
   235  4.1. Requests
   236  
   237    Note that outside of their normal responses (described below), all requests
   238    can get an ERROR message (Section 4.2.1) as response.
   239  
   240  4.1.1. STARTUP
   241  
   242    Initialize the connection. The server will respond by either a READY message
   243    (in which case the connection is ready for queries) or an AUTHENTICATE message
   244    (in which case credentials will need to be provided using AUTH_RESPONSE).
   245  
   246    This must be the first message of the connection, except for OPTIONS that can
   247    be sent before to find out the options supported by the server. Once the
   248    connection has been initialized, a client should not send any more STARTUP
   249    message.
   250  
   251    The body is a [string map] of options. Possible options are:
   252      - "CQL_VERSION": the version of CQL to use. This option is mandatory and
   253        currenty, the only version supported is "3.0.0". Note that this is
   254        different from the protocol version.
   255      - "COMPRESSION": the compression algorithm to use for frames (See section 5).
   256        This is optional, if not specified no compression will be used.
   257  
   258  
   259  4.1.2. AUTH_RESPONSE
   260  
   261    Answers a server authentication challenge.
   262  
   263    Authentication in the protocol is SASL based. The server sends authentication
   264    challenges (a bytes token) to which the client answer with this message. Those
   265    exchanges continue until the server accepts the authentication by sending a
   266    AUTH_SUCCESS message after a client AUTH_RESPONSE. It is however that client that
   267    initiate the exchange by sending an initial AUTH_RESPONSE in response to a
   268    server AUTHENTICATE request.
   269  
   270    The body of this message is a single [bytes] token. The details of what this
   271    token contains (and when it can be null/empty, if ever) depends on the actual
   272    authenticator used.
   273  
   274    The response to a AUTH_RESPONSE is either a follow-up AUTH_CHALLENGE message,
   275    an AUTH_SUCCESS message or an ERROR message.
   276  
   277  
   278  4.1.3. OPTIONS
   279  
   280    Asks the server to return what STARTUP options are supported. The body of an
   281    OPTIONS message should be empty and the server will respond with a SUPPORTED
   282    message.
   283  
   284  
   285  4.1.4. QUERY
   286  
   287    Performs a CQL query. The body of the message must be:
   288      <query><query_parameters>
   289    where <query> is a [long string] representing the query and
   290    <query_parameters> must be
   291      <consistency><flags>[<n>[name_1]<value_1>...[name_n]<value_n>][<result_page_size>][<paging_state>][<serial_consistency>][<timestamp>]
   292    where:
   293      - <consistency> is the [consistency] level for the operation.
   294      - <flags> is a [byte] whose bits define the options for this query and
   295        in particular influence what the remainder of the message contains.
   296        A flag is set if the bit corresponding to its `mask` is set. Supported
   297        flags are, given there mask:
   298          0x01: Values. In that case, a [short] <n> followed by <n> [bytes]
   299                values are provided. Those value are used for bound variables in
   300                the query. Optionally, if the 0x40 flag is present, each value
   301                will be preceded by a [string] name, representing the name of
   302                the marker the value must be binded to. This is optional, and
   303                if not present, values will be binded by position.
   304          0x02: Skip_metadata. If present, the Result Set returned as a response
   305                to that query (if any) will have the NO_METADATA flag (see
   306                Section 4.2.5.2).
   307          0x04: Page_size. In that case, <result_page_size> is an [int]
   308                controlling the desired page size of the result (in CQL3 rows).
   309                See the section on paging (Section 8) for more details.
   310          0x08: With_paging_state. If present, <paging_state> should be present.
   311                <paging_state> is a [bytes] value that should have been returned
   312                in a result set (Section 4.2.5.2). If provided, the query will be
   313                executed but starting from a given paging state. This also to
   314                continue paging on a different node from the one it has been
   315                started (See Section 8 for more details).
   316          0x10: With serial consistency. If present, <serial_consistency> should be
   317                present. <serial_consistency> is the [consistency] level for the
   318                serial phase of conditional updates. That consitency can only be
   319                either SERIAL or LOCAL_SERIAL and if not present, it defaults to
   320                SERIAL. This option will be ignored for anything else that a
   321                conditional update/insert.
   322          0x20: With default timestamp. If present, <timestamp> should be present.
   323                <timestamp> is a [long] representing the default timestamp for the query
   324                in microseconds (negative values are discouraged but supported for
   325                backward compatibility reasons except for the smallest negative
   326                value (-2^63) that is forbidden). If provided, this will
   327                replace the server side assigned timestamp as default timestamp.
   328                Note that a timestamp in the query itself will still override
   329                this timestamp. This is entirely optional.
   330          0x40: With names for values. This only makes sense if the 0x01 flag is set and
   331                is ignored otherwise. If present, the values from the 0x01 flag will
   332                be preceded by a name (see above). Note that this is only useful for
   333                QUERY requests where named bind markers are used; for EXECUTE statements,
   334                since the names for the expected values was returned during preparation,
   335                a client can always provide values in the right order without any names
   336                and using this flag, while supported, is almost surely inefficient.
   337  
   338    Note that the consistency is ignored by some queries (USE, CREATE, ALTER,
   339    TRUNCATE, ...).
   340  
   341    The server will respond to a QUERY message with a RESULT message, the content
   342    of which depends on the query.
   343  
   344  
   345  4.1.5. PREPARE
   346  
   347    Prepare a query for later execution (through EXECUTE). The body consists of
   348    the CQL query to prepare as a [long string].
   349  
   350    The server will respond with a RESULT message with a `prepared` kind (0x0004,
   351    see Section 4.2.5).
   352  
   353  
   354  4.1.6. EXECUTE
   355  
   356    Executes a prepared query. The body of the message must be:
   357      <id><query_parameters>
   358    where <id> is the prepared query ID. It's the [short bytes] returned as a
   359    response to a PREPARE message. As for <query_parameters>, it has the exact
   360    same definition than in QUERY (see Section 4.1.4).
   361  
   362    The response from the server will be a RESULT message.
   363  
   364  
   365  4.1.7. BATCH
   366  
   367    Allows executing a list of queries (prepared or not) as a batch (note that
   368    only DML statements are accepted in a batch). The body of the message must
   369    be:
   370      <type><n><query_1>...<query_n><consistency><flags>[<serial_consistency>][<timestamp>]
   371    where:
   372      - <type> is a [byte] indicating the type of batch to use:
   373          - If <type> == 0, the batch will be "logged". This is equivalent to a
   374            normal CQL3 batch statement.
   375          - If <type> == 1, the batch will be "unlogged".
   376          - If <type> == 2, the batch will be a "counter" batch (and non-counter
   377            statements will be rejected).
   378      - <flags> is a [byte] whose bits define the options for this query and
   379        in particular influence the remainder of the message contains. It is similar
   380        to the <flags> from QUERY and EXECUTE methods, except that the 4 rightmost
   381        bits must always be 0 as their corresponding option do not make sense for
   382        Batch. A flag is set if the bit corresponding to its `mask` is set. Supported
   383        flags are, given there mask:
   384          0x10: With serial consistency. If present, <serial_consistency> should be
   385                present. <serial_consistency> is the [consistency] level for the
   386                serial phase of conditional updates. That consitency can only be
   387                either SERIAL or LOCAL_SERIAL and if not present, it defaults to
   388                SERIAL. This option will be ignored for anything else that a
   389                conditional update/insert.
   390          0x20: With default timestamp. If present, <timestamp> should be present.
   391                <timestamp> is a [long] representing the default timestamp for the query
   392                in microseconds. If provided, this will replace the server side assigned
   393                timestamp as default timestamp. Note that a timestamp in the query itself
   394                will still override this timestamp. This is entirely optional.
   395          0x40: With names for values. If set, then all values for all <query_i> must be
   396                preceded by a [string] <name_i> that have the same meaning as in QUERY
   397                requests [IMPORTANT NOTE: this feature does not work and should not be
   398                used. It is specified in a way that makes it impossible for the server
   399                to implement. This will be fixed in a future version of the native
   400                protocol. See https://issues.apache.org/jira/browse/CASSANDRA-10246 for
   401                more details].
   402      - <n> is a [short] indicating the number of following queries.
   403      - <query_1>...<query_n> are the queries to execute. A <query_i> must be of the
   404        form:
   405          <kind><string_or_id><n>[<name_1>]<value_1>...[<name_n>]<value_n>
   406        where:
   407         - <kind> is a [byte] indicating whether the following query is a prepared
   408           one or not. <kind> value must be either 0 or 1.
   409         - <string_or_id> depends on the value of <kind>. If <kind> == 0, it should be
   410           a [long string] query string (as in QUERY, the query string might contain
   411           bind markers). Otherwise (that is, if <kind> == 1), it should be a
   412           [short bytes] representing a prepared query ID.
   413         - <n> is a [short] indicating the number (possibly 0) of following values.
   414         - <name_i> is the optional name of the following <value_i>. It must be present
   415           if and only if the 0x40 flag is provided for the batch.
   416         - <value_i> is the [bytes] to use for bound variable i (of bound variable <name_i>
   417           if the 0x40 flag is used).
   418      - <consistency> is the [consistency] level for the operation.
   419      - <serial_consistency> is only present if the 0x10 flag is set. In that case,
   420        <serial_consistency> is the [consistency] level for the serial phase of
   421        conditional updates. That consitency can only be either SERIAL or
   422        LOCAL_SERIAL and if not present will defaults to SERIAL. This option will
   423        be ignored for anything else that a conditional update/insert.
   424  
   425    The server will respond with a RESULT message.
   426  
   427  
   428  4.1.8. REGISTER
   429  
   430    Register this connection to receive some type of events. The body of the
   431    message is a [string list] representing the event types to register to. See
   432    section 4.2.6 for the list of valid event types.
   433  
   434    The response to a REGISTER message will be a READY message.
   435  
   436    Please note that if a client driver maintains multiple connections to a
   437    Cassandra node and/or connections to multiple nodes, it is advised to
   438    dedicate a handful of connections to receive events, but to *not* register
   439    for events on all connections, as this would only result in receiving
   440    multiple times the same event messages, wasting bandwidth.
   441  
   442  
   443  4.2. Responses
   444  
   445    This section describes the content of the frame body for the different
   446    responses. Please note that to make room for future evolution, clients should
   447    support extra informations (that they should simply discard) to the one
   448    described in this document at the end of the frame body.
   449  
   450  4.2.1. ERROR
   451  
   452    Indicates an error processing a request. The body of the message will be an
   453    error code ([int]) followed by a [string] error message. Then, depending on
   454    the exception, more content may follow. The error codes are defined in
   455    Section 9, along with their additional content if any.
   456  
   457  
   458  4.2.2. READY
   459  
   460    Indicates that the server is ready to process queries. This message will be
   461    sent by the server either after a STARTUP message if no authentication is
   462    required, or after a successful CREDENTIALS message.
   463  
   464    The body of a READY message is empty.
   465  
   466  
   467  4.2.3. AUTHENTICATE
   468  
   469    Indicates that the server require authentication, and which authentication
   470    mechanism to use.
   471  
   472    The authentication is SASL based and thus consists on a number of server
   473    challenges (AUTH_CHALLENGE, Section 4.2.7) followed by client responses
   474    (AUTH_RESPONSE, Section 4.1.2). The Initial exchange is however boostrapped
   475    by an initial client response. The details of that exchange (including how
   476    much challenge-response pair are required) are specific to the authenticator
   477    in use. The exchange ends when the server sends an AUTH_SUCCESS message or
   478    an ERROR message.
   479  
   480    This message will be sent following a STARTUP message if authentication is
   481    required and must be answered by a AUTH_RESPONSE message from the client.
   482  
   483    The body consists of a single [string] indicating the full class name of the
   484    IAuthenticator in use.
   485  
   486  
   487  4.2.4. SUPPORTED
   488  
   489    Indicates which startup options are supported by the server. This message
   490    comes as a response to an OPTIONS message.
   491  
   492    The body of a SUPPORTED message is a [string multimap]. This multimap gives
   493    for each of the supported STARTUP options, the list of supported values.
   494  
   495  
   496  4.2.5. RESULT
   497  
   498    The result to a query (QUERY, PREPARE, EXECUTE or BATCH messages).
   499  
   500    The first element of the body of a RESULT message is an [int] representing the
   501    `kind` of result. The rest of the body depends on the kind. The kind can be
   502    one of:
   503      0x0001    Void: for results carrying no information.
   504      0x0002    Rows: for results to select queries, returning a set of rows.
   505      0x0003    Set_keyspace: the result to a `use` query.
   506      0x0004    Prepared: result to a PREPARE message.
   507      0x0005    Schema_change: the result to a schema altering query.
   508  
   509    The body for each kind (after the [int] kind) is defined below.
   510  
   511  
   512  4.2.5.1. Void
   513  
   514    The rest of the body for a Void result is empty. It indicates that a query was
   515    successful without providing more information.
   516  
   517  
   518  4.2.5.2. Rows
   519  
   520    Indicates a set of rows. The rest of body of a Rows result is:
   521      <metadata><rows_count><rows_content>
   522    where:
   523      - <metadata> is composed of:
   524          <flags><columns_count>[<paging_state>][<global_table_spec>?<col_spec_1>...<col_spec_n>]
   525        where:
   526          - <flags> is an [int]. The bits of <flags> provides information on the
   527            formatting of the remaining informations. A flag is set if the bit
   528            corresponding to its `mask` is set. Supported flags are, given there
   529            mask:
   530              0x0001    Global_tables_spec: if set, only one table spec (keyspace
   531                        and table name) is provided as <global_table_spec>. If not
   532                        set, <global_table_spec> is not present.
   533              0x0002    Has_more_pages: indicates whether this is not the last
   534                        page of results and more should be retrieve. If set, the
   535                        <paging_state> will be present. The <paging_state> is a
   536                        [bytes] value that should be used in QUERY/EXECUTE to
   537                        continue paging and retrieve the remained of the result for
   538                        this query (See Section 8 for more details).
   539              0x0004    No_metadata: if set, the <metadata> is only composed of
   540                        these <flags>, the <column_count> and optionally the
   541                        <paging_state> (depending on the Has_more_pages flage) but
   542                        no other information (so no <global_table_spec> nor <col_spec_i>).
   543                        This will only ever be the case if this was requested
   544                        during the query (see QUERY and RESULT messages).
   545          - <columns_count> is an [int] representing the number of columns selected
   546            by the query this result is of. It defines the number of <col_spec_i>
   547            elements in and the number of element for each row in <rows_content>.
   548          - <global_table_spec> is present if the Global_tables_spec is set in
   549            <flags>. If present, it is composed of two [string] representing the
   550            (unique) keyspace name and table name the columns return are of.
   551          - <col_spec_i> specifies the columns returned in the query. There is
   552            <column_count> such column specifications that are composed of:
   553              (<ksname><tablename>)?<name><type>
   554            The initial <ksname> and <tablename> are two [string] are only present
   555            if the Global_tables_spec flag is not set. The <column_name> is a
   556            [string] and <type> is an [option] that correspond to the description
   557            (what this description is depends a bit on the context: in results to
   558            selects, this will be either the user chosen alias or the selection used
   559            (often a colum name, but it can be a function call too). In results to
   560            a PREPARE, this will be either the name of the bind variable corresponding
   561            or the column name for the variable if it is "anonymous") and type of
   562            the corresponding result. The option for <type> is either a native
   563            type (see below), in which case the option has no value, or a
   564            'custom' type, in which case the value is a [string] representing
   565            the full qualified class name of the type represented. Valid option
   566            ids are:
   567              0x0000    Custom: the value is a [string], see above.
   568              0x0001    Ascii
   569              0x0002    Bigint
   570              0x0003    Blob
   571              0x0004    Boolean
   572              0x0005    Counter
   573              0x0006    Decimal
   574              0x0007    Double
   575              0x0008    Float
   576              0x0009    Int
   577              0x000B    Timestamp
   578              0x000C    Uuid
   579              0x000D    Varchar
   580              0x000E    Varint
   581              0x000F    Timeuuid
   582              0x0010    Inet
   583              0x0020    List: the value is an [option], representing the type
   584                              of the elements of the list.
   585              0x0021    Map: the value is two [option], representing the types of the
   586                             keys and values of the map
   587              0x0022    Set: the value is an [option], representing the type
   588                              of the elements of the set
   589              0x0030    UDT: the value is <ks><udt_name><n><name_1><type_1>...<name_n><type_n>
   590                             where:
   591                                - <ks> is a [string] representing the keyspace name this
   592                                  UDT is part of.
   593                                - <udt_name> is a [string] representing the UDT name.
   594                                - <n> is a [short] reprensenting the number of fields of
   595                                  the UDT, and thus the number of <name_i><type_i> pair
   596                                  following
   597                                - <name_i> is a [string] representing the name of the
   598                                  i_th field of the UDT.
   599                                - <type_i> is an [option] representing the type of the
   600                                  i_th field of the UDT.
   601              0x0031    Tuple: the value is <n><type_1>...<type_n> where <n> is a [short]
   602                               representing the number of value in the type, and <type_i>
   603                               are [option] representing the type of the i_th component
   604                               of the tuple
   605  
   606      - <rows_count> is an [int] representing the number of rows present in this
   607        result. Those rows are serialized in the <rows_content> part.
   608      - <rows_content> is composed of <row_1>...<row_m> where m is <rows_count>.
   609        Each <row_i> is composed of <value_1>...<value_n> where n is
   610        <columns_count> and where <value_j> is a [bytes] representing the value
   611        returned for the jth column of the ith row. In other words, <rows_content>
   612        is composed of (<rows_count> * <columns_count>) [bytes].
   613  
   614  
   615  4.2.5.3. Set_keyspace
   616  
   617    The result to a `use` query. The body (after the kind [int]) is a single
   618    [string] indicating the name of the keyspace that has been set.
   619  
   620  
   621  4.2.5.4. Prepared
   622  
   623    The result to a PREPARE message. The rest of the body of a Prepared result is:
   624      <id><metadata><result_metadata>
   625    where:
   626      - <id> is [short bytes] representing the prepared query ID.
   627      - <metadata> is defined exactly as for a Rows RESULT (See section 4.2.5.2; you
   628        can however assume that the Has_more_pages flag is always off) and
   629        is the specification for the variable bound in this prepare statement.
   630      - <result_metadata> is defined exactly as <metadata> but correspond to the
   631        metadata for the resultSet that execute this query will yield. Note that
   632        <result_metadata> may be empty (have the No_metadata flag and 0 columns, See
   633        section 4.2.5.2) and will be for any query that is not a Select. There is
   634        in fact never a guarantee that this will non-empty so client should protect
   635        themselves accordingly. The presence of this information is an
   636        optimization that allows to later execute the statement that has been
   637        prepared without requesting the metadata (Skip_metadata flag in EXECUTE).
   638        Clients can safely discard this metadata if they do not want to take
   639        advantage of that optimization.
   640  
   641    Note that prepared query ID return is global to the node on which the query
   642    has been prepared. It can be used on any connection to that node and this
   643    until the node is restarted (after which the query must be reprepared).
   644  
   645  4.2.5.5. Schema_change
   646  
   647    The result to a schema altering query (creation/update/drop of a
   648    keyspace/table/index). The body (after the kind [int]) is the same
   649    as the body for a "SCHEMA_CHANGE" event, so 3 strings:
   650      <change_type><target><options>
   651    Please refer to the section 4.2.6 below for the meaning of those fields.
   652  
   653    Note that queries to create and drop an index are considered as change
   654    updating the table the index is on.
   655  
   656  
   657  4.2.6. EVENT
   658  
   659    And event pushed by the server. A client will only receive events for the
   660    type it has REGISTER to. The body of an EVENT message will start by a
   661    [string] representing the event type. The rest of the message depends on the
   662    event type. The valid event types are:
   663      - "TOPOLOGY_CHANGE": events related to change in the cluster topology.
   664        Currently, events are sent when new nodes are added to the cluster, and
   665        when nodes are removed. The body of the message (after the event type)
   666        consists of a [string] and an [inet], corresponding respectively to the
   667        type of change ("NEW_NODE", "REMOVED_NODE", or "MOVED_NODE") followed
   668        by the address of the new/removed/moved node.
   669      - "STATUS_CHANGE": events related to change of node status. Currently,
   670        up/down events are sent. The body of the message (after the event type)
   671        consists of a [string] and an [inet], corresponding respectively to the
   672        type of status change ("UP" or "DOWN") followed by the address of the
   673        concerned node.
   674      - "SCHEMA_CHANGE": events related to schema change. After the event type,
   675        the rest of the message will be <change_type><target><options> where:
   676          - <change_type> is a [string] representing the type of changed involved.
   677            It will be one of "CREATED", "UPDATED" or "DROPPED".
   678          - <target> is a [string] that can be one of "KEYSPACE", "TABLE" or "TYPE"
   679            and describes what has been modified ("TYPE" stands for modifications
   680            related to user types).
   681          - <options> depends on the preceding <target>. If <target> is
   682            "KEYSPACE", then <options> will be a single [string] representing the
   683            keyspace changed. Otherwise, if <target> is "TABLE" or "TYPE", then
   684            <options> will be 2 [string]: the first one will be the keyspace
   685            containing the affected object, and the second one will be the name
   686            of said affected object (so either the table name or the user type
   687            name).
   688  
   689    All EVENT message have a streamId of -1 (Section 2.3).
   690  
   691    Please note that "NEW_NODE" and "UP" events are sent based on internal Gossip
   692    communication and as such may be sent a short delay before the binary
   693    protocol server on the newly up node is fully started. Clients are thus
   694    advise to wait a short time before trying to connect to the node (1 seconds
   695    should be enough), otherwise they may experience a connection refusal at
   696    first.
   697  
   698    It is possible for the same event to be sent multiple times. Therefore,
   699    a client library should ignore the same event if it has already been notified
   700    of a change.
   701  
   702  4.2.7. AUTH_CHALLENGE
   703  
   704    A server authentication challenge (see AUTH_RESPONSE (Section 4.1.2) for more
   705    details).
   706  
   707    The body of this message is a single [bytes] token. The details of what this
   708    token contains (and when it can be null/empty, if ever) depends on the actual
   709    authenticator used.
   710  
   711    Clients are expected to answer the server challenge by an AUTH_RESPONSE
   712    message.
   713  
   714  4.2.7. AUTH_SUCCESS
   715  
   716    Indicate the success of the authentication phase. See Section 4.2.3 for more
   717    details.
   718  
   719    The body of this message is a single [bytes] token holding final information
   720    from the server that the client may require to finish the authentication
   721    process. What that token contains and whether it can be null depends on the
   722    actual authenticator used.
   723  
   724  
   725  5. Compression
   726  
   727    Frame compression is supported by the protocol, but then only the frame body
   728    is compressed (the frame header should never be compressed).
   729  
   730    Before being used, client and server must agree on a compression algorithm to
   731    use, which is done in the STARTUP message. As a consequence, a STARTUP message
   732    must never be compressed.  However, once the STARTUP frame has been received
   733    by the server can be compressed (including the response to the STARTUP
   734    request). Frame do not have to be compressed however, even if compression has
   735    been agreed upon (a server may only compress frame above a certain size at its
   736    discretion). A frame body should be compressed if and only if the compressed
   737    flag (see Section 2.2) is set.
   738  
   739    As of this version 2 of the protocol, the following compressions are available:
   740      - lz4 (https://code.google.com/p/lz4/). In that, note that the 4 first bytes
   741        of the body will be the uncompressed length (followed by the compressed
   742        bytes).
   743      - snappy (https://code.google.com/p/snappy/). This compression might not be
   744        available as it depends on a native lib (server-side) that might not be
   745        avaivable on some installation.
   746  
   747  
   748  6. Data Type Serialization Formats
   749  
   750    This sections describes the serialization formats for all CQL data types
   751    supported by Cassandra through the native protocol.  These serialization
   752    formats should be used by client drivers to encode values for EXECUTE
   753    messages.  Cassandra will use these formats when returning values in
   754    RESULT messages.
   755  
   756    All values are represented as [bytes] in EXECUTE and RESULT messages.
   757    The [bytes] format includes an int prefix denoting the length of the value.
   758    For that reason, the serialization formats described here will not include
   759    a length component.
   760  
   761    For legacy compatibility reasons, note that most non-string types support
   762    "empty" values (i.e. a value with zero length).  An empty value is distinct
   763    from NULL, which is encoded with a negative length.
   764  
   765    As with the rest of the native protocol, all encodings are big-endian.
   766  
   767  6.1. ascii
   768  
   769    A sequence of bytes in the ASCII range [0, 127].  Bytes with values outside of
   770    this range will result in a validation error.
   771  
   772  6.2 bigint
   773  
   774    An eight-byte two's complement integer.
   775  
   776  6.3 blob
   777  
   778    Any sequence of bytes.
   779  
   780  6.4 boolean
   781  
   782    A single byte.  A value of 0 denotes "false"; any other value denotes "true".
   783    (However, it is recommended that a value of 1 be used to represent "true".)
   784  
   785  6.5 decimal
   786  
   787    The decimal format represents an arbitrary-precision number.  It contains an
   788    [int] "scale" component followed by a varint encoding (see section 6.17)
   789    of the unscaled value.  The encoded value represents "<unscaled>E<-scale>".
   790    In other words, "<unscaled> * 10 ^ (-1 * <scale>)".
   791  
   792  6.6 double
   793  
   794    An eight-byte floating point number in the IEEE 754 binary64 format.
   795  
   796  6.7 float
   797  
   798    An four-byte floating point number in the IEEE 754 binary32 format.
   799  
   800  6.8 inet
   801  
   802    A 4 byte or 16 byte sequence denoting an IPv4 or IPv6 address, respectively.
   803  
   804  6.9 int
   805  
   806    A four-byte two's complement integer.
   807  
   808  6.10 list
   809  
   810    A [int] n indicating the number of elements in the list, followed by n
   811    elements.  Each element is [bytes] representing the serialized value.
   812  
   813  6.11 map
   814  
   815    A [int] n indicating the number of key/value pairs in the map, followed by
   816    n entries.  Each entry is composed of two [bytes] representing the key
   817    and value.
   818  
   819  6.12 set
   820  
   821    A [int] n indicating the number of elements in the set, followed by n
   822    elements.  Each element is [bytes] representing the serialized value.
   823  
   824  6.13 text
   825  
   826    A sequence of bytes conforming to the UTF-8 specifications.
   827  
   828  6.14 timestamp
   829  
   830    An eight-byte two's complement integer representing a millisecond-precision
   831    offset from the unix epoch (00:00:00, January 1st, 1970).  Negative values
   832    represent a negative offset from the epoch.
   833  
   834  6.15 uuid
   835  
   836    A 16 byte sequence representing any valid UUID as defined by RFC 4122.
   837  
   838  6.16 varchar
   839  
   840    An alias of the "text" type.
   841  
   842  6.17 varint
   843  
   844    A variable-length two's complement encoding of a signed integer.
   845  
   846    The following examples may help implementors of this spec:
   847  
   848    Value | Encoding
   849    ------|---------
   850        0 |     0x00
   851        1 |     0x01
   852      127 |     0x7F
   853      128 |   0x0080
   854      129 |   0x0081
   855       -1 |     0xFF
   856     -128 |     0x80
   857     -129 |   0xFF7F
   858  
   859    Note that positive numbers must use a most-significant byte with a value
   860    less than 0x80, because a most-significant bit of 1 indicates a negative
   861    value.  Implementors should pad positive values that have a MSB >= 0x80
   862    with a leading 0x00 byte.
   863  
   864  6.18 timeuuid
   865  
   866    A 16 byte sequence representing a version 1 UUID as defined by RFC 4122.
   867  
   868  6.19 tuple
   869  
   870    A sequence of [bytes] values representing the items in a tuple.  The encoding
   871    of each element depends on the data type for that position in the tuple.
   872    Null values may be represented by using length -1 for the [bytes]
   873    representation of an element.
   874  
   875    Within a tuple, all data types should use the v3 protocol serialization format.
   876  
   877  
   878  7. User Defined Types
   879  
   880    This section describes the serialization format for User defined types (UDT),
   881    as described in section 4.2.5.2.
   882  
   883    A UDT value is composed of successive [bytes] values, one for each field of the UDT
   884    value (in the order defined by the type). A UDT value will generally have one value
   885    for each field of the type it represents, but it is allowed to have less values than
   886    the type has fields.
   887  
   888    Within a user-defined type value, all data types should use the v3 protocol
   889    serialization format.
   890  
   891  
   892  8. Result paging
   893  
   894    The protocol allows for paging the result of queries. For that, the QUERY and
   895    EXECUTE messages have a <result_page_size> value that indicate the desired
   896    page size in CQL3 rows.
   897  
   898    If a positive value is provided for <result_page_size>, the result set of the
   899    RESULT message returned for the query will contain at most the
   900    <result_page_size> first rows of the query result. If that first page of result
   901    contains the full result set for the query, the RESULT message (of kind `Rows`)
   902    will have the Has_more_pages flag *not* set. However, if some results are not
   903    part of the first response, the Has_more_pages flag will be set and the result
   904    will contain a <paging_state> value. In that case, the <paging_state> value
   905    should be used in a QUERY or EXECUTE message (that has the *same* query than
   906    the original one or the behavior is undefined) to retrieve the next page of
   907    results.
   908  
   909    Only CQL3 queries that return a result set (RESULT message with a Rows `kind`)
   910    support paging. For other type of queries, the <result_page_size> value is
   911    ignored.
   912  
   913    Note to client implementors:
   914    - While <result_page_size> can be as low as 1, it will likely be detrimental
   915      to performance to pick a value too low. A value below 100 is probably too
   916      low for most use cases.
   917    - Clients should not rely on the actual size of the result set returned to
   918      decide if there is more result to fetch or not. Instead, they should always
   919      check the Has_more_pages flag (unless they did not enabled paging for the query
   920      obviously). Clients should also not assert that no result will have more than
   921      <result_page_size> results. While the current implementation always respect
   922      the exact value of <result_page_size>, we reserve ourselves the right to return
   923      slightly smaller or bigger pages in the future for performance reasons.
   924  
   925  
   926  9. Error codes
   927  
   928    The supported error codes are described below:
   929      0x0000    Server error: something unexpected happened. This indicates a
   930                server-side bug.
   931      0x000A    Protocol error: some client message triggered a protocol
   932                violation (for instance a QUERY message is sent before a STARTUP
   933                one has been sent)
   934      0x0100    Bad credentials: CREDENTIALS request failed because Cassandra
   935                did not accept the provided credentials.
   936  
   937      0x1000    Unavailable exception. The rest of the ERROR message body will be
   938                  <cl><required><alive>
   939                where:
   940                  <cl> is the [consistency] level of the query having triggered
   941                       the exception.
   942                  <required> is an [int] representing the number of node that
   943                             should be alive to respect <cl>
   944                  <alive> is an [int] representing the number of replica that
   945                          were known to be alive when the request has been
   946                          processed (since an unavailable exception has been
   947                          triggered, there will be <alive> < <required>)
   948      0x1001    Overloaded: the request cannot be processed because the
   949                coordinator node is overloaded
   950      0x1002    Is_bootstrapping: the request was a read request but the
   951                coordinator node is bootstrapping
   952      0x1003    Truncate_error: error during a truncation error.
   953      0x1100    Write_timeout: Timeout exception during a write request. The rest
   954                of the ERROR message body will be
   955                  <cl><received><blockfor><writeType>
   956                where:
   957                  <cl> is the [consistency] level of the query having triggered
   958                       the exception.
   959                  <received> is an [int] representing the number of nodes having
   960                             acknowledged the request.
   961                  <blockfor> is an [int] representing the number of replica whose
   962                             acknowledgement is required to achieve <cl>.
   963                  <writeType> is a [string] that describe the type of the write
   964                              that timeouted. The value of that string can be one
   965                              of:
   966                               - "SIMPLE": the write was a non-batched
   967                                 non-counter write.
   968                               - "BATCH": the write was a (logged) batch write.
   969                                 If this type is received, it means the batch log
   970                                 has been successfully written (otherwise a
   971                                 "BATCH_LOG" type would have been send instead).
   972                               - "UNLOGGED_BATCH": the write was an unlogged
   973                                 batch. Not batch log write has been attempted.
   974                               - "COUNTER": the write was a counter write
   975                                 (batched or not).
   976                               - "BATCH_LOG": the timeout occured during the
   977                                 write to the batch log when a (logged) batch
   978                                 write was requested.
   979      0x1200    Read_timeout: Timeout exception during a read request. The rest
   980                of the ERROR message body will be
   981                  <cl><received><blockfor><data_present>
   982                where:
   983                  <cl> is the [consistency] level of the query having triggered
   984                       the exception.
   985                  <received> is an [int] representing the number of nodes having
   986                             answered the request.
   987                  <blockfor> is an [int] representing the number of replica whose
   988                             response is required to achieve <cl>. Please note that
   989                             it is possible to have <received> >= <blockfor> if
   990                             <data_present> is false. And also in the (unlikely)
   991                             case were <cl> is achieved but the coordinator node
   992                             timeout while waiting for read-repair
   993                             acknowledgement.
   994                  <data_present> is a single byte. If its value is 0, it means
   995                                 the replica that was asked for data has not
   996                                 responded. Otherwise, the value is != 0.
   997  
   998      0x2000    Syntax_error: The submitted query has a syntax error.
   999      0x2100    Unauthorized: The logged user doesn't have the right to perform
  1000                the query.
  1001      0x2200    Invalid: The query is syntactically correct but invalid.
  1002      0x2300    Config_error: The query is invalid because of some configuration issue
  1003      0x2400    Already_exists: The query attempted to create a keyspace or a
  1004                table that was already existing. The rest of the ERROR message
  1005                body will be <ks><table> where:
  1006                  <ks> is a [string] representing either the keyspace that
  1007                       already exists, or the keyspace in which the table that
  1008                       already exists is.
  1009                  <table> is a [string] representing the name of the table that
  1010                          already exists. If the query was attempting to create a
  1011                          keyspace, <table> will be present but will be the empty
  1012                          string.
  1013      0x2500    Unprepared: Can be thrown while a prepared statement tries to be
  1014                executed if the provide prepared statement ID is not known by
  1015                this host. The rest of the ERROR message body will be [short
  1016                bytes] representing the unknown ID.
  1017  
  1018  10. Changes from v2
  1019    * stream id is now 2 bytes long (a [short] value), so the header is now 1 byte longer (9 bytes total).
  1020    * BATCH messages now have <flags> (like QUERY and EXECUTE) and a corresponding optional
  1021      <serial_consistency> parameters (see Section 4.1.7).
  1022    * User Defined Types and tuple types have to added to ResultSet metadata (see 4.2.5.2) and a
  1023      new section on the serialization format of UDT and tuple values has been added to the documentation
  1024      (Section 7).
  1025    * The serialization format for collection has changed (both the collection size and
  1026      the length of each argument is now 4 bytes long). See Section 6.
  1027    * QUERY, EXECUTE and BATCH messages can now optionally provide the default timestamp for the query.
  1028      As this feature is optionally enabled by clients, implementing it is at the discretion of the
  1029      client.
  1030    * QUERY and EXECUTE messages can now optionally provide the names for the values of the
  1031      query. As this feature is optionally enabled by clients, implementing it is at the discretion of the
  1032      client (Note that while the BATCH message has a flag for this, it actually doesn't work for BATCH,
  1033      see Section 4.1.7 for details).
  1034    * The format of "Schema_change" results (Section 4.2.5.5) and "SCHEMA_CHANGE" events (Section 4.2.6)
  1035      has been modified, and now includes changes related to user types.
  1036