github.com/datastax/go-cassandra-native-protocol@v0.0.0-20220706104457-5e8aad05cf90/specs/dse_protocol_v1.spec (about) 1 2 DSE BINARY PROTOCOL v1 3 4 5 Table of Contents 6 7 1. Overview 8 2. Frame header 9 2.1. version 10 2.2. flags 11 2.3. stream 12 2.4. opcode 13 2.5. length 14 3. Notations 15 4. Messages 16 4.1. Requests 17 4.1.1. STARTUP 18 4.1.2. AUTH_RESPONSE 19 4.1.3. OPTIONS 20 4.1.4. QUERY 21 4.1.5. PREPARE 22 4.1.6. EXECUTE 23 4.1.7. BATCH 24 4.1.8. REGISTER 25 4.1.9. CANCEL 26 4.2. Responses 27 4.2.1. ERROR 28 4.2.2. READY 29 4.2.3. AUTHENTICATE 30 4.2.4. SUPPORTED 31 4.2.5. RESULT 32 4.2.5.1. Void 33 4.2.5.2. Rows 34 4.2.5.3. Set_keyspace 35 4.2.5.4. Prepared 36 4.2.5.5. Schema_change 37 4.2.6. EVENT 38 4.2.7. AUTH_CHALLENGE 39 4.2.8. AUTH_SUCCESS 40 5. Compression 41 6. Data Type Serialization Formats 42 7. User Defined Type Serialization 43 8. Result paging 44 9. Error codes 45 10. Changes from CQL binary protocol v5 46 47 48 1. Overview 49 50 The CQL binary protocol is a frame based protocol. Frames are defined as: 51 52 0 8 16 24 32 40 53 +---------+---------+---------+---------+---------+ 54 | version | flags | stream | opcode | 55 +---------+---------+---------+---------+---------+ 56 | length | 57 +---------+---------+---------+---------+ 58 | | 59 . ... body ... . 60 . . 61 . . 62 +---------------------------------------- 63 64 The protocol is big-endian (network byte order). 65 66 Each frame contains a fixed size header (9 bytes) followed by a variable size 67 body. The header is described in Section 2. The content of the body depends 68 on the header opcode value (the body can in particular be empty for some 69 opcode values). The list of allowed opcodes is defined in Section 2.4 and the 70 details of each corresponding message are described Section 4. 71 72 The protocol distinguishes two types of frames: requests and responses. Requests 73 are those frames sent by the client to the server. Responses are those frames sent 74 by the server to the client. Note, however, that the protocol supports server pushes 75 (events) so a response does not necessarily come right after a client request. 76 77 Note to client implementors: client libraries should always assume that the 78 body of a given frame may contain more data than what is described in this 79 document. It will however always be safe to ignore the remainder of the frame 80 body in such cases. The reason is that this may enable extending the protocol 81 with optional features without needing to change the protocol version. 82 83 84 85 2. Frame header 86 87 2.1. version 88 89 The version is a single byte that indicates both the direction of the message 90 (request or response) and the version of the protocol in use. The most 91 significant bit of version is used to define the direction of the message: 92 0 indicates a request, 1 indicates a response. This can be useful for protocol 93 analyzers to distinguish the nature of the packet from the direction in which 94 it is moving. 95 96 The next most significant bit must be set to one to indicate that this 97 is a dse private version and not a public open source version. 98 99 The rest of that byte is the dse protocol version (1 for the protocol 100 defined in this document). In other words, for this version of the protocol, 101 version will be one of: 102 0x41 (0100 0001) Request frame for this dse protocol version 103 0xC1 (1100 0001) Response frame for this dse protocol version 104 105 Please note that while every message ships with the version, only one version 106 of messages is accepted on a given connection. In other words, the first message 107 exchanged (STARTUP) sets the version for the connection for the lifetime of this 108 connection. The single exception to this behavior is when a startup message 109 is sent with a version that is higher than the current server version. In this 110 case, the server will respond with its current version. 111 112 This document describes version 1 of the dse protocol. For the changes made since 113 version 5 of the public protocol, see Section 10. 114 115 116 2.2. flags 117 118 Flags applying to this frame. The flags have the following meaning (described 119 by the mask that allows selecting them): 120 0x01: Compression flag. If set, the frame body is compressed. The actual 121 compression to use should have been set up beforehand through the 122 Startup message (which thus cannot be compressed; Section 4.1.1). 123 0x02: Tracing flag. For a request frame, this indicates the client requires 124 tracing of the request. Note that only QUERY, PREPARE and EXECUTE queries 125 support tracing. Other requests will simply ignore the tracing flag if 126 set. If a request supports tracing and the tracing flag is set, the response 127 to this request will have the tracing flag set and contain tracing 128 information. 129 If a response frame has the tracing flag set, its body contains 130 a tracing ID. The tracing ID is a [uuid] and is the first thing in 131 the frame body. The rest of the body will then be the usual body 132 corresponding to the response opcode. 133 0x04: Custom payload flag. For a request or response frame, this indicates 134 that a generic key-value custom payload for a custom QueryHandler 135 implementation is present in the frame. Such a custom payload is simply 136 ignored by the default QueryHandler implementation. 137 Currently, only QUERY, PREPARE, EXECUTE and BATCH requests support 138 payload. 139 Type of custom payload is [bytes map] (see below). 140 0x08: Warning flag. The response contains warnings which were generated by the 141 server to go along with this response. 142 If a response frame has the warning flag set, its body will contain the 143 text of the warnings. The warnings are a [string list] and will be the 144 first value in the frame body if the tracing flag is not set, or directly 145 after the tracing ID if it is. 146 0x10: Use beta flag. Indicates that the client opts in to use protocol version 147 that is currently in beta. Server will respond with ERROR if protocol 148 version is marked as beta on server and client does not provide this flag. 149 150 The rest of flags is currently unused and ignored. 151 152 2.3. stream 153 154 A frame has a stream id (a [short] value). When sending request messages, this 155 stream id must be set by the client to a non-negative value (negative stream id 156 are reserved for streams initiated by the server; currently all EVENT messages 157 (section 4.2.6) have a streamId of -1). If a client sends a request message 158 with the stream id X, it is guaranteed that the stream id of the response to 159 that message will be X. 160 161 This helps to enable the asynchronous nature of the protocol. If a client 162 sends multiple messages simultaneously (without waiting for responses), there 163 is no guarantee on the order of the responses. For instance, if the client 164 writes REQ_1, REQ_2, REQ_3 on the wire (in that order), the server might 165 respond to REQ_3 (or REQ_2) first. Assigning different stream ids to these 3 166 requests allows the client to distinguish to which request a received answer 167 responds to. As there can only be 32768 different simultaneous streams, it is up 168 to the client to reuse stream id. 169 170 Note that clients are free to use the protocol synchronously (i.e. wait for 171 the response to REQ_N before sending REQ_N+1). In that case, the stream id 172 can be safely set to 0. Clients should also feel free to use only a subset of 173 the 32768 maximum possible stream ids if it is simpler for its implementation. 174 175 2.4. opcode 176 177 An integer byte that distinguishes the actual message: 178 0x00 ERROR 179 0x01 STARTUP 180 0x02 READY 181 0x03 AUTHENTICATE 182 0x05 OPTIONS 183 0x06 SUPPORTED 184 0x07 QUERY 185 0x08 RESULT 186 0x09 PREPARE 187 0x0A EXECUTE 188 0x0B REGISTER 189 0x0C EVENT 190 0x0D BATCH 191 0x0E AUTH_CHALLENGE 192 0x0F AUTH_RESPONSE 193 0x10 AUTH_SUCCESS 194 0xFF CANCEL 195 196 Messages are described in Section 4. 197 198 (Note that there is no 0x04 message in this version of the protocol) 199 200 201 2.5. length 202 203 A 4 byte integer representing the length of the body of the frame (note: 204 currently a frame is limited to 256MB in length). 205 206 207 3. Notations 208 209 To describe the layout of the frame body for the messages in Section 4, we 210 define the following: 211 212 [int] A 4 bytes integer 213 [long] A 8 bytes integer 214 [byte] A 1 byte unsigned integer 215 [short] A 2 bytes unsigned integer 216 [string] A [short] n, followed by n bytes representing an UTF-8 217 string. 218 [long string] An [int] n, followed by n bytes representing an UTF-8 string. 219 [uuid] A 16 bytes long uuid. 220 [string list] A [short] n, followed by n [string]. 221 [bytes] A [int] n, followed by n bytes if n >= 0. If n < 0, 222 no byte should follow and the value represented is `null`. 223 [value] A [int] n, followed by n bytes if n >= 0. 224 If n == -1 no byte should follow and the value represented is `null`. 225 If n == -2 no byte should follow and the value represented is 226 `not set` not resulting in any change to the existing value. 227 n < -2 is an invalid value and results in an error. 228 [short bytes] A [short] n, followed by n bytes if n >= 0. 229 230 [unsigned vint] An unsigned variable length integer. A vint is encoded with the most significant byte (MSB) first. 231 The most significant byte will contains the information about how many extra bytes need to be read 232 as well as the most significant bits of the integer. 233 The number of extra bytes to read is encoded as 1 bits on the left side. 234 For example, if we need to read 2 more bytes the first byte will start with 110 235 (e.g. 256 000 will be encoded on 3 bytes as [110]00011 11101000 00000000) 236 If the encoded integer is 8 bytes long the vint will be encoded on 9 bytes and the first 237 byte will be: 11111111 238 239 [vint] A signed variable length integer. This is encoded using zig-zag encoding and then sent 240 like an [unsigned vint]. Zig-zag encoding converts numbers as follows: 241 0 = 0, -1 = 1, 1 = 2, -2 = 3, 2 = 4, -3 = 5, 3 = 6 and so forth. 242 The purpose is to send small negative values as small unsigned values, so that we save bytes on the wire. 243 To encode a value n use "(n >> 31) ^ (n << 1)" for 32 bit values, and "(n >> 63) ^ (n << 1)" 244 for 64 bit values where "^" is the xor operation, "<<" is the left shift operation and ">>" is 245 the arithemtic right shift operation (highest-order bit is replicated). 246 Decode with "(n >> 1) ^ -(n & 1)". 247 248 [option] A pair of <id><value> where <id> is a [short] representing 249 the option id and <value> depends on that option (and can be 250 of size 0). The supported id (and the corresponding <value>) 251 will be described when this is used. 252 [option list] A [short] n, followed by n [option]. 253 [inet] An address (ip and port) to a node. It consists of one 254 [byte] n, that represents the address size, followed by n 255 [byte] representing the IP address (in practice n can only be 256 either 4 (IPv4) or 16 (IPv6)), following by one [int] 257 representing the port. 258 [inetaddr] An IP address (without a port) to a node. It consists of one 259 [byte] n, that represents the address size, followed by n 260 [byte] representing the IP address. 261 [consistency] A consistency level specification. This is a [short] 262 representing a consistency level with the following 263 correspondance: 264 0x0000 ANY 265 0x0001 ONE 266 0x0002 TWO 267 0x0003 THREE 268 0x0004 QUORUM 269 0x0005 ALL 270 0x0006 LOCAL_QUORUM 271 0x0007 EACH_QUORUM 272 0x0008 SERIAL 273 0x0009 LOCAL_SERIAL 274 0x000A LOCAL_ONE 275 276 [string map] A [short] n, followed by n pair <k><v> where <k> and <v> 277 are [string]. 278 [string multimap] A [short] n, followed by n pair <k><v> where <k> is a 279 [string] and <v> is a [string list]. 280 [bytes map] A [short] n, followed by n pair <k><v> where <k> is a 281 [string] and <v> is a [bytes]. 282 283 284 4. Messages 285 286 4.1. Requests 287 288 Note that outside of their normal responses (described below), all requests 289 can get an ERROR message (Section 4.2.1) as response. 290 291 4.1.1. STARTUP 292 293 Initialize the connection. The server will respond by either a READY message 294 (in which case the connection is ready for queries) or an AUTHENTICATE message 295 (in which case credentials will need to be provided using AUTH_RESPONSE). 296 297 This must be the first message of the connection, except for OPTIONS that can 298 be sent before to find out the options supported by the server. Once the 299 connection has been initialized, a client should not send any more STARTUP 300 messages. 301 302 The body is a [string map] of options. Possible options are: 303 - "CQL_VERSION": the version of CQL to use. This option is mandatory and 304 currently the only version supported is "3.0.0". Note that this is 305 different from the protocol version. 306 - "COMPRESSION": the compression algorithm to use for frames (See section 5). 307 This is optional; if not specified no compression will be used. 308 309 310 4.1.2. AUTH_RESPONSE 311 312 Answers a server authentication challenge. 313 314 Authentication in the protocol is SASL based. The server sends authentication 315 challenges (a bytes token) to which the client answers with this message. Those 316 exchanges continue until the server accepts the authentication by sending a 317 AUTH_SUCCESS message after a client AUTH_RESPONSE. Note that the exchange 318 begins with the client sending an initial AUTH_RESPONSE in response to a 319 server AUTHENTICATE request. 320 321 The body of this message is a single [bytes] token. The details of what this 322 token contains (and when it can be null/empty, if ever) depends on the actual 323 authenticator used. 324 325 The response to a AUTH_RESPONSE is either a follow-up AUTH_CHALLENGE message, 326 an AUTH_SUCCESS message or an ERROR message. 327 328 329 4.1.3. OPTIONS 330 331 Asks the server to return which STARTUP options are supported. The body of an 332 OPTIONS message should be empty and the server will respond with a SUPPORTED 333 message. 334 335 336 4.1.4. QUERY 337 338 Performs a CQL query. The body of the message must be: 339 <query><query_parameters> 340 where <query> is a [long string] representing the query and 341 <query_parameters> must be 342 <consistency><flags>[<n>[name_1]<value_1>...[name_n]<value_n>][<result_page_size>][<paging_state>] 343 [<serial_consistency>][<timestamp>][continuous_paging_options] 344 where: 345 - <consistency> is the [consistency] level for the operation. 346 - <flags> is a [int] whose bits define the options for this query and 347 in particular influence what the remainder of the message contains. 348 A flag is set if the bit corresponding to its `mask` is set. Supported 349 flags are, given their mask: 350 0x00000001: Values. If set, a [short] <n> followed by <n> [value] 351 values are provided. Those values are used for bound variables in 352 the query. Optionally, if the 0x40 flag is present, each value 353 will be preceded by a [string] name, representing the name of 354 the marker the value must be bound to. 355 0x00000002: Skip_metadata. If set, the Result Set returned as a response 356 to the query (if any) will have the NO_METADATA flag (see 357 Section 4.2.5.2). 358 0x00000004: Page_size. If set, <result_page_size> is a positive [int] 359 controlling the desired page size of the result in CQL3 rows or 360 in bytes, if Page_size_bytes is set. 361 See the section on paging (Section 8) for more details. 362 0x00000008: With_paging_state. If set, <paging_state> should be present. 363 <paging_state> is a [bytes] value that should have been returned 364 in a result set (Section 4.2.5.2). The query will be 365 executed but starting from a given paging state. An error will be 366 returned if the paging_state is present but no page_size has been 367 specified. The paging state can also be used to 368 continue paging on a different node than the one where it 369 started (See Section 8 for more details). 370 0x00000010: With serial consistency. If set, <serial_consistency> should be 371 present. <serial_consistency> is the [consistency] level for the 372 serial phase of conditional updates. That consitency can only be 373 either SERIAL or LOCAL_SERIAL and if not present, it defaults to 374 SERIAL. This option will be ignored for anything else other than a 375 conditional update/insert. 376 0x00000020: With default timestamp. If set, <timestamp> should be present. 377 <timestamp> is a [long] representing the default timestamp for the query 378 in microseconds (negative values are forbidden). This will 379 replace the server side assigned timestamp as default timestamp. 380 Note that a timestamp in the query itself will still override 381 this timestamp. This is entirely optional. 382 0x00000040: With names for values. This only makes sense if the 0x01 flag is set and 383 is ignored otherwise. If present, the values from the 0x01 flag will 384 be preceded by a name (see above). Note that this is only useful for 385 QUERY requests where named bind markers are used; for EXECUTE statements, 386 since the names for the expected values was returned during preparation, 387 a client can always provide values in the right order without any names 388 and using this flag, while supported, is almost surely inefficient. 389 0x40000000: Page_size_bytes. If set, <result_page_size> is expressed in bytes. The server 390 will try to return a number of CQL rows whose total size is as close as possible 391 to the requested page size, without splitting any CQL row however. This functionality 392 is currently only supported with continuous paging, setting this flag without 393 setting the continuous paging flag, will result in an error. 394 0x80000000: With continuous paging. If set, <continuous_paging_options> should be present. 395 This structure contains the following: 396 - <max_num_pages>, an [int] indicating the maximum number of pages that the server will send 397 to the client in total, set this to zero to indicate no limit. 398 - <pages_per_second>, an [int] indicating the maximum number of pages per second, set this 399 to zero to indicate no limit. 400 When continuous paging is enabled, the query results will be pushed to the client asynchronously and 401 according to the paging options in the request message, without the client having to request each 402 single page. Each response message will have the same stream id as the initial request. 403 Continuous paging can be interrupted by the client at any time via a CANCEL request, see section 4.1.9. 404 405 Note that the consistency is ignored by some queries (USE, CREATE, ALTER, 406 TRUNCATE, ...). 407 408 The server will respond to a QUERY message with a RESULT message, the content 409 of which depends on the query. 410 411 412 4.1.5. PREPARE 413 414 Prepare a query for later execution (through EXECUTE). The body consists of 415 the CQL query to prepare as a [long string]. 416 417 The server will respond with a RESULT message with a `prepared` kind (0x0004, 418 see Section 4.2.5). 419 420 421 4.1.6. EXECUTE 422 423 Executes a prepared query. The body of the message must be: 424 <id><query_parameters> 425 where <id> is the prepared query ID. It's the [short bytes] returned as a 426 response to a PREPARE message. As for <query_parameters>, it has the exact 427 same definition as in QUERY (see Section 4.1.4). 428 429 The response from the server will be a RESULT message. 430 431 432 4.1.7. BATCH 433 434 Allows executing a list of queries (prepared or not) as a batch (note that 435 only DML statements are accepted in a batch). The body of the message must 436 be: 437 <type><n><query_1>...<query_n><consistency><flags>[<serial_consistency>][<timestamp>] 438 where: 439 - <type> is a [byte] indicating the type of batch to use: 440 - If <type> == 0, the batch will be "logged". This is equivalent to a 441 normal CQL3 batch statement. 442 - If <type> == 1, the batch will be "unlogged". 443 - If <type> == 2, the batch will be a "counter" batch (and non-counter 444 statements will be rejected). 445 - <flags> is a [int] whose bits define the options for this query and 446 in particular influence what the remainder of the message contains. It is similar 447 to the <flags> from QUERY and EXECUTE methods, except that the 4 rightmost 448 bits must always be 0 as their corresponding options do not make sense for 449 Batch. A flag is set if the bit corresponding to its `mask` is set. Supported 450 flags are, given their mask: 451 0x10: With serial consistency. If set, <serial_consistency> should be 452 present. <serial_consistency> is the [consistency] level for the 453 serial phase of conditional updates. That consistency can only be 454 either SERIAL or LOCAL_SERIAL and if not present, it defaults to 455 SERIAL. This option will be ignored for anything else other than a 456 conditional update/insert. 457 0x20: With default timestamp. If set, <timestamp> should be present. 458 <timestamp> is a [long] representing the default timestamp for the query 459 in microseconds. This will replace the server side assigned 460 timestamp as default timestamp. Note that a timestamp in the query itself 461 will still override this timestamp. This is entirely optional. 462 0x40: With names for values. If set, then all values for all <query_i> must be 463 preceded by a [string] <name_i> that have the same meaning as in QUERY 464 requests [IMPORTANT NOTE: this feature does not work and should not be 465 used. It is specified in a way that makes it impossible for the server 466 to implement. This will be fixed in a future version of the native 467 protocol. See https://issues.apache.org/jira/browse/CASSANDRA-10246 for 468 more details]. 469 - <n> is a [short] indicating the number of following queries. 470 - <query_1>...<query_n> are the queries to execute. A <query_i> must be of the 471 form: 472 <kind><string_or_id><n>[<name_1>]<value_1>...[<name_n>]<value_n> 473 where: 474 - <kind> is a [byte] indicating whether the following query is a prepared 475 one or not. <kind> value must be either 0 or 1. 476 - <string_or_id> depends on the value of <kind>. If <kind> == 0, it should be 477 a [long string] query string (as in QUERY, the query string might contain 478 bind markers). Otherwise (that is, if <kind> == 1), it should be a 479 [short bytes] representing a prepared query ID. 480 - <n> is a [short] indicating the number (possibly 0) of following values. 481 - <name_i> is the optional name of the following <value_i>. It must be present 482 if and only if the 0x40 flag is provided for the batch. 483 - <value_i> is the [value] to use for bound variable i (of bound variable <name_i> 484 if the 0x40 flag is used). 485 - <consistency> is the [consistency] level for the operation. 486 - <serial_consistency> is only present if the 0x10 flag is set. In that case, 487 <serial_consistency> is the [consistency] level for the serial phase of 488 conditional updates. That consitency can only be either SERIAL or 489 LOCAL_SERIAL and if not present will defaults to SERIAL. This option will 490 be ignored for anything else other than a conditional update/insert. 491 492 The server will respond with a RESULT message. 493 494 495 4.1.8. REGISTER 496 497 Register this connection to receive some types of events. The body of the 498 message is a [string list] representing the event types to register for. See 499 section 4.2.6 for the list of valid event types. 500 501 The response to a REGISTER message will be a READY message. 502 503 Please note that if a client driver maintains multiple connections to a 504 Cassandra node and/or connections to multiple nodes, it is advised to 505 dedicate a handful of connections to receive events, but to *not* register 506 for events on all connections, as this would only result in receiving 507 multiple times the same event messages, wasting bandwidth. 508 509 4.1.9. CANCEL 510 511 Request to cancel an asynchronous operation. The body of the message is: 512 - an [int] identifying the operation type: 513 - 0x00000001 for "continuous paging", see section 4.1.4 514 - an [int] equal to the stream id of the initial request message. 515 516 The server will reply with a RESULT of type ROWS (section 4.2.5.2), 517 containing a single row with a single boolean value, which is set to: 518 - true if the operation was cancelled, 519 - false if the operation was not found. 520 If an operation is found but cannot be cancelled, an error is returned instead. 521 522 523 4.2. Responses 524 525 This section describes the content of the frame body for the different 526 responses. Please note that to make room for future evolution, clients should 527 support extra informations (that they should simply discard) to the one 528 described in this document at the end of the frame body. 529 530 4.2.1. ERROR 531 532 Indicates an error processing a request. The body of the message will be an 533 error code ([int]) followed by a [string] error message. Then, depending on 534 the exception, more content may follow. The error codes are defined in 535 Section 9, along with their additional content if any. 536 537 538 4.2.2. READY 539 540 Indicates that the server is ready to process queries. This message will be 541 sent by the server either after a STARTUP message if no authentication is 542 required (if authentication is required, the server indicates readiness by 543 sending a AUTH_RESPONSE message). 544 545 The body of a READY message is empty. 546 547 548 4.2.3. AUTHENTICATE 549 550 Indicates that the server requires authentication, and which authentication 551 mechanism to use. 552 553 The authentication is SASL based and thus consists of a number of server 554 challenges (AUTH_CHALLENGE, Section 4.2.7) followed by client responses 555 (AUTH_RESPONSE, Section 4.1.2). The initial exchange is however boostrapped 556 by an initial client response. The details of that exchange (including how 557 many challenge-response pairs are required) are specific to the authenticator 558 in use. The exchange ends when the server sends an AUTH_SUCCESS message or 559 an ERROR message. 560 561 This message will be sent following a STARTUP message if authentication is 562 required and must be answered by a AUTH_RESPONSE message from the client. 563 564 The body consists of a single [string] indicating the full class name of the 565 IAuthenticator in use. 566 567 568 4.2.4. SUPPORTED 569 570 Indicates which startup options are supported by the server. This message 571 comes as a response to an OPTIONS message. 572 573 The body of a SUPPORTED message is a [string multimap]. This multimap gives 574 for each of the supported STARTUP options, the list of supported values. It 575 also includes: 576 - "PROTOCOL_VERSIONS": the list of native protocol versions that are 577 supported, encoded as the version number followed by a slash and the 578 version description. For example: 3/v3, 4/v4, 5/v5-beta. If a version is 579 in beta, it will have the word "beta" in its description. 580 581 582 4.2.5. RESULT 583 584 The result to a query (QUERY, PREPARE, EXECUTE or BATCH messages). 585 586 The first element of the body of a RESULT message is an [int] representing the 587 `kind` of result. The rest of the body depends on the kind. The kind can be 588 one of: 589 0x0001 Void: for results carrying no information. 590 0x0002 Rows: for results to select queries, returning a set of rows. 591 0x0003 Set_keyspace: the result to a `use` query. 592 0x0004 Prepared: result to a PREPARE message. 593 0x0005 Schema_change: the result to a schema altering query. 594 595 The body for each kind (after the [int] kind) is defined below. 596 597 598 4.2.5.1. Void 599 600 The rest of the body for a Void result is empty. It indicates that a query was 601 successful without providing more information. 602 603 604 4.2.5.2. Rows 605 606 Indicates a set of rows. The rest of the body of a Rows result is: 607 <metadata><rows_count><rows_content> 608 where: 609 - <metadata> is composed of: 610 <flags><columns_count>[<paging_state>][<continuous_page_no>][<global_table_spec>?<col_spec_1>...<col_spec_n>] 611 where: 612 - <flags> is an [int]. The bits of <flags> provides information on the 613 formatting of the remaining information. A flag is set if the bit 614 corresponding to its `mask` is set. Supported flags are, given their 615 mask: 616 0x00000001 Global_tables_spec: if set, only one table spec (keyspace 617 and table name) is provided as <global_table_spec>. If not 618 set, <global_table_spec> is not present. 619 0x00000002 Has_more_pages: indicates whether this is not the last 620 page of results and more should be retrieved. If set, the 621 <paging_state> will be present. The <paging_state> is a 622 [bytes] value that should be used in QUERY/EXECUTE to 623 continue paging and retrieve the remainder of the result for 624 this query (See Section 8 for more details). 625 0x00000004 No_metadata: if set, the <metadata> is only composed of 626 these <flags>, the <column_count> and optionally the 627 <paging_state> and <continuous_page_no> (depending on the 628 corresponding flags) but no other information (so no 629 <global_table_spec> nor <col_spec_i>). 630 This will only ever be the case if this was requested 631 during the query (see QUERY and RESULT messages). 632 0x40000000 continuous paging: if set, this page is part of a continuous 633 paging session, as requested by the client. <continuous_page_no> 634 will be present, this is an [int] that identifies the sequential 635 number of this page in the session, and the last_continuous_page 636 flag below will be set for the final page. 637 0x80000000 Last_continuous_page: indicates that this is the last continuous page 638 that will be sent in the continuous paging session. This flag can only 639 be set when the continuous paging flag is also set. Note that this may not 640 necessarily be the last page of the query results, for this it is 641 necessary to look at Has_more_pages (this could happen if the client only 642 requested the first N pages in the continuous paging session, or if 643 it sent a CANCEL message). 644 - <columns_count> is an [int] representing the number of columns selected 645 by the query that produced this result. It defines the number of <col_spec_i> 646 elements in and the number of elements for each row in <rows_content>. 647 - <global_table_spec> is present if the Global_tables_spec is set in 648 <flags>. It is composed of two [string] representing the 649 (unique) keyspace name and table name the columns belong to. 650 - <col_spec_i> specifies the columns returned in the query. There are 651 <column_count> such column specifications that are composed of: 652 (<ksname><tablename>)?<name><type> 653 The initial <ksname> and <tablename> are two [string] and are only present 654 if the Global_tables_spec flag is not set. The <column_name> is a 655 [string] and <type> is an [option] that corresponds to the description 656 (what this description is depends a bit on the context: in results to 657 selects, this will be either the user chosen alias or the selection used 658 (often a colum name, but it can be a function call too). In results to 659 a PREPARE, this will be either the name of the corresponding bind variable 660 or the column name for the variable if it is "anonymous") and type of 661 the corresponding result. The option for <type> is either a native 662 type (see below), in which case the option has no value, or a 663 'custom' type, in which case the value is a [string] representing 664 the fully qualified class name of the type represented. Valid option 665 ids are: 666 0x0000 Custom: the value is a [string], see above. 667 0x0001 Ascii 668 0x0002 Bigint 669 0x0003 Blob 670 0x0004 Boolean 671 0x0005 Counter 672 0x0006 Decimal 673 0x0007 Double 674 0x0008 Float 675 0x0009 Int 676 0x000B Timestamp 677 0x000C Uuid 678 0x000D Varchar 679 0x000E Varint 680 0x000F Timeuuid 681 0x0010 Inet 682 0x0011 Date 683 0x0012 Time 684 0x0013 Smallint 685 0x0014 Tinyint 686 0x0015 Duration 687 0x0020 List: the value is an [option], representing the type 688 of the elements of the list. 689 0x0021 Map: the value is two [option], representing the types of the 690 keys and values of the map 691 0x0022 Set: the value is an [option], representing the type 692 of the elements of the set 693 0x0030 UDT: the value is <ks><udt_name><n><name_1><type_1>...<name_n><type_n> 694 where: 695 - <ks> is a [string] representing the keyspace name this 696 UDT is part of. 697 - <udt_name> is a [string] representing the UDT name. 698 - <n> is a [short] representing the number of fields of 699 the UDT, and thus the number of <name_i><type_i> pairs 700 following 701 - <name_i> is a [string] representing the name of the 702 i_th field of the UDT. 703 - <type_i> is an [option] representing the type of the 704 i_th field of the UDT. 705 0x0031 Tuple: the value is <n><type_1>...<type_n> where <n> is a [short] 706 representing the number of values in the type, and <type_i> 707 are [option] representing the type of the i_th component 708 of the tuple 709 710 - <rows_count> is an [int] representing the number of rows present in this 711 result. Those rows are serialized in the <rows_content> part. 712 - <rows_content> is composed of <row_1>...<row_m> where m is <rows_count>. 713 Each <row_i> is composed of <value_1>...<value_n> where n is 714 <columns_count> and where <value_j> is a [bytes] representing the value 715 returned for the jth column of the ith row. In other words, <rows_content> 716 is composed of (<rows_count> * <columns_count>) [bytes]. 717 718 719 4.2.5.3. Set_keyspace 720 721 The result to a `use` query. The body (after the kind [int]) is a single 722 [string] indicating the name of the keyspace that has been set. 723 724 725 4.2.5.4. Prepared 726 727 The result to a PREPARE message. The body of a Prepared result is: 728 <id><metadata><result_metadata> 729 where: 730 - <id> is [short bytes] representing the prepared query ID. 731 - <metadata> is composed of: 732 <flags><columns_count><pk_count>[<pk_index_1>...<pk_index_n>][<global_table_spec>?<col_spec_1>...<col_spec_n>] 733 where: 734 - <flags> is an [int]. The bits of <flags> provides information on the 735 formatting of the remaining information. A flag is set if the bit 736 corresponding to its `mask` is set. Supported masks and their flags 737 are: 738 0x0001 Global_tables_spec: if set, only one table spec (keyspace 739 and table name) is provided as <global_table_spec>. If not 740 set, <global_table_spec> is not present. 741 - <columns_count> is an [int] representing the number of bind markers 742 in the prepared statement. It defines the number of <col_spec_i> 743 elements. 744 - <pk_count> is an [int] representing the number of <pk_index_i> 745 elements to follow. If this value is zero, at least one of the 746 partition key columns in the table that the statement acts on 747 did not have a corresponding bind marker (or the bind marker 748 was wrapped in a function call). 749 - <pk_index_i> is a short that represents the index of the bind marker 750 that corresponds to the partition key column in position i. 751 For example, a <pk_index> sequence of [2, 0, 1] indicates that the 752 table has three partition key columns; the full partition key 753 can be constructed by creating a composite of the values for 754 the bind markers at index 2, at index 0, and at index 1. 755 This allows implementations with token-aware routing to correctly 756 construct the partition key without needing to inspect table 757 metadata. 758 - <global_table_spec> is present if the Global_tables_spec is set in 759 <flags>. If present, it is composed of two [string]s. The first 760 [string] is the name of the keyspace that the statement acts on. 761 The second [string] is the name of the table that the columns 762 represented by the bind markers belong to. 763 - <col_spec_i> specifies the bind markers in the prepared statement. 764 There are <column_count> such column specifications, each with the 765 following format: 766 (<ksname><tablename>)?<name><type> 767 The initial <ksname> and <tablename> are two [string] that are only 768 present if the Global_tables_spec flag is not set. The <name> field 769 is a [string] that holds the name of the bind marker (if named), 770 or the name of the column, field, or expression that the bind marker 771 corresponds to (if the bind marker is "anonymous"). The <type> 772 field is an [option] that represents the expected type of values for 773 the bind marker. See the Rows documentation (section 4.2.5.2) for 774 full details on the <type> field. 775 776 - <result_metadata> is defined exactly the same as <metadata> in the Rows 777 documentation (section 4.2.5.2). This describes the metadata for the 778 result set that will be returned when this prepared statement is executed. 779 Note that <result_metadata> may be empty (have the No_metadata flag and 780 0 columns, See section 4.2.5.2) and will be for any query that is not a 781 Select. In fact, there is never a guarantee that this will be non-empty, so 782 implementations should protect themselves accordingly. This result metadata 783 is an optimization that allows implementations to later execute the 784 prepared statement without requesting the metadata (see the Skip_metadata 785 flag in EXECUTE). Clients can safely discard this metadata if they do not 786 want to take advantage of that optimization. 787 788 Note that the prepared query ID returned is global to the node on which the query 789 has been prepared. It can be used on any connection to that node 790 until the node is restarted (after which the query must be reprepared). 791 792 4.2.5.5. Schema_change 793 794 The result to a schema altering query (creation/update/drop of a 795 keyspace/table/index). The body (after the kind [int]) is the same 796 as the body for a "SCHEMA_CHANGE" event, so 3 strings: 797 <change_type><target><options> 798 Please refer to section 4.2.6 below for the meaning of those fields. 799 800 Note that a query to create or drop an index is considered to be a change 801 to the table the index is on. 802 803 804 4.2.6. EVENT 805 806 An event pushed by the server. A client will only receive events for the 807 types it has REGISTERed to. The body of an EVENT message will start with a 808 [string] representing the event type. The rest of the message depends on the 809 event type. The valid event types are: 810 - "TOPOLOGY_CHANGE": events related to change in the cluster topology. 811 Currently, events are sent when new nodes are added to the cluster, and 812 when nodes are removed. The body of the message (after the event type) 813 consists of a [string] and an [inet], corresponding respectively to the 814 type of change ("NEW_NODE" or "REMOVED_NODE") followed by the address of 815 the new/removed node. 816 - "STATUS_CHANGE": events related to change of node status. Currently, 817 up/down events are sent. The body of the message (after the event type) 818 consists of a [string] and an [inet], corresponding respectively to the 819 type of status change ("UP" or "DOWN") followed by the address of the 820 concerned node. 821 - "SCHEMA_CHANGE": events related to schema change. After the event type, 822 the rest of the message will be <change_type><target><options> where: 823 - <change_type> is a [string] representing the type of changed involved. 824 It will be one of "CREATED", "UPDATED" or "DROPPED". 825 - <target> is a [string] that can be one of "KEYSPACE", "TABLE", "TYPE", 826 "FUNCTION" or "AGGREGATE" and describes what has been modified 827 ("TYPE" stands for modifications related to user types, "FUNCTION" 828 for modifications related to user defined functions, "AGGREGATE" 829 for modifications related to user defined aggregates). 830 - <options> depends on the preceding <target>: 831 - If <target> is "KEYSPACE", then <options> will be a single [string] 832 representing the keyspace changed. 833 - If <target> is "TABLE" or "TYPE", then 834 <options> will be 2 [string]: the first one will be the keyspace 835 containing the affected object, and the second one will be the name 836 of said affected object (either the table, user type, function, or 837 aggregate name). 838 - If <target> is "FUNCTION" or "AGGREGATE", multiple arguments follow: 839 - [string] keyspace containing the user defined function / aggregate 840 - [string] the function/aggregate name 841 - [string list] one string for each argument type (as CQL type) 842 843 All EVENT messages have a streamId of -1 (Section 2.3). 844 845 Please note that "NEW_NODE" and "UP" events are sent based on internal Gossip 846 communication and as such may be sent a short delay before the binary 847 protocol server on the newly up node is fully started. Clients are thus 848 advised to wait a short time before trying to connect to the node (1 second 849 should be enough), otherwise they may experience a connection refusal at 850 first. 851 852 4.2.7. AUTH_CHALLENGE 853 854 A server authentication challenge (see AUTH_RESPONSE (Section 4.1.2) for more 855 details). 856 857 The body of this message is a single [bytes] token. The details of what this 858 token contains (and when it can be null/empty, if ever) depends on the actual 859 authenticator used. 860 861 Clients are expected to answer the server challenge with an AUTH_RESPONSE 862 message. 863 864 4.2.8. AUTH_SUCCESS 865 866 Indicates the success of the authentication phase. See Section 4.2.3 for more 867 details. 868 869 The body of this message is a single [bytes] token holding final information 870 from the server that the client may require to finish the authentication 871 process. What that token contains and whether it can be null depends on the 872 actual authenticator used. 873 874 5. Compression 875 876 Frame compression is supported by the protocol, but then only the frame body 877 is compressed (the frame header should never be compressed). 878 879 Before being used, client and server must agree on a compression algorithm to 880 use, which is done in the STARTUP message. As a consequence, a STARTUP message 881 must never be compressed. However, once the STARTUP frame has been received 882 by the server, messages can be compressed (including the response to the STARTUP 883 request). Frames do not have to be compressed, however, even if compression has 884 been agreed upon (a server may only compress frames above a certain size at its 885 discretion). A frame body should be compressed if and only if the compressed 886 flag (see Section 2.2) is set. 887 888 As of version 2 of the protocol, the following compressions are available: 889 - lz4 (https://code.google.com/p/lz4/). In that, note that the first four bytes 890 of the body will be the uncompressed length (followed by the compressed 891 bytes). 892 - snappy (https://code.google.com/p/snappy/). This compression might not be 893 available as it depends on a native lib (server-side) that might not be 894 avaivable on some installations. 895 896 897 6. Data Type Serialization Formats 898 899 This sections describes the serialization formats for all CQL data types 900 supported by Cassandra through the native protocol. These serialization 901 formats should be used by client drivers to encode values for EXECUTE 902 messages. Cassandra will use these formats when returning values in 903 RESULT messages. 904 905 All values are represented as [bytes] in EXECUTE and RESULT messages. 906 The [bytes] format includes an int prefix denoting the length of the value. 907 For that reason, the serialization formats described here will not include 908 a length component. 909 910 For legacy compatibility reasons, note that most non-string types support 911 "empty" values (i.e. a value with zero length). An empty value is distinct 912 from NULL, which is encoded with a negative length. 913 914 As with the rest of the native protocol, all encodings are big-endian. 915 916 6.1. ascii 917 918 A sequence of bytes in the ASCII range [0, 127]. Bytes with values outside of 919 this range will result in a validation error. 920 921 6.2 bigint 922 923 An eight-byte two's complement integer. 924 925 6.3 blob 926 927 Any sequence of bytes. 928 929 6.4 boolean 930 931 A single byte. A value of 0 denotes "false"; any other value denotes "true". 932 (However, it is recommended that a value of 1 be used to represent "true".) 933 934 6.5 date 935 936 An unsigned integer representing days with epoch centered at 2^31. 937 (unix epoch January 1st, 1970). 938 A few examples: 939 0: -5877641-06-23 940 2^31: 1970-1-1 941 2^32: 5881580-07-11 942 943 6.6 decimal 944 945 The decimal format represents an arbitrary-precision number. It contains an 946 [int] "scale" component followed by a varint encoding (see section 6.17) 947 of the unscaled value. The encoded value represents "<unscaled>E<-scale>". 948 In other words, "<unscaled> * 10 ^ (-1 * <scale>)". 949 950 6.7 double 951 952 An 8 byte floating point number in the IEEE 754 binary64 format. 953 954 6.8 duration 955 956 A duration is composed of 3 signed variable length integers ([vint]s). 957 The first [vint] represents a number of months, the second [vint] represents 958 a number of days, and the last [vint] represents a number of nanoseconds. 959 The number of months and days must be valid 32 bits integers whereas the 960 number of nanoseconds must be a valid 64 bits integer. 961 A duration can either be positive or negative. If a duration is positive 962 all the integers must be positive or zero. If a duration is 963 negative all the numbers must be negative or zero. 964 965 6.9 float 966 967 A 4 byte floating point number in the IEEE 754 binary32 format. 968 969 6.10 inet 970 971 A 4 byte or 16 byte sequence denoting an IPv4 or IPv6 address, respectively. 972 973 6.11 int 974 975 A 4 byte two's complement integer. 976 977 6.12 list 978 979 A [int] n indicating the number of elements in the list, followed by n 980 elements. Each element is [bytes] representing the serialized value. 981 982 6.13 map 983 984 A [int] n indicating the number of key/value pairs in the map, followed by 985 n entries. Each entry is composed of two [bytes] representing the key 986 and value. 987 988 6.14 set 989 990 A [int] n indicating the number of elements in the set, followed by n 991 elements. Each element is [bytes] representing the serialized value. 992 993 6.15 smallint 994 995 A 2 byte two's complement integer. 996 997 6.16 text 998 999 A sequence of bytes conforming to the UTF-8 specifications. 1000 1001 6.17 time 1002 1003 An 8 byte two's complement long representing nanoseconds since midnight. 1004 Valid values are in the range 0 to 86399999999999 1005 1006 6.18 timestamp 1007 1008 An 8 byte two's complement integer representing a millisecond-precision 1009 offset from the unix epoch (00:00:00, January 1st, 1970). Negative values 1010 represent a negative offset from the epoch. 1011 1012 6.19 timeuuid 1013 1014 A 16 byte sequence representing a version 1 UUID as defined by RFC 4122. 1015 1016 6.20 tinyint 1017 1018 A 1 byte two's complement integer. 1019 1020 6.21 tuple 1021 1022 A sequence of [bytes] values representing the items in a tuple. The encoding 1023 of each element depends on the data type for that position in the tuple. 1024 Null values may be represented by using length -1 for the [bytes] 1025 representation of an element. 1026 1027 6.22 uuid 1028 1029 A 16 byte sequence representing any valid UUID as defined by RFC 4122. 1030 1031 6.23 varchar 1032 1033 An alias of the "text" type. 1034 1035 6.24 varint 1036 1037 A variable-length two's complement encoding of a signed integer. 1038 1039 The following examples may help implementors of this spec: 1040 1041 Value | Encoding 1042 ------|--------- 1043 0 | 0x00 1044 1 | 0x01 1045 127 | 0x7F 1046 128 | 0x0080 1047 129 | 0x0081 1048 -1 | 0xFF 1049 -128 | 0x80 1050 -129 | 0xFF7F 1051 1052 Note that positive numbers must use a most-significant byte with a value 1053 less than 0x80, because a most-significant bit of 1 indicates a negative 1054 value. Implementors should pad positive values that have a MSB >= 0x80 1055 with a leading 0x00 byte. 1056 1057 1058 7. User Defined Types 1059 1060 This section describes the serialization format for User defined types (UDT), 1061 as described in section 4.2.5.2. 1062 1063 A UDT value is composed of successive [bytes] values, one for each field of the UDT 1064 value (in the order defined by the type). A UDT value will generally have one value 1065 for each field of the type it represents, but it is allowed to have less values than 1066 the type has fields. 1067 1068 1069 8. Result paging 1070 1071 The protocol allows for paging the result of queries. For that, the QUERY and 1072 EXECUTE messages have a <result_page_size> value that indicate the desired 1073 page size in CQL3 rows. 1074 1075 If a positive value is provided for <result_page_size>, the result set of the 1076 RESULT message returned for the query will contain at most the 1077 <result_page_size> first rows of the query result. If that first page of results 1078 contains the full result set for the query, the RESULT message (of kind `Rows`) 1079 will have the Has_more_pages flag *not* set. However, if some results are not 1080 part of the first response, the Has_more_pages flag will be set and the result 1081 will contain a <paging_state> value. In that case, the <paging_state> value 1082 should be used in a QUERY or EXECUTE message (that has the *same* query as 1083 the original one or the behavior is undefined) to retrieve the next page of 1084 results. 1085 1086 Only CQL3 queries that return a result set (RESULT message with a Rows `kind`) 1087 support paging. For other type of queries, the <result_page_size> value is 1088 ignored. 1089 1090 Note to client implementors: 1091 - While <result_page_size> can be as low as 1, it will likely be detrimental 1092 to performance to pick a value too low. A value below 100 is probably too 1093 low for most use cases. 1094 - Clients should not rely on the actual size of the result set returned to 1095 decide if there are more results to fetch or not. Instead, they should always 1096 check the Has_more_pages flag (unless they did not enable paging for the query 1097 obviously). Clients should also not assert that no result will have more than 1098 <result_page_size> results. While the current implementation always respects 1099 the exact value of <result_page_size>, we reserve the right to return 1100 slightly smaller or bigger pages in the future for performance reasons. 1101 - The <paging_state> is specific to a protocol version and drivers should not 1102 send a <paging_state> returned by a node using the protocol v3 to query a node 1103 using the protocol v4 for instance. 1104 1105 1106 9. Error codes 1107 1108 Let us recall that an ERROR message is composed of <code><message>[...] 1109 (see 4.2.1 for details). The supported error codes, as well as any additional 1110 information the message may contain after the <message> are described below: 1111 0x0000 Server error: something unexpected happened. This indicates a 1112 server-side bug. 1113 0x000A Protocol error: some client message triggered a protocol 1114 violation (for instance a QUERY message is sent before a STARTUP 1115 one has been sent) 1116 0x0100 Authentication error: authentication was required and failed. The 1117 possible reason for failing depends on the authenticator in use, 1118 which may or may not include more detail in the accompanying 1119 error message. 1120 0x1000 Unavailable exception. The rest of the ERROR message body will be 1121 <cl><required><alive> 1122 where: 1123 <cl> is the [consistency] level of the query that triggered 1124 the exception. 1125 <required> is an [int] representing the number of nodes that 1126 should be alive to respect <cl> 1127 <alive> is an [int] representing the number of replicas that 1128 were known to be alive when the request had been 1129 processed (since an unavailable exception has been 1130 triggered, there will be <alive> < <required>) 1131 0x1001 Overloaded: the request cannot be processed because the 1132 coordinator node is overloaded 1133 0x1002 Is_bootstrapping: the request was a read request but the 1134 coordinator node is bootstrapping 1135 0x1003 Truncate_error: error during a truncation error. 1136 0x1100 Write_timeout: Timeout exception during a write request. The rest 1137 of the ERROR message body will be 1138 <cl><received><blockfor><writeType> 1139 where: 1140 <cl> is the [consistency] level of the query having triggered 1141 the exception. 1142 <received> is an [int] representing the number of nodes having 1143 acknowledged the request. 1144 <blockfor> is an [int] representing the number of replicas whose 1145 acknowledgement is required to achieve <cl>. 1146 <writeType> is a [string] that describe the type of the write 1147 that timed out. The value of that string can be one 1148 of: 1149 - "SIMPLE": the write was a non-batched 1150 non-counter write. 1151 - "BATCH": the write was a (logged) batch write. 1152 If this type is received, it means the batch log 1153 has been successfully written (otherwise a 1154 "BATCH_LOG" type would have been sent instead). 1155 - "UNLOGGED_BATCH": the write was an unlogged 1156 batch. No batch log write has been attempted. 1157 - "COUNTER": the write was a counter write 1158 (batched or not). 1159 - "BATCH_LOG": the timeout occurred during the 1160 write to the batch log when a (logged) batch 1161 write was requested. 1162 0x1200 Read_timeout: Timeout exception during a read request. The rest 1163 of the ERROR message body will be 1164 <cl><received><blockfor><data_present> 1165 where: 1166 <cl> is the [consistency] level of the query having triggered 1167 the exception. 1168 <received> is an [int] representing the number of nodes having 1169 answered the request. 1170 <blockfor> is an [int] representing the number of replicas whose 1171 response is required to achieve <cl>. Please note that 1172 it is possible to have <received> >= <blockfor> if 1173 <data_present> is false. Also in the (unlikely) 1174 case where <cl> is achieved but the coordinator node 1175 times out while waiting for read-repair acknowledgement. 1176 <data_present> is a single byte. If its value is 0, it means 1177 the replica that was asked for data has not 1178 responded. Otherwise, the value is != 0. 1179 0x1300 Read_failure: A non-timeout exception during a read request. The rest 1180 of the ERROR message body will be 1181 <cl><received><blockfor><reasonmap><data_present> 1182 where: 1183 <cl> is the [consistency] level of the query having triggered 1184 the exception. 1185 <received> is an [int] representing the number of nodes having 1186 answered the request. 1187 <blockfor> is an [int] representing the number of replicas whose 1188 acknowledgement is required to achieve <cl>. 1189 <reasonmap> is a map of endpoint to failure reason codes. This maps 1190 the endpoints of the replica nodes that failed when 1191 executing the request to a code representing the reason 1192 for the failure. The map is encoded starting with an [int] n 1193 followed by n pairs of <endpoint><failurecode> where 1194 <endpoint> is an [inetaddr] and <failurecode> is a [short] 1195 that has the following meaning: 1196 0x0000 Unknown reason 1197 0x0001 Too many tombstones read (as controlled by the 1198 yaml tombstone_failure_threshold option) 1199 0x0002 The query uses an index but that index is not available 1200 (built) on the queried <endpoint>. 1201 0x0003 The query writes on some CDC enabled tables, but the CDC 1202 space is full (CDC data isn't consumed fast enough). Note 1203 that this can only happen in Write_failure in practice, but 1204 the reasons are shared between both exception. 1205 0x0004 Some failures (one or more) were reported to the replica 1206 "leading" a counter write. The actual error didn't 1207 necessarily occur on the node that sent this failure, it is 1208 simply the node reporting it due to how counter writes 1209 work; the initial reason for the failure should have been 1210 logged on the actual replica on which the problem occured, 1211 which may or may not be the same node. 1212 Any other value for <failurecode> must be considered as an 1213 Unknown reason (but drivers should not fail) as new <failurecode> 1214 may be added without a bump of the protocol version. 1215 <data_present> is a single byte. If its value is 0, it means 1216 the replica that was asked for data had not 1217 responded. Otherwise, the value is != 0. 1218 0x1400 Function_failure: A (user defined) function failed during execution. 1219 The rest of the ERROR message body will be 1220 <keyspace><function><arg_types> 1221 where: 1222 <keyspace> is the keyspace [string] of the failed function 1223 <function> is the name [string] of the failed function 1224 <arg_types> [string list] one string for each argument type (as CQL type) of the failed function 1225 0x1500 Write_failure: A non-timeout exception during a write request. The rest 1226 of the ERROR message body will be 1227 <cl><received><blockfor><reasonmap><write_type> 1228 where: 1229 <cl> is the [consistency] level of the query having triggered 1230 the exception. 1231 <received> is an [int] representing the number of nodes having 1232 answered the request. 1233 <blockfor> is an [int] representing the number of replicas whose 1234 acknowledgement is required to achieve <cl>. 1235 <reasonmap> is a map of endpoint to failure reason codes. This maps 1236 the endpoints of the replica nodes that failed when 1237 executing the request to a code representing the reason 1238 for the failure. The map is encoded starting with an [int] n 1239 followed by n pairs of <endpoint><failurecode> where 1240 <endpoint> is an [inetaddr] and <failurecode> is a [short] 1241 whose meaning is the same than in Read_failure (see above, 1242 though note that some reason only apply to writes and others only 1243 to reads). 1244 <writeType> is a [string] that describes the type of the write 1245 that failed. The value of that string can be one 1246 of: 1247 - "SIMPLE": the write was a non-batched 1248 non-counter write. 1249 - "BATCH": the write was a (logged) batch write. 1250 If this type is received, it means the batch log 1251 has been successfully written (otherwise a 1252 "BATCH_LOG" type would have been sent instead). 1253 - "UNLOGGED_BATCH": the write was an unlogged 1254 batch. No batch log write has been attempted. 1255 - "COUNTER": the write was a counter write 1256 (batched or not). 1257 - "BATCH_LOG": the failure occured during the 1258 write to the batch log when a (logged) batch 1259 write was requested. 1260 1261 0x2000 Syntax_error: The submitted query has a syntax error. 1262 0x2100 Unauthorized: The logged user doesn't have the right to perform 1263 the query. 1264 0x2200 Invalid: The query is syntactically correct but invalid. 1265 0x2300 Config_error: The query is invalid because of some configuration issue 1266 0x2400 Already_exists: The query attempted to create a keyspace or a 1267 table that was already existing. The rest of the ERROR message 1268 body will be <ks><table> where: 1269 <ks> is a [string] representing either the keyspace that 1270 already exists, or the keyspace in which the table that 1271 already exists is. 1272 <table> is a [string] representing the name of the table that 1273 already exists. If the query was attempting to create a 1274 keyspace, <table> will be present but will be the empty 1275 string. 1276 0x2500 Unprepared: Can be thrown while a prepared statement tries to be 1277 executed if the provided prepared statement ID is not known by 1278 this host. The rest of the ERROR message body will be [short 1279 bytes] representing the unknown ID. 1280 1281 0x8000 Client_write_failure: an error occured when sending asynchronous results to 1282 the client, for example if the client is unable to keep up with the rate during 1283 a continuous paging session. 1284 1285 10. Changes from CQL binary protocol version 5 1286 1287 * Second most signficant bit in the frame version byte is set to one to indicate 1288 a dse protocol message (section 2.1) 1289 1290 * Continuous paging: 1291 * Added options to QUERY message (section 4.1.4) 1292 * Added response parameters to ROWS response (section 4.2.5.2) 1293 1294 * Added CANCEL message (section 4.1.9) 1295 1296 * Does _not_ have keyspace field in QUERY, PREPARE, and BATCH messages (Sections 4.1.4, 4.1.5, and 4.1.7 of native-protocol-v5). 1297 * Does _not_ have [int] flags field in PREPARE message (Section 4.1.5 of native-protocol-v5).