github.com/datastax/go-cassandra-native-protocol@v0.0.0-20220706104457-5e8aad05cf90/specs/dse_protocol_v2.spec (about) 1 2 DSE BINARY PROTOCOL v2 3 4 5 Table of Contents 6 7 1. Overview 8 2. Frame header 9 2.1. version 10 2.2. flags 11 2.3. stream 12 2.4. opcode 13 2.5. length 14 3. Notations 15 4. Messages 16 4.1. Requests 17 4.1.1. STARTUP 18 4.1.2. AUTH_RESPONSE 19 4.1.3. OPTIONS 20 4.1.4. QUERY 21 4.1.5. PREPARE 22 4.1.6. EXECUTE 23 4.1.7. BATCH 24 4.1.8. REGISTER 25 4.1.9. REVISE_REQUEST 26 4.2. Responses 27 4.2.1. ERROR 28 4.2.2. READY 29 4.2.3. AUTHENTICATE 30 4.2.4. SUPPORTED 31 4.2.5. RESULT 32 4.2.5.1. Void 33 4.2.5.2. Rows 34 4.2.5.3. Set_keyspace 35 4.2.5.4. Prepared 36 4.2.5.5. Schema_change 37 4.2.6. EVENT 38 4.2.7. AUTH_CHALLENGE 39 4.2.8. AUTH_SUCCESS 40 5. Compression 41 6. Data Type Serialization Formats 42 7. User Defined Type Serialization 43 8. Result paging 44 9. Error codes 45 10. Changes 46 10.1. Changes from CQL binary protocol v5 47 10.2. Changes from DSE binary protocol v1 48 49 50 1. Overview 51 52 The CQL binary protocol is a frame based protocol. Frames are defined as: 53 54 0 8 16 24 32 40 55 +---------+---------+---------+---------+---------+ 56 | version | flags | stream | opcode | 57 +---------+---------+---------+---------+---------+ 58 | length | 59 +---------+---------+---------+---------+ 60 | | 61 . ... body ... . 62 . . 63 . . 64 +---------------------------------------- 65 66 The protocol is big-endian (network byte order). 67 68 Each frame contains a fixed size header (9 bytes) followed by a variable size 69 body. The header is described in Section 2. The content of the body depends 70 on the header opcode value (the body can in particular be empty for some 71 opcode values). The list of allowed opcodes is defined in Section 2.4 and the 72 details of each corresponding message are described Section 4. 73 74 The protocol distinguishes two types of frames: requests and responses. Requests 75 are those frames sent by the client to the server. Responses are those frames sent 76 by the server to the client. Note, however, that the protocol supports server pushes 77 (events) so a response does not necessarily come right after a client request. 78 79 Note to client implementors: client libraries should always assume that the 80 body of a given frame may contain more data than what is described in this 81 document. It will however always be safe to ignore the remainder of the frame 82 body in such cases. The reason is that this may enable extending the protocol 83 with optional features without needing to change the protocol version. 84 85 86 87 2. Frame header 88 89 2.1. version 90 91 The version is a single byte that indicates both the direction of the message 92 (request or response) and the version of the protocol in use. The most 93 significant bit of version is used to define the direction of the message: 94 0 indicates a request, 1 indicates a response. This can be useful for protocol 95 analyzers to distinguish the nature of the packet from the direction in which 96 it is moving. 97 98 The next most significant bit must be set to one to indicate that this 99 is a dse private version and not a public open source version. 100 101 The rest of that byte is the dse protocol version (1 for the protocol 102 defined in this document). In other words, for this version of the protocol, 103 version will be one of: 104 0x41 (0100 0001) Request frame for this dse protocol version 105 0xC1 (1100 0001) Response frame for this dse protocol version 106 107 Please note that while every message ships with the version, only one version 108 of messages is accepted on a given connection. In other words, the first message 109 exchanged (STARTUP) sets the version for the connection for the lifetime of this 110 connection. The single exception to this behavior is when a startup message 111 is sent with a version that is higher than the current server version. In this 112 case, the server will respond with its current version. 113 114 This document describes version 1 of the dse protocol. For the changes made since 115 version 5 of the public protocol, see Section 10. 116 117 118 2.2. flags 119 120 Flags applying to this frame. The flags have the following meaning (described 121 by the mask that allows selecting them): 122 0x01: Compression flag. If set, the frame body is compressed. The actual 123 compression to use should have been set up beforehand through the 124 Startup message (which thus cannot be compressed; Section 4.1.1). 125 0x02: Tracing flag. For a request frame, this indicates the client requires 126 tracing of the request. Note that only QUERY, PREPARE and EXECUTE queries 127 support tracing. Other requests will simply ignore the tracing flag if 128 set. If a request supports tracing and the tracing flag is set, the response 129 to this request will have the tracing flag set and contain tracing 130 information. 131 If a response frame has the tracing flag set, its body contains 132 a tracing ID. The tracing ID is a [uuid] and is the first thing in 133 the frame body. The rest of the body will then be the usual body 134 corresponding to the response opcode. 135 0x04: Custom payload flag. For a request or response frame, this indicates 136 that a generic key-value custom payload for a custom QueryHandler 137 implementation is present in the frame. Such a custom payload is simply 138 ignored by the default QueryHandler implementation. 139 Currently, only QUERY, PREPARE, EXECUTE and BATCH requests support 140 payload. 141 Type of custom payload is [bytes map] (see below). 142 0x08: Warning flag. The response contains warnings which were generated by the 143 server to go along with this response. 144 If a response frame has the warning flag set, its body will contain the 145 text of the warnings. The warnings are a [string list] and will be the 146 first value in the frame body if the tracing flag is not set, or directly 147 after the tracing ID if it is. 148 0x10: Use beta flag. Indicates that the client opts in to use protocol version 149 that is currently in beta. Server will respond with ERROR if protocol 150 version is marked as beta on server and client does not provide this flag. 151 152 The rest of flags is currently unused and ignored. 153 154 2.3. stream 155 156 A frame has a stream id (a [short] value). When sending request messages, this 157 stream id must be set by the client to a non-negative value (negative stream id 158 are reserved for streams initiated by the server; currently all EVENT messages 159 (section 4.2.6) have a streamId of -1). If a client sends a request message 160 with the stream id X, it is guaranteed that the stream id of the response to 161 that message will be X. 162 163 This helps to enable the asynchronous nature of the protocol. If a client 164 sends multiple messages simultaneously (without waiting for responses), there 165 is no guarantee on the order of the responses. For instance, if the client 166 writes REQ_1, REQ_2, REQ_3 on the wire (in that order), the server might 167 respond to REQ_3 (or REQ_2) first. Assigning different stream ids to these 3 168 requests allows the client to distinguish to which request a received answer 169 responds to. As there can only be 32768 different simultaneous streams, it is up 170 to the client to reuse stream id. 171 172 Note that clients are free to use the protocol synchronously (i.e. wait for 173 the response to REQ_N before sending REQ_N+1). In that case, the stream id 174 can be safely set to 0. Clients should also feel free to use only a subset of 175 the 32768 maximum possible stream ids if it is simpler for its implementation. 176 177 2.4. opcode 178 179 An integer byte that distinguishes the actual message: 180 0x00 ERROR 181 0x01 STARTUP 182 0x02 READY 183 0x03 AUTHENTICATE 184 0x05 OPTIONS 185 0x06 SUPPORTED 186 0x07 QUERY 187 0x08 RESULT 188 0x09 PREPARE 189 0x0A EXECUTE 190 0x0B REGISTER 191 0x0C EVENT 192 0x0D BATCH 193 0x0E AUTH_CHALLENGE 194 0x0F AUTH_RESPONSE 195 0x10 AUTH_SUCCESS 196 0xFF REVISE_REQUEST 197 198 Messages are described in Section 4. 199 200 (Note that there is no 0x04 message in this version of the protocol) 201 202 203 2.5. length 204 205 A 4 byte integer representing the length of the body of the frame (note: 206 currently a frame is limited to 256MB in length). 207 208 209 3. Notations 210 211 To describe the layout of the frame body for the messages in Section 4, we 212 define the following: 213 214 [int] A 4 bytes integer 215 [long] A 8 bytes integer 216 [byte] A 1 byte unsigned integer 217 [short] A 2 bytes unsigned integer 218 [string] A [short] n, followed by n bytes representing an UTF-8 219 string. 220 [long string] An [int] n, followed by n bytes representing an UTF-8 string. 221 [uuid] A 16 bytes long uuid. 222 [string list] A [short] n, followed by n [string]. 223 [bytes] A [int] n, followed by n bytes if n >= 0. If n < 0, 224 no byte should follow and the value represented is `null`. 225 [value] A [int] n, followed by n bytes if n >= 0. 226 If n == -1 no byte should follow and the value represented is `null`. 227 If n == -2 no byte should follow and the value represented is 228 `not set` not resulting in any change to the existing value. 229 n < -2 is an invalid value and results in an error. 230 [short bytes] A [short] n, followed by n bytes if n >= 0. 231 232 [unsigned vint] An unsigned variable length integer. A vint is encoded with the most significant byte (MSB) first. 233 The most significant byte will contains the information about how many extra bytes need to be read 234 as well as the most significant bits of the integer. 235 The number of extra bytes to read is encoded as 1 bits on the left side. 236 For example, if we need to read 2 more bytes the first byte will start with 110 237 (e.g. 256 000 will be encoded on 3 bytes as [110]00011 11101000 00000000) 238 If the encoded integer is 8 bytes long the vint will be encoded on 9 bytes and the first 239 byte will be: 11111111 240 241 [vint] A signed variable length integer. This is encoded using zig-zag encoding and then sent 242 like an [unsigned vint]. Zig-zag encoding converts numbers as follows: 243 0 = 0, -1 = 1, 1 = 2, -2 = 3, 2 = 4, -3 = 5, 3 = 6 and so forth. 244 The purpose is to send small negative values as small unsigned values, so that we save bytes on the wire. 245 To encode a value n use "(n >> 31) ^ (n << 1)" for 32 bit values, and "(n >> 63) ^ (n << 1)" 246 for 64 bit values where "^" is the xor operation, "<<" is the left shift operation and ">>" is 247 the arithemtic right shift operation (highest-order bit is replicated). 248 Decode with "(n >> 1) ^ -(n & 1)". 249 250 [option] A pair of <id><value> where <id> is a [short] representing 251 the option id and <value> depends on that option (and can be 252 of size 0). The supported id (and the corresponding <value>) 253 will be described when this is used. 254 [option list] A [short] n, followed by n [option]. 255 [inet] An address (ip and port) to a node. It consists of one 256 [byte] n, that represents the address size, followed by n 257 [byte] representing the IP address (in practice n can only be 258 either 4 (IPv4) or 16 (IPv6)), following by one [int] 259 representing the port. 260 [inetaddr] An IP address (without a port) to a node. It consists of one 261 [byte] n, that represents the address size, followed by n 262 [byte] representing the IP address. 263 [consistency] A consistency level specification. This is a [short] 264 representing a consistency level with the following 265 correspondance: 266 0x0000 ANY 267 0x0001 ONE 268 0x0002 TWO 269 0x0003 THREE 270 0x0004 QUORUM 271 0x0005 ALL 272 0x0006 LOCAL_QUORUM 273 0x0007 EACH_QUORUM 274 0x0008 SERIAL 275 0x0009 LOCAL_SERIAL 276 0x000A LOCAL_ONE 277 278 [string map] A [short] n, followed by n pair <k><v> where <k> and <v> 279 are [string]. 280 [string multimap] A [short] n, followed by n pair <k><v> where <k> is a 281 [string] and <v> is a [string list]. 282 [bytes map] A [short] n, followed by n pair <k><v> where <k> is a 283 [string] and <v> is a [bytes]. 284 285 286 4. Messages 287 288 4.1. Requests 289 290 Note that outside of their normal responses (described below), all requests 291 can get an ERROR message (Section 4.2.1) as response. 292 293 4.1.1. STARTUP 294 295 Initialize the connection. The server will respond by either a READY message 296 (in which case the connection is ready for queries) or an AUTHENTICATE message 297 (in which case credentials will need to be provided using AUTH_RESPONSE). 298 299 This must be the first message of the connection, except for OPTIONS that can 300 be sent before to find out the options supported by the server. Once the 301 connection has been initialized, a client should not send any more STARTUP 302 messages. 303 304 The body is a [string map] of options. Possible options are: 305 - "CQL_VERSION": the version of CQL to use. This option is mandatory and 306 currently the only version supported is "3.0.0". Note that this is 307 different from the protocol version. 308 - "COMPRESSION": the compression algorithm to use for frames (See section 5). 309 This is optional; if not specified no compression will be used. 310 - "CLIENT_ID": string representation of the client instance. Recommended 311 is a ID unique per runtime instance (e.g. DataStax Java Driver's Cluster 312 instance), generated by the driver. 313 - "APPLICATION_NAME": optional, name of the application, should include the 314 vendor name. For example "DataStax Studio" 315 - "APPLICATION_VERSION": optional, version of the application. 316 - "DRIVER_NAME": product name of the driver implementation. For example: 317 'DataStax Java Driver'. 318 - "DRIVER_VERSION": version of the driver implementation, typically a 319 semantic version string. 320 321 Strictly speaking, the parameters "CLIENT_ID", "APPLICATION_NAME", 322 "APPLICATION_VERSION", "APPLICATION_INSTANCE", "DRIVER_NAME" and "DRIVER_VERSION" 323 are not restricted to this protocol version but can also be passed using older 324 protocol versions. It depends on the DSE version whether and how this information 325 is used or is accessible. 326 327 328 4.1.2. AUTH_RESPONSE 329 330 Answers a server authentication challenge. 331 332 Authentication in the protocol is SASL based. The server sends authentication 333 challenges (a bytes token) to which the client answers with this message. Those 334 exchanges continue until the server accepts the authentication by sending a 335 AUTH_SUCCESS message after a client AUTH_RESPONSE. Note that the exchange 336 begins with the client sending an initial AUTH_RESPONSE in response to a 337 server AUTHENTICATE request. 338 339 The body of this message is a single [bytes] token. The details of what this 340 token contains (and when it can be null/empty, if ever) depends on the actual 341 authenticator used. 342 343 The response to a AUTH_RESPONSE is either a follow-up AUTH_CHALLENGE message, 344 an AUTH_SUCCESS message or an ERROR message. 345 346 347 4.1.3. OPTIONS 348 349 Asks the server to return which STARTUP options are supported. The body of an 350 OPTIONS message should be empty and the server will respond with a SUPPORTED 351 message. 352 353 354 4.1.4. QUERY 355 356 Performs a CQL query. The body of the message must be: 357 <query><query_parameters> 358 where <query> is a [long string] representing the query and 359 <query_parameters> must be 360 <consistency><flags>[<n>[name_1]<value_1>...[name_n]<value_n>][<result_page_size>][<paging_state>] 361 [<serial_consistency>][<timestamp>][<keyspace>][continuous_paging_options] 362 where: 363 - <consistency> is the [consistency] level for the operation. 364 - <flags> is a [int] whose bits define the options for this query and 365 in particular influence what the remainder of the message contains. 366 A flag is set if the bit corresponding to its `mask` is set. Supported 367 flags are, given their mask: 368 0x00000001: Values. If set, a [short] <n> followed by <n> [value] 369 values are provided. Those values are used for bound variables in 370 the query. Optionally, if the 0x40 flag is present, each value 371 will be preceded by a [string] name, representing the name of 372 the marker the value must be bound to. 373 0x00000002: Skip_metadata. If set, the Result Set returned as a response 374 to the query (if any) will have the NO_METADATA flag (see 375 Section 4.2.5.2). 376 0x00000004: Page_size. If set, <result_page_size> is a positive [int] 377 controlling the desired page size of the result in CQL3 rows or 378 in bytes, if Page_size_bytes is set. 379 See the section on paging (Section 8) for more details. 380 0x00000008: With_paging_state. If set, <paging_state> should be present. 381 <paging_state> is a [bytes] value that should have been returned 382 in a result set (Section 4.2.5.2). The query will be 383 executed but starting from a given paging state. An error will be 384 returned if the paging_state is present but no page_size has been 385 specified. The paging state can also be used to 386 continue paging on a different node than the one where it 387 started (See Section 8 for more details). 388 0x00000010: With serial consistency. If set, <serial_consistency> should be 389 present. <serial_consistency> is the [consistency] level for the 390 serial phase of conditional updates. That consitency can only be 391 either SERIAL or LOCAL_SERIAL and if not present, it defaults to 392 SERIAL. This option will be ignored for anything else other than a 393 conditional update/insert. 394 0x00000020: With default timestamp. If set, <timestamp> should be present. 395 <timestamp> is a [long] representing the default timestamp for the query 396 in microseconds (negative values are forbidden). This will 397 replace the server side assigned timestamp as default timestamp. 398 Note that a timestamp in the query itself will still override 399 this timestamp. This is entirely optional. 400 0x00000040: With names for values. This only makes sense if the 0x01 flag is set and 401 is ignored otherwise. If present, the values from the 0x01 flag will 402 be preceded by a name (see above). Note that this is only useful for 403 QUERY requests where named bind markers are used; for EXECUTE statements, 404 since the names for the expected values was returned during preparation, 405 a client can always provide values in the right order without any names 406 and using this flag, while supported, is almost surely inefficient. 407 0x00000080: With keyspace. If set, <keyspace> must be present. <keyspace> is a 408 [string] indicating the keyspace that the query should be executed in. 409 It supercedes the keyspace that the connection is bound to, if any. 410 0x40000000: Page_size_bytes. If set, <result_page_size> is expressed in bytes. The server 411 will try to return a number of CQL rows whose total size is as close as possible 412 to the requested page size, without splitting any CQL row however. This functionality 413 is currently only supported with continuous paging, setting this flag without 414 setting the continuous paging flag, will result in an error. 415 0x80000000: With continuous paging. If set, <continuous_paging_options> should be present. 416 This structure contains the following: 417 - <max_num_pages>, an [int] indicating the maximum number of pages that the server will send 418 to the client in total, set this to zero to indicate no limit. 419 - <pages_per_second>, an [int] indicating the maximum number of pages per second, set this 420 to zero to indicate no limit. 421 - <next_pages>, an [int] indicating the number of pages that the client is ready to receive 422 right now. The server will not send more pages than this number, until a REVISE_UPDATE is 423 received with a revision of type backpressure, see section 4.1.9. Set this parameter to zero 424 to indicate no limit. In this case the server will send as many pages as it is able to send, 425 up to <max_num_pages> and at the rate specified by <pages_per_second>. 426 When continuous paging is enabled, the query results will be pushed to the client asynchronously and 427 according to the paging options in the request message, without the client having to request each 428 single page. Each response message will have the same stream id as the initial request. 429 Continuous paging can be interrupted by the client at any time via a REVISE_REQUEST with a revision of 430 type cancel, see section 4.1.9. 431 432 Note that the consistency is ignored by some queries (USE, CREATE, ALTER, 433 TRUNCATE, ...). 434 435 The server will respond to a QUERY message with a RESULT message, the content 436 of which depends on the query. 437 438 439 4.1.5. PREPARE 440 441 Prepare a query for later execution (through EXECUTE). The body of the message must be: 442 <query><flags>[<keyspace>] 443 where: 444 - <query> is a [long string] representing the CQL query. 445 - <flags> is a [int] whose bits define the options for this statement and in particular 446 influence what the remainder of the message contains. 447 A flag is set if the bit corresponding to its `mask` is set. Supported 448 flags are, given their mask: 449 0x01: With keyspace. If set, <keyspace> must be present. <keyspace> is a 450 [string] indicating the keyspace that the query should be executed in. 451 It supercedes the keyspace that the connection is bound to, if any. 452 453 The server will respond with a RESULT message with a `prepared` kind (0x0004, 454 see Section 4.2.5). 455 456 457 4.1.6. EXECUTE 458 459 Executes a prepared query. The body of the message must be: 460 <id><result_metadata_id><query_parameters> 461 where 462 - <id> is the prepared query ID. It's the [short bytes] returned as a 463 response to a PREPARE message. As for <query_parameters>, it has the exact 464 same definition as in QUERY (see Section 4.1.4). 465 - <result_metadata_id> is the ID of the resultset metadata that was sent 466 along with response to PREPARE message. If a RESULT/Rows message reports 467 changed resultset metadata with the Metadata_changed flag, the reported new 468 resultset metadata must be used in subsequent executions. 469 470 471 4.1.7. BATCH 472 473 Allows executing a list of queries (prepared or not) as a batch (note that 474 only DML statements are accepted in a batch). The body of the message must 475 be: 476 <type><n><query_1>...<query_n><consistency><flags>[<serial_consistency>][<timestamp>][<keyspace>] 477 where: 478 - <type> is a [byte] indicating the type of batch to use: 479 - If <type> == 0, the batch will be "logged". This is equivalent to a 480 normal CQL3 batch statement. 481 - If <type> == 1, the batch will be "unlogged". 482 - If <type> == 2, the batch will be a "counter" batch (and non-counter 483 statements will be rejected). 484 - <flags> is a [int] whose bits define the options for this query and 485 in particular influence what the remainder of the message contains. It is similar 486 to the <flags> from QUERY and EXECUTE methods, except that the 4 rightmost 487 bits must always be 0 as their corresponding options do not make sense for 488 Batch. A flag is set if the bit corresponding to its `mask` is set. Supported 489 flags are, given their mask: 490 0x10: With serial consistency. If set, <serial_consistency> should be 491 present. <serial_consistency> is the [consistency] level for the 492 serial phase of conditional updates. That consistency can only be 493 either SERIAL or LOCAL_SERIAL and if not present, it defaults to 494 SERIAL. This option will be ignored for anything else other than a 495 conditional update/insert. 496 0x20: With default timestamp. If set, <timestamp> should be present. 497 <timestamp> is a [long] representing the default timestamp for the query 498 in microseconds. This will replace the server side assigned 499 timestamp as default timestamp. Note that a timestamp in the query itself 500 will still override this timestamp. This is entirely optional. 501 0x40: With names for values. If set, then all values for all <query_i> must be 502 preceded by a [string] <name_i> that have the same meaning as in QUERY 503 requests [IMPORTANT NOTE: this feature does not work and should not be 504 used. It is specified in a way that makes it impossible for the server 505 to implement. This will be fixed in a future version of the native 506 protocol. See https://issues.apache.org/jira/browse/CASSANDRA-10246 for 507 more details]. 508 0x80: With keyspace. If set, <keyspace> must be present. <keyspace> is a 509 [string] indicating the keyspace that the query should be executed in. 510 It supercedes the keyspace that the connection is bound to, if any. 511 - <n> is a [short] indicating the number of following queries. 512 - <query_1>...<query_n> are the queries to execute. A <query_i> must be of the 513 form: 514 <kind><string_or_id><n>[<name_1>]<value_1>...[<name_n>]<value_n> 515 where: 516 - <kind> is a [byte] indicating whether the following query is a prepared 517 one or not. <kind> value must be either 0 or 1. 518 - <string_or_id> depends on the value of <kind>. If <kind> == 0, it should be 519 a [long string] query string (as in QUERY, the query string might contain 520 bind markers). Otherwise (that is, if <kind> == 1), it should be a 521 [short bytes] representing a prepared query ID. 522 - <n> is a [short] indicating the number (possibly 0) of following values. 523 - <name_i> is the optional name of the following <value_i>. It must be present 524 if and only if the 0x40 flag is provided for the batch. 525 - <value_i> is the [value] to use for bound variable i (of bound variable <name_i> 526 if the 0x40 flag is used). 527 - <consistency> is the [consistency] level for the operation. 528 - <serial_consistency> is only present if the 0x10 flag is set. In that case, 529 <serial_consistency> is the [consistency] level for the serial phase of 530 conditional updates. That consitency can only be either SERIAL or 531 LOCAL_SERIAL and if not present will defaults to SERIAL. This option will 532 be ignored for anything else other than a conditional update/insert. 533 534 The server will respond with a RESULT message. 535 536 537 4.1.8. REGISTER 538 539 Register this connection to receive some types of events. The body of the 540 message is a [string list] representing the event types to register for. See 541 section 4.2.6 for the list of valid event types. 542 543 The response to a REGISTER message will be a READY message. 544 545 Please note that if a client driver maintains multiple connections to a 546 Cassandra node and/or connections to multiple nodes, it is advised to 547 dedicate a handful of connections to receive events, but to *not* register 548 for events on all connections, as this would only result in receiving 549 multiple times the same event messages, wasting bandwidth. 550 551 4.1.9. REVISE_REQUEST 552 553 Revise a previous request, typically a long running request such as continuous paging. 554 The body of the message is: 555 - an [int] identifying the revision type: 556 - 0x00000001 to cancel a continuous paging session, see section 4.1.4 557 - 0x00000002 to request more pages for a continuous paging session with next_pages > 0, 558 see section 4.1.4 559 - an [int] equal to the stream id of the initial request message. 560 - Optional parameters specific to the revision type: 561 - for revision type 2, then [next_pages] as described in section 4.1.4 562 563 The server will reply with a RESULT of type ROWS (section 4.2.5.2), 564 containing a single row with a single boolean value, which is set to: 565 - true if the update succeeded, 566 - false if the initial request was not found or is no longer running. 567 If the initial request is found but the update cannot be carryed out, then an error is returned instead. 568 569 570 4.2. Responses 571 572 This section describes the content of the frame body for the different 573 responses. Please note that to make room for future evolution, clients should 574 support extra informations (that they should simply discard) to the one 575 described in this document at the end of the frame body. 576 577 4.2.1. ERROR 578 579 Indicates an error processing a request. The body of the message will be an 580 error code ([int]) followed by a [string] error message. Then, depending on 581 the exception, more content may follow. The error codes are defined in 582 Section 9, along with their additional content if any. 583 584 585 4.2.2. READY 586 587 Indicates that the server is ready to process queries. This message will be 588 sent by the server either after a STARTUP message if no authentication is 589 required (if authentication is required, the server indicates readiness by 590 sending a AUTH_RESPONSE message). 591 592 The body of a READY message is empty. 593 594 595 4.2.3. AUTHENTICATE 596 597 Indicates that the server requires authentication, and which authentication 598 mechanism to use. 599 600 The authentication is SASL based and thus consists of a number of server 601 challenges (AUTH_CHALLENGE, Section 4.2.7) followed by client responses 602 (AUTH_RESPONSE, Section 4.1.2). The initial exchange is however boostrapped 603 by an initial client response. The details of that exchange (including how 604 many challenge-response pairs are required) are specific to the authenticator 605 in use. The exchange ends when the server sends an AUTH_SUCCESS message or 606 an ERROR message. 607 608 This message will be sent following a STARTUP message if authentication is 609 required and must be answered by a AUTH_RESPONSE message from the client. 610 611 The body consists of a single [string] indicating the full class name of the 612 IAuthenticator in use. 613 614 615 4.2.4. SUPPORTED 616 617 Indicates which startup options are supported by the server. This message 618 comes as a response to an OPTIONS message. 619 620 The body of a SUPPORTED message is a [string multimap]. This multimap gives 621 for each of the supported STARTUP options, the list of supported values. It 622 also includes: 623 - "PROTOCOL_VERSIONS": the list of native protocol versions that are 624 supported, encoded as the version number followed by a slash and the 625 version description. For example: 3/v3, 4/v4, 5/v5-beta. If a version is 626 in beta, it will have the word "beta" in its description. 627 628 629 4.2.5. RESULT 630 631 The result to a query (QUERY, PREPARE, EXECUTE or BATCH messages). 632 633 The first element of the body of a RESULT message is an [int] representing the 634 `kind` of result. The rest of the body depends on the kind. The kind can be 635 one of: 636 0x0001 Void: for results carrying no information. 637 0x0002 Rows: for results to select queries, returning a set of rows. 638 0x0003 Set_keyspace: the result to a `use` query. 639 0x0004 Prepared: result to a PREPARE message. 640 0x0005 Schema_change: the result to a schema altering query. 641 642 The body for each kind (after the [int] kind) is defined below. 643 644 645 4.2.5.1. Void 646 647 The rest of the body for a Void result is empty. It indicates that a query was 648 successful without providing more information. 649 650 651 4.2.5.2. Rows 652 653 Indicates a set of rows. The rest of the body of a Rows result is: 654 <metadata><rows_count><rows_content> 655 where: 656 - <metadata> is composed of: 657 <flags><columns_count>[<paging_state>][<new_metadata_id>][<continuous_page_no>][<global_table_spec>?<col_spec_1>...<col_spec_n>] 658 where: 659 - <flags> is an [int]. The bits of <flags> provides information on the 660 formatting of the remaining information. A flag is set if the bit 661 corresponding to its `mask` is set. Supported flags are, given their 662 mask: 663 0x00000001 Global_tables_spec: if set, only one table spec (keyspace 664 and table name) is provided as <global_table_spec>. If not 665 set, <global_table_spec> is not present. 666 0x00000002 Has_more_pages: indicates whether this is not the last 667 page of results and more should be retrieved. If set, the 668 <paging_state> will be present. The <paging_state> is a 669 [bytes] value that should be used in QUERY/EXECUTE to 670 continue paging and retrieve the remainder of the result for 671 this query (See Section 8 for more details). 672 0x00000004 No_metadata: if set, the <metadata> is only composed of 673 these <flags>, the <column_count> and optionally the 674 <paging_state> and <continuous_page_no> (depending on the 675 corresponding flags) but no other information (so no 676 <global_table_spec> nor <col_spec_i>). 677 This will only ever be the case if this was requested 678 during the query (see QUERY and RESULT messages). 679 0x00000008 Metadata_changed: if set, the No_metadata flag has to be unset 680 and <new_metadata_id> has to be supplied. This flag is to be 681 used to avoid a roundtrip in case of metadata changes for queries 682 that requested metadata to be skipped. 683 0x40000000 continuous paging: if set, this page is part of a continuous 684 paging session, as requested by the client. <continuous_page_no> 685 will be present, this is an [int] that identifies the sequential 686 number of this page in the session, and the last_continuous_page 687 flag below will be set for the final page. 688 0x80000000 Last_continuous_page: indicates that this is the last continuous page 689 that will be sent in the continuous paging session. This flag can only 690 be set when the continuous paging flag is also set. Note that this may not 691 necessarily be the last page of the query results, for this it is 692 necessary to look at Has_more_pages (this could happen if the client only 693 requested the first N pages in the continuous paging session, or if 694 it sent a CANCEL message). 695 - <columns_count> is an [int] representing the number of columns selected 696 by the query that produced this result. It defines the number of <col_spec_i> 697 elements in and the number of elements for each row in <rows_content>. 698 - <new_metadata_id> is [short bytes] representing the new, changed resultset 699 metadata. The new metadata ID must also be used in subsequent executions of 700 the corresponding prepared statement, if any. 701 - <global_table_spec> is present if the Global_tables_spec is set in 702 <flags>. It is composed of two [string] representing the 703 (unique) keyspace name and table name the columns belong to. 704 - <col_spec_i> specifies the columns returned in the query. There are 705 <column_count> such column specifications that are composed of: 706 (<ksname><tablename>)?<name><type> 707 The initial <ksname> and <tablename> are two [string] and are only present 708 if the Global_tables_spec flag is not set. The <column_name> is a 709 [string] and <type> is an [option] that corresponds to the description 710 (what this description is depends a bit on the context: in results to 711 selects, this will be either the user chosen alias or the selection used 712 (often a colum name, but it can be a function call too). In results to 713 a PREPARE, this will be either the name of the corresponding bind variable 714 or the column name for the variable if it is "anonymous") and type of 715 the corresponding result. The option for <type> is either a native 716 type (see below), in which case the option has no value, or a 717 'custom' type, in which case the value is a [string] representing 718 the fully qualified class name of the type represented. Valid option 719 ids are: 720 0x0000 Custom: the value is a [string], see above. 721 0x0001 Ascii 722 0x0002 Bigint 723 0x0003 Blob 724 0x0004 Boolean 725 0x0005 Counter 726 0x0006 Decimal 727 0x0007 Double 728 0x0008 Float 729 0x0009 Int 730 0x000B Timestamp 731 0x000C Uuid 732 0x000D Varchar 733 0x000E Varint 734 0x000F Timeuuid 735 0x0010 Inet 736 0x0011 Date 737 0x0012 Time 738 0x0013 Smallint 739 0x0014 Tinyint 740 0x0015 Duration 741 0x0020 List: the value is an [option], representing the type 742 of the elements of the list. 743 0x0021 Map: the value is two [option], representing the types of the 744 keys and values of the map 745 0x0022 Set: the value is an [option], representing the type 746 of the elements of the set 747 0x0030 UDT: the value is <ks><udt_name><n><name_1><type_1>...<name_n><type_n> 748 where: 749 - <ks> is a [string] representing the keyspace name this 750 UDT is part of. 751 - <udt_name> is a [string] representing the UDT name. 752 - <n> is a [short] representing the number of fields of 753 the UDT, and thus the number of <name_i><type_i> pairs 754 following 755 - <name_i> is a [string] representing the name of the 756 i_th field of the UDT. 757 - <type_i> is an [option] representing the type of the 758 i_th field of the UDT. 759 0x0031 Tuple: the value is <n><type_1>...<type_n> where <n> is a [short] 760 representing the number of values in the type, and <type_i> 761 are [option] representing the type of the i_th component 762 of the tuple 763 764 - <rows_count> is an [int] representing the number of rows present in this 765 result. Those rows are serialized in the <rows_content> part. 766 - <rows_content> is composed of <row_1>...<row_m> where m is <rows_count>. 767 Each <row_i> is composed of <value_1>...<value_n> where n is 768 <columns_count> and where <value_j> is a [bytes] representing the value 769 returned for the jth column of the ith row. In other words, <rows_content> 770 is composed of (<rows_count> * <columns_count>) [bytes]. 771 772 773 4.2.5.3. Set_keyspace 774 775 The result to a `use` query. The body (after the kind [int]) is a single 776 [string] indicating the name of the keyspace that has been set. 777 778 779 4.2.5.4. Prepared 780 781 The result to a PREPARE message. The body of a Prepared result is: 782 <id><result_metadata_id><metadata><result_metadata> 783 where: 784 - <id> is [short bytes] representing the prepared query ID. 785 - <result_metadata_id> is [short bytes] representing the resultset metadata ID. 786 - <metadata> is composed of: 787 <flags><columns_count><pk_count>[<pk_index_1>...<pk_index_n>][<global_table_spec>?<col_spec_1>...<col_spec_n>] 788 where: 789 - <flags> is an [int]. The bits of <flags> provides information on the 790 formatting of the remaining information. A flag is set if the bit 791 corresponding to its `mask` is set. Supported masks and their flags 792 are: 793 0x0001 Global_tables_spec: if set, only one table spec (keyspace 794 and table name) is provided as <global_table_spec>. If not 795 set, <global_table_spec> is not present. 796 - <columns_count> is an [int] representing the number of bind markers 797 in the prepared statement. It defines the number of <col_spec_i> 798 elements. 799 - <pk_count> is an [int] representing the number of <pk_index_i> 800 elements to follow. If this value is zero, at least one of the 801 partition key columns in the table that the statement acts on 802 did not have a corresponding bind marker (or the bind marker 803 was wrapped in a function call). 804 - <pk_index_i> is a short that represents the index of the bind marker 805 that corresponds to the partition key column in position i. 806 For example, a <pk_index> sequence of [2, 0, 1] indicates that the 807 table has three partition key columns; the full partition key 808 can be constructed by creating a composite of the values for 809 the bind markers at index 2, at index 0, and at index 1. 810 This allows implementations with token-aware routing to correctly 811 construct the partition key without needing to inspect table 812 metadata. 813 - <global_table_spec> is present if the Global_tables_spec is set in 814 <flags>. If present, it is composed of two [string]s. The first 815 [string] is the name of the keyspace that the statement acts on. 816 The second [string] is the name of the table that the columns 817 represented by the bind markers belong to. 818 - <col_spec_i> specifies the bind markers in the prepared statement. 819 There are <column_count> such column specifications, each with the 820 following format: 821 (<ksname><tablename>)?<name><type> 822 The initial <ksname> and <tablename> are two [string] that are only 823 present if the Global_tables_spec flag is not set. The <name> field 824 is a [string] that holds the name of the bind marker (if named), 825 or the name of the column, field, or expression that the bind marker 826 corresponds to (if the bind marker is "anonymous"). The <type> 827 field is an [option] that represents the expected type of values for 828 the bind marker. See the Rows documentation (section 4.2.5.2) for 829 full details on the <type> field. 830 831 - <result_metadata> is defined exactly the same as <metadata> in the Rows 832 documentation (section 4.2.5.2). This describes the metadata for the 833 result set that will be returned when this prepared statement is executed. 834 Note that <result_metadata> may be empty (have the No_metadata flag and 835 0 columns, See section 4.2.5.2) and will be for any query that is not a 836 Select. In fact, there is never a guarantee that this will be non-empty, so 837 implementations should protect themselves accordingly. This result metadata 838 is an optimization that allows implementations to later execute the 839 prepared statement without requesting the metadata (see the Skip_metadata 840 flag in EXECUTE). Clients can safely discard this metadata if they do not 841 want to take advantage of that optimization. 842 843 Note that the prepared query ID returned is global to the node on which the query 844 has been prepared. It can be used on any connection to that node 845 until the node is restarted (after which the query must be reprepared). 846 847 4.2.5.5. Schema_change 848 849 The result to a schema altering query (creation/update/drop of a 850 keyspace/table/index). The body (after the kind [int]) is the same 851 as the body for a "SCHEMA_CHANGE" event, so 3 strings: 852 <change_type><target><options> 853 Please refer to section 4.2.6 below for the meaning of those fields. 854 855 Note that a query to create or drop an index is considered to be a change 856 to the table the index is on. 857 858 859 4.2.6. EVENT 860 861 An event pushed by the server. A client will only receive events for the 862 types it has REGISTERed to. The body of an EVENT message will start with a 863 [string] representing the event type. The rest of the message depends on the 864 event type. The valid event types are: 865 - "TOPOLOGY_CHANGE": events related to change in the cluster topology. 866 Currently, events are sent when new nodes are added to the cluster, and 867 when nodes are removed. The body of the message (after the event type) 868 consists of a [string] and an [inet], corresponding respectively to the 869 type of change ("NEW_NODE" or "REMOVED_NODE") followed by the address of 870 the new/removed node. 871 - "STATUS_CHANGE": events related to change of node status. Currently, 872 up/down events are sent. The body of the message (after the event type) 873 consists of a [string] and an [inet], corresponding respectively to the 874 type of status change ("UP" or "DOWN") followed by the address of the 875 concerned node. 876 - "SCHEMA_CHANGE": events related to schema change. After the event type, 877 the rest of the message will be <change_type><target><options> where: 878 - <change_type> is a [string] representing the type of changed involved. 879 It will be one of "CREATED", "UPDATED" or "DROPPED". 880 - <target> is a [string] that can be one of "KEYSPACE", "TABLE", "TYPE", 881 "FUNCTION" or "AGGREGATE" and describes what has been modified 882 ("TYPE" stands for modifications related to user types, "FUNCTION" 883 for modifications related to user defined functions, "AGGREGATE" 884 for modifications related to user defined aggregates). 885 - <options> depends on the preceding <target>: 886 - If <target> is "KEYSPACE", then <options> will be a single [string] 887 representing the keyspace changed. 888 - If <target> is "TABLE" or "TYPE", then 889 <options> will be 2 [string]: the first one will be the keyspace 890 containing the affected object, and the second one will be the name 891 of said affected object (either the table, user type, function, or 892 aggregate name). 893 - If <target> is "FUNCTION" or "AGGREGATE", multiple arguments follow: 894 - [string] keyspace containing the user defined function / aggregate 895 - [string] the function/aggregate name 896 - [string list] one string for each argument type (as CQL type) 897 898 All EVENT messages have a streamId of -1 (Section 2.3). 899 900 Please note that "NEW_NODE" and "UP" events are sent based on internal Gossip 901 communication and as such may be sent a short delay before the binary 902 protocol server on the newly up node is fully started. Clients are thus 903 advised to wait a short time before trying to connect to the node (1 second 904 should be enough), otherwise they may experience a connection refusal at 905 first. 906 907 4.2.7. AUTH_CHALLENGE 908 909 A server authentication challenge (see AUTH_RESPONSE (Section 4.1.2) for more 910 details). 911 912 The body of this message is a single [bytes] token. The details of what this 913 token contains (and when it can be null/empty, if ever) depends on the actual 914 authenticator used. 915 916 Clients are expected to answer the server challenge with an AUTH_RESPONSE 917 message. 918 919 4.2.8. AUTH_SUCCESS 920 921 Indicates the success of the authentication phase. See Section 4.2.3 for more 922 details. 923 924 The body of this message is a single [bytes] token holding final information 925 from the server that the client may require to finish the authentication 926 process. What that token contains and whether it can be null depends on the 927 actual authenticator used. 928 929 5. Compression 930 931 Frame compression is supported by the protocol, but then only the frame body 932 is compressed (the frame header should never be compressed). 933 934 Before being used, client and server must agree on a compression algorithm to 935 use, which is done in the STARTUP message. As a consequence, a STARTUP message 936 must never be compressed. However, once the STARTUP frame has been received 937 by the server, messages can be compressed (including the response to the STARTUP 938 request). Frames do not have to be compressed, however, even if compression has 939 been agreed upon (a server may only compress frames above a certain size at its 940 discretion). A frame body should be compressed if and only if the compressed 941 flag (see Section 2.2) is set. 942 943 As of version 2 of the protocol, the following compressions are available: 944 - lz4 (https://code.google.com/p/lz4/). In that, note that the first four bytes 945 of the body will be the uncompressed length (followed by the compressed 946 bytes). 947 - snappy (https://code.google.com/p/snappy/). This compression might not be 948 available as it depends on a native lib (server-side) that might not be 949 avaivable on some installations. 950 951 952 6. Data Type Serialization Formats 953 954 This sections describes the serialization formats for all CQL data types 955 supported by Cassandra through the native protocol. These serialization 956 formats should be used by client drivers to encode values for EXECUTE 957 messages. Cassandra will use these formats when returning values in 958 RESULT messages. 959 960 All values are represented as [bytes] in EXECUTE and RESULT messages. 961 The [bytes] format includes an int prefix denoting the length of the value. 962 For that reason, the serialization formats described here will not include 963 a length component. 964 965 For legacy compatibility reasons, note that most non-string types support 966 "empty" values (i.e. a value with zero length). An empty value is distinct 967 from NULL, which is encoded with a negative length. 968 969 As with the rest of the native protocol, all encodings are big-endian. 970 971 6.1. ascii 972 973 A sequence of bytes in the ASCII range [0, 127]. Bytes with values outside of 974 this range will result in a validation error. 975 976 6.2 bigint 977 978 An eight-byte two's complement integer. 979 980 6.3 blob 981 982 Any sequence of bytes. 983 984 6.4 boolean 985 986 A single byte. A value of 0 denotes "false"; any other value denotes "true". 987 (However, it is recommended that a value of 1 be used to represent "true".) 988 989 6.5 date 990 991 An unsigned integer representing days with epoch centered at 2^31. 992 (unix epoch January 1st, 1970). 993 A few examples: 994 0: -5877641-06-23 995 2^31: 1970-1-1 996 2^32: 5881580-07-11 997 998 6.6 decimal 999 1000 The decimal format represents an arbitrary-precision number. It contains an 1001 [int] "scale" component followed by a varint encoding (see section 6.17) 1002 of the unscaled value. The encoded value represents "<unscaled>E<-scale>". 1003 In other words, "<unscaled> * 10 ^ (-1 * <scale>)". 1004 1005 6.7 double 1006 1007 An 8 byte floating point number in the IEEE 754 binary64 format. 1008 1009 6.8 duration 1010 1011 A duration is composed of 3 signed variable length integers ([vint]s). 1012 The first [vint] represents a number of months, the second [vint] represents 1013 a number of days, and the last [vint] represents a number of nanoseconds. 1014 The number of months and days must be valid 32 bits integers whereas the 1015 number of nanoseconds must be a valid 64 bits integer. 1016 A duration can either be positive or negative. If a duration is positive 1017 all the integers must be positive or zero. If a duration is 1018 negative all the numbers must be negative or zero. 1019 1020 6.9 float 1021 1022 A 4 byte floating point number in the IEEE 754 binary32 format. 1023 1024 6.10 inet 1025 1026 A 4 byte or 16 byte sequence denoting an IPv4 or IPv6 address, respectively. 1027 1028 6.11 int 1029 1030 A 4 byte two's complement integer. 1031 1032 6.12 list 1033 1034 A [int] n indicating the number of elements in the list, followed by n 1035 elements. Each element is [bytes] representing the serialized value. 1036 1037 6.13 map 1038 1039 A [int] n indicating the number of key/value pairs in the map, followed by 1040 n entries. Each entry is composed of two [bytes] representing the key 1041 and value. 1042 1043 6.14 set 1044 1045 A [int] n indicating the number of elements in the set, followed by n 1046 elements. Each element is [bytes] representing the serialized value. 1047 1048 6.15 smallint 1049 1050 A 2 byte two's complement integer. 1051 1052 6.16 text 1053 1054 A sequence of bytes conforming to the UTF-8 specifications. 1055 1056 6.17 time 1057 1058 An 8 byte two's complement long representing nanoseconds since midnight. 1059 Valid values are in the range 0 to 86399999999999 1060 1061 6.18 timestamp 1062 1063 An 8 byte two's complement integer representing a millisecond-precision 1064 offset from the unix epoch (00:00:00, January 1st, 1970). Negative values 1065 represent a negative offset from the epoch. 1066 1067 6.19 timeuuid 1068 1069 A 16 byte sequence representing a version 1 UUID as defined by RFC 4122. 1070 1071 6.20 tinyint 1072 1073 A 1 byte two's complement integer. 1074 1075 6.21 tuple 1076 1077 A sequence of [bytes] values representing the items in a tuple. The encoding 1078 of each element depends on the data type for that position in the tuple. 1079 Null values may be represented by using length -1 for the [bytes] 1080 representation of an element. 1081 1082 6.22 uuid 1083 1084 A 16 byte sequence representing any valid UUID as defined by RFC 4122. 1085 1086 6.23 varchar 1087 1088 An alias of the "text" type. 1089 1090 6.24 varint 1091 1092 A variable-length two's complement encoding of a signed integer. 1093 1094 The following examples may help implementors of this spec: 1095 1096 Value | Encoding 1097 ------|--------- 1098 0 | 0x00 1099 1 | 0x01 1100 127 | 0x7F 1101 128 | 0x0080 1102 129 | 0x0081 1103 -1 | 0xFF 1104 -128 | 0x80 1105 -129 | 0xFF7F 1106 1107 Note that positive numbers must use a most-significant byte with a value 1108 less than 0x80, because a most-significant bit of 1 indicates a negative 1109 value. Implementors should pad positive values that have a MSB >= 0x80 1110 with a leading 0x00 byte. 1111 1112 1113 7. User Defined Types 1114 1115 This section describes the serialization format for User defined types (UDT), 1116 as described in section 4.2.5.2. 1117 1118 A UDT value is composed of successive [bytes] values, one for each field of the UDT 1119 value (in the order defined by the type). A UDT value will generally have one value 1120 for each field of the type it represents, but it is allowed to have less values than 1121 the type has fields. 1122 1123 1124 8. Result paging 1125 1126 The protocol allows for paging the result of queries. For that, the QUERY and 1127 EXECUTE messages have a <result_page_size> value that indicate the desired 1128 page size in CQL3 rows. 1129 1130 If a positive value is provided for <result_page_size>, the result set of the 1131 RESULT message returned for the query will contain at most the 1132 <result_page_size> first rows of the query result. If that first page of results 1133 contains the full result set for the query, the RESULT message (of kind `Rows`) 1134 will have the Has_more_pages flag *not* set. However, if some results are not 1135 part of the first response, the Has_more_pages flag will be set and the result 1136 will contain a <paging_state> value. In that case, the <paging_state> value 1137 should be used in a QUERY or EXECUTE message (that has the *same* query as 1138 the original one or the behavior is undefined) to retrieve the next page of 1139 results. 1140 1141 Only CQL3 queries that return a result set (RESULT message with a Rows `kind`) 1142 support paging. For other type of queries, the <result_page_size> value is 1143 ignored. 1144 1145 Note to client implementors: 1146 - While <result_page_size> can be as low as 1, it will likely be detrimental 1147 to performance to pick a value too low. A value below 100 is probably too 1148 low for most use cases. 1149 - Clients should not rely on the actual size of the result set returned to 1150 decide if there are more results to fetch or not. Instead, they should always 1151 check the Has_more_pages flag (unless they did not enable paging for the query 1152 obviously). Clients should also not assert that no result will have more than 1153 <result_page_size> results. While the current implementation always respects 1154 the exact value of <result_page_size>, we reserve the right to return 1155 slightly smaller or bigger pages in the future for performance reasons. 1156 - The <paging_state> is specific to a protocol version and drivers should not 1157 send a <paging_state> returned by a node using the protocol v3 to query a node 1158 using the protocol v4 for instance. 1159 1160 1161 9. Error codes 1162 1163 Let us recall that an ERROR message is composed of <code><message>[...] 1164 (see 4.2.1 for details). The supported error codes, as well as any additional 1165 information the message may contain after the <message> are described below: 1166 0x0000 Server error: something unexpected happened. This indicates a 1167 server-side bug. 1168 0x000A Protocol error: some client message triggered a protocol 1169 violation (for instance a QUERY message is sent before a STARTUP 1170 one has been sent) 1171 0x0100 Authentication error: authentication was required and failed. The 1172 possible reason for failing depends on the authenticator in use, 1173 which may or may not include more detail in the accompanying 1174 error message. 1175 0x1000 Unavailable exception. The rest of the ERROR message body will be 1176 <cl><required><alive> 1177 where: 1178 <cl> is the [consistency] level of the query that triggered 1179 the exception. 1180 <required> is an [int] representing the number of nodes that 1181 should be alive to respect <cl> 1182 <alive> is an [int] representing the number of replicas that 1183 were known to be alive when the request had been 1184 processed (since an unavailable exception has been 1185 triggered, there will be <alive> < <required>) 1186 0x1001 Overloaded: the request cannot be processed because the 1187 coordinator node is overloaded 1188 0x1002 Is_bootstrapping: the request was a read request but the 1189 coordinator node is bootstrapping 1190 0x1003 Truncate_error: error during a truncation error. 1191 0x1100 Write_timeout: Timeout exception during a write request. The rest 1192 of the ERROR message body will be 1193 <cl><received><blockfor><writeType> 1194 where: 1195 <cl> is the [consistency] level of the query having triggered 1196 the exception. 1197 <received> is an [int] representing the number of nodes having 1198 acknowledged the request. 1199 <blockfor> is an [int] representing the number of replicas whose 1200 acknowledgement is required to achieve <cl>. 1201 <writeType> is a [string] that describe the type of the write 1202 that timed out. The value of that string can be one 1203 of: 1204 - "SIMPLE": the write was a non-batched 1205 non-counter write. 1206 - "BATCH": the write was a (logged) batch write. 1207 If this type is received, it means the batch log 1208 has been successfully written (otherwise a 1209 "BATCH_LOG" type would have been sent instead). 1210 - "UNLOGGED_BATCH": the write was an unlogged 1211 batch. No batch log write has been attempted. 1212 - "COUNTER": the write was a counter write 1213 (batched or not). 1214 - "BATCH_LOG": the timeout occurred during the 1215 write to the batch log when a (logged) batch 1216 write was requested. 1217 0x1200 Read_timeout: Timeout exception during a read request. The rest 1218 of the ERROR message body will be 1219 <cl><received><blockfor><data_present> 1220 where: 1221 <cl> is the [consistency] level of the query having triggered 1222 the exception. 1223 <received> is an [int] representing the number of nodes having 1224 answered the request. 1225 <blockfor> is an [int] representing the number of replicas whose 1226 response is required to achieve <cl>. Please note that 1227 it is possible to have <received> >= <blockfor> if 1228 <data_present> is false. Also in the (unlikely) 1229 case where <cl> is achieved but the coordinator node 1230 times out while waiting for read-repair acknowledgement. 1231 <data_present> is a single byte. If its value is 0, it means 1232 the replica that was asked for data has not 1233 responded. Otherwise, the value is != 0. 1234 0x1300 Read_failure: A non-timeout exception during a read request. The rest 1235 of the ERROR message body will be 1236 <cl><received><blockfor><reasonmap><data_present> 1237 where: 1238 <cl> is the [consistency] level of the query having triggered 1239 the exception. 1240 <received> is an [int] representing the number of nodes having 1241 answered the request. 1242 <blockfor> is an [int] representing the number of replicas whose 1243 acknowledgement is required to achieve <cl>. 1244 <reasonmap> is a map of endpoint to failure reason codes. This maps 1245 the endpoints of the replica nodes that failed when 1246 executing the request to a code representing the reason 1247 for the failure. The map is encoded starting with an [int] n 1248 followed by n pairs of <endpoint><failurecode> where 1249 <endpoint> is an [inetaddr] and <failurecode> is a [short] 1250 that has the following meaning: 1251 0x0000 Unknown reason 1252 0x0001 Too many tombstones read (as controlled by the 1253 yaml tombstone_failure_threshold option) 1254 0x0002 The query uses an index but that index is not available 1255 (built) on the queried <endpoint>. 1256 0x0003 The query writes on some CDC enabled tables, but the CDC 1257 space is full (CDC data isn't consumed fast enough). Note 1258 that this can only happen in Write_failure in practice, but 1259 the reasons are shared between both exception. 1260 0x0004 Some failures (one or more) were reported to the replica 1261 "leading" a counter write. The actual error didn't 1262 occur on the node that sent this failure, it is 1263 is simply the node reporting it due to how counter writes 1264 work; the initial reason for the failure should have been 1265 logged on the actual replica on which the problem occured). 1266 0x0005 The table used by the query was not found on at least one 1267 of the replica. This strongly suggest a query was done on 1268 either a newly created or newly dropped table with having 1269 waited for schema agreement first. 1270 0x0006 The keyspace used by the query was not found on at least 1271 one replica. Same likely cause as for tables above. 1272 Any other value for <failurecode> must be considered as an 1273 Unknown reason (but drivers should not fail) as new <failurecode> 1274 may be added without a bump of the protocol version. 1275 <data_present> is a single byte. If its value is 0, it means 1276 the replica that was asked for data had not 1277 responded. Otherwise, the value is != 0. 1278 0x1400 Function_failure: A (user defined) function failed during execution. 1279 The rest of the ERROR message body will be 1280 <keyspace><function><arg_types> 1281 where: 1282 <keyspace> is the keyspace [string] of the failed function 1283 <function> is the name [string] of the failed function 1284 <arg_types> [string list] one string for each argument type (as CQL type) of the failed function 1285 0x1500 Write_failure: A non-timeout exception during a write request. The rest 1286 of the ERROR message body will be 1287 <cl><received><blockfor><reasonmap><write_type> 1288 where: 1289 <cl> is the [consistency] level of the query having triggered 1290 the exception. 1291 <received> is an [int] representing the number of nodes having 1292 answered the request. 1293 <blockfor> is an [int] representing the number of replicas whose 1294 acknowledgement is required to achieve <cl>. 1295 <reasonmap> is a map of endpoint to failure reason codes. This maps 1296 the endpoints of the replica nodes that failed when 1297 executing the request to a code representing the reason 1298 for the failure. The map is encoded starting with an [int] n 1299 followed by n pairs of <endpoint><failurecode> where 1300 <endpoint> is an [inetaddr] and <failurecode> is a [short] 1301 whose meaning is the same than in Read_failure (see above, 1302 though note that some reason only apply to writes and others only 1303 to reads). 1304 <writeType> is a [string] that describes the type of the write 1305 that failed. The value of that string can be one 1306 of: 1307 - "SIMPLE": the write was a non-batched 1308 non-counter write. 1309 - "BATCH": the write was a (logged) batch write. 1310 If this type is received, it means the batch log 1311 has been successfully written (otherwise a 1312 "BATCH_LOG" type would have been sent instead). 1313 - "UNLOGGED_BATCH": the write was an unlogged 1314 batch. No batch log write has been attempted. 1315 - "COUNTER": the write was a counter write 1316 (batched or not). 1317 - "BATCH_LOG": the failure occured during the 1318 write to the batch log when a (logged) batch 1319 write was requested. 1320 1321 0x2000 Syntax_error: The submitted query has a syntax error. 1322 0x2100 Unauthorized: The logged user doesn't have the right to perform 1323 the query. 1324 0x2200 Invalid: The query is syntactically correct but invalid. 1325 0x2300 Config_error: The query is invalid because of some configuration issue 1326 0x2400 Already_exists: The query attempted to create a keyspace or a 1327 table that was already existing. The rest of the ERROR message 1328 body will be <ks><table> where: 1329 <ks> is a [string] representing either the keyspace that 1330 already exists, or the keyspace in which the table that 1331 already exists is. 1332 <table> is a [string] representing the name of the table that 1333 already exists. If the query was attempting to create a 1334 keyspace, <table> will be present but will be the empty 1335 string. 1336 0x2500 Unprepared: Can be thrown while a prepared statement tries to be 1337 executed if the provided prepared statement ID is not known by 1338 this host. The rest of the ERROR message body will be [short 1339 bytes] representing the unknown ID. 1340 1341 0x8000 Client_write_failure: an error occured when sending asynchronous results to 1342 the client, for example if the client is unable to keep up with the rate during 1343 a continuous paging session. 1344 1345 10. Changes 1346 1347 10.1. Changes from CQL binary protocol version 5 1348 1349 * Second most signficant bit in the frame version byte is set to one to indicate 1350 a dse protocol message (section 2.1) 1351 1352 * Continuous paging: 1353 * Added options to QUERY message (section 4.1.4) 1354 * Added response parameters to ROWS response (section 4.2.5.2) 1355 1356 * Added CANCEL message (section 4.1.9) 1357 1358 10.2. Changes from DSE binary protocol v1 1359 1360 * Added keyspace field in QUERY, PREPARE, and BATCH messages (Sections 4.1.4, 4.1.5, and 4.1.7). 1361 * Added [int] flags field in PREPARE message (Section 4.1.5). 1362 * Added STARTUP message parameters to pass client instance, driver information and 1363 application information. 1364 * Added [next_pages] to continuous paging QUERY messages (section 4.1.4) 1365 * Renamed CANCEL to REVISE_REQUEST and added revision type 2 for updating [next_pages] (section 4.1.9) 1366 * Add two new reasons for read failures (section 9, 'reasonmap' of the 'Read_failure' case). 1367 * Added <result_metadata_id> to Prepared response (section 4.2.5.4) and EXECUTE request (section 4.1.6), 1368 and Metadata_changed flag and <new_metadata_id> to Rows response (section 4.2.5.2).