github.com/pingcap/tiflow@v0.0.0-20240520035814-5bf52d54e205/docs/design/2023-07-04-ticdc-pulsar-sink.md (about)

     1  # TiCDC Design Documents
     2  
     3  - Author(s): [yumchina](https://github.com/yumchina)
     4  - Tracking Issue: https://github.com/pingcap/tiflow/issues/9413
     5  
     6  ## Table of Contents
     7  
     8  - [Introduction](#introduction)
     9  - [Motivation or Background](#motivation-or-background)
    10  - [Detailed Design](#detailed-design)
    11    - [Protocol-support](#protocol-support)
    12      - [Row Order and Transactions](#row-order-and-transactions)
    13    - [Pulsar Client](#pulsar-client)
    14      - [Information](#information)
    15      - [Different from Kafka](#different-from-kafka)
    16      - [Pulsar Client Config](#pulsar-client-config)
    17    - [Pulsar Producer](#pulsar-producer)
    18      - [Producer Message](#producer-message)
    19    - [Producer Authentication](#pulsar-authentication)
    20    - [Producer Route Rule](#pulsar-route-rule)
    21    - [Producer Topic Rule](#pulsar-topic-rule)
    22    - [Produce DDL Event](#produce-ddl-event)
    23      - [SyncSendMessage Method](#syncsendmessage-method)
    24      - [SyncBroadcastMessage Method](#syncbroadcastmessage-method)
    25      - [Close Method](#close-method)
    26    - [Produce DML Event](#produce-dml-event)
    27      - [AsyncSendMessage Method](#asyncsendmessage-method)
    28      - [Close Method](#close-method)
    29    - [Pulsar Metrics](#pulsar-metrics)
    30    - [User Interface](#user-interface)
    31  - [Test Design](#test-design)
    32    - [Functional Tests](#functional-tests)
    33    - [Scenario Tests](#scenario-tests)
    34    - [Compatibility Tests](#compatibility-tests)
    35    - [Benchmark Tests](#benchmark-tests)
    36  - [Impacts & Risks](#impacts--risks)
    37  - [Investigation & Alternatives](#investigation--alternatives)
    38  - [Unresolved Questions](#unresolved-questions)
    39  
    40  ## Introduction
    41  
    42  This document provides a complete design on implementing pulsar sink for TiCDC.
    43  The pulsar sink is used to distribute the DML change records, and DDL events generated by TiCDC.
    44  
    45  ## Motivation or Background
    46  
    47  Incorporating Pulsar into Ticdc is for the purpose of expanding the downstream MQ distribution channels.
    48  Users want to output TiDB events to Pulsar, because they can reuse machines from Pulsar with others,
    49  the pulsar easily expanded horizontally etc.
    50  
    51  ## Detailed Design
    52  
    53  #### Protocol-support
    54  
    55  In order to maintain the consistency of the middleware of the MQ class,
    56  we give priority support some of the protocols supported by Kafka:
    57  
    58  **CanalJSON**
    59  
    60  **Canal**
    61  
    62  **Maxwell**
    63  
    64  CanalJSON protocol sample:
    65  
    66  ```
    67  for more information, please refer to: https://docs.pingcap.com/tidb/dev/ticdc-canal-json
    68  
    69  {
    70      "id": 0,
    71      "database": "test",
    72      "table": "",
    73      "pkNames": null,
    74      "isDdl": true,
    75      "type": "QUERY",
    76      "es": 1639633094670,
    77      "ts": 1639633095489,
    78      "sql": "drop database if exists test",
    79      "sqlType": null,
    80      "mysqlType": null,
    81      "data": null,
    82      "old": null,
    83      "_tidb": {     // TiDB extension field
    84          "commitTs": 163963309467037594
    85      }
    86  }
    87  ```
    88  
    89  #### Row Order and Transactions
    90  
    91  - Ensure that each event of commit-ts is incremented and be sent to Pulsar in order .
    92  - Ensure that there are no incomplete inner-table transactions in Pulsar.
    93  - Ensure that every event must be sent to Pulsar at least once.
    94  
    95  #### Pulsar Client
    96  
    97  ##### Information
    98  
    99  https://github.com/apache/pulsar-client-go Version: v0.10.0
   100  Requirement Golang 1.18+
   101  
   102  ##### Different from Kafka
   103  
   104  The difference between pulsar and kafka is that the producer in the client of pulsar must be bound to a topic, but kafka does not.
   105  
   106  ##### Pulsar Client Config
   107  
   108  ```api
   109  type ClientOptions struct {
   110    // Configure the service URL for the Pulsar service.
   111    // This parameter is required
   112    URL string
   113    // Timeout for the establishment of a TCP connection (default: 5 seconds)
   114    ConnectionTimeout time.Duration
   115  
   116    // Set the operation timeout (default: 30 seconds)
   117    // Producer-create, subscribe and unsubscribe operations will be retried until this interval, after which the
   118    // operation will be marked as failed
   119    OperationTimeout time.Duration
   120  
   121    // Configure the ping send and check interval, default to 30 seconds.
   122    KeepAliveInterval time.Duration
   123  
   124    // Configure the authentication provider. (default: no authentication)
   125    // Example: `Authentication: NewAuthenticationToken("token")`
   126    Authentication
   127  
   128    // Add custom labels to all the metrics reported by this client instance
   129    CustomMetricsLabels map[string]string
   130  
   131    // Specify metric registerer used to register metrics.
   132    // Default prometheus.DefaultRegisterer
   133    MetricsRegisterer prometheus.Registerer
   134  }
   135  ```
   136  
   137  **Main Note:**
   138  
   139  - URL: like pulsar://127.0.0.1:6650
   140  - Authentication: We only support token/token-from-file/account-with-password.
   141  - MetricsRegisterer: We initialize pulsar MetricsRegisterer with `prometheus.NewRegistry()` from tiflow project `cdc/server/metrics.go`
   142  
   143  #### Pulsar Producer
   144  
   145  ```go
   146  type ProducerOptions struct {
   147    // Topic specifies the topic this producer will be publishing on.
   148    // This argument is required when constructing the producer.
   149    Topic string
   150  
   151    // Properties specifies a set of application defined properties for the producer.
   152    // This properties will be visible in the topic stats
   153    Properties map[string]string
   154  
   155    //……… others
   156  
   157  }
   158  
   159  ```
   160  
   161  - Payload: is carrying real binary data
   162  - Value: Value and payload is mutually exclusive, Value for schema message.
   163  - Key: The optional key associated with the message (particularly useful for things like topic compaction)
   164  
   165  **We must cache all producers to the client for different topics
   166  Every changefeed of pulsar client have a producer map. Type as `map[string]pulsar.Producer`, the key is topic name, value is producer of pulsar client.**
   167  
   168  ##### Producer Message:
   169  
   170  ```go
   171  type ProducerMessage struct {
   172  // Payload for the message
   173  Payload []byte
   174  // Value and payload is mutually exclusive, `Value interface{}` for schema message.
   175  Value interface{}
   176  // Key sets the key of the message for routing policy
   177  Key string
   178  // OrderingKey sets the ordering key of the message
   179  OrderingKey string
   180  
   181  	……… others no use
   182  }
   183  ```
   184  
   185  - Payload: is carrying real binary data
   186  - Value: Value and payload is mutually exclusive, Value for schema message.
   187  - Key: The optional key associated with the message (particularly useful for things like topic compaction)
   188  - OrderingKey: OrderingKey sets the ordering key of the message.Same as Key, so we do not use it.
   189  
   190  #### Pulsar Authentication
   191  
   192  - Use authentication-token from sink-uri support token to authenticate the pulsar server.
   193  - Use basic-user-name and basic-password from sink-uri authenticate to the pulsar server.
   194  - Use token-from-file from sink-uri support token to authenticate the pulsar server.
   195  
   196  #### Pulsar Route Rule
   197  
   198  - We support route events to different partitions by changefeed config dispatchers,
   199    refer to `Pulsar Topic Rule`
   200  - You can set the message-key to any characters. We do not set any characters default, the event will be sent to the partition by hash algorithm.
   201  
   202  #### Pulsar Topic Rule
   203  
   204  ```yaml
   205  dispatchers = [
   206     {matcher = ['test1.*', 'test2.*'], topic = "Topic expression 1",partition="table" },
   207     {matcher = ['test6.*'],topic = "Topic expression 2",partition="ts" }
   208  ]
   209  The topic expression syntax is legal if it meets the following conditions:
   210  1.{schema} and {table} respectively identify the database name and table name that need to be matched, and are required fields.
   211     Pulsar support "(persistent|non-persistent)://tenant/namespace/topic" as topic name。
   212  2.The tenant, namespace and topic must be separated by 2 slashes, such as: "tenant/namespace/topic".
   213  3. If the topic does not match, it will enter the default topic, which is the topic in the sink-uri
   214  4. "partition" ="xxx" choose [refer to https://docs.pingcap.com/tidb/dev/ticdc-sink-to-kafka#customize-the-rules-for-topic-and-partition-dispatchers-of-kafka-sink]:
   215      default: When multiple unique indexes (including the primary key) exist or the Old Value feature is enabled, events are dispatched in the table mode. When only one unique index (or the primary key) exists, events are dispatched in the index-value mode.
   216      ts: Use the commitTs of the row change to hash and dispatch events.
   217      index-value: Use the value of the primary key or the unique index of the table to hash and dispatch events.
   218      table: Use the schema name of the table and the table name to hash and dispatch events.
   219  
   220  ```
   221  
   222  #### Produce DDL Event
   223  
   224  We implement the DDLProducer interface
   225  
   226  ##### SyncSendMessage Method
   227  
   228  It will find a producer by topic name.
   229  Send the event to pulsar.
   230  Report some metrics .
   231  `partitionNum` is not used, because the pulsar server supports set partition num only.
   232  
   233  ##### SyncBroadcastMessage Method
   234  
   235  It do nothing
   236  
   237  ##### Close Method
   238  
   239  Close every producers
   240  
   241  ##### Produce DML Event
   242  
   243  We implement the DMLProducer interface
   244  
   245  ##### AsyncSendMessage Method
   246  
   247  It will find a producer by topic name.
   248  Set a callback function to the pulsar producer client.
   249  Send the event to pulsar.
   250  Report some metrics.
   251  `partitionNum` is not used, because the pulsar server supports set partition num only.
   252  
   253  ##### Close Method
   254  
   255  Close every producers
   256  
   257  #### Pulsar Metrics
   258  
   259  Pulsar client support metric of `prometheus.Registry`
   260  Following are pulsar client metrics
   261  
   262  ```
   263  pulsar_client_bytes_published
   264  pulsar_client_bytes_received
   265  pulsar_client_connections_closed
   266  pulsar_client_connections_establishment_errors
   267  pulsar_client_connections_handshake_errors
   268  pulsar_client_connections_opened
   269  pulsar_client_lookup_count
   270  pulsar_client_messages_published
   271  pulsar_client_messages_received
   272  pulsar_client_partitioned_topic_metadata_count
   273  pulsar_client_producer_errors
   274  pulsar_client_producer_latency_seconds_bucket
   275  pulsar_client_producer_latency_seconds_count
   276  pulsar_client_producer_latency_seconds_sum
   277  pulsar_client_producer_pending_bytes
   278  pulsar_client_producer_pending_messages
   279  pulsar_client_producer_rpc_latency_seconds_bucket
   280  pulsar_client_producer_rpc_latency_seconds_count
   281  pulsar_client_producer_rpc_latency_seconds_sum
   282  pulsar_client_producers_closed
   283  pulsar_client_producers_opened
   284  pulsar_client_producers_partitions_active
   285  pulsar_client_producers_reconnect_failure
   286  pulsar_client_producers_reconnect_max_retry
   287  pulsar_client_readers_closed
   288  pulsar_client_readers_opened
   289  pulsar_client_rpc_count
   290  ```
   291  
   292  #### User Interface
   293  
   294  **Sink-URI**
   295  
   296  When creating a changefeed, the user can specify the sink-uri like this:
   297  cdc cli changefeed create --sink-uri="${scheme}://${address}/${topic-name}?protocol=${protocol}&pulsar-version=${pulsar-version}&authentication-token=${authentication-token}
   298  
   299  Example:
   300  
   301  ```
   302  cdc cli changefeed create --server=http://127.0.0.1:8300
   303  --sink-uri="pulsar://127.0.0.1:6650/persistent://public/default/test?protocol=canal-json&pulsar-version=v2.10.0&authentication-token=eyJhbGciOiJSUzIxxxxxxxxxxxxxxxxx"
   304  ```
   305  
   306  ## Test Design
   307  
   308  Pulsar sink is a new feature, For tests, we focus on the functional tests, scenario tests and benchmark.
   309  
   310  ### Functional Tests
   311  
   312  - Regular unit testing and integration testing cover the correctness of data replication using canal/maxwell/canal-json protocol.
   313  
   314  ### Scenario Tests
   315  
   316  Run stability and chaos tests under different workloads.
   317  
   318  - The upstream and downstream data are consistent.
   319  - Throughput and latency are stable for most scenarios.
   320  
   321  ### Compatibility Tests
   322  
   323  #### Compatibility with other features/components
   324  
   325  Should be compatible with other features.
   326  
   327  #### Upgrade Downgrade Compatibility
   328  
   329  Pulsar sink is a new feature, so there should be no upgrade
   330  or downgrade compatibility issues.
   331  
   332  ### Benchmark Tests
   333  
   334  Perform benchmark tests under common scenarios, big data scenarios, multi-table scenarios, and wide table scenarios with different parameters.
   335  
   336  ## Impacts & Risks
   337  
   338  N/A
   339  
   340  ## Investigation & Alternatives
   341  
   342  N/A
   343  
   344  ## Unresolved Questions
   345  
   346  N/A