github.com/pingcap/tiflow@v0.0.0-20240520035814-5bf52d54e205/docs/design/2022-01-20-ticdc-mq-sink-multiple-topics.md (about) 1 # TiCDC supports multi-topic dispatch 2 3 - Author(s): [hi-rustin](https://github.com/hi-rustin) 4 - Tracking Issue: https://github.com/pingcap/tiflow/issues/4423 5 6 ## Table of Contents 7 8 - [Introduction](#introduction) 9 - [Motivation or Background](#motivation-or-background) 10 - [Detailed Design](#detailed-design) 11 - [Test Design](#test-design) 12 - [Functional Tests](#functional-tests) 13 - [Scenario Tests](#scenario-tests) 14 - [Compatibility Tests](#compatibility-tests) 15 - [Benchmark Tests](#benchmark-tests) 16 - [Impacts & Risks](#impacts--risks) 17 - [Investigation & Alternatives](#investigation--alternatives) 18 - [Unresolved Questions](#unresolved-questions) 19 20 ## Introduction 21 22 This document provides a complete design on implementing multi-topic support in TiCDC MQ Sink. 23 24 ## Motivation or Background 25 26 TiCDC MQ Sink only supports sending messages to a single topic, but in the MQ Sink usage scenario, we send data to 27 systems like [Flink], which requires us to support multiple topics, each topic as a data source. 28 29 ## Detailed Design 30 31 This solution will introduce a new configuration to the configuration file that specifies which topic the sink will send 32 the table data to. 33 34 We will continue to keep the original topic configuration in the sinkURI, which serves two purposes. 35 36 1. when there is no new configuration or the configuration does not match, the data will be sent to that default topic. 37 2. DDLs of the schema level will be sent to this topic by default. 38 39 ### Topic dispatch configuration format 40 41 This configuration will be added to the TiCDC changefeed configuration file. 42 43 ```toml 44 [sink] 45 dispatchers = [ 46 { matcher = ['test1.*', 'test2.*'], dispatcher = "ts", topic = "Topic dispatch expression" }, 47 { matcher = ['test3.*', 'test4.*'], dispatcher = "rowid", topic = "Topic dispatch expression" }, 48 { matcher = ['test1.*', 'test5.*'], dispatcher = "ts", topic = "Topic dispatch expression" }, 49 ] 50 ``` 51 52 A new topic field has been added to dispatchers that will specify the topic dispatching rules for these tables. 53 54 ### Topic dispatch expression details 55 56 The expression format looks like `flink_{schema}{table}`. This expression consists of two keywords and the `flink_` 57 prefix. 58 59 Two keywords(case-insensitive): 60 61 | Keyword | Description | Required | 62 | -------- | ---------------------- | -------- | 63 | {schema} | the name of the schema | no | 64 | {table} | the name of the table | no | 65 66 > When neither keyword is filled in, it is equivalent to sending the data to a fixed topic. 67 68 `flink_` is the user-definable part, where the user can fill in the expression with a custom string. 69 70 Some examples: 71 72 ```toml 73 [sink] 74 dispatchers = [ 75 { matcher = ['test1.table1', 'test2.table1'], topic = "{schema}_{table}" }, 76 { matcher = ['test3.*', 'test4.*'], topic = "flink{schema}" }, 77 { matcher = ['test1.*', 'test5.*'], topic = "test-cdc" }, 78 ] 79 ``` 80 81 - matcher = ['test1.*', 'test2.*'], topic = "{schema}\_{table}" 82 - Send the data from `test1.table1` and `test2.table1` to the `test1_table1` and `test2_table1` topics, respectively 83 - matcher = ['test3.*', 'test4.*'], topic = "flink\_{schema}" 84 - Send the data from all tables in `test3` and `test4` to `flinktest3` and `flinktest4` topics, respectively 85 - matcher = ['test1.*', 'test5.*'], topic = "test-cdc" 86 - Send the data of all the tables in `test1` (except `test1.table1`) and `test5` to the `test-cdc` topic 87 - The `table1` in `test1` is sent to the `test1_table1` topic, because for tables matching multiple matcher rules, the 88 topic expression corresponding to the top matcher prevails 89 90 ### DDL dispatch rules 91 92 - schema-level DDLs that are sent to the default topic 93 - table-level DDLs, will be sent to the matching topic, if there is no topic match, it will be sent to the default topic 94 95 ## Test Design 96 97 This functionality will be mainly covered by unit and integration tests. 98 99 ### Functional Tests 100 101 #### Unit test 102 103 Coverage should be more than 75% in new added code. 104 105 #### Integration test 106 107 Can pass all existing integration tests when changefeed without topic dispatch configuration. In addition, we will 108 integrate [Flink] into our integration tests to verify multi-topic functionality. 109 110 ### Scenario Tests 111 112 We will test the scenario of using `canal-json` format to connect data to [Flink]. 113 114 ### Compatibility Tests 115 116 #### Compatibility with other features/components 117 118 For TiCDC's original support of only a single topic, we're not going to break it this time. When you pass only the 119 default topic in the sinkURI and there is no topic expression configuration, it will work as is. 120 121 #### Upgrade compatibility 122 123 When not configured, it works as a single topic, so just add the configuration and create a new changefeed after the 124 upgrade. 125 126 #### Downgrade compatibility 127 128 The new configuration is not recognized by the old TiCDC, so you need to remove the changefeed before downgrading. 129 130 ### Benchmark Tests 131 132 N/A 133 134 ## Impacts & Risks 135 136 N/A 137 138 ## Investigation & Alternatives 139 140 N/A 141 142 ## Unresolved Questions 143 144 N/A 145 146 [flink]: https://flink.apache.org/