github.com/pingcap/tidb/parser@v0.0.0-20231013125129-93a834a6bf8d/docs/quickstart.md (about) 1 # Quickstart 2 3 This parser is highly compatible with MySQL syntax. You can use it as a library, parse a text SQL into an AST tree, and traverse the AST nodes. 4 5 In this example, you will build a project, which can extract all the column names from a text SQL. 6 7 ## Prerequisites 8 9 - [Golang](https://golang.org/dl/) version 1.13 or above. You can follow the instructions in the official [installation page](https://golang.org/doc/install) (check it by `go version`) 10 11 ## Create a Project 12 13 ```bash 14 mkdir colx && cd colx 15 go mod init colx && touch main.go 16 ``` 17 18 ## Import Dependencies 19 20 First, you need to use `go get` to fetch the dependencies through git hash. The git hashes are available in [release page](https://github.com/pingcap/tidb/releases). Take `v5.3.0` as an example: 21 22 ```bash 23 go get -v github.com/pingcap/tidb/parser@4a1b2e9 24 ``` 25 26 > **NOTE** 27 > 28 > The parser was merged into TiDB repo since v5.3.0. So you can only choose version v5.3.0 or higher in this TiDB repo. 29 > 30 > You may want to use advanced API on expressions (a kind of AST node), such as numbers, string literals, booleans, nulls, etc. It is strongly recommended using the `types` package in TiDB repo with the following command: 31 > 32 > ```bash 33 > go get -v github.com/pingcap/tidb/types/parser_driver@4a1b2e9 34 > ``` 35 > and import it in your golang source code: 36 > ```go 37 > import _ "github.com/pingcap/tidb/types/parser_driver" 38 > ``` 39 40 Your directory should contain the following three files: 41 ``` 42 . 43 ├── go.mod 44 ├── go.sum 45 └── main.go 46 ``` 47 48 Now, open `main.go` with your favorite editor, and start coding! 49 50 ## Parse SQL text 51 52 To convert a SQL text to an AST tree, you need to: 53 1. Use the [`parser.New()`](https://pkg.go.dev/github.com/pingcap/tidb/parser?tab=doc#New) function to instantiate a parser, and 54 2. Invoke the method [`Parse(sql, charset, collation)`](https://pkg.go.dev/github.com/pingcap/tidb/parser?tab=doc#Parser.Parse) on the parser. 55 56 ```go 57 package main 58 59 import ( 60 "fmt" 61 62 "github.com/pingcap/tidb/parser" 63 "github.com/pingcap/tidb/parser/ast" 64 _ "github.com/pingcap/tidb/parser/test_driver" 65 ) 66 67 func parse(sql string) (*ast.StmtNode, error) { 68 p := parser.New() 69 70 stmtNodes, _, err := p.Parse(sql, "", "") 71 if err != nil { 72 return nil, err 73 } 74 75 return &stmtNodes[0], nil 76 } 77 78 func main() { 79 astNode, err := parse("SELECT a, b FROM t") 80 if err != nil { 81 fmt.Printf("parse error: %v\n", err.Error()) 82 return 83 } 84 fmt.Printf("%v\n", *astNode) 85 } 86 87 ``` 88 89 Test the parser by running the following command: 90 91 ```bash 92 go run main.go 93 ``` 94 95 If the parser runs properly, you should get a result like this: 96 97 ``` 98 &{{{{SELECT a, b FROM t}}} {[]} 0xc0000a1980 false 0xc00000e7a0 <nil> 0xc0000a19b0 <nil> <nil> [] <nil> <nil> none [] false false 0 <nil>} 99 ``` 100 101 > **NOTE** 102 > 103 > Here are a few things you might want to know: 104 > - To use a parser, a `parser_driver` is required. It decides how to parse the basic data types in SQL. 105 > 106 > You can use [`github.com/pingcap/tidb/parser/test_driver`](https://pkg.go.dev/github.com/pingcap/tidb/parser/test_driver) as the `parser_driver` for test. Again, if you need advanced features, please use the `parser_driver` in TiDB (run `go get -v github.com/pingcap/tidb/types/parser_driver@4a1b2e9` and import it). 107 > - The instantiated parser object is not goroutine safe. It is better to keep it in a single goroutine. 108 > - The instantiated parser object is not lightweight. It is better to reuse it if possible. 109 > - Warning: the 'parser.result' object is being reused without being properly reset or copied. This can cause unexpected behavior or errors if the object is used for multiple parsing operations or concurrently in multiple goroutines. To avoid these issues, make a copy of the 'parser.result' object before calling 'parser.Parse()' again or before using it in another goroutine, or create a new 'parser' object altogether for each new parsing operation. 110 > - The 2nd and 3rd arguments of [`parser.Parse()`](https://pkg.go.dev/github.com/pingcap/tidb/parser?tab=doc#Parser.Parse) are charset and collation respectively. If you pass an empty string into it, a default value is chosen. 111 112 113 ## Traverse AST Nodes 114 115 Now you get the AST tree root of a SQL statement. It is time to extract the column names by traverse. 116 117 Parser implements the interface [`ast.Node`](https://pkg.go.dev/github.com/pingcap/tidb/parser/ast?tab=doc#Node) for each kind of AST node, such as SelectStmt, TableName, ColumnName. [`ast.Node`](https://pkg.go.dev/github.com/pingcap/tidb/parser/ast?tab=doc#Node) provides a method `Accept(v Visitor) (node Node, ok bool)` to allow any struct that has implemented [`ast.Visitor`](https://pkg.go.dev/github.com/pingcap/tidb/parser/ast?tab=doc#Visitor) to traverse itself. 118 119 [`ast.Visitor`](https://pkg.go.dev/github.com/pingcap/tidb/parser/ast?tab=doc#Visitor) is defined as follows: 120 ```go 121 type Visitor interface { 122 Enter(n Node) (node Node, skipChildren bool) 123 Leave(n Node) (node Node, ok bool) 124 } 125 ``` 126 127 Now you can define your own visitor, `colX`(columnExtractor): 128 129 ```go 130 type colX struct{ 131 colNames []string 132 } 133 134 func (v *colX) Enter(in ast.Node) (ast.Node, bool) { 135 if name, ok := in.(*ast.ColumnName); ok { 136 v.colNames = append(v.colNames, name.Name.O) 137 } 138 return in, false 139 } 140 141 func (v *colX) Leave(in ast.Node) (ast.Node, bool) { 142 return in, true 143 } 144 ``` 145 146 Finally, wrap `colX` in a simple function: 147 148 ```go 149 func extract(rootNode *ast.StmtNode) []string { 150 v := &colX{} 151 (*rootNode).Accept(v) 152 return v.colNames 153 } 154 ``` 155 156 And slightly modify the main function: 157 158 ```go 159 func main() { 160 if len(os.Args) != 2 { 161 fmt.Println("usage: colx 'SQL statement'") 162 return 163 } 164 sql := os.Args[1] 165 astNode, err := parse(sql) 166 if err != nil { 167 fmt.Printf("parse error: %v\n", err.Error()) 168 return 169 } 170 fmt.Printf("%v\n", extract(astNode)) 171 } 172 ``` 173 174 Test your program: 175 176 ```bash 177 go build && ./colx 'select a, b from t' 178 ``` 179 180 ``` 181 [a b] 182 ``` 183 184 You can also try a different SQL statement as an input. For example: 185 186 ```console 187 $ ./colx 'SELECT a, b FROM t GROUP BY (a, b) HAVING a > c ORDER BY b' 188 [a b a b a c b] 189 190 If necessary, you can deduplicate by yourself. 191 192 $ ./colx 'SELECT a, b FROM t/invalid_str' 193 parse error: line 1 column 19 near "/invalid_str" 194 ``` 195 196 Enjoy!