github.com/cockroachdb/cockroach@v20.2.0-alpha.1+incompatible/docs/codelabs/00-sql-function.md (about) 1 # Codelab: Adding a SQL Function 2 3 ## Background 4 5 This codelab will walk you through adding a new SQL function and its associated 6 tests. 7 8 ## Getting Started 9 10 Before we get started, you need to download the CockroachDB source code and 11 ensure you have all of the prerequisites needed for development. See 12 [CONTRIBUTING.md] doc for details. 13 14 ## Adding a SQL Function 15 16 Currently, CockroachDB only supports [built-in SQL functions][built-ins]. We’re 17 going to walk through the process of adding a new built-in function and an 18 associated test. Along the way you’ll see a bit of the SQL code layout, parts of 19 the type system and part of the logic test infrastructure. 20 21 ### Built-ins 22 23 The SQL code lies within the `pkg/sql` directory. The built-in 24 functions reside in `pkg/sql/sem/builtins/builtins.go`. A function is 25 described by a `Overload` structure, in `pkg/sql/sem/tree/overload.go`: 26 27 ```go 28 type Overload struct { 29 Types TypeList 30 ReturnType ReturnTyper 31 ... 32 Fn func(*EvalContext, Datums) (Datum, error) 33 } 34 ``` 35 36 `Overload` contains a number of fields, reflecting the 37 diversity of built-in functions. Three important fields for us to pay 38 attention to our the argument types (`Types`), the return type 39 (`ReturnType`) and the implementation function pointer (`Fn`). 40 41 Multiple function overloads are then grouped into a single "built-in 42 definition" (`builtinDefinition` in `builtins/builtins.go`), and 43 during CockroachDB initialization transformed into a 44 `FunctionDefinition` (in `builtins/all_builtins.go`). 45 46 For example, `abs` has an overload for each numeric type (`float`, 47 `decimal`, and `int`). The type system takes care of selecting the 48 correct version of a function given the name and the argument 49 types. 50 51 The SQL execution engine finds the `builtinDefinition` structure 52 given the name of a function using the `builtins` map: 53 54 ```go 55 var builtins = map[string]builtinDefinition{...} 56 ``` 57 58 Notice that this is a map from `string` to `builtinDefinition`, which 59 contains a slice of `Overload`s via the member field 60 `Overloads`. The `Overloads` slice is used to distinguish the 61 "overloads" for a given function. 62 63 ### What’s Your Name 64 65 We’re going to add a new SQL function: `whois()`. This function will take a 66 variable number of usernames and return the corresponding real names. For 67 example, `whois('pmattis')` will return `'Peter Mattis'`. For simplicity, the 68 mapping of usernames to real names will be hardcoded. Let’s get started. 69 70 The `builtins` map is divided up into sections by function category, but this 71 organization is purely for readability. We can add our function anywhere, so 72 let’s add it right at the top of the definition for simplicity: 73 74 ```go 75 var builtins = map[string]builtinDefinition{ 76 "whois": makeBuiltin(defProps(), 77 tree.Overload{ 78 Types: tree.VariadicType{VarType: types.String}, 79 ReturnType: tree.FixedReturnType(types.String), 80 Fn: func(ctx *tree.EvalContext, args tree.Datums) (tree.Datum, error) { 81 return tree.DNull, fmt.Errorf("nothing to see here") 82 }, 83 }, 84 ), 85 ... 86 ``` 87 88 This is the skeleton of our built-in. The `Types` field indicates our function 89 takes a variable number of string arguments. The `ReturnType` field indicates 90 our function returns a string. The implementation of our function is currently 91 unfinished, so we’ll return an error for now. 92 93 Go ahead and add the above code to `pkg/sql/sem/builtins/builtins.go`. If you’ve 94 followed the instructions in [CONTRIBUTING.md], you should be able to build 95 CockroachDB from source: 96 97 ```text 98 ~/go/src/github.com/cockroachdb/cockroach$ make build 99 ... 100 github.com/cockroachdb/cockroach 101 ``` 102 103 Now, let’s run a single-node Cockroach instance: 104 105 ```text 106 $ rm -fr cockroach-data/ && ./cockroach start --insecure 107 ... 108 status: initialized new cluster 109 ... 110 ``` 111 112 In another terminal window, use the `cockroach sql` shell to execute our 113 built-in: 114 115 ```text 116 $ ./cockroach sql --insecure -e "select whois()" 117 Error: pq: whois(): nothing to see here 118 Failed running "sql" 119 ``` 120 121 Yay! We successfully added our built-in function and it failed to execute. Note 122 that the error message above is due to our implementation. If we try to execute 123 a non-existent function we’d get a different error: 124 125 ```go 126 $ ./cockroach sql --insecure -e 'select nonexistent()' 127 Error: pq: unknown function: nonexistent() 128 Failed running "sql" 129 ``` 130 131 Our built-in is going to map usernames to real names. For that we’ll need a map: 132 133 ```go 134 users := map[string]string{ 135 "bdarnell": "Ben Darnell", 136 "pmattis": "Peter Mattis", 137 "skimball": "Spencer Kimball", 138 } 139 ``` 140 141 We’ll need to loop over the arguments to the function and look up the 142 corresponding real names: 143 144 ```go 145 var buf bytes.Buffer 146 for i, arg := range args { 147 // Because we specified the type of this function as 148 // Variadic{Typ: types.String}, the type system will ensure that all 149 // arguments are strings, so we can perform a simple type assertion on 150 // each argument to access the string within. 151 username := string(*arg.(*tree.DString)) 152 name, ok := users[strings.ToLower(username)] 153 if !ok { 154 return tree.DNull, fmt.Errorf("unknown username: %s", arg) 155 } 156 if i > 0 { 157 buf.WriteString(", ") 158 } 159 buf.WriteString(name) 160 } 161 ``` 162 163 Lastly, we need to return the result: 164 165 ```go 166 return tree.NewDString(buf.String()), nil 167 ``` 168 169 Much of the above looks like standard Go, but what is a ``DString``? The SQL 170 execution engine has its own typing system. Each type in the system adheres to 171 the ``Datum`` interface which defines the methods that a type needs to 172 implement. ``DString`` is the implementation of ``Datum`` for the SQL ``string`` 173 type. 174 175 ```go 176 type DString string 177 ``` 178 179 Note that `*DString` implements the `Datum` interface, not `DString`. This is why 180 we type assert the arguments using `arg.(*DString)`. 181 182 Put it all together (rebuild, restart your server) and we should have a working 183 function: 184 185 ```text 186 $ ./cockroach sql --insecure -e "select whois('pmattis')" 187 +------------------+ 188 | whois('pmattis') | 189 +------------------+ 190 | Peter Mattis | 191 +------------------+ 192 (1 row) 193 194 $ ./cockroach sql --insecure -e "select whois('pmattis', 'bdarnell')" 195 +------------------------------+ 196 | whois('pmattis', 'bdarnell') | 197 +------------------------------+ 198 | Peter Mattis, Ben Darnell | 199 +------------------------------+ 200 (1 row) 201 202 $ ./cockroach sql --insecure -e "select whois('non-existent')" 203 Error: pq: whois(): unknown username: 'non-existent' 204 Failed running "sql" 205 ``` 206 207 So far so good. One oddity of our function is that it returns the empty string 208 if there are no arguments. Let’s make it return all of the users in that case. 209 Before the loop over the arguments, we check to see if no arguments were 210 specified and expand that to a list of all of the usernames: 211 212 ```go 213 if len(args) == 0 { 214 args = make(tree.Datums, 0, len(users)) 215 for user := range users { 216 args = append(args, tree.NewDString(user)) 217 } 218 } 219 var buf bytes.Buffer 220 for i, arg := range args { 221 ... 222 } 223 ``` 224 225 Rebuild, restart and test: 226 227 ```text 228 $ ./cockroach sql --insecure -e "select whois()" 229 +--------------------------------------------+ 230 | whois() | 231 +--------------------------------------------+ 232 | Ben Darnell, Peter Mattis, Spencer Kimball | 233 +--------------------------------------------+ 234 (1 row) 235 ``` 236 237 Nice! 238 239 ## Testing Our New Function 240 241 Now, it’s time to codify the manual testing we just performed into a proper 242 test. Even though this function is very simple, writing a test or two will 243 safeguard against future regressions. And who knows: we might still spot a bug! 244 245 To test CockroachDB’s SQL functionality, we use a logic test framework that 246 provides a convenient syntax for asserting the expected results of queries. 247 248 Take a peek at the top of one of these logic test files, 249 `pkg/sql/logictest/testdata/logic_test/builtin_function`. Here’s an existing test for the 250 length function from that file: 251 252 ```text 253 query II 254 SELECT LENGTH('Hello, 世界'), LENGTH(b'Hello, 世界') 255 ---- 256 9 13 257 ``` 258 259 The format is relatively straightforward. `query II` means "there’s a query on 260 the next line that will return two *I*nteger columns; please check that that 261 they match what I expect". The logic test framework takes each line after the 262 `----` separator as an expected row, up to the first non-blank line, and takes 263 each whitespace-separated value on a line as the expected value for the 264 corresponding column. In the above example, we expect one row of output with 265 columns 9 and 13. 266 267 Let’s add a new test for our function. Create a new file, 268 `pkg/sql/logictest/testdata/logic_test/codelab`, with the following contents: 269 270 ```text 271 query T 272 select whois('pmattis') 273 ---- 274 Peter Mattis 275 ``` 276 277 `query T` means the query is expected to return one column of text output. 278 279 Now, run your new logic test! 280 281 ```text 282 $ make testlogic FILES=codelab 283 ``` 284 285 If all the tests in your file pass, the last line of output will read `PASS`. 286 Now, let’s add a failing test. Fill in `USERNAME`, `FIRST`, and `LAST` with your 287 real username, first name, and last name. 288 289 ```text 290 query T 291 select whois('USERNAME') 292 ---- 293 FIRST LAST 294 ``` 295 296 Re-run the tests and make sure they fail. This gives us confidence that our test 297 will actually catch bugs, should they arise. Go back and add your name to the 298 end of the users map, and verify that the tests once again succeed. 299 300 Let’s add one more test for the default case. Again, remember to replace `FIRST` 301 and `LAST` with your own name. 302 303 ``` 304 query T 305 select whois() 306 ---- 307 Ben Darnell, Peter Mattis, Spencer Kimball, FIRST LAST 308 ``` 309 310 Run the tests once more. If they still succeed, we’re done! 311 312 Well, not quite. If you haven’t seen a failure yet, run the tests a few more 313 times. Eventually, you should see an error like this: 314 315 ``` 316 --- FAIL: TestLogic (0.06s) 317 --- FAIL: TestLogic/default (0.05s) 318 --- FAIL: TestLogic/default/codelab (0.05s) 319 logic_test.go:1707: 320 testdata/logic_test/codelab:1: 321 expected: 322 Ben Darnell, Peter Mattis, Spencer Kimball 323 but found (query options: "") : 324 Peter Mattis, Spencer Kimball, Ben Darnell 325 ``` 326 327 Looks like we’ve found a bug! We’re expecting names in alphabetical order (Ben, 328 Peter, then Spencer), but the names were output in a different order! 329 330 What might cause this? Take another look at how your built-in constructs the 331 output string and see if you can spot the bug. 332 333 If you get stuck, check out this [blog post about maps in Go][blog-maps]. 334 335 Once you’ve found and fixed the bug, verify that the tests reliably pass. Then 336 check your solution against ours. 337 338 <details> 339 <summary>View our solution</summary> 340 <p> 341 342 <!-- 343 This collapsible-section hack is very sensitive to whitespace. 344 Be careful! See: https://stackoverflow.com/a/39920717/1122351 345 --> 346 347 ```diff 348 "whois": makeBuiltin(defProps(), 349 tree.Overload{ 350 Types: tree.VariadicType{VarType: types.String}, 351 ReturnType: tree.FixedReturnType(types.String), 352 Fn: func(ctx *tree.EvalContext, args tree.Datums) (tree.Datum, error) { 353 users := map[string]string{ 354 "bdarnell": "Ben Darnell", 355 "pmattis": "Peter Mattis", 356 "skimball": "Spencer Kimball", 357 } 358 if len(args) == 0 { 359 args = make(tree.Datums, 0, len(users)) 360 for user := range users { 361 args = append(args, tree.NewDString(user)) 362 } 363 + sort.Slice(args, func(i, j int) bool { 364 + return *args[i].(*tree.DString) < *args[j].(*tree.DString) 365 + }) 366 } 367 var buf bytes.Buffer 368 for i, arg := range args { 369 name, ok := users[strings.ToLower(string(*arg.(*tree.DString)))] 370 if !ok { 371 return tree.DNull, fmt.Errorf("unknown username: %s", arg) 372 } 373 if i > 0 { 374 buf.WriteString(", ") 375 } 376 buf.WriteString(name) 377 } 378 return tree.NewDString(buf.String()), nil 379 }, 380 }, 381 ), 382 ... 383 ``` 384 </p> 385 </details> 386 387 <span></span> <!-- Force space after collapsible section. --> 388 389 That’s it! You’ve successfully added a bug-free built-in SQL function to 390 CockroachDB. 391 392 [CONTRIBUTING.md]: https://github.com/cockroachdb/cockroach/blob/master/CONTRIBUTING.md 393 [built-ins]: https://www.cockroachlabs.com/docs/stable/functions-and-operators.html#built-in-functions 394 [blog-maps]: https://blog.golang.org/go-maps-in-action#TOC_7.