github.com/mithrandie/csvq@v1.18.1/docs/_posts/2006-01-02-string-functions.md (about) 1 --- 2 layout: default 3 title: String Functions - Reference Manual - csvq 4 category: reference 5 --- 6 7 # String Functions 8 9 | name | description | 10 | :- | :- | 11 | [TRIM](#trim) | Return a string with all the specified leading and trailing characters removed | 12 | [LTRIM](#ltrim) | Return a string with all the specified leading characters removed | 13 | [RTRIM](#rtrim) | Return a string with all the specified trailing characters removed | 14 | [UPPER](#upper) | Return a string with all characters mapped to their upper case | 15 | [LOWER](#lower) | Return a string with all characters mapped to their lower case | 16 | [BASE64_ENCODE](#base64_encode) | Return a base64 encoding of a string | 17 | [BASE64_DECODE](#base64_decode) | Return a string represented by a base64 encoding | 18 | [HEX_ENCODE](#hex_encode) | Return a hexadecimal encoding of a string | 19 | [HEX_DECODE](#hex_decode) | Return a string represented by a hexadecimal encoding | 20 | [LEN](#len) | Return the number of characters of a string | 21 | [BYTE_LEN](#byte_len) | Return the byte length of a string | 22 | [WIDTH](#width) | Return the string width of a string | 23 | [LPAD](#lpad) | Return a string left-side padded | 24 | [RPAD](#rpad) | Return a string right-side padded | 25 | [SUBSTRING](#substring) | Return the substring of a string | 26 | [SUBSTR](#substr) | Return the substring of a string using zero-based indexing | 27 | [INSTR](#instr) | Return the index of the first occurrence of a substring | 28 | [LIST_ELEM](#list_elem) | Return a element of a list | 29 | [REPLACE](#replace) | Return a string replaced the substrings with another string | 30 | [REGEXP_MATCH](#regexp_match) | Verify a string matches with a regular expression | 31 | [REGEXP_FIND](#regexp_find) | Return a string that matches a regular expression | 32 | [REGEXP_FIND_SUBMATCHES](#regexp_find_submatches) | Return a string representing an array that matches a regular expression | 33 | [REGEXP_FIND_ALL](#regexp_all) | Return a string representing a nested array that matches a regular expression | 34 | [REGEXP_REPLACE](#regexp_replace) | Return a string replaced substrings that match a regular expression with another strings | 35 | [TITLE_CASE](#title_case) | Returns a string converted to Title Case | 36 | [FORMAT](#format) | Return a formatted string | 37 | [JSON_VALUE](#json_value) | Return a value from json | 38 | [JSON_OBJECT](#json_object) | Return a string formatted in json object | 39 40 ## Definitions 41 42 ### TRIM 43 {: #trim} 44 45 ``` 46 TRIM(str) 47 ``` 48 49 _str_ 50 : [string]({{ '/reference/value.html#string' | relative_url }}) 51 52 _return_ 53 : [string]({{ '/reference/value.html#string' | relative_url }}) 54 55 Returns the string value that is removed all leading and trailing white spaces from _str_. 56 57 ``` 58 TRIM(str, charset) 59 ``` 60 61 _str_ 62 : [string]({{ '/reference/value.html#string' | relative_url }}) 63 64 _charset_ 65 : [string]({{ '/reference/value.html#string' | relative_url }}) 66 67 _return_ 68 : [string]({{ '/reference/value.html#string' | relative_url }}) 69 70 Returns the string value that is removed all leading and trailing characters contained in _charset_ from _str_. 71 72 ### LTRIM 73 {: #ltrim} 74 75 ``` 76 LTRIM(str) 77 ``` 78 79 _str_ 80 : [string]({{ '/reference/value.html#string' | relative_url }}) 81 82 _return_ 83 : [string]({{ '/reference/value.html#string' | relative_url }}) 84 85 Returns the string value that is removed all leading white spaces from _str_. 86 87 ``` 88 LTRIM(str, charset) 89 ``` 90 91 _str_ 92 : [string]({{ '/reference/value.html#string' | relative_url }}) 93 94 _charset_ 95 : [string]({{ '/reference/value.html#string' | relative_url }}) 96 97 _return_ 98 : [string]({{ '/reference/value.html#string' | relative_url }}) 99 100 Returns the string value that is removed all leading characters contained in _charset_ from _str_. 101 102 103 ### RTRIM 104 {: #rtrim} 105 106 {: #trim} 107 108 ``` 109 RTRIM(str) 110 ``` 111 112 _str_ 113 : [string]({{ '/reference/value.html#string' | relative_url }}) 114 115 _return_ 116 : [string]({{ '/reference/value.html#string' | relative_url }}) 117 118 Returns the string value that is removed all trailing white spaces from _str_. 119 120 ``` 121 RTRIM(str, charset) 122 ``` 123 124 _str_ 125 : [string]({{ '/reference/value.html#string' | relative_url }}) 126 127 _charset_ 128 : [string]({{ '/reference/value.html#string' | relative_url }}) 129 130 _return_ 131 : [string]({{ '/reference/value.html#string' | relative_url }}) 132 133 Returns the string value that is removed all trailing characters contained in _charset_ from _str_. 134 135 ### UPPER 136 {: #upper} 137 138 ``` 139 UPPER(str) 140 ``` 141 142 _str_ 143 : [string]({{ '/reference/value.html#string' | relative_url }}) 144 145 _return_ 146 : [string]({{ '/reference/value.html#string' | relative_url }}) 147 148 Returns the string value replaced _str_ with characters mapped to their upper case. 149 150 ### LOWER 151 {: #lower} 152 153 ``` 154 LOWER(str) 155 ``` 156 157 _str_ 158 : [string]({{ '/reference/value.html#string' | relative_url }}) 159 160 _return_ 161 : [string]({{ '/reference/value.html#string' | relative_url }}) 162 163 Returns the string value replaced _str_ with characters mapped to their upper case. 164 165 ### BASE64_ENCODE 166 {: #base64_encode} 167 168 ``` 169 BASE64_ENCODE(str) 170 ``` 171 172 _str_ 173 : [string]({{ '/reference/value.html#string' | relative_url }}) 174 175 _return_ 176 : [string]({{ '/reference/value.html#string' | relative_url }}) 177 178 Returns the Base64 encoding of _str_. 179 180 ### BASE64_DECODE 181 {: #base64_decode} 182 183 ``` 184 BASE64_DECODE(str) 185 ``` 186 187 _str_ 188 : [string]({{ '/reference/value.html#string' | relative_url }}) 189 190 _return_ 191 : [string]({{ '/reference/value.html#string' | relative_url }}) 192 193 Returns the string value represented by _str_ that is encoded with Base64. 194 195 ### HEX_ENCODE 196 {: #hex_encode} 197 198 ``` 199 HEX_ENCODE(str) 200 ``` 201 202 _str_ 203 : [string]({{ '/reference/value.html#string' | relative_url }}) 204 205 _return_ 206 : [string]({{ '/reference/value.html#string' | relative_url }}) 207 208 Returns the hexadecimal encoding of _str_. 209 210 ### HEX_DECODE 211 {: #hex_decode} 212 213 ``` 214 HEX_DECODE(str) 215 ``` 216 217 _str_ 218 : [string]({{ '/reference/value.html#string' | relative_url }}) 219 220 _return_ 221 : [string]({{ '/reference/value.html#string' | relative_url }}) 222 223 Returns the string value represented by _str_ that is encoded with hexadecimal. 224 225 ### LEN 226 {: #len} 227 228 ``` 229 LEN(str) 230 ``` 231 232 _str_ 233 : [string]({{ '/reference/value.html#string' | relative_url }}) 234 235 _return_ 236 : [integer]({{ '/reference/value.html#integer' | relative_url }}) 237 238 Returns the number of characters of _str_. 239 240 ### BYTE_LEN 241 {: #byte_len} 242 243 ``` 244 BYTE_LEN(str [, encoding]) 245 ``` 246 247 _str_ 248 : [string]({{ '/reference/value.html#string' | relative_url }}) 249 250 _encoding_ 251 : [string]({{ '/reference/value.html#string' | relative_url }}) 252 253 "UTF8", "UTF16" or "SJIS". The default is "UTF8". 254 255 _return_ 256 : [integer]({{ '/reference/value.html#integer' | relative_url }}) 257 258 Returns the byte length of _str_. 259 260 ### WIDTH 261 {: #width} 262 263 ``` 264 WIDTH(str) 265 ``` 266 267 _str_ 268 : [string]({{ '/reference/value.html#string' | relative_url }}) 269 270 _return_ 271 : [integer]({{ '/reference/value.html#integer' | relative_url }}) 272 273 Returns the string width of _str_. Half-width characters are counted as 1, and full-width characters are counted as 2. 274 275 ### LPAD 276 {: #lpad} 277 278 ``` 279 LPAD(str, len, padstr [, pad_type, encoding]) 280 ``` 281 282 _str_ 283 : [string]({{ '/reference/value.html#string' | relative_url }}) 284 285 _len_ 286 : [integer]({{ '/reference/value.html#integer' | relative_url }}) 287 288 _padstr_ 289 : [string]({{ '/reference/value.html#string' | relative_url }}) 290 291 _pad_type_ 292 : [string]({{ '/reference/value.html#string' | relative_url }}) 293 294 "LEN", "BYTE" or "WIDTH". The default is "LEN". 295 296 _encoding_ 297 : [string]({{ '/reference/value.html#string' | relative_url }}) 298 299 "UTF8", "UTF16" or "SJIS". The default is "UTF8". 300 301 _return_ 302 : [string]({{ '/reference/value.html#string' | relative_url }}) 303 304 Returns the string value of _str_ padded with leading _padstr_ to a length specified by _len_. 305 306 ### RPAD 307 {: #rpad} 308 309 ``` 310 RPAD(str, len, padstr [, pad_type, encoding]) 311 ``` 312 313 _str_ 314 : [string]({{ '/reference/value.html#string' | relative_url }}) 315 316 _len_ 317 : [integer]({{ '/reference/value.html#integer' | relative_url }}) 318 319 _padstr_ 320 : [string]({{ '/reference/value.html#string' | relative_url }}) 321 322 _pad_type_ 323 : [string]({{ '/reference/value.html#string' | relative_url }}) 324 325 "LEN", "BYTE" or "WIDTH". The default is "LEN". 326 327 _encoding_ 328 : [string]({{ '/reference/value.html#string' | relative_url }}) 329 330 "UTF8", "UTF16" or "SJIS". The default is "UTF8". 331 332 _return_ 333 : [string]({{ '/reference/value.html#string' | relative_url }}) 334 335 Returns the string value of _str_ padded with trailing _padstr_ to a length specified by _len_. 336 337 338 ### SUBSTRING 339 {: #substring} 340 341 ``` 342 SUBSTRING(str FROM position [FOR len]) 343 SUBSTRING(str, position [, len]) 344 ``` 345 346 _str_ 347 : [string]({{ '/reference/value.html#string' | relative_url }}) 348 349 _position_ 350 : [integer]({{ '/reference/value.html#integer' | relative_url }}) 351 352 _len_ 353 : [integer]({{ '/reference/value.html#integer' | relative_url }}) 354 355 _return_ 356 : [string]({{ '/reference/value.html#string' | relative_url }}) 357 358 Returns the _len_ characters in _str_ starting from the _position_-th character using one-based positional indexing. 359 360 If _position_ is 0, then it is treated as 1.<br /> 361 if _len_ is not specified or _len_ is longer than the length from _position_ to the end, then returns the substring from _position_ to the end.<br /> 362 If _position_ is negative, then starting position is _position_ from the end of the _str_. 363 364 365 ### SUBSTR 366 {: #substr} 367 368 ``` 369 SUBSTR(str, position [, len]) 370 ``` 371 372 _str_ 373 : [string]({{ '/reference/value.html#string' | relative_url }}) 374 375 _position_ 376 : [integer]({{ '/reference/value.html#integer' | relative_url }}) 377 378 _len_ 379 : [integer]({{ '/reference/value.html#integer' | relative_url }}) 380 381 _return_ 382 : [string]({{ '/reference/value.html#string' | relative_url }}) 383 384 This function behaves the same as [SUBSTRING](#substring), but uses zero-based positional indexing. 385 386 387 ### INSTR 388 {: #instr} 389 390 ``` 391 INSTR(str, substr) 392 ``` 393 394 _str_ 395 : [string]({{ '/reference/value.html#string' | relative_url }}) 396 397 _substr_ 398 : [string]({{ '/reference/value.html#string' | relative_url }}) 399 400 _return_ 401 : [integer]({{ '/reference/value.html#integer' | relative_url }}) 402 403 Returns the index of the first occurrence of _substr_ in _str_, 404 or null if _substr_ is not present in _str_. 405 406 ### LIST_ELEM 407 {: #list_elem} 408 409 ``` 410 LIST_ELEM(str, sep, index) 411 ``` 412 413 _str_ 414 : [string]({{ '/reference/value.html#string' | relative_url }}) 415 416 _sep_ 417 : [string]({{ '/reference/value.html#string' | relative_url }}) 418 419 _index_ 420 : [integer]({{ '/reference/value.html#integer' | relative_url }}) 421 422 _return_ 423 : [string]({{ '/reference/value.html#string' | relative_url }}) 424 425 Returns the string at _index_ in the list generated by splitting with _sep_ from _str_. 426 427 ### REPLACE 428 {: #replace} 429 430 ``` 431 REPLACE(str, old, new) 432 ``` 433 434 _str_ 435 : [string]({{ '/reference/value.html#string' | relative_url }}) 436 437 _old_ 438 : [string]({{ '/reference/value.html#string' | relative_url }}) 439 440 _new_ 441 : [string]({{ '/reference/value.html#string' | relative_url }}) 442 443 _return_ 444 : [string]({{ '/reference/value.html#string' | relative_url }}) 445 446 Returns the string that is replaced all occurrences of _old_ with _new_ in _str_. 447 448 ### REGEXP_MATCH 449 {: #regexp_match} 450 451 ``` 452 REGEXP_MATCH(str, regexp [, flags]) 453 ``` 454 455 _str_ 456 : [string]({{ '/reference/value.html#string' | relative_url }}) 457 458 _regexp_ 459 : [string]({{ '/reference/value.html#string' | relative_url }}) 460 461 _flags_ 462 : [string]({{ '/reference/value.html#string' | relative_url }}) 463 464 A string including the [flags of regular expressions](#flags-of-regular-expressions) 465 466 _return_ 467 : [ternary]({{ '/reference/value.html#ternary' | relative_url }}) 468 469 Verifies the string _str_ matches with the regular expression _regexp_. 470 471 ### REGEXP_FIND 472 {: #regexp_find} 473 474 ``` 475 REGEXP_FIND(str, regexp [, flags]) 476 ``` 477 478 _str_ 479 : [string]({{ '/reference/value.html#string' | relative_url }}) 480 481 _regexp_ 482 : [string]({{ '/reference/value.html#string' | relative_url }}) 483 484 _flags_ 485 : [string]({{ '/reference/value.html#string' | relative_url }}) 486 487 A string including the [flags of regular expressions](#flags-of-regular-expressions) 488 489 _return_ 490 : [string]({{ '/reference/value.html#string' | relative_url }}) 491 492 Returns the string that matches the regular expression _regexp_ in _str_. 493 494 #### Examples 495 496 ```shell 497 # Return the matched string. 498 $ csvq "SELECT REGEXP_FIND('ABCDEFG abcdefg', 'cdef')" 499 +----------------------------------------+ 500 | REGEXP_FIND('ABCDEFG abcdefg', 'cdef') | 501 +----------------------------------------+ 502 | cdef | 503 +----------------------------------------+ 504 505 # Return the submatched string if there is a submatch expression. 506 $ csvq "SELECT REGEXP_FIND('ABCDEFG abcdefg', 'c(de)f')" 507 +------------------------------------------+ 508 | REGEXP_FIND('ABCDEFG abcdefg', 'c(de)f') | 509 +------------------------------------------+ 510 | de | 511 +------------------------------------------+ 512 513 # Return the first matched string if there are multiple matched strings. 514 $ csvq "SELECT REGEXP_FIND('ABCDEFG abcdefg', 'cdef', 'i')" 515 +---------------------------------------------+ 516 | REGEXP_FIND('ABCDEFG abcdefg', 'cdef', 'i') | 517 +---------------------------------------------+ 518 | CDEF | 519 +---------------------------------------------+ 520 ``` 521 522 ### REGEXP_FIND_SUBMATCHES 523 {: #regexp_find_submatches} 524 525 ``` 526 REGEXP_FIND_SUBMATCHES(str, regexp [, flags]) 527 ``` 528 529 _str_ 530 : [string]({{ '/reference/value.html#string' | relative_url }}) 531 532 _regexp_ 533 : [string]({{ '/reference/value.html#string' | relative_url }}) 534 535 _flags_ 536 : [string]({{ '/reference/value.html#string' | relative_url }}) 537 538 A string including the [flags of regular expressions](#flags-of-regular-expressions) 539 540 _return_ 541 : [string]({{ '/reference/value.html#string' | relative_url }}) 542 543 A string representing a JSON array. 544 545 Returns the string representing an array that matches the regular expression _regexp_ in _str_. 546 547 #### Examples 548 549 ```shell 550 # Return all the first matched strings including submatches. 551 $ csvq "SELECT REGEXP_FIND_SUBMATCHES('ABCDEFG abcdefg', 'c(de)f', 'i')" 552 +----------------------------------------------------------+ 553 | REGEXP_FIND_SUBMATCHES('ABCDEFG abcdefg', 'c(de)f', 'i') | 554 +----------------------------------------------------------+ 555 | ["CDEF","DE"] | 556 +----------------------------------------------------------+ 557 558 # Return only the submatched string. 559 $ csvq "SELECT JSON_VALUE('[1]', REGEXP_FIND_SUBMATCHES('ABCDEFG abcdefg', 'c(de)f', 'i'))" 560 +-----------------------------------------------------------------------------+ 561 | JSON_VALUE('[1]', REGEXP_FIND_SUBMATCHES('ABCDEFG abcdefg', 'c(de)f', 'i')) | 562 +-----------------------------------------------------------------------------+ 563 | DE | 564 +-----------------------------------------------------------------------------+ 565 ``` 566 567 ### REGEXP_FIND_ALL 568 {: #regexp_all} 569 570 ``` 571 REGEXP_FIND_ALL(str, regexp [, flags]) 572 ``` 573 574 _str_ 575 : [string]({{ '/reference/value.html#string' | relative_url }}) 576 577 _regexp_ 578 : [string]({{ '/reference/value.html#string' | relative_url }}) 579 580 _flags_ 581 : [string]({{ '/reference/value.html#string' | relative_url }}) 582 583 A string including the [flags of regular expressions](#flags-of-regular-expressions) 584 585 _return_ 586 : [string]({{ '/reference/value.html#string' | relative_url }}) 587 588 A string representing a nested JSON array. 589 590 Returns the string representing a nested array that matches the regular expression _regexp_ in _str_. 591 592 #### Examples 593 594 ```shell 595 # Return all the matched strings. 596 $ csvq "SELECT REGEXP_FIND_ALL('ABCDEFG abcdefg', 'c(de)f', 'i')" 597 +---------------------------------------------------+ 598 | REGEXP_FIND_ALL('ABCDEFG abcdefg', 'c(de)f', 'i') | 599 +---------------------------------------------------+ 600 | [["CDEF","DE"],["cdef","de"]] | 601 +---------------------------------------------------+ 602 603 # Return only submatched strings as an array. 604 $ csvq "SELECT JSON_VALUE('[][1]', REGEXP_FIND_ALL('ABCDEFG abcdefg', 'c(de)f', 'i'))" 605 +------------------------------------------------------------------------+ 606 | JSON_VALUE('[][1]', REGEXP_FIND_ALL('ABCDEFG abcdefg', 'c(de)f', 'i')) | 607 +------------------------------------------------------------------------+ 608 | ["DE","de"] | 609 +------------------------------------------------------------------------+ 610 ``` 611 612 ### REGEXP_REPLACE 613 {: #regexp_replace} 614 615 ``` 616 REGEXP_REPLACE(str, regexp, replacement_value [, flags]) 617 ``` 618 619 _str_ 620 : [string]({{ '/reference/value.html#string' | relative_url }}) 621 622 _regexp_ 623 : [string]({{ '/reference/value.html#string' | relative_url }}) 624 625 _replacement_value_ 626 : [string]({{ '/reference/value.html#string' | relative_url }}) 627 628 _flags_ 629 : [string]({{ '/reference/value.html#string' | relative_url }}) 630 631 A string including the [flags of regular expressions](#flags-of-regular-expressions) 632 633 _return_ 634 : [string]({{ '/reference/value.html#string' | relative_url }}) 635 636 637 Returns the string replaced substrings that match the regular expression _regexp_ with _replacement_value_ in _str_. 638 639 ### TITLE_CASE 640 {: #title_case} 641 642 ``` 643 TITLE_CASE(str) 644 ``` 645 646 _str_ 647 : [string]({{ '/reference/value.html#string' | relative_url }}) 648 649 _return_ 650 : [string]({{ '/reference/value.html#string' | relative_url }}) 651 652 Returns a string with the first letter of each word in _str_ capitalized. 653 654 ### FORMAT 655 {: #format} 656 657 ``` 658 FORMAT(format [, replace_value ... ]) 659 ``` 660 661 _format_ 662 : [string]({{ '/reference/value.html#string' | relative_url }}) 663 664 _replace_value_ 665 : [value]({{ '/reference/value.html' | relative_url }}) 666 667 _return_ 668 : [string]({{ '/reference/value.html#string' | relative_url }}) 669 670 Returns a formatted string replaced placeholders with _replace_ in _format_. 671 672 #### Format Placeholder 673 674 ``` 675 %[flag][width][.precision]specifier 676 ``` 677 678 flag 679 : | flag | description | 680 | :- | :- | 681 | + | Print a plus sign for numeric values | 682 | ' ' (U+0020 Space) | Print a space instead of a plus sign | 683 | - | Pad on the right | 684 | 0 | Pad with zeros | 685 686 width 687 : [integer]({{ '/reference/value.html#integer' | relative_url }}) 688 689 Width of the replaced string. 690 691 precision 692 : [integer]({{ '/reference/value.html#integer' | relative_url }}) 693 694 Number of digits after the decimal point for a float value, 695 or max length for a string value. 696 697 specifier 698 : | specifier | description | 699 | :- | :- | 700 | b | Base 2 integer | 701 | o | Base 8 integer | 702 | d | Base 10 integer | 703 | x | Base 16 integer with lower cases | 704 | X | Base 16 integer with upper cases | 705 | e | Exponential notation with lower cases | 706 | E | Exponential notation with upper cases | 707 | f | Floating point decimal number | 708 | s | String representation of the value | 709 | q | Quoted string representation of the value | 710 | i | Quoted identifier representation of the value | 711 | T | Type of the value | 712 | % | '%' | 713 714 > Quoted string and identifier representations are escaped for [special characters]({{ '/reference/command.html#special_characters' | relative_url }}). 715 716 ### JSON_VALUE 717 {: #json_value} 718 719 ``` 720 JSON_VALUE(json_query, json_data) 721 ``` 722 723 _json_query_ 724 : [string]({{ '/reference/value.html#string' | relative_url }}) 725 726 [JSON Query]({{ '/reference/json.html#query' |relative_url }}) to uniquely specify a value. 727 728 _json_data_ 729 : [string]({{ '/reference/value.html#string' | relative_url }}) 730 731 _return_ 732 : [value]({{ '/reference/value.html' | relative_url }}) 733 734 Returns a value in _json_data. 735 736 A JSON values are converted to following types. 737 738 | JSON value | csvq value | 739 |:-|:-| 740 | object | string | 741 | array | string | 742 | number | integer or float | 743 | string | string | 744 | true | boolean | 745 | false | boolean | 746 | null | null | 747 748 749 ### JSON_OBJECT 750 {: #json_object} 751 752 ``` 753 JSON_OBJECT([field [, field ...]]) 754 755 field 756 : value 757 : value AS alias 758 ``` 759 760 _value_ 761 : [value]({{ '/reference/value.html' | relative_url }}) 762 763 _alias_ 764 : [identifier]({{ '/reference/statement.html#parsing' | relative_url }}) 765 766 _return_ 767 : [string]({{ '/reference/value.html#string' | relative_url }}) 768 769 770 Returns a string formatted in JSON. 771 772 If no arguments are passed, then the object include all fields in the view. 773 774 ## Flags of regular expressions 775 {: #flags-of-regular-expressions} 776 777 | flag | description | 778 |:-|:-| 779 | i | case-insensitive | 780 | m | multi-line mode | 781 | s | let . match \n | 782 | U | swap meaning of x* and x*?, x+ and x+?, etc. |