github.com/mithrandie/csvq@v1.18.1/docs/_posts/2006-01-02-select-query.md (about) 1 --- 2 layout: default 3 title: Select Query - Reference Manual - csvq 4 category: reference 5 --- 6 7 # Select Query 8 9 Select query is used to retrieve data from csv files. 10 11 ``` 12 select_query 13 : [with_clause] 14 select_entity 15 [order_by_clause] 16 [limit_clause] 17 [FOR UPDATE] 18 19 select_entity 20 : select_clause 21 [from_clause] 22 [where_clause] 23 [group_by_clause] 24 [having_clause] 25 | select_set_entity set_operator [ALL] select_set_entity 26 27 select_set_entity 28 : select_entity 29 | (select_query) 30 ``` 31 32 _with_clause_ 33 : [With Clause](#with_clause) 34 35 _select_clause_ 36 : [Select Clause](#select_clause) 37 38 _from_clause_ 39 : [From Clause](#from_clause) 40 41 _where_clause_ 42 : [Where Clause](#where_clause) 43 44 _group_by_clause_ 45 : [Group By Clause](#group_by_clause) 46 47 _having_clause_ 48 : [Having Clause](#having_clause) 49 50 _order_by_clause_ 51 : [Order By Clause](#order_by_clause) 52 53 _limit_clause_ 54 : [Limit Clause](#limit_clause) 55 56 _set_operator_ 57 : [Set Operators]({{ '/reference/set-operators.html' | relative_url }}) 58 59 ## With Clause 60 {: #with_clause} 61 62 ```sql 63 WITH common_table_expression [, common_table_expression ...] 64 ``` 65 _common_table_expression_ 66 : [Common Table Expression]({{ '/reference/common-table-expression.html' | relative_url }}) 67 68 ## Select Clause 69 {: #select_clause} 70 71 ```sql 72 SELECT [DISTINCT] field [, field ...] 73 ``` 74 75 ### Distinct 76 77 You can use DISTINCT keyword to retrieve only unique records. 78 79 ### field syntax 80 81 ```sql 82 field 83 : value 84 | value AS alias 85 | * 86 | table_name.* 87 ``` 88 89 _value_ 90 : [value]({{ '/reference/value.html' | relative_url }}) 91 92 _alias_ 93 : [identifier]({{ '/reference/statement.html#parsing' | relative_url }}) 94 95 _table_name_ 96 : [identifier]({{ '/reference/statement.html#parsing' | relative_url }}) 97 98 _*_ 99 : Asterisk(U+002A `*`) denotes all columns. 100 101 When used alone, the asterisk selects all columns in all tables; when used with a table name, it selects all columns in that table. 102 103 ## From Clause 104 {: #from_clause} 105 106 ```sql 107 FROM table [, {table|LATERAL laterable_table} ...] 108 ``` 109 110 If multiple tables have been enumerated, tables are joined using cross join. 111 112 ### table syntax 113 114 ```sql 115 table 116 : table_entity 117 | table_entity alias 118 | table_entity AS alias 119 | join 120 | DUAL 121 | laterable_table 122 | (table) 123 124 table_entity 125 : table_identifier 126 | format_specified_function 127 | inline_format_specified_function 128 129 table_identifier 130 : table_name 131 | url 132 | table_identification_function 133 | STDIN 134 135 laterable_table 136 : subquery 137 | subquery alias 138 | subquery AS alias 139 140 subquery 141 : (select_query) 142 143 join 144 : table CROSS JOIN table 145 | table [INNER] JOIN table join_condition 146 | table {LEFT|RIGHT|FULL} [OUTER] JOIN table join_condition 147 | table NATURAL [INNER] JOIN table 148 | table NATURAL {LEFT|RIGHT|FULL} [OUTER] JOIN table 149 | table CROSS JOIN LATERAL laterable_table 150 | table [INNER] JOIN LATERAL laterable_table join_condition 151 | table LEFT [OUTER] JOIN LATERAL laterable_table join_condition 152 | table NATURAL [INNER] JOIN LATERAL laterable_table 153 | table NATURAL LEFT [OUTER] JOIN LATERAL laterable_table 154 155 join_condition 156 : ON condition 157 | USING (column_name [, column_name, ...]) 158 159 table_identification_function 160 : FILE::(file_path) 161 : INLINE::(file_path) 162 : URL::(url_string) 163 : DATA::(data_string) 164 165 format_specified_function 166 : CSV(delimiter, table_identifier [, encoding [, no_header [, without_null]]]) 167 | FIXED(delimiter_positions, table_identifier [, encoding [, no_header [, without_null]]]) 168 | JSON(json_query, table_identifier) 169 | JSONL(json_query, table_identifier) 170 | LTSV(table_identifier [, encoding [, without_null]]) 171 172 inline_format_specified_function -- Deprecated. Table identification functions can be used instead. 173 : CSV_INLINE(delimiter, inline_table_identifier [, encoding [, no_header [, without_null]]]) 174 | CSV_INLINE(delimiter, csv_data) 175 | JSON_INLINE(json_query, inline_table_identifier [, encoding [, no_header [, without_null]]]) 176 | JSON_INLINE(json_query, json_data) 177 178 inline_table_identifier 179 : table_name 180 | url_identifier 181 182 ``` 183 184 _table_name_ 185 : [identifier]({{ '/reference/statement.html#parsing' | relative_url }}) 186 187 A _table_name_ represents a file path, a [temporary table]({{ '/reference/temporary-table.html' | relative_url }}), or a [inline table]({{ '/reference/common-table-expression.html' | relative_url }}). 188 You can use absolute path or relative path from the directory specified by the ["--repository" option]({{ '/reference/command.html#options' | relative_url }}) as a file path. 189 190 When the file name extension is ".csv", ".tsv", ".json", ".jsonl" or ".txt", the format to be loaded is automatically determined by the file extension, and you can omit it. 191 192 ```sql 193 FROM `user.csv` -- Relative path 194 FROM `/path/to/user.csv` -- Absolute path 195 FROM user -- Relative path without file extension 196 ``` 197 198 The specifications of the command options are used as file attributes such as encoding to be loaded. 199 If you want to specify the different attributes for each file, you can use _table_object_ expressions for each file to load. 200 201 Once a file is loaded, then the data is cached, and it can be loaded with only file name after that within the transaction. 202 203 _url_ 204 : A string of characters representing URL starting with a schema name and a colon. 205 206 "http", "https" and "file" schemes are available. 207 208 ```sql 209 https://example.com/files/data.csv -- Remote resource downloaded using HTTP GET method 210 file:///C:/Users/yourname/files/data.csv -- Local file specified by absolute path 211 file:./data.csv -- Local file specified by relative path 212 ``` 213 214 An inline table is created from remote resources. 215 The downloaded data is cached until the transaction ends. 216 217 The file format is automatically determined when the http response specifies the following content types. 218 219 | MIME type | Format | 220 |:-----------------|:-------| 221 | text/csv | CSV | 222 | application/json | JSON | 223 224 _table_identification_function_ 225 : Function notation with a name followed by two colons. 226 227 - FILE::(file_path) 228 229 file_path: [string]({{ '/reference/value.html#string' | relative_url }}) 230 231 This is the same as specifying a file using _table_name_. 232 233 - INLINE::(file_path) 234 235 file_path: [string]({{ '/reference/value.html#string' | relative_url }}) 236 237 Files read by this function are not cached and cannot be updated. 238 239 - URL::(url_string) 240 241 url_string: [string]({{ '/reference/value.html#string' | relative_url }}) 242 243 When specifying a resource using _url_, the path must be encoded, but this function does not require encoding. 244 245 - DATA::(data_string) 246 247 file_path: [string]({{ '/reference/value.html#string' | relative_url }}) 248 249 This function creates an inline table from a string. 250 251 Example of use in a query: 252 253 ```sql 254 SELECT id, 255 tag_name, 256 (SELECT COUNT(*) FROM JSON('', DATA::(assets))) AS number_of_assets, 257 published_at 258 FROM https://api.github.com/repos/mithrandie/csvq/releases 259 WHERE prerelease = false 260 ORDER BY published_at DESC 261 LIMIT 10 262 ``` 263 264 _alias_ 265 : [identifier]({{ '/reference/statement.html#parsing' | relative_url }}) 266 267 If _alias_ is not specified, _table_name_ stripped its directory path and extension is used as alias. 268 269 ```sql 270 -- Following expressions are equivalent 271 FROM `/path/to/user.csv` 272 FROM `/path/to/user.csv` AS user 273 ``` 274 275 _select_query_ 276 : [Select Query]({{ '/reference/select-query.html' | relative_url }}) 277 278 _condition_ 279 : [value]({{ '/reference/value.html' | relative_url }}) 280 281 _column_name_ 282 : [identifier]({{ '/reference/statement.html#parsing' | relative_url }}) 283 284 _delimiter_ 285 : [string]({{ '/reference/value.html#string' | relative_url }}) 286 287 _json_query_ 288 : [JSON Query]({{ '/reference/json.html#query' | relative_url }}) 289 290 Empty string is equivalent to "{}". 291 292 _delimiter_positions_ 293 : [string]({{ '/reference/value.html#string' | relative_url }}) 294 295 "SPACES" or JSON Array of integers 296 297 _encoding_ 298 : [string]({{ '/reference/value.html#string' | relative_url }}) or [identifier]({{ '/reference/statement.html#parsing' | relative_url }}) 299 300 "AUTO", "UTF8", "UTF8M", "UTF16", "UTF16BE", "UTF16LE", "UTF16BEM", "UTF16LEM" or "SJIS". 301 302 _no_header_ 303 : [boolean]({{ '/reference/value.html#boolean' | relative_url }}) 304 305 _without_null_ 306 : [boolean]({{ '/reference/value.html#boolean' | relative_url }}) 307 308 _url_identifier_ 309 : [identifier]({{ '/reference/statement.html#parsing' | relative_url }}) 310 311 A URL of the http or https scheme to refer to a resource. 312 313 _csv_data_ 314 : [string]({{ '/reference/value.html#string' | relative_url }}) 315 316 _json_data_ 317 : [string]({{ '/reference/value.html#string' | relative_url }}) 318 319 #### Special Tables 320 {: #special_tables} 321 322 DUAL 323 : The dual table has one column and one record, and the only field is empty. 324 This table is used to retrieve pseudo columns. 325 326 STDIN 327 : The stdin table loads data from pipe or redirection as a csv data. 328 The stdin table is one of [temporary tables]({{ '/reference/temporary-table.html' | relative_url }}) that is declared automatically. 329 This table cannot to be used in the interactive shell. 330 331 332 ## Where Clause 333 {: #where_clause} 334 335 The Where clause is used to filter records. 336 337 ```sql 338 WHERE condition 339 ``` 340 341 _condition_ 342 : [value]({{ '/reference/value.html' | relative_url }}) 343 344 ## Group By Clause 345 {: #group_by_clause} 346 347 The Group By clause is used to group records. 348 349 ```sql 350 GROUP BY field [, field ...] 351 ``` 352 353 _field_ 354 : [value]({{ '/reference/value.html' | relative_url }}) 355 356 ## Having Clause 357 {: #having_clause} 358 359 The Having clause is used to filter grouped records. 360 361 ```sql 362 HAVING condition 363 ``` 364 365 _condition_ 366 : [value]({{ '/reference/value.html' | relative_url }}) 367 368 ## Order By Clause 369 {: #order_by_clause} 370 371 The Order By clause is used to sort records. 372 373 ```sql 374 ORDER BY order_item [, order_item ...] 375 ``` 376 377 ### order item 378 379 ```sql 380 order_item 381 : field [order_direction] [null_position] 382 383 order_direction 384 : {ASC|DESC} 385 386 null_position 387 : NULLS {FIRST|LAST} 388 ``` 389 390 _field_ 391 : [value]({{ '/reference/value.html' | relative_url }}) 392 393 If DISTINCT keyword is specified in the select clause, you can use only enumerated fields in the select clause as _field_. 394 395 _order_direction_ 396 : _ASC_ sorts records in ascending order. _DESC_ sorts in descending order. _ASC_ is the default. 397 398 _null_position_ 399 : _FIRST_ puts null values first. _LAST_ puts null values last. 400 If _order_direction_ is specified as _ASC_ then _FIRST_ is the default, otherwise _LAST_ is the default. 401 402 403 ## Limit Clause 404 {: #limit_clause} 405 406 The Limit clause is used to specify the maximum number of records to return and exclude the first set of records. 407 408 ```sql 409 limit_clause 410 : LIMIT number_of_records [{ROW|ROWS}] [{ONLY|WITH TIES}] [offset_clause] 411 | LIMIT percentage PERCENT [{ONLY|WITH TIES}] [offset_clause] 412 | [offset_clause] FETCH {FIRST|NEXT} number_of_records {ROW|ROWS} [{ONLY|WITH TIES}] 413 | [offset_clause] FETCH {FIRST|NEXT} percentage PERCENT [{ONLY|WITH TIES}] 414 | offset_clause 415 416 offset_clause 417 : OFFSET number_of_records [{ROW|ROWS}] 418 ``` 419 420 _number_of_records_ 421 : [integer]({{ '/reference/value.html#integer' | relative_url }}) 422 423 _percent_ 424 : [float]({{ '/reference/value.html#integer' | relative_url }}) 425 426 _ROW_ and _ROWS_ after _number_of_records_, _FIRST_ and _NEXT_ after _FETCH_, and _ONLY_ keyword does not affect the result. 427 428 If _PERCENT_ keyword is specified, maximum number of records is _percentage_ percent of the result set that includes the excluded records by _offset_clause_. 429 430 If _WITH TIES_ keywords are specified, all records that have the same sort keys specified by _Order By Clause_ as the last record of the limited records are included in the records to return. 431 If there is no _Order By Clause_ in the query, _WITH TIES_ keywords are ignored. 432