github.com/xiaq/elvish@v0.12.0/website/src/learn/effective-elvish.md

github.com/xiaq/elvish@v0.12.0/website/src/learn/effective-elvish.md (about)

     1  <!-- toc -->
     2  
     3  Elvish is not an entirely new language. Its programming techniques have two
     4  primary sources: traditional Unix shells and functional programming languages,
     5  both dating back to many decades ago. However, the way Elvish combines those
     6  two paradigms is unique in many ways, which enables new ways to write code.
     7  
     8  This document is an advanced tutorial focusing on how to write idiomatic
     9  Elvish code, code that is concise and clear, and takes full advantage of
    10  Elvish's features.
    11  
    12  An appropriate adjective for idiomatic Elvish code, like *Pythonic* for Python
    13  or *Rubyesque* for Ruby, is **Elven**. In [Roguelike
    14  games](https://en.wikipedia.org/wiki/Roguelike), Elven items are known to be
    15  high-quality, artful and resilient. So is Elven code.
    16  
    17  
    18  # Style
    19  
    20  ## Naming
    21  
    22  Use `dash-delimited-words` for names of variables and functions. Underscores
    23  are allowed in variable and function names, but their use should be limited to
    24  environment variables (e.g. `$E:LC_ALL`) and external commands (e.g. `pkg_add`).
    25  
    26  When building a module, use a leading dash to communicate that a variable or
    27  function is subject to change in future and cannot be relied upon, either
    28  because it is an experimental feature or implementation detail.
    29  
    30  Elvish's core libraries follow the naming convention above.
    31  
    32  ## Indentation
    33  
    34  Indent by two spaces.
    35  
    36  ## Code Blocks
    37  
    38  In Elvish, code blocks in control structures are delimited by curly braces.
    39  This is perhaps the most visible difference of Elvish from most other shells
    40  like bash, zsh or fish. The following bash code:
    41  
    42  ```bash
    43  if true; then
    44    echo true
    45  fi
    46  ```
    47  
    48  Is written like this in Elvish:
    49  
    50  ```elvish
    51  if $true {
    52    echo true
    53  }
    54  ```
    55  
    56  If you have used lambdas in Elvish, you will notice that code blocks are
    57  syntactically just parameter-list-less lambdas.
    58  
    59  In Elvish, you cannot put opening braces of code blocks on the next line. This
    60  won't work:
    61  
    62  ```elvish
    63  if $true
    64  { # wrong!
    65    echo true
    66  }
    67  ```
    68  
    69  Instead, you must write:
    70  
    71  ```elvish
    72  if $true {
    73    echo true
    74  }
    75  ```
    76  
    77  This is because in Elvish, control structures like `if` follow the same syntax
    78  as normal commands, hence newlines terminate them. To make the code block part
    79  of the `if` command, it must appear on the same line.
    80  
    81  
    82  # Using the Pipeline
    83  
    84  Elvish is equipped with a powerful tool for passing data: the pipeline. Like
    85  in traditional shells, it is an intuitive notation for data processing: data
    86  flows from left to right, undergoing one transformation after another. Unlike
    87  in traditional shells, it is not restricted to unstructured bytes: all Elvish
    88  values, including lists, maps and even closures, can flow in the pipeline.
    89  This section documents how to make the most use of pipelines.
    90  
    91  ## Returning Values with Structured Output
    92  
    93  Unlike functions in most other programming languages, Elvish commands do not
    94  have return values. Instead, they can write to *structured output*, which is
    95  similar to the traditional byte-based stdout, but preserves all internal
    96  structures of aribitrary Elvish values. The most fundamental command that does
    97  this is `put`:
    98  
    99  ```elvish-transcript
   100  ~> put foo
   101  ▶ foo
   102  ~> x = (put foo)
   103  ~> put $x
   104  ▶ foo
   105  ```
   106  
   107  This is hardly impressive - you can output and recover simple strings using
   108  good old byte-based output as well. But let's try this:
   109  
   110  ```elvish-transcript
   111  ~> put "a\nb" [foo bar]
   112  ▶ "a\nb"
   113  ▶ [foo bar]
   114  ~> s li = (put "a\nb" [foo bar])
   115  ~> put $s
   116  ▶ "a\nb"
   117  ~> put $li[0]
   118  ▶ foo
   119  ```
   120  
   121  Here, two things are worth mentioning: the first value we `put` contains a
   122  newline, and the second value is a list. When we capture the output, we get
   123  those exact values back. Passing structured data is difficult with byte-based
   124  output, but trivial with value output.
   125  
   126  Besides `put`, many other builtin commands also write to structured output,
   127  like `splits`:
   128  
   129  ```elvish-transcript
   130  ~> splits , foo,bar
   131  ▶ foo
   132  ▶ bar
   133  ~> words = [(splits , foo,bar)]
   134  ~> put $words
   135  ▶ [foo bar]
   136  ```
   137  
   138  User-defined functions behave in the same way: they "return" values by writing
   139  to structured stdout. Without realizing that "return values" are just outputs
   140  in Elvish, it is easy to think of `put` as **the** command to "return" values
   141  and write code like this:
   142  
   143  ```elvish-transcript
   144  ~> fn split-by-comma [s]{ put (splits , $s) }
   145  ~> split-by-comma foo,bar
   146  ▶ foo
   147  ▶ bar
   148  ```
   149  
   150  The `split-by-comma` function works, but it can be written more concisely
   151  as:
   152  
   153  ```elvish-transcript
   154  ~> fn split-by-comma [s]{ splits , $s }
   155  ~> split-by-comma foo,bar
   156  ▶ foo
   157  ▶ bar
   158  ```
   159  
   160  In fact, the pattern `put (some-cmd)` is almost always redundant and
   161  equivalent to just `some-command`.
   162  
   163  Similarly, it is seldom necessary to write `echo (some-cmd)`: it is almost
   164  always equivalent to just `some-cmd`. As an exercise, try simplifying the
   165  following function:
   166  
   167  ```elvish
   168  fn git-describe { echo (git describe --tags --always) }
   169  ```
   170  
   171  ## Mixing Bytes and Values
   172  
   173  Each pipe in Elvish comprises two components: one traditional byte pipe that
   174  carries unstructured bytes, and one value pipe that carries Elvish values. You
   175  can write to both, and output capture will capture both:
   176  
   177  ```elvish-transcript
   178  ~> fn f { echo bytes; put value }
   179  ~> f
   180  bytes
   181  ▶ value
   182  ~> outs = [(f)]
   183  ~> put $outs
   184  ▶ [bytes value]
   185  ```
   186  
   187  This also illustrates that the output capture operator `(...)` works with both
   188  byte and value outputs, and it can recover the output sent to `echo`. When
   189  byte output contains multiple lines, each line becomes one value:
   190  
   191  ```elvish-transcript
   192  ~> x = [(echo "lorem\nipsum")]
   193  ~> put $x
   194  ▶ [lorem ipsum]
   195  ```
   196  
   197  Most Elvish builtin functions also work with with both byte and value inputs.
   198  Similarly to output capture, they split their byte intput by newlines. For
   199  example:
   200  
   201  ```elvish-transcript
   202  ~> use str
   203  ~> put lorem ipsum | each $str:to-upper~
   204  ▶ LOREM
   205  ▶ IPSUM
   206  ~> echo "lorem\nipsum" | each $str:to-upper~
   207  ▶ LOREM
   208  ▶ IPSUM
   209  ```
   210  
   211  This line-oriented processing of byte input is consistent with traditional
   212  Unix tools like `grep`, `sed` and `awk`. In fact, it is easy to write your own
   213  `grep` in Elvish:
   214  
   215  ```elvish-transcript
   216  ~> use re
   217  ~> fn mygrep [p]{ each [line]{ if (re:match $p $line) { echo $line } } }
   218  ~> cat in.txt
   219  abc
   220  123
   221  lorem
   222  456
   223  ~> cat in.txt | mygrep '[0-9]'
   224  123
   225  456
   226  ```
   227  
   228  (Note that it is more concise to write `mygrep ... < in.txt`, but due to [a
   229  bug](https://github.com/elves/elvish/issues/600) this does not work.)
   230  
   231  However, this line-oriented behavior is not always desirable: not all Unix
   232  commands output newline-separated data. When you want to get the output as is,
   233  as a single string, you can use the `slurp` command:
   234  
   235  ```elvish-transcript
   236  ~> echo "a\nb\nc" | slurp
   237  ▶ "a\nb\nc\n"
   238  ```
   239  
   240  One immediate use of `slurp` is to read a whole file into a string:
   241  
   242  ```elvish-transcript
   243  ~> cat hello.go
   244  package main
   245  
   246  import "fmt"
   247  
   248  func main() {
   249              fmt.Println("vim-go")
   250  }
   251  ~> hello-go = (slurp < hello.go)
   252  ~> put $hello-go
   253  ▶ "package main\n\nimport \"fmt\"\n\nfunc main()
   254  {\n\tfmt.Println(\"vim-go\")\n}\n"
   255  ```
   256  
   257  It is also useful, for example, when working with NUL-separated output:
   258  
   259  ```elvish-transcript
   260  ~> touch "a\nb.go"
   261  ~> mkdir d
   262  ~> touch d/f.go
   263  ~> find . -name '*.go' -print0 | splits "\000" (slurp)
   264  ▶ "./a\nb.go"
   265  ▶ ./d/f.go
   266  ▶ ''
   267  ```
   268  
   269  In the above command, `slurp` turns the input into one string, which is then
   270  used as an argument to `splits`. The `splits` command then splits the whole
   271  input by NUL bytes.
   272  
   273  Note that in Elvish, strings can contain NUL bytes; in fact, they can contain
   274  any byte; this makes Elvish suitable for working with binary data. (Also, note
   275  that the `find` command terminates its output with a NUL byte, hence we see a
   276  trailing empty string in the output.)
   277  
   278  One side note: In the first example, we saw that `bytes` appeared before `value`.
   279  This is not guaranteed: byte output and value output are separate, it is
   280  possible to get `value` before `bytes` in more complex cases. Writes to one
   281  component, however, always have their orders preserved, so in `put x; put y`,
   282  `x` will always appear before `y`.
   283  
   284  ## Prefer Pipes Over Parentheses
   285  
   286  If you have experience with Lisp, you will discover that you can write Elvish
   287  code very similar to Lisp. For instance, to split a string containing
   288  comma-separated value, reduplicate each value (using commas as separators),
   289  and rejoin them with semicolons, you can write:
   290  
   291  ```elvish-transcript
   292  ~> csv = a,b,foo,bar
   293  ~> joins ';' [(each [x]{ put $x,$x } [(splits , $csv)])]
   294  ▶ 'a,a;b,b;foo,foo;bar,bar'
   295  ```
   296  
   297  This code works, but it is a bit unreadable. In particular, since `splits`
   298  outputs multiple values but `each` wants a list argument, you have to wrap the
   299  output of `splits` in a list with `[(splits ...)]`. Then you have to do this
   300  again in order to pass the output of `each` to `joins`. You might wonder why
   301  commands like `splits` and `each` do not simply output a list to make this
   302  easier.
   303  
   304  The answer to that particular question is in the next subsection, but for the
   305  program at hand, there is a much better way to write it:
   306  
   307  ```elvish-transcript
   308  ~> csv = a,b,foo,bar
   309  ~> splits , $csv | each [x]{ put $x,$x } | joins ';'
   310  ▶ 'a,a;b,b;foo,foo;bar,bar'
   311  ```
   312  
   313  Besides having fewer pairs of parentheses (and brackets), this program is also
   314  more readable, because the data flows from left to right, and there is no
   315  nesting. You can see that `$csv` is first split by commas, then each value
   316  gets reduplicated, and then finally everything is joined by semicolons. It
   317  matches exactly how you would describe the algorithm in spoken English -- or
   318  for that matter, any spoken language!
   319  
   320  Both versions work, because commands like `each` and `joins` that work with
   321  multiple inputs can take their inputs in two ways: they can take the inputs as
   322  one list argument, like in the first version; or from the pipeline, like the
   323  second version. Whenever possible, you should prefer the input-from-pipeline
   324  form: it makes for programs that have little nesting, read naturally.
   325  
   326  One exception to the recommendation is when the input is a small set of things
   327  known beforehand. For example:
   328  
   329  ```elvish-transcript
   330  ~> each $str:to-upper~ [lorem ipsum]
   331  ▶ LOREM
   332  ▶ IPSUM
   333  ```
   334  
   335  Here, using the input-from-argument is completely fine: if you want to use the
   336  input-from-input form, you have to supply the input using `put`, which is also
   337  OK but a bit more wordy:
   338  
   339  ```elvish-transcript
   340  ~> put lorem ipsum | each $str:to-upper~
   341  ▶ LOREM
   342  ▶ IPSUM
   343  ```
   344  
   345  However, not all commands support taking input from the pipeline. For example,
   346  if we want to first join some values with space and then split at commas, this
   347  won't work:
   348  
   349  ```elvish-transcript
   350  ~> joins ' ' [a,b c,d] | splits ,
   351  Exception: want 2 arguments, got 1
   352  [tty], line 1: joins ' ' [a,b c,d] | splits ,
   353  ```
   354  
   355  This is because the `splits` command only ever works with one input (one
   356  string to split), and was not implemented to support taking input from
   357  pipeline; hence it always takes 2 arguments and we got an exception.
   358  
   359  It is easy to remedy this situation however. The `all` command passes its
   360  input to its output, and by capturing its output, we can turn the input into
   361  an argument:
   362  
   363  ```elvish-transcript
   364  ~> joins ' ' [a,b c,d] | splits , (all)
   365  ▶ a
   366  ▶ 'b c'
   367  ▶ d
   368  ```
   369  
   370  ## Streaming Multiple Outputs
   371  
   372  In the previous subsection, we remarked that commands like `splits` and `each`
   373  write multiple output values instead of one list. Why?
   374  
   375  This has to do with another advantage of passing data through the pipeline: in
   376  a pipeline, all commands are executed in parallel. A command in a pipeline
   377  does not need to wait for its previous command to finish running before it can
   378  start processing data. Try this in your terminal:
   379  
   380  ```elvish-transcript
   381  ~> each $str:to-upper~ | each [x]{ put $x$x }
   382  (Start typing)
   383  abc
   384  ▶ ABCABC
   385  xyz
   386  ▶ XYZXYZ
   387  (Press ^D)
   388  ```
   389  
   390  You will notice that as soon as you press Enter after typing `abc`, the output
   391  `ABCABC` is shown. As soon as one input is available, it goes through the
   392  entire pipeline, each command doing its work. This gives you immediate
   393  feedback, and makes good use of multi-core CPUs on modern computers. Pipelines
   394  are like assembly lines in the manufacturing industry.
   395  
   396  If instead of passing multiple values, we pass a list through the pipeline:
   397  that means that each command will now be waiting for its previous command to
   398  do all the processing and pack the results in a list before it can start doing
   399  anything. Now, although the commands themselves are run in parallel, they all
   400  need to be waiting for their previous commands to finish before they can start
   401  doing real work.
   402  
   403  This is why commands like `each` and `splits` produce multiple values instead
   404  of one list. When writing your functions, try to make them produce multiple
   405  values as well: they will cooperate better with builtin commands, and they can
   406  benefit from the efficiency of parallel computations.
   407  
   408  
   409  # Working with Multiple Values
   410  
   411  In Elvish, many constructs can evaluate to multiple values. This can be
   412  surprising if you are not familiar with it.
   413  
   414  To start with, output captures evaluate to all the captured values, instead of
   415  a list:
   416  
   417  ```elvish-transcript
   418  ~> splits , a,b,c
   419  ▶ a
   420  ▶ b
   421  ▶ c
   422  ~> li = (splits , a,b,c)
   423  Exception: arity mismatch
   424  [tty], line 1: li = (splits , a,b,c)
   425  ```
   426  
   427  The assignment fails with "arity mismatch" because the right hand side
   428  evaluates to 3 values, but you are attempting to assign them to just one
   429  variable. If you want to capture the results into a list, you have to
   430  explicitly do so, either by constructing a list or using rest variables:
   431  
   432  ```elvish-transcript
   433  ~> li = [(splits , a,b,c)]
   434  ~> put $li
   435  ▶ [a b c]
   436  ~> @li = (splits , a,b,c) # equivalent and slightly shorter
   437  ```
   438  
   439  ## Assigning Multiple Variables
   440  
   441  
   442  # To Be Continued...
   443  
   444  As of writing, Elvish is neither stable nor complete. The builtin libraries
   445  still have missing pieces, the package manager is in its early days, and
   446  things like a type system and macros have been proposed and considered, but
   447  not yet worked on. Deciding best practices for using feature *x* can be a bit
   448  tricky when that feature *x* doesn't yet exist!
   449  
   450  The current version of the document is what the lead developer of Elvish
   451  (@xiaq) has collected as best practices for writing Elvish code in early 2018,
   452  between the release of Elvish 0.11 and 0.12. They apply to aspects of the
   453  Elvish language that are relatively complete and stable; but as Elvish
   454  evolves, the document will co-evolve. You are invited to revisit this document
   455  once in a while!