github.com/elves/elvish@v0.15.0/website/learn/effective-elvish.md

github.com/elves/elvish@v0.15.0/website/learn/effective-elvish.md (about)

     1  <!-- toc -->
     2  
     3  Elvish is not an entirely new language. Its programming techniques have two
     4  primary sources: traditional Unix shells and functional programming languages,
     5  both dating back to many decades ago. However, the way Elvish combines those two
     6  paradigms is unique in many ways, which enables new ways to write code.
     7  
     8  This document is an advanced tutorial focusing on how to write idiomatic Elvish
     9  code, code that is concise and clear, and takes full advantage of Elvish's
    10  features.
    11  
    12  An appropriate adjective for idiomatic Elvish code, like _Pythonic_ for Python
    13  or _Rubyesque_ for Ruby, is **Elven**. In
    14  [Roguelike games](https://en.wikipedia.org/wiki/Roguelike), Elven items are
    15  known to be high-quality, artful and resilient. So is Elven code.
    16  
    17  # Style
    18  
    19  ## Naming
    20  
    21  Use `dash-delimited-words` for names of variables and functions. Underscores are
    22  allowed in variable and function names, but their use should be limited to
    23  environment variables (e.g. `$E:LC_ALL`) and external commands (e.g. `pkg_add`).
    24  
    25  When building a module, use a leading dash to communicate that a variable or
    26  function is subject to change in future and cannot be relied upon, either
    27  because it is an experimental feature or implementation detail.
    28  
    29  Elvish's core libraries follow the naming convention above.
    30  
    31  ## Indentation
    32  
    33  Indent by two spaces.
    34  
    35  ## Code Blocks
    36  
    37  In Elvish, code blocks in control structures are delimited by curly braces. This
    38  is perhaps the most visible difference of Elvish from most other shells like
    39  bash, zsh or fish. The following bash code:
    40  
    41  ```bash
    42  if true; then
    43    echo true
    44  fi
    45  ```
    46  
    47  Is written like this in Elvish:
    48  
    49  ```elvish
    50  if $true {
    51    echo true
    52  }
    53  ```
    54  
    55  If you have used lambdas in Elvish, you will notice that code blocks are
    56  syntactically just parameter-list-less lambdas.
    57  
    58  In Elvish, you cannot put opening braces of code blocks on the next line. This
    59  won't work:
    60  
    61  ```elvish
    62  if $true
    63  { # wrong!
    64    echo true
    65  }
    66  ```
    67  
    68  Instead, you must write:
    69  
    70  ```elvish
    71  if $true {
    72    echo true
    73  }
    74  ```
    75  
    76  This is because in Elvish, control structures like `if` follow the same syntax
    77  as normal commands, hence newlines terminate them. To make the code block part
    78  of the `if` command, it must appear on the same line.
    79  
    80  # Using the Pipeline
    81  
    82  Elvish is equipped with a powerful tool for passing data: the pipeline. Like in
    83  traditional shells, it is an intuitive notation for data processing: data flows
    84  from left to right, undergoing one transformation after another. Unlike in
    85  traditional shells, it is not restricted to unstructured bytes: all Elvish
    86  values, including lists, maps and even closures, can flow in the pipeline. This
    87  section documents how to make the most use of pipelines.
    88  
    89  ## Returning Values with Structured Output
    90  
    91  Unlike functions in most other programming languages, Elvish commands do not
    92  have return values. Instead, they can write to _structured output_, which is
    93  similar to the traditional byte-based stdout, but preserves all internal
    94  structures of aribitrary Elvish values. The most fundamental command that does
    95  this is `put`:
    96  
    97  ```elvish-transcript
    98  ~> put foo
    99  ▶ foo
   100  ~> x = (put foo)
   101  ~> put $x
   102  ▶ foo
   103  ```
   104  
   105  This is hardly impressive - you can output and recover simple strings using good
   106  old byte-based output as well. But let's try this:
   107  
   108  ```elvish-transcript
   109  ~> put "a\nb" [foo bar]
   110  ▶ "a\nb"
   111  ▶ [foo bar]
   112  ~> s li = (put "a\nb" [foo bar])
   113  ~> put $s
   114  ▶ "a\nb"
   115  ~> put $li[0]
   116  ▶ foo
   117  ```
   118  
   119  Here, two things are worth mentioning: the first value we `put` contains a
   120  newline, and the second value is a list. When we capture the output, we get
   121  those exact values back. Passing structured data is difficult with byte-based
   122  output, but trivial with value output.
   123  
   124  Besides `put`, many other builtin commands and commands in builtin modules also
   125  write to structured output, like `str:split`:
   126  
   127  ```elvish-transcript
   128  ~> use str
   129  ~> str:split , foo,bar
   130  ▶ foo
   131  ▶ bar
   132  ~> words = [(str:split , foo,bar)]
   133  ~> put $words
   134  ▶ [foo bar]
   135  ```
   136  
   137  User-defined functions behave in the same way: they "return" values by writing
   138  to structured stdout. Without realizing that "return values" are just outputs in
   139  Elvish, it is easy to think of `put` as **the** command to "return" values and
   140  write code like this:
   141  
   142  ```elvish-transcript
   143  ~> fn split-by-comma [s]{ use str; put (str:split , $s) }
   144  ~> split-by-comma foo,bar
   145  ▶ foo
   146  ▶ bar
   147  ```
   148  
   149  The `split-by-comma` function works, but it can be written more concisely as:
   150  
   151  ```elvish-transcript
   152  ~> fn split-by-comma [s]{ use str; str:split , $s }
   153  ~> split-by-comma foo,bar
   154  ▶ foo
   155  ▶ bar
   156  ```
   157  
   158  In fact, the pattern `put (some-cmd)` is almost always redundant and equivalent
   159  to just `some-command`.
   160  
   161  Similarly, it is seldom necessary to write `echo (some-cmd)`: it is almost
   162  always equivalent to just `some-cmd`. As an exercise, try simplifying the
   163  following function:
   164  
   165  ```elvish
   166  fn git-describe { echo (git describe --tags --always) }
   167  ```
   168  
   169  ## Mixing Bytes and Values
   170  
   171  Each pipe in Elvish comprises two components: one traditional byte pipe that
   172  carries unstructured bytes, and one value pipe that carries Elvish values. You
   173  can write to both, and output capture will capture both:
   174  
   175  ```elvish-transcript
   176  ~> fn f { echo bytes; put value }
   177  ~> f
   178  bytes
   179  ▶ value
   180  ~> outs = [(f)]
   181  ~> put $outs
   182  ▶ [bytes value]
   183  ```
   184  
   185  This also illustrates that the output capture operator `(...)` works with both
   186  byte and value outputs, and it can recover the output sent to `echo`. When byte
   187  output contains multiple lines, each line becomes one value:
   188  
   189  ```elvish-transcript
   190  ~> x = [(echo "lorem\nipsum")]
   191  ~> put $x
   192  ▶ [lorem ipsum]
   193  ```
   194  
   195  Most Elvish builtin functions also work with both byte and value inputs.
   196  Similarly to output capture, they split their byte input by newlines. For
   197  example:
   198  
   199  ```elvish-transcript
   200  ~> use str
   201  ~> put lorem ipsum | each $str:to-upper~
   202  ▶ LOREM
   203  ▶ IPSUM
   204  ~> echo "lorem\nipsum" | each $str:to-upper~
   205  ▶ LOREM
   206  ▶ IPSUM
   207  ```
   208  
   209  This line-oriented processing of byte input is consistent with traditional Unix
   210  tools like `grep`, `sed` and `awk`. In fact, it is easy to write your own `grep`
   211  in Elvish:
   212  
   213  ```elvish-transcript
   214  ~> use re
   215  ~> fn mygrep [p]{ each [line]{ if (re:match $p $line) { echo $line } } }
   216  ~> cat in.txt
   217  abc
   218  123
   219  lorem
   220  456
   221  ~> cat in.txt | mygrep '[0-9]'
   222  123
   223  456
   224  ```
   225  
   226  (Note that it is more concise to write `mygrep ... < in.txt`, but due to
   227  [a bug](https://github.com/elves/elvish/issues/600) this does not work.)
   228  
   229  However, this line-oriented behavior is not always desirable: not all Unix
   230  commands output newline-separated data. When you want to get the output as is,
   231  as a single string, you can use the `slurp` command:
   232  
   233  ```elvish-transcript
   234  ~> echo "a\nb\nc" | slurp
   235  ▶ "a\nb\nc\n"
   236  ```
   237  
   238  One immediate use of `slurp` is to read a whole file into a string:
   239  
   240  ```elvish-transcript
   241  ~> cat hello.go
   242  package main
   243  
   244  import "fmt"
   245  
   246  func main() {
   247              fmt.Println("vim-go")
   248  }
   249  ~> hello-go = (slurp < hello.go)
   250  ~> put $hello-go
   251  ▶ "package main\n\nimport \"fmt\"\n\nfunc main()
   252  {\n\tfmt.Println(\"vim-go\")\n}\n"
   253  ```
   254  
   255  It is also useful, for example, when working with NUL-separated output:
   256  
   257  ```elvish-transcript
   258  ~> touch "a\nb.go"
   259  ~> mkdir d
   260  ~> touch d/f.go
   261  ~> use str
   262  ~> find . -name '*.go' -print0 | str:split "\000" (slurp)
   263  ▶ "./a\nb.go"
   264  ▶ ./d/f.go
   265  ▶ ''
   266  ```
   267  
   268  In the above command, `slurp` turns the input into one string, which is then
   269  used as an argument to `str:split`. The `str:split` command then splits the
   270  whole input by NUL bytes.
   271  
   272  Note that in Elvish, strings can contain NUL bytes; in fact, they can contain
   273  any byte; this makes Elvish suitable for working with binary data. (Also, note
   274  that the `find` command terminates its output with a NUL byte, hence we see a
   275  trailing empty string in the output.)
   276  
   277  One side note: In the first example, we saw that `bytes` appeared before
   278  `value`. This is not guaranteed: byte output and value output are separate, it
   279  is possible to get `value` before `bytes` in more complex cases. Writes to one
   280  component, however, always have their orders preserved, so in `put x; put y`,
   281  `x` will always appear before `y`.
   282  
   283  ## Prefer Pipes Over Parentheses
   284  
   285  If you have experience with Lisp, you will discover that you can write Elvish
   286  code very similar to Lisp. For instance, to split a string containing
   287  comma-separated value, reduplicate each value (using commas as separators), and
   288  rejoin them with semicolons, you can write:
   289  
   290  ```elvish-transcript
   291  ~> csv = a,b,foo,bar
   292  ~> use str
   293  ~> str:join ';' [(each [x]{ put $x,$x } [(str:split , $csv)])]
   294  ▶ 'a,a;b,b;foo,foo;bar,bar'
   295  ```
   296  
   297  This code works, but it is a bit unreadable. In particular, since `str:split`
   298  outputs multiple values but `each` wants a list argument, you have to wrap the
   299  output of `str:split` in a list with `[(str:split ...)]`. Then you have to do
   300  this again in order to pass the output of `each` to `str:join`. You might wonder
   301  why commands like `str:split` and `each` do not simply output a list to make
   302  this easier.
   303  
   304  The answer to that particular question is in the next subsection, but for the
   305  program at hand, there is a much better way to write it:
   306  
   307  ```elvish-transcript
   308  ~> csv = a,b,foo,bar
   309  ~> use str
   310  ~> str:split , $csv | each [x]{ put $x,$x } | str:join ';'
   311  ▶ 'a,a;b,b;foo,foo;bar,bar'
   312  ```
   313  
   314  Besides having fewer pairs of parentheses (and brackets), this program is also
   315  more readable, because the data flows from left to right, and there is no
   316  nesting. You can see that `$csv` is first split by commas, then each value gets
   317  reduplicated, and then finally everything is joined by semicolons. It matches
   318  exactly how you would describe the algorithm in spoken English -- or for that
   319  matter, any spoken language!
   320  
   321  Both versions work, because commands like `each` and `str:join` that work with
   322  multiple inputs can take their inputs in two ways: they can take the inputs as
   323  one list argument, like in the first version; or from the pipeline, like the
   324  second version. Whenever possible, you should prefer the input-from-pipeline
   325  form: it makes for programs that have little nesting, read naturally.
   326  
   327  One exception to the recommendation is when the input is a small set of things
   328  known beforehand. For example:
   329  
   330  ```elvish-transcript
   331  ~> each $str:to-upper~ [lorem ipsum]
   332  ▶ LOREM
   333  ▶ IPSUM
   334  ```
   335  
   336  Here, using the input-from-argument is completely fine: if you want to use the
   337  input-from-input form, you have to supply the input using `put`, which is also
   338  OK but a bit more wordy:
   339  
   340  ```elvish-transcript
   341  ~> put lorem ipsum | each $str:to-upper~
   342  ▶ LOREM
   343  ▶ IPSUM
   344  ```
   345  
   346  However, not all commands support taking input from the pipeline. For example,
   347  if we want to first join some values with space and then split at commas, this
   348  won't work:
   349  
   350  ```elvish-transcript
   351  ~> use str
   352  ~> str:join ' ' [a,b c,d] | str:split ,
   353  Exception: arity mismatch: arguments here must be 2 values, but is 1 value
   354  [tty], line 1: str:join ' ' [a,b c,d] | str:split ,
   355  ```
   356  
   357  This is because the `str:split` command only ever works with one input (one
   358  string to split), and was not implemented to support taking input from pipeline;
   359  hence it always takes 2 arguments and we got an exception.
   360  
   361  It is easy to remedy this situation however. The `all` command passes its input
   362  to its output, and by capturing its output, we can turn the input into an
   363  argument:
   364  
   365  ```elvish-transcript
   366  ~> use str
   367  ~> str:join ' ' [a,b c,d] | str:split , (all)
   368  ▶ a
   369  ▶ 'b c'
   370  ▶ d
   371  ```
   372  
   373  ## Streaming Multiple Outputs
   374  
   375  In the previous subsection, we remarked that commands like `str:split` and
   376  `each` write multiple output values instead of one list. Why?
   377  
   378  This has to do with another advantage of passing data through the pipeline: in a
   379  pipeline, all commands are executed in parallel. A command in a pipeline does
   380  not need to wait for its previous command to finish running before it can start
   381  processing data. Try this in your terminal:
   382  
   383  ```elvish-transcript
   384  ~> each $str:to-upper~ | each [x]{ put $x$x }
   385  (Start typing)
   386  abc
   387  ▶ ABCABC
   388  xyz
   389  ▶ XYZXYZ
   390  (Press ^D)
   391  ```
   392  
   393  You will notice that as soon as you press Enter after typing `abc`, the output
   394  `ABCABC` is shown. As soon as one input is available, it goes through the entire
   395  pipeline, each command doing its work. This gives you immediate feedback, and
   396  makes good use of multi-core CPUs on modern computers. Pipelines are like
   397  assembly lines in the manufacturing industry.
   398  
   399  If instead of passing multiple values, we pass a list through the pipeline: that
   400  means that each command will now be waiting for its previous command to do all
   401  the processing and pack the results in a list before it can start doing
   402  anything. Now, although the commands themselves are run in parallel, they all
   403  need to be waiting for their previous commands to finish before they can start
   404  doing real work.
   405  
   406  This is why commands like `each` and `str:split` produce multiple values instead
   407  of one list. When writing your functions, try to make them produce multiple
   408  values as well: they will cooperate better with builtin commands, and they can
   409  benefit from the efficiency of parallel computations.
   410  
   411  # Working with Multiple Values
   412  
   413  In Elvish, many constructs can evaluate to multiple values. This can be
   414  surprising if you are not familiar with it.
   415  
   416  To start with, output captures evaluate to all the captured values, instead of a
   417  list:
   418  
   419  ```elvish-transcript
   420  ~> use str
   421  ~> str:split , a,b,c
   422  ▶ a
   423  ▶ b
   424  ▶ c
   425  ~> li = (str:split , a,b,c)
   426  Exception: arity mismatch: assignment right-hand-side must be 1 value, but is 3 values
   427  [tty], line 1: li = (str:split , a,b,c)
   428  ```
   429  
   430  The assignment fails with "arity mismatch" because the right hand side evaluates
   431  to 3 values, but you are attempting to assign them to just one variable. If you
   432  want to capture the results into a list, you have to explicitly do so, either by
   433  constructing a list or using rest variables:
   434  
   435  ```elvish-transcript
   436  ~> use str
   437  ~> li = [(str:split , a,b,c)]
   438  ~> put $li
   439  ▶ [a b c]
   440  ~> @li = (str:split , a,b,c) # equivalent and slightly shorter
   441  ```
   442  
   443  ## Assigning Multiple Variables
   444  
   445  # To Be Continued...
   446  
   447  As of writing, Elvish is neither stable nor complete. The builtin libraries
   448  still have missing pieces, the package manager is in its early days, and things
   449  like a type system and macros have been proposed and considered, but not yet
   450  worked on. Deciding best practices for using feature _x_ can be a bit tricky
   451  when that feature _x_ doesn't yet exist!
   452  
   453  The current version of the document is what the lead developer of Elvish (@xiaq)
   454  has collected as best practices for writing Elvish code in early 2018, between
   455  the release of Elvish 0.11 and 0.12. They apply to aspects of the Elvish
   456  language that are relatively complete and stable; but as Elvish evolves, the
   457  document will co-evolve. You are invited to revisit this document once in a
   458  while!