github.com/lmorg/murex@v0.0.0-20240217211045-e081c89cd4ef/docs/blog/split_personalities.md

github.com/lmorg/murex@v0.0.0-20240217211045-e081c89cd4ef/docs/blog/split_personalities.md (about)

1 # The Split Personalities of Shell Usage
2
3 > Shell usage is split between the need to write something quickly and frequently verses the need to write something more complex but only the once. In this article is explore those opposing use cases and how different $SHELLs have chosen to address them.
4
5 ## A Very Brief History
6
7 ![Thompson (sitting) and Ritchie working together at a PDP-11](https://nojs.murex.rocks/images/blog/split_personalities/thompson.jpg?v=undef)
8
9 In the very early days of UNIX you had the Thompson shell which supported
10 pipes, some basic control structures and wildcards. Thompson shell was based
11 after the Multics shell, which in turn was inspired from `RUNCOM`. In fact the
12 'rc' extension often seen in shell profiles is directly taken from `RUNCOM`.
13
14 It wasn't until a little later that variables were a feature in shells. That
15 came with the PWB shell, which was designed to be upwardly-compatible with the
16 Thompson shell, supporting Thompson syntax while bringing advancements intended
17 to make shell scripting much more practical.
18
19 While the inspiration behind modern shells, `RUNCOM`, is a program that
20 literally just ran commands from a file; it is this authors opinion that early
21 UNIX shells were originally designed to be interactive terminals for launching
22 applications first and foremost, with scripting as a feature that took a few
23 years to mature. Furthermore, the ALGOL-inspired scripting commands were
24 originally external executables and only later rewritten as shell builtins for
25 performance reasons. For example running `if` in the shell would originally
26 `fork()` the executable `/bin/if` but that quickly became call a builtin
27 function that was part of the shell itself.
28
29 I believe it is these reasons why $SHELLs based on that lineage, be it the
30 Bourne shell, Bash or Zsh, all share a scripting syntax which very much feels
31 like it is extended from REPL usage.
32
33 ## Opposing Requirements
34
35 ![Opposing Requirements](https://nojs.murex.rocks/images/blog/split_personalities/conflict.png?v=undef)
36
37 The problem with shell usage is it falls into two contradictory categories
38 equally:
39
40 1. You need an interactive terminal that is optimized for the operators
41 productivity. Since it is a REPL environment, any instructions you do pass
42 are going to be write-many read-once. In other words, the syntax needs to be
43 quick to type because it's going to be typed often. However it doesn't have
44 to be particularly readable because you're not going to save and read back
45 whatever instructions you've keyed into the REPL.
46
47 2. You need the ability to write short scripts. The language here needs to be
48 familiar because it is aimed at non-developers (otherwise they might just as
49 well use C, FORTRAN, ALGOL or others) and succinct (again, otherwise a
50 developer might as well use a compiled language). However it also should be
51 readable because scripts are saved, recalled, reused and often extended over
52 time. So they fall into the write-once read-many category.
53
54 In an interactive program manager it makes sense to forgo quotation marks
55 around strings, commas to separate parameters and semi-colons to terminate the
56 line. Even the C shell, `csh` then later `tcsh`, doesn't follow C's syntax that
57 strictly -- instead understanding that brevity is required for interactive use.
58
59 When I first started writing my own shell, Murex, I originally started out
60 with syntax that was inspired by the C. A pipeline would look something like
61 the following:
62
63 ```
64 cat ("./example.csv") | grep ("-n", "foobar")
65 ```
66
67 While this came with some readability improvements, it was a _massive_ pain to
68 write over and over. So I added some syntax completion to the terminal,
69 inspired by IDE's and how they attempt to minimize the repetition of entering
70 syntax tokens. However this didn't remove the pain entirely, it just masked it
71 a little. So I removed the redundant braces. But the enforced quotation marks
72 were still annoying, so I decided to make the quotation marks optional. Then
73 the commas were removed...and before I knew it, I'd basically just reinvented
74 the same syntax for writing commands as everyone had already been using for a
75 multitude of decades prior. What started out as the example above ended up
76 looking more like the example below:
77
78 ```
79 cat ./example.csv | grep -n foobar
80 ```
81
82 (please excuse the useless use of `cat` in these examples -- it's purely there
83 for illustrative reasons)
84
85 ## The Traditional
86
87 ![The Traditional](https://nojs.murex.rocks/images/blog/split_personalities/old.jpg?v=undef)
88
89 As I've already hinted in the section before, Bourne, Bash, Zsh all fall nicely
90 into the first camp. The write-many read-once camp. And that makes sense to me
91 when I consider the evolution of those shells. Their heritage does stem from
92 interactive terminals firstly and scripting secondly.
93
94 The problem with traditional shells is that their grammar is lousy for anyone
95 who needs a write-once read-many language. Worse still, while a significant
96 amount of their grammar has now been included as builtins, for practical use
97 operators often find themselves inlining other languages anyway, such as awk,
98 sed, Perl and others. So it is understandable that a great many chose to do
99 away with traditional shells for scripting and instead use more other, more
100 powerful and readable languages like Python.
101
102 Unfortunately the same problems transfer the other way too, in that I have
103 already demonstrated why Python (and other programming languages) don't always
104 make good shells. While I will conceded that there is a loyal fanbase who will
105 swear by their Python REPL for terminal usage, and if they're happy with that
106 then I salute them, their usage is as niche as those who enjoy using Bash for
107 complex scripts. Perhaps the only language I've used which translates well both
108 for terse REPLs and lengthier scripts is LISP.
109
110 ## The Modern
111
112 ![The Modern](https://nojs.murex.rocks/images/blog/split_personalities/new.jpg?v=undef)
113
114 So how are modern shells addressing these split concerns?
115
116 ### Powershell
117
118 Microsoft had the benefit of being able to start from a clean room. They didn't
119 need to inherit 50+ years of UNIX legacy when they wrote Powershell. So their
120 approach was naturally to base their shell on .NET. Passing .NET objects around
121 has a number of advantages over the POSIX specification of passing files, byte
122 streams, to applications. This allows developers to write richer command line
123 applications in their preferred .NET language rather than being tied to the
124 shell's syntax. However one could argue the same is true with POSIX shells and
125 how you can write a program in any language you like. But in Powershell those
126 other .NET programs feel more tightly integrated into Powershell than a forked
127 process does in Bash. Again, I put this down to Powershell passing .NET objects
128 along the pipeline.
129
130 Where Powershell falls down for me is in two key areas:
131
132 1. Many of the flags passed are verbose. Calling .NET objects can be verbose.
133 Take this example of base64 encoding a string:
134 ```
135 [Convert]::ToBase64String([System.Text.Encoding]::Unicode.GetBytes("TextToEncode"))
136 ```
137
138 2. Powershell doesn't play nicely with POSIX. Okay, I'm arguably contradicting
139 myself now because earlier I raised this as a benefit. And in many ways it
140 is. However if you wish to run Powershell on Linux, which you can do, you
141 may find that you'll want to work with CLI tools that do "think" in terms of
142 byte streams. Many of these tools have equivalent aliases written in .NET so
143 you can appear to use them without escaping the rich programming environment
144 provided by Powershell. However you may, and I often did, run into a great
145 many scenarios where my expectations didn't match the practicalities of
146 Powershell.
147
148 (I will talk more about the second point in another article where I'll discuss
149 pipelines, data types and the need for modern shells to understand rich data
150 rather than treating everything as a flat stream of bytes)
151
152 There is no question that Powershell is a more powerful REPL than Bash but it
153 definitely slides more towards the "write-once read-many" end of the spectrum.
154
155 ### Oil
156
157 [Oil](https://www.oilshell.org/) describes itself as the following:
158
159 > Oil is a new Unix shell. It's our upgrade path from bash to a better language
160 > and runtime. It's also for Python and JavaScript users who avoid shell!
161
162 The way Oil achieves this is a lot of how PWB improved upon the Thompson shell
163 in the 1970s. Oil aims to be upwards-compatible with Bash. Any command line or
164 shell script you can run in Bash should, eventually, be supported in Oil as
165 well. Oil can extend on that and support a syntax and grammar that is more
166 readable and sane to write longer lived scripts in. Thus bridging the conflict
167 between "write-many" and "read-many" languages.
168
169 This make Oil one of the most interesting alternative shells I have come
170 across.
171
172 ### Murex
173
174 ![Murex](https://nojs.murex.rocks/images/blog/split_personalities/murex.png?v=undef)
175
176 The approach Murex takes sits somewhere in between the previous two shells.
177 It attempts to retain familiarity with POSIX syntax but isn't afraid to break
178 compatibility where it makes sense. The emphasis is on creating grammar that
179 is both succinct but also readable. This mission was driven from originally
180 attempting to create something more familiar to Javascript developers then
181 falling back to some old Bash-ism's when I realized that for all of it's warts,
182 Bash and its kin aren't actually bad for quick REPL usage of C-style braces
183 over ALGOL style named scopes:
184
185 **POSIX:**
186
187 ```
188 if [ 0 -eq 1 ]; then
189 echo '0 == 1'
190 else
191 echo '0 != 1'
192 fi
193 ```
194
195 **Murex:**
196
197 ```
198 if { 0 == 1 } then {
199 echo '0 == 1'
200 } else {
201 echo '0 != 1'
202 }
203 ```
204
205 But since the curly braces are tokens, grammar like `then` / `else` become
206 superfluous words that only exist for readability. So then we can make them
207 optional. And you end up with a syntax that allows for a certain amount of
208 golfing in the REPL should the operator want to save a few key strokes
209
210 ```
211 if { 0 == 1 } { echo '0 == 1' } { echo '0 != 1' }
212 ```
213
214 ## Conclusion
215
216 The write-many read-once tendencies of the interactive terminal and the
217 write-once read-many demands of scripting might be difficult to consolidate
218 but I do think it is achievable and I'm not convinced the current heavy weights
219 do a good job at addressing those conflicting concerns. Whereas alternative
220 shells like [Oil](https://www.oilshell.org/), [Elfish](https://elv.sh/) and
221 [Murex](https://github.com/lmorg/murex) seem to be putting a lot more thought
222 into this problem and it is really exciting seeing the different ideas that are
223 being produced.
224
225 <hr>
226
227 Published: 02.10.2021 at 22:42
228
229 ## See Also
230
231 * [Interactive Shell](../user-guide/interactive-shell.md):
232 What's different about Murex's interactive shell?
233 * [Reading Lists From The Command Line](../blog/reading_lists.md):
234 How hard can it be to read a list of data from the command line? If your list is line delimited then it should be easy. However what if your list is a JSON array? This post will explore how to work with lists in a different command line environments.
235 * [Rosetta Stone](../user-guide/rosetta-stone.md):
236 A tabulated list of Bashism's and their equivalent Murex syntax
237 * [`if`](../commands/if.md):
238 Conditional statement to execute different blocks of code depending on the result of the condition
239
240 <hr/>
241
242 This document was generated from [gen/blog/split_personalities_doc.yaml](https://github.com/lmorg/murex/blob/master/gen/blog/split_personalities_doc.yaml).