github.com/graybobo/golang.org-package-offline-cache@v0.0.0-20200626051047-6608995c132f/x/talks/2015/gogo.slide (about) 1 Go in Go 2 Gopherfest 3 26 May 2015 4 5 Rob Pike 6 Google 7 r@golang.org 8 http://golang.org/ 9 10 * Go in Go 11 12 As of the 1.5 release of Go, the entire system is now written in Go. 13 (And a little assembler.) 14 15 C is gone. 16 17 Side note: `gccgo` is still going strong. 18 This talk is about the original compiler, `gc`. 19 20 * Why was it in C? 21 22 Bootstrapping. 23 24 (Also Go was not intended primarily as a compiler implementation language.) 25 26 * Why move the compiler to Go? 27 28 Not for validation; we have more pragmatic motives: 29 30 - Go is easier to write (correctly) than C. 31 - Go is easier to debug than C (even absent a debugger). 32 - Go is the only language you'd need to know; encourages contributions. 33 - Go has better modularity, tooling, testing, profiling, ... 34 - Go makes parallel execution trivial. 35 36 Already seeing benefits, and it's early yet. 37 38 Design document: [[http://golang.org/s/go13compiler]] 39 40 * Why move the runtime to Go? 41 42 We had our own C compiler just to compile the runtime. 43 We needed a compiler with the same ABI as Go, such as segmented stacks. 44 45 Switching it to Go means we can get rid of the C compiler. 46 That's more important than converting the compiler to Go. 47 48 (All the reasons for moving the compiler apply to the runtime as well.) 49 50 Now only one language in the runtime; easier integration, stack management, etc. 51 52 53 As always, simplicity is the overriding consideration. 54 55 * History 56 57 Why do we have our own tool chain at all? 58 Our own ABI? 59 Our own file formats? 60 61 History, familiarity, and ease of moving forward. And speed. 62 63 Many of Go's big changes would be much harder with GCC or LLVM. 64 65 .link https://news.ycombinator.com/item?id=8817990 66 67 * Big changes 68 69 All made easier by owning the tools and/or moving to Go: 70 71 - linker rearchitecture 72 - new garbage collector 73 - stack maps 74 - contiguous stacks 75 - write barriers 76 77 The last three are all but impossible in C: 78 79 - C is not type safe; don't always know what's a pointer 80 - aliasing of stack slots caused by optimization 81 82 (`Gccgo` will have segmented stacks and imprecise (stack) collection for a while yet.) 83 84 * Goroutine stacks 85 86 - Until 1.2: Stacks were segmented. 87 - 1.3: Stacks were contiguous unless executing C code (runtime). 88 - 1.4: Stacks made contiguous by restricting C to system stack. 89 - 1.5: Stacks made contiguous by eliminating C. 90 91 These were each huge steps, made quickly (led by `khr@`). 92 93 * Converting the runtime 94 95 Mostly done by hand with machine assistance. 96 97 Challenge to implement the runtime in a safe language. 98 Some use of `unsafe` to deal with pointers as raw bits in the GC, for instance. 99 But less than you might think. 100 101 The translator (next sections) helped for some of the translation. 102 103 * Converting the compiler 104 105 Why translate it, not write it from scratch? Correctness, testing. 106 107 Steps: 108 109 - Write a custom translator from C to Go. 110 - Run the translator, iterate until success. 111 - Measure success by bit-identical output. 112 - Clean up the code by hand and by machine. 113 - Turn it from C-in-Go to idiomatic Go (still happening). 114 115 * Translator 116 117 First output was C line-by-line translated to (bad!) Go. 118 Tool to do this written by `rsc@` (talked about at GopherCon 2014). 119 Custom written for this job, not a general C-to-Go translator. 120 121 Steps: 122 123 - Parse C code using new simple C parser (`yacc`) 124 - Remove or rewrite C-isms such as `*p++` as an expression 125 - Walk the C parse tree, print the C code in Go syntax 126 - Compile the output 127 - Run, compare generated code 128 - Repeat 129 130 The `Yacc` grammar was translated by sam-powered hands. 131 132 * Translator configuration 133 134 Aided by hand-written rewrite rules, such as: 135 136 - this field is a bool 137 - this function returns a bool 138 139 Also diff-like rewrites for things such as using the standard library: 140 141 diff { 142 - g.Rpo = obj.Calloc(g.Num*sizeof(g.Rpo[0]), 1).([]*Flow) 143 - idom = obj.Calloc(g.Num*sizeof(idom[0]), 1).([]int32) 144 - if g.Rpo == nil || idom == nil { 145 - Fatal("out of memory") 146 - } 147 + g.Rpo = make([]*Flow, g.Num) 148 + idom = make([]int32, g.Num) 149 } 150 151 * Another example 152 153 This one due to semantic difference between the languages. 154 155 diff { 156 - if nreg == 64 { 157 - mask = ^0 // can't rely on C to shift by 64 158 - } else { 159 - mask = (1 << uint(nreg)) - 1 160 - } 161 + mask = (1 << uint(nreg)) - 1 162 } 163 164 * Grind 165 166 Once in Go, new tool `grind` deployed (by `rsc@`): 167 168 - parses Go, type checks 169 - records a list of edits to perform: "insert this text at this position" 170 - at end, applies edits to source (hard to edit AST). 171 172 Changes guided by profiling and other analysis: 173 174 - removes dead code 175 - removes gotos 176 - removes unused labels, needless indirections, etc. 177 - moves `var` declarations nearer to first use 178 179 .link http://rsc.io/grind 180 181 * Performance problems 182 183 Output from translator was poor Go, and ran about 10X slower. 184 Most of that slowdown has been recovered. 185 186 Problems with C to Go: 187 188 - C patterns can be poor Go; e.g.: complex `for` loops 189 - C stack variables never escape; Go compiler isn't as sure 190 - interfaces such as `fmt.Stringer` vs. C's `varargs` 191 - no `unions` in Go, so use `structs` instead: bloat 192 - variable declarations in wrong place 193 194 C compiler didn't free much memory, but Go has a GC. 195 Adds CPU and memory overhead. 196 197 * Performance fixes 198 199 Profile! (Never done before!) 200 201 - move `vars` closer to first use 202 - split `vars` into multiple 203 - replace code in the compiler with code in the library: e.g. `math/big` 204 - use interface or other tricks to combine `struct` fields 205 - better escape analysis (`drchase@`). 206 - hand tuning code and data layout 207 208 Use tools like `grind`, `gofmt` `-r` and `eg` for much of this. 209 210 Removing interface argument from a debugging print library got 15% overall! 211 212 More remains to be done. 213 214 * Technical benefits 215 216 Other benefits of the conversion: 217 218 Garbage collection means no more worry about introducing a dangling pointer. 219 220 Chance to clean up the back ends. 221 222 Unified `386` and `amd64` architectures throughout the tool chain. 223 224 New architectures are easier to add. 225 226 Unified the tools: now one compiler, one assembler, one linker. 227 228 * Compiler 229 230 `GOOS=YYY` `GOARCH=XXX` `go` `tool` `compile` 231 232 One compiler; no more `6g`, `8g` etc. 233 234 About 50K lines of portable code. 235 Even the registerizer is portable now; architectures well characterized. 236 Non-portable: Peepholing, details like registers bound to instructions. 237 Typically around 10% of the portable LOC. 238 239 * Assembler 240 241 `GOOS=YYY` `GOARCH=XXX` `go` `tool` `asm` 242 243 New assembler, all in Go, written from scratch by `r@`. 244 Clean, idiomatic Go code. 245 246 Less than 4000 lines, <10% machine-dependent. 247 248 Almost completely compatible with previous `yacc` and C assemblers. 249 250 How is this possible? 251 252 - shared syntax originating in the Plan 9 assemblers 253 - unified back-end logic (old `liblink`, now `internal/obj`) 254 255 * Linker 256 257 `GOOS=YYY` `GOARCH=XXX` `go` `tool` `link` 258 259 Mostly hand- and machine- translated from C code. 260 261 New library, `internal/obj`, part of original linker, captures details about machines, writes object files. 262 263 27000 lines summed across 4 architectures, mostly tables (plus some ugliness). 264 265 - `arm`: 4000 266 - `arm64`: 6000 267 - `ppc64`: 5000 268 - `x86`: 7500 (`386` and `amd64`) 269 270 Example benefit: one print routine to print any instruction for any architecture. 271 272 * Bootstrap 273 274 With no C compiler, bootstrapping requires a Go compiler. 275 276 Therefore need to build or download a working Go installation to build 1.5 from source. 277 278 We use Go 1.4+ as the base to build the 1.5+ tool chain. (Newer is OK too.) 279 280 Details: [[http://golang.org/s/go15bootstrap]] 281 282 * Future 283 284 Much work still to do, but 1.5 is mostly set. 285 286 Future work: 287 288 Better escape analysis. 289 New compiler back end using SSA (much easier in Go than C). 290 Will allow much more optimization. 291 292 Generate machine descriptions from PDFs (or maybe XML). 293 Will have a purely machine-generated instruction definition: 294 "Read in PDF, write out an assembler configuration". 295 Already deployed for the disassemblers. 296 297 * Conclusions 298 299 Getting rid of C was a huge advance for the project. 300 Code is cleaner, testable, profilable, easier to work on. 301 302 New unified tool chain reduces code size, increases maintainability. 303 304 Flexible tool chain, portability still paramount. 305 306