github.com/graybobo/golang.org-package-offline-cache@v0.0.0-20200626051047-6608995c132f/x/talks/2014/c2go.slide (about) 1 Go, from C to Go 2 3 GopherCon 4 25 Apr 2014 5 6 Russ Cox 7 Google 8 9 http://golang.org/ 10 11 12 * Video 13 14 A video of this talk was recorded at GopherCon in Denver. 15 16 .link https://www.youtube.com/watch?v=QIE5nV5fDwA Watch the talk on YouTube 17 18 19 * Go Compiler 20 21 * Go Compiler 22 23 80,000+ lines of C. 24 25 * Problem 26 27 Programming in Go is fun. 28 29 Programming in C is not. 30 31 * Problem 32 33 Writing a Go compiler requires Go expertise. 34 35 Writing any program in C requires C expertise. 36 37 Writing a Go compiler in C requires Go and C expertise. 38 39 * Solution 40 41 Write the Go compiler in Go. 42 43 * Past 44 45 Why not write the Go compiler in Go on day one? 46 47 - Go did not exist. 48 49 - Go was unstable. 50 51 - Go is not targeting compiler writers. 52 53 * Present 54 55 Why do it today? 56 57 - Go does exist. 58 59 - Go is stable. 60 61 - Go is a great general purpose language. 62 63 * How? 64 65 Crazy idea: mechanical conversion. 66 67 “One big gofix.” 68 69 * C 70 71 * C 72 73 - First creative burst in 1972 at Bell Labs 74 75 - Ritchie, [[http://cm.bell-labs.com/who/dmr/chist.html][The Development of the C Language]], HOPL 1993 76 77 - “C is quirky, flawed, and an enormous success...” 78 79 * C Data Model 80 81 - Original target: PDP-11 with 24 kB of memory. 82 83 - Programmer is in charge of memory. 84 85 - “Off-stack, dynamically-allocated storage is provided only by a library routine and the burden of managing it is placed on the programmer: C is hostile to automatic garbage collection.” 86 87 - Types are there to help but not enforced. 88 89 * C Control Flow 90 91 - `do...while`, `for`, `switch`, `while` 92 93 - the much maligned `goto` 94 95 * C Program Model 96 97 - Per-file compilation. 98 99 - Headers vs code. 100 101 - `#define`, `#include` 102 103 * Conversion 104 105 * Challenges for Converting C to Go 106 107 - minor: unions, #define, comments 108 109 - goto 110 111 - type mapping 112 113 * Goal 114 115 Automated conversion of our C code to Go. 116 117 Target: _our_ C code, not _all_ C code. 118 119 - Want generated code to be maintainable. 120 - Want automatic translation for 99%+ of the code. 121 - No need to solve general problem for tiny number of cases. 122 - Special cases in converter are okay. 123 - Annotations in source code are okay. 124 125 * Warmups 126 127 * Unions 128 129 go/src/cmd/gc/go.h 130 131 struct Val 132 { 133 short ctype; 134 union 135 { 136 short reg; // OREGISTER 137 short bval; // bool value CTBOOL 138 Mpint* xval; // int CTINT, rune CTRUNE 139 Mpflt* fval; // float CTFLT 140 Mpcplx* cval; // float CTCPLX 141 Strlit* sval; // string CTSTR 142 } u; 143 }; 144 145 * Unions 146 147 go/include/link.h 148 149 struct Addr 150 { 151 short type; 152 union 153 { 154 char sval[8]; 155 float64 dval; 156 Prog* branch; // for 5g, 6g, 8g 157 } u; 158 159 ... 160 }; 161 162 * Unions 163 164 `#define` `struct` `union` `/*` `Great` `space` `saver` `*/` 165 166 * Unions 167 168 `#define` `union` `struct` `/*` `legal` `in` `C!` `*/` 169 170 And anyway, there are only two. 171 172 * #define 173 174 Can't just expand during parsing. 175 176 * #define 177 178 Not many. 179 180 /* 181 * defined macros 182 * you need super-gopher-guru privilege 183 * to add this list. 184 */ 185 #define nelem(x) (sizeof(x)/sizeof((x)[0])) 186 #define nil ((void*)0) 187 ... 188 189 Extend parser to recognize special cases. 190 191 * #define 192 193 Annotate some. 194 195 #define BOM 0xFEFF 196 /*c2go enum { BOM = 0xFEFF }; */ 197 198 Rewrite others. 199 200 enum { 201 BOM = 0xFEFF, 202 }; 203 204 * Comments 205 206 Can't just discard during parsing. 207 208 /* 209 * If the new process paused because it was 210 * swapped out, set the stack level to the last call 211 * to savu(u_ssav). This means that the return 212 * which is executed immediately after the call to aretu 213 * actually returns from the last routine which did 214 * the savu. 215 * 216 * You are not expected to understand this. 217 */ 218 if(rp->p_flag&SSWAP) { 219 rp->p_flag =& ~SSWAP; 220 aretu(u.u_ssav); 221 } 222 223 * Comments 224 225 Record precise source locations. 226 227 case OMAPLIT: 228 n->esc = EscNone; // until proven otherwise 229 e->noesc = list(e->noesc, n); 230 n->escloopdepth = e->loopdepth; 231 232 // Keys and values make it to memory, lose track. 233 for(ll=n->list; ll; ll=ll->next) { 234 escassign(e, &e->theSink, ll->n->left); 235 escassign(e, &e->theSink, ll->n->right); 236 } 237 break; 238 239 Whole-line comments attach to syntax immediately following (or EOF). 240 241 Suffix comments attach to syntax immediately before. 242 243 Syntax carries comments if it moves. 244 245 * Goto 246 247 * C Goto 248 249 “27. Horrors! goto’s and labels 250 251 C has a goto statement and labels, so you can branch about the way you used to. But most of the time goto’s aren’t needed... The code can almost always be more clearly expressed by for/while, if/else, and compound statements. 252 253 * C Goto 254 255 One use of goto’s with some legitimacy is in a program which contains a long loop, where a while(1) would be too extended. Then you might write 256 257 mainloop: 258 ... 259 goto mainloop; 260 261 Another use is to implement a break out of more than one level of for or while. goto’s can only branch to labels within the same function.” 262 263 — Kernighan, [[http://cm.bell-labs.com/who/dmr/ctut.pdf][Programming in C – A Tutorial]] 264 265 * Go Goto Restrictions 266 267 - Cannot jump over a variable declaration in target scope. 268 269 . if x { 270 goto Done 271 } 272 273 y := f() 274 print(y) 275 276 Done: 277 close(c) 278 return 279 280 * Go Goto Restrictions 281 282 - Cannot jump over a variable declaration in target scope. 283 284 . var y int 285 286 if x { 287 goto Done 288 } 289 290 y = f() 291 print(y) 292 293 Done: 294 close(c) 295 return 296 297 * Go Goto Restrictions 298 299 - Cannot jump into a new scope (into a { } block). 300 301 if bad { 302 Bad: 303 printError() 304 return err 305 } 306 307 ... 308 309 if other bad thing { 310 goto Bad 311 } 312 313 * Go Goto Restrictions 314 315 - Cannot jump into a new scope (into a { } block or switch case). 316 317 switch x { 318 case 1: 319 F() 320 goto Common; 321 case 2: 322 G() 323 goto Common 324 case 3: 325 Common: 326 H() 327 } 328 329 * Goto in Go compiler 330 331 1032 goto statements 332 241 labels 333 334 * Goto in Go compiler 335 336 35 indented labels 337 338 18 switch case 339 6 multilevel break/continue 340 5 ‘else’ statement 341 4 cleanup/error labels 342 1 loop 343 1 difficult to explain 344 345 * Refactor switch case goto 346 347 switch(r->op) { 348 case OINDEXMAP: 349 n->op = OAS2MAPR; 350 goto common; 351 case ORECV: 352 n->op = OAS2RECV; 353 goto common; 354 case ODOTTYPE: 355 n->op = OAS2DOTTYPE; 356 r->op = ODOTTYPE2; 357 common: 358 ... 359 } 360 361 * Refactor switch case goto 362 363 switch r.op { 364 case OINDEXMAP, ORECV, ODOTTYPE: 365 switch r.op { 366 case OINDEXMAP: 367 n.op = OAS2MAPR 368 case ORECV: 369 n.op = OAS2RECV 370 case ODOTTYPE: 371 n.op = OAS2DOTTYPE 372 r.op = ODOTTYPE2 373 } 374 ... 375 } 376 377 * General solution 378 379 Baker, [[http://dl.acm.org/citation.cfm?id=321999][An Algorithm for Structuring Flowgraphs]], JACM 1977 380 381 But we don't need it. 382 383 Handle trivial rewrites in converter. 384 Rewrite problematic gotos by hand. 385 386 * Type Mapping 387 388 * Type Mapping 389 390 General question: what type to use in the Go translation? 391 392 - C allows implicit conversion between int, long, char and so on. Go must use one consistently. 393 394 - C uses pointers for what Go calls pointers _and_ slices. 395 396 * Type Mapping 397 398 Build graph of “assigned” value flow and extract clusters. 399 400 x = y; 401 402 int f(void) { 403 return x; 404 } 405 w = f(); 406 407 void g(int z); 408 g(x); 409 g(y); 410 411 Apply to entire compiler (all files). Exclude some functions. 412 413 * Type Mapping 414 415 int 416 islvalue(Node *n) 417 { 418 switch(n->op) { 419 case OINDEX: 420 if(isfixedarray(n->left->type)) 421 return islvalue(n->left); 422 if(n->left->type != T && n->left->type->etype == TSTRING) 423 return 0; 424 // fall through 425 case OIND: 426 case ODOTPTR: 427 case OCLOSUREVAR: 428 return 1; 429 case ODOT: 430 return islvalue(n->left); 431 case ONAME: 432 if(n->class == PFUNC) 433 return 0; 434 return 1; 435 } 436 return 0; 437 } 438 439 * Type Mapping 440 441 int 442 islvalue(Node *n) 443 { 444 ... 445 return islvalue(n->left); 446 ... 447 return 0; 448 ... 449 return 1; 450 ... 451 return islvalue(n->left); 452 ... 453 return 0; 454 ... 455 return 1; 456 ... 457 return 0; 458 } 459 460 * Type Mapping 461 462 cluster 463 types: int 464 values: 465 return from islvalue 466 0 467 1 468 islvalue(n) 469 islvalue(n->left) 470 islvalue(n->right) 471 contexts: 472 bool condition 473 /* if(islvalue(n)), if(!islvalue(n)), ... */ 474 475 Translation: bool. 476 477 * Type Mapping 478 479 cluster 480 types: int 481 values: 482 return from checksliceconst 483 0 484 -1 485 contexts: 486 checksliceconst(lo, hi) < 0 487 checksliceconst(lo, mid) < 0 488 checksliceconst(mid, hi) < 0 489 490 Translation: bool or error. 491 492 * Type Mapping 493 494 cluster 495 types: Val* 496 values: 497 var Val *v 498 va_arg(fp->args, Val*) 499 contexts: 500 v->ctype 501 v->u 502 503 Translation: pointer. 504 505 * Type Mapping 506 507 cluster 508 types: long* 509 values: 510 var long* a1 511 &a->a[0] 512 contexts: 513 *a1 514 a1++ 515 516 Translation: slice. 517 518 * Type Mapping 519 520 Cluster statistics 521 522 - 1,703 clusters in Go compiler 523 - median cluster size 4 values 524 - max cluster size 16,592 values 525 526 Clustering does not rely on C type information at all. 527 528 * Conversion status 529 530 - Still prototyping, but looks good. 531 - Aiming at Go 1.4, but no promises. 532 533 By the way, please try the Go 1.3 beta! 534 535 * Go from C to Go! 536 537 - Practical 538 - Applicable to other code bases? 539 - Applicable to other languages? 540 - Applicable to program understanding tools? 541