modernc.org/ccgo/v3@v3.16.14/lib/design-notes.adoc (about)

     1  = Design Notes
     2  
     3  == Problems:
     4  
     5  Translating C to Go is harder than it looks.
     6  
     7  Jan says: It's impossible in the general case to turn C char* into Go
     8  []byte.  It's possible to do it probably often for concrete C code
     9  cases - based also on author's C coding style. The first problem this
    10  runs into is that Go does not guarantee that the backing array will
    11  keep its address stable due to Go movable stacks. C expects the
    12  opposite, a pointer never magically modifies itself, so some code will
    13  fail.
    14  
    15  INSERT CODE EXAMPLES ILLUSTRATING THE PROBLEM HERE
    16  
    17  == How the parser works
    18  
    19  There are no comment nodes in the C AST. Instead every cc.Token has a
    20  Sep field: https://godoc.org/modernc.org/cc/v3#Token
    21  
    22  It captures, when configured to do so, all white space preceding the
    23  token, combined, including comments, if any. So we have all white
    24  space/comments information for every token in the AST. A final white
    25  space/comment, preceding EOF, is available as field TrailingSeperator
    26  in the AST: https://godoc.org/modernc.org/cc/v3#AST.
    27  
    28  To get the lexically first white space/comment for any node, use
    29  tokenSeparator():
    30  https://gitlab.com/cznic/ccgo/-/blob/6551e2544a758fdc265c8fac71fb2587fb3e1042/v3/go.go#L1476
    31  
    32  The same with a default value is comment():
    33  https://gitlab.com/cznic/ccgo/-/blob/6551e2544a758fdc265c8fac71fb2587fb3e1042/v3/go.go#L1467
    34  
    35  == Looking forward
    36  
    37  Eric says: In my visualization of how the translator would work, the
    38  output of a ccgo translation of a module at any given time is a file
    39  of pseudo-Go code in which some sections may be enclosed by a Unicode
    40  bracketing character (presently using the guillemot quotes U+ab and
    41  U+bb) meaning "this is not Go yet" that intentionally makes the Go
    42  compiler barf. This expresses a color on the AST nodes.
    43  
    44  So, for example, if I'm translating hello.c with a ruleset that does not
    45  include print -> fmt.Printf, this:
    46  
    47  ---------------------------------------------------------
    48  #include <stdio>
    49  
    50  /* an example comment */
    51  
    52  int main(int argc, char *argv[])
    53  {
    54      printf("Hello, World")
    55  }
    56  ---------------------------------------------------------
    57  
    58  becomes this without any explicit rules at all:
    59  
    60  ---------------------------------------------------------
    61  «#include <stdio>»
    62  
    63  /* an example comment */
    64  
    65  func main
    66  {
    67  	«printf(»"Hello, World"!\n"«)»
    68  }
    69  ---------------------------------------------------------
    70  
    71  Then, when the rule print -> fmt.Printf is added, it becomes
    72  
    73  ---------------------------------------------------------
    74  import (
    75          "fmt"
    76  )
    77  
    78  /* an example comment */
    79  
    80  func main
    81  {
    82  	fmt.Printf("Hello, World"!\n")
    83  }
    84  ---------------------------------------------------------
    85  
    86  because with that rule the AST node corresponding to the printf
    87  call can be translated and colored "Go".  This implies an import
    88  of fmt.  We observe that there are no longer C-colored spans
    89  and drop the #includes.
    90  
    91  // end