github.com/varialus/godfly@v0.0.0-20130904042352-1934f9f095ab/doc/articles/gos_declaration_syntax.html

github.com/varialus/godfly@v0.0.0-20130904042352-1934f9f095ab/doc/articles/gos_declaration_syntax.html (about)

     1  <!--{
     2  "Title": "Go's Declaration Syntax"
     3  }-->
     4  
     5  <p>
     6  Newcomers to Go wonder why the declaration syntax is different from the
     7  tradition established in the C family. In this post we'll compare the
     8  two approaches and explain why Go's declarations look as they do.
     9  </p>
    10  
    11  <p>
    12  <b>C syntax</b>
    13  </p>
    14  
    15  <p>
    16  First, let's talk about C syntax. C took an unusual and clever approach
    17  to declaration syntax. Instead of describing the types with special
    18  syntax, one writes an expression involving the item being declared, and
    19  states what type that expression will have. Thus
    20  </p>
    21  
    22  <pre>
    23  int x;
    24  </pre>
    25  
    26  <p>
    27  declares x to be an int: the expression 'x' will have type int. In
    28  general, to figure out how to write the type of a new variable, write an
    29  expression involving that variable that evaluates to a basic type, then
    30  put the basic type on the left and the expression on the right.
    31  </p>
    32  
    33  <p>
    34  Thus, the declarations
    35  </p>
    36  
    37  <pre>
    38  int *p;
    39  int a[3];
    40  </pre>
    41  
    42  <p>
    43  state that p is a pointer to int because '*p' has type int, and that a
    44  is an array of ints because a[3] (ignoring the particular index value,
    45  which is punned to be the size of the array) has type int.
    46  </p>
    47  
    48  <p>
    49  What about functions? Originally, C's function declarations wrote the
    50  types of the arguments outside the parens, like this:
    51  </p>
    52  
    53  <pre>
    54  int main(argc, argv)
    55      int argc;
    56      char *argv[];
    57  { /* ... */ }
    58  </pre>
    59  
    60  <p>
    61  Again, we see that main is a function because the expression main(argc,
    62  argv) returns an int. In modern notation we'd write
    63  </p>
    64  
    65  <pre>
    66  int main(int argc, char *argv[]) { /* ... */ }
    67  </pre>
    68  
    69  <p>
    70  but the basic structure is the same.
    71  </p>
    72  
    73  <p>
    74  This is a clever syntactic idea that works well for simple types but can
    75  get confusing fast. The famous example is declaring a function pointer.
    76  Follow the rules and you get this:
    77  </p>
    78  
    79  <pre>
    80  int (*fp)(int a, int b);
    81  </pre>
    82  
    83  <p>
    84  Here, fp is a pointer to a function because if you write the expression
    85  (*fp)(a, b) you'll call a function that returns int. What if one of fp's
    86  arguments is itself a function?
    87  </p>
    88  
    89  <pre>
    90  int (*fp)(int (*ff)(int x, int y), int b)
    91  </pre>
    92  
    93  <p>
    94  That's starting to get hard to read.
    95  </p>
    96  
    97  <p>
    98  Of course, we can leave out the name of the parameters when we declare a
    99  function, so main can be declared
   100  </p>
   101  
   102  <pre>
   103  int main(int, char *[])
   104  </pre>
   105  
   106  <p>
   107  Recall that argv is declared like this,
   108  </p>
   109  
   110  <pre>
   111  char *argv[]
   112  </pre>
   113  
   114  <p>
   115  so you drop the name from the <em>middle</em> of its declaration to construct
   116  its type. It's not obvious, though, that you declare something of type
   117  char *[] by putting its name in the middle.
   118  </p>
   119  
   120  <p>
   121  And look what happens to fp's declaration if you don't name the
   122  parameters:
   123  </p>
   124  
   125  <pre>
   126  int (*fp)(int (*)(int, int), int)
   127  </pre>
   128  
   129  <p>
   130  Not only is it not obvious where to put the name inside
   131  </p>
   132  
   133  <pre>
   134  int (*)(int, int)
   135  </pre>
   136  
   137  <p>
   138  it's not exactly clear that it's a function pointer declaration at all.
   139  And what if the return type is a function pointer?
   140  </p>
   141  
   142  <pre>
   143  int (*(*fp)(int (*)(int, int), int))(int, int)
   144  </pre>
   145  
   146  <p>
   147  It's hard even to see that this declaration is about fp.
   148  </p>
   149  
   150  <p>
   151  You can construct more elaborate examples but these should illustrate
   152  some of the difficulties that C's declaration syntax can introduce.
   153  </p>
   154  
   155  <p>
   156  There's one more point that needs to be made, though. Because type and
   157  declaration syntax are the same, it can be difficult to parse
   158  expressions with types in the middle. This is why, for instance, C casts
   159  always parenthesize the type, as in
   160  </p>
   161  
   162  <pre>
   163  (int)M_PI
   164  </pre>
   165  
   166  <p>
   167  <b>Go syntax</b>
   168  </p>
   169  
   170  <p>
   171  Languages outside the C family usually use a distinct type syntax in
   172  declarations. Although it's a separate point, the name usually comes
   173  first, often followed by a colon. Thus our examples above become
   174  something like (in a fictional but illustrative language)
   175  </p>
   176  
   177  <pre>
   178  x: int
   179  p: pointer to int
   180  a: array[3] of int
   181  </pre>
   182  
   183  <p>
   184  These declarations are clear, if verbose - you just read them left to
   185  right. Go takes its cue from here, but in the interests of brevity it
   186  drops the colon and removes some of the keywords:
   187  </p>
   188  
   189  <pre>
   190  x int
   191  p *int
   192  a [3]int
   193  </pre>
   194  
   195  <p>
   196  There is no direct correspondence between the look of [3]int and how to
   197  use a in an expression. (We'll come back to pointers in the next
   198  section.) You gain clarity at the cost of a separate syntax.
   199  </p>
   200  
   201  <p>
   202  Now consider functions. Let's transcribe the declaration for main, even
   203  though the main function in Go takes no arguments:
   204  </p>
   205  
   206  <pre>
   207  func main(argc int, argv *[]byte) int
   208  </pre>
   209  
   210  <p>
   211  Superficially that's not much different from C, but it reads well from
   212  left to right:
   213  </p>
   214  
   215  <p>
   216  <em>function main takes an int and a pointer to a slice of bytes and returns an int.</em>
   217  </p>
   218  
   219  <p>
   220  Drop the parameter names and it's just as clear - they're always first
   221  so there's no confusion.
   222  </p>
   223  
   224  <pre>
   225  func main(int, *[]byte) int
   226  </pre>
   227  
   228  <p>
   229  One value of this left-to-right style is how well it works as the types
   230  become more complex. Here's a declaration of a function variable
   231  (analogous to a function pointer in C):
   232  </p>
   233  
   234  <pre>
   235  f func(func(int,int) int, int) int
   236  </pre>
   237  
   238  <p>
   239  Or if f returns a function:
   240  </p>
   241  
   242  <pre>
   243  f func(func(int,int) int, int) func(int, int) int
   244  </pre>
   245  
   246  <p>
   247  It still reads clearly, from left to right, and it's always obvious
   248  which name is being declared - the name comes first.
   249  </p>
   250  
   251  <p>
   252  The distinction between type and expression syntax makes it easy to
   253  write and invoke closures in Go:
   254  </p>
   255  
   256  <pre>
   257  sum := func(a, b int) int { return a+b } (3, 4)
   258  </pre>
   259  
   260  <p>
   261  <b>Pointers</b>
   262  </p>
   263  
   264  <p>
   265  Pointers are the exception that proves the rule. Notice that in arrays
   266  and slices, for instance, Go's type syntax puts the brackets on the left
   267  of the type but the expression syntax puts them on the right of the
   268  expression:
   269  </p>
   270  
   271  <pre>
   272  var a []int
   273  x = a[1]
   274  </pre>
   275  
   276  <p>
   277  For familiarity, Go's pointers use the * notation from C, but we could
   278  not bring ourselves to make a similar reversal for pointer types. Thus
   279  pointers work like this
   280  </p>
   281  
   282  <pre>
   283  var p *int
   284  x = *p
   285  </pre>
   286  
   287  <p>
   288  We couldn't say
   289  </p>
   290  
   291  <pre>
   292  var p *int
   293  x = p*
   294  </pre>
   295  
   296  <p>
   297  because that postfix * would conflate with multiplication. We could have
   298  used the Pascal ^, for example:
   299  </p>
   300  
   301  <pre>
   302  var p ^int
   303  x = p^
   304  </pre>
   305  
   306  <p>
   307  and perhaps we should have (and chosen another operator for xor),
   308  because the prefix asterisk on both types and expressions complicates
   309  things in a number of ways. For instance, although one can write
   310  </p>
   311  
   312  <pre>
   313  []int("hi")
   314  </pre>
   315  
   316  <p>
   317  as a conversion, one must parenthesize the type if it starts with a *:
   318  </p>
   319  
   320  <pre>
   321  (*int)(nil)
   322  </pre>
   323  
   324  <p>
   325  Had we been willing to give up * as pointer syntax, those parentheses
   326  would be unnecessary.
   327  </p>
   328  
   329  <p>
   330  So Go's pointer syntax is tied to the familiar C form, but those ties
   331  mean that we cannot break completely from using parentheses to
   332  disambiguate types and expressions in the grammar.
   333  </p>
   334  
   335  <p>
   336  Overall, though, we believe Go's type syntax is easier to understand
   337  than C's, especially when things get complicated.
   338  </p>
   339  
   340  <p>
   341  <b>Notes</b>
   342  </p>
   343  
   344  <p>
   345  Go's declarations read left to right. It's been pointed out that C's
   346  read in a spiral! See <a href="http://c-faq.com/decl/spiral.anderson.html">
   347  The "Clockwise/Spiral Rule"</a> by David Anderson.
   348  </p>