github.com/notti/go-dynamic@v0.0.0-20190619201224-fc443047424c/README.md (about)

     1  nocgo
     2  =====
     3  
     4  Tested on go1.11 and go1.12.
     5  
     6  [![GoDoc](https://godoc.org/github.com/notti/nocgo?status.svg)](https://godoc.org/github.com/notti/nocgo)
     7  
     8  This repository/package contains a *proof of concept* for calling into C code *without* using cgo.
     9  
    10  > **WARNING!** This is meant as a proof of concept and subject to changes.
    11  Furthermore this is highly experimental code. DO NOT USE IN PRODUCTION.
    12  This could cause lots of issues from random crashes (there are tests - but there is definitely stuff that's not tested) to teaching your gopher to talk [C gibberish](https://cdecl.org/).
    13  
    14  > **WARNING** nocgo supports both cgo and missing cgo as environment. So if you want to ensure cgo not being used don't forget `CGO_ENABLED=0` as environment variable to `go build`.
    15  
    16  Todo
    17  ----
    18  
    19  - Callbacks into go
    20  - Structures
    21  
    22  When that's done write up a proposal for golang inclusion.
    23  
    24  Usage
    25  -----
    26  
    27  Libraries can be loaded and unloaded similar to `dlopen` and `dlclose`, but acquiring symbols (i.e., functions, global variables) is a bit different, since a function specification (i.e., arguments, types, return type) is also needed. Furthermore, C-types must be translated to go-types and vice versa.
    28  
    29  This works by providing a function specification as a pointer to a function variable. A call to `lib.Func` will examine arguments and eventual return value (only one or no return values allowed!), and set the function variable to a wrapper that will call into the desired C-function.
    30  
    31  ### Type Mappings
    32  
    33  Go types will be mapped to C-types according to the following table:
    34  
    35  Go type                                       | C Type
    36  --------------------------------------------- | ------
    37  `int8`, `byte`                                | `char`
    38  `uint8`, `bool`                               | `unsigned char`
    39  `int16`                                       | `short`
    40  `uint16`                                      | `unsigned short`
    41  `int32`                                       | `int`
    42  `uint32`                                      | `unsigned int`
    43  `int64`                                       | `long`
    44  `uint64`                                      | `unsigned long`
    45  `float32`                                     | `float`
    46  `float64`                                     | `double`
    47  `[]`, `uintptr`, `reflect.UnsafePointer`, `*` | `*`
    48  
    49  The last line means that slices and pointers are mapped to pointers in C. Pointers to structs are possible.
    50  
    51  Passing `struct`, `complex`, and callback functions is not (yet) supported.
    52  
    53  > **WARNING** `struct`s that are referenced **must** follow C alignment rules! There is **no** type checking, since this is actually not possible due to libraries not knowing their types...
    54  
    55  Go `int` was deliberately left out to avoid confusion, since it has different sizes on different architectures.
    56  
    57  ### Example
    58  
    59  An example using `pcap_open_live` from libpcap (C-definition: `pcap_t *pcap_open_live(const char *device, int snaplen, int promisc, int to_ms, char *errbuf)
    60  `) could look like the following example:
    61  
    62  ```golang
    63  
    64  // Load the library
    65  lib, err := nocgo.Open("libpcap.so")
    66  if err != nil {
    67      log.Fatalln("Couldn't load libpcap: ", err)
    68  }
    69  
    70  // func specification
    71  var pcapOpenLive func(device []byte, snaplen int32, promisc int32, toMS int32, errbuf []byte) uintptr
    72  // Get a handle for the function
    73  if err := lib.Func("pcap_open_live", &pcapOpenLive); err != nil {
    74      log.Fatalln("Couldn't get pcap_open_live: ", err)
    75  }
    76  
    77  // Do the function call
    78  errbuf := make([]byte, 512)
    79  pcapHandle := pcapOpenLive(nocgo.MakeCString("lo"), 1500, 1, 100, errbuf)
    80  
    81  // Check return value
    82  if pcapHandle == 0 {
    83      log.Fatalf("Couldn't open %s: %s\n", "lo", nocgo.MakeGoStringFromSlice(errbuf))
    84  }
    85  
    86  // pcapHandle can now be used as argument to the other libpcap functions
    87  ```
    88  
    89  A full example is contained in [examplelibpcap](examplelibpcap) and another one in [example](example).
    90  
    91  > **WARNING** nocgo supports both cgo and missing cgo as environment. So if you want to ensure cgo not being used don't forget `CGO_ENABLED=0` as environment variable to `go build`.
    92  
    93  Supported Systems
    94  -----------------
    95  
    96  * linux with glibc
    97  * FreeBSD<br>
    98    *Errata:* FreeBSD requires the exported symbols `_environ` and `_progname`. This is only possible inside cgo or stdlib. So for building on FreeBSD, `-gcflags=github.com/notti/nocgo/fakecgo=-std` is required (This doesn't seem to work for `go test` - so examples work, but test does not)).
    99  
   100  With some small modifications probably all systems providing `dlopen` can be supported. Have a look at [dlopen_OS.go](dlopen_linux.go) and [symbols_OS.go](fakecgo/symbols_linux.go) in fakecgo.
   101  
   102  Supported Architectures
   103  -----------------------
   104  
   105  * 386
   106  * amd64
   107  
   108  Implementing further architectures requires
   109  * Building trampolines for [fakecgo](fakecgo) (see below)
   110  * Implementing the cdecl callspec in [call_.go](call_amd64.go)/[.s](call_amd64.s)
   111  
   112  How does this work
   113  ------------------
   114  
   115  ### nocgo
   116  
   117  nocgo imports `dlopen`, `dlclose`, `dlerror`, `dlsym` via `go:cgo_import_dynamic` in [dlopen_OS.go](dlopen_linux.go). `lib.Func` builds a specification on where to put which argument in [call_arch.go](call_amd64.go). go calls such a function by dereferencing, where it points to, provide this address in a register and call the first address that is stored there. nocgo uses this mechanism by putting a struct there, that contains the address to a wrapper followed by a pointer to the what `dlsym` provided and a calling specification. The provided wrapper uses `cgocall` from the runtime to call an assembly function and pass the spec and a pointer to the arguments to it. This assembly function is implemented in call_arch.s and it uses the specification to place the arguments into the right places, calls the pointer provided by `dlsym` and then puts the return argument into the right place if needed.
   118  
   119  This is basically what `libffi` does. So far cdecl for 386 (pass arguments on the stack in right to left order, return values are in AX/CX or ST0) and amd64 (pass arguments in registers DI, SI, DX, CX, R8, R9/X0-X7 and the stack in right to left order, number of floats in AX, fixup alignment of stack) are implemented.
   120  
   121  So far so simple. `cgocall` could actually be used to call a C function directly - but it is only capable of providing one argument!
   122  
   123  But there is a second issue. For simple C functions we could leave it at that (well we would need to use `asmcgocall`, because `cgocall` checks, if cgo is actually there...). But there is this thing called Thread Local Storage (TLS) that is not too happy about golang not setting that up correctly. This is already needed if you do `printf("%f", 1)` with glibc!
   124  
   125  So we need to provide some functionality that cgo normally provides, which is implemented in fakecgo:
   126  
   127  ### fakecgo
   128  
   129  go sets up it's own TLS during startup in runtime/asm_arch.s in `runtime·rt0_go`. We can easily prevent that by providing setting the global variable `_cgo_init` to something non-zero (easily achieved with `go:linkname` and setting a value). But this would crash go, since if this is the case, go actually calls the address inside this variable (well ok we can provide an empty function).
   130  
   131  Additionally, this would provide correct TLS only on the main thread. This works until one does a lot more than just call one function, so we need to fixup also some other stuff.
   132  
   133  So next step: set `runtime.is_cgo` to true (again - linkname to the rescue). But this will panic since now the runtime expects the global variables `_cgo_thread_start`, `_cgo_notify_runtime_init_done`, `_cgo_setenv`, and `_cgo_unsetenv` to point to something. Ok so let's just implement those.
   134  
   135  * `_cgo_notify_runtime_init_done` is easy - we don't need this one: empty function.
   136  * `_cgo_setenv` is also simple: just one function call to `setenv`
   137  * `_cgo_unsetenv` is the same.
   138  * `_cgo_init` queries the needed stack size to update g->stack so that runtime stack checks do the right thing (it also provides a setg function we come to that later...)
   139  * `_cgo_thread_start` is a bit more involved... It starts up a new thread with `pthread_create` and does a bit of setup.
   140  
   141  So this should be doable - right?
   142  
   143  Well easier said than done - those are implemented in C-code in runtime/cgo/*c presenting some kind of chicken and egg problem to us.
   144  
   145  So I started out with reimplementing those in go assembly (remember: we want to be cgo free) which is available in the tag asm. Since this is really cumbersome and needs a lot of code duplication, I experimented a bit if we can do better.
   146  
   147  Aaaand we can:
   148  
   149  [fakecgo/trampoline_arch.s](fakecgo/trampoline_amd64.s) contains the above mentioned entry points, and "converts" the C-calling conventions to go calling conventions (e.g. move register passed arguments to the stack). Then it calls the go functions in [fakecgo/cgo.go](fakecgo/cgo.go).
   150  
   151  Ok - but we still need all those pthread and C-library-functions. Well we can import the symbols (like with `dlopen`). So all we need is a way to call those:
   152  
   153  The trampoline file also contains an `asmlibccall6` function that can call C-functions with a maximum of 6 integer arguments and one return value. [fakecgo/libccall.go](fakecgo/libccall.go) maps this onto more convenient go functions with 1-6 arguments and [fakecgo/libcdefs.go](fakecgo/libcdefs.go) further maps those into nice functions that look like the C functions (e.g. `func pthread_create(thread *pthread_t, attr *pthread_attr, start, arg unsafe.Pointer) int32`). Well this was not exactly my idea - the runtime already does that for solaris and darwin (runtime/os_solaris.go, runtime/syscall_solaris.go, runtime/sys_solaris_amd64.s) - although my implementation here is kept a bit simpler since it only ever will be called from gocode pretending to be C.
   154  
   155  So now we can implement all the above mentioned cgo functions in pure (but sometimes a bit ugly) go in [fakecgo/cgo.go](fakecgo/cgo.go). Ugly, because those functions are called with lots of functionality missing! Writebarriers are **not** allowed, as are stack splits.
   156  
   157  The upside is, that the only arch dependent stuff are the trampolines (in assembly) and the only OS dependent stuff are the symbol imports.
   158  
   159  Except for freebsd (which needs two exported symbols, as mentioned above) all those things work outside the runtime and no special treatment is needed. Just import fakecgo and all the cgo setup just works (except if you use cgo at the same time - then the linker will complain).
   160  
   161  Benchmarks
   162  ----------
   163  
   164  This will be a bit slower than cgo. Most of this is caused by argument rearranging:
   165  
   166  ### 386
   167  
   168  ```
   169  name           old time/op    new time/op    delta
   170  Empty-4          84.5ns ± 0%    86.4ns ± 2%    +2.22%  (p=0.000 n=8+8)
   171  Float2-4         87.9ns ± 1%   222.5ns ± 6%  +153.20%  (p=0.000 n=8+10)
   172  StackSpill3-4     116ns ± 1%     130ns ± 1%   +12.04%  (p=0.000 n=8+8)
   173  ```
   174  
   175  Float is so slow since that type is at the end of the comparison chain.
   176  
   177  ### amd64
   178  
   179  ```
   180  name           old time/op    new time/op    delta
   181  Empty-4          76.8ns ±10%    80.1ns ± 9%   +4.24%  (p=0.041 n=10+10)
   182  Float2-4         78.4ns ± 5%    81.4ns ± 9%   +3.80%  (p=0.033 n=9+10)
   183  StackSpill3-4    96.2ns ± 5%   120.7ns ± 7%  +25.46%  (p=0.000 n=10+9)
   184  ```