github.com/criyle/go-sandbox@v0.10.3/README.md (about)

     1  # go-sandbox
     2  
     3  [![GoDoc](https://godoc.org/github.com/criyle/go-sandbox?status.svg)](https://godoc.org/github.com/criyle/go-sandbox) [![Go Report Card](https://goreportcard.com/badge/github.com/criyle/go-sandbox)](https://goreportcard.com/report/github.com/criyle/go-sandbox) [![Release](https://img.shields.io/github/v/tag/criyle/go-sandbox)](https://github.com/criyle/go-sandbox/releases/latest)
     4  
     5  Original goal was to replica [uoj-judger/run_program](https://github.com/vfleaking/uoj) in GO language using [libseccomp](https://github.com/pkg/seccomp/libseccomp-golang). As technology grows, it also implements new technologies including Linux namespace and cgroup.
     6  
     7  The idea of rootfs and interval CPU usage checking comes from [syzoj/judge-v3](https://github.com/syzoj/judge-v3) and the pooled pre-forked container comes from [vijos/jd4](https://github.com/vijos/jd4).
     8  
     9  If you are looking for sandbox implementation via REST / gRPC API, please check [go-judge](https://github.com/criyle/go-judge).
    10  
    11  Notice: Only works on Linux since ptrace, unshare, cgroup are available only on Linux
    12  
    13  ## Build & Install
    14  
    15  - install latest go compiler from [golang/download](https://golang.org/dl/)
    16  - install libseccomp library: (for Ubuntu) `apt install libseccomp-dev`
    17  - build & install: `go install github.com/criyle/go-sandbox/...`
    18  
    19  ## Technologies
    20  
    21  ### libseccomp + ptrace (improved UOJ sandbox)
    22  
    23  1. Restricted computing resource by POSIX rlimit: Time & Memory (Stack) & Output
    24  2. Restricted syscall access (by libseccomp & ptrace)
    25  3. Restricted file access (read & write & access & exec). Evaluated by UOJ FileSet
    26  
    27  Improvements:
    28  
    29  1. Precise resource limits (s -> ms, mb -> kb)
    30  2. More architectures (arm32, arm64)
    31  3. Allow multiple traced programs in different threads
    32  4. Allow pipes as input / output files
    33  
    34  Default file access syscall check:
    35  
    36  - check file read / write: `open`, `openat`
    37  - check file read: `readlink`, `readlinkat`
    38  - check file write: `unlink`, `unlinkat`, `chmod`, `rename`
    39  - check file access: `stat`, `lstat`, `access`, `faccessat`
    40  - check file exec: `execve`, `execveat`
    41  
    42  ### linux namespace + cgroup
    43  
    44  1. Unshare & bind mount rootfs based on hostfs (eliminated ptrace)
    45  2. Use Linux Control Groups to limit & acct CPU & memory (eliminated wait4.rusage)
    46  3. Container tech with execveat memfd, sethostname, setdomainname
    47  
    48  ## Design
    49  
    50  ### Result Status
    51  
    52  - Normal (no error)
    53  - Program Error
    54    - Resource Limit Exceeded
    55      - Time
    56      - Memory
    57      - Output
    58    - Unauthorized Access
    59      - Disallowed Syscall
    60    - Runtime Error
    61      - Signalled
    62        - `SIGXCPU` / `SIGKILL` are treated as TimeLimitExceeded by rlimit or caller kill
    63        - `SIGXFSZ` is treated as OutputLimitExceeded by rlimit
    64        - `SIGSYS` is treaded as Disallowed Syscall by seccomp
    65        - Potential Runtime error are: `SIGSEGV` (segment fault)
    66      - Nonzero Exit Status
    67  - Program Runner Error
    68  
    69  ### Result Structure
    70  
    71  ``` go
    72  type Result struct {
    73      Status            // result status
    74      ExitStatus int    // exit status (signal number if signalled)
    75      Error      string // potential detailed error message (for program runner error)
    76  
    77      Time   time.Duration // used user CPU time  (underlying type int64 in ns)
    78      Memory Size          // used user memory    (underlying type uint64 in bytes)
    79      // metrics for the program runner
    80      SetUpTime   time.Duration
    81      RunningTime time.Duration
    82  }
    83  ```
    84  
    85  ### Runner Interface
    86  
    87  Configured runner to run the program. `Context` is used to cancel (control time limit exceeded event; should not be nil).
    88  
    89  ``` go
    90  type Runner interface {
    91      Run(context.Context) <-chan runner.Result
    92  }
    93  ```
    94  
    95  ### Pre-forked Container Protocol
    96  
    97  1. Pre-fork container to run programs inside
    98  2. Unix socket to pass fd inside / outside
    99  
   100  Container / Host Communication Protocol (single thread):
   101  
   102  - ping (alive check):
   103    - reply: pong
   104  - conf (set configuration):
   105    - reply pong
   106  - open (open files in given mode inside container):
   107    - send: []OpenCmd
   108    - reply: "success", file fds / "error"
   109  - delete (unlink file / rmdir dir inside container):
   110    - send: path
   111    - reply: "finished" / "error"
   112  - reset (clean up container for later use (clear workdir / tmp)):
   113    - send:
   114    - reply: "success"
   115  - execve: (execute file inside container):
   116    - send: argv, env, rLimits, fds
   117    - reply:
   118      - success: "success", pid
   119      - failed: "failed"
   120    - send (success): "init_finished" (as cmd)
   121      - reply: "finished" / send: "kill" (as cmd)
   122      - send: "kill" (as cmd) / reply: "finished"
   123    - reply:
   124  
   125  Any socket related error will cause the container exit (with all process inside container)
   126  
   127  ### Pre-forked Container Environment
   128  
   129  Container restricted environment is accessed though RPC interface defined by above protocol
   130  
   131  Provides:
   132  
   133  - File access
   134    - Open: create / access files
   135    - Delete: remove file
   136  - Management
   137    - Ping: alive check
   138    - Reset: remove temporary files
   139    - Destroy: destroy the container environment
   140  - Run program
   141    - Execve: execute program with given parameters
   142  
   143  ``` go
   144  type Environment interface {
   145      Ping() error
   146      Open([]OpenCmd) ([]*os.File, error)
   147      Delete(p string) error
   148      Reset() error
   149      Execve(context.Context, ExecveParam) <-chan runner.Result
   150      Destroy() error
   151  }
   152  ```
   153  
   154  ## Packages (/pkg)
   155  
   156  - seccomp: provides seccomp type definition
   157    - libseccomp: provides utility function that wrappers libseccomp
   158  - forkexec: fork-exec provides mount, unshare, ptrace, seccomp, capset before exec
   159  - memfd: read regular file and creates a sealed memfd for its contents
   160  - unixsocket: send / recv oob msg from a unix socket
   161  - cgroup: creates cgroup directories and collects resource usage / limits
   162  - mount: provides utility function that wrappers mount syscall
   163  - rlimit: provides utility function that defines rlimit syscall
   164  - pipe: provides wrapper to collect all written content through pipe
   165  
   166  ## Packages
   167  
   168  - cmd/runprog/config: defines arch & language specified trace condition for ptrace runner from UOJ
   169  - container: creates pre-forked container to run programs inside
   170  - runner: interface to run program
   171    - ptrace: wrapper to call forkexec and ptracer
   172      - filehandler: an example implementation of UOJ file set
   173    - unshare: wrapper to call forkexec and unshared namespaces
   174  - ptracer: ptrace tracer and provides syscall trap filter context
   175  
   176  ## Executable
   177  
   178  - runprog: safely run program by unshare / ptrace / pre-forked containers
   179  
   180  ## Configurations
   181  
   182  - config/config.go: all configs toward running specs (similar to UOJ)
   183  
   184  ## Kernel Versions
   185  
   186  - 5.19: `memory.peak` in cgroup v2
   187  - 4.15: cgroup v2
   188  - 4.14: SECCOMP_RET_KILL_PROCESS 
   189  - 4.6: CLONE_NEWCGROUP
   190  - 3.19: execveat()
   191  - 3.17: seccomp, memfd_create
   192  - 3.10: CentOS 7
   193  - 3.8: CLONE_NEWUSER without CAP_SYS_ADMIN, CAP_SETUID, CAP_SETGID
   194  - 3.5: prctl(PR_SET_NO_NEW_PRIVS)
   195  - 2.6.36: prlimit64
   196  
   197  ## Benchmarks
   198  
   199  ### ForkExec
   200  
   201  ```bash
   202  $ go test -bench . -benchtime 10s
   203  goos: linux
   204  goarch: amd64
   205  pkg: github.com/criyle/go-sandbox/pkg/forkexec
   206  BenchmarkSimpleFork-4              	   12409	    996096 ns/op
   207  BenchmarkUnsharePid-4              	   10000	   1065168 ns/op
   208  BenchmarkUnshareUser-4             	   10000	   1061770 ns/op
   209  BenchmarkUnshareUts-4              	   10000	   1056558 ns/op
   210  BenchmarkUnshareCgroup-4           	   10000	   1049446 ns/op
   211  BenchmarkUnshareIpc-4              	     709	  16114052 ns/op
   212  BenchmarkUnshareMount-4            	     745	  16207754 ns/op
   213  BenchmarkUnshareNet-4              	    3643	   3492924 ns/op
   214  BenchmarkFastUnshareMountPivot-4   	     612	  20967318 ns/op
   215  BenchmarkUnshareAll-4              	     837	  14047995 ns/op
   216  BenchmarkUnshareMountPivot-4       	     488	  24198331 ns/op
   217  PASS
   218  ok  	github.com/criyle/go-sandbox/pkg/forkexec	147.186s
   219  ```
   220  
   221  ### Container
   222  
   223  ```bash
   224  $ go test -bench . -benchtime 10s
   225  goos: linux
   226  goarch: amd64
   227  pkg: github.com/criyle/go-sandbox/container
   228  BenchmarkContainer-4   	    5907	   2062070 ns/op
   229  PASS
   230  ok  	github.com/criyle/go-sandbox/container	21.763s
   231  ```
   232  
   233  ### Cgroup
   234  
   235  ```bash
   236  $ go test -bench . -benchtime 10s
   237  goos: linux
   238  goarch: amd64
   239  pkg: github.com/criyle/go-sandbox/pkg/cgroup
   240  BenchmarkCgroup-4   	   50283	    245094 ns/op
   241  PASS
   242  ok  	github.com/criyle/go-sandbox/pkg/cgroup	14.744s
   243  ```
   244  
   245  ### Socket
   246  
   247  Blocking:
   248  
   249  ```bash
   250  $ go test -bench . -benchtime 10s
   251  goos: linux
   252  goarch: amd64
   253  pkg: github.com/criyle/go-sandbox/pkg/unixsocket
   254  cpu: Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
   255  BenchmarkBaseline-8             12170148              1048 ns/op
   256  BenchmarkGoroutine-8             2658846              4910 ns/op
   257  BenchmarkChannel-8               8454133              1431 ns/op
   258  BenchmarkChannelBuffed-8         8767264              1357 ns/op
   259  BenchmarkChannelBuffed4-8        9670935              1230 ns/op
   260  BenchmarkEmptyGoroutine-8       34927512               342.8 ns/op
   261  PASS
   262  ok      github.com/criyle/go-sandbox/pkg/unixsocket     83.669s
   263  ```
   264  
   265  Non-block:
   266  
   267  ```bash
   268  $ go test -bench . -benchtime 10s
   269  goos: linux
   270  goarch: amd64
   271  pkg: github.com/criyle/go-sandbox/pkg/unixsocket
   272  cpu: Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
   273  BenchmarkBaseline-8             11609772              1001 ns/op
   274  BenchmarkGoroutine-8             2470767              4788 ns/op
   275  BenchmarkChannel-8               8488646              1427 ns/op
   276  BenchmarkChannelBuffed-8         8876050              1345 ns/op
   277  BenchmarkChannelBuffed4-8        9813187              1212 ns/op
   278  BenchmarkEmptyGoroutine-8       34852828               342.2 ns/op
   279  PASS
   280  ok      github.com/criyle/go-sandbox/pkg/unixsocket     81.679s
   281  ```