github.com/graybobo/golang.org-package-offline-cache@v0.0.0-20200626051047-6608995c132f/x/talks/2014/static-analysis.slide (about) 1 Static analysis tools 2 for Go code comprehension and refactoring 3 4 golang nyc meetup 5 13 Nov 2014 6 7 Alan Donovan 8 Google, New York 9 adonovan@google.com 10 11 * Video 12 13 This talk was presented at the GothamGo Kickoff Meetup in New York City, November 2014. 14 15 .link http://vimeo.com/114736889 Watch the talk on Vimeo 16 17 * Introduction 18 19 Programmers say "writing code", but most of that time is spent _reading_ 20 21 Actually: reading code, and making logical deductions 22 23 Machines are good at reading and logic 24 25 *Machines* *should* *be* *helping* *us* *read* *code.* 26 27 And write it! Refactoring can be tedious and error-prone. 28 29 30 * Outline 31 32 - Code comprehension tools: `oracle`, `godoc` 33 - Analysis technology 34 - Refactoring tools: `gorename`, `eg` 35 36 37 * Demo: the Go oracle 38 39 Supports interactive, editor-integrated queries: 40 - where is this thing defined? 41 - what are the methods of this type? 42 - what are the free variables of this selection? 43 - what might this expression point to? 44 - what dynamic types might this interface or `reflect.Value` contain? 45 46 47 * Demo: godoc analysis features 48 49 .link http://golang.org/lib/godoc/analysis/help.html godoc -analysis=type,pointer 50 51 Package view 52 - method set and _implements_ relation for every type 53 - internal call graph of every package 54 55 Source code view 56 - kind, type, definition of every identifier 57 - method set and _implements_ relation for every type 58 - peers of every channel send/receive 59 - callers of every function 60 - callees of every call site (even dynamic) 61 62 63 * How it works 64 65 66 67 * Libraries and tools 68 69 .image static-analysis/tools.svg 70 71 Mostly in `golang.org/x/tools` repo 72 73 74 * go/types: the Go type checker 75 76 Computes mappings from: 77 - identifiers to definitions 78 - constant expressions to values 79 - expressions to types 80 81 Many subtle corners: 82 - method set computation 83 - recursive interfaces 84 - shifts 85 86 Making it correct, fast, and clean was a substantial project 87 88 .link http://godoc.org/golang.org/x/tools/go/types golang.org/x/tools/go/types 89 90 Author: Robert Griesemer 91 92 93 94 * go/ssa: intermediate representation (IR) 95 96 Typed abstract syntax trees (ASTs) are too varied and subtle to analyze directly 97 98 Programs are lowered into Static Single-Assignment form (SSA): 99 - simplifies dataflow analyses since _reaching_ _definitions_ are implicit 100 - invented 1991, now mainstream (gcc, llvm) 101 102 All Go programs can be expressed using only ~30 basic instructions 103 104 Simple, explicit, high-level, high source fidelity 105 106 .link http://godoc.org/golang.org/x/tools/go/ssa golang.org/x/tools/go/ssa 107 108 The llgo project is using go/ssa as a front-end for LLVM 109 110 111 112 * Demo: ssadump 113 114 # Simple fibonacci function 115 # % ssadump -build=F fib.go basic 116 # % ssadump -build=FD fib.go debug info 117 # % ssadump -test -run unicode toy interpreter 118 119 120 121 * go/pointer: pointer analysis 122 123 Pointers complicate reasoning about program behaviour 124 - functions may be called dynamically 125 - a variable may be updated and accessed via different names ("aliases") 126 - runtime types are values too: `interface`, `reflect.Type` 127 128 We use *pointer* *analysis* to answer the question: 129 which variables might this pointer point to? 130 131 .link http://godoc.org/golang.org/x/tools/go/pointer golang.org/x/tools/go/pointer 132 133 # comment on go's appropriateness for this analysis: 134 # (closed program---no dlopen, classloading, no generics, typesafe) 135 136 137 * Pointer analysis 138 139 Let `pts(p)` be the _points-to_ _set_ of pointer p 140 Its elements are abstract variables: globals, locals, unnamed (`new(T)`) 141 142 Overview: 143 144 - analyze each SSA statement in the whole program 145 146 - generate appropriate constraints on the points-to set of each variable 147 148 Statement Constraint 149 y = &x pts(y) ⊇ {x} 150 y = x pts(y) ⊇ pts(x) "inclusion-based" 151 *y = x ∀o ∈ pts(y). pts(o) ⊇ pts(x) 152 y = *x ∀o ∈ pts(x). pts(y) ⊇ pts(o) 153 y = &x->f \ 154 y = x.(T) } not shown 155 reflection / 156 157 - solve the constraint system 158 159 160 * Pointer analysis: constraint generation 161 162 All Go operations can be expressed in these constraints 163 Function, map, slice and channel ops all simplify to struct ops 164 165 Treatment of `unsafe.Pointer` conversion is unsound 166 But we don't care! This isn't a compiler 167 168 The constraint system is: 169 - *field-sensitive*: struct fields x.f and x.g have distinct solutions 170 - *flow-insensitive*: the order of statements is ignored 171 - *context-insensitive*: each function is analyzed only once 172 - *whole-program*: Go source is required for all dependencies 173 - *inclusion-based*: i.e. Andersen's analysis 174 175 Optimization: don't make constraints for "uninteresting" types 176 such as types not related to a one-shot Oracle query 177 178 * Pointer analysis: pre-solver optimizations 179 180 Constraint system for 124KLoC program (oracle) has: 181 182 254,000 variables 183 184,000 equations 184 185 Solving phase is in O(|v|³), so pre-solver optimisation is crucial 186 187 We can bring this down to: 188 189 88,000 variables 190 65,000 equations 191 192 * Pointer Equivalence by Hash-Value Numbering (Hardekopf & Lin '07) 193 194 p = &x a = &x if ... { 195 q = p b = a p, q = &x, &y 196 r = q c = b } else { 197 s = r if ... { a = c } p, q = &y, &x 198 } 199 200 .image static-analysis/hvn.svg 201 202 Hash-Value Numbering 203 204 * Pointer analysis: solving 205 206 It's transitive closure of a graph, essentially 207 208 Lots of optimizations, for example: 209 210 _sparse_bit_vectors_, a very compact representation for points-to sets 211 212 .link http://godoc.org/golang.org/x/tools/container/ints golang.org/x/tools/container/ints 213 214 Solver log is >1GB. Debugging is fun. 215 216 217 #* Sparse bit vector 218 #`golang.org/x/tools/container/ints` 219 # 220 #.image sparsebitvector.svg 221 # 222 #Very compact representation of sparse `set<int>` 223 #Doubly-linked list of (offset int, data [256]bit) blocks. 224 225 226 * Call graph 227 228 The *call* *graph* can be read directly from the solution 229 230 The `golang.org/x/tools/cmd/callgraph` tool prints raw call graphs 231 232 Demo (time permitting) 233 234 235 * Refactoring tools 236 237 * gorename: precise, type-safe identifier renaming 238 239 A refactoring tool, usable from... 240 241 - the command line 242 243 % gorename -from bytes.Buffer.Len -to Size 244 Renamed 105 occurrences in 42 files in 33 packages. 245 246 - many editors ([[https://code.google.com/p/go/source/browse/refactor/rename/rename.el?repo=tools&r=511801758bb9a0b83f9bf931342889fbedbc9b48][Emacs]], [[http://quick.as/dgof2p7][Vim]], [[https://github.com/smartystreets/sublime-gorename][Sublime]], [[https://github.com/davidrjenni/agorn][Acme]]) 247 248 All renamings are reversible 249 250 Sound: either a conflict is reported, or the refactoring preserves behaviour* 251 252 *except reflection 253 254 .link http://godoc.org/golang.org/x/tools/cmd/gorename golang.org/x/tools/cmd/gorename 255 256 * Demo: gorename 257 258 259 * Example-based refactoring with eg 260 261 A Go port of the Java _Refaster_ tool 262 263 A template specifies the pattern and replacement as Go expressions: 264 265 package P 266 267 import ( "errors"; "fmt" ) 268 269 func before(s string) error { return fmt.Errorf("%s", s) } 270 func after(s string) error { return errors.New(s) } 271 272 Parameters are _wildcards_ 273 274 % eg -t template.go <package> ... 275 276 .link http://godoc.org/golang.org/x/tools/cmd/eg golang.org/x/tools/cmd/eg 277 278 * Demo: eg 279 280 * Conclusion 281 282 From the outset, Go has been renowned for its tools: `go`, `gofmt` 283 284 We have many building blocks for sophisticated source analysis tools 285 286 What should we build next?