github.com/april1989/origin-go-tools@v0.0.32/README.md (about) 1 # Go Tools -> Go Pointer Analysis 2 3 ## This branch is for call back funcs shown in app func but called after several level of lib calls 4 5 We want to skip the analysis of lib calls, too expensive. If we are going to create synthetic ssa for lib functions, 6 we start here. 7 1. write a native.xml file with all irs we want 8 2. preload all irs in native.xml 9 3. when reach here @ssa/create.go CreatePackage(), we check if it is in preload, if yes, use this synthetic 10 11 Key files: 12 ssa/create.go 13 ssa/builder.go 14 15 Standard libraries: https://pkg.go.dev/std 16 17 #### *Update* 18 19 20 ==================================================================================== 21 22 Git clone from https://github.com/golang/tools, start from commit 146a0deefdd11b942db7520f68c117335329271a (around v0.5.0-pre1). 23 24 The default go pointer analysis algorithm (v0.5.0-pre1) is at ```go_tools/go/pointer_default```. 25 26 For any panic, please submit an issue with copy/paste crash stack. Thanks. 27 28 ## How to Use? 29 Go to ```go_tools/main```, and run ```go build```. Then, run ```./main``` with the following flags and 30 the directory of the go project that you want to analyze. 31 It will go through all of your main files and analyze them one by one. 32 33 #### Flags 34 - *path*: default value = "", Designated project filepath. 35 - *doLog*: default value = false, Do a log to record all procedure, so verbose. 36 - *doCompare*: default value = false, Do comparison with default pta about performance and result. 37 38 For example, 39 40 ```./main -doLog -doCompare ../grpc-go/benchmark/server``` 41 42 This will run the origin-sensitive pointer analysis on all main files under directory ```../grpc-go/benchmark/server```, 43 as well as generate a full log and a comparison with the default algorithm about performance and result. 44 45 *Note* that ```-doLog``` is very verbose and significantly slowdown the analysis. 46 47 ## User APIs (for detector) 48 Go to https://github.com/april1989/origin-go-tools/main/main.go, check how to use the callgraph and queries. 49 50 ## Origin-sensitive 51 52 #### What is Origin? 53 We treat a go routine instruction as an origin entry point, and all variables/function calls inside this go rountine share the same context as their belonging go routine. 54 55 #### Main Changes from Default 56 Instead of pre-computing all cgnodes and their constraints before actually propagating changes among points-to constraints, 57 we start from the reachable cgnodes ```init``` and ```main``` and gradually compute reachable cgnodes and their constraints. 58 59 ## kCFA 60 61 #### Main Changes from Default 62 - Create k-callsite-sensitive contexts for static/invoke calls 63 - Generate constraints/cgnode online for invoke calls and targets when it is necessary 64 - Currently, skip the creation of reflection and dynamic calls due to the huge number 65 66 67 ======================================================================== 68 ## Doc of Default Algorithm 69 70 The most recent doc is https://pkg.go.dev/golang.org/x/tools/go/pointer#pkg-overview, quoted: 71 72 "SOUNDNESS 73 74 The analysis is fully sound when invoked on pure Go programs that do not use reflection or unsafe.Pointer conversions. In other words, if there is any possible execution of the program in which pointer P may point to object O, the analysis will report that fact." 75 76 However, over soundness is unnecessary. 77 78 ======================================================================== 79 ## Major differences between the results of mine and default 80 81 #### 1. Queries and Points-to Set (pts) 82 All example below based on race_checker/tests/cg.go 83 84 #### Why my query has two pointers for one ssa.Value: 85 There are two pointers involved in one constraint, both of which are stored under the same ssa.Value in my queries. 86 87 #### Why default query is empty but mines is not: 88 Default has tracked less types than mine, of which constraints and invoke calls are missing in the default result. 89 Hence, it has empty pts while mine has non-empty pts. For example, 90 ``` 91 SSA: &t57[41:int] 92 My Query: (#obj: 1 ) 93 n597&[0:shared contour; ] : [slicelit[*]] 94 n2266&[0:shared contour; ] : [] 95 Default Query: (#obj: 0 ) 96 n36563 : [] 97 98 In default log: 99 ; t181 = &t57[41:int] 100 localobj[t181] = n37034 101 type not tracked: *strconv.leftCheat 102 103 In my log: 104 ; t181 = &t57[41:int] 105 localobj[t181] = n2266 106 addr n597 <- {&n2266} 107 ``` 108 109 #### Why my query is empty but default is not: 110 Due to the default algorithm (pre-compute all constraints for all functions), 111 it generates a lot of unreachable functions/cgnodes (they have no callers), as well as their constraints. 112 This also affect the pts of the reachable part in cg and pts, since they may be polluted. 113 For example, 114 ``` 115 SSA: (*internal/reflectlite.rtype).Size //-> (*internal/reflectlite.rtype).Size is not reachable function 116 My Query: (#obj: 0 ) 117 n8971&(Global/Local) : [] 118 Default Query: (#obj: 1 ) 119 n6354 : [(*internal/reflectlite.rtype).Size] 120 ``` 121 122 #### Why my query is non-empty but no corresponding pointer in default: 123 Default does not create queries for those types (not tracked types). 124 For example, 125 ``` 126 654. 127 SSA: io.ErrClosedPipe 128 My Query: (#obj: 1 ) 129 n4448&(Global/Local) : [makeinterface:*errors.errorString] 130 Default Query: nil) 131 132 In default log: 133 ; *ErrClosedPipe = t10 134 copy n19413 <- n39471 135 136 In my log: 137 ; *ErrClosedPipe = t10 138 create n4448 error for global 139 globalobj[io.ErrClosedPipe] = n4448 140 copy n4448 <- n4431 141 ``` 142 143 144 #### Why default query is empty but no corresponding pointer in mine: 145 IDK. 146 For example, 147 ``` 148 SSA: &r.peekRune [#4] 149 My Query: nil 150 Default Query: (#obj: 0 ) 151 n44378 : [] 152 and 153 SSA: ssa:wrapnilchk(v, "internal/reflectl...":string, "IsNil":string) 154 My Query: nil) 155 Default Query: (#obj: 0 ) 156 n44378 : [] 157 and 158 val[t115] = n44377 (*ssa.FieldAddr) 159 create n44378 *[16]byte for query 160 copy n44378 <- n44377 161 ``` 162 163 #### Why my query has less objs in pts than the default: 164 All missing objs in my pts are due to objs and constraints introduced by unreachable functions. 165 This is the pollution we mentioned before. 166 For example, 167 ``` 168 SSA: *t49 169 My Query: (#obj: 27 ) 170 n11815&[0:shared contour; ] : [makeinterface:int makeinterface:[]int makeinterface:int makeinterface:*internal/reflectlite.ValueError makeinterface:*internal/reflectlite.ValueError makeinterface:string makeinterface:string makeinterface:*internal/reflectlite.ValueError makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string] 171 Default Query: (#obj: 79 ) 172 n18076 : [makeinterface:int makeinterface:[]int makeinterface:string makeinterface:*internal/reflectlite.ValueError makeinterface:*internal/reflectlite.ValueError makeinterface:*internal/reflectlite.ValueError makeinterface:*internal/reflectlite.ValueError makeinterface:string makeinterface:*internal/reflectlite.ValueError makeinterface:string makeinterface:string makeinterface:string makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:*internal/reflectlite.ValueError makeinterface:string makeinterface:string makeinterface:*internal/reflectlite.ValueError makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:int makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:*internal/reflectlite.ValueError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:*errors.errorString makeinterface:string makeinterface:string makeinterface:*internal/reflectlite.ValueError makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string] 173 ``` 174 175 176 #### 2. CG 177 178 #### *Why are the cgs from default and my pta different?* 179 180 The default algorithm create cgnodes for functions that are not reachable from the main entry. 181 For example, when analyzing the main entry ```google.golang.org/grpc/benchmark/server```, 182 the default algorithm pre-generate constraints and cgnodes for function: 183 ```go 184 (*google.golang.org/grpc/credentials.tlsCreds).ServerHandshake 185 ``` 186 which is not reachable from the main entry (it has no caller in cg). 187 188 This can be reflected in the default analysis data: 189 ``` 190 Call Graph: (function based) 191 #Nodes: 14740 192 #Edges: 45550 193 #Unreach Nodes: 7698 194 #Reach Nodes: 7042 195 #Unreach Functions: 7698 196 #Reach Functions: 7042 197 198 Done -- PTA/CG Build; Using 5m13.058385739s . 199 ``` 200 Default generates 14740 functions and their constraints, however, only 7042 (at most) of them can be reachable from the main. 201 202 While my analysis data is: 203 ``` 204 Call Graph: (cgnode based: function + context) 205 #Nodes: 6306 206 #Edges: 23291 207 #Unreach Nodes: 39 208 #Reach Nodes: 6267 209 #Unreach Functions: 39 210 #Reach Functions: 5870 211 212 #Unreach Nodes from Pre-Gen Nodes: 39 213 #Unreach Functions from Pre-Gen Nodes: 39 214 #(Pre-Gen are created for reflections) 215 216 Done -- PTA/CG Build; Using 10.279403083s. 217 ``` 218 My analysis traverse 6267 functions that can be reached after extended the traced types. 219 220 This not only introduce differences in cg, but also unreachable constraints and objs, which can be 221 propagated to the cgnodes and constraints that can be reached from the main entry. This causes false 222 call edges and callees in default cg. 223 224 Most CG DIFFs from comparing mine with default result are due to this reason. 225 226 227 #### Why the unreachable function/cgnode will be generated? 228 This is because the default algorithm creates nodes and constraints for all methods of all types 229 that are dynamically accessible via reflection or interfaces (no matter it will be reached or not). 230 231 232 #### Why the cgnodes from default not include some callees as mine? 233 Because default algo has less type tracked than mine (no constraints generated for them and hence 234 no propagation), Hence, some invoke calls has no base instance that will exist if we track those types. 235 Consequently, no callee functions/cgs generated as well as constraints. 236 237 238