github.com/bir3/gocompiler@v0.3.205/src/cmd/internal/obj/ppc64/doc.go (about) 1 // Copyright 2019 The Go Authors. All rights reserved. 2 // Use of this source code is governed by a BSD-style 3 // license that can be found in the LICENSE file. 4 5 /* 6 Package ppc64 implements a PPC64 assembler that assembles Go asm into 7 the corresponding PPC64 instructions as defined by the Power ISA 3.0B. 8 9 This document provides information on how to write code in Go assembler 10 for PPC64, focusing on the differences between Go and PPC64 assembly language. 11 It assumes some knowledge of PPC64 assembler. The original implementation of 12 PPC64 in Go defined many opcodes that are different from PPC64 opcodes, but 13 updates to the Go assembly language used mnemonics that are mostly similar if not 14 identical to the PPC64 mneumonics, such as VMX and VSX instructions. Not all detail 15 is included here; refer to the Power ISA document if interested in more detail. 16 17 Starting with Go 1.15 the Go objdump supports the -gnu option, which provides a 18 side by side view of the Go assembler and the PPC64 assembler output. This is 19 extremely helpful in determining what final PPC64 assembly is generated from the 20 corresponding Go assembly. 21 22 In the examples below, the Go assembly is on the left, PPC64 assembly on the right. 23 24 1. Operand ordering 25 26 In Go asm, the last operand (right) is the target operand, but with PPC64 asm, 27 the first operand (left) is the target. The order of the remaining operands is 28 not consistent: in general opcodes with 3 operands that perform math or logical 29 operations have their operands in reverse order. Opcodes for vector instructions 30 and those with more than 3 operands usually have operands in the same order except 31 for the target operand, which is first in PPC64 asm and last in Go asm. 32 33 Example: 34 35 ADD R3, R4, R5 <=> add r5, r4, r3 36 37 2. Constant operands 38 39 In Go asm, an operand that starts with '$' indicates a constant value. If the 40 instruction using the constant has an immediate version of the opcode, then an 41 immediate value is used with the opcode if possible. 42 43 Example: 44 45 ADD $1, R3, R4 <=> addi r4, r3, 1 46 47 3. Opcodes setting condition codes 48 49 In PPC64 asm, some instructions other than compares have variations that can set 50 the condition code where meaningful. This is indicated by adding '.' to the end 51 of the PPC64 instruction. In Go asm, these instructions have 'CC' at the end of 52 the opcode. The possible settings of the condition code depend on the instruction. 53 CR0 is the default for fixed-point instructions; CR1 for floating point; CR6 for 54 vector instructions. 55 56 Example: 57 58 ANDCC R3, R4, R5 <=> and. r5, r3, r4 (set CR0) 59 60 4. Loads and stores from memory 61 62 In Go asm, opcodes starting with 'MOV' indicate a load or store. When the target 63 is a memory reference, then it is a store; when the target is a register and the 64 source is a memory reference, then it is a load. 65 66 MOV{B,H,W,D} variations identify the size as byte, halfword, word, doubleword. 67 68 Adding 'Z' to the opcode for a load indicates zero extend; if omitted it is sign extend. 69 Adding 'U' to a load or store indicates an update of the base register with the offset. 70 Adding 'BR' to an opcode indicates byte-reversed load or store, or the order opposite 71 of the expected endian order. If 'BR' is used then zero extend is assumed. 72 73 Memory references n(Ra) indicate the address in Ra + n. When used with an update form 74 of an opcode, the value in Ra is incremented by n. 75 76 Memory references (Ra+Rb) or (Ra)(Rb) indicate the address Ra + Rb, used by indexed 77 loads or stores. Both forms are accepted. When used with an update then the base register 78 is updated by the value in the index register. 79 80 Examples: 81 82 MOVD (R3), R4 <=> ld r4,0(r3) 83 MOVW (R3), R4 <=> lwa r4,0(r3) 84 MOVWZU 4(R3), R4 <=> lwzu r4,4(r3) 85 MOVWZ (R3+R5), R4 <=> lwzx r4,r3,r5 86 MOVHZ (R3), R4 <=> lhz r4,0(r3) 87 MOVHU 2(R3), R4 <=> lhau r4,2(r3) 88 MOVBZ (R3), R4 <=> lbz r4,0(r3) 89 90 MOVD R4,(R3) <=> std r4,0(r3) 91 MOVW R4,(R3) <=> stw r4,0(r3) 92 MOVW R4,(R3+R5) <=> stwx r4,r3,r5 93 MOVWU R4,4(R3) <=> stwu r4,4(r3) 94 MOVH R4,2(R3) <=> sth r4,2(r3) 95 MOVBU R4,(R3)(R5) <=> stbux r4,r3,r5 96 97 4. Compares 98 99 When an instruction does a compare or other operation that might 100 result in a condition code, then the resulting condition is set 101 in a field of the condition register. The condition register consists 102 of 8 4-bit fields named CR0 - CR7. When a compare instruction 103 identifies a CR then the resulting condition is set in that field 104 to be read by a later branch or isel instruction. Within these fields, 105 bits are set to indicate less than, greater than, or equal conditions. 106 107 Once an instruction sets a condition, then a subsequent branch, isel or 108 other instruction can read the condition field and operate based on the 109 bit settings. 110 111 Examples: 112 113 CMP R3, R4 <=> cmp r3, r4 (CR0 assumed) 114 CMP R3, R4, CR1 <=> cmp cr1, r3, r4 115 116 Note that the condition register is the target operand of compare opcodes, so 117 the remaining operands are in the same order for Go asm and PPC64 asm. 118 When CR0 is used then it is implicit and does not need to be specified. 119 120 5. Branches 121 122 Many branches are represented as a form of the BC instruction. There are 123 other extended opcodes to make it easier to see what type of branch is being 124 used. 125 126 The following is a brief description of the BC instruction and its commonly 127 used operands. 128 129 BC op1, op2, op3 130 131 op1: type of branch 132 16 -> bctr (branch on ctr) 133 12 -> bcr (branch if cr bit is set) 134 8 -> bcr+bctr (branch on ctr and cr values) 135 4 -> bcr != 0 (branch if specified cr bit is not set) 136 137 There are more combinations but these are the most common. 138 139 op2: condition register field and condition bit 140 141 This contains an immediate value indicating which condition field 142 to read and what bits to test. Each field is 4 bits long with CR0 143 at bit 0, CR1 at bit 4, etc. The value is computed as 4*CR+condition 144 with these condition values: 145 146 0 -> LT 147 1 -> GT 148 2 -> EQ 149 3 -> OVG 150 151 Thus 0 means test CR0 for LT, 5 means CR1 for GT, 30 means CR7 for EQ. 152 153 op3: branch target 154 155 Examples: 156 157 BC 12, 0, target <=> blt cr0, target 158 BC 12, 2, target <=> beq cr0, target 159 BC 12, 5, target <=> bgt cr1, target 160 BC 12, 30, target <=> beq cr7, target 161 BC 4, 6, target <=> bne cr1, target 162 BC 4, 1, target <=> ble cr1, target 163 164 The following extended opcodes are available for ease of use and readability: 165 166 BNE CR2, target <=> bne cr2, target 167 BEQ CR4, target <=> beq cr4, target 168 BLT target <=> blt target (cr0 default) 169 BGE CR7, target <=> bge cr7, target 170 171 Refer to the ISA for more information on additional values for the BC instruction, 172 how to handle OVG information, and much more. 173 174 5. Align directive 175 176 Starting with Go 1.12, Go asm supports the PCALIGN directive, which indicates 177 that the next instruction should be aligned to the specified value. Currently 178 8 and 16 are the only supported values, and a maximum of 2 NOPs will be added 179 to align the code. That means in the case where the code is aligned to 4 but 180 PCALIGN $16 is at that location, the code will only be aligned to 8 to avoid 181 adding 3 NOPs. 182 183 The purpose of this directive is to improve performance for cases like loops 184 where better alignment (8 or 16 instead of 4) might be helpful. This directive 185 exists in PPC64 assembler and is frequently used by PPC64 assembler writers. 186 187 PCALIGN $16 188 PCALIGN $8 189 190 Functions in Go are aligned to 16 bytes, as is the case in all other compilers 191 for PPC64. 192 193 6. Shift instructions 194 195 The simple scalar shifts on PPC64 expect a shift count that fits in 5 bits for 196 32-bit values or 6 bit for 64-bit values. If the shift count is a constant value 197 greater than the max then the assembler sets it to the max for that size (31 for 198 32 bit values, 63 for 64 bit values). If the shift count is in a register, then 199 only the low 5 or 6 bits of the register will be used as the shift count. The 200 Go compiler will add appropriate code to compare the shift value to achieve the 201 correct result, and the assembler does not add extra checking. 202 203 Examples: 204 205 SRAD $8,R3,R4 => sradi r4,r3,8 206 SRD $8,R3,R4 => rldicl r4,r3,56,8 207 SLD $8,R3,R4 => rldicr r4,r3,8,55 208 SRAW $16,R4,R5 => srawi r5,r4,16 209 SRW $40,R4,R5 => rlwinm r5,r4,0,0,31 210 SLW $12,R4,R5 => rlwinm r5,r4,12,0,19 211 212 Some non-simple shifts have operands in the Go assembly which don't map directly 213 onto operands in the PPC64 assembly. When an operand in a shift instruction in the 214 Go assembly is a bit mask, that mask is represented as a start and end bit in the 215 PPC64 assembly instead of a mask. See the ISA for more detail on these types of shifts. 216 Here are a few examples: 217 218 RLWMI $7,R3,$65535,R6 => rlwimi r6,r3,7,16,31 219 RLDMI $0,R4,$7,R6 => rldimi r6,r4,0,61 220 221 More recently, Go opcodes were added which map directly onto the PPC64 opcodes. It is 222 recommended to use the newer opcodes to avoid confusion. 223 224 RLDICL $0,R4,$15,R6 => rldicl r6,r4,0,15 225 RLDICR $0,R4,$15,R6 => rldicr r6.r4,0,15 226 227 # Register naming 228 229 1. Special register usage in Go asm 230 231 The following registers should not be modified by user Go assembler code. 232 233 R0: Go code expects this register to contain the value 0. 234 R1: Stack pointer 235 R2: TOC pointer when compiled with -shared or -dynlink (a.k.a position independent code) 236 R13: TLS pointer 237 R30: g (goroutine) 238 239 Register names: 240 241 Rn is used for general purpose registers. (0-31) 242 Fn is used for floating point registers. (0-31) 243 Vn is used for vector registers. Slot 0 of Vn overlaps with Fn. (0-31) 244 VSn is used for vector-scalar registers. V0-V31 overlap with VS32-VS63. (0-63) 245 CTR represents the count register. 246 LR represents the link register. 247 CR represents the condition register 248 CRn represents a condition register field. (0-7) 249 CRnLT represents CR bit 0 of CR field n. (0-7) 250 CRnGT represents CR bit 1 of CR field n. (0-7) 251 CRnEQ represents CR bit 2 of CR field n. (0-7) 252 CRnSO represents CR bit 3 of CR field n. (0-7) 253 */ 254 package ppc64