github.com/gagliardetto/golang-go@v0.0.0-20201020153340-53909ea70814/cmd/internal/obj/ppc64/doc.go (about) 1 // Copyright 2019 The Go Authors. All rights reserved. 2 // Use of this source code is governed by a BSD-style 3 // license that can be found in the LICENSE file. 4 5 /* 6 Package ppc64 implements a PPC64 assembler that assembles Go asm into 7 the corresponding PPC64 binary instructions as defined by the Power 8 ISA. Since POWER8 is the minimum instruction set used by GOARCHes ppc64le 9 and ppc64, refer to ISA 2.07B or later for details. 10 11 This document provides information on how to write code in Go assembler 12 for PPC64, focusing on the differences between Go and PPC64 assembly language. 13 It assumes some knowledge of PPC64 assembler. The original implementation of 14 PPC64 in Go defined many opcodes that are different from PPC64 opcodes, but 15 updates to the Go assembly language used mnemonics that are mostly similar if not 16 identical to the PPC64 mneumonics, such as VMX and VSX instructions. Not all detail 17 is included here; refer to the Power ISA document if interested in more detail. 18 19 In the examples below, the Go assembly is on the left, PPC64 assembly on the right. 20 21 1. Operand ordering 22 23 In Go asm, the last operand (right) is the target operand, but with PPC64 asm, 24 the first operand (left) is the target. In general, the remaining operands are 25 in the same order except in a few special cases, especially those with 4 operands. 26 27 Example: 28 ADD R3, R4, R5 <=> add r5, r3, r4 29 30 2. Constant operands 31 32 In Go asm, an operand that starts with '$' indicates a constant value. If the 33 instruction using the constant has an immediate version of the opcode, then an 34 immediate value is used with the opcode if possible. 35 36 Example: 37 ADD $1, R3, R4 <=> addi r4, r3, 1 38 39 3. Opcodes setting condition codes 40 41 In PPC64 asm, some instructions other than compares have variations that can set 42 the condition code where meaningful. This is indicated by adding '.' to the end 43 of the PPC64 instruction. In Go asm, these instructions have 'CC' at the end of 44 the opcode. The possible settings of the condition code depend on the instruction. 45 CR0 is the default for fixed-point instructions; CR1 for floating point; CR6 for 46 vector instructions. 47 48 Example: 49 ANDCC R3, R4, R5 <=> and. r5, r3, r4 (set CR0) 50 51 4. Loads and stores from memory 52 53 In Go asm, opcodes starting with 'MOV' indicate a load or store. When the target 54 is a memory reference, then it is a store; when the target is a register and the 55 source is a memory reference, then it is a load. 56 57 MOV{B,H,W,D} variations identify the size as byte, halfword, word, doubleword. 58 59 Adding 'Z' to the opcode for a load indicates zero extend; if omitted it is sign extend. 60 Adding 'U' to a load or store indicates an update of the base register with the offset. 61 Adding 'BR' to an opcode indicates byte-reversed load or store, or the order opposite 62 of the expected endian order. If 'BR' is used then zero extend is assumed. 63 64 Memory references n(Ra) indicate the address in Ra + n. When used with an update form 65 of an opcode, the value in Ra is incremented by n. 66 67 Memory references (Ra+Rb) or (Ra)(Rb) indicate the address Ra + Rb, used by indexed 68 loads or stores. Both forms are accepted. When used with an update then the base register 69 is updated by the value in the index register. 70 71 Examples: 72 MOVD (R3), R4 <=> ld r4,0(r3) 73 MOVW (R3), R4 <=> lwa r4,0(r3) 74 MOVWZU 4(R3), R4 <=> lwzu r4,4(r3) 75 MOVWZ (R3+R5), R4 <=> lwzx r4,r3,r5 76 MOVHZ (R3), R4 <=> lhz r4,0(r3) 77 MOVHU 2(R3), R4 <=> lhau r4,2(r3) 78 MOVBZ (R3), R4 <=> lbz r4,0(r3) 79 80 MOVD R4,(R3) <=> std r4,0(r3) 81 MOVW R4,(R3) <=> stw r4,0(r3) 82 MOVW R4,(R3+R5) <=> stwx r4,r3,r5 83 MOVWU R4,4(R3) <=> stwu r4,4(r3) 84 MOVH R4,2(R3) <=> sth r4,2(r3) 85 MOVBU R4,(R3)(R5) <=> stbux r4,r3,r5 86 87 4. Compares 88 89 When an instruction does a compare or other operation that might 90 result in a condition code, then the resulting condition is set 91 in a field of the condition register. The condition register consists 92 of 8 4-bit fields named CR0 - CR7. When a compare instruction 93 identifies a CR then the resulting condition is set in that field 94 to be read by a later branch or isel instruction. Within these fields, 95 bits are set to indicate less than, greater than, or equal conditions. 96 97 Once an instruction sets a condition, then a subsequent branch, isel or 98 other instruction can read the condition field and operate based on the 99 bit settings. 100 101 Examples: 102 CMP R3, R4 <=> cmp r3, r4 (CR0 assumed) 103 CMP R3, R4, CR1 <=> cmp cr1, r3, r4 104 105 Note that the condition register is the target operand of compare opcodes, so 106 the remaining operands are in the same order for Go asm and PPC64 asm. 107 When CR0 is used then it is implicit and does not need to be specified. 108 109 5. Branches 110 111 Many branches are represented as a form of the BC instruction. There are 112 other extended opcodes to make it easier to see what type of branch is being 113 used. 114 115 The following is a brief description of the BC instruction and its commonly 116 used operands. 117 118 BC op1, op2, op3 119 120 op1: type of branch 121 16 -> bctr (branch on ctr) 122 12 -> bcr (branch if cr bit is set) 123 8 -> bcr+bctr (branch on ctr and cr values) 124 4 -> bcr != 0 (branch if specified cr bit is not set) 125 126 There are more combinations but these are the most common. 127 128 op2: condition register field and condition bit 129 130 This contains an immediate value indicating which condition field 131 to read and what bits to test. Each field is 4 bits long with CR0 132 at bit 0, CR1 at bit 4, etc. The value is computed as 4*CR+condition 133 with these condition values: 134 135 0 -> LT 136 1 -> GT 137 2 -> EQ 138 3 -> OVG 139 140 Thus 0 means test CR0 for LT, 5 means CR1 for GT, 30 means CR7 for EQ. 141 142 op3: branch target 143 144 Examples: 145 146 BC 12, 0, target <=> blt cr0, target 147 BC 12, 2, target <=> beq cr0, target 148 BC 12, 5, target <=> bgt cr1, target 149 BC 12, 30, target <=> beq cr7, target 150 BC 4, 6, target <=> bne cr1, target 151 BC 4, 1, target <=> ble cr1, target 152 153 The following extended opcodes are available for ease of use and readability: 154 155 BNE CR2, target <=> bne cr2, target 156 BEQ CR4, target <=> beq cr4, target 157 BLT target <=> blt target (cr0 default) 158 BGE CR7, target <=> bge cr7, target 159 160 Refer to the ISA for more information on additional values for the BC instruction, 161 how to handle OVG information, and much more. 162 163 5. Align directive 164 165 Starting with Go 1.12, Go asm supports the PCALIGN directive, which indicates 166 that the next instruction should be aligned to the specified value. Currently 167 8 and 16 are the only supported values, and a maximum of 2 NOPs will be added 168 to align the code. That means in the case where the code is aligned to 4 but 169 PCALIGN $16 is at that location, the code will only be aligned to 8 to avoid 170 adding 3 NOPs. 171 172 The purpose of this directive is to improve performance for cases like loops 173 where better alignment (8 or 16 instead of 4) might be helpful. This directive 174 exists in PPC64 assembler and is frequently used by PPC64 assembler writers. 175 176 PCALIGN $16 177 PCALIGN $8 178 179 Functions in Go are aligned to 16 bytes, as is the case in all other compilers 180 for PPC64. 181 182 Register naming 183 184 1. Special register usage in Go asm 185 186 The following registers should not be modified by user Go assembler code. 187 188 R0: Go code expects this register to contain the value 0. 189 R1: Stack pointer 190 R2: TOC pointer when compiled with -shared or -dynlink (a.k.a position independent code) 191 R13: TLS pointer 192 R30: g (goroutine) 193 194 Register names: 195 196 Rn is used for general purpose registers. (0-31) 197 Fn is used for floating point registers. (0-31) 198 Vn is used for vector registers. Slot 0 of Vn overlaps with Fn. (0-31) 199 VSn is used for vector-scalar registers. V0-V31 overlap with VS32-VS63. (0-63) 200 CTR represents the count register. 201 LR represents the link register. 202 203 */ 204 package ppc64