github.com/aergoio/aergo@v1.3.1/libtool/src/gmp-6.1.2/mpn/powerpc32/README (about) 1 Copyright 2002, 2005 Free Software Foundation, Inc. 2 3 This file is part of the GNU MP Library. 4 5 The GNU MP Library is free software; you can redistribute it and/or modify 6 it under the terms of either: 7 8 * the GNU Lesser General Public License as published by the Free 9 Software Foundation; either version 3 of the License, or (at your 10 option) any later version. 11 12 or 13 14 * the GNU General Public License as published by the Free Software 15 Foundation; either version 2 of the License, or (at your option) any 16 later version. 17 18 or both in parallel, as here. 19 20 The GNU MP Library is distributed in the hope that it will be useful, but 21 WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY 22 or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 23 for more details. 24 25 You should have received copies of the GNU General Public License and the 26 GNU Lesser General Public License along with the GNU MP Library. If not, 27 see https://www.gnu.org/licenses/. 28 29 30 31 32 33 POWERPC 32-BIT MPN SUBROUTINES 34 35 36 This directory contains mpn functions for various 32-bit PowerPC chips. 37 38 39 CODE ORGANIZATION 40 41 directory used for 42 ================================================ 43 powerpc generic, 604, 604e, 744x, 745x 44 powerpc/750 740, 750, 7400, 7410 45 46 47 The top-level powerpc directory is currently mostly aimed at 604/604e but 48 should be reasonable on all powerpcs. 49 50 51 52 STATUS 53 54 The code is quite well optimized for the 604e, other chips have had less 55 attention. 56 57 Altivec SIMD available in 74xx might hold some promise, but unfortunately 58 GMP only guarantees 32-bit data alignment, so there's lots of fiddling 59 around with partial operations at the start and end of limb vectors. A 60 128-bit limb would be a novel idea, but is unlikely to be practical, since 61 it would have to work with ordinary +, -, * etc in the C code. 62 63 Also, Altivec isn't very well suited for the GMP multiplication needs. 64 Using floating-point based multiplication has much better better performance 65 potential for all current powerpcs, both the ones with slow integer multiply 66 units (603, 740, 750, 7400, 7410) and those with fast (604, 604e, 744x, 67 745x). This is because all powerpcs do some level of pipelining in the FPU: 68 69 603 and 750 can sustain one fmadd every 2nd cycle. 70 604 and 604e can sustain one fmadd per cycle. 71 7400 and 7410 can sustain 3 fmadd in 4 cycles. 72 744x and 745x can sustain 4 fmadd in 5 cycles. 73 74 75 76 REGISTER NAMES 77 78 The normal powerpc convention is to give registers as plain numbers, like 79 "mtctr 6", but on Apple MacOS X (powerpc*-*-rhapsody* and 80 powerpc*-*-darwin*) the assembler demands an "r" like "mtctr r6". Note 81 however when register 0 in an instruction means a literal zero the "r" is 82 omitted, for instance "lwzx r6,0,r7". 83 84 The GMP code uses the "r" forms, powerpc-defs.m4 transforms them to plain 85 numbers according to what GMP_ASM_POWERPC_R_REGISTERS finds is needed. 86 (Note that this style isn't fully general, as the identifier r4 and the 87 register r4 will not be distinguishable on some systems. However, this is 88 not a problem for the limited GMP assembly usage.) 89 90 91 92 GLOBAL REFERENCES 93 94 Linux non-PIC 95 lis 9, __gmp_binvert_limb_table@ha 96 rlwinm 11, 5, 31, 25, 31 97 la 9, __gmp_binvert_limb_table@l(9) 98 lbzx 11, 9, 11 99 100 Linux PIC (FIXME) 101 .LCL0: 102 .long .LCTOC1-.LCF0 103 bcl 20, 31, .LCF0 104 .LCF0: 105 mflr 30 106 lwz 7, .LCL0-.LCF0(30) 107 add 30, 7, 30 108 lwz 11, .LC0-.LCTOC1(30) 109 rlwinm 3, 5, 31, 25, 31 110 lbzx 7, 11, 3 111 112 AIX (always PIC) 113 LC..0: 114 .tc __gmp_binvert_limb_table[TC],__gmp_binvert_limb_table[RW] 115 lwz 9, LC..0(2) 116 rlwinm 0, 5, 31, 25, 31 117 lbzx 0, 9, 0 118 119 Darwin (non-PIC) 120 lis r2, ha16(___gmp_binvert_limb_table) 121 rlwinm r9, r5, 31, 25, 31 122 la r2, lo16(___gmp_binvert_limb_table)(r2) 123 lbzx r0, r2, r9 124 Darwin (PIC) 125 mflr r0 126 bcl 20, 31, L0001$pb 127 L0001$pb: 128 mflr r7 129 mtlr r0 130 addis r2, r7, ha16(L___gmp_binvert_limb_table$non_lazy_ptr-L0001$pb) 131 rlwinm r9, r5, 31, 25, 31 132 lwz r2, lo16(L___gmp_binvert_limb_table$non_lazy_ptr-L0001$pb)(r2) 133 lbzx r0, r2, r9 134 ------ 135 .non_lazy_symbol_pointer 136 L___gmp_binvert_limb_table$non_lazy_ptr: 137 .indirect_symbol ___gmp_binvert_limb_table 138 .long 0 139 .subsections_via_symbols 140 141 142 For GNU/Linux and Darwin, we might want to duplicate __gmp_binvert_limb_table 143 into the text section in this file. We should thus be able to reach it like 144 this: 145 146 blr L0 147 L0: mflr r2 148 rlwinm r9, r5, 31, 25, 31 149 addi r9, r9, lo16(local_binvert_table-L0) 150 lbzx r0, r2, r9 151 152 153 154 REFERENCES 155 156 PowerPC Microprocessor Family: The Programming Environments for 32-bit 157 Microprocessors, IBM document G522-0290-01, 2000. 158 159 PowerPC 604e RISC Microprocessor User's Manual with Supplement for PowerPC 160 604 Microprocessor, IBM document G552-0330-00, Freescale document 161 MPC604EUM/AD, 3/1998. 162 163 MPC7410/MPC7400 RISC Microprocessor User's Manual, Freescale document 164 MPC7400UM/D, rev 1, 11/2002. 165 166 MPC7450 RISC Microprocessor Family Reference Manual, Freescale document 167 MPC7450UM, rev 5, 1/2005. 168 169 The above are available online from 170 171 http://www.ibm.com/chips/techlib/techlib.nsf/productfamilies/PowerPC 172 http://www.freescale.com/PowerPC 173 174 175 176 ---------------- 177 Local variables: 178 mode: text 179 fill-column: 76 180 End: