github.com/aergoio/aergo@v1.3.1/libtool/src/gmp-6.1.2/doc/gmp.texi (about) 1 \input texinfo @c -*-texinfo-*- 2 @c %**start of header 3 @setfilename gmp.info 4 @documentencoding ISO-8859-1 5 @include version.texi 6 @settitle GNU MP @value{VERSION} 7 @synindex tp fn 8 @iftex 9 @afourpaper 10 @end iftex 11 @comment %**end of header 12 13 @copying 14 This manual describes how to install and use the GNU multiple precision 15 arithmetic library, version @value{VERSION}. 16 17 Copyright 1991, 1993-2016 Free Software Foundation, Inc. 18 19 Permission is granted to copy, distribute and/or modify this document under 20 the terms of the GNU Free Documentation License, Version 1.3 or any later 21 version published by the Free Software Foundation; with no Invariant Sections, 22 with the Front-Cover Texts being ``A GNU Manual'', and with the Back-Cover 23 Texts being ``You have freedom to copy and modify this GNU Manual, like GNU 24 software''. A copy of the license is included in 25 @ref{GNU Free Documentation License}. 26 @end copying 27 @c Note the @ref above must be on one line, a line break in an @ref within 28 @c @copying will bomb in recent texinfo.tex (eg. 2004-04-07.08 which comes 29 @c with texinfo 4.7), with messages about missing @endcsname. 30 31 32 @c Texinfo version 4.2 or up will be needed to process this file. 33 @c 34 @c The version number and edition number are taken from version.texi provided 35 @c by automake (note that it's regenerated only if you configure with 36 @c --enable-maintainer-mode). 37 @c 38 @c Notes discussing the present version number of GMP in relation to previous 39 @c ones (for instance in the "Compatibility" section) must be updated at 40 @c manually though. 41 @c 42 @c @cindex entries have been made for function categories and programming 43 @c topics. The "mpn" section is not included in this, because a beginner 44 @c looking for "GCD" or something is only going to be confused by pointers to 45 @c low level routines. 46 @c 47 @c @cindex entries are present for processors and systems when there's 48 @c particular notes concerning them, but not just for everything GMP 49 @c supports. 50 @c 51 @c Index entries for files use @code rather than @file, @samp or @option, 52 @c since the latter come out with quotes in TeX, which are nice in the text 53 @c but don't look so good in index columns. 54 @c 55 @c Tex: 56 @c 57 @c A suitable texinfo.tex is supplied, a newer one should work equally well. 58 @c 59 @c HTML: 60 @c 61 @c Nothing special is done for links to external manuals, they just come out 62 @c in the usual makeinfo style, eg. "../libc/Locales.html". If you have 63 @c local copies of such manuals then this is a good thing, if not then you 64 @c may want to search-and-replace to some online source. 65 @c 66 67 @dircategory GNU libraries 68 @direntry 69 * gmp: (gmp). GNU Multiple Precision Arithmetic Library. 70 @end direntry 71 72 @c html <meta name="description" content="..."> 73 @documentdescription 74 How to install and use the GNU multiple precision arithmetic library, version @value{VERSION}. 75 @end documentdescription 76 77 @c smallbook 78 @finalout 79 @setchapternewpage on 80 81 @ifnottex 82 @node Top, Copying, (dir), (dir) 83 @top GNU MP 84 @end ifnottex 85 86 @iftex 87 @titlepage 88 @title GNU MP 89 @subtitle The GNU Multiple Precision Arithmetic Library 90 @subtitle Edition @value{EDITION} 91 @subtitle @value{UPDATED} 92 93 @author by Torbj@"orn Granlund and the GMP development team 94 @c @email{tg@@gmplib.org} 95 96 @c Include the Distribution inside the titlepage so 97 @c that headings are turned off. 98 99 @tex 100 \global\parindent=0pt 101 \global\parskip=8pt 102 \global\baselineskip=13pt 103 @end tex 104 105 @page 106 @vskip 0pt plus 1filll 107 @end iftex 108 109 @insertcopying 110 @ifnottex 111 @sp 1 112 @end ifnottex 113 114 @iftex 115 @end titlepage 116 @headings double 117 @end iftex 118 119 @c Don't bother with contents for html, the menus seem adequate. 120 @ifnothtml 121 @contents 122 @end ifnothtml 123 124 @menu 125 * Copying:: GMP Copying Conditions (LGPL). 126 * Introduction to GMP:: Brief introduction to GNU MP. 127 * Installing GMP:: How to configure and compile the GMP library. 128 * GMP Basics:: What every GMP user should know. 129 * Reporting Bugs:: How to usefully report bugs. 130 * Integer Functions:: Functions for arithmetic on signed integers. 131 * Rational Number Functions:: Functions for arithmetic on rational numbers. 132 * Floating-point Functions:: Functions for arithmetic on floats. 133 * Low-level Functions:: Fast functions for natural numbers. 134 * Random Number Functions:: Functions for generating random numbers. 135 * Formatted Output:: @code{printf} style output. 136 * Formatted Input:: @code{scanf} style input. 137 * C++ Class Interface:: Class wrappers around GMP types. 138 * Custom Allocation:: How to customize the internal allocation. 139 * Language Bindings:: Using GMP from other languages. 140 * Algorithms:: What happens behind the scenes. 141 * Internals:: How values are represented behind the scenes. 142 143 * Contributors:: Who brings you this library? 144 * References:: Some useful papers and books to read. 145 * GNU Free Documentation License:: 146 * Concept Index:: 147 * Function Index:: 148 @end menu 149 150 151 @c @m{T,N} is $T$ in tex or @math{N} otherwise. This is an easy way to give 152 @c different forms for math in tex and info. Commas in N or T don't work, 153 @c but @C{} can be used instead. \, works in info but not in tex. 154 @iftex 155 @macro m {T,N} 156 @tex$\T\$@end tex 157 @end macro 158 @end iftex 159 @ifnottex 160 @macro m {T,N} 161 @math{\N\} 162 @end macro 163 @end ifnottex 164 165 @macro C {} 166 , 167 @end macro 168 169 @c @ms{V,N} is $V_N$ in tex or just vn otherwise. This suits simple 170 @c subscripts like @ms{x,0}. 171 @iftex 172 @macro ms {V,N} 173 @tex$\V\_{\N\}$@end tex 174 @end macro 175 @end iftex 176 @ifnottex 177 @macro ms {V,N} 178 \V\\N\ 179 @end macro 180 @end ifnottex 181 182 @c @nicode{S} is plain S in info, or @code{S} elsewhere. This can be used 183 @c when the quotes that @code{} gives in info aren't wanted, but the 184 @c fontification in tex or html is wanted. Doesn't work as @nicode{'\\0'} 185 @c though (gives two backslashes in tex). 186 @ifinfo 187 @macro nicode {S} 188 \S\ 189 @end macro 190 @end ifinfo 191 @ifnotinfo 192 @macro nicode {S} 193 @code{\S\} 194 @end macro 195 @end ifnotinfo 196 197 @c @nisamp{S} is plain S in info, or @samp{S} elsewhere. This can be used 198 @c when the quotes that @samp{} gives in info aren't wanted, but the 199 @c fontification in tex or html is wanted. 200 @ifinfo 201 @macro nisamp {S} 202 \S\ 203 @end macro 204 @end ifinfo 205 @ifnotinfo 206 @macro nisamp {S} 207 @samp{\S\} 208 @end macro 209 @end ifnotinfo 210 211 @c Usage: @GMPtimes{} 212 @c Give either \times or the word "times". 213 @tex 214 \gdef\GMPtimes{\times} 215 @end tex 216 @ifnottex 217 @macro GMPtimes 218 times 219 @end macro 220 @end ifnottex 221 222 @c Usage: @GMPmultiply{} 223 @c Give * in info, or nothing in tex. 224 @tex 225 \gdef\GMPmultiply{} 226 @end tex 227 @ifnottex 228 @macro GMPmultiply 229 * 230 @end macro 231 @end ifnottex 232 233 @c Usage: @GMPabs{x} 234 @c Give either |x| in tex, or abs(x) in info or html. 235 @tex 236 \gdef\GMPabs#1{|#1|} 237 @end tex 238 @ifnottex 239 @macro GMPabs {X} 240 @abs{}(\X\) 241 @end macro 242 @end ifnottex 243 244 @c Usage: @GMPfloor{x} 245 @c Give either \lfloor x\rfloor in tex, or floor(x) in info or html. 246 @tex 247 \gdef\GMPfloor#1{\lfloor #1\rfloor} 248 @end tex 249 @ifnottex 250 @macro GMPfloor {X} 251 floor(\X\) 252 @end macro 253 @end ifnottex 254 255 @c Usage: @GMPceil{x} 256 @c Give either \lceil x\rceil in tex, or ceil(x) in info or html. 257 @tex 258 \gdef\GMPceil#1{\lceil #1 \rceil} 259 @end tex 260 @ifnottex 261 @macro GMPceil {X} 262 ceil(\X\) 263 @end macro 264 @end ifnottex 265 266 @c Math operators already available in tex, made available in info too. 267 @c For example @bmod{} can be used in both tex and info. 268 @ifnottex 269 @macro bmod 270 mod 271 @end macro 272 @macro gcd 273 gcd 274 @end macro 275 @macro ge 276 >= 277 @end macro 278 @macro le 279 <= 280 @end macro 281 @macro log 282 log 283 @end macro 284 @macro min 285 min 286 @end macro 287 @macro leftarrow 288 <- 289 @end macro 290 @macro rightarrow 291 -> 292 @end macro 293 @end ifnottex 294 295 @c New math operators. 296 @c @abs{} can be used in both tex and info, or just \abs in tex. 297 @tex 298 \gdef\abs{\mathop{\rm abs}} 299 @end tex 300 @ifnottex 301 @macro abs 302 abs 303 @end macro 304 @end ifnottex 305 306 @c @cross{} is a \times symbol in tex, or an "x" in info. In tex it works 307 @c inside or outside $ $. 308 @tex 309 \gdef\cross{\ifmmode\times\else$\times$\fi} 310 @end tex 311 @ifnottex 312 @macro cross 313 x 314 @end macro 315 @end ifnottex 316 317 @c @times{} made available as a "*" in info and html (already works in tex). 318 @ifnottex 319 @macro times 320 * 321 @end macro 322 @end ifnottex 323 324 @c Usage: @W{text} 325 @c Like @w{} but working in math mode too. 326 @tex 327 \gdef\W#1{\ifmmode{#1}\else\w{#1}\fi} 328 @end tex 329 @ifnottex 330 @macro W {S} 331 @w{\S\} 332 @end macro 333 @end ifnottex 334 335 @c Usage: \GMPdisplay{text} 336 @c Put the given text in an @display style indent, but without turning off 337 @c paragraph reflow etc. 338 @tex 339 \gdef\GMPdisplay#1{% 340 \noindent 341 \advance\leftskip by \lispnarrowing 342 #1\par} 343 @end tex 344 345 @c Usage: \GMPhat 346 @c A new \hat that will work in math mode, unlike the texinfo redefined 347 @c version. 348 @tex 349 \gdef\GMPhat{\mathaccent"705E} 350 @end tex 351 352 @c Usage: \GMPraise{text} 353 @c For use in a $ $ math expression as an alternative to "^". This is good 354 @c for @code{} in an exponent, since there seems to be no superscript font 355 @c for that. 356 @tex 357 \gdef\GMPraise#1{\mskip0.5\thinmuskip\hbox{\raise0.8ex\hbox{#1}}} 358 @end tex 359 360 @c Usage: @texlinebreak{} 361 @c A line break as per @*, but only in tex. 362 @iftex 363 @macro texlinebreak 364 @* 365 @end macro 366 @end iftex 367 @ifnottex 368 @macro texlinebreak 369 @end macro 370 @end ifnottex 371 372 @c Usage: @maybepagebreak 373 @c Allow tex to insert a page break, if it feels the urge. 374 @c Normally blocks of @deftypefun/funx are kept together, which can lead to 375 @c some poor page break positioning if it's a big block, like the sets of 376 @c division functions etc. 377 @tex 378 \gdef\maybepagebreak{\penalty0} 379 @end tex 380 @ifnottex 381 @macro maybepagebreak 382 @end macro 383 @end ifnottex 384 385 @c Usage: @GMPreftop{info,title} 386 @c Usage: @GMPpxreftop{info,title} 387 @c 388 @c Like @ref{} and @pxref{}, but designed for a reference to the top of a 389 @c document, not a particular section. The TeX output for plain @ref insists 390 @c on printing a particular section, GMPreftop gives just the title. 391 @c 392 @c The texinfo manual recommends putting a likely section name in references 393 @c like this, eg. "Introduction", but it seems better to just give the title. 394 @c 395 @iftex 396 @macro GMPreftop{info,title} 397 @i{\title\} 398 @end macro 399 @macro GMPpxreftop{info,title} 400 see @i{\title\} 401 @end macro 402 @end iftex 403 @c 404 @ifnottex 405 @macro GMPreftop{info,title} 406 @ref{Top,\title\,\title\,\info\,\title\} 407 @end macro 408 @macro GMPpxreftop{info,title} 409 @pxref{Top,\title\,\title\,\info\,\title\} 410 @end macro 411 @end ifnottex 412 413 414 @node Copying, Introduction to GMP, Top, Top 415 @comment node-name, next, previous, up 416 @unnumbered GNU MP Copying Conditions 417 @cindex Copying conditions 418 @cindex Conditions for copying GNU MP 419 @cindex License conditions 420 421 This library is @dfn{free}; this means that everyone is free to use it and 422 free to redistribute it on a free basis. The library is not in the public 423 domain; it is copyrighted and there are restrictions on its distribution, but 424 these restrictions are designed to permit everything that a good cooperating 425 citizen would want to do. What is not allowed is to try to prevent others 426 from further sharing any version of this library that they might get from 427 you.@refill 428 429 Specifically, we want to make sure that you have the right to give away copies 430 of the library, that you receive source code or else can get it if you want 431 it, that you can change this library or use pieces of it in new free programs, 432 and that you know you can do these things.@refill 433 434 To make sure that everyone has such rights, we have to forbid you to deprive 435 anyone else of these rights. For example, if you distribute copies of the GNU 436 MP library, you must give the recipients all the rights that you have. You 437 must make sure that they, too, receive or can get the source code. And you 438 must tell them their rights.@refill 439 440 Also, for our own protection, we must make certain that everyone finds out 441 that there is no warranty for the GNU MP library. If it is modified by 442 someone else and passed on, we want their recipients to know that what they 443 have is not what we distributed, so that any problems introduced by others 444 will not reflect on our reputation.@refill 445 446 More precisely, the GNU MP library is dual licensed, under the conditions of 447 the GNU Lesser General Public License version 3 (see 448 @file{COPYING.LESSERv3}), or the GNU General Public License version 2 (see 449 @file{COPYINGv2}). This is the recipient's choice, and the recipient also has 450 the additional option of applying later versions of these licenses. (The 451 reason for this dual licensing is to make it possible to use the library with 452 programs which are licensed under GPL version 2, but which for historical or 453 other reasons do not allow use under later versions of the GPL). 454 455 Programs which are not part of the library itself, such as demonstration 456 programs and the GMP testsuite, are licensed under the terms of the GNU 457 General Public License version 3 (see @file{COPYINGv3}), or any later 458 version. 459 460 461 @node Introduction to GMP, Installing GMP, Copying, Top 462 @comment node-name, next, previous, up 463 @chapter Introduction to GNU MP 464 @cindex Introduction 465 466 GNU MP is a portable library written in C for arbitrary precision arithmetic 467 on integers, rational numbers, and floating-point numbers. It aims to provide 468 the fastest possible arithmetic for all applications that need higher 469 precision than is directly supported by the basic C types. 470 471 Many applications use just a few hundred bits of precision; but some 472 applications may need thousands or even millions of bits. GMP is designed to 473 give good performance for both, by choosing algorithms based on the sizes of 474 the operands, and by carefully keeping the overhead at a minimum. 475 476 The speed of GMP is achieved by using fullwords as the basic arithmetic type, 477 by using sophisticated algorithms, by including carefully optimized assembly 478 code for the most common inner loops for many different CPUs, and by a general 479 emphasis on speed (as opposed to simplicity or elegance). 480 481 There is assembly code for these CPUs: 482 @cindex CPU types 483 ARM Cortex-A9, Cortex-A15, and generic ARM, 484 DEC Alpha 21064, 21164, and 21264, 485 AMD K8 and K10 (sold under many brands, e.g. Athlon64, Phenom, Opteron) 486 Bulldozer, and Bobcat, 487 Intel Pentium, Pentium Pro/II/III, Pentium 4, Core2, Nehalem, Sandy bridge, Haswell, generic x86, 488 Intel IA-64, 489 Motorola/IBM PowerPC 32 and 64 such as POWER970, POWER5, POWER6, and POWER7, 490 MIPS 32-bit and 64-bit, 491 SPARC 32-bit ad 64-bit with special support for all UltraSPARC models. 492 There is also assembly code for many obsolete CPUs. 493 494 495 @cindex Home page 496 @cindex Web page 497 @noindent 498 For up-to-date information on GMP, please see the GMP web pages at 499 500 @display 501 @uref{https://gmplib.org/} 502 @end display 503 504 @cindex Latest version of GMP 505 @cindex Anonymous FTP of latest version 506 @cindex FTP of latest version 507 @noindent 508 The latest version of the library is available at 509 510 @display 511 @uref{https://ftp.gnu.org/gnu/gmp/} 512 @end display 513 514 Many sites around the world mirror @samp{ftp.gnu.org}, please use a mirror 515 near you, see @uref{https://www.gnu.org/order/ftp.html} for a full list. 516 517 @cindex Mailing lists 518 There are three public mailing lists of interest. One for release 519 announcements, one for general questions and discussions about usage of the GMP 520 library and one for bug reports. For more information, see 521 522 @display 523 @uref{https://gmplib.org/mailman/listinfo/}. 524 @end display 525 526 The proper place for bug reports is @email{gmp-bugs@@gmplib.org}. See 527 @ref{Reporting Bugs} for information about reporting bugs. 528 529 @sp 1 530 @section How to use this Manual 531 @cindex About this manual 532 533 Everyone should read @ref{GMP Basics}. If you need to install the library 534 yourself, then read @ref{Installing GMP}. If you have a system with multiple 535 ABIs, then read @ref{ABI and ISA}, for the compiler options that must be used 536 on applications. 537 538 The rest of the manual can be used for later reference, although it is 539 probably a good idea to glance through it. 540 541 542 @node Installing GMP, GMP Basics, Introduction to GMP, Top 543 @comment node-name, next, previous, up 544 @chapter Installing GMP 545 @cindex Installing GMP 546 @cindex Configuring GMP 547 @cindex Building GMP 548 549 GMP has an autoconf/automake/libtool based configuration system. On a 550 Unix-like system a basic build can be done with 551 552 @example 553 ./configure 554 make 555 @end example 556 557 @noindent 558 Some self-tests can be run with 559 560 @example 561 make check 562 @end example 563 564 @noindent 565 And you can install (under @file{/usr/local} by default) with 566 567 @example 568 make install 569 @end example 570 571 If you experience problems, please report them to @email{gmp-bugs@@gmplib.org}. 572 See @ref{Reporting Bugs}, for information on what to include in useful bug 573 reports. 574 575 @menu 576 * Build Options:: 577 * ABI and ISA:: 578 * Notes for Package Builds:: 579 * Notes for Particular Systems:: 580 * Known Build Problems:: 581 * Performance optimization:: 582 @end menu 583 584 585 @node Build Options, ABI and ISA, Installing GMP, Installing GMP 586 @section Build Options 587 @cindex Build options 588 589 All the usual autoconf configure options are available, run @samp{./configure 590 --help} for a summary. The file @file{INSTALL.autoconf} has some generic 591 installation information too. 592 593 @table @asis 594 @item Tools 595 @cindex Non-Unix systems 596 @samp{configure} requires various Unix-like tools. See @ref{Notes for 597 Particular Systems}, for some options on non-Unix systems. 598 599 It might be possible to build without the help of @samp{configure}, certainly 600 all the code is there, but unfortunately you'll be on your own. 601 602 @item Build Directory 603 @cindex Build directory 604 To compile in a separate build directory, @command{cd} to that directory, and 605 prefix the configure command with the path to the GMP source directory. For 606 example 607 608 @example 609 cd /my/build/dir 610 /my/sources/gmp-@value{VERSION}/configure 611 @end example 612 613 Not all @samp{make} programs have the necessary features (@code{VPATH}) to 614 support this. In particular, SunOS and Slowaris @command{make} have bugs that 615 make them unable to build in a separate directory. Use GNU @command{make} 616 instead. 617 618 @item @option{--prefix} and @option{--exec-prefix} 619 @cindex Prefix 620 @cindex Exec prefix 621 @cindex Install prefix 622 @cindex @code{--prefix} 623 @cindex @code{--exec-prefix} 624 The @option{--prefix} option can be used in the normal way to direct GMP to 625 install under a particular tree. The default is @samp{/usr/local}. 626 627 @option{--exec-prefix} can be used to direct architecture-dependent files like 628 @file{libgmp.a} to a different location. This can be used to share 629 architecture-independent parts like the documentation, but separate the 630 dependent parts. Note however that @file{gmp.h} is 631 architecture-dependent since it encodes certain aspects of @file{libgmp}, so 632 it will be necessary to ensure both @file{$prefix/include} and 633 @file{$exec_prefix/include} are available to the compiler. 634 635 @item @option{--disable-shared}, @option{--disable-static} 636 @cindex @code{--disable-shared} 637 @cindex @code{--disable-static} 638 By default both shared and static libraries are built (where possible), but 639 one or other can be disabled. Shared libraries result in smaller executables 640 and permit code sharing between separate running processes, but on some CPUs 641 are slightly slower, having a small cost on each function call. 642 643 @item Native Compilation, @option{--build=CPU-VENDOR-OS} 644 @cindex Native compilation 645 @cindex Build system 646 @cindex @code{--build} 647 For normal native compilation, the system can be specified with 648 @samp{--build}. By default @samp{./configure} uses the output from running 649 @samp{./config.guess}. On some systems @samp{./config.guess} can determine 650 the exact CPU type, on others it will be necessary to give it explicitly. For 651 example, 652 653 @example 654 ./configure --build=ultrasparc-sun-solaris2.7 655 @end example 656 657 In all cases the @samp{OS} part is important, since it controls how libtool 658 generates shared libraries. Running @samp{./config.guess} is the simplest way 659 to see what it should be, if you don't know already. 660 661 @item Cross Compilation, @option{--host=CPU-VENDOR-OS} 662 @cindex Cross compiling 663 @cindex Host system 664 @cindex @code{--host} 665 When cross-compiling, the system used for compiling is given by @samp{--build} 666 and the system where the library will run is given by @samp{--host}. For 667 example when using a FreeBSD Athlon system to build GNU/Linux m68k binaries, 668 669 @example 670 ./configure --build=athlon-pc-freebsd3.5 --host=m68k-mac-linux-gnu 671 @end example 672 673 Compiler tools are sought first with the host system type as a prefix. For 674 example @command{m68k-mac-linux-gnu-ranlib} is tried, then plain 675 @command{ranlib}. This makes it possible for a set of cross-compiling tools 676 to co-exist with native tools. The prefix is the argument to @samp{--host}, 677 and this can be an alias, such as @samp{m68k-linux}. But note that tools 678 don't have to be setup this way, it's enough to just have a @env{PATH} with a 679 suitable cross-compiling @command{cc} etc. 680 681 Compiling for a different CPU in the same family as the build system is a form 682 of cross-compilation, though very possibly this would merely be special 683 options on a native compiler. In any case @samp{./configure} avoids depending 684 on being able to run code on the build system, which is important when 685 creating binaries for a newer CPU since they very possibly won't run on the 686 build system. 687 688 In all cases the compiler must be able to produce an executable (of whatever 689 format) from a standard C @code{main}. Although only object files will go to 690 make up @file{libgmp}, @samp{./configure} uses linking tests for various 691 purposes, such as determining what functions are available on the host system. 692 693 Currently a warning is given unless an explicit @samp{--build} is used when 694 cross-compiling, because it may not be possible to correctly guess the build 695 system type if the @env{PATH} has only a cross-compiling @command{cc}. 696 697 Note that the @samp{--target} option is not appropriate for GMP@. It's for use 698 when building compiler tools, with @samp{--host} being where they will run, 699 and @samp{--target} what they'll produce code for. Ordinary programs or 700 libraries like GMP are only interested in the @samp{--host} part, being where 701 they'll run. (Some past versions of GMP used @samp{--target} incorrectly.) 702 703 @item CPU types 704 @cindex CPU types 705 In general, if you want a library that runs as fast as possible, you should 706 configure GMP for the exact CPU type your system uses. However, this may mean 707 the binaries won't run on older members of the family, and might run slower on 708 other members, older or newer. The best idea is always to build GMP for the 709 exact machine type you intend to run it on. 710 711 The following CPUs have specific support. See @file{configure.ac} for details 712 of what code and compiler options they select. 713 714 @itemize @bullet 715 716 @c Keep this formatting, it's easy to read and it can be grepped to 717 @c automatically test that CPUs listed get through ./config.sub 718 719 @item 720 Alpha: 721 @nisamp{alpha}, 722 @nisamp{alphaev5}, 723 @nisamp{alphaev56}, 724 @nisamp{alphapca56}, 725 @nisamp{alphapca57}, 726 @nisamp{alphaev6}, 727 @nisamp{alphaev67}, 728 @nisamp{alphaev68} 729 @nisamp{alphaev7} 730 731 @item 732 Cray: 733 @nisamp{c90}, 734 @nisamp{j90}, 735 @nisamp{t90}, 736 @nisamp{sv1} 737 738 @item 739 HPPA: 740 @nisamp{hppa1.0}, 741 @nisamp{hppa1.1}, 742 @nisamp{hppa2.0}, 743 @nisamp{hppa2.0n}, 744 @nisamp{hppa2.0w}, 745 @nisamp{hppa64} 746 747 @item 748 IA-64: 749 @nisamp{ia64}, 750 @nisamp{itanium}, 751 @nisamp{itanium2} 752 753 @item 754 MIPS: 755 @nisamp{mips}, 756 @nisamp{mips3}, 757 @nisamp{mips64} 758 759 @item 760 Motorola: 761 @nisamp{m68k}, 762 @nisamp{m68000}, 763 @nisamp{m68010}, 764 @nisamp{m68020}, 765 @nisamp{m68030}, 766 @nisamp{m68040}, 767 @nisamp{m68060}, 768 @nisamp{m68302}, 769 @nisamp{m68360}, 770 @nisamp{m88k}, 771 @nisamp{m88110} 772 773 @item 774 POWER: 775 @nisamp{power}, 776 @nisamp{power1}, 777 @nisamp{power2}, 778 @nisamp{power2sc} 779 780 @item 781 PowerPC: 782 @nisamp{powerpc}, 783 @nisamp{powerpc64}, 784 @nisamp{powerpc401}, 785 @nisamp{powerpc403}, 786 @nisamp{powerpc405}, 787 @nisamp{powerpc505}, 788 @nisamp{powerpc601}, 789 @nisamp{powerpc602}, 790 @nisamp{powerpc603}, 791 @nisamp{powerpc603e}, 792 @nisamp{powerpc604}, 793 @nisamp{powerpc604e}, 794 @nisamp{powerpc620}, 795 @nisamp{powerpc630}, 796 @nisamp{powerpc740}, 797 @nisamp{powerpc7400}, 798 @nisamp{powerpc7450}, 799 @nisamp{powerpc750}, 800 @nisamp{powerpc801}, 801 @nisamp{powerpc821}, 802 @nisamp{powerpc823}, 803 @nisamp{powerpc860}, 804 @nisamp{powerpc970} 805 806 @item 807 SPARC: 808 @nisamp{sparc}, 809 @nisamp{sparcv8}, 810 @nisamp{microsparc}, 811 @nisamp{supersparc}, 812 @nisamp{sparcv9}, 813 @nisamp{ultrasparc}, 814 @nisamp{ultrasparc2}, 815 @nisamp{ultrasparc2i}, 816 @nisamp{ultrasparc3}, 817 @nisamp{sparc64} 818 819 @item 820 x86 family: 821 @nisamp{i386}, 822 @nisamp{i486}, 823 @nisamp{i586}, 824 @nisamp{pentium}, 825 @nisamp{pentiummmx}, 826 @nisamp{pentiumpro}, 827 @nisamp{pentium2}, 828 @nisamp{pentium3}, 829 @nisamp{pentium4}, 830 @nisamp{k6}, 831 @nisamp{k62}, 832 @nisamp{k63}, 833 @nisamp{athlon}, 834 @nisamp{amd64}, 835 @nisamp{viac3}, 836 @nisamp{viac32} 837 838 @item 839 Other: 840 @nisamp{arm}, 841 @nisamp{sh}, 842 @nisamp{sh2}, 843 @nisamp{vax}, 844 @end itemize 845 846 CPUs not listed will use generic C code. 847 848 @item Generic C Build 849 @cindex Generic C 850 If some of the assembly code causes problems, or if otherwise desired, the 851 generic C code can be selected with the configure @option{--disable-assembly}. 852 853 Note that this will run quite slowly, but it should be portable and should at 854 least make it possible to get something running if all else fails. 855 856 @item Fat binary, @option{--enable-fat} 857 @cindex Fat binary 858 @cindex @code{--enable-fat} 859 Using @option{--enable-fat} selects a ``fat binary'' build on x86, where 860 optimized low level subroutines are chosen at runtime according to the CPU 861 detected. This means more code, but gives good performance on all x86 chips. 862 (This option might become available for more architectures in the future.) 863 864 @item @option{ABI} 865 @cindex ABI 866 On some systems GMP supports multiple ABIs (application binary interfaces), 867 meaning data type sizes and calling conventions. By default GMP chooses the 868 best ABI available, but a particular ABI can be selected. For example 869 870 @example 871 ./configure --host=mips64-sgi-irix6 ABI=n32 872 @end example 873 874 See @ref{ABI and ISA}, for the available choices on relevant CPUs, and what 875 applications need to do. 876 877 @item @option{CC}, @option{CFLAGS} 878 @cindex C compiler 879 @cindex @code{CC} 880 @cindex @code{CFLAGS} 881 By default the C compiler used is chosen from among some likely candidates, 882 with @command{gcc} normally preferred if it's present. The usual 883 @samp{CC=whatever} can be passed to @samp{./configure} to choose something 884 different. 885 886 For various systems, default compiler flags are set based on the CPU and 887 compiler. The usual @samp{CFLAGS="-whatever"} can be passed to 888 @samp{./configure} to use something different or to set good flags for systems 889 GMP doesn't otherwise know. 890 891 The @samp{CC} and @samp{CFLAGS} used are printed during @samp{./configure}, 892 and can be found in each generated @file{Makefile}. This is the easiest way 893 to check the defaults when considering changing or adding something. 894 895 Note that when @samp{CC} and @samp{CFLAGS} are specified on a system 896 supporting multiple ABIs it's important to give an explicit 897 @samp{ABI=whatever}, since GMP can't determine the ABI just from the flags and 898 won't be able to select the correct assembly code. 899 900 If just @samp{CC} is selected then normal default @samp{CFLAGS} for that 901 compiler will be used (if GMP recognises it). For example @samp{CC=gcc} can 902 be used to force the use of GCC, with default flags (and default ABI). 903 904 @item @option{CPPFLAGS} 905 @cindex @code{CPPFLAGS} 906 Any flags like @samp{-D} defines or @samp{-I} includes required by the 907 preprocessor should be set in @samp{CPPFLAGS} rather than @samp{CFLAGS}. 908 Compiling is done with both @samp{CPPFLAGS} and @samp{CFLAGS}, but 909 preprocessing uses just @samp{CPPFLAGS}. This distinction is because most 910 preprocessors won't accept all the flags the compiler does. Preprocessing is 911 done separately in some configure tests. 912 913 @item @option{CC_FOR_BUILD} 914 @cindex @code{CC_FOR_BUILD} 915 Some build-time programs are compiled and run to generate host-specific data 916 tables. @samp{CC_FOR_BUILD} is the compiler used for this. It doesn't need 917 to be in any particular ABI or mode, it merely needs to generate executables 918 that can run. The default is to try the selected @samp{CC} and some likely 919 candidates such as @samp{cc} and @samp{gcc}, looking for something that works. 920 921 No flags are used with @samp{CC_FOR_BUILD} because a simple invocation like 922 @samp{cc foo.c} should be enough. If some particular options are required 923 they can be included as for instance @samp{CC_FOR_BUILD="cc -whatever"}. 924 925 @item C++ Support, @option{--enable-cxx} 926 @cindex C++ support 927 @cindex @code{--enable-cxx} 928 C++ support in GMP can be enabled with @samp{--enable-cxx}, in which case a 929 C++ compiler will be required. As a convenience @samp{--enable-cxx=detect} 930 can be used to enable C++ support only if a compiler can be found. The C++ 931 support consists of a library @file{libgmpxx.la} and header file 932 @file{gmpxx.h} (@pxref{Headers and Libraries}). 933 934 A separate @file{libgmpxx.la} has been adopted rather than having C++ objects 935 within @file{libgmp.la} in order to ensure dynamic linked C programs aren't 936 bloated by a dependency on the C++ standard library, and to avoid any chance 937 that the C++ compiler could be required when linking plain C programs. 938 939 @file{libgmpxx.la} will use certain internals from @file{libgmp.la} and can 940 only be expected to work with @file{libgmp.la} from the same GMP version. 941 Future changes to the relevant internals will be accompanied by renaming, so a 942 mismatch will cause unresolved symbols rather than perhaps mysterious 943 misbehaviour. 944 945 In general @file{libgmpxx.la} will be usable only with the C++ compiler that 946 built it, since name mangling and runtime support are usually incompatible 947 between different compilers. 948 949 @item @option{CXX}, @option{CXXFLAGS} 950 @cindex C++ compiler 951 @cindex @code{CXX} 952 @cindex @code{CXXFLAGS} 953 When C++ support is enabled, the C++ compiler and its flags can be set with 954 variables @samp{CXX} and @samp{CXXFLAGS} in the usual way. The default for 955 @samp{CXX} is the first compiler that works from a list of likely candidates, 956 with @command{g++} normally preferred when available. The default for 957 @samp{CXXFLAGS} is to try @samp{CFLAGS}, @samp{CFLAGS} without @samp{-g}, then 958 for @command{g++} either @samp{-g -O2} or @samp{-O2}, or for other compilers 959 @samp{-g} or nothing. Trying @samp{CFLAGS} this way is convenient when using 960 @samp{gcc} and @samp{g++} together, since the flags for @samp{gcc} will 961 usually suit @samp{g++}. 962 963 It's important that the C and C++ compilers match, meaning their startup and 964 runtime support routines are compatible and that they generate code in the 965 same ABI (if there's a choice of ABIs on the system). @samp{./configure} 966 isn't currently able to check these things very well itself, so for that 967 reason @samp{--disable-cxx} is the default, to avoid a build failure due to a 968 compiler mismatch. Perhaps this will change in the future. 969 970 Incidentally, it's normally not good enough to set @samp{CXX} to the same as 971 @samp{CC}. Although @command{gcc} for instance recognises @file{foo.cc} as 972 C++ code, only @command{g++} will invoke the linker the right way when 973 building an executable or shared library from C++ object files. 974 975 @item Temporary Memory, @option{--enable-alloca=<choice>} 976 @cindex Temporary memory 977 @cindex Stack overflow 978 @cindex @code{alloca} 979 @cindex @code{--enable-alloca} 980 GMP allocates temporary workspace using one of the following three methods, 981 which can be selected with for instance 982 @samp{--enable-alloca=malloc-reentrant}. 983 984 @itemize @bullet 985 @item 986 @samp{alloca} - C library or compiler builtin. 987 @item 988 @samp{malloc-reentrant} - the heap, in a re-entrant fashion. 989 @item 990 @samp{malloc-notreentrant} - the heap, with global variables. 991 @end itemize 992 993 For convenience, the following choices are also available. 994 @samp{--disable-alloca} is the same as @samp{no}. 995 996 @itemize @bullet 997 @item 998 @samp{yes} - a synonym for @samp{alloca}. 999 @item 1000 @samp{no} - a synonym for @samp{malloc-reentrant}. 1001 @item 1002 @samp{reentrant} - @code{alloca} if available, otherwise 1003 @samp{malloc-reentrant}. This is the default. 1004 @item 1005 @samp{notreentrant} - @code{alloca} if available, otherwise 1006 @samp{malloc-notreentrant}. 1007 @end itemize 1008 1009 @code{alloca} is reentrant and fast, and is recommended. It actually allocates 1010 just small blocks on the stack; larger ones use malloc-reentrant. 1011 1012 @samp{malloc-reentrant} is, as the name suggests, reentrant and thread safe, 1013 but @samp{malloc-notreentrant} is faster and should be used if reentrancy is 1014 not required. 1015 1016 The two malloc methods in fact use the memory allocation functions selected by 1017 @code{mp_set_memory_functions}, these being @code{malloc} and friends by 1018 default. @xref{Custom Allocation}. 1019 1020 An additional choice @samp{--enable-alloca=debug} is available, to help when 1021 debugging memory related problems (@pxref{Debugging}). 1022 1023 @item FFT Multiplication, @option{--disable-fft} 1024 @cindex FFT multiplication 1025 @cindex @code{--disable-fft} 1026 By default multiplications are done using Karatsuba, 3-way Toom, higher degree 1027 Toom, and Fermat FFT@. The FFT is only used on large to very large operands 1028 and can be disabled to save code size if desired. 1029 1030 @item Assertion Checking, @option{--enable-assert} 1031 @cindex Assertion checking 1032 @cindex @code{--enable-assert} 1033 This option enables some consistency checking within the library. This can be 1034 of use while debugging, @pxref{Debugging}. 1035 1036 @item Execution Profiling, @option{--enable-profiling=prof/gprof/instrument} 1037 @cindex Execution profiling 1038 @cindex @code{--enable-profiling} 1039 Enable profiling support, in one of various styles, @pxref{Profiling}. 1040 1041 @item @option{MPN_PATH} 1042 @cindex @code{MPN_PATH} 1043 Various assembly versions of each mpn subroutines are provided. For a given 1044 CPU, a search is made though a path to choose a version of each. For example 1045 @samp{sparcv8} has 1046 1047 @example 1048 MPN_PATH="sparc32/v8 sparc32 generic" 1049 @end example 1050 1051 which means look first for v8 code, then plain sparc32 (which is v7), and 1052 finally fall back on generic C@. Knowledgeable users with special requirements 1053 can specify a different path. Normally this is completely unnecessary. 1054 1055 @item Documentation 1056 @cindex Documentation formats 1057 @cindex Texinfo 1058 The source for the document you're now reading is @file{doc/gmp.texi}, in 1059 Texinfo format, see @GMPreftop{texinfo, Texinfo}. 1060 1061 @cindex Postscript 1062 @cindex DVI 1063 @cindex PDF 1064 Info format @samp{doc/gmp.info} is included in the distribution. The usual 1065 automake targets are available to make PostScript, DVI, PDF and HTML (these 1066 will require various @TeX{} and Texinfo tools). 1067 1068 @cindex DocBook 1069 @cindex XML 1070 DocBook and XML can be generated by the Texinfo @command{makeinfo} program 1071 too, see @ref{makeinfo options,, Options for @command{makeinfo}, texinfo, 1072 Texinfo}. 1073 1074 Some supplementary notes can also be found in the @file{doc} subdirectory. 1075 1076 @end table 1077 1078 1079 @need 2000 1080 @node ABI and ISA, Notes for Package Builds, Build Options, Installing GMP 1081 @section ABI and ISA 1082 @cindex ABI 1083 @cindex Application Binary Interface 1084 @cindex ISA 1085 @cindex Instruction Set Architecture 1086 1087 ABI (Application Binary Interface) refers to the calling conventions between 1088 functions, meaning what registers are used and what sizes the various C data 1089 types are. ISA (Instruction Set Architecture) refers to the instructions and 1090 registers a CPU has available. 1091 1092 Some 64-bit ISA CPUs have both a 64-bit ABI and a 32-bit ABI defined, the 1093 latter for compatibility with older CPUs in the family. GMP supports some 1094 CPUs like this in both ABIs. In fact within GMP @samp{ABI} means a 1095 combination of chip ABI, plus how GMP chooses to use it. For example in some 1096 32-bit ABIs, GMP may support a limb as either a 32-bit @code{long} or a 64-bit 1097 @code{long long}. 1098 1099 By default GMP chooses the best ABI available for a given system, and this 1100 generally gives significantly greater speed. But an ABI can be chosen 1101 explicitly to make GMP compatible with other libraries, or particular 1102 application requirements. For example, 1103 1104 @example 1105 ./configure ABI=32 1106 @end example 1107 1108 In all cases it's vital that all object code used in a given program is 1109 compiled for the same ABI. 1110 1111 Usually a limb is implemented as a @code{long}. When a @code{long long} limb 1112 is used this is encoded in the generated @file{gmp.h}. This is convenient for 1113 applications, but it does mean that @file{gmp.h} will vary, and can't be just 1114 copied around. @file{gmp.h} remains compiler independent though, since all 1115 compilers for a particular ABI will be expected to use the same limb type. 1116 1117 Currently no attempt is made to follow whatever conventions a system has for 1118 installing library or header files built for a particular ABI@. This will 1119 probably only matter when installing multiple builds of GMP, and it might be 1120 as simple as configuring with a special @samp{libdir}, or it might require 1121 more than that. Note that builds for different ABIs need to done separately, 1122 with a fresh @command{./configure} and @command{make} each. 1123 1124 @sp 1 1125 @table @asis 1126 @need 1000 1127 @item AMD64 (@samp{x86_64}) 1128 @cindex AMD64 1129 On AMD64 systems supporting both 32-bit and 64-bit modes for applications, the 1130 following ABI choices are available. 1131 1132 @table @asis 1133 @item @samp{ABI=64} 1134 The 64-bit ABI uses 64-bit limbs and pointers and makes full use of the chip 1135 architecture. This is the default. Applications will usually not need 1136 special compiler flags, but for reference the option is 1137 1138 @example 1139 gcc -m64 1140 @end example 1141 1142 @item @samp{ABI=32} 1143 The 32-bit ABI is the usual i386 conventions. This will be slower, and is not 1144 recommended except for inter-operating with other code not yet 64-bit capable. 1145 Applications must be compiled with 1146 1147 @example 1148 gcc -m32 1149 @end example 1150 1151 (In GCC 2.95 and earlier there's no @samp{-m32} option, it's the only mode.) 1152 1153 @item @samp{ABI=x32} 1154 The x32 ABI uses 64-bit limbs but 32-bit pointers. Like the 64-bit ABI, it 1155 makes full use of the chip's arithmetic capabilities. This ABI is not 1156 supported by all operating systems. 1157 1158 @example 1159 gcc -mx32 1160 @end example 1161 1162 @end table 1163 1164 @sp 1 1165 @need 1000 1166 @item HPPA 2.0 (@samp{hppa2.0*}, @samp{hppa64}) 1167 @cindex HPPA 1168 @cindex HP-UX 1169 @table @asis 1170 @item @samp{ABI=2.0w} 1171 The 2.0w ABI uses 64-bit limbs and pointers and is available on HP-UX 11 or 1172 up. Applications must be compiled with 1173 1174 @example 1175 gcc [built for 2.0w] 1176 cc +DD64 1177 @end example 1178 1179 @item @samp{ABI=2.0n} 1180 The 2.0n ABI means the 32-bit HPPA 1.0 ABI and all its normal calling 1181 conventions, but with 64-bit instructions permitted within functions. GMP 1182 uses a 64-bit @code{long long} for a limb. This ABI is available on hppa64 1183 GNU/Linux and on HP-UX 10 or higher. Applications must be compiled with 1184 1185 @example 1186 gcc [built for 2.0n] 1187 cc +DA2.0 +e 1188 @end example 1189 1190 Note that current versions of GCC (eg.@: 3.2) don't generate 64-bit 1191 instructions for @code{long long} operations and so may be slower than for 1192 2.0w. (The GMP assembly code is the same though.) 1193 1194 @item @samp{ABI=1.0} 1195 HPPA 2.0 CPUs can run all HPPA 1.0 and 1.1 code in the 32-bit HPPA 1.0 ABI@. 1196 No special compiler options are needed for applications. 1197 @end table 1198 1199 All three ABIs are available for CPU types @samp{hppa2.0w}, @samp{hppa2.0} and 1200 @samp{hppa64}, but for CPU type @samp{hppa2.0n} only 2.0n or 1.0 are 1201 considered. 1202 1203 Note that GCC on HP-UX has no options to choose between 2.0n and 2.0w modes, 1204 unlike HP @command{cc}. Instead it must be built for one or the other ABI@. 1205 GMP will detect how it was built, and skip to the corresponding @samp{ABI}. 1206 1207 @sp 1 1208 @need 1500 1209 @item IA-64 under HP-UX (@samp{ia64*-*-hpux*}, @samp{itanium*-*-hpux*}) 1210 @cindex IA-64 1211 @cindex HP-UX 1212 HP-UX supports two ABIs for IA-64. GMP performance is the same in both. 1213 1214 @table @asis 1215 @item @samp{ABI=32} 1216 In the 32-bit ABI, pointers, @code{int}s and @code{long}s are 32 bits and GMP 1217 uses a 64 bit @code{long long} for a limb. Applications can be compiled 1218 without any special flags since this ABI is the default in both HP C and GCC, 1219 but for reference the flags are 1220 1221 @example 1222 gcc -milp32 1223 cc +DD32 1224 @end example 1225 1226 @item @samp{ABI=64} 1227 In the 64-bit ABI, @code{long}s and pointers are 64 bits and GMP uses a 1228 @code{long} for a limb. Applications must be compiled with 1229 1230 @example 1231 gcc -mlp64 1232 cc +DD64 1233 @end example 1234 @end table 1235 1236 On other IA-64 systems, GNU/Linux for instance, @samp{ABI=64} is the only 1237 choice. 1238 1239 @sp 1 1240 @need 1000 1241 @item MIPS under IRIX 6 (@samp{mips*-*-irix[6789]}) 1242 @cindex MIPS 1243 @cindex IRIX 1244 IRIX 6 always has a 64-bit MIPS 3 or better CPU, and supports ABIs o32, n32, 1245 and 64. n32 or 64 are recommended, and GMP performance will be the same in 1246 each. The default is n32. 1247 1248 @table @asis 1249 @item @samp{ABI=o32} 1250 The o32 ABI is 32-bit pointers and integers, and no 64-bit operations. GMP 1251 will be slower than in n32 or 64, this option only exists to support old 1252 compilers, eg.@: GCC 2.7.2. Applications can be compiled with no special 1253 flags on an old compiler, or on a newer compiler with 1254 1255 @example 1256 gcc -mabi=32 1257 cc -32 1258 @end example 1259 1260 @item @samp{ABI=n32} 1261 The n32 ABI is 32-bit pointers and integers, but with a 64-bit limb using a 1262 @code{long long}. Applications must be compiled with 1263 1264 @example 1265 gcc -mabi=n32 1266 cc -n32 1267 @end example 1268 1269 @item @samp{ABI=64} 1270 The 64-bit ABI is 64-bit pointers and integers. Applications must be compiled 1271 with 1272 1273 @example 1274 gcc -mabi=64 1275 cc -64 1276 @end example 1277 @end table 1278 1279 Note that MIPS GNU/Linux, as of kernel version 2.2, doesn't have the necessary 1280 support for n32 or 64 and so only gets a 32-bit limb and the MIPS 2 code. 1281 1282 @sp 1 1283 @need 1000 1284 @item PowerPC 64 (@samp{powerpc64}, @samp{powerpc620}, @samp{powerpc630}, @samp{powerpc970}, @samp{power4}, @samp{power5}) 1285 @cindex PowerPC 1286 @table @asis 1287 @item @samp{ABI=mode64} 1288 @cindex AIX 1289 The AIX 64 ABI uses 64-bit limbs and pointers and is the default on PowerPC 64 1290 @samp{*-*-aix*} systems. Applications must be compiled with 1291 1292 @example 1293 gcc -maix64 1294 xlc -q64 1295 @end example 1296 1297 On 64-bit GNU/Linux, BSD, and Mac OS X/Darwin systems, the applications must 1298 be compiled with 1299 1300 @example 1301 gcc -m64 1302 @end example 1303 1304 @item @samp{ABI=mode32} 1305 The @samp{mode32} ABI uses a 64-bit @code{long long} limb but with the chip 1306 still in 32-bit mode and using 32-bit calling conventions. This is the default 1307 for systems where the true 64-bit ABI is unavailable. No special compiler 1308 options are typically needed for applications. This ABI is not available under 1309 AIX. 1310 1311 @item @samp{ABI=32} 1312 This is the basic 32-bit PowerPC ABI, with a 32-bit limb. No special compiler 1313 options are needed for applications. 1314 @end table 1315 1316 GMP's speed is greatest for the @samp{mode64} ABI, the @samp{mode32} ABI is 2nd 1317 best. In @samp{ABI=32} only the 32-bit ISA is used and this doesn't make full 1318 use of a 64-bit chip. 1319 1320 @sp 1 1321 @need 1000 1322 @item Sparc V9 (@samp{sparc64}, @samp{sparcv9}, @samp{ultrasparc*}) 1323 @cindex Sparc V9 1324 @cindex Solaris 1325 @cindex Sun 1326 @table @asis 1327 @item @samp{ABI=64} 1328 The 64-bit V9 ABI is available on the various BSD sparc64 ports, recent 1329 versions of Sparc64 GNU/Linux, and Solaris 2.7 and up (when the kernel is in 1330 64-bit mode). GCC 3.2 or higher, or Sun @command{cc} is required. On 1331 GNU/Linux, depending on the default @command{gcc} mode, applications must be 1332 compiled with 1333 1334 @example 1335 gcc -m64 1336 @end example 1337 1338 On Solaris applications must be compiled with 1339 1340 @example 1341 gcc -m64 -mptr64 -Wa,-xarch=v9 -mcpu=v9 1342 cc -xarch=v9 1343 @end example 1344 1345 On the BSD sparc64 systems no special options are required, since 64-bits is 1346 the only ABI available. 1347 1348 @item @samp{ABI=32} 1349 For the basic 32-bit ABI, GMP still uses as much of the V9 ISA as it can. In 1350 the Sun documentation this combination is known as ``v8plus''. On GNU/Linux, 1351 depending on the default @command{gcc} mode, applications may need to be 1352 compiled with 1353 1354 @example 1355 gcc -m32 1356 @end example 1357 1358 On Solaris, no special compiler options are required for applications, though 1359 using something like the following is recommended. (@command{gcc} 2.8 and 1360 earlier only support @samp{-mv8} though.) 1361 1362 @example 1363 gcc -mv8plus 1364 cc -xarch=v8plus 1365 @end example 1366 @end table 1367 1368 GMP speed is greatest in @samp{ABI=64}, so it's the default where available. 1369 The speed is partly because there are extra registers available and partly 1370 because 64-bits is considered the more important case and has therefore had 1371 better code written for it. 1372 1373 Don't be confused by the names of the @samp{-m} and @samp{-x} compiler 1374 options, they're called @samp{arch} but effectively control both ABI and ISA@. 1375 1376 On Solaris 2.6 and earlier, only @samp{ABI=32} is available since the kernel 1377 doesn't save all registers. 1378 1379 On Solaris 2.7 with the kernel in 32-bit mode, a normal native build will 1380 reject @samp{ABI=64} because the resulting executables won't run. 1381 @samp{ABI=64} can still be built if desired by making it look like a 1382 cross-compile, for example 1383 1384 @example 1385 ./configure --build=none --host=sparcv9-sun-solaris2.7 ABI=64 1386 @end example 1387 @end table 1388 1389 1390 @need 2000 1391 @node Notes for Package Builds, Notes for Particular Systems, ABI and ISA, Installing GMP 1392 @section Notes for Package Builds 1393 @cindex Build notes for binary packaging 1394 @cindex Packaged builds 1395 1396 GMP should present no great difficulties for packaging in a binary 1397 distribution. 1398 1399 @cindex Libtool versioning 1400 @cindex Shared library versioning 1401 Libtool is used to build the library and @samp{-version-info} is set 1402 appropriately, having started from @samp{3:0:0} in GMP 3.0 (@pxref{Versioning, 1403 Library interface versions, Library interface versions, libtool, GNU 1404 Libtool}). 1405 1406 The GMP 4 series will be upwardly binary compatible in each release and will 1407 be upwardly binary compatible with all of the GMP 3 series. Additional 1408 function interfaces may be added in each release, so on systems where libtool 1409 versioning is not fully checked by the loader an auxiliary mechanism may be 1410 needed to express that a dynamic linked application depends on a new enough 1411 GMP. 1412 1413 An auxiliary mechanism may also be needed to express that @file{libgmpxx.la} 1414 (from @option{--enable-cxx}, @pxref{Build Options}) requires @file{libgmp.la} 1415 from the same GMP version, since this is not done by the libtool versioning, 1416 nor otherwise. A mismatch will result in unresolved symbols from the linker, 1417 or perhaps the loader. 1418 1419 When building a package for a CPU family, care should be taken to use 1420 @samp{--host} (or @samp{--build}) to choose the least common denominator among 1421 the CPUs which might use the package. For example this might mean plain 1422 @samp{sparc} (meaning V7) for SPARCs. 1423 1424 For x86s, @option{--enable-fat} sets things up for a fat binary build, making a 1425 runtime selection of optimized low level routines. This is a good choice for 1426 packaging to run on a range of x86 chips. 1427 1428 Users who care about speed will want GMP built for their exact CPU type, to 1429 make best use of the available optimizations. Providing a way to suitably 1430 rebuild a package may be useful. This could be as simple as making it 1431 possible for a user to omit @samp{--build} (and @samp{--host}) so 1432 @samp{./config.guess} will detect the CPU@. But a way to manually specify a 1433 @samp{--build} will be wanted for systems where @samp{./config.guess} is 1434 inexact. 1435 1436 On systems with multiple ABIs, a packaged build will need to decide which 1437 among the choices is to be provided, see @ref{ABI and ISA}. A given run of 1438 @samp{./configure} etc will only build one ABI@. If a second ABI is also 1439 required then a second run of @samp{./configure} etc must be made, starting 1440 from a clean directory tree (@samp{make distclean}). 1441 1442 As noted under ``ABI and ISA'', currently no attempt is made to follow system 1443 conventions for install locations that vary with ABI, such as 1444 @file{/usr/lib/sparcv9} for @samp{ABI=64} as opposed to @file{/usr/lib} for 1445 @samp{ABI=32}. A package build can override @samp{libdir} and other standard 1446 variables as necessary. 1447 1448 Note that @file{gmp.h} is a generated file, and will be architecture and ABI 1449 dependent. When attempting to install two ABIs simultaneously it will be 1450 important that an application compile gets the correct @file{gmp.h} for its 1451 desired ABI@. If compiler include paths don't vary with ABI options then it 1452 might be necessary to create a @file{/usr/include/gmp.h} which tests 1453 preprocessor symbols and chooses the correct actual @file{gmp.h}. 1454 1455 1456 @need 2000 1457 @node Notes for Particular Systems, Known Build Problems, Notes for Package Builds, Installing GMP 1458 @section Notes for Particular Systems 1459 @cindex Build notes for particular systems 1460 @cindex Particular systems 1461 @cindex Systems 1462 @table @asis 1463 1464 @c This section is more or less meant for notes about performance or about 1465 @c build problems that have been worked around but might leave a user 1466 @c scratching their head. Fun with different ABIs on a system belongs in the 1467 @c above section. 1468 1469 @item AIX 3 and 4 1470 @cindex AIX 1471 On systems @samp{*-*-aix[34]*} shared libraries are disabled by default, since 1472 some versions of the native @command{ar} fail on the convenience libraries 1473 used. A shared build can be attempted with 1474 1475 @example 1476 ./configure --enable-shared --disable-static 1477 @end example 1478 1479 Note that the @samp{--disable-static} is necessary because in a shared build 1480 libtool makes @file{libgmp.a} a symlink to @file{libgmp.so}, apparently for 1481 the benefit of old versions of @command{ld} which only recognise @file{.a}, 1482 but unfortunately this is done even if a fully functional @command{ld} is 1483 available. 1484 1485 @item ARM 1486 @cindex ARM 1487 On systems @samp{arm*-*-*}, versions of GCC up to and including 2.95.3 have a 1488 bug in unsigned division, giving wrong results for some operands. GMP 1489 @samp{./configure} will demand GCC 2.95.4 or later. 1490 1491 @item Compaq C++ 1492 @cindex Compaq C++ 1493 Compaq C++ on OSF 5.1 has two flavours of @code{iostream}, a standard one and 1494 an old pre-standard one (see @samp{man iostream_intro}). GMP can only use the 1495 standard one, which unfortunately is not the default but must be selected by 1496 defining @code{__USE_STD_IOSTREAM}. Configure with for instance 1497 1498 @example 1499 ./configure --enable-cxx CPPFLAGS=-D__USE_STD_IOSTREAM 1500 @end example 1501 1502 @item Floating Point Mode 1503 @cindex Floating point mode 1504 @cindex Hardware floating point mode 1505 @cindex Precision of hardware floating point 1506 @cindex x87 1507 On some systems, the hardware floating point has a control mode which can set 1508 all operations to be done in a particular precision, for instance single, 1509 double or extended on x86 systems (x87 floating point). The GMP functions 1510 involving a @code{double} cannot be expected to operate to their full 1511 precision when the hardware is in single precision mode. Of course this 1512 affects all code, including application code, not just GMP. 1513 1514 @item FreeBSD 7.x, 8.x, 9.0, 9.1, 9.2 1515 @cindex FreeBSD 1516 @command{m4} in these releases of FreeBSD has an eval function which ignores 1517 its 2nd and 3rd arguments, which makes it unsuitable for @file{.asm} file 1518 processing. @samp{./configure} will detect the problem and either abort or 1519 choose another m4 in the @env{PATH}. The bug is fixed in FreeBSD 9.3 and 10.0, 1520 so either upgrade or use GNU m4. Note that the FreeBSD package system installs 1521 GNU m4 under the name @samp{gm4}, which GMP cannot guess. 1522 1523 @item FreeBSD 7.x, 8.x, 9.x 1524 @cindex FreeBSD 1525 GMP releases starting with 6.0 do not support @samp{ABI=32} on FreeBSD/amd64 1526 prior to release 10.0 of the system. The cause is a broken @code{limits.h}, 1527 which GMP no longer works around. 1528 1529 @item MS-DOS and MS Windows 1530 @cindex MS-DOS 1531 @cindex MS Windows 1532 @cindex Windows 1533 @cindex Cygwin 1534 @cindex DJGPP 1535 @cindex MINGW 1536 On an MS-DOS system DJGPP can be used to build GMP, and on an MS Windows 1537 system Cygwin, DJGPP and MINGW can be used. All three are excellent ports of 1538 GCC and the various GNU tools. 1539 1540 @display 1541 @uref{http://www.cygwin.com/} 1542 @uref{http://www.delorie.com/djgpp/} 1543 @uref{http://www.mingw.org/} 1544 @end display 1545 1546 @cindex Interix 1547 @cindex Services for Unix 1548 Microsoft also publishes an Interix ``Services for Unix'' which can be used to 1549 build GMP on Windows (with a normal @samp{./configure}), but it's not free 1550 software. 1551 1552 @item MS Windows DLLs 1553 @cindex DLLs 1554 @cindex MS Windows 1555 @cindex Windows 1556 On systems @samp{*-*-cygwin*}, @samp{*-*-mingw*} and @samp{*-*-pw32*} by 1557 default GMP builds only a static library, but a DLL can be built instead using 1558 1559 @example 1560 ./configure --disable-static --enable-shared 1561 @end example 1562 1563 Static and DLL libraries can't both be built, since certain export directives 1564 in @file{gmp.h} must be different. 1565 1566 A MINGW DLL build of GMP can be used with Microsoft C@. Libtool doesn't 1567 install a @file{.lib} format import library, but it can be created with MS 1568 @command{lib} as follows, and copied to the install directory. Similarly for 1569 @file{libmp} and @file{libgmpxx}. 1570 1571 @example 1572 cd .libs 1573 lib /def:libgmp-3.dll.def /out:libgmp-3.lib 1574 @end example 1575 1576 MINGW uses the C runtime library @samp{msvcrt.dll} for I/O, so applications 1577 wanting to use the GMP I/O routines must be compiled with @samp{cl /MD} to do 1578 the same. If one of the other C runtime library choices provided by MS C is 1579 desired then the suggestion is to use the GMP string functions and confine I/O 1580 to the application. 1581 1582 @item Motorola 68k CPU Types 1583 @cindex 68000 1584 @samp{m68k} is taken to mean 68000. @samp{m68020} or higher will give a 1585 performance boost on applicable CPUs. @samp{m68360} can be used for CPU32 1586 series chips. @samp{m68302} can be used for ``Dragonball'' series chips, 1587 though this is merely a synonym for @samp{m68000}. 1588 1589 @item NetBSD 5.x 1590 @cindex NetBSD 1591 @command{m4} in these releases of NetBSD has an eval function which ignores its 1592 2nd and 3rd arguments, which makes it unsuitable for @file{.asm} file 1593 processing. @samp{./configure} will detect the problem and either abort or 1594 choose another m4 in the @env{PATH}. The bug is fixed in NetBSD 6, so either 1595 upgrade or use GNU m4. Note that the NetBSD package system installs GNU m4 1596 under the name @samp{gm4}, which GMP cannot guess. 1597 1598 @item OpenBSD 2.6 1599 @cindex OpenBSD 1600 @command{m4} in this release of OpenBSD has a bug in @code{eval} that makes it 1601 unsuitable for @file{.asm} file processing. @samp{./configure} will detect 1602 the problem and either abort or choose another m4 in the @env{PATH}. The bug 1603 is fixed in OpenBSD 2.7, so either upgrade or use GNU m4. 1604 1605 @item Power CPU Types 1606 @cindex Power/PowerPC 1607 In GMP, CPU types @samp{power*} and @samp{powerpc*} will each use instructions 1608 not available on the other, so it's important to choose the right one for the 1609 CPU that will be used. Currently GMP has no assembly code support for using 1610 just the common instruction subset. To get executables that run on both, the 1611 current suggestion is to use the generic C code (@option{--disable-assembly}), 1612 possibly with appropriate compiler options (like @samp{-mcpu=common} for 1613 @command{gcc}). CPU @samp{rs6000} (which is not a CPU but a family of 1614 workstations) is accepted by @file{config.sub}, but is currently equivalent to 1615 @option{--disable-assembly}. 1616 1617 @item Sparc CPU Types 1618 @cindex Sparc 1619 @samp{sparcv8} or @samp{supersparc} on relevant systems will give a 1620 significant performance increase over the V7 code selected by plain 1621 @samp{sparc}. 1622 1623 @item Sparc App Regs 1624 @cindex Sparc 1625 The GMP assembly code for both 32-bit and 64-bit Sparc clobbers the 1626 ``application registers'' @code{g2}, @code{g3} and @code{g4}, the same way 1627 that the GCC default @samp{-mapp-regs} does (@pxref{SPARC Options,, SPARC 1628 Options, gcc, Using the GNU Compiler Collection (GCC)}). 1629 1630 This makes that code unsuitable for use with the special V9 1631 @samp{-mcmodel=embmedany} (which uses @code{g4} as a data segment pointer), and 1632 for applications wanting to use those registers for special purposes. In these 1633 cases the only suggestion currently is to build GMP with 1634 @option{--disable-assembly} to avoid the assembly code. 1635 1636 @item SunOS 4 1637 @cindex SunOS 1638 @command{/usr/bin/m4} lacks various features needed to process @file{.asm} 1639 files, and instead @samp{./configure} will automatically use 1640 @command{/usr/5bin/m4}, which we believe is always available (if not then use 1641 GNU m4). 1642 1643 @item x86 CPU Types 1644 @cindex x86 1645 @cindex 80x86 1646 @cindex i386 1647 @samp{i586}, @samp{pentium} or @samp{pentiummmx} code is good for its intended 1648 P5 Pentium chips, but quite slow when run on Intel P6 class chips (PPro, P-II, 1649 P-III)@. @samp{i386} is a better choice when making binaries that must run on 1650 both. 1651 1652 @item x86 MMX and SSE2 Code 1653 @cindex MMX 1654 @cindex SSE2 1655 If the CPU selected has MMX code but the assembler doesn't support it, a 1656 warning is given and non-MMX code is used instead. This will be an inferior 1657 build, since the MMX code that's present is there because it's faster than the 1658 corresponding plain integer code. The same applies to SSE2. 1659 1660 Old versions of @samp{gas} don't support MMX instructions, in particular 1661 version 1.92.3 that comes with FreeBSD 2.2.8 or the more recent OpenBSD 3.1 1662 doesn't. 1663 1664 Solaris 2.6 and 2.7 @command{as} generate incorrect object code for register 1665 to register @code{movq} instructions, and so can't be used for MMX code. 1666 Install a recent @command{gas} if MMX code is wanted on these systems. 1667 @end table 1668 1669 1670 @need 2000 1671 @node Known Build Problems, Performance optimization, Notes for Particular Systems, Installing GMP 1672 @section Known Build Problems 1673 @cindex Build problems known 1674 1675 @c This section is more or less meant for known build problems that are not 1676 @c otherwise worked around and require some sort of manual intervention. 1677 1678 You might find more up-to-date information at @uref{https://gmplib.org/}. 1679 1680 @table @asis 1681 @item Compiler link options 1682 The version of libtool currently in use rather aggressively strips compiler 1683 options when linking a shared library. This will hopefully be relaxed in the 1684 future, but for now if this is a problem the suggestion is to create a little 1685 script to hide them, and for instance configure with 1686 1687 @example 1688 ./configure CC=gcc-with-my-options 1689 @end example 1690 1691 @item DJGPP (@samp{*-*-msdosdjgpp*}) 1692 @cindex DJGPP 1693 The DJGPP port of @command{bash} 2.03 is unable to run the @samp{configure} 1694 script, it exits silently, having died writing a preamble to 1695 @file{config.log}. Use @command{bash} 2.04 or higher. 1696 1697 @samp{make all} was found to run out of memory during the final 1698 @file{libgmp.la} link on one system tested, despite having 64Mb available. 1699 Running @samp{make libgmp.la} directly helped, perhaps recursing into the 1700 various subdirectories uses up memory. 1701 1702 @item GNU binutils @command{strip} prior to 2.12 1703 @cindex Stripped libraries 1704 @cindex Binutils @command{strip} 1705 @cindex GNU @command{strip} 1706 @command{strip} from GNU binutils 2.11 and earlier should not be used on the 1707 static libraries @file{libgmp.a} and @file{libmp.a} since it will discard all 1708 but the last of multiple archive members with the same name, like the three 1709 versions of @file{init.o} in @file{libgmp.a}. Binutils 2.12 or higher can be 1710 used successfully. 1711 1712 The shared libraries @file{libgmp.so} and @file{libmp.so} are not affected by 1713 this and any version of @command{strip} can be used on them. 1714 1715 @item @command{make} syntax error 1716 @cindex SCO 1717 @cindex IRIX 1718 On certain versions of SCO OpenServer 5 and IRIX 6.5 the native @command{make} 1719 is unable to handle the long dependencies list for @file{libgmp.la}. The 1720 symptom is a ``syntax error'' on the following line of the top-level 1721 @file{Makefile}. 1722 1723 @example 1724 libgmp.la: $(libgmp_la_OBJECTS) $(libgmp_la_DEPENDENCIES) 1725 @end example 1726 1727 Either use GNU Make, or as a workaround remove 1728 @code{$(libgmp_la_DEPENDENCIES)} from that line (which will make the initial 1729 build work, but if any recompiling is done @file{libgmp.la} might not be 1730 rebuilt). 1731 1732 @item MacOS X (@samp{*-*-darwin*}) 1733 @cindex MacOS X 1734 @cindex Darwin 1735 Libtool currently only knows how to create shared libraries on MacOS X using 1736 the native @command{cc} (which is a modified GCC), not a plain GCC@. A 1737 static-only build should work though (@samp{--disable-shared}). 1738 1739 @item NeXT prior to 3.3 1740 @cindex NeXT 1741 The system compiler on old versions of NeXT was a massacred and old GCC, even 1742 if it called itself @file{cc}. This compiler cannot be used to build GMP, you 1743 need to get a real GCC, and install that. (NeXT may have fixed this in 1744 release 3.3 of their system.) 1745 1746 @item POWER and PowerPC 1747 @cindex Power/PowerPC 1748 Bugs in GCC 2.7.2 (and 2.6.3) mean it can't be used to compile GMP on POWER or 1749 PowerPC@. If you want to use GCC for these machines, get GCC 2.7.2.1 (or 1750 later). 1751 1752 @item Sequent Symmetry 1753 @cindex Sequent Symmetry 1754 Use the GNU assembler instead of the system assembler, since the latter has 1755 serious bugs. 1756 1757 @item Solaris 2.6 1758 @cindex Solaris 1759 The system @command{sed} prints an error ``Output line too long'' when libtool 1760 builds @file{libgmp.la}. This doesn't seem to cause any obvious ill effects, 1761 but GNU @command{sed} is recommended, to avoid any doubt. 1762 1763 @item Sparc Solaris 2.7 with gcc 2.95.2 in @samp{ABI=32} 1764 @cindex Solaris 1765 A shared library build of GMP seems to fail in this combination, it builds but 1766 then fails the tests, apparently due to some incorrect data relocations within 1767 @code{gmp_randinit_lc_2exp_size}. The exact cause is unknown, 1768 @samp{--disable-shared} is recommended. 1769 @end table 1770 1771 1772 @need 2000 1773 @node Performance optimization, , Known Build Problems, Installing GMP 1774 @section Performance optimization 1775 @cindex Optimizing performance 1776 1777 @c At some point, this should perhaps move to a separate chapter on optimizing 1778 @c performance. 1779 1780 For optimal performance, build GMP for the exact CPU type of the target 1781 computer, see @ref{Build Options}. 1782 1783 Unlike what is the case for most other programs, the compiler typically 1784 doesn't matter much, since GMP uses assembly language for the most critical 1785 operation. 1786 1787 In particular for long-running GMP applications, and applications demanding 1788 extremely large numbers, building and running the @code{tuneup} program in the 1789 @file{tune} subdirectory, can be important. For example, 1790 1791 @example 1792 cd tune 1793 make tuneup 1794 ./tuneup 1795 @end example 1796 1797 will generate better contents for the @file{gmp-mparam.h} parameter file. 1798 1799 To use the results, put the output in the file indicated in the 1800 @samp{Parameters for ...} header. Then recompile from scratch. 1801 1802 The @code{tuneup} program takes one useful parameter, @samp{-f NNN}, which 1803 instructs the program how long to check FFT multiply parameters. If you're 1804 going to use GMP for extremely large numbers, you may want to run @code{tuneup} 1805 with a large NNN value. 1806 1807 1808 @node GMP Basics, Reporting Bugs, Installing GMP, Top 1809 @comment node-name, next, previous, up 1810 @chapter GMP Basics 1811 @cindex Basics 1812 1813 @strong{Using functions, macros, data types, etc.@: not documented in this 1814 manual is strongly discouraged. If you do so your application is guaranteed 1815 to be incompatible with future versions of GMP.} 1816 1817 @menu 1818 * Headers and Libraries:: 1819 * Nomenclature and Types:: 1820 * Function Classes:: 1821 * Variable Conventions:: 1822 * Parameter Conventions:: 1823 * Memory Management:: 1824 * Reentrancy:: 1825 * Useful Macros and Constants:: 1826 * Compatibility with older versions:: 1827 * Demonstration Programs:: 1828 * Efficiency:: 1829 * Debugging:: 1830 * Profiling:: 1831 * Autoconf:: 1832 * Emacs:: 1833 @end menu 1834 1835 @node Headers and Libraries, Nomenclature and Types, GMP Basics, GMP Basics 1836 @section Headers and Libraries 1837 @cindex Headers 1838 1839 @cindex @file{gmp.h} 1840 @cindex Include files 1841 @cindex @code{#include} 1842 All declarations needed to use GMP are collected in the include file 1843 @file{gmp.h}. It is designed to work with both C and C++ compilers. 1844 1845 @example 1846 #include <gmp.h> 1847 @end example 1848 1849 @cindex @code{stdio.h} 1850 Note however that prototypes for GMP functions with @code{FILE *} parameters 1851 are only provided if @code{<stdio.h>} is included too. 1852 1853 @example 1854 #include <stdio.h> 1855 #include <gmp.h> 1856 @end example 1857 1858 @cindex @code{stdarg.h} 1859 Likewise @code{<stdarg.h>} is required for prototypes with @code{va_list} 1860 parameters, such as @code{gmp_vprintf}. And @code{<obstack.h>} for prototypes 1861 with @code{struct obstack} parameters, such as @code{gmp_obstack_printf}, when 1862 available. 1863 1864 @cindex Libraries 1865 @cindex Linking 1866 @cindex @code{libgmp} 1867 All programs using GMP must link against the @file{libgmp} library. On a 1868 typical Unix-like system this can be done with @samp{-lgmp}, for example 1869 1870 @example 1871 gcc myprogram.c -lgmp 1872 @end example 1873 1874 @cindex @code{libgmpxx} 1875 GMP C++ functions are in a separate @file{libgmpxx} library. This is built 1876 and installed if C++ support has been enabled (@pxref{Build Options}). For 1877 example, 1878 1879 @example 1880 g++ mycxxprog.cc -lgmpxx -lgmp 1881 @end example 1882 1883 @cindex Libtool 1884 GMP is built using Libtool and an application can use that to link if desired, 1885 @GMPpxreftop{libtool, GNU Libtool}. 1886 1887 If GMP has been installed to a non-standard location then it may be necessary 1888 to use @samp{-I} and @samp{-L} compiler options to point to the right 1889 directories, and some sort of run-time path for a shared library. 1890 1891 1892 @node Nomenclature and Types, Function Classes, Headers and Libraries, GMP Basics 1893 @section Nomenclature and Types 1894 @cindex Nomenclature 1895 @cindex Types 1896 1897 @cindex Integer 1898 @tindex @code{mpz_t} 1899 In this manual, @dfn{integer} usually means a multiple precision integer, as 1900 defined by the GMP library. The C data type for such integers is @code{mpz_t}. 1901 Here are some examples of how to declare such integers: 1902 1903 @example 1904 mpz_t sum; 1905 1906 struct foo @{ mpz_t x, y; @}; 1907 1908 mpz_t vec[20]; 1909 @end example 1910 1911 @cindex Rational number 1912 @tindex @code{mpq_t} 1913 @dfn{Rational number} means a multiple precision fraction. The C data type 1914 for these fractions is @code{mpq_t}. For example: 1915 1916 @example 1917 mpq_t quotient; 1918 @end example 1919 1920 @cindex Floating-point number 1921 @tindex @code{mpf_t} 1922 @dfn{Floating point number} or @dfn{Float} for short, is an arbitrary precision 1923 mantissa with a limited precision exponent. The C data type for such objects 1924 is @code{mpf_t}. For example: 1925 1926 @example 1927 mpf_t fp; 1928 @end example 1929 1930 @tindex @code{mp_exp_t} 1931 The floating point functions accept and return exponents in the C type 1932 @code{mp_exp_t}. Currently this is usually a @code{long}, but on some systems 1933 it's an @code{int} for efficiency. 1934 1935 @cindex Limb 1936 @tindex @code{mp_limb_t} 1937 A @dfn{limb} means the part of a multi-precision number that fits in a single 1938 machine word. (We chose this word because a limb of the human body is 1939 analogous to a digit, only larger, and containing several digits.) Normally a 1940 limb is 32 or 64 bits. The C data type for a limb is @code{mp_limb_t}. 1941 1942 @tindex @code{mp_size_t} 1943 Counts of limbs of a multi-precision number represented in the C type 1944 @code{mp_size_t}. Currently this is normally a @code{long}, but on some 1945 systems it's an @code{int} for efficiency, and on some systems it will be 1946 @code{long long} in the future. 1947 1948 @tindex @code{mp_bitcnt_t} 1949 Counts of bits of a multi-precision number are represented in the C type 1950 @code{mp_bitcnt_t}. Currently this is always an @code{unsigned long}, but on 1951 some systems it will be an @code{unsigned long long} in the future. 1952 1953 @cindex Random state 1954 @tindex @code{gmp_randstate_t} 1955 @dfn{Random state} means an algorithm selection and current state data. The C 1956 data type for such objects is @code{gmp_randstate_t}. For example: 1957 1958 @example 1959 gmp_randstate_t rstate; 1960 @end example 1961 1962 Also, in general @code{mp_bitcnt_t} is used for bit counts and ranges, and 1963 @code{size_t} is used for byte or character counts. 1964 1965 1966 @node Function Classes, Variable Conventions, Nomenclature and Types, GMP Basics 1967 @section Function Classes 1968 @cindex Function classes 1969 1970 There are six classes of functions in the GMP library: 1971 1972 @enumerate 1973 @item 1974 Functions for signed integer arithmetic, with names beginning with 1975 @code{mpz_}. The associated type is @code{mpz_t}. There are about 150 1976 functions in this class. (@pxref{Integer Functions}) 1977 1978 @item 1979 Functions for rational number arithmetic, with names beginning with 1980 @code{mpq_}. The associated type is @code{mpq_t}. There are about 35 1981 functions in this class, but the integer functions can be used for arithmetic 1982 on the numerator and denominator separately. (@pxref{Rational Number 1983 Functions}) 1984 1985 @item 1986 Functions for floating-point arithmetic, with names beginning with 1987 @code{mpf_}. The associated type is @code{mpf_t}. There are about 70 1988 functions is this class. (@pxref{Floating-point Functions}) 1989 1990 @item 1991 Fast low-level functions that operate on natural numbers. These are used by 1992 the functions in the preceding groups, and you can also call them directly 1993 from very time-critical user programs. These functions' names begin with 1994 @code{mpn_}. The associated type is array of @code{mp_limb_t}. There are 1995 about 60 (hard-to-use) functions in this class. (@pxref{Low-level Functions}) 1996 1997 @item 1998 Miscellaneous functions. Functions for setting up custom allocation and 1999 functions for generating random numbers. (@pxref{Custom Allocation}, and 2000 @pxref{Random Number Functions}) 2001 @end enumerate 2002 2003 2004 @node Variable Conventions, Parameter Conventions, Function Classes, GMP Basics 2005 @section Variable Conventions 2006 @cindex Variable conventions 2007 @cindex Conventions for variables 2008 2009 GMP functions generally have output arguments before input arguments. This 2010 notation is by analogy with the assignment operator. The BSD MP compatibility 2011 functions are exceptions, having the output arguments last. 2012 2013 GMP lets you use the same variable for both input and output in one call. For 2014 example, the main function for integer multiplication, @code{mpz_mul}, can be 2015 used to square @code{x} and put the result back in @code{x} with 2016 2017 @example 2018 mpz_mul (x, x, x); 2019 @end example 2020 2021 Before you can assign to a GMP variable, you need to initialize it by calling 2022 one of the special initialization functions. When you're done with a 2023 variable, you need to clear it out, using one of the functions for that 2024 purpose. Which function to use depends on the type of variable. See the 2025 chapters on integer functions, rational number functions, and floating-point 2026 functions for details. 2027 2028 A variable should only be initialized once, or at least cleared between each 2029 initialization. After a variable has been initialized, it may be assigned to 2030 any number of times. 2031 2032 For efficiency reasons, avoid excessive initializing and clearing. In 2033 general, initialize near the start of a function and clear near the end. For 2034 example, 2035 2036 @example 2037 void 2038 foo (void) 2039 @{ 2040 mpz_t n; 2041 int i; 2042 mpz_init (n); 2043 for (i = 1; i < 100; i++) 2044 @{ 2045 mpz_mul (n, @dots{}); 2046 mpz_fdiv_q (n, @dots{}); 2047 @dots{} 2048 @} 2049 mpz_clear (n); 2050 @} 2051 @end example 2052 2053 2054 @node Parameter Conventions, Memory Management, Variable Conventions, GMP Basics 2055 @section Parameter Conventions 2056 @cindex Parameter conventions 2057 @cindex Conventions for parameters 2058 2059 When a GMP variable is used as a function parameter, it's effectively a 2060 call-by-reference, meaning if the function stores a value there it will change 2061 the original in the caller. Parameters which are input-only can be designated 2062 @code{const} to provoke a compiler error or warning on attempting to modify 2063 them. 2064 2065 When a function is going to return a GMP result, it should designate a 2066 parameter that it sets, like the library functions do. More than one value 2067 can be returned by having more than one output parameter, again like the 2068 library functions. A @code{return} of an @code{mpz_t} etc doesn't return the 2069 object, only a pointer, and this is almost certainly not what's wanted. 2070 2071 Here's an example accepting an @code{mpz_t} parameter, doing a calculation, 2072 and storing the result to the indicated parameter. 2073 2074 @example 2075 void 2076 foo (mpz_t result, const mpz_t param, unsigned long n) 2077 @{ 2078 unsigned long i; 2079 mpz_mul_ui (result, param, n); 2080 for (i = 1; i < n; i++) 2081 mpz_add_ui (result, result, i*7); 2082 @} 2083 2084 int 2085 main (void) 2086 @{ 2087 mpz_t r, n; 2088 mpz_init (r); 2089 mpz_init_set_str (n, "123456", 0); 2090 foo (r, n, 20L); 2091 gmp_printf ("%Zd\n", r); 2092 return 0; 2093 @} 2094 @end example 2095 2096 @code{foo} works even if the mainline passes the same variable for 2097 @code{param} and @code{result}, just like the library functions. But 2098 sometimes it's tricky to make that work, and an application might not want to 2099 bother supporting that sort of thing. 2100 2101 For interest, the GMP types @code{mpz_t} etc are implemented as one-element 2102 arrays of certain structures. This is why declaring a variable creates an 2103 object with the fields GMP needs, but then using it as a parameter passes a 2104 pointer to the object. Note that the actual fields in each @code{mpz_t} etc 2105 are for internal use only and should not be accessed directly by code that 2106 expects to be compatible with future GMP releases. 2107 2108 2109 @need 1000 2110 @node Memory Management, Reentrancy, Parameter Conventions, GMP Basics 2111 @section Memory Management 2112 @cindex Memory management 2113 2114 The GMP types like @code{mpz_t} are small, containing only a couple of sizes, 2115 and pointers to allocated data. Once a variable is initialized, GMP takes 2116 care of all space allocation. Additional space is allocated whenever a 2117 variable doesn't have enough. 2118 2119 @code{mpz_t} and @code{mpq_t} variables never reduce their allocated space. 2120 Normally this is the best policy, since it avoids frequent reallocation. 2121 Applications that need to return memory to the heap at some particular point 2122 can use @code{mpz_realloc2}, or clear variables no longer needed. 2123 2124 @code{mpf_t} variables, in the current implementation, use a fixed amount of 2125 space, determined by the chosen precision and allocated at initialization, so 2126 their size doesn't change. 2127 2128 All memory is allocated using @code{malloc} and friends by default, but this 2129 can be changed, see @ref{Custom Allocation}. Temporary memory on the stack is 2130 also used (via @code{alloca}), but this can be changed at build-time if 2131 desired, see @ref{Build Options}. 2132 2133 2134 @node Reentrancy, Useful Macros and Constants, Memory Management, GMP Basics 2135 @section Reentrancy 2136 @cindex Reentrancy 2137 @cindex Thread safety 2138 @cindex Multi-threading 2139 2140 @noindent 2141 GMP is reentrant and thread-safe, with some exceptions: 2142 2143 @itemize @bullet 2144 @item 2145 If configured with @option{--enable-alloca=malloc-notreentrant} (or with 2146 @option{--enable-alloca=notreentrant} when @code{alloca} is not available), 2147 then naturally GMP is not reentrant. 2148 2149 @item 2150 @code{mpf_set_default_prec} and @code{mpf_init} use a global variable for the 2151 selected precision. @code{mpf_init2} can be used instead, and in the C++ 2152 interface an explicit precision to the @code{mpf_class} constructor. 2153 2154 @item 2155 @code{mpz_random} and the other old random number functions use a global 2156 random state and are hence not reentrant. The newer random number functions 2157 that accept a @code{gmp_randstate_t} parameter can be used instead. 2158 2159 @item 2160 @code{gmp_randinit} (obsolete) returns an error indication through a global 2161 variable, which is not thread safe. Applications are advised to use 2162 @code{gmp_randinit_default} or @code{gmp_randinit_lc_2exp} instead. 2163 2164 @item 2165 @code{mp_set_memory_functions} uses global variables to store the selected 2166 memory allocation functions. 2167 2168 @item 2169 If the memory allocation functions set by a call to 2170 @code{mp_set_memory_functions} (or @code{malloc} and friends by default) are 2171 not reentrant, then GMP will not be reentrant either. 2172 2173 @item 2174 If the standard I/O functions such as @code{fwrite} are not reentrant then the 2175 GMP I/O functions using them will not be reentrant either. 2176 2177 @item 2178 It's safe for two threads to read from the same GMP variable simultaneously, 2179 but it's not safe for one to read while another might be writing, nor for 2180 two threads to write simultaneously. It's not safe for two threads to 2181 generate a random number from the same @code{gmp_randstate_t} simultaneously, 2182 since this involves an update of that variable. 2183 @end itemize 2184 2185 2186 @need 2000 2187 @node Useful Macros and Constants, Compatibility with older versions, Reentrancy, GMP Basics 2188 @section Useful Macros and Constants 2189 @cindex Useful macros and constants 2190 @cindex Constants 2191 2192 @deftypevr {Global Constant} {const int} mp_bits_per_limb 2193 @findex mp_bits_per_limb 2194 @cindex Bits per limb 2195 @cindex Limb size 2196 The number of bits per limb. 2197 @end deftypevr 2198 2199 @defmac __GNU_MP_VERSION 2200 @defmacx __GNU_MP_VERSION_MINOR 2201 @defmacx __GNU_MP_VERSION_PATCHLEVEL 2202 @cindex Version number 2203 @cindex GMP version number 2204 The major and minor GMP version, and patch level, respectively, as integers. 2205 For GMP i.j, these numbers will be i, j, and 0, respectively. 2206 For GMP i.j.k, these numbers will be i, j, and k, respectively. 2207 @end defmac 2208 2209 @deftypevr {Global Constant} {const char * const} gmp_version 2210 @findex gmp_version 2211 The GMP version number, as a null-terminated string, in the form ``i.j.k''. 2212 This release is @nicode{"@value{VERSION}"}. Note that the format ``i.j'' was 2213 used, before version 4.3.0, when k was zero. 2214 @end deftypevr 2215 2216 @defmac __GMP_CC 2217 @defmacx __GMP_CFLAGS 2218 The compiler and compiler flags, respectively, used when compiling GMP, as 2219 strings. 2220 @end defmac 2221 2222 2223 @node Compatibility with older versions, Demonstration Programs, Useful Macros and Constants, GMP Basics 2224 @section Compatibility with older versions 2225 @cindex Compatibility with older versions 2226 @cindex Past GMP versions 2227 @cindex Upward compatibility 2228 2229 This version of GMP is upwardly binary compatible with all 5.x, 4.x, and 3.x 2230 versions, and upwardly compatible at the source level with all 2.x versions, 2231 with the following exceptions. 2232 2233 @itemize @bullet 2234 @item 2235 @code{mpn_gcd} had its source arguments swapped as of GMP 3.0, for consistency 2236 with other @code{mpn} functions. 2237 2238 @item 2239 @code{mpf_get_prec} counted precision slightly differently in GMP 3.0 and 2240 3.0.1, but in 3.1 reverted to the 2.x style. 2241 2242 @item 2243 @code{mpn_bdivmod}, documented as preliminary in GMP 4, has been removed. 2244 @end itemize 2245 2246 There are a number of compatibility issues between GMP 1 and GMP 2 that of 2247 course also apply when porting applications from GMP 1 to GMP 5. Please 2248 see the GMP 2 manual for details. 2249 2250 @c @item Integer division functions round the result differently. The obsolete 2251 @c functions (@code{mpz_div}, @code{mpz_divmod}, @code{mpz_mdiv}, 2252 @c @code{mpz_mdivmod}, etc) now all use floor rounding (i.e., they round the 2253 @c quotient towards 2254 @c @ifinfo 2255 @c @minus{}infinity). 2256 @c @end ifinfo 2257 @c @iftex 2258 @c @tex 2259 @c $-\infty$). 2260 @c @end tex 2261 @c @end iftex 2262 @c There are a lot of functions for integer division, giving the user better 2263 @c control over the rounding. 2264 2265 @c @item The function @code{mpz_mod} now compute the true @strong{mod} function. 2266 2267 @c @item The functions @code{mpz_powm} and @code{mpz_powm_ui} now use 2268 @c @strong{mod} for reduction. 2269 2270 @c @item The assignment functions for rational numbers do no longer canonicalize 2271 @c their results. In the case a non-canonical result could arise from an 2272 @c assignment, the user need to insert an explicit call to 2273 @c @code{mpq_canonicalize}. This change was made for efficiency. 2274 2275 @c @item Output generated by @code{mpz_out_raw} in this release cannot be read 2276 @c by @code{mpz_inp_raw} in previous releases. This change was made for making 2277 @c the file format truly portable between machines with different word sizes. 2278 2279 @c @item Several @code{mpn} functions have changed. But they were intentionally 2280 @c undocumented in previous releases. 2281 2282 @c @item The functions @code{mpz_cmp_ui}, @code{mpz_cmp_si}, and @code{mpq_cmp_ui} 2283 @c are now implemented as macros, and thereby sometimes evaluate their 2284 @c arguments multiple times. 2285 2286 @c @item The functions @code{mpz_pow_ui} and @code{mpz_ui_pow_ui} now yield 1 2287 @c for 0^0. (In version 1, they yielded 0.) 2288 2289 @c In version 1 of the library, @code{mpq_set_den} handled negative 2290 @c denominators by copying the sign to the numerator. That is no longer done. 2291 2292 @c Pure assignment functions do not canonicalize the assigned variable. It is 2293 @c the responsibility of the user to canonicalize the assigned variable before 2294 @c any arithmetic operations are performed on that variable. 2295 @c Note that this is an incompatible change from version 1 of the library. 2296 2297 @c @end enumerate 2298 2299 2300 @need 1000 2301 @node Demonstration Programs, Efficiency, Compatibility with older versions, GMP Basics 2302 @section Demonstration programs 2303 @cindex Demonstration programs 2304 @cindex Example programs 2305 @cindex Sample programs 2306 The @file{demos} subdirectory has some sample programs using GMP@. These 2307 aren't built or installed, but there's a @file{Makefile} with rules for them. 2308 For instance, 2309 2310 @example 2311 make pexpr 2312 ./pexpr 68^975+10 2313 @end example 2314 2315 @noindent 2316 The following programs are provided 2317 2318 @itemize @bullet 2319 @item 2320 @cindex Expression parsing demo 2321 @cindex Parsing expressions demo 2322 @samp{pexpr} is an expression evaluator, the program used on the GMP web page. 2323 @item 2324 @cindex Expression parsing demo 2325 @cindex Parsing expressions demo 2326 The @samp{calc} subdirectory has a similar but simpler evaluator using 2327 @command{lex} and @command{yacc}. 2328 @item 2329 @cindex Expression parsing demo 2330 @cindex Parsing expressions demo 2331 The @samp{expr} subdirectory is yet another expression evaluator, a library 2332 designed for ease of use within a C program. See @file{demos/expr/README} for 2333 more information. 2334 @item 2335 @cindex Factorization demo 2336 @samp{factorize} is a Pollard-Rho factorization program. 2337 @item 2338 @samp{isprime} is a command-line interface to the @code{mpz_probab_prime_p} 2339 function. 2340 @item 2341 @samp{primes} counts or lists primes in an interval, using a sieve. 2342 @item 2343 @samp{qcn} is an example use of @code{mpz_kronecker_ui} to estimate quadratic 2344 class numbers. 2345 @item 2346 @cindex @code{perl} 2347 @cindex GMP Perl module 2348 @cindex Perl module 2349 The @samp{perl} subdirectory is a comprehensive perl interface to GMP@. See 2350 @file{demos/perl/INSTALL} for more information. Documentation is in POD 2351 format in @file{demos/perl/GMP.pm}. 2352 @end itemize 2353 2354 As an aside, consideration has been given at various times to some sort of 2355 expression evaluation within the main GMP library. Going beyond something 2356 minimal quickly leads to matters like user-defined functions, looping, fixnums 2357 for control variables, etc, which are considered outside the scope of GMP 2358 (much closer to language interpreters or compilers, @xref{Language Bindings}.) 2359 Something simple for program input convenience may yet be a possibility, a 2360 combination of the @file{expr} demo and the @file{pexpr} tree back-end 2361 perhaps. But for now the above evaluators are offered as illustrations. 2362 2363 2364 @need 1000 2365 @node Efficiency, Debugging, Demonstration Programs, GMP Basics 2366 @section Efficiency 2367 @cindex Efficiency 2368 2369 @table @asis 2370 @item Small Operands 2371 @cindex Small operands 2372 On small operands, the time for function call overheads and memory allocation 2373 can be significant in comparison to actual calculation. This is unavoidable 2374 in a general purpose variable precision library, although GMP attempts to be 2375 as efficient as it can on both large and small operands. 2376 2377 @item Static Linking 2378 @cindex Static linking 2379 On some CPUs, in particular the x86s, the static @file{libgmp.a} should be 2380 used for maximum speed, since the PIC code in the shared @file{libgmp.so} will 2381 have a small overhead on each function call and global data address. For many 2382 programs this will be insignificant, but for long calculations there's a gain 2383 to be had. 2384 2385 @item Initializing and Clearing 2386 @cindex Initializing and clearing 2387 Avoid excessive initializing and clearing of variables, since this can be 2388 quite time consuming, especially in comparison to otherwise fast operations 2389 like addition. 2390 2391 A language interpreter might want to keep a free list or stack of 2392 initialized variables ready for use. It should be possible to integrate 2393 something like that with a garbage collector too. 2394 2395 @item Reallocations 2396 @cindex Reallocations 2397 An @code{mpz_t} or @code{mpq_t} variable used to hold successively increasing 2398 values will have its memory repeatedly @code{realloc}ed, which could be quite 2399 slow or could fragment memory, depending on the C library. If an application 2400 can estimate the final size then @code{mpz_init2} or @code{mpz_realloc2} can 2401 be called to allocate the necessary space from the beginning 2402 (@pxref{Initializing Integers}). 2403 2404 It doesn't matter if a size set with @code{mpz_init2} or @code{mpz_realloc2} 2405 is too small, since all functions will do a further reallocation if necessary. 2406 Badly overestimating memory required will waste space though. 2407 2408 @item @code{2exp} Functions 2409 @cindex @code{2exp} functions 2410 It's up to an application to call functions like @code{mpz_mul_2exp} when 2411 appropriate. General purpose functions like @code{mpz_mul} make no attempt to 2412 identify powers of two or other special forms, because such inputs will 2413 usually be very rare and testing every time would be wasteful. 2414 2415 @item @code{ui} and @code{si} Functions 2416 @cindex @code{ui} and @code{si} functions 2417 The @code{ui} functions and the small number of @code{si} functions exist for 2418 convenience and should be used where applicable. But if for example an 2419 @code{mpz_t} contains a value that fits in an @code{unsigned long} there's no 2420 need extract it and call a @code{ui} function, just use the regular @code{mpz} 2421 function. 2422 2423 @item In-Place Operations 2424 @cindex In-place operations 2425 @code{mpz_abs}, @code{mpq_abs}, @code{mpf_abs}, @code{mpz_neg}, @code{mpq_neg} 2426 and @code{mpf_neg} are fast when used for in-place operations like 2427 @code{mpz_abs(x,x)}, since in the current implementation only a single field 2428 of @code{x} needs changing. On suitable compilers (GCC for instance) this is 2429 inlined too. 2430 2431 @code{mpz_add_ui}, @code{mpz_sub_ui}, @code{mpf_add_ui} and @code{mpf_sub_ui} 2432 benefit from an in-place operation like @code{mpz_add_ui(x,x,y)}, since 2433 usually only one or two limbs of @code{x} will need to be changed. The same 2434 applies to the full precision @code{mpz_add} etc if @code{y} is small. If 2435 @code{y} is big then cache locality may be helped, but that's all. 2436 2437 @code{mpz_mul} is currently the opposite, a separate destination is slightly 2438 better. A call like @code{mpz_mul(x,x,y)} will, unless @code{y} is only one 2439 limb, make a temporary copy of @code{x} before forming the result. Normally 2440 that copying will only be a tiny fraction of the time for the multiply, so 2441 this is not a particularly important consideration. 2442 2443 @code{mpz_set}, @code{mpq_set}, @code{mpq_set_num}, @code{mpf_set}, etc, make 2444 no attempt to recognise a copy of something to itself, so a call like 2445 @code{mpz_set(x,x)} will be wasteful. Naturally that would never be written 2446 deliberately, but if it might arise from two pointers to the same object then 2447 a test to avoid it might be desirable. 2448 2449 @example 2450 if (x != y) 2451 mpz_set (x, y); 2452 @end example 2453 2454 Note that it's never worth introducing extra @code{mpz_set} calls just to get 2455 in-place operations. If a result should go to a particular variable then just 2456 direct it there and let GMP take care of data movement. 2457 2458 @item Divisibility Testing (Small Integers) 2459 @cindex Divisibility testing 2460 @code{mpz_divisible_ui_p} and @code{mpz_congruent_ui_p} are the best functions 2461 for testing whether an @code{mpz_t} is divisible by an individual small 2462 integer. They use an algorithm which is faster than @code{mpz_tdiv_ui}, but 2463 which gives no useful information about the actual remainder, only whether 2464 it's zero (or a particular value). 2465 2466 However when testing divisibility by several small integers, it's best to take 2467 a remainder modulo their product, to save multi-precision operations. For 2468 instance to test whether a number is divisible by any of 23, 29 or 31 take a 2469 remainder modulo @math{23@times{}29@times{}31 = 20677} and then test that. 2470 2471 The division functions like @code{mpz_tdiv_q_ui} which give a quotient as well 2472 as a remainder are generally a little slower than the remainder-only functions 2473 like @code{mpz_tdiv_ui}. If the quotient is only rarely wanted then it's 2474 probably best to just take a remainder and then go back and calculate the 2475 quotient if and when it's wanted (@code{mpz_divexact_ui} can be used if the 2476 remainder is zero). 2477 2478 @item Rational Arithmetic 2479 @cindex Rational arithmetic 2480 The @code{mpq} functions operate on @code{mpq_t} values with no common factors 2481 in the numerator and denominator. Common factors are checked-for and cast out 2482 as necessary. In general, cancelling factors every time is the best approach 2483 since it minimizes the sizes for subsequent operations. 2484 2485 However, applications that know something about the factorization of the 2486 values they're working with might be able to avoid some of the GCDs used for 2487 canonicalization, or swap them for divisions. For example when multiplying by 2488 a prime it's enough to check for factors of it in the denominator instead of 2489 doing a full GCD@. Or when forming a big product it might be known that very 2490 little cancellation will be possible, and so canonicalization can be left to 2491 the end. 2492 2493 The @code{mpq_numref} and @code{mpq_denref} macros give access to the 2494 numerator and denominator to do things outside the scope of the supplied 2495 @code{mpq} functions. @xref{Applying Integer Functions}. 2496 2497 The canonical form for rationals allows mixed-type @code{mpq_t} and integer 2498 additions or subtractions to be done directly with multiples of the 2499 denominator. This will be somewhat faster than @code{mpq_add}. For example, 2500 2501 @example 2502 /* mpq increment */ 2503 mpz_add (mpq_numref(q), mpq_numref(q), mpq_denref(q)); 2504 2505 /* mpq += unsigned long */ 2506 mpz_addmul_ui (mpq_numref(q), mpq_denref(q), 123UL); 2507 2508 /* mpq -= mpz */ 2509 mpz_submul (mpq_numref(q), mpq_denref(q), z); 2510 @end example 2511 2512 @item Number Sequences 2513 @cindex Number sequences 2514 Functions like @code{mpz_fac_ui}, @code{mpz_fib_ui} and @code{mpz_bin_uiui} 2515 are designed for calculating isolated values. If a range of values is wanted 2516 it's probably best to call to get a starting point and iterate from there. 2517 2518 @item Text Input/Output 2519 @cindex Text input/output 2520 Hexadecimal or octal are suggested for input or output in text form. 2521 Power-of-2 bases like these can be converted much more efficiently than other 2522 bases, like decimal. For big numbers there's usually nothing of particular 2523 interest to be seen in the digits, so the base doesn't matter much. 2524 2525 Maybe we can hope octal will one day become the normal base for everyday use, 2526 as proposed by King Charles XII of Sweden and later reformers. 2527 @c Reference: Knuth volume 2 section 4.1, page 184 of second edition. :-) 2528 @end table 2529 2530 2531 @node Debugging, Profiling, Efficiency, GMP Basics 2532 @section Debugging 2533 @cindex Debugging 2534 2535 @table @asis 2536 @item Stack Overflow 2537 @cindex Stack overflow 2538 @cindex Segmentation violation 2539 @cindex Bus error 2540 Depending on the system, a segmentation violation or bus error might be the 2541 only indication of stack overflow. See @samp{--enable-alloca} choices in 2542 @ref{Build Options}, for how to address this. 2543 2544 In new enough versions of GCC, @samp{-fstack-check} may be able to ensure an 2545 overflow is recognised by the system before too much damage is done, or 2546 @samp{-fstack-limit-symbol} or @samp{-fstack-limit-register} may be able to 2547 add checking if the system itself doesn't do any (@pxref{Code Gen Options,, 2548 Options for Code Generation, gcc, Using the GNU Compiler Collection (GCC)}). 2549 These options must be added to the @samp{CFLAGS} used in the GMP build 2550 (@pxref{Build Options}), adding them just to an application will have no 2551 effect. Note also they're a slowdown, adding overhead to each function call 2552 and each stack allocation. 2553 2554 @item Heap Problems 2555 @cindex Heap problems 2556 @cindex Malloc problems 2557 The most likely cause of application problems with GMP is heap corruption. 2558 Failing to @code{init} GMP variables will have unpredictable effects, and 2559 corruption arising elsewhere in a program may well affect GMP@. Initializing 2560 GMP variables more than once or failing to clear them will cause memory leaks. 2561 2562 @cindex Malloc debugger 2563 In all such cases a @code{malloc} debugger is recommended. On a GNU or BSD 2564 system the standard C library @code{malloc} has some diagnostic facilities, 2565 see @ref{Allocation Debugging,, Allocation Debugging, libc, The GNU C Library 2566 Reference Manual}, or @samp{man 3 malloc}. Other possibilities, in no 2567 particular order, include 2568 2569 @display 2570 @uref{http://www.inf.ethz.ch/personal/biere/projects/ccmalloc/} 2571 @uref{http://dmalloc.com/} 2572 @uref{http://www.perens.com/FreeSoftware/} @ (electric fence) 2573 @uref{http://packages.debian.org/stable/devel/fda} 2574 @uref{http://www.gnupdate.org/components/leakbug/} 2575 @uref{http://people.redhat.com/~otaylor/memprof/} 2576 @uref{http://www.cbmamiga.demon.co.uk/mpatrol/} 2577 @end display 2578 2579 The GMP default allocation routines in @file{memory.c} also have a simple 2580 sentinel scheme which can be enabled with @code{#define DEBUG} in that file. 2581 This is mainly designed for detecting buffer overruns during GMP development, 2582 but might find other uses. 2583 2584 @item Stack Backtraces 2585 @cindex Stack backtrace 2586 On some systems the compiler options GMP uses by default can interfere with 2587 debugging. In particular on x86 and 68k systems @samp{-fomit-frame-pointer} 2588 is used and this generally inhibits stack backtracing. Recompiling without 2589 such options may help while debugging, though the usual caveats about it 2590 potentially moving a memory problem or hiding a compiler bug will apply. 2591 2592 @item GDB, the GNU Debugger 2593 @cindex GDB 2594 @cindex GNU Debugger 2595 A sample @file{.gdbinit} is included in the distribution, showing how to call 2596 some undocumented dump functions to print GMP variables from within GDB@. Note 2597 that these functions shouldn't be used in final application code since they're 2598 undocumented and may be subject to incompatible changes in future versions of 2599 GMP. 2600 2601 @item Source File Paths 2602 GMP has multiple source files with the same name, in different directories. 2603 For example @file{mpz}, @file{mpq} and @file{mpf} each have an 2604 @file{init.c}. If the debugger can't already determine the right one it may 2605 help to build with absolute paths on each C file. One way to do that is to 2606 use a separate object directory with an absolute path to the source directory. 2607 2608 @example 2609 cd /my/build/dir 2610 /my/source/dir/gmp-@value{VERSION}/configure 2611 @end example 2612 2613 This works via @code{VPATH}, and might require GNU @command{make}. 2614 Alternately it might be possible to change the @code{.c.lo} rules 2615 appropriately. 2616 2617 @item Assertion Checking 2618 @cindex Assertion checking 2619 The build option @option{--enable-assert} is available to add some consistency 2620 checks to the library (see @ref{Build Options}). These are likely to be of 2621 limited value to most applications. Assertion failures are just as likely to 2622 indicate memory corruption as a library or compiler bug. 2623 2624 Applications using the low-level @code{mpn} functions, however, will benefit 2625 from @option{--enable-assert} since it adds checks on the parameters of most 2626 such functions, many of which have subtle restrictions on their usage. Note 2627 however that only the generic C code has checks, not the assembly code, so 2628 @option{--disable-assembly} should be used for maximum checking. 2629 2630 @item Temporary Memory Checking 2631 The build option @option{--enable-alloca=debug} arranges that each block of 2632 temporary memory in GMP is allocated with a separate call to @code{malloc} (or 2633 the allocation function set with @code{mp_set_memory_functions}). 2634 2635 This can help a malloc debugger detect accesses outside the intended bounds, 2636 or detect memory not released. In a normal build, on the other hand, 2637 temporary memory is allocated in blocks which GMP divides up for its own use, 2638 or may be allocated with a compiler builtin @code{alloca} which will go 2639 nowhere near any malloc debugger hooks. 2640 2641 @item Maximum Debuggability 2642 To summarize the above, a GMP build for maximum debuggability would be 2643 2644 @example 2645 ./configure --disable-shared --enable-assert \ 2646 --enable-alloca=debug --disable-assembly CFLAGS=-g 2647 @end example 2648 2649 For C++, add @samp{--enable-cxx CXXFLAGS=-g}. 2650 2651 @item Checker 2652 @cindex Checker 2653 @cindex GCC Checker 2654 The GCC checker (@uref{https://savannah.nongnu.org/projects/checker/}) can be 2655 used with GMP@. It contains a stub library which means GMP applications 2656 compiled with checker can use a normal GMP build. 2657 2658 A build of GMP with checking within GMP itself can be made. This will run 2659 very very slowly. On GNU/Linux for example, 2660 2661 @cindex @command{checkergcc} 2662 @example 2663 ./configure --disable-assembly CC=checkergcc 2664 @end example 2665 2666 @option{--disable-assembly} must be used, since the GMP assembly code doesn't 2667 support the checking scheme. The GMP C++ features cannot be used, since 2668 current versions of checker (0.9.9.1) don't yet support the standard C++ 2669 library. 2670 2671 @item Valgrind 2672 @cindex Valgrind 2673 Valgrind (@uref{http://valgrind.org/}) is a memory checker for x86, ARM, MIPS, 2674 PowerPC, and S/390. It translates and emulates machine instructions to do 2675 strong checks for uninitialized data (at the level of individual bits), memory 2676 accesses through bad pointers, and memory leaks. 2677 2678 Valgrind does not always support every possible instruction, in particular 2679 ones recently added to an ISA. Valgrind might therefore be incompatible with 2680 a recent GMP or even a less recent GMP which is compiled using a recent GCC. 2681 2682 GMP's assembly code sometimes promotes a read of the limbs to some larger size, 2683 for efficiency. GMP will do this even at the start and end of a multilimb 2684 operand, using naturally aligned operations on the larger type. This may lead 2685 to benign reads outside of allocated areas, triggering complaints from 2686 Valgrind. Valgrind's option @samp{--partial-loads-ok=yes} should help. 2687 2688 @item Other Problems 2689 Any suspected bug in GMP itself should be isolated to make sure it's not an 2690 application problem, see @ref{Reporting Bugs}. 2691 @end table 2692 2693 2694 @node Profiling, Autoconf, Debugging, GMP Basics 2695 @section Profiling 2696 @cindex Profiling 2697 @cindex Execution profiling 2698 @cindex @code{--enable-profiling} 2699 2700 Running a program under a profiler is a good way to find where it's spending 2701 most time and where improvements can be best sought. The profiling choices 2702 for a GMP build are as follows. 2703 2704 @table @asis 2705 @item @samp{--disable-profiling} 2706 The default is to add nothing special for profiling. 2707 2708 It should be possible to just compile the mainline of a program with @code{-p} 2709 and use @command{prof} to get a profile consisting of timer-based sampling of 2710 the program counter. Most of the GMP assembly code has the necessary symbol 2711 information. 2712 2713 This approach has the advantage of minimizing interference with normal program 2714 operation, but on most systems the resolution of the sampling is quite low (10 2715 milliseconds for instance), requiring long runs to get accurate information. 2716 2717 @item @samp{--enable-profiling=prof} 2718 @cindex @code{prof} 2719 Build with support for the system @command{prof}, which means @samp{-p} added 2720 to the @samp{CFLAGS}. 2721 2722 This provides call counting in addition to program counter sampling, which 2723 allows the most frequently called routines to be identified, and an average 2724 time spent in each routine to be determined. 2725 2726 The x86 assembly code has support for this option, but on other processors 2727 the assembly routines will be as if compiled without @samp{-p} and therefore 2728 won't appear in the call counts. 2729 2730 On some systems, such as GNU/Linux, @samp{-p} in fact means @samp{-pg} and in 2731 this case @samp{--enable-profiling=gprof} described below should be used 2732 instead. 2733 2734 @item @samp{--enable-profiling=gprof} 2735 @cindex @code{gprof} 2736 Build with support for @command{gprof}, which means @samp{-pg} added to the 2737 @samp{CFLAGS}. 2738 2739 This provides call graph construction in addition to call counting and program 2740 counter sampling, which makes it possible to count calls coming from different 2741 locations. For example the number of calls to @code{mpn_mul} from 2742 @code{mpz_mul} versus the number from @code{mpf_mul}. The program counter 2743 sampling is still flat though, so only a total time in @code{mpn_mul} would be 2744 accumulated, not a separate amount for each call site. 2745 2746 The x86 assembly code has support for this option, but on other processors 2747 the assembly routines will be as if compiled without @samp{-pg} and therefore 2748 not be included in the call counts. 2749 2750 On x86 and m68k systems @samp{-pg} and @samp{-fomit-frame-pointer} are 2751 incompatible, so the latter is omitted from the default flags in that case, 2752 which might result in poorer code generation. 2753 2754 Incidentally, it should be possible to use the @command{gprof} program with a 2755 plain @samp{--enable-profiling=prof} build. But in that case only the 2756 @samp{gprof -p} flat profile and call counts can be expected to be valid, not 2757 the @samp{gprof -q} call graph. 2758 2759 @item @samp{--enable-profiling=instrument} 2760 @cindex @code{-finstrument-functions} 2761 @cindex @code{instrument-functions} 2762 Build with the GCC option @samp{-finstrument-functions} added to the 2763 @samp{CFLAGS} (@pxref{Code Gen Options,, Options for Code Generation, gcc, 2764 Using the GNU Compiler Collection (GCC)}). 2765 2766 This inserts special instrumenting calls at the start and end of each 2767 function, allowing exact timing and full call graph construction. 2768 2769 This instrumenting is not normally a standard system feature and will require 2770 support from an external library, such as 2771 2772 @cindex FunctionCheck 2773 @cindex fnccheck 2774 @display 2775 @uref{http://sourceforge.net/projects/fnccheck/} 2776 @end display 2777 2778 This should be included in @samp{LIBS} during the GMP configure so that test 2779 programs will link. For example, 2780 2781 @example 2782 ./configure --enable-profiling=instrument LIBS=-lfc 2783 @end example 2784 2785 On a GNU system the C library provides dummy instrumenting functions, so 2786 programs compiled with this option will link. In this case it's only 2787 necessary to ensure the correct library is added when linking an application. 2788 2789 The x86 assembly code supports this option, but on other processors the 2790 assembly routines will be as if compiled without 2791 @samp{-finstrument-functions} meaning time spent in them will effectively be 2792 attributed to their caller. 2793 @end table 2794 2795 2796 @node Autoconf, Emacs, Profiling, GMP Basics 2797 @section Autoconf 2798 @cindex Autoconf 2799 2800 Autoconf based applications can easily check whether GMP is installed. The 2801 only thing to be noted is that GMP library symbols from version 3 onwards have 2802 prefixes like @code{__gmpz}. The following therefore would be a simple test, 2803 2804 @cindex @code{AC_CHECK_LIB} 2805 @example 2806 AC_CHECK_LIB(gmp, __gmpz_init) 2807 @end example 2808 2809 This just uses the default @code{AC_CHECK_LIB} actions for found or not found, 2810 but an application that must have GMP would want to generate an error if not 2811 found. For example, 2812 2813 @example 2814 AC_CHECK_LIB(gmp, __gmpz_init, , 2815 [AC_MSG_ERROR([GNU MP not found, see https://gmplib.org/])]) 2816 @end example 2817 2818 If functions added in some particular version of GMP are required, then one of 2819 those can be used when checking. For example @code{mpz_mul_si} was added in 2820 GMP 3.1, 2821 2822 @example 2823 AC_CHECK_LIB(gmp, __gmpz_mul_si, , 2824 [AC_MSG_ERROR( 2825 [GNU MP not found, or not 3.1 or up, see https://gmplib.org/])]) 2826 @end example 2827 2828 An alternative would be to test the version number in @file{gmp.h} using say 2829 @code{AC_EGREP_CPP}. That would make it possible to test the exact version, 2830 if some particular sub-minor release is known to be necessary. 2831 2832 In general it's recommended that applications should simply demand a new 2833 enough GMP rather than trying to provide supplements for features not 2834 available in past versions. 2835 2836 Occasionally an application will need or want to know the size of a type at 2837 configuration or preprocessing time, not just with @code{sizeof} in the code. 2838 This can be done in the normal way with @code{mp_limb_t} etc, but GMP 4.0 or 2839 up is best for this, since prior versions needed certain @samp{-D} defines on 2840 systems using a @code{long long} limb. The following would suit Autoconf 2.50 2841 or up, 2842 2843 @example 2844 AC_CHECK_SIZEOF(mp_limb_t, , [#include <gmp.h>]) 2845 @end example 2846 2847 2848 @node Emacs, , Autoconf, GMP Basics 2849 @section Emacs 2850 @cindex Emacs 2851 @cindex @code{info-lookup-symbol} 2852 2853 @key{C-h C-i} (@code{info-lookup-symbol}) is a good way to find documentation 2854 on C functions while editing (@pxref{Info Lookup, , Info Documentation Lookup, 2855 emacs, The Emacs Editor}). 2856 2857 The GMP manual can be included in such lookups by putting the following in 2858 your @file{.emacs}, 2859 2860 @c This isn't pretty, but there doesn't seem to be a better way (in emacs 2861 @c 21.2 at least). info-lookup->mode-value could be used for the "assoc"s, 2862 @c but that function isn't documented, whereas info-lookup-alist is. 2863 @c 2864 @example 2865 (eval-after-load "info-look" 2866 '(let ((mode-value (assoc 'c-mode (assoc 'symbol info-lookup-alist)))) 2867 (setcar (nthcdr 3 mode-value) 2868 (cons '("(gmp)Function Index" nil "^ -.* " "\\>") 2869 (nth 3 mode-value))))) 2870 @end example 2871 2872 2873 @node Reporting Bugs, Integer Functions, GMP Basics, Top 2874 @comment node-name, next, previous, up 2875 @chapter Reporting Bugs 2876 @cindex Reporting bugs 2877 @cindex Bug reporting 2878 2879 If you think you have found a bug in the GMP library, please investigate it 2880 and report it. We have made this library available to you, and it is not too 2881 much to ask you to report the bugs you find. 2882 2883 Before you report a bug, check it's not already addressed in @ref{Known Build 2884 Problems}, or perhaps @ref{Notes for Particular Systems}. You may also want 2885 to check @uref{https://gmplib.org/} for patches for this release. 2886 2887 Please include the following in any report, 2888 2889 @itemize @bullet 2890 @item 2891 The GMP version number, and if pre-packaged or patched then say so. 2892 2893 @item 2894 A test program that makes it possible for us to reproduce the bug. Include 2895 instructions on how to run the program. 2896 2897 @item 2898 A description of what is wrong. If the results are incorrect, in what way. 2899 If you get a crash, say so. 2900 2901 @item 2902 If you get a crash, include a stack backtrace from the debugger if it's 2903 informative (@samp{where} in @command{gdb}, or @samp{$C} in @command{adb}). 2904 2905 @item 2906 Please do not send core dumps, executables or @command{strace}s. 2907 2908 @item 2909 The @samp{configure} options you used when building GMP, if any. 2910 2911 @item 2912 The output from @samp{configure}, as printed to stdout, with any options used. 2913 2914 @item 2915 The name of the compiler and its version. For @command{gcc}, get the version 2916 with @samp{gcc -v}, otherwise perhaps @samp{what `which cc`}, or similar. 2917 2918 @item 2919 The output from running @samp{uname -a}. 2920 2921 @item 2922 The output from running @samp{./config.guess}, and from running 2923 @samp{./configfsf.guess} (might be the same). 2924 2925 @item 2926 If the bug is related to @samp{configure}, then the compressed contents of 2927 @file{config.log}. 2928 2929 @item 2930 If the bug is related to an @file{asm} file not assembling, then the contents 2931 of @file{config.m4} and the offending line or lines from the temporary 2932 @file{mpn/tmp-<file>.s}. 2933 @end itemize 2934 2935 Please make an effort to produce a self-contained report, with something 2936 definite that can be tested or debugged. Vague queries or piecemeal messages 2937 are difficult to act on and don't help the development effort. 2938 2939 It is not uncommon that an observed problem is actually due to a bug in the 2940 compiler; the GMP code tends to explore interesting corners in compilers. 2941 2942 If your bug report is good, we will do our best to help you get a corrected 2943 version of the library; if the bug report is poor, we won't do anything about 2944 it (except maybe ask you to send a better report). 2945 2946 Send your report to: @email{gmp-bugs@@gmplib.org}. 2947 2948 If you think something in this manual is unclear, or downright incorrect, or if 2949 the language needs to be improved, please send a note to the same address. 2950 2951 2952 @node Integer Functions, Rational Number Functions, Reporting Bugs, Top 2953 @comment node-name, next, previous, up 2954 @chapter Integer Functions 2955 @cindex Integer functions 2956 2957 This chapter describes the GMP functions for performing integer arithmetic. 2958 These functions start with the prefix @code{mpz_}. 2959 2960 GMP integers are stored in objects of type @code{mpz_t}. 2961 2962 @menu 2963 * Initializing Integers:: 2964 * Assigning Integers:: 2965 * Simultaneous Integer Init & Assign:: 2966 * Converting Integers:: 2967 * Integer Arithmetic:: 2968 * Integer Division:: 2969 * Integer Exponentiation:: 2970 * Integer Roots:: 2971 * Number Theoretic Functions:: 2972 * Integer Comparisons:: 2973 * Integer Logic and Bit Fiddling:: 2974 * I/O of Integers:: 2975 * Integer Random Numbers:: 2976 * Integer Import and Export:: 2977 * Miscellaneous Integer Functions:: 2978 * Integer Special Functions:: 2979 @end menu 2980 2981 @node Initializing Integers, Assigning Integers, Integer Functions, Integer Functions 2982 @comment node-name, next, previous, up 2983 @section Initialization Functions 2984 @cindex Integer initialization functions 2985 @cindex Initialization functions 2986 2987 The functions for integer arithmetic assume that all integer objects are 2988 initialized. You do that by calling the function @code{mpz_init}. For 2989 example, 2990 2991 @example 2992 @{ 2993 mpz_t integ; 2994 mpz_init (integ); 2995 @dots{} 2996 mpz_add (integ, @dots{}); 2997 @dots{} 2998 mpz_sub (integ, @dots{}); 2999 3000 /* Unless the program is about to exit, do ... */ 3001 mpz_clear (integ); 3002 @} 3003 @end example 3004 3005 As you can see, you can store new values any number of times, once an 3006 object is initialized. 3007 3008 @deftypefun void mpz_init (mpz_t @var{x}) 3009 Initialize @var{x}, and set its value to 0. 3010 @end deftypefun 3011 3012 @deftypefun void mpz_inits (mpz_t @var{x}, ...) 3013 Initialize a NULL-terminated list of @code{mpz_t} variables, and set their 3014 values to 0. 3015 @end deftypefun 3016 3017 @deftypefun void mpz_init2 (mpz_t @var{x}, mp_bitcnt_t @var{n}) 3018 Initialize @var{x}, with space for @var{n}-bit numbers, and set its value to 0. 3019 Calling this function instead of @code{mpz_init} or @code{mpz_inits} is never 3020 necessary; reallocation is handled automatically by GMP when needed. 3021 3022 While @var{n} defines the initial space, @var{x} will grow automatically in the 3023 normal way, if necessary, for subsequent values stored. @code{mpz_init2} makes 3024 it possible to avoid such reallocations if a maximum size is known in advance. 3025 3026 In preparation for an operation, GMP often allocates one limb more than 3027 ultimately needed. To make sure GMP will not perform reallocation for 3028 @var{x}, you need to add the number of bits in @code{mp_limb_t} to @var{n}. 3029 @end deftypefun 3030 3031 @deftypefun void mpz_clear (mpz_t @var{x}) 3032 Free the space occupied by @var{x}. Call this function for all @code{mpz_t} 3033 variables when you are done with them. 3034 @end deftypefun 3035 3036 @deftypefun void mpz_clears (mpz_t @var{x}, ...) 3037 Free the space occupied by a NULL-terminated list of @code{mpz_t} variables. 3038 @end deftypefun 3039 3040 @deftypefun void mpz_realloc2 (mpz_t @var{x}, mp_bitcnt_t @var{n}) 3041 Change the space allocated for @var{x} to @var{n} bits. The value in @var{x} 3042 is preserved if it fits, or is set to 0 if not. 3043 3044 Calling this function is never necessary; reallocation is handled automatically 3045 by GMP when needed. But this function can be used to increase the space for a 3046 variable in order to avoid repeated automatic reallocations, or to decrease it 3047 to give memory back to the heap. 3048 @end deftypefun 3049 3050 3051 @node Assigning Integers, Simultaneous Integer Init & Assign, Initializing Integers, Integer Functions 3052 @comment node-name, next, previous, up 3053 @section Assignment Functions 3054 @cindex Integer assignment functions 3055 @cindex Assignment functions 3056 3057 These functions assign new values to already initialized integers 3058 (@pxref{Initializing Integers}). 3059 3060 @deftypefun void mpz_set (mpz_t @var{rop}, const mpz_t @var{op}) 3061 @deftypefunx void mpz_set_ui (mpz_t @var{rop}, unsigned long int @var{op}) 3062 @deftypefunx void mpz_set_si (mpz_t @var{rop}, signed long int @var{op}) 3063 @deftypefunx void mpz_set_d (mpz_t @var{rop}, double @var{op}) 3064 @deftypefunx void mpz_set_q (mpz_t @var{rop}, const mpq_t @var{op}) 3065 @deftypefunx void mpz_set_f (mpz_t @var{rop}, const mpf_t @var{op}) 3066 Set the value of @var{rop} from @var{op}. 3067 3068 @code{mpz_set_d}, @code{mpz_set_q} and @code{mpz_set_f} truncate @var{op} to 3069 make it an integer. 3070 @end deftypefun 3071 3072 @deftypefun int mpz_set_str (mpz_t @var{rop}, const char *@var{str}, int @var{base}) 3073 Set the value of @var{rop} from @var{str}, a null-terminated C string in base 3074 @var{base}. White space is allowed in the string, and is simply ignored. 3075 3076 The @var{base} may vary from 2 to 62, or if @var{base} is 0, then the leading 3077 characters are used: @code{0x} and @code{0X} for hexadecimal, @code{0b} and 3078 @code{0B} for binary, @code{0} for octal, or decimal otherwise. 3079 3080 For bases up to 36, case is ignored; upper-case and lower-case letters have 3081 the same value. For bases 37 to 62, upper-case letter represent the usual 3082 10..35 while lower-case letter represent 36..61. 3083 3084 This function returns 0 if the entire string is a valid number in base 3085 @var{base}. Otherwise it returns @minus{}1. 3086 @c 3087 @c It turns out that it is not entirely true that this function ignores 3088 @c white-space. It does ignore it between digits, but not after a minus sign 3089 @c or within or after ``0x''. Some thought was given to disallowing all 3090 @c whitespace, but that would be an incompatible change, whitespace has been 3091 @c documented as ignored ever since GMP 1. 3092 @c 3093 @end deftypefun 3094 3095 @deftypefun void mpz_swap (mpz_t @var{rop1}, mpz_t @var{rop2}) 3096 Swap the values @var{rop1} and @var{rop2} efficiently. 3097 @end deftypefun 3098 3099 3100 @node Simultaneous Integer Init & Assign, Converting Integers, Assigning Integers, Integer Functions 3101 @comment node-name, next, previous, up 3102 @section Combined Initialization and Assignment Functions 3103 @cindex Integer assignment functions 3104 @cindex Assignment functions 3105 @cindex Integer initialization functions 3106 @cindex Initialization functions 3107 3108 For convenience, GMP provides a parallel series of initialize-and-set functions 3109 which initialize the output and then store the value there. These functions' 3110 names have the form @code{mpz_init_set@dots{}} 3111 3112 Here is an example of using one: 3113 3114 @example 3115 @{ 3116 mpz_t pie; 3117 mpz_init_set_str (pie, "3141592653589793238462643383279502884", 10); 3118 @dots{} 3119 mpz_sub (pie, @dots{}); 3120 @dots{} 3121 mpz_clear (pie); 3122 @} 3123 @end example 3124 3125 @noindent 3126 Once the integer has been initialized by any of the @code{mpz_init_set@dots{}} 3127 functions, it can be used as the source or destination operand for the ordinary 3128 integer functions. Don't use an initialize-and-set function on a variable 3129 already initialized! 3130 3131 @deftypefun void mpz_init_set (mpz_t @var{rop}, const mpz_t @var{op}) 3132 @deftypefunx void mpz_init_set_ui (mpz_t @var{rop}, unsigned long int @var{op}) 3133 @deftypefunx void mpz_init_set_si (mpz_t @var{rop}, signed long int @var{op}) 3134 @deftypefunx void mpz_init_set_d (mpz_t @var{rop}, double @var{op}) 3135 Initialize @var{rop} with limb space and set the initial numeric value from 3136 @var{op}. 3137 @end deftypefun 3138 3139 @deftypefun int mpz_init_set_str (mpz_t @var{rop}, const char *@var{str}, int @var{base}) 3140 Initialize @var{rop} and set its value like @code{mpz_set_str} (see its 3141 documentation above for details). 3142 3143 If the string is a correct base @var{base} number, the function returns 0; 3144 if an error occurs it returns @minus{}1. @var{rop} is initialized even if 3145 an error occurs. (I.e., you have to call @code{mpz_clear} for it.) 3146 @end deftypefun 3147 3148 3149 @node Converting Integers, Integer Arithmetic, Simultaneous Integer Init & Assign, Integer Functions 3150 @comment node-name, next, previous, up 3151 @section Conversion Functions 3152 @cindex Integer conversion functions 3153 @cindex Conversion functions 3154 3155 This section describes functions for converting GMP integers to standard C 3156 types. Functions for converting @emph{to} GMP integers are described in 3157 @ref{Assigning Integers} and @ref{I/O of Integers}. 3158 3159 @deftypefun {unsigned long int} mpz_get_ui (const mpz_t @var{op}) 3160 Return the value of @var{op} as an @code{unsigned long}. 3161 3162 If @var{op} is too big to fit an @code{unsigned long} then just the least 3163 significant bits that do fit are returned. The sign of @var{op} is ignored, 3164 only the absolute value is used. 3165 @end deftypefun 3166 3167 @deftypefun {signed long int} mpz_get_si (const mpz_t @var{op}) 3168 If @var{op} fits into a @code{signed long int} return the value of @var{op}. 3169 Otherwise return the least significant part of @var{op}, with the same sign 3170 as @var{op}. 3171 3172 If @var{op} is too big to fit in a @code{signed long int}, the returned 3173 result is probably not very useful. To find out if the value will fit, use 3174 the function @code{mpz_fits_slong_p}. 3175 @end deftypefun 3176 3177 @deftypefun double mpz_get_d (const mpz_t @var{op}) 3178 Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding 3179 towards zero). 3180 3181 If the exponent from the conversion is too big, the result is system 3182 dependent. An infinity is returned where available. A hardware overflow trap 3183 may or may not occur. 3184 @end deftypefun 3185 3186 @deftypefun double mpz_get_d_2exp (signed long int *@var{exp}, const mpz_t @var{op}) 3187 Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding 3188 towards zero), and returning the exponent separately. 3189 3190 The return value is in the range @math{0.5@le{}@GMPabs{@var{d}}<1} and the 3191 exponent is stored to @code{*@var{exp}}. @m{@var{d} * 2^{exp}, @var{d} * 3192 2^@var{exp}} is the (truncated) @var{op} value. If @var{op} is zero, the 3193 return is @math{0.0} and 0 is stored to @code{*@var{exp}}. 3194 3195 @cindex @code{frexp} 3196 This is similar to the standard C @code{frexp} function (@pxref{Normalization 3197 Functions,,, libc, The GNU C Library Reference Manual}). 3198 @end deftypefun 3199 3200 @deftypefun {char *} mpz_get_str (char *@var{str}, int @var{base}, const mpz_t @var{op}) 3201 Convert @var{op} to a string of digits in base @var{base}. The base argument 3202 may vary from 2 to 62 or from @minus{}2 to @minus{}36. 3203 3204 For @var{base} in the range 2..36, digits and lower-case letters are used; for 3205 @minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, 3206 digits, upper-case letters, and lower-case letters (in that significance order) 3207 are used. 3208 3209 If @var{str} is @code{NULL}, the result string is allocated using the current 3210 allocation function (@pxref{Custom Allocation}). The block will be 3211 @code{strlen(str)+1} bytes, that being exactly enough for the string and 3212 null-terminator. 3213 3214 If @var{str} is not @code{NULL}, it should point to a block of storage large 3215 enough for the result, that being @code{mpz_sizeinbase (@var{op}, @var{base}) 3216 + 2}. The two extra bytes are for a possible minus sign, and the 3217 null-terminator. 3218 3219 A pointer to the result string is returned, being either the allocated block, 3220 or the given @var{str}. 3221 @end deftypefun 3222 3223 3224 @need 2000 3225 @node Integer Arithmetic, Integer Division, Converting Integers, Integer Functions 3226 @comment node-name, next, previous, up 3227 @section Arithmetic Functions 3228 @cindex Integer arithmetic functions 3229 @cindex Arithmetic functions 3230 3231 @deftypefun void mpz_add (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3232 @deftypefunx void mpz_add_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2}) 3233 Set @var{rop} to @math{@var{op1} + @var{op2}}. 3234 @end deftypefun 3235 3236 @deftypefun void mpz_sub (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3237 @deftypefunx void mpz_sub_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2}) 3238 @deftypefunx void mpz_ui_sub (mpz_t @var{rop}, unsigned long int @var{op1}, const mpz_t @var{op2}) 3239 Set @var{rop} to @var{op1} @minus{} @var{op2}. 3240 @end deftypefun 3241 3242 @deftypefun void mpz_mul (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3243 @deftypefunx void mpz_mul_si (mpz_t @var{rop}, const mpz_t @var{op1}, long int @var{op2}) 3244 @deftypefunx void mpz_mul_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2}) 3245 Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}. 3246 @end deftypefun 3247 3248 @deftypefun void mpz_addmul (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3249 @deftypefunx void mpz_addmul_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2}) 3250 Set @var{rop} to @math{@var{rop} + @var{op1} @GMPtimes{} @var{op2}}. 3251 @end deftypefun 3252 3253 @deftypefun void mpz_submul (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3254 @deftypefunx void mpz_submul_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2}) 3255 Set @var{rop} to @math{@var{rop} - @var{op1} @GMPtimes{} @var{op2}}. 3256 @end deftypefun 3257 3258 @deftypefun void mpz_mul_2exp (mpz_t @var{rop}, const mpz_t @var{op1}, mp_bitcnt_t @var{op2}) 3259 @cindex Bit shift left 3260 Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to 3261 @var{op2}}. This operation can also be defined as a left shift by @var{op2} 3262 bits. 3263 @end deftypefun 3264 3265 @deftypefun void mpz_neg (mpz_t @var{rop}, const mpz_t @var{op}) 3266 Set @var{rop} to @minus{}@var{op}. 3267 @end deftypefun 3268 3269 @deftypefun void mpz_abs (mpz_t @var{rop}, const mpz_t @var{op}) 3270 Set @var{rop} to the absolute value of @var{op}. 3271 @end deftypefun 3272 3273 3274 @need 2000 3275 @node Integer Division, Integer Exponentiation, Integer Arithmetic, Integer Functions 3276 @section Division Functions 3277 @cindex Integer division functions 3278 @cindex Division functions 3279 3280 Division is undefined if the divisor is zero. Passing a zero divisor to the 3281 division or modulo functions (including the modular powering functions 3282 @code{mpz_powm} and @code{mpz_powm_ui}), will cause an intentional division by 3283 zero. This lets a program handle arithmetic exceptions in these functions the 3284 same way as for normal C @code{int} arithmetic. 3285 3286 @c Separate deftypefun groups for cdiv, fdiv and tdiv produce a blank line 3287 @c between each, and seem to let tex do a better job of page breaks than an 3288 @c @sp 1 in the middle of one big set. 3289 3290 @deftypefun void mpz_cdiv_q (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d}) 3291 @deftypefunx void mpz_cdiv_r (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3292 @deftypefunx void mpz_cdiv_qr (mpz_t @var{q}, mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3293 @maybepagebreak 3294 @deftypefunx {unsigned long int} mpz_cdiv_q_ui (mpz_t @var{q}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3295 @deftypefunx {unsigned long int} mpz_cdiv_r_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3296 @deftypefunx {unsigned long int} mpz_cdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{const mpz_t @var{n}}, @w{unsigned long int @var{d}}) 3297 @deftypefunx {unsigned long int} mpz_cdiv_ui (const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3298 @maybepagebreak 3299 @deftypefunx void mpz_cdiv_q_2exp (mpz_t @var{q}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3300 @deftypefunx void mpz_cdiv_r_2exp (mpz_t @var{r}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3301 @end deftypefun 3302 3303 @deftypefun void mpz_fdiv_q (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d}) 3304 @deftypefunx void mpz_fdiv_r (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3305 @deftypefunx void mpz_fdiv_qr (mpz_t @var{q}, mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3306 @maybepagebreak 3307 @deftypefunx {unsigned long int} mpz_fdiv_q_ui (mpz_t @var{q}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3308 @deftypefunx {unsigned long int} mpz_fdiv_r_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3309 @deftypefunx {unsigned long int} mpz_fdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{const mpz_t @var{n}}, @w{unsigned long int @var{d}}) 3310 @deftypefunx {unsigned long int} mpz_fdiv_ui (const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3311 @maybepagebreak 3312 @deftypefunx void mpz_fdiv_q_2exp (mpz_t @var{q}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3313 @deftypefunx void mpz_fdiv_r_2exp (mpz_t @var{r}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3314 @end deftypefun 3315 3316 @deftypefun void mpz_tdiv_q (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d}) 3317 @deftypefunx void mpz_tdiv_r (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3318 @deftypefunx void mpz_tdiv_qr (mpz_t @var{q}, mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3319 @maybepagebreak 3320 @deftypefunx {unsigned long int} mpz_tdiv_q_ui (mpz_t @var{q}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3321 @deftypefunx {unsigned long int} mpz_tdiv_r_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3322 @deftypefunx {unsigned long int} mpz_tdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{const mpz_t @var{n}}, @w{unsigned long int @var{d}}) 3323 @deftypefunx {unsigned long int} mpz_tdiv_ui (const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3324 @maybepagebreak 3325 @deftypefunx void mpz_tdiv_q_2exp (mpz_t @var{q}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3326 @deftypefunx void mpz_tdiv_r_2exp (mpz_t @var{r}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3327 @cindex Bit shift right 3328 3329 @sp 1 3330 Divide @var{n} by @var{d}, forming a quotient @var{q} and/or remainder 3331 @var{r}. For the @code{2exp} functions, @m{@var{d}=2^b, @var{d}=2^@var{b}}. 3332 The rounding is in three styles, each suiting different applications. 3333 3334 @itemize @bullet 3335 @item 3336 @code{cdiv} rounds @var{q} up towards @m{+\infty, +infinity}, and @var{r} will 3337 have the opposite sign to @var{d}. The @code{c} stands for ``ceil''. 3338 3339 @item 3340 @code{fdiv} rounds @var{q} down towards @m{-\infty, @minus{}infinity}, and 3341 @var{r} will have the same sign as @var{d}. The @code{f} stands for 3342 ``floor''. 3343 3344 @item 3345 @code{tdiv} rounds @var{q} towards zero, and @var{r} will have the same sign 3346 as @var{n}. The @code{t} stands for ``truncate''. 3347 @end itemize 3348 3349 In all cases @var{q} and @var{r} will satisfy 3350 @m{@var{n}=@var{q}@var{d}+@var{r}, @var{n}=@var{q}*@var{d}+@var{r}}, and 3351 @var{r} will satisfy @math{0@le{}@GMPabs{@var{r}}<@GMPabs{@var{d}}}. 3352 3353 The @code{q} functions calculate only the quotient, the @code{r} functions 3354 only the remainder, and the @code{qr} functions calculate both. Note that for 3355 @code{qr} the same variable cannot be passed for both @var{q} and @var{r}, or 3356 results will be unpredictable. 3357 3358 For the @code{ui} variants the return value is the remainder, and in fact 3359 returning the remainder is all the @code{div_ui} functions do. For 3360 @code{tdiv} and @code{cdiv} the remainder can be negative, so for those the 3361 return value is the absolute value of the remainder. 3362 3363 For the @code{2exp} variants the divisor is @m{2^b,2^@var{b}}. These 3364 functions are implemented as right shifts and bit masks, but of course they 3365 round the same as the other functions. 3366 3367 For positive @var{n} both @code{mpz_fdiv_q_2exp} and @code{mpz_tdiv_q_2exp} 3368 are simple bitwise right shifts. For negative @var{n}, @code{mpz_fdiv_q_2exp} 3369 is effectively an arithmetic right shift treating @var{n} as twos complement 3370 the same as the bitwise logical functions do, whereas @code{mpz_tdiv_q_2exp} 3371 effectively treats @var{n} as sign and magnitude. 3372 @end deftypefun 3373 3374 @deftypefun void mpz_mod (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3375 @deftypefunx {unsigned long int} mpz_mod_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3376 Set @var{r} to @var{n} @code{mod} @var{d}. The sign of the divisor is 3377 ignored; the result is always non-negative. 3378 3379 @code{mpz_mod_ui} is identical to @code{mpz_fdiv_r_ui} above, returning the 3380 remainder as well as setting @var{r}. See @code{mpz_fdiv_ui} above if only 3381 the return value is wanted. 3382 @end deftypefun 3383 3384 @deftypefun void mpz_divexact (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d}) 3385 @deftypefunx void mpz_divexact_ui (mpz_t @var{q}, const mpz_t @var{n}, unsigned long @var{d}) 3386 @cindex Exact division functions 3387 Set @var{q} to @var{n}/@var{d}. These functions produce correct results only 3388 when it is known in advance that @var{d} divides @var{n}. 3389 3390 These routines are much faster than the other division functions, and are the 3391 best choice when exact division is known to occur, for example reducing a 3392 rational to lowest terms. 3393 @end deftypefun 3394 3395 @deftypefun int mpz_divisible_p (const mpz_t @var{n}, const mpz_t @var{d}) 3396 @deftypefunx int mpz_divisible_ui_p (const mpz_t @var{n}, unsigned long int @var{d}) 3397 @deftypefunx int mpz_divisible_2exp_p (const mpz_t @var{n}, mp_bitcnt_t @var{b}) 3398 @cindex Divisibility functions 3399 Return non-zero if @var{n} is exactly divisible by @var{d}, or in the case of 3400 @code{mpz_divisible_2exp_p} by @m{2^b,2^@var{b}}. 3401 3402 @var{n} is divisible by @var{d} if there exists an integer @var{q} satisfying 3403 @math{@var{n} = @var{q}@GMPmultiply{}@var{d}}. Unlike the other division 3404 functions, @math{@var{d}=0} is accepted and following the rule it can be seen 3405 that only 0 is considered divisible by 0. 3406 @end deftypefun 3407 3408 @deftypefun int mpz_congruent_p (const mpz_t @var{n}, const mpz_t @var{c}, const mpz_t @var{d}) 3409 @deftypefunx int mpz_congruent_ui_p (const mpz_t @var{n}, unsigned long int @var{c}, unsigned long int @var{d}) 3410 @deftypefunx int mpz_congruent_2exp_p (const mpz_t @var{n}, const mpz_t @var{c}, mp_bitcnt_t @var{b}) 3411 @cindex Divisibility functions 3412 @cindex Congruence functions 3413 Return non-zero if @var{n} is congruent to @var{c} modulo @var{d}, or in the 3414 case of @code{mpz_congruent_2exp_p} modulo @m{2^b,2^@var{b}}. 3415 3416 @var{n} is congruent to @var{c} mod @var{d} if there exists an integer @var{q} 3417 satisfying @math{@var{n} = @var{c} + @var{q}@GMPmultiply{}@var{d}}. Unlike 3418 the other division functions, @math{@var{d}=0} is accepted and following the 3419 rule it can be seen that @var{n} and @var{c} are considered congruent mod 0 3420 only when exactly equal. 3421 @end deftypefun 3422 3423 3424 @need 2000 3425 @node Integer Exponentiation, Integer Roots, Integer Division, Integer Functions 3426 @section Exponentiation Functions 3427 @cindex Integer exponentiation functions 3428 @cindex Exponentiation functions 3429 @cindex Powering functions 3430 3431 @deftypefun void mpz_powm (mpz_t @var{rop}, const mpz_t @var{base}, const mpz_t @var{exp}, const mpz_t @var{mod}) 3432 @deftypefunx void mpz_powm_ui (mpz_t @var{rop}, const mpz_t @var{base}, unsigned long int @var{exp}, const mpz_t @var{mod}) 3433 Set @var{rop} to @m{base^{exp} \bmod mod, (@var{base} raised to @var{exp}) 3434 modulo @var{mod}}. 3435 3436 Negative @var{exp} is supported if an inverse @math{@var{base}^@W{-1} @bmod 3437 @var{mod}} exists (see @code{mpz_invert} in @ref{Number Theoretic Functions}). 3438 If an inverse doesn't exist then a divide by zero is raised. 3439 @end deftypefun 3440 3441 @deftypefun void mpz_powm_sec (mpz_t @var{rop}, const mpz_t @var{base}, const mpz_t @var{exp}, const mpz_t @var{mod}) 3442 Set @var{rop} to @m{base^{exp} \bmod @var{mod}, (@var{base} raised to @var{exp}) 3443 modulo @var{mod}}. 3444 3445 It is required that @math{@var{exp} > 0} and that @var{mod} is odd. 3446 3447 This function is designed to take the same time and have the same cache access 3448 patterns for any two same-size arguments, assuming that function arguments are 3449 placed at the same position and that the machine state is identical upon 3450 function entry. This function is intended for cryptographic purposes, where 3451 resilience to side-channel attacks is desired. 3452 @end deftypefun 3453 3454 @deftypefun void mpz_pow_ui (mpz_t @var{rop}, const mpz_t @var{base}, unsigned long int @var{exp}) 3455 @deftypefunx void mpz_ui_pow_ui (mpz_t @var{rop}, unsigned long int @var{base}, unsigned long int @var{exp}) 3456 Set @var{rop} to @m{base^{exp}, @var{base} raised to @var{exp}}. The case 3457 @math{0^0} yields 1. 3458 @end deftypefun 3459 3460 3461 @need 2000 3462 @node Integer Roots, Number Theoretic Functions, Integer Exponentiation, Integer Functions 3463 @section Root Extraction Functions 3464 @cindex Integer root functions 3465 @cindex Root extraction functions 3466 3467 @deftypefun int mpz_root (mpz_t @var{rop}, const mpz_t @var{op}, unsigned long int @var{n}) 3468 Set @var{rop} to @m{\lfloor\root n \of {op}\rfloor@C{},} the truncated integer 3469 part of the @var{n}th root of @var{op}. Return non-zero if the computation 3470 was exact, i.e., if @var{op} is @var{rop} to the @var{n}th power. 3471 @end deftypefun 3472 3473 @deftypefun void mpz_rootrem (mpz_t @var{root}, mpz_t @var{rem}, const mpz_t @var{u}, unsigned long int @var{n}) 3474 Set @var{root} to @m{\lfloor\root n \of {u}\rfloor@C{},} the truncated 3475 integer part of the @var{n}th root of @var{u}. Set @var{rem} to the 3476 remainder, @m{(@var{u} - @var{root}^n), 3477 @var{u}@minus{}@var{root}**@var{n}}. 3478 @end deftypefun 3479 3480 @deftypefun void mpz_sqrt (mpz_t @var{rop}, const mpz_t @var{op}) 3481 Set @var{rop} to @m{\lfloor\sqrt{@var{op}}\rfloor@C{},} the truncated 3482 integer part of the square root of @var{op}. 3483 @end deftypefun 3484 3485 @deftypefun void mpz_sqrtrem (mpz_t @var{rop1}, mpz_t @var{rop2}, const mpz_t @var{op}) 3486 Set @var{rop1} to @m{\lfloor\sqrt{@var{op}}\rfloor, the truncated integer part 3487 of the square root of @var{op}}, like @code{mpz_sqrt}. Set @var{rop2} to the 3488 remainder @m{(@var{op} - @var{rop1}^2), 3489 @var{op}@minus{}@var{rop1}*@var{rop1}}, which will be zero if @var{op} is a 3490 perfect square. 3491 3492 If @var{rop1} and @var{rop2} are the same variable, the results are 3493 undefined. 3494 @end deftypefun 3495 3496 @deftypefun int mpz_perfect_power_p (const mpz_t @var{op}) 3497 @cindex Perfect power functions 3498 @cindex Root testing functions 3499 Return non-zero if @var{op} is a perfect power, i.e., if there exist integers 3500 @m{a,@var{a}} and @m{b,@var{b}}, with @m{b>1, @var{b}>1}, such that 3501 @m{@var{op}=a^b, @var{op} equals @var{a} raised to the power @var{b}}. 3502 3503 Under this definition both 0 and 1 are considered to be perfect powers. 3504 Negative values of @var{op} are accepted, but of course can only be odd 3505 perfect powers. 3506 @end deftypefun 3507 3508 @deftypefun int mpz_perfect_square_p (const mpz_t @var{op}) 3509 @cindex Perfect square functions 3510 @cindex Root testing functions 3511 Return non-zero if @var{op} is a perfect square, i.e., if the square root of 3512 @var{op} is an integer. Under this definition both 0 and 1 are considered to 3513 be perfect squares. 3514 @end deftypefun 3515 3516 3517 @need 2000 3518 @node Number Theoretic Functions, Integer Comparisons, Integer Roots, Integer Functions 3519 @section Number Theoretic Functions 3520 @cindex Number theoretic functions 3521 3522 @deftypefun int mpz_probab_prime_p (const mpz_t @var{n}, int @var{reps}) 3523 @cindex Prime testing functions 3524 @cindex Probable prime testing functions 3525 Determine whether @var{n} is prime. Return 2 if @var{n} is definitely prime, 3526 return 1 if @var{n} is probably prime (without being certain), or return 0 if 3527 @var{n} is definitely non-prime. 3528 3529 This function performs some trial divisions, then @var{reps} Miller-Rabin 3530 probabilistic primality tests. A higher @var{reps} value will reduce the 3531 chances of a non-prime being identified as ``probably prime''. A composite 3532 number will be identified as a prime with a probability of less than 3533 @m{4^{-reps},4^(-@var{reps})}. Reasonable values of @var{reps} are between 15 3534 and 50. 3535 @end deftypefun 3536 3537 @deftypefun void mpz_nextprime (mpz_t @var{rop}, const mpz_t @var{op}) 3538 @cindex Next prime function 3539 Set @var{rop} to the next prime greater than @var{op}. 3540 3541 This function uses a probabilistic algorithm to identify primes. For 3542 practical purposes it's adequate, the chance of a composite passing will be 3543 extremely small. 3544 @end deftypefun 3545 3546 @c mpz_prime_p not implemented as of gmp 3.0. 3547 3548 @c @deftypefun int mpz_prime_p (const mpz_t @var{n}) 3549 @c Return non-zero if @var{n} is prime and zero if @var{n} is a non-prime. 3550 @c This function is far slower than @code{mpz_probab_prime_p}, but then it 3551 @c never returns non-zero for composite numbers. 3552 3553 @c (For practical purposes, using @code{mpz_probab_prime_p} is adequate. 3554 @c The likelihood of a programming error or hardware malfunction is orders 3555 @c of magnitudes greater than the likelihood for a composite to pass as a 3556 @c prime, if the @var{reps} argument is in the suggested range.) 3557 @c @end deftypefun 3558 3559 @deftypefun void mpz_gcd (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3560 @cindex Greatest common divisor functions 3561 @cindex GCD functions 3562 Set @var{rop} to the greatest common divisor of @var{op1} and @var{op2}. The 3563 result is always positive even if one or both input operands are negative. 3564 Except if both inputs are zero; then this function defines @math{gcd(0,0) = 0}. 3565 @end deftypefun 3566 3567 @deftypefun {unsigned long int} mpz_gcd_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2}) 3568 Compute the greatest common divisor of @var{op1} and @var{op2}. If 3569 @var{rop} is not @code{NULL}, store the result there. 3570 3571 If the result is small enough to fit in an @code{unsigned long int}, it is 3572 returned. If the result does not fit, 0 is returned, and the result is equal 3573 to the argument @var{op1}. Note that the result will always fit if @var{op2} 3574 is non-zero. 3575 @end deftypefun 3576 3577 @deftypefun void mpz_gcdext (mpz_t @var{g}, mpz_t @var{s}, mpz_t @var{t}, const mpz_t @var{a}, const mpz_t @var{b}) 3578 @cindex Extended GCD 3579 @cindex GCD extended 3580 Set @var{g} to the greatest common divisor of @var{a} and @var{b}, and in 3581 addition set @var{s} and @var{t} to coefficients satisfying 3582 @math{@var{a}@GMPmultiply{}@var{s} + @var{b}@GMPmultiply{}@var{t} = @var{g}}. 3583 The value in @var{g} is always positive, even if one or both of @var{a} and 3584 @var{b} are negative (or zero if both inputs are zero). The values in @var{s} 3585 and @var{t} are chosen such that normally, @math{@GMPabs{@var{s}} < 3586 @GMPabs{@var{b}} / (2 @var{g})} and @math{@GMPabs{@var{t}} < @GMPabs{@var{a}} 3587 / (2 @var{g})}, and these relations define @var{s} and @var{t} uniquely. There 3588 are a few exceptional cases: 3589 3590 If @math{@GMPabs{@var{a}} = @GMPabs{@var{b}}}, then @math{@var{s} = 0}, 3591 @math{@var{t} = sgn(@var{b})}. 3592 3593 Otherwise, @math{@var{s} = sgn(@var{a})} if @math{@var{b} = 0} or 3594 @math{@GMPabs{@var{b}} = 2 @var{g}}, and @math{@var{t} = sgn(@var{b})} if 3595 @math{@var{a} = 0} or @math{@GMPabs{@var{a}} = 2 @var{g}}. 3596 3597 In all cases, @math{@var{s} = 0} if and only if @math{@var{g} = 3598 @GMPabs{@var{b}}}, i.e., if @var{b} divides @var{a} or @math{@var{a} = @var{b} 3599 = 0}. 3600 3601 If @var{t} is @code{NULL} then that value is not computed. 3602 @end deftypefun 3603 3604 @deftypefun void mpz_lcm (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3605 @deftypefunx void mpz_lcm_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long @var{op2}) 3606 @cindex Least common multiple functions 3607 @cindex LCM functions 3608 Set @var{rop} to the least common multiple of @var{op1} and @var{op2}. 3609 @var{rop} is always positive, irrespective of the signs of @var{op1} and 3610 @var{op2}. @var{rop} will be zero if either @var{op1} or @var{op2} is zero. 3611 @end deftypefun 3612 3613 @deftypefun int mpz_invert (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3614 @cindex Modular inverse functions 3615 @cindex Inverse modulo functions 3616 Compute the inverse of @var{op1} modulo @var{op2} and put the result in 3617 @var{rop}. If the inverse exists, the return value is non-zero and @var{rop} 3618 will satisfy @math{0 @le{} @var{rop} < @GMPabs{@var{op2}}} (with @math{@var{rop} 3619 = 0} possible only when @math{@GMPabs{@var{op2}} = 1}, i.e., in the 3620 somewhat degenerate zero ring). If an inverse doesn't 3621 exist the return value is zero and @var{rop} is undefined. The behaviour of 3622 this function is undefined when @var{op2} is zero. 3623 @end deftypefun 3624 3625 @deftypefun int mpz_jacobi (const mpz_t @var{a}, const mpz_t @var{b}) 3626 @cindex Jacobi symbol functions 3627 Calculate the Jacobi symbol @m{\left(a \over b\right), 3628 (@var{a}/@var{b})}. This is defined only for @var{b} odd. 3629 @end deftypefun 3630 3631 @deftypefun int mpz_legendre (const mpz_t @var{a}, const mpz_t @var{p}) 3632 @cindex Legendre symbol functions 3633 Calculate the Legendre symbol @m{\left(a \over p\right), 3634 (@var{a}/@var{p})}. This is defined only for @var{p} an odd positive 3635 prime, and for such @var{p} it's identical to the Jacobi symbol. 3636 @end deftypefun 3637 3638 @deftypefun int mpz_kronecker (const mpz_t @var{a}, const mpz_t @var{b}) 3639 @deftypefunx int mpz_kronecker_si (const mpz_t @var{a}, long @var{b}) 3640 @deftypefunx int mpz_kronecker_ui (const mpz_t @var{a}, unsigned long @var{b}) 3641 @deftypefunx int mpz_si_kronecker (long @var{a}, const mpz_t @var{b}) 3642 @deftypefunx int mpz_ui_kronecker (unsigned long @var{a}, const mpz_t @var{b}) 3643 @cindex Kronecker symbol functions 3644 Calculate the Jacobi symbol @m{\left(a \over b\right), 3645 (@var{a}/@var{b})} with the Kronecker extension @m{\left(a \over 3646 2\right) = \left(2 \over a\right), (a/2)=(2/a)} when @math{a} odd, or 3647 @m{\left(a \over 2\right) = 0, (a/2)=0} when @math{a} even. 3648 3649 When @var{b} is odd the Jacobi symbol and Kronecker symbol are 3650 identical, so @code{mpz_kronecker_ui} etc can be used for mixed 3651 precision Jacobi symbols too. 3652 3653 For more information see Henri Cohen section 1.4.2 (@pxref{References}), 3654 or any number theory textbook. See also the example program 3655 @file{demos/qcn.c} which uses @code{mpz_kronecker_ui}. 3656 @end deftypefun 3657 3658 @deftypefun {mp_bitcnt_t} mpz_remove (mpz_t @var{rop}, const mpz_t @var{op}, const mpz_t @var{f}) 3659 @cindex Remove factor functions 3660 @cindex Factor removal functions 3661 Remove all occurrences of the factor @var{f} from @var{op} and store the 3662 result in @var{rop}. The return value is how many such occurrences were 3663 removed. 3664 @end deftypefun 3665 3666 @deftypefun void mpz_fac_ui (mpz_t @var{rop}, unsigned long int @var{n}) 3667 @deftypefunx void mpz_2fac_ui (mpz_t @var{rop}, unsigned long int @var{n}) 3668 @deftypefunx void mpz_mfac_uiui (mpz_t @var{rop}, unsigned long int @var{n}, unsigned long int @var{m}) 3669 @cindex Factorial functions 3670 Set @var{rop} to the factorial of @var{n}: @code{mpz_fac_ui} computes the plain factorial @var{n}!, 3671 @code{mpz_2fac_ui} computes the double-factorial @var{n}!!, and @code{mpz_mfac_uiui} the 3672 @var{m}-multi-factorial @m{n!^{(m)}, @var{n}!^(@var{m})}. 3673 @end deftypefun 3674 3675 @deftypefun void mpz_primorial_ui (mpz_t @var{rop}, unsigned long int @var{n}) 3676 @cindex Primorial functions 3677 Set @var{rop} to the primorial of @var{n}, i.e. the product of all positive 3678 prime numbers @math{@le{}@var{n}}. 3679 @end deftypefun 3680 3681 @deftypefun void mpz_bin_ui (mpz_t @var{rop}, const mpz_t @var{n}, unsigned long int @var{k}) 3682 @deftypefunx void mpz_bin_uiui (mpz_t @var{rop}, unsigned long int @var{n}, @w{unsigned long int @var{k}}) 3683 @cindex Binomial coefficient functions 3684 Compute the binomial coefficient @m{\left({n}\atop{k}\right), @var{n} over 3685 @var{k}} and store the result in @var{rop}. Negative values of @var{n} are 3686 supported by @code{mpz_bin_ui}, using the identity 3687 @m{\left({-n}\atop{k}\right) = (-1)^k \left({n+k-1}\atop{k}\right), 3688 bin(-n@C{}k) = (-1)^k * bin(n+k-1@C{}k)}, see Knuth volume 1 section 1.2.6 3689 part G. 3690 @end deftypefun 3691 3692 @deftypefun void mpz_fib_ui (mpz_t @var{fn}, unsigned long int @var{n}) 3693 @deftypefunx void mpz_fib2_ui (mpz_t @var{fn}, mpz_t @var{fnsub1}, unsigned long int @var{n}) 3694 @cindex Fibonacci sequence functions 3695 @code{mpz_fib_ui} sets @var{fn} to to @m{F_n,F[n]}, the @var{n}'th Fibonacci 3696 number. @code{mpz_fib2_ui} sets @var{fn} to @m{F_n,F[n]}, and @var{fnsub1} to 3697 @m{F_{n-1},F[n-1]}. 3698 3699 These functions are designed for calculating isolated Fibonacci numbers. When 3700 a sequence of values is wanted it's best to start with @code{mpz_fib2_ui} and 3701 iterate the defining @m{F_{n+1} = F_n + F_{n-1}, F[n+1]=F[n]+F[n-1]} or 3702 similar. 3703 @end deftypefun 3704 3705 @deftypefun void mpz_lucnum_ui (mpz_t @var{ln}, unsigned long int @var{n}) 3706 @deftypefunx void mpz_lucnum2_ui (mpz_t @var{ln}, mpz_t @var{lnsub1}, unsigned long int @var{n}) 3707 @cindex Lucas number functions 3708 @code{mpz_lucnum_ui} sets @var{ln} to to @m{L_n,L[n]}, the @var{n}'th Lucas 3709 number. @code{mpz_lucnum2_ui} sets @var{ln} to @m{L_n,L[n]}, and @var{lnsub1} 3710 to @m{L_{n-1},L[n-1]}. 3711 3712 These functions are designed for calculating isolated Lucas numbers. When a 3713 sequence of values is wanted it's best to start with @code{mpz_lucnum2_ui} and 3714 iterate the defining @m{L_{n+1} = L_n + L_{n-1}, L[n+1]=L[n]+L[n-1]} or 3715 similar. 3716 3717 The Fibonacci numbers and Lucas numbers are related sequences, so it's never 3718 necessary to call both @code{mpz_fib2_ui} and @code{mpz_lucnum2_ui}. The 3719 formulas for going from Fibonacci to Lucas can be found in @ref{Lucas Numbers 3720 Algorithm}, the reverse is straightforward too. 3721 @end deftypefun 3722 3723 3724 @node Integer Comparisons, Integer Logic and Bit Fiddling, Number Theoretic Functions, Integer Functions 3725 @comment node-name, next, previous, up 3726 @section Comparison Functions 3727 @cindex Integer comparison functions 3728 @cindex Comparison functions 3729 3730 @deftypefn Function int mpz_cmp (const mpz_t @var{op1}, const mpz_t @var{op2}) 3731 @deftypefnx Function int mpz_cmp_d (const mpz_t @var{op1}, double @var{op2}) 3732 @deftypefnx Macro int mpz_cmp_si (const mpz_t @var{op1}, signed long int @var{op2}) 3733 @deftypefnx Macro int mpz_cmp_ui (const mpz_t @var{op1}, unsigned long int @var{op2}) 3734 Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} > 3735 @var{op2}}, zero if @math{@var{op1} = @var{op2}}, or a negative value if 3736 @math{@var{op1} < @var{op2}}. 3737 3738 @code{mpz_cmp_ui} and @code{mpz_cmp_si} are macros and will evaluate their 3739 arguments more than once. @code{mpz_cmp_d} can be called with an infinity, 3740 but results are undefined for a NaN. 3741 @end deftypefn 3742 3743 @deftypefn Function int mpz_cmpabs (const mpz_t @var{op1}, const mpz_t @var{op2}) 3744 @deftypefnx Function int mpz_cmpabs_d (const mpz_t @var{op1}, double @var{op2}) 3745 @deftypefnx Function int mpz_cmpabs_ui (const mpz_t @var{op1}, unsigned long int @var{op2}) 3746 Compare the absolute values of @var{op1} and @var{op2}. Return a positive 3747 value if @math{@GMPabs{@var{op1}} > @GMPabs{@var{op2}}}, zero if 3748 @math{@GMPabs{@var{op1}} = @GMPabs{@var{op2}}}, or a negative value if 3749 @math{@GMPabs{@var{op1}} < @GMPabs{@var{op2}}}. 3750 3751 @code{mpz_cmpabs_d} can be called with an infinity, but results are undefined 3752 for a NaN. 3753 @end deftypefn 3754 3755 @deftypefn Macro int mpz_sgn (const mpz_t @var{op}) 3756 @cindex Sign tests 3757 @cindex Integer sign tests 3758 Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and 3759 @math{-1} if @math{@var{op} < 0}. 3760 3761 This function is actually implemented as a macro. It evaluates its argument 3762 multiple times. 3763 @end deftypefn 3764 3765 3766 @node Integer Logic and Bit Fiddling, I/O of Integers, Integer Comparisons, Integer Functions 3767 @comment node-name, next, previous, up 3768 @section Logical and Bit Manipulation Functions 3769 @cindex Logical functions 3770 @cindex Bit manipulation functions 3771 @cindex Integer logical functions 3772 @cindex Integer bit manipulation functions 3773 3774 These functions behave as if twos complement arithmetic were used (although 3775 sign-magnitude is the actual implementation). The least significant bit is 3776 number 0. 3777 3778 @deftypefun void mpz_and (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3779 Set @var{rop} to @var{op1} bitwise-and @var{op2}. 3780 @end deftypefun 3781 3782 @deftypefun void mpz_ior (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3783 Set @var{rop} to @var{op1} bitwise inclusive-or @var{op2}. 3784 @end deftypefun 3785 3786 @deftypefun void mpz_xor (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3787 Set @var{rop} to @var{op1} bitwise exclusive-or @var{op2}. 3788 @end deftypefun 3789 3790 @deftypefun void mpz_com (mpz_t @var{rop}, const mpz_t @var{op}) 3791 Set @var{rop} to the one's complement of @var{op}. 3792 @end deftypefun 3793 3794 @deftypefun {mp_bitcnt_t} mpz_popcount (const mpz_t @var{op}) 3795 If @math{@var{op}@ge{}0}, return the population count of @var{op}, which is the 3796 number of 1 bits in the binary representation. If @math{@var{op}<0}, the 3797 number of 1s is infinite, and the return value is the largest possible 3798 @code{mp_bitcnt_t}. 3799 @end deftypefun 3800 3801 @deftypefun {mp_bitcnt_t} mpz_hamdist (const mpz_t @var{op1}, const mpz_t @var{op2}) 3802 If @var{op1} and @var{op2} are both @math{@ge{}0} or both @math{<0}, return the 3803 hamming distance between the two operands, which is the number of bit positions 3804 where @var{op1} and @var{op2} have different bit values. If one operand is 3805 @math{@ge{}0} and the other @math{<0} then the number of bits different is 3806 infinite, and the return value is the largest possible @code{mp_bitcnt_t}. 3807 @end deftypefun 3808 3809 @deftypefun {mp_bitcnt_t} mpz_scan0 (const mpz_t @var{op}, mp_bitcnt_t @var{starting_bit}) 3810 @deftypefunx {mp_bitcnt_t} mpz_scan1 (const mpz_t @var{op}, mp_bitcnt_t @var{starting_bit}) 3811 @cindex Bit scanning functions 3812 @cindex Scan bit functions 3813 Scan @var{op}, starting from bit @var{starting_bit}, towards more significant 3814 bits, until the first 0 or 1 bit (respectively) is found. Return the index of 3815 the found bit. 3816 3817 If the bit at @var{starting_bit} is already what's sought, then 3818 @var{starting_bit} is returned. 3819 3820 If there's no bit found, then the largest possible @code{mp_bitcnt_t} is 3821 returned. This will happen in @code{mpz_scan0} past the end of a negative 3822 number, or @code{mpz_scan1} past the end of a nonnegative number. 3823 @end deftypefun 3824 3825 @deftypefun void mpz_setbit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index}) 3826 Set bit @var{bit_index} in @var{rop}. 3827 @end deftypefun 3828 3829 @deftypefun void mpz_clrbit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index}) 3830 Clear bit @var{bit_index} in @var{rop}. 3831 @end deftypefun 3832 3833 @deftypefun void mpz_combit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index}) 3834 Complement bit @var{bit_index} in @var{rop}. 3835 @end deftypefun 3836 3837 @deftypefun int mpz_tstbit (const mpz_t @var{op}, mp_bitcnt_t @var{bit_index}) 3838 Test bit @var{bit_index} in @var{op} and return 0 or 1 accordingly. 3839 @end deftypefun 3840 3841 @node I/O of Integers, Integer Random Numbers, Integer Logic and Bit Fiddling, Integer Functions 3842 @comment node-name, next, previous, up 3843 @section Input and Output Functions 3844 @cindex Integer input and output functions 3845 @cindex Input functions 3846 @cindex Output functions 3847 @cindex I/O functions 3848 3849 Functions that perform input from a stdio stream, and functions that output to 3850 a stdio stream, of @code{mpz} numbers. Passing a @code{NULL} pointer for a 3851 @var{stream} argument to any of these functions will make them read from 3852 @code{stdin} and write to @code{stdout}, respectively. 3853 3854 When using any of these functions, it is a good idea to include @file{stdio.h} 3855 before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes 3856 for these functions. 3857 3858 See also @ref{Formatted Output} and @ref{Formatted Input}. 3859 3860 @deftypefun size_t mpz_out_str (FILE *@var{stream}, int @var{base}, const mpz_t @var{op}) 3861 Output @var{op} on stdio stream @var{stream}, as a string of digits in base 3862 @var{base}. The base argument may vary from 2 to 62 or from @minus{}2 to 3863 @minus{}36. 3864 3865 For @var{base} in the range 2..36, digits and lower-case letters are used; for 3866 @minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, 3867 digits, upper-case letters, and lower-case letters (in that significance order) 3868 are used. 3869 3870 Return the number of bytes written, or if an error occurred, return 0. 3871 @end deftypefun 3872 3873 @deftypefun size_t mpz_inp_str (mpz_t @var{rop}, FILE *@var{stream}, int @var{base}) 3874 Input a possibly white-space preceded string in base @var{base} from stdio 3875 stream @var{stream}, and put the read integer in @var{rop}. 3876 3877 The @var{base} may vary from 2 to 62, or if @var{base} is 0, then the leading 3878 characters are used: @code{0x} and @code{0X} for hexadecimal, @code{0b} and 3879 @code{0B} for binary, @code{0} for octal, or decimal otherwise. 3880 3881 For bases up to 36, case is ignored; upper-case and lower-case letters have 3882 the same value. For bases 37 to 62, upper-case letter represent the usual 3883 10..35 while lower-case letter represent 36..61. 3884 3885 Return the number of bytes read, or if an error occurred, return 0. 3886 @end deftypefun 3887 3888 @deftypefun size_t mpz_out_raw (FILE *@var{stream}, const mpz_t @var{op}) 3889 Output @var{op} on stdio stream @var{stream}, in raw binary format. The 3890 integer is written in a portable format, with 4 bytes of size information, and 3891 that many bytes of limbs. Both the size and the limbs are written in 3892 decreasing significance order (i.e., in big-endian). 3893 3894 The output can be read with @code{mpz_inp_raw}. 3895 3896 Return the number of bytes written, or if an error occurred, return 0. 3897 3898 The output of this can not be read by @code{mpz_inp_raw} from GMP 1, because 3899 of changes necessary for compatibility between 32-bit and 64-bit machines. 3900 @end deftypefun 3901 3902 @deftypefun size_t mpz_inp_raw (mpz_t @var{rop}, FILE *@var{stream}) 3903 Input from stdio stream @var{stream} in the format written by 3904 @code{mpz_out_raw}, and put the result in @var{rop}. Return the number of 3905 bytes read, or if an error occurred, return 0. 3906 3907 This routine can read the output from @code{mpz_out_raw} also from GMP 1, in 3908 spite of changes necessary for compatibility between 32-bit and 64-bit 3909 machines. 3910 @end deftypefun 3911 3912 3913 @need 2000 3914 @node Integer Random Numbers, Integer Import and Export, I/O of Integers, Integer Functions 3915 @comment node-name, next, previous, up 3916 @section Random Number Functions 3917 @cindex Integer random number functions 3918 @cindex Random number functions 3919 3920 The random number functions of GMP come in two groups; older function 3921 that rely on a global state, and newer functions that accept a state 3922 parameter that is read and modified. Please see the @ref{Random Number 3923 Functions} for more information on how to use and not to use random 3924 number functions. 3925 3926 @deftypefun void mpz_urandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{n}) 3927 Generate a uniformly distributed random integer in the range 0 to @m{2^n-1, 3928 2^@var{n}@minus{}1}, inclusive. 3929 3930 The variable @var{state} must be initialized by calling one of the 3931 @code{gmp_randinit} functions (@ref{Random State Initialization}) before 3932 invoking this function. 3933 @end deftypefun 3934 3935 @deftypefun void mpz_urandomm (mpz_t @var{rop}, gmp_randstate_t @var{state}, const mpz_t @var{n}) 3936 Generate a uniform random integer in the range 0 to @math{@var{n}-1}, 3937 inclusive. 3938 3939 The variable @var{state} must be initialized by calling one of the 3940 @code{gmp_randinit} functions (@ref{Random State Initialization}) 3941 before invoking this function. 3942 @end deftypefun 3943 3944 @deftypefun void mpz_rrandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{n}) 3945 Generate a random integer with long strings of zeros and ones in the 3946 binary representation. Useful for testing functions and algorithms, 3947 since this kind of random numbers have proven to be more likely to 3948 trigger corner-case bugs. The random number will be in the range 3949 @m{2^{n-1}, 2^@var{n@minus{}1}} to @m{2^n-1, 2^@var{n}@minus{}1}, inclusive. 3950 3951 The variable @var{state} must be initialized by calling one of the 3952 @code{gmp_randinit} functions (@ref{Random State Initialization}) 3953 before invoking this function. 3954 @end deftypefun 3955 3956 @deftypefun void mpz_random (mpz_t @var{rop}, mp_size_t @var{max_size}) 3957 Generate a random integer of at most @var{max_size} limbs. The generated 3958 random number doesn't satisfy any particular requirements of randomness. 3959 Negative random numbers are generated when @var{max_size} is negative. 3960 3961 This function is obsolete. Use @code{mpz_urandomb} or 3962 @code{mpz_urandomm} instead. 3963 @end deftypefun 3964 3965 @deftypefun void mpz_random2 (mpz_t @var{rop}, mp_size_t @var{max_size}) 3966 Generate a random integer of at most @var{max_size} limbs, with long strings 3967 of zeros and ones in the binary representation. Useful for testing functions 3968 and algorithms, since this kind of random numbers have proven to be more 3969 likely to trigger corner-case bugs. Negative random numbers are generated 3970 when @var{max_size} is negative. 3971 3972 This function is obsolete. Use @code{mpz_rrandomb} instead. 3973 @end deftypefun 3974 3975 3976 @node Integer Import and Export, Miscellaneous Integer Functions, Integer Random Numbers, Integer Functions 3977 @section Integer Import and Export 3978 3979 @code{mpz_t} variables can be converted to and from arbitrary words of binary 3980 data with the following functions. 3981 3982 @deftypefun void mpz_import (mpz_t @var{rop}, size_t @var{count}, int @var{order}, size_t @var{size}, int @var{endian}, size_t @var{nails}, const void *@var{op}) 3983 @cindex Integer import 3984 @cindex Import 3985 Set @var{rop} from an array of word data at @var{op}. 3986 3987 The parameters specify the format of the data. @var{count} many words are 3988 read, each @var{size} bytes. @var{order} can be 1 for most significant word 3989 first or -1 for least significant first. Within each word @var{endian} can be 3990 1 for most significant byte first, -1 for least significant first, or 0 for 3991 the native endianness of the host CPU@. The most significant @var{nails} bits 3992 of each word are skipped, this can be 0 to use the full words. 3993 3994 There is no sign taken from the data, @var{rop} will simply be a positive 3995 integer. An application can handle any sign itself, and apply it for instance 3996 with @code{mpz_neg}. 3997 3998 There are no data alignment restrictions on @var{op}, any address is allowed. 3999 4000 Here's an example converting an array of @code{unsigned long} data, most 4001 significant element first, and host byte order within each value. 4002 4003 @example 4004 unsigned long a[20]; 4005 /* Initialize @var{z} and @var{a} */ 4006 mpz_import (z, 20, 1, sizeof(a[0]), 0, 0, a); 4007 @end example 4008 4009 This example assumes the full @code{sizeof} bytes are used for data in the 4010 given type, which is usually true, and certainly true for @code{unsigned long} 4011 everywhere we know of. However on Cray vector systems it may be noted that 4012 @code{short} and @code{int} are always stored in 8 bytes (and with 4013 @code{sizeof} indicating that) but use only 32 or 46 bits. The @var{nails} 4014 feature can account for this, by passing for instance 4015 @code{8*sizeof(int)-INT_BIT}. 4016 @end deftypefun 4017 4018 @deftypefun {void *} mpz_export (void *@var{rop}, size_t *@var{countp}, int @var{order}, size_t @var{size}, int @var{endian}, size_t @var{nails}, const mpz_t @var{op}) 4019 @cindex Integer export 4020 @cindex Export 4021 Fill @var{rop} with word data from @var{op}. 4022 4023 The parameters specify the format of the data produced. Each word will be 4024 @var{size} bytes and @var{order} can be 1 for most significant word first or 4025 -1 for least significant first. Within each word @var{endian} can be 1 for 4026 most significant byte first, -1 for least significant first, or 0 for the 4027 native endianness of the host CPU@. The most significant @var{nails} bits of 4028 each word are unused and set to zero, this can be 0 to produce full words. 4029 4030 The number of words produced is written to @code{*@var{countp}}, or 4031 @var{countp} can be @code{NULL} to discard the count. @var{rop} must have 4032 enough space for the data, or if @var{rop} is @code{NULL} then a result array 4033 of the necessary size is allocated using the current GMP allocation function 4034 (@pxref{Custom Allocation}). In either case the return value is the 4035 destination used, either @var{rop} or the allocated block. 4036 4037 If @var{op} is non-zero then the most significant word produced will be 4038 non-zero. If @var{op} is zero then the count returned will be zero and 4039 nothing written to @var{rop}. If @var{rop} is @code{NULL} in this case, no 4040 block is allocated, just @code{NULL} is returned. 4041 4042 The sign of @var{op} is ignored, just the absolute value is exported. An 4043 application can use @code{mpz_sgn} to get the sign and handle it as desired. 4044 (@pxref{Integer Comparisons}) 4045 4046 There are no data alignment restrictions on @var{rop}, any address is allowed. 4047 4048 When an application is allocating space itself the required size can be 4049 determined with a calculation like the following. Since @code{mpz_sizeinbase} 4050 always returns at least 1, @code{count} here will be at least one, which 4051 avoids any portability problems with @code{malloc(0)}, though if @code{z} is 4052 zero no space at all is actually needed (or written). 4053 4054 @example 4055 numb = 8*size - nail; 4056 count = (mpz_sizeinbase (z, 2) + numb-1) / numb; 4057 p = malloc (count * size); 4058 @end example 4059 @end deftypefun 4060 4061 4062 @need 2000 4063 @node Miscellaneous Integer Functions, Integer Special Functions, Integer Import and Export, Integer Functions 4064 @comment node-name, next, previous, up 4065 @section Miscellaneous Functions 4066 @cindex Miscellaneous integer functions 4067 @cindex Integer miscellaneous functions 4068 4069 @deftypefun int mpz_fits_ulong_p (const mpz_t @var{op}) 4070 @deftypefunx int mpz_fits_slong_p (const mpz_t @var{op}) 4071 @deftypefunx int mpz_fits_uint_p (const mpz_t @var{op}) 4072 @deftypefunx int mpz_fits_sint_p (const mpz_t @var{op}) 4073 @deftypefunx int mpz_fits_ushort_p (const mpz_t @var{op}) 4074 @deftypefunx int mpz_fits_sshort_p (const mpz_t @var{op}) 4075 Return non-zero iff the value of @var{op} fits in an @code{unsigned long int}, 4076 @code{signed long int}, @code{unsigned int}, @code{signed int}, @code{unsigned 4077 short int}, or @code{signed short int}, respectively. Otherwise, return zero. 4078 @end deftypefun 4079 4080 @deftypefn Macro int mpz_odd_p (const mpz_t @var{op}) 4081 @deftypefnx Macro int mpz_even_p (const mpz_t @var{op}) 4082 Determine whether @var{op} is odd or even, respectively. Return non-zero if 4083 yes, zero if no. These macros evaluate their argument more than once. 4084 @end deftypefn 4085 4086 @deftypefun size_t mpz_sizeinbase (const mpz_t @var{op}, int @var{base}) 4087 @cindex Size in digits 4088 @cindex Digits in an integer 4089 Return the size of @var{op} measured in number of digits in the given 4090 @var{base}. @var{base} can vary from 2 to 62. The sign of @var{op} is 4091 ignored, just the absolute value is used. The result will be either exact or 4092 1 too big. If @var{base} is a power of 2, the result is always exact. If 4093 @var{op} is zero the return value is always 1. 4094 4095 This function can be used to determine the space required when converting 4096 @var{op} to a string. The right amount of allocation is normally two more 4097 than the value returned by @code{mpz_sizeinbase}, one extra for a minus sign 4098 and one for the null-terminator. 4099 4100 @cindex Most significant bit 4101 It will be noted that @code{mpz_sizeinbase(@var{op},2)} can be used to locate 4102 the most significant 1 bit in @var{op}, counting from 1. (Unlike the bitwise 4103 functions which start from 0, @xref{Integer Logic and Bit Fiddling,, Logical 4104 and Bit Manipulation Functions}.) 4105 @end deftypefun 4106 4107 4108 @node Integer Special Functions, , Miscellaneous Integer Functions, Integer Functions 4109 @section Special Functions 4110 @cindex Special integer functions 4111 @cindex Integer special functions 4112 4113 The functions in this section are for various special purposes. Most 4114 applications will not need them. 4115 4116 @deftypefun void mpz_array_init (mpz_t @var{integer_array}, mp_size_t @var{array_size}, @w{mp_size_t @var{fixed_num_bits}}) 4117 @strong{This is an obsolete function. Do not use it.} 4118 @end deftypefun 4119 4120 @deftypefun {void *} _mpz_realloc (mpz_t @var{integer}, mp_size_t @var{new_alloc}) 4121 Change the space for @var{integer} to @var{new_alloc} limbs. The value in 4122 @var{integer} is preserved if it fits, or is set to 0 if not. The return 4123 value is not useful to applications and should be ignored. 4124 4125 @code{mpz_realloc2} is the preferred way to accomplish allocation changes like 4126 this. @code{mpz_realloc2} and @code{_mpz_realloc} are the same except that 4127 @code{_mpz_realloc} takes its size in limbs. 4128 @end deftypefun 4129 4130 @deftypefun mp_limb_t mpz_getlimbn (const mpz_t @var{op}, mp_size_t @var{n}) 4131 Return limb number @var{n} from @var{op}. The sign of @var{op} is ignored, 4132 just the absolute value is used. The least significant limb is number 0. 4133 4134 @code{mpz_size} can be used to find how many limbs make up @var{op}. 4135 @code{mpz_getlimbn} returns zero if @var{n} is outside the range 0 to 4136 @code{mpz_size(@var{op})-1}. 4137 @end deftypefun 4138 4139 @deftypefun size_t mpz_size (const mpz_t @var{op}) 4140 Return the size of @var{op} measured in number of limbs. If @var{op} is zero, 4141 the returned value will be zero. 4142 @c (@xref{Nomenclature}, for an explanation of the concept @dfn{limb}.) 4143 @end deftypefun 4144 4145 @deftypefun {const mp_limb_t *} mpz_limbs_read (const mpz_t @var{x}) 4146 Return a pointer to the limb array representing the absolute value of @var{x}. 4147 The size of the array is @code{mpz_size(@var{x})}. Intended for read access 4148 only. 4149 @end deftypefun 4150 4151 @deftypefun {mp_limb_t *} mpz_limbs_write (mpz_t @var{x}, mp_size_t @var{n}) 4152 @deftypefunx {mp_limb_t *} mpz_limbs_modify (mpz_t @var{x}, mp_size_t @var{n}) 4153 Return a pointer to the limb array, intended for write access. The array is 4154 reallocated as needed, to make room for @var{n} limbs. Requires @math{@var{n} 4155 > 0}. The @code{mpz_limbs_modify} function returns an array that holds the old 4156 absolute value of @var{x}, while @code{mpz_limbs_write} may destroy the old 4157 value and return an array with unspecified contents. 4158 @end deftypefun 4159 4160 @deftypefun void mpz_limbs_finish (mpz_t @var{x}, mp_size_t @var{s}) 4161 Updates the internal size field of @var{x}. Used after writing to the limb 4162 array pointer returned by @code{mpz_limbs_write} or @code{mpz_limbs_modify} is 4163 completed. The array should contain @math{@GMPabs{@var{s}}} valid limbs, 4164 representing the new absolute value for @var{x}, and the sign of @var{x} is 4165 taken from the sign of @var{s}. This function never reallocates @var{x}, so 4166 the limb pointer remains valid. 4167 @end deftypefun 4168 4169 @c FIXME: Some more useful and less silly example? 4170 @example 4171 void foo (mpz_t x) 4172 @{ 4173 mp_size_t n, i; 4174 mp_limb_t *xp; 4175 4176 n = mpz_size (x); 4177 xp = mpz_limbs_modify (x, 2*n); 4178 for (i = 0; i < n; i++) 4179 xp[n+i] = xp[n-1-i]; 4180 mpz_limbs_finish (x, mpz_sgn (x) < 0 ? - 2*n : 2*n); 4181 @} 4182 @end example 4183 4184 @deftypefun mpz_srcptr mpz_roinit_n (mpz_t @var{x}, const mp_limb_t *@var{xp}, mp_size_t @var{xs}) 4185 Special initialization of @var{x}, using the given limb array and size. 4186 @var{x} should be treated as read-only: it can be passed safely as input to 4187 any mpz function, but not as an output. The array @var{xp} must point to at 4188 least a readable limb, its size is 4189 @math{@GMPabs{@var{xs}}}, and the sign of @var{x} is the sign of @var{xs}. For 4190 convenience, the function returns @var{x}, but cast to a const pointer type. 4191 @end deftypefun 4192 4193 @example 4194 void foo (mpz_t x) 4195 @{ 4196 static const mp_limb_t y[3] = @{ 0x1, 0x2, 0x3 @}; 4197 mpz_t tmp; 4198 mpz_add (x, x, mpz_roinit_n (tmp, y, 3)); 4199 @} 4200 @end example 4201 4202 @deftypefn Macro mpz_t MPZ_ROINIT_N (mp_limb_t *@var{xp}, mp_size_t @var{xs}) 4203 This macro expands to an initializer which can be assigned to an mpz_t 4204 variable. The limb array @var{xp} must point to at least a readable limb, 4205 moreover, unlike the @code{mpz_roinit_n} function, the array must be 4206 normalized: if @var{xs} is non-zero, then 4207 @code{@var{xp}[@math{@GMPabs{@var{xs}}-1}]} must be non-zero. Intended 4208 primarily for constant values. Using it for non-constant values requires a C 4209 compiler supporting C99. 4210 @end deftypefn 4211 4212 @example 4213 void foo (mpz_t x) 4214 @{ 4215 static const mp_limb_t ya[3] = @{ 0x1, 0x2, 0x3 @}; 4216 static const mpz_t y = MPZ_ROINIT_N ((mp_limb_t *) ya, 3); 4217 4218 mpz_add (x, x, y); 4219 @} 4220 @end example 4221 4222 4223 @node Rational Number Functions, Floating-point Functions, Integer Functions, Top 4224 @comment node-name, next, previous, up 4225 @chapter Rational Number Functions 4226 @cindex Rational number functions 4227 4228 This chapter describes the GMP functions for performing arithmetic on rational 4229 numbers. These functions start with the prefix @code{mpq_}. 4230 4231 Rational numbers are stored in objects of type @code{mpq_t}. 4232 4233 All rational arithmetic functions assume operands have a canonical form, and 4234 canonicalize their result. The canonical form means that the denominator and 4235 the numerator have no common factors, and that the denominator is positive. 4236 Zero has the unique representation 0/1. 4237 4238 Pure assignment functions do not canonicalize the assigned variable. It is 4239 the responsibility of the user to canonicalize the assigned variable before 4240 any arithmetic operations are performed on that variable. 4241 4242 @deftypefun void mpq_canonicalize (mpq_t @var{op}) 4243 Remove any factors that are common to the numerator and denominator of 4244 @var{op}, and make the denominator positive. 4245 @end deftypefun 4246 4247 @menu 4248 * Initializing Rationals:: 4249 * Rational Conversions:: 4250 * Rational Arithmetic:: 4251 * Comparing Rationals:: 4252 * Applying Integer Functions:: 4253 * I/O of Rationals:: 4254 @end menu 4255 4256 @node Initializing Rationals, Rational Conversions, Rational Number Functions, Rational Number Functions 4257 @comment node-name, next, previous, up 4258 @section Initialization and Assignment Functions 4259 @cindex Rational assignment functions 4260 @cindex Assignment functions 4261 @cindex Rational initialization functions 4262 @cindex Initialization functions 4263 4264 @deftypefun void mpq_init (mpq_t @var{x}) 4265 Initialize @var{x} and set it to 0/1. Each variable should normally only be 4266 initialized once, or at least cleared out (using the function @code{mpq_clear}) 4267 between each initialization. 4268 @end deftypefun 4269 4270 @deftypefun void mpq_inits (mpq_t @var{x}, ...) 4271 Initialize a NULL-terminated list of @code{mpq_t} variables, and set their 4272 values to 0/1. 4273 @end deftypefun 4274 4275 @deftypefun void mpq_clear (mpq_t @var{x}) 4276 Free the space occupied by @var{x}. Make sure to call this function for all 4277 @code{mpq_t} variables when you are done with them. 4278 @end deftypefun 4279 4280 @deftypefun void mpq_clears (mpq_t @var{x}, ...) 4281 Free the space occupied by a NULL-terminated list of @code{mpq_t} variables. 4282 @end deftypefun 4283 4284 @deftypefun void mpq_set (mpq_t @var{rop}, const mpq_t @var{op}) 4285 @deftypefunx void mpq_set_z (mpq_t @var{rop}, const mpz_t @var{op}) 4286 Assign @var{rop} from @var{op}. 4287 @end deftypefun 4288 4289 @deftypefun void mpq_set_ui (mpq_t @var{rop}, unsigned long int @var{op1}, unsigned long int @var{op2}) 4290 @deftypefunx void mpq_set_si (mpq_t @var{rop}, signed long int @var{op1}, unsigned long int @var{op2}) 4291 Set the value of @var{rop} to @var{op1}/@var{op2}. Note that if @var{op1} and 4292 @var{op2} have common factors, @var{rop} has to be passed to 4293 @code{mpq_canonicalize} before any operations are performed on @var{rop}. 4294 @end deftypefun 4295 4296 @deftypefun int mpq_set_str (mpq_t @var{rop}, const char *@var{str}, int @var{base}) 4297 Set @var{rop} from a null-terminated string @var{str} in the given @var{base}. 4298 4299 The string can be an integer like ``41'' or a fraction like ``41/152''. The 4300 fraction must be in canonical form (@pxref{Rational Number Functions}), or if 4301 not then @code{mpq_canonicalize} must be called. 4302 4303 The numerator and optional denominator are parsed the same as in 4304 @code{mpz_set_str} (@pxref{Assigning Integers}). White space is allowed in 4305 the string, and is simply ignored. The @var{base} can vary from 2 to 62, or 4306 if @var{base} is 0 then the leading characters are used: @code{0x} or @code{0X} for hex, 4307 @code{0b} or @code{0B} for binary, 4308 @code{0} for octal, or decimal otherwise. Note that this is done separately 4309 for the numerator and denominator, so for instance @code{0xEF/100} is 239/100, 4310 whereas @code{0xEF/0x100} is 239/256. 4311 4312 The return value is 0 if the entire string is a valid number, or @minus{}1 if 4313 not. 4314 @end deftypefun 4315 4316 @deftypefun void mpq_swap (mpq_t @var{rop1}, mpq_t @var{rop2}) 4317 Swap the values @var{rop1} and @var{rop2} efficiently. 4318 @end deftypefun 4319 4320 4321 @need 2000 4322 @node Rational Conversions, Rational Arithmetic, Initializing Rationals, Rational Number Functions 4323 @comment node-name, next, previous, up 4324 @section Conversion Functions 4325 @cindex Rational conversion functions 4326 @cindex Conversion functions 4327 4328 @deftypefun double mpq_get_d (const mpq_t @var{op}) 4329 Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding 4330 towards zero). 4331 4332 If the exponent from the conversion is too big or too small to fit a 4333 @code{double} then the result is system dependent. For too big an infinity is 4334 returned when available. For too small @math{0.0} is normally returned. 4335 Hardware overflow, underflow and denorm traps may or may not occur. 4336 @end deftypefun 4337 4338 @deftypefun void mpq_set_d (mpq_t @var{rop}, double @var{op}) 4339 @deftypefunx void mpq_set_f (mpq_t @var{rop}, const mpf_t @var{op}) 4340 Set @var{rop} to the value of @var{op}. There is no rounding, this conversion 4341 is exact. 4342 @end deftypefun 4343 4344 @deftypefun {char *} mpq_get_str (char *@var{str}, int @var{base}, const mpq_t @var{op}) 4345 Convert @var{op} to a string of digits in base @var{base}. The base may vary 4346 from 2 to 36. The string will be of the form @samp{num/den}, or if the 4347 denominator is 1 then just @samp{num}. 4348 4349 If @var{str} is @code{NULL}, the result string is allocated using the current 4350 allocation function (@pxref{Custom Allocation}). The block will be 4351 @code{strlen(str)+1} bytes, that being exactly enough for the string and 4352 null-terminator. 4353 4354 If @var{str} is not @code{NULL}, it should point to a block of storage large 4355 enough for the result, that being 4356 4357 @example 4358 mpz_sizeinbase (mpq_numref(@var{op}), @var{base}) 4359 + mpz_sizeinbase (mpq_denref(@var{op}), @var{base}) + 3 4360 @end example 4361 4362 The three extra bytes are for a possible minus sign, possible slash, and the 4363 null-terminator. 4364 4365 A pointer to the result string is returned, being either the allocated block, 4366 or the given @var{str}. 4367 @end deftypefun 4368 4369 4370 @node Rational Arithmetic, Comparing Rationals, Rational Conversions, Rational Number Functions 4371 @comment node-name, next, previous, up 4372 @section Arithmetic Functions 4373 @cindex Rational arithmetic functions 4374 @cindex Arithmetic functions 4375 4376 @deftypefun void mpq_add (mpq_t @var{sum}, const mpq_t @var{addend1}, const mpq_t @var{addend2}) 4377 Set @var{sum} to @var{addend1} + @var{addend2}. 4378 @end deftypefun 4379 4380 @deftypefun void mpq_sub (mpq_t @var{difference}, const mpq_t @var{minuend}, const mpq_t @var{subtrahend}) 4381 Set @var{difference} to @var{minuend} @minus{} @var{subtrahend}. 4382 @end deftypefun 4383 4384 @deftypefun void mpq_mul (mpq_t @var{product}, const mpq_t @var{multiplier}, const mpq_t @var{multiplicand}) 4385 Set @var{product} to @math{@var{multiplier} @GMPtimes{} @var{multiplicand}}. 4386 @end deftypefun 4387 4388 @deftypefun void mpq_mul_2exp (mpq_t @var{rop}, const mpq_t @var{op1}, mp_bitcnt_t @var{op2}) 4389 Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to 4390 @var{op2}}. 4391 @end deftypefun 4392 4393 @deftypefun void mpq_div (mpq_t @var{quotient}, const mpq_t @var{dividend}, const mpq_t @var{divisor}) 4394 @cindex Division functions 4395 Set @var{quotient} to @var{dividend}/@var{divisor}. 4396 @end deftypefun 4397 4398 @deftypefun void mpq_div_2exp (mpq_t @var{rop}, const mpq_t @var{op1}, mp_bitcnt_t @var{op2}) 4399 Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to 4400 @var{op2}}. 4401 @end deftypefun 4402 4403 @deftypefun void mpq_neg (mpq_t @var{negated_operand}, const mpq_t @var{operand}) 4404 Set @var{negated_operand} to @minus{}@var{operand}. 4405 @end deftypefun 4406 4407 @deftypefun void mpq_abs (mpq_t @var{rop}, const mpq_t @var{op}) 4408 Set @var{rop} to the absolute value of @var{op}. 4409 @end deftypefun 4410 4411 @deftypefun void mpq_inv (mpq_t @var{inverted_number}, const mpq_t @var{number}) 4412 Set @var{inverted_number} to 1/@var{number}. If the new denominator is 4413 zero, this routine will divide by zero. 4414 @end deftypefun 4415 4416 @node Comparing Rationals, Applying Integer Functions, Rational Arithmetic, Rational Number Functions 4417 @comment node-name, next, previous, up 4418 @section Comparison Functions 4419 @cindex Rational comparison functions 4420 @cindex Comparison functions 4421 4422 @deftypefun int mpq_cmp (const mpq_t @var{op1}, const mpq_t @var{op2}) 4423 @deftypefunx int mpq_cmp_z (const mpq_t @var{op1}, const mpz_t @var{op2}) 4424 Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} > 4425 @var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if 4426 @math{@var{op1} < @var{op2}}. 4427 4428 To determine if two rationals are equal, @code{mpq_equal} is faster than 4429 @code{mpq_cmp}. 4430 @end deftypefun 4431 4432 @deftypefn Macro int mpq_cmp_ui (const mpq_t @var{op1}, unsigned long int @var{num2}, unsigned long int @var{den2}) 4433 @deftypefnx Macro int mpq_cmp_si (const mpq_t @var{op1}, long int @var{num2}, unsigned long int @var{den2}) 4434 Compare @var{op1} and @var{num2}/@var{den2}. Return a positive value if 4435 @math{@var{op1} > @var{num2}/@var{den2}}, zero if @math{@var{op1} = 4436 @var{num2}/@var{den2}}, and a negative value if @math{@var{op1} < 4437 @var{num2}/@var{den2}}. 4438 4439 @var{num2} and @var{den2} are allowed to have common factors. 4440 4441 These functions are implemented as a macros and evaluate their arguments 4442 multiple times. 4443 @end deftypefn 4444 4445 @deftypefn Macro int mpq_sgn (const mpq_t @var{op}) 4446 @cindex Sign tests 4447 @cindex Rational sign tests 4448 Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and 4449 @math{-1} if @math{@var{op} < 0}. 4450 4451 This function is actually implemented as a macro. It evaluates its 4452 argument multiple times. 4453 @end deftypefn 4454 4455 @deftypefun int mpq_equal (const mpq_t @var{op1}, const mpq_t @var{op2}) 4456 Return non-zero if @var{op1} and @var{op2} are equal, zero if they are 4457 non-equal. Although @code{mpq_cmp} can be used for the same purpose, this 4458 function is much faster. 4459 @end deftypefun 4460 4461 @node Applying Integer Functions, I/O of Rationals, Comparing Rationals, Rational Number Functions 4462 @comment node-name, next, previous, up 4463 @section Applying Integer Functions to Rationals 4464 @cindex Rational numerator and denominator 4465 @cindex Numerator and denominator 4466 4467 The set of @code{mpq} functions is quite small. In particular, there are few 4468 functions for either input or output. The following functions give direct 4469 access to the numerator and denominator of an @code{mpq_t}. 4470 4471 Note that if an assignment to the numerator and/or denominator could take an 4472 @code{mpq_t} out of the canonical form described at the start of this chapter 4473 (@pxref{Rational Number Functions}) then @code{mpq_canonicalize} must be 4474 called before any other @code{mpq} functions are applied to that @code{mpq_t}. 4475 4476 @deftypefn Macro mpz_t mpq_numref (const mpq_t @var{op}) 4477 @deftypefnx Macro mpz_t mpq_denref (const mpq_t @var{op}) 4478 Return a reference to the numerator and denominator of @var{op}, respectively. 4479 The @code{mpz} functions can be used on the result of these macros. 4480 @end deftypefn 4481 4482 @deftypefun void mpq_get_num (mpz_t @var{numerator}, const mpq_t @var{rational}) 4483 @deftypefunx void mpq_get_den (mpz_t @var{denominator}, const mpq_t @var{rational}) 4484 @deftypefunx void mpq_set_num (mpq_t @var{rational}, const mpz_t @var{numerator}) 4485 @deftypefunx void mpq_set_den (mpq_t @var{rational}, const mpz_t @var{denominator}) 4486 Get or set the numerator or denominator of a rational. These functions are 4487 equivalent to calling @code{mpz_set} with an appropriate @code{mpq_numref} or 4488 @code{mpq_denref}. Direct use of @code{mpq_numref} or @code{mpq_denref} is 4489 recommended instead of these functions. 4490 @end deftypefun 4491 4492 4493 @need 2000 4494 @node I/O of Rationals, , Applying Integer Functions, Rational Number Functions 4495 @comment node-name, next, previous, up 4496 @section Input and Output Functions 4497 @cindex Rational input and output functions 4498 @cindex Input functions 4499 @cindex Output functions 4500 @cindex I/O functions 4501 4502 Functions that perform input from a stdio stream, and functions that output to 4503 a stdio stream, of @code{mpq} numbers. Passing a @code{NULL} pointer for a 4504 @var{stream} argument to any of these functions will make them read from 4505 @code{stdin} and write to @code{stdout}, respectively. 4506 4507 When using any of these functions, it is a good idea to include @file{stdio.h} 4508 before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes 4509 for these functions. 4510 4511 See also @ref{Formatted Output} and @ref{Formatted Input}. 4512 4513 @deftypefun size_t mpq_out_str (FILE *@var{stream}, int @var{base}, const mpq_t @var{op}) 4514 Output @var{op} on stdio stream @var{stream}, as a string of digits in base 4515 @var{base}. The base may vary from 2 to 36. Output is in the form 4516 @samp{num/den} or if the denominator is 1 then just @samp{num}. 4517 4518 Return the number of bytes written, or if an error occurred, return 0. 4519 @end deftypefun 4520 4521 @deftypefun size_t mpq_inp_str (mpq_t @var{rop}, FILE *@var{stream}, int @var{base}) 4522 Read a string of digits from @var{stream} and convert them to a rational in 4523 @var{rop}. Any initial white-space characters are read and discarded. Return 4524 the number of characters read (including white space), or 0 if a rational 4525 could not be read. 4526 4527 The input can be a fraction like @samp{17/63} or just an integer like 4528 @samp{123}. Reading stops at the first character not in this form, and white 4529 space is not permitted within the string. If the input might not be in 4530 canonical form, then @code{mpq_canonicalize} must be called (@pxref{Rational 4531 Number Functions}). 4532 4533 The @var{base} can be between 2 and 36, or can be 0 in which case the leading 4534 characters of the string determine the base, @samp{0x} or @samp{0X} for 4535 hexadecimal, @samp{0} for octal, or decimal otherwise. The leading characters 4536 are examined separately for the numerator and denominator of a fraction, so 4537 for instance @samp{0x10/11} is @math{16/11}, whereas @samp{0x10/0x11} is 4538 @math{16/17}. 4539 @end deftypefun 4540 4541 4542 @node Floating-point Functions, Low-level Functions, Rational Number Functions, Top 4543 @comment node-name, next, previous, up 4544 @chapter Floating-point Functions 4545 @cindex Floating-point functions 4546 @cindex Float functions 4547 @cindex User-defined precision 4548 @cindex Precision of floats 4549 4550 GMP floating point numbers are stored in objects of type @code{mpf_t} and 4551 functions operating on them have an @code{mpf_} prefix. 4552 4553 The mantissa of each float has a user-selectable precision, in practice only 4554 limited by available memory. Each variable has its own precision, and that can 4555 be increased or decreased at any time. This selectable precision is a minimum 4556 value, GMP rounds it up to a whole limb. 4557 4558 The accuracy of a calculation is determined by the priorly set precision of the 4559 destination variable and the numeric values of the input variables. Input 4560 variables' set precisions do not affect calculations (except indirectly as 4561 their values might have been affected when they were assigned). 4562 4563 The exponent of each float has fixed precision, one machine word on most 4564 systems. In the current implementation the exponent is a count of limbs, so 4565 for example on a 32-bit system this means a range of roughly 4566 @math{2^@W{-68719476768}} to @math{2^@W{68719476736}}, or on a 64-bit system 4567 this will be much greater. Note however that @code{mpf_get_str} can only 4568 return an exponent which fits an @code{mp_exp_t} and currently 4569 @code{mpf_set_str} doesn't accept exponents bigger than a @code{long}. 4570 4571 Each variable keeps track of the mantissa data actually in use. This means 4572 that if a float is exactly represented in only a few bits then only those bits 4573 will be used in a calculation, even if the variable's selected precision is 4574 high. This is a performance optimization; it does not affect the numeric 4575 results. 4576 4577 Internally, GMP sometimes calculates with higher precision than that of the 4578 destination variable in order to limit errors. Final results are always 4579 truncated to the destination variable's precision. 4580 4581 The mantissa is stored in binary. One consequence of this is that decimal 4582 fractions like @math{0.1} cannot be represented exactly. The same is true of 4583 plain IEEE @code{double} floats. This makes both highly unsuitable for 4584 calculations involving money or other values that should be exact decimal 4585 fractions. (Suitably scaled integers, or perhaps rationals, are better 4586 choices.) 4587 4588 The @code{mpf} functions and variables have no special notion of infinity or 4589 not-a-number, and applications must take care not to overflow the exponent or 4590 results will be unpredictable. 4591 4592 Note that the @code{mpf} functions are @emph{not} intended as a smooth 4593 extension to IEEE P754 arithmetic. In particular results obtained on one 4594 computer often differ from the results on a computer with a different word 4595 size. 4596 4597 New projects should consider using the GMP extension library MPFR 4598 (@url{http://mpfr.org}) instead. MPFR provides well-defined precision and 4599 accurate rounding, and thereby naturally extends IEEE P754. 4600 4601 @menu 4602 * Initializing Floats:: 4603 * Assigning Floats:: 4604 * Simultaneous Float Init & Assign:: 4605 * Converting Floats:: 4606 * Float Arithmetic:: 4607 * Float Comparison:: 4608 * I/O of Floats:: 4609 * Miscellaneous Float Functions:: 4610 @end menu 4611 4612 @node Initializing Floats, Assigning Floats, Floating-point Functions, Floating-point Functions 4613 @comment node-name, next, previous, up 4614 @section Initialization Functions 4615 @cindex Float initialization functions 4616 @cindex Initialization functions 4617 4618 @deftypefun void mpf_set_default_prec (mp_bitcnt_t @var{prec}) 4619 Set the default precision to be @strong{at least} @var{prec} bits. All 4620 subsequent calls to @code{mpf_init} will use this precision, but previously 4621 initialized variables are unaffected. 4622 @end deftypefun 4623 4624 @deftypefun {mp_bitcnt_t} mpf_get_default_prec (void) 4625 Return the default precision actually used. 4626 @end deftypefun 4627 4628 An @code{mpf_t} object must be initialized before storing the first value in 4629 it. The functions @code{mpf_init} and @code{mpf_init2} are used for that 4630 purpose. 4631 4632 @deftypefun void mpf_init (mpf_t @var{x}) 4633 Initialize @var{x} to 0. Normally, a variable should be initialized once only 4634 or at least be cleared, using @code{mpf_clear}, between initializations. The 4635 precision of @var{x} is undefined unless a default precision has already been 4636 established by a call to @code{mpf_set_default_prec}. 4637 @end deftypefun 4638 4639 @deftypefun void mpf_init2 (mpf_t @var{x}, mp_bitcnt_t @var{prec}) 4640 Initialize @var{x} to 0 and set its precision to be @strong{at least} 4641 @var{prec} bits. Normally, a variable should be initialized once only or at 4642 least be cleared, using @code{mpf_clear}, between initializations. 4643 @end deftypefun 4644 4645 @deftypefun void mpf_inits (mpf_t @var{x}, ...) 4646 Initialize a NULL-terminated list of @code{mpf_t} variables, and set their 4647 values to 0. The precision of the initialized variables is undefined unless a 4648 default precision has already been established by a call to 4649 @code{mpf_set_default_prec}. 4650 @end deftypefun 4651 4652 @deftypefun void mpf_clear (mpf_t @var{x}) 4653 Free the space occupied by @var{x}. Make sure to call this function for all 4654 @code{mpf_t} variables when you are done with them. 4655 @end deftypefun 4656 4657 @deftypefun void mpf_clears (mpf_t @var{x}, ...) 4658 Free the space occupied by a NULL-terminated list of @code{mpf_t} variables. 4659 @end deftypefun 4660 4661 @need 2000 4662 Here is an example on how to initialize floating-point variables: 4663 @example 4664 @{ 4665 mpf_t x, y; 4666 mpf_init (x); /* use default precision */ 4667 mpf_init2 (y, 256); /* precision @emph{at least} 256 bits */ 4668 @dots{} 4669 /* Unless the program is about to exit, do ... */ 4670 mpf_clear (x); 4671 mpf_clear (y); 4672 @} 4673 @end example 4674 4675 The following three functions are useful for changing the precision during a 4676 calculation. A typical use would be for adjusting the precision gradually in 4677 iterative algorithms like Newton-Raphson, making the computation precision 4678 closely match the actual accurate part of the numbers. 4679 4680 @deftypefun {mp_bitcnt_t} mpf_get_prec (const mpf_t @var{op}) 4681 Return the current precision of @var{op}, in bits. 4682 @end deftypefun 4683 4684 @deftypefun void mpf_set_prec (mpf_t @var{rop}, mp_bitcnt_t @var{prec}) 4685 Set the precision of @var{rop} to be @strong{at least} @var{prec} bits. The 4686 value in @var{rop} will be truncated to the new precision. 4687 4688 This function requires a call to @code{realloc}, and so should not be used in 4689 a tight loop. 4690 @end deftypefun 4691 4692 @deftypefun void mpf_set_prec_raw (mpf_t @var{rop}, mp_bitcnt_t @var{prec}) 4693 Set the precision of @var{rop} to be @strong{at least} @var{prec} bits, 4694 without changing the memory allocated. 4695 4696 @var{prec} must be no more than the allocated precision for @var{rop}, that 4697 being the precision when @var{rop} was initialized, or in the most recent 4698 @code{mpf_set_prec}. 4699 4700 The value in @var{rop} is unchanged, and in particular if it had a higher 4701 precision than @var{prec} it will retain that higher precision. New values 4702 written to @var{rop} will use the new @var{prec}. 4703 4704 Before calling @code{mpf_clear} or the full @code{mpf_set_prec}, another 4705 @code{mpf_set_prec_raw} call must be made to restore @var{rop} to its original 4706 allocated precision. Failing to do so will have unpredictable results. 4707 4708 @code{mpf_get_prec} can be used before @code{mpf_set_prec_raw} to get the 4709 original allocated precision. After @code{mpf_set_prec_raw} it reflects the 4710 @var{prec} value set. 4711 4712 @code{mpf_set_prec_raw} is an efficient way to use an @code{mpf_t} variable at 4713 different precisions during a calculation, perhaps to gradually increase 4714 precision in an iteration, or just to use various different precisions for 4715 different purposes during a calculation. 4716 @end deftypefun 4717 4718 4719 @need 2000 4720 @node Assigning Floats, Simultaneous Float Init & Assign, Initializing Floats, Floating-point Functions 4721 @comment node-name, next, previous, up 4722 @section Assignment Functions 4723 @cindex Float assignment functions 4724 @cindex Assignment functions 4725 4726 These functions assign new values to already initialized floats 4727 (@pxref{Initializing Floats}). 4728 4729 @deftypefun void mpf_set (mpf_t @var{rop}, const mpf_t @var{op}) 4730 @deftypefunx void mpf_set_ui (mpf_t @var{rop}, unsigned long int @var{op}) 4731 @deftypefunx void mpf_set_si (mpf_t @var{rop}, signed long int @var{op}) 4732 @deftypefunx void mpf_set_d (mpf_t @var{rop}, double @var{op}) 4733 @deftypefunx void mpf_set_z (mpf_t @var{rop}, const mpz_t @var{op}) 4734 @deftypefunx void mpf_set_q (mpf_t @var{rop}, const mpq_t @var{op}) 4735 Set the value of @var{rop} from @var{op}. 4736 @end deftypefun 4737 4738 @deftypefun int mpf_set_str (mpf_t @var{rop}, const char *@var{str}, int @var{base}) 4739 Set the value of @var{rop} from the string in @var{str}. The string is of the 4740 form @samp{M@@N} or, if the base is 10 or less, alternatively @samp{MeN}. 4741 @samp{M} is the mantissa and @samp{N} is the exponent. The mantissa is always 4742 in the specified base. The exponent is either in the specified base or, if 4743 @var{base} is negative, in decimal. The decimal point expected is taken from 4744 the current locale, on systems providing @code{localeconv}. 4745 4746 The argument @var{base} may be in the ranges 2 to 62, or @minus{}62 to 4747 @minus{}2. Negative values are used to specify that the exponent is in 4748 decimal. 4749 4750 For bases up to 36, case is ignored; upper-case and lower-case letters have 4751 the same value; for bases 37 to 62, upper-case letter represent the usual 4752 10..35 while lower-case letter represent 36..61. 4753 4754 Unlike the corresponding @code{mpz} function, the base will not be determined 4755 from the leading characters of the string if @var{base} is 0. This is so that 4756 numbers like @samp{0.23} are not interpreted as octal. 4757 4758 White space is allowed in the string, and is simply ignored. [This is not 4759 really true; white-space is ignored in the beginning of the string and within 4760 the mantissa, but not in other places, such as after a minus sign or in the 4761 exponent. We are considering changing the definition of this function, making 4762 it fail when there is any white-space in the input, since that makes a lot of 4763 sense. Please tell us your opinion about this change. Do you really want it 4764 to accept @nicode{"3 14"} as meaning 314 as it does now?] 4765 4766 This function returns 0 if the entire string is a valid number in base 4767 @var{base}. Otherwise it returns @minus{}1. 4768 @end deftypefun 4769 4770 @deftypefun void mpf_swap (mpf_t @var{rop1}, mpf_t @var{rop2}) 4771 Swap @var{rop1} and @var{rop2} efficiently. Both the values and the 4772 precisions of the two variables are swapped. 4773 @end deftypefun 4774 4775 4776 @node Simultaneous Float Init & Assign, Converting Floats, Assigning Floats, Floating-point Functions 4777 @comment node-name, next, previous, up 4778 @section Combined Initialization and Assignment Functions 4779 @cindex Float assignment functions 4780 @cindex Assignment functions 4781 @cindex Float initialization functions 4782 @cindex Initialization functions 4783 4784 For convenience, GMP provides a parallel series of initialize-and-set functions 4785 which initialize the output and then store the value there. These functions' 4786 names have the form @code{mpf_init_set@dots{}} 4787 4788 Once the float has been initialized by any of the @code{mpf_init_set@dots{}} 4789 functions, it can be used as the source or destination operand for the ordinary 4790 float functions. Don't use an initialize-and-set function on a variable 4791 already initialized! 4792 4793 @deftypefun void mpf_init_set (mpf_t @var{rop}, const mpf_t @var{op}) 4794 @deftypefunx void mpf_init_set_ui (mpf_t @var{rop}, unsigned long int @var{op}) 4795 @deftypefunx void mpf_init_set_si (mpf_t @var{rop}, signed long int @var{op}) 4796 @deftypefunx void mpf_init_set_d (mpf_t @var{rop}, double @var{op}) 4797 Initialize @var{rop} and set its value from @var{op}. 4798 4799 The precision of @var{rop} will be taken from the active default precision, as 4800 set by @code{mpf_set_default_prec}. 4801 @end deftypefun 4802 4803 @deftypefun int mpf_init_set_str (mpf_t @var{rop}, const char *@var{str}, int @var{base}) 4804 Initialize @var{rop} and set its value from the string in @var{str}. See 4805 @code{mpf_set_str} above for details on the assignment operation. 4806 4807 Note that @var{rop} is initialized even if an error occurs. (I.e., you have to 4808 call @code{mpf_clear} for it.) 4809 4810 The precision of @var{rop} will be taken from the active default precision, as 4811 set by @code{mpf_set_default_prec}. 4812 @end deftypefun 4813 4814 4815 @node Converting Floats, Float Arithmetic, Simultaneous Float Init & Assign, Floating-point Functions 4816 @comment node-name, next, previous, up 4817 @section Conversion Functions 4818 @cindex Float conversion functions 4819 @cindex Conversion functions 4820 4821 @deftypefun double mpf_get_d (const mpf_t @var{op}) 4822 Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding 4823 towards zero). 4824 4825 If the exponent in @var{op} is too big or too small to fit a @code{double} 4826 then the result is system dependent. For too big an infinity is returned when 4827 available. For too small @math{0.0} is normally returned. Hardware overflow, 4828 underflow and denorm traps may or may not occur. 4829 @end deftypefun 4830 4831 @deftypefun double mpf_get_d_2exp (signed long int *@var{exp}, const mpf_t @var{op}) 4832 Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding 4833 towards zero), and with an exponent returned separately. 4834 4835 The return value is in the range @math{0.5@le{}@GMPabs{@var{d}}<1} and the 4836 exponent is stored to @code{*@var{exp}}. @m{@var{d} \times 2^{exp}, 4837 @var{d} * 2^@var{exp}} is the (truncated) @var{op} value. If @var{op} is zero, 4838 the return is @math{0.0} and 0 is stored to @code{*@var{exp}}. 4839 4840 @cindex @code{frexp} 4841 This is similar to the standard C @code{frexp} function (@pxref{Normalization 4842 Functions,,, libc, The GNU C Library Reference Manual}). 4843 @end deftypefun 4844 4845 @deftypefun long mpf_get_si (const mpf_t @var{op}) 4846 @deftypefunx {unsigned long} mpf_get_ui (const mpf_t @var{op}) 4847 Convert @var{op} to a @code{long} or @code{unsigned long}, truncating any 4848 fraction part. If @var{op} is too big for the return type, the result is 4849 undefined. 4850 4851 See also @code{mpf_fits_slong_p} and @code{mpf_fits_ulong_p} 4852 (@pxref{Miscellaneous Float Functions}). 4853 @end deftypefun 4854 4855 @deftypefun {char *} mpf_get_str (char *@var{str}, mp_exp_t *@var{expptr}, int @var{base}, size_t @var{n_digits}, const mpf_t @var{op}) 4856 Convert @var{op} to a string of digits in base @var{base}. The base argument 4857 may vary from 2 to 62 or from @minus{}2 to @minus{}36. Up to @var{n_digits} 4858 digits will be generated. Trailing zeros are not returned. No more digits 4859 than can be accurately represented by @var{op} are ever generated. If 4860 @var{n_digits} is 0 then that accurate maximum number of digits are generated. 4861 4862 For @var{base} in the range 2..36, digits and lower-case letters are used; for 4863 @minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, 4864 digits, upper-case letters, and lower-case letters (in that significance order) 4865 are used. 4866 4867 If @var{str} is @code{NULL}, the result string is allocated using the current 4868 allocation function (@pxref{Custom Allocation}). The block will be 4869 @code{strlen(str)+1} bytes, that being exactly enough for the string and 4870 null-terminator. 4871 4872 If @var{str} is not @code{NULL}, it should point to a block of 4873 @math{@var{n_digits} + 2} bytes, that being enough for the mantissa, a 4874 possible minus sign, and a null-terminator. When @var{n_digits} is 0 to get 4875 all significant digits, an application won't be able to know the space 4876 required, and @var{str} should be @code{NULL} in that case. 4877 4878 The generated string is a fraction, with an implicit radix point immediately 4879 to the left of the first digit. The applicable exponent is written through 4880 the @var{expptr} pointer. For example, the number 3.1416 would be returned as 4881 string @nicode{"31416"} and exponent 1. 4882 4883 When @var{op} is zero, an empty string is produced and the exponent returned 4884 is 0. 4885 4886 A pointer to the result string is returned, being either the allocated block 4887 or the given @var{str}. 4888 @end deftypefun 4889 4890 4891 @node Float Arithmetic, Float Comparison, Converting Floats, Floating-point Functions 4892 @comment node-name, next, previous, up 4893 @section Arithmetic Functions 4894 @cindex Float arithmetic functions 4895 @cindex Arithmetic functions 4896 4897 @deftypefun void mpf_add (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2}) 4898 @deftypefunx void mpf_add_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2}) 4899 Set @var{rop} to @math{@var{op1} + @var{op2}}. 4900 @end deftypefun 4901 4902 @deftypefun void mpf_sub (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2}) 4903 @deftypefunx void mpf_ui_sub (mpf_t @var{rop}, unsigned long int @var{op1}, const mpf_t @var{op2}) 4904 @deftypefunx void mpf_sub_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2}) 4905 Set @var{rop} to @var{op1} @minus{} @var{op2}. 4906 @end deftypefun 4907 4908 @deftypefun void mpf_mul (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2}) 4909 @deftypefunx void mpf_mul_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2}) 4910 Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}. 4911 @end deftypefun 4912 4913 Division is undefined if the divisor is zero, and passing a zero divisor to the 4914 divide functions will make these functions intentionally divide by zero. This 4915 lets the user handle arithmetic exceptions in these functions in the same 4916 manner as other arithmetic exceptions. 4917 4918 @deftypefun void mpf_div (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2}) 4919 @deftypefunx void mpf_ui_div (mpf_t @var{rop}, unsigned long int @var{op1}, const mpf_t @var{op2}) 4920 @deftypefunx void mpf_div_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2}) 4921 @cindex Division functions 4922 Set @var{rop} to @var{op1}/@var{op2}. 4923 @end deftypefun 4924 4925 @deftypefun void mpf_sqrt (mpf_t @var{rop}, const mpf_t @var{op}) 4926 @deftypefunx void mpf_sqrt_ui (mpf_t @var{rop}, unsigned long int @var{op}) 4927 @cindex Root extraction functions 4928 Set @var{rop} to @m{\sqrt{@var{op}}, the square root of @var{op}}. 4929 @end deftypefun 4930 4931 @deftypefun void mpf_pow_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2}) 4932 @cindex Exponentiation functions 4933 @cindex Powering functions 4934 Set @var{rop} to @m{@var{op1}^{op2}, @var{op1} raised to the power @var{op2}}. 4935 @end deftypefun 4936 4937 @deftypefun void mpf_neg (mpf_t @var{rop}, const mpf_t @var{op}) 4938 Set @var{rop} to @minus{}@var{op}. 4939 @end deftypefun 4940 4941 @deftypefun void mpf_abs (mpf_t @var{rop}, const mpf_t @var{op}) 4942 Set @var{rop} to the absolute value of @var{op}. 4943 @end deftypefun 4944 4945 @deftypefun void mpf_mul_2exp (mpf_t @var{rop}, const mpf_t @var{op1}, mp_bitcnt_t @var{op2}) 4946 Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to 4947 @var{op2}}. 4948 @end deftypefun 4949 4950 @deftypefun void mpf_div_2exp (mpf_t @var{rop}, const mpf_t @var{op1}, mp_bitcnt_t @var{op2}) 4951 Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to 4952 @var{op2}}. 4953 @end deftypefun 4954 4955 @node Float Comparison, I/O of Floats, Float Arithmetic, Floating-point Functions 4956 @comment node-name, next, previous, up 4957 @section Comparison Functions 4958 @cindex Float comparison functions 4959 @cindex Comparison functions 4960 4961 @deftypefun int mpf_cmp (const mpf_t @var{op1}, const mpf_t @var{op2}) 4962 @deftypefunx int mpf_cmp_z (const mpf_t @var{op1}, const mpz_t @var{op2}) 4963 @deftypefunx int mpf_cmp_d (const mpf_t @var{op1}, double @var{op2}) 4964 @deftypefunx int mpf_cmp_ui (const mpf_t @var{op1}, unsigned long int @var{op2}) 4965 @deftypefunx int mpf_cmp_si (const mpf_t @var{op1}, signed long int @var{op2}) 4966 Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} > 4967 @var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if 4968 @math{@var{op1} < @var{op2}}. 4969 4970 @code{mpf_cmp_d} can be called with an infinity, but results are undefined for 4971 a NaN. 4972 @end deftypefun 4973 4974 @deftypefun int mpf_eq (const mpf_t @var{op1}, const mpf_t @var{op2}, mp_bitcnt_t op3) 4975 @strong{This function is mathematically ill-defined and should not be used.} 4976 4977 Return non-zero if the first @var{op3} bits of @var{op1} and @var{op2} are 4978 equal, zero otherwise. Note that numbers like e.g., 256 (binary 100000000) and 4979 255 (binary 11111111) will never be equal by this function's measure, and 4980 furthermore that 0 will only be equal to itself. 4981 @end deftypefun 4982 4983 @deftypefun void mpf_reldiff (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2}) 4984 Compute the relative difference between @var{op1} and @var{op2} and store the 4985 result in @var{rop}. This is @math{@GMPabs{@var{op1}-@var{op2}}/@var{op1}}. 4986 @end deftypefun 4987 4988 @deftypefn Macro int mpf_sgn (const mpf_t @var{op}) 4989 @cindex Sign tests 4990 @cindex Float sign tests 4991 Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and 4992 @math{-1} if @math{@var{op} < 0}. 4993 4994 This function is actually implemented as a macro. It evaluates its argument 4995 multiple times. 4996 @end deftypefn 4997 4998 @node I/O of Floats, Miscellaneous Float Functions, Float Comparison, Floating-point Functions 4999 @comment node-name, next, previous, up 5000 @section Input and Output Functions 5001 @cindex Float input and output functions 5002 @cindex Input functions 5003 @cindex Output functions 5004 @cindex I/O functions 5005 5006 Functions that perform input from a stdio stream, and functions that output to 5007 a stdio stream, of @code{mpf} numbers. Passing a @code{NULL} pointer for a 5008 @var{stream} argument to any of these functions will make them read from 5009 @code{stdin} and write to @code{stdout}, respectively. 5010 5011 When using any of these functions, it is a good idea to include @file{stdio.h} 5012 before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes 5013 for these functions. 5014 5015 See also @ref{Formatted Output} and @ref{Formatted Input}. 5016 5017 @deftypefun size_t mpf_out_str (FILE *@var{stream}, int @var{base}, size_t @var{n_digits}, const mpf_t @var{op}) 5018 Print @var{op} to @var{stream}, as a string of digits. Return the number of 5019 bytes written, or if an error occurred, return 0. 5020 5021 The mantissa is prefixed with an @samp{0.} and is in the given @var{base}, 5022 which may vary from 2 to 62 or from @minus{}2 to @minus{}36. An exponent is 5023 then printed, separated by an @samp{e}, or if the base is greater than 10 then 5024 by an @samp{@@}. The exponent is always in decimal. The decimal point follows 5025 the current locale, on systems providing @code{localeconv}. 5026 5027 For @var{base} in the range 2..36, digits and lower-case letters are used; for 5028 @minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, 5029 digits, upper-case letters, and lower-case letters (in that significance order) 5030 are used. 5031 5032 Up to @var{n_digits} will be printed from the mantissa, except that no more 5033 digits than are accurately representable by @var{op} will be printed. 5034 @var{n_digits} can be 0 to select that accurate maximum. 5035 @end deftypefun 5036 5037 @deftypefun size_t mpf_inp_str (mpf_t @var{rop}, FILE *@var{stream}, int @var{base}) 5038 Read a string in base @var{base} from @var{stream}, and put the read float in 5039 @var{rop}. The string is of the form @samp{M@@N} or, if the base is 10 or 5040 less, alternatively @samp{MeN}. @samp{M} is the mantissa and @samp{N} is the 5041 exponent. The mantissa is always in the specified base. The exponent is 5042 either in the specified base or, if @var{base} is negative, in decimal. The 5043 decimal point expected is taken from the current locale, on systems providing 5044 @code{localeconv}. 5045 5046 The argument @var{base} may be in the ranges 2 to 36, or @minus{}36 to 5047 @minus{}2. Negative values are used to specify that the exponent is in 5048 decimal. 5049 5050 Unlike the corresponding @code{mpz} function, the base will not be determined 5051 from the leading characters of the string if @var{base} is 0. This is so that 5052 numbers like @samp{0.23} are not interpreted as octal. 5053 5054 Return the number of bytes read, or if an error occurred, return 0. 5055 @end deftypefun 5056 5057 @c @deftypefun void mpf_out_raw (FILE *@var{stream}, const mpf_t @var{float}) 5058 @c Output @var{float} on stdio stream @var{stream}, in raw binary 5059 @c format. The float is written in a portable format, with 4 bytes of 5060 @c size information, and that many bytes of limbs. Both the size and the 5061 @c limbs are written in decreasing significance order. 5062 @c @end deftypefun 5063 5064 @c @deftypefun void mpf_inp_raw (mpf_t @var{float}, FILE *@var{stream}) 5065 @c Input from stdio stream @var{stream} in the format written by 5066 @c @code{mpf_out_raw}, and put the result in @var{float}. 5067 @c @end deftypefun 5068 5069 5070 @node Miscellaneous Float Functions, , I/O of Floats, Floating-point Functions 5071 @comment node-name, next, previous, up 5072 @section Miscellaneous Functions 5073 @cindex Miscellaneous float functions 5074 @cindex Float miscellaneous functions 5075 5076 @deftypefun void mpf_ceil (mpf_t @var{rop}, const mpf_t @var{op}) 5077 @deftypefunx void mpf_floor (mpf_t @var{rop}, const mpf_t @var{op}) 5078 @deftypefunx void mpf_trunc (mpf_t @var{rop}, const mpf_t @var{op}) 5079 @cindex Rounding functions 5080 @cindex Float rounding functions 5081 Set @var{rop} to @var{op} rounded to an integer. @code{mpf_ceil} rounds to the 5082 next higher integer, @code{mpf_floor} to the next lower, and @code{mpf_trunc} 5083 to the integer towards zero. 5084 @end deftypefun 5085 5086 @deftypefun int mpf_integer_p (const mpf_t @var{op}) 5087 Return non-zero if @var{op} is an integer. 5088 @end deftypefun 5089 5090 @deftypefun int mpf_fits_ulong_p (const mpf_t @var{op}) 5091 @deftypefunx int mpf_fits_slong_p (const mpf_t @var{op}) 5092 @deftypefunx int mpf_fits_uint_p (const mpf_t @var{op}) 5093 @deftypefunx int mpf_fits_sint_p (const mpf_t @var{op}) 5094 @deftypefunx int mpf_fits_ushort_p (const mpf_t @var{op}) 5095 @deftypefunx int mpf_fits_sshort_p (const mpf_t @var{op}) 5096 Return non-zero if @var{op} would fit in the respective C data type, when 5097 truncated to an integer. 5098 @end deftypefun 5099 5100 @deftypefun void mpf_urandomb (mpf_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{nbits}) 5101 @cindex Random number functions 5102 @cindex Float random number functions 5103 Generate a uniformly distributed random float in @var{rop}, such that @math{0 5104 @le{} @var{rop} < 1}, with @var{nbits} significant bits in the mantissa or 5105 less if the precision of @var{rop} is smaller. 5106 5107 The variable @var{state} must be initialized by calling one of the 5108 @code{gmp_randinit} functions (@ref{Random State Initialization}) before 5109 invoking this function. 5110 @end deftypefun 5111 5112 @deftypefun void mpf_random2 (mpf_t @var{rop}, mp_size_t @var{max_size}, mp_exp_t @var{exp}) 5113 Generate a random float of at most @var{max_size} limbs, with long strings of 5114 zeros and ones in the binary representation. The exponent of the number is in 5115 the interval @minus{}@var{exp} to @var{exp} (in limbs). This function is 5116 useful for testing functions and algorithms, since these kind of random 5117 numbers have proven to be more likely to trigger corner-case bugs. Negative 5118 random numbers are generated when @var{max_size} is negative. 5119 @end deftypefun 5120 5121 @c @deftypefun size_t mpf_size (const mpf_t @var{op}) 5122 @c Return the size of @var{op} measured in number of limbs. If @var{op} is 5123 @c zero, the returned value will be zero. (@xref{Nomenclature}, for an 5124 @c explanation of the concept @dfn{limb}.) 5125 @c 5126 @c @strong{This function is obsolete. It will disappear from future GMP 5127 @c releases.} 5128 @c @end deftypefun 5129 5130 5131 @node Low-level Functions, Random Number Functions, Floating-point Functions, Top 5132 @comment node-name, next, previous, up 5133 @chapter Low-level Functions 5134 @cindex Low-level functions 5135 5136 This chapter describes low-level GMP functions, used to implement the 5137 high-level GMP functions, but also intended for time-critical user code. 5138 5139 These functions start with the prefix @code{mpn_}. 5140 5141 @c 1. Some of these function clobber input operands. 5142 @c 5143 5144 The @code{mpn} functions are designed to be as fast as possible, @strong{not} 5145 to provide a coherent calling interface. The different functions have somewhat 5146 similar interfaces, but there are variations that make them hard to use. These 5147 functions do as little as possible apart from the real multiple precision 5148 computation, so that no time is spent on things that not all callers need. 5149 5150 A source operand is specified by a pointer to the least significant limb and a 5151 limb count. A destination operand is specified by just a pointer. It is the 5152 responsibility of the caller to ensure that the destination has enough space 5153 for storing the result. 5154 5155 With this way of specifying operands, it is possible to perform computations on 5156 subranges of an argument, and store the result into a subrange of a 5157 destination. 5158 5159 A common requirement for all functions is that each source area needs at least 5160 one limb. No size argument may be zero. Unless otherwise stated, in-place 5161 operations are allowed where source and destination are the same, but not where 5162 they only partly overlap. 5163 5164 The @code{mpn} functions are the base for the implementation of the 5165 @code{mpz_}, @code{mpf_}, and @code{mpq_} functions. 5166 5167 This example adds the number beginning at @var{s1p} and the number beginning at 5168 @var{s2p} and writes the sum at @var{destp}. All areas have @var{n} limbs. 5169 5170 @example 5171 cy = mpn_add_n (destp, s1p, s2p, n) 5172 @end example 5173 5174 It should be noted that the @code{mpn} functions make no attempt to identify 5175 high or low zero limbs on their operands, or other special forms. On random 5176 data such cases will be unlikely and it'd be wasteful for every function to 5177 check every time. An application knowing something about its data can take 5178 steps to trim or perhaps split its calculations. 5179 @c 5180 @c For reference, within gmp mpz_t operands never have high zero limbs, and 5181 @c we rate low zero limbs as unlikely too (or something an application should 5182 @c handle). This is a prime motivation for not stripping zero limbs in say 5183 @c mpn_mul_n etc. 5184 @c 5185 @c Other applications doing variable-length calculations will quite likely do 5186 @c something similar to mpz. And even if not then it's highly likely zero 5187 @c limb stripping can be done at just a few judicious points, which will be 5188 @c more efficient than having lots of mpn functions checking every time. 5189 5190 @sp 1 5191 @noindent 5192 In the notation used below, a source operand is identified by the pointer to 5193 the least significant limb, and the limb count in braces. For example, 5194 @{@var{s1p}, @var{s1n}@}. 5195 5196 @deftypefun mp_limb_t mpn_add_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5197 Add @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the @var{n} 5198 least significant limbs of the result to @var{rp}. Return carry, either 0 or 5199 1. 5200 5201 This is the lowest-level function for addition. It is the preferred function 5202 for addition, since it is written in assembly for most CPUs. For addition of 5203 a variable to itself (i.e., @var{s1p} equals @var{s2p}) use @code{mpn_lshift} 5204 with a count of 1 for optimal speed. 5205 @end deftypefun 5206 5207 @deftypefun mp_limb_t mpn_add_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5208 Add @{@var{s1p}, @var{n}@} and @var{s2limb}, and write the @var{n} least 5209 significant limbs of the result to @var{rp}. Return carry, either 0 or 1. 5210 @end deftypefun 5211 5212 @deftypefun mp_limb_t mpn_add (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n}) 5213 Add @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the 5214 @var{s1n} least significant limbs of the result to @var{rp}. Return carry, 5215 either 0 or 1. 5216 5217 This function requires that @var{s1n} is greater than or equal to @var{s2n}. 5218 @end deftypefun 5219 5220 @deftypefun mp_limb_t mpn_sub_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5221 Subtract @{@var{s2p}, @var{n}@} from @{@var{s1p}, @var{n}@}, and write the 5222 @var{n} least significant limbs of the result to @var{rp}. Return borrow, 5223 either 0 or 1. 5224 5225 This is the lowest-level function for subtraction. It is the preferred 5226 function for subtraction, since it is written in assembly for most CPUs. 5227 @end deftypefun 5228 5229 @deftypefun mp_limb_t mpn_sub_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5230 Subtract @var{s2limb} from @{@var{s1p}, @var{n}@}, and write the @var{n} least 5231 significant limbs of the result to @var{rp}. Return borrow, either 0 or 1. 5232 @end deftypefun 5233 5234 @deftypefun mp_limb_t mpn_sub (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n}) 5235 Subtract @{@var{s2p}, @var{s2n}@} from @{@var{s1p}, @var{s1n}@}, and write the 5236 @var{s1n} least significant limbs of the result to @var{rp}. Return borrow, 5237 either 0 or 1. 5238 5239 This function requires that @var{s1n} is greater than or equal to 5240 @var{s2n}. 5241 @end deftypefun 5242 5243 @deftypefun mp_limb_t mpn_neg (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}) 5244 Perform the negation of @{@var{sp}, @var{n}@}, and write the result to 5245 @{@var{rp}, @var{n}@}. This is equivalent to calling @code{mpn_sub_n} with a 5246 @var{n}-limb zero minuend and passing @{@var{sp}, @var{n}@} as subtrahend. 5247 Return borrow, either 0 or 1. 5248 @end deftypefun 5249 5250 @deftypefun void mpn_mul_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5251 Multiply @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the 5252 2*@var{n}-limb result to @var{rp}. 5253 5254 The destination has to have space for 2*@var{n} limbs, even if the product's 5255 most significant limb is zero. No overlap is permitted between the 5256 destination and either source. 5257 5258 If the two input operands are the same, use @code{mpn_sqr}. 5259 @end deftypefun 5260 5261 @deftypefun mp_limb_t mpn_mul (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n}) 5262 Multiply @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the 5263 (@var{s1n}+@var{s2n})-limb result to @var{rp}. Return the most significant 5264 limb of the result. 5265 5266 The destination has to have space for @var{s1n} + @var{s2n} limbs, even if the 5267 product's most significant limb is zero. No overlap is permitted between the 5268 destination and either source. 5269 5270 This function requires that @var{s1n} is greater than or equal to @var{s2n}. 5271 @end deftypefun 5272 5273 @deftypefun void mpn_sqr (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5274 Compute the square of @{@var{s1p}, @var{n}@} and write the 2*@var{n}-limb 5275 result to @var{rp}. 5276 5277 The destination has to have space for 2@var{n} limbs, even if the result's 5278 most significant limb is zero. No overlap is permitted between the 5279 destination and the source. 5280 @end deftypefun 5281 5282 @deftypefun mp_limb_t mpn_mul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5283 Multiply @{@var{s1p}, @var{n}@} by @var{s2limb}, and write the @var{n} least 5284 significant limbs of the product to @var{rp}. Return the most significant 5285 limb of the product. @{@var{s1p}, @var{n}@} and @{@var{rp}, @var{n}@} are 5286 allowed to overlap provided @math{@var{rp} @le{} @var{s1p}}. 5287 5288 This is a low-level function that is a building block for general 5289 multiplication as well as other operations in GMP@. It is written in assembly 5290 for most CPUs. 5291 5292 Don't call this function if @var{s2limb} is a power of 2; use @code{mpn_lshift} 5293 with a count equal to the logarithm of @var{s2limb} instead, for optimal speed. 5294 @end deftypefun 5295 5296 @deftypefun mp_limb_t mpn_addmul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5297 Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and add the @var{n} least 5298 significant limbs of the product to @{@var{rp}, @var{n}@} and write the result 5299 to @var{rp}. Return the most significant limb of the product, plus carry-out 5300 from the addition. @{@var{s1p}, @var{n}@} and @{@var{rp}, @var{n}@} are 5301 allowed to overlap provided @math{@var{rp} @le{} @var{s1p}}. 5302 5303 This is a low-level function that is a building block for general 5304 multiplication as well as other operations in GMP@. It is written in assembly 5305 for most CPUs. 5306 @end deftypefun 5307 5308 @deftypefun mp_limb_t mpn_submul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5309 Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and subtract the @var{n} 5310 least significant limbs of the product from @{@var{rp}, @var{n}@} and write the 5311 result to @var{rp}. Return the most significant limb of the product, plus 5312 borrow-out from the subtraction. @{@var{s1p}, @var{n}@} and @{@var{rp}, 5313 @var{n}@} are allowed to overlap provided @math{@var{rp} @le{} @var{s1p}}. 5314 5315 This is a low-level function that is a building block for general 5316 multiplication and division as well as other operations in GMP@. It is written 5317 in assembly for most CPUs. 5318 @end deftypefun 5319 5320 @deftypefun void mpn_tdiv_qr (mp_limb_t *@var{qp}, mp_limb_t *@var{rp}, mp_size_t @var{qxn}, const mp_limb_t *@var{np}, mp_size_t @var{nn}, const mp_limb_t *@var{dp}, mp_size_t @var{dn}) 5321 Divide @{@var{np}, @var{nn}@} by @{@var{dp}, @var{dn}@} and put the quotient 5322 at @{@var{qp}, @var{nn}@minus{}@var{dn}+1@} and the remainder at @{@var{rp}, 5323 @var{dn}@}. The quotient is rounded towards 0. 5324 5325 No overlap is permitted between arguments, except that @var{np} might equal 5326 @var{rp}. The dividend size @var{nn} must be greater than or equal to divisor 5327 size @var{dn}. The most significant limb of the divisor must be non-zero. The 5328 @var{qxn} operand must be zero. 5329 @end deftypefun 5330 5331 @deftypefun mp_limb_t mpn_divrem (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n}) 5332 [This function is obsolete. Please call @code{mpn_tdiv_qr} instead for best 5333 performance.] 5334 5335 Divide @{@var{rs2p}, @var{rs2n}@} by @{@var{s3p}, @var{s3n}@}, and write the 5336 quotient at @var{r1p}, with the exception of the most significant limb, which 5337 is returned. The remainder replaces the dividend at @var{rs2p}; it will be 5338 @var{s3n} limbs long (i.e., as many limbs as the divisor). 5339 5340 In addition to an integer quotient, @var{qxn} fraction limbs are developed, and 5341 stored after the integral limbs. For most usages, @var{qxn} will be zero. 5342 5343 It is required that @var{rs2n} is greater than or equal to @var{s3n}. It is 5344 required that the most significant bit of the divisor is set. 5345 5346 If the quotient is not needed, pass @var{rs2p} + @var{s3n} as @var{r1p}. Aside 5347 from that special case, no overlap between arguments is permitted. 5348 5349 Return the most significant limb of the quotient, either 0 or 1. 5350 5351 The area at @var{r1p} needs to be @var{rs2n} @minus{} @var{s3n} + @var{qxn} 5352 limbs large. 5353 @end deftypefun 5354 5355 @deftypefn Function mp_limb_t mpn_divrem_1 (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, @w{mp_limb_t *@var{s2p}}, mp_size_t @var{s2n}, mp_limb_t @var{s3limb}) 5356 @deftypefnx Macro mp_limb_t mpn_divmod_1 (mp_limb_t *@var{r1p}, mp_limb_t *@var{s2p}, @w{mp_size_t @var{s2n}}, @w{mp_limb_t @var{s3limb}}) 5357 Divide @{@var{s2p}, @var{s2n}@} by @var{s3limb}, and write the quotient at 5358 @var{r1p}. Return the remainder. 5359 5360 The integer quotient is written to @{@var{r1p}+@var{qxn}, @var{s2n}@} and in 5361 addition @var{qxn} fraction limbs are developed and written to @{@var{r1p}, 5362 @var{qxn}@}. Either or both @var{s2n} and @var{qxn} can be zero. For most 5363 usages, @var{qxn} will be zero. 5364 5365 @code{mpn_divmod_1} exists for upward source compatibility and is simply a 5366 macro calling @code{mpn_divrem_1} with a @var{qxn} of 0. 5367 5368 The areas at @var{r1p} and @var{s2p} have to be identical or completely 5369 separate, not partially overlapping. 5370 @end deftypefn 5371 5372 @deftypefun mp_limb_t mpn_divmod (mp_limb_t *@var{r1p}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n}) 5373 [This function is obsolete. Please call @code{mpn_tdiv_qr} instead for best 5374 performance.] 5375 @end deftypefun 5376 5377 @deftypefun void mpn_divexact_1 (mp_limb_t * @var{rp}, const mp_limb_t * @var{sp}, mp_size_t @var{n}, mp_limb_t @var{d}) 5378 Divide @{@var{sp}, @var{n}@} by @var{d}, expecting it to divide exactly, and 5379 writing the result to @{@var{rp}, @var{n}@}. If @var{d} doesn't divide 5380 exactly, the value written to @{@var{rp}, @var{n}@} is undefined. The areas at 5381 @var{rp} and @var{sp} have to be identical or completely separate, not 5382 partially overlapping. 5383 @end deftypefun 5384 5385 @deftypefn Macro mp_limb_t mpn_divexact_by3 (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}}) 5386 @deftypefnx Function mp_limb_t mpn_divexact_by3c (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}}, mp_limb_t @var{carry}) 5387 Divide @{@var{sp}, @var{n}@} by 3, expecting it to divide exactly, and writing 5388 the result to @{@var{rp}, @var{n}@}. If 3 divides exactly, the return value is 5389 zero and the result is the quotient. If not, the return value is non-zero and 5390 the result won't be anything useful. 5391 5392 @code{mpn_divexact_by3c} takes an initial carry parameter, which can be the 5393 return value from a previous call, so a large calculation can be done piece by 5394 piece from low to high. @code{mpn_divexact_by3} is simply a macro calling 5395 @code{mpn_divexact_by3c} with a 0 carry parameter. 5396 5397 These routines use a multiply-by-inverse and will be faster than 5398 @code{mpn_divrem_1} on CPUs with fast multiplication but slow division. 5399 5400 The source @math{a}, result @math{q}, size @math{n}, initial carry @math{i}, 5401 and return value @math{c} satisfy @m{cb^n+a-i=3q, c*b^n + a-i = 3*q}, where 5402 @m{b=2\GMPraise{@code{GMP\_NUMB\_BITS}}, b=2^GMP_NUMB_BITS}. The 5403 return @math{c} is always 0, 1 or 2, and the initial carry @math{i} must also 5404 be 0, 1 or 2 (these are both borrows really). When @math{c=0} clearly 5405 @math{q=(a-i)/3}. When @m{c \neq 0, c!=0}, the remainder @math{(a-i) @bmod{} 5406 3} is given by @math{3-c}, because @math{b @equiv{} 1 @bmod{} 3} (when 5407 @code{mp_bits_per_limb} is even, which is always so currently). 5408 @end deftypefn 5409 5410 @deftypefun mp_limb_t mpn_mod_1 (const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t @var{s2limb}) 5411 Divide @{@var{s1p}, @var{s1n}@} by @var{s2limb}, and return the remainder. 5412 @var{s1n} can be zero. 5413 @end deftypefun 5414 5415 @deftypefun mp_limb_t mpn_lshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count}) 5416 Shift @{@var{sp}, @var{n}@} left by @var{count} bits, and write the result to 5417 @{@var{rp}, @var{n}@}. The bits shifted out at the left are returned in the 5418 least significant @var{count} bits of the return value (the rest of the return 5419 value is zero). 5420 5421 @var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1. The 5422 regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided 5423 @math{@var{rp} @ge{} @var{sp}}. 5424 5425 This function is written in assembly for most CPUs. 5426 @end deftypefun 5427 5428 @deftypefun mp_limb_t mpn_rshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count}) 5429 Shift @{@var{sp}, @var{n}@} right by @var{count} bits, and write the result to 5430 @{@var{rp}, @var{n}@}. The bits shifted out at the right are returned in the 5431 most significant @var{count} bits of the return value (the rest of the return 5432 value is zero). 5433 5434 @var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1. The 5435 regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided 5436 @math{@var{rp} @le{} @var{sp}}. 5437 5438 This function is written in assembly for most CPUs. 5439 @end deftypefun 5440 5441 @deftypefun int mpn_cmp (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5442 Compare @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@} and return a 5443 positive value if @math{@var{s1} > @var{s2}}, 0 if they are equal, or a 5444 negative value if @math{@var{s1} < @var{s2}}. 5445 @end deftypefun 5446 5447 @deftypefun int mpn_zero_p (const mp_limb_t *@var{sp}, mp_size_t @var{n}) 5448 Test @{@var{sp}, @var{n}@} and return 1 if the operand is zero, 0 otherwise. 5449 @end deftypefun 5450 5451 @deftypefun mp_size_t mpn_gcd (mp_limb_t *@var{rp}, mp_limb_t *@var{xp}, mp_size_t @var{xn}, mp_limb_t *@var{yp}, mp_size_t @var{yn}) 5452 Set @{@var{rp}, @var{retval}@} to the greatest common divisor of @{@var{xp}, 5453 @var{xn}@} and @{@var{yp}, @var{yn}@}. The result can be up to @var{yn} limbs, 5454 the return value is the actual number produced. Both source operands are 5455 destroyed. 5456 5457 It is required that @math{@var{xn} @ge @var{yn} > 0}, and the most significant 5458 limb of @{@var{yp}, @var{yn}@} must be non-zero. No overlap is permitted 5459 between @{@var{xp}, @var{xn}@} and @{@var{yp}, @var{yn}@}. 5460 @end deftypefun 5461 5462 @deftypefun mp_limb_t mpn_gcd_1 (const mp_limb_t *@var{xp}, mp_size_t @var{xn}, mp_limb_t @var{ylimb}) 5463 Return the greatest common divisor of @{@var{xp}, @var{xn}@} and @var{ylimb}. 5464 Both operands must be non-zero. 5465 @end deftypefun 5466 5467 @deftypefun mp_size_t mpn_gcdext (mp_limb_t *@var{gp}, mp_limb_t *@var{sp}, mp_size_t *@var{sn}, mp_limb_t *@var{up}, mp_size_t @var{un}, mp_limb_t *@var{vp}, mp_size_t @var{vn}) 5468 Let @m{U,@var{U}} be defined by @{@var{up}, @var{un}@} and let @m{V,@var{V}} be 5469 defined by @{@var{vp}, @var{vn}@}. 5470 5471 Compute the greatest common divisor @math{G} of @math{U} and @math{V}. Compute 5472 a cofactor @math{S} such that @math{G = US + VT}. The second cofactor @var{T} 5473 is not computed but can easily be obtained from @m{(G - US) / V, (@var{G} - 5474 @var{U}*@var{S}) / @var{V}} (the division will be exact). It is required that 5475 @math{@var{un} @ge @var{vn} > 0}, and the most significant 5476 limb of @{@var{vp}, @var{vn}@} must be non-zero. 5477 5478 @math{S} satisfies @math{S = 1} or @math{@GMPabs{S} < V / (2 G)}. @math{S = 5479 0} if and only if @math{V} divides @math{U} (i.e., @math{G = V}). 5480 5481 Store @math{G} at @var{gp} and let the return value define its limb count. 5482 Store @math{S} at @var{sp} and let |*@var{sn}| define its limb count. @math{S} 5483 can be negative; when this happens *@var{sn} will be negative. The area at 5484 @var{gp} should have room for @var{vn} limbs and the area at @var{sp} should 5485 have room for @math{@var{vn}+1} limbs. 5486 5487 Both source operands are destroyed. 5488 5489 Compatibility notes: GMP 4.3.0 and 4.3.1 defined @math{S} less strictly. 5490 Earlier as well as later GMP releases define @math{S} as described here. 5491 GMP releases before GMP 4.3.0 required additional space for both input and output 5492 areas. More precisely, the areas @{@var{up}, @math{@var{un}+1}@} and 5493 @{@var{vp}, @math{@var{vn}+1}@} were destroyed (i.e.@: the operands plus an 5494 extra limb past the end of each), and the areas pointed to by @var{gp} and 5495 @var{sp} should each have room for @math{@var{un}+1} limbs. 5496 @end deftypefun 5497 5498 @deftypefun mp_size_t mpn_sqrtrem (mp_limb_t *@var{r1p}, mp_limb_t *@var{r2p}, const mp_limb_t *@var{sp}, mp_size_t @var{n}) 5499 Compute the square root of @{@var{sp}, @var{n}@} and put the result at 5500 @{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and the remainder at @{@var{r2p}, 5501 @var{retval}@}. @var{r2p} needs space for @var{n} limbs, but the return value 5502 indicates how many are produced. 5503 5504 The most significant limb of @{@var{sp}, @var{n}@} must be non-zero. The 5505 areas @{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and @{@var{sp}, @var{n}@} must 5506 be completely separate. The areas @{@var{r2p}, @var{n}@} and @{@var{sp}, 5507 @var{n}@} must be either identical or completely separate. 5508 5509 If the remainder is not wanted then @var{r2p} can be @code{NULL}, and in this 5510 case the return value is zero or non-zero according to whether the remainder 5511 would have been zero or non-zero. 5512 5513 A return value of zero indicates a perfect square. See also 5514 @code{mpn_perfect_square_p}. 5515 @end deftypefun 5516 5517 @deftypefun size_t mpn_sizeinbase (const mp_limb_t *@var{xp}, mp_size_t @var{n}, int @var{base}) 5518 Return the size of @{@var{xp},@var{n}@} measured in number of digits in the 5519 given @var{base}. @var{base} can vary from 2 to 62. Requires @math{@var{n} > 0} 5520 and @math{@var{xp}[@var{n}-1] > 0}. The result will be either exact or 5521 1 too big. If @var{base} is a power of 2, the result is always exact. 5522 @end deftypefun 5523 5524 @deftypefun mp_size_t mpn_get_str (unsigned char *@var{str}, int @var{base}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n}) 5525 Convert @{@var{s1p}, @var{s1n}@} to a raw unsigned char array at @var{str} in 5526 base @var{base}, and return the number of characters produced. There may be 5527 leading zeros in the string. The string is not in ASCII; to convert it to 5528 printable format, add the ASCII codes for @samp{0} or @samp{A}, depending on 5529 the base and range. @var{base} can vary from 2 to 256. 5530 5531 The most significant limb of the input @{@var{s1p}, @var{s1n}@} must be 5532 non-zero. The input @{@var{s1p}, @var{s1n}@} is clobbered, except when 5533 @var{base} is a power of 2, in which case it's unchanged. 5534 5535 The area at @var{str} has to have space for the largest possible number 5536 represented by a @var{s1n} long limb array, plus one extra character. 5537 @end deftypefun 5538 5539 @deftypefun mp_size_t mpn_set_str (mp_limb_t *@var{rp}, const unsigned char *@var{str}, size_t @var{strsize}, int @var{base}) 5540 Convert bytes @{@var{str},@var{strsize}@} in the given @var{base} to limbs at 5541 @var{rp}. 5542 5543 @math{@var{str}[0]} is the most significant input byte and 5544 @math{@var{str}[@var{strsize}-1]} is the least significant input byte. Each 5545 byte should be a value in the range 0 to @math{@var{base}-1}, not an ASCII 5546 character. @var{base} can vary from 2 to 256. 5547 5548 The converted value is @{@var{rp},@var{rn}@} where @var{rn} is the return 5549 value. If the most significant input byte @math{@var{str}[0]} is non-zero, 5550 then @math{@var{rp}[@var{rn}-1]} will be non-zero, else 5551 @math{@var{rp}[@var{rn}-1]} and some number of subsequent limbs may be zero. 5552 5553 The area at @var{rp} has to have space for the largest possible number with 5554 @var{strsize} digits in the chosen base, plus one extra limb. 5555 5556 The input must have at least one byte, and no overlap is permitted between 5557 @{@var{str},@var{strsize}@} and the result at @var{rp}. 5558 @end deftypefun 5559 5560 @deftypefun {mp_bitcnt_t} mpn_scan0 (const mp_limb_t *@var{s1p}, mp_bitcnt_t @var{bit}) 5561 Scan @var{s1p} from bit position @var{bit} for the next clear bit. 5562 5563 It is required that there be a clear bit within the area at @var{s1p} at or 5564 beyond bit position @var{bit}, so that the function has something to return. 5565 @end deftypefun 5566 5567 @deftypefun {mp_bitcnt_t} mpn_scan1 (const mp_limb_t *@var{s1p}, mp_bitcnt_t @var{bit}) 5568 Scan @var{s1p} from bit position @var{bit} for the next set bit. 5569 5570 It is required that there be a set bit within the area at @var{s1p} at or 5571 beyond bit position @var{bit}, so that the function has something to return. 5572 @end deftypefun 5573 5574 @deftypefun void mpn_random (mp_limb_t *@var{r1p}, mp_size_t @var{r1n}) 5575 @deftypefunx void mpn_random2 (mp_limb_t *@var{r1p}, mp_size_t @var{r1n}) 5576 Generate a random number of length @var{r1n} and store it at @var{r1p}. The 5577 most significant limb is always non-zero. @code{mpn_random} generates 5578 uniformly distributed limb data, @code{mpn_random2} generates long strings of 5579 zeros and ones in the binary representation. 5580 5581 @code{mpn_random2} is intended for testing the correctness of the @code{mpn} 5582 routines. 5583 @end deftypefun 5584 5585 @deftypefun {mp_bitcnt_t} mpn_popcount (const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5586 Count the number of set bits in @{@var{s1p}, @var{n}@}. 5587 @end deftypefun 5588 5589 @deftypefun {mp_bitcnt_t} mpn_hamdist (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5590 Compute the hamming distance between @{@var{s1p}, @var{n}@} and @{@var{s2p}, 5591 @var{n}@}, which is the number of bit positions where the two operands have 5592 different bit values. 5593 @end deftypefun 5594 5595 @deftypefun int mpn_perfect_square_p (const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5596 Return non-zero iff @{@var{s1p}, @var{n}@} is a perfect square. 5597 The most significant limb of the input @{@var{s1p}, @var{n}@} must be 5598 non-zero. 5599 @end deftypefun 5600 5601 @deftypefun void mpn_and_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5602 Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and @{@var{s2p}, 5603 @var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5604 @end deftypefun 5605 5606 @deftypefun void mpn_ior_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5607 Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and 5608 @{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5609 @end deftypefun 5610 5611 @deftypefun void mpn_xor_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5612 Perform the bitwise logical exclusive or of @{@var{s1p}, @var{n}@} and 5613 @{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5614 @end deftypefun 5615 5616 @deftypefun void mpn_andn_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5617 Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and the bitwise 5618 complement of @{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5619 @end deftypefun 5620 5621 @deftypefun void mpn_iorn_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5622 Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and the bitwise 5623 complement of @{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5624 @end deftypefun 5625 5626 @deftypefun void mpn_nand_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5627 Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and @{@var{s2p}, 5628 @var{n}@}, and write the bitwise complement of the result to @{@var{rp}, @var{n}@}. 5629 @end deftypefun 5630 5631 @deftypefun void mpn_nior_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5632 Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and 5633 @{@var{s2p}, @var{n}@}, and write the bitwise complement of the result to 5634 @{@var{rp}, @var{n}@}. 5635 @end deftypefun 5636 5637 @deftypefun void mpn_xnor_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5638 Perform the bitwise logical exclusive or of @{@var{s1p}, @var{n}@} and 5639 @{@var{s2p}, @var{n}@}, and write the bitwise complement of the result to 5640 @{@var{rp}, @var{n}@}. 5641 @end deftypefun 5642 5643 @deftypefun void mpn_com (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}) 5644 Perform the bitwise complement of @{@var{sp}, @var{n}@}, and write the result 5645 to @{@var{rp}, @var{n}@}. 5646 @end deftypefun 5647 5648 @deftypefun void mpn_copyi (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5649 Copy from @{@var{s1p}, @var{n}@} to @{@var{rp}, @var{n}@}, increasingly. 5650 @end deftypefun 5651 5652 @deftypefun void mpn_copyd (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5653 Copy from @{@var{s1p}, @var{n}@} to @{@var{rp}, @var{n}@}, decreasingly. 5654 @end deftypefun 5655 5656 @deftypefun void mpn_zero (mp_limb_t *@var{rp}, mp_size_t @var{n}) 5657 Zero @{@var{rp}, @var{n}@}. 5658 @end deftypefun 5659 5660 @sp 1 5661 @section Low-level functions for cryptography 5662 @cindex Low-level functions for cryptography 5663 @cindex Cryptography functions, low-level 5664 5665 The functions prefixed with @code{mpn_sec_} and @code{mpn_cnd_} are designed to 5666 perform the exact same low-level operations and have the same cache access 5667 patterns for any two same-size arguments, assuming that function arguments are 5668 placed at the same position and that the machine state is identical upon 5669 function entry. These functions are intended for cryptographic purposes, where 5670 resilience to side-channel attacks is desired. 5671 5672 These functions are less efficient than their ``leaky'' counterparts; their 5673 performance for operands of the sizes typically used for cryptographic 5674 applications is between 15% and 100% worse. For larger operands, these 5675 functions might be inadequate, since they rely on asymptotically elementary 5676 algorithms. 5677 5678 These functions do not make any explicit allocations. Those of these functions 5679 that need scratch space accept a scratch space operand. This convention allows 5680 callers to keep sensitive data in designated memory areas. Note however that 5681 compilers may choose to spill scalar values used within these functions to 5682 their stack frame and that such scalars may contain sensitive data. 5683 5684 In addition to these specially crafted functions, the following @code{mpn} 5685 functions are naturally side-channel resistant: @code{mpn_add_n}, 5686 @code{mpn_sub_n}, @code{mpn_lshift}, @code{mpn_rshift}, @code{mpn_zero}, 5687 @code{mpn_copyi}, @code{mpn_copyd}, @code{mpn_com}, and the logical function 5688 (@code{mpn_and_n}, etc). 5689 5690 There are some exceptions from the side-channel resilience: (1) Some assembly 5691 implementations of @code{mpn_lshift} identify shift-by-one as a special case. 5692 This is a problem iff the shift count is a function of sensitive data. (2) 5693 Alpha ev6 and Pentium4 using 64-bit limbs have leaky @code{mpn_add_n} and 5694 @code{mpn_sub_n}. (3) Alpha ev6 has a leaky @code{mpn_mul_1} which also makes 5695 @code{mpn_sec_mul} on those systems unsafe. 5696 5697 @deftypefun mp_limb_t mpn_cnd_add_n (mp_limb_t @var{cnd}, mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5698 @deftypefunx mp_limb_t mpn_cnd_sub_n (mp_limb_t @var{cnd}, mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5699 These functions do conditional addition and subtraction. If @var{cnd} is 5700 non-zero, they produce the same result as a regular @code{mpn_add_n} or 5701 @code{mpn_sub_n}, and if @var{cnd} is zero, they copy @{@var{s1p},@var{n}@} to 5702 the result area and return zero. The functions are designed to have timing and 5703 memory access patterns depending only on size and location of the data areas, 5704 but independent of the condition @var{cnd}. Like for @code{mpn_add_n} and 5705 @code{mpn_sub_n}, on most machines, the timing will also be independent of the 5706 actual limb values. 5707 @end deftypefun 5708 5709 @deftypefun mp_limb_t mpn_sec_add_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{ap}, mp_size_t @var{n}, mp_limb_t @var{b}, mp_limb_t *@var{tp}) 5710 @deftypefunx mp_limb_t mpn_sec_sub_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{ap}, mp_size_t @var{n}, mp_limb_t @var{b}, mp_limb_t *@var{tp}) 5711 Set @var{R} to @var{A} + @var{b} or @var{A} - @var{b}, respectively, where 5712 @var{R} = @{@var{rp},@var{n}@}, @var{A} = @{@var{ap},@var{n}@}, and @var{b} is 5713 a single limb. Returns carry. 5714 5715 These functions take @math{O(N)} time, unlike the leaky functions 5716 @code{mpn_add_1} which are @math{O(1)} on average. They require scratch space 5717 of @code{mpn_sec_add_1_itch(@var{n})} and @code{mpn_sec_sub_1_itch(@var{n})} 5718 limbs, respectively, to be passed in the @var{tp} parameter. The scratch space 5719 requirements are guaranteed to be at most @var{n} limbs, and increase 5720 monotonously in the operand size. 5721 @end deftypefun 5722 5723 @deftypefun void mpn_cnd_swap (mp_limb_t @var{cnd}, volatile mp_limb_t *@var{ap}, volatile mp_limb_t *@var{bp}, mp_size_t @var{n}) 5724 If @var{cnd} is non-zero, swaps the contents of the areas @{@var{ap},@var{n}@} 5725 and @{@var{bp},@var{n}@}. Otherwise, the areas are left unmodified. 5726 Implemented using logical operations on the limbs, with the same memory 5727 accesses independent of the value of @var{cnd}. 5728 @end deftypefun 5729 5730 @deftypefun void mpn_sec_mul (mp_limb_t *@var{rp}, const mp_limb_t *@var{ap}, mp_size_t @var{an}, const mp_limb_t *@var{bp}, mp_size_t @var{bn}, mp_limb_t *@var{tp}) 5731 @deftypefunx mp_size_t mpn_sec_mul_itch (mp_size_t @var{an}, mp_size_t @var{bn}) 5732 Set @var{R} to @math{A @times{} B}, where @var{A} = @{@var{ap},@var{an}@}, 5733 @var{B} = @{@var{bp},@var{bn}@}, and @var{R} = 5734 @{@var{rp},@math{@var{an}+@var{bn}}@}. 5735 5736 It is required that @math{@var{an} @ge @var{bn} > 0}. 5737 5738 No overlapping between @var{R} and the input operands is allowed. For 5739 @math{@var{A} = @var{B}}, use @code{mpn_sec_sqr} for optimal performance. 5740 5741 This function requires scratch space of @code{mpn_sec_mul_itch(@var{an}, 5742 @var{bn})} limbs to be passed in the @var{tp} parameter. The scratch space 5743 requirements are guaranteed to increase monotonously in the operand sizes. 5744 @end deftypefun 5745 5746 5747 @deftypefun void mpn_sec_sqr (mp_limb_t *@var{rp}, const mp_limb_t *@var{ap}, mp_size_t @var{an}, mp_limb_t *@var{tp}) 5748 @deftypefunx mp_size_t mpn_sec_sqr_itch (mp_size_t @var{an}) 5749 Set @var{R} to @math{A^2}, where @var{A} = @{@var{ap},@var{an}@}, and @var{R} = 5750 @{@var{rp},@math{2@var{an}}@}. 5751 5752 It is required that @math{@var{an} > 0}. 5753 5754 No overlapping between @var{R} and the input operands is allowed. 5755 5756 This function requires scratch space of @code{mpn_sec_sqr_itch(@var{an})} limbs 5757 to be passed in the @var{tp} parameter. The scratch space requirements are 5758 guaranteed to increase monotonously in the operand size. 5759 @end deftypefun 5760 5761 5762 @deftypefun void mpn_sec_powm (mp_limb_t *@var{rp}, const mp_limb_t *@var{bp}, mp_size_t @var{bn}, const mp_limb_t *@var{ep}, mp_bitcnt_t @var{enb}, const mp_limb_t *@var{mp}, mp_size_t @var{n}, mp_limb_t *@var{tp}) 5763 @deftypefunx mp_size_t mpn_sec_powm_itch (mp_size_t @var{bn}, mp_bitcnt_t @var{enb}, size_t @var{n}) 5764 Set @var{R} to @m{B^E \bmod @var{M}, (@var{B} raised to @var{E}) modulo 5765 @var{M}}, where @var{R} = @{@var{rp},@var{n}@}, @var{M} = @{@var{mp},@var{n}@}, 5766 and @var{E} = @{@var{ep},@math{@GMPceil{@var{enb} / 5767 @code{GMP\_NUMB\_BITS}}}@}. 5768 5769 It is required that @math{@var{B} > 0}, that @math{@var{M} > 0} is odd, and 5770 that @m{@var{E} < 2@GMPraise{@var{enb}}, @var{E} < 2^@var{enb}}. 5771 5772 No overlapping between @var{R} and the input operands is allowed. 5773 5774 This function requires scratch space of @code{mpn_sec_powm_itch(@var{bn}, 5775 @var{enb}, @var{n})} limbs to be passed in the @var{tp} parameter. The scratch 5776 space requirements are guaranteed to increase monotonously in the operand 5777 sizes. 5778 @end deftypefun 5779 5780 @deftypefun void mpn_sec_tabselect (mp_limb_t *@var{rp}, const mp_limb_t *@var{tab}, mp_size_t @var{n}, mp_size_t @var{nents}, mp_size_t @var{which}) 5781 Select entry @var{which} from table @var{tab}, which has @var{nents} entries, each @var{n} 5782 limbs. Store the selected entry at @var{rp}. 5783 5784 This function reads the entire table to avoid side-channel information leaks. 5785 @end deftypefun 5786 5787 @deftypefun mp_limb_t mpn_sec_div_qr (mp_limb_t *@var{qp}, mp_limb_t *@var{np}, mp_size_t @var{nn}, const mp_limb_t *@var{dp}, mp_size_t @var{dn}, mp_limb_t *@var{tp}) 5788 @deftypefunx mp_size_t mpn_sec_div_qr_itch (mp_size_t @var{nn}, mp_size_t @var{dn}) 5789 5790 Set @var{Q} to @m{\lfloor @var{N} / @var{D}\rfloor, the truncated quotient 5791 @var{N} / @var{D}} and @var{R} to @m{@var{N} \bmod @var{D}, @var{N} modulo 5792 @var{D}}, where @var{N} = @{@var{np},@var{nn}@}, @var{D} = 5793 @{@var{dp},@var{dn}@}, @var{Q}'s most significant limb is the function return 5794 value and the remaining limbs are @{@var{qp},@var{nn-dn}@}, and @var{R} = 5795 @{@var{np},@var{dn}@}. 5796 5797 It is required that @math{@var{nn} @ge @var{dn} @ge 1}, and that 5798 @m{@var{dp}[@var{dn}-1] @neq 0, @var{dp}[@var{dn}-1] != 0}. This does not 5799 imply that @math{@var{N} @ge @var{D}} since @var{N} might be zero-padded. 5800 5801 Note the overlapping between @var{N} and @var{R}. No other operand overlapping 5802 is allowed. The entire space occupied by @var{N} is overwritten. 5803 5804 This function requires scratch space of @code{mpn_sec_div_qr_itch(@var{nn}, 5805 @var{dn})} limbs to be passed in the @var{tp} parameter. 5806 @end deftypefun 5807 5808 @deftypefun void mpn_sec_div_r (mp_limb_t *@var{np}, mp_size_t @var{nn}, const mp_limb_t *@var{dp}, mp_size_t @var{dn}, mp_limb_t *@var{tp}) 5809 @deftypefunx mp_size_t mpn_sec_div_r_itch (mp_size_t @var{nn}, mp_size_t @var{dn}) 5810 5811 Set @var{R} to @m{@var{N} \bmod @var{D}, @var{N} modulo @var{D}}, where @var{N} 5812 = @{@var{np},@var{nn}@}, @var{D} = @{@var{dp},@var{dn}@}, and @var{R} = 5813 @{@var{np},@var{dn}@}. 5814 5815 It is required that @math{@var{nn} @ge @var{dn} @ge 1}, and that 5816 @m{@var{dp}[@var{dn}-1] @neq 0, @var{dp}[@var{dn}-1] != 0}. This does not 5817 imply that @math{@var{N} @ge @var{D}} since @var{N} might be zero-padded. 5818 5819 Note the overlapping between @var{N} and @var{R}. No other operand overlapping 5820 is allowed. The entire space occupied by @var{N} is overwritten. 5821 5822 This function requires scratch space of @code{mpn_sec_div_r_itch(@var{nn}, 5823 @var{dn})} limbs to be passed in the @var{tp} parameter. 5824 @end deftypefun 5825 5826 @deftypefun int mpn_sec_invert (mp_limb_t *@var{rp}, mp_limb_t *@var{ap}, const mp_limb_t *@var{mp}, mp_size_t @var{n}, mp_bitcnt_t @var{nbcnt}, mp_limb_t *@var{tp}) 5827 @deftypefunx mp_size_t mpn_sec_invert_itch (mp_size_t @var{n}) 5828 Set @var{R} to @m{@var{A}^{-1} \bmod @var{M}, the inverse of @var{A} modulo 5829 @var{M}}, where @var{R} = @{@var{rp},@var{n}@}, @var{A} = @{@var{ap},@var{n}@}, 5830 and @var{M} = @{@var{mp},@var{n}@}. @strong{This function's interface is 5831 preliminary.} 5832 5833 If an inverse exists, return 1, otherwise return 0 and leave @var{R} 5834 undefined. In either case, the input @var{A} is destroyed. 5835 5836 It is required that @var{M} is odd, and that @math{@var{nbcnt} @ge 5837 @GMPceil{\log(@var{A}+1)} + @GMPceil{\log(@var{M}+1)}}. A safe choice is 5838 @m{@var{nbcnt} = 2@var{n} @times{} @code{GMP\_NUMB\_BITS}, @var{nbcnt} = 2 5839 @times{} @var{n} @times{} GMP_NUMB_BITS}, but a smaller value might improve 5840 performance if @var{M} or @var{A} are known to have leading zero bits. 5841 5842 This function requires scratch space of @code{mpn_sec_invert_itch(@var{n})} 5843 limbs to be passed in the @var{tp} parameter. 5844 @end deftypefun 5845 5846 5847 @sp 1 5848 @section Nails 5849 @cindex Nails 5850 5851 @strong{Everything in this section is highly experimental and may disappear or 5852 be subject to incompatible changes in a future version of GMP.} 5853 5854 Nails are an experimental feature whereby a few bits are left unused at the 5855 top of each @code{mp_limb_t}. This can significantly improve carry handling 5856 on some processors. 5857 5858 All the @code{mpn} functions accepting limb data will expect the nail bits to 5859 be zero on entry, and will return data with the nails similarly all zero. 5860 This applies both to limb vectors and to single limb arguments. 5861 5862 Nails can be enabled by configuring with @samp{--enable-nails}. By default 5863 the number of bits will be chosen according to what suits the host processor, 5864 but a particular number can be selected with @samp{--enable-nails=N}. 5865 5866 At the mpn level, a nail build is neither source nor binary compatible with a 5867 non-nail build, strictly speaking. But programs acting on limbs only through 5868 the mpn functions are likely to work equally well with either build, and 5869 judicious use of the definitions below should make any program compatible with 5870 either build, at the source level. 5871 5872 For the higher level routines, meaning @code{mpz} etc, a nail build should be 5873 fully source and binary compatible with a non-nail build. 5874 5875 @defmac GMP_NAIL_BITS 5876 @defmacx GMP_NUMB_BITS 5877 @defmacx GMP_LIMB_BITS 5878 @code{GMP_NAIL_BITS} is the number of nail bits, or 0 when nails are not in 5879 use. @code{GMP_NUMB_BITS} is the number of data bits in a limb. 5880 @code{GMP_LIMB_BITS} is the total number of bits in an @code{mp_limb_t}. In 5881 all cases 5882 5883 @example 5884 GMP_LIMB_BITS == GMP_NAIL_BITS + GMP_NUMB_BITS 5885 @end example 5886 @end defmac 5887 5888 @defmac GMP_NAIL_MASK 5889 @defmacx GMP_NUMB_MASK 5890 Bit masks for the nail and number parts of a limb. @code{GMP_NAIL_MASK} is 0 5891 when nails are not in use. 5892 5893 @code{GMP_NAIL_MASK} is not often needed, since the nail part can be obtained 5894 with @code{x >> GMP_NUMB_BITS}, and that means one less large constant, which 5895 can help various RISC chips. 5896 @end defmac 5897 5898 @defmac GMP_NUMB_MAX 5899 The maximum value that can be stored in the number part of a limb. This is 5900 the same as @code{GMP_NUMB_MASK}, but can be used for clarity when doing 5901 comparisons rather than bit-wise operations. 5902 @end defmac 5903 5904 The term ``nails'' comes from finger or toe nails, which are at the ends of a 5905 limb (arm or leg). ``numb'' is short for number, but is also how the 5906 developers felt after trying for a long time to come up with sensible names 5907 for these things. 5908 5909 In the future (the distant future most likely) a non-zero nail might be 5910 permitted, giving non-unique representations for numbers in a limb vector. 5911 This would help vector processors since carries would only ever need to 5912 propagate one or two limbs. 5913 5914 5915 @node Random Number Functions, Formatted Output, Low-level Functions, Top 5916 @chapter Random Number Functions 5917 @cindex Random number functions 5918 5919 Sequences of pseudo-random numbers in GMP are generated using a variable of 5920 type @code{gmp_randstate_t}, which holds an algorithm selection and a current 5921 state. Such a variable must be initialized by a call to one of the 5922 @code{gmp_randinit} functions, and can be seeded with one of the 5923 @code{gmp_randseed} functions. 5924 5925 The functions actually generating random numbers are described in @ref{Integer 5926 Random Numbers}, and @ref{Miscellaneous Float Functions}. 5927 5928 The older style random number functions don't accept a @code{gmp_randstate_t} 5929 parameter but instead share a global variable of that type. They use a 5930 default algorithm and are currently not seeded (though perhaps that will 5931 change in the future). The new functions accepting a @code{gmp_randstate_t} 5932 are recommended for applications that care about randomness. 5933 5934 @menu 5935 * Random State Initialization:: 5936 * Random State Seeding:: 5937 * Random State Miscellaneous:: 5938 @end menu 5939 5940 @node Random State Initialization, Random State Seeding, Random Number Functions, Random Number Functions 5941 @section Random State Initialization 5942 @cindex Random number state 5943 @cindex Initialization functions 5944 5945 @deftypefun void gmp_randinit_default (gmp_randstate_t @var{state}) 5946 Initialize @var{state} with a default algorithm. This will be a compromise 5947 between speed and randomness, and is recommended for applications with no 5948 special requirements. Currently this is @code{gmp_randinit_mt}. 5949 @end deftypefun 5950 5951 @deftypefun void gmp_randinit_mt (gmp_randstate_t @var{state}) 5952 @cindex Mersenne twister random numbers 5953 Initialize @var{state} for a Mersenne Twister algorithm. This algorithm is 5954 fast and has good randomness properties. 5955 @end deftypefun 5956 5957 @deftypefun void gmp_randinit_lc_2exp (gmp_randstate_t @var{state}, const mpz_t @var{a}, @w{unsigned long @var{c}}, @w{mp_bitcnt_t @var{m2exp}}) 5958 @cindex Linear congruential random numbers 5959 Initialize @var{state} with a linear congruential algorithm @m{X = (@var{a}X + 5960 @var{c}) @bmod 2^{m2exp}, X = (@var{a}*X + @var{c}) mod 2^@var{m2exp}}. 5961 5962 The low bits of @math{X} in this algorithm are not very random. The least 5963 significant bit will have a period no more than 2, and the second bit no more 5964 than 4, etc. For this reason only the high half of each @math{X} is actually 5965 used. 5966 5967 When a random number of more than @math{@var{m2exp}/2} bits is to be 5968 generated, multiple iterations of the recurrence are used and the results 5969 concatenated. 5970 @end deftypefun 5971 5972 @deftypefun int gmp_randinit_lc_2exp_size (gmp_randstate_t @var{state}, mp_bitcnt_t @var{size}) 5973 @cindex Linear congruential random numbers 5974 Initialize @var{state} for a linear congruential algorithm as per 5975 @code{gmp_randinit_lc_2exp}. @var{a}, @var{c} and @var{m2exp} are selected 5976 from a table, chosen so that @var{size} bits (or more) of each @math{X} will 5977 be used, i.e.@: @math{@var{m2exp}/2 @ge{} @var{size}}. 5978 5979 If successful the return value is non-zero. If @var{size} is bigger than the 5980 table data provides then the return value is zero. The maximum @var{size} 5981 currently supported is 128. 5982 @end deftypefun 5983 5984 @deftypefun void gmp_randinit_set (gmp_randstate_t @var{rop}, gmp_randstate_t @var{op}) 5985 Initialize @var{rop} with a copy of the algorithm and state from @var{op}. 5986 @end deftypefun 5987 5988 @c Although gmp_randinit, gmp_errno and related constants are obsolete, we 5989 @c still put @findex entries for them, since they're still documented and 5990 @c someone might be looking them up when perusing old application code. 5991 5992 @deftypefun void gmp_randinit (gmp_randstate_t @var{state}, @w{gmp_randalg_t @var{alg}}, @dots{}) 5993 @strong{This function is obsolete.} 5994 5995 @findex GMP_RAND_ALG_LC 5996 @findex GMP_RAND_ALG_DEFAULT 5997 Initialize @var{state} with an algorithm selected by @var{alg}. The only 5998 choice is @code{GMP_RAND_ALG_LC}, which is @code{gmp_randinit_lc_2exp_size} 5999 described above. A third parameter of type @code{unsigned long} is required, 6000 this is the @var{size} for that function. @code{GMP_RAND_ALG_DEFAULT} or 0 6001 are the same as @code{GMP_RAND_ALG_LC}. 6002 6003 @c For reference, this is the only place gmp_errno has been documented, and 6004 @c due to being non thread safe we won't be adding to it's uses. 6005 @findex gmp_errno 6006 @findex GMP_ERROR_UNSUPPORTED_ARGUMENT 6007 @findex GMP_ERROR_INVALID_ARGUMENT 6008 @code{gmp_randinit} sets bits in the global variable @code{gmp_errno} to 6009 indicate an error. @code{GMP_ERROR_UNSUPPORTED_ARGUMENT} if @var{alg} is 6010 unsupported, or @code{GMP_ERROR_INVALID_ARGUMENT} if the @var{size} parameter 6011 is too big. It may be noted this error reporting is not thread safe (a good 6012 reason to use @code{gmp_randinit_lc_2exp_size} instead). 6013 @end deftypefun 6014 6015 @deftypefun void gmp_randclear (gmp_randstate_t @var{state}) 6016 Free all memory occupied by @var{state}. 6017 @end deftypefun 6018 6019 6020 @node Random State Seeding, Random State Miscellaneous, Random State Initialization, Random Number Functions 6021 @section Random State Seeding 6022 @cindex Random number seeding 6023 @cindex Seeding random numbers 6024 6025 @deftypefun void gmp_randseed (gmp_randstate_t @var{state}, const mpz_t @var{seed}) 6026 @deftypefunx void gmp_randseed_ui (gmp_randstate_t @var{state}, @w{unsigned long int @var{seed}}) 6027 Set an initial seed value into @var{state}. 6028 6029 The size of a seed determines how many different sequences of random numbers 6030 that it's possible to generate. The ``quality'' of the seed is the randomness 6031 of a given seed compared to the previous seed used, and this affects the 6032 randomness of separate number sequences. The method for choosing a seed is 6033 critical if the generated numbers are to be used for important applications, 6034 such as generating cryptographic keys. 6035 6036 Traditionally the system time has been used to seed, but care needs to be 6037 taken with this. If an application seeds often and the resolution of the 6038 system clock is low, then the same sequence of numbers might be repeated. 6039 Also, the system time is quite easy to guess, so if unpredictability is 6040 required then it should definitely not be the only source for the seed value. 6041 On some systems there's a special device @file{/dev/random} which provides 6042 random data better suited for use as a seed. 6043 @end deftypefun 6044 6045 6046 @node Random State Miscellaneous, , Random State Seeding, Random Number Functions 6047 @section Random State Miscellaneous 6048 6049 @deftypefun {unsigned long} gmp_urandomb_ui (gmp_randstate_t @var{state}, unsigned long @var{n}) 6050 Return a uniformly distributed random number of @var{n} bits, i.e.@: in the 6051 range 0 to @m{2^n-1,2^@var{n}-1} inclusive. @var{n} must be less than or 6052 equal to the number of bits in an @code{unsigned long}. 6053 @end deftypefun 6054 6055 @deftypefun {unsigned long} gmp_urandomm_ui (gmp_randstate_t @var{state}, unsigned long @var{n}) 6056 Return a uniformly distributed random number in the range 0 to 6057 @math{@var{n}-1}, inclusive. 6058 @end deftypefun 6059 6060 6061 @node Formatted Output, Formatted Input, Random Number Functions, Top 6062 @chapter Formatted Output 6063 @cindex Formatted output 6064 @cindex @code{printf} formatted output 6065 6066 @menu 6067 * Formatted Output Strings:: 6068 * Formatted Output Functions:: 6069 * C++ Formatted Output:: 6070 @end menu 6071 6072 @node Formatted Output Strings, Formatted Output Functions, Formatted Output, Formatted Output 6073 @section Format Strings 6074 6075 @code{gmp_printf} and friends accept format strings similar to the standard C 6076 @code{printf} (@pxref{Formatted Output,, Formatted Output, libc, The GNU C 6077 Library Reference Manual}). A format specification is of the form 6078 6079 @example 6080 % [flags] [width] [.[precision]] [type] conv 6081 @end example 6082 6083 GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t} 6084 and @code{mpf_t} respectively, @samp{M} for @code{mp_limb_t}, and @samp{N} for 6085 an @code{mp_limb_t} array. @samp{Z}, @samp{Q}, @samp{M} and @samp{N} behave 6086 like integers. @samp{Q} will print a @samp{/} and a denominator, if needed. 6087 @samp{F} behaves like a float. For example, 6088 6089 @example 6090 mpz_t z; 6091 gmp_printf ("%s is an mpz %Zd\n", "here", z); 6092 6093 mpq_t q; 6094 gmp_printf ("a hex rational: %#40Qx\n", q); 6095 6096 mpf_t f; 6097 int n; 6098 gmp_printf ("fixed point mpf %.*Ff with %d digits\n", n, f, n); 6099 6100 mp_limb_t l; 6101 gmp_printf ("limb %Mu\n", l); 6102 6103 const mp_limb_t *ptr; 6104 mp_size_t size; 6105 gmp_printf ("limb array %Nx\n", ptr, size); 6106 @end example 6107 6108 For @samp{N} the limbs are expected least significant first, as per the 6109 @code{mpn} functions (@pxref{Low-level Functions}). A negative size can be 6110 given to print the value as a negative. 6111 6112 All the standard C @code{printf} types behave the same as the C library 6113 @code{printf}, and can be freely intermixed with the GMP extensions. In the 6114 current implementation the standard parts of the format string are simply 6115 handed to @code{printf} and only the GMP extensions handled directly. 6116 6117 The flags accepted are as follows. GLIBC style @nisamp{'} is only for the 6118 standard C types (not the GMP types), and only if the C library supports it. 6119 6120 @quotation 6121 @multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6122 @item @nicode{0} @tab pad with zeros (rather than spaces) 6123 @item @nicode{#} @tab show the base with @samp{0x}, @samp{0X} or @samp{0} 6124 @item @nicode{+} @tab always show a sign 6125 @item (space) @tab show a space or a @samp{-} sign 6126 @item @nicode{'} @tab group digits, GLIBC style (not GMP types) 6127 @end multitable 6128 @end quotation 6129 6130 The optional width and precision can be given as a number within the format 6131 string, or as a @samp{*} to take an extra parameter of type @code{int}, the 6132 same as the standard @code{printf}. 6133 6134 The standard types accepted are as follows. @samp{h} and @samp{l} are 6135 portable, the rest will depend on the compiler (or include files) for the type 6136 and the C library for the output. 6137 6138 @quotation 6139 @multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6140 @item @nicode{h} @tab @nicode{short} 6141 @item @nicode{hh} @tab @nicode{char} 6142 @item @nicode{j} @tab @nicode{intmax_t} or @nicode{uintmax_t} 6143 @item @nicode{l} @tab @nicode{long} or @nicode{wchar_t} 6144 @item @nicode{ll} @tab @nicode{long long} 6145 @item @nicode{L} @tab @nicode{long double} 6146 @item @nicode{q} @tab @nicode{quad_t} or @nicode{u_quad_t} 6147 @item @nicode{t} @tab @nicode{ptrdiff_t} 6148 @item @nicode{z} @tab @nicode{size_t} 6149 @end multitable 6150 @end quotation 6151 6152 @noindent 6153 The GMP types are 6154 6155 @quotation 6156 @multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6157 @item @nicode{F} @tab @nicode{mpf_t}, float conversions 6158 @item @nicode{Q} @tab @nicode{mpq_t}, integer conversions 6159 @item @nicode{M} @tab @nicode{mp_limb_t}, integer conversions 6160 @item @nicode{N} @tab @nicode{mp_limb_t} array, integer conversions 6161 @item @nicode{Z} @tab @nicode{mpz_t}, integer conversions 6162 @end multitable 6163 @end quotation 6164 6165 The conversions accepted are as follows. @samp{a} and @samp{A} are always 6166 supported for @code{mpf_t} but depend on the C library for standard C float 6167 types. @samp{m} and @samp{p} depend on the C library. 6168 6169 @quotation 6170 @multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6171 @item @nicode{a} @nicode{A} @tab hex floats, C99 style 6172 @item @nicode{c} @tab character 6173 @item @nicode{d} @tab decimal integer 6174 @item @nicode{e} @nicode{E} @tab scientific format float 6175 @item @nicode{f} @tab fixed point float 6176 @item @nicode{i} @tab same as @nicode{d} 6177 @item @nicode{g} @nicode{G} @tab fixed or scientific float 6178 @item @nicode{m} @tab @code{strerror} string, GLIBC style 6179 @item @nicode{n} @tab store characters written so far 6180 @item @nicode{o} @tab octal integer 6181 @item @nicode{p} @tab pointer 6182 @item @nicode{s} @tab string 6183 @item @nicode{u} @tab unsigned integer 6184 @item @nicode{x} @nicode{X} @tab hex integer 6185 @end multitable 6186 @end quotation 6187 6188 @samp{o}, @samp{x} and @samp{X} are unsigned for the standard C types, but for 6189 types @samp{Z}, @samp{Q} and @samp{N} they are signed. @samp{u} is not 6190 meaningful for @samp{Z}, @samp{Q} and @samp{N}. 6191 6192 @samp{M} is a proxy for the C library @samp{l} or @samp{L}, according to the 6193 size of @code{mp_limb_t}. Unsigned conversions will be usual, but a signed 6194 conversion can be used and will interpret the value as a twos complement 6195 negative. 6196 6197 @samp{n} can be used with any type, even the GMP types. 6198 6199 Other types or conversions that might be accepted by the C library 6200 @code{printf} cannot be used through @code{gmp_printf}, this includes for 6201 instance extensions registered with GLIBC @code{register_printf_function}. 6202 Also currently there's no support for POSIX @samp{$} style numbered arguments 6203 (perhaps this will be added in the future). 6204 6205 The precision field has its usual meaning for integer @samp{Z} and float 6206 @samp{F} types, but is currently undefined for @samp{Q} and should not be used 6207 with that. 6208 6209 @code{mpf_t} conversions only ever generate as many digits as can be 6210 accurately represented by the operand, the same as @code{mpf_get_str} does. 6211 Zeros will be used if necessary to pad to the requested precision. This 6212 happens even for an @samp{f} conversion of an @code{mpf_t} which is an 6213 integer, for instance @math{2^@W{1024}} in an @code{mpf_t} of 128 bits 6214 precision will only produce about 40 digits, then pad with zeros to the 6215 decimal point. An empty precision field like @samp{%.Fe} or @samp{%.Ff} can 6216 be used to specifically request just the significant digits. Without any dot 6217 and thus no precision field, a precision value of 6 will be used. Note that 6218 these rules mean that @samp{%Ff}, @samp{%.Ff}, and @samp{%.0Ff} will all be 6219 different. 6220 6221 The decimal point character (or string) is taken from the current locale 6222 settings on systems which provide @code{localeconv} (@pxref{Locales,, Locales 6223 and Internationalization, libc, The GNU C Library Reference Manual}). The C 6224 library will normally do the same for standard float output. 6225 6226 The format string is only interpreted as plain @code{char}s, multibyte 6227 characters are not recognised. Perhaps this will change in the future. 6228 6229 6230 @node Formatted Output Functions, C++ Formatted Output, Formatted Output Strings, Formatted Output 6231 @section Functions 6232 @cindex Output functions 6233 6234 Each of the following functions is similar to the corresponding C library 6235 function. The basic @code{printf} forms take a variable argument list. The 6236 @code{vprintf} forms take an argument pointer, see @ref{Variadic Functions,, 6237 Variadic Functions, libc, The GNU C Library Reference Manual}, or @samp{man 3 6238 va_start}. 6239 6240 It should be emphasised that if a format string is invalid, or the arguments 6241 don't match what the format specifies, then the behaviour of any of these 6242 functions will be unpredictable. GCC format string checking is not available, 6243 since it doesn't recognise the GMP extensions. 6244 6245 The file based functions @code{gmp_printf} and @code{gmp_fprintf} will return 6246 @math{-1} to indicate a write error. Output is not ``atomic'', so partial 6247 output may be produced if a write error occurs. All the functions can return 6248 @math{-1} if the C library @code{printf} variant in use returns @math{-1}, but 6249 this shouldn't normally occur. 6250 6251 @deftypefun int gmp_printf (const char *@var{fmt}, @dots{}) 6252 @deftypefunx int gmp_vprintf (const char *@var{fmt}, va_list @var{ap}) 6253 Print to the standard output @code{stdout}. Return the number of characters 6254 written, or @math{-1} if an error occurred. 6255 @end deftypefun 6256 6257 @deftypefun int gmp_fprintf (FILE *@var{fp}, const char *@var{fmt}, @dots{}) 6258 @deftypefunx int gmp_vfprintf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap}) 6259 Print to the stream @var{fp}. Return the number of characters written, or 6260 @math{-1} if an error occurred. 6261 @end deftypefun 6262 6263 @deftypefun int gmp_sprintf (char *@var{buf}, const char *@var{fmt}, @dots{}) 6264 @deftypefunx int gmp_vsprintf (char *@var{buf}, const char *@var{fmt}, va_list @var{ap}) 6265 Form a null-terminated string in @var{buf}. Return the number of characters 6266 written, excluding the terminating null. 6267 6268 No overlap is permitted between the space at @var{buf} and the string 6269 @var{fmt}. 6270 6271 These functions are not recommended, since there's no protection against 6272 exceeding the space available at @var{buf}. 6273 @end deftypefun 6274 6275 @deftypefun int gmp_snprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, @dots{}) 6276 @deftypefunx int gmp_vsnprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, va_list @var{ap}) 6277 Form a null-terminated string in @var{buf}. No more than @var{size} bytes 6278 will be written. To get the full output, @var{size} must be enough for the 6279 string and null-terminator. 6280 6281 The return value is the total number of characters which ought to have been 6282 produced, excluding the terminating null. If @math{@var{retval} @ge{} 6283 @var{size}} then the actual output has been truncated to the first 6284 @math{@var{size}-1} characters, and a null appended. 6285 6286 No overlap is permitted between the region @{@var{buf},@var{size}@} and the 6287 @var{fmt} string. 6288 6289 Notice the return value is in ISO C99 @code{snprintf} style. This is so even 6290 if the C library @code{vsnprintf} is the older GLIBC 2.0.x style. 6291 @end deftypefun 6292 6293 @deftypefun int gmp_asprintf (char **@var{pp}, const char *@var{fmt}, @dots{}) 6294 @deftypefunx int gmp_vasprintf (char **@var{pp}, const char *@var{fmt}, va_list @var{ap}) 6295 Form a null-terminated string in a block of memory obtained from the current 6296 memory allocation function (@pxref{Custom Allocation}). The block will be the 6297 size of the string and null-terminator. The address of the block in stored to 6298 *@var{pp}. The return value is the number of characters produced, excluding 6299 the null-terminator. 6300 6301 Unlike the C library @code{asprintf}, @code{gmp_asprintf} doesn't return 6302 @math{-1} if there's no more memory available, it lets the current allocation 6303 function handle that. 6304 @end deftypefun 6305 6306 @deftypefun int gmp_obstack_printf (struct obstack *@var{ob}, const char *@var{fmt}, @dots{}) 6307 @deftypefunx int gmp_obstack_vprintf (struct obstack *@var{ob}, const char *@var{fmt}, va_list @var{ap}) 6308 @cindex @code{obstack} output 6309 Append to the current object in @var{ob}. The return value is the number of 6310 characters written. A null-terminator is not written. 6311 6312 @var{fmt} cannot be within the current object in @var{ob}, since that object 6313 might move as it grows. 6314 6315 These functions are available only when the C library provides the obstack 6316 feature, which probably means only on GNU systems, see @ref{Obstacks,, 6317 Obstacks, libc, The GNU C Library Reference Manual}. 6318 @end deftypefun 6319 6320 6321 @node C++ Formatted Output, , Formatted Output Functions, Formatted Output 6322 @section C++ Formatted Output 6323 @cindex C++ @code{ostream} output 6324 @cindex @code{ostream} output 6325 6326 The following functions are provided in @file{libgmpxx} (@pxref{Headers and 6327 Libraries}), which is built if C++ support is enabled (@pxref{Build Options}). 6328 Prototypes are available from @code{<gmp.h>}. 6329 6330 @deftypefun ostream& operator<< (ostream& @var{stream}, const mpz_t @var{op}) 6331 Print @var{op} to @var{stream}, using its @code{ios} formatting settings. 6332 @code{ios::width} is reset to 0 after output, the same as the standard 6333 @code{ostream operator<<} routines do. 6334 6335 In hex or octal, @var{op} is printed as a signed number, the same as for 6336 decimal. This is unlike the standard @code{operator<<} routines on @code{int} 6337 etc, which instead give twos complement. 6338 @end deftypefun 6339 6340 @deftypefun ostream& operator<< (ostream& @var{stream}, const mpq_t @var{op}) 6341 Print @var{op} to @var{stream}, using its @code{ios} formatting settings. 6342 @code{ios::width} is reset to 0 after output, the same as the standard 6343 @code{ostream operator<<} routines do. 6344 6345 Output will be a fraction like @samp{5/9}, or if the denominator is 1 then 6346 just a plain integer like @samp{123}. 6347 6348 In hex or octal, @var{op} is printed as a signed value, the same as for 6349 decimal. If @code{ios::showbase} is set then a base indicator is shown on 6350 both the numerator and denominator (if the denominator is required). 6351 @end deftypefun 6352 6353 @deftypefun ostream& operator<< (ostream& @var{stream}, const mpf_t @var{op}) 6354 Print @var{op} to @var{stream}, using its @code{ios} formatting settings. 6355 @code{ios::width} is reset to 0 after output, the same as the standard 6356 @code{ostream operator<<} routines do. 6357 6358 The decimal point follows the standard library float @code{operator<<}, which 6359 on recent systems means the @code{std::locale} imbued on @var{stream}. 6360 6361 Hex and octal are supported, unlike the standard @code{operator<<} on 6362 @code{double}. The mantissa will be in hex or octal, the exponent will be in 6363 decimal. For hex the exponent delimiter is an @samp{@@}. This is as per 6364 @code{mpf_out_str}. 6365 6366 @code{ios::showbase} is supported, and will put a base on the mantissa, for 6367 example hex @samp{0x1.8} or @samp{0x0.8}, or octal @samp{01.4} or @samp{00.4}. 6368 This last form is slightly strange, but at least differentiates itself from 6369 decimal. 6370 @end deftypefun 6371 6372 These operators mean that GMP types can be printed in the usual C++ way, for 6373 example, 6374 6375 @example 6376 mpz_t z; 6377 int n; 6378 ... 6379 cout << "iteration " << n << " value " << z << "\n"; 6380 @end example 6381 6382 But note that @code{ostream} output (and @code{istream} input, @pxref{C++ 6383 Formatted Input}) is the only overloading available for the GMP types and that 6384 for instance using @code{+} with an @code{mpz_t} will have unpredictable 6385 results. For classes with overloading, see @ref{C++ Class Interface}. 6386 6387 6388 @node Formatted Input, C++ Class Interface, Formatted Output, Top 6389 @chapter Formatted Input 6390 @cindex Formatted input 6391 @cindex @code{scanf} formatted input 6392 6393 @menu 6394 * Formatted Input Strings:: 6395 * Formatted Input Functions:: 6396 * C++ Formatted Input:: 6397 @end menu 6398 6399 6400 @node Formatted Input Strings, Formatted Input Functions, Formatted Input, Formatted Input 6401 @section Formatted Input Strings 6402 6403 @code{gmp_scanf} and friends accept format strings similar to the standard C 6404 @code{scanf} (@pxref{Formatted Input,, Formatted Input, libc, The GNU C 6405 Library Reference Manual}). A format specification is of the form 6406 6407 @example 6408 % [flags] [width] [type] conv 6409 @end example 6410 6411 GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t} 6412 and @code{mpf_t} respectively. @samp{Z} and @samp{Q} behave like integers. 6413 @samp{Q} will read a @samp{/} and a denominator, if present. @samp{F} behaves 6414 like a float. 6415 6416 GMP variables don't require an @code{&} when passed to @code{gmp_scanf}, since 6417 they're already ``call-by-reference''. For example, 6418 6419 @example 6420 /* to read say "a(5) = 1234" */ 6421 int n; 6422 mpz_t z; 6423 gmp_scanf ("a(%d) = %Zd\n", &n, z); 6424 6425 mpq_t q1, q2; 6426 gmp_sscanf ("0377 + 0x10/0x11", "%Qi + %Qi", q1, q2); 6427 6428 /* to read say "topleft (1.55,-2.66)" */ 6429 mpf_t x, y; 6430 char buf[32]; 6431 gmp_scanf ("%31s (%Ff,%Ff)", buf, x, y); 6432 @end example 6433 6434 All the standard C @code{scanf} types behave the same as in the C library 6435 @code{scanf}, and can be freely intermixed with the GMP extensions. In the 6436 current implementation the standard parts of the format string are simply 6437 handed to @code{scanf} and only the GMP extensions handled directly. 6438 6439 The flags accepted are as follows. @samp{a} and @samp{'} will depend on 6440 support from the C library, and @samp{'} cannot be used with GMP types. 6441 6442 @quotation 6443 @multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6444 @item @nicode{*} @tab read but don't store 6445 @item @nicode{a} @tab allocate a buffer (string conversions) 6446 @item @nicode{'} @tab grouped digits, GLIBC style (not GMP types) 6447 @end multitable 6448 @end quotation 6449 6450 The standard types accepted are as follows. @samp{h} and @samp{l} are 6451 portable, the rest will depend on the compiler (or include files) for the type 6452 and the C library for the input. 6453 6454 @quotation 6455 @multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6456 @item @nicode{h} @tab @nicode{short} 6457 @item @nicode{hh} @tab @nicode{char} 6458 @item @nicode{j} @tab @nicode{intmax_t} or @nicode{uintmax_t} 6459 @item @nicode{l} @tab @nicode{long int}, @nicode{double} or @nicode{wchar_t} 6460 @item @nicode{ll} @tab @nicode{long long} 6461 @item @nicode{L} @tab @nicode{long double} 6462 @item @nicode{q} @tab @nicode{quad_t} or @nicode{u_quad_t} 6463 @item @nicode{t} @tab @nicode{ptrdiff_t} 6464 @item @nicode{z} @tab @nicode{size_t} 6465 @end multitable 6466 @end quotation 6467 6468 @noindent 6469 The GMP types are 6470 6471 @quotation 6472 @multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6473 @item @nicode{F} @tab @nicode{mpf_t}, float conversions 6474 @item @nicode{Q} @tab @nicode{mpq_t}, integer conversions 6475 @item @nicode{Z} @tab @nicode{mpz_t}, integer conversions 6476 @end multitable 6477 @end quotation 6478 6479 The conversions accepted are as follows. @samp{p} and @samp{[} will depend on 6480 support from the C library, the rest are standard. 6481 6482 @quotation 6483 @multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6484 @item @nicode{c} @tab character or characters 6485 @item @nicode{d} @tab decimal integer 6486 @item @nicode{e} @nicode{E} @nicode{f} @nicode{g} @nicode{G} 6487 @tab float 6488 @item @nicode{i} @tab integer with base indicator 6489 @item @nicode{n} @tab characters read so far 6490 @item @nicode{o} @tab octal integer 6491 @item @nicode{p} @tab pointer 6492 @item @nicode{s} @tab string of non-whitespace characters 6493 @item @nicode{u} @tab decimal integer 6494 @item @nicode{x} @nicode{X} @tab hex integer 6495 @item @nicode{[} @tab string of characters in a set 6496 @end multitable 6497 @end quotation 6498 6499 @samp{e}, @samp{E}, @samp{f}, @samp{g} and @samp{G} are identical, they all 6500 read either fixed point or scientific format, and either upper or lower case 6501 @samp{e} for the exponent in scientific format. 6502 6503 C99 style hex float format (@code{printf %a}, @pxref{Formatted Output 6504 Strings}) is always accepted for @code{mpf_t}, but for the standard float 6505 types it will depend on the C library. 6506 6507 @samp{x} and @samp{X} are identical, both accept both upper and lower case 6508 hexadecimal. 6509 6510 @samp{o}, @samp{u}, @samp{x} and @samp{X} all read positive or negative 6511 values. For the standard C types these are described as ``unsigned'' 6512 conversions, but that merely affects certain overflow handling, negatives are 6513 still allowed (per @code{strtoul}, @pxref{Parsing of Integers,, Parsing of 6514 Integers, libc, The GNU C Library Reference Manual}). For GMP types there are 6515 no overflows, so @samp{d} and @samp{u} are identical. 6516 6517 @samp{Q} type reads the numerator and (optional) denominator as given. If the 6518 value might not be in canonical form then @code{mpq_canonicalize} must be 6519 called before using it in any calculations (@pxref{Rational Number 6520 Functions}). 6521 6522 @samp{Qi} will read a base specification separately for the numerator and 6523 denominator. For example @samp{0x10/11} would be 16/11, whereas 6524 @samp{0x10/0x11} would be 16/17. 6525 6526 @samp{n} can be used with any of the types above, even the GMP types. 6527 @samp{*} to suppress assignment is allowed, though in that case it would do 6528 nothing at all. 6529 6530 Other conversions or types that might be accepted by the C library 6531 @code{scanf} cannot be used through @code{gmp_scanf}. 6532 6533 Whitespace is read and discarded before a field, except for @samp{c} and 6534 @samp{[} conversions. 6535 6536 For float conversions, the decimal point character (or string) expected is 6537 taken from the current locale settings on systems which provide 6538 @code{localeconv} (@pxref{Locales,, Locales and Internationalization, libc, 6539 The GNU C Library Reference Manual}). The C library will normally do the same 6540 for standard float input. 6541 6542 The format string is only interpreted as plain @code{char}s, multibyte 6543 characters are not recognised. Perhaps this will change in the future. 6544 6545 6546 @node Formatted Input Functions, C++ Formatted Input, Formatted Input Strings, Formatted Input 6547 @section Formatted Input Functions 6548 @cindex Input functions 6549 6550 Each of the following functions is similar to the corresponding C library 6551 function. The plain @code{scanf} forms take a variable argument list. The 6552 @code{vscanf} forms take an argument pointer, see @ref{Variadic Functions,, 6553 Variadic Functions, libc, The GNU C Library Reference Manual}, or @samp{man 3 6554 va_start}. 6555 6556 It should be emphasised that if a format string is invalid, or the arguments 6557 don't match what the format specifies, then the behaviour of any of these 6558 functions will be unpredictable. GCC format string checking is not available, 6559 since it doesn't recognise the GMP extensions. 6560 6561 No overlap is permitted between the @var{fmt} string and any of the results 6562 produced. 6563 6564 @deftypefun int gmp_scanf (const char *@var{fmt}, @dots{}) 6565 @deftypefunx int gmp_vscanf (const char *@var{fmt}, va_list @var{ap}) 6566 Read from the standard input @code{stdin}. 6567 @end deftypefun 6568 6569 @deftypefun int gmp_fscanf (FILE *@var{fp}, const char *@var{fmt}, @dots{}) 6570 @deftypefunx int gmp_vfscanf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap}) 6571 Read from the stream @var{fp}. 6572 @end deftypefun 6573 6574 @deftypefun int gmp_sscanf (const char *@var{s}, const char *@var{fmt}, @dots{}) 6575 @deftypefunx int gmp_vsscanf (const char *@var{s}, const char *@var{fmt}, va_list @var{ap}) 6576 Read from a null-terminated string @var{s}. 6577 @end deftypefun 6578 6579 The return value from each of these functions is the same as the standard C99 6580 @code{scanf}, namely the number of fields successfully parsed and stored. 6581 @samp{%n} fields and fields read but suppressed by @samp{*} don't count 6582 towards the return value. 6583 6584 If end of input (or a file error) is reached before a character for a field or 6585 a literal, and if no previous non-suppressed fields have matched, then the 6586 return value is @code{EOF} instead of 0. A whitespace character in the format 6587 string is only an optional match and doesn't induce an @code{EOF} in this 6588 fashion. Leading whitespace read and discarded for a field don't count as 6589 characters for that field. 6590 6591 For the GMP types, input parsing follows C99 rules, namely one character of 6592 lookahead is used and characters are read while they continue to meet the 6593 format requirements. If this doesn't provide a complete number then the 6594 function terminates, with that field not stored nor counted towards the return 6595 value. For instance with @code{mpf_t} an input @samp{1.23e-XYZ} would be read 6596 up to the @samp{X} and that character pushed back since it's not a digit. The 6597 string @samp{1.23e-} would then be considered invalid since an @samp{e} must 6598 be followed by at least one digit. 6599 6600 For the standard C types, in the current implementation GMP calls the C 6601 library @code{scanf} functions, which might have looser rules about what 6602 constitutes a valid input. 6603 6604 Note that @code{gmp_sscanf} is the same as @code{gmp_fscanf} and only does one 6605 character of lookahead when parsing. Although clearly it could look at its 6606 entire input, it is deliberately made identical to @code{gmp_fscanf}, the same 6607 way C99 @code{sscanf} is the same as @code{fscanf}. 6608 6609 6610 @node C++ Formatted Input, , Formatted Input Functions, Formatted Input 6611 @section C++ Formatted Input 6612 @cindex C++ @code{istream} input 6613 @cindex @code{istream} input 6614 6615 The following functions are provided in @file{libgmpxx} (@pxref{Headers and 6616 Libraries}), which is built only if C++ support is enabled (@pxref{Build 6617 Options}). Prototypes are available from @code{<gmp.h>}. 6618 6619 @deftypefun istream& operator>> (istream& @var{stream}, mpz_t @var{rop}) 6620 Read @var{rop} from @var{stream}, using its @code{ios} formatting settings. 6621 @end deftypefun 6622 6623 @deftypefun istream& operator>> (istream& @var{stream}, mpq_t @var{rop}) 6624 An integer like @samp{123} will be read, or a fraction like @samp{5/9}. No 6625 whitespace is allowed around the @samp{/}. If the fraction is not in 6626 canonical form then @code{mpq_canonicalize} must be called (@pxref{Rational 6627 Number Functions}) before operating on it. 6628 6629 As per integer input, an @samp{0} or @samp{0x} base indicator is read when 6630 none of @code{ios::dec}, @code{ios::oct} or @code{ios::hex} are set. This is 6631 done separately for numerator and denominator, so that for instance 6632 @samp{0x10/11} is @math{16/11} and @samp{0x10/0x11} is @math{16/17}. 6633 @end deftypefun 6634 6635 @deftypefun istream& operator>> (istream& @var{stream}, mpf_t @var{rop}) 6636 Read @var{rop} from @var{stream}, using its @code{ios} formatting settings. 6637 6638 Hex or octal floats are not supported, but might be in the future, or perhaps 6639 it's best to accept only what the standard float @code{operator>>} does. 6640 @end deftypefun 6641 6642 Note that digit grouping specified by the @code{istream} locale is currently 6643 not accepted. Perhaps this will change in the future. 6644 6645 @sp 1 6646 These operators mean that GMP types can be read in the usual C++ way, for 6647 example, 6648 6649 @example 6650 mpz_t z; 6651 ... 6652 cin >> z; 6653 @end example 6654 6655 But note that @code{istream} input (and @code{ostream} output, @pxref{C++ 6656 Formatted Output}) is the only overloading available for the GMP types and 6657 that for instance using @code{+} with an @code{mpz_t} will have unpredictable 6658 results. For classes with overloading, see @ref{C++ Class Interface}. 6659 6660 6661 6662 @node C++ Class Interface, Custom Allocation, Formatted Input, Top 6663 @chapter C++ Class Interface 6664 @cindex C++ interface 6665 6666 This chapter describes the C++ class based interface to GMP. 6667 6668 All GMP C language types and functions can be used in C++ programs, since 6669 @file{gmp.h} has @code{extern "C"} qualifiers, but the class interface offers 6670 overloaded functions and operators which may be more convenient. 6671 6672 Due to the implementation of this interface, a reasonably recent C++ compiler 6673 is required, one supporting namespaces, partial specialization of templates 6674 and member templates. 6675 6676 @strong{Everything described in this chapter is to be considered preliminary 6677 and might be subject to incompatible changes if some unforeseen difficulty 6678 reveals itself.} 6679 6680 @menu 6681 * C++ Interface General:: 6682 * C++ Interface Integers:: 6683 * C++ Interface Rationals:: 6684 * C++ Interface Floats:: 6685 * C++ Interface Random Numbers:: 6686 * C++ Interface Limitations:: 6687 @end menu 6688 6689 6690 @node C++ Interface General, C++ Interface Integers, C++ Class Interface, C++ Class Interface 6691 @section C++ Interface General 6692 6693 @noindent 6694 All the C++ classes and functions are available with 6695 6696 @cindex @code{gmpxx.h} 6697 @example 6698 #include <gmpxx.h> 6699 @end example 6700 6701 Programs should be linked with the @file{libgmpxx} and @file{libgmp} 6702 libraries. For example, 6703 6704 @example 6705 g++ mycxxprog.cc -lgmpxx -lgmp 6706 @end example 6707 6708 @noindent 6709 The classes defined are 6710 6711 @deftp Class mpz_class 6712 @deftpx Class mpq_class 6713 @deftpx Class mpf_class 6714 @end deftp 6715 6716 The standard operators and various standard functions are overloaded to allow 6717 arithmetic with these classes. For example, 6718 6719 @example 6720 int 6721 main (void) 6722 @{ 6723 mpz_class a, b, c; 6724 6725 a = 1234; 6726 b = "-5678"; 6727 c = a+b; 6728 cout << "sum is " << c << "\n"; 6729 cout << "absolute value is " << abs(c) << "\n"; 6730 6731 return 0; 6732 @} 6733 @end example 6734 6735 An important feature of the implementation is that an expression like 6736 @code{a=b+c} results in a single call to the corresponding @code{mpz_add}, 6737 without using a temporary for the @code{b+c} part. Expressions which by their 6738 nature imply intermediate values, like @code{a=b*c+d*e}, still use temporaries 6739 though. 6740 6741 The classes can be freely intermixed in expressions, as can the classes and 6742 the standard types @code{long}, @code{unsigned long} and @code{double}. 6743 Smaller types like @code{int} or @code{float} can also be intermixed, since 6744 C++ will promote them. 6745 6746 Note that @code{bool} is not accepted directly, but must be explicitly cast to 6747 an @code{int} first. This is because C++ will automatically convert any 6748 pointer to a @code{bool}, so if GMP accepted @code{bool} it would make all 6749 sorts of invalid class and pointer combinations compile but almost certainly 6750 not do anything sensible. 6751 6752 Conversions back from the classes to standard C++ types aren't done 6753 automatically, instead member functions like @code{get_si} are provided (see 6754 the following sections for details). 6755 6756 Also there are no automatic conversions from the classes to the corresponding 6757 GMP C types, instead a reference to the underlying C object can be obtained 6758 with the following functions, 6759 6760 @deftypefun mpz_t mpz_class::get_mpz_t () 6761 @deftypefunx mpq_t mpq_class::get_mpq_t () 6762 @deftypefunx mpf_t mpf_class::get_mpf_t () 6763 @end deftypefun 6764 6765 These can be used to call a C function which doesn't have a C++ class 6766 interface. For example to set @code{a} to the GCD of @code{b} and @code{c}, 6767 6768 @example 6769 mpz_class a, b, c; 6770 ... 6771 mpz_gcd (a.get_mpz_t(), b.get_mpz_t(), c.get_mpz_t()); 6772 @end example 6773 6774 In the other direction, a class can be initialized from the corresponding GMP 6775 C type, or assigned to if an explicit constructor is used. In both cases this 6776 makes a copy of the value, it doesn't create any sort of association. For 6777 example, 6778 6779 @example 6780 mpz_t z; 6781 // ... init and calculate z ... 6782 mpz_class x(z); 6783 mpz_class y; 6784 y = mpz_class (z); 6785 @end example 6786 6787 There are no namespace setups in @file{gmpxx.h}, all types and functions are 6788 simply put into the global namespace. This is what @file{gmp.h} has done in 6789 the past, and continues to do for compatibility. The extras provided by 6790 @file{gmpxx.h} follow GMP naming conventions and are unlikely to clash with 6791 anything. 6792 6793 6794 @node C++ Interface Integers, C++ Interface Rationals, C++ Interface General, C++ Class Interface 6795 @section C++ Interface Integers 6796 6797 @deftypefun {} mpz_class::mpz_class (type @var{n}) 6798 Construct an @code{mpz_class}. All the standard C++ types may be used, except 6799 @code{long long} and @code{long double}, and all the GMP C++ classes can be 6800 used, although conversions from @code{mpq_class} and @code{mpf_class} are 6801 @code{explicit}. Any necessary conversion follows the corresponding C 6802 function, for example @code{double} follows @code{mpz_set_d} 6803 (@pxref{Assigning Integers}). 6804 @end deftypefun 6805 6806 @deftypefun explicit mpz_class::mpz_class (const mpz_t @var{z}) 6807 Construct an @code{mpz_class} from an @code{mpz_t}. The value in @var{z} is 6808 copied into the new @code{mpz_class}, there won't be any permanent association 6809 between it and @var{z}. 6810 @end deftypefun 6811 6812 @deftypefun explicit mpz_class::mpz_class (const char *@var{s}, int @var{base} = 0) 6813 @deftypefunx explicit mpz_class::mpz_class (const string& @var{s}, int @var{base} = 0) 6814 Construct an @code{mpz_class} converted from a string using @code{mpz_set_str} 6815 (@pxref{Assigning Integers}). 6816 6817 If the string is not a valid integer, an @code{std::invalid_argument} 6818 exception is thrown. The same applies to @code{operator=}. 6819 @end deftypefun 6820 6821 @deftypefun mpz_class operator"" _mpz (const char *@var{str}) 6822 With C++11 compilers, integers can be constructed with the syntax 6823 @code{123_mpz} which is equivalent to @code{mpz_class("123")}. 6824 @end deftypefun 6825 6826 @deftypefun mpz_class operator/ (mpz_class @var{a}, mpz_class @var{d}) 6827 @deftypefunx mpz_class operator% (mpz_class @var{a}, mpz_class @var{d}) 6828 Divisions involving @code{mpz_class} round towards zero, as per the 6829 @code{mpz_tdiv_q} and @code{mpz_tdiv_r} functions (@pxref{Integer Division}). 6830 This is the same as the C99 @code{/} and @code{%} operators. 6831 6832 The @code{mpz_fdiv@dots{}} or @code{mpz_cdiv@dots{}} functions can always be called 6833 directly if desired. For example, 6834 6835 @example 6836 mpz_class q, a, d; 6837 ... 6838 mpz_fdiv_q (q.get_mpz_t(), a.get_mpz_t(), d.get_mpz_t()); 6839 @end example 6840 @end deftypefun 6841 6842 @deftypefun mpz_class abs (mpz_class @var{op}) 6843 @deftypefunx int cmp (mpz_class @var{op1}, type @var{op2}) 6844 @deftypefunx int cmp (type @var{op1}, mpz_class @var{op2}) 6845 @maybepagebreak 6846 @deftypefunx bool mpz_class::fits_sint_p (void) 6847 @deftypefunx bool mpz_class::fits_slong_p (void) 6848 @deftypefunx bool mpz_class::fits_sshort_p (void) 6849 @maybepagebreak 6850 @deftypefunx bool mpz_class::fits_uint_p (void) 6851 @deftypefunx bool mpz_class::fits_ulong_p (void) 6852 @deftypefunx bool mpz_class::fits_ushort_p (void) 6853 @maybepagebreak 6854 @deftypefunx double mpz_class::get_d (void) 6855 @deftypefunx long mpz_class::get_si (void) 6856 @deftypefunx string mpz_class::get_str (int @var{base} = 10) 6857 @deftypefunx {unsigned long} mpz_class::get_ui (void) 6858 @maybepagebreak 6859 @deftypefunx int mpz_class::set_str (const char *@var{str}, int @var{base}) 6860 @deftypefunx int mpz_class::set_str (const string& @var{str}, int @var{base}) 6861 @deftypefunx int sgn (mpz_class @var{op}) 6862 @deftypefunx mpz_class sqrt (mpz_class @var{op}) 6863 @maybepagebreak 6864 @deftypefunx mpz_class gcd (mpz_class @var{op1}, mpz_class @var{op2}) 6865 @deftypefunx mpz_class lcm (mpz_class @var{op1}, mpz_class @var{op2}) 6866 @maybepagebreak 6867 @deftypefunx void mpz_class::swap (mpz_class& @var{op}) 6868 @deftypefunx void swap (mpz_class& @var{op1}, mpz_class& @var{op2}) 6869 These functions provide a C++ class interface to the corresponding GMP C 6870 routines. 6871 6872 @code{cmp} can be used with any of the classes or the standard C++ types, 6873 except @code{long long} and @code{long double}. 6874 @end deftypefun 6875 6876 @sp 1 6877 Overloaded operators for combinations of @code{mpz_class} and @code{double} 6878 are provided for completeness, but it should be noted that if the given 6879 @code{double} is not an integer then the way any rounding is done is currently 6880 unspecified. The rounding might take place at the start, in the middle, or at 6881 the end of the operation, and it might change in the future. 6882 6883 Conversions between @code{mpz_class} and @code{double}, however, are defined 6884 to follow the corresponding C functions @code{mpz_get_d} and @code{mpz_set_d}. 6885 And comparisons are always made exactly, as per @code{mpz_cmp_d}. 6886 6887 6888 @node C++ Interface Rationals, C++ Interface Floats, C++ Interface Integers, C++ Class Interface 6889 @section C++ Interface Rationals 6890 6891 In all the following constructors, if a fraction is given then it should be in 6892 canonical form, or if not then @code{mpq_class::canonicalize} called. 6893 6894 @deftypefun {} mpq_class::mpq_class (type @var{op}) 6895 @deftypefunx {} mpq_class::mpq_class (integer @var{num}, integer @var{den}) 6896 Construct an @code{mpq_class}. The initial value can be a single value of any 6897 type (conversion from @code{mpf_class} is @code{explicit}), or a pair of 6898 integers (@code{mpz_class} or standard C++ integer types) representing a 6899 fraction, except that @code{long long} and @code{long double} are not 6900 supported. For example, 6901 6902 @example 6903 mpq_class q (99); 6904 mpq_class q (1.75); 6905 mpq_class q (1, 3); 6906 @end example 6907 @end deftypefun 6908 6909 @deftypefun explicit mpq_class::mpq_class (const mpq_t @var{q}) 6910 Construct an @code{mpq_class} from an @code{mpq_t}. The value in @var{q} is 6911 copied into the new @code{mpq_class}, there won't be any permanent association 6912 between it and @var{q}. 6913 @end deftypefun 6914 6915 @deftypefun explicit mpq_class::mpq_class (const char *@var{s}, int @var{base} = 0) 6916 @deftypefunx explicit mpq_class::mpq_class (const string& @var{s}, int @var{base} = 0) 6917 Construct an @code{mpq_class} converted from a string using @code{mpq_set_str} 6918 (@pxref{Initializing Rationals}). 6919 6920 If the string is not a valid rational, an @code{std::invalid_argument} 6921 exception is thrown. The same applies to @code{operator=}. 6922 @end deftypefun 6923 6924 @deftypefun mpq_class operator"" _mpq (const char *@var{str}) 6925 With C++11 compilers, integral rationals can be constructed with the syntax 6926 @code{123_mpq} which is equivalent to @code{mpq_class(123_mpz)}. Other 6927 rationals can be built as @code{-1_mpq/2} or @code{0xb_mpq/123456_mpz}. 6928 @end deftypefun 6929 6930 @deftypefun void mpq_class::canonicalize () 6931 Put an @code{mpq_class} into canonical form, as per @ref{Rational Number 6932 Functions}. All arithmetic operators require their operands in canonical 6933 form, and will return results in canonical form. 6934 @end deftypefun 6935 6936 @deftypefun mpq_class abs (mpq_class @var{op}) 6937 @deftypefunx int cmp (mpq_class @var{op1}, type @var{op2}) 6938 @deftypefunx int cmp (type @var{op1}, mpq_class @var{op2}) 6939 @maybepagebreak 6940 @deftypefunx double mpq_class::get_d (void) 6941 @deftypefunx string mpq_class::get_str (int @var{base} = 10) 6942 @maybepagebreak 6943 @deftypefunx int mpq_class::set_str (const char *@var{str}, int @var{base}) 6944 @deftypefunx int mpq_class::set_str (const string& @var{str}, int @var{base}) 6945 @deftypefunx int sgn (mpq_class @var{op}) 6946 @maybepagebreak 6947 @deftypefunx void mpq_class::swap (mpq_class& @var{op}) 6948 @deftypefunx void swap (mpq_class& @var{op1}, mpq_class& @var{op2}) 6949 These functions provide a C++ class interface to the corresponding GMP C 6950 routines. 6951 6952 @code{cmp} can be used with any of the classes or the standard C++ types, 6953 except @code{long long} and @code{long double}. 6954 @end deftypefun 6955 6956 @deftypefun {mpz_class&} mpq_class::get_num () 6957 @deftypefunx {mpz_class&} mpq_class::get_den () 6958 Get a reference to an @code{mpz_class} which is the numerator or denominator 6959 of an @code{mpq_class}. This can be used both for read and write access. If 6960 the object returned is modified, it modifies the original @code{mpq_class}. 6961 6962 If direct manipulation might produce a non-canonical value, then 6963 @code{mpq_class::canonicalize} must be called before further operations. 6964 @end deftypefun 6965 6966 @deftypefun mpz_t mpq_class::get_num_mpz_t () 6967 @deftypefunx mpz_t mpq_class::get_den_mpz_t () 6968 Get a reference to the underlying @code{mpz_t} numerator or denominator of an 6969 @code{mpq_class}. This can be passed to C functions expecting an 6970 @code{mpz_t}. Any modifications made to the @code{mpz_t} will modify the 6971 original @code{mpq_class}. 6972 6973 If direct manipulation might produce a non-canonical value, then 6974 @code{mpq_class::canonicalize} must be called before further operations. 6975 @end deftypefun 6976 6977 @deftypefun istream& operator>> (istream& @var{stream}, mpq_class& @var{rop}); 6978 Read @var{rop} from @var{stream}, using its @code{ios} formatting settings, 6979 the same as @code{mpq_t operator>>} (@pxref{C++ Formatted Input}). 6980 6981 If the @var{rop} read might not be in canonical form then 6982 @code{mpq_class::canonicalize} must be called. 6983 @end deftypefun 6984 6985 6986 @node C++ Interface Floats, C++ Interface Random Numbers, C++ Interface Rationals, C++ Class Interface 6987 @section C++ Interface Floats 6988 6989 When an expression requires the use of temporary intermediate @code{mpf_class} 6990 values, like @code{f=g*h+x*y}, those temporaries will have the same precision 6991 as the destination @code{f}. Explicit constructors can be used if this 6992 doesn't suit. 6993 6994 @deftypefun {} mpf_class::mpf_class (type @var{op}) 6995 @deftypefunx {} mpf_class::mpf_class (type @var{op}, mp_bitcnt_t @var{prec}) 6996 Construct an @code{mpf_class}. Any standard C++ type can be used, except 6997 @code{long long} and @code{long double}, and any of the GMP C++ classes can be 6998 used. 6999 7000 If @var{prec} is given, the initial precision is that value, in bits. If 7001 @var{prec} is not given, then the initial precision is determined by the type 7002 of @var{op} given. An @code{mpz_class}, @code{mpq_class}, or C++ 7003 builtin type will give the default @code{mpf} precision (@pxref{Initializing 7004 Floats}). An @code{mpf_class} or expression will give the precision of that 7005 value. The precision of a binary expression is the higher of the two 7006 operands. 7007 7008 @example 7009 mpf_class f(1.5); // default precision 7010 mpf_class f(1.5, 500); // 500 bits (at least) 7011 mpf_class f(x); // precision of x 7012 mpf_class f(abs(x)); // precision of x 7013 mpf_class f(-g, 1000); // 1000 bits (at least) 7014 mpf_class f(x+y); // greater of precisions of x and y 7015 @end example 7016 @end deftypefun 7017 7018 @deftypefun explicit mpf_class::mpf_class (const mpf_t @var{f}) 7019 @deftypefunx {} mpf_class::mpf_class (const mpf_t @var{f}, mp_bitcnt_t @var{prec}) 7020 Construct an @code{mpf_class} from an @code{mpf_t}. The value in @var{f} is 7021 copied into the new @code{mpf_class}, there won't be any permanent association 7022 between it and @var{f}. 7023 7024 If @var{prec} is given, the initial precision is that value, in bits. If 7025 @var{prec} is not given, then the initial precision is that of @var{f}. 7026 @end deftypefun 7027 7028 @deftypefun explicit mpf_class::mpf_class (const char *@var{s}) 7029 @deftypefunx {} mpf_class::mpf_class (const char *@var{s}, mp_bitcnt_t @var{prec}, int @var{base} = 0) 7030 @deftypefunx explicit mpf_class::mpf_class (const string& @var{s}) 7031 @deftypefunx {} mpf_class::mpf_class (const string& @var{s}, mp_bitcnt_t @var{prec}, int @var{base} = 0) 7032 Construct an @code{mpf_class} converted from a string using @code{mpf_set_str} 7033 (@pxref{Assigning Floats}). If @var{prec} is given, the initial precision is 7034 that value, in bits. If not, the default @code{mpf} precision 7035 (@pxref{Initializing Floats}) is used. 7036 7037 If the string is not a valid float, an @code{std::invalid_argument} exception 7038 is thrown. The same applies to @code{operator=}. 7039 @end deftypefun 7040 7041 @deftypefun mpf_class operator"" _mpf (const char *@var{str}) 7042 With C++11 compilers, floats can be constructed with the syntax 7043 @code{1.23e-1_mpf} which is equivalent to @code{mpf_class("1.23e-1")}. 7044 @end deftypefun 7045 7046 @deftypefun {mpf_class&} mpf_class::operator= (type @var{op}) 7047 Convert and store the given @var{op} value to an @code{mpf_class} object. The 7048 same types are accepted as for the constructors above. 7049 7050 Note that @code{operator=} only stores a new value, it doesn't copy or change 7051 the precision of the destination, instead the value is truncated if necessary. 7052 This is the same as @code{mpf_set} etc. Note in particular this means for 7053 @code{mpf_class} a copy constructor is not the same as a default constructor 7054 plus assignment. 7055 7056 @example 7057 mpf_class x (y); // x created with precision of y 7058 7059 mpf_class x; // x created with default precision 7060 x = y; // value truncated to that precision 7061 @end example 7062 7063 Applications using templated code may need to be careful about the assumptions 7064 the code makes in this area, when working with @code{mpf_class} values of 7065 various different or non-default precisions. For instance implementations of 7066 the standard @code{complex} template have been seen in both styles above, 7067 though of course @code{complex} is normally only actually specified for use 7068 with the builtin float types. 7069 @end deftypefun 7070 7071 @deftypefun mpf_class abs (mpf_class @var{op}) 7072 @deftypefunx mpf_class ceil (mpf_class @var{op}) 7073 @deftypefunx int cmp (mpf_class @var{op1}, type @var{op2}) 7074 @deftypefunx int cmp (type @var{op1}, mpf_class @var{op2}) 7075 @maybepagebreak 7076 @deftypefunx bool mpf_class::fits_sint_p (void) 7077 @deftypefunx bool mpf_class::fits_slong_p (void) 7078 @deftypefunx bool mpf_class::fits_sshort_p (void) 7079 @maybepagebreak 7080 @deftypefunx bool mpf_class::fits_uint_p (void) 7081 @deftypefunx bool mpf_class::fits_ulong_p (void) 7082 @deftypefunx bool mpf_class::fits_ushort_p (void) 7083 @maybepagebreak 7084 @deftypefunx mpf_class floor (mpf_class @var{op}) 7085 @deftypefunx mpf_class hypot (mpf_class @var{op1}, mpf_class @var{op2}) 7086 @maybepagebreak 7087 @deftypefunx double mpf_class::get_d (void) 7088 @deftypefunx long mpf_class::get_si (void) 7089 @deftypefunx string mpf_class::get_str (mp_exp_t& @var{exp}, int @var{base} = 10, size_t @var{digits} = 0) 7090 @deftypefunx {unsigned long} mpf_class::get_ui (void) 7091 @maybepagebreak 7092 @deftypefunx int mpf_class::set_str (const char *@var{str}, int @var{base}) 7093 @deftypefunx int mpf_class::set_str (const string& @var{str}, int @var{base}) 7094 @deftypefunx int sgn (mpf_class @var{op}) 7095 @deftypefunx mpf_class sqrt (mpf_class @var{op}) 7096 @maybepagebreak 7097 @deftypefunx void mpf_class::swap (mpf_class& @var{op}) 7098 @deftypefunx void swap (mpf_class& @var{op1}, mpf_class& @var{op2}) 7099 @deftypefunx mpf_class trunc (mpf_class @var{op}) 7100 These functions provide a C++ class interface to the corresponding GMP C 7101 routines. 7102 7103 @code{cmp} can be used with any of the classes or the standard C++ types, 7104 except @code{long long} and @code{long double}. 7105 7106 The accuracy provided by @code{hypot} is not currently guaranteed. 7107 @end deftypefun 7108 7109 @deftypefun {mp_bitcnt_t} mpf_class::get_prec () 7110 @deftypefunx void mpf_class::set_prec (mp_bitcnt_t @var{prec}) 7111 @deftypefunx void mpf_class::set_prec_raw (mp_bitcnt_t @var{prec}) 7112 Get or set the current precision of an @code{mpf_class}. 7113 7114 The restrictions described for @code{mpf_set_prec_raw} (@pxref{Initializing 7115 Floats}) apply to @code{mpf_class::set_prec_raw}. Note in particular that the 7116 @code{mpf_class} must be restored to it's allocated precision before being 7117 destroyed. This must be done by application code, there's no automatic 7118 mechanism for it. 7119 @end deftypefun 7120 7121 7122 @node C++ Interface Random Numbers, C++ Interface Limitations, C++ Interface Floats, C++ Class Interface 7123 @section C++ Interface Random Numbers 7124 7125 @deftp Class gmp_randclass 7126 The C++ class interface to the GMP random number functions uses 7127 @code{gmp_randclass} to hold an algorithm selection and current state, as per 7128 @code{gmp_randstate_t}. 7129 @end deftp 7130 7131 @deftypefun {} gmp_randclass::gmp_randclass (void (*@var{randinit}) (gmp_randstate_t, @dots{}), @dots{}) 7132 Construct a @code{gmp_randclass}, using a call to the given @var{randinit} 7133 function (@pxref{Random State Initialization}). The arguments expected are 7134 the same as @var{randinit}, but with @code{mpz_class} instead of @code{mpz_t}. 7135 For example, 7136 7137 @example 7138 gmp_randclass r1 (gmp_randinit_default); 7139 gmp_randclass r2 (gmp_randinit_lc_2exp_size, 32); 7140 gmp_randclass r3 (gmp_randinit_lc_2exp, a, c, m2exp); 7141 gmp_randclass r4 (gmp_randinit_mt); 7142 @end example 7143 7144 @code{gmp_randinit_lc_2exp_size} will fail if the size requested is too big, 7145 an @code{std::length_error} exception is thrown in that case. 7146 @end deftypefun 7147 7148 @deftypefun {} gmp_randclass::gmp_randclass (gmp_randalg_t @var{alg}, @dots{}) 7149 Construct a @code{gmp_randclass} using the same parameters as 7150 @code{gmp_randinit} (@pxref{Random State Initialization}). This function is 7151 obsolete and the above @var{randinit} style should be preferred. 7152 @end deftypefun 7153 7154 @deftypefun void gmp_randclass::seed (unsigned long int @var{s}) 7155 @deftypefunx void gmp_randclass::seed (mpz_class @var{s}) 7156 Seed a random number generator. See @pxref{Random Number Functions}, for how 7157 to choose a good seed. 7158 @end deftypefun 7159 7160 @deftypefun mpz_class gmp_randclass::get_z_bits (mp_bitcnt_t @var{bits}) 7161 @deftypefunx mpz_class gmp_randclass::get_z_bits (mpz_class @var{bits}) 7162 Generate a random integer with a specified number of bits. 7163 @end deftypefun 7164 7165 @deftypefun mpz_class gmp_randclass::get_z_range (mpz_class @var{n}) 7166 Generate a random integer in the range 0 to @math{@var{n}-1} inclusive. 7167 @end deftypefun 7168 7169 @deftypefun mpf_class gmp_randclass::get_f () 7170 @deftypefunx mpf_class gmp_randclass::get_f (mp_bitcnt_t @var{prec}) 7171 Generate a random float @var{f} in the range @math{0 <= @var{f} < 1}. @var{f} 7172 will be to @var{prec} bits precision, or if @var{prec} is not given then to 7173 the precision of the destination. For example, 7174 7175 @example 7176 gmp_randclass r; 7177 ... 7178 mpf_class f (0, 512); // 512 bits precision 7179 f = r.get_f(); // random number, 512 bits 7180 @end example 7181 @end deftypefun 7182 7183 7184 7185 @node C++ Interface Limitations, , C++ Interface Random Numbers, C++ Class Interface 7186 @section C++ Interface Limitations 7187 7188 @table @asis 7189 @item @code{mpq_class} and Templated Reading 7190 A generic piece of template code probably won't know that @code{mpq_class} 7191 requires a @code{canonicalize} call if inputs read with @code{operator>>} 7192 might be non-canonical. This can lead to incorrect results. 7193 7194 @code{operator>>} behaves as it does for reasons of efficiency. A 7195 canonicalize can be quite time consuming on large operands, and is best 7196 avoided if it's not necessary. 7197 7198 But this potential difficulty reduces the usefulness of @code{mpq_class}. 7199 Perhaps a mechanism to tell @code{operator>>} what to do will be adopted in 7200 the future, maybe a preprocessor define, a global flag, or an @code{ios} flag 7201 pressed into service. Or maybe, at the risk of inconsistency, the 7202 @code{mpq_class} @code{operator>>} could canonicalize and leave @code{mpq_t} 7203 @code{operator>>} not doing so, for use on those occasions when that's 7204 acceptable. Send feedback or alternate ideas to @email{gmp-bugs@@gmplib.org}. 7205 7206 @item Subclassing 7207 Subclassing the GMP C++ classes works, but is not currently recommended. 7208 7209 Expressions involving subclasses resolve correctly (or seem to), but in normal 7210 C++ fashion the subclass doesn't inherit constructors and assignments. 7211 There's many of those in the GMP classes, and a good way to reestablish them 7212 in a subclass is not yet provided. 7213 7214 @item Templated Expressions 7215 A subtle difficulty exists when using expressions together with 7216 application-defined template functions. Consider the following, with @code{T} 7217 intended to be some numeric type, 7218 7219 @example 7220 template <class T> 7221 T fun (const T &, const T &); 7222 @end example 7223 7224 @noindent 7225 When used with, say, plain @code{mpz_class} variables, it works fine: @code{T} 7226 is resolved as @code{mpz_class}. 7227 7228 @example 7229 mpz_class f(1), g(2); 7230 fun (f, g); // Good 7231 @end example 7232 7233 @noindent 7234 But when one of the arguments is an expression, it doesn't work. 7235 7236 @example 7237 mpz_class f(1), g(2), h(3); 7238 fun (f, g+h); // Bad 7239 @end example 7240 7241 This is because @code{g+h} ends up being a certain expression template type 7242 internal to @code{gmpxx.h}, which the C++ template resolution rules are unable 7243 to automatically convert to @code{mpz_class}. The workaround is simply to add 7244 an explicit cast. 7245 7246 @example 7247 mpz_class f(1), g(2), h(3); 7248 fun (f, mpz_class(g+h)); // Good 7249 @end example 7250 7251 Similarly, within @code{fun} it may be necessary to cast an expression to type 7252 @code{T} when calling a templated @code{fun2}. 7253 7254 @example 7255 template <class T> 7256 void fun (T f, T g) 7257 @{ 7258 fun2 (f, f+g); // Bad 7259 @} 7260 7261 template <class T> 7262 void fun (T f, T g) 7263 @{ 7264 fun2 (f, T(f+g)); // Good 7265 @} 7266 @end example 7267 7268 @item C++11 7269 C++11 provides several new ways in which types can be inferred: @code{auto}, 7270 @code{decltype}, etc. While they can be very convenient, they don't mix well 7271 with expression templates. In this example, the addition is performed twice, 7272 as if we had defined @code{sum} as a macro. 7273 7274 @example 7275 mpz_class z = 33; 7276 auto sum = z + z; 7277 mpz_class prod = sum * sum; 7278 @end example 7279 7280 This other example may crash, though some compilers might make it look like 7281 it is working, because the expression @code{z+z} goes out of scope before it 7282 is evaluated. 7283 7284 @example 7285 mpz_class z = 33; 7286 auto sum = z + z + z; 7287 mpz_class prod = sum * 2; 7288 @end example 7289 7290 It is thus strongly recommended to avoid @code{auto} anywhere a GMP C++ 7291 expression may appear. 7292 @end table 7293 7294 7295 @node Custom Allocation, Language Bindings, C++ Class Interface, Top 7296 @comment node-name, next, previous, up 7297 @chapter Custom Allocation 7298 @cindex Custom allocation 7299 @cindex Memory allocation 7300 @cindex Allocation of memory 7301 7302 By default GMP uses @code{malloc}, @code{realloc} and @code{free} for memory 7303 allocation, and if they fail GMP prints a message to the standard error output 7304 and terminates the program. 7305 7306 Alternate functions can be specified, to allocate memory in a different way or 7307 to have a different error action on running out of memory. 7308 7309 @deftypefun void mp_set_memory_functions (@* void *(*@var{alloc_func_ptr}) (size_t), @* void *(*@var{realloc_func_ptr}) (void *, size_t, size_t), @* void (*@var{free_func_ptr}) (void *, size_t)) 7310 Replace the current allocation functions from the arguments. If an argument 7311 is @code{NULL}, the corresponding default function is used. 7312 7313 These functions will be used for all memory allocation done by GMP, apart from 7314 temporary space from @code{alloca} if that function is available and GMP is 7315 configured to use it (@pxref{Build Options}). 7316 7317 @strong{Be sure to call @code{mp_set_memory_functions} only when there are no 7318 active GMP objects allocated using the previous memory functions! Usually 7319 that means calling it before any other GMP function.} 7320 @end deftypefun 7321 7322 The functions supplied should fit the following declarations: 7323 7324 @deftypevr Function {void *} allocate_function (size_t @var{alloc_size}) 7325 Return a pointer to newly allocated space with at least @var{alloc_size} 7326 bytes. 7327 @end deftypevr 7328 7329 @deftypevr Function {void *} reallocate_function (void *@var{ptr}, size_t @var{old_size}, size_t @var{new_size}) 7330 Resize a previously allocated block @var{ptr} of @var{old_size} bytes to be 7331 @var{new_size} bytes. 7332 7333 The block may be moved if necessary or if desired, and in that case the 7334 smaller of @var{old_size} and @var{new_size} bytes must be copied to the new 7335 location. The return value is a pointer to the resized block, that being the 7336 new location if moved or just @var{ptr} if not. 7337 7338 @var{ptr} is never @code{NULL}, it's always a previously allocated block. 7339 @var{new_size} may be bigger or smaller than @var{old_size}. 7340 @end deftypevr 7341 7342 @deftypevr Function void free_function (void *@var{ptr}, size_t @var{size}) 7343 De-allocate the space pointed to by @var{ptr}. 7344 7345 @var{ptr} is never @code{NULL}, it's always a previously allocated block of 7346 @var{size} bytes. 7347 @end deftypevr 7348 7349 A @dfn{byte} here means the unit used by the @code{sizeof} operator. 7350 7351 The @var{reallocate_function} parameter @var{old_size} and the 7352 @var{free_function} parameter @var{size} are passed for convenience, but of 7353 course they can be ignored if not needed by an implementation. The default 7354 functions using @code{malloc} and friends for instance don't use them. 7355 7356 No error return is allowed from any of these functions, if they return then 7357 they must have performed the specified operation. In particular note that 7358 @var{allocate_function} or @var{reallocate_function} mustn't return 7359 @code{NULL}. 7360 7361 Getting a different fatal error action is a good use for custom allocation 7362 functions, for example giving a graphical dialog rather than the default print 7363 to @code{stderr}. How much is possible when genuinely out of memory is 7364 another question though. 7365 7366 There's currently no defined way for the allocation functions to recover from 7367 an error such as out of memory, they must terminate program execution. A 7368 @code{longjmp} or throwing a C++ exception will have undefined results. This 7369 may change in the future. 7370 7371 GMP may use allocated blocks to hold pointers to other allocated blocks. This 7372 will limit the assumptions a conservative garbage collection scheme can make. 7373 7374 Since the default GMP allocation uses @code{malloc} and friends, those 7375 functions will be linked in even if the first thing a program does is an 7376 @code{mp_set_memory_functions}. It's necessary to change the GMP sources if 7377 this is a problem. 7378 7379 @sp 1 7380 @deftypefun void mp_get_memory_functions (@* void *(**@var{alloc_func_ptr}) (size_t), @* void *(**@var{realloc_func_ptr}) (void *, size_t, size_t), @* void (**@var{free_func_ptr}) (void *, size_t)) 7381 Get the current allocation functions, storing function pointers to the 7382 locations given by the arguments. If an argument is @code{NULL}, that 7383 function pointer is not stored. 7384 7385 @need 1000 7386 For example, to get just the current free function, 7387 7388 @example 7389 void (*freefunc) (void *, size_t); 7390 7391 mp_get_memory_functions (NULL, NULL, &freefunc); 7392 @end example 7393 @end deftypefun 7394 7395 @node Language Bindings, Algorithms, Custom Allocation, Top 7396 @chapter Language Bindings 7397 @cindex Language bindings 7398 @cindex Other languages 7399 7400 The following packages and projects offer access to GMP from languages other 7401 than C, though perhaps with varying levels of functionality and efficiency. 7402 7403 @c @spaceuref{U} is the same as @uref{U}, but with a couple of extra spaces 7404 @c in tex, just to separate the URL from the preceding text a bit. 7405 @iftex 7406 @macro spaceuref {U} 7407 @ @ @uref{\U\} 7408 @end macro 7409 @end iftex 7410 @ifnottex 7411 @macro spaceuref {U} 7412 @uref{\U\} 7413 @end macro 7414 @end ifnottex 7415 7416 @sp 1 7417 @table @asis 7418 @item C++ 7419 @itemize @bullet 7420 @item 7421 GMP C++ class interface, @pxref{C++ Class Interface} @* Straightforward 7422 interface, expression templates to eliminate temporaries. 7423 @item 7424 ALP @spaceuref{https://www-sop.inria.fr/saga/logiciels/ALP/} @* Linear algebra and 7425 polynomials using templates. 7426 @item 7427 Arithmos @spaceuref{http://cant.ua.ac.be/old/arithmos/} @* Rationals 7428 with infinities and square roots. 7429 @item 7430 CLN @spaceuref{http://www.ginac.de/CLN/} @* High level classes for arithmetic. 7431 @item 7432 Linbox @spaceuref{http://www.linalg.org/} @* Sparse vectors and matrices. 7433 @item 7434 NTL @spaceuref{http://www.shoup.net/ntl/} @* A C++ number theory library. 7435 @end itemize 7436 7437 @c @item D 7438 @c @itemize @bullet 7439 @c @item 7440 @c gmp-d @spaceuref{http://home.comcast.net/~benhinkle/gmp-d/} 7441 @c @end itemize 7442 7443 @item Eiffel 7444 @itemize @bullet 7445 @item 7446 Eiffelroom @spaceuref{http://www.eiffelroom.org/node/442} 7447 @end itemize 7448 7449 @c @item Fortran 7450 @c @itemize @bullet 7451 @c @item 7452 @c Omni F77 @spaceuref{http://phase.hpcc.jp/Omni/home.html} @* Arbitrary 7453 @c precision floats. 7454 @c @end itemize 7455 7456 @item Haskell 7457 @itemize @bullet 7458 @item 7459 Glasgow Haskell Compiler @spaceuref{https://www.haskell.org/ghc/} 7460 @end itemize 7461 7462 @item Java 7463 @itemize @bullet 7464 @item 7465 Kaffe @spaceuref{https://github.com/kaffe/kaffe} 7466 @end itemize 7467 7468 @item Lisp 7469 @itemize @bullet 7470 @item 7471 GNU Common Lisp @spaceuref{https://www.gnu.org/software/gcl/gcl.html} 7472 @item 7473 Librep @spaceuref{http://librep.sourceforge.net/} 7474 @item 7475 @c FIXME: When there's a stable release with gmp support, just refer to it 7476 @c rather than bothering to talk about betas. 7477 XEmacs (21.5.18 beta and up) @spaceuref{http://www.xemacs.org} @* Optional 7478 big integers, rationals and floats using GMP. 7479 @end itemize 7480 7481 @item M4 7482 @itemize @bullet 7483 @item 7484 @c FIXME: When there's a stable release with gmp support, just refer to it 7485 @c rather than bothering to talk about betas. 7486 GNU m4 betas @spaceuref{http://www.seindal.dk/rene/gnu/} @* Optionally provides 7487 an arbitrary precision @code{mpeval}. 7488 @end itemize 7489 7490 @item ML 7491 @itemize @bullet 7492 @item 7493 MLton compiler @spaceuref{http://mlton.org/} 7494 @end itemize 7495 7496 @item Objective Caml 7497 @itemize @bullet 7498 @item 7499 MLGMP @spaceuref{http://opam.ocamlpro.com/pkg/mlgmp.20120224.html} 7500 @item 7501 Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* Optionally using 7502 GMP. 7503 @end itemize 7504 7505 @item Oz 7506 @itemize @bullet 7507 @item 7508 Mozart @spaceuref{http://mozart.github.io/} 7509 @end itemize 7510 7511 @item Pascal 7512 @itemize @bullet 7513 @item 7514 GNU Pascal Compiler @spaceuref{http://www.gnu-pascal.de/} @* GMP unit. 7515 @item 7516 Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* For Free Pascal, 7517 optionally using GMP. 7518 @end itemize 7519 7520 @item Perl 7521 @itemize @bullet 7522 @item 7523 GMP module, see @file{demos/perl} in the GMP sources (@pxref{Demonstration 7524 Programs}). 7525 @item 7526 Math::GMP @spaceuref{http://www.cpan.org/} @* Compatible with Math::BigInt, but 7527 not as many functions as the GMP module above. 7528 @item 7529 Math::BigInt::GMP @spaceuref{http://www.cpan.org/} @* Plug Math::GMP into 7530 normal Math::BigInt operations. 7531 @end itemize 7532 7533 @need 1000 7534 @item Pike 7535 @itemize @bullet 7536 @item 7537 mpz module in the standard distribution, @uref{http://pike.ida.liu.se/} 7538 @end itemize 7539 7540 @need 500 7541 @item Prolog 7542 @itemize @bullet 7543 @item 7544 SWI Prolog @spaceuref{http://www.swi-prolog.org/} @* 7545 Arbitrary precision floats. 7546 @end itemize 7547 7548 @item Python 7549 @itemize @bullet 7550 @item 7551 GMPY @uref{https://code.google.com/p/gmpy/} 7552 @end itemize 7553 7554 @item Ruby 7555 @itemize @bullet 7556 @item 7557 http://rubygems.org/gems/gmp 7558 @end itemize 7559 7560 @item Scheme 7561 @itemize @bullet 7562 @item 7563 GNU Guile @spaceuref{https://www.gnu.org/software/guile/guile.html} 7564 @item 7565 RScheme @spaceuref{http://www.rscheme.org/} 7566 @item 7567 STklos @spaceuref{http://www.stklos.net/} 7568 @c 7569 @c For reference, MzScheme uses some of gmp, but (as of version 205) it only 7570 @c has copies of some of the generic C code, and we don't consider that a 7571 @c language binding to gmp. 7572 @c 7573 @end itemize 7574 7575 @item Smalltalk 7576 @itemize @bullet 7577 @item 7578 GNU Smalltalk @spaceuref{http://www.smalltalk.org/versions/GNUSmalltalk.html} 7579 @end itemize 7580 7581 @item Other 7582 @itemize @bullet 7583 @item 7584 Axiom @uref{https://savannah.nongnu.org/projects/axiom} @* Computer algebra 7585 using GCL. 7586 @item 7587 DrGenius @spaceuref{http://drgenius.seul.org/} @* Geometry system and 7588 mathematical programming language. 7589 @item 7590 GiNaC @spaceuref{http://www.ginac.de/} @* C++ computer algebra using CLN. 7591 @item 7592 GOO @spaceuref{https://www.eecs.berkeley.edu/~jrb/goo/} @* Dynamic object oriented 7593 language. 7594 @item 7595 Maxima @uref{https://www.ma.utexas.edu/users/wfs/maxima.html} @* Macsyma 7596 computer algebra using GCL. 7597 @c @item 7598 @c Q @spaceuref{http://q-lang.sourceforge.net/} @* Equational programming system. 7599 @item 7600 Regina @spaceuref{http://regina.sourceforge.net/} @* Topological calculator. 7601 @item 7602 Yacas @spaceuref{http://yacas.sourceforge.net} @* Yet another computer algebra system. 7603 @end itemize 7604 7605 @end table 7606 7607 7608 @node Algorithms, Internals, Language Bindings, Top 7609 @chapter Algorithms 7610 @cindex Algorithms 7611 7612 This chapter is an introduction to some of the algorithms used for various GMP 7613 operations. The code is likely to be hard to understand without knowing 7614 something about the algorithms. 7615 7616 Some GMP internals are mentioned, but applications that expect to be 7617 compatible with future GMP releases should take care to use only the 7618 documented functions. 7619 7620 @menu 7621 * Multiplication Algorithms:: 7622 * Division Algorithms:: 7623 * Greatest Common Divisor Algorithms:: 7624 * Powering Algorithms:: 7625 * Root Extraction Algorithms:: 7626 * Radix Conversion Algorithms:: 7627 * Other Algorithms:: 7628 * Assembly Coding:: 7629 @end menu 7630 7631 7632 @node Multiplication Algorithms, Division Algorithms, Algorithms, Algorithms 7633 @section Multiplication 7634 @cindex Multiplication algorithms 7635 7636 N@cross{}N limb multiplications and squares are done using one of seven 7637 algorithms, as the size N increases. 7638 7639 @quotation 7640 @multitable {KaratsubaMMM} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 7641 @item Algorithm @tab Threshold 7642 @item Basecase @tab (none) 7643 @item Karatsuba @tab @code{MUL_TOOM22_THRESHOLD} 7644 @item Toom-3 @tab @code{MUL_TOOM33_THRESHOLD} 7645 @item Toom-4 @tab @code{MUL_TOOM44_THRESHOLD} 7646 @item Toom-6.5 @tab @code{MUL_TOOM6H_THRESHOLD} 7647 @item Toom-8.5 @tab @code{MUL_TOOM8H_THRESHOLD} 7648 @item FFT @tab @code{MUL_FFT_THRESHOLD} 7649 @end multitable 7650 @end quotation 7651 7652 Similarly for squaring, with the @code{SQR} thresholds. 7653 7654 N@cross{}M multiplications of operands with different sizes above 7655 @code{MUL_TOOM22_THRESHOLD} are currently done by special Toom-inspired 7656 algorithms or directly with FFT, depending on operand size (@pxref{Unbalanced 7657 Multiplication}). 7658 7659 @menu 7660 * Basecase Multiplication:: 7661 * Karatsuba Multiplication:: 7662 * Toom 3-Way Multiplication:: 7663 * Toom 4-Way Multiplication:: 7664 * Higher degree Toom'n'half:: 7665 * FFT Multiplication:: 7666 * Other Multiplication:: 7667 * Unbalanced Multiplication:: 7668 @end menu 7669 7670 7671 @node Basecase Multiplication, Karatsuba Multiplication, Multiplication Algorithms, Multiplication Algorithms 7672 @subsection Basecase Multiplication 7673 7674 Basecase N@cross{}M multiplication is a straightforward rectangular set of 7675 cross-products, the same as long multiplication done by hand and for that 7676 reason sometimes known as the schoolbook or grammar school method. This is an 7677 @m{O(NM),O(N*M)} algorithm. See Knuth section 4.3.1 algorithm M 7678 (@pxref{References}), and the @file{mpn/generic/mul_basecase.c} code. 7679 7680 Assembly implementations of @code{mpn_mul_basecase} are essentially the same 7681 as the generic C code, but have all the usual assembly tricks and 7682 obscurities introduced for speed. 7683 7684 A square can be done in roughly half the time of a multiply, by using the fact 7685 that the cross products above and below the diagonal are the same. A triangle 7686 of products below the diagonal is formed, doubled (left shift by one bit), and 7687 then the products on the diagonal added. This can be seen in 7688 @file{mpn/generic/sqr_basecase.c}. Again the assembly implementations take 7689 essentially the same approach. 7690 7691 @tex 7692 \def\GMPline#1#2#3#4#5#6{% 7693 \hbox {% 7694 \vrule height 2.5ex depth 1ex 7695 \hbox to 2em {\hfil{#2}\hfil}% 7696 \vrule \hbox to 2em {\hfil{#3}\hfil}% 7697 \vrule \hbox to 2em {\hfil{#4}\hfil}% 7698 \vrule \hbox to 2em {\hfil{#5}\hfil}% 7699 \vrule \hbox to 2em {\hfil{#6}\hfil}% 7700 \vrule}} 7701 \GMPdisplay{ 7702 \hbox{% 7703 \vbox{% 7704 \hbox to 1.5em {\vrule height 2.5ex depth 1ex width 0pt}% 7705 \hbox {\vrule height 2.5ex depth 1ex width 0pt u0\hfil}% 7706 \hbox {\vrule height 2.5ex depth 1ex width 0pt u1\hfil}% 7707 \hbox {\vrule height 2.5ex depth 1ex width 0pt u2\hfil}% 7708 \hbox {\vrule height 2.5ex depth 1ex width 0pt u3\hfil}% 7709 \hbox {\vrule height 2.5ex depth 1ex width 0pt u4\hfil}% 7710 \vfill}% 7711 \vbox{% 7712 \hbox{% 7713 \hbox to 2em {\hfil u0\hfil}% 7714 \hbox to 2em {\hfil u1\hfil}% 7715 \hbox to 2em {\hfil u2\hfil}% 7716 \hbox to 2em {\hfil u3\hfil}% 7717 \hbox to 2em {\hfil u4\hfil}}% 7718 \vskip 0.7ex 7719 \hrule 7720 \GMPline{u0}{d}{}{}{}{}% 7721 \hrule 7722 \GMPline{u1}{}{d}{}{}{}% 7723 \hrule 7724 \GMPline{u2}{}{}{d}{}{}% 7725 \hrule 7726 \GMPline{u3}{}{}{}{d}{}% 7727 \hrule 7728 \GMPline{u4}{}{}{}{}{d}% 7729 \hrule}}} 7730 @end tex 7731 @ifnottex 7732 @example 7733 @group 7734 u0 u1 u2 u3 u4 7735 +---+---+---+---+---+ 7736 u0 | d | | | | | 7737 +---+---+---+---+---+ 7738 u1 | | d | | | | 7739 +---+---+---+---+---+ 7740 u2 | | | d | | | 7741 +---+---+---+---+---+ 7742 u3 | | | | d | | 7743 +---+---+---+---+---+ 7744 u4 | | | | | d | 7745 +---+---+---+---+---+ 7746 @end group 7747 @end example 7748 @end ifnottex 7749 7750 In practice squaring isn't a full 2@cross{} faster than multiplying, it's 7751 usually around 1.5@cross{}. Less than 1.5@cross{} probably indicates 7752 @code{mpn_sqr_basecase} wants improving on that CPU. 7753 7754 On some CPUs @code{mpn_mul_basecase} can be faster than the generic C 7755 @code{mpn_sqr_basecase} on some small sizes. @code{SQR_BASECASE_THRESHOLD} is 7756 the size at which to use @code{mpn_sqr_basecase}, this will be zero if that 7757 routine should be used always. 7758 7759 7760 @node Karatsuba Multiplication, Toom 3-Way Multiplication, Basecase Multiplication, Multiplication Algorithms 7761 @subsection Karatsuba Multiplication 7762 @cindex Karatsuba multiplication 7763 7764 The Karatsuba multiplication algorithm is described in Knuth section 4.3.3 7765 part A, and various other textbooks. A brief description is given here. 7766 7767 The inputs @math{x} and @math{y} are treated as each split into two parts of 7768 equal length (or the most significant part one limb shorter if N is odd). 7769 7770 @tex 7771 % GMPboxwidth used for all the multiplication pictures 7772 \global\newdimen\GMPboxwidth \global\GMPboxwidth=5em 7773 % GMPboxdepth and GMPboxheight are also used for the float pictures 7774 \global\newdimen\GMPboxdepth \global\GMPboxdepth=1ex 7775 \global\newdimen\GMPboxheight \global\GMPboxheight=2ex 7776 \gdef\GMPvrule{\vrule height \GMPboxheight depth \GMPboxdepth} 7777 \def\GMPbox#1#2{% 7778 \vbox {% 7779 \hrule 7780 \hbox to 2\GMPboxwidth{% 7781 \GMPvrule \hfil $#1$\hfil \vrule \hfil $#2$\hfil \vrule}% 7782 \hrule}} 7783 \GMPdisplay{% 7784 \vbox{% 7785 \hbox to 2\GMPboxwidth {high \hfil low} 7786 \vskip 0.7ex 7787 \GMPbox{x_1}{x_0} 7788 \vskip 0.5ex 7789 \GMPbox{y_1}{y_0} 7790 }} 7791 @end tex 7792 @ifnottex 7793 @example 7794 @group 7795 high low 7796 +----------+----------+ 7797 | x1 | x0 | 7798 +----------+----------+ 7799 7800 +----------+----------+ 7801 | y1 | y0 | 7802 +----------+----------+ 7803 @end group 7804 @end example 7805 @end ifnottex 7806 7807 Let @math{b} be the power of 2 where the split occurs, i.e.@: if @ms{x,0} is 7808 @math{k} limbs (@ms{y,0} the same) then 7809 @m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}}, b=2^(k*mp_bits_per_limb)}. 7810 With that @m{x=x_1b+x_0,x=x1*b+x0} and @m{y=y_1b+y_0,y=y1*b+y0}, and the 7811 following holds, 7812 7813 @display 7814 @m{xy = (b^2+b)x_1y_1 - b(x_1-x_0)(y_1-y_0) + (b+1)x_0y_0, 7815 x*y = (b^2+b)*x1*y1 - b*(x1-x0)*(y1-y0) + (b+1)*x0*y0} 7816 @end display 7817 7818 This formula means doing only three multiplies of (N/2)@cross{}(N/2) limbs, 7819 whereas a basecase multiply of N@cross{}N limbs is equivalent to four 7820 multiplies of (N/2)@cross{}(N/2). The factors @math{(b^2+b)} etc represent 7821 the positions where the three products must be added. 7822 7823 @tex 7824 \def\GMPboxA#1#2{% 7825 \vbox{% 7826 \hrule 7827 \hbox{% 7828 \GMPvrule 7829 \hbox to 2\GMPboxwidth {\hfil\hbox{$#1$}\hfil}% 7830 \vrule 7831 \hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}% 7832 \vrule} 7833 \hrule}} 7834 \def\GMPboxB#1#2{% 7835 \hbox{% 7836 \raise \GMPboxdepth \hbox to \GMPboxwidth {\hfil #1\hskip 0.5em}% 7837 \vbox{% 7838 \hrule 7839 \hbox{% 7840 \GMPvrule 7841 \hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}% 7842 \vrule}% 7843 \hrule}}} 7844 \GMPdisplay{% 7845 \vbox{% 7846 \hbox to 4\GMPboxwidth {high \hfil low} 7847 \vskip 0.7ex 7848 \GMPboxA{x_1y_1}{x_0y_0} 7849 \vskip 0.5ex 7850 \GMPboxB{$+$}{x_1y_1} 7851 \vskip 0.5ex 7852 \GMPboxB{$+$}{x_0y_0} 7853 \vskip 0.5ex 7854 \GMPboxB{$-$}{(x_1-x_0)(y_1-y_0)} 7855 }} 7856 @end tex 7857 @ifnottex 7858 @example 7859 @group 7860 high low 7861 +--------+--------+ +--------+--------+ 7862 | x1*y1 | | x0*y0 | 7863 +--------+--------+ +--------+--------+ 7864 +--------+--------+ 7865 add | x1*y1 | 7866 +--------+--------+ 7867 +--------+--------+ 7868 add | x0*y0 | 7869 +--------+--------+ 7870 +--------+--------+ 7871 sub | (x1-x0)*(y1-y0) | 7872 +--------+--------+ 7873 @end group 7874 @end example 7875 @end ifnottex 7876 7877 The term @m{(x_1-x_0)(y_1-y_0),(x1-x0)*(y1-y0)} is best calculated as an 7878 absolute value, and the sign used to choose to add or subtract. Notice the 7879 sum @m{\mathop{\rm high}(x_0y_0)+\mathop{\rm low}(x_1y_1), 7880 high(x0*y0)+low(x1*y1)} occurs twice, so it's possible to do @m{5k,5*k} limb 7881 additions, rather than @m{6k,6*k}, but in GMP extra function call overheads 7882 outweigh the saving. 7883 7884 Squaring is similar to multiplying, but with @math{x=y} the formula reduces to 7885 an equivalent with three squares, 7886 7887 @display 7888 @m{x^2 = (b^2+b)x_1^2 - b(x_1-x_0)^2 + (b+1)x_0^2, 7889 x^2 = (b^2+b)*x1^2 - b*(x1-x0)^2 + (b+1)*x0^2} 7890 @end display 7891 7892 The final result is accumulated from those three squares the same way as for 7893 the three multiplies above. The middle term @m{(x_1-x_0)^2,(x1-x0)^2} is now 7894 always positive. 7895 7896 A similar formula for both multiplying and squaring can be constructed with a 7897 middle term @m{(x_1+x_0)(y_1+y_0),(x1+x0)*(y1+y0)}. But those sums can exceed 7898 @math{k} limbs, leading to more carry handling and additions than the form 7899 above. 7900 7901 Karatsuba multiplication is asymptotically an @math{O(N^@W{1.585})} algorithm, 7902 the exponent being @m{\log3/\log2,log(3)/log(2)}, representing 3 multiplies 7903 each @math{1/2} the size of the inputs. This is a big improvement over the 7904 basecase multiply at @math{O(N^2)} and the advantage soon overcomes the extra 7905 additions Karatsuba performs. @code{MUL_TOOM22_THRESHOLD} can be as little 7906 as 10 limbs. The @code{SQR} threshold is usually about twice the @code{MUL}. 7907 7908 The basecase algorithm will take a time of the form @m{M(N) = aN^2 + bN + c, 7909 M(N) = a*N^2 + b*N + c} and the Karatsuba algorithm @m{K(N) = 3M(N/2) + dN + 7910 e, K(N) = 3*M(N/2) + d*N + e}, which expands to @m{K(N) = {3\over4} aN^2 + 7911 {3\over2} bN + 3c + dN + e, K(N) = 3/4*a*N^2 + 3/2*b*N + 3*c + d*N + e}. The 7912 factor @m{3\over4, 3/4} for @math{a} means per-crossproduct speedups in the 7913 basecase code will increase the threshold since they benefit @math{M(N)} more 7914 than @math{K(N)}. And conversely the @m{3\over2, 3/2} for @math{b} means 7915 linear style speedups of @math{b} will increase the threshold since they 7916 benefit @math{K(N)} more than @math{M(N)}. The latter can be seen for 7917 instance when adding an optimized @code{mpn_sqr_diagonal} to 7918 @code{mpn_sqr_basecase}. Of course all speedups reduce total time, and in 7919 that sense the algorithm thresholds are merely of academic interest. 7920 7921 7922 @node Toom 3-Way Multiplication, Toom 4-Way Multiplication, Karatsuba Multiplication, Multiplication Algorithms 7923 @subsection Toom 3-Way Multiplication 7924 @cindex Toom multiplication 7925 7926 The Karatsuba formula is the simplest case of a general approach to splitting 7927 inputs that leads to both Toom and FFT algorithms. A description of 7928 Toom can be found in Knuth section 4.3.3, with an example 3-way 7929 calculation after Theorem A@. The 3-way form used in GMP is described here. 7930 7931 The operands are each considered split into 3 pieces of equal length (or the 7932 most significant part 1 or 2 limbs shorter than the other two). 7933 7934 @tex 7935 \def\GMPbox#1#2#3{% 7936 \vbox{% 7937 \hrule \vfil 7938 \hbox to 3\GMPboxwidth {% 7939 \GMPvrule 7940 \hfil$#1$\hfil 7941 \vrule 7942 \hfil$#2$\hfil 7943 \vrule 7944 \hfil$#3$\hfil 7945 \vrule}% 7946 \vfil \hrule 7947 }} 7948 \GMPdisplay{% 7949 \vbox{% 7950 \hbox to 3\GMPboxwidth {high \hfil low} 7951 \vskip 0.7ex 7952 \GMPbox{x_2}{x_1}{x_0} 7953 \vskip 0.5ex 7954 \GMPbox{y_2}{y_1}{y_0} 7955 \vskip 0.5ex 7956 }} 7957 @end tex 7958 @ifnottex 7959 @example 7960 @group 7961 high low 7962 +----------+----------+----------+ 7963 | x2 | x1 | x0 | 7964 +----------+----------+----------+ 7965 7966 +----------+----------+----------+ 7967 | y2 | y1 | y0 | 7968 +----------+----------+----------+ 7969 @end group 7970 @end example 7971 @end ifnottex 7972 7973 @noindent 7974 These parts are treated as the coefficients of two polynomials 7975 7976 @display 7977 @group 7978 @m{X(t) = x_2t^2 + x_1t + x_0, 7979 X(t) = x2*t^2 + x1*t + x0} 7980 @m{Y(t) = y_2t^2 + y_1t + y_0, 7981 Y(t) = y2*t^2 + y1*t + y0} 7982 @end group 7983 @end display 7984 7985 Let @math{b} equal the power of 2 which is the size of the @ms{x,0}, @ms{x,1}, 7986 @ms{y,0} and @ms{y,1} pieces, i.e.@: if they're @math{k} limbs each then 7987 @m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}}, b=2^(k*mp_bits_per_limb)}. 7988 With this @math{x=X(b)} and @math{y=Y(b)}. 7989 7990 Let a polynomial @m{W(t)=X(t)Y(t),W(t)=X(t)*Y(t)} and suppose its coefficients 7991 are 7992 7993 @display 7994 @m{W(t) = w_4t^4 + w_3t^3 + w_2t^2 + w_1t + w_0, 7995 W(t) = w4*t^4 + w3*t^3 + w2*t^2 + w1*t + w0} 7996 @end display 7997 7998 The @m{w_i,w[i]} are going to be determined, and when they are they'll give 7999 the final result using @math{w=W(b)}, since 8000 @m{xy=X(b)Y(b),x*y=X(b)*Y(b)=W(b)}. The coefficients will be roughly 8001 @math{b^2} each, and the final @math{W(b)} will be an addition like, 8002 8003 @tex 8004 \def\GMPbox#1#2{% 8005 \moveright #1\GMPboxwidth 8006 \vbox{% 8007 \hrule 8008 \hbox{% 8009 \GMPvrule 8010 \hbox to 2\GMPboxwidth {\hfil$#2$\hfil}% 8011 \vrule}% 8012 \hrule 8013 }} 8014 \GMPdisplay{% 8015 \vbox{% 8016 \hbox to 6\GMPboxwidth {high \hfil low}% 8017 \vskip 0.7ex 8018 \GMPbox{0}{w_4} 8019 \vskip 0.5ex 8020 \GMPbox{1}{w_3} 8021 \vskip 0.5ex 8022 \GMPbox{2}{w_2} 8023 \vskip 0.5ex 8024 \GMPbox{3}{w_1} 8025 \vskip 0.5ex 8026 \GMPbox{4}{w_0} 8027 }} 8028 @end tex 8029 @ifnottex 8030 @example 8031 @group 8032 high low 8033 +-------+-------+ 8034 | w4 | 8035 +-------+-------+ 8036 +--------+-------+ 8037 | w3 | 8038 +--------+-------+ 8039 +--------+-------+ 8040 | w2 | 8041 +--------+-------+ 8042 +--------+-------+ 8043 | w1 | 8044 +--------+-------+ 8045 +-------+-------+ 8046 | w0 | 8047 +-------+-------+ 8048 @end group 8049 @end example 8050 @end ifnottex 8051 8052 The @m{w_i,w[i]} coefficients could be formed by a simple set of cross 8053 products, like @m{w_4=x_2y_2,w4=x2*y2}, @m{w_3=x_2y_1+x_1y_2,w3=x2*y1+x1*y2}, 8054 @m{w_2=x_2y_0+x_1y_1+x_0y_2,w2=x2*y0+x1*y1+x0*y2} etc, but this would need all 8055 nine @m{x_iy_j,x[i]*y[j]} for @math{i,j=0,1,2}, and would be equivalent merely 8056 to a basecase multiply. Instead the following approach is used. 8057 8058 @math{X(t)} and @math{Y(t)} are evaluated and multiplied at 5 points, giving 8059 values of @math{W(t)} at those points. In GMP the following points are used, 8060 8061 @quotation 8062 @multitable {@m{t=\infty,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 8063 @item Point @tab Value 8064 @item @math{t=0} @tab @m{x_0y_0,x0 * y0}, which gives @ms{w,0} immediately 8065 @item @math{t=1} @tab @m{(x_2+x_1+x_0)(y_2+y_1+y_0),(x2+x1+x0) * (y2+y1+y0)} 8066 @item @math{t=-1} @tab @m{(x_2-x_1+x_0)(y_2-y_1+y_0),(x2-x1+x0) * (y2-y1+y0)} 8067 @item @math{t=2} @tab @m{(4x_2+2x_1+x_0)(4y_2+2y_1+y_0),(4*x2+2*x1+x0) * (4*y2+2*y1+y0)} 8068 @item @m{t=\infty,t=inf} @tab @m{x_2y_2,x2 * y2}, which gives @ms{w,4} immediately 8069 @end multitable 8070 @end quotation 8071 8072 At @math{t=-1} the values can be negative and that's handled using the 8073 absolute values and tracking the sign separately. At @m{t=\infty,t=inf} the 8074 value is actually @m{\lim_{t\to\infty} {X(t)Y(t)\over t^4}, X(t)*Y(t)/t^4 in 8075 the limit as t approaches infinity}, but it's much easier to think of as 8076 simply @m{x_2y_2,x2*y2} giving @ms{w,4} immediately (much like 8077 @m{x_0y_0,x0*y0} at @math{t=0} gives @ms{w,0} immediately). 8078 8079 Each of the points substituted into 8080 @m{W(t)=w_4t^4+\cdots+w_0,W(t)=w4*t^4+@dots{}+w0} gives a linear combination 8081 of the @m{w_i,w[i]} coefficients, and the value of those combinations has just 8082 been calculated. 8083 8084 @tex 8085 \GMPdisplay{% 8086 $\matrix{% 8087 W(0) & = & & & & & & & & & w_0 \cr 8088 W(1) & = & w_4 & + & w_3 & + & w_2 & + & w_1 & + & w_0 \cr 8089 W(-1) & = & w_4 & - & w_3 & + & w_2 & - & w_1 & + & w_0 \cr 8090 W(2) & = & 16w_4 & + & 8w_3 & + & 4w_2 & + & 2w_1 & + & w_0 \cr 8091 W(\infty) & = & w_4 \cr 8092 }$} 8093 @end tex 8094 @ifnottex 8095 @example 8096 @group 8097 W(0) = w0 8098 W(1) = w4 + w3 + w2 + w1 + w0 8099 W(-1) = w4 - w3 + w2 - w1 + w0 8100 W(2) = 16*w4 + 8*w3 + 4*w2 + 2*w1 + w0 8101 W(inf) = w4 8102 @end group 8103 @end example 8104 @end ifnottex 8105 8106 This is a set of five equations in five unknowns, and some elementary linear 8107 algebra quickly isolates each @m{w_i,w[i]}. This involves adding or 8108 subtracting one @math{W(t)} value from another, and a couple of divisions by 8109 powers of 2 and one division by 3, the latter using the special 8110 @code{mpn_divexact_by3} (@pxref{Exact Division}). 8111 8112 The conversion of @math{W(t)} values to the coefficients is interpolation. A 8113 polynomial of degree 4 like @math{W(t)} is uniquely determined by values known 8114 at 5 different points. The points are arbitrary and can be chosen to make the 8115 linear equations come out with a convenient set of steps for quickly isolating 8116 the @m{w_i,w[i]}. 8117 8118 Squaring follows the same procedure as multiplication, but there's only one 8119 @math{X(t)} and it's evaluated at the 5 points, and those values squared to 8120 give values of @math{W(t)}. The interpolation is then identical, and in fact 8121 the same @code{toom_interpolate_5pts} subroutine is used for both squaring and 8122 multiplying. 8123 8124 Toom-3 is asymptotically @math{O(N^@W{1.465})}, the exponent being 8125 @m{\log5/\log3,log(5)/log(3)}, representing 5 recursive multiplies of 1/3 the 8126 original size each. This is an improvement over Karatsuba at 8127 @math{O(N^@W{1.585})}, though Toom does more work in the evaluation and 8128 interpolation and so it only realizes its advantage above a certain size. 8129 8130 Near the crossover between Toom-3 and Karatsuba there's generally a range of 8131 sizes where the difference between the two is small. 8132 @code{MUL_TOOM33_THRESHOLD} is a somewhat arbitrary point in that range and 8133 successive runs of the tune program can give different values due to small 8134 variations in measuring. A graph of time versus size for the two shows the 8135 effect, see @file{tune/README}. 8136 8137 At the fairly small sizes where the Toom-3 thresholds occur it's worth 8138 remembering that the asymptotic behaviour for Karatsuba and Toom-3 can't be 8139 expected to make accurate predictions, due of course to the big influence of 8140 all sorts of overheads, and the fact that only a few recursions of each are 8141 being performed. Even at large sizes there's a good chance machine dependent 8142 effects like cache architecture will mean actual performance deviates from 8143 what might be predicted. 8144 8145 The formula given for the Karatsuba algorithm (@pxref{Karatsuba 8146 Multiplication}) has an equivalent for Toom-3 involving only five multiplies, 8147 but this would be complicated and unenlightening. 8148 8149 An alternate view of Toom-3 can be found in Zuras (@pxref{References}), using 8150 a vector to represent the @math{x} and @math{y} splits and a matrix 8151 multiplication for the evaluation and interpolation stages. The matrix 8152 inverses are not meant to be actually used, and they have elements with values 8153 much greater than in fact arise in the interpolation steps. The diagram shown 8154 for the 3-way is attractive, but again doesn't have to be implemented that way 8155 and for example with a bit of rearrangement just one division by 6 can be 8156 done. 8157 8158 8159 @node Toom 4-Way Multiplication, Higher degree Toom'n'half, Toom 3-Way Multiplication, Multiplication Algorithms 8160 @subsection Toom 4-Way Multiplication 8161 @cindex Toom multiplication 8162 8163 Karatsuba and Toom-3 split the operands into 2 and 3 coefficients, 8164 respectively. Toom-4 analogously splits the operands into 4 coefficients. 8165 Using the notation from the section on Toom-3 multiplication, we form two 8166 polynomials: 8167 8168 @display 8169 @group 8170 @m{X(t) = x_3t^3 + x_2t^2 + x_1t + x_0, 8171 X(t) = x3*t^3 + x2*t^2 + x1*t + x0} 8172 @m{Y(t) = y_3t^3 + y_2t^2 + y_1t + y_0, 8173 Y(t) = y3*t^3 + y2*t^2 + y1*t + y0} 8174 @end group 8175 @end display 8176 8177 @math{X(t)} and @math{Y(t)} are evaluated and multiplied at 7 points, giving 8178 values of @math{W(t)} at those points. In GMP the following points are used, 8179 8180 @quotation 8181 @multitable {@m{t=-1/2,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 8182 @item Point @tab Value 8183 @item @math{t=0} @tab @m{x_0y_0,x0 * y0}, which gives @ms{w,0} immediately 8184 @item @math{t=1/2} @tab @m{(x_3+2x_2+4x_1+8x_0)(y_3+2y_2+4y_1+8y_0),(x3+2*x2+4*x1+8*x0) * (y3+2*y2+4*y1+8*y0)} 8185 @item @math{t=-1/2} @tab @m{(-x_3+2x_2-4x_1+8x_0)(-y_3+2y_2-4y_1+8y_0),(-x3+2*x2-4*x1+8*x0) * (-y3+2*y2-4*y1+8*y0)} 8186 @item @math{t=1} @tab @m{(x_3+x_2+x_1+x_0)(y_3+y_2+y_1+y_0),(x3+x2+x1+x0) * (y3+y2+y1+y0)} 8187 @item @math{t=-1} @tab @m{(-x_3+x_2-x_1+x_0)(-y_3+y_2-y_1+y_0),(-x3+x2-x1+x0) * (-y3+y2-y1+y0)} 8188 @item @math{t=2} @tab @m{(8x_3+4x_2+2x_1+x_0)(8y_3+4y_2+2y_1+y_0),(8*x3+4*x2+2*x1+x0) * (8*y3+4*y2+2*y1+y0)} 8189 @item @m{t=\infty,t=inf} @tab @m{x_3y_3,x3 * y3}, which gives @ms{w,6} immediately 8190 @end multitable 8191 @end quotation 8192 8193 The number of additions and subtractions for Toom-4 is much larger than for Toom-3. 8194 But several subexpressions occur multiple times, for example @m{x_2+x_0,x2+x0}, occurs 8195 for both @math{t=1} and @math{t=-1}. 8196 8197 Toom-4 is asymptotically @math{O(N^@W{1.404})}, the exponent being 8198 @m{\log7/\log4,log(7)/log(4)}, representing 7 recursive multiplies of 1/4 the 8199 original size each. 8200 8201 8202 @node Higher degree Toom'n'half, FFT Multiplication, Toom 4-Way Multiplication, Multiplication Algorithms 8203 @subsection Higher degree Toom'n'half 8204 @cindex Toom multiplication 8205 8206 The Toom algorithms described above (@pxref{Toom 3-Way Multiplication}, 8207 @pxref{Toom 4-Way Multiplication}) generalizes to split into an arbitrary 8208 number of pieces. In general a split of two equally long operands into 8209 @math{r} pieces leads to evaluations and pointwise multiplications done at 8210 @m{2r-1,2*r-1} points. To fully exploit symmetries it would be better to have 8211 a multiple of 4 points, that's why for higher degree Toom'n'half is used. 8212 8213 Toom'n'half means that the existence of one more piece is considered for a 8214 single operand. It can be virtual, i.e. zero, or real, when the two operand 8215 are not exactly balanced. By choosing an even @math{r}, 8216 Toom-@m{r{1\over2},r+1/2} requires @math{2r} points, a multiple of four. 8217 8218 The four-plets of points include 0, @m{\infty,inf}, +1, -1 and 8219 @m{\pm2^i,+-2^i}, @m{\pm2^{-i},+-2^-i} . Each of them giving shortcuts for the 8220 evaluation phase and for some steps in the interpolation phase. Further tricks 8221 are used to reduce the memory footprint of the whole multiplication algorithm 8222 to a memory buffer equanl in size to the result of the product. 8223 8224 Current GMP uses both Toom-6'n'half and Toom-8'n'half. 8225 8226 8227 @node FFT Multiplication, Other Multiplication, Higher degree Toom'n'half, Multiplication Algorithms 8228 @subsection FFT Multiplication 8229 @cindex FFT multiplication 8230 @cindex Fast Fourier Transform 8231 8232 At large to very large sizes a Fermat style FFT multiplication is used, 8233 following Sch@"onhage and Strassen (@pxref{References}). Descriptions of FFTs 8234 in various forms can be found in many textbooks, for instance Knuth section 8235 4.3.3 part C or Lipson chapter IX@. A brief description of the form used in 8236 GMP is given here. 8237 8238 The multiplication done is @m{xy \bmod 2^N+1, x*y mod 2^N+1}, for a given 8239 @math{N}. A full product @m{xy,x*y} is obtained by choosing @m{N \ge 8240 \mathop{\rm bits}(x)+\mathop{\rm bits}(y), N>=bits(x)+bits(y)} and padding 8241 @math{x} and @math{y} with high zero limbs. The modular product is the native 8242 form for the algorithm, so padding to get a full product is unavoidable. 8243 8244 The algorithm follows a split, evaluate, pointwise multiply, interpolate and 8245 combine similar to that described above for Karatsuba and Toom-3. A @math{k} 8246 parameter controls the split, with an FFT-@math{k} splitting into @math{2^k} 8247 pieces of @math{M=N/2^k} bits each. @math{N} must be a multiple of 8248 @m{2^k\times@code{mp\_bits\_per\_limb}, (2^k)*@nicode{mp_bits_per_limb}} so 8249 the split falls on limb boundaries, avoiding bit shifts in the split and 8250 combine stages. 8251 8252 The evaluations, pointwise multiplications, and interpolation, are all done 8253 modulo @m{2^{N'}+1, 2^N'+1} where @math{N'} is @math{2M+k+3} rounded up to a 8254 multiple of @math{2^k} and of @code{mp_bits_per_limb}. The results of 8255 interpolation will be the following negacyclic convolution of the input 8256 pieces, and the choice of @math{N'} ensures these sums aren't truncated. 8257 @tex 8258 $$ w_n = \sum_{{i+j = b2^k+n}\atop{b=0,1}} (-1)^b x_i y_j $$ 8259 @end tex 8260 @ifnottex 8261 8262 @example 8263 --- 8264 \ b 8265 w[n] = / (-1) * x[i] * y[j] 8266 --- 8267 i+j==b*2^k+n 8268 b=0,1 8269 @end example 8270 8271 @end ifnottex 8272 The points used for the evaluation are @math{g^i} for @math{i=0} to 8273 @math{2^k-1} where @m{g=2^{2N'/2^k}, g=2^(2N'/2^k)}. @math{g} is a 8274 @m{2^k,2^k'}th root of unity mod @m{2^{N'}+1,2^N'+1}, which produces necessary 8275 cancellations at the interpolation stage, and it's also a power of 2 so the 8276 fast Fourier transforms used for the evaluation and interpolation do only 8277 shifts, adds and negations. 8278 8279 The pointwise multiplications are done modulo @m{2^{N'}+1, 2^N'+1} and either 8280 recurse into a further FFT or use a plain multiplication (Toom-3, Karatsuba or 8281 basecase), whichever is optimal at the size @math{N'}. The interpolation is 8282 an inverse fast Fourier transform. The resulting set of sums of @m{x_iy_j, 8283 x[i]*y[j]} are added at appropriate offsets to give the final result. 8284 8285 Squaring is the same, but @math{x} is the only input so it's one transform at 8286 the evaluate stage and the pointwise multiplies are squares. The 8287 interpolation is the same. 8288 8289 For a mod @math{2^N+1} product, an FFT-@math{k} is an @m{O(N^{k/(k-1)}), 8290 O(N^(k/(k-1)))} algorithm, the exponent representing @math{2^k} recursed 8291 modular multiplies each @m{1/2^{k-1},1/2^(k-1)} the size of the original. 8292 Each successive @math{k} is an asymptotic improvement, but overheads mean each 8293 is only faster at bigger and bigger sizes. In the code, @code{MUL_FFT_TABLE} 8294 and @code{SQR_FFT_TABLE} are the thresholds where each @math{k} is used. Each 8295 new @math{k} effectively swaps some multiplying for some shifts, adds and 8296 overheads. 8297 8298 A mod @math{2^N+1} product can be formed with a normal 8299 @math{N@cross{}N@rightarrow{}2N} bit multiply plus a subtraction, so an FFT 8300 and Toom-3 etc can be compared directly. A @math{k=4} FFT at 8301 @math{O(N^@W{1.333})} can be expected to be the first faster than Toom-3 at 8302 @math{O(N^@W{1.465})}. In practice this is what's found, with 8303 @code{MUL_FFT_MODF_THRESHOLD} and @code{SQR_FFT_MODF_THRESHOLD} being between 8304 300 and 1000 limbs, depending on the CPU@. So far it's been found that only 8305 very large FFTs recurse into pointwise multiplies above these sizes. 8306 8307 When an FFT is to give a full product, the change of @math{N} to @math{2N} 8308 doesn't alter the theoretical complexity for a given @math{k}, but for the 8309 purposes of considering where an FFT might be first used it can be assumed 8310 that the FFT is recursing into a normal multiply and that on that basis it's 8311 doing @math{2^k} recursed multiplies each @m{1/2^{k-2},1/2^(k-2)} the size of 8312 the inputs, making it @m{O(N^{k/(k-2)}), O(N^(k/(k-2)))}. This would mean 8313 @math{k=7} at @math{O(N^@W{1.4})} would be the first FFT faster than Toom-3. 8314 In practice @code{MUL_FFT_THRESHOLD} and @code{SQR_FFT_THRESHOLD} have been 8315 found to be in the @math{k=8} range, somewhere between 3000 and 10000 limbs. 8316 8317 The way @math{N} is split into @math{2^k} pieces and then @math{2M+k+3} is 8318 rounded up to a multiple of @math{2^k} and @code{mp_bits_per_limb} means that 8319 when @math{2^k@ge{}@nicode{mp\_bits\_per\_limb}} the effective @math{N} is a 8320 multiple of @m{2^{2k-1},2^(2k-1)} bits. The @math{+k+3} means some values of 8321 @math{N} just under such a multiple will be rounded to the next. The 8322 complexity calculations above assume that a favourable size is used, meaning 8323 one which isn't padded through rounding, and it's also assumed that the extra 8324 @math{+k+3} bits are negligible at typical FFT sizes. 8325 8326 The practical effect of the @m{2^{2k-1},2^(2k-1)} constraint is to introduce a 8327 step-effect into measured speeds. For example @math{k=8} will round @math{N} 8328 up to a multiple of 32768 bits, so for a 32-bit limb there'll be 512 limb 8329 groups of sizes for which @code{mpn_mul_n} runs at the same speed. Or for 8330 @math{k=9} groups of 2048 limbs, @math{k=10} groups of 8192 limbs, etc. In 8331 practice it's been found each @math{k} is used at quite small multiples of its 8332 size constraint and so the step effect is quite noticeable in a time versus 8333 size graph. 8334 8335 The threshold determinations currently measure at the mid-points of size 8336 steps, but this is sub-optimal since at the start of a new step it can happen 8337 that it's better to go back to the previous @math{k} for a while. Something 8338 more sophisticated for @code{MUL_FFT_TABLE} and @code{SQR_FFT_TABLE} will be 8339 needed. 8340 8341 8342 @node Other Multiplication, Unbalanced Multiplication, FFT Multiplication, Multiplication Algorithms 8343 @subsection Other Multiplication 8344 @cindex Toom multiplication 8345 8346 The Toom algorithms described above (@pxref{Toom 3-Way Multiplication}, 8347 @pxref{Toom 4-Way Multiplication}) generalizes to split into an arbitrary 8348 number of pieces, as per Knuth section 4.3.3 algorithm C@. This is not 8349 currently used. The notes here are merely for interest. 8350 8351 In general a split into @math{r+1} pieces is made, and evaluations and 8352 pointwise multiplications done at @m{2r+1,2*r+1} points. A 4-way split does 7 8353 pointwise multiplies, 5-way does 9, etc. Asymptotically an @math{(r+1)}-way 8354 algorithm is @m{O(N^{log(2r+1)/log(r+1)}), O(N^(log(2*r+1)/log(r+1)))}. Only 8355 the pointwise multiplications count towards big-@math{O} complexity, but the 8356 time spent in the evaluate and interpolate stages grows with @math{r} and has 8357 a significant practical impact, with the asymptotic advantage of each @math{r} 8358 realized only at bigger and bigger sizes. The overheads grow as 8359 @m{O(Nr),O(N*r)}, whereas in an @math{r=2^k} FFT they grow only as @m{O(N \log 8360 r), O(N*log(r))}. 8361 8362 Knuth algorithm C evaluates at points 0,1,2,@dots{},@m{2r,2*r}, but exercise 4 8363 uses @math{-r},@dots{},0,@dots{},@math{r} and the latter saves some small 8364 multiplies in the evaluate stage (or rather trades them for additions), and 8365 has a further saving of nearly half the interpolate steps. The idea is to 8366 separate odd and even final coefficients and then perform algorithm C steps C7 8367 and C8 on them separately. The divisors at step C7 become @math{j^2} and the 8368 multipliers at C8 become @m{2tj-j^2,2*t*j-j^2}. 8369 8370 Splitting odd and even parts through positive and negative points can be 8371 thought of as using @math{-1} as a square root of unity. If a 4th root of 8372 unity was available then a further split and speedup would be possible, but no 8373 such root exists for plain integers. Going to complex integers with 8374 @m{i=\sqrt{-1}, i=sqrt(-1)} doesn't help, essentially because in Cartesian 8375 form it takes three real multiplies to do a complex multiply. The existence 8376 of @m{2^k,2^k'}th roots of unity in a suitable ring or field lets the fast 8377 Fourier transform keep splitting and get to @m{O(N \log r), O(N*log(r))}. 8378 8379 Floating point FFTs use complex numbers approximating Nth roots of unity. 8380 Some processors have special support for such FFTs. But these are not used in 8381 GMP since it's very difficult to guarantee an exact result (to some number of 8382 bits). An occasional difference of 1 in the last bit might not matter to a 8383 typical signal processing algorithm, but is of course of vital importance to 8384 GMP. 8385 8386 8387 @node Unbalanced Multiplication, , Other Multiplication, Multiplication Algorithms 8388 @subsection Unbalanced Multiplication 8389 @cindex Unbalanced multiplication 8390 8391 Multiplication of operands with different sizes, both below 8392 @code{MUL_TOOM22_THRESHOLD} are done with plain schoolbook multiplication 8393 (@pxref{Basecase Multiplication}). 8394 8395 For really large operands, we invoke FFT directly. 8396 8397 For operands between these sizes, we use Toom inspired algorithms suggested by 8398 Alberto Zanoni and Marco Bodrato. The idea is to split the operands into 8399 polynomials of different degree. GMP currently splits the smaller operand 8400 onto 2 coefficients, i.e., a polynomial of degree 1, but the larger operand 8401 can be split into 2, 3, or 4 coefficients, i.e., a polynomial of degree 1 to 8402 3. 8403 8404 @c FIXME: This is mighty ugly, but a cleaner @need triggers texinfo bugs that 8405 @c screws up layout here and there in the rest of the manual. 8406 @c @tex 8407 @c \goodbreak 8408 @c @end tex 8409 @node Division Algorithms, Greatest Common Divisor Algorithms, Multiplication Algorithms, Algorithms 8410 @section Division Algorithms 8411 @cindex Division algorithms 8412 8413 @menu 8414 * Single Limb Division:: 8415 * Basecase Division:: 8416 * Divide and Conquer Division:: 8417 * Block-Wise Barrett Division:: 8418 * Exact Division:: 8419 * Exact Remainder:: 8420 * Small Quotient Division:: 8421 @end menu 8422 8423 8424 @node Single Limb Division, Basecase Division, Division Algorithms, Division Algorithms 8425 @subsection Single Limb Division 8426 8427 N@cross{}1 division is implemented using repeated 2@cross{}1 divisions from 8428 high to low, either with a hardware divide instruction or a multiplication by 8429 inverse, whichever is best on a given CPU. 8430 8431 The multiply by inverse follows ``Improved division by invariant integers'' by 8432 M@"oller and Granlund (@pxref{References}) and is implemented as 8433 @code{udiv_qrnnd_preinv} in @file{gmp-impl.h}. The idea is to have a 8434 fixed-point approximation to @math{1/d} (see @code{invert_limb}) and then 8435 multiply by the high limb (plus one bit) of the dividend to get a quotient 8436 @math{q}. With @math{d} normalized (high bit set), @math{q} is no more than 1 8437 too small. Subtracting @m{qd,q*d} from the dividend gives a remainder, and 8438 reveals whether @math{q} or @math{q-1} is correct. 8439 8440 The result is a division done with two multiplications and four or five 8441 arithmetic operations. On CPUs with low latency multipliers this can be much 8442 faster than a hardware divide, though the cost of calculating the inverse at 8443 the start may mean it's only better on inputs bigger than say 4 or 5 limbs. 8444 8445 When a divisor must be normalized, either for the generic C 8446 @code{__udiv_qrnnd_c} or the multiply by inverse, the division performed is 8447 actually @m{a2^k,a*2^k} by @m{d2^k,d*2^k} where @math{a} is the dividend and 8448 @math{k} is the power necessary to have the high bit of @m{d2^k,d*2^k} set. 8449 The bit shifts for the dividend are usually accomplished ``on the fly'' 8450 meaning by extracting the appropriate bits at each step. Done this way the 8451 quotient limbs come out aligned ready to store. When only the remainder is 8452 wanted, an alternative is to take the dividend limbs unshifted and calculate 8453 @m{r = a \bmod d2^k, r = a mod d*2^k} followed by an extra final step @m{r2^k 8454 \bmod d2^k, r*2^k mod d*2^k}. This can help on CPUs with poor bit shifts or 8455 few registers. 8456 8457 The multiply by inverse can be done two limbs at a time. The calculation is 8458 basically the same, but the inverse is two limbs and the divisor treated as if 8459 padded with a low zero limb. This means more work, since the inverse will 8460 need a 2@cross{}2 multiply, but the four 1@cross{}1s to do that are 8461 independent and can therefore be done partly or wholly in parallel. Likewise 8462 for a 2@cross{}1 calculating @m{qd,q*d}. The net effect is to process two 8463 limbs with roughly the same two multiplies worth of latency that one limb at a 8464 time gives. This extends to 3 or 4 limbs at a time, though the extra work to 8465 apply the inverse will almost certainly soon reach the limits of multiplier 8466 throughput. 8467 8468 A similar approach in reverse can be taken to process just half a limb at a 8469 time if the divisor is only a half limb. In this case the 1@cross{}1 multiply 8470 for the inverse effectively becomes two @m{{1\over2}\times1, (1/2)x1} for each 8471 limb, which can be a saving on CPUs with a fast half limb multiply, or in fact 8472 if the only multiply is a half limb, and especially if it's not pipelined. 8473 8474 8475 @node Basecase Division, Divide and Conquer Division, Single Limb Division, Division Algorithms 8476 @subsection Basecase Division 8477 8478 Basecase N@cross{}M division is like long division done by hand, but in base 8479 @m{2\GMPraise{@code{mp\_bits\_per\_limb}}, 2^mp_bits_per_limb}. See Knuth 8480 section 4.3.1 algorithm D, and @file{mpn/generic/sb_divrem_mn.c}. 8481 8482 Briefly stated, while the dividend remains larger than the divisor, a high 8483 quotient limb is formed and the N@cross{}1 product @m{qd,q*d} subtracted at 8484 the top end of the dividend. With a normalized divisor (most significant bit 8485 set), each quotient limb can be formed with a 2@cross{}1 division and a 8486 1@cross{}1 multiplication plus some subtractions. The 2@cross{}1 division is 8487 by the high limb of the divisor and is done either with a hardware divide or a 8488 multiply by inverse (the same as in @ref{Single Limb Division}) whichever is 8489 faster. Such a quotient is sometimes one too big, requiring an addback of the 8490 divisor, but that happens rarely. 8491 8492 With Q=N@minus{}M being the number of quotient limbs, this is an 8493 @m{O(QM),O(Q*M)} algorithm and will run at a speed similar to a basecase 8494 Q@cross{}M multiplication, differing in fact only in the extra multiply and 8495 divide for each of the Q quotient limbs. 8496 8497 8498 @node Divide and Conquer Division, Block-Wise Barrett Division, Basecase Division, Division Algorithms 8499 @subsection Divide and Conquer Division 8500 8501 For divisors larger than @code{DC_DIV_QR_THRESHOLD}, division is done by dividing. 8502 Or to be precise by a recursive divide and conquer algorithm based on work by 8503 Moenck and Borodin, Jebelean, and Burnikel and Ziegler (@pxref{References}). 8504 8505 The algorithm consists essentially of recognising that a 2N@cross{}N division 8506 can be done with the basecase division algorithm (@pxref{Basecase Division}), 8507 but using N/2 limbs as a base, not just a single limb. This way the 8508 multiplications that arise are (N/2)@cross{}(N/2) and can take advantage of 8509 Karatsuba and higher multiplication algorithms (@pxref{Multiplication 8510 Algorithms}). The two ``digits'' of the quotient are formed by recursive 8511 N@cross{}(N/2) divisions. 8512 8513 If the (N/2)@cross{}(N/2) multiplies are done with a basecase multiplication 8514 then the work is about the same as a basecase division, but with more function 8515 call overheads and with some subtractions separated from the multiplies. 8516 These overheads mean that it's only when N/2 is above 8517 @code{MUL_TOOM22_THRESHOLD} that divide and conquer is of use. 8518 8519 @code{DC_DIV_QR_THRESHOLD} is based on the divisor size N, so it will be somewhere 8520 above twice @code{MUL_TOOM22_THRESHOLD}, but how much above depends on the 8521 CPU@. An optimized @code{mpn_mul_basecase} can lower @code{DC_DIV_QR_THRESHOLD} a 8522 little by offering a ready-made advantage over repeated @code{mpn_submul_1} 8523 calls. 8524 8525 Divide and conquer is asymptotically @m{O(M(N)\log N),O(M(N)*log(N))} where 8526 @math{M(N)} is the time for an N@cross{}N multiplication done with FFTs. The 8527 actual time is a sum over multiplications of the recursed sizes, as can be 8528 seen near the end of section 2.2 of Burnikel and Ziegler. For example, within 8529 the Toom-3 range, divide and conquer is @m{2.63M(N), 2.63*M(N)}. With higher 8530 algorithms the @math{M(N)} term improves and the multiplier tends to @m{\log 8531 N, log(N)}. In practice, at moderate to large sizes, a 2N@cross{}N division 8532 is about 2 to 4 times slower than an N@cross{}N multiplication. 8533 8534 8535 @node Block-Wise Barrett Division, Exact Division, Divide and Conquer Division, Division Algorithms 8536 @subsection Block-Wise Barrett Division 8537 8538 For the largest divisions, a block-wise Barrett division algorithm is used. 8539 Here, the divisor is inverted to a precision determined by the relative size of 8540 the dividend and divisor. Blocks of quotient limbs are then generated by 8541 multiplying blocks from the dividend by the inverse. 8542 8543 Our block-wise algorithm computes a smaller inverse than in the plain Barrett 8544 algorithm. For a @math{2n/n} division, the inverse will be just @m{\lceil n/2 8545 \rceil, ceil(n/2)} limbs. 8546 8547 8548 @node Exact Division, Exact Remainder, Block-Wise Barrett Division, Division Algorithms 8549 @subsection Exact Division 8550 8551 8552 A so-called exact division is when the dividend is known to be an exact 8553 multiple of the divisor. Jebelean's exact division algorithm uses this 8554 knowledge to make some significant optimizations (@pxref{References}). 8555 8556 The idea can be illustrated in decimal for example with 368154 divided by 8557 543. Because the low digit of the dividend is 4, the low digit of the 8558 quotient must be 8. This is arrived at from @m{4 \mathord{\times} 7 \bmod 10, 8559 4*7 mod 10}, using the fact 7 is the modular inverse of 3 (the low digit of 8560 the divisor), since @m{3 \mathord{\times} 7 \mathop{\equiv} 1 \bmod 10, 3*7 8561 @equiv{} 1 mod 10}. So @m{8\mathord{\times}543 = 4344,8*543=4344} can be 8562 subtracted from the dividend leaving 363810. Notice the low digit has become 8563 zero. 8564 8565 The procedure is repeated at the second digit, with the next quotient digit 7 8566 (@m{1 \mathord{\times} 7 \bmod 10, 7 @equiv{} 1*7 mod 10}), subtracting 8567 @m{7\mathord{\times}543 = 3801,7*543=3801}, leaving 325800. And finally at 8568 the third digit with quotient digit 6 (@m{8 \mathord{\times} 7 \bmod 10, 8*7 8569 mod 10}), subtracting @m{6\mathord{\times}543 = 3258,6*543=3258} leaving 0. 8570 So the quotient is 678. 8571 8572 Notice however that the multiplies and subtractions don't need to extend past 8573 the low three digits of the dividend, since that's enough to determine the 8574 three quotient digits. For the last quotient digit no subtraction is needed 8575 at all. On a 2N@cross{}N division like this one, only about half the work of 8576 a normal basecase division is necessary. 8577 8578 For an N@cross{}M exact division producing Q=N@minus{}M quotient limbs, the 8579 saving over a normal basecase division is in two parts. Firstly, each of the 8580 Q quotient limbs needs only one multiply, not a 2@cross{}1 divide and 8581 multiply. Secondly, the crossproducts are reduced when @math{Q>M} to 8582 @m{QM-M(M+1)/2,Q*M-M*(M+1)/2}, or when @math{Q@le{}M} to @m{Q(Q-1)/2, 8583 Q*(Q-1)/2}. Notice the savings are complementary. If Q is big then many 8584 divisions are saved, or if Q is small then the crossproducts reduce to a small 8585 number. 8586 8587 The modular inverse used is calculated efficiently by @code{binvert_limb} in 8588 @file{gmp-impl.h}. This does four multiplies for a 32-bit limb, or six for a 8589 64-bit limb. @file{tune/modlinv.c} has some alternate implementations that 8590 might suit processors better at bit twiddling than multiplying. 8591 8592 The sub-quadratic exact division described by Jebelean in ``Exact Division 8593 with Karatsuba Complexity'' is not currently implemented. It uses a 8594 rearrangement similar to the divide and conquer for normal division 8595 (@pxref{Divide and Conquer Division}), but operating from low to high. A 8596 further possibility not currently implemented is ``Bidirectional Exact Integer 8597 Division'' by Krandick and Jebelean which forms quotient limbs from both the 8598 high and low ends of the dividend, and can halve once more the number of 8599 crossproducts needed in a 2N@cross{}N division. 8600 8601 A special case exact division by 3 exists in @code{mpn_divexact_by3}, 8602 supporting Toom-3 multiplication and @code{mpq} canonicalizations. It forms 8603 quotient digits with a multiply by the modular inverse of 3 (which is 8604 @code{0xAA..AAB}) and uses two comparisons to determine a borrow for the next 8605 limb. The multiplications don't need to be on the dependent chain, as long as 8606 the effect of the borrows is applied, which can help chips with pipelined 8607 multipliers. 8608 8609 8610 @node Exact Remainder, Small Quotient Division, Exact Division, Division Algorithms 8611 @subsection Exact Remainder 8612 @cindex Exact remainder 8613 8614 If the exact division algorithm is done with a full subtraction at each stage 8615 and the dividend isn't a multiple of the divisor, then low zero limbs are 8616 produced but with a remainder in the high limbs. For dividend @math{a}, 8617 divisor @math{d}, quotient @math{q}, and @m{b = 2 8618 \GMPraise{@code{mp\_bits\_per\_limb}}, b = 2^mp_bits_per_limb}, this remainder 8619 @math{r} is of the form 8620 @tex 8621 $$ a = qd + r b^n $$ 8622 @end tex 8623 @ifnottex 8624 8625 @example 8626 a = q*d + r*b^n 8627 @end example 8628 8629 @end ifnottex 8630 @math{n} represents the number of zero limbs produced by the subtractions, 8631 that being the number of limbs produced for @math{q}. @math{r} will be in the 8632 range @math{0@le{}r<d} and can be viewed as a remainder, but one shifted up by 8633 a factor of @math{b^n}. 8634 8635 Carrying out full subtractions at each stage means the same number of cross 8636 products must be done as a normal division, but there's still some single limb 8637 divisions saved. When @math{d} is a single limb some simplifications arise, 8638 providing good speedups on a number of processors. 8639 8640 The functions @code{mpn_divexact_by3}, @code{mpn_modexact_1_odd} and the 8641 internal @code{mpn_redc_X} functions differ subtly in how they return @math{r}, 8642 leading to some negations in the above formula, but all are essentially the 8643 same. 8644 8645 @cindex Divisibility algorithm 8646 @cindex Congruence algorithm 8647 Clearly @math{r} is zero when @math{a} is a multiple of @math{d}, and this 8648 leads to divisibility or congruence tests which are potentially more efficient 8649 than a normal division. 8650 8651 The factor of @math{b^n} on @math{r} can be ignored in a GCD when @math{d} is 8652 odd, hence the use of @code{mpn_modexact_1_odd} by @code{mpn_gcd_1} and 8653 @code{mpz_kronecker_ui} etc (@pxref{Greatest Common Divisor Algorithms}). 8654 8655 Montgomery's REDC method for modular multiplications uses operands of the form 8656 of @m{xb^{-n}, x*b^-n} and @m{yb^{-n}, y*b^-n} and on calculating @m{(xb^{-n}) 8657 (yb^{-n}), (x*b^-n)*(y*b^-n)} uses the factor of @math{b^n} in the exact 8658 remainder to reach a product in the same form @m{(xy)b^{-n}, (x*y)*b^-n} 8659 (@pxref{Modular Powering Algorithm}). 8660 8661 Notice that @math{r} generally gives no useful information about the ordinary 8662 remainder @math{a @bmod d} since @math{b^n @bmod d} could be anything. If 8663 however @math{b^n @equiv{} 1 @bmod d}, then @math{r} is the negative of the 8664 ordinary remainder. This occurs whenever @math{d} is a factor of 8665 @math{b^n-1}, as for example with 3 in @code{mpn_divexact_by3}. For a 32 or 8666 64 bit limb other such factors include 5, 17 and 257, but no particular use 8667 has been found for this. 8668 8669 8670 @node Small Quotient Division, , Exact Remainder, Division Algorithms 8671 @subsection Small Quotient Division 8672 8673 An N@cross{}M division where the number of quotient limbs Q=N@minus{}M is 8674 small can be optimized somewhat. 8675 8676 An ordinary basecase division normalizes the divisor by shifting it to make 8677 the high bit set, shifting the dividend accordingly, and shifting the 8678 remainder back down at the end of the calculation. This is wasteful if only a 8679 few quotient limbs are to be formed. Instead a division of just the top 8680 @m{\rm2Q,2*Q} limbs of the dividend by the top Q limbs of the divisor can be 8681 used to form a trial quotient. This requires only those limbs normalized, not 8682 the whole of the divisor and dividend. 8683 8684 A multiply and subtract then applies the trial quotient to the M@minus{}Q 8685 unused limbs of the divisor and N@minus{}Q dividend limbs (which includes Q 8686 limbs remaining from the trial quotient division). The starting trial 8687 quotient can be 1 or 2 too big, but all cases of 2 too big and most cases of 1 8688 too big are detected by first comparing the most significant limbs that will 8689 arise from the subtraction. An addback is done if the quotient still turns 8690 out to be 1 too big. 8691 8692 This whole procedure is essentially the same as one step of the basecase 8693 algorithm done in a Q limb base, though with the trial quotient test done only 8694 with the high limbs, not an entire Q limb ``digit'' product. The correctness 8695 of this weaker test can be established by following the argument of Knuth 8696 section 4.3.1 exercise 20 but with the @m{v_2 \GMPhat q > b \GMPhat r 8697 + u_2, v2*q>b*r+u2} condition appropriately relaxed. 8698 8699 8700 @need 1000 8701 @node Greatest Common Divisor Algorithms, Powering Algorithms, Division Algorithms, Algorithms 8702 @section Greatest Common Divisor 8703 @cindex Greatest common divisor algorithms 8704 @cindex GCD algorithms 8705 8706 @menu 8707 * Binary GCD:: 8708 * Lehmer's Algorithm:: 8709 * Subquadratic GCD:: 8710 * Extended GCD:: 8711 * Jacobi Symbol:: 8712 @end menu 8713 8714 8715 @node Binary GCD, Lehmer's Algorithm, Greatest Common Divisor Algorithms, Greatest Common Divisor Algorithms 8716 @subsection Binary GCD 8717 8718 At small sizes GMP uses an @math{O(N^2)} binary style GCD@. This is described 8719 in many textbooks, for example Knuth section 4.5.2 algorithm B@. It simply 8720 consists of successively reducing odd operands @math{a} and @math{b} using 8721 8722 @quotation 8723 @math{a,b = @abs{}(a-b),@min{}(a,b)} @* 8724 strip factors of 2 from @math{a} 8725 @end quotation 8726 8727 The Euclidean GCD algorithm, as per Knuth algorithms E and A, repeatedly 8728 computes the quotient @m{q = \lfloor a/b \rfloor, q = floor(a/b)} and replaces 8729 @math{a,b} by @math{v, u - q v}. The binary algorithm has so far been found to 8730 be faster than the Euclidean algorithm everywhere. One reason the binary 8731 method does well is that the implied quotient at each step is usually small, 8732 so often only one or two subtractions are needed to get the same effect as a 8733 division. Quotients 1, 2 and 3 for example occur 67.7% of the time, see Knuth 8734 section 4.5.3 Theorem E. 8735 8736 When the implied quotient is large, meaning @math{b} is much smaller than 8737 @math{a}, then a division is worthwhile. This is the basis for the initial 8738 @math{a @bmod b} reductions in @code{mpn_gcd} and @code{mpn_gcd_1} (the latter 8739 for both N@cross{}1 and 1@cross{}1 cases). But after that initial reduction, 8740 big quotients occur too rarely to make it worth checking for them. 8741 8742 @sp 1 8743 The final @math{1@cross{}1} GCD in @code{mpn_gcd_1} is done in the generic C 8744 code as described above. For two N-bit operands, the algorithm takes about 8745 0.68 iterations per bit. For optimum performance some attention needs to be 8746 paid to the way the factors of 2 are stripped from @math{a}. 8747 8748 Firstly it may be noted that in twos complement the number of low zero bits on 8749 @math{a-b} is the same as @math{b-a}, so counting or testing can begin on 8750 @math{a-b} without waiting for @math{@abs{}(a-b)} to be determined. 8751 8752 A loop stripping low zero bits tends not to branch predict well, since the 8753 condition is data dependent. But on average there's only a few low zeros, so 8754 an option is to strip one or two bits arithmetically then loop for more (as 8755 done for AMD K6). Or use a lookup table to get a count for several bits then 8756 loop for more (as done for AMD K7). An alternative approach is to keep just 8757 one of @math{a} or @math{b} odd and iterate 8758 8759 @quotation 8760 @math{a,b = @abs{}(a-b), @min{}(a,b)} @* 8761 @math{a = a/2} if even @* 8762 @math{b = b/2} if even 8763 @end quotation 8764 8765 This requires about 1.25 iterations per bit, but stripping of a single bit at 8766 each step avoids any branching. Repeating the bit strip reduces to about 0.9 8767 iterations per bit, which may be a worthwhile tradeoff. 8768 8769 Generally with the above approaches a speed of perhaps 6 cycles per bit can be 8770 achieved, which is still not terribly fast with for instance a 64-bit GCD 8771 taking nearly 400 cycles. It's this sort of time which means it's not usually 8772 advantageous to combine a set of divisibility tests into a GCD. 8773 8774 Currently, the binary algorithm is used for GCD only when @math{N < 3}. 8775 8776 @node Lehmer's Algorithm, Subquadratic GCD, Binary GCD, Greatest Common Divisor Algorithms 8777 @comment node-name, next, previous, up 8778 @subsection Lehmer's algorithm 8779 8780 Lehmer's improvement of the Euclidean algorithms is based on the observation 8781 that the initial part of the quotient sequence depends only on the most 8782 significant parts of the inputs. The variant of Lehmer's algorithm used in GMP 8783 splits off the most significant two limbs, as suggested, e.g., in ``A 8784 Double-Digit Lehmer-Euclid Algorithm'' by Jebelean (@pxref{References}). The 8785 quotients of two double-limb inputs are collected as a 2 by 2 matrix with 8786 single-limb elements. This is done by the function @code{mpn_hgcd2}. The 8787 resulting matrix is applied to the inputs using @code{mpn_mul_1} and 8788 @code{mpn_submul_1}. Each iteration usually reduces the inputs by almost one 8789 limb. In the rare case of a large quotient, no progress can be made by 8790 examining just the most significant two limbs, and the quotient is computed 8791 using plain division. 8792 8793 The resulting algorithm is asymptotically @math{O(N^2)}, just as the Euclidean 8794 algorithm and the binary algorithm. The quadratic part of the work are 8795 the calls to @code{mpn_mul_1} and @code{mpn_submul_1}. For small sizes, the 8796 linear work is also significant. There are roughly @math{N} calls to the 8797 @code{mpn_hgcd2} function. This function uses a couple of important 8798 optimizations: 8799 8800 @itemize 8801 @item 8802 It uses the same relaxed notion of correctness as @code{mpn_hgcd} (see next 8803 section). This means that when called with the most significant two limbs of 8804 two large numbers, the returned matrix does not always correspond exactly to 8805 the initial quotient sequence for the two large numbers; the final quotient 8806 may sometimes be one off. 8807 8808 @item 8809 It takes advantage of the fact the quotients are usually small. The division 8810 operator is not used, since the corresponding assembler instruction is very 8811 slow on most architectures. (This code could probably be improved further, it 8812 uses many branches that are unfriendly to prediction). 8813 8814 @item 8815 It switches from double-limb calculations to single-limb calculations half-way 8816 through, when the input numbers have been reduced in size from two limbs to 8817 one and a half. 8818 8819 @end itemize 8820 8821 @node Subquadratic GCD, Extended GCD, Lehmer's Algorithm, Greatest Common Divisor Algorithms 8822 @subsection Subquadratic GCD 8823 8824 For inputs larger than @code{GCD_DC_THRESHOLD}, GCD is computed via the HGCD 8825 (Half GCD) function, as a generalization to Lehmer's algorithm. 8826 8827 Let the inputs @math{a,b} be of size @math{N} limbs each. Put @m{S=\lfloor N/2 8828 \rfloor + 1, S = floor(N/2) + 1}. Then HGCD(a,b) returns a transformation 8829 matrix @math{T} with non-negative elements, and reduced numbers @math{(c;d) = 8830 T^{-1} (a;b)}. The reduced numbers @math{c,d} must be larger than @math{S} 8831 limbs, while their difference @math{abs(c-d)} must fit in @math{S} limbs. The 8832 matrix elements will also be of size roughly @math{N/2}. 8833 8834 The HGCD base case uses Lehmer's algorithm, but with the above stop condition 8835 that returns reduced numbers and the corresponding transformation matrix 8836 half-way through. For inputs larger than @code{HGCD_THRESHOLD}, HGCD is 8837 computed recursively, using the divide and conquer algorithm in ``On 8838 Sch@"onhage's algorithm and subquadratic integer GCD computation'' by M@"oller 8839 (@pxref{References}). The recursive algorithm consists of these main 8840 steps. 8841 8842 @itemize 8843 8844 @item 8845 Call HGCD recursively, on the most significant @math{N/2} limbs. Apply the 8846 resulting matrix @math{T_1} to the full numbers, reducing them to a size just 8847 above @math{3N/2}. 8848 8849 @item 8850 Perform a small number of division or subtraction steps to reduce the numbers 8851 to size below @math{3N/2}. This is essential mainly for the unlikely case of 8852 large quotients. 8853 8854 @item 8855 Call HGCD recursively, on the most significant @math{N/2} limbs of the reduced 8856 numbers. Apply the resulting matrix @math{T_2} to the full numbers, reducing 8857 them to a size just above @math{N/2}. 8858 8859 @item 8860 Compute @math{T = T_1 T_2}. 8861 8862 @item 8863 Perform a small number of division and subtraction steps to satisfy the 8864 requirements, and return. 8865 @end itemize 8866 8867 GCD is then implemented as a loop around HGCD, similarly to Lehmer's 8868 algorithm. Where Lehmer repeatedly chops off the top two limbs, calls 8869 @code{mpn_hgcd2}, and applies the resulting matrix to the full numbers, the 8870 sub-quadratic GCD chops off the most significant third of the limbs (the 8871 proportion is a tuning parameter, and @math{1/3} seems to be more efficient 8872 than, e.g, @math{1/2}), calls @code{mpn_hgcd}, and applies the resulting 8873 matrix. Once the input numbers are reduced to size below 8874 @code{GCD_DC_THRESHOLD}, Lehmer's algorithm is used for the rest of the work. 8875 8876 The asymptotic running time of both HGCD and GCD is @m{O(M(N)\log N),O(M(N)*log(N))}, 8877 where @math{M(N)} is the time for multiplying two @math{N}-limb numbers. 8878 8879 @comment node-name, next, previous, up 8880 8881 @node Extended GCD, Jacobi Symbol, Subquadratic GCD, Greatest Common Divisor Algorithms 8882 @subsection Extended GCD 8883 8884 The extended GCD function, or GCDEXT, calculates @math{@gcd{}(a,b)} and also 8885 cofactors @math{x} and @math{y} satisfying @m{ax+by=\gcd(a@C{}b), 8886 a*x+b*y=gcd(a@C{}b)}. All the algorithms used for plain GCD are extended to 8887 handle this case. The binary algorithm is used only for single-limb GCDEXT. 8888 Lehmer's algorithm is used for sizes up to @code{GCDEXT_DC_THRESHOLD}. Above 8889 this threshold, GCDEXT is implemented as a loop around HGCD, but with more 8890 book-keeping to keep track of the cofactors. This gives the same asymptotic 8891 running time as for GCD and HGCD, @m{O(M(N)\log N),O(M(N)*log(N))} 8892 8893 One difference to plain GCD is that while the inputs @math{a} and @math{b} are 8894 reduced as the algorithm proceeds, the cofactors @math{x} and @math{y} grow in 8895 size. This makes the tuning of the chopping-point more difficult. The current 8896 code chops off the most significant half of the inputs for the call to HGCD in 8897 the first iteration, and the most significant two thirds for the remaining 8898 calls. This strategy could surely be improved. Also the stop condition for the 8899 loop, where Lehmer's algorithm is invoked once the inputs are reduced below 8900 @code{GCDEXT_DC_THRESHOLD}, could maybe be improved by taking into account the 8901 current size of the cofactors. 8902 8903 @node Jacobi Symbol, , Extended GCD, Greatest Common Divisor Algorithms 8904 @subsection Jacobi Symbol 8905 @cindex Jacobi symbol algorithm 8906 8907 [This section is obsolete. The current Jacobi code actually uses a very 8908 efficient algorithm.] 8909 8910 @code{mpz_jacobi} and @code{mpz_kronecker} are currently implemented with a 8911 simple binary algorithm similar to that described for the GCDs (@pxref{Binary 8912 GCD}). They're not very fast when both inputs are large. Lehmer's multi-step 8913 improvement or a binary based multi-step algorithm is likely to be better. 8914 8915 When one operand fits a single limb, and that includes @code{mpz_kronecker_ui} 8916 and friends, an initial reduction is done with either @code{mpn_mod_1} or 8917 @code{mpn_modexact_1_odd}, followed by the binary algorithm on a single limb. 8918 The binary algorithm is well suited to a single limb, and the whole 8919 calculation in this case is quite efficient. 8920 8921 In all the routines sign changes for the result are accumulated using some bit 8922 twiddling, avoiding table lookups or conditional jumps. 8923 8924 8925 @need 1000 8926 @node Powering Algorithms, Root Extraction Algorithms, Greatest Common Divisor Algorithms, Algorithms 8927 @section Powering Algorithms 8928 @cindex Powering algorithms 8929 8930 @menu 8931 * Normal Powering Algorithm:: 8932 * Modular Powering Algorithm:: 8933 @end menu 8934 8935 8936 @node Normal Powering Algorithm, Modular Powering Algorithm, Powering Algorithms, Powering Algorithms 8937 @subsection Normal Powering 8938 8939 Normal @code{mpz} or @code{mpf} powering uses a simple binary algorithm, 8940 successively squaring and then multiplying by the base when a 1 bit is seen in 8941 the exponent, as per Knuth section 4.6.3. The ``left to right'' 8942 variant described there is used rather than algorithm A, since it's just as 8943 easy and can be done with somewhat less temporary memory. 8944 8945 8946 @node Modular Powering Algorithm, , Normal Powering Algorithm, Powering Algorithms 8947 @subsection Modular Powering 8948 8949 Modular powering is implemented using a @math{2^k}-ary sliding window 8950 algorithm, as per ``Handbook of Applied Cryptography'' algorithm 14.85 8951 (@pxref{References}). @math{k} is chosen according to the size of the 8952 exponent. Larger exponents use larger values of @math{k}, the choice being 8953 made to minimize the average number of multiplications that must supplement 8954 the squaring. 8955 8956 The modular multiplies and squarings use either a simple division or the REDC 8957 method by Montgomery (@pxref{References}). REDC is a little faster, 8958 essentially saving N single limb divisions in a fashion similar to an exact 8959 remainder (@pxref{Exact Remainder}). 8960 8961 8962 @node Root Extraction Algorithms, Radix Conversion Algorithms, Powering Algorithms, Algorithms 8963 @section Root Extraction Algorithms 8964 @cindex Root extraction algorithms 8965 8966 @menu 8967 * Square Root Algorithm:: 8968 * Nth Root Algorithm:: 8969 * Perfect Square Algorithm:: 8970 * Perfect Power Algorithm:: 8971 @end menu 8972 8973 8974 @node Square Root Algorithm, Nth Root Algorithm, Root Extraction Algorithms, Root Extraction Algorithms 8975 @subsection Square Root 8976 @cindex Square root algorithm 8977 @cindex Karatsuba square root algorithm 8978 8979 Square roots are taken using the ``Karatsuba Square Root'' algorithm by Paul 8980 Zimmermann (@pxref{References}). 8981 8982 An input @math{n} is split into four parts of @math{k} bits each, so with 8983 @math{b=2^k} we have @m{n = a_3b^3 + a_2b^2 + a_1b + a_0, n = a3*b^3 + a2*b^2 8984 + a1*b + a0}. Part @ms{a,3} must be ``normalized'' so that either the high or 8985 second highest bit is set. In GMP, @math{k} is kept on a limb boundary and 8986 the input is left shifted (by an even number of bits) to normalize. 8987 8988 The square root of the high two parts is taken, by recursive application of 8989 the algorithm (bottoming out in a one-limb Newton's method), 8990 @tex 8991 $$ s',r' = \mathop{\rm sqrtrem} \> (a_3b + a_2) $$ 8992 @end tex 8993 @ifnottex 8994 8995 @example 8996 s1,r1 = sqrtrem (a3*b + a2) 8997 @end example 8998 8999 @end ifnottex 9000 This is an approximation to the desired root and is extended by a division to 9001 give @math{s},@math{r}, 9002 @tex 9003 $$\eqalign{ 9004 q,u &= \mathop{\rm divrem} \> (r'b + a_1, 2s') \cr 9005 s &= s'b + q \cr 9006 r &= ub + a_0 - q^2 9007 }$$ 9008 @end tex 9009 @ifnottex 9010 9011 @example 9012 q,u = divrem (r1*b + a1, 2*s1) 9013 s = s1*b + q 9014 r = u*b + a0 - q^2 9015 @end example 9016 9017 @end ifnottex 9018 The normalization requirement on @ms{a,3} means at this point @math{s} is 9019 either correct or 1 too big. @math{r} is negative in the latter case, so 9020 @tex 9021 $$\eqalign{ 9022 \mathop{\rm if} \; r &< 0 \; \mathop{\rm then} \cr 9023 r &\leftarrow r + 2s - 1 \cr 9024 s &\leftarrow s - 1 9025 }$$ 9026 @end tex 9027 @ifnottex 9028 9029 @example 9030 if r < 0 then 9031 r = r + 2*s - 1 9032 s = s - 1 9033 @end example 9034 9035 @end ifnottex 9036 The algorithm is expressed in a divide and conquer form, but as noted in the 9037 paper it can also be viewed as a discrete variant of Newton's method, or as a 9038 variation on the schoolboy method (no longer taught) for square roots two 9039 digits at a time. 9040 9041 If the remainder @math{r} is not required then usually only a few high limbs 9042 of @math{r} and @math{u} need to be calculated to determine whether an 9043 adjustment to @math{s} is required. This optimization is not currently 9044 implemented. 9045 9046 In the Karatsuba multiplication range this algorithm is @m{O({3\over2} 9047 M(N/2)),O(1.5*M(N/2))}, where @math{M(n)} is the time to multiply two numbers 9048 of @math{n} limbs. In the FFT multiplication range this grows to a bound of 9049 @m{O(6 M(N/2)),O(6*M(N/2))}. In practice a factor of about 1.5 to 1.8 is 9050 found in the Karatsuba and Toom-3 ranges, growing to 2 or 3 in the FFT range. 9051 9052 The algorithm does all its calculations in integers and the resulting 9053 @code{mpn_sqrtrem} is used for both @code{mpz_sqrt} and @code{mpf_sqrt}. 9054 The extended precision given by @code{mpf_sqrt_ui} is obtained by 9055 padding with zero limbs. 9056 9057 9058 @node Nth Root Algorithm, Perfect Square Algorithm, Square Root Algorithm, Root Extraction Algorithms 9059 @subsection Nth Root 9060 @cindex Root extraction algorithm 9061 @cindex Nth root algorithm 9062 9063 Integer Nth roots are taken using Newton's method with the following 9064 iteration, where @math{A} is the input and @math{n} is the root to be taken. 9065 @tex 9066 $$a_{i+1} = {1\over n} \left({A \over a_i^{n-1}} + (n-1)a_i \right)$$ 9067 @end tex 9068 @ifnottex 9069 9070 @example 9071 1 A 9072 a[i+1] = - * ( --------- + (n-1)*a[i] ) 9073 n a[i]^(n-1) 9074 @end example 9075 9076 @end ifnottex 9077 The initial approximation @m{a_1,a[1]} is generated bitwise by successively 9078 powering a trial root with or without new 1 bits, aiming to be just above the 9079 true root. The iteration converges quadratically when started from a good 9080 approximation. When @math{n} is large more initial bits are needed to get 9081 good convergence. The current implementation is not particularly well 9082 optimized. 9083 9084 9085 @node Perfect Square Algorithm, Perfect Power Algorithm, Nth Root Algorithm, Root Extraction Algorithms 9086 @subsection Perfect Square 9087 @cindex Perfect square algorithm 9088 9089 A significant fraction of non-squares can be quickly identified by checking 9090 whether the input is a quadratic residue modulo small integers. 9091 9092 @code{mpz_perfect_square_p} first tests the input mod 256, which means just 9093 examining the low byte. Only 44 different values occur for squares mod 256, 9094 so 82.8% of inputs can be immediately identified as non-squares. 9095 9096 On a 32-bit system similar tests are done mod 9, 5, 7, 13 and 17, for a total 9097 99.25% of inputs identified as non-squares. On a 64-bit system 97 is tested 9098 too, for a total 99.62%. 9099 9100 These moduli are chosen because they're factors of @math{2^@W{24}-1} (or 9101 @math{2^@W{48}-1} for 64-bits), and such a remainder can be quickly taken just 9102 using additions (see @code{mpn_mod_34lsub1}). 9103 9104 When nails are in use moduli are instead selected by the @file{gen-psqr.c} 9105 program and applied with an @code{mpn_mod_1}. The same @math{2^@W{24}-1} or 9106 @math{2^@W{48}-1} could be done with nails using some extra bit shifts, but 9107 this is not currently implemented. 9108 9109 In any case each modulus is applied to the @code{mpn_mod_34lsub1} or 9110 @code{mpn_mod_1} remainder and a table lookup identifies non-squares. By 9111 using a ``modexact'' style calculation, and suitably permuted tables, just one 9112 multiply each is required, see the code for details. Moduli are also combined 9113 to save operations, so long as the lookup tables don't become too big. 9114 @file{gen-psqr.c} does all the pre-calculations. 9115 9116 A square root must still be taken for any value that passes these tests, to 9117 verify it's really a square and not one of the small fraction of non-squares 9118 that get through (i.e.@: a pseudo-square to all the tested bases). 9119 9120 Clearly more residue tests could be done, @code{mpz_perfect_square_p} only 9121 uses a compact and efficient set. Big inputs would probably benefit from more 9122 residue testing, small inputs might be better off with less. The assumed 9123 distribution of squares versus non-squares in the input would affect such 9124 considerations. 9125 9126 9127 @node Perfect Power Algorithm, , Perfect Square Algorithm, Root Extraction Algorithms 9128 @subsection Perfect Power 9129 @cindex Perfect power algorithm 9130 9131 Detecting perfect powers is required by some factorization algorithms. 9132 Currently @code{mpz_perfect_power_p} is implemented using repeated Nth root 9133 extractions, though naturally only prime roots need to be considered. 9134 (@xref{Nth Root Algorithm}.) 9135 9136 If a prime divisor @math{p} with multiplicity @math{e} can be found, then only 9137 roots which are divisors of @math{e} need to be considered, much reducing the 9138 work necessary. To this end divisibility by a set of small primes is checked. 9139 9140 9141 @node Radix Conversion Algorithms, Other Algorithms, Root Extraction Algorithms, Algorithms 9142 @section Radix Conversion 9143 @cindex Radix conversion algorithms 9144 9145 Radix conversions are less important than other algorithms. A program 9146 dominated by conversions should probably use a different data representation. 9147 9148 @menu 9149 * Binary to Radix:: 9150 * Radix to Binary:: 9151 @end menu 9152 9153 9154 @node Binary to Radix, Radix to Binary, Radix Conversion Algorithms, Radix Conversion Algorithms 9155 @subsection Binary to Radix 9156 9157 Conversions from binary to a power-of-2 radix use a simple and fast 9158 @math{O(N)} bit extraction algorithm. 9159 9160 Conversions from binary to other radices use one of two algorithms. Sizes 9161 below @code{GET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method. 9162 Repeated divisions by @math{b^n} are made, where @math{b} is the radix and 9163 @math{n} is the biggest power that fits in a limb. But instead of simply 9164 using the remainder @math{r} from such divisions, an extra divide step is done 9165 to give a fractional limb representing @math{r/b^n}. The digits of @math{r} 9166 can then be extracted using multiplications by @math{b} rather than divisions. 9167 Special case code is provided for decimal, allowing multiplications by 10 to 9168 optimize to shifts and adds. 9169 9170 Above @code{GET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used. 9171 For an input @math{t}, powers @m{b^{n2^i},b^(n*2^i)} of the radix are 9172 calculated, until a power between @math{t} and @m{\sqrt{t},sqrt(t)} is 9173 reached. @math{t} is then divided by that largest power, giving a quotient 9174 which is the digits above that power, and a remainder which is those below. 9175 These two parts are in turn divided by the second highest power, and so on 9176 recursively. When a piece has been divided down to less than 9177 @code{GET_STR_DC_THRESHOLD} limbs, the basecase algorithm described above is 9178 used. 9179 9180 The advantage of this algorithm is that big divisions can make use of the 9181 sub-quadratic divide and conquer division (@pxref{Divide and Conquer 9182 Division}), and big divisions tend to have less overheads than lots of 9183 separate single limb divisions anyway. But in any case the cost of 9184 calculating the powers @m{b^{n2^i},b^(n*2^i)} must first be overcome. 9185 9186 @code{GET_STR_PRECOMPUTE_THRESHOLD} and @code{GET_STR_DC_THRESHOLD} represent 9187 the same basic thing, the point where it becomes worth doing a big division to 9188 cut the input in half. @code{GET_STR_PRECOMPUTE_THRESHOLD} includes the cost 9189 of calculating the radix power required, whereas @code{GET_STR_DC_THRESHOLD} 9190 assumes that's already available, which is the case when recursing. 9191 9192 Since the base case produces digits from least to most significant but they 9193 want to be stored from most to least, it's necessary to calculate in advance 9194 how many digits there will be, or at least be sure not to underestimate that. 9195 For GMP the number of input bits is multiplied by @code{chars_per_bit_exactly} 9196 from @code{mp_bases}, rounding up. The result is either correct or one too 9197 big. 9198 9199 Examining some of the high bits of the input could increase the chance of 9200 getting the exact number of digits, but an exact result every time would not 9201 be practical, since in general the difference between numbers 100@dots{} and 9202 99@dots{} is only in the last few bits and the work to identify 99@dots{} 9203 might well be almost as much as a full conversion. 9204 9205 The @math{r/b^n} scheme described above for using multiplications to bring out 9206 digits might be useful for more than a single limb. Some brief experiments 9207 with it on the base case when recursing didn't give a noticeable improvement, 9208 but perhaps that was only due to the implementation. Something similar would 9209 work for the sub-quadratic divisions too, though there would be the cost of 9210 calculating a bigger radix power. 9211 9212 Another possible improvement for the sub-quadratic part would be to arrange 9213 for radix powers that balanced the sizes of quotient and remainder produced, 9214 i.e.@: the highest power would be an @m{b^{nk},b^(n*k)} approximately equal to 9215 @m{\sqrt{t},sqrt(t)}, not restricted to a @math{2^i} factor. That ought to 9216 smooth out a graph of times against sizes, but may or may not be a net 9217 speedup. 9218 9219 9220 @node Radix to Binary, , Binary to Radix, Radix Conversion Algorithms 9221 @subsection Radix to Binary 9222 9223 @strong{This section needs to be rewritten, it currently describes the 9224 algorithms used before GMP 4.3.} 9225 9226 Conversions from a power-of-2 radix into binary use a simple and fast 9227 @math{O(N)} bitwise concatenation algorithm. 9228 9229 Conversions from other radices use one of two algorithms. Sizes below 9230 @code{SET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method. Groups 9231 of @math{n} digits are converted to limbs, where @math{n} is the biggest 9232 power of the base @math{b} which will fit in a limb, then those groups are 9233 accumulated into the result by multiplying by @math{b^n} and adding. This 9234 saves multi-precision operations, as per Knuth section 4.4 part E 9235 (@pxref{References}). Some special case code is provided for decimal, giving 9236 the compiler a chance to optimize multiplications by 10. 9237 9238 Above @code{SET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used. 9239 First groups of @math{n} digits are converted into limbs. Then adjacent 9240 limbs are combined into limb pairs with @m{xb^n+y,x*b^n+y}, where @math{x} 9241 and @math{y} are the limbs. Adjacent limb pairs are combined into quads 9242 similarly with @m{xb^{2n}+y,x*b^(2n)+y}. This continues until a single block 9243 remains, that being the result. 9244 9245 The advantage of this method is that the multiplications for each @math{x} are 9246 big blocks, allowing Karatsuba and higher algorithms to be used. But the cost 9247 of calculating the powers @m{b^{n2^i},b^(n*2^i)} must be overcome. 9248 @code{SET_STR_PRECOMPUTE_THRESHOLD} usually ends up quite big, around 5000 digits, and on 9249 some processors much bigger still. 9250 9251 @code{SET_STR_PRECOMPUTE_THRESHOLD} is based on the input digits (and tuned 9252 for decimal), though it might be better based on a limb count, so as to be 9253 independent of the base. But that sort of count isn't used by the base case 9254 and so would need some sort of initial calculation or estimate. 9255 9256 The main reason @code{SET_STR_PRECOMPUTE_THRESHOLD} is so much bigger than the 9257 corresponding @code{GET_STR_PRECOMPUTE_THRESHOLD} is that @code{mpn_mul_1} is 9258 much faster than @code{mpn_divrem_1} (often by a factor of 5, or more). 9259 9260 9261 @need 1000 9262 @node Other Algorithms, Assembly Coding, Radix Conversion Algorithms, Algorithms 9263 @section Other Algorithms 9264 9265 @menu 9266 * Prime Testing Algorithm:: 9267 * Factorial Algorithm:: 9268 * Binomial Coefficients Algorithm:: 9269 * Fibonacci Numbers Algorithm:: 9270 * Lucas Numbers Algorithm:: 9271 * Random Number Algorithms:: 9272 @end menu 9273 9274 9275 @node Prime Testing Algorithm, Factorial Algorithm, Other Algorithms, Other Algorithms 9276 @subsection Prime Testing 9277 @cindex Prime testing algorithms 9278 9279 The primality testing in @code{mpz_probab_prime_p} (@pxref{Number Theoretic 9280 Functions}) first does some trial division by small factors and then uses the 9281 Miller-Rabin probabilistic primality testing algorithm, as described in Knuth 9282 section 4.5.4 algorithm P (@pxref{References}). 9283 9284 For an odd input @math{n}, and with @math{n = q@GMPmultiply{}2^k+1} where 9285 @math{q} is odd, this algorithm selects a random base @math{x} and tests 9286 whether @math{x^q @bmod{} n} is 1 or @math{-1}, or an @m{x^{q2^j} \bmod n, 9287 x^(q*2^j) mod n} is @math{1}, for @math{1@le{}j@le{}k}. If so then @math{n} 9288 is probably prime, if not then @math{n} is definitely composite. 9289 9290 Any prime @math{n} will pass the test, but some composites do too. Such 9291 composites are known as strong pseudoprimes to base @math{x}. No @math{n} is 9292 a strong pseudoprime to more than @math{1/4} of all bases (see Knuth exercise 9293 22), hence with @math{x} chosen at random there's no more than a @math{1/4} 9294 chance a ``probable prime'' will in fact be composite. 9295 9296 In fact strong pseudoprimes are quite rare, making the test much more 9297 powerful than this analysis would suggest, but @math{1/4} is all that's proven 9298 for an arbitrary @math{n}. 9299 9300 9301 @node Factorial Algorithm, Binomial Coefficients Algorithm, Prime Testing Algorithm, Other Algorithms 9302 @subsection Factorial 9303 @cindex Factorial algorithm 9304 9305 Factorials are calculated by a combination of two algorithms. An idea is 9306 shared among them: to compute the odd part of the factorial; a final step 9307 takes account of the power of @math{2} term, by shifting. 9308 9309 For small @math{n}, the odd factor of @math{n!} is computed with the simple 9310 observation that it is equal to the product of all positive odd numbers 9311 smaller than @math{n} times the odd factor of @m{\lfloor n/2\rfloor!, [n/2]!}, 9312 where @m{\lfloor x\rfloor, [x]} is the integer part of @math{x}, and so on 9313 recursively. The procedure can be best illustrated with an example, 9314 9315 @quotation 9316 @math{23! = (23.21.19.17.15.13.11.9.7.5.3)(11.9.7.5.3)(5.3)2^{19}} 9317 @end quotation 9318 9319 Current code collects all the factors in a single list, with a loop and no 9320 recursion, and compute the product, with no special care for repeated chunks. 9321 9322 When @math{n} is larger, computation pass trough prime sieving. An helper 9323 function is used, as suggested by Peter Luschny: 9324 @tex 9325 $$\mathop{\rm msf}(n) = {n!\over\lfloor n/2\rfloor!^2\cdot2^k} = \prod_{p=3}^{n} 9326 p^{\mathop{\rm L}(p,n)} $$ 9327 @end tex 9328 @ifnottex 9329 9330 @example 9331 n 9332 ----- 9333 n! | | L(p,n) 9334 msf(n) = -------------- = | | p 9335 [n/2]!^2.2^k p=3 9336 @end example 9337 @end ifnottex 9338 9339 Where @math{p} ranges on odd prime numbers. The exponent @math{k} is chosen to 9340 obtain an odd integer number: @math{k} is the number of 1 bits in the binary 9341 representation of @m{\lfloor n/2\rfloor, [n/2]}. The function L@math{(p,n)} 9342 can be defined as zero when @math{p} is composite, and, for any prime 9343 @math{p}, it is computed with: 9344 @tex 9345 $$\mathop{\rm L}(p,n) = \sum_{i>0}\left\lfloor{n\over p^i}\right\rfloor\bmod2 9346 \leq\log_p(n)$$ 9347 @end tex 9348 @ifnottex 9349 9350 @example 9351 --- 9352 \ n 9353 L(p,n) = / [---] mod 2 <= log (n) . 9354 --- p^i p 9355 i>0 9356 @end example 9357 @end ifnottex 9358 9359 With this helper function, we are able to compute the odd part of @math{n!} 9360 using the recursion implied by @m{n!=\lfloor n/2\rfloor!^2\cdot\mathop{\rm 9361 msf}(n)\cdot2^k , n!=[n/2]!^2*msf(n)*2^k}. The recursion stops using the 9362 small-@math{n} algorithm on some @m{\lfloor n/2^i\rfloor, [n/2^i]}. 9363 9364 Both the above algorithms use binary splitting to compute the product of many 9365 small factors. At first as many products as possible are accumulated in a 9366 single register, generating a list of factors that fit in a machine word. This 9367 list is then split into halves, and the product is computed recursively. 9368 9369 Such splitting is more efficient than repeated N@cross{}1 multiplies since it 9370 forms big multiplies, allowing Karatsuba and higher algorithms to be used. 9371 And even below the Karatsuba threshold a big block of work can be more 9372 efficient for the basecase algorithm. 9373 9374 9375 @node Binomial Coefficients Algorithm, Fibonacci Numbers Algorithm, Factorial Algorithm, Other Algorithms 9376 @subsection Binomial Coefficients 9377 @cindex Binomial coefficient algorithm 9378 9379 Binomial coefficients @m{\left({n}\atop{k}\right), C(n@C{}k)} are calculated 9380 by first arranging @math{k @le{} n/2} using @m{\left({n}\atop{k}\right) = 9381 \left({n}\atop{n-k}\right), C(n@C{}k) = C(n@C{}n-k)} if necessary, and then 9382 evaluating the following product simply from @math{i=2} to @math{i=k}. 9383 @tex 9384 $$ \left({n}\atop{k}\right) = (n-k+1) \prod_{i=2}^{k} {{n-k+i} \over i} $$ 9385 @end tex 9386 @ifnottex 9387 9388 @example 9389 k (n-k+i) 9390 C(n,k) = (n-k+1) * prod ------- 9391 i=2 i 9392 @end example 9393 9394 @end ifnottex 9395 It's easy to show that each denominator @math{i} will divide the product so 9396 far, so the exact division algorithm is used (@pxref{Exact Division}). 9397 9398 The numerators @math{n-k+i} and denominators @math{i} are first accumulated 9399 into as many fit a limb, to save multi-precision operations, though for 9400 @code{mpz_bin_ui} this applies only to the divisors, since @math{n} is an 9401 @code{mpz_t} and @math{n-k+i} in general won't fit in a limb at all. 9402 9403 9404 @node Fibonacci Numbers Algorithm, Lucas Numbers Algorithm, Binomial Coefficients Algorithm, Other Algorithms 9405 @subsection Fibonacci Numbers 9406 @cindex Fibonacci number algorithm 9407 9408 The Fibonacci functions @code{mpz_fib_ui} and @code{mpz_fib2_ui} are designed 9409 for calculating isolated @m{F_n,F[n]} or @m{F_n,F[n]},@m{F_{n-1},F[n-1]} 9410 values efficiently. 9411 9412 For small @math{n}, a table of single limb values in @code{__gmp_fib_table} is 9413 used. On a 32-bit limb this goes up to @m{F_{47},F[47]}, or on a 64-bit limb 9414 up to @m{F_{93},F[93]}. For convenience the table starts at @m{F_{-1},F[-1]}. 9415 9416 Beyond the table, values are generated with a binary powering algorithm, 9417 calculating a pair @m{F_n,F[n]} and @m{F_{n-1},F[n-1]} working from high to 9418 low across the bits of @math{n}. The formulas used are 9419 @tex 9420 $$\eqalign{ 9421 F_{2k+1} &= 4F_k^2 - F_{k-1}^2 + 2(-1)^k \cr 9422 F_{2k-1} &= F_k^2 + F_{k-1}^2 \cr 9423 F_{2k} &= F_{2k+1} - F_{2k-1} 9424 }$$ 9425 @end tex 9426 @ifnottex 9427 9428 @example 9429 F[2k+1] = 4*F[k]^2 - F[k-1]^2 + 2*(-1)^k 9430 F[2k-1] = F[k]^2 + F[k-1]^2 9431 9432 F[2k] = F[2k+1] - F[2k-1] 9433 @end example 9434 9435 @end ifnottex 9436 At each step, @math{k} is the high @math{b} bits of @math{n}. If the next bit 9437 of @math{n} is 0 then @m{F_{2k},F[2k]},@m{F_{2k-1},F[2k-1]} is used, or if 9438 it's a 1 then @m{F_{2k+1},F[2k+1]},@m{F_{2k},F[2k]} is used, and the process 9439 repeated until all bits of @math{n} are incorporated. Notice these formulas 9440 require just two squares per bit of @math{n}. 9441 9442 It'd be possible to handle the first few @math{n} above the single limb table 9443 with simple additions, using the defining Fibonacci recurrence @m{F_{k+1} = 9444 F_k + F_{k-1}, F[k+1]=F[k]+F[k-1]}, but this is not done since it usually 9445 turns out to be faster for only about 10 or 20 values of @math{n}, and 9446 including a block of code for just those doesn't seem worthwhile. If they 9447 really mattered it'd be better to extend the data table. 9448 9449 Using a table avoids lots of calculations on small numbers, and makes small 9450 @math{n} go fast. A bigger table would make more small @math{n} go fast, it's 9451 just a question of balancing size against desired speed. For GMP the code is 9452 kept compact, with the emphasis primarily on a good powering algorithm. 9453 9454 @code{mpz_fib2_ui} returns both @m{F_n,F[n]} and @m{F_{n-1},F[n-1]}, but 9455 @code{mpz_fib_ui} is only interested in @m{F_n,F[n]}. In this case the last 9456 step of the algorithm can become one multiply instead of two squares. One of 9457 the following two formulas is used, according as @math{n} is odd or even. 9458 @tex 9459 $$\eqalign{ 9460 F_{2k} &= F_k (F_k + 2F_{k-1}) \cr 9461 F_{2k+1} &= (2F_k + F_{k-1}) (2F_k - F_{k-1}) + 2(-1)^k 9462 }$$ 9463 @end tex 9464 @ifnottex 9465 9466 @example 9467 F[2k] = F[k]*(F[k]+2F[k-1]) 9468 9469 F[2k+1] = (2F[k]+F[k-1])*(2F[k]-F[k-1]) + 2*(-1)^k 9470 @end example 9471 9472 @end ifnottex 9473 @m{F_{2k+1},F[2k+1]} here is the same as above, just rearranged to be a 9474 multiply. For interest, the @m{2(-1)^k, 2*(-1)^k} term both here and above 9475 can be applied just to the low limb of the calculation, without a carry or 9476 borrow into further limbs, which saves some code size. See comments with 9477 @code{mpz_fib_ui} and the internal @code{mpn_fib2_ui} for how this is done. 9478 9479 9480 @node Lucas Numbers Algorithm, Random Number Algorithms, Fibonacci Numbers Algorithm, Other Algorithms 9481 @subsection Lucas Numbers 9482 @cindex Lucas number algorithm 9483 9484 @code{mpz_lucnum2_ui} derives a pair of Lucas numbers from a pair of Fibonacci 9485 numbers with the following simple formulas. 9486 @tex 9487 $$\eqalign{ 9488 L_k &= F_k + 2F_{k-1} \cr 9489 L_{k-1} &= 2F_k - F_{k-1} 9490 }$$ 9491 @end tex 9492 @ifnottex 9493 9494 @example 9495 L[k] = F[k] + 2*F[k-1] 9496 L[k-1] = 2*F[k] - F[k-1] 9497 @end example 9498 9499 @end ifnottex 9500 @code{mpz_lucnum_ui} is only interested in @m{L_n,L[n]}, and some work can be 9501 saved. Trailing zero bits on @math{n} can be handled with a single square 9502 each. 9503 @tex 9504 $$ L_{2k} = L_k^2 - 2(-1)^k $$ 9505 @end tex 9506 @ifnottex 9507 9508 @example 9509 L[2k] = L[k]^2 - 2*(-1)^k 9510 @end example 9511 9512 @end ifnottex 9513 And the lowest 1 bit can be handled with one multiply of a pair of Fibonacci 9514 numbers, similar to what @code{mpz_fib_ui} does. 9515 @tex 9516 $$ L_{2k+1} = 5F_{k-1} (2F_k + F_{k-1}) - 4(-1)^k $$ 9517 @end tex 9518 @ifnottex 9519 9520 @example 9521 L[2k+1] = 5*F[k-1]*(2*F[k]+F[k-1]) - 4*(-1)^k 9522 @end example 9523 9524 @end ifnottex 9525 9526 9527 @node Random Number Algorithms, , Lucas Numbers Algorithm, Other Algorithms 9528 @subsection Random Numbers 9529 @cindex Random number algorithms 9530 9531 For the @code{urandomb} functions, random numbers are generated simply by 9532 concatenating bits produced by the generator. As long as the generator has 9533 good randomness properties this will produce well-distributed @math{N} bit 9534 numbers. 9535 9536 For the @code{urandomm} functions, random numbers in a range @math{0@le{}R<N} 9537 are generated by taking values @math{R} of @m{\lceil \log_2 N \rceil, 9538 ceil(log2(N))} bits each until one satisfies @math{R<N}. This will normally 9539 require only one or two attempts, but the attempts are limited in case the 9540 generator is somehow degenerate and produces only 1 bits or similar. 9541 9542 @cindex Mersenne twister algorithm 9543 The Mersenne Twister generator is by Matsumoto and Nishimura 9544 (@pxref{References}). It has a non-repeating period of @math{2^@W{19937}-1}, 9545 which is a Mersenne prime, hence the name of the generator. The state is 624 9546 words of 32-bits each, which is iterated with one XOR and shift for each 9547 32-bit word generated, making the algorithm very fast. Randomness properties 9548 are also very good and this is the default algorithm used by GMP. 9549 9550 @cindex Linear congruential algorithm 9551 Linear congruential generators are described in many text books, for instance 9552 Knuth volume 2 (@pxref{References}). With a modulus @math{M} and parameters 9553 @math{A} and @math{C}, an integer state @math{S} is iterated by the formula 9554 @math{S @leftarrow{} A@GMPmultiply{}S+C @bmod{} M}. At each step the new 9555 state is a linear function of the previous, mod @math{M}, hence the name of 9556 the generator. 9557 9558 In GMP only moduli of the form @math{2^N} are supported, and the current 9559 implementation is not as well optimized as it could be. Overheads are 9560 significant when @math{N} is small, and when @math{N} is large clearly the 9561 multiply at each step will become slow. This is not a big concern, since the 9562 Mersenne Twister generator is better in every respect and is therefore 9563 recommended for all normal applications. 9564 9565 For both generators the current state can be deduced by observing enough 9566 output and applying some linear algebra (over GF(2) in the case of the 9567 Mersenne Twister). This generally means raw output is unsuitable for 9568 cryptographic applications without further hashing or the like. 9569 9570 9571 @node Assembly Coding, , Other Algorithms, Algorithms 9572 @section Assembly Coding 9573 @cindex Assembly coding 9574 9575 The assembly subroutines in GMP are the most significant source of speed at 9576 small to moderate sizes. At larger sizes algorithm selection becomes more 9577 important, but of course speedups in low level routines will still speed up 9578 everything proportionally. 9579 9580 Carry handling and widening multiplies that are important for GMP can't be 9581 easily expressed in C@. GCC @code{asm} blocks help a lot and are provided in 9582 @file{longlong.h}, but hand coding low level routines invariably offers a 9583 speedup over generic C by a factor of anything from 2 to 10. 9584 9585 @menu 9586 * Assembly Code Organisation:: 9587 * Assembly Basics:: 9588 * Assembly Carry Propagation:: 9589 * Assembly Cache Handling:: 9590 * Assembly Functional Units:: 9591 * Assembly Floating Point:: 9592 * Assembly SIMD Instructions:: 9593 * Assembly Software Pipelining:: 9594 * Assembly Loop Unrolling:: 9595 * Assembly Writing Guide:: 9596 @end menu 9597 9598 9599 @node Assembly Code Organisation, Assembly Basics, Assembly Coding, Assembly Coding 9600 @subsection Code Organisation 9601 @cindex Assembly code organisation 9602 @cindex Code organisation 9603 9604 The various @file{mpn} subdirectories contain machine-dependent code, written 9605 in C or assembly. The @file{mpn/generic} subdirectory contains default code, 9606 used when there's no machine-specific version of a particular file. 9607 9608 Each @file{mpn} subdirectory is for an ISA family. Generally 32-bit and 9609 64-bit variants in a family cannot share code and have separate directories. 9610 Within a family further subdirectories may exist for CPU variants. 9611 9612 In each directory a @file{nails} subdirectory may exist, holding code with 9613 nails support for that CPU variant. A @code{NAILS_SUPPORT} directive in each 9614 file indicates the nails values the code handles. Nails code only exists 9615 where it's faster, or promises to be faster, than plain code. There's no 9616 effort put into nails if they're not going to enhance a given CPU. 9617 9618 9619 @node Assembly Basics, Assembly Carry Propagation, Assembly Code Organisation, Assembly Coding 9620 @subsection Assembly Basics 9621 9622 @code{mpn_addmul_1} and @code{mpn_submul_1} are the most important routines 9623 for overall GMP performance. All multiplications and divisions come down to 9624 repeated calls to these. @code{mpn_add_n}, @code{mpn_sub_n}, 9625 @code{mpn_lshift} and @code{mpn_rshift} are next most important. 9626 9627 On some CPUs assembly versions of the internal functions 9628 @code{mpn_mul_basecase} and @code{mpn_sqr_basecase} give significant speedups, 9629 mainly through avoiding function call overheads. They can also potentially 9630 make better use of a wide superscalar processor, as can bigger primitives like 9631 @code{mpn_addmul_2} or @code{mpn_addmul_4}. 9632 9633 The restrictions on overlaps between sources and destinations 9634 (@pxref{Low-level Functions}) are designed to facilitate a variety of 9635 implementations. For example, knowing @code{mpn_add_n} won't have partly 9636 overlapping sources and destination means reading can be done far ahead of 9637 writing on superscalar processors, and loops can be vectorized on a vector 9638 processor, depending on the carry handling. 9639 9640 9641 @node Assembly Carry Propagation, Assembly Cache Handling, Assembly Basics, Assembly Coding 9642 @subsection Carry Propagation 9643 @cindex Assembly carry propagation 9644 9645 The problem that presents most challenges in GMP is propagating carries from 9646 one limb to the next. In functions like @code{mpn_addmul_1} and 9647 @code{mpn_add_n}, carries are the only dependencies between limb operations. 9648 9649 On processors with carry flags, a straightforward CISC style @code{adc} is 9650 generally best. AMD K6 @code{mpn_addmul_1} however is an example of an 9651 unusual set of circumstances where a branch works out better. 9652 9653 On RISC processors generally an add and compare for overflow is used. This 9654 sort of thing can be seen in @file{mpn/generic/aors_n.c}. Some carry 9655 propagation schemes require 4 instructions, meaning at least 4 cycles per 9656 limb, but other schemes may use just 1 or 2. On wide superscalar processors 9657 performance may be completely determined by the number of dependent 9658 instructions between carry-in and carry-out for each limb. 9659 9660 On vector processors good use can be made of the fact that a carry bit only 9661 very rarely propagates more than one limb. When adding a single bit to a 9662 limb, there's only a carry out if that limb was @code{0xFF@dots{}FF} which on 9663 random data will be only 1 in @m{2\GMPraise{@code{mp\_bits\_per\_limb}}, 9664 2^mp_bits_per_limb}. @file{mpn/cray/add_n.c} is an example of this, it adds 9665 all limbs in parallel, adds one set of carry bits in parallel and then only 9666 rarely needs to fall through to a loop propagating further carries. 9667 9668 On the x86s, GCC (as of version 2.95.2) doesn't generate particularly good code 9669 for the RISC style idioms that are necessary to handle carry bits in 9670 C@. Often conditional jumps are generated where @code{adc} or @code{sbb} forms 9671 would be better. And so unfortunately almost any loop involving carry bits 9672 needs to be coded in assembly for best results. 9673 9674 9675 @node Assembly Cache Handling, Assembly Functional Units, Assembly Carry Propagation, Assembly Coding 9676 @subsection Cache Handling 9677 @cindex Assembly cache handling 9678 9679 GMP aims to perform well both on operands that fit entirely in L1 cache and 9680 those which don't. 9681 9682 Basic routines like @code{mpn_add_n} or @code{mpn_lshift} are often used on 9683 large operands, so L2 and main memory performance is important for them. 9684 @code{mpn_mul_1} and @code{mpn_addmul_1} are mostly used for multiply and 9685 square basecases, so L1 performance matters most for them, unless assembly 9686 versions of @code{mpn_mul_basecase} and @code{mpn_sqr_basecase} exist, in 9687 which case the remaining uses are mostly for larger operands. 9688 9689 For L2 or main memory operands, memory access times will almost certainly be 9690 more than the calculation time. The aim therefore is to maximize memory 9691 throughput, by starting a load of the next cache line while processing the 9692 contents of the previous one. Clearly this is only possible if the chip has a 9693 lock-up free cache or some sort of prefetch instruction. Most current chips 9694 have both these features. 9695 9696 Prefetching sources combines well with loop unrolling, since a prefetch can be 9697 initiated once per unrolled loop (or more than once if the loop covers more 9698 than one cache line). 9699 9700 On CPUs without write-allocate caches, prefetching destinations will ensure 9701 individual stores don't go further down the cache hierarchy, limiting 9702 bandwidth. Of course for calculations which are slow anyway, like 9703 @code{mpn_divrem_1}, write-throughs might be fine. 9704 9705 The distance ahead to prefetch will be determined by memory latency versus 9706 throughput. The aim of course is to have data arriving continuously, at peak 9707 throughput. Some CPUs have limits on the number of fetches or prefetches in 9708 progress. 9709 9710 If a special prefetch instruction doesn't exist then a plain load can be used, 9711 but in that case care must be taken not to attempt to read past the end of an 9712 operand, since that might produce a segmentation violation. 9713 9714 Some CPUs or systems have hardware that detects sequential memory accesses and 9715 initiates suitable cache movements automatically, making life easy. 9716 9717 9718 @node Assembly Functional Units, Assembly Floating Point, Assembly Cache Handling, Assembly Coding 9719 @subsection Functional Units 9720 9721 When choosing an approach for an assembly loop, consideration is given to 9722 what operations can execute simultaneously and what throughput can thereby be 9723 achieved. In some cases an algorithm can be tweaked to accommodate available 9724 resources. 9725 9726 Loop control will generally require a counter and pointer updates, costing as 9727 much as 5 instructions, plus any delays a branch introduces. CPU addressing 9728 modes might reduce pointer updates, perhaps by allowing just one updating 9729 pointer and others expressed as offsets from it, or on CISC chips with all 9730 addressing done with the loop counter as a scaled index. 9731 9732 The final loop control cost can be amortised by processing several limbs in 9733 each iteration (@pxref{Assembly Loop Unrolling}). This at least ensures loop 9734 control isn't a big fraction the work done. 9735 9736 Memory throughput is always a limit. If perhaps only one load or one store 9737 can be done per cycle then 3 cycles/limb will the top speed for ``binary'' 9738 operations like @code{mpn_add_n}, and any code achieving that is optimal. 9739 9740 Integer resources can be freed up by having the loop counter in a float 9741 register, or by pressing the float units into use for some multiplying, 9742 perhaps doing every second limb on the float side (@pxref{Assembly Floating 9743 Point}). 9744 9745 Float resources can be freed up by doing carry propagation on the integer 9746 side, or even by doing integer to float conversions in integers using bit 9747 twiddling. 9748 9749 9750 @node Assembly Floating Point, Assembly SIMD Instructions, Assembly Functional Units, Assembly Coding 9751 @subsection Floating Point 9752 @cindex Assembly floating Point 9753 9754 Floating point arithmetic is used in GMP for multiplications on CPUs with poor 9755 integer multipliers. It's mostly useful for @code{mpn_mul_1}, 9756 @code{mpn_addmul_1} and @code{mpn_submul_1} on 64-bit machines, and 9757 @code{mpn_mul_basecase} on both 32-bit and 64-bit machines. 9758 9759 With IEEE 53-bit double precision floats, integer multiplications producing up 9760 to 53 bits will give exact results. Breaking a 64@cross{}64 multiplication 9761 into eight 16@cross{}@math{32@rightarrow{}48} bit pieces is convenient. With 9762 some care though six 21@cross{}@math{32@rightarrow{}53} bit products can be 9763 used, if one of the lower two 21-bit pieces also uses the sign bit. 9764 9765 For the @code{mpn_mul_1} family of functions on a 64-bit machine, the 9766 invariant single limb is split at the start, into 3 or 4 pieces. Inside the 9767 loop, the bignum operand is split into 32-bit pieces. Fast conversion of 9768 these unsigned 32-bit pieces to floating point is highly machine-dependent. 9769 In some cases, reading the data into the integer unit, zero-extending to 9770 64-bits, then transferring to the floating point unit back via memory is the 9771 only option. 9772 9773 Converting partial products back to 64-bit limbs is usually best done as a 9774 signed conversion. Since all values are smaller than @m{2^{53},2^53}, signed 9775 and unsigned are the same, but most processors lack unsigned conversions. 9776 9777 @sp 2 9778 9779 Here is a diagram showing 16@cross{}32 bit products for an @code{mpn_mul_1} or 9780 @code{mpn_addmul_1} with a 64-bit limb. The single limb operand V is split 9781 into four 16-bit parts. The multi-limb operand U is split in the loop into 9782 two 32-bit parts. 9783 9784 @tex 9785 \global\newdimen\GMPbits \global\GMPbits=0.18em 9786 \def\GMPbox#1#2#3{% 9787 \hbox{% 9788 \hbox to 128\GMPbits{\hfil 9789 \vbox{% 9790 \hrule 9791 \hbox to 48\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}% 9792 \hrule}% 9793 \hskip #1\GMPbits}% 9794 \raise \GMPboxdepth \hbox{\hskip 2em #3}}} 9795 % 9796 \GMPdisplay{% 9797 \vbox{% 9798 \hbox{% 9799 \hbox to 128\GMPbits {\hfil 9800 \vbox{% 9801 \hrule 9802 \hbox to 64\GMPbits{% 9803 \GMPvrule \hfil$v48$\hfil 9804 \vrule \hfil$v32$\hfil 9805 \vrule \hfil$v16$\hfil 9806 \vrule \hfil$v00$\hfil 9807 \vrule} 9808 \hrule}}% 9809 \raise \GMPboxdepth \hbox{\hskip 2em V Operand}} 9810 \vskip 0.5ex 9811 \hbox{% 9812 \hbox to 128\GMPbits {\hfil 9813 \raise \GMPboxdepth \hbox{$\times$\hskip 1.5em}% 9814 \vbox{% 9815 \hrule 9816 \hbox to 64\GMPbits {% 9817 \GMPvrule \hfil$u32$\hfil 9818 \vrule \hfil$u00$\hfil 9819 \vrule}% 9820 \hrule}}% 9821 \raise \GMPboxdepth \hbox{\hskip 2em U Operand (one limb)}}% 9822 \vskip 0.5ex 9823 \hbox{\vbox to 2ex{\hrule width 128\GMPbits}}% 9824 \GMPbox{0}{u00 \times v00}{$p00$\hskip 1.5em 48-bit products}% 9825 \vskip 0.5ex 9826 \GMPbox{16}{u00 \times v16}{$p16$} 9827 \vskip 0.5ex 9828 \GMPbox{32}{u00 \times v32}{$p32$} 9829 \vskip 0.5ex 9830 \GMPbox{48}{u00 \times v48}{$p48$} 9831 \vskip 0.5ex 9832 \GMPbox{32}{u32 \times v00}{$r32$} 9833 \vskip 0.5ex 9834 \GMPbox{48}{u32 \times v16}{$r48$} 9835 \vskip 0.5ex 9836 \GMPbox{64}{u32 \times v32}{$r64$} 9837 \vskip 0.5ex 9838 \GMPbox{80}{u32 \times v48}{$r80$} 9839 }} 9840 @end tex 9841 @ifnottex 9842 @example 9843 @group 9844 +---+---+---+---+ 9845 |v48|v32|v16|v00| V operand 9846 +---+---+---+---+ 9847 9848 +-------+---+---+ 9849 x | u32 | u00 | U operand (one limb) 9850 +---------------+ 9851 9852 --------------------------------- 9853 9854 +-----------+ 9855 | u00 x v00 | p00 48-bit products 9856 +-----------+ 9857 +-----------+ 9858 | u00 x v16 | p16 9859 +-----------+ 9860 +-----------+ 9861 | u00 x v32 | p32 9862 +-----------+ 9863 +-----------+ 9864 | u00 x v48 | p48 9865 +-----------+ 9866 +-----------+ 9867 | u32 x v00 | r32 9868 +-----------+ 9869 +-----------+ 9870 | u32 x v16 | r48 9871 +-----------+ 9872 +-----------+ 9873 | u32 x v32 | r64 9874 +-----------+ 9875 +-----------+ 9876 | u32 x v48 | r80 9877 +-----------+ 9878 @end group 9879 @end example 9880 @end ifnottex 9881 9882 @math{p32} and @math{r32} can be summed using floating-point addition, and 9883 likewise @math{p48} and @math{r48}. @math{p00} and @math{p16} can be summed 9884 with @math{r64} and @math{r80} from the previous iteration. 9885 9886 For each loop then, four 49-bit quantities are transferred to the integer unit, 9887 aligned as follows, 9888 9889 @tex 9890 % GMPbox here should be 49 bits wide, but use 51 to better show p16+r80' 9891 % crossing into the upper 64 bits. 9892 \def\GMPbox#1#2#3{% 9893 \hbox{% 9894 \hbox to 128\GMPbits {% 9895 \hfil 9896 \vbox{% 9897 \hrule 9898 \hbox to 51\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}% 9899 \hrule}% 9900 \hskip #1\GMPbits}% 9901 \raise \GMPboxdepth \hbox{\hskip 1.5em $#3$\hfil}% 9902 }} 9903 \newbox\b \setbox\b\hbox{64 bits}% 9904 \newdimen\bw \bw=\wd\b \advance\bw by 2em 9905 \newdimen\x \x=128\GMPbits 9906 \advance\x by -2\bw 9907 \divide\x by4 9908 \GMPdisplay{% 9909 \vbox{% 9910 \hbox to 128\GMPbits {% 9911 \GMPvrule 9912 \raise 0.5ex \vbox{\hrule \hbox to \x {}}% 9913 \hfil 64 bits\hfil 9914 \raise 0.5ex \vbox{\hrule \hbox to \x {}}% 9915 \vrule 9916 \raise 0.5ex \vbox{\hrule \hbox to \x {}}% 9917 \hfil 64 bits\hfil 9918 \raise 0.5ex \vbox{\hrule \hbox to \x {}}% 9919 \vrule}% 9920 \vskip 0.7ex 9921 \GMPbox{0}{p00+r64'}{i00} 9922 \vskip 0.5ex 9923 \GMPbox{16}{p16+r80'}{i16} 9924 \vskip 0.5ex 9925 \GMPbox{32}{p32+r32}{i32} 9926 \vskip 0.5ex 9927 \GMPbox{48}{p48+r48}{i48} 9928 }} 9929 @end tex 9930 @ifnottex 9931 @example 9932 @group 9933 |-----64bits----|-----64bits----| 9934 +------------+ 9935 | p00 + r64' | i00 9936 +------------+ 9937 +------------+ 9938 | p16 + r80' | i16 9939 +------------+ 9940 +------------+ 9941 | p32 + r32 | i32 9942 +------------+ 9943 +------------+ 9944 | p48 + r48 | i48 9945 +------------+ 9946 @end group 9947 @end example 9948 @end ifnottex 9949 9950 The challenge then is to sum these efficiently and add in a carry limb, 9951 generating a low 64-bit result limb and a high 33-bit carry limb (@math{i48} 9952 extends 33 bits into the high half). 9953 9954 9955 @node Assembly SIMD Instructions, Assembly Software Pipelining, Assembly Floating Point, Assembly Coding 9956 @subsection SIMD Instructions 9957 @cindex Assembly SIMD 9958 9959 The single-instruction multiple-data support in current microprocessors is 9960 aimed at signal processing algorithms where each data point can be treated 9961 more or less independently. There's generally not much support for 9962 propagating the sort of carries that arise in GMP. 9963 9964 SIMD multiplications of say four 16@cross{}16 bit multiplies only do as much 9965 work as one 32@cross{}32 from GMP's point of view, and need some shifts and 9966 adds besides. But of course if say the SIMD form is fully pipelined and uses 9967 less instruction decoding then it may still be worthwhile. 9968 9969 On the x86 chips, MMX has so far found a use in @code{mpn_rshift} and 9970 @code{mpn_lshift}, and is used in a special case for 16-bit multipliers in the 9971 P55 @code{mpn_mul_1}. SSE2 is used for Pentium 4 @code{mpn_mul_1}, 9972 @code{mpn_addmul_1}, and @code{mpn_submul_1}. 9973 9974 9975 @node Assembly Software Pipelining, Assembly Loop Unrolling, Assembly SIMD Instructions, Assembly Coding 9976 @subsection Software Pipelining 9977 @cindex Assembly software pipelining 9978 9979 Software pipelining consists of scheduling instructions around the branch 9980 point in a loop. For example a loop might issue a load not for use in the 9981 present iteration but the next, thereby allowing extra cycles for the data to 9982 arrive from memory. 9983 9984 Naturally this is wanted only when doing things like loads or multiplies that 9985 take several cycles to complete, and only where a CPU has multiple functional 9986 units so that other work can be done in the meantime. 9987 9988 A pipeline with several stages will have a data value in progress at each 9989 stage and each loop iteration moves them along one stage. This is like 9990 juggling. 9991 9992 If the latency of some instruction is greater than the loop time then it will 9993 be necessary to unroll, so one register has a result ready to use while 9994 another (or multiple others) are still in progress. (@pxref{Assembly Loop 9995 Unrolling}). 9996 9997 9998 @node Assembly Loop Unrolling, Assembly Writing Guide, Assembly Software Pipelining, Assembly Coding 9999 @subsection Loop Unrolling 10000 @cindex Assembly loop unrolling 10001 10002 Loop unrolling consists of replicating code so that several limbs are 10003 processed in each loop. At a minimum this reduces loop overheads by a 10004 corresponding factor, but it can also allow better register usage, for example 10005 alternately using one register combination and then another. Judicious use of 10006 @command{m4} macros can help avoid lots of duplication in the source code. 10007 10008 Any amount of unrolling can be handled with a loop counter that's decremented 10009 by @math{N} each time, stopping when the remaining count is less than the 10010 further @math{N} the loop will process. Or by subtracting @math{N} at the 10011 start, the termination condition becomes when the counter @math{C} is less 10012 than 0 (and the count of remaining limbs is @math{C+N}). 10013 10014 Alternately for a power of 2 unroll the loop count and remainder can be 10015 established with a shift and mask. This is convenient if also making a 10016 computed jump into the middle of a large loop. 10017 10018 The limbs not a multiple of the unrolling can be handled in various ways, for 10019 example 10020 10021 @itemize @bullet 10022 @item 10023 A simple loop at the end (or the start) to process the excess. Care will be 10024 wanted that it isn't too much slower than the unrolled part. 10025 10026 @item 10027 A set of binary tests, for example after an 8-limb unrolling, test for 4 more 10028 limbs to process, then a further 2 more or not, and finally 1 more or not. 10029 This will probably take more code space than a simple loop. 10030 10031 @item 10032 A @code{switch} statement, providing separate code for each possible excess, 10033 for example an 8-limb unrolling would have separate code for 0 remaining, 1 10034 remaining, etc, up to 7 remaining. This might take a lot of code, but may be 10035 the best way to optimize all cases in combination with a deep pipelined loop. 10036 10037 @item 10038 A computed jump into the middle of the loop, thus making the first iteration 10039 handle the excess. This should make times smoothly increase with size, which 10040 is attractive, but setups for the jump and adjustments for pointers can be 10041 tricky and could become quite difficult in combination with deep pipelining. 10042 @end itemize 10043 10044 10045 @node Assembly Writing Guide, , Assembly Loop Unrolling, Assembly Coding 10046 @subsection Writing Guide 10047 @cindex Assembly writing guide 10048 10049 This is a guide to writing software pipelined loops for processing limb 10050 vectors in assembly. 10051 10052 First determine the algorithm and which instructions are needed. Code it 10053 without unrolling or scheduling, to make sure it works. On a 3-operand CPU 10054 try to write each new value to a new register, this will greatly simplify later 10055 steps. 10056 10057 Then note for each instruction the functional unit and/or issue port 10058 requirements. If an instruction can use either of two units, like U0 or U1 10059 then make a category ``U0/U1''. Count the total using each unit (or combined 10060 unit), and count all instructions. 10061 10062 Figure out from those counts the best possible loop time. The goal will be to 10063 find a perfect schedule where instruction latencies are completely hidden. 10064 The total instruction count might be the limiting factor, or perhaps a 10065 particular functional unit. It might be possible to tweak the instructions to 10066 help the limiting factor. 10067 10068 Suppose the loop time is @math{N}, then make @math{N} issue buckets, with the 10069 final loop branch at the end of the last. Now fill the buckets with dummy 10070 instructions using the functional units desired. Run this to make sure the 10071 intended speed is reached. 10072 10073 Now replace the dummy instructions with the real instructions from the slow 10074 but correct loop you started with. The first will typically be a load 10075 instruction. Then the instruction using that value is placed in a bucket an 10076 appropriate distance down. Run the loop again, to check it still runs at 10077 target speed. 10078 10079 Keep placing instructions, frequently measuring the loop. After a few you 10080 will need to wrap around from the last bucket back to the top of the loop. If 10081 you used the new-register for new-value strategy above then there will be no 10082 register conflicts. If not then take care not to clobber something already in 10083 use. Changing registers at this time is very error prone. 10084 10085 The loop will overlap two or more of the original loop iterations, and the 10086 computation of one vector element result will be started in one iteration of 10087 the new loop, and completed one or several iterations later. 10088 10089 The final step is to create feed-in and wind-down code for the loop. A good 10090 way to do this is to make a copy (or copies) of the loop at the start and 10091 delete those instructions which don't have valid antecedents, and at the end 10092 replicate and delete those whose results are unwanted (including any further 10093 loads). 10094 10095 The loop will have a minimum number of limbs loaded and processed, so the 10096 feed-in code must test if the request size is smaller and skip either to a 10097 suitable part of the wind-down or to special code for small sizes. 10098 10099 10100 @node Internals, Contributors, Algorithms, Top 10101 @chapter Internals 10102 @cindex Internals 10103 10104 @strong{This chapter is provided only for informational purposes and the 10105 various internals described here may change in future GMP releases. 10106 Applications expecting to be compatible with future releases should use only 10107 the documented interfaces described in previous chapters.} 10108 10109 @menu 10110 * Integer Internals:: 10111 * Rational Internals:: 10112 * Float Internals:: 10113 * Raw Output Internals:: 10114 * C++ Interface Internals:: 10115 @end menu 10116 10117 @node Integer Internals, Rational Internals, Internals, Internals 10118 @section Integer Internals 10119 @cindex Integer internals 10120 10121 @code{mpz_t} variables represent integers using sign and magnitude, in space 10122 dynamically allocated and reallocated. The fields are as follows. 10123 10124 @table @asis 10125 @item @code{_mp_size} 10126 The number of limbs, or the negative of that when representing a negative 10127 integer. Zero is represented by @code{_mp_size} set to zero, in which case 10128 the @code{_mp_d} data is unused. 10129 10130 @item @code{_mp_d} 10131 A pointer to an array of limbs which is the magnitude. These are stored 10132 ``little endian'' as per the @code{mpn} functions, so @code{_mp_d[0]} is the 10133 least significant limb and @code{_mp_d[ABS(_mp_size)-1]} is the most 10134 significant. Whenever @code{_mp_size} is non-zero, the most significant limb 10135 is non-zero. 10136 10137 Currently there's always at least one limb allocated, so for instance 10138 @code{mpz_set_ui} never needs to reallocate, and @code{mpz_get_ui} can fetch 10139 @code{_mp_d[0]} unconditionally (though its value is then only wanted if 10140 @code{_mp_size} is non-zero). 10141 10142 @item @code{_mp_alloc} 10143 @code{_mp_alloc} is the number of limbs currently allocated at @code{_mp_d}, 10144 and naturally @code{_mp_alloc >= ABS(_mp_size)}. When an @code{mpz} routine 10145 is about to (or might be about to) increase @code{_mp_size}, it checks 10146 @code{_mp_alloc} to see whether there's enough space, and reallocates if not. 10147 @code{MPZ_REALLOC} is generally used for this. 10148 @end table 10149 10150 The various bitwise logical functions like @code{mpz_and} behave as if 10151 negative values were twos complement. But sign and magnitude is always used 10152 internally, and necessary adjustments are made during the calculations. 10153 Sometimes this isn't pretty, but sign and magnitude are best for other 10154 routines. 10155 10156 Some internal temporary variables are setup with @code{MPZ_TMP_INIT} and these 10157 have @code{_mp_d} space obtained from @code{TMP_ALLOC} rather than the memory 10158 allocation functions. Care is taken to ensure that these are big enough that 10159 no reallocation is necessary (since it would have unpredictable consequences). 10160 10161 @code{_mp_size} and @code{_mp_alloc} are @code{int}, although @code{mp_size_t} 10162 is usually a @code{long}. This is done to make the fields just 32 bits on 10163 some 64 bits systems, thereby saving a few bytes of data space but still 10164 providing plenty of range. 10165 10166 10167 @node Rational Internals, Float Internals, Integer Internals, Internals 10168 @section Rational Internals 10169 @cindex Rational internals 10170 10171 @code{mpq_t} variables represent rationals using an @code{mpz_t} numerator and 10172 denominator (@pxref{Integer Internals}). 10173 10174 The canonical form adopted is denominator positive (and non-zero), no common 10175 factors between numerator and denominator, and zero uniquely represented as 10176 0/1. 10177 10178 It's believed that casting out common factors at each stage of a calculation 10179 is best in general. A GCD is an @math{O(N^2)} operation so it's better to do 10180 a few small ones immediately than to delay and have to do a big one later. 10181 Knowing the numerator and denominator have no common factors can be used for 10182 example in @code{mpq_mul} to make only two cross GCDs necessary, not four. 10183 10184 This general approach to common factors is badly sub-optimal in the presence 10185 of simple factorizations or little prospect for cancellation, but GMP has no 10186 way to know when this will occur. As per @ref{Efficiency}, that's left to 10187 applications. The @code{mpq_t} framework might still suit, with 10188 @code{mpq_numref} and @code{mpq_denref} for direct access to the numerator and 10189 denominator, or of course @code{mpz_t} variables can be used directly. 10190 10191 10192 @node Float Internals, Raw Output Internals, Rational Internals, Internals 10193 @section Float Internals 10194 @cindex Float internals 10195 10196 Efficient calculation is the primary aim of GMP floats and the use of whole 10197 limbs and simple rounding facilitates this. 10198 10199 @code{mpf_t} floats have a variable precision mantissa and a single machine 10200 word signed exponent. The mantissa is represented using sign and magnitude. 10201 10202 @c FIXME: The arrow heads don't join to the lines exactly. 10203 @tex 10204 \global\newdimen\GMPboxwidth \GMPboxwidth=5em 10205 \global\newdimen\GMPboxheight \GMPboxheight=3ex 10206 \def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}} 10207 \GMPdisplay{% 10208 \vbox{% 10209 \hbox to 5\GMPboxwidth {most significant limb \hfil least significant limb} 10210 \vskip 0.7ex 10211 \def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}} 10212 \hbox { 10213 \hbox to 3\GMPboxwidth {% 10214 \setbox 0 = \hbox{@code{\_mp\_exp}}% 10215 \dimen0=3\GMPboxwidth 10216 \advance\dimen0 by -\wd0 10217 \divide\dimen0 by 2 10218 \advance\dimen0 by -1em 10219 \setbox1 = \hbox{$\rightarrow$}% 10220 \dimen1=\dimen0 10221 \advance\dimen1 by -\wd1 10222 \GMPcentreline{\dimen0}% 10223 \hfil 10224 \box0% 10225 \hfil 10226 \GMPcentreline{\dimen1{}}% 10227 \box1} 10228 \hbox to 2\GMPboxwidth {\hfil @code{\_mp\_d}}} 10229 \vskip 0.5ex 10230 \vbox {% 10231 \hrule 10232 \hbox{% 10233 \vrule height 2ex depth 1ex 10234 \hbox to \GMPboxwidth {}% 10235 \vrule 10236 \hbox to \GMPboxwidth {}% 10237 \vrule 10238 \hbox to \GMPboxwidth {}% 10239 \vrule 10240 \hbox to \GMPboxwidth {}% 10241 \vrule 10242 \hbox to \GMPboxwidth {}% 10243 \vrule} 10244 \hrule 10245 } 10246 \hbox {% 10247 \hbox to 0.8 pt {} 10248 \hbox to 3\GMPboxwidth {% 10249 \hfil $\cdot$} \hbox {$\leftarrow$ radix point\hfil}} 10250 \hbox to 5\GMPboxwidth{% 10251 \setbox 0 = \hbox{@code{\_mp\_size}}% 10252 \dimen0 = 5\GMPboxwidth 10253 \advance\dimen0 by -\wd0 10254 \divide\dimen0 by 2 10255 \advance\dimen0 by -1em 10256 \dimen1 = \dimen0 10257 \setbox1 = \hbox{$\leftarrow$}% 10258 \setbox2 = \hbox{$\rightarrow$}% 10259 \advance\dimen0 by -\wd1 10260 \advance\dimen1 by -\wd2 10261 \hbox to 0.3 em {}% 10262 \box1 10263 \GMPcentreline{\dimen0}% 10264 \hfil 10265 \box0 10266 \hfil 10267 \GMPcentreline{\dimen1}% 10268 \box2} 10269 }} 10270 @end tex 10271 @ifnottex 10272 @example 10273 most least 10274 significant significant 10275 limb limb 10276 10277 _mp_d 10278 |---- _mp_exp ---> | 10279 _____ _____ _____ _____ _____ 10280 |_____|_____|_____|_____|_____| 10281 . <------------ radix point 10282 10283 <-------- _mp_size ---------> 10284 @sp 1 10285 @end example 10286 @end ifnottex 10287 10288 @noindent 10289 The fields are as follows. 10290 10291 @table @asis 10292 @item @code{_mp_size} 10293 The number of limbs currently in use, or the negative of that when 10294 representing a negative value. Zero is represented by @code{_mp_size} and 10295 @code{_mp_exp} both set to zero, and in that case the @code{_mp_d} data is 10296 unused. (In the future @code{_mp_exp} might be undefined when representing 10297 zero.) 10298 10299 @item @code{_mp_prec} 10300 The precision of the mantissa, in limbs. In any calculation the aim is to 10301 produce @code{_mp_prec} limbs of result (the most significant being non-zero). 10302 10303 @item @code{_mp_d} 10304 A pointer to the array of limbs which is the absolute value of the mantissa. 10305 These are stored ``little endian'' as per the @code{mpn} functions, so 10306 @code{_mp_d[0]} is the least significant limb and 10307 @code{_mp_d[ABS(_mp_size)-1]} the most significant. 10308 10309 The most significant limb is always non-zero, but there are no other 10310 restrictions on its value, in particular the highest 1 bit can be anywhere 10311 within the limb. 10312 10313 @code{_mp_prec+1} limbs are allocated to @code{_mp_d}, the extra limb being 10314 for convenience (see below). There are no reallocations during a calculation, 10315 only in a change of precision with @code{mpf_set_prec}. 10316 10317 @item @code{_mp_exp} 10318 The exponent, in limbs, determining the location of the implied radix point. 10319 Zero means the radix point is just above the most significant limb. Positive 10320 values mean a radix point offset towards the lower limbs and hence a value 10321 @math{@ge{} 1}, as for example in the diagram above. Negative exponents mean 10322 a radix point further above the highest limb. 10323 10324 Naturally the exponent can be any value, it doesn't have to fall within the 10325 limbs as the diagram shows, it can be a long way above or a long way below. 10326 Limbs other than those included in the @code{@{_mp_d,_mp_size@}} data 10327 are treated as zero. 10328 @end table 10329 10330 The @code{_mp_size} and @code{_mp_prec} fields are @code{int}, although the 10331 @code{mp_size_t} type is usually a @code{long}. The @code{_mp_exp} field is 10332 usually @code{long}. This is done to make some fields just 32 bits on some 64 10333 bits systems, thereby saving a few bytes of data space but still providing 10334 plenty of precision and a very large range. 10335 10336 10337 @sp 1 10338 @noindent 10339 The following various points should be noted. 10340 10341 @table @asis 10342 @item Low Zeros 10343 The least significant limbs @code{_mp_d[0]} etc can be zero, though such low 10344 zeros can always be ignored. Routines likely to produce low zeros check and 10345 avoid them to save time in subsequent calculations, but for most routines 10346 they're quite unlikely and aren't checked. 10347 10348 @item Mantissa Size Range 10349 The @code{_mp_size} count of limbs in use can be less than @code{_mp_prec} if 10350 the value can be represented in less. This means low precision values or 10351 small integers stored in a high precision @code{mpf_t} can still be operated 10352 on efficiently. 10353 10354 @code{_mp_size} can also be greater than @code{_mp_prec}. Firstly a value is 10355 allowed to use all of the @code{_mp_prec+1} limbs available at @code{_mp_d}, 10356 and secondly when @code{mpf_set_prec_raw} lowers @code{_mp_prec} it leaves 10357 @code{_mp_size} unchanged and so the size can be arbitrarily bigger than 10358 @code{_mp_prec}. 10359 10360 @item Rounding 10361 All rounding is done on limb boundaries. Calculating @code{_mp_prec} limbs 10362 with the high non-zero will ensure the application requested minimum precision 10363 is obtained. 10364 10365 The use of simple ``trunc'' rounding towards zero is efficient, since there's 10366 no need to examine extra limbs and increment or decrement. 10367 10368 @item Bit Shifts 10369 Since the exponent is in limbs, there are no bit shifts in basic operations 10370 like @code{mpf_add} and @code{mpf_mul}. When differing exponents are 10371 encountered all that's needed is to adjust pointers to line up the relevant 10372 limbs. 10373 10374 Of course @code{mpf_mul_2exp} and @code{mpf_div_2exp} will require bit shifts, 10375 but the choice is between an exponent in limbs which requires shifts there, or 10376 one in bits which requires them almost everywhere else. 10377 10378 @item Use of @code{_mp_prec+1} Limbs 10379 The extra limb on @code{_mp_d} (@code{_mp_prec+1} rather than just 10380 @code{_mp_prec}) helps when an @code{mpf} routine might get a carry from its 10381 operation. @code{mpf_add} for instance will do an @code{mpn_add} of 10382 @code{_mp_prec} limbs. If there's no carry then that's the result, but if 10383 there is a carry then it's stored in the extra limb of space and 10384 @code{_mp_size} becomes @code{_mp_prec+1}. 10385 10386 Whenever @code{_mp_prec+1} limbs are held in a variable, the low limb is not 10387 needed for the intended precision, only the @code{_mp_prec} high limbs. But 10388 zeroing it out or moving the rest down is unnecessary. Subsequent routines 10389 reading the value will simply take the high limbs they need, and this will be 10390 @code{_mp_prec} if their target has that same precision. This is no more than 10391 a pointer adjustment, and must be checked anyway since the destination 10392 precision can be different from the sources. 10393 10394 Copy functions like @code{mpf_set} will retain a full @code{_mp_prec+1} limbs 10395 if available. This ensures that a variable which has @code{_mp_size} equal to 10396 @code{_mp_prec+1} will get its full exact value copied. Strictly speaking 10397 this is unnecessary since only @code{_mp_prec} limbs are needed for the 10398 application's requested precision, but it's considered that an @code{mpf_set} 10399 from one variable into another of the same precision ought to produce an exact 10400 copy. 10401 10402 @item Application Precisions 10403 @code{__GMPF_BITS_TO_PREC} converts an application requested precision to an 10404 @code{_mp_prec}. The value in bits is rounded up to a whole limb then an 10405 extra limb is added since the most significant limb of @code{_mp_d} is only 10406 non-zero and therefore might contain only one bit. 10407 10408 @code{__GMPF_PREC_TO_BITS} does the reverse conversion, and removes the extra 10409 limb from @code{_mp_prec} before converting to bits. The net effect of 10410 reading back with @code{mpf_get_prec} is simply the precision rounded up to a 10411 multiple of @code{mp_bits_per_limb}. 10412 10413 Note that the extra limb added here for the high only being non-zero is in 10414 addition to the extra limb allocated to @code{_mp_d}. For example with a 10415 32-bit limb, an application request for 250 bits will be rounded up to 8 10416 limbs, then an extra added for the high being only non-zero, giving an 10417 @code{_mp_prec} of 9. @code{_mp_d} then gets 10 limbs allocated. Reading 10418 back with @code{mpf_get_prec} will take @code{_mp_prec} subtract 1 limb and 10419 multiply by 32, giving 256 bits. 10420 10421 Strictly speaking, the fact the high limb has at least one bit means that a 10422 float with, say, 3 limbs of 32-bits each will be holding at least 65 bits, but 10423 for the purposes of @code{mpf_t} it's considered simply to be 64 bits, a nice 10424 multiple of the limb size. 10425 @end table 10426 10427 10428 @node Raw Output Internals, C++ Interface Internals, Float Internals, Internals 10429 @section Raw Output Internals 10430 @cindex Raw output internals 10431 10432 @noindent 10433 @code{mpz_out_raw} uses the following format. 10434 10435 @tex 10436 \global\newdimen\GMPboxwidth \GMPboxwidth=5em 10437 \global\newdimen\GMPboxheight \GMPboxheight=3ex 10438 \def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}} 10439 \GMPdisplay{% 10440 \vbox{% 10441 \def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}} 10442 \vbox {% 10443 \hrule 10444 \hbox{% 10445 \vrule height 2.5ex depth 1.5ex 10446 \hbox to \GMPboxwidth {\hfil size\hfil}% 10447 \vrule 10448 \hbox to 3\GMPboxwidth {\hfil data bytes\hfil}% 10449 \vrule} 10450 \hrule} 10451 }} 10452 @end tex 10453 @ifnottex 10454 @example 10455 +------+------------------------+ 10456 | size | data bytes | 10457 +------+------------------------+ 10458 @end example 10459 @end ifnottex 10460 10461 The size is 4 bytes written most significant byte first, being the number of 10462 subsequent data bytes, or the twos complement negative of that when a negative 10463 integer is represented. The data bytes are the absolute value of the integer, 10464 written most significant byte first. 10465 10466 The most significant data byte is always non-zero, so the output is the same 10467 on all systems, irrespective of limb size. 10468 10469 In GMP 1, leading zero bytes were written to pad the data bytes to a multiple 10470 of the limb size. @code{mpz_inp_raw} will still accept this, for 10471 compatibility. 10472 10473 The use of ``big endian'' for both the size and data fields is deliberate, it 10474 makes the data easy to read in a hex dump of a file. Unfortunately it also 10475 means that the limb data must be reversed when reading or writing, so neither 10476 a big endian nor little endian system can just read and write @code{_mp_d}. 10477 10478 10479 @node C++ Interface Internals, , Raw Output Internals, Internals 10480 @section C++ Interface Internals 10481 @cindex C++ interface internals 10482 10483 A system of expression templates is used to ensure something like @code{a=b+c} 10484 turns into a simple call to @code{mpz_add} etc. For @code{mpf_class} 10485 the scheme also ensures the precision of the final 10486 destination is used for any temporaries within a statement like 10487 @code{f=w*x+y*z}. These are important features which a naive implementation 10488 cannot provide. 10489 10490 A simplified description of the scheme follows. The true scheme is 10491 complicated by the fact that expressions have different return types. For 10492 detailed information, refer to the source code. 10493 10494 To perform an operation, say, addition, we first define a ``function object'' 10495 evaluating it, 10496 10497 @example 10498 struct __gmp_binary_plus 10499 @{ 10500 static void eval(mpf_t f, const mpf_t g, const mpf_t h) 10501 @{ 10502 mpf_add(f, g, h); 10503 @} 10504 @}; 10505 @end example 10506 10507 @noindent 10508 And an ``additive expression'' object, 10509 10510 @example 10511 __gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> > 10512 operator+(const mpf_class &f, const mpf_class &g) 10513 @{ 10514 return __gmp_expr 10515 <__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >(f, g); 10516 @} 10517 @end example 10518 10519 The seemingly redundant @code{__gmp_expr<__gmp_binary_expr<@dots{}>>} is used to 10520 encapsulate any possible kind of expression into a single template type. In 10521 fact even @code{mpf_class} etc are @code{typedef} specializations of 10522 @code{__gmp_expr}. 10523 10524 Next we define assignment of @code{__gmp_expr} to @code{mpf_class}. 10525 10526 @example 10527 template <class T> 10528 mpf_class & mpf_class::operator=(const __gmp_expr<T> &expr) 10529 @{ 10530 expr.eval(this->get_mpf_t(), this->precision()); 10531 return *this; 10532 @} 10533 10534 template <class Op> 10535 void __gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, Op> >::eval 10536 (mpf_t f, mp_bitcnt_t precision) 10537 @{ 10538 Op::eval(f, expr.val1.get_mpf_t(), expr.val2.get_mpf_t()); 10539 @} 10540 @end example 10541 10542 where @code{expr.val1} and @code{expr.val2} are references to the expression's 10543 operands (here @code{expr} is the @code{__gmp_binary_expr} stored within the 10544 @code{__gmp_expr}). 10545 10546 This way, the expression is actually evaluated only at the time of assignment, 10547 when the required precision (that of @code{f}) is known. Furthermore the 10548 target @code{mpf_t} is now available, thus we can call @code{mpf_add} directly 10549 with @code{f} as the output argument. 10550 10551 Compound expressions are handled by defining operators taking subexpressions 10552 as their arguments, like this: 10553 10554 @example 10555 template <class T, class U> 10556 __gmp_expr 10557 <__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> > 10558 operator+(const __gmp_expr<T> &expr1, const __gmp_expr<U> &expr2) 10559 @{ 10560 return __gmp_expr 10561 <__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> > 10562 (expr1, expr2); 10563 @} 10564 @end example 10565 10566 And the corresponding specializations of @code{__gmp_expr::eval}: 10567 10568 @example 10569 template <class T, class U, class Op> 10570 void __gmp_expr 10571 <__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, Op> >::eval 10572 (mpf_t f, mp_bitcnt_t precision) 10573 @{ 10574 // declare two temporaries 10575 mpf_class temp1(expr.val1, precision), temp2(expr.val2, precision); 10576 Op::eval(f, temp1.get_mpf_t(), temp2.get_mpf_t()); 10577 @} 10578 @end example 10579 10580 The expression is thus recursively evaluated to any level of complexity and 10581 all subexpressions are evaluated to the precision of @code{f}. 10582 10583 10584 @node Contributors, References, Internals, Top 10585 @comment node-name, next, previous, up 10586 @appendix Contributors 10587 @cindex Contributors 10588 10589 Torbj@"orn Granlund wrote the original GMP library and is still the main 10590 developer. Code not explicitly attributed to others, was contributed by 10591 Torbj@"orn. Several other individuals and organizations have contributed 10592 GMP. Here is a list in chronological order on first contribution: 10593 10594 Gunnar Sj@"odin and Hans Riesel helped with mathematical problems in early 10595 versions of the library. 10596 10597 Richard Stallman helped with the interface design and revised the first 10598 version of this manual. 10599 10600 Brian Beuning and Doug Lea helped with testing of early versions of the 10601 library and made creative suggestions. 10602 10603 John Amanatides of York University in Canada contributed the function 10604 @code{mpz_probab_prime_p}. 10605 10606 Paul Zimmermann wrote the REDC-based mpz_powm code, the Sch@"onhage-Strassen 10607 FFT multiply code, and the Karatsuba square root code. He also improved the 10608 Toom3 code for GMP 4.2. Paul sparked the development of GMP 2, with his 10609 comparisons between bignum packages. The ECMNET project Paul is organizing 10610 was a driving force behind many of the optimizations in GMP 3. Paul also 10611 wrote the new GMP 4.3 nth root code (with Torbj@"orn). 10612 10613 Ken Weber (Kent State University, Universidade Federal do Rio Grande do Sul) 10614 contributed now defunct versions of @code{mpz_gcd}, @code{mpz_divexact}, 10615 @code{mpn_gcd}, and @code{mpn_bdivmod}, partially supported by CNPq (Brazil) 10616 grant 301314194-2. 10617 10618 Per Bothner of Cygnus Support helped to set up GMP to use Cygnus' configure. 10619 He has also made valuable suggestions and tested numerous intermediary 10620 releases. 10621 10622 Joachim Hollman was involved in the design of the @code{mpf} interface, and in 10623 the @code{mpz} design revisions for version 2. 10624 10625 Bennet Yee contributed the initial versions of @code{mpz_jacobi} and 10626 @code{mpz_legendre}. 10627 10628 Andreas Schwab contributed the files @file{mpn/m68k/lshift.S} and 10629 @file{mpn/m68k/rshift.S} (now in @file{.asm} form). 10630 10631 Robert Harley of Inria, France and David Seal of ARM, England, suggested clever 10632 improvements for population count. Robert also wrote highly optimized 10633 Karatsuba and 3-way Toom multiplication functions for GMP 3, and contributed 10634 the ARM assembly code. 10635 10636 Torsten Ekedahl of the Mathematical department of Stockholm University provided 10637 significant inspiration during several phases of the GMP development. His 10638 mathematical expertise helped improve several algorithms. 10639 10640 Linus Nordberg wrote the new configure system based on autoconf and 10641 implemented the new random functions. 10642 10643 Kevin Ryde worked on a large number of things: optimized x86 code, m4 asm 10644 macros, parameter tuning, speed measuring, the configure system, function 10645 inlining, divisibility tests, bit scanning, Jacobi symbols, Fibonacci and Lucas 10646 number functions, printf and scanf functions, perl interface, demo expression 10647 parser, the algorithms chapter in the manual, @file{gmpasm-mode.el}, and 10648 various miscellaneous improvements elsewhere. 10649 10650 Kent Boortz made the Mac OS 9 port. 10651 10652 Steve Root helped write the optimized alpha 21264 assembly code. 10653 10654 Gerardo Ballabio wrote the @file{gmpxx.h} C++ class interface and the C++ 10655 @code{istream} input routines. 10656 10657 Jason Moxham rewrote @code{mpz_fac_ui}. 10658 10659 Pedro Gimeno implemented the Mersenne Twister and made other random number 10660 improvements. 10661 10662 Niels M@"oller wrote the sub-quadratic GCD, extended GCD and jacobi code, the 10663 quadratic Hensel division code, and (with Torbj@"orn) the new divide and 10664 conquer division code for GMP 4.3. Niels also helped implement the new Toom 10665 multiply code for GMP 4.3 and implemented helper functions to simplify Toom 10666 evaluations for GMP 5.0. He wrote the original version of mpn_mulmod_bnm1, and 10667 he is the main author of the mini-gmp package used for gmp bootstrapping. 10668 10669 Alberto Zanoni and Marco Bodrato suggested the unbalanced multiply strategy, 10670 and found the optimal strategies for evaluation and interpolation in Toom 10671 multiplication. 10672 10673 Marco Bodrato helped implement the new Toom multiply code for GMP 4.3 and 10674 implemented most of the new Toom multiply and squaring code for 5.0. 10675 He is the main author of the current mpn_mulmod_bnm1, mpn_mullo_n, and 10676 mpn_sqrlo. Marco also wrote the functions mpn_invert and mpn_invertappr, 10677 and improved the speed of integer root extraction. He is the author of 10678 the current combinatorial functions: binomial, factorial, multifactorial, 10679 primorial. 10680 10681 David Harvey suggested the internal function @code{mpn_bdiv_dbm1}, implementing 10682 division relevant to Toom multiplication. He also worked on fast assembly 10683 sequences, in particular on a fast AMD64 @code{mpn_mul_basecase}. He wrote 10684 the internal middle product functions @code{mpn_mulmid_basecase}, 10685 @code{mpn_toom42_mulmid}, @code{mpn_mulmid_n} and related helper routines. 10686 10687 Martin Boij wrote @code{mpn_perfect_power_p}. 10688 10689 Marc Glisse improved @file{gmpxx.h}: use fewer temporaries (faster), 10690 specializations of @code{numeric_limits} and @code{common_type}, C++11 10691 features (move constructors, explicit bool conversion, UDL), make the 10692 conversion from @code{mpq_class} to @code{mpz_class} explicit, optimize 10693 operations where one argument is a small compile-time constant, replace 10694 some heap allocations by stack allocations. He also fixed the eofbit 10695 handling of C++ streams, and removed one division from @file{mpq/aors.c}. 10696 10697 David S Miller wrote assembly code for SPARC T3 and T4. 10698 10699 Mark Sofroniou cleaned up the types of mul_fft.c, letting it work for huge 10700 operands. 10701 10702 Ulrich Weigand ported GMP to the powerpc64le ABI. 10703 10704 (This list is chronological, not ordered after significance. If you have 10705 contributed to GMP but are not listed above, please tell 10706 @email{gmp-devel@@gmplib.org} about the omission!) 10707 10708 The development of floating point functions of GNU MP 2, were supported in part 10709 by the ESPRIT-BRA (Basic Research Activities) 6846 project POSSO (POlynomial 10710 System SOlving). 10711 10712 The development of GMP 2, 3, and 4.0 was supported in part by the IDA Center 10713 for Computing Sciences. 10714 10715 The development of GMP 4.3, 5.0, and 5.1 was supported in part by the Swedish 10716 Foundation for Strategic Research. 10717 10718 Thanks go to Hans Thorsen for donating an SGI system for the GMP test system 10719 environment. 10720 10721 @node References, GNU Free Documentation License, Contributors, Top 10722 @comment node-name, next, previous, up 10723 @appendix References 10724 @cindex References 10725 10726 @c FIXME: In tex, the @uref's are unhyphenated, which is good for clarity, 10727 @c but being long words they upset paragraph formatting (the preceding line 10728 @c can get badly stretched). Would like an conditional @* style line break 10729 @c if the uref is too long to fit on the last line of the paragraph, but it's 10730 @c not clear how to do that. For now explicit @texlinebreak{}s are used on 10731 @c paragraphs that come out bad. 10732 10733 @section Books 10734 10735 @itemize @bullet 10736 @item 10737 Jonathan M. Borwein and Peter B. Borwein, ``Pi and the AGM: A Study in 10738 Analytic Number Theory and Computational Complexity'', Wiley, 1998. 10739 10740 @item 10741 Richard Crandall and Carl Pomerance, ``Prime Numbers: A Computational 10742 Perspective'', 2nd edition, Springer-Verlag, 2005. 10743 @texlinebreak{} @uref{http://www.math.dartmouth.edu/~carlp/} 10744 10745 @item 10746 Henri Cohen, ``A Course in Computational Algebraic Number Theory'', Graduate 10747 Texts in Mathematics number 138, Springer-Verlag, 1993. 10748 @texlinebreak{} @uref{http://www.math.u-bordeaux.fr/~cohen/} 10749 10750 @item 10751 Donald E. Knuth, ``The Art of Computer Programming'', volume 2, 10752 ``Seminumerical Algorithms'', 3rd edition, Addison-Wesley, 1998. 10753 @texlinebreak{} @uref{http://www-cs-faculty.stanford.edu/~knuth/taocp.html} 10754 10755 @item 10756 John D. Lipson, ``Elements of Algebra and Algebraic Computing'', 10757 The Benjamin Cummings Publishing Company Inc, 1981. 10758 10759 @item 10760 Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone, ``Handbook of 10761 Applied Cryptography'', @uref{http://www.cacr.math.uwaterloo.ca/hac/} 10762 10763 @item 10764 Richard M. Stallman and the GCC Developer Community, ``Using the GNU Compiler 10765 Collection'', Free Software Foundation, 2008, available online 10766 @uref{https://gcc.gnu.org/onlinedocs/}, and in the GCC package 10767 @uref{https://ftp.gnu.org/gnu/gcc/} 10768 @end itemize 10769 10770 @section Papers 10771 10772 @itemize @bullet 10773 @item 10774 Yves Bertot, Nicolas Magaud and Paul Zimmermann, ``A Proof of GMP Square 10775 Root'', Journal of Automated Reasoning, volume 29, 2002, pp.@: 225-252. Also 10776 available online as INRIA Research Report 4475, June 2002, 10777 @uref{http://hal.inria.fr/docs/00/07/21/13/PDF/RR-4475.pdf} 10778 10779 @item 10780 Christoph Burnikel and Joachim Ziegler, ``Fast Recursive Division'', 10781 Max-Planck-Institut fuer Informatik Research Report MPI-I-98-1-022, 10782 @texlinebreak{} @uref{http://data.mpi-sb.mpg.de/internet/reports.nsf/NumberView/1998-1-022} 10783 10784 @item 10785 Torbj@"orn Granlund and Peter L. Montgomery, ``Division by Invariant Integers 10786 using Multiplication'', in Proceedings of the SIGPLAN PLDI'94 Conference, June 10787 1994. Also available @uref{https://gmplib.org/~tege/divcnst-pldi94.pdf}. 10788 10789 @item 10790 Niels M@"oller and Torbj@"orn Granlund, ``Improved division by invariant 10791 integers'', IEEE Transactions on Computers, 11 June 2010. 10792 @uref{https://gmplib.org/~tege/division-paper.pdf} 10793 10794 @item 10795 Torbj@"orn Granlund and Niels M@"oller, ``Division of integers large and 10796 small'', to appear. 10797 10798 @item 10799 Tudor Jebelean, 10800 ``An algorithm for exact division'', 10801 Journal of Symbolic Computation, 10802 volume 15, 1993, pp.@: 169-180. 10803 Research report version available @texlinebreak{} 10804 @uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-35.ps.gz} 10805 10806 @item 10807 Tudor Jebelean, ``Exact Division with Karatsuba Complexity - Extended 10808 Abstract'', RISC-Linz technical report 96-31, @texlinebreak{} 10809 @uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-31.ps.gz} 10810 10811 @item 10812 Tudor Jebelean, ``Practical Integer Division with Karatsuba Complexity'', 10813 ISSAC 97, pp.@: 339-341. Technical report available @texlinebreak{} 10814 @uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-29.ps.gz} 10815 10816 @item 10817 Tudor Jebelean, ``A Generalization of the Binary GCD Algorithm'', ISSAC 93, 10818 pp.@: 111-116. Technical report version available @texlinebreak{} 10819 @uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1993/93-01.ps.gz} 10820 10821 @item 10822 Tudor Jebelean, ``A Double-Digit Lehmer-Euclid Algorithm for Finding the GCD 10823 of Long Integers'', Journal of Symbolic Computation, volume 19, 1995, 10824 pp.@: 145-157. Technical report version also available @texlinebreak{} 10825 @uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-69.ps.gz} 10826 10827 @item 10828 Werner Krandick and Tudor Jebelean, ``Bidirectional Exact Integer Division'', 10829 Journal of Symbolic Computation, volume 21, 1996, pp.@: 441-455. Early 10830 technical report version also available 10831 @uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1994/94-50.ps.gz} 10832 10833 @item 10834 Makoto Matsumoto and Takuji Nishimura, ``Mersenne Twister: A 623-dimensionally 10835 equidistributed uniform pseudorandom number generator'', ACM Transactions on 10836 Modelling and Computer Simulation, volume 8, January 1998, pp.@: 3-30. 10837 Available online @texlinebreak{} 10838 @uref{http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/mt.ps.gz} (or .pdf) 10839 10840 @item 10841 R. Moenck and A. Borodin, ``Fast Modular Transforms via Division'', 10842 Proceedings of the 13th Annual IEEE Symposium on Switching and Automata 10843 Theory, October 1972, pp.@: 90-96. Reprinted as ``Fast Modular Transforms'', 10844 Journal of Computer and System Sciences, volume 8, number 3, June 1974, 10845 pp.@: 366-386. 10846 10847 @item 10848 Niels M@"oller, ``On Sch@"onhage's algorithm and subquadratic integer GCD 10849 computation'', in Mathematics of Computation, volume 77, January 2008, pp.@: 10850 589-607. 10851 10852 @item 10853 Peter L. Montgomery, ``Modular Multiplication Without Trial Division'', in 10854 Mathematics of Computation, volume 44, number 170, April 1985. 10855 10856 @item 10857 Arnold Sch@"onhage and Volker Strassen, ``Schnelle Multiplikation grosser 10858 Zahlen'', Computing 7, 1971, pp.@: 281-292. 10859 10860 @item 10861 Kenneth Weber, ``The accelerated integer GCD algorithm'', 10862 ACM Transactions on Mathematical Software, 10863 volume 21, number 1, March 1995, pp.@: 111-122. 10864 10865 @item 10866 Paul Zimmermann, ``Karatsuba Square Root'', INRIA Research Report 3805, 10867 November 1999, @uref{http://hal.inria.fr/inria-00072854/PDF/RR-3805.pdf} 10868 10869 @item 10870 Paul Zimmermann, ``A Proof of GMP Fast Division and Square Root 10871 Implementations'', @texlinebreak{} 10872 @uref{http://www.loria.fr/~zimmerma/papers/proof-div-sqrt.ps.gz} 10873 10874 @item 10875 Dan Zuras, ``On Squaring and Multiplying Large Integers'', ARITH-11: IEEE 10876 Symposium on Computer Arithmetic, 1993, pp.@: 260 to 271. Reprinted as ``More 10877 on Multiplying and Squaring Large Integers'', IEEE Transactions on Computers, 10878 volume 43, number 8, August 1994, pp.@: 899-908. 10879 @end itemize 10880 10881 10882 @node GNU Free Documentation License, Concept Index, References, Top 10883 @appendix GNU Free Documentation License 10884 @cindex GNU Free Documentation License 10885 @cindex Free Documentation License 10886 @cindex Documentation license 10887 @include fdl-1.3.texi 10888 10889 10890 @node Concept Index, Function Index, GNU Free Documentation License, Top 10891 @comment node-name, next, previous, up 10892 @unnumbered Concept Index 10893 @printindex cp 10894 10895 @node Function Index, , Concept Index, Top 10896 @comment node-name, next, previous, up 10897 @unnumbered Function and Type Index 10898 @printindex fn 10899 10900 @bye 10901 10902 @c Local variables: 10903 @c fill-column: 78 10904 @c compile-command: "make gmp.info" 10905 @c End: