modernc.org/knuth@v0.0.4/web/testdata/ctan.org/tex-archive/systems/knuth/dist/tex/tex.web

modernc.org/knuth@v0.0.4/web/testdata/ctan.org/tex-archive/systems/knuth/dist/tex/tex.web (about)

     1  % This program is copyright (C) 1982 by D. E. Knuth; all rights are reserved.
     2  % Unlimited copying and redistribution of this file are permitted as long
     3  % as this file is not modified. Modifications are permitted, but only if
     4  % the resulting file is not named tex.web. (The WEB system provides
     5  % for alterations via an auxiliary file; the master file should stay intact.)
     6  % See Appendix H of the WEB manual for hints on how to install this program.
     7  % And see Appendix A of the TRIP manual for details about how to validate it.
     8  
     9  % TeX is a trademark of the American Mathematical Society.
    10  % METAFONT is a trademark of Addison-Wesley Publishing Company.
    11  
    12  % Version 0 was released in September 1982 after it passed a variety of tests.
    13  % Version 1 was released in November 1983 after thorough testing.
    14  % Version 1.1 fixed ``disappearing font identifiers'' et alia (July 1984).
    15  % Version 1.2 allowed `0' in response to an error, et alia (October 1984).
    16  % Version 1.3 made memory allocation more flexible and local (November 1984).
    17  % Version 1.4 fixed accents right after line breaks, et alia (April 1985).
    18  % Version 1.5 fixed \the\toks after other expansion in \edefs (August 1985).
    19  % Version 2.0 (almost identical to 1.5) corresponds to "Volume B" (April 1986).
    20  % Version 2.1 corrected anomalies in discretionary breaks (January 1987).
    21  % Version 2.2 corrected "(Please type...)" with null \endlinechar (April 1987).
    22  % Version 2.3 avoided incomplete page in premature termination (August 1987).
    23  % Version 2.4 fixed \noaligned rules in indented displays (August 1987).
    24  % Version 2.5 saved cur_order when expanding tokens (September 1987).
    25  % Version 2.6 added 10sp slop when shipping leaders (November 1987).
    26  % Version 2.7 improved rounding of negative-width characters (November 1987).
    27  % Version 2.8 fixed weird bug if no \patterns are used (December 1987).
    28  % Version 2.9 made \csname\endcsname's "relax" local (December 1987).
    29  % Version 2.91 fixed \outer\def\a0{}\a\a bug (April 1988).
    30  % Version 2.92 fixed \patterns, also file names with complex macros (May 1988).
    31  % Version 2.93 fixed negative halving in allocator when mem_min<0 (June 1988).
    32  % Version 2.94 kept open_log_file from calling fatal_error (November 1988).
    33  % Version 2.95 solved that problem a better way (December 1988).
    34  % Version 2.96 corrected bug in "Infinite shrinkage" recovery (January 1989).
    35  % Version 2.97 corrected blunder in creating 2.95 (February 1989).
    36  % Version 2.98 omitted save_for_after at outer level (March 1989).
    37  % Version 2.99 caught $$\begingroup\halign..$$ (June 1989).
    38  % Version 2.991 caught .5\ifdim.6... (June 1989).
    39  % Version 2.992 introduced major changes for 8-bit extensions (September 1989).
    40  % Version 2.993 fixed a save_stack synchronization bug et alia (December 1989).
    41  % Version 3.0 fixed unusual displays; was more \output robust (March 1990).
    42  % Version 3.1 fixed nullfont, disabled \write{\the\prevgraf} (September 1990).
    43  % Version 3.14 fixed unprintable font names and corrected typos (March 1991).
    44  % Version 3.141 more of same; reconstituted ligatures better (March 1992).
    45  % Version 3.1415 preserved nonexplicit kerns, tidied up (February 1993).
    46  % Version 3.14159 allowed fontmemsize to change; bulletproofing (March 1995).
    47  % Version 3.141592 fixed \xleaders, glueset, weird alignments (December 2002).
    48  % Version 3.1415926 was a general cleanup with minor fixes (February 2008).
    49  % Version 3.14159265 was similar (January 2014).
    50  % Version 3.141592653 was similar but more extensive (January 2021).
    51  
    52  % A reward of $327.68 will be paid to the first finder of any remaining bug.
    53  
    54  % Although considerable effort has been expended to make the TeX program
    55  % correct and reliable, no warranty is implied; the author disclaims any
    56  % obligation or liability for damages, including but not limited to
    57  % special, indirect, or consequential damages arising out of or in
    58  % connection with the use or performance of this software. This work has
    59  % been a ``labor of love'' and the author hopes that users enjoy it.
    60  
    61  % Here is TeX material that gets inserted after \input webmac
    62  \def\hang{\hangindent 3em\noindent\ignorespaces}
    63  \def\hangg#1 {\hang\hbox{#1 }}
    64  \def\textindent#1{\hangindent2.5em\noindent\hbox to2.5em{\hss#1 }\ignorespaces}
    65  \font\ninerm=cmr9
    66  \let\mc=\ninerm % medium caps for names like SAIL
    67  \def\PASCAL{Pascal}
    68  \def\ph{\hbox{Pascal-H}}
    69  \def\pct!{{\char`\%}} % percent sign in ordinary text
    70  \font\logo=logo10 % font used for the METAFONT logo
    71  \def\MF{{\logo META}\-{\logo FONT}}
    72  \def\<#1>{$\langle#1\rangle$}
    73  \def\section{\mathhexbox278}
    74  
    75  \def\(#1){} % this is used to make section names sort themselves better
    76  \def\9#1{} % this is used for sort keys in the index via @@:sort key}{entry@@>
    77  
    78  \outer\def\N#1. \[#2]#3.{\MN#1.\vfil\eject % begin starred section
    79    \def\rhead{PART #2:\uppercase{#3}} % define running headline
    80    \message{*\modno} % progress report
    81    \edef\next{\write\cont{\Z{\?#2]#3}{\modno}{\the\pageno}}}\next
    82    \ifon\startsection{\bf\ignorespaces#3.\quad}\ignorespaces}
    83  \let\?=\relax % we want to be able to \write a \?
    84  
    85  \def\title{\TeX82}
    86  \def\topofcontents{\hsize 5.5in
    87    \vglue 0pt plus 1fil minus 1.5in
    88    \def\?##1]{\hbox to 1in{\hfil##1.\ }}
    89    }
    90  \def\botofcontents{\vskip 0pt plus 1fil minus 1.5in}
    91  \pageno=3
    92  \def\glob{13} % this should be the section number of "<Global...>"
    93  \def\gglob{20, 26} % this should be the next two sections of "<Global...>"
    94  
    95  @* \[1] Introduction.
    96  This is \TeX, a document compiler intended to produce typesetting of high
    97  quality.
    98  The \PASCAL\ program that follows is the definition of \TeX82, a standard
    99  @:PASCAL}{\PASCAL@>
   100  @!@:TeX82}{\TeX82@>
   101  version of \TeX\ that is designed to be highly portable so that identical output
   102  will be obtainable on a great variety of computers.
   103  
   104  The main purpose of the following program is to explain the algorithms of \TeX\
   105  as clearly as possible. As a result, the program will not necessarily be very
   106  efficient when a particular \PASCAL\ compiler has translated it into a
   107  particular machine language. However, the program has been written so that it
   108  can be tuned to run efficiently in a wide variety of operating environments
   109  by making comparatively few changes. Such flexibility is possible because
   110  the documentation that follows is written in the \.{WEB} language, which is
   111  at a higher level than \PASCAL; the preprocessing step that converts \.{WEB}
   112  to \PASCAL\ is able to introduce most of the necessary refinements.
   113  Semi-automatic translation to other languages is also feasible, because the
   114  program below does not make extensive use of features that are peculiar to
   115  \PASCAL.
   116  
   117  A large piece of software like \TeX\ has inherent complexity that cannot
   118  be reduced below a certain level of difficulty, although each individual
   119  part is fairly simple by itself. The \.{WEB} language is intended to make
   120  the algorithms as readable as possible, by reflecting the way the
   121  individual program pieces fit together and by providing the
   122  cross-references that connect different parts. Detailed comments about
   123  what is going on, and about why things were done in certain ways, have
   124  been liberally sprinkled throughout the program.  These comments explain
   125  features of the implementation, but they rarely attempt to explain the
   126  \TeX\ language itself, since the reader is supposed to be familiar with
   127  {\sl The \TeX book}.
   128  @.WEB@>
   129  @:TeXbook}{\sl The \TeX book@>
   130  
   131  @ The present implementation has a long ancestry, beginning in the summer
   132  of~1977, when Michael~F. Plass and Frank~M. Liang designed and coded
   133  a prototype
   134  @^Plass, Michael Frederick@>
   135  @^Liang, Franklin Mark@>
   136  @^Knuth, Donald Ervin@>
   137  based on some specifications that the author had made in May of that year.
   138  This original proto\TeX\ included macro definitions and elementary
   139  manipulations on boxes and glue, but it did not have line-breaking,
   140  page-breaking, mathematical formulas, alignment routines, error recovery,
   141  or the present semantic nest; furthermore,
   142  it used character lists instead of token lists, so that a control sequence
   143  like \.{\\halign} was represented by a list of seven characters. A
   144  complete version of \TeX\ was designed and coded by the author in late
   145  1977 and early 1978; that program, like its prototype, was written in the
   146  {\mc SAIL} language, for which an excellent debugging system was
   147  available. Preliminary plans to convert the {\mc SAIL} code into a form
   148  somewhat like the present ``web'' were developed by Luis Trabb~Pardo and
   149  @^Trabb Pardo, Luis Isidoro@>
   150  the author at the beginning of 1979, and a complete implementation was
   151  created by Ignacio~A. Zabala in 1979 and 1980. The \TeX82 program, which
   152  @^Zabala Salelles, Ignacio Andr\'es@>
   153  was written by the author during the latter part of 1981 and the early
   154  part of 1982, also incorporates ideas from the 1979 implementation of
   155  @^Guibas, Leonidas Ioannis@>
   156  @^Sedgewick, Robert@>
   157  @^Wyatt, Douglas Kirk@>
   158  \TeX\ in {\mc MESA} that was written by Leonidas Guibas, Robert Sedgewick,
   159  and Douglas Wyatt at the Xerox Palo Alto Research Center.  Several hundred
   160  refinements were introduced into \TeX82 based on the experiences gained with
   161  the original implementations, so that essentially every part of the system
   162  has been substantially improved. After the appearance of ``Version 0'' in
   163  September 1982, this program benefited greatly from the comments of
   164  many other people, notably David~R. Fuchs and Howard~W. Trickey.
   165  A final revision in September 1989 extended the input character set to
   166  eight-bit codes and introduced the ability to hyphenate words from
   167  different languages, based on some ideas of Michael~J. Ferguson.
   168  @^Fuchs, David Raymond@>
   169  @^Trickey, Howard Wellington@>
   170  @^Ferguson, Michael John@>
   171  
   172  No doubt there still is plenty of room for improvement, but the author
   173  is firmly committed to keeping \TeX82 ``frozen'' from now on; stability
   174  and reliability are to be its main virtues.
   175  
   176  On the other hand, the \.{WEB} description can be extended without changing
   177  the core of \TeX82 itself, and the program has been designed so that such
   178  extensions are not extremely difficult to make.
   179  The |banner| string defined here should be changed whenever \TeX\
   180  undergoes any modifications, so that it will be clear which version of
   181  \TeX\ might be the guilty party when a problem arises.
   182  @^extensions to \TeX@>
   183  @^system dependencies@>
   184  
   185  If this program is changed, the resulting system should not be called
   186  `\TeX'; the official name `\TeX' by itself is reserved
   187  for software systems that are fully compatible with each other.
   188  A special test suite called the ``\.{TRIP} test'' is available for
   189  helping to determine whether a particular implementation deserves to be
   190  known as `\TeX' [cf.~Stanford Computer Science report CS1027,
   191  November 1984].
   192  
   193  @d banner=='This is TeX, Version 3.141592653' {printed when \TeX\ starts}
   194  
   195  @ Different \PASCAL s have slightly different conventions, and the present
   196  @!@:PASCAL H}{\ph@>
   197  program expresses \TeX\ in terms of the \PASCAL\ that was
   198  available to the author in 1982. Constructions that apply to
   199  this particular compiler, which we shall call \ph, should help the
   200  reader see how to make an appropriate interface for other systems
   201  if necessary. (\ph\ is Charles Hedrick's modification of a compiler
   202  @^Hedrick, Charles Locke@>
   203  for the DECsystem-10 that was originally developed at the University of
   204  Hamburg; cf.\ {\sl Software---Practice and Experience \bf6} (1976),
   205  29--42. The \TeX\ program below is intended to be adaptable, without
   206  extensive changes, to most other versions of \PASCAL, so it does not fully
   207  use the admirable features of \ph. Indeed, a conscious effort has been
   208  made here to avoid using several idiosyncratic features of standard
   209  \PASCAL\ itself, so that most of the code can be translated mechanically
   210  into other high-level languages. For example, the `\&{with}' and `\\{new}'
   211  features are not used, nor are pointer types, set types, or enumerated
   212  scalar types; there are no `\&{var}' parameters, except in the case of files;
   213  there are no tag fields on variant records; there are no assignments
   214  |real:=integer|; no procedures are declared local to other procedures.)
   215  
   216  The portions of this program that involve system-dependent code, where
   217  changes might be necessary because of differences between \PASCAL\ compilers
   218  and/or differences between
   219  operating systems, can be identified by looking at the sections whose
   220  numbers are listed under `system dependencies' in the index. Furthermore,
   221  the index entries for `dirty \PASCAL' list all places where the restrictions
   222  of \PASCAL\ have not been followed perfectly, for one reason or another.
   223  @!@^system dependencies@>
   224  @!@^dirty \PASCAL@>
   225  
   226  Incidentally, \PASCAL's standard |round| function can be problematical,
   227  because it disagrees with the IEEE floating-point standard.
   228  Many implementors have
   229  therefore chosen to substitute their own home-grown rounding procedure.
   230  
   231  @ The program begins with a normal \PASCAL\ program heading, whose
   232  components will be filled in later, using the conventions of \.{WEB}.
   233  @.WEB@>
   234  For example, the portion of the program called `\X\glob:Global
   235  variables\X' below will be replaced by a sequence of variable declarations
   236  that starts in $\section\glob$ of this documentation. In this way, we are able
   237  to define each individual global variable when we are prepared to
   238  understand what it means; we do not have to define all of the globals at
   239  once.  Cross references in $\section\glob$, where it says ``See also
   240  sections \gglob, \dots,'' also make it possible to look at the set of
   241  all global variables, if desired.  Similar remarks apply to the other
   242  portions of the program heading.
   243  
   244  Actually the heading shown here is not quite normal: The |program| line
   245  does not mention any |output| file, because \ph\ would ask the \TeX\ user
   246  to specify a file name if |output| were specified here.
   247  @:PASCAL H}{\ph@>
   248  @^system dependencies@>
   249  
   250  @d mtype==t@&y@&p@&e {this is a \.{WEB} coding trick:}
   251  @f mtype==type {`\&{mtype}' will be equivalent to `\&{type}'}
   252  @f type==true {but `|type|' will not be treated as a reserved word}
   253  
   254  @p @t\4@>@<Compiler directives@>@/
   255  program TEX; {all file names are defined dynamically}
   256  label @<Labels in the outer block@>@/
   257  const @<Constants in the outer block@>@/
   258  mtype @<Types in the outer block@>@/
   259  var @<Global variables@>@/
   260  @#
   261  procedure initialize; {this procedure gets things started properly}
   262    var @<Local variables for initialization@>@/
   263    begin @<Initialize whatever \TeX\ might access@>@;
   264    end;@#
   265  @t\4@>@<Basic printing procedures@>@/
   266  @t\4@>@<Error handling procedures@>@/
   267  
   268  @ The overall \TeX\ program begins with the heading just shown, after which
   269  comes a bunch of procedure declarations and function declarations.
   270  Finally we will get to the main program, which begins with the
   271  comment `|start_here|'. If you want to skip down to the
   272  main program now, you can look up `|start_here|' in the index.
   273  But the author suggests that the best way to understand this program
   274  is to follow pretty much the order of \TeX's components as they appear in the
   275  \.{WEB} description you are now reading, since the present ordering is
   276  intended to combine the advantages of the ``bottom up'' and ``top down''
   277  approaches to the problem of understanding a somewhat complicated system.
   278  
   279  @ Three labels must be declared in the main program, so we give them
   280  symbolic names.
   281  
   282  @d start_of_TEX=1 {go here when \TeX's variables are initialized}
   283  @d end_of_TEX=9998 {go here to close files and terminate gracefully}
   284  @d final_end=9999 {this label marks the ending of the program}
   285  
   286  @<Labels in the out...@>=
   287  start_of_TEX@t\hskip-2pt@>, end_of_TEX@t\hskip-2pt@>,@,final_end;
   288    {key control points}
   289  
   290  @ Some of the code below is intended to be used only when diagnosing the
   291  strange behavior that sometimes occurs when \TeX\ is being installed or
   292  when system wizards are fooling around with \TeX\ without quite knowing
   293  what they are doing. Such code will not normally be compiled; it is
   294  delimited by the codewords `$|debug|\ldots|gubed|$', with apologies
   295  to people who wish to preserve the purity of English.
   296  
   297  Similarly, there is some conditional code delimited by
   298  `$|stat|\ldots|tats|$' that is intended for use when statistics are to be
   299  kept about \TeX's memory usage.  The |stat| $\ldots$ |tats| code also
   300  implements diagnostic information for \.{\\tracingparagraphs},
   301  \.{\\tracingpages}, and \.{\\tracingrestores}.
   302  @^debugging@>
   303  
   304  @d debug==@{ {change this to `$\\{debug}\equiv\null$' when debugging}
   305  @d gubed==@t@>@} {change this to `$\\{gubed}\equiv\null$' when debugging}
   306  @f debug==begin
   307  @f gubed==end
   308  @#
   309  @d stat==@{ {change this to `$\\{stat}\equiv\null$' when gathering
   310    usage statistics}
   311  @d tats==@t@>@} {change this to `$\\{tats}\equiv\null$' when gathering
   312    usage statistics}
   313  @f stat==begin
   314  @f tats==end
   315  
   316  @ This program has two important variations: (1) There is a long and slow
   317  version called \.{INITEX}, which does the extra calculations needed to
   318  @.INITEX@>
   319  initialize \TeX's internal tables; and (2)~there is a shorter and faster
   320  production version, which cuts the initialization to a bare minimum.
   321  Parts of the program that are needed in (1) but not in (2) are delimited by
   322  the codewords `$|init|\ldots|tini|$'.
   323  
   324  @d init== {change this to `$\\{init}\equiv\.{@@\{}$' in the production version}
   325  @d tini== {change this to `$\\{tini}\equiv\.{@@\}}$' in the production version}
   326  @f init==begin
   327  @f tini==end
   328  
   329  @<Initialize whatever...@>=
   330  @<Set initial values of key variables@>@/
   331  @!init @<Initialize table entries (done by \.{INITEX} only)@>@;@+tini
   332  
   333  @ If the first character of a \PASCAL\ comment is a dollar sign,
   334  \ph\ treats the comment as a list of ``compiler directives'' that will
   335  affect the translation of this program into machine language.  The
   336  directives shown below specify full checking and inclusion of the \PASCAL\
   337  debugger when \TeX\ is being debugged, but they cause range checking and other
   338  redundant code to be eliminated when the production system is being generated.
   339  Arithmetic overflow will be detected in all cases.
   340  @:PASCAL H}{\ph@>
   341  @^system dependencies@>
   342  @^overflow in arithmetic@>
   343  
   344  @<Compiler directives@>=
   345  @{@&$C-,A+,D-@} {no range check, catch arithmetic overflow, no debug overhead}
   346  @!debug @{@&$C+,D+@}@+ gubed {but turn everything on when debugging}
   347  
   348  @ This \TeX\ implementation conforms to the rules of the {\sl Pascal User
   349  @:PASCAL}{\PASCAL@>
   350  @^system dependencies@>
   351  Manual} published by Jensen and Wirth in 1975, except where system-dependent
   352  @^Wirth, Niklaus@>
   353  @^Jensen, Kathleen@>
   354  code is necessary to make a useful system program, and except in another
   355  respect where such conformity would unnecessarily obscure the meaning
   356  and clutter up the code: We assume that |case| statements may include a
   357  default case that applies if no matching label is found. Thus, we shall use
   358  constructions like
   359  $$\vbox{\halign{\ignorespaces#\hfil\cr
   360  |case x of|\cr
   361  1: $\langle\,$code for $x=1\,\rangle$;\cr
   362  3: $\langle\,$code for $x=3\,\rangle$;\cr
   363  |othercases| $\langle\,$code for |x<>1| and |x<>3|$\,\rangle$\cr
   364  |endcases|\cr}}$$
   365  since most \PASCAL\ compilers have plugged this hole in the language by
   366  incorporating some sort of default mechanism. For example, the \ph\
   367  compiler allows `|others|:' as a default label, and other \PASCAL s allow
   368  syntaxes like `\&{else}' or `\&{otherwise}' or `\\{otherwise}:', etc. The
   369  definitions of |othercases| and |endcases| should be changed to agree with
   370  local conventions.  Note that no semicolon appears before |endcases| in
   371  this program, so the definition of |endcases| should include a semicolon
   372  if the compiler wants one. (Of course, if no default mechanism is
   373  available, the |case| statements of \TeX\ will have to be laboriously
   374  extended by listing all remaining cases. People who are stuck with such
   375  \PASCAL s have, in fact, done this, successfully but not happily!)
   376  @:PASCAL H}{\ph@>
   377  
   378  @d othercases == others: {default for cases not listed explicitly}
   379  @d endcases == @+end {follows the default case in an extended |case| statement}
   380  @f othercases == else
   381  @f endcases == end
   382  
   383  @ The following parameters can be changed at compile time to extend or
   384  reduce \TeX's capacity. They may have different values in \.{INITEX} and
   385  in production versions of \TeX.
   386  @.INITEX@>
   387  @^system dependencies@>
   388  
   389  @<Constants...@>=
   390  @!mem_max=30000; {greatest index in \TeX's internal |mem| array;
   391    must be strictly less than |max_halfword|;
   392    must be equal to |mem_top| in \.{INITEX}, otherwise |>=mem_top|}
   393  @!mem_min=0; {smallest index in \TeX's internal |mem| array;
   394    must be |min_halfword| or more;
   395    must be equal to |mem_bot| in \.{INITEX}, otherwise |<=mem_bot|}
   396  @!buf_size=500; {maximum number of characters simultaneously present in
   397    current lines of open files and in control sequences between
   398    \.{\\csname} and \.{\\endcsname}; must not exceed |max_halfword|}
   399  @!error_line=72; {width of context lines on terminal error messages}
   400  @!half_error_line=42; {width of first lines of contexts in terminal
   401    error messages; should be between 30 and |error_line-15|}
   402  @!max_print_line=79; {width of longest text lines output; should be at least 60}
   403  @!stack_size=200; {maximum number of simultaneous input sources}
   404  @!max_in_open=6; {maximum number of input files and error insertions that
   405    can be going on simultaneously}
   406  @!font_max=75; {maximum internal font number; must not exceed |max_quarterword|
   407    and must be at most |font_base+256|}
   408  @!font_mem_size=20000; {number of words of |font_info| for all fonts}
   409  @!param_size=60; {maximum number of simultaneous macro parameters}
   410  @!nest_size=40; {maximum number of semantic levels simultaneously active}
   411  @!max_strings=3000; {maximum number of strings; must not exceed |max_halfword|}
   412  @!string_vacancies=8000; {the minimum number of characters that should be
   413    available for the user's control sequences and font names,
   414    after \TeX's own error messages are stored}
   415  @!pool_size=32000; {maximum number of characters in strings, including all
   416    error messages and help texts, and the names of all fonts and
   417    control sequences; must exceed |string_vacancies| by the total
   418    length of \TeX's own strings, which is currently about 23000}
   419  @!save_size=600; {space for saving values outside of current group; must be
   420    at most |max_halfword|}
   421  @!trie_size=8000; {space for hyphenation patterns; should be larger for
   422    \.{INITEX} than it is in production versions of \TeX}
   423  @!trie_op_size=500; {space for ``opcodes'' in the hyphenation patterns}
   424  @!dvi_buf_size=800; {size of the output buffer; must be a multiple of 8}
   425  @!file_name_size=40; {file names shouldn't be longer than this}
   426  @!pool_name='TeXformats:TEX.POOL                     ';
   427    {string of length |file_name_size|; tells where the string pool appears}
   428  @.TeXformats@>
   429  
   430  @ Like the preceding parameters, the following quantities can be changed
   431  at compile time to extend or reduce \TeX's capacity. But if they are changed,
   432  it is necessary to rerun the initialization program \.{INITEX}
   433  @.INITEX@>
   434  to generate new tables for the production \TeX\ program.
   435  One can't simply make helter-skelter changes to the following constants,
   436  since certain rather complex initialization
   437  numbers are computed from them. They are defined here using
   438  \.{WEB} macros, instead of being put into \PASCAL's |const| list, in order to
   439  emphasize this distinction.
   440  
   441  @d mem_bot=0 {smallest index in the |mem| array dumped by \.{INITEX};
   442    must not be less than |mem_min|}
   443  @d mem_top==30000 {largest index in the |mem| array dumped by \.{INITEX};
   444    must be substantially larger than |mem_bot|
   445    and not greater than |mem_max|}
   446  @d font_base=0 {smallest internal font number; must not be less
   447    than |min_quarterword|}
   448  @d hash_size=2100 {maximum number of control sequences; it should be at most
   449    about |(mem_max-mem_min)/10|}
   450  @d hash_prime=1777 {a prime number equal to about 85\pct! of |hash_size|}
   451  @d hyph_size=307 {another prime; the number of \.{\\hyphenation} exceptions}
   452  @^system dependencies@>
   453  
   454  @ In case somebody has inadvertently made bad settings of the ``constants,''
   455  \TeX\ checks them using a global variable called |bad|.
   456  
   457  This is the first of many sections of \TeX\ where global variables are
   458  defined.
   459  
   460  @<Glob...@>=
   461  @!bad:integer; {is some ``constant'' wrong?}
   462  
   463  @ Later on we will say `\ignorespaces|if mem_max>=max_halfword then bad:=14|',
   464  or something similar. (We can't do that until |max_halfword| has been defined.)
   465  
   466  @<Check the ``constant'' values for consistency@>=
   467  bad:=0;
   468  if (half_error_line<30)or(half_error_line>error_line-15) then bad:=1;
   469  if max_print_line<60 then bad:=2;
   470  if dvi_buf_size mod 8<>0 then bad:=3;
   471  if mem_bot+1100>mem_top then bad:=4;
   472  if hash_prime>hash_size then bad:=5;
   473  if max_in_open>=128 then bad:=6;
   474  if mem_top<256+11 then bad:=7; {we will want |null_list>255|}
   475  
   476  @ Labels are given symbolic names by the following definitions, so that
   477  occasional |goto| statements will be meaningful. We insert the label
   478  `|exit|' just before the `\ignorespaces|end|\unskip' of a procedure in
   479  which we have used the `|return|' statement defined below; the label
   480  `|restart|' is occasionally used at the very beginning of a procedure; and
   481  the label `|reswitch|' is occasionally used just prior to a |case|
   482  statement in which some cases change the conditions and we wish to branch
   483  to the newly applicable case.  Loops that are set up with the |loop|
   484  construction defined below are commonly exited by going to `|done|' or to
   485  `|found|' or to `|not_found|', and they are sometimes repeated by going to
   486  `|continue|'.  If two or more parts of a subroutine start differently but
   487  end up the same, the shared code may be gathered together at
   488  `|common_ending|'.
   489  
   490  Incidentally, this program never declares a label that isn't actually used,
   491  because some fussy \PASCAL\ compilers will complain about redundant labels.
   492  
   493  @d exit=10 {go here to leave a procedure}
   494  @d restart=20 {go here to start a procedure again}
   495  @d reswitch=21 {go here to start a case statement again}
   496  @d continue=22 {go here to resume a loop}
   497  @d done=30 {go here to exit a loop}
   498  @d done1=31 {like |done|, when there is more than one loop}
   499  @d done2=32 {for exiting the second loop in a long block}
   500  @d done3=33 {for exiting the third loop in a very long block}
   501  @d done4=34 {for exiting the fourth loop in an extremely long block}
   502  @d done5=35 {for exiting the fifth loop in an immense block}
   503  @d done6=36 {for exiting the sixth loop in a block}
   504  @d found=40 {go here when you've found it}
   505  @d found1=41 {like |found|, when there's more than one per routine}
   506  @d found2=42 {like |found|, when there's more than two per routine}
   507  @d not_found=45 {go here when you've found nothing}
   508  @d common_ending=50 {go here when you want to merge with another branch}
   509  
   510  @ Here are some macros for common programming idioms.
   511  
   512  @d incr(#) == #:=#+1 {increase a variable by unity}
   513  @d decr(#) == #:=#-1 {decrease a variable by unity}
   514  @d negate(#) == #:=-# {change the sign of a variable}
   515  @d loop == @+ while true do@+ {repeat over and over until a |goto| happens}
   516  @f loop == xclause
   517    {\.{WEB}'s |xclause| acts like `\ignorespaces|while true do|\unskip'}
   518  @d do_nothing == {empty statement}
   519  @d return == goto exit {terminate a procedure call}
   520  @f return == nil
   521  @d empty=0 {symbolic name for a null constant}
   522  
   523  @* \[2] The character set.
   524  In order to make \TeX\ readily portable to a wide variety of
   525  computers, all of its input text is converted to an internal eight-bit
   526  code that includes standard ASCII, the ``American Standard Code for
   527  Information Interchange.''  This conversion is done immediately when each
   528  character is read in. Conversely, characters are converted from ASCII to
   529  the user's external representation just before they are output to a
   530  text file.
   531  
   532  Such an internal code is relevant to users of \TeX\ primarily because it
   533  governs the positions of characters in the fonts. For example, the
   534  character `\.A' has ASCII code $65=@'101$, and when \TeX\ typesets
   535  this letter it specifies character number 65 in the current font.
   536  If that font actually has `\.A' in a different position, \TeX\ doesn't
   537  know what the real position is; the program that does the actual printing from
   538  \TeX's device-independent files is responsible for converting from ASCII to
   539  a particular font encoding.
   540  @^ASCII code@>
   541  
   542  \TeX's internal code also defines the value of constants
   543  that begin with a reverse apostrophe; and it provides an index to the
   544  \.{\\catcode}, \.{\\mathcode}, \.{\\uccode}, \.{\\lccode}, and \.{\\delcode}
   545  tables.
   546  
   547  @ Characters of text that have been converted to \TeX's internal form
   548  are said to be of type |ASCII_code|, which is a subrange of the integers.
   549  
   550  @<Types...@>=
   551  @!ASCII_code=0..255; {eight-bit numbers}
   552  
   553  @ The original \PASCAL\ compiler was designed in the late 60s, when six-bit
   554  character sets were common, so it did not make provision for lowercase
   555  letters. Nowadays, of course, we need to deal with both capital and small
   556  letters in a convenient way, especially in a program for typesetting;
   557  so the present specification of \TeX\ has been written under the assumption
   558  that the \PASCAL\ compiler and run-time system permit the use of text files
   559  with more than 64 distinguishable characters. More precisely, we assume that
   560  the character set contains at least the letters and symbols associated
   561  with ASCII codes @'40 through @'176; all of these characters are now
   562  available on most computer terminals.
   563  
   564  Since we are dealing with more characters than were present in the first
   565  \PASCAL\ compilers, we have to decide what to call the associated data
   566  type. Some \PASCAL s use the original name |char| for the
   567  characters in text files, even though there now are more than 64 such
   568  characters, while other \PASCAL s consider |char| to be a 64-element
   569  subrange of a larger data type that has some other name.
   570  
   571  In order to accommodate this difference, we shall use the name |text_char|
   572  to stand for the data type of the characters that are converted to and
   573  from |ASCII_code| when they are input and output. We shall also assume
   574  that |text_char| consists of the elements |chr(first_text_char)| through
   575  |chr(last_text_char)|, inclusive. The following definitions should be
   576  adjusted if necessary.
   577  @^system dependencies@>
   578  
   579  @d text_char == char {the data type of characters in text files}
   580  @d first_text_char=0 {ordinal number of the smallest element of |text_char|}
   581  @d last_text_char=255 {ordinal number of the largest element of |text_char|}
   582  
   583  @<Local variables for init...@>=
   584  @!i:integer;
   585  
   586  @ The \TeX\ processor converts between ASCII code and
   587  the user's external character set by means of arrays |xord| and |xchr|
   588  that are analogous to \PASCAL's |ord| and |chr| functions.
   589  
   590  @<Glob...@>=
   591  @!xord: array [text_char] of ASCII_code;
   592    {specifies conversion of input characters}
   593  @!xchr: array [ASCII_code] of text_char;
   594    {specifies conversion of output characters}
   595  
   596  @ Since we are assuming that our \PASCAL\ system is able to read and
   597  write the visible characters of standard ASCII (although not
   598  necessarily using the ASCII codes to represent them), the following
   599  assignment statements initialize the standard part of the |xchr| array
   600  properly, without needing any system-dependent changes. On the other
   601  hand, it is possible to implement \TeX\ with less complete character
   602  sets, and in such cases it will be necessary to change something here.
   603  @^system dependencies@>
   604  
   605  @<Set init...@>=
   606  xchr[@'40]:=' ';
   607  xchr[@'41]:='!';
   608  xchr[@'42]:='"';
   609  xchr[@'43]:='#';
   610  xchr[@'44]:='$';
   611  xchr[@'45]:='%';
   612  xchr[@'46]:='&';
   613  xchr[@'47]:='''';@/
   614  xchr[@'50]:='(';
   615  xchr[@'51]:=')';
   616  xchr[@'52]:='*';
   617  xchr[@'53]:='+';
   618  xchr[@'54]:=',';
   619  xchr[@'55]:='-';
   620  xchr[@'56]:='.';
   621  xchr[@'57]:='/';@/
   622  xchr[@'60]:='0';
   623  xchr[@'61]:='1';
   624  xchr[@'62]:='2';
   625  xchr[@'63]:='3';
   626  xchr[@'64]:='4';
   627  xchr[@'65]:='5';
   628  xchr[@'66]:='6';
   629  xchr[@'67]:='7';@/
   630  xchr[@'70]:='8';
   631  xchr[@'71]:='9';
   632  xchr[@'72]:=':';
   633  xchr[@'73]:=';';
   634  xchr[@'74]:='<';
   635  xchr[@'75]:='=';
   636  xchr[@'76]:='>';
   637  xchr[@'77]:='?';@/
   638  xchr[@'100]:='@@';
   639  xchr[@'101]:='A';
   640  xchr[@'102]:='B';
   641  xchr[@'103]:='C';
   642  xchr[@'104]:='D';
   643  xchr[@'105]:='E';
   644  xchr[@'106]:='F';
   645  xchr[@'107]:='G';@/
   646  xchr[@'110]:='H';
   647  xchr[@'111]:='I';
   648  xchr[@'112]:='J';
   649  xchr[@'113]:='K';
   650  xchr[@'114]:='L';
   651  xchr[@'115]:='M';
   652  xchr[@'116]:='N';
   653  xchr[@'117]:='O';@/
   654  xchr[@'120]:='P';
   655  xchr[@'121]:='Q';
   656  xchr[@'122]:='R';
   657  xchr[@'123]:='S';
   658  xchr[@'124]:='T';
   659  xchr[@'125]:='U';
   660  xchr[@'126]:='V';
   661  xchr[@'127]:='W';@/
   662  xchr[@'130]:='X';
   663  xchr[@'131]:='Y';
   664  xchr[@'132]:='Z';
   665  xchr[@'133]:='[';
   666  xchr[@'134]:='\';
   667  xchr[@'135]:=']';
   668  xchr[@'136]:='^';
   669  xchr[@'137]:='_';@/
   670  xchr[@'140]:='`';
   671  xchr[@'141]:='a';
   672  xchr[@'142]:='b';
   673  xchr[@'143]:='c';
   674  xchr[@'144]:='d';
   675  xchr[@'145]:='e';
   676  xchr[@'146]:='f';
   677  xchr[@'147]:='g';@/
   678  xchr[@'150]:='h';
   679  xchr[@'151]:='i';
   680  xchr[@'152]:='j';
   681  xchr[@'153]:='k';
   682  xchr[@'154]:='l';
   683  xchr[@'155]:='m';
   684  xchr[@'156]:='n';
   685  xchr[@'157]:='o';@/
   686  xchr[@'160]:='p';
   687  xchr[@'161]:='q';
   688  xchr[@'162]:='r';
   689  xchr[@'163]:='s';
   690  xchr[@'164]:='t';
   691  xchr[@'165]:='u';
   692  xchr[@'166]:='v';
   693  xchr[@'167]:='w';@/
   694  xchr[@'170]:='x';
   695  xchr[@'171]:='y';
   696  xchr[@'172]:='z';
   697  xchr[@'173]:='{';
   698  xchr[@'174]:='|';
   699  xchr[@'175]:='}';
   700  xchr[@'176]:='~';@/
   701  
   702  @ Some of the ASCII codes without visible characters have been given symbolic
   703  names in this program because they are used with a special meaning.
   704  
   705  @d null_code=@'0 {ASCII code that might disappear}
   706  @d carriage_return=@'15 {ASCII code used at end of line}
   707  @d invalid_code=@'177 {ASCII code that many systems prohibit in text files}
   708  
   709  @ The ASCII code is ``standard'' only to a certain extent, since many
   710  computer installations have found it advantageous to have ready access
   711  to more than 94 printing characters. Appendix~C of {\sl The \TeX book\/}
   712  gives a complete specification of the intended correspondence between
   713  characters and \TeX's internal representation.
   714  @:TeXbook}{\sl The \TeX book@>
   715  
   716  If \TeX\ is being used
   717  on a garden-variety \PASCAL\ for which only standard ASCII
   718  codes will appear in the input and output files, it doesn't really matter
   719  what codes are specified in |xchr[0..@'37]|, but the safest policy is to
   720  blank everything out by using the code shown below.
   721  
   722  However, other settings of |xchr| will make \TeX\ more friendly on
   723  computers that have an extended character set, so that users can type things
   724  like `\.^^Z' instead of `\.{\\ne}'. People with extended character sets can
   725  assign codes arbitrarily, giving an |xchr| equivalent to whatever
   726  characters the users of \TeX\ are allowed to have in their input files.
   727  It is best to make the codes correspond to the intended interpretations as
   728  shown in Appendix~C whenever possible; but this is not necessary. For
   729  example, in countries with an alphabet of more than 26 letters, it is
   730  usually best to map the additional letters into codes less than~@'40.
   731  To get the most ``permissive'' character set, change |' '| on the
   732  right of these assignment statements to |chr(i)|.
   733  @^character set dependencies@>
   734  @^system dependencies@>
   735  
   736  @<Set init...@>=
   737  for i:=0 to @'37 do xchr[i]:=' ';
   738  for i:=@'177 to @'377 do xchr[i]:=' ';
   739  
   740  @ The following system-independent code makes the |xord| array contain a
   741  suitable inverse to the information in |xchr|. Note that if |xchr[i]=xchr[j]|
   742  where |i<j<@'177|, the value of |xord[xchr[i]]| will turn out to be
   743  |j| or more; hence, standard ASCII code numbers will be used instead of
   744  codes below @'40 in case there is a coincidence.
   745  
   746  @<Set init...@>=
   747  for i:=first_text_char to last_text_char do xord[chr(i)]:=invalid_code;
   748  for i:=@'200 to @'377 do xord[xchr[i]]:=i;
   749  for i:=0 to @'176 do xord[xchr[i]]:=i;
   750  
   751  @* \[3] Input and output.
   752  The bane of portability is the fact that different operating systems treat
   753  input and output quite differently, perhaps because computer scientists
   754  have not given sufficient attention to this problem. People have felt somehow
   755  that input and output are not part of ``real'' programming. Well, it is true
   756  that some kinds of programming are more fun than others. With existing
   757  input/output conventions being so diverse and so messy, the only sources of
   758  joy in such parts of the code are the rare occasions when one can find a
   759  way to make the program a little less bad than it might have been. We have
   760  two choices, either to attack I/O now and get it over with, or to postpone
   761  I/O until near the end. Neither prospect is very attractive, so let's
   762  get it over with.
   763  
   764  The basic operations we need to do are (1)~inputting and outputting of
   765  text, to or from a file or the user's terminal; (2)~inputting and
   766  outputting of eight-bit bytes, to or from a file; (3)~instructing the
   767  operating system to initiate (``open'') or to terminate (``close'') input or
   768  output from a specified file; (4)~testing whether the end of an input
   769  file has been reached.
   770  
   771  \TeX\ needs to deal with two kinds of files.
   772  We shall use the term |alpha_file| for a file that contains textual data,
   773  and the term |byte_file| for a file that contains eight-bit binary information.
   774  These two types turn out to be the same on many computers, but
   775  sometimes there is a significant distinction, so we shall be careful to
   776  distinguish between them. Standard protocols for transferring
   777  such files from computer to computer, via high-speed networks, are
   778  now becoming available to more and more communities of users.
   779  
   780  The program actually makes use also of a third kind of file, called a
   781  |word_file|, when dumping and reloading base information for its own
   782  initialization.  We shall define a word file later; but it will be possible
   783  for us to specify simple operations on word files before they are defined.
   784  
   785  @<Types...@>=
   786  @!eight_bits=0..255; {unsigned one-byte quantity}
   787  @!alpha_file=packed file of text_char; {files that contain textual data}
   788  @!byte_file=packed file of eight_bits; {files that contain binary data}
   789  
   790  @ Most of what we need to do with respect to input and output can be handled
   791  by the I/O facilities that are standard in \PASCAL, i.e., the routines
   792  called |get|, |put|, |eof|, and so on. But
   793  standard \PASCAL\ does not allow file variables to be associated with file
   794  names that are determined at run time, so it cannot be used to implement
   795  \TeX; some sort of extension to \PASCAL's ordinary |reset| and |rewrite|
   796  is crucial for our purposes. We shall assume that |name_of_file| is a variable
   797  of an appropriate type such that the \PASCAL\ run-time system being used to
   798  implement \TeX\ can open a file whose external name is specified by
   799  |name_of_file|.
   800  @^system dependencies@>
   801  
   802  @<Glob...@>=
   803  @!name_of_file:packed array[1..file_name_size] of char;@;@/
   804    {on some systems this may be a \&{record} variable}
   805  @!name_length:0..file_name_size;@/{this many characters are actually
   806    relevant in |name_of_file| (the rest are blank)}
   807  
   808  @ The \ph\ compiler with which the present version of \TeX\ was prepared has
   809  extended the rules of \PASCAL\ in a very convenient way. To open file~|f|,
   810  we can write
   811  $$\vbox{\halign{#\hfil\qquad&#\hfil\cr
   812  |reset(f,@t\\{name}@>,'/O')|&for input;\cr
   813  |rewrite(f,@t\\{name}@>,'/O')|&for output.\cr}}$$
   814  The `\\{name}' parameter, which is of type `{\bf packed array
   815  $[\langle\\{any}\rangle]$ of \\{char}}', stands for the name of
   816  the external file that is being opened for input or output.
   817  Blank spaces that might appear in \\{name} are ignored.
   818  
   819  The `\.{/O}' parameter tells the operating system not to issue its own
   820  error messages if something goes wrong. If a file of the specified name
   821  cannot be found, or if such a file cannot be opened for some other reason
   822  (e.g., someone may already be trying to write the same file), we will have
   823  |@!erstat(f)<>0| after an unsuccessful |reset| or |rewrite|.  This allows
   824  \TeX\ to undertake appropriate corrective action.
   825  @:PASCAL H}{\ph@>
   826  @^system dependencies@>
   827  
   828  \TeX's file-opening procedures return |false| if no file identified by
   829  |name_of_file| could be opened.
   830  
   831  @d reset_OK(#)==erstat(#)=0
   832  @d rewrite_OK(#)==erstat(#)=0
   833  
   834  @p function a_open_in(var f:alpha_file):boolean;
   835    {open a text file for input}
   836  begin reset(f,name_of_file,'/O'); a_open_in:=reset_OK(f);
   837  end;
   838  @#
   839  function a_open_out(var f:alpha_file):boolean;
   840    {open a text file for output}
   841  begin rewrite(f,name_of_file,'/O'); a_open_out:=rewrite_OK(f);
   842  end;
   843  @#
   844  function b_open_in(var f:byte_file):boolean;
   845    {open a binary file for input}
   846  begin reset(f,name_of_file,'/O'); b_open_in:=reset_OK(f);
   847  end;
   848  @#
   849  function b_open_out(var f:byte_file):boolean;
   850    {open a binary file for output}
   851  begin rewrite(f,name_of_file,'/O'); b_open_out:=rewrite_OK(f);
   852  end;
   853  @#
   854  function w_open_in(var f:word_file):boolean;
   855    {open a word file for input}
   856  begin reset(f,name_of_file,'/O'); w_open_in:=reset_OK(f);
   857  end;
   858  @#
   859  function w_open_out(var f:word_file):boolean;
   860    {open a word file for output}
   861  begin rewrite(f,name_of_file,'/O'); w_open_out:=rewrite_OK(f);
   862  end;
   863  
   864  @ Files can be closed with the \ph\ routine `|close(f)|', which
   865  @:PASCAL H}{\ph@>
   866  @^system dependencies@>
   867  should be used when all input or output with respect to |f| has been completed.
   868  This makes |f| available to be opened again, if desired; and if |f| was used for
   869  output, the |close| operation makes the corresponding external file appear
   870  on the user's area, ready to be read.
   871  
   872  These procedures should not generate error messages if a file is
   873  being closed before it has been successfully opened.
   874  
   875  @p procedure a_close(var f:alpha_file); {close a text file}
   876  begin close(f);
   877  end;
   878  @#
   879  procedure b_close(var f:byte_file); {close a binary file}
   880  begin close(f);
   881  end;
   882  @#
   883  procedure w_close(var f:word_file); {close a word file}
   884  begin close(f);
   885  end;
   886  
   887  @ Binary input and output are done with \PASCAL's ordinary |get| and |put|
   888  procedures, so we don't have to make any other special arrangements for
   889  binary~I/O. Text output is also easy to do with standard \PASCAL\ routines.
   890  The treatment of text input is more difficult, however, because
   891  of the necessary translation to |ASCII_code| values.
   892  \TeX's conventions should be efficient, and they should
   893  blend nicely with the user's operating environment.
   894  
   895  @ Input from text files is read one line at a time, using a routine called
   896  |input_ln|. This function is defined in terms of global variables called
   897  |buffer|, |first|, and |last| that will be described in detail later; for
   898  now, it suffices for us to know that |buffer| is an array of |ASCII_code|
   899  values, and that |first| and |last| are indices into this array
   900  representing the beginning and ending of a line of text.
   901  
   902  @<Glob...@>=
   903  @!buffer:array[0..buf_size] of ASCII_code; {lines of characters being read}
   904  @!first:0..buf_size; {the first unused position in |buffer|}
   905  @!last:0..buf_size; {end of the line just input to |buffer|}
   906  @!max_buf_stack:0..buf_size; {largest index used in |buffer|}
   907  
   908  @ The |input_ln| function brings the next line of input from the specified
   909  file into available positions of the buffer array and returns the value
   910  |true|, unless the file has already been entirely read, in which case it
   911  returns |false| and sets |last:=first|.  In general, the |ASCII_code|
   912  numbers that represent the next line of the file are input into
   913  |buffer[first]|, |buffer[first+1]|, \dots, |buffer[last-1]|; and the
   914  global variable |last| is set equal to |first| plus the length of the
   915  line. Trailing blanks are removed from the line; thus, either |last=first|
   916  (in which case the line was entirely blank) or |buffer[last-1]<>" "|.
   917  
   918  An overflow error is given, however, if the normal actions of |input_ln|
   919  would make |last>=buf_size|; this is done so that other parts of \TeX\
   920  can safely look at the contents of |buffer[last+1]| without overstepping
   921  the bounds of the |buffer| array. Upon entry to |input_ln|, the condition
   922  |first<buf_size| will always hold, so that there is always room for an
   923  ``empty'' line.
   924  
   925  The variable |max_buf_stack|, which is used to keep track of how large
   926  the |buf_size| parameter must be to accommodate the present job, is
   927  also kept up to date by |input_ln|.
   928  
   929  If the |bypass_eoln| parameter is |true|, |input_ln| will do a |get|
   930  before looking at the first character of the line; this skips over
   931  an |eoln| that was in |f^|. The procedure does not do a |get| when it
   932  reaches the end of the line; therefore it can be used to acquire input
   933  from the user's terminal as well as from ordinary text files.
   934  
   935  Standard \PASCAL\ says that a file should have |eoln| immediately
   936  before |eof|, but \TeX\ needs only a weaker restriction: If |eof|
   937  occurs in the middle of a line, the system function |eoln| should return
   938  a |true| result (even though |f^| will be undefined).
   939  
   940  Since the inner loop of |input_ln| is part of \TeX's ``inner loop''---each
   941  character of input comes in at this place---it is wise to reduce system
   942  overhead by making use of special routines that read in an entire array
   943  of characters at once, if such routines are available. The following
   944  code uses standard \PASCAL\ to illustrate what needs to be done, but
   945  finer tuning is often possible at well-developed \PASCAL\ sites.
   946  @^inner loop@>
   947  
   948  @p function input_ln(var f:alpha_file;@!bypass_eoln:boolean):boolean;
   949    {inputs the next line or returns |false|}
   950  var last_nonblank:0..buf_size; {|last| with trailing blanks removed}
   951  begin if bypass_eoln then if not eof(f) then get(f);
   952    {input the first character of the line into |f^|}
   953  last:=first; {cf.\ Matthew 19\thinspace:\thinspace30}
   954  if eof(f) then input_ln:=false
   955  else  begin last_nonblank:=first;
   956    while not eoln(f) do
   957      begin if last>=max_buf_stack then
   958        begin max_buf_stack:=last+1;
   959        if max_buf_stack=buf_size then
   960          @<Report overflow of the input buffer, and abort@>;
   961        end;
   962      buffer[last]:=xord[f^]; get(f); incr(last);
   963      if buffer[last-1]<>" " then last_nonblank:=last;
   964      end;
   965    last:=last_nonblank; input_ln:=true;
   966    end;
   967  end;
   968  
   969  @ The user's terminal acts essentially like other files of text, except
   970  that it is used both for input and for output. When the terminal is
   971  considered an input file, the file variable is called |term_in|, and when it
   972  is considered an output file the file variable is |term_out|.
   973  @^system dependencies@>
   974  
   975  @<Glob...@>=
   976  @!term_in:alpha_file; {the terminal as an input file}
   977  @!term_out:alpha_file; {the terminal as an output file}
   978  
   979  @ Here is how to open the terminal files
   980  in \ph. The `\.{/I}' switch suppresses the first |get|.
   981  @:PASCAL H}{\ph@>
   982  @^system dependencies@>
   983  
   984  @d t_open_in==reset(term_in,'TTY:','/O/I') {open the terminal for text input}
   985  @d t_open_out==rewrite(term_out,'TTY:','/O') {open the terminal for text output}
   986  
   987  @ Sometimes it is necessary to synchronize the input/output mixture that
   988  happens on the user's terminal, and three system-dependent
   989  procedures are used for this
   990  purpose. The first of these, |update_terminal|, is called when we want
   991  to make sure that everything we have output to the terminal so far has
   992  actually left the computer's internal buffers and been sent.
   993  The second, |clear_terminal|, is called when we wish to cancel any
   994  input that the user may have typed ahead (since we are about to
   995  issue an unexpected error message). The third, |wake_up_terminal|,
   996  is supposed to revive the terminal if the user has disabled it by
   997  some instruction to the operating system.  The following macros show how
   998  these operations can be specified in \ph:
   999  @:PASCAL H}{\ph@>
  1000  @^system dependencies@>
  1001  
  1002  @d update_terminal == break(term_out) {empty the terminal output buffer}
  1003  @d clear_terminal == break_in(term_in,true) {clear the terminal input buffer}
  1004  @d wake_up_terminal == do_nothing {cancel the user's cancellation of output}
  1005  
  1006  @ We need a special routine to read the first line of \TeX\ input from
  1007  the user's terminal. This line is different because it is read before we
  1008  have opened the transcript file; there is sort of a ``chicken and
  1009  egg'' problem here. If the user types `\.{\\input paper}' on the first
  1010  line, or if some macro invoked by that line does such an \.{\\input},
  1011  the transcript file will be named `\.{paper.log}'; but if no \.{\\input}
  1012  commands are performed during the first line of terminal input, the transcript
  1013  file will acquire its default name `\.{texput.log}'. (The transcript file
  1014  will not contain error messages generated by the first line before the
  1015  first \.{\\input} command.)
  1016  @.texput@>
  1017  
  1018  The first line is even more special if we are lucky enough to have an operating
  1019  system that treats \TeX\ differently from a run-of-the-mill \PASCAL\ object
  1020  program. It's nice to let the user start running a \TeX\ job by typing
  1021  a command line like `\.{tex paper}'; in such a case, \TeX\ will operate
  1022  as if the first line of input were `\.{paper}', i.e., the first line will
  1023  consist of the remainder of the command line, after the part that invoked
  1024  \TeX.
  1025  
  1026  The first line is special also because it may be read before \TeX\ has
  1027  input a format file. In such cases, normal error messages cannot yet
  1028  be given. The following code uses concepts that will be explained later.
  1029  (If the \PASCAL\ compiler does not support non-local |@!goto|\unskip, the
  1030  @^system dependencies@>
  1031  statement `|goto final_end|' should be replaced by something that
  1032  quietly terminates the program.)
  1033  
  1034  @<Report overflow of the input buffer, and abort@>=
  1035  if format_ident=0 then
  1036    begin write_ln(term_out,'Buffer size exceeded!'); goto final_end;
  1037  @.Buffer size exceeded@>
  1038    end
  1039  else begin cur_input.loc_field:=first; cur_input.limit_field:=last-1;
  1040    overflow("buffer size",buf_size);
  1041  @:TeX capacity exceeded buffer size}{\quad buffer size@>
  1042    end
  1043  
  1044  @ Different systems have different ways to get started. But regardless of
  1045  what conventions are adopted, the routine that initializes the terminal
  1046  should satisfy the following specifications:
  1047  
  1048  \yskip\textindent{1)}It should open file |term_in| for input from the
  1049    terminal. (The file |term_out| will already be open for output to the
  1050    terminal.)
  1051  
  1052  \textindent{2)}If the user has given a command line, this line should be
  1053    considered the first line of terminal input. Otherwise the
  1054    user should be prompted with `\.{**}', and the first line of input
  1055    should be whatever is typed in response.
  1056  
  1057  \textindent{3)}The first line of input, which might or might not be a
  1058    command line, should appear in locations |first| to |last-1| of the
  1059    |buffer| array.
  1060  
  1061  \textindent{4)}The global variable |loc| should be set so that the
  1062    character to be read next by \TeX\ is in |buffer[loc]|. This
  1063    character should not be blank, and we should have |loc<last|.
  1064  
  1065  \yskip\noindent(It may be necessary to prompt the user several times
  1066  before a non-blank line comes in. The prompt is `\.{**}' instead of the
  1067  later `\.*' because the meaning is slightly different: `\.{\\input}' need
  1068  not be typed immediately after~`\.{**}'.)
  1069  
  1070  @d loc==cur_input.loc_field {location of first unread character in |buffer|}
  1071  
  1072  @ The following program does the required initialization
  1073  without retrieving a possible command line.
  1074  It should be clear how to modify this routine to deal with command lines,
  1075  if the system permits them.
  1076  @^system dependencies@>
  1077  
  1078  @p function init_terminal:boolean; {gets the terminal input started}
  1079  label exit;
  1080  begin t_open_in;
  1081  loop@+begin wake_up_terminal; write(term_out,'**'); update_terminal;
  1082  @.**@>
  1083    if not input_ln(term_in,true) then {this shouldn't happen}
  1084      begin write_ln(term_out);
  1085      write(term_out,'! End of file on the terminal... why?');
  1086  @.End of file on the terminal@>
  1087      init_terminal:=false; return;
  1088      end;
  1089    loc:=first;
  1090    while (loc<last)and(buffer[loc]=" ") do incr(loc);
  1091    if loc<last then
  1092      begin init_terminal:=true;
  1093      return; {return unless the line was all blank}
  1094      end;
  1095    write_ln(term_out,'Please type the name of your input file.');
  1096    end;
  1097  exit:end;
  1098  
  1099  @* \[4] String handling.
  1100  Control sequence names and diagnostic messages are variable-length strings
  1101  of eight-bit characters. Since \PASCAL\ does not have a well-developed string
  1102  mechanism, \TeX\ does all of its string processing by homegrown methods.
  1103  
  1104  Elaborate facilities for dynamic strings are not needed, so all of the
  1105  necessary operations can be handled with a simple data structure.
  1106  The array |str_pool| contains all of the (eight-bit) ASCII codes in all
  1107  of the strings, and the array |str_start| contains indices of the starting
  1108  points of each string. Strings are referred to by integer numbers, so that
  1109  string number |s| comprises the characters |str_pool[j]| for
  1110  |str_start[s]<=j<str_start[s+1]|. Additional integer variables
  1111  |pool_ptr| and |str_ptr| indicate the number of entries used so far
  1112  in |str_pool| and |str_start|, respectively; locations
  1113  |str_pool[pool_ptr]| and |str_start[str_ptr]| are
  1114  ready for the next string to be allocated.
  1115  
  1116  String numbers 0 to 255 are reserved for strings that correspond to single
  1117  ASCII characters. This is in accordance with the conventions of \.{WEB},
  1118  @.WEB@>
  1119  which converts single-character strings into the ASCII code number of the
  1120  single character involved, while it converts other strings into integers
  1121  and builds a string pool file. Thus, when the string constant \.{"."} appears
  1122  in the program below, \.{WEB} converts it into the integer 46, which is the
  1123  ASCII code for a period, while \.{WEB} will convert a string like \.{"hello"}
  1124  into some integer greater than~255. String number 46 will presumably be the
  1125  single character `\..'; but some ASCII codes have no standard visible
  1126  representation, and \TeX\ sometimes needs to be able to print an arbitrary
  1127  ASCII character, so the first 256 strings are used to specify exactly what
  1128  should be printed for each of the 256 possibilities.
  1129  
  1130  Elements of the |str_pool| array must be ASCII codes that can actually
  1131  be printed; i.e., they must have an |xchr| equivalent in the local
  1132  character set. (This restriction applies only to preloaded strings,
  1133  not to those generated dynamically by the user.)
  1134  
  1135  Some \PASCAL\ compilers won't pack integers into a single byte unless the
  1136  integers lie in the range |-128..127|. To accommodate such systems
  1137  we access the string pool only via macros that can easily be redefined.
  1138  @^system dependencies@>
  1139  
  1140  @d si(#) == # {convert from |ASCII_code| to |packed_ASCII_code|}
  1141  @d so(#) == # {convert from |packed_ASCII_code| to |ASCII_code|}
  1142  
  1143  @<Types...@>=
  1144  @!pool_pointer = 0..pool_size; {for variables that point into |str_pool|}
  1145  @!str_number = 0..max_strings; {for variables that point into |str_start|}
  1146  @!packed_ASCII_code = 0..255; {elements of |str_pool| array}
  1147  
  1148  @ @<Glob...@>=
  1149  @!str_pool:packed array[pool_pointer] of packed_ASCII_code; {the characters}
  1150  @!str_start : array[str_number] of pool_pointer; {the starting pointers}
  1151  @!pool_ptr : pool_pointer; {first unused position in |str_pool|}
  1152  @!str_ptr : str_number; {number of the current string being created}
  1153  @!init_pool_ptr : pool_pointer; {the starting value of |pool_ptr|}
  1154  @!init_str_ptr : str_number; {the starting value of |str_ptr|}
  1155  
  1156  @ Several of the elementary string operations are performed using \.{WEB}
  1157  macros instead of \PASCAL\ procedures, because many of the
  1158  operations are done quite frequently and we want to avoid the
  1159  overhead of procedure calls. For example, here is
  1160  a simple macro that computes the length of a string.
  1161  @.WEB@>
  1162  
  1163  @d length(#)==(str_start[#+1]-str_start[#]) {the number of characters
  1164    in string number \#}
  1165  
  1166  @ The length of the current string is called |cur_length|:
  1167  
  1168  @d cur_length == (pool_ptr - str_start[str_ptr])
  1169  
  1170  @ Strings are created by appending character codes to |str_pool|.
  1171  The |append_char| macro, defined here, does not check to see if the
  1172  value of |pool_ptr| has gotten too high; this test is supposed to be
  1173  made before |append_char| is used. There is also a |flush_char|
  1174  macro, which erases the last character appended.
  1175  
  1176  To test if there is room to append |l| more characters to |str_pool|,
  1177  we shall write |str_room(l)|, which aborts \TeX\ and gives an
  1178  apologetic error message if there isn't enough room.
  1179  
  1180  @d append_char(#) == {put |ASCII_code| \# at the end of |str_pool|}
  1181  begin str_pool[pool_ptr]:=si(#); incr(pool_ptr);
  1182  end
  1183  @d flush_char == decr(pool_ptr) {forget the last character in the pool}
  1184  @d str_room(#) == {make sure that the pool hasn't overflowed}
  1185    begin if pool_ptr+# > pool_size then
  1186    overflow("pool size",pool_size-init_pool_ptr);
  1187  @:TeX capacity exceeded pool size}{\quad pool size@>
  1188    end
  1189  
  1190  @ Once a sequence of characters has been appended to |str_pool|, it
  1191  officially becomes a string when the function |make_string| is called.
  1192  This function returns the identification number of the new string as its
  1193  value.
  1194  
  1195  @p function make_string : str_number; {current string enters the pool}
  1196  begin if str_ptr=max_strings then
  1197    overflow("number of strings",max_strings-init_str_ptr);
  1198  @:TeX capacity exceeded number of strings}{\quad number of strings@>
  1199  incr(str_ptr); str_start[str_ptr]:=pool_ptr;
  1200  make_string:=str_ptr-1;
  1201  end;
  1202  
  1203  @ To destroy the most recently made string, we say |flush_string|.
  1204  
  1205  @d flush_string==begin decr(str_ptr); pool_ptr:=str_start[str_ptr];
  1206    end
  1207  
  1208  @ The following subroutine compares string |s| with another string of the
  1209  same length that appears in |buffer| starting at position |k|;
  1210  the result is |true| if and only if the strings are equal.
  1211  Empirical tests indicate that |str_eq_buf| is used in such a way that
  1212  it tends to return |true| about 80 percent of the time.
  1213  
  1214  @p function str_eq_buf(@!s:str_number;@!k:integer):boolean;
  1215    {test equality of strings}
  1216  label not_found; {loop exit}
  1217  var j: pool_pointer; {running index}
  1218  @!result: boolean; {result of comparison}
  1219  begin j:=str_start[s];
  1220  while j<str_start[s+1] do
  1221    begin if so(str_pool[j])<>buffer[k] then
  1222      begin result:=false; goto not_found;
  1223      end;
  1224    incr(j); incr(k);
  1225    end;
  1226  result:=true;
  1227  not_found: str_eq_buf:=result;
  1228  end;
  1229  
  1230  @ Here is a similar routine, but it compares two strings in the string pool,
  1231  and it does not assume that they have the same length.
  1232  
  1233  @p function str_eq_str(@!s,@!t:str_number):boolean;
  1234    {test equality of strings}
  1235  label not_found; {loop exit}
  1236  var j,@!k: pool_pointer; {running indices}
  1237  @!result: boolean; {result of comparison}
  1238  begin result:=false;
  1239  if length(s)<>length(t) then goto not_found;
  1240  j:=str_start[s]; k:=str_start[t];
  1241  while j<str_start[s+1] do
  1242    begin if str_pool[j]<>str_pool[k] then goto not_found;
  1243    incr(j); incr(k);
  1244    end;
  1245  result:=true;
  1246  not_found: str_eq_str:=result;
  1247  end;
  1248  
  1249  @ The initial values of |str_pool|, |str_start|, |pool_ptr|,
  1250  and |str_ptr| are computed by the \.{INITEX} program, based in part
  1251  on the information that \.{WEB} has output while processing \TeX.
  1252  @.INITEX@>
  1253  @^string pool@>
  1254  
  1255  @p @!init function get_strings_started:boolean; {initializes the string pool,
  1256    but returns |false| if something goes wrong}
  1257  label done,exit;
  1258  var k,@!l:0..255; {small indices or counters}
  1259  @!m,@!n:text_char; {characters input from |pool_file|}
  1260  @!g:str_number; {garbage}
  1261  @!a:integer; {accumulator for check sum}
  1262  @!c:boolean; {check sum has been checked}
  1263  begin pool_ptr:=0; str_ptr:=0; str_start[0]:=0;
  1264  @<Make the first 256 strings@>;
  1265  @<Read the other strings from the \.{TEX.POOL} file and return |true|,
  1266    or give an error message and return |false|@>;
  1267  exit:end;
  1268  tini
  1269  
  1270  @ @d app_lc_hex(#)==l:=#;
  1271    if l<10 then append_char(l+"0")@+else append_char(l-10+"a")
  1272  
  1273  @<Make the first 256...@>=
  1274  for k:=0 to 255 do
  1275    begin if (@<Character |k| cannot be printed@>) then
  1276      begin append_char("^"); append_char("^");
  1277      if k<@'100 then append_char(k+@'100)
  1278      else if k<@'200 then append_char(k-@'100)
  1279      else begin app_lc_hex(k div 16); app_lc_hex(k mod 16);
  1280        end;
  1281      end
  1282    else append_char(k);
  1283    g:=make_string;
  1284    end
  1285  
  1286  @ The first 128 strings will contain 95 standard ASCII characters, and the
  1287  other 33 characters will be printed in three-symbol form like `\.{\^\^A}'
  1288  unless a system-dependent change is made here. Installations that have
  1289  an extended character set, where for example |xchr[@'32]=@t\.{\'^^Z\'}@>|,
  1290  would like string @'32 to be the single character @'32 instead of the
  1291  three characters @'136, @'136, @'132 (\.{\^\^Z}). On the other hand,
  1292  even people with an extended character set will want to represent string
  1293  @'15 by \.{\^\^M}, since @'15 is |carriage_return|; the idea is to
  1294  produce visible strings instead of tabs or line-feeds or carriage-returns
  1295  or bell-rings or characters that are treated anomalously in text files.
  1296  
  1297  Unprintable characters of codes 128--255 are, similarly, rendered
  1298  \.{\^\^80}--\.{\^\^ff}.
  1299  
  1300  The boolean expression defined here should be |true| unless \TeX\
  1301  internal code number~|k| corresponds to a non-troublesome visible
  1302  symbol in the local character set.  An appropriate formula for the
  1303  extended character set recommended in {\sl The \TeX book\/} would, for
  1304  example, be `|k in [0,@'10..@'12,@'14,@'15,@'33,@'177..@'377]|'.
  1305  If character |k| cannot be printed, and |k<@'200|, then character |k+@'100| or
  1306  |k-@'100| must be printable; moreover, ASCII codes |[@'41..@'46,
  1307  @'60..@'71, @'136, @'141..@'146, @'160..@'171]| must be printable.
  1308  Thus, at least 80 printable characters are needed.
  1309  @:TeXbook}{\sl The \TeX book@>
  1310  @^character set dependencies@>
  1311  @^system dependencies@>
  1312  
  1313  @<Character |k| cannot be printed@>=
  1314    (k<" ")or(k>"~")
  1315  
  1316  @ When the \.{WEB} system program called \.{TANGLE} processes the \.{TEX.WEB}
  1317  description that you are now reading, it outputs the \PASCAL\ program
  1318  \.{TEX.PAS} and also a string pool file called \.{TEX.POOL}. The \.{INITEX}
  1319  @.WEB@>@.INITEX@>
  1320  program reads the latter file, where each string appears as a two-digit decimal
  1321  length followed by the string itself, and the information is recorded in
  1322  \TeX's string memory.
  1323  
  1324  @<Glob...@>=
  1325  @!init @!pool_file:alpha_file; {the string-pool file output by \.{TANGLE}}
  1326  tini
  1327  
  1328  @ @d bad_pool(#)==begin wake_up_terminal; write_ln(term_out,#);
  1329    a_close(pool_file); get_strings_started:=false; return;
  1330    end
  1331  @<Read the other strings...@>=
  1332  name_of_file:=pool_name; {we needn't set |name_length|}
  1333  if a_open_in(pool_file) then
  1334    begin c:=false;
  1335    repeat @<Read one string, but return |false| if the
  1336      string memory space is getting too tight for comfort@>;
  1337    until c;
  1338    a_close(pool_file); get_strings_started:=true;
  1339    end
  1340  else  bad_pool('! I can''t read TEX.POOL.')
  1341  @.I can't read TEX.POOL@>
  1342  
  1343  @ @<Read one string...@>=
  1344  begin if eof(pool_file) then bad_pool('! TEX.POOL has no check sum.');
  1345  @.TEX.POOL has no check sum@>
  1346  read(pool_file,m,n); {read two digits of string length}
  1347  if m='*' then @<Check the pool check sum@>
  1348  else  begin if (xord[m]<"0")or(xord[m]>"9")or@|
  1349        (xord[n]<"0")or(xord[n]>"9") then
  1350      bad_pool('! TEX.POOL line doesn''t begin with two digits.');
  1351  @.TEX.POOL line doesn't...@>
  1352    l:=xord[m]*10+xord[n]-"0"*11; {compute the length}
  1353    if pool_ptr+l+string_vacancies>pool_size then
  1354      bad_pool('! You have to increase POOLSIZE.');
  1355  @.You have to increase POOLSIZE@>
  1356    for k:=1 to l do
  1357      begin if eoln(pool_file) then m:=' '@+else read(pool_file,m);
  1358      append_char(xord[m]);
  1359      end;
  1360    read_ln(pool_file); g:=make_string;
  1361    end;
  1362  end
  1363  
  1364  @ The \.{WEB} operation \.{@@\$} denotes the value that should be at the
  1365  end of this \.{TEX.POOL} file; any other value means that the wrong pool
  1366  file has been loaded.
  1367  @^check sum@>
  1368  
  1369  @<Check the pool check sum@>=
  1370  begin a:=0; k:=1;
  1371  loop@+  begin if (xord[n]<"0")or(xord[n]>"9") then
  1372    bad_pool('! TEX.POOL check sum doesn''t have nine digits.');
  1373  @.TEX.POOL check sum...@>
  1374    a:=10*a+xord[n]-"0";
  1375    if k=9 then goto done;
  1376    incr(k); read(pool_file,n);
  1377    end;
  1378  done: if a<>@$ then bad_pool('! TEX.POOL doesn''t match; TANGLE me again.');
  1379  @.TEX.POOL doesn't match@>
  1380  c:=true;
  1381  end
  1382  
  1383  @* \[5] On-line and off-line printing.
  1384  Messages that are sent to a user's terminal and to the transcript-log file
  1385  are produced by several `|print|' procedures. These procedures will
  1386  direct their output to a variety of places, based on the setting of
  1387  the global variable |selector|, which has the following possible
  1388  values:
  1389  
  1390  \yskip
  1391  \hang |term_and_log|, the normal setting, prints on the terminal and on the
  1392    transcript file.
  1393  
  1394  \hang |log_only|, prints only on the transcript file.
  1395  
  1396  \hang |term_only|, prints only on the terminal.
  1397  
  1398  \hang |no_print|, doesn't print at all. This is used only in rare cases
  1399    before the transcript file is open.
  1400  
  1401  \hang |pseudo|, puts output into a cyclic buffer that is used
  1402    by the |show_context| routine; when we get to that routine we shall discuss
  1403    the reasoning behind this curious mode.
  1404  
  1405  \hang |new_string|, appends the output to the current string in the
  1406    string pool.
  1407  
  1408  \hang 0 to 15, prints on one of the sixteen files for \.{\\write} output.
  1409  
  1410  \yskip
  1411  \noindent The symbolic names `|term_and_log|', etc., have been assigned
  1412  numeric codes that satisfy the convenient relations |no_print+1=term_only|,
  1413  |no_print+2=log_only|, |term_only+2=log_only+1=term_and_log|.
  1414  
  1415  Three additional global variables, |tally| and |term_offset| and
  1416  |file_offset|, record the number of characters that have been printed
  1417  since they were most recently cleared to zero. We use |tally| to record
  1418  the length of (possibly very long) stretches of printing; |term_offset|
  1419  and |file_offset|, on the other hand, keep track of how many characters
  1420  have appeared so far on the current line that has been output to the
  1421  terminal or to the transcript file, respectively.
  1422  
  1423  @d no_print=16 {|selector| setting that makes data disappear}
  1424  @d term_only=17 {printing is destined for the terminal only}
  1425  @d log_only=18 {printing is destined for the transcript file only}
  1426  @d term_and_log=19 {normal |selector| setting}
  1427  @d pseudo=20 {special |selector| setting for |show_context|}
  1428  @d new_string=21 {printing is deflected to the string pool}
  1429  @d max_selector=21 {highest selector setting}
  1430  
  1431  @<Glob...@>=
  1432  @!log_file : alpha_file; {transcript of \TeX\ session}
  1433  @!selector : 0..max_selector; {where to print a message}
  1434  @!dig : array[0..22] of 0..15; {digits in a number being output}
  1435  @!tally : integer; {the number of characters recently printed}
  1436  @!term_offset : 0..max_print_line;
  1437    {the number of characters on the current terminal line}
  1438  @!file_offset : 0..max_print_line;
  1439    {the number of characters on the current file line}
  1440  @!trick_buf:array[0..error_line] of ASCII_code; {circular buffer for
  1441    pseudoprinting}
  1442  @!trick_count: integer; {threshold for pseudoprinting, explained later}
  1443  @!first_count: integer; {another variable for pseudoprinting}
  1444  
  1445  @ @<Initialize the output routines@>=
  1446  selector:=term_only; tally:=0; term_offset:=0; file_offset:=0;
  1447  
  1448  @ Macro abbreviations for output to the terminal and to the log file are
  1449  defined here for convenience. Some systems need special conventions
  1450  for terminal output, and it is possible to adhere to those conventions
  1451  by changing |wterm|, |wterm_ln|, and |wterm_cr| in this section.
  1452  @^system dependencies@>
  1453  
  1454  @d wterm(#)==write(term_out,#)
  1455  @d wterm_ln(#)==write_ln(term_out,#)
  1456  @d wterm_cr==write_ln(term_out)
  1457  @d wlog(#)==write(log_file,#)
  1458  @d wlog_ln(#)==write_ln(log_file,#)
  1459  @d wlog_cr==write_ln(log_file)
  1460  
  1461  @ To end a line of text output, we call |print_ln|.
  1462  
  1463  @<Basic print...@>=
  1464  procedure print_ln; {prints an end-of-line}
  1465  begin case selector of
  1466  term_and_log: begin wterm_cr; wlog_cr;
  1467    term_offset:=0; file_offset:=0;
  1468    end;
  1469  log_only: begin wlog_cr; file_offset:=0;
  1470    end;
  1471  term_only: begin wterm_cr; term_offset:=0;
  1472    end;
  1473  no_print,pseudo,new_string: do_nothing;
  1474  othercases write_ln(write_file[selector])
  1475  endcases;@/
  1476  end; {|tally| is not affected}
  1477  
  1478  @ The |print_char| procedure sends one character to the desired destination,
  1479  using the |xchr| array to map it into an external character compatible with
  1480  |input_ln|. All printing comes through |print_ln| or |print_char|.
  1481  
  1482  @<Basic printing...@>=
  1483  procedure print_char(@!s:ASCII_code); {prints a single character}
  1484  label exit;
  1485  begin if @<Character |s| is the current new-line character@> then
  1486   if selector<pseudo then
  1487    begin print_ln; return;
  1488    end;
  1489  case selector of
  1490  term_and_log: begin wterm(xchr[s]); wlog(xchr[s]);
  1491    incr(term_offset); incr(file_offset);
  1492    if term_offset=max_print_line then
  1493      begin wterm_cr; term_offset:=0;
  1494      end;
  1495    if file_offset=max_print_line then
  1496      begin wlog_cr; file_offset:=0;
  1497      end;
  1498    end;
  1499  log_only: begin wlog(xchr[s]); incr(file_offset);
  1500    if file_offset=max_print_line then print_ln;
  1501    end;
  1502  term_only: begin wterm(xchr[s]); incr(term_offset);
  1503    if term_offset=max_print_line then print_ln;
  1504    end;
  1505  no_print: do_nothing;
  1506  pseudo: if tally<trick_count then trick_buf[tally mod error_line]:=s;
  1507  new_string: begin if pool_ptr<pool_size then append_char(s);
  1508    end; {we drop characters if the string space is full}
  1509  othercases write(write_file[selector],xchr[s])
  1510  endcases;@/
  1511  incr(tally);
  1512  exit:end;
  1513  
  1514  @ An entire string is output by calling |print|. Note that if we are outputting
  1515  the single standard ASCII character \.c, we could call |print("c")|, since
  1516  |"c"=99| is the number of a single-character string, as explained above. But
  1517  |print_char("c")| is quicker, so \TeX\ goes directly to the |print_char|
  1518  routine when it knows that this is safe. (The present implementation
  1519  assumes that it is always safe to print a visible ASCII character.)
  1520  @^system dependencies@>
  1521  
  1522  @<Basic print...@>=
  1523  procedure print(@!s:integer); {prints string |s|}
  1524  label exit;
  1525  var j:pool_pointer; {current character code position}
  1526  @!nl:integer; {new-line character to restore}
  1527  begin if s>=str_ptr then s:="???" {this can't happen}
  1528  @.???@>
  1529  else if s<256 then
  1530    if s<0 then s:="???" {can't happen}
  1531    else begin if selector>pseudo then
  1532        begin print_char(s); return; {internal strings are not expanded}
  1533        end;
  1534      if (@<Character |s| is the current new-line character@>) then
  1535        if selector<pseudo then
  1536          begin print_ln; return;
  1537          end;
  1538      nl:=new_line_char; new_line_char:=-1;
  1539        {temporarily disable new-line character}
  1540      j:=str_start[s];
  1541      while j<str_start[s+1] do
  1542        begin print_char(so(str_pool[j])); incr(j);
  1543        end;
  1544      new_line_char:=nl; return;
  1545      end;
  1546  j:=str_start[s];
  1547  while j<str_start[s+1] do
  1548    begin print_char(so(str_pool[j])); incr(j);
  1549    end;
  1550  exit:end;
  1551  
  1552  @ Control sequence names, file names, and strings constructed with
  1553  \.{\\string} might contain |ASCII_code| values that can't
  1554  be printed using |print_char|. Therefore we use |slow_print| for them:
  1555  
  1556  @<Basic print...@>=
  1557  procedure slow_print(@!s:integer); {prints string |s|}
  1558  var j:pool_pointer; {current character code position}
  1559  begin if (s>=str_ptr) or (s<256) then print(s)
  1560  else begin j:=str_start[s];
  1561    while j<str_start[s+1] do
  1562      begin print(so(str_pool[j])); incr(j);
  1563      end;
  1564    end;
  1565  end;
  1566  
  1567  @ Here is the very first thing that \TeX\ prints: a headline that identifies
  1568  the version number and format package. The |term_offset| variable is temporarily
  1569  incorrect, but the discrepancy is not serious since we assume that this
  1570  part of the program is system dependent.
  1571  @^system dependencies@>
  1572  
  1573  @<Initialize the output...@>=
  1574  wterm(banner);
  1575  if format_ident=0 then wterm_ln(' (no format preloaded)')
  1576  else  begin slow_print(format_ident); print_ln;
  1577    end;
  1578  update_terminal;
  1579  
  1580  @ The procedure |print_nl| is like |print|, but it makes sure that the
  1581  string appears at the beginning of a new line.
  1582  
  1583  @<Basic print...@>=
  1584  procedure print_nl(@!s:str_number); {prints string |s| at beginning of line}
  1585  begin if ((term_offset>0)and(odd(selector)))or@|
  1586    ((file_offset>0)and(selector>=log_only)) then print_ln;
  1587  print(s);
  1588  end;
  1589  
  1590  @ The procedure |print_esc| prints a string that is preceded by
  1591  the user's escape character (which is usually a backslash).
  1592  
  1593  @<Basic print...@>=
  1594  procedure print_esc(@!s:str_number); {prints escape character, then |s|}
  1595  var c:integer; {the escape character code}
  1596  begin  @<Set variable |c| to the current escape character@>;
  1597  if c>=0 then if c<256 then print(c);
  1598  slow_print(s);
  1599  end;
  1600  
  1601  @ An array of digits in the range |0..15| is printed by |print_the_digs|.
  1602  
  1603  @<Basic print...@>=
  1604  procedure print_the_digs(@!k:eight_bits);
  1605    {prints |dig[k-1]|$\,\ldots\,$|dig[0]|}
  1606  begin while k>0 do
  1607    begin decr(k);
  1608    if dig[k]<10 then print_char("0"+dig[k])
  1609    else print_char("A"-10+dig[k]);
  1610    end;
  1611  end;
  1612  
  1613  @ The following procedure, which prints out the decimal representation of a
  1614  given integer |n|, has been written carefully so that it works properly
  1615  if |n=0| or if |(-n)| would cause overflow. It does not apply |mod| or |div|
  1616  to negative arguments, since such operations are not implemented consistently
  1617  by all \PASCAL\ compilers.
  1618  
  1619  @<Basic print...@>=
  1620  procedure print_int(@!n:integer); {prints an integer in decimal form}
  1621  var k:0..23; {index to current digit; we assume that $\vert n\vert<10^{23}$}
  1622  @!m:integer; {used to negate |n| in possibly dangerous cases}
  1623  begin k:=0;
  1624  if n<0 then
  1625    begin print_char("-");
  1626    if n>-100000000 then negate(n)
  1627    else  begin m:=-1-n; n:=m div 10; m:=(m mod 10)+1; k:=1;
  1628      if m<10 then dig[0]:=m
  1629      else  begin dig[0]:=0; incr(n);
  1630        end;
  1631      end;
  1632    end;
  1633  repeat dig[k]:=n mod 10; n:=n div 10; incr(k);
  1634  until n=0;
  1635  print_the_digs(k);
  1636  end;
  1637  
  1638  @ Here is a trivial procedure to print two digits; it is usually called with
  1639  a parameter in the range |0<=n<=99|.
  1640  
  1641  @p procedure print_two(@!n:integer); {prints two least significant digits}
  1642  begin n:=abs(n) mod 100; print_char("0"+(n div 10));
  1643  print_char("0"+(n mod 10));
  1644  end;
  1645  
  1646  @ Hexadecimal printing of nonnegative integers is accomplished by |print_hex|.
  1647  
  1648  @p procedure print_hex(@!n:integer);
  1649    {prints a positive integer in hexadecimal form}
  1650  var k:0..22; {index to current digit; we assume that $0\L n<16^{22}$}
  1651  begin k:=0; print_char("""");
  1652  repeat dig[k]:=n mod 16; n:=n div 16; incr(k);
  1653  until n=0;
  1654  print_the_digs(k);
  1655  end;
  1656  
  1657  @ Old versions of \TeX\ needed a procedure called |print_ASCII| whose function
  1658  is now subsumed by |print|. We retain the old name here as a possible aid to
  1659  future software arch\ae ologists.
  1660  
  1661  @d print_ASCII == print
  1662  
  1663  @ Roman numerals are produced by the |print_roman_int| routine.  Readers
  1664  who like puzzles might enjoy trying to figure out how this tricky code
  1665  works; therefore no explanation will be given. Notice that 1990 yields
  1666  \.{mcmxc}, not \.{mxm}.
  1667  
  1668  @p procedure print_roman_int(@!n:integer);
  1669  label exit;
  1670  var j,@!k: pool_pointer; {mysterious indices into |str_pool|}
  1671  @!u,@!v: nonnegative_integer; {mysterious numbers}
  1672  begin j:=str_start["m2d5c2l5x2v5i"]; v:=1000;
  1673  loop@+  begin while n>=v do
  1674      begin print_char(so(str_pool[j])); n:=n-v;
  1675      end;
  1676    if n<=0 then return; {nonpositive input produces no output}
  1677    k:=j+2; u:=v div (so(str_pool[k-1])-"0");
  1678    if str_pool[k-1]=si("2") then
  1679      begin k:=k+2; u:=u div (so(str_pool[k-1])-"0");
  1680      end;
  1681    if n+u>=v then
  1682      begin print_char(so(str_pool[k])); n:=n+u;
  1683      end
  1684    else  begin j:=j+2; v:=v div (so(str_pool[j-1])-"0");
  1685      end;
  1686    end;
  1687  exit:end;
  1688  
  1689  @ The |print| subroutine will not print a string that is still being
  1690  created. The following procedure will.
  1691  
  1692  @p procedure print_current_string; {prints a yet-unmade string}
  1693  var j:pool_pointer; {points to current character code}
  1694  begin j:=str_start[str_ptr];
  1695  while j<pool_ptr do
  1696    begin print_char(so(str_pool[j])); incr(j);
  1697    end;
  1698  end;
  1699  
  1700  @ Here is a procedure that asks the user to type a line of input,
  1701  assuming that the |selector| setting is either |term_only| or |term_and_log|.
  1702  The input is placed into locations |first| through |last-1| of the
  1703  |buffer| array, and echoed on the transcript file if appropriate.
  1704  
  1705  This procedure is never called when |interaction<scroll_mode|.
  1706  
  1707  @d prompt_input(#)==begin wake_up_terminal; print(#); term_input;
  1708      end {prints a string and gets a line of input}
  1709  
  1710  @p procedure term_input; {gets a line from the terminal}
  1711  var k:0..buf_size; {index into |buffer|}
  1712  begin update_terminal; {now the user sees the prompt for sure}
  1713  if not input_ln(term_in,true) then fatal_error("End of file on the terminal!");
  1714  @.End of file on the terminal@>
  1715  term_offset:=0; {the user's line ended with \<\rm return>}
  1716  decr(selector); {prepare to echo the input}
  1717  if last<>first then for k:=first to last-1 do print(buffer[k]);
  1718  print_ln; incr(selector); {restore previous status}
  1719  end;
  1720  
  1721  @* \[6] Reporting errors.
  1722  When something anomalous is detected, \TeX\ typically does something like this:
  1723  $$\vbox{\halign{#\hfil\cr
  1724  |print_err("Something anomalous has been detected");|\cr
  1725  |help3("This is the first line of my offer to help.")|\cr
  1726  |("This is the second line. I'm trying to")|\cr
  1727  |("explain the best way for you to proceed.");|\cr
  1728  |error;|\cr}}$$
  1729  A two-line help message would be given using |help2|, etc.; these informal
  1730  helps should use simple vocabulary that complements the words used in the
  1731  official error message that was printed. (Outside the U.S.A., the help
  1732  messages should preferably be translated into the local vernacular. Each
  1733  line of help is at most 60 characters long, in the present implementation,
  1734  so that |max_print_line| will not be exceeded.)
  1735  
  1736  The |print_err| procedure supplies a `\.!' before the official message,
  1737  and makes sure that the terminal is awake if a stop is going to occur.
  1738  The |error| procedure supplies a `\..' after the official message, then it
  1739  shows the location of the error; and if |interaction=error_stop_mode|,
  1740  it also enters into a dialog with the user, during which time the help
  1741  message may be printed.
  1742  @^system dependencies@>
  1743  
  1744  @ The global variable |interaction| has four settings, representing increasing
  1745  amounts of user interaction:
  1746  
  1747  @d batch_mode=0 {omits all stops and omits terminal output}
  1748  @d nonstop_mode=1 {omits all stops}
  1749  @d scroll_mode=2 {omits error stops}
  1750  @d error_stop_mode=3 {stops at every opportunity to interact}
  1751  @d print_err(#)==begin if interaction=error_stop_mode then wake_up_terminal;
  1752    print_nl("! "); print(#);
  1753    end
  1754  
  1755  @<Glob...@>=
  1756  @!interaction:batch_mode..error_stop_mode; {current level of interaction}
  1757  
  1758  @ @<Set init...@>=interaction:=error_stop_mode;
  1759  
  1760  @ \TeX\ is careful not to call |error| when the print |selector| setting
  1761  might be unusual. The only possible values of |selector| at the time of
  1762  error messages are
  1763  
  1764  \yskip\hang|no_print| (when |interaction=batch_mode|
  1765    and |log_file| not yet open);
  1766  
  1767  \hang|term_only| (when |interaction>batch_mode| and |log_file| not yet open);
  1768  
  1769  \hang|log_only| (when |interaction=batch_mode| and |log_file| is open);
  1770  
  1771  \hang|term_and_log| (when |interaction>batch_mode| and |log_file| is open).
  1772  
  1773  @<Initialize the print |selector| based on |interaction|@>=
  1774  if interaction=batch_mode then selector:=no_print@+else selector:=term_only
  1775  
  1776  @ A global variable |deletions_allowed| is set |false| if the |get_next|
  1777  routine is active when |error| is called; this ensures that |get_next|
  1778  and related routines like |get_token| will never be called recursively.
  1779  A similar interlock is provided by |set_box_allowed|.
  1780  @^recursion@>
  1781  
  1782  The global variable |history| records the worst level of error that
  1783  has been detected. It has four possible values: |spotless|, |warning_issued|,
  1784  |error_message_issued|, and |fatal_error_stop|.
  1785  
  1786  Another global variable, |error_count|, is increased by one when an
  1787  |error| occurs without an interactive dialog, and it is reset to zero at
  1788  the end of every paragraph.  If |error_count| reaches 100, \TeX\ decides
  1789  that there is no point in continuing further.
  1790  
  1791  @d spotless=0 {|history| value when nothing has been amiss yet}
  1792  @d warning_issued=1 {|history| value when |begin_diagnostic| has been called}
  1793  @d error_message_issued=2 {|history| value when |error| has been called}
  1794  @d fatal_error_stop=3 {|history| value when termination was premature}
  1795  
  1796  @<Glob...@>=
  1797  @!deletions_allowed:boolean; {is it safe for |error| to call |get_token|?}
  1798  @!set_box_allowed:boolean; {is it safe to do a \.{\\setbox} assignment?}
  1799  @!history:spotless..fatal_error_stop; {has the source input been clean so far?}
  1800  @!error_count:-1..100; {the number of scrolled errors since the
  1801    last paragraph ended}
  1802  
  1803  @ The value of |history| is initially |fatal_error_stop|, but it will
  1804  be changed to |spotless| if \TeX\ survives the initialization process.
  1805  
  1806  @<Set init...@>=
  1807  deletions_allowed:=true; set_box_allowed:=true;
  1808  error_count:=0; {|history| is initialized elsewhere}
  1809  
  1810  @ Since errors can be detected almost anywhere in \TeX, we want to declare the
  1811  error procedures near the beginning of the program. But the error procedures
  1812  in turn use some other procedures, which need to be declared |forward|
  1813  before we get to |error| itself.
  1814  
  1815  It is possible for |error| to be called recursively if some error arises
  1816  when |get_token| is being used to delete a token, and/or if some fatal error
  1817  occurs while \TeX\ is trying to fix a non-fatal one. But such recursion
  1818  @^recursion@>
  1819  is never more than two levels deep.
  1820  
  1821  @<Error handling...@>=
  1822  procedure@?normalize_selector; forward;@t\2@>@/
  1823  procedure@?get_token; forward;@t\2@>@/
  1824  procedure@?term_input; forward;@t\2@>@/
  1825  procedure@?show_context; forward;@t\2@>@/
  1826  procedure@?begin_file_reading; forward;@t\2@>@/
  1827  procedure@?open_log_file; forward;@t\2@>@/
  1828  procedure@?close_files_and_terminate; forward;@t\2@>@/
  1829  procedure@?clear_for_error_prompt; forward;@t\2@>@/
  1830  procedure@?give_err_help; forward;@t\2@>@/
  1831  @t\4\hskip-\fontdimen2\font@>@;@+@!debug@+procedure@?debug_help;
  1832    forward;@;@+gubed
  1833  
  1834  @ Individual lines of help are recorded in the array |help_line|, which
  1835  contains entries in positions |0..(help_ptr-1)|. They should be printed
  1836  in reverse order, i.e., with |help_line[0]| appearing last.
  1837  
  1838  @d hlp1(#)==help_line[0]:=#;@+end
  1839  @d hlp2(#)==help_line[1]:=#; hlp1
  1840  @d hlp3(#)==help_line[2]:=#; hlp2
  1841  @d hlp4(#)==help_line[3]:=#; hlp3
  1842  @d hlp5(#)==help_line[4]:=#; hlp4
  1843  @d hlp6(#)==help_line[5]:=#; hlp5
  1844  @d help0==help_ptr:=0 {sometimes there might be no help}
  1845  @d help1==@+begin help_ptr:=1; hlp1 {use this with one help line}
  1846  @d help2==@+begin help_ptr:=2; hlp2 {use this with two help lines}
  1847  @d help3==@+begin help_ptr:=3; hlp3 {use this with three help lines}
  1848  @d help4==@+begin help_ptr:=4; hlp4 {use this with four help lines}
  1849  @d help5==@+begin help_ptr:=5; hlp5 {use this with five help lines}
  1850  @d help6==@+begin help_ptr:=6; hlp6 {use this with six help lines}
  1851  
  1852  @<Glob...@>=
  1853  @!help_line:array[0..5] of str_number; {helps for the next |error|}
  1854  @!help_ptr:0..6; {the number of help lines present}
  1855  @!use_err_help:boolean; {should the |err_help| list be shown?}
  1856  
  1857  @ @<Set init...@>=
  1858  help_ptr:=0; use_err_help:=false;
  1859  
  1860  @ The |jump_out| procedure just cuts across all active procedure levels and
  1861  goes to |end_of_TEX|. This is the only nontrivial |@!goto| statement in the
  1862  whole program. It is used when there is no recovery from a particular error.
  1863  
  1864  Some \PASCAL\ compilers do not implement non-local |goto| statements.
  1865  @^system dependencies@>
  1866  In such cases the body of |jump_out| should simply be
  1867  `|close_files_and_terminate|;\thinspace' followed by a call on some system
  1868  procedure that quietly terminates the program.
  1869  
  1870  @<Error hand...@>=
  1871  procedure jump_out;
  1872  begin goto end_of_TEX;
  1873  end;
  1874  
  1875  @ Here now is the general |error| routine.
  1876  
  1877  @<Error hand...@>=
  1878  procedure error; {completes the job of error reporting}
  1879  label continue,exit;
  1880  var c:ASCII_code; {what the user types}
  1881  @!s1,@!s2,@!s3,@!s4:integer;
  1882    {used to save global variables when deleting tokens}
  1883  begin if history<error_message_issued then history:=error_message_issued;
  1884  print_char("."); show_context;
  1885  if interaction=error_stop_mode then
  1886    @<Get user's advice and |return|@>;
  1887  incr(error_count);
  1888  if error_count=100 then
  1889    begin print_nl("(That makes 100 errors; please try again.)");
  1890  @.That makes 100 errors...@>
  1891    history:=fatal_error_stop; jump_out;
  1892    end;
  1893  @<Put help message on the transcript file@>;
  1894  exit:end;
  1895  
  1896  @ @<Get user's advice...@>=
  1897  loop@+begin continue: if interaction<>error_stop_mode then return;
  1898    clear_for_error_prompt; prompt_input("? ");
  1899  @.?\relax@>
  1900    if last=first then return;
  1901    c:=buffer[first];
  1902    if c>="a" then c:=c+"A"-"a"; {convert to uppercase}
  1903    @<Interpret code |c| and |return| if done@>;
  1904    end
  1905  
  1906  @ It is desirable to provide an `\.E' option here that gives the user
  1907  an easy way to return from \TeX\ to the system editor, with the offending
  1908  line ready to be edited. But such an extension requires some system
  1909  wizardry, so the present implementation simply types out the name of the
  1910  file that should be
  1911  edited and the relevant line number.
  1912  @^system dependencies@>
  1913  
  1914  There is a secret `\.D' option available when the debugging routines haven't
  1915  been commented~out.
  1916  @^debugging@>
  1917  
  1918  @<Interpret code |c| and |return| if done@>=
  1919  case c of
  1920  "0","1","2","3","4","5","6","7","8","9": if deletions_allowed then
  1921    @<Delete \(c)|c-"0"| tokens and |goto continue|@>;
  1922  @t\4\4@>@;@+@!debug "D": begin debug_help; goto continue;@+end;@+gubed@/
  1923  "E": if base_ptr>0 then if input_stack[base_ptr].name_field>=256 then
  1924    begin print_nl("You want to edit file ");
  1925  @.You want to edit file x@>
  1926    slow_print(input_stack[base_ptr].name_field);
  1927    print(" at line "); print_int(line);
  1928    interaction:=scroll_mode; jump_out;
  1929    end;
  1930  "H": @<Print the help information and |goto continue|@>;
  1931  "I":@<Introduce new material from the terminal and |return|@>;
  1932  "Q","R","S":@<Change the interaction level and |return|@>;
  1933  "X":begin interaction:=scroll_mode; jump_out;
  1934    end;
  1935  othercases do_nothing
  1936  endcases;@/
  1937  @<Print the menu of available options@>
  1938  
  1939  @ @<Print the menu...@>=
  1940  begin print("Type <return> to proceed, S to scroll future error messages,");@/
  1941  @.Type <return> to proceed...@>
  1942  print_nl("R to run without stopping, Q to run quietly,");@/
  1943  print_nl("I to insert something, ");
  1944  if base_ptr>0 then if input_stack[base_ptr].name_field>=256 then
  1945    print("E to edit your file,");
  1946  if deletions_allowed then
  1947    print_nl("1 or ... or 9 to ignore the next 1 to 9 tokens of input,");
  1948  print_nl("H for help, X to quit.");
  1949  end
  1950  
  1951  @ Here the author of \TeX\ apologizes for making use of the numerical
  1952  relation between |"Q"|, |"R"|, |"S"|, and the desired interaction settings
  1953  |batch_mode|, |nonstop_mode|, |scroll_mode|.
  1954  @^Knuth, Donald Ervin@>
  1955  
  1956  @<Change the interaction...@>=
  1957  begin error_count:=0; interaction:=batch_mode+c-"Q";
  1958  print("OK, entering ");
  1959  case c of
  1960  "Q":begin print_esc("batchmode"); decr(selector);
  1961    end;
  1962  "R":print_esc("nonstopmode");
  1963  "S":print_esc("scrollmode");
  1964  end; {there are no other cases}
  1965  print("..."); print_ln; update_terminal; return;
  1966  end
  1967  
  1968  @ When the following code is executed, |buffer[(first+1)..(last-1)]| may
  1969  contain the material inserted by the user; otherwise another prompt will
  1970  be given. In order to understand this part of the program fully, you need
  1971  to be familiar with \TeX's input stacks.
  1972  
  1973  @<Introduce new material...@>=
  1974  begin begin_file_reading; {enter a new syntactic level for terminal input}
  1975  {now |state=mid_line|, so an initial blank space will count as a blank}
  1976  if last>first+1 then
  1977    begin loc:=first+1; buffer[first]:=" ";
  1978    end
  1979  else  begin prompt_input("insert>"); loc:=first;
  1980  @.insert>@>
  1981    end;
  1982  first:=last;
  1983  cur_input.limit_field:=last-1; {no |end_line_char| ends this line}
  1984  return;
  1985  end
  1986  
  1987  @ We allow deletion of up to 99 tokens at a time.
  1988  
  1989  @<Delete \(c)|c-"0"| tokens...@>=
  1990  begin s1:=cur_tok; s2:=cur_cmd; s3:=cur_chr; s4:=align_state;
  1991  align_state:=1000000; OK_to_interrupt:=false;
  1992  if (last>first+1) and (buffer[first+1]>="0")and(buffer[first+1]<="9") then
  1993    c:=c*10+buffer[first+1]-"0"*11
  1994  else c:=c-"0";
  1995  while c>0 do
  1996    begin get_token; {one-level recursive call of |error| is possible}
  1997    decr(c);
  1998    end;
  1999  cur_tok:=s1; cur_cmd:=s2; cur_chr:=s3; align_state:=s4; OK_to_interrupt:=true;
  2000  help2("I have just deleted some text, as you asked.")@/
  2001  ("You can now delete more, or insert, or whatever.");
  2002  show_context; goto continue;
  2003  end
  2004  
  2005  @ @<Print the help info...@>=
  2006  begin if use_err_help then
  2007    begin give_err_help; use_err_help:=false;
  2008    end
  2009  else  begin if help_ptr=0 then
  2010      help2("Sorry, I don't know how to help in this situation.")@/
  2011      @t\kern1em@>("Maybe you should try asking a human?");
  2012    repeat decr(help_ptr); print(help_line[help_ptr]); print_ln;
  2013    until help_ptr=0;
  2014    end;
  2015  help4("Sorry, I already gave what help I could...")@/
  2016    ("Maybe you should try asking a human?")@/
  2017    ("An error might have occurred before I noticed any problems.")@/
  2018    ("``If all else fails, read the instructions.''");@/
  2019  goto continue;
  2020  end
  2021  
  2022  @ @<Put help message on the transcript file@>=
  2023  if interaction>batch_mode then decr(selector); {avoid terminal output}
  2024  if use_err_help then
  2025    begin print_ln; give_err_help;
  2026    end
  2027  else while help_ptr>0 do
  2028    begin decr(help_ptr); print_nl(help_line[help_ptr]);
  2029    end;
  2030  print_ln;
  2031  if interaction>batch_mode then incr(selector); {re-enable terminal output}
  2032  print_ln
  2033  
  2034  @ A dozen or so error messages end with a parenthesized integer, so we
  2035  save a teeny bit of program space by declaring the following procedure:
  2036  
  2037  @p procedure int_error(@!n:integer);
  2038  begin print(" ("); print_int(n); print_char(")"); error;
  2039  end;
  2040  
  2041  @ In anomalous cases, the print selector might be in an unknown state;
  2042  the following subroutine is called to fix things just enough to keep
  2043  running a bit longer.
  2044  
  2045  @p procedure normalize_selector;
  2046  begin if log_opened then selector:=term_and_log
  2047  else selector:=term_only;
  2048  if job_name=0 then open_log_file;
  2049  if interaction=batch_mode then decr(selector);
  2050  end;
  2051  
  2052  @ The following procedure prints \TeX's last words before dying.
  2053  
  2054  @d succumb==begin if interaction=error_stop_mode then
  2055      interaction:=scroll_mode; {no more interaction}
  2056    if log_opened then error;
  2057    @!debug if interaction>batch_mode then debug_help;@+gubed@;@/
  2058    history:=fatal_error_stop; jump_out; {irrecoverable error}
  2059    end
  2060  
  2061  @<Error hand...@>=
  2062  procedure fatal_error(@!s:str_number); {prints |s|, and that's it}
  2063  begin normalize_selector;@/
  2064  print_err("Emergency stop"); help1(s); succumb;
  2065  @.Emergency stop@>
  2066  end;
  2067  
  2068  @ Here is the most dreaded error message.
  2069  
  2070  @<Error hand...@>=
  2071  procedure overflow(@!s:str_number;@!n:integer); {stop due to finiteness}
  2072  begin normalize_selector;
  2073  print_err("TeX capacity exceeded, sorry [");
  2074  @.TeX capacity exceeded ...@>
  2075  print(s); print_char("="); print_int(n); print_char("]");
  2076  help2("If you really absolutely need more capacity,")@/
  2077    ("you can ask a wizard to enlarge me.");
  2078  succumb;
  2079  end;
  2080  
  2081  @ The program might sometime run completely amok, at which point there is
  2082  no choice but to stop. If no previous error has been detected, that's bad
  2083  news; a message is printed that is really intended for the \TeX\
  2084  maintenance person instead of the user (unless the user has been
  2085  particularly diabolical).  The index entries for `this can't happen' may
  2086  help to pinpoint the problem.
  2087  @^dry rot@>
  2088  
  2089  @<Error hand...@>=
  2090  procedure confusion(@!s:str_number);
  2091    {consistency check violated; |s| tells where}
  2092  begin normalize_selector;
  2093  if history<error_message_issued then
  2094    begin print_err("This can't happen ("); print(s); print_char(")");
  2095  @.This can't happen@>
  2096    help1("I'm broken. Please show this to someone who can fix can fix");
  2097    end
  2098  else  begin print_err("I can't go on meeting you like this");
  2099  @.I can't go on...@>
  2100    help2("One of your faux pas seems to have wounded me deeply...")@/
  2101      ("in fact, I'm barely conscious. Please fix it and try again.");
  2102    end;
  2103  succumb;
  2104  end;
  2105  
  2106  @ Users occasionally want to interrupt \TeX\ while it's running.
  2107  If the \PASCAL\ runtime system allows this, one can implement
  2108  a routine that sets the global variable |interrupt| to some nonzero value
  2109  when such an interrupt is signalled. Otherwise there is probably at least
  2110  a way to make |interrupt| nonzero using the \PASCAL\ debugger.
  2111  @^system dependencies@>
  2112  @^debugging@>
  2113  
  2114  @d check_interrupt==begin if interrupt<>0 then pause_for_instructions;
  2115    end
  2116  
  2117  @<Global...@>=
  2118  @!interrupt:integer; {should \TeX\ pause for instructions?}
  2119  @!OK_to_interrupt:boolean; {should interrupts be observed?}
  2120  
  2121  @ @<Set init...@>=
  2122  interrupt:=0; OK_to_interrupt:=true;
  2123  
  2124  @ When an interrupt has been detected, the program goes into its
  2125  highest interaction level and lets the user have nearly the full flexibility of
  2126  the |error| routine.  \TeX\ checks for interrupts only at times when it is
  2127  safe to do this.
  2128  
  2129  @p procedure pause_for_instructions;
  2130  begin if OK_to_interrupt then
  2131    begin interaction:=error_stop_mode;
  2132    if (selector=log_only)or(selector=no_print) then
  2133      incr(selector);
  2134    print_err("Interruption");
  2135  @.Interruption@>
  2136    help3("You rang?")@/
  2137    ("Try to insert an instruction for me (e.g., `I\showlists'),")@/
  2138    ("unless you just want to quit by typing `X'.");
  2139    deletions_allowed:=false; error; deletions_allowed:=true;
  2140    interrupt:=0;
  2141    end;
  2142  end;
  2143  
  2144  @* \[7] Arithmetic with scaled dimensions.
  2145  The principal computations performed by \TeX\ are done entirely in terms of
  2146  integers less than $2^{31}$ in magnitude; and divisions are done only when both
  2147  dividend and divisor are nonnegative. Thus, the arithmetic specified in this
  2148  program can be carried out in exactly the same way on a wide variety of
  2149  computers, including some small ones. Why? Because the arithmetic
  2150  calculations need to be spelled out precisely in order to guarantee that
  2151  \TeX\ will produce identical output on different machines. If some
  2152  quantities were rounded differently in different implementations, we would
  2153  find that line breaks and even page breaks might occur in different places.
  2154  Hence the arithmetic of \TeX\ has been designed with care, and systems that
  2155  claim to be implementations of \TeX82 should follow precisely the
  2156  @:TeX82}{\TeX82@>
  2157  calculations as they appear in the present program.
  2158  
  2159  (Actually there are three places where \TeX\ uses |div| with a possibly negative
  2160  numerator. These are harmless; see |div| in the index. Also if the user
  2161  sets the \.{\\time} or the \.{\\year} to a negative value, some diagnostic
  2162  information will involve negative-numerator division. The same remarks
  2163  apply for |mod| as well as for |div|.)
  2164  
  2165  @ Here is a routine that calculates half of an integer, using an
  2166  unambiguous convention with respect to signed odd numbers.
  2167  
  2168  @p function half(@!x:integer):integer;
  2169  begin if odd(x) then half:=(x+1) div 2
  2170  else half:=x @!div 2;
  2171  end;
  2172  
  2173  @ Fixed-point arithmetic is done on {\sl scaled integers\/} that are multiples
  2174  of $2^{-16}$. In other words, a binary point is assumed to be sixteen bit
  2175  positions from the right end of a binary computer word.
  2176  
  2177  @d unity == @'200000 {$2^{16}$, represents 1.00000}
  2178  @d two == @'400000 {$2^{17}$, represents 2.00000}
  2179  
  2180  @<Types...@>=
  2181  @!scaled = integer; {this type is used for scaled integers}
  2182  @!nonnegative_integer=0..@'17777777777; {$0\L x<2^{31}$}
  2183  @!small_number=0..63; {this type is self-explanatory}
  2184  
  2185  @ The following function is used to create a scaled integer from a given decimal
  2186  fraction $(.d_0d_1\ldots d_{k-1})$, where |0<=k<=17|. The digit $d_i$ is
  2187  given in |dig[i]|, and the calculation produces a correctly rounded result.
  2188  
  2189  @p function round_decimals(@!k:small_number) : scaled;
  2190    {converts a decimal fraction}
  2191  var a:integer; {the accumulator}
  2192  begin a:=0;
  2193  while k>0 do
  2194    begin decr(k); a:=(a+dig[k]*two) div 10;
  2195    end;
  2196  round_decimals:=(a+1) div 2;
  2197  end;
  2198  
  2199  @ Conversely, here is a procedure analogous to |print_int|. If the output
  2200  of this procedure is subsequently read by \TeX\ and converted by the
  2201  |round_decimals| routine above, it turns out that the original value will
  2202  be reproduced exactly; the ``simplest'' such decimal number is output,
  2203  but there is always at least one digit following the decimal point.
  2204  
  2205  The invariant relation in the \&{repeat} loop is that a sequence of
  2206  decimal digits yet to be printed will yield the original number if and only if
  2207  they form a fraction~$f$ in the range $s-\delta\L10\cdot2^{16}f<s$.
  2208  We can stop if and only if $f=0$ satisfies this condition; the loop will
  2209  terminate before $s$ can possibly become zero.
  2210  
  2211  @p procedure print_scaled(@!s:scaled); {prints scaled real, rounded to five
  2212    digits}
  2213  var delta:scaled; {amount of allowable inaccuracy}
  2214  begin if s<0 then
  2215    begin print_char("-"); negate(s); {print the sign, if negative}
  2216    end;
  2217  print_int(s div unity); {print the integer part}
  2218  print_char(".");
  2219  s:=10*(s mod unity)+5; delta:=10;
  2220  repeat if delta>unity then s:=s+@'100000-50000; {round the last digit}
  2221  print_char("0"+(s div unity)); s:=10*(s mod unity); delta:=delta*10;
  2222  until s<=delta;
  2223  end;
  2224  
  2225  @ Physical sizes that a \TeX\ user specifies for portions of documents are
  2226  represented internally as scaled points. Thus, if we define an `sp' (scaled
  2227  @^sp@>
  2228  point) as a unit equal to $2^{-16}$ printer's points, every dimension
  2229  inside of \TeX\ is an integer number of sp. There are exactly
  2230  4,736,286.72 sp per inch.  Users are not allowed to specify dimensions
  2231  larger than $2^{30}-1$ sp, which is a distance of about 18.892 feet (5.7583
  2232  meters); two such quantities can be added without overflow on a 32-bit
  2233  computer.
  2234  
  2235  The present implementation of \TeX\ does not check for overflow when
  2236  @^overflow in arithmetic@>
  2237  dimensions are added or subtracted. This could be done by inserting a
  2238  few dozen tests of the form `\ignorespaces|if x>=@'10000000000 then
  2239  @t\\{report\_overflow}@>|', but the chance of overflow is so remote that
  2240  such tests do not seem worthwhile.
  2241  
  2242  \TeX\ needs to do only a few arithmetic operations on scaled quantities,
  2243  other than addition and subtraction, and the following subroutines do most of
  2244  the work. A single computation might use several subroutine calls, and it is
  2245  desirable to avoid producing multiple error messages in case of arithmetic
  2246  overflow; so the routines set the global variable |arith_error| to |true|
  2247  instead of reporting errors directly to the user. Another global variable,
  2248  |remainder|, holds the remainder after a division.
  2249  
  2250  @<Glob...@>=
  2251  @!arith_error:boolean; {has arithmetic overflow occurred recently?}
  2252  @!remainder:scaled; {amount subtracted to get an exact division}
  2253  
  2254  @ The first arithmetical subroutine we need computes $nx+y$, where |x|
  2255  and~|y| are |scaled| and |n| is an integer. We will also use it to
  2256  multiply integers.
  2257  
  2258  @d nx_plus_y(#)==mult_and_add(#,@'7777777777)
  2259  @d mult_integers(#)==mult_and_add(#,0,@'17777777777)
  2260  
  2261  @p function mult_and_add(@!n:integer;@!x,@!y,@!max_answer:scaled):scaled;
  2262  begin if n<0 then
  2263    begin negate(x); negate(n);
  2264    end;
  2265  if n=0 then mult_and_add:=y
  2266  else if ((x<=(max_answer-y) div n)and(-x<=(max_answer+y) div n)) then
  2267    mult_and_add:=n*x+y
  2268  else  begin arith_error:=true; mult_and_add:=0;
  2269    end;
  2270  end;
  2271  
  2272  @ We also need to divide scaled dimensions by integers.
  2273  
  2274  @p function x_over_n(@!x:scaled;@!n:integer):scaled;
  2275  var negative:boolean; {should |remainder| be negated?}
  2276  begin negative:=false;
  2277  if n=0 then
  2278    begin arith_error:=true; x_over_n:=0; remainder:=x;
  2279    end
  2280  else  begin if n<0 then
  2281      begin negate(x); negate(n); negative:=true;
  2282      end;
  2283    if x>=0 then
  2284      begin x_over_n:=x div n; remainder:=x mod n;
  2285      end
  2286    else  begin x_over_n:=-((-x) div n); remainder:=-((-x) mod n);
  2287      end;
  2288    end;
  2289  if negative then negate(remainder);
  2290  end;
  2291  
  2292  @ Then comes the multiplication of a scaled number by a fraction |n/d|,
  2293  where |n| and |d| are nonnegative integers |<=@t$2^{16}$@>| and |d| is
  2294  positive. It would be too dangerous to multiply by~|n| and then divide
  2295  by~|d|, in separate operations, since overflow might well occur; and it
  2296  would be too inaccurate to divide by |d| and then multiply by |n|. Hence
  2297  this subroutine simulates 1.5-precision arithmetic.
  2298  
  2299  @p function xn_over_d(@!x:scaled; @!n,@!d:integer):scaled;
  2300  var positive:boolean; {was |x>=0|?}
  2301  @!t,@!u,@!v:nonnegative_integer; {intermediate quantities}
  2302  begin if x>=0 then positive:=true
  2303  else  begin negate(x); positive:=false;
  2304    end;
  2305  t:=(x mod @'100000)*n;
  2306  u:=(x div @'100000)*n+(t div @'100000);
  2307  v:=(u mod d)*@'100000 + (t mod @'100000);
  2308  if u div d>=@'100000 then arith_error:=true
  2309  else u:=@'100000*(u div d) + (v div d);
  2310  if positive then
  2311    begin xn_over_d:=u; remainder:=v mod d;
  2312    end
  2313  else  begin xn_over_d:=-u; remainder:=-(v mod d);
  2314    end;
  2315  end;
  2316  
  2317  @ The next subroutine is used to compute the ``badness'' of glue, when a
  2318  total~|t| is supposed to be made from amounts that sum to~|s|.  According
  2319  to {\sl The \TeX book}, the badness of this situation is $100(t/s)^3$;
  2320  however, badness is simply a heuristic, so we need not squeeze out the
  2321  last drop of accuracy when computing it. All we really want is an
  2322  approximation that has similar properties.
  2323  @:TeXbook}{\sl The \TeX book@>
  2324  
  2325  The actual method used to compute the badness is easier to read from the
  2326  program than to describe in words. It produces an integer value that is a
  2327  reasonably close approximation to $100(t/s)^3$, and all implementations
  2328  of \TeX\ should use precisely this method. Any badness of $2^{13}$ or more is
  2329  treated as infinitely bad, and represented by 10000.
  2330  
  2331  It is not difficult to prove that $$\hbox{|badness(t+1,s)>=badness(t,s)
  2332  >=badness(t,s+1)|}.$$ The badness function defined here is capable of
  2333  computing at most 1095 distinct values, but that is plenty.
  2334  
  2335  @d inf_bad = 10000 {infinitely bad value}
  2336  
  2337  @p function badness(@!t,@!s:scaled):halfword; {compute badness, given |t>=0|}
  2338  var r:integer; {approximation to $\alpha t/s$, where $\alpha^3\approx
  2339    100\cdot2^{18}$}
  2340  begin if t=0 then badness:=0
  2341  else if s<=0 then badness:=inf_bad
  2342  else  begin if t<=7230584 then  r:=(t*297) div s {$297^3=99.94\times2^{18}$}
  2343    else if s>=1663497 then r:=t div (s div 297)
  2344    else r:=t;
  2345    if r>1290 then badness:=inf_bad {$1290^3<2^{31}<1291^3$}
  2346    else badness:=(r*r*r+@'400000) div @'1000000;
  2347    end; {that was $r^3/2^{18}$, rounded to the nearest integer}
  2348  end;
  2349  
  2350  @ When \TeX\ ``packages'' a list into a box, it needs to calculate the
  2351  proportionality ratio by which the glue inside the box should stretch
  2352  or shrink. This calculation does not affect \TeX's decision making,
  2353  so the precise details of rounding, etc., in the glue calculation are not
  2354  of critical importance for the consistency of results on different computers.
  2355  
  2356  We shall use the type |glue_ratio| for such proportionality ratios.
  2357  A glue ratio should take the same amount of memory as an
  2358  |integer| (usually 32 bits) if it is to blend smoothly with \TeX's
  2359  other data structures. Thus |glue_ratio| should be equivalent to
  2360  |short_real| in some implementations of \PASCAL. Alternatively,
  2361  it is possible to deal with glue ratios using nothing but fixed-point
  2362  arithmetic; see {\sl TUGboat \bf3},1 (March 1982), 10--27. (But the
  2363  routines cited there must be modified to allow negative glue ratios.)
  2364  @^system dependencies@>
  2365  
  2366  @d set_glue_ratio_zero(#) == #:=0.0 {store the representation of zero ratio}
  2367  @d set_glue_ratio_one(#) == #:=1.0 {store the representation of unit ratio}
  2368  @d float(#) == # {convert from |glue_ratio| to type |real|}
  2369  @d unfloat(#) == # {convert from |real| to type |glue_ratio|}
  2370  @d float_constant(#) == #.0 {convert |integer| constant to |real|}
  2371  
  2372  @<Types...@>=
  2373  @!glue_ratio=real; {one-word representation of a glue expansion factor}
  2374  
  2375  @* \[8] Packed data.
  2376  In order to make efficient use of storage space, \TeX\ bases its major data
  2377  structures on a |memory_word|, which contains either a (signed) integer,
  2378  possibly scaled, or a (signed) |glue_ratio|, or a small number of
  2379  fields that are one half or one quarter of the size used for storing
  2380  integers.
  2381  
  2382  If |x| is a variable of type |memory_word|, it contains up to four
  2383  fields that can be referred to as follows:
  2384  $$\vbox{\halign{\hfil#&#\hfil&#\hfil\cr
  2385  |x|&.|int|&(an |integer|)\cr
  2386  |x|&.|sc|\qquad&(a |scaled| integer)\cr
  2387  |x|&.|gr|&(a |glue_ratio|)\cr
  2388  |x.hh.lh|, |x.hh|&.|rh|&(two halfword fields)\cr
  2389  |x.hh.b0|, |x.hh.b1|, |x.hh|&.|rh|&(two quarterword fields, one halfword
  2390    field)\cr
  2391  |x.qqqq.b0|, |x.qqqq.b1|, |x.qqqq|&.|b2|, |x.qqqq.b3|\hskip-100pt
  2392    &\qquad\qquad\qquad(four quarterword fields)\cr}}$$
  2393  This is somewhat cumbersome to write, and not very readable either, but
  2394  macros will be used to make the notation shorter and more transparent.
  2395  The \PASCAL\ code below gives a formal definition of |memory_word| and
  2396  its subsidiary types, using packed variant records. \TeX\ makes no
  2397  assumptions about the relative positions of the fields within a word.
  2398  
  2399  Since we are assuming 32-bit integers, a halfword must contain at least
  2400  16 bits, and a quarterword must contain at least 8 bits.
  2401  @^system dependencies@>
  2402  But it doesn't hurt to have more bits; for example, with enough 36-bit
  2403  words you might be able to have |mem_max| as large as 262142, which is
  2404  eight times as much memory as anybody had during the first four years of
  2405  \TeX's existence.
  2406  
  2407  N.B.: Valuable memory space will be dreadfully wasted unless \TeX\ is compiled
  2408  by a \PASCAL\ that packs all of the |memory_word| variants into
  2409  the space of a single integer. This means, for example, that |glue_ratio|
  2410  words should be |short_real| instead of |real| on some computers. Some
  2411  \PASCAL\ compilers will pack an integer whose subrange is `|0..255|' into
  2412  an eight-bit field, but others insist on allocating space for an additional
  2413  sign bit; on such systems you can get 256 values into a quarterword only
  2414  if the subrange is `|-128..127|'.
  2415  
  2416  The present implementation tries to accommodate as many variations as possible,
  2417  so it makes few assumptions. If integers having the subrange
  2418  `|min_quarterword..max_quarterword|' can be packed into a quarterword,
  2419  and if integers having the subrange `|min_halfword..max_halfword|'
  2420  can be packed into a halfword, everything should work satisfactorily.
  2421  
  2422  It is usually most efficient to have |min_quarterword=min_halfword=0|,
  2423  so one should try to achieve this unless it causes a severe problem.
  2424  The values defined here are recommended for most 32-bit computers.
  2425  
  2426  @d min_quarterword=0 {smallest allowable value in a |quarterword|}
  2427  @d max_quarterword=255 {largest allowable value in a |quarterword|}
  2428  @d min_halfword==0 {smallest allowable value in a |halfword|}
  2429  @d max_halfword==65535 {largest allowable value in a |halfword|}
  2430  
  2431  @ Here are the inequalities that the quarterword and halfword values
  2432  must satisfy (or rather, the inequalities that they mustn't satisfy):
  2433  
  2434  @<Check the ``constant''...@>=
  2435  init if (mem_min<>mem_bot)or(mem_max<>mem_top) then bad:=10;@+tini@;@/
  2436  if (mem_min>mem_bot)or(mem_max<mem_top) then bad:=10;
  2437  if (min_quarterword>0)or(max_quarterword<127) then bad:=11;
  2438  if (min_halfword>0)or(max_halfword<32767) then bad:=12;
  2439  if (min_quarterword<min_halfword)or@|
  2440    (max_quarterword>max_halfword) then bad:=13;
  2441  if (mem_min<min_halfword)or(mem_max>=max_halfword)or@|
  2442    (mem_bot-mem_min>max_halfword+1) then bad:=14;
  2443  if (font_base<min_quarterword)or(font_max>max_quarterword) then bad:=15;
  2444  if font_max>font_base+256 then bad:=16;
  2445  if (save_size>max_halfword)or(max_strings>max_halfword) then bad:=17;
  2446  if buf_size>max_halfword then bad:=18;
  2447  if max_quarterword-min_quarterword<255 then bad:=19;
  2448  
  2449  @ The operation of adding or subtracting |min_quarterword| occurs quite
  2450  frequently in \TeX, so it is convenient to abbreviate this operation
  2451  by using the macros |qi| and |qo| for input and output to and from
  2452  quarterword format.
  2453  
  2454  The inner loop of \TeX\ will run faster with respect to compilers
  2455  that don't optimize expressions like `|x+0|' and `|x-0|', if these
  2456  macros are simplified in the obvious way when |min_quarterword=0|.
  2457  @^inner loop@>@^system dependencies@>
  2458  
  2459  @d qi(#)==#+min_quarterword
  2460    {to put an |eight_bits| item into a quarterword}
  2461  @d qo(#)==#-min_quarterword
  2462    {to take an |eight_bits| item out of a quarterword}
  2463  @d hi(#)==#+min_halfword
  2464    {to put a sixteen-bit item into a halfword}
  2465  @d ho(#)==#-min_halfword
  2466    {to take a sixteen-bit item from a halfword}
  2467  
  2468  @ The reader should study the following definitions closely:
  2469  @^system dependencies@>
  2470  
  2471  @d sc==int {|scaled| data is equivalent to |integer|}
  2472  
  2473  @<Types...@>=
  2474  @!quarterword = min_quarterword..max_quarterword; {1/4 of a word}
  2475  @!halfword=min_halfword..max_halfword; {1/2 of a word}
  2476  @!two_choices = 1..2; {used when there are two variants in a record}
  2477  @!four_choices = 1..4; {used when there are four variants in a record}
  2478  @!two_halves = packed record@;@/
  2479    @!rh:halfword;
  2480    case two_choices of
  2481    1: (@!lh:halfword);
  2482    2: (@!b0:quarterword; @!b1:quarterword);
  2483    end;
  2484  @!four_quarters = packed record@;@/
  2485    @!b0:quarterword;
  2486    @!b1:quarterword;
  2487    @!b2:quarterword;
  2488    @!b3:quarterword;
  2489    end;
  2490  @!memory_word = record@;@/
  2491    case four_choices of
  2492    1: (@!int:integer);
  2493    2: (@!gr:glue_ratio);
  2494    3: (@!hh:two_halves);
  2495    4: (@!qqqq:four_quarters);
  2496    end;
  2497  @!word_file = file of memory_word;
  2498  
  2499  @ When debugging, we may want to print a |memory_word| without knowing
  2500  what type it is; so we print it in all modes.
  2501  @^dirty \PASCAL@>@^debugging@>
  2502  
  2503  @p @!debug procedure print_word(@!w:memory_word);
  2504    {prints |w| in all ways}
  2505  begin print_int(w.int); print_char(" ");@/
  2506  print_scaled(w.sc); print_char(" ");@/
  2507  print_scaled(round(unity*float(w.gr))); print_ln;@/
  2508  @^real multiplication@>
  2509  print_int(w.hh.lh); print_char("="); print_int(w.hh.b0); print_char(":");
  2510  print_int(w.hh.b1); print_char(";"); print_int(w.hh.rh); print_char(" ");@/
  2511  print_int(w.qqqq.b0); print_char(":"); print_int(w.qqqq.b1); print_char(":");
  2512  print_int(w.qqqq.b2); print_char(":"); print_int(w.qqqq.b3);
  2513  end;
  2514  gubed
  2515  
  2516  @* \[9] Dynamic memory allocation.
  2517  The \TeX\ system does nearly all of its own memory allocation, so that it
  2518  can readily be transported into environments that do not have automatic
  2519  facilities for strings, garbage collection, etc., and so that it can be in
  2520  control of what error messages the user receives. The dynamic storage
  2521  requirements of \TeX\ are handled by providing a large array |mem| in
  2522  which consecutive blocks of words are used as nodes by the \TeX\ routines.
  2523  
  2524  Pointer variables are indices into this array, or into another array
  2525  called |eqtb| that will be explained later. A pointer variable might
  2526  also be a special flag that lies outside the bounds of |mem|, so we
  2527  allow pointers to assume any |halfword| value. The minimum halfword
  2528  value represents a null pointer. \TeX\ does not assume that |mem[null]| exists.
  2529  
  2530  @d pointer==halfword {a flag or a location in |mem| or |eqtb|}
  2531  @d null==min_halfword {the null pointer}
  2532  
  2533  @<Glob...@>=
  2534  @!temp_ptr:pointer; {a pointer variable for occasional emergency use}
  2535  
  2536  @ The |mem| array is divided into two regions that are allocated separately,
  2537  but the dividing line between these two regions is not fixed; they grow
  2538  together until finding their ``natural'' size in a particular job.
  2539  Locations less than or equal to |lo_mem_max| are used for storing
  2540  variable-length records consisting of two or more words each. This region
  2541  is maintained using an algorithm similar to the one described in exercise
  2542  2.5--19 of {\sl The Art of Computer Programming}. However, no size field
  2543  appears in the allocated nodes; the program is responsible for knowing the
  2544  relevant size when a node is freed. Locations greater than or equal to
  2545  |hi_mem_min| are used for storing one-word records; a conventional
  2546  \.{AVAIL} stack is used for allocation in this region.
  2547  
  2548  Locations of |mem| between |mem_bot| and |mem_top| may be dumped as part
  2549  of preloaded format files, by the \.{INITEX} preprocessor.
  2550  @.INITEX@>
  2551  Production versions of \TeX\ may extend the memory at both ends in order to
  2552  provide more space; locations between |mem_min| and |mem_bot| are always
  2553  used for variable-size nodes, and locations between |mem_top| and |mem_max|
  2554  are always used for single-word nodes.
  2555  
  2556  The key pointers that govern |mem| allocation have a prescribed order:
  2557  $$\advance\thickmuskip-2mu
  2558  \hbox{|null<=mem_min<=mem_bot<lo_mem_max<
  2559    hi_mem_min<mem_top<=mem_end<=mem_max|.}$$
  2560  
  2561  Empirical tests show that the present implementation of \TeX\ tends to
  2562  spend about 9\pct! of its running time allocating nodes, and about 6\pct!
  2563  deallocating them after their use.
  2564  
  2565  @<Glob...@>=
  2566  @!mem : array[mem_min..mem_max] of memory_word; {the big dynamic storage area}
  2567  @!lo_mem_max : pointer; {the largest location of variable-size memory in use}
  2568  @!hi_mem_min : pointer; {the smallest location of one-word memory in use}
  2569  
  2570  @ In order to study the memory requirements of particular applications, it
  2571  is possible to prepare a version of \TeX\ that keeps track of current and
  2572  maximum memory usage. When code between the delimiters |@!stat| $\ldots$
  2573  |tats| is not ``commented out,'' \TeX\ will run a bit slower but it will
  2574  report these statistics when |tracing_stats| is sufficiently large.
  2575  
  2576  @<Glob...@>=
  2577  @!var_used, @!dyn_used : integer; {how much memory is in use}
  2578  
  2579  @ Let's consider the one-word memory region first, since it's the
  2580  simplest. The pointer variable |mem_end| holds the highest-numbered location
  2581  of |mem| that has ever been used. The free locations of |mem| that
  2582  occur between |hi_mem_min| and |mem_end|, inclusive, are of type
  2583  |two_halves|, and we write |info(p)| and |link(p)| for the |lh|
  2584  and |rh| fields of |mem[p]| when it is of this type. The single-word
  2585  free locations form a linked list
  2586  $$|avail|,\;\hbox{|link(avail)|},\;\hbox{|link(link(avail))|},\;\ldots$$
  2587  terminated by |null|.
  2588  
  2589  @d link(#) == mem[#].hh.rh {the |link| field of a memory word}
  2590  @d info(#) == mem[#].hh.lh {the |info| field of a memory word}
  2591  
  2592  @<Glob...@>=
  2593  @!avail : pointer; {head of the list of available one-word nodes}
  2594  @!mem_end : pointer; {the last one-word node used in |mem|}
  2595  
  2596  @ If memory is exhausted, it might mean that the user has forgotten
  2597  a right brace. We will define some procedures later that try to help
  2598  pinpoint the trouble.
  2599  
  2600  @p @<Declare the procedure called |show_token_list|@>@/
  2601  @<Declare the procedure called |runaway|@>
  2602  
  2603  @ The function |get_avail| returns a pointer to a new one-word node whose
  2604  |link| field is null. However, \TeX\ will halt if there is no more room left.
  2605  @^inner loop@>
  2606  
  2607  If the available-space list is empty, i.e., if |avail=null|,
  2608  we try first to increase |mem_end|. If that cannot be done, i.e., if
  2609  |mem_end=mem_max|, we try to decrease |hi_mem_min|. If that cannot be
  2610  done, i.e., if |hi_mem_min=lo_mem_max+1|, we have to quit.
  2611  
  2612  @p function get_avail : pointer; {single-word node allocation}
  2613  var p:pointer; {the new node being got}
  2614  begin p:=avail; {get top location in the |avail| stack}
  2615  if p<>null then avail:=link(avail) {and pop it off}
  2616  else if mem_end<mem_max then {or go into virgin territory}
  2617    begin incr(mem_end); p:=mem_end;
  2618    end
  2619  else   begin decr(hi_mem_min); p:=hi_mem_min;
  2620    if hi_mem_min<=lo_mem_max then
  2621      begin runaway; {if memory is exhausted, display possible runaway text}
  2622      overflow("main memory size",mem_max+1-mem_min);
  2623        {quit; all one-word nodes are busy}
  2624  @:TeX capacity exceeded main memory size}{\quad main memory size@>
  2625      end;
  2626    end;
  2627  link(p):=null; {provide an oft-desired initialization of the new node}
  2628  @!stat incr(dyn_used);@+tats@;{maintain statistics}
  2629  get_avail:=p;
  2630  end;
  2631  
  2632  @ Conversely, a one-word node is recycled by calling |free_avail|.
  2633  This routine is part of \TeX's ``inner loop,'' so we want it to be fast.
  2634  @^inner loop@>
  2635  
  2636  @d free_avail(#)== {single-word node liberation}
  2637    begin link(#):=avail; avail:=#;
  2638    @!stat decr(dyn_used);@+tats@/
  2639    end
  2640  
  2641  @ There's also a |fast_get_avail| routine, which saves the procedure-call
  2642  overhead at the expense of extra programming. This routine is used in
  2643  the places that would otherwise account for the most calls of |get_avail|.
  2644  @^inner loop@>
  2645  
  2646  @d fast_get_avail(#)==@t@>@;@/
  2647    begin #:=avail; {avoid |get_avail| if possible, to save time}
  2648    if #=null then #:=get_avail
  2649    else  begin avail:=link(#); link(#):=null;
  2650      @!stat incr(dyn_used);@+tats@/
  2651      end;
  2652    end
  2653  
  2654  @ The procedure |flush_list(p)| frees an entire linked list of
  2655  one-word nodes that starts at position |p|.
  2656  @^inner loop@>
  2657  
  2658  @p procedure flush_list(@!p:pointer); {makes list of single-word nodes
  2659    available}
  2660  var @!q,@!r:pointer; {list traversers}
  2661  begin if p<>null then
  2662    begin r:=p;
  2663    repeat q:=r; r:=link(r); @!stat decr(dyn_used);@+tats@/
  2664    until r=null; {now |q| is the last node on the list}
  2665    link(q):=avail; avail:=p;
  2666    end;
  2667  end;
  2668  
  2669  @ The available-space list that keeps track of the variable-size portion
  2670  of |mem| is a nonempty, doubly-linked circular list of empty nodes,
  2671  pointed to by the roving pointer |rover|.
  2672  
  2673  Each empty node has size 2 or more; the first word contains the special
  2674  value |max_halfword| in its |link| field and the size in its |info| field;
  2675  the second word contains the two pointers for double linking.
  2676  
  2677  Each nonempty node also has size 2 or more. Its first word is of type
  2678  |two_halves|\kern-1pt, and its |link| field is never equal to |max_halfword|.
  2679  Otherwise there is complete flexibility with respect to the contents
  2680  of its other fields and its other words.
  2681  
  2682  (We require |mem_max<max_halfword| because terrible things can happen
  2683  when |max_halfword| appears in the |link| field of a nonempty node.)
  2684  
  2685  @d empty_flag == max_halfword {the |link| of an empty variable-size node}
  2686  @d is_empty(#) == (link(#)=empty_flag) {tests for empty node}
  2687  @d node_size == info {the size field in empty variable-size nodes}
  2688  @d llink(#) == info(#+1) {left link in doubly-linked list of empty nodes}
  2689  @d rlink(#) == link(#+1) {right link in doubly-linked list of empty nodes}
  2690  
  2691  @<Glob...@>=
  2692  @!rover : pointer; {points to some node in the list of empties}
  2693  
  2694  @ A call to |get_node| with argument |s| returns a pointer to a new node
  2695  of size~|s|, which must be 2~or more. The |link| field of the first word
  2696  of this new node is set to null. An overflow stop occurs if no suitable
  2697  space exists.
  2698  
  2699  If |get_node| is called with $s=2^{30}$, it simply merges adjacent free
  2700  areas and returns the value |max_halfword|.
  2701  
  2702  @p function get_node(@!s:integer):pointer; {variable-size node allocation}
  2703  label found,exit,restart;
  2704  var p:pointer; {the node currently under inspection}
  2705  @!q:pointer; {the node physically after node |p|}
  2706  @!r:integer; {the newly allocated node, or a candidate for this honor}
  2707  @!t:integer; {temporary register}
  2708  begin restart: p:=rover; {start at some free node in the ring}
  2709  repeat @<Try to allocate within node |p| and its physical successors,
  2710    and |goto found| if allocation was possible@>;
  2711  @^inner loop@>
  2712  p:=rlink(p); {move to the next node in the ring}
  2713  until p=rover; {repeat until the whole list has been traversed}
  2714  if s=@'10000000000 then
  2715    begin get_node:=max_halfword; return;
  2716    end;
  2717  if lo_mem_max+2<hi_mem_min then if lo_mem_max+2<=mem_bot+max_halfword then
  2718    @<Grow more variable-size memory and |goto restart|@>;
  2719  overflow("main memory size",mem_max+1-mem_min);
  2720    {sorry, nothing satisfactory is left}
  2721  @:TeX capacity exceeded main memory size}{\quad main memory size@>
  2722  found: link(r):=null; {this node is now nonempty}
  2723  @!stat var_used:=var_used+s; {maintain usage statistics}
  2724  tats@;@/
  2725  get_node:=r;
  2726  exit:end;
  2727  
  2728  @ The lower part of |mem| grows by 1000 words at a time, unless
  2729  we are very close to going under. When it grows, we simply link
  2730  a new node into the available-space list. This method of controlled
  2731  growth helps to keep the |mem| usage consecutive when \TeX\ is
  2732  implemented on ``virtual memory'' systems.
  2733  @^virtual memory@>
  2734  
  2735  @<Grow more variable-size memory and |goto restart|@>=
  2736  begin if hi_mem_min-lo_mem_max>=1998 then t:=lo_mem_max+1000
  2737  else t:=lo_mem_max+1+(hi_mem_min-lo_mem_max) div 2;
  2738    {|lo_mem_max+2<=t<hi_mem_min|}
  2739  p:=llink(rover); q:=lo_mem_max; rlink(p):=q; llink(rover):=q;@/
  2740  if t>mem_bot+max_halfword then t:=mem_bot+max_halfword;
  2741  rlink(q):=rover; llink(q):=p; link(q):=empty_flag; node_size(q):=t-lo_mem_max;@/
  2742  lo_mem_max:=t; link(lo_mem_max):=null; info(lo_mem_max):=null;
  2743  rover:=q; goto restart;
  2744  end
  2745  
  2746  @ Empirical tests show that the routine in this section performs a
  2747  node-merging operation about 0.75 times per allocation, on the average,
  2748  after which it finds that |r>p+1| about 95\pct! of the time.
  2749  
  2750  @<Try to allocate...@>=
  2751  q:=p+node_size(p); {find the physical successor}
  2752  @^inner loop@>
  2753  while is_empty(q) do {merge node |p| with node |q|}
  2754    begin t:=rlink(q);
  2755    if q=rover then rover:=t;
  2756    llink(t):=llink(q); rlink(llink(q)):=t;@/
  2757    q:=q+node_size(q);
  2758    end;
  2759  r:=q-s;
  2760  if r>p+1 then @<Allocate from the top of node |p| and |goto found|@>;
  2761  if r=p then if rlink(p)<>p then
  2762    @<Allocate entire node |p| and |goto found|@>;
  2763  node_size(p):=q-p {reset the size in case it grew}
  2764  
  2765  @ @<Allocate from the top...@>=
  2766  begin node_size(p):=r-p; {store the remaining size}
  2767  @^inner loop@>
  2768  rover:=p; {start searching here next time}
  2769  goto found;
  2770  end
  2771  
  2772  @ Here we delete node |p| from the ring, and let |rover| rove around.
  2773  
  2774  @<Allocate entire...@>=
  2775  begin rover:=rlink(p); t:=llink(p);
  2776  llink(rover):=t; rlink(t):=rover;
  2777  goto found;
  2778  end
  2779  
  2780  @ Conversely, when some variable-size node |p| of size |s| is no longer needed,
  2781  the operation |free_node(p,s)| will make its words available, by inserting
  2782  |p| as a new empty node just before where |rover| now points.
  2783  @^inner loop@>
  2784  
  2785  @p procedure free_node(@!p:pointer; @!s:halfword); {variable-size node
  2786    liberation}
  2787  var q:pointer; {|llink(rover)|}
  2788  begin node_size(p):=s; link(p):=empty_flag;
  2789  q:=llink(rover); llink(p):=q; rlink(p):=rover; {set both links}
  2790  llink(rover):=p; rlink(q):=p; {insert |p| into the ring}
  2791  @!stat var_used:=var_used-s;@+tats@;{maintain statistics}
  2792  end;
  2793  
  2794  @ Just before \.{INITEX} writes out the memory, it sorts the doubly linked
  2795  available space list. The list is probably very short at such times, so a
  2796  simple insertion sort is used. The smallest available location will be
  2797  pointed to by |rover|, the next-smallest by |rlink(rover)|, etc.
  2798  
  2799  @p @!init procedure sort_avail; {sorts the available variable-size nodes
  2800    by location}
  2801  var p,@!q,@!r: pointer; {indices into |mem|}
  2802  @!old_rover:pointer; {initial |rover| setting}
  2803  begin p:=get_node(@'10000000000); {merge adjacent free areas}
  2804  p:=rlink(rover); rlink(rover):=max_halfword; old_rover:=rover;
  2805  while p<>old_rover do @<Sort \(p)|p| into the list starting at |rover|
  2806    and advance |p| to |rlink(p)|@>;
  2807  p:=rover;
  2808  while rlink(p)<>max_halfword do
  2809    begin llink(rlink(p)):=p; p:=rlink(p);
  2810    end;
  2811  rlink(p):=rover; llink(rover):=p;
  2812  end;
  2813  tini
  2814  
  2815  @ The following |while| loop is guaranteed to
  2816  terminate, since the list that starts at
  2817  |rover| ends with |max_halfword| during the sorting procedure.
  2818  
  2819  @<Sort \(p)|p|...@>=
  2820  if p<rover then
  2821    begin q:=p; p:=rlink(q); rlink(q):=rover; rover:=q;
  2822    end
  2823  else  begin q:=rover;
  2824    while rlink(q)<p do q:=rlink(q);
  2825    r:=rlink(p); rlink(p):=rlink(q); rlink(q):=p; p:=r;
  2826    end
  2827  
  2828  @* \[10] Data structures for boxes and their friends.
  2829  From the computer's standpoint, \TeX's chief mission is to create
  2830  horizontal and vertical lists. We shall now investigate how the elements
  2831  of these lists are represented internally as nodes in the dynamic memory.
  2832  
  2833  A horizontal or vertical list is linked together by |link| fields in
  2834  the first word of each node. Individual nodes represent boxes, glue,
  2835  penalties, or special things like discretionary hyphens; because of this
  2836  variety, some nodes are longer than others, and we must distinguish different
  2837  kinds of nodes. We do this by putting a `|type|' field in the first word,
  2838  together with the link and an optional `|subtype|'.
  2839  
  2840  @d type(#) == mem[#].hh.b0 {identifies what kind of node this is}
  2841  @d subtype(#) == mem[#].hh.b1 {secondary identification in some cases}
  2842  
  2843  @ A |@!char_node|, which represents a single character, is the most important
  2844  kind of node because it accounts for the vast majority of all boxes.
  2845  Special precautions are therefore taken to ensure that a |char_node| does
  2846  not take up much memory space. Every such node is one word long, and in fact
  2847  it is identifiable by this property, since other kinds of nodes have at least
  2848  two words, and they appear in |mem| locations less than |hi_mem_min|.
  2849  This makes it possible to omit the |type| field in a |char_node|, leaving
  2850  us room for two bytes that identify a |font| and a |character| within
  2851  that font.
  2852  
  2853  Note that the format of a |char_node| allows for up to 256 different
  2854  fonts and up to 256 characters per font; but most implementations will
  2855  probably limit the total number of fonts to fewer than 75 per job,
  2856  and most fonts will stick to characters whose codes are
  2857  less than 128 (since higher codes
  2858  are more difficult to access on most keyboards).
  2859  
  2860  Extensions of \TeX\ intended for oriental languages will need even more
  2861  than $256\times256$ possible characters, when we consider different sizes
  2862  @^oriental characters@>@^Chinese characters@>@^Japanese characters@>
  2863  and styles of type.  It is suggested that Chinese and Japanese fonts be
  2864  handled by representing such characters in two consecutive |char_node|
  2865  entries: The first of these has |font=font_base|, and its |link| points
  2866  to the second;
  2867  the second identifies the font and the character dimensions.
  2868  The saving feature about oriental characters is that most of them have
  2869  the same box dimensions. The |character| field of the first |char_node|
  2870  is a ``\\{charext}'' that distinguishes between graphic symbols whose
  2871  dimensions are identical for typesetting purposes. (See the \MF\ manual.)
  2872  Such an extension of \TeX\ would not be difficult; further details are
  2873  left to the reader.
  2874  
  2875  In order to make sure that the |character| code fits in a quarterword,
  2876  \TeX\ adds the quantity |min_quarterword| to the actual code.
  2877  
  2878  Character nodes appear only in horizontal lists, never in vertical lists.
  2879  
  2880  @d is_char_node(#) == (#>=hi_mem_min)
  2881    {does the argument point to a |char_node|?}
  2882  @d font == type {the font code in a |char_node|}
  2883  @d character == subtype {the character code in a |char_node|}
  2884  
  2885  @ An |hlist_node| stands for a box that was made from a horizontal list.
  2886  Each |hlist_node| is seven words long, and contains the following fields
  2887  (in addition to the mandatory |type| and |link|, which we shall not
  2888  mention explicitly when discussing the other node types): The |height| and
  2889  |width| and |depth| are scaled integers denoting the dimensions of the
  2890  box.  There is also a |shift_amount| field, a scaled integer indicating
  2891  how much this box should be lowered (if it appears in a horizontal list),
  2892  or how much it should be moved to the right (if it appears in a vertical
  2893  list). There is a |list_ptr| field, which points to the beginning of the
  2894  list from which this box was fabricated; if |list_ptr| is |null|, the box
  2895  is empty. Finally, there are three fields that represent the setting of
  2896  the glue:  |glue_set(p)| is a word of type |glue_ratio| that represents
  2897  the proportionality constant for glue setting; |glue_sign(p)| is
  2898  |stretching| or |shrinking| or |normal| depending on whether or not the
  2899  glue should stretch or shrink or remain rigid; and |glue_order(p)|
  2900  specifies the order of infinity to which glue setting applies (|normal|,
  2901  |fil|, |fill|, or |filll|). The |subtype| field is not used.
  2902  
  2903  @d hlist_node=0 {|type| of hlist nodes}
  2904  @d box_node_size=7 {number of words to allocate for a box node}
  2905  @d width_offset=1 {position of |width| field in a box node}
  2906  @d depth_offset=2 {position of |depth| field in a box node}
  2907  @d height_offset=3 {position of |height| field in a box node}
  2908  @d width(#) == mem[#+width_offset].sc {width of the box, in sp}
  2909  @d depth(#) == mem[#+depth_offset].sc {depth of the box, in sp}
  2910  @d height(#) == mem[#+height_offset].sc {height of the box, in sp}
  2911  @d shift_amount(#) == mem[#+4].sc {repositioning distance, in sp}
  2912  @d list_offset=5 {position of |list_ptr| field in a box node}
  2913  @d list_ptr(#) == link(#+list_offset) {beginning of the list inside the box}
  2914  @d glue_order(#) == subtype(#+list_offset) {applicable order of infinity}
  2915  @d glue_sign(#) == type(#+list_offset) {stretching or shrinking}
  2916  @d normal=0 {the most common case when several cases are named}
  2917  @d stretching = 1 {glue setting applies to the stretch components}
  2918  @d shrinking = 2 {glue setting applies to the shrink components}
  2919  @d glue_offset = 6 {position of |glue_set| in a box node}
  2920  @d glue_set(#) == mem[#+glue_offset].gr
  2921    {a word of type |glue_ratio| for glue setting}
  2922  
  2923  @ The |new_null_box| function returns a pointer to an |hlist_node| in
  2924  which all subfields have the values corresponding to `\.{\\hbox\{\}}'.
  2925  (The |subtype| field is set to |min_quarterword|, for historic reasons
  2926  that are no longer relevant.)
  2927  
  2928  @p function new_null_box:pointer; {creates a new box node}
  2929  var p:pointer; {the new node}
  2930  begin p:=get_node(box_node_size); type(p):=hlist_node;
  2931  subtype(p):=min_quarterword;
  2932  width(p):=0; depth(p):=0; height(p):=0; shift_amount(p):=0; list_ptr(p):=null;
  2933  glue_sign(p):=normal; glue_order(p):=normal; set_glue_ratio_zero(glue_set(p));
  2934  new_null_box:=p;
  2935  end;
  2936  
  2937  @ A |vlist_node| is like an |hlist_node| in all respects except that it
  2938  contains a vertical list.
  2939  
  2940  @d vlist_node=1 {|type| of vlist nodes}
  2941  
  2942  @ A |rule_node| stands for a solid black rectangle; it has |width|,
  2943  |depth|, and |height| fields just as in an |hlist_node|. However, if
  2944  any of these dimensions is $-2^{30}$, the actual value will be determined
  2945  by running the rule up to the boundary of the innermost enclosing box.
  2946  This is called a ``running dimension.'' The |width| is never running in
  2947  an hlist; the |height| and |depth| are never running in a~vlist.
  2948  
  2949  @d rule_node=2 {|type| of rule nodes}
  2950  @d rule_node_size=4 {number of words to allocate for a rule node}
  2951  @d null_flag==-@'10000000000 {$-2^{30}$, signifies a missing item}
  2952  @d is_running(#) == (#=null_flag) {tests for a running dimension}
  2953  
  2954  @ A new rule node is delivered by the |new_rule| function. It
  2955  makes all the dimensions ``running,'' so you have to change the
  2956  ones that are not allowed to run.
  2957  
  2958  @p function new_rule:pointer;
  2959  var p:pointer; {the new node}
  2960  begin p:=get_node(rule_node_size); type(p):=rule_node;
  2961  subtype(p):=0; {the |subtype| is not used}
  2962  width(p):=null_flag; depth(p):=null_flag; height(p):=null_flag;
  2963  new_rule:=p;
  2964  end;
  2965  
  2966  @ Insertions are represented by |ins_node| records, where the |subtype|
  2967  indicates the corresponding box number. For example, `\.{\\insert 250}'
  2968  leads to an |ins_node| whose |subtype| is |250+min_quarterword|.
  2969  The |height| field of an |ins_node| is slightly misnamed; it actually holds
  2970  the natural height plus depth of the vertical list being inserted.
  2971  The |depth| field holds the |split_max_depth| to be used in case this
  2972  insertion is split, and the |split_top_ptr| points to the corresponding
  2973  |split_top_skip|. The |float_cost| field holds the |floating_penalty| that
  2974  will be used if this insertion floats to a subsequent page after a
  2975  split insertion of the same class.  There is one more field, the
  2976  |ins_ptr|, which points to the beginning of the vlist for the insertion.
  2977  
  2978  @d ins_node=3 {|type| of insertion nodes}
  2979  @d ins_node_size=5 {number of words to allocate for an insertion}
  2980  @d float_cost(#)==mem[#+1].int {the |floating_penalty| to be used}
  2981  @d ins_ptr(#)==info(#+4) {the vertical list to be inserted}
  2982  @d split_top_ptr(#)==link(#+4) {the |split_top_skip| to be used}
  2983  
  2984  @ A |mark_node| has a |mark_ptr| field that points to the reference count
  2985  of a token list that contains the user's \.{\\mark} text.
  2986  This field occupies a full word instead of a halfword, because
  2987  there's nothing to put in the other halfword; it is easier in \PASCAL\ to
  2988  use the full word than to risk leaving garbage in the unused half.
  2989  
  2990  @d mark_node=4 {|type| of a mark node}
  2991  @d small_node_size=2 {number of words to allocate for most node types}
  2992  @d mark_ptr(#)==mem[#+1].int {head of the token list for a mark}
  2993  
  2994  @ An |adjust_node|, which occurs only in horizontal lists,
  2995  specifies material that will be moved out into the surrounding
  2996  vertical list; i.e., it is used to implement \TeX's `\.{\\vadjust}'
  2997  operation.  The |adjust_ptr| field points to the vlist containing this
  2998  material.
  2999  
  3000  @d adjust_node=5 {|type| of an adjust node}
  3001  @d adjust_ptr==mark_ptr {vertical list to be moved out of horizontal list}
  3002  
  3003  @ A |ligature_node|, which occurs only in horizontal lists, specifies
  3004  a character that was fabricated from the interaction of two or more
  3005  actual characters.  The second word of the node, which is called the
  3006  |lig_char| word, contains |font| and |character| fields just as in a
  3007  |char_node|. The characters that generated the ligature have not been
  3008  forgotten, since they are needed for diagnostic messages and for
  3009  hyphenation; the |lig_ptr| field points to a linked list of character
  3010  nodes for all original characters that have been deleted. (This list
  3011  might be empty if the characters that generated the ligature were
  3012  retained in other nodes.)
  3013  
  3014  The |subtype| field is 0, plus 2 and/or 1 if the original source of the
  3015  ligature included implicit left and/or right boundaries.
  3016  
  3017  @d ligature_node=6 {|type| of a ligature node}
  3018  @d lig_char(#)==#+1 {the word where the ligature is to be found}
  3019  @d lig_ptr(#)==link(lig_char(#)) {the list of characters}
  3020  
  3021  @ The |new_ligature| function creates a ligature node having given
  3022  contents of the |font|, |character|, and |lig_ptr| fields. We also have
  3023  a |new_lig_item| function, which returns a two-word node having a given
  3024  |character| field. Such nodes are used for temporary processing as ligatures
  3025  are being created.
  3026  
  3027  @p function new_ligature(@!f,@!c:quarterword; @!q:pointer):pointer;
  3028  var p:pointer; {the new node}
  3029  begin p:=get_node(small_node_size); type(p):=ligature_node;
  3030  font(lig_char(p)):=f; character(lig_char(p)):=c; lig_ptr(p):=q;
  3031  subtype(p):=0; new_ligature:=p;
  3032  end;
  3033  @#
  3034  function new_lig_item(@!c:quarterword):pointer;
  3035  var p:pointer; {the new node}
  3036  begin p:=get_node(small_node_size); character(p):=c; lig_ptr(p):=null;
  3037  new_lig_item:=p;
  3038  end;
  3039  
  3040  @ A |disc_node|, which occurs only in horizontal lists, specifies a
  3041  ``dis\-cretion\-ary'' line break. If such a break occurs at node |p|, the text
  3042  that starts at |pre_break(p)| will precede the break, the text that starts at
  3043  |post_break(p)| will follow the break, and text that appears in the next
  3044  |replace_count(p)| nodes will be ignored. For example, an ordinary
  3045  discretionary hyphen, indicated by `\.{\\-}', yields a |disc_node| with
  3046  |pre_break| pointing to a |char_node| containing a hyphen, |post_break=null|,
  3047  and |replace_count=0|. All three of the discretionary texts must be
  3048  lists that consist entirely of character, kern, box, rule, and ligature nodes.
  3049  
  3050  If |pre_break(p)=null|, the |ex_hyphen_penalty| will be charged for this
  3051  break.  Otherwise the |hyphen_penalty| will be charged.  The texts will
  3052  actually be substituted into the list by the line-breaking algorithm if it
  3053  decides to make the break, and the discretionary node will disappear at
  3054  that time; thus, the output routine sees only discretionaries that were
  3055  not chosen.
  3056  
  3057  @d disc_node=7 {|type| of a discretionary node}
  3058  @d replace_count==subtype {how many subsequent nodes to replace}
  3059  @d pre_break==llink {text that precedes a discretionary break}
  3060  @d post_break==rlink {text that follows a discretionary break}
  3061  
  3062  @p function new_disc:pointer; {creates an empty |disc_node|}
  3063  var p:pointer; {the new node}
  3064  begin p:=get_node(small_node_size); type(p):=disc_node;
  3065  replace_count(p):=0; pre_break(p):=null; post_break(p):=null;
  3066  new_disc:=p;
  3067  end;
  3068  
  3069  @ A |whatsit_node| is a wild card reserved for extensions to \TeX. The
  3070  |subtype| field in its first word says what `\\{whatsit}' it is, and
  3071  implicitly determines the node size (which must be 2 or more) and the
  3072  format of the remaining words. When a |whatsit_node| is encountered
  3073  in a list, special actions are invoked; knowledgeable people who are
  3074  careful not to mess up the rest of \TeX\ are able to make \TeX\ do new
  3075  things by adding code at the end of the program. For example, there
  3076  might be a `\TeX nicolor' extension to specify different colors of ink,
  3077  @^extensions to \TeX@>
  3078  and the whatsit node might contain the desired parameters.
  3079  
  3080  The present implementation of \TeX\ treats the features associated with
  3081  `\.{\\write}' and `\.{\\special}' as if they were extensions, in order to
  3082  illustrate how such routines might be coded. We shall defer further
  3083  discussion of extensions until the end of this program.
  3084  
  3085  @d whatsit_node=8 {|type| of special extension nodes}
  3086  
  3087  @ A |math_node|, which occurs only in horizontal lists, appears before and
  3088  after mathematical formulas. The |subtype| field is |before| before the
  3089  formula and |after| after it. There is a |width| field, which represents
  3090  the amount of surrounding space inserted by \.{\\mathsurround}.
  3091  
  3092  @d math_node=9 {|type| of a math node}
  3093  @d before=0 {|subtype| for math node that introduces a formula}
  3094  @d after=1 {|subtype| for math node that winds up a formula}
  3095  
  3096  @p function new_math(@!w:scaled;@!s:small_number):pointer;
  3097  var p:pointer; {the new node}
  3098  begin p:=get_node(small_node_size); type(p):=math_node;
  3099  subtype(p):=s; width(p):=w; new_math:=p;
  3100  end;
  3101  
  3102  @ \TeX\ makes use of the fact that |hlist_node|, |vlist_node|,
  3103  |rule_node|, |ins_node|, |mark_node|, |adjust_node|, |ligature_node|,
  3104  |disc_node|, |whatsit_node|, and |math_node| are at the low end of the
  3105  type codes, by permitting a break at glue in a list if and only if the
  3106  |type| of the previous node is less than |math_node|. Furthermore, a
  3107  node is discarded after a break if its type is |math_node| or~more.
  3108  
  3109  @d precedes_break(#)==(type(#)<math_node)
  3110  @d non_discardable(#)==(type(#)<math_node)
  3111  
  3112  @ A |glue_node| represents glue in a list. However, it is really only
  3113  a pointer to a separate glue specification, since \TeX\ makes use of the
  3114  fact that many essentially identical nodes of glue are usually present.
  3115  If |p| points to a |glue_node|, |glue_ptr(p)| points to
  3116  another packet of words that specify the stretch and shrink components, etc.
  3117  
  3118  Glue nodes also serve to represent leaders; the |subtype| is used to
  3119  distinguish between ordinary glue (which is called |normal|) and the three
  3120  kinds of leaders (which are called |a_leaders|, |c_leaders|, and |x_leaders|).
  3121  The |leader_ptr| field points to a rule node or to a box node containing the
  3122  leaders; it is set to |null| in ordinary glue nodes.
  3123  
  3124  Many kinds of glue are computed from \TeX's ``skip'' parameters, and
  3125  it is helpful to know which parameter has led to a particular glue node.
  3126  Therefore the |subtype| is set to indicate the source of glue, whenever
  3127  it originated as a parameter. We will be defining symbolic names for the
  3128  parameter numbers later (e.g., |line_skip_code=0|, |baseline_skip_code=1|,
  3129  etc.); it suffices for now to say that the |subtype| of parametric glue
  3130  will be the same as the parameter number, plus~one.
  3131  
  3132  In math formulas there are two more possibilities for the |subtype| in a
  3133  glue node: |mu_glue| denotes an \.{\\mskip} (where the units are scaled \.{mu}
  3134  instead of scaled \.{pt}); and |cond_math_glue| denotes the `\.{\\nonscript}'
  3135  feature that cancels the glue node immediately following if it appears
  3136  in a subscript.
  3137  
  3138  @d glue_node=10 {|type| of node that points to a glue specification}
  3139  @d cond_math_glue=98 {special |subtype| to suppress glue in the next node}
  3140  @d mu_glue=99 {|subtype| for math glue}
  3141  @d a_leaders=100 {|subtype| for aligned leaders}
  3142  @d c_leaders=101 {|subtype| for centered leaders}
  3143  @d x_leaders=102 {|subtype| for expanded leaders}
  3144  @d glue_ptr==llink {pointer to a glue specification}
  3145  @d leader_ptr==rlink {pointer to box or rule node for leaders}
  3146  
  3147  @ A glue specification has a halfword reference count in its first word,
  3148  @^reference counts@>
  3149  representing |null| plus the number of glue nodes that point to it (less one).
  3150  Note that the reference count appears in the same position as
  3151  the |link| field in list nodes; this is the field that is initialized
  3152  to |null| when a node is allocated, and it is also the field that is flagged
  3153  by |empty_flag| in empty nodes.
  3154  
  3155  Glue specifications also contain three |scaled| fields, for the |width|,
  3156  |stretch|, and |shrink| dimensions. Finally, there are two one-byte
  3157  fields called |stretch_order| and |shrink_order|; these contain the
  3158  orders of infinity (|normal|, |fil|, |fill|, or |filll|)
  3159  corresponding to the stretch and shrink values.
  3160  
  3161  @d glue_spec_size=4 {number of words to allocate for a glue specification}
  3162  @d glue_ref_count(#) == link(#) {reference count of a glue specification}
  3163  @d stretch(#) == mem[#+2].sc {the stretchability of this glob of glue}
  3164  @d shrink(#) == mem[#+3].sc {the shrinkability of this glob of glue}
  3165  @d stretch_order == type {order of infinity for stretching}
  3166  @d shrink_order == subtype {order of infinity for shrinking}
  3167  @d fil=1 {first-order infinity}
  3168  @d fill=2 {second-order infinity}
  3169  @d filll=3 {third-order infinity}
  3170  
  3171  @<Types...@>=
  3172  @!glue_ord=normal..filll; {infinity to the 0, 1, 2, or 3 power}
  3173  
  3174  @ Here is a function that returns a pointer to a copy of a glue spec.
  3175  The reference count in the copy is |null|, because there is assumed
  3176  to be exactly one reference to the new specification.
  3177  
  3178  @p function new_spec(@!p:pointer):pointer; {duplicates a glue specification}
  3179  var q:pointer; {the new spec}
  3180  begin q:=get_node(glue_spec_size);@/
  3181  mem[q]:=mem[p]; glue_ref_count(q):=null;@/
  3182  width(q):=width(p); stretch(q):=stretch(p); shrink(q):=shrink(p);
  3183  new_spec:=q;
  3184  end;
  3185  
  3186  @ And here's a function that creates a glue node for a given parameter
  3187  identified by its code number; for example,
  3188  |new_param_glue(line_skip_code)| returns a pointer to a glue node for the
  3189  current \.{\\lineskip}.
  3190  
  3191  @p function new_param_glue(@!n:small_number):pointer;
  3192  var p:pointer; {the new node}
  3193  @!q:pointer; {the glue specification}
  3194  begin p:=get_node(small_node_size); type(p):=glue_node; subtype(p):=n+1;
  3195  leader_ptr(p):=null;@/
  3196  q:=@<Current |mem| equivalent of glue parameter number |n|@>@t@>;
  3197  glue_ptr(p):=q; incr(glue_ref_count(q));
  3198  new_param_glue:=p;
  3199  end;
  3200  
  3201  @ Glue nodes that are more or less anonymous are created by |new_glue|,
  3202  whose argument points to a glue specification.
  3203  
  3204  @p function new_glue(@!q:pointer):pointer;
  3205  var p:pointer; {the new node}
  3206  begin p:=get_node(small_node_size); type(p):=glue_node; subtype(p):=normal;
  3207  leader_ptr(p):=null; glue_ptr(p):=q; incr(glue_ref_count(q));
  3208  new_glue:=p;
  3209  end;
  3210  
  3211  @ Still another subroutine is needed: This one is sort of a combination
  3212  of |new_param_glue| and |new_glue|. It creates a glue node for one of
  3213  the current glue parameters, but it makes a fresh copy of the glue
  3214  specification, since that specification will probably be subject to change,
  3215  while the parameter will stay put. The global variable |temp_ptr| is
  3216  set to the address of the new spec.
  3217  
  3218  @p function new_skip_param(@!n:small_number):pointer;
  3219  var p:pointer; {the new node}
  3220  begin temp_ptr:=new_spec(@<Current |mem| equivalent of glue parameter...@>);
  3221  p:=new_glue(temp_ptr); glue_ref_count(temp_ptr):=null; subtype(p):=n+1;
  3222  new_skip_param:=p;
  3223  end;
  3224  
  3225  @ A |kern_node| has a |width| field to specify a (normally negative)
  3226  amount of spacing. This spacing correction appears in horizontal lists
  3227  between letters like A and V when the font designer said that it looks
  3228  better to move them closer together or further apart. A kern node can
  3229  also appear in a vertical list, when its `|width|' denotes additional
  3230  spacing in the vertical direction. The |subtype| is either |normal| (for
  3231  kerns inserted from font information or math mode calculations) or |explicit|
  3232  (for kerns inserted from \.{\\kern} and \.{\\/} commands) or |acc_kern|
  3233  (for kerns inserted from non-math accents) or |mu_glue| (for kerns
  3234  inserted from \.{\\mkern} specifications in math formulas).
  3235  
  3236  @d kern_node=11 {|type| of a kern node}
  3237  @d explicit=1 {|subtype| of kern nodes from \.{\\kern} and \.{\\/}}
  3238  @d acc_kern=2 {|subtype| of kern nodes from accents}
  3239  
  3240  @ The |new_kern| function creates a kern node having a given width.
  3241  
  3242  @p function new_kern(@!w:scaled):pointer;
  3243  var p:pointer; {the new node}
  3244  begin p:=get_node(small_node_size); type(p):=kern_node;
  3245  subtype(p):=normal;
  3246  width(p):=w;
  3247  new_kern:=p;
  3248  end;
  3249  
  3250  @ A |penalty_node| specifies the penalty associated with line or page
  3251  breaking, in its |penalty| field. This field is a fullword integer, but
  3252  the full range of integer values is not used: Any penalty |>=10000| is
  3253  treated as infinity, and no break will be allowed for such high values.
  3254  Similarly, any penalty |<=-10000| is treated as negative infinity, and a
  3255  break will be forced.
  3256  
  3257  @d penalty_node=12 {|type| of a penalty node}
  3258  @d inf_penalty=inf_bad {``infinite'' penalty value}
  3259  @d eject_penalty=-inf_penalty {``negatively infinite'' penalty value}
  3260  @d penalty(#) == mem[#+1].int {the added cost of breaking a list here}
  3261  
  3262  @ Anyone who has been reading the last few sections of the program will
  3263  be able to guess what comes next.
  3264  
  3265  @p function new_penalty(@!m:integer):pointer;
  3266  var p:pointer; {the new node}
  3267  begin p:=get_node(small_node_size); type(p):=penalty_node;
  3268  subtype(p):=0; {the |subtype| is not used}
  3269  penalty(p):=m; new_penalty:=p;
  3270  end;
  3271  
  3272  @ You might think that we have introduced enough node types by now. Well,
  3273  almost, but there is one more: An |unset_node| has nearly the same format
  3274  as an |hlist_node| or |vlist_node|; it is used for entries in \.{\\halign}
  3275  or \.{\\valign} that are not yet in their final form, since the box
  3276  dimensions are their ``natural'' sizes before any glue adjustment has been
  3277  made. The |glue_set| word is not present; instead, we have a |glue_stretch|
  3278  field, which contains the total stretch of order |glue_order| that is
  3279  present in the hlist or vlist being boxed.
  3280  Similarly, the |shift_amount| field is replaced by a |glue_shrink| field,
  3281  containing the total shrink of order |glue_sign| that is present.
  3282  The |subtype| field is called |span_count|; an unset box typically
  3283  contains the data for |qo(span_count)+1| columns.
  3284  Unset nodes will be changed to box nodes when alignment is completed.
  3285  
  3286  @d unset_node=13 {|type| for an unset node}
  3287  @d glue_stretch(#)==mem[#+glue_offset].sc {total stretch in an unset node}
  3288  @d glue_shrink==shift_amount {total shrink in an unset node}
  3289  @d span_count==subtype {indicates the number of spanned columns}
  3290  
  3291  @ In fact, there are still more types coming. When we get to math formula
  3292  processing we will see that a |style_node| has |type=14|; and a number
  3293  of larger type codes will also be defined, for use in math mode only.
  3294  
  3295  @ Warning: If any changes are made to these data structure layouts, such as
  3296  changing any of the node sizes or even reordering the words of nodes,
  3297  the |copy_node_list| procedure and the memory initialization code
  3298  below may have to be changed. Such potentially dangerous parts of the
  3299  program are listed in the index under `data structure assumptions'.
  3300  @!@^data structure assumptions@>
  3301  However, other references to the nodes are made symbolically in terms of
  3302  the \.{WEB} macro definitions above, so that format changes will leave
  3303  \TeX's other algorithms intact.
  3304  @^system dependencies@>
  3305  
  3306  @* \[11] Memory layout.
  3307  Some areas of |mem| are dedicated to fixed usage, since static allocation is
  3308  more efficient than dynamic allocation when we can get away with it. For
  3309  example, locations |mem_bot| to |mem_bot+3| are always used to store the
  3310  specification for glue that is `\.{0pt plus 0pt minus 0pt}'. The
  3311  following macro definitions accomplish the static allocation by giving
  3312  symbolic names to the fixed positions. Static variable-size nodes appear
  3313  in locations |mem_bot| through |lo_mem_stat_max|, and static single-word nodes
  3314  appear in locations |hi_mem_stat_min| through |mem_top|, inclusive. It is
  3315  harmless to let |lig_trick| and |garbage| share the same location of |mem|.
  3316  
  3317  @d zero_glue==mem_bot {specification for \.{0pt plus 0pt minus 0pt}}
  3318  @d fil_glue==zero_glue+glue_spec_size {\.{0pt plus 1fil minus 0pt}}
  3319  @d fill_glue==fil_glue+glue_spec_size {\.{0pt plus 1fill minus 0pt}}
  3320  @d ss_glue==fill_glue+glue_spec_size {\.{0pt plus 1fil minus 1fil}}
  3321  @d fil_neg_glue==ss_glue+glue_spec_size {\.{0pt plus -1fil minus 0pt}}
  3322  @d lo_mem_stat_max==fil_neg_glue+glue_spec_size-1 {largest statically
  3323    allocated word in the variable-size |mem|}
  3324  @#
  3325  @d page_ins_head==mem_top {list of insertion data for current page}
  3326  @d contrib_head==mem_top-1 {vlist of items not yet on current page}
  3327  @d page_head==mem_top-2 {vlist for current page}
  3328  @d temp_head==mem_top-3 {head of a temporary list of some kind}
  3329  @d hold_head==mem_top-4 {head of a temporary list of another kind}
  3330  @d adjust_head==mem_top-5 {head of adjustment list returned by |hpack|}
  3331  @d active==mem_top-7 {head of active list in |line_break|, needs two words}
  3332  @d align_head==mem_top-8 {head of preamble list for alignments}
  3333  @d end_span==mem_top-9 {tail of spanned-width lists}
  3334  @d omit_template==mem_top-10 {a constant token list}
  3335  @d null_list==mem_top-11 {permanently empty list}
  3336  @d lig_trick==mem_top-12 {a ligature masquerading as a |char_node|}
  3337  @d garbage==mem_top-12 {used for scrap information}
  3338  @d backup_head==mem_top-13 {head of token list built by |scan_keyword|}
  3339  @d hi_mem_stat_min==mem_top-13 {smallest statically allocated word in
  3340    the one-word |mem|}
  3341  @d hi_mem_stat_usage=14 {the number of one-word nodes always present}
  3342  
  3343  @ The following code gets |mem| off to a good start, when \TeX\ is
  3344  initializing itself the slow~way.
  3345  
  3346  @<Local variables for init...@>=
  3347  @!k:integer; {index into |mem|, |eqtb|, etc.}
  3348  
  3349  @ @<Initialize table entries...@>=
  3350  for k:=mem_bot+1 to lo_mem_stat_max do mem[k].sc:=0;
  3351    {all glue dimensions are zeroed}
  3352  @^data structure assumptions@>
  3353  k:=mem_bot;@+while k<=lo_mem_stat_max do
  3354      {set first words of glue specifications}
  3355    begin glue_ref_count(k):=null+1;
  3356    stretch_order(k):=normal; shrink_order(k):=normal;
  3357    k:=k+glue_spec_size;
  3358    end;
  3359  stretch(fil_glue):=unity; stretch_order(fil_glue):=fil;@/
  3360  stretch(fill_glue):=unity; stretch_order(fill_glue):=fill;@/
  3361  stretch(ss_glue):=unity; stretch_order(ss_glue):=fil;@/
  3362  shrink(ss_glue):=unity; shrink_order(ss_glue):=fil;@/
  3363  stretch(fil_neg_glue):=-unity; stretch_order(fil_neg_glue):=fil;@/
  3364  rover:=lo_mem_stat_max+1;
  3365  link(rover):=empty_flag; {now initialize the dynamic memory}
  3366  node_size(rover):=1000; {which is a 1000-word available node}
  3367  llink(rover):=rover; rlink(rover):=rover;@/
  3368  lo_mem_max:=rover+1000; link(lo_mem_max):=null; info(lo_mem_max):=null;@/
  3369  for k:=hi_mem_stat_min to mem_top do
  3370    mem[k]:=mem[lo_mem_max]; {clear list heads}
  3371  @<Initialize the special list heads and constant nodes@>;
  3372  avail:=null; mem_end:=mem_top;
  3373  hi_mem_min:=hi_mem_stat_min; {initialize the one-word memory}
  3374  var_used:=lo_mem_stat_max+1-mem_bot; dyn_used:=hi_mem_stat_usage;
  3375    {initialize statistics}
  3376  
  3377  @ If \TeX\ is extended improperly, the |mem| array might get screwed up.
  3378  For example, some pointers might be wrong, or some ``dead'' nodes might not
  3379  have been freed when the last reference to them disappeared. Procedures
  3380  |check_mem| and |search_mem| are available to help diagnose such
  3381  problems. These procedures make use of two arrays called |free| and
  3382  |was_free| that are present only if \TeX's debugging routines have
  3383  been included. (You may want to decrease the size of |mem| while you
  3384  @^debugging@>
  3385  are debugging.)
  3386  
  3387  @<Glob...@>=
  3388  @!debug @!free: packed array [mem_min..mem_max] of boolean; {free cells}
  3389  @t\hskip10pt@>@!was_free: packed array [mem_min..mem_max] of boolean;
  3390    {previously free cells}
  3391  @t\hskip10pt@>@!was_mem_end,@!was_lo_max,@!was_hi_min: pointer;
  3392    {previous |mem_end|, |lo_mem_max|, and |hi_mem_min|}
  3393  @t\hskip10pt@>@!panicking:boolean; {do we want to check memory constantly?}
  3394  gubed
  3395  
  3396  @ @<Set initial...@>=
  3397  @!debug was_mem_end:=mem_min; {indicate that everything was previously free}
  3398  was_lo_max:=mem_min; was_hi_min:=mem_max;
  3399  panicking:=false;
  3400  gubed
  3401  
  3402  @ Procedure |check_mem| makes sure that the available space lists of
  3403  |mem| are well formed, and it optionally prints out all locations
  3404  that are reserved now but were free the last time this procedure was called.
  3405  
  3406  @p @!debug procedure check_mem(@!print_locs : boolean);
  3407  label done1,done2; {loop exits}
  3408  var p,@!q:pointer; {current locations of interest in |mem|}
  3409  @!clobbered:boolean; {is something amiss?}
  3410  begin for p:=mem_min to lo_mem_max do free[p]:=false; {you can probably
  3411    do this faster}
  3412  for p:=hi_mem_min to mem_end do free[p]:=false; {ditto}
  3413  @<Check single-word |avail| list@>;
  3414  @<Check variable-size |avail| list@>;
  3415  @<Check flags of unavailable nodes@>;
  3416  if print_locs then @<Print newly busy locations@>;
  3417  for p:=mem_min to lo_mem_max do was_free[p]:=free[p];
  3418  for p:=hi_mem_min to mem_end do was_free[p]:=free[p];
  3419    {|was_free:=free| might be faster}
  3420  was_mem_end:=mem_end; was_lo_max:=lo_mem_max; was_hi_min:=hi_mem_min;
  3421  end;
  3422  gubed
  3423  
  3424  @ @<Check single-word...@>=
  3425  p:=avail; q:=null; clobbered:=false;
  3426  while p<>null do
  3427    begin if (p>mem_end)or(p<hi_mem_min) then clobbered:=true
  3428    else if free[p] then clobbered:=true;
  3429    if clobbered then
  3430      begin print_nl("AVAIL list clobbered at ");
  3431  @.AVAIL list clobbered...@>
  3432      print_int(q); goto done1;
  3433      end;
  3434    free[p]:=true; q:=p; p:=link(q);
  3435    end;
  3436  done1:
  3437  
  3438  @ @<Check variable-size...@>=
  3439  p:=rover; q:=null; clobbered:=false;
  3440  repeat if (p>=lo_mem_max)or(p<mem_min) then clobbered:=true
  3441    else if (rlink(p)>=lo_mem_max)or(rlink(p)<mem_min) then clobbered:=true
  3442    else if  not(is_empty(p))or(node_size(p)<2)or@|
  3443     (p+node_size(p)>lo_mem_max)or@| (llink(rlink(p))<>p) then clobbered:=true;
  3444    if clobbered then
  3445    begin print_nl("Double-AVAIL list clobbered at ");
  3446    print_int(q); goto done2;
  3447    end;
  3448  for q:=p to p+node_size(p)-1 do {mark all locations free}
  3449    begin if free[q] then
  3450      begin print_nl("Doubly free location at ");
  3451  @.Doubly free location...@>
  3452      print_int(q); goto done2;
  3453      end;
  3454    free[q]:=true;
  3455    end;
  3456  q:=p; p:=rlink(p);
  3457  until p=rover;
  3458  done2:
  3459  
  3460  @ @<Check flags...@>=
  3461  p:=mem_min;
  3462  while p<=lo_mem_max do {node |p| should not be empty}
  3463    begin if is_empty(p) then
  3464      begin print_nl("Bad flag at "); print_int(p);
  3465  @.Bad flag...@>
  3466      end;
  3467    while (p<=lo_mem_max) and not free[p] do incr(p);
  3468    while (p<=lo_mem_max) and free[p] do incr(p);
  3469    end
  3470  
  3471  @ @<Print newly busy...@>=
  3472  begin print_nl("New busy locs:");
  3473  for p:=mem_min to lo_mem_max do
  3474    if not free[p] and ((p>was_lo_max) or was_free[p]) then
  3475      begin print_char(" "); print_int(p);
  3476      end;
  3477  for p:=hi_mem_min to mem_end do
  3478    if not free[p] and
  3479     ((p<was_hi_min) or (p>was_mem_end) or was_free[p]) then
  3480      begin print_char(" "); print_int(p);
  3481      end;
  3482  end
  3483  
  3484  @ The |search_mem| procedure attempts to answer the question ``Who points
  3485  to node~|p|?'' In doing so, it fetches |link| and |info| fields of |mem|
  3486  that might not be of type |two_halves|. Strictly speaking, this is
  3487  @^dirty \PASCAL@>
  3488  undefined in \PASCAL, and it can lead to ``false drops'' (words that seem to
  3489  point to |p| purely by coincidence). But for debugging purposes, we want
  3490  to rule out the places that do {\sl not\/} point to |p|, so a few false
  3491  drops are tolerable.
  3492  
  3493  @p @!debug procedure search_mem(@!p:pointer); {look for pointers to |p|}
  3494  var q:integer; {current position being searched}
  3495  begin for q:=mem_min to lo_mem_max do
  3496    begin if link(q)=p then
  3497      begin print_nl("LINK("); print_int(q); print_char(")");
  3498      end;
  3499    if info(q)=p then
  3500      begin print_nl("INFO("); print_int(q); print_char(")");
  3501      end;
  3502    end;
  3503  for q:=hi_mem_min to mem_end do
  3504    begin if link(q)=p then
  3505      begin print_nl("LINK("); print_int(q); print_char(")");
  3506      end;
  3507    if info(q)=p then
  3508      begin print_nl("INFO("); print_int(q); print_char(")");
  3509      end;
  3510    end;
  3511  @<Search |eqtb| for equivalents equal to |p|@>;
  3512  @<Search |save_stack| for equivalents that point to |p|@>;
  3513  @<Search |hyph_list| for pointers to |p|@>;
  3514  end;
  3515  gubed
  3516  
  3517  @* \[12] Displaying boxes.
  3518  We can reinforce our knowledge of the data structures just introduced
  3519  by considering two procedures that display a list in symbolic form.
  3520  The first of these, called |short_display|, is used in ``overfull box''
  3521  messages to give the top-level description of a list. The other one,
  3522  called |show_node_list|, prints a detailed description of exactly what
  3523  is in the data structure.
  3524  
  3525  The philosophy of |short_display| is to ignore the fine points about exactly
  3526  what is inside boxes, except that ligatures and discretionary breaks are
  3527  expanded. As a result, |short_display| is a recursive procedure, but the
  3528  recursion is never more than one level deep.
  3529  @^recursion@>
  3530  
  3531  A global variable |font_in_short_display| keeps track of the font code that
  3532  is assumed to be present when |short_display| begins; deviations from this
  3533  font will be printed.
  3534  
  3535  @<Glob...@>=
  3536  @!font_in_short_display:integer; {an internal font number}
  3537  
  3538  @ Boxes, rules, inserts, whatsits, marks, and things in general that are
  3539  sort of ``complicated'' are indicated only by printing `\.{[]}'.
  3540  
  3541  @p procedure short_display(@!p:integer); {prints highlights of list |p|}
  3542  var n:integer; {for replacement counts}
  3543  begin while p>mem_min do
  3544    begin if is_char_node(p) then
  3545      begin if p<=mem_end then
  3546        begin if font(p)<>font_in_short_display then
  3547          begin if (font(p)<font_base)or(font(p)>font_max) then
  3548            print_char("*")
  3549  @.*\relax@>
  3550          else @<Print the font identifier for |font(p)|@>;
  3551          print_char(" "); font_in_short_display:=font(p);
  3552          end;
  3553        print_ASCII(qo(character(p)));
  3554        end;
  3555      end
  3556    else @<Print a short indication of the contents of node |p|@>;
  3557    p:=link(p);
  3558    end;
  3559  end;
  3560  
  3561  @ @<Print a short indication of the contents of node |p|@>=
  3562  case type(p) of
  3563  hlist_node,vlist_node,ins_node,whatsit_node,mark_node,adjust_node,
  3564    unset_node: print("[]");
  3565  rule_node: print_char("|");
  3566  glue_node: if glue_ptr(p)<>zero_glue then print_char(" ");
  3567  math_node: print_char("$");
  3568  ligature_node: short_display(lig_ptr(p));
  3569  disc_node: begin short_display(pre_break(p));
  3570    short_display(post_break(p));@/
  3571    n:=replace_count(p);
  3572    while n>0 do
  3573      begin if link(p)<>null then p:=link(p);
  3574      decr(n);
  3575      end;
  3576    end;
  3577  othercases do_nothing
  3578  endcases
  3579  
  3580  @ The |show_node_list| routine requires some auxiliary subroutines: one to
  3581  print a font-and-character combination, one to print a token list without
  3582  its reference count, and one to print a rule dimension.
  3583  
  3584  @p procedure print_font_and_char(@!p:integer); {prints |char_node| data}
  3585  begin if p>mem_end then print_esc("CLOBBERED.")
  3586  else  begin if (font(p)<font_base)or(font(p)>font_max) then print_char("*")
  3587  @.*\relax@>
  3588    else @<Print the font identifier for |font(p)|@>;
  3589    print_char(" "); print_ASCII(qo(character(p)));
  3590    end;
  3591  end;
  3592  @#
  3593  procedure print_mark(@!p:integer); {prints token list data in braces}
  3594  begin print_char("{");
  3595  if (p<hi_mem_min)or(p>mem_end) then print_esc("CLOBBERED.")
  3596  else show_token_list(link(p),null,max_print_line-10);
  3597  print_char("}");
  3598  end;
  3599  @#
  3600  procedure print_rule_dimen(@!d:scaled); {prints dimension in rule node}
  3601  begin if is_running(d) then print_char("*") else print_scaled(d);
  3602  @.*\relax@>
  3603  end;
  3604  
  3605  @ Then there is a subroutine that prints glue stretch and shrink, possibly
  3606  followed by the name of finite units:
  3607  
  3608  @p procedure print_glue(@!d:scaled;@!order:integer;@!s:str_number);
  3609    {prints a glue component}
  3610  begin print_scaled(d);
  3611  if (order<normal)or(order>filll) then print("foul")
  3612  else if order>normal then
  3613    begin print("fil");
  3614    while order>fil do
  3615      begin print_char("l"); decr(order);
  3616      end;
  3617    end
  3618  else if s<>0 then print(s);
  3619  end;
  3620  
  3621  @ The next subroutine prints a whole glue specification.
  3622  
  3623  @p procedure print_spec(@!p:integer;@!s:str_number);
  3624    {prints a glue specification}
  3625  begin if (p<mem_min)or(p>=lo_mem_max) then print_char("*")
  3626  @.*\relax@>
  3627  else  begin print_scaled(width(p));
  3628    if s<>0 then print(s);
  3629    if stretch(p)<>0 then
  3630      begin print(" plus "); print_glue(stretch(p),stretch_order(p),s);
  3631      end;
  3632    if shrink(p)<>0 then
  3633      begin print(" minus "); print_glue(shrink(p),shrink_order(p),s);
  3634      end;
  3635    end;
  3636  end;
  3637  
  3638  @ We also need to declare some procedures that appear later in this
  3639  documentation.
  3640  
  3641  @p @<Declare procedures needed for displaying the elements of mlists@>@;
  3642  @<Declare the procedure called |print_skip_param|@>
  3643  
  3644  @ Since boxes can be inside of boxes, |show_node_list| is inherently recursive,
  3645  @^recursion@>
  3646  up to a given maximum number of levels.  The history of nesting is indicated
  3647  by the current string, which will be printed at the beginning of each line;
  3648  the length of this string, namely |cur_length|, is the depth of nesting.
  3649  
  3650  Recursive calls on |show_node_list| therefore use the following pattern:
  3651  
  3652  @d node_list_display(#)==
  3653    begin append_char("."); show_node_list(#); flush_char;
  3654    end {|str_room| need not be checked; see |show_box| below}
  3655  
  3656  @ A global variable called |depth_threshold| is used to record the maximum
  3657  depth of nesting for which |show_node_list| will show information.  If we
  3658  have |depth_threshold=0|, for example, only the top level information will
  3659  be given and no sublists will be traversed. Another global variable, called
  3660  |breadth_max|, tells the maximum number of items to show at each level;
  3661  |breadth_max| had better be positive, or you won't see anything.
  3662  
  3663  @<Glob...@>=
  3664  @!depth_threshold : integer; {maximum nesting depth in box displays}
  3665  @!breadth_max : integer; {maximum number of items shown at the same list level}
  3666  
  3667  @ Now we are ready for |show_node_list| itself. This procedure has been
  3668  written to be ``extra robust'' in the sense that it should not crash or get
  3669  into a loop even if the data structures have been messed up by bugs in
  3670  the rest of the program. You can safely call its parent routine
  3671  |show_box(p)| for arbitrary values of |p| when you are debugging \TeX.
  3672  However, in the presence of bad data, the procedure may
  3673  @^dirty \PASCAL@>@^debugging@>
  3674  fetch a |memory_word| whose variant is different from the way it was stored;
  3675  for example, it might try to read |mem[p].hh| when |mem[p]|
  3676  contains a scaled integer, if |p| is a pointer that has been
  3677  clobbered or chosen at random.
  3678  
  3679  @p procedure show_node_list(@!p:integer); {prints a node list symbolically}
  3680  label exit;
  3681  var n:integer; {the number of items already printed at this level}
  3682  @!g:real; {a glue ratio, as a floating point number}
  3683  begin if cur_length>depth_threshold then
  3684    begin if p>null then print(" []");
  3685      {indicate that there's been some truncation}
  3686    return;
  3687    end;
  3688  n:=0;
  3689  while p>mem_min do
  3690    begin print_ln; print_current_string; {display the nesting history}
  3691    if p>mem_end then {pointer out of range}
  3692      begin print("Bad link, display aborted."); return;
  3693  @.Bad link...@>
  3694      end;
  3695    incr(n); if n>breadth_max then {time to stop}
  3696      begin print("etc."); return;
  3697  @.etc@>
  3698      end;
  3699    @<Display node |p|@>;
  3700    p:=link(p);
  3701    end;
  3702  exit:
  3703  end;
  3704  
  3705  @ @<Display node |p|@>=
  3706  if is_char_node(p) then print_font_and_char(p)
  3707  else  case type(p) of
  3708    hlist_node,vlist_node,unset_node: @<Display box |p|@>;
  3709    rule_node: @<Display rule |p|@>;
  3710    ins_node: @<Display insertion |p|@>;
  3711    whatsit_node: @<Display the whatsit node |p|@>;
  3712    glue_node: @<Display glue |p|@>;
  3713    kern_node: @<Display kern |p|@>;
  3714    math_node: @<Display math node |p|@>;
  3715    ligature_node: @<Display ligature |p|@>;
  3716    penalty_node: @<Display penalty |p|@>;
  3717    disc_node: @<Display discretionary |p|@>;
  3718    mark_node: @<Display mark |p|@>;
  3719    adjust_node: @<Display adjustment |p|@>;
  3720    @t\4@>@<Cases of |show_node_list| that arise in mlists only@>@;
  3721    othercases print("Unknown node type!")
  3722    endcases
  3723  
  3724  @ @<Display box |p|@>=
  3725  begin if type(p)=hlist_node then print_esc("h")
  3726  else if type(p)=vlist_node then print_esc("v")
  3727  else print_esc("unset");
  3728  print("box("); print_scaled(height(p)); print_char("+");
  3729  print_scaled(depth(p)); print(")x"); print_scaled(width(p));
  3730  if type(p)=unset_node then
  3731    @<Display special fields of the unset node |p|@>
  3732  else  begin @<Display the value of |glue_set(p)|@>;
  3733    if shift_amount(p)<>0 then
  3734      begin print(", shifted "); print_scaled(shift_amount(p));
  3735      end;
  3736    end;
  3737  node_list_display(list_ptr(p)); {recursive call}
  3738  end
  3739  
  3740  @ @<Display special fields of the unset node |p|@>=
  3741  begin if span_count(p)<>min_quarterword then
  3742    begin print(" ("); print_int(qo(span_count(p))+1);
  3743    print(" columns)");
  3744    end;
  3745  if glue_stretch(p)<>0 then
  3746    begin print(", stretch "); print_glue(glue_stretch(p),glue_order(p),0);
  3747    end;
  3748  if glue_shrink(p)<>0 then
  3749    begin print(", shrink "); print_glue(glue_shrink(p),glue_sign(p),0);
  3750    end;
  3751  end
  3752  
  3753  @ The code will have to change in this place if |glue_ratio| is
  3754  a structured type instead of an ordinary |real|. Note that this routine
  3755  should avoid arithmetic errors even if the |glue_set| field holds an
  3756  arbitrary random value. The following code assumes that a properly
  3757  formed nonzero |real| number has absolute value $2^{20}$ or more when
  3758  it is regarded as an integer; this precaution was adequate to prevent
  3759  floating point underflow on the author's computer.
  3760  @^system dependencies@>
  3761  @^dirty \PASCAL@>
  3762  
  3763  @<Display the value of |glue_set(p)|@>=
  3764  g:=float(glue_set(p));
  3765  if (g<>float_constant(0))and(glue_sign(p)<>normal) then
  3766    begin print(", glue set ");
  3767    if glue_sign(p)=shrinking then print("- ");
  3768    if abs(mem[p+glue_offset].int)<@'4000000 then print("?.?")
  3769    else if abs(g)>float_constant(20000) then
  3770      begin if g>float_constant(0) then print_char(">")
  3771      else print("< -");
  3772      print_glue(20000*unity,glue_order(p),0);
  3773      end
  3774    else print_glue(round(unity*g),glue_order(p),0);
  3775  @^real multiplication@>
  3776    end
  3777  
  3778  @ @<Display rule |p|@>=
  3779  begin print_esc("rule("); print_rule_dimen(height(p)); print_char("+");
  3780  print_rule_dimen(depth(p)); print(")x"); print_rule_dimen(width(p));
  3781  end
  3782  
  3783  @ @<Display insertion |p|@>=
  3784  begin print_esc("insert"); print_int(qo(subtype(p)));
  3785  print(", natural size "); print_scaled(height(p));
  3786  print("; split("); print_spec(split_top_ptr(p),0);
  3787  print_char(","); print_scaled(depth(p));
  3788  print("); float cost "); print_int(float_cost(p));
  3789  node_list_display(ins_ptr(p)); {recursive call}
  3790  end
  3791  
  3792  @ @<Display glue |p|@>=
  3793  if subtype(p)>=a_leaders then @<Display leaders |p|@>
  3794  else  begin print_esc("glue");
  3795    if subtype(p)<>normal then
  3796      begin print_char("(");
  3797      if subtype(p)<cond_math_glue then
  3798        print_skip_param(subtype(p)-1)
  3799      else if subtype(p)=cond_math_glue then print_esc("nonscript")
  3800      else print_esc("mskip");
  3801      print_char(")");
  3802      end;
  3803    if subtype(p)<>cond_math_glue then
  3804      begin print_char(" ");
  3805      if subtype(p)<cond_math_glue then print_spec(glue_ptr(p),0)
  3806      else print_spec(glue_ptr(p),"mu");
  3807      end;
  3808    end
  3809  
  3810  @ @<Display leaders |p|@>=
  3811  begin print_esc("");
  3812  if subtype(p)=c_leaders then print_char("c")
  3813  else if subtype(p)=x_leaders then print_char("x");
  3814  print("leaders "); print_spec(glue_ptr(p),0);
  3815  node_list_display(leader_ptr(p)); {recursive call}
  3816  end
  3817  
  3818  @ An ``explicit'' kern value is indicated implicitly by an explicit space.
  3819  
  3820  @<Display kern |p|@>=
  3821  if subtype(p)<>mu_glue then
  3822    begin print_esc("kern");
  3823    if subtype(p)<>normal then print_char(" ");
  3824    print_scaled(width(p));
  3825    if subtype(p)=acc_kern then print(" (for accent)");
  3826  @.for accent@>
  3827    end
  3828  else  begin print_esc("mkern"); print_scaled(width(p)); print("mu");
  3829    end
  3830  
  3831  @ @<Display math node |p|@>=
  3832  begin print_esc("math");
  3833  if subtype(p)=before then print("on")
  3834  else print("off");
  3835  if width(p)<>0 then
  3836    begin print(", surrounded "); print_scaled(width(p));
  3837    end;
  3838  end
  3839  
  3840  @ @<Display ligature |p|@>=
  3841  begin print_font_and_char(lig_char(p)); print(" (ligature ");
  3842  if subtype(p)>1 then print_char("|");
  3843  font_in_short_display:=font(lig_char(p)); short_display(lig_ptr(p));
  3844  if odd(subtype(p)) then print_char("|");
  3845  print_char(")");
  3846  end
  3847  
  3848  @ @<Display penalty |p|@>=
  3849  begin print_esc("penalty "); print_int(penalty(p));
  3850  end
  3851  
  3852  @ The |post_break| list of a discretionary node is indicated by a prefixed
  3853  `\.{\char'174}' instead of the `\..' before the |pre_break| list.
  3854  
  3855  @<Display discretionary |p|@>=
  3856  begin print_esc("discretionary");
  3857  if replace_count(p)>0 then
  3858    begin print(" replacing "); print_int(replace_count(p));
  3859    end;
  3860  node_list_display(pre_break(p)); {recursive call}
  3861  append_char("|"); show_node_list(post_break(p)); flush_char; {recursive call}
  3862  end
  3863  
  3864  @ @<Display mark |p|@>=
  3865  begin print_esc("mark"); print_mark(mark_ptr(p));
  3866  end
  3867  
  3868  @ @<Display adjustment |p|@>=
  3869  begin print_esc("vadjust"); node_list_display(adjust_ptr(p)); {recursive call}
  3870  end
  3871  
  3872  @ The recursive machinery is started by calling |show_box|.
  3873  @^recursion@>
  3874  
  3875  @p procedure show_box(@!p:pointer);
  3876  begin @<Assign the values |depth_threshold:=show_box_depth| and
  3877    |breadth_max:=show_box_breadth|@>;
  3878  if breadth_max<=0 then breadth_max:=5;
  3879  if pool_ptr+depth_threshold>=pool_size then
  3880    depth_threshold:=pool_size-pool_ptr-1;
  3881    {now there's enough room for prefix string}
  3882  show_node_list(p); {the show starts at |p|}
  3883  print_ln;
  3884  end;
  3885  
  3886  @* \[13] Destroying boxes.
  3887  When we are done with a node list, we are obliged to return it to free
  3888  storage, including all of its sublists. The recursive procedure
  3889  |flush_node_list| does this for us.
  3890  
  3891  @ First, however, we shall consider two non-recursive procedures that do
  3892  simpler tasks. The first of these, |delete_token_ref|, is called when
  3893  a pointer to a token list's reference count is being removed. This means
  3894  that the token list should disappear if the reference count was |null|,
  3895  otherwise the count should be decreased by one.
  3896  @^reference counts@>
  3897  
  3898  @d token_ref_count(#) == info(#) {reference count preceding a token list}
  3899  
  3900  @p procedure delete_token_ref(@!p:pointer); {|p| points to the reference count
  3901    of a token list that is losing one reference}
  3902  begin if token_ref_count(p)=null then flush_list(p)
  3903  else decr(token_ref_count(p));
  3904  end;
  3905  
  3906  @ Similarly, |delete_glue_ref| is called when a pointer to a glue
  3907  specification is being withdrawn.
  3908  @^reference counts@>
  3909  @d fast_delete_glue_ref(#)==@t@>@;@/
  3910    begin if glue_ref_count(#)=null then free_node(#,glue_spec_size)
  3911    else decr(glue_ref_count(#));
  3912    end
  3913  
  3914  @p procedure delete_glue_ref(@!p:pointer); {|p| points to a glue specification}
  3915  fast_delete_glue_ref(p);
  3916  
  3917  @ Now we are ready to delete any node list, recursively.
  3918  In practice, the nodes deleted are usually charnodes (about 2/3 of the time),
  3919  and they are glue nodes in about half of the remaining cases.
  3920  @^recursion@>
  3921  
  3922  @p procedure flush_node_list(@!p:pointer); {erase list of nodes starting at |p|}
  3923  label done; {go here when node |p| has been freed}
  3924  var q:pointer; {successor to node |p|}
  3925  begin while p<>null do
  3926  @^inner loop@>
  3927    begin q:=link(p);
  3928    if is_char_node(p) then free_avail(p)
  3929    else  begin case type(p) of
  3930      hlist_node,vlist_node,unset_node: begin flush_node_list(list_ptr(p));
  3931        free_node(p,box_node_size); goto done;
  3932        end;
  3933      rule_node: begin free_node(p,rule_node_size); goto done;
  3934        end;
  3935      ins_node: begin flush_node_list(ins_ptr(p));
  3936        delete_glue_ref(split_top_ptr(p));
  3937        free_node(p,ins_node_size); goto done;
  3938        end;
  3939      whatsit_node: @<Wipe out the whatsit node |p| and |goto done|@>;
  3940      glue_node: begin fast_delete_glue_ref(glue_ptr(p));
  3941        if leader_ptr(p)<>null then flush_node_list(leader_ptr(p));
  3942        end;
  3943      kern_node,math_node,penalty_node: do_nothing;
  3944      ligature_node: flush_node_list(lig_ptr(p));
  3945      mark_node: delete_token_ref(mark_ptr(p));
  3946      disc_node: begin flush_node_list(pre_break(p));
  3947        flush_node_list(post_break(p));
  3948        end;
  3949      adjust_node: flush_node_list(adjust_ptr(p));
  3950      @t\4@>@<Cases of |flush_node_list| that arise in mlists only@>@;
  3951      othercases confusion("flushing")
  3952  @:this can't happen flushing}{\quad flushing@>
  3953      endcases;@/
  3954      free_node(p,small_node_size);
  3955      done:end;
  3956    p:=q;
  3957    end;
  3958  end;
  3959  
  3960  @* \[14] Copying boxes.
  3961  Another recursive operation that acts on boxes is sometimes needed: The
  3962  procedure |copy_node_list| returns a pointer to another node list that has
  3963  the same structure and meaning as the original. Note that since glue
  3964  specifications and token lists have reference counts, we need not make
  3965  copies of them. Reference counts can never get too large to fit in a
  3966  halfword, since each pointer to a node is in a different memory address,
  3967  and the total number of memory addresses fits in a halfword.
  3968  @^recursion@>
  3969  @^reference counts@>
  3970  
  3971  (Well, there actually are also references from outside |mem|; if the
  3972  |save_stack| is made arbitrarily large, it would theoretically be possible
  3973  to break \TeX\ by overflowing a reference count. But who would want to do that?)
  3974  
  3975  @d add_token_ref(#)==incr(token_ref_count(#)) {new reference to a token list}
  3976  @d add_glue_ref(#)==incr(glue_ref_count(#)) {new reference to a glue spec}
  3977  
  3978  @ The copying procedure copies words en masse without bothering
  3979  to look at their individual fields. If the node format changes---for
  3980  example, if the size is altered, or if some link field is moved to another
  3981  relative position---then this code may need to be changed too.
  3982  @^data structure assumptions@>
  3983  
  3984  @p function copy_node_list(@!p:pointer):pointer; {makes a duplicate of the
  3985    node list that starts at |p| and returns a pointer to the new list}
  3986  var h:pointer; {temporary head of copied list}
  3987  @!q:pointer; {previous position in new list}
  3988  @!r:pointer; {current node being fabricated for new list}
  3989  @!words:0..5; {number of words remaining to be copied}
  3990  begin h:=get_avail; q:=h;
  3991  while p<>null do
  3992    begin @<Make a copy of node |p| in node |r|@>;
  3993    link(q):=r; q:=r; p:=link(p);
  3994    end;
  3995  link(q):=null; q:=link(h); free_avail(h);
  3996  copy_node_list:=q;
  3997  end;
  3998  
  3999  @ @<Make a copy of node |p|...@>=
  4000  words:=1; {this setting occurs in more branches than any other}
  4001  if is_char_node(p) then r:=get_avail
  4002  else @<Case statement to copy different types and set |words| to the number
  4003    of initial words not yet copied@>;
  4004  while words>0 do
  4005    begin decr(words); mem[r+words]:=mem[p+words];
  4006    end
  4007  
  4008  @ @<Case statement to copy...@>=
  4009  case type(p) of
  4010  hlist_node,vlist_node,unset_node: begin r:=get_node(box_node_size);
  4011    mem[r+6]:=mem[p+6]; mem[r+5]:=mem[p+5]; {copy the last two words}
  4012    list_ptr(r):=copy_node_list(list_ptr(p)); {this affects |mem[r+5]|}
  4013    words:=5;
  4014    end;
  4015  rule_node: begin r:=get_node(rule_node_size); words:=rule_node_size;
  4016    end;
  4017  ins_node: begin r:=get_node(ins_node_size); mem[r+4]:=mem[p+4];
  4018    add_glue_ref(split_top_ptr(p));
  4019    ins_ptr(r):=copy_node_list(ins_ptr(p)); {this affects |mem[r+4]|}
  4020    words:=ins_node_size-1;
  4021    end;
  4022  whatsit_node:@<Make a partial copy of the whatsit node |p| and make |r|
  4023    point to it; set |words| to the number of initial words not yet copied@>;
  4024  glue_node: begin r:=get_node(small_node_size); add_glue_ref(glue_ptr(p));
  4025    glue_ptr(r):=glue_ptr(p); leader_ptr(r):=copy_node_list(leader_ptr(p));
  4026    end;
  4027  kern_node,math_node,penalty_node: begin r:=get_node(small_node_size);
  4028    words:=small_node_size;
  4029    end;
  4030  ligature_node: begin r:=get_node(small_node_size);
  4031    mem[lig_char(r)]:=mem[lig_char(p)]; {copy |font| and |character|}
  4032    lig_ptr(r):=copy_node_list(lig_ptr(p));
  4033    end;
  4034  disc_node: begin r:=get_node(small_node_size);
  4035    pre_break(r):=copy_node_list(pre_break(p));
  4036    post_break(r):=copy_node_list(post_break(p));
  4037    end;
  4038  mark_node: begin r:=get_node(small_node_size); add_token_ref(mark_ptr(p));
  4039    words:=small_node_size;
  4040    end;
  4041  adjust_node: begin r:=get_node(small_node_size);
  4042    adjust_ptr(r):=copy_node_list(adjust_ptr(p));
  4043    end; {|words=1=small_node_size-1|}
  4044  othercases confusion("copying")
  4045  @:this can't happen copying}{\quad copying@>
  4046  endcases
  4047  
  4048  @* \[15] The command codes.
  4049  Before we can go any further, we need to define symbolic names for the internal
  4050  code numbers that represent the various commands obeyed by \TeX. These codes
  4051  are somewhat arbitrary, but not completely so. For example, the command
  4052  codes for character types are fixed by the language, since a user says,
  4053  e.g., `\.{\\catcode \`\\\${} = 3}' to make \.{\char'44} a math delimiter,
  4054  and the command code |math_shift| is equal to~3. Some other codes have
  4055  been made adjacent so that |case| statements in the program need not consider
  4056  cases that are widely spaced, or so that |case| statements can be replaced
  4057  by |if| statements.
  4058  
  4059  At any rate, here is the list, for future reference. First come the
  4060  ``catcode'' commands, several of which share their numeric codes with
  4061  ordinary commands when the catcode cannot emerge from \TeX's scanning routine.
  4062  
  4063  @d escape=0 {escape delimiter (called \.\\ in {\sl The \TeX book\/})}
  4064  @:TeXbook}{\sl The \TeX book@>
  4065  @d relax=0 {do nothing ( \.{\\relax} )}
  4066  @d left_brace=1 {beginning of a group ( \.\{ )}
  4067  @d right_brace=2 {ending of a group ( \.\} )}
  4068  @d math_shift=3 {mathematics shift character ( \.\$ )}
  4069  @d tab_mark=4 {alignment delimiter ( \.\&, \.{\\span} )}
  4070  @d car_ret=5 {end of line ( |carriage_return|, \.{\\cr}, \.{\\crcr} )}
  4071  @d out_param=5 {output a macro parameter}
  4072  @d mac_param=6 {macro parameter symbol ( \.\# )}
  4073  @d sup_mark=7 {superscript ( \.{\char'136} )}
  4074  @d sub_mark=8 {subscript ( \.{\char'137} )}
  4075  @d ignore=9 {characters to ignore ( \.{\^\^@@} )}
  4076  @d endv=9 {end of \<v_j> list in alignment template}
  4077  @d spacer=10 {characters equivalent to blank space ( \.{\ } )}
  4078  @d letter=11 {characters regarded as letters ( \.{A..Z}, \.{a..z} )}
  4079  @d other_char=12 {none of the special character types}
  4080  @d active_char=13 {characters that invoke macros ( \.{\char`\~} )}
  4081  @d par_end=13 {end of paragraph ( \.{\\par} )}
  4082  @d match=13 {match a macro parameter}
  4083  @d comment=14 {characters that introduce comments ( \.\% )}
  4084  @d end_match=14 {end of parameters to macro}
  4085  @d stop=14 {end of job ( \.{\\end}, \.{\\dump} )}
  4086  @d invalid_char=15 {characters that shouldn't appear ( \.{\^\^?} )}
  4087  @d delim_num=15 {specify delimiter numerically ( \.{\\delimiter} )}
  4088  @d max_char_code=15 {largest catcode for individual characters}
  4089  
  4090  @ Next are the ordinary run-of-the-mill command codes.  Codes that are
  4091  |min_internal| or more represent internal quantities that might be
  4092  expanded by `\.{\\the}'.
  4093  
  4094  @d char_num=16 {character specified numerically ( \.{\\char} )}
  4095  @d math_char_num=17 {explicit math code ( \.{\\mathchar} )}
  4096  @d mark=18 {mark definition ( \.{\\mark} )}
  4097  @d xray=19 {peek inside of \TeX\ ( \.{\\show}, \.{\\showbox}, etc.~)}
  4098  @d make_box=20 {make a box ( \.{\\box}, \.{\\copy}, \.{\\hbox}, etc.~)}
  4099  @d hmove=21 {horizontal motion ( \.{\\moveleft}, \.{\\moveright} )}
  4100  @d vmove=22 {vertical motion ( \.{\\raise}, \.{\\lower} )}
  4101  @d un_hbox=23 {unglue a box ( \.{\\unhbox}, \.{\\unhcopy} )}
  4102  @d un_vbox=24 {unglue a box ( \.{\\unvbox}, \.{\\unvcopy} )}
  4103  @d remove_item=25 {nullify last item ( \.{\\unpenalty},
  4104    \.{\\unkern}, \.{\\unskip} )}
  4105  @d hskip=26 {horizontal glue ( \.{\\hskip}, \.{\\hfil}, etc.~)}
  4106  @d vskip=27 {vertical glue ( \.{\\vskip}, \.{\\vfil}, etc.~)}
  4107  @d mskip=28 {math glue ( \.{\\mskip} )}
  4108  @d kern=29 {fixed space ( \.{\\kern} )}
  4109  @d mkern=30 {math kern ( \.{\\mkern} )}
  4110  @d leader_ship=31 {use a box ( \.{\\shipout}, \.{\\leaders}, etc.~)}
  4111  @d halign=32 {horizontal table alignment ( \.{\\halign} )}
  4112  @d valign=33 {vertical table alignment ( \.{\\valign} )}
  4113  @d no_align=34 {temporary escape from alignment ( \.{\\noalign} )}
  4114  @d vrule=35 {vertical rule ( \.{\\vrule} )}
  4115  @d hrule=36 {horizontal rule ( \.{\\hrule} )}
  4116  @d insert=37 {vlist inserted in box ( \.{\\insert} )}
  4117  @d vadjust=38 {vlist inserted in enclosing paragraph ( \.{\\vadjust} )}
  4118  @d ignore_spaces=39 {gobble |spacer| tokens ( \.{\\ignorespaces} )}
  4119  @d after_assignment=40 {save till assignment is done ( \.{\\afterassignment} )}
  4120  @d after_group=41 {save till group is done ( \.{\\aftergroup} )}
  4121  @d break_penalty=42 {additional badness ( \.{\\penalty} )}
  4122  @d start_par=43 {begin paragraph ( \.{\\indent}, \.{\\noindent} )}
  4123  @d ital_corr=44 {italic correction ( \.{\\/} )}
  4124  @d accent=45 {attach accent in text ( \.{\\accent} )}
  4125  @d math_accent=46 {attach accent in math ( \.{\\mathaccent} )}
  4126  @d discretionary=47 {discretionary texts ( \.{\\-}, \.{\\discretionary} )}
  4127  @d eq_no=48 {equation number ( \.{\\eqno}, \.{\\leqno} )}
  4128  @d left_right=49 {variable delimiter ( \.{\\left}, \.{\\right} )}
  4129  @d math_comp=50 {component of formula ( \.{\\mathbin}, etc.~)}
  4130  @d limit_switch=51 {diddle limit conventions ( \.{\\displaylimits}, etc.~)}
  4131  @d above=52 {generalized fraction ( \.{\\above}, \.{\\atop}, etc.~)}
  4132  @d math_style=53 {style specification ( \.{\\displaystyle}, etc.~)}
  4133  @d math_choice=54 {choice specification ( \.{\\mathchoice} )}
  4134  @d non_script=55 {conditional math glue ( \.{\\nonscript} )}
  4135  @d vcenter=56 {vertically center a vbox ( \.{\\vcenter} )}
  4136  @d case_shift=57 {force specific case ( \.{\\lowercase}, \.{\\uppercase}~)}
  4137  @d message=58 {send to user ( \.{\\message}, \.{\\errmessage} )}
  4138  @d extension=59 {extensions to \TeX\ ( \.{\\write}, \.{\\special}, etc.~)}
  4139  @d in_stream=60 {files for reading ( \.{\\openin}, \.{\\closein} )}
  4140  @d begin_group=61 {begin local grouping ( \.{\\begingroup} )}
  4141  @d end_group=62 {end local grouping ( \.{\\endgroup} )}
  4142  @d omit=63 {omit alignment template ( \.{\\omit} )}
  4143  @d ex_space=64 {explicit space ( \.{\\\ } )}
  4144  @d no_boundary=65 {suppress boundary ligatures ( \.{\\noboundary} )}
  4145  @d radical=66 {square root and similar signs ( \.{\\radical} )}
  4146  @d end_cs_name=67 {end control sequence ( \.{\\endcsname} )}
  4147  @d min_internal=68 {the smallest code that can follow \.{\\the}}
  4148  @d char_given=68 {character code defined by \.{\\chardef}}
  4149  @d math_given=69 {math code defined by \.{\\mathchardef}}
  4150  @d last_item=70 {most recent item ( \.{\\lastpenalty},
  4151    \.{\\lastkern}, \.{\\lastskip} )}
  4152  @d max_non_prefixed_command=70 {largest command code that can't be \.{\\global}}
  4153  
  4154  @ The next codes are special; they all relate to mode-independent
  4155  assignment of values to \TeX's internal registers or tables.
  4156  Codes that are |max_internal| or less represent internal quantities
  4157  that might be expanded by `\.{\\the}'.
  4158  
  4159  @d toks_register=71 {token list register ( \.{\\toks} )}
  4160  @d assign_toks=72 {special token list ( \.{\\output}, \.{\\everypar}, etc.~)}
  4161  @d assign_int=73 {user-defined integer ( \.{\\tolerance}, \.{\\day}, etc.~)}
  4162  @d assign_dimen=74 {user-defined length ( \.{\\hsize}, etc.~)}
  4163  @d assign_glue=75 {user-defined glue ( \.{\\baselineskip}, etc.~)}
  4164  @d assign_mu_glue=76 {user-defined muglue ( \.{\\thinmuskip}, etc.~)}
  4165  @d assign_font_dimen=77 {user-defined font dimension ( \.{\\fontdimen} )}
  4166  @d assign_font_int=78 {user-defined font integer ( \.{\\hyphenchar},
  4167    \.{\\skewchar} )}
  4168  @d set_aux=79 {specify state info ( \.{\\spacefactor}, \.{\\prevdepth} )}
  4169  @d set_prev_graf=80 {specify state info ( \.{\\prevgraf} )}
  4170  @d set_page_dimen=81 {specify state info ( \.{\\pagegoal}, etc.~)}
  4171  @d set_page_int=82 {specify state info ( \.{\\deadcycles},
  4172    \.{\\insertpenalties} )}
  4173  @d set_box_dimen=83 {change dimension of box ( \.{\\wd}, \.{\\ht}, \.{\\dp} )}
  4174  @d set_shape=84 {specify fancy paragraph shape ( \.{\\parshape} )}
  4175  @d def_code=85 {define a character code ( \.{\\catcode}, etc.~)}
  4176  @d def_family=86 {declare math fonts ( \.{\\textfont}, etc.~)}
  4177  @d set_font=87 {set current font ( font identifiers )}
  4178  @d def_font=88 {define a font file ( \.{\\font} )}
  4179  @d register=89 {internal register ( \.{\\count}, \.{\\dimen}, etc.~)}
  4180  @d max_internal=89 {the largest code that can follow \.{\\the}}
  4181  @d advance=90 {advance a register or parameter ( \.{\\advance} )}
  4182  @d multiply=91 {multiply a register or parameter ( \.{\\multiply} )}
  4183  @d divide=92 {divide a register or parameter ( \.{\\divide} )}
  4184  @d prefix=93 {qualify a definition ( \.{\\global}, \.{\\long}, \.{\\outer} )}
  4185  @d let=94 {assign a command code ( \.{\\let}, \.{\\futurelet} )}
  4186  @d shorthand_def=95 {code definition ( \.{\\chardef}, \.{\\countdef}, etc.~)}
  4187  @d read_to_cs=96 {read into a control sequence ( \.{\\read} )}
  4188  @d def=97 {macro definition ( \.{\\def}, \.{\\gdef}, \.{\\xdef}, \.{\\edef} )}
  4189  @d set_box=98 {set a box ( \.{\\setbox} )}
  4190  @d hyph_data=99 {hyphenation data ( \.{\\hyphenation}, \.{\\patterns} )}
  4191  @d set_interaction=100 {define level of interaction ( \.{\\batchmode}, etc.~)}
  4192  @d max_command=100 {the largest command code seen at |big_switch|}
  4193  
  4194  @ The remaining command codes are extra special, since they cannot get through
  4195  \TeX's scanner to the main control routine. They have been given values higher
  4196  than |max_command| so that their special nature is easily discernible.
  4197  The ``expandable'' commands come first.
  4198  
  4199  @d undefined_cs=max_command+1 {initial state of most |eq_type| fields}
  4200  @d expand_after=max_command+2 {special expansion ( \.{\\expandafter} )}
  4201  @d no_expand=max_command+3 {special nonexpansion ( \.{\\noexpand} )}
  4202  @d input=max_command+4 {input a source file ( \.{\\input}, \.{\\endinput} )}
  4203  @d if_test=max_command+5 {conditional text ( \.{\\if}, \.{\\ifcase}, etc.~)}
  4204  @d fi_or_else=max_command+6 {delimiters for conditionals ( \.{\\else}, etc.~)}
  4205  @d cs_name=max_command+7 {make a control sequence from tokens ( \.{\\csname} )}
  4206  @d convert=max_command+8 {convert to text ( \.{\\number}, \.{\\string}, etc.~)}
  4207  @d the=max_command+9 {expand an internal quantity ( \.{\\the} )}
  4208  @d top_bot_mark=max_command+10 {inserted mark ( \.{\\topmark}, etc.~)}
  4209  @d call=max_command+11 {non-long, non-outer control sequence}
  4210  @d long_call=max_command+12 {long, non-outer control sequence}
  4211  @d outer_call=max_command+13 {non-long, outer control sequence}
  4212  @d long_outer_call=max_command+14 {long, outer control sequence}
  4213  @d end_template=max_command+15 {end of an alignment template}
  4214  @d dont_expand=max_command+16 {the following token was marked by \.{\\noexpand}}
  4215  @d glue_ref=max_command+17 {the equivalent points to a glue specification}
  4216  @d shape_ref=max_command+18 {the equivalent points to a parshape specification}
  4217  @d box_ref=max_command+19 {the equivalent points to a box node, or is |null|}
  4218  @d data=max_command+20 {the equivalent is simply a halfword number}
  4219  
  4220  @* \[16] The semantic nest.
  4221  \TeX\ is typically in the midst of building many lists at once. For example,
  4222  when a math formula is being processed, \TeX\ is in math mode and
  4223  working on an mlist; this formula has temporarily interrupted \TeX\ from
  4224  being in horizontal mode and building the hlist of a paragraph; and this
  4225  paragraph has temporarily interrupted \TeX\ from being in vertical mode
  4226  and building the vlist for the next page of a document. Similarly, when a
  4227  \.{\\vbox} occurs inside of an \.{\\hbox}, \TeX\ is temporarily
  4228  interrupted from working in restricted horizontal mode, and it enters
  4229  internal vertical mode.  The ``semantic nest'' is a stack that
  4230  keeps track of what lists and modes are currently suspended.
  4231  
  4232  At each level of processing we are in one of six modes:
  4233  
  4234  \yskip\hang|vmode| stands for vertical mode (the page builder);
  4235  
  4236  \hang|hmode| stands for horizontal mode (the paragraph builder);
  4237  
  4238  \hang|mmode| stands for displayed formula mode;
  4239  
  4240  \hang|-vmode| stands for internal vertical mode (e.g., in a \.{\\vbox});
  4241  
  4242  \hang|-hmode| stands for restricted horizontal mode (e.g., in an \.{\\hbox});
  4243  
  4244  \hang|-mmode| stands for math formula mode (not displayed).
  4245  
  4246  \yskip\noindent The mode is temporarily set to zero while processing \.{\\write}
  4247  texts.
  4248  
  4249  Numeric values are assigned to |vmode|, |hmode|, and |mmode| so that
  4250  \TeX's ``big semantic switch'' can select the appropriate thing to
  4251  do by computing the value |abs(mode)+cur_cmd|, where |mode| is the current
  4252  mode and |cur_cmd| is the current command code.
  4253  
  4254  @d vmode=1 {vertical mode}
  4255  @d hmode=vmode+max_command+1 {horizontal mode}
  4256  @d mmode=hmode+max_command+1 {math mode}
  4257  
  4258  @p procedure print_mode(@!m:integer); {prints the mode represented by |m|}
  4259  begin if m>0 then
  4260    case m div (max_command+1) of
  4261    0:print("vertical");
  4262    1:print("horizontal");
  4263    2:print("display math");
  4264    end
  4265  else if m=0 then print("no")
  4266  else  case (-m) div (max_command+1) of
  4267    0:print("internal vertical");
  4268    1:print("restricted horizontal");
  4269    2:print("math");
  4270    end;
  4271  print(" mode");
  4272  end;
  4273  
  4274  @ The state of affairs at any semantic level can be represented by
  4275  five values:
  4276  
  4277  \yskip\hang|mode| is the number representing the semantic mode, as
  4278  just explained.
  4279  
  4280  \yskip\hang|head| is a |pointer| to a list head for the list being built;
  4281  |link(head)| therefore points to the first element of the list, or
  4282  to |null| if the list is empty.
  4283  
  4284  \yskip\hang|tail| is a |pointer| to the final node of the list being
  4285  built; thus, |tail=head| if and only if the list is empty.
  4286  
  4287  \yskip\hang|prev_graf| is the number of lines of the current paragraph that
  4288  have already been put into the present vertical list.
  4289  
  4290  \yskip\hang|aux| is an auxiliary |memory_word| that gives further information
  4291  that is needed to characterize the situation.
  4292  
  4293  \yskip\noindent
  4294  In vertical mode, |aux| is also known as |prev_depth|; it is the scaled
  4295  value representing the depth of the previous box, for use in baseline
  4296  calculations, or it is |<=-1000|pt if the next box on the vertical list is to
  4297  be exempt from baseline calculations.  In horizontal mode, |aux| is also
  4298  known as |space_factor| and |clang|; it holds the current space factor used in
  4299  spacing calculations, and the current language used for hyphenation.
  4300  (The value of |clang| is undefined in restricted horizontal mode.)
  4301  In math mode, |aux| is also known as |incompleat_noad|; if
  4302  not |null|, it points to a record that represents the numerator of a
  4303  generalized fraction for which the denominator is currently being formed
  4304  in the current list.
  4305  
  4306  There is also a sixth quantity, |mode_line|, which correlates
  4307  the semantic nest with the user's input; |mode_line| contains the source
  4308  line number at which the current level of nesting was entered. The negative
  4309  of this line number is the |mode_line| at the level of the
  4310  user's output routine.
  4311  
  4312  In horizontal mode, the |prev_graf| field is used for initial language data.
  4313  
  4314  The semantic nest is an array called |nest| that holds the |mode|, |head|,
  4315  |tail|, |prev_graf|, |aux|, and |mode_line| values for all semantic levels
  4316  below the currently active one. Information about the currently active
  4317  level is kept in the global quantities |mode|, |head|, |tail|, |prev_graf|,
  4318  |aux|, and |mode_line|, which live in a \PASCAL\ record that is ready to
  4319  be pushed onto |nest| if necessary.
  4320  
  4321  @d ignore_depth==-65536000 {|prev_depth| value that is ignored}
  4322  
  4323  @<Types...@>=
  4324  @!list_state_record=record@!mode_field:-mmode..mmode;@+
  4325    @!head_field,@!tail_field: pointer;
  4326    @!pg_field,@!ml_field: integer;@+
  4327    @!aux_field: memory_word;
  4328    end;
  4329  
  4330  @ @d mode==cur_list.mode_field {current mode}
  4331  @d head==cur_list.head_field {header node of current list}
  4332  @d tail==cur_list.tail_field {final node on current list}
  4333  @d prev_graf==cur_list.pg_field {number of paragraph lines accumulated}
  4334  @d aux==cur_list.aux_field {auxiliary data about the current list}
  4335  @d prev_depth==aux.sc {the name of |aux| in vertical mode}
  4336  @d space_factor==aux.hh.lh {part of |aux| in horizontal mode}
  4337  @d clang==aux.hh.rh {the other part of |aux| in horizontal mode}
  4338  @d incompleat_noad==aux.int {the name of |aux| in math mode}
  4339  @d mode_line==cur_list.ml_field {source file line number at beginning of list}
  4340  
  4341  @<Glob...@>=
  4342  @!nest:array[0..nest_size] of list_state_record;
  4343  @!nest_ptr:0..nest_size; {first unused location of |nest|}
  4344  @!max_nest_stack:0..nest_size; {maximum of |nest_ptr| when pushing}
  4345  @!cur_list:list_state_record; {the ``top'' semantic state}
  4346  @!shown_mode:-mmode..mmode; {most recent mode shown by \.{\\tracingcommands}}
  4347  
  4348  @ Here is a common way to make the current list grow:
  4349  
  4350  @d tail_append(#)==begin link(tail):=#; tail:=link(tail);
  4351    end
  4352  
  4353  @ We will see later that the vertical list at the bottom semantic level is split
  4354  into two parts; the ``current page'' runs from |page_head| to |page_tail|,
  4355  and the ``contribution list'' runs from |contrib_head| to |tail| of
  4356  semantic level zero. The idea is that contributions are first formed in
  4357  vertical mode, then ``contributed'' to the current page (during which time
  4358  the page-breaking decisions are made). For now, we don't need to know
  4359  any more details about the page-building process.
  4360  
  4361  @<Set init...@>=
  4362  nest_ptr:=0; max_nest_stack:=0;
  4363  mode:=vmode; head:=contrib_head; tail:=contrib_head;
  4364  prev_depth:=ignore_depth; mode_line:=0;
  4365  prev_graf:=0; shown_mode:=0;
  4366  @<Start a new current page@>;
  4367  
  4368  @ When \TeX's work on one level is interrupted, the state is saved by
  4369  calling |push_nest|. This routine changes |head| and |tail| so that
  4370  a new (empty) list is begun; it does not change |mode| or |aux|.
  4371  
  4372  @p procedure push_nest; {enter a new semantic level, save the old}
  4373  begin if nest_ptr>max_nest_stack then
  4374    begin max_nest_stack:=nest_ptr;
  4375    if nest_ptr=nest_size then overflow("semantic nest size",nest_size);
  4376  @:TeX capacity exceeded semantic nest size}{\quad semantic nest size@>
  4377    end;
  4378  nest[nest_ptr]:=cur_list; {stack the record}
  4379  incr(nest_ptr); head:=get_avail; tail:=head; prev_graf:=0; mode_line:=line;
  4380  end;
  4381  
  4382  @ Conversely, when \TeX\ is finished on the current level, the former
  4383  state is restored by calling |pop_nest|. This routine will never be
  4384  called at the lowest semantic level, nor will it be called unless |head|
  4385  is a node that should be returned to free memory.
  4386  
  4387  @p procedure pop_nest; {leave a semantic level, re-enter the old}
  4388  begin free_avail(head); decr(nest_ptr); cur_list:=nest[nest_ptr];
  4389  end;
  4390  
  4391  @ Here is a procedure that displays what \TeX\ is working on, at all levels.
  4392  
  4393  @p procedure@?print_totals; forward;@t\2@>
  4394  procedure show_activities;
  4395  var p:0..nest_size; {index into |nest|}
  4396  @!m:-mmode..mmode; {mode}
  4397  @!a:memory_word; {auxiliary}
  4398  @!q,@!r:pointer; {for showing the current page}
  4399  @!t:integer; {ditto}
  4400  begin nest[nest_ptr]:=cur_list; {put the top level into the array}
  4401  print_nl(""); print_ln;
  4402  for p:=nest_ptr downto 0 do
  4403    begin m:=nest[p].mode_field; a:=nest[p].aux_field;
  4404    print_nl("### "); print_mode(m);
  4405    print(" entered at line "); print_int(abs(nest[p].ml_field));
  4406    if m=hmode then if nest[p].pg_field <> @'40600000 then
  4407      begin print(" (language"); print_int(nest[p].pg_field mod @'200000);
  4408      print(":hyphenmin"); print_int(nest[p].pg_field div @'20000000);
  4409      print_char(","); print_int((nest[p].pg_field div @'200000) mod @'100);
  4410      print_char(")");
  4411      end;
  4412    if nest[p].ml_field<0 then print(" (\output routine)");
  4413    if p=0 then
  4414      begin @<Show the status of the current page@>;
  4415      if link(contrib_head)<>null then
  4416        print_nl("### recent contributions:");
  4417      end;
  4418    show_box(link(nest[p].head_field));
  4419    @<Show the auxiliary field, |a|@>;
  4420    end;
  4421  end;
  4422  
  4423  @ @<Show the auxiliary...@>=
  4424  case abs(m) div (max_command+1) of
  4425  0: begin print_nl("prevdepth ");
  4426    if a.sc<=ignore_depth then print("ignored")
  4427    else print_scaled(a.sc);
  4428    if nest[p].pg_field<>0 then
  4429      begin print(", prevgraf ");
  4430      print_int(nest[p].pg_field); print(" line");
  4431      if nest[p].pg_field<>1 then print_char("s");
  4432      end;
  4433    end;
  4434  1: begin print_nl("spacefactor "); print_int(a.hh.lh);
  4435    if m>0 then@+ if a.hh.rh>0 then
  4436      begin print(", current language "); print_int(a.hh.rh);@+
  4437      end;
  4438    end;
  4439  2: if a.int<>null then
  4440    begin print("this will begin denominator of:"); show_box(a.int);@+
  4441    end;
  4442  end {there are no other cases}
  4443  
  4444  @* \[17] The table of equivalents.
  4445  Now that we have studied the data structures for \TeX's semantic routines,
  4446  we ought to consider the data structures used by its syntactic routines. In
  4447  other words, our next concern will be
  4448  the tables that \TeX\ looks at when it is scanning
  4449  what the user has written.
  4450  
  4451  The biggest and most important such table is called |eqtb|. It holds the
  4452  current ``equivalents'' of things; i.e., it explains what things mean
  4453  or what their current values are, for all quantities that are subject to
  4454  the nesting structure provided by \TeX's grouping mechanism. There are six
  4455  parts to |eqtb|:
  4456  
  4457  \yskip\hangg 1) |eqtb[active_base..(hash_base-1)]| holds the current
  4458  equivalents of single-character control sequences.
  4459  
  4460  \yskip\hangg 2) |eqtb[hash_base..(glue_base-1)]| holds the current
  4461  equivalents of multiletter control sequences.
  4462  
  4463  \yskip\hangg 3) |eqtb[glue_base..(local_base-1)]| holds the current
  4464  equivalents of glue parameters like the current baselineskip.
  4465  
  4466  \yskip\hangg 4) |eqtb[local_base..(int_base-1)]| holds the current
  4467  equivalents of local halfword quantities like the current box registers,
  4468  the current ``catcodes,'' the current font, and a pointer to the current
  4469  paragraph shape.
  4470  
  4471  \yskip\hangg 5) |eqtb[int_base..(dimen_base-1)]| holds the current
  4472  equivalents of fullword integer parameters like the current hyphenation
  4473  penalty.
  4474  
  4475  \yskip\hangg 6) |eqtb[dimen_base..eqtb_size]| holds the current equivalents
  4476  of fullword dimension parameters like the current hsize or amount of
  4477  hanging indentation.
  4478  
  4479  \yskip\noindent Note that, for example, the current amount of
  4480  baselineskip glue is determined by the setting of a particular location
  4481  in region~3 of |eqtb|, while the current meaning of the control sequence
  4482  `\.{\\baselineskip}' (which might have been changed by \.{\\def} or
  4483  \.{\\let}) appears in region~2.
  4484  
  4485  @ Each entry in |eqtb| is a |memory_word|. Most of these words are of type
  4486  |two_halves|, and subdivided into three fields:
  4487  
  4488  \yskip\hangg 1) The |eq_level| (a quarterword) is the level of grouping at
  4489  which this equivalent was defined. If the level is |level_zero|, the
  4490  equivalent has never been defined; |level_one| refers to the outer level
  4491  (outside of all groups), and this level is also used for global
  4492  definitions that never go away. Higher levels are for equivalents that
  4493  will disappear at the end of their group.  @^global definitions@>
  4494  
  4495  \yskip\hangg 2) The |eq_type| (another quarterword) specifies what kind of
  4496  entry this is. There are many types, since each \TeX\ primitive like
  4497  \.{\\hbox}, \.{\\def}, etc., has its own special code. The list of
  4498  command codes above includes all possible settings of the |eq_type| field.
  4499  
  4500  \yskip\hangg 3) The |equiv| (a halfword) is the current equivalent value.
  4501  This may be a font number, a pointer into |mem|, or a variety of other
  4502  things.
  4503  
  4504  @d eq_level_field(#)==#.hh.b1
  4505  @d eq_type_field(#)==#.hh.b0
  4506  @d equiv_field(#)==#.hh.rh
  4507  @d eq_level(#)==eq_level_field(eqtb[#]) {level of definition}
  4508  @d eq_type(#)==eq_type_field(eqtb[#]) {command code for equivalent}
  4509  @d equiv(#)==equiv_field(eqtb[#]) {equivalent value}
  4510  @d level_zero=min_quarterword {level for undefined quantities}
  4511  @d level_one=level_zero+1 {outermost level for defined quantities}
  4512  
  4513  @ Many locations in |eqtb| have symbolic names. The purpose of the next
  4514  paragraphs is to define these names, and to set up the initial values of the
  4515  equivalents.
  4516  
  4517  In the first region we have 256 equivalents for ``active characters'' that
  4518  act as control sequences, followed by 256 equivalents for single-character
  4519  control sequences.
  4520  
  4521  Then comes region~2, which corresponds to the hash table that we will
  4522  define later.  The maximum address in this region is used for a dummy
  4523  control sequence that is perpetually undefined. There also are several
  4524  locations for control sequences that are perpetually defined
  4525  (since they are used in error recovery).
  4526  
  4527  @d active_base=1 {beginning of region 1, for active character equivalents}
  4528  @d single_base=active_base+256 {equivalents of one-character control sequences}
  4529  @d null_cs=single_base+256 {equivalent of \.{\\csname\\endcsname}}
  4530  @d hash_base=null_cs+1 {beginning of region 2, for the hash table}
  4531  @d frozen_control_sequence=hash_base+hash_size {for error recovery}
  4532  @d frozen_protection=frozen_control_sequence {inaccessible but definable}
  4533  @d frozen_cr=frozen_control_sequence+1 {permanent `\.{\\cr}'}
  4534  @d frozen_end_group=frozen_control_sequence+2 {permanent `\.{\\endgroup}'}
  4535  @d frozen_right=frozen_control_sequence+3 {permanent `\.{\\right}'}
  4536  @d frozen_fi=frozen_control_sequence+4 {permanent `\.{\\fi}'}
  4537  @d frozen_end_template=frozen_control_sequence+5 {permanent `\.{\\endtemplate}'}
  4538  @d frozen_endv=frozen_control_sequence+6 {second permanent `\.{\\endtemplate}'}
  4539  @d frozen_relax=frozen_control_sequence+7 {permanent `\.{\\relax}'}
  4540  @d end_write=frozen_control_sequence+8 {permanent `\.{\\endwrite}'}
  4541  @d frozen_dont_expand=frozen_control_sequence+9
  4542    {permanent `\.{\\notexpanded:}'}
  4543  @d frozen_null_font=frozen_control_sequence+10
  4544    {permanent `\.{\\nullfont}'}
  4545  @d font_id_base=frozen_null_font-font_base
  4546    {begins table of 257 permanent font identifiers}
  4547  @d undefined_control_sequence=frozen_null_font+257 {dummy location}
  4548  @d glue_base=undefined_control_sequence+1 {beginning of region 3}
  4549  
  4550  @<Initialize table entries...@>=
  4551  eq_type(undefined_control_sequence):=undefined_cs;
  4552  equiv(undefined_control_sequence):=null;
  4553  eq_level(undefined_control_sequence):=level_zero;
  4554  for k:=active_base to undefined_control_sequence-1 do
  4555    eqtb[k]:=eqtb[undefined_control_sequence];
  4556  
  4557  @ Here is a routine that displays the current meaning of an |eqtb| entry
  4558  in region 1 or~2. (Similar routines for the other regions will appear
  4559  below.)
  4560  
  4561  @<Show equivalent |n|, in region 1 or 2@>=
  4562  begin sprint_cs(n); print_char("="); print_cmd_chr(eq_type(n),equiv(n));
  4563  if eq_type(n)>=call then
  4564    begin print_char(":"); show_token_list(link(equiv(n)),null,32);
  4565    end;
  4566  end
  4567  
  4568  @ Region 3 of |eqtb| contains the 256 \.{\\skip} registers, as well as the
  4569  glue parameters defined here. It is important that the ``muskip''
  4570  parameters have larger numbers than the others.
  4571  
  4572  @d line_skip_code=0 {interline glue if |baseline_skip| is infeasible}
  4573  @d baseline_skip_code=1 {desired glue between baselines}
  4574  @d par_skip_code=2 {extra glue just above a paragraph}
  4575  @d above_display_skip_code=3 {extra glue just above displayed math}
  4576  @d below_display_skip_code=4 {extra glue just below displayed math}
  4577  @d above_display_short_skip_code=5
  4578    {glue above displayed math following short lines}
  4579  @d below_display_short_skip_code=6
  4580    {glue below displayed math following short lines}
  4581  @d left_skip_code=7 {glue at left of justified lines}
  4582  @d right_skip_code=8 {glue at right of justified lines}
  4583  @d top_skip_code=9 {glue at top of main pages}
  4584  @d split_top_skip_code=10 {glue at top of split pages}
  4585  @d tab_skip_code=11 {glue between aligned entries}
  4586  @d space_skip_code=12 {glue between words (if not |zero_glue|)}
  4587  @d xspace_skip_code=13 {glue after sentences (if not |zero_glue|)}
  4588  @d par_fill_skip_code=14 {glue on last line of paragraph}
  4589  @d thin_mu_skip_code=15 {thin space in math formula}
  4590  @d med_mu_skip_code=16 {medium space in math formula}
  4591  @d thick_mu_skip_code=17 {thick space in math formula}
  4592  @d glue_pars=18 {total number of glue parameters}
  4593  @d skip_base=glue_base+glue_pars {table of 256 ``skip'' registers}
  4594  @d mu_skip_base=skip_base+256 {table of 256 ``muskip'' registers}
  4595  @d local_base=mu_skip_base+256 {beginning of region 4}
  4596  @#
  4597  @d skip(#)==equiv(skip_base+#) {|mem| location of glue specification}
  4598  @d mu_skip(#)==equiv(mu_skip_base+#) {|mem| location of math glue spec}
  4599  @d glue_par(#)==equiv(glue_base+#) {|mem| location of glue specification}
  4600  @d line_skip==glue_par(line_skip_code)
  4601  @d baseline_skip==glue_par(baseline_skip_code)
  4602  @d par_skip==glue_par(par_skip_code)
  4603  @d above_display_skip==glue_par(above_display_skip_code)
  4604  @d below_display_skip==glue_par(below_display_skip_code)
  4605  @d above_display_short_skip==glue_par(above_display_short_skip_code)
  4606  @d below_display_short_skip==glue_par(below_display_short_skip_code)
  4607  @d left_skip==glue_par(left_skip_code)
  4608  @d right_skip==glue_par(right_skip_code)
  4609  @d top_skip==glue_par(top_skip_code)
  4610  @d split_top_skip==glue_par(split_top_skip_code)
  4611  @d tab_skip==glue_par(tab_skip_code)
  4612  @d space_skip==glue_par(space_skip_code)
  4613  @d xspace_skip==glue_par(xspace_skip_code)
  4614  @d par_fill_skip==glue_par(par_fill_skip_code)
  4615  @d thin_mu_skip==glue_par(thin_mu_skip_code)
  4616  @d med_mu_skip==glue_par(med_mu_skip_code)
  4617  @d thick_mu_skip==glue_par(thick_mu_skip_code)
  4618  
  4619  @<Current |mem| equivalent of glue parameter number |n|@>=glue_par(n)
  4620  
  4621  @ Sometimes we need to convert \TeX's internal code numbers into symbolic
  4622  form. The |print_skip_param| routine gives the symbolic name of a glue
  4623  parameter.
  4624  
  4625  @<Declare the procedure called |print_skip_param|@>=
  4626  procedure print_skip_param(@!n:integer);
  4627  begin case n of
  4628  line_skip_code: print_esc("lineskip");
  4629  baseline_skip_code: print_esc("baselineskip");
  4630  par_skip_code: print_esc("parskip");
  4631  above_display_skip_code: print_esc("abovedisplayskip");
  4632  below_display_skip_code: print_esc("belowdisplayskip");
  4633  above_display_short_skip_code: print_esc("abovedisplayshortskip");
  4634  below_display_short_skip_code: print_esc("belowdisplayshortskip");
  4635  left_skip_code: print_esc("leftskip");
  4636  right_skip_code: print_esc("rightskip");
  4637  top_skip_code: print_esc("topskip");
  4638  split_top_skip_code: print_esc("splittopskip");
  4639  tab_skip_code: print_esc("tabskip");
  4640  space_skip_code: print_esc("spaceskip");
  4641  xspace_skip_code: print_esc("xspaceskip");
  4642  par_fill_skip_code: print_esc("parfillskip");
  4643  thin_mu_skip_code: print_esc("thinmuskip");
  4644  med_mu_skip_code: print_esc("medmuskip");
  4645  thick_mu_skip_code: print_esc("thickmuskip");
  4646  othercases print("[unknown glue parameter!]")
  4647  endcases;
  4648  end;
  4649  
  4650  @ The symbolic names for glue parameters are put into \TeX's hash table
  4651  by using the routine called |primitive|, defined below. Let us enter them
  4652  now, so that we don't have to list all those parameter names anywhere else.
  4653  
  4654  @<Put each of \TeX's primitives into the hash table@>=
  4655  primitive("lineskip",assign_glue,glue_base+line_skip_code);@/
  4656  @!@:line_skip_}{\.{\\lineskip} primitive@>
  4657  primitive("baselineskip",assign_glue,glue_base+baseline_skip_code);@/
  4658  @!@:baseline_skip_}{\.{\\baselineskip} primitive@>
  4659  primitive("parskip",assign_glue,glue_base+par_skip_code);@/
  4660  @!@:par_skip_}{\.{\\parskip} primitive@>
  4661  primitive("abovedisplayskip",assign_glue,glue_base+above_display_skip_code);@/
  4662  @!@:above_display_skip_}{\.{\\abovedisplayskip} primitive@>
  4663  primitive("belowdisplayskip",assign_glue,glue_base+below_display_skip_code);@/
  4664  @!@:below_display_skip_}{\.{\\belowdisplayskip} primitive@>
  4665  primitive("abovedisplayshortskip",
  4666    assign_glue,glue_base+above_display_short_skip_code);@/
  4667  @!@:above_display_short_skip_}{\.{\\abovedisplayshortskip} primitive@>
  4668  primitive("belowdisplayshortskip",
  4669    assign_glue,glue_base+below_display_short_skip_code);@/
  4670  @!@:below_display_short_skip_}{\.{\\belowdisplayshortskip} primitive@>
  4671  primitive("leftskip",assign_glue,glue_base+left_skip_code);@/
  4672  @!@:left_skip_}{\.{\\leftskip} primitive@>
  4673  primitive("rightskip",assign_glue,glue_base+right_skip_code);@/
  4674  @!@:right_skip_}{\.{\\rightskip} primitive@>
  4675  primitive("topskip",assign_glue,glue_base+top_skip_code);@/
  4676  @!@:top_skip_}{\.{\\topskip} primitive@>
  4677  primitive("splittopskip",assign_glue,glue_base+split_top_skip_code);@/
  4678  @!@:split_top_skip_}{\.{\\splittopskip} primitive@>
  4679  primitive("tabskip",assign_glue,glue_base+tab_skip_code);@/
  4680  @!@:tab_skip_}{\.{\\tabskip} primitive@>
  4681  primitive("spaceskip",assign_glue,glue_base+space_skip_code);@/
  4682  @!@:space_skip_}{\.{\\spaceskip} primitive@>
  4683  primitive("xspaceskip",assign_glue,glue_base+xspace_skip_code);@/
  4684  @!@:xspace_skip_}{\.{\\xspaceskip} primitive@>
  4685  primitive("parfillskip",assign_glue,glue_base+par_fill_skip_code);@/
  4686  @!@:par_fill_skip_}{\.{\\parfillskip} primitive@>
  4687  primitive("thinmuskip",assign_mu_glue,glue_base+thin_mu_skip_code);@/
  4688  @!@:thin_mu_skip_}{\.{\\thinmuskip} primitive@>
  4689  primitive("medmuskip",assign_mu_glue,glue_base+med_mu_skip_code);@/
  4690  @!@:med_mu_skip_}{\.{\\medmuskip} primitive@>
  4691  primitive("thickmuskip",assign_mu_glue,glue_base+thick_mu_skip_code);@/
  4692  @!@:thick_mu_skip_}{\.{\\thickmuskip} primitive@>
  4693  
  4694  @ @<Cases of |print_cmd_chr| for symbolic printing of primitives@>=
  4695  assign_glue,assign_mu_glue: if chr_code<skip_base then
  4696      print_skip_param(chr_code-glue_base)
  4697    else if chr_code<mu_skip_base then
  4698      begin print_esc("skip"); print_int(chr_code-skip_base);
  4699      end
  4700    else  begin print_esc("muskip"); print_int(chr_code-mu_skip_base);
  4701      end;
  4702  
  4703  @ All glue parameters and registers are initially `\.{0pt plus0pt minus0pt}'.
  4704  
  4705  @<Initialize table entries...@>=
  4706  equiv(glue_base):=zero_glue; eq_level(glue_base):=level_one;
  4707  eq_type(glue_base):=glue_ref;
  4708  for k:=glue_base+1 to local_base-1 do eqtb[k]:=eqtb[glue_base];
  4709  glue_ref_count(zero_glue):=glue_ref_count(zero_glue)+local_base-glue_base;
  4710  
  4711  @ @<Show equivalent |n|, in region 3@>=
  4712  if n<skip_base then
  4713    begin print_skip_param(n-glue_base); print_char("=");
  4714    if n<glue_base+thin_mu_skip_code then print_spec(equiv(n),"pt")
  4715    else print_spec(equiv(n),"mu");
  4716    end
  4717  else if n<mu_skip_base then
  4718    begin print_esc("skip"); print_int(n-skip_base); print_char("=");
  4719    print_spec(equiv(n),"pt");
  4720    end
  4721  else  begin print_esc("muskip"); print_int(n-mu_skip_base); print_char("=");
  4722    print_spec(equiv(n),"mu");
  4723    end
  4724  
  4725  @ Region 4 of |eqtb| contains the local quantities defined here. The
  4726  bulk of this region is taken up by five tables that are indexed by eight-bit
  4727  characters; these tables are important to both the syntactic and semantic
  4728  portions of \TeX. There are also a bunch of special things like font and
  4729  token parameters, as well as the tables of \.{\\toks} and \.{\\box}
  4730  registers.
  4731  
  4732  @d par_shape_loc=local_base {specifies paragraph shape}
  4733  @d output_routine_loc=local_base+1 {points to token list for \.{\\output}}
  4734  @d every_par_loc=local_base+2 {points to token list for \.{\\everypar}}
  4735  @d every_math_loc=local_base+3 {points to token list for \.{\\everymath}}
  4736  @d every_display_loc=local_base+4 {points to token list for \.{\\everydisplay}}
  4737  @d every_hbox_loc=local_base+5 {points to token list for \.{\\everyhbox}}
  4738  @d every_vbox_loc=local_base+6 {points to token list for \.{\\everyvbox}}
  4739  @d every_job_loc=local_base+7 {points to token list for \.{\\everyjob}}
  4740  @d every_cr_loc=local_base+8 {points to token list for \.{\\everycr}}
  4741  @d err_help_loc=local_base+9 {points to token list for \.{\\errhelp}}
  4742  @d toks_base=local_base+10 {table of 256 token list registers}
  4743  @d box_base=toks_base+256 {table of 256 box registers}
  4744  @d cur_font_loc=box_base+256 {internal font number outside math mode}
  4745  @d math_font_base=cur_font_loc+1 {table of 48 math font numbers}
  4746  @d cat_code_base=math_font_base+48
  4747    {table of 256 command codes (the ``catcodes'')}
  4748  @d lc_code_base=cat_code_base+256 {table of 256 lowercase mappings}
  4749  @d uc_code_base=lc_code_base+256 {table of 256 uppercase mappings}
  4750  @d sf_code_base=uc_code_base+256 {table of 256 spacefactor mappings}
  4751  @d math_code_base=sf_code_base+256 {table of 256 math mode mappings}
  4752  @d int_base=math_code_base+256 {beginning of region 5}
  4753  @#
  4754  @d par_shape_ptr==equiv(par_shape_loc)
  4755  @d output_routine==equiv(output_routine_loc)
  4756  @d every_par==equiv(every_par_loc)
  4757  @d every_math==equiv(every_math_loc)
  4758  @d every_display==equiv(every_display_loc)
  4759  @d every_hbox==equiv(every_hbox_loc)
  4760  @d every_vbox==equiv(every_vbox_loc)
  4761  @d every_job==equiv(every_job_loc)
  4762  @d every_cr==equiv(every_cr_loc)
  4763  @d err_help==equiv(err_help_loc)
  4764  @d toks(#)==equiv(toks_base+#)
  4765  @d box(#)==equiv(box_base+#)
  4766  @d cur_font==equiv(cur_font_loc)
  4767  @d fam_fnt(#)==equiv(math_font_base+#)
  4768  @d cat_code(#)==equiv(cat_code_base+#)
  4769  @d lc_code(#)==equiv(lc_code_base+#)
  4770  @d uc_code(#)==equiv(uc_code_base+#)
  4771  @d sf_code(#)==equiv(sf_code_base+#)
  4772  @d math_code(#)==equiv(math_code_base+#)
  4773    {Note: |math_code(c)| is the true math code plus |min_halfword|}
  4774  
  4775  @<Put each...@>=
  4776  primitive("output",assign_toks,output_routine_loc);
  4777  @!@:output_}{\.{\\output} primitive@>
  4778  primitive("everypar",assign_toks,every_par_loc);
  4779  @!@:every_par_}{\.{\\everypar} primitive@>
  4780  primitive("everymath",assign_toks,every_math_loc);
  4781  @!@:every_math_}{\.{\\everymath} primitive@>
  4782  primitive("everydisplay",assign_toks,every_display_loc);
  4783  @!@:every_display_}{\.{\\everydisplay} primitive@>
  4784  primitive("everyhbox",assign_toks,every_hbox_loc);
  4785  @!@:every_hbox_}{\.{\\everyhbox} primitive@>
  4786  primitive("everyvbox",assign_toks,every_vbox_loc);
  4787  @!@:every_vbox_}{\.{\\everyvbox} primitive@>
  4788  primitive("everyjob",assign_toks,every_job_loc);
  4789  @!@:every_job_}{\.{\\everyjob} primitive@>
  4790  primitive("everycr",assign_toks,every_cr_loc);
  4791  @!@:every_cr_}{\.{\\everycr} primitive@>
  4792  primitive("errhelp",assign_toks,err_help_loc);
  4793  @!@:err_help_}{\.{\\errhelp} primitive@>
  4794  
  4795  @ @<Cases of |print_cmd_chr|...@>=
  4796  assign_toks: if chr_code>=toks_base then
  4797    begin print_esc("toks"); print_int(chr_code-toks_base);
  4798    end
  4799  else  case chr_code of
  4800    output_routine_loc: print_esc("output");
  4801    every_par_loc: print_esc("everypar");
  4802    every_math_loc: print_esc("everymath");
  4803    every_display_loc: print_esc("everydisplay");
  4804    every_hbox_loc: print_esc("everyhbox");
  4805    every_vbox_loc: print_esc("everyvbox");
  4806    every_job_loc: print_esc("everyjob");
  4807    every_cr_loc: print_esc("everycr");
  4808    othercases print_esc("errhelp")
  4809    endcases;
  4810  
  4811  @ We initialize most things to null or undefined values. An undefined font
  4812  is represented by the internal code |font_base|.
  4813  
  4814  However, the character code tables are given initial values based on the
  4815  conventional interpretation of ASCII code. These initial values should
  4816  not be changed when \TeX\ is adapted for use with non-English languages;
  4817  all changes to the initialization conventions should be made in format
  4818  packages, not in \TeX\ itself, so that global interchange of formats is
  4819  possible.
  4820  
  4821  @d null_font==font_base
  4822  @d var_code==@'70000 {math code meaning ``use the current family''}
  4823  
  4824  @<Initialize table entries...@>=
  4825  par_shape_ptr:=null; eq_type(par_shape_loc):=shape_ref;
  4826  eq_level(par_shape_loc):=level_one;@/
  4827  for k:=output_routine_loc to toks_base+255 do
  4828    eqtb[k]:=eqtb[undefined_control_sequence];
  4829  box(0):=null; eq_type(box_base):=box_ref; eq_level(box_base):=level_one;
  4830  for k:=box_base+1 to box_base+255 do eqtb[k]:=eqtb[box_base];
  4831  cur_font:=null_font; eq_type(cur_font_loc):=data;
  4832  eq_level(cur_font_loc):=level_one;@/
  4833  for k:=math_font_base to math_font_base+47 do eqtb[k]:=eqtb[cur_font_loc];
  4834  equiv(cat_code_base):=0; eq_type(cat_code_base):=data;
  4835  eq_level(cat_code_base):=level_one;@/
  4836  for k:=cat_code_base+1 to int_base-1 do eqtb[k]:=eqtb[cat_code_base];
  4837  for k:=0 to 255 do
  4838    begin cat_code(k):=other_char; math_code(k):=hi(k); sf_code(k):=1000;
  4839    end;
  4840  cat_code(carriage_return):=car_ret; cat_code(" "):=spacer;
  4841  cat_code("\"):=escape; cat_code("%"):=comment;
  4842  cat_code(invalid_code):=invalid_char; cat_code(null_code):=ignore;
  4843  for k:="0" to "9" do math_code(k):=hi(k+var_code);
  4844  for k:="A" to "Z" do
  4845    begin cat_code(k):=letter; cat_code(k+"a"-"A"):=letter;@/
  4846    math_code(k):=hi(k+var_code+@"100);
  4847    math_code(k+"a"-"A"):=hi(k+"a"-"A"+var_code+@"100);@/
  4848    lc_code(k):=k+"a"-"A"; lc_code(k+"a"-"A"):=k+"a"-"A";@/
  4849    uc_code(k):=k; uc_code(k+"a"-"A"):=k;@/
  4850    sf_code(k):=999;
  4851    end;
  4852  
  4853  @ @<Show equivalent |n|, in region 4@>=
  4854  if n=par_shape_loc then
  4855    begin print_esc("parshape"); print_char("=");
  4856    if par_shape_ptr=null then print_char("0")
  4857    else print_int(info(par_shape_ptr));
  4858    end
  4859  else if n<toks_base then
  4860    begin print_cmd_chr(assign_toks,n); print_char("=");
  4861    if equiv(n)<>null then show_token_list(link(equiv(n)),null,32);
  4862    end
  4863  else if n<box_base then
  4864    begin print_esc("toks"); print_int(n-toks_base); print_char("=");
  4865    if equiv(n)<>null then show_token_list(link(equiv(n)),null,32);
  4866    end
  4867  else if n<cur_font_loc then
  4868    begin print_esc("box"); print_int(n-box_base); print_char("=");
  4869    if equiv(n)=null then print("void")
  4870    else  begin depth_threshold:=0; breadth_max:=1; show_node_list(equiv(n));
  4871      end;
  4872    end
  4873  else if n<cat_code_base then @<Show the font identifier in |eqtb[n]|@>
  4874  else @<Show the halfword code in |eqtb[n]|@>
  4875  
  4876  @ @<Show the font identifier in |eqtb[n]|@>=
  4877  begin if n=cur_font_loc then print("current font")
  4878  else if n<math_font_base+16 then
  4879    begin print_esc("textfont"); print_int(n-math_font_base);
  4880    end
  4881  else if n<math_font_base+32 then
  4882    begin print_esc("scriptfont"); print_int(n-math_font_base-16);
  4883    end
  4884  else  begin print_esc("scriptscriptfont"); print_int(n-math_font_base-32);
  4885    end;
  4886  print_char("=");@/
  4887  print_esc(hash[font_id_base+equiv(n)].rh);
  4888    {that's |font_id_text(equiv(n))|}
  4889  end
  4890  
  4891  @ @<Show the halfword code in |eqtb[n]|@>=
  4892  if n<math_code_base then
  4893    begin if n<lc_code_base then
  4894      begin print_esc("catcode"); print_int(n-cat_code_base);
  4895      end
  4896    else if n<uc_code_base then
  4897      begin print_esc("lccode"); print_int(n-lc_code_base);
  4898      end
  4899    else if n<sf_code_base then
  4900      begin print_esc("uccode"); print_int(n-uc_code_base);
  4901      end
  4902    else  begin print_esc("sfcode"); print_int(n-sf_code_base);
  4903      end;
  4904    print_char("="); print_int(equiv(n));
  4905    end
  4906  else  begin print_esc("mathcode"); print_int(n-math_code_base);
  4907    print_char("="); print_int(ho(equiv(n)));
  4908    end
  4909  
  4910  @ Region 5 of |eqtb| contains the integer parameters and registers defined
  4911  here, as well as the |del_code| table. The latter table differs from the
  4912  |cat_code..math_code| tables that precede it, since delimiter codes are
  4913  fullword integers while the other kinds of codes occupy at most a
  4914  halfword. This is what makes region~5 different from region~4. We will
  4915  store the |eq_level| information in an auxiliary array of quarterwords
  4916  that will be defined later.
  4917  
  4918  @d pretolerance_code=0 {badness tolerance before hyphenation}
  4919  @d tolerance_code=1 {badness tolerance after hyphenation}
  4920  @d line_penalty_code=2 {added to the badness of every line}
  4921  @d hyphen_penalty_code=3 {penalty for break after discretionary hyphen}
  4922  @d ex_hyphen_penalty_code=4 {penalty for break after explicit hyphen}
  4923  @d club_penalty_code=5 {penalty for creating a club line}
  4924  @d widow_penalty_code=6 {penalty for creating a widow line}
  4925  @d display_widow_penalty_code=7 {ditto, just before a display}
  4926  @d broken_penalty_code=8 {penalty for breaking a page at a broken line}
  4927  @d bin_op_penalty_code=9 {penalty for breaking after a binary operation}
  4928  @d rel_penalty_code=10 {penalty for breaking after a relation}
  4929  @d pre_display_penalty_code=11
  4930    {penalty for breaking just before a displayed formula}
  4931  @d post_display_penalty_code=12
  4932    {penalty for breaking just after a displayed formula}
  4933  @d inter_line_penalty_code=13 {additional penalty between lines}
  4934  @d double_hyphen_demerits_code=14 {demerits for double hyphen break}
  4935  @d final_hyphen_demerits_code=15 {demerits for final hyphen break}
  4936  @d adj_demerits_code=16 {demerits for adjacent incompatible lines}
  4937  @d mag_code=17 {magnification ratio}
  4938  @d delimiter_factor_code=18 {ratio for variable-size delimiters}
  4939  @d looseness_code=19 {change in number of lines for a paragraph}
  4940  @d time_code=20 {current time of day}
  4941  @d day_code=21 {current day of the month}
  4942  @d month_code=22 {current month of the year}
  4943  @d year_code=23 {current year of our Lord}
  4944  @d show_box_breadth_code=24 {nodes per level in |show_box|}
  4945  @d show_box_depth_code=25 {maximum level in |show_box|}
  4946  @d hbadness_code=26 {hboxes exceeding this badness will be shown by |hpack|}
  4947  @d vbadness_code=27 {vboxes exceeding this badness will be shown by |vpack|}
  4948  @d pausing_code=28 {pause after each line is read from a file}
  4949  @d tracing_online_code=29 {show diagnostic output on terminal}
  4950  @d tracing_macros_code=30 {show macros as they are being expanded}
  4951  @d tracing_stats_code=31 {show memory usage if \TeX\ knows it}
  4952  @d tracing_paragraphs_code=32 {show line-break calculations}
  4953  @d tracing_pages_code=33 {show page-break calculations}
  4954  @d tracing_output_code=34 {show boxes when they are shipped out}
  4955  @d tracing_lost_chars_code=35 {show characters that aren't in the font}
  4956  @d tracing_commands_code=36 {show command codes at |big_switch|}
  4957  @d tracing_restores_code=37 {show equivalents when they are restored}
  4958  @d uc_hyph_code=38 {hyphenate words beginning with a capital letter}
  4959  @d output_penalty_code=39 {penalty found at current page break}
  4960  @d max_dead_cycles_code=40 {bound on consecutive dead cycles of output}
  4961  @d hang_after_code=41 {hanging indentation changes after this many lines}
  4962  @d floating_penalty_code=42 {penalty for insertions held over after a split}
  4963  @d global_defs_code=43 {override \.{\\global} specifications}
  4964  @d cur_fam_code=44 {current family}
  4965  @d escape_char_code=45 {escape character for token output}
  4966  @d default_hyphen_char_code=46 {value of \.{\\hyphenchar} when a font is loaded}
  4967  @d default_skew_char_code=47 {value of \.{\\skewchar} when a font is loaded}
  4968  @d end_line_char_code=48 {character placed at the right end of the buffer}
  4969  @d new_line_char_code=49 {character that prints as |print_ln|}
  4970  @d language_code=50 {current hyphenation table}
  4971  @d left_hyphen_min_code=51 {minimum left hyphenation fragment size}
  4972  @d right_hyphen_min_code=52 {minimum right hyphenation fragment size}
  4973  @d holding_inserts_code=53 {do not remove insertion nodes from \.{\\box255}}
  4974  @d error_context_lines_code=54 {maximum intermediate line pairs shown}
  4975  @d int_pars=55 {total number of integer parameters}
  4976  @d count_base=int_base+int_pars {256 user \.{\\count} registers}
  4977  @d del_code_base=count_base+256 {256 delimiter code mappings}
  4978  @d dimen_base=del_code_base+256 {beginning of region 6}
  4979  @#
  4980  @d del_code(#)==eqtb[del_code_base+#].int
  4981  @d count(#)==eqtb[count_base+#].int
  4982  @d int_par(#)==eqtb[int_base+#].int {an integer parameter}
  4983  @d pretolerance==int_par(pretolerance_code)
  4984  @d tolerance==int_par(tolerance_code)
  4985  @d line_penalty==int_par(line_penalty_code)
  4986  @d hyphen_penalty==int_par(hyphen_penalty_code)
  4987  @d ex_hyphen_penalty==int_par(ex_hyphen_penalty_code)
  4988  @d club_penalty==int_par(club_penalty_code)
  4989  @d widow_penalty==int_par(widow_penalty_code)
  4990  @d display_widow_penalty==int_par(display_widow_penalty_code)
  4991  @d broken_penalty==int_par(broken_penalty_code)
  4992  @d bin_op_penalty==int_par(bin_op_penalty_code)
  4993  @d rel_penalty==int_par(rel_penalty_code)
  4994  @d pre_display_penalty==int_par(pre_display_penalty_code)
  4995  @d post_display_penalty==int_par(post_display_penalty_code)
  4996  @d inter_line_penalty==int_par(inter_line_penalty_code)
  4997  @d double_hyphen_demerits==int_par(double_hyphen_demerits_code)
  4998  @d final_hyphen_demerits==int_par(final_hyphen_demerits_code)
  4999  @d adj_demerits==int_par(adj_demerits_code)
  5000  @d mag==int_par(mag_code)
  5001  @d delimiter_factor==int_par(delimiter_factor_code)
  5002  @d looseness==int_par(looseness_code)
  5003  @d time==int_par(time_code)
  5004  @d day==int_par(day_code)
  5005  @d month==int_par(month_code)
  5006  @d year==int_par(year_code)
  5007  @d show_box_breadth==int_par(show_box_breadth_code)
  5008  @d show_box_depth==int_par(show_box_depth_code)
  5009  @d hbadness==int_par(hbadness_code)
  5010  @d vbadness==int_par(vbadness_code)
  5011  @d pausing==int_par(pausing_code)
  5012  @d tracing_online==int_par(tracing_online_code)
  5013  @d tracing_macros==int_par(tracing_macros_code)
  5014  @d tracing_stats==int_par(tracing_stats_code)
  5015  @d tracing_paragraphs==int_par(tracing_paragraphs_code)
  5016  @d tracing_pages==int_par(tracing_pages_code)
  5017  @d tracing_output==int_par(tracing_output_code)
  5018  @d tracing_lost_chars==int_par(tracing_lost_chars_code)
  5019  @d tracing_commands==int_par(tracing_commands_code)
  5020  @d tracing_restores==int_par(tracing_restores_code)
  5021  @d uc_hyph==int_par(uc_hyph_code)
  5022  @d output_penalty==int_par(output_penalty_code)
  5023  @d max_dead_cycles==int_par(max_dead_cycles_code)
  5024  @d hang_after==int_par(hang_after_code)
  5025  @d floating_penalty==int_par(floating_penalty_code)
  5026  @d global_defs==int_par(global_defs_code)
  5027  @d cur_fam==int_par(cur_fam_code)
  5028  @d escape_char==int_par(escape_char_code)
  5029  @d default_hyphen_char==int_par(default_hyphen_char_code)
  5030  @d default_skew_char==int_par(default_skew_char_code)
  5031  @d end_line_char==int_par(end_line_char_code)
  5032  @d new_line_char==int_par(new_line_char_code)
  5033  @d language==int_par(language_code)
  5034  @d left_hyphen_min==int_par(left_hyphen_min_code)
  5035  @d right_hyphen_min==int_par(right_hyphen_min_code)
  5036  @d holding_inserts==int_par(holding_inserts_code)
  5037  @d error_context_lines==int_par(error_context_lines_code)
  5038  
  5039  @<Assign the values |depth_threshold:=show_box_depth|...@>=
  5040  depth_threshold:=show_box_depth;
  5041  breadth_max:=show_box_breadth
  5042  
  5043  @ We can print the symbolic name of an integer parameter as follows.
  5044  
  5045  @p procedure print_param(@!n:integer);
  5046  begin case n of
  5047  pretolerance_code:print_esc("pretolerance");
  5048  tolerance_code:print_esc("tolerance");
  5049  line_penalty_code:print_esc("linepenalty");
  5050  hyphen_penalty_code:print_esc("hyphenpenalty");
  5051  ex_hyphen_penalty_code:print_esc("exhyphenpenalty");
  5052  club_penalty_code:print_esc("clubpenalty");
  5053  widow_penalty_code:print_esc("widowpenalty");
  5054  display_widow_penalty_code:print_esc("displaywidowpenalty");
  5055  broken_penalty_code:print_esc("brokenpenalty");
  5056  bin_op_penalty_code:print_esc("binoppenalty");
  5057  rel_penalty_code:print_esc("relpenalty");
  5058  pre_display_penalty_code:print_esc("predisplaypenalty");
  5059  post_display_penalty_code:print_esc("postdisplaypenalty");
  5060  inter_line_penalty_code:print_esc("interlinepenalty");
  5061  double_hyphen_demerits_code:print_esc("doublehyphendemerits");
  5062  final_hyphen_demerits_code:print_esc("finalhyphendemerits");
  5063  adj_demerits_code:print_esc("adjdemerits");
  5064  mag_code:print_esc("mag");
  5065  delimiter_factor_code:print_esc("delimiterfactor");
  5066  looseness_code:print_esc("looseness");
  5067  time_code:print_esc("time");
  5068  day_code:print_esc("day");
  5069  month_code:print_esc("month");
  5070  year_code:print_esc("year");
  5071  show_box_breadth_code:print_esc("showboxbreadth");
  5072  show_box_depth_code:print_esc("showboxdepth");
  5073  hbadness_code:print_esc("hbadness");
  5074  vbadness_code:print_esc("vbadness");
  5075  pausing_code:print_esc("pausing");
  5076  tracing_online_code:print_esc("tracingonline");
  5077  tracing_macros_code:print_esc("tracingmacros");
  5078  tracing_stats_code:print_esc("tracingstats");
  5079  tracing_paragraphs_code:print_esc("tracingparagraphs");
  5080  tracing_pages_code:print_esc("tracingpages");
  5081  tracing_output_code:print_esc("tracingoutput");
  5082  tracing_lost_chars_code:print_esc("tracinglostchars");
  5083  tracing_commands_code:print_esc("tracingcommands");
  5084  tracing_restores_code:print_esc("tracingrestores");
  5085  uc_hyph_code:print_esc("uchyph");
  5086  output_penalty_code:print_esc("outputpenalty");
  5087  max_dead_cycles_code:print_esc("maxdeadcycles");
  5088  hang_after_code:print_esc("hangafter");
  5089  floating_penalty_code:print_esc("floatingpenalty");
  5090  global_defs_code:print_esc("globaldefs");
  5091  cur_fam_code:print_esc("fam");
  5092  escape_char_code:print_esc("escapechar");
  5093  default_hyphen_char_code:print_esc("defaulthyphenchar");
  5094  default_skew_char_code:print_esc("defaultskewchar");
  5095  end_line_char_code:print_esc("endlinechar");
  5096  new_line_char_code:print_esc("newlinechar");
  5097  language_code:print_esc("language");
  5098  left_hyphen_min_code:print_esc("lefthyphenmin");
  5099  right_hyphen_min_code:print_esc("righthyphenmin");
  5100  holding_inserts_code:print_esc("holdinginserts");
  5101  error_context_lines_code:print_esc("errorcontextlines");
  5102  othercases print("[unknown integer parameter!]")
  5103  endcases;
  5104  end;
  5105  
  5106  @ The integer parameter names must be entered into the hash table.
  5107  
  5108  @<Put each...@>=
  5109  primitive("pretolerance",assign_int,int_base+pretolerance_code);@/
  5110  @!@:pretolerance_}{\.{\\pretolerance} primitive@>
  5111  primitive("tolerance",assign_int,int_base+tolerance_code);@/
  5112  @!@:tolerance_}{\.{\\tolerance} primitive@>
  5113  primitive("linepenalty",assign_int,int_base+line_penalty_code);@/
  5114  @!@:line_penalty_}{\.{\\linepenalty} primitive@>
  5115  primitive("hyphenpenalty",assign_int,int_base+hyphen_penalty_code);@/
  5116  @!@:hyphen_penalty_}{\.{\\hyphenpenalty} primitive@>
  5117  primitive("exhyphenpenalty",assign_int,int_base+ex_hyphen_penalty_code);@/
  5118  @!@:ex_hyphen_penalty_}{\.{\\exhyphenpenalty} primitive@>
  5119  primitive("clubpenalty",assign_int,int_base+club_penalty_code);@/
  5120  @!@:club_penalty_}{\.{\\clubpenalty} primitive@>
  5121  primitive("widowpenalty",assign_int,int_base+widow_penalty_code);@/
  5122  @!@:widow_penalty_}{\.{\\widowpenalty} primitive@>
  5123  primitive("displaywidowpenalty",
  5124    assign_int,int_base+display_widow_penalty_code);@/
  5125  @!@:display_widow_penalty_}{\.{\\displaywidowpenalty} primitive@>
  5126  primitive("brokenpenalty",assign_int,int_base+broken_penalty_code);@/
  5127  @!@:broken_penalty_}{\.{\\brokenpenalty} primitive@>
  5128  primitive("binoppenalty",assign_int,int_base+bin_op_penalty_code);@/
  5129  @!@:bin_op_penalty_}{\.{\\binoppenalty} primitive@>
  5130  primitive("relpenalty",assign_int,int_base+rel_penalty_code);@/
  5131  @!@:rel_penalty_}{\.{\\relpenalty} primitive@>
  5132  primitive("predisplaypenalty",assign_int,int_base+pre_display_penalty_code);@/
  5133  @!@:pre_display_penalty_}{\.{\\predisplaypenalty} primitive@>
  5134  primitive("postdisplaypenalty",assign_int,int_base+post_display_penalty_code);@/
  5135  @!@:post_display_penalty_}{\.{\\postdisplaypenalty} primitive@>
  5136  primitive("interlinepenalty",assign_int,int_base+inter_line_penalty_code);@/
  5137  @!@:inter_line_penalty_}{\.{\\interlinepenalty} primitive@>
  5138  primitive("doublehyphendemerits",
  5139    assign_int,int_base+double_hyphen_demerits_code);@/
  5140  @!@:double_hyphen_demerits_}{\.{\\doublehyphendemerits} primitive@>
  5141  primitive("finalhyphendemerits",
  5142    assign_int,int_base+final_hyphen_demerits_code);@/
  5143  @!@:final_hyphen_demerits_}{\.{\\finalhyphendemerits} primitive@>
  5144  primitive("adjdemerits",assign_int,int_base+adj_demerits_code);@/
  5145  @!@:adj_demerits_}{\.{\\adjdemerits} primitive@>
  5146  primitive("mag",assign_int,int_base+mag_code);@/
  5147  @!@:mag_}{\.{\\mag} primitive@>
  5148  primitive("delimiterfactor",assign_int,int_base+delimiter_factor_code);@/
  5149  @!@:delimiter_factor_}{\.{\\delimiterfactor} primitive@>
  5150  primitive("looseness",assign_int,int_base+looseness_code);@/
  5151  @!@:looseness_}{\.{\\looseness} primitive@>
  5152  primitive("time",assign_int,int_base+time_code);@/
  5153  @!@:time_}{\.{\\time} primitive@>
  5154  primitive("day",assign_int,int_base+day_code);@/
  5155  @!@:day_}{\.{\\day} primitive@>
  5156  primitive("month",assign_int,int_base+month_code);@/
  5157  @!@:month_}{\.{\\month} primitive@>
  5158  primitive("year",assign_int,int_base+year_code);@/
  5159  @!@:year_}{\.{\\year} primitive@>
  5160  primitive("showboxbreadth",assign_int,int_base+show_box_breadth_code);@/
  5161  @!@:show_box_breadth_}{\.{\\showboxbreadth} primitive@>
  5162  primitive("showboxdepth",assign_int,int_base+show_box_depth_code);@/
  5163  @!@:show_box_depth_}{\.{\\showboxdepth} primitive@>
  5164  primitive("hbadness",assign_int,int_base+hbadness_code);@/
  5165  @!@:hbadness_}{\.{\\hbadness} primitive@>
  5166  primitive("vbadness",assign_int,int_base+vbadness_code);@/
  5167  @!@:vbadness_}{\.{\\vbadness} primitive@>
  5168  primitive("pausing",assign_int,int_base+pausing_code);@/
  5169  @!@:pausing_}{\.{\\pausing} primitive@>
  5170  primitive("tracingonline",assign_int,int_base+tracing_online_code);@/
  5171  @!@:tracing_online_}{\.{\\tracingonline} primitive@>
  5172  primitive("tracingmacros",assign_int,int_base+tracing_macros_code);@/
  5173  @!@:tracing_macros_}{\.{\\tracingmacros} primitive@>
  5174  primitive("tracingstats",assign_int,int_base+tracing_stats_code);@/
  5175  @!@:tracing_stats_}{\.{\\tracingstats} primitive@>
  5176  primitive("tracingparagraphs",assign_int,int_base+tracing_paragraphs_code);@/
  5177  @!@:tracing_paragraphs_}{\.{\\tracingparagraphs} primitive@>
  5178  primitive("tracingpages",assign_int,int_base+tracing_pages_code);@/
  5179  @!@:tracing_pages_}{\.{\\tracingpages} primitive@>
  5180  primitive("tracingoutput",assign_int,int_base+tracing_output_code);@/
  5181  @!@:tracing_output_}{\.{\\tracingoutput} primitive@>
  5182  primitive("tracinglostchars",assign_int,int_base+tracing_lost_chars_code);@/
  5183  @!@:tracing_lost_chars_}{\.{\\tracinglostchars} primitive@>
  5184  primitive("tracingcommands",assign_int,int_base+tracing_commands_code);@/
  5185  @!@:tracing_commands_}{\.{\\tracingcommands} primitive@>
  5186  primitive("tracingrestores",assign_int,int_base+tracing_restores_code);@/
  5187  @!@:tracing_restores_}{\.{\\tracingrestores} primitive@>
  5188  primitive("uchyph",assign_int,int_base+uc_hyph_code);@/
  5189  @!@:uc_hyph_}{\.{\\uchyph} primitive@>
  5190  primitive("outputpenalty",assign_int,int_base+output_penalty_code);@/
  5191  @!@:output_penalty_}{\.{\\outputpenalty} primitive@>
  5192  primitive("maxdeadcycles",assign_int,int_base+max_dead_cycles_code);@/
  5193  @!@:max_dead_cycles_}{\.{\\maxdeadcycles} primitive@>
  5194  primitive("hangafter",assign_int,int_base+hang_after_code);@/
  5195  @!@:hang_after_}{\.{\\hangafter} primitive@>
  5196  primitive("floatingpenalty",assign_int,int_base+floating_penalty_code);@/
  5197  @!@:floating_penalty_}{\.{\\floatingpenalty} primitive@>
  5198  primitive("globaldefs",assign_int,int_base+global_defs_code);@/
  5199  @!@:global_defs_}{\.{\\globaldefs} primitive@>
  5200  primitive("fam",assign_int,int_base+cur_fam_code);@/
  5201  @!@:fam_}{\.{\\fam} primitive@>
  5202  primitive("escapechar",assign_int,int_base+escape_char_code);@/
  5203  @!@:escape_char_}{\.{\\escapechar} primitive@>
  5204  primitive("defaulthyphenchar",assign_int,int_base+default_hyphen_char_code);@/
  5205  @!@:default_hyphen_char_}{\.{\\defaulthyphenchar} primitive@>
  5206  primitive("defaultskewchar",assign_int,int_base+default_skew_char_code);@/
  5207  @!@:default_skew_char_}{\.{\\defaultskewchar} primitive@>
  5208  primitive("endlinechar",assign_int,int_base+end_line_char_code);@/
  5209  @!@:end_line_char_}{\.{\\endlinechar} primitive@>
  5210  primitive("newlinechar",assign_int,int_base+new_line_char_code);@/
  5211  @!@:new_line_char_}{\.{\\newlinechar} primitive@>
  5212  primitive("language",assign_int,int_base+language_code);@/
  5213  @!@:language_}{\.{\\language} primitive@>
  5214  primitive("lefthyphenmin",assign_int,int_base+left_hyphen_min_code);@/
  5215  @!@:left_hyphen_min_}{\.{\\lefthyphenmin} primitive@>
  5216  primitive("righthyphenmin",assign_int,int_base+right_hyphen_min_code);@/
  5217  @!@:right_hyphen_min_}{\.{\\righthyphenmin} primitive@>
  5218  primitive("holdinginserts",assign_int,int_base+holding_inserts_code);@/
  5219  @!@:holding_inserts_}{\.{\\holdinginserts} primitive@>
  5220  primitive("errorcontextlines",assign_int,int_base+error_context_lines_code);@/
  5221  @!@:error_context_lines_}{\.{\\errorcontextlines} primitive@>
  5222  
  5223  @ @<Cases of |print_cmd_chr|...@>=
  5224  assign_int: if chr_code<count_base then print_param(chr_code-int_base)
  5225    else  begin print_esc("count"); print_int(chr_code-count_base);
  5226      end;
  5227  
  5228  @ The integer parameters should really be initialized by a macro package;
  5229  the following initialization does the minimum to keep \TeX\ from
  5230  complete failure.
  5231  @^null delimiter@>
  5232  
  5233  @<Initialize table entries...@>=
  5234  for k:=int_base to del_code_base-1 do eqtb[k].int:=0;
  5235  mag:=1000; tolerance:=10000; hang_after:=1; max_dead_cycles:=25;
  5236  escape_char:="\"; end_line_char:=carriage_return;
  5237  for k:=0 to 255 do del_code(k):=-1;
  5238  del_code("."):=0; {this null delimiter is used in error recovery}
  5239  
  5240  @ The following procedure, which is called just before \TeX\ initializes its
  5241  input and output, establishes the initial values of the date and time.
  5242  @^system dependencies@>
  5243  Since standard \PASCAL\ cannot provide such information, something special
  5244  is needed. The program here simply assumes that suitable values appear in
  5245  the global variables \\{sys\_time}, \\{sys\_day}, \\{sys\_month}, and
  5246  \\{sys\_year} (which are initialized to noon on 4 July 1776,
  5247  in case the implementor is careless).
  5248  
  5249  @p procedure fix_date_and_time;
  5250  begin sys_time:=12*60;
  5251  sys_day:=4; sys_month:=7; sys_year:=1776;  {self-evident truths}
  5252  time:=sys_time; {minutes since midnight}
  5253  day:=sys_day; {day of the month}
  5254  month:=sys_month; {month of the year}
  5255  year:=sys_year; {Anno Domini}
  5256  end;
  5257  
  5258  @ @<Show equivalent |n|, in region 5@>=
  5259  begin if n<count_base then print_param(n-int_base)
  5260  else if  n<del_code_base then
  5261    begin print_esc("count"); print_int(n-count_base);
  5262    end
  5263  else  begin print_esc("delcode"); print_int(n-del_code_base);
  5264    end;
  5265  print_char("="); print_int(eqtb[n].int);
  5266  end
  5267  
  5268  @ @<Set variable |c| to the current escape character@>=c:=escape_char
  5269  
  5270  @ @<Character |s| is the current new-line character@>=s=new_line_char
  5271  
  5272  @ \TeX\ is occasionally supposed to print diagnostic information that
  5273  goes only into the transcript file, unless |tracing_online| is positive.
  5274  Here are two routines that adjust the destination of print commands:
  5275  
  5276  @p procedure begin_diagnostic; {prepare to do some tracing}
  5277  begin old_setting:=selector;
  5278  if (tracing_online<=0)and(selector=term_and_log) then
  5279    begin decr(selector);
  5280    if history=spotless then history:=warning_issued;
  5281    end;
  5282  end;
  5283  @#
  5284  procedure end_diagnostic(@!blank_line:boolean);
  5285    {restore proper conditions after tracing}
  5286  begin print_nl("");
  5287  if blank_line then print_ln;
  5288  selector:=old_setting;
  5289  end;
  5290  
  5291  @ Of course we had better declare a few more global variables, if the previous
  5292  routines are going to work.
  5293  
  5294  @<Glob...@>=
  5295  @!old_setting:0..max_selector;
  5296  @!sys_time,@!sys_day,@!sys_month,@!sys_year:integer;
  5297      {date and time supplied by external system}
  5298  
  5299  @ The final region of |eqtb| contains the dimension parameters defined
  5300  here, and the 256 \.{\\dimen} registers.
  5301  
  5302  @d par_indent_code=0 {indentation of paragraphs}
  5303  @d math_surround_code=1 {space around math in text}
  5304  @d line_skip_limit_code=2 {threshold for |line_skip| instead of |baseline_skip|}
  5305  @d hsize_code=3 {line width in horizontal mode}
  5306  @d vsize_code=4 {page height in vertical mode}
  5307  @d max_depth_code=5 {maximum depth of boxes on main pages}
  5308  @d split_max_depth_code=6 {maximum depth of boxes on split pages}
  5309  @d box_max_depth_code=7 {maximum depth of explicit vboxes}
  5310  @d hfuzz_code=8 {tolerance for overfull hbox messages}
  5311  @d vfuzz_code=9 {tolerance for overfull vbox messages}
  5312  @d delimiter_shortfall_code=10 {maximum amount uncovered by variable delimiters}
  5313  @d null_delimiter_space_code=11 {blank space in null delimiters}
  5314  @d script_space_code=12 {extra space after subscript or superscript}
  5315  @d pre_display_size_code=13 {length of text preceding a display}
  5316  @d display_width_code=14 {length of line for displayed equation}
  5317  @d display_indent_code=15 {indentation of line for displayed equation}
  5318  @d overfull_rule_code=16 {width of rule that identifies overfull hboxes}
  5319  @d hang_indent_code=17 {amount of hanging indentation}
  5320  @d h_offset_code=18 {amount of horizontal offset when shipping pages out}
  5321  @d v_offset_code=19 {amount of vertical offset when shipping pages out}
  5322  @d emergency_stretch_code=20 {reduces badnesses on final pass of line-breaking}
  5323  @d dimen_pars=21 {total number of dimension parameters}
  5324  @d scaled_base=dimen_base+dimen_pars
  5325    {table of 256 user-defined \.{\\dimen} registers}
  5326  @d eqtb_size=scaled_base+255 {largest subscript of |eqtb|}
  5327  @#
  5328  @d dimen(#)==eqtb[scaled_base+#].sc
  5329  @d dimen_par(#)==eqtb[dimen_base+#].sc {a scaled quantity}
  5330  @d par_indent==dimen_par(par_indent_code)
  5331  @d math_surround==dimen_par(math_surround_code)
  5332  @d line_skip_limit==dimen_par(line_skip_limit_code)
  5333  @d hsize==dimen_par(hsize_code)
  5334  @d vsize==dimen_par(vsize_code)
  5335  @d max_depth==dimen_par(max_depth_code)
  5336  @d split_max_depth==dimen_par(split_max_depth_code)
  5337  @d box_max_depth==dimen_par(box_max_depth_code)
  5338  @d hfuzz==dimen_par(hfuzz_code)
  5339  @d vfuzz==dimen_par(vfuzz_code)
  5340  @d delimiter_shortfall==dimen_par(delimiter_shortfall_code)
  5341  @d null_delimiter_space==dimen_par(null_delimiter_space_code)
  5342  @d script_space==dimen_par(script_space_code)
  5343  @d pre_display_size==dimen_par(pre_display_size_code)
  5344  @d display_width==dimen_par(display_width_code)
  5345  @d display_indent==dimen_par(display_indent_code)
  5346  @d overfull_rule==dimen_par(overfull_rule_code)
  5347  @d hang_indent==dimen_par(hang_indent_code)
  5348  @d h_offset==dimen_par(h_offset_code)
  5349  @d v_offset==dimen_par(v_offset_code)
  5350  @d emergency_stretch==dimen_par(emergency_stretch_code)
  5351  
  5352  @p procedure print_length_param(@!n:integer);
  5353  begin case n of
  5354  par_indent_code:print_esc("parindent");
  5355  math_surround_code:print_esc("mathsurround");
  5356  line_skip_limit_code:print_esc("lineskiplimit");
  5357  hsize_code:print_esc("hsize");
  5358  vsize_code:print_esc("vsize");
  5359  max_depth_code:print_esc("maxdepth");
  5360  split_max_depth_code:print_esc("splitmaxdepth");
  5361  box_max_depth_code:print_esc("boxmaxdepth");
  5362  hfuzz_code:print_esc("hfuzz");
  5363  vfuzz_code:print_esc("vfuzz");
  5364  delimiter_shortfall_code:print_esc("delimitershortfall");
  5365  null_delimiter_space_code:print_esc("nulldelimiterspace");
  5366  script_space_code:print_esc("scriptspace");
  5367  pre_display_size_code:print_esc("predisplaysize");
  5368  display_width_code:print_esc("displaywidth");
  5369  display_indent_code:print_esc("displayindent");
  5370  overfull_rule_code:print_esc("overfullrule");
  5371  hang_indent_code:print_esc("hangindent");
  5372  h_offset_code:print_esc("hoffset");
  5373  v_offset_code:print_esc("voffset");
  5374  emergency_stretch_code:print_esc("emergencystretch");
  5375  othercases print("[unknown dimen parameter!]")
  5376  endcases;
  5377  end;
  5378  
  5379  @ @<Put each...@>=
  5380  primitive("parindent",assign_dimen,dimen_base+par_indent_code);@/
  5381  @!@:par_indent_}{\.{\\parindent} primitive@>
  5382  primitive("mathsurround",assign_dimen,dimen_base+math_surround_code);@/
  5383  @!@:math_surround_}{\.{\\mathsurround} primitive@>
  5384  primitive("lineskiplimit",assign_dimen,dimen_base+line_skip_limit_code);@/
  5385  @!@:line_skip_limit_}{\.{\\lineskiplimit} primitive@>
  5386  primitive("hsize",assign_dimen,dimen_base+hsize_code);@/
  5387  @!@:hsize_}{\.{\\hsize} primitive@>
  5388  primitive("vsize",assign_dimen,dimen_base+vsize_code);@/
  5389  @!@:vsize_}{\.{\\vsize} primitive@>
  5390  primitive("maxdepth",assign_dimen,dimen_base+max_depth_code);@/
  5391  @!@:max_depth_}{\.{\\maxdepth} primitive@>
  5392  primitive("splitmaxdepth",assign_dimen,dimen_base+split_max_depth_code);@/
  5393  @!@:split_max_depth_}{\.{\\splitmaxdepth} primitive@>
  5394  primitive("boxmaxdepth",assign_dimen,dimen_base+box_max_depth_code);@/
  5395  @!@:box_max_depth_}{\.{\\boxmaxdepth} primitive@>
  5396  primitive("hfuzz",assign_dimen,dimen_base+hfuzz_code);@/
  5397  @!@:hfuzz_}{\.{\\hfuzz} primitive@>
  5398  primitive("vfuzz",assign_dimen,dimen_base+vfuzz_code);@/
  5399  @!@:vfuzz_}{\.{\\vfuzz} primitive@>
  5400  primitive("delimitershortfall",
  5401    assign_dimen,dimen_base+delimiter_shortfall_code);@/
  5402  @!@:delimiter_shortfall_}{\.{\\delimitershortfall} primitive@>
  5403  primitive("nulldelimiterspace",
  5404    assign_dimen,dimen_base+null_delimiter_space_code);@/
  5405  @!@:null_delimiter_space_}{\.{\\nulldelimiterspace} primitive@>
  5406  primitive("scriptspace",assign_dimen,dimen_base+script_space_code);@/
  5407  @!@:script_space_}{\.{\\scriptspace} primitive@>
  5408  primitive("predisplaysize",assign_dimen,dimen_base+pre_display_size_code);@/
  5409  @!@:pre_display_size_}{\.{\\predisplaysize} primitive@>
  5410  primitive("displaywidth",assign_dimen,dimen_base+display_width_code);@/
  5411  @!@:display_width_}{\.{\\displaywidth} primitive@>
  5412  primitive("displayindent",assign_dimen,dimen_base+display_indent_code);@/
  5413  @!@:display_indent_}{\.{\\displayindent} primitive@>
  5414  primitive("overfullrule",assign_dimen,dimen_base+overfull_rule_code);@/
  5415  @!@:overfull_rule_}{\.{\\overfullrule} primitive@>
  5416  primitive("hangindent",assign_dimen,dimen_base+hang_indent_code);@/
  5417  @!@:hang_indent_}{\.{\\hangindent} primitive@>
  5418  primitive("hoffset",assign_dimen,dimen_base+h_offset_code);@/
  5419  @!@:h_offset_}{\.{\\hoffset} primitive@>
  5420  primitive("voffset",assign_dimen,dimen_base+v_offset_code);@/
  5421  @!@:v_offset_}{\.{\\voffset} primitive@>
  5422  primitive("emergencystretch",assign_dimen,dimen_base+emergency_stretch_code);@/
  5423  @!@:emergency_stretch_}{\.{\\emergencystretch} primitive@>
  5424  
  5425  @ @<Cases of |print_cmd_chr|...@>=
  5426  assign_dimen: if chr_code<scaled_base then
  5427      print_length_param(chr_code-dimen_base)
  5428    else  begin print_esc("dimen"); print_int(chr_code-scaled_base);
  5429      end;
  5430  
  5431  @ @<Initialize table entries...@>=
  5432  for k:=dimen_base to eqtb_size do eqtb[k].sc:=0;
  5433  
  5434  @ @<Show equivalent |n|, in region 6@>=
  5435  begin if n<scaled_base then print_length_param(n-dimen_base)
  5436  else  begin print_esc("dimen"); print_int(n-scaled_base);
  5437    end;
  5438  print_char("="); print_scaled(eqtb[n].sc); print("pt");
  5439  end
  5440  
  5441  @ Here is a procedure that displays the contents of |eqtb[n]|
  5442  symbolically.
  5443  
  5444  @p@t\4@>@<Declare the procedure called |print_cmd_chr|@>@;@/
  5445  @!stat procedure show_eqtb(@!n:pointer);
  5446  begin if n<active_base then print_char("?") {this can't happen}
  5447  else if n<glue_base then @<Show equivalent |n|, in region 1 or 2@>
  5448  else if n<local_base then @<Show equivalent |n|, in region 3@>
  5449  else if n<int_base then @<Show equivalent |n|, in region 4@>
  5450  else if n<dimen_base then @<Show equivalent |n|, in region 5@>
  5451  else if n<=eqtb_size then @<Show equivalent |n|, in region 6@>
  5452  else print_char("?"); {this can't happen either}
  5453  end;
  5454  tats
  5455  
  5456  @ The last two regions of |eqtb| have fullword values instead of the
  5457  three fields |eq_level|, |eq_type|, and |equiv|. An |eq_type| is unnecessary,
  5458  but \TeX\ needs to store the |eq_level| information in another array
  5459  called |xeq_level|.
  5460  
  5461  @<Glob...@>=
  5462  @!eqtb:array[active_base..eqtb_size] of memory_word;
  5463  @!xeq_level:array[int_base..eqtb_size] of quarterword;
  5464  
  5465  @ @<Set init...@>=
  5466  for k:=int_base to eqtb_size do xeq_level[k]:=level_one;
  5467  
  5468  @ When the debugging routine |search_mem| is looking for pointers having a
  5469  given value, it is interested only in regions 1 to~3 of~|eqtb|, and in the
  5470  first part of region~4.
  5471  
  5472  @<Search |eqtb| for equivalents equal to |p|@>=
  5473  for q:=active_base to box_base+255 do
  5474    begin if equiv(q)=p then
  5475      begin print_nl("EQUIV("); print_int(q); print_char(")");
  5476      end;
  5477    end
  5478  
  5479  @* \[18] The hash table.
  5480  Control sequences are stored and retrieved by means of a fairly standard hash
  5481  table algorithm called the method of ``coalescing lists'' (cf.\ Algorithm 6.4C
  5482  in {\sl The Art of Computer Programming\/}). Once a control sequence enters the
  5483  table, it is never removed, because there are complicated situations
  5484  involving \.{\\gdef} where the removal of a control sequence at the end of
  5485  a group would be a mistake preventable only by the introduction of a
  5486  complicated reference-count mechanism.
  5487  
  5488  The actual sequence of letters forming a control sequence identifier is
  5489  stored in the |str_pool| array together with all the other strings. An
  5490  auxiliary array |hash| consists of items with two halfword fields per
  5491  word. The first of these, called |next(p)|, points to the next identifier
  5492  belonging to the same coalesced list as the identifier corresponding to~|p|;
  5493  and the other, called |text(p)|, points to the |str_start| entry for
  5494  |p|'s identifier. If position~|p| of the hash table is empty, we have
  5495  |text(p)=0|; if position |p| is either empty or the end of a coalesced
  5496  hash list, we have |next(p)=0|. An auxiliary pointer variable called
  5497  |hash_used| is maintained in such a way that all locations |p>=hash_used|
  5498  are nonempty. The global variable |cs_count| tells how many multiletter
  5499  control sequences have been defined, if statistics are being kept.
  5500  
  5501  A global boolean variable called |no_new_control_sequence| is set to
  5502  |true| during the time that new hash table entries are forbidden.
  5503  
  5504  @d next(#) == hash[#].lh {link for coalesced lists}
  5505  @d text(#) == hash[#].rh {string number for control sequence name}
  5506  @d hash_is_full == (hash_used=hash_base) {test if all positions are occupied}
  5507  @d font_id_text(#) == text(font_id_base+#) {a frozen font identifier's name}
  5508  
  5509  @<Glob...@>=
  5510  @!hash: array[hash_base..undefined_control_sequence-1] of two_halves;
  5511    {the hash table}
  5512  @!hash_used:pointer; {allocation pointer for |hash|}
  5513  @!no_new_control_sequence:boolean; {are new identifiers legal?}
  5514  @!cs_count:integer; {total number of known identifiers}
  5515  
  5516  @ @<Set init...@>=
  5517  no_new_control_sequence:=true; {new identifiers are usually forbidden}
  5518  next(hash_base):=0; text(hash_base):=0;
  5519  for k:=hash_base+1 to undefined_control_sequence-1 do hash[k]:=hash[hash_base];
  5520  
  5521  @ @<Initialize table entries...@>=
  5522  hash_used:=frozen_control_sequence; {nothing is used}
  5523  cs_count:=0;
  5524  eq_type(frozen_dont_expand):=dont_expand;
  5525  text(frozen_dont_expand):="notexpanded:";
  5526  @.notexpanded:@>
  5527  
  5528  @ Here is the subroutine that searches the hash table for an identifier
  5529  that matches a given string of length |l>1| appearing in |buffer[j..
  5530  (j+l-1)]|. If the identifier is found, the corresponding hash table address
  5531  is returned. Otherwise, if the global variable |no_new_control_sequence|
  5532  is |true|, the dummy address |undefined_control_sequence| is returned.
  5533  Otherwise the identifier is inserted into the hash table and its location
  5534  is returned.
  5535  
  5536  @p function id_lookup(@!j,@!l:integer):pointer; {search the hash table}
  5537  label found; {go here if you found it}
  5538  var h:integer; {hash code}
  5539  @!d:integer; {number of characters in incomplete current string}
  5540  @!p:pointer; {index in |hash| array}
  5541  @!k:pointer; {index in |buffer| array}
  5542  begin @<Compute the hash code |h|@>;
  5543  p:=h+hash_base; {we start searching here; note that |0<=h<hash_prime|}
  5544  loop@+begin if text(p)>0 then if length(text(p))=l then
  5545      if str_eq_buf(text(p),j) then goto found;
  5546    if next(p)=0 then
  5547      begin if no_new_control_sequence then
  5548        p:=undefined_control_sequence
  5549      else @<Insert a new control sequence after |p|, then make
  5550        |p| point to it@>;
  5551      goto found;
  5552      end;
  5553    p:=next(p);
  5554    end;
  5555  found: id_lookup:=p;
  5556  end;
  5557  
  5558  @ @<Insert a new control...@>=
  5559  begin if text(p)>0 then
  5560    begin repeat if hash_is_full then overflow("hash size",hash_size);
  5561  @:TeX capacity exceeded hash size}{\quad hash size@>
  5562    decr(hash_used);
  5563    until text(hash_used)=0; {search for an empty location in |hash|}
  5564    next(p):=hash_used; p:=hash_used;
  5565    end;
  5566  str_room(l); d:=cur_length;
  5567  while pool_ptr>str_start[str_ptr] do
  5568    begin decr(pool_ptr); str_pool[pool_ptr+l]:=str_pool[pool_ptr];
  5569    end; {move current string up to make room for another}
  5570  for k:=j to j+l-1 do append_char(buffer[k]);
  5571  text(p):=make_string; pool_ptr:=pool_ptr+d;
  5572  @!stat incr(cs_count);@+tats@;@/
  5573  end
  5574  
  5575  @ The value of |hash_prime| should be roughly 85\pct! of |hash_size|, and it
  5576  should be a prime number.  The theory of hashing tells us to expect fewer
  5577  than two table probes, on the average, when the search is successful.
  5578  [See J.~S. Vitter, {\sl Journal of the ACM\/ \bf30} (1983), 231--258.]
  5579  @^Vitter, Jeffrey Scott@>
  5580  
  5581  @<Compute the hash code |h|@>=
  5582  h:=buffer[j];
  5583  for k:=j+1 to j+l-1 do
  5584    begin h:=h+h+buffer[k];
  5585    while h>=hash_prime do h:=h-hash_prime;
  5586    end
  5587  
  5588  @ Single-character control sequences do not need to be looked up in a hash
  5589  table, since we can use the character code itself as a direct address.
  5590  The procedure |print_cs| prints the name of a control sequence, given
  5591  a pointer to its address in |eqtb|. A space is printed after the name
  5592  unless it is a single nonletter or an active character. This procedure
  5593  might be invoked with invalid data, so it is ``extra robust.'' The
  5594  individual characters must be printed one at a time using |print|, since
  5595  they may be unprintable.
  5596  
  5597  @<Basic printing...@>=
  5598  procedure print_cs(@!p:integer); {prints a purported control sequence}
  5599  begin if p<hash_base then {single character}
  5600    if p>=single_base then
  5601      if p=null_cs then
  5602        begin print_esc("csname"); print_esc("endcsname"); print_char(" ");
  5603        end
  5604      else  begin print_esc(p-single_base);
  5605        if cat_code(p-single_base)=letter then print_char(" ");
  5606        end
  5607    else if p<active_base then print_esc("IMPOSSIBLE.")
  5608  @.IMPOSSIBLE@>
  5609    else print(p-active_base)
  5610  else if p>=undefined_control_sequence then print_esc("IMPOSSIBLE.")
  5611  else if (text(p)<0)or(text(p)>=str_ptr) then print_esc("NONEXISTENT.")
  5612  @.NONEXISTENT@>
  5613  else  begin print_esc(text(p));
  5614    print_char(" ");
  5615    end;
  5616  end;
  5617  
  5618  @ Here is a similar procedure; it avoids the error checks, and it never
  5619  prints a space after the control sequence.
  5620  
  5621  @<Basic printing procedures@>=
  5622  procedure sprint_cs(@!p:pointer); {prints a control sequence}
  5623  begin if p<hash_base then
  5624    if p<single_base then print(p-active_base)
  5625    else  if p<null_cs then print_esc(p-single_base)
  5626      else  begin print_esc("csname"); print_esc("endcsname");
  5627        end
  5628  else print_esc(text(p));
  5629  end;
  5630  
  5631  @ We need to put \TeX's ``primitive'' control sequences into the hash
  5632  table, together with their command code (which will be the |eq_type|)
  5633  and an operand (which will be the |equiv|). The |primitive| procedure
  5634  does this, in a way that no \TeX\ user can. The global value |cur_val|
  5635  contains the new |eqtb| pointer after |primitive| has acted.
  5636  
  5637  @p @!init procedure primitive(@!s:str_number;@!c:quarterword;@!o:halfword);
  5638  var k:pool_pointer; {index into |str_pool|}
  5639  @!j:small_number; {index into |buffer|}
  5640  @!l:small_number; {length of the string}
  5641  begin if s<256 then cur_val:=s+single_base
  5642  else  begin k:=str_start[s]; l:=str_start[s+1]-k;
  5643      {we will move |s| into the (empty) |buffer|}
  5644    for j:=0 to l-1 do buffer[j]:=so(str_pool[k+j]);
  5645    cur_val:=id_lookup(0,l); {|no_new_control_sequence| is |false|}
  5646    flush_string; text(cur_val):=s; {we don't want to have the string twice}
  5647    end;
  5648  eq_level(cur_val):=level_one; eq_type(cur_val):=c; equiv(cur_val):=o;
  5649  end;
  5650  tini
  5651  
  5652  @ Many of \TeX's primitives need no |equiv|, since they are identifiable
  5653  by their |eq_type| alone. These primitives are loaded into the hash table
  5654  as follows:
  5655  
  5656  @<Put each of \TeX's primitives into the hash table@>=
  5657  primitive(" ",ex_space,0);@/
  5658  @!@:Single-character primitives /}{\quad\.{\\\ }@>
  5659  primitive("/",ital_corr,0);@/
  5660  @!@:Single-character primitives /}{\quad\.{\\/}@>
  5661  primitive("accent",accent,0);@/
  5662  @!@:accent_}{\.{\\accent} primitive@>
  5663  primitive("advance",advance,0);@/
  5664  @!@:advance_}{\.{\\advance} primitive@>
  5665  primitive("afterassignment",after_assignment,0);@/
  5666  @!@:after_assignment_}{\.{\\afterassignment} primitive@>
  5667  primitive("aftergroup",after_group,0);@/
  5668  @!@:after_group_}{\.{\\aftergroup} primitive@>
  5669  primitive("begingroup",begin_group,0);@/
  5670  @!@:begin_group_}{\.{\\begingroup} primitive@>
  5671  primitive("char",char_num,0);@/
  5672  @!@:char_}{\.{\\char} primitive@>
  5673  primitive("csname",cs_name,0);@/
  5674  @!@:cs_name_}{\.{\\csname} primitive@>
  5675  primitive("delimiter",delim_num,0);@/
  5676  @!@:delimiter_}{\.{\\delimiter} primitive@>
  5677  primitive("divide",divide,0);@/
  5678  @!@:divide_}{\.{\\divide} primitive@>
  5679  primitive("endcsname",end_cs_name,0);@/
  5680  @!@:end_cs_name_}{\.{\\endcsname} primitive@>
  5681  primitive("endgroup",end_group,0);
  5682  @!@:end_group_}{\.{\\endgroup} primitive@>
  5683  text(frozen_end_group):="endgroup"; eqtb[frozen_end_group]:=eqtb[cur_val];@/
  5684  primitive("expandafter",expand_after,0);@/
  5685  @!@:expand_after_}{\.{\\expandafter} primitive@>
  5686  primitive("font",def_font,0);@/
  5687  @!@:font_}{\.{\\font} primitive@>
  5688  primitive("fontdimen",assign_font_dimen,0);@/
  5689  @!@:font_dimen_}{\.{\\fontdimen} primitive@>
  5690  primitive("halign",halign,0);@/
  5691  @!@:halign_}{\.{\\halign} primitive@>
  5692  primitive("hrule",hrule,0);@/
  5693  @!@:hrule_}{\.{\\hrule} primitive@>
  5694  primitive("ignorespaces",ignore_spaces,0);@/
  5695  @!@:ignore_spaces_}{\.{\\ignorespaces} primitive@>
  5696  primitive("insert",insert,0);@/
  5697  @!@:insert_}{\.{\\insert} primitive@>
  5698  primitive("mark",mark,0);@/
  5699  @!@:mark_}{\.{\\mark} primitive@>
  5700  primitive("mathaccent",math_accent,0);@/
  5701  @!@:math_accent_}{\.{\\mathaccent} primitive@>
  5702  primitive("mathchar",math_char_num,0);@/
  5703  @!@:math_char_}{\.{\\mathchar} primitive@>
  5704  primitive("mathchoice",math_choice,0);@/
  5705  @!@:math_choice_}{\.{\\mathchoice} primitive@>
  5706  primitive("multiply",multiply,0);@/
  5707  @!@:multiply_}{\.{\\multiply} primitive@>
  5708  primitive("noalign",no_align,0);@/
  5709  @!@:no_align_}{\.{\\noalign} primitive@>
  5710  primitive("noboundary",no_boundary,0);@/
  5711  @!@:no_boundary_}{\.{\\noboundary} primitive@>
  5712  primitive("noexpand",no_expand,0);@/
  5713  @!@:no_expand_}{\.{\\noexpand} primitive@>
  5714  primitive("nonscript",non_script,0);@/
  5715  @!@:non_script_}{\.{\\nonscript} primitive@>
  5716  primitive("omit",omit,0);@/
  5717  @!@:omit_}{\.{\\omit} primitive@>
  5718  primitive("parshape",set_shape,0);@/
  5719  @!@:par_shape_}{\.{\\parshape} primitive@>
  5720  primitive("penalty",break_penalty,0);@/
  5721  @!@:penalty_}{\.{\\penalty} primitive@>
  5722  primitive("prevgraf",set_prev_graf,0);@/
  5723  @!@:prev_graf_}{\.{\\prevgraf} primitive@>
  5724  primitive("radical",radical,0);@/
  5725  @!@:radical_}{\.{\\radical} primitive@>
  5726  primitive("read",read_to_cs,0);@/
  5727  @!@:read_}{\.{\\read} primitive@>
  5728  primitive("relax",relax,256); {cf.\ |scan_file_name|}
  5729  @!@:relax_}{\.{\\relax} primitive@>
  5730  text(frozen_relax):="relax"; eqtb[frozen_relax]:=eqtb[cur_val];@/
  5731  primitive("setbox",set_box,0);@/
  5732  @!@:set_box_}{\.{\\setbox} primitive@>
  5733  primitive("the",the,0);@/
  5734  @!@:the_}{\.{\\the} primitive@>
  5735  primitive("toks",toks_register,0);@/
  5736  @!@:toks_}{\.{\\toks} primitive@>
  5737  primitive("vadjust",vadjust,0);@/
  5738  @!@:vadjust_}{\.{\\vadjust} primitive@>
  5739  primitive("valign",valign,0);@/
  5740  @!@:valign_}{\.{\\valign} primitive@>
  5741  primitive("vcenter",vcenter,0);@/
  5742  @!@:vcenter_}{\.{\\vcenter} primitive@>
  5743  primitive("vrule",vrule,0);@/
  5744  @!@:vrule_}{\.{\\vrule} primitive@>
  5745  
  5746  @ Each primitive has a corresponding inverse, so that it is possible to
  5747  display the cryptic numeric contents of |eqtb| in symbolic form.
  5748  Every call of |primitive| in this program is therefore accompanied by some
  5749  straightforward code that forms part of the |print_cmd_chr| routine
  5750  below.
  5751  
  5752  @<Cases of |print_cmd_chr|...@>=
  5753  accent: print_esc("accent");
  5754  advance: print_esc("advance");
  5755  after_assignment: print_esc("afterassignment");
  5756  after_group: print_esc("aftergroup");
  5757  assign_font_dimen: print_esc("fontdimen");
  5758  begin_group: print_esc("begingroup");
  5759  break_penalty: print_esc("penalty");
  5760  char_num: print_esc("char");
  5761  cs_name: print_esc("csname");
  5762  def_font: print_esc("font");
  5763  delim_num: print_esc("delimiter");
  5764  divide: print_esc("divide");
  5765  end_cs_name: print_esc("endcsname");
  5766  end_group: print_esc("endgroup");
  5767  ex_space: print_esc(" ");
  5768  expand_after: print_esc("expandafter");
  5769  halign: print_esc("halign");
  5770  hrule: print_esc("hrule");
  5771  ignore_spaces: print_esc("ignorespaces");
  5772  insert: print_esc("insert");
  5773  ital_corr: print_esc("/");
  5774  mark: print_esc("mark");
  5775  math_accent: print_esc("mathaccent");
  5776  math_char_num: print_esc("mathchar");
  5777  math_choice: print_esc("mathchoice");
  5778  multiply: print_esc("multiply");
  5779  no_align: print_esc("noalign");
  5780  no_boundary:print_esc("noboundary");
  5781  no_expand: print_esc("noexpand");
  5782  non_script: print_esc("nonscript");
  5783  omit: print_esc("omit");
  5784  radical: print_esc("radical");
  5785  read_to_cs: print_esc("read");
  5786  relax: print_esc("relax");
  5787  set_box: print_esc("setbox");
  5788  set_prev_graf: print_esc("prevgraf");
  5789  set_shape: print_esc("parshape");
  5790  the: print_esc("the");
  5791  toks_register: print_esc("toks");
  5792  vadjust: print_esc("vadjust");
  5793  valign: print_esc("valign");
  5794  vcenter: print_esc("vcenter");
  5795  vrule: print_esc("vrule");
  5796  
  5797  @ We will deal with the other primitives later, at some point in the program
  5798  where their |eq_type| and |equiv| values are more meaningful.  For example,
  5799  the primitives for math mode will be loaded when we consider the routines
  5800  that deal with formulas. It is easy to find where each particular
  5801  primitive was treated by looking in the index at the end; for example, the
  5802  section where |"radical"| entered |eqtb| is listed under `\.{\\radical}
  5803  primitive'. (Primitives consisting of a single nonalphabetic character,
  5804  @!like `\.{\\/}', are listed under `Single-character primitives'.)
  5805  @!@^Single-character primitives@>
  5806  
  5807  Meanwhile, this is a convenient place to catch up on something we were unable
  5808  to do before the hash table was defined:
  5809  
  5810  @<Print the font identifier for |font(p)|@>=
  5811  print_esc(font_id_text(font(p)))
  5812  
  5813  @* \[19] Saving and restoring equivalents.
  5814  The nested structure provided by `$\.{\char'173}\ldots\.{\char'175}$' groups
  5815  in \TeX\ means that |eqtb| entries valid in outer groups should be saved
  5816  and restored later if they are overridden inside the braces. When a new |eqtb|
  5817  value is being assigned, the program therefore checks to see if the previous
  5818  entry belongs to an outer level. In such a case, the old value is placed
  5819  on the |save_stack| just before the new value enters |eqtb|. At the
  5820  end of a grouping level, i.e., when the right brace is sensed, the
  5821  |save_stack| is used to restore the outer values, and the inner ones are
  5822  destroyed.
  5823  
  5824  Entries on the |save_stack| are of type |memory_word|. The top item on
  5825  this stack is |save_stack[p]|, where |p=save_ptr-1|; it contains three
  5826  fields called |save_type|, |save_level|, and |save_index|, and it is
  5827  interpreted in one of four ways:
  5828  
  5829  \yskip\hangg 1) If |save_type(p)=restore_old_value|, then
  5830  |save_index(p)| is a location in |eqtb| whose current value should
  5831  be destroyed at the end of the current group and replaced by |save_stack[p-1]|.
  5832  Furthermore if |save_index(p)>=int_base|, then |save_level(p)|
  5833  should replace the corresponding entry in |xeq_level|.
  5834  
  5835  \yskip\hangg 2) If |save_type(p)=restore_zero|, then |save_index(p)|
  5836  is a location in |eqtb| whose current value should be destroyed at the end
  5837  of the current group, when it should be
  5838  replaced by the value of |eqtb[undefined_control_sequence]|.
  5839  
  5840  \yskip\hangg 3) If |save_type(p)=insert_token|, then |save_index(p)|
  5841  is a token that should be inserted into \TeX's input when the current
  5842  group ends.
  5843  
  5844  \yskip\hangg 4) If |save_type(p)=level_boundary|, then |save_level(p)|
  5845  is a code explaining what kind of group we were previously in, and
  5846  |save_index(p)| points to the level boundary word at the bottom of
  5847  the entries for that group.
  5848  
  5849  @d save_type(#)==save_stack[#].hh.b0 {classifies a |save_stack| entry}
  5850  @d save_level(#)==save_stack[#].hh.b1
  5851    {saved level for regions 5 and 6, or group code}
  5852  @d save_index(#)==save_stack[#].hh.rh
  5853    {|eqtb| location or token or |save_stack| location}
  5854  @d restore_old_value=0 {|save_type| when a value should be restored later}
  5855  @d restore_zero=1 {|save_type| when an undefined entry should be restored}
  5856  @d insert_token=2 {|save_type| when a token is being saved for later use}
  5857  @d level_boundary=3 {|save_type| corresponding to beginning of group}
  5858  
  5859  @ Here are the group codes that are used to discriminate between different
  5860  kinds of groups. They allow \TeX\ to decide what special actions, if any,
  5861  should be performed when a group ends.
  5862  \def\grp{\.{\char'173...\char'175}}
  5863  
  5864  Some groups are not supposed to be ended by right braces. For example,
  5865  the `\.\$' that begins a math formula causes a |math_shift_group| to
  5866  be started, and this should be terminated by a matching `\.\$'. Similarly,
  5867  a group that starts with \.{\\left} should end with \.{\\right}, and
  5868  one that starts with \.{\\begingroup} should end with \.{\\endgroup}.
  5869  
  5870  @d bottom_level=0 {group code for the outside world}
  5871  @d simple_group=1 {group code for local structure only}
  5872  @d hbox_group=2 {code for `\.{\\hbox}\grp'}
  5873  @d adjusted_hbox_group=3 {code for `\.{\\hbox}\grp' in vertical mode}
  5874  @d vbox_group=4 {code for `\.{\\vbox}\grp'}
  5875  @d vtop_group=5 {code for `\.{\\vtop}\grp'}
  5876  @d align_group=6 {code for `\.{\\halign}\grp', `\.{\\valign}\grp'}
  5877  @d no_align_group=7 {code for `\.{\\noalign}\grp'}
  5878  @d output_group=8 {code for output routine}
  5879  @d math_group=9 {code for, e.g., `\.{\char'136}\grp'}
  5880  @d disc_group=10 {code for `\.{\\discretionary}\grp\grp\grp'}
  5881  @d insert_group=11 {code for `\.{\\insert}\grp', `\.{\\vadjust}\grp'}
  5882  @d vcenter_group=12 {code for `\.{\\vcenter}\grp'}
  5883  @d math_choice_group=13 {code for `\.{\\mathchoice}\grp\grp\grp\grp'}
  5884  @d semi_simple_group=14 {code for `\.{\\begingroup...\\endgroup}'}
  5885  @d math_shift_group=15 {code for `\.{\$...\$}'}
  5886  @d math_left_group=16 {code for `\.{\\left...\\right}'}
  5887  @d max_group_code=16
  5888  
  5889  @<Types...@>=
  5890  @!group_code=0..max_group_code; {|save_level| for a level boundary}
  5891  
  5892  @ The global variable |cur_group| keeps track of what sort of group we are
  5893  currently in. Another global variable, |cur_boundary|, points to the
  5894  topmost |level_boundary| word.  And |cur_level| is the current depth of
  5895  nesting. The routines are designed to preserve the condition that no entry
  5896  in the |save_stack| or in |eqtb| ever has a level greater than |cur_level|.
  5897  
  5898  @ @<Glob...@>=
  5899  @!save_stack : array[0..save_size] of memory_word;
  5900  @!save_ptr : 0..save_size; {first unused entry on |save_stack|}
  5901  @!max_save_stack:0..save_size; {maximum usage of save stack}
  5902  @!cur_level: quarterword; {current nesting level for groups}
  5903  @!cur_group: group_code; {current group type}
  5904  @!cur_boundary: 0..save_size; {where the current level begins}
  5905  
  5906  @ At this time it might be a good idea for the reader to review the introduction
  5907  to |eqtb| that was given above just before the long lists of parameter names.
  5908  Recall that the ``outer level'' of the program is |level_one|, since
  5909  undefined control sequences are assumed to be ``defined'' at |level_zero|.
  5910  
  5911  @<Set init...@>=
  5912  save_ptr:=0; cur_level:=level_one; cur_group:=bottom_level; cur_boundary:=0;
  5913  max_save_stack:=0;
  5914  
  5915  @ The following macro is used to test if there is room for up to six more
  5916  entries on |save_stack|. By making a conservative test like this, we can
  5917  get by with testing for overflow in only a few places.
  5918  
  5919  @d check_full_save_stack==if save_ptr>max_save_stack then
  5920    begin max_save_stack:=save_ptr;
  5921    if max_save_stack>save_size-6 then overflow("save size",save_size);
  5922  @:TeX capacity exceeded save size}{\quad save size@>
  5923    end
  5924  
  5925  @ Procedure |new_save_level| is called when a group begins. The
  5926  argument is a group identification code like `|hbox_group|'. After
  5927  calling this routine, it is safe to put five more entries on |save_stack|.
  5928  
  5929  In some cases integer-valued items are placed onto the
  5930  |save_stack| just below a |level_boundary| word, because this is a
  5931  convenient place to keep information that is supposed to ``pop up'' just
  5932  when the group has finished.
  5933  For example, when `\.{\\hbox to 100pt}\grp' is being treated, the 100pt
  5934  dimension is stored on |save_stack| just before |new_save_level| is
  5935  called.
  5936  
  5937  We use the notation |saved(k)| to stand for an integer item that
  5938  appears in location |save_ptr+k| of the save stack.
  5939  
  5940  @d saved(#)==save_stack[save_ptr+#].int
  5941  
  5942  @p procedure new_save_level(@!c:group_code); {begin a new level of grouping}
  5943  begin check_full_save_stack;
  5944  save_type(save_ptr):=level_boundary; save_level(save_ptr):=cur_group;
  5945  save_index(save_ptr):=cur_boundary;
  5946  if cur_level=max_quarterword then overflow("grouping levels",
  5947  @:TeX capacity exceeded grouping levels}{\quad grouping levels@>
  5948    max_quarterword-min_quarterword);
  5949    {quit if |(cur_level+1)| is too big to be stored in |eqtb|}
  5950  cur_boundary:=save_ptr; incr(cur_level); incr(save_ptr); cur_group:=c;
  5951  end;
  5952  
  5953  @ Just before an entry of |eqtb| is changed, the following procedure should
  5954  be called to update the other data structures properly. It is important
  5955  to keep in mind that reference counts in |mem| include references from
  5956  within |save_stack|, so these counts must be handled carefully.
  5957  @^reference counts@>
  5958  
  5959  @p procedure eq_destroy(@!w:memory_word); {gets ready to forget |w|}
  5960  var q:pointer; {|equiv| field of |w|}
  5961  begin case eq_type_field(w) of
  5962  call,long_call,outer_call,long_outer_call: delete_token_ref(equiv_field(w));
  5963  glue_ref: delete_glue_ref(equiv_field(w));
  5964  shape_ref: begin q:=equiv_field(w); {we need to free a \.{\\parshape} block}
  5965    if q<>null then free_node(q,info(q)+info(q)+1);
  5966    end; {such a block is |2n+1| words long, where |n=info(q)|}
  5967  box_ref: flush_node_list(equiv_field(w));
  5968  othercases do_nothing
  5969  endcases;
  5970  end;
  5971  
  5972  @ To save a value of |eqtb[p]| that was established at level |l|, we
  5973  can use the following subroutine.
  5974  
  5975  @p procedure eq_save(@!p:pointer;@!l:quarterword); {saves |eqtb[p]|}
  5976  begin check_full_save_stack;
  5977  if l=level_zero then save_type(save_ptr):=restore_zero
  5978  else  begin save_stack[save_ptr]:=eqtb[p]; incr(save_ptr);
  5979    save_type(save_ptr):=restore_old_value;
  5980    end;
  5981  save_level(save_ptr):=l; save_index(save_ptr):=p; incr(save_ptr);
  5982  end;
  5983  
  5984  @ The procedure |eq_define| defines an |eqtb| entry having specified
  5985  |eq_type| and |equiv| fields, and saves the former value if appropriate.
  5986  This procedure is used only for entries in the first four regions of |eqtb|,
  5987  i.e., only for entries that have |eq_type| and |equiv| fields.
  5988  After calling this routine, it is safe to put four more entries on
  5989  |save_stack|, provided that there was room for four more entries before
  5990  the call, since |eq_save| makes the necessary test.
  5991  
  5992  @p procedure eq_define(@!p:pointer;@!t:quarterword;@!e:halfword);
  5993    {new data for |eqtb|}
  5994  begin if eq_level(p)=cur_level then eq_destroy(eqtb[p])
  5995  else if cur_level>level_one then eq_save(p,eq_level(p));
  5996  eq_level(p):=cur_level; eq_type(p):=t; equiv(p):=e;
  5997  end;
  5998  
  5999  @ The counterpart of |eq_define| for the remaining (fullword) positions in
  6000  |eqtb| is called |eq_word_define|. Since |xeq_level[p]>=level_one| for all
  6001  |p|, a `|restore_zero|' will never be used in this case.
  6002  
  6003  @p procedure eq_word_define(@!p:pointer;@!w:integer);
  6004  begin if xeq_level[p]<>cur_level then
  6005    begin eq_save(p,xeq_level[p]); xeq_level[p]:=cur_level;
  6006    end;
  6007  eqtb[p].int:=w;
  6008  end;
  6009  
  6010  @ The |eq_define| and |eq_word_define| routines take care of local definitions.
  6011  @^global definitions@>
  6012  Global definitions are done in almost the same way, but there is no need
  6013  to save old values, and the new value is associated with |level_one|.
  6014  
  6015  @p procedure geq_define(@!p:pointer;@!t:quarterword;@!e:halfword);
  6016    {global |eq_define|}
  6017  begin eq_destroy(eqtb[p]);
  6018  eq_level(p):=level_one; eq_type(p):=t; equiv(p):=e;
  6019  end;
  6020  @#
  6021  procedure geq_word_define(@!p:pointer;@!w:integer); {global |eq_word_define|}
  6022  begin eqtb[p].int:=w; xeq_level[p]:=level_one;
  6023  end;
  6024  
  6025  @ Subroutine |save_for_after| puts a token on the stack for save-keeping.
  6026  
  6027  @p procedure save_for_after(@!t:halfword);
  6028  begin if cur_level>level_one then
  6029    begin check_full_save_stack;
  6030    save_type(save_ptr):=insert_token; save_level(save_ptr):=level_zero;
  6031    save_index(save_ptr):=t; incr(save_ptr);
  6032    end;
  6033  end;
  6034  
  6035  @ The |unsave| routine goes the other way, taking items off of |save_stack|.
  6036  This routine takes care of restoration when a level ends; everything
  6037  belonging to the topmost group is cleared off of the save stack.
  6038  
  6039  @p@t\4@>@<Declare the procedure called |restore_trace|@>@;@/
  6040  procedure@?back_input; forward; @t\2@>
  6041  procedure unsave; {pops the top level off the save stack}
  6042  label done;
  6043  var p:pointer; {position to be restored}
  6044  @!l:quarterword; {saved level, if in fullword regions of |eqtb|}
  6045  @!t:halfword; {saved value of |cur_tok|}
  6046  begin if cur_level>level_one then
  6047    begin decr(cur_level);
  6048    @<Clear off top level from |save_stack|@>;
  6049    end
  6050  else confusion("curlevel"); {|unsave| is not used when |cur_group=bottom_level|}
  6051  @:this can't happen curlevel}{\quad curlevel@>
  6052  end;
  6053  
  6054  @ @<Clear off...@>=
  6055  loop@+begin decr(save_ptr);
  6056    if save_type(save_ptr)=level_boundary then goto done;
  6057    p:=save_index(save_ptr);
  6058    if save_type(save_ptr)=insert_token then
  6059      @<Insert token |p| into \TeX's input@>
  6060    else  begin if save_type(save_ptr)=restore_old_value then
  6061        begin l:=save_level(save_ptr); decr(save_ptr);
  6062        end
  6063      else save_stack[save_ptr]:=eqtb[undefined_control_sequence];
  6064      @<Store \(s)|save_stack[save_ptr]| in |eqtb[p]|, unless
  6065        |eqtb[p]| holds a global value@>;
  6066      end;
  6067    end;
  6068  done: cur_group:=save_level(save_ptr); cur_boundary:=save_index(save_ptr)
  6069  
  6070  @ A global definition, which sets the level to |level_one|,
  6071  @^global definitions@>
  6072  will not be undone by |unsave|. If at least one global definition of
  6073  |eqtb[p]| has been carried out within the group that just ended, the
  6074  last such definition will therefore survive.
  6075  
  6076  @<Store \(s)|save...@>=
  6077  if p<int_base then
  6078    if eq_level(p)=level_one then
  6079      begin eq_destroy(save_stack[save_ptr]); {destroy the saved value}
  6080      @!stat if tracing_restores>0 then restore_trace(p,"retaining");@+tats@;@/
  6081      end
  6082    else  begin eq_destroy(eqtb[p]); {destroy the current value}
  6083      eqtb[p]:=save_stack[save_ptr]; {restore the saved value}
  6084      @!stat if tracing_restores>0 then restore_trace(p,"restoring");@+tats@;@/
  6085      end
  6086  else if xeq_level[p]<>level_one then
  6087    begin eqtb[p]:=save_stack[save_ptr]; xeq_level[p]:=l;
  6088    @!stat if tracing_restores>0 then restore_trace(p,"restoring");@+tats@;@/
  6089    end
  6090  else  begin
  6091    @!stat if tracing_restores>0 then restore_trace(p,"retaining");@+tats@;@/
  6092    end
  6093  
  6094  @ @<Declare the procedure called |restore_trace|@>=
  6095  @!stat procedure restore_trace(@!p:pointer;@!s:str_number);
  6096    {|eqtb[p]| has just been restored or retained}
  6097  begin begin_diagnostic; print_char("{"); print(s); print_char(" ");
  6098  show_eqtb(p); print_char("}");
  6099  end_diagnostic(false);
  6100  end;
  6101  tats
  6102  
  6103  @ When looking for possible pointers to a memory location, it is helpful
  6104  to look for references from |eqtb| that might be waiting on the
  6105  save stack. Of course, we might find spurious pointers too; but this
  6106  routine is merely an aid when debugging, and at such times we are
  6107  grateful for any scraps of information, even if they prove to be irrelevant.
  6108  @^dirty \PASCAL@>
  6109  
  6110  @<Search |save_stack| for equivalents that point to |p|@>=
  6111  if save_ptr>0 then for q:=0 to save_ptr-1 do
  6112    begin if equiv_field(save_stack[q])=p then
  6113      begin print_nl("SAVE("); print_int(q); print_char(")");
  6114      end;
  6115    end
  6116  
  6117  @ Most of the parameters kept in |eqtb| can be changed freely, but there's
  6118  an exception:  The magnification should not be used with two different
  6119  values during any \TeX\ job, since a single magnification is applied to an
  6120  entire run. The global variable |mag_set| is set to the current magnification
  6121  whenever it becomes necessary to ``freeze'' it at a particular value.
  6122  
  6123  @<Glob...@>=
  6124  @!mag_set:integer; {if nonzero, this magnification should be used henceforth}
  6125  
  6126  @ @<Set init...@>=
  6127  mag_set:=0;
  6128  
  6129  @ The |prepare_mag| subroutine is called whenever \TeX\ wants to use |mag|
  6130  for magnification.
  6131  
  6132  @p procedure prepare_mag;
  6133  begin if (mag_set>0)and(mag<>mag_set) then
  6134    begin print_err("Incompatible magnification ("); print_int(mag);
  6135  @.Incompatible magnification@>
  6136    print(");"); print_nl(" the previous value will be retained");
  6137    help2("I can handle only one magnification ratio per job. So I've")@/
  6138    ("reverted to the magnification you used earlier on this run.");@/
  6139    int_error(mag_set);
  6140    geq_word_define(int_base+mag_code,mag_set); {|mag:=mag_set|}
  6141    end;
  6142  if (mag<=0)or(mag>32768) then
  6143    begin print_err("Illegal magnification has been changed to 1000");@/
  6144  @.Illegal magnification...@>
  6145    help1("The magnification ratio must be between 1 and 32768.");
  6146    int_error(mag); geq_word_define(int_base+mag_code,1000);
  6147    end;
  6148  mag_set:=mag;
  6149  end;
  6150  
  6151  @* \[20] Token lists.
  6152  A \TeX\ token is either a character or a control sequence, and it is
  6153  @^token@>
  6154  represented internally in one of two ways: (1)~A character whose ASCII
  6155  code number is |c| and whose command code is |m| is represented as the
  6156  number $2^8m+c$; the command code is in the range |1<=m<=14|. (2)~A control
  6157  sequence whose |eqtb| address is |p| is represented as the number
  6158  |cs_token_flag+p|. Here |cs_token_flag=@t$2^{12}-1$@>| is larger than
  6159  $2^8m+c$, yet it is small enough that |cs_token_flag+p< max_halfword|;
  6160  thus, a token fits comfortably in a halfword.
  6161  
  6162  A token |t| represents a |left_brace| command if and only if
  6163  |t<left_brace_limit|; it represents a |right_brace| command if and only if
  6164  we have |left_brace_limit<=t<right_brace_limit|; and it represents a |match| or
  6165  |end_match| command if and only if |match_token<=t<=end_match_token|.
  6166  The following definitions take care of these token-oriented constants
  6167  and a few others.
  6168  
  6169  @d cs_token_flag==@'7777 {amount added to the |eqtb| location in a
  6170    token that stands for a control sequence; is a multiple of~256, less~1}
  6171  @d left_brace_token=@'0400 {$2^8\cdot|left_brace|$}
  6172  @d left_brace_limit=@'1000 {$2^8\cdot(|left_brace|+1)$}
  6173  @d right_brace_token=@'1000 {$2^8\cdot|right_brace|$}
  6174  @d right_brace_limit=@'1400 {$2^8\cdot(|right_brace|+1)$}
  6175  @d math_shift_token=@'1400 {$2^8\cdot|math_shift|$}
  6176  @d tab_token=@'2000 {$2^8\cdot|tab_mark|$}
  6177  @d out_param_token=@'2400 {$2^8\cdot|out_param|$}
  6178  @d space_token=@'5040 {$2^8\cdot|spacer|+|" "|$}
  6179  @d letter_token=@'5400 {$2^8\cdot|letter|$}
  6180  @d other_token=@'6000 {$2^8\cdot|other_char|$}
  6181  @d match_token=@'6400 {$2^8\cdot|match|$}
  6182  @d end_match_token=@'7000 {$2^8\cdot|end_match|$}
  6183  
  6184  @ @<Check the ``constant''...@>=
  6185  if cs_token_flag+undefined_control_sequence>max_halfword then bad:=21;
  6186  
  6187  @ A token list is a singly linked list of one-word nodes in |mem|, where
  6188  each word contains a token and a link. Macro definitions, output-routine
  6189  definitions, marks, \.{\\write} texts, and a few other things
  6190  are remembered by \TeX\ in the form
  6191  of token lists, usually preceded by a node with a reference count in its
  6192  |token_ref_count| field. The token stored in location |p| is called
  6193  |info(p)|.
  6194  
  6195  Three special commands appear in the token lists of macro definitions.
  6196  When |m=match|, it means that \TeX\ should scan a parameter
  6197  for the current macro; when |m=end_match|, it means that parameter
  6198  matching should end and \TeX\ should start reading the macro text; and
  6199  when |m=out_param|, it means that \TeX\ should insert parameter
  6200  number |c| into the text at this point.
  6201  
  6202  The enclosing \.{\char'173} and \.{\char'175} characters of a macro
  6203  definition are omitted, but an output routine
  6204  will be enclosed in braces.
  6205  
  6206  Here is an example macro definition that illustrates these conventions.
  6207  After \TeX\ processes the text
  6208  $$\.{\\def\\mac a\#1\#2 \\b \{\#1\\-a \#\#1\#2 \#2\}}$$
  6209  the definition of \.{\\mac} is represented as a token list containing
  6210  $$\def\,{\hskip2pt}
  6211  \vbox{\halign{\hfil#\hfil\cr
  6212  (reference count), |letter|\,\.a, |match|\,\#, |match|\,\#, |spacer|\,\.\ ,
  6213  \.{\\b}, |end_match|,\cr
  6214  |out_param|\,1, \.{\\-}, |letter|\,\.a, |spacer|\,\.\ , |mac_param|\,\#,
  6215  |other_char|\,\.1,\cr
  6216  |out_param|\,2, |spacer|\,\.\ , |out_param|\,2.\cr}}$$
  6217  The procedure |scan_toks| builds such token lists, and |macro_call|
  6218  does the parameter matching.
  6219  @^reference counts@>
  6220  
  6221  Examples such as
  6222  $$\.{\\def\\m\{\\def\\m\{a\}\ b\}}$$
  6223  explain why reference counts would be needed even if \TeX\ had no \.{\\let}
  6224  operation: When the token list for \.{\\m} is being read, the redefinition of
  6225  \.{\\m} changes the |eqtb| entry before the token list has been fully
  6226  consumed, so we dare not simply destroy a token list when its
  6227  control sequence is being redefined.
  6228  
  6229  If the parameter-matching part of a definition ends with `\.{\#\{}',
  6230  the corresponding token list will have `\.\{' just before the `|end_match|'
  6231  and also at the very end. The first `\.\{' is used to delimit the parameter; the
  6232  second one keeps the first from disappearing.
  6233  
  6234  @ The procedure |show_token_list|, which prints a symbolic form of
  6235  the token list that starts at a given node |p|, illustrates these
  6236  conventions. The token list being displayed should not begin with a reference
  6237  count. However, the procedure is intended to be robust, so that if the
  6238  memory links are awry or if |p| is not really a pointer to a token list,
  6239  nothing catastrophic will happen.
  6240  
  6241  An additional parameter |q| is also given; this parameter is either null
  6242  or it points to a node in the token list where a certain magic computation
  6243  takes place that will be explained later. (Basically, |q| is non-null when
  6244  we are printing the two-line context information at the time of an error
  6245  message; |q| marks the place corresponding to where the second line
  6246  should begin.)
  6247  
  6248  For example, if |p| points to the node containing the first \.a in the
  6249  token list above, then |show_token_list| will print the string
  6250  $$\hbox{`\.{a\#1\#2\ \\b\ ->\#1\\-a\ \#\#1\#2\ \#2}';}$$
  6251  and if |q| points to the node containing the second \.a,
  6252  the magic computation will be performed just before the second \.a is printed.
  6253  
  6254  The generation will stop, and `\.{\\ETC.}' will be printed, if the length
  6255  of printing exceeds a given limit~|l|. Anomalous entries are printed in the
  6256  form of control sequences that are not followed by a blank space, e.g.,
  6257  `\.{\\BAD.}'; this cannot be confused with actual control sequences because
  6258  a real control sequence named \.{BAD} would come out `\.{\\BAD\ }'.
  6259  
  6260  @<Declare the procedure called |show_token_list|@>=
  6261  procedure show_token_list(@!p,@!q:integer;@!l:integer);
  6262  label exit;
  6263  var m,@!c:integer; {pieces of a token}
  6264  @!match_chr:ASCII_code; {character used in a `|match|'}
  6265  @!n:ASCII_code; {the highest parameter number, as an ASCII digit}
  6266  begin match_chr:="#"; n:="0"; tally:=0;
  6267  while (p<>null) and (tally<l) do
  6268    begin if p=q then @<Do magic computation@>;
  6269    @<Display token |p|, and |return| if there are problems@>;
  6270    p:=link(p);
  6271    end;
  6272  if p<>null then print_esc("ETC.");
  6273  @.ETC@>
  6274  exit:
  6275  end;
  6276  
  6277  @ @<Display token |p|...@>=
  6278  if (p<hi_mem_min) or (p>mem_end) then
  6279    begin print_esc("CLOBBERED."); return;
  6280  @.CLOBBERED@>
  6281    end;
  6282  if info(p)>=cs_token_flag then print_cs(info(p)-cs_token_flag)
  6283  else  begin m:=info(p) div @'400; c:=info(p) mod @'400;
  6284    if info(p)<0 then print_esc("BAD.")
  6285  @.BAD@>
  6286    else @<Display the token $(|m|,|c|)$@>;
  6287    end
  6288  
  6289  @ The procedure usually ``learns'' the character code used for macro
  6290  parameters by seeing one in a |match| command before it runs into any
  6291  |out_param| commands.
  6292  
  6293  @<Display the token ...@>=
  6294  case m of
  6295  left_brace,right_brace,math_shift,tab_mark,sup_mark,sub_mark,spacer,
  6296    letter,other_char: print(c);
  6297  mac_param: begin print(c); print(c);
  6298    end;
  6299  out_param: begin print(match_chr);
  6300    if c<=9 then print_char(c+"0")
  6301    else  begin print_char("!"); return;
  6302      end;
  6303    end;
  6304  match: begin match_chr:=c; print(c); incr(n); print_char(n);
  6305    if n>"9" then return;
  6306    end;
  6307  end_match: print("->");
  6308  @.->@>
  6309  othercases print_esc("BAD.")
  6310  @.BAD@>
  6311  endcases
  6312  
  6313  @ Here's the way we sometimes want to display a token list, given a pointer
  6314  to its reference count; the pointer may be null.
  6315  
  6316  @p procedure token_show(@!p:pointer);
  6317  begin if p<>null then show_token_list(link(p),null,10000000);
  6318  end;
  6319  
  6320  @ The |print_meaning| subroutine displays |cur_cmd| and |cur_chr| in
  6321  symbolic form, including the expansion of a macro or mark.
  6322  
  6323  @p procedure print_meaning;
  6324  begin print_cmd_chr(cur_cmd,cur_chr);
  6325  if cur_cmd>=call then
  6326    begin print_char(":"); print_ln; token_show(cur_chr);
  6327    end
  6328  else if cur_cmd=top_bot_mark then
  6329    begin print_char(":"); print_ln;
  6330    token_show(cur_mark[cur_chr]);
  6331    end;
  6332  end;
  6333  
  6334  @* \[21] Introduction to the syntactic routines.
  6335  Let's pause a moment now and try to look at the Big Picture.
  6336  The \TeX\ program consists of three main parts: syntactic routines,
  6337  semantic routines, and output routines. The chief purpose of the
  6338  syntactic routines is to deliver the user's input to the semantic routines,
  6339  one token at a time. The semantic routines act as an interpreter
  6340  responding to these tokens, which may be regarded as commands. And the
  6341  output routines are periodically called on to convert box-and-glue
  6342  lists into a compact set of instructions that will be sent
  6343  to a typesetter. We have discussed the basic data structures and utility
  6344  routines of \TeX, so we are good and ready to plunge into the real activity by
  6345  considering the syntactic routines.
  6346  
  6347  Our current goal is to come to grips with the |get_next| procedure,
  6348  which is the keystone of \TeX's input mechanism. Each call of |get_next|
  6349  sets the value of three variables |cur_cmd|, |cur_chr|, and |cur_cs|,
  6350  representing the next input token.
  6351  $$\vbox{\halign{#\hfil\cr
  6352    \hbox{|cur_cmd| denotes a command code from the long list of codes
  6353     given above;}\cr
  6354    \hbox{|cur_chr| denotes a character code or other modifier of the command
  6355     code;}\cr
  6356    \hbox{|cur_cs| is the |eqtb| location of the current control sequence,}\cr
  6357    \hbox{\qquad if the current token was a control sequence,
  6358     otherwise it's zero.}\cr}}$$
  6359  Underlying this external behavior of |get_next| is all the machinery
  6360  necessary to convert from character files to tokens. At a given time we
  6361  may be only partially finished with the reading of several files (for
  6362  which \.{\\input} was specified), and partially finished with the expansion
  6363  of some user-defined macros and/or some macro parameters, and partially
  6364  finished with the generation of some text in a template for \.{\\halign},
  6365  and so on. When reading a character file, special characters must be
  6366  classified as math delimiters, etc.; comments and extra blank spaces must
  6367  be removed, paragraphs must be recognized, and control sequences must be
  6368  found in the hash table. Furthermore there are occasions in which the
  6369  scanning routines have looked ahead for a word like `\.{plus}' but only
  6370  part of that word was found, hence a few characters must be put back
  6371  into the input and scanned again.
  6372  
  6373  To handle these situations, which might all be present simultaneously,
  6374  \TeX\ uses various stacks that hold information about the incomplete
  6375  activities, and there is a finite state control for each level of the
  6376  input mechanism. These stacks record the current state of an implicitly
  6377  recursive process, but the |get_next| procedure is not recursive.
  6378  Therefore it will not be difficult to translate these algorithms into
  6379  low-level languages that do not support recursion.
  6380  
  6381  @<Glob...@>=
  6382  @!cur_cmd: eight_bits; {current command set by |get_next|}
  6383  @!cur_chr: halfword; {operand of current command}
  6384  @!cur_cs: pointer; {control sequence found here, zero if none found}
  6385  @!cur_tok: halfword; {packed representative of |cur_cmd| and |cur_chr|}
  6386  
  6387  @ The |print_cmd_chr| routine prints a symbolic interpretation of a
  6388  command code and its modifier. This is used in certain `\.{You can\'t}'
  6389  error messages, and in the implementation of diagnostic routines like
  6390  \.{\\show}.
  6391  
  6392  The body of |print_cmd_chr| is a rather tedious listing of print
  6393  commands, and most of it is essentially an inverse to the |primitive|
  6394  routine that enters a \TeX\ primitive into |eqtb|. Therefore much of
  6395  this procedure appears elsewhere in the program,
  6396  together with the corresponding |primitive| calls.
  6397  
  6398  @d chr_cmd(#)==begin print(#); print_ASCII(chr_code);
  6399    end
  6400  
  6401  @<Declare the procedure called |print_cmd_chr|@>=
  6402  procedure print_cmd_chr(@!cmd:quarterword;@!chr_code:halfword);
  6403  begin case cmd of
  6404  left_brace: chr_cmd("begin-group character ");
  6405  right_brace: chr_cmd("end-group character ");
  6406  math_shift: chr_cmd("math shift character ");
  6407  mac_param: chr_cmd("macro parameter character ");
  6408  sup_mark: chr_cmd("superscript character ");
  6409  sub_mark: chr_cmd("subscript character ");
  6410  endv: print("end of alignment template");
  6411  spacer: chr_cmd("blank space ");
  6412  letter: chr_cmd("the letter ");
  6413  other_char: chr_cmd("the character ");
  6414  @t\4@>@<Cases of |print_cmd_chr| for symbolic printing of primitives@>@/
  6415  othercases print("[unknown command code!]")
  6416  endcases;
  6417  end;
  6418  
  6419  @ Here is a procedure that displays the current command.
  6420  
  6421  @p procedure show_cur_cmd_chr;
  6422  begin begin_diagnostic; print_nl("{");
  6423  if mode<>shown_mode then
  6424    begin print_mode(mode); print(": "); shown_mode:=mode;
  6425    end;
  6426  print_cmd_chr(cur_cmd,cur_chr); print_char("}");
  6427  end_diagnostic(false);
  6428  end;
  6429  
  6430  @* \[22] Input stacks and states.
  6431  This implementation of
  6432  \TeX\ uses two different conventions for representing sequential stacks.
  6433  @^stack conventions@>@^conventions for representing stacks@>
  6434  
  6435  \yskip\hangg 1) If there is frequent access to the top entry, and if the
  6436  stack is essentially never empty, then the top entry is kept in a global
  6437  variable (even better would be a machine register), and the other entries
  6438  appear in the array $\\{stack}[0\to(\\{ptr}-1)]$. For example, the
  6439  semantic stack described above is handled this way, and so is the input
  6440  stack that we are about to study.
  6441  
  6442  \yskip\hangg 2) If there is infrequent top access, the entire stack contents
  6443  are in the array $\\{stack}[0\to(\\{ptr}-1)]$. For example, the |save_stack|
  6444  is treated this way, as we have seen.
  6445  
  6446  \yskip\noindent
  6447  The state of \TeX's input mechanism appears in the input stack, whose
  6448  entries are records with six fields, called |state|, |index|, |start|, |loc|,
  6449  |limit|, and |name|. This stack is maintained with
  6450  convention~(1), so it is declared in the following way:
  6451  
  6452  @<Types...@>=
  6453  @!in_state_record = record
  6454    @!state_field, @!index_field: quarterword;
  6455    @!start_field,@!loc_field, @!limit_field, @!name_field: halfword;
  6456    end;
  6457  
  6458  @ @<Glob...@>=
  6459  @!input_stack : array[0..stack_size] of in_state_record;
  6460  @!input_ptr : 0..stack_size; {first unused location of |input_stack|}
  6461  @!max_in_stack: 0..stack_size; {largest value of |input_ptr| when pushing}
  6462  @!cur_input : in_state_record;
  6463    {the ``top'' input state, according to convention (1)}
  6464  
  6465  @ We've already defined the special variable |loc==cur_input.loc_field|
  6466  in our discussion of basic input-output routines. The other components of
  6467  |cur_input| are defined in the same way:
  6468  
  6469  @d state==cur_input.state_field {current scanner state}
  6470  @d index==cur_input.index_field {reference for buffer information}
  6471  @d start==cur_input.start_field {starting position in |buffer|}
  6472  @d limit==cur_input.limit_field {end of current line in |buffer|}
  6473  @d name==cur_input.name_field {name of the current file}
  6474  
  6475  @ Let's look more closely now at the control variables
  6476  (|state|,~|index|,~|start|,~|loc|,~|limit|,~|name|),
  6477  assuming that \TeX\ is reading a line of characters that have been input
  6478  from some file or from the user's terminal. There is an array called
  6479  |buffer| that acts as a stack of all lines of characters that are
  6480  currently being read from files, including all lines on subsidiary
  6481  levels of the input stack that are not yet completed. \TeX\ will return to
  6482  the other lines when it is finished with the present input file.
  6483  
  6484  (Incidentally, on a machine with byte-oriented addressing, it might be
  6485  appropriate to combine |buffer| with the |str_pool| array,
  6486  letting the buffer entries grow downward from the top of the string pool
  6487  and checking that these two tables don't bump into each other.)
  6488  
  6489  The line we are currently working on begins in position |start| of the
  6490  buffer; the next character we are about to read is |buffer[loc]|; and
  6491  |limit| is the location of the last character present.  If |loc>limit|,
  6492  the line has been completely read. Usually |buffer[limit]| is the
  6493  |end_line_char|, denoting the end of a line, but this is not
  6494  true if the current line is an insertion that was entered on the user's
  6495  terminal in response to an error message.
  6496  
  6497  The |name| variable is a string number that designates the name of
  6498  the current file, if we are reading a text file. It is zero if we
  6499  are reading from the terminal; it is |n+1| if we are reading from
  6500  input stream |n|, where |0<=n<=16|. (Input stream 16 stands for
  6501  an invalid stream number; in such cases the input is actually from
  6502  the terminal, under control of the procedure |read_toks|.)
  6503  
  6504  The |state| variable has one of three values, when we are scanning such
  6505  files:
  6506  $$\baselineskip 15pt\vbox{\halign{#\hfil\cr
  6507  1) |state=mid_line| is the normal state.\cr
  6508  2) |state=skip_blanks| is like |mid_line|, but blanks are ignored.\cr
  6509  3) |state=new_line| is the state at the beginning of a line.\cr}}$$
  6510  These state values are assigned numeric codes so that if we add the state
  6511  code to the next character's command code, we get distinct values. For
  6512  example, `|mid_line+spacer|' stands for the case that a blank
  6513  space character occurs in the middle of a line when it is not being
  6514  ignored; after this case is processed, the next value of |state| will
  6515  be |skip_blanks|.
  6516  
  6517  @d mid_line=1 {|state| code when scanning a line of characters}
  6518  @d skip_blanks=2+max_char_code {|state| code when ignoring blanks}
  6519  @d new_line=3+max_char_code+max_char_code {|state| code at start of line}
  6520  
  6521  @ Additional information about the current line is available via the
  6522  |index| variable, which counts how many lines of characters are present
  6523  in the buffer below the current level. We have |index=0| when reading
  6524  from the terminal and prompting the user for each line; then if the user types,
  6525  e.g., `\.{\\input paper}', we will have |index=1| while reading
  6526  the file \.{paper.tex}. However, it does not follow that |index| is the
  6527  same as the input stack pointer, since many of the levels on the input
  6528  stack may come from token lists. For example, the instruction `\.{\\input
  6529  paper}' might occur in a token list.
  6530  
  6531  The global variable |in_open| is equal to the |index|
  6532  value of the highest non-token-list level. Thus, the number of partially read
  6533  lines in the buffer is |in_open+1|, and we have |in_open=index|
  6534  when we are not reading a token list.
  6535  
  6536  If we are not currently reading from the terminal, or from an input
  6537  stream, we are reading from the file variable |input_file[index]|. We use
  6538  the notation |terminal_input| as a convenient abbreviation for |name=0|,
  6539  and |cur_file| as an abbreviation for |input_file[index]|.
  6540  
  6541  The global variable |line| contains the line number in the topmost
  6542  open file, for use in error messages. If we are not reading from
  6543  the terminal, |line_stack[index]| holds the line number for the
  6544  enclosing level, so that |line| can be restored when the current
  6545  file has been read. Line numbers should never be negative, since the
  6546  negative of the current line number is used to identify the user's output
  6547  routine in the |mode_line| field of the semantic nest entries.
  6548  
  6549  If more information about the input state is needed, it can be
  6550  included in small arrays like those shown here. For example,
  6551  the current page or segment number in the input file might be
  6552  put into a variable |@!page|, maintained for enclosing levels in
  6553  `\ignorespaces|@!page_stack:array[1..max_in_open] of integer|\unskip'
  6554  by analogy with |line_stack|.
  6555  @^system dependencies@>
  6556  
  6557  @d terminal_input==(name=0) {are we reading from the terminal?}
  6558  @d cur_file==input_file[index] {the current |alpha_file| variable}
  6559  
  6560  @<Glob...@>=
  6561  @!in_open : 0..max_in_open; {the number of lines in the buffer, less one}
  6562  @!open_parens : 0..max_in_open; {the number of open text files}
  6563  @!input_file : array[1..max_in_open] of alpha_file;
  6564  @!line : integer; {current line number in the current source file}
  6565  @!line_stack : array[1..max_in_open] of integer;
  6566  
  6567  @ Users of \TeX\ sometimes forget to balance left and right braces properly,
  6568  and one of the ways \TeX\ tries to spot such errors is by considering an
  6569  input file as broken into subfiles by control sequences that
  6570  are declared to be \.{\\outer}.
  6571  
  6572  A variable called |scanner_status| tells \TeX\ whether or not to complain
  6573  when a subfile ends. This variable has six possible values:
  6574  
  6575  \yskip\hang|normal|, means that a subfile can safely end here without incident.
  6576  
  6577  \yskip\hang|skipping|, means that a subfile can safely end here, but not a file,
  6578  because we're reading past some conditional text that was not selected.
  6579  
  6580  \yskip\hang|defining|, means that a subfile shouldn't end now because a
  6581  macro is being defined.
  6582  
  6583  \yskip\hang|matching|, means that a subfile shouldn't end now because a
  6584  macro is being used and we are searching for the end of its arguments.
  6585  
  6586  \yskip\hang|aligning|, means that a subfile shouldn't end now because we are
  6587  not finished with the preamble of an \.{\\halign} or \.{\\valign}.
  6588  
  6589  \yskip\hang|absorbing|, means that a subfile shouldn't end now because we are
  6590  reading a balanced token list for \.{\\message}, \.{\\write}, etc.
  6591  
  6592  \yskip\noindent
  6593  If the |scanner_status| is not |normal|, the variable |warning_index| points
  6594  to the |eqtb| location for the relevant control sequence name to print
  6595  in an error message.
  6596  
  6597  @d skipping=1 {|scanner_status| when passing conditional text}
  6598  @d defining=2 {|scanner_status| when reading a macro definition}
  6599  @d matching=3 {|scanner_status| when reading macro arguments}
  6600  @d aligning=4 {|scanner_status| when reading an alignment preamble}
  6601  @d absorbing=5 {|scanner_status| when reading a balanced text}
  6602  
  6603  @<Glob...@>=
  6604  @!scanner_status : normal..absorbing; {can a subfile end now?}
  6605  @!warning_index : pointer; {identifier relevant to non-|normal| scanner status}
  6606  @!def_ref : pointer; {reference count of token list being defined}
  6607  
  6608  @ Here is a procedure that uses |scanner_status| to print a warning message
  6609  when a subfile has ended, and at certain other crucial times:
  6610  
  6611  @<Declare the procedure called |runaway|@>=
  6612  procedure runaway;
  6613  var p:pointer; {head of runaway list}
  6614  begin if scanner_status>skipping then
  6615    begin print_nl("Runaway ");
  6616  @.Runaway...@>
  6617    case scanner_status of
  6618    defining: begin print("definition"); p:=def_ref;
  6619      end;
  6620    matching: begin print("argument"); p:=temp_head;
  6621      end;
  6622    aligning: begin print("preamble"); p:=hold_head;
  6623      end;
  6624    absorbing: begin print("text"); p:=def_ref;
  6625      end;
  6626    end; {there are no other cases}
  6627    print_char("?");print_ln; show_token_list(link(p),null,error_line-10);
  6628    end;
  6629  end;
  6630  
  6631  @ However, all this discussion about input state really applies only to the
  6632  case that we are inputting from a file. There is another important case,
  6633  namely when we are currently getting input from a token list. In this case
  6634  |state=token_list|, and the conventions about the other state variables
  6635  are different:
  6636  
  6637  \yskip\hang|loc| is a pointer to the current node in the token list, i.e.,
  6638  the node that will be read next. If |loc=null|, the token list has been
  6639  fully read.
  6640  
  6641  \yskip\hang|start| points to the first node of the token list; this node
  6642  may or may not contain a reference count, depending on the type of token
  6643  list involved.
  6644  
  6645  \yskip\hang|token_type|, which takes the place of |index| in the
  6646  discussion above, is a code number that explains what kind of token list
  6647  is being scanned.
  6648  
  6649  \yskip\hang|name| points to the |eqtb| address of the control sequence
  6650  being expanded, if the current token list is a macro.
  6651  
  6652  \yskip\hang|param_start|, which takes the place of |limit|, tells where
  6653  the parameters of the current macro begin in the |param_stack|, if the
  6654  current token list is a macro.
  6655  
  6656  \yskip\noindent The |token_type| can take several values, depending on
  6657  where the current token list came from:
  6658  
  6659  \yskip\hang|parameter|, if a parameter is being scanned;
  6660  
  6661  \hang|u_template|, if the \<u_j> part of an alignment
  6662  template is being scanned;
  6663  
  6664  \hang|v_template|, if the \<v_j> part of an alignment
  6665  template is being scanned;
  6666  
  6667  \hang|backed_up|, if the token list being scanned has been inserted as
  6668  `to be read again';
  6669  
  6670  \hang|inserted|, if the token list being scanned has been inserted as
  6671  the text expansion of a \.{\\count} or similar variable;
  6672  
  6673  \hang|macro|, if a user-defined control sequence is being scanned;
  6674  
  6675  \hang|output_text|, if an \.{\\output} routine is being scanned;
  6676  
  6677  \hang|every_par_text|, if the text of \.{\\everypar} is being scanned;
  6678  
  6679  \hang|every_math_text|, if the text of \.{\\everymath} is being scanned;
  6680  
  6681  \hang|every_display_text|, if the text of \.{\\everydisplay} is being scanned;
  6682  
  6683  \hang|every_hbox_text|, if the text of \.{\\everyhbox} is being scanned;
  6684  
  6685  \hang|every_vbox_text|, if the text of \.{\\everyvbox} is being scanned;
  6686  
  6687  \hang|every_job_text|, if the text of \.{\\everyjob} is being scanned;
  6688  
  6689  \hang|every_cr_text|, if the text of \.{\\everycr} is being scanned;
  6690  
  6691  \hang|mark_text|, if the text of a \.{\\mark} is being scanned;
  6692  
  6693  \hang|write_text|, if the text of a \.{\\write} is being scanned.
  6694  
  6695  \yskip\noindent
  6696  The codes for |output_text|, |every_par_text|, etc., are equal to a constant
  6697  plus the corresponding codes for token list parameters |output_routine_loc|,
  6698  |every_par_loc|, etc.  The token list begins with a reference count if and
  6699  only if |token_type>=macro|.
  6700  @^reference counts@>
  6701  
  6702  @d token_list=0 {|state| code when scanning a token list}
  6703  @d token_type==index {type of current token list}
  6704  @d param_start==limit {base of macro parameters in |param_stack|}
  6705  @d parameter=0 {|token_type| code for parameter}
  6706  @d u_template=1 {|token_type| code for \<u_j> template}
  6707  @d v_template=2 {|token_type| code for \<v_j> template}
  6708  @d backed_up=3 {|token_type| code for text to be reread}
  6709  @d inserted=4 {|token_type| code for inserted texts}
  6710  @d macro=5 {|token_type| code for defined control sequences}
  6711  @d output_text=6 {|token_type| code for output routines}
  6712  @d every_par_text=7 {|token_type| code for \.{\\everypar}}
  6713  @d every_math_text=8 {|token_type| code for \.{\\everymath}}
  6714  @d every_display_text=9 {|token_type| code for \.{\\everydisplay}}
  6715  @d every_hbox_text=10 {|token_type| code for \.{\\everyhbox}}
  6716  @d every_vbox_text=11 {|token_type| code for \.{\\everyvbox}}
  6717  @d every_job_text=12 {|token_type| code for \.{\\everyjob}}
  6718  @d every_cr_text=13 {|token_type| code for \.{\\everycr}}
  6719  @d mark_text=14 {|token_type| code for \.{\\topmark}, etc.}
  6720  @d write_text=15 {|token_type| code for \.{\\write}}
  6721  
  6722  @ The |param_stack| is an auxiliary array used to hold pointers to the token
  6723  lists for parameters at the current level and subsidiary levels of input.
  6724  This stack is maintained with convention (2), and it grows at a different
  6725  rate from the others.
  6726  
  6727  @<Glob...@>=
  6728  @!param_stack:array [0..param_size] of pointer;
  6729    {token list pointers for parameters}
  6730  @!param_ptr:0..param_size; {first unused entry in |param_stack|}
  6731  @!max_param_stack:integer;
  6732    {largest value of |param_ptr|, will be |<=param_size+9|}
  6733  
  6734  @ The input routines must also interact with the processing of
  6735  \.{\\halign} and \.{\\valign}, since the appearance of tab marks and
  6736  \.{\\cr} in certain places is supposed to trigger the beginning of special
  6737  \<v_j> template text in the scanner. This magic is accomplished by an
  6738  |align_state| variable that is increased by~1 when a `\.{\char'173}' is
  6739  scanned and decreased by~1 when a `\.{\char'175}' is scanned. The |align_state|
  6740  is nonzero during the \<u_j> template, after which it is set to zero; the
  6741  \<v_j> template begins when a tab mark or \.{\\cr} occurs at a time that
  6742  |align_state=0|.
  6743  
  6744  @<Glob...@>=
  6745  @!align_state:integer; {group level with respect to current alignment}
  6746  
  6747  @ Thus, the ``current input state'' can be very complicated indeed; there
  6748  can be many levels and each level can arise in a variety of ways. The
  6749  |show_context| procedure, which is used by \TeX's error-reporting routine to
  6750  print out the current input state on all levels down to the most recent
  6751  line of characters from an input file, illustrates most of these conventions.
  6752  The global variable |base_ptr| contains the lowest level that was
  6753  displayed by this procedure.
  6754  
  6755  @<Glob...@>=
  6756  @!base_ptr:0..stack_size; {shallowest level shown by |show_context|}
  6757  
  6758  @ The status at each level is indicated by printing two lines, where the first
  6759  line indicates what was read so far and the second line shows what remains
  6760  to be read. The context is cropped, if necessary, so that the first line
  6761  contains at most |half_error_line| characters, and the second contains
  6762  at most |error_line|. Non-current input levels whose |token_type| is
  6763  `|backed_up|' are shown only if they have not been fully read.
  6764  
  6765  @p procedure show_context; {prints where the scanner is}
  6766  label done;
  6767  var old_setting:0..max_selector; {saved |selector| setting}
  6768  @!nn:integer; {number of contexts shown so far, less one}
  6769  @!bottom_line:boolean; {have we reached the final context to be shown?}
  6770  @<Local variables for formatting calculations@>@/
  6771  begin base_ptr:=input_ptr; input_stack[base_ptr]:=cur_input;
  6772    {store current state}
  6773  nn:=-1; bottom_line:=false;
  6774  loop@+begin cur_input:=input_stack[base_ptr]; {enter into the context}
  6775    if (state<>token_list) then
  6776      if (name>17) or (base_ptr=0) then bottom_line:=true;
  6777    if (base_ptr=input_ptr)or bottom_line or(nn<error_context_lines) then
  6778      @<Display the current context@>
  6779    else if nn=error_context_lines then
  6780      begin print_nl("..."); incr(nn); {omitted if |error_context_lines<0|}
  6781      end;
  6782    if bottom_line then goto done;
  6783    decr(base_ptr);
  6784    end;
  6785  done: cur_input:=input_stack[input_ptr]; {restore original state}
  6786  end;
  6787  
  6788  @ @<Display the current context@>=
  6789  begin if (base_ptr=input_ptr) or (state<>token_list) or
  6790     (token_type<>backed_up) or (loc<>null) then
  6791      {we omit backed-up token lists that have already been read}
  6792    begin tally:=0; {get ready to count characters}
  6793    old_setting:=selector;
  6794    if state<>token_list then
  6795      begin @<Print location of current line@>;
  6796      @<Pseudoprint the line@>;
  6797      end
  6798    else  begin @<Print type of token list@>;
  6799      @<Pseudoprint the token list@>;
  6800      end;
  6801    selector:=old_setting; {stop pseudoprinting}
  6802    @<Print two lines using the tricky pseudoprinted information@>;
  6803    incr(nn);
  6804    end;
  6805  end
  6806  
  6807  @ This routine should be changed, if necessary, to give the best possible
  6808  indication of where the current line resides in the input file.
  6809  For example, on some systems it is best to print both a page and line number.
  6810  @^system dependencies@>
  6811  
  6812  @<Print location of current line@>=
  6813  if name<=17 then
  6814    if terminal_input then
  6815      if base_ptr=0 then print_nl("<*>") else print_nl("<insert> ")
  6816    else  begin print_nl("<read ");
  6817      if name=17 then print_char("*")@+else print_int(name-1);
  6818  @.*\relax@>
  6819      print_char(">");
  6820      end
  6821  else  begin print_nl("l."); print_int(line);
  6822    end;
  6823  print_char(" ")
  6824  
  6825  @ @<Print type of token list@>=
  6826  case token_type of
  6827  parameter: print_nl("<argument> ");
  6828  u_template,v_template: print_nl("<template> ");
  6829  backed_up: if loc=null then print_nl("<recently read> ")
  6830    else print_nl("<to be read again> ");
  6831  inserted: print_nl("<inserted text> ");
  6832  macro: begin print_ln; print_cs(name);
  6833    end;
  6834  output_text: print_nl("<output> ");
  6835  every_par_text: print_nl("<everypar> ");
  6836  every_math_text: print_nl("<everymath> ");
  6837  every_display_text: print_nl("<everydisplay> ");
  6838  every_hbox_text: print_nl("<everyhbox> ");
  6839  every_vbox_text: print_nl("<everyvbox> ");
  6840  every_job_text: print_nl("<everyjob> ");
  6841  every_cr_text: print_nl("<everycr> ");
  6842  mark_text: print_nl("<mark> ");
  6843  write_text: print_nl("<write> ");
  6844  othercases print_nl("?") {this should never happen}
  6845  endcases
  6846  
  6847  @ Here it is necessary to explain a little trick. We don't want to store a long
  6848  string that corresponds to a token list, because that string might take up
  6849  lots of memory; and we are printing during a time when an error message is
  6850  being given, so we dare not do anything that might overflow one of \TeX's
  6851  tables. So `pseudoprinting' is the answer: We enter a mode of printing
  6852  that stores characters into a buffer of length |error_line|, where character
  6853  $k+1$ is placed into \hbox{|trick_buf[k mod error_line]|} if
  6854  |k<trick_count|, otherwise character |k| is dropped. Initially we set
  6855  |tally:=0| and |trick_count:=1000000|; then when we reach the
  6856  point where transition from line 1 to line 2 should occur, we
  6857  set |first_count:=tally| and |trick_count:=@tmax@>(error_line,
  6858  tally+1+error_line-half_error_line)|. At the end of the
  6859  pseudoprinting, the values of |first_count|, |tally|, and
  6860  |trick_count| give us all the information we need to print the two lines,
  6861  and all of the necessary text is in |trick_buf|.
  6862  
  6863  Namely, let |l| be the length of the descriptive information that appears
  6864  on the first line. The length of the context information gathered for that
  6865  line is |k=first_count|, and the length of the context information
  6866  gathered for line~2 is $m=\min(|tally|, |trick_count|)-k$. If |l+k<=h|,
  6867  where |h=half_error_line|, we print |trick_buf[0..k-1]| after the
  6868  descriptive information on line~1, and set |n:=l+k|; here |n| is the
  6869  length of line~1. If $l+k>h$, some cropping is necessary, so we set |n:=h|
  6870  and print `\.{...}' followed by
  6871  $$\hbox{|trick_buf[(l+k-h+3)..k-1]|,}$$
  6872  where subscripts of |trick_buf| are circular modulo |error_line|. The
  6873  second line consists of |n|~spaces followed by |trick_buf[k..(k+m-1)]|,
  6874  unless |n+m>error_line|; in the latter case, further cropping is done.
  6875  This is easier to program than to explain.
  6876  
  6877  @<Local variables for formatting...@>=
  6878  @!i:0..buf_size; {index into |buffer|}
  6879  @!j:0..buf_size; {end of current line in |buffer|}
  6880  @!l:0..half_error_line; {length of descriptive information on line 1}
  6881  @!m:integer; {context information gathered for line 2}
  6882  @!n:0..error_line; {length of line 1}
  6883  @!p: integer; {starting or ending place in |trick_buf|}
  6884  @!q: integer; {temporary index}
  6885  
  6886  @ The following code sets up the print routines so that they will gather
  6887  the desired information.
  6888  
  6889  @d begin_pseudoprint==
  6890    begin l:=tally; tally:=0; selector:=pseudo;
  6891    trick_count:=1000000;
  6892    end
  6893  @d set_trick_count==
  6894    begin first_count:=tally;
  6895    trick_count:=tally+1+error_line-half_error_line;
  6896    if trick_count<error_line then trick_count:=error_line;
  6897    end
  6898  
  6899  @ And the following code uses the information after it has been gathered.
  6900  
  6901  @<Print two lines using the tricky pseudoprinted information@>=
  6902  if trick_count=1000000 then set_trick_count;
  6903    {|set_trick_count| must be performed}
  6904  if tally<trick_count then m:=tally-first_count
  6905  else m:=trick_count-first_count; {context on line 2}
  6906  if l+first_count<=half_error_line then
  6907    begin p:=0; n:=l+first_count;
  6908    end
  6909  else  begin print("..."); p:=l+first_count-half_error_line+3;
  6910    n:=half_error_line;
  6911    end;
  6912  for q:=p to first_count-1 do print_char(trick_buf[q mod error_line]);
  6913  print_ln;
  6914  for q:=1 to n do print_char(" "); {print |n| spaces to begin line~2}
  6915  if m+n<=error_line then p:=first_count+m else p:=first_count+(error_line-n-3);
  6916  for q:=first_count to p-1 do print_char(trick_buf[q mod error_line]);
  6917  if m+n>error_line then print("...")
  6918  
  6919  @ But the trick is distracting us from our current goal, which is to
  6920  understand the input state. So let's concentrate on the data structures that
  6921  are being pseudoprinted as we finish up the |show_context| procedure.
  6922  
  6923  @<Pseudoprint the line@>=
  6924  begin_pseudoprint;
  6925  if buffer[limit]=end_line_char then j:=limit
  6926  else j:=limit+1; {determine the effective end of the line}
  6927  if j>0 then for i:=start to j-1 do
  6928    begin if i=loc then set_trick_count;
  6929    print(buffer[i]);
  6930    end
  6931  
  6932  @ @<Pseudoprint the token list@>=
  6933  begin_pseudoprint;
  6934  if token_type<macro then show_token_list(start,loc,100000)
  6935  else show_token_list(link(start),loc,100000) {avoid reference count}
  6936  
  6937  @ Here is the missing piece of |show_token_list| that is activated when the
  6938  token beginning line~2 is about to be shown:
  6939  
  6940  @<Do magic computation@>=set_trick_count
  6941  
  6942  @* \[23] Maintaining the input stacks.
  6943  The following subroutines change the input status in commonly needed ways.
  6944  
  6945  First comes |push_input|, which stores the current state and creates a
  6946  new level (having, initially, the same properties as the old).
  6947  
  6948  @d push_input==@t@> {enter a new input level, save the old}
  6949    begin if input_ptr>max_in_stack then
  6950      begin max_in_stack:=input_ptr;
  6951      if input_ptr=stack_size then overflow("input stack size",stack_size);
  6952  @:TeX capacity exceeded input stack size}{\quad input stack size@>
  6953      end;
  6954    input_stack[input_ptr]:=cur_input; {stack the record}
  6955    incr(input_ptr);
  6956    end
  6957  
  6958  @ And of course what goes up must come down.
  6959  
  6960  @d pop_input==@t@> {leave an input level, re-enter the old}
  6961    begin decr(input_ptr); cur_input:=input_stack[input_ptr];
  6962    end
  6963  
  6964  @ Here is a procedure that starts a new level of token-list input, given
  6965  a token list |p| and its type |t|. If |t=macro|, the calling routine should
  6966  set |name| and |loc|.
  6967  
  6968  @d back_list(#)==begin_token_list(#,backed_up) {backs up a simple token list}
  6969  @d ins_list(#)==begin_token_list(#,inserted) {inserts a simple token list}
  6970  
  6971  @p procedure begin_token_list(@!p:pointer;@!t:quarterword);
  6972  begin push_input; state:=token_list; start:=p; token_type:=t;
  6973  if t>=macro then {the token list starts with a reference count}
  6974    begin add_token_ref(p);
  6975    if t=macro then param_start:=param_ptr
  6976    else  begin loc:=link(p);
  6977      if tracing_macros>1 then
  6978        begin begin_diagnostic; print_nl("");
  6979        case t of
  6980        mark_text:print_esc("mark");
  6981        write_text:print_esc("write");
  6982        othercases print_cmd_chr(assign_toks,t-output_text+output_routine_loc)
  6983        endcases;@/
  6984        print("->"); token_show(p); end_diagnostic(false);
  6985        end;
  6986      end;
  6987    end
  6988  else loc:=p;
  6989  end;
  6990  
  6991  @ When a token list has been fully scanned, the following computations
  6992  should be done as we leave that level of input. The |token_type| tends
  6993  to be equal to either |backed_up| or |inserted| about 2/3 of the time.
  6994  @^inner loop@>
  6995  
  6996  @p procedure end_token_list; {leave a token-list input level}
  6997  begin if token_type>=backed_up then {token list to be deleted}
  6998    begin if token_type<=inserted then flush_list(start)
  6999    else  begin delete_token_ref(start); {update reference count}
  7000      if token_type=macro then {parameters must be flushed}
  7001        while param_ptr>param_start do
  7002          begin decr(param_ptr);
  7003          flush_list(param_stack[param_ptr]);
  7004          end;
  7005      end;
  7006    end
  7007  else if token_type=u_template then
  7008    if align_state>500000 then align_state:=0
  7009    else fatal_error("(interwoven alignment preambles are not allowed)");
  7010  @.interwoven alignment preambles...@>
  7011  pop_input;
  7012  check_interrupt;
  7013  end;
  7014  
  7015  @ Sometimes \TeX\ has read too far and wants to ``unscan'' what it has
  7016  seen. The |back_input| procedure takes care of this by putting the token
  7017  just scanned back into the input stream, ready to be read again. This
  7018  procedure can be used only if |cur_tok| represents the token to be
  7019  replaced. Some applications of \TeX\ use this procedure a lot,
  7020  so it has been slightly optimized for speed.
  7021  @^inner loop@>
  7022  
  7023  @p procedure back_input; {undoes one token of input}
  7024  var p:pointer; {a token list of length one}
  7025  begin while (state=token_list)and(loc=null)and(token_type<>v_template) do
  7026    end_token_list; {conserve stack space}
  7027  p:=get_avail; info(p):=cur_tok;
  7028  if cur_tok<right_brace_limit then
  7029    if cur_tok<left_brace_limit then decr(align_state)
  7030    else incr(align_state);
  7031  push_input; state:=token_list; start:=p; token_type:=backed_up;
  7032  loc:=p; {that was |back_list(p)|, without procedure overhead}
  7033  end;
  7034  
  7035  @ @<Insert token |p| into \TeX's input@>=
  7036  begin t:=cur_tok; cur_tok:=p; back_input; cur_tok:=t;
  7037  end
  7038  
  7039  @ The |back_error| routine is used when we want to replace an offending token
  7040  just before issuing an error message. This routine, like |back_input|,
  7041  requires that |cur_tok| has been set. We disable interrupts during the
  7042  call of |back_input| so that the help message won't be lost.
  7043  
  7044  @p procedure back_error; {back up one token and call |error|}
  7045  begin OK_to_interrupt:=false; back_input; OK_to_interrupt:=true; error;
  7046  end;
  7047  @#
  7048  procedure ins_error; {back up one inserted token and call |error|}
  7049  begin OK_to_interrupt:=false; back_input; token_type:=inserted;
  7050  OK_to_interrupt:=true; error;
  7051  end;
  7052  
  7053  @ The |begin_file_reading| procedure starts a new level of input for lines
  7054  of characters to be read from a file, or as an insertion from the
  7055  terminal. It does not take care of opening the file, nor does it set |loc|
  7056  or |limit| or |line|.
  7057  @^system dependencies@>
  7058  
  7059  @p procedure begin_file_reading;
  7060  begin if in_open=max_in_open then overflow("text input levels",max_in_open);
  7061  @:TeX capacity exceeded text input levels}{\quad text input levels@>
  7062  if first=buf_size then overflow("buffer size",buf_size);
  7063  @:TeX capacity exceeded buffer size}{\quad buffer size@>
  7064  incr(in_open); push_input; index:=in_open;
  7065  line_stack[index]:=line; start:=first; state:=mid_line;
  7066  name:=0; {|terminal_input| is now |true|}
  7067  end;
  7068  
  7069  @ Conversely, the variables must be downdated when such a level of input
  7070  is finished:
  7071  
  7072  @p procedure end_file_reading;
  7073  begin first:=start; line:=line_stack[index];
  7074  if name>17 then a_close(cur_file); {forget it}
  7075  pop_input; decr(in_open);
  7076  end;
  7077  
  7078  @ In order to keep the stack from overflowing during a long sequence of
  7079  inserted `\.{\\show}' commands, the following routine removes completed
  7080  error-inserted lines from memory.
  7081  
  7082  @p procedure clear_for_error_prompt;
  7083  begin while (state<>token_list)and terminal_input and@|
  7084    (input_ptr>0)and(loc>limit) do end_file_reading;
  7085  print_ln; clear_terminal;
  7086  end;
  7087  
  7088  @ To get \TeX's whole input mechanism going, we perform the following
  7089  actions.
  7090  
  7091  @<Initialize the input routines@>=
  7092  begin input_ptr:=0; max_in_stack:=0;
  7093  in_open:=0; open_parens:=0; max_buf_stack:=0;
  7094  param_ptr:=0; max_param_stack:=0;
  7095  first:=buf_size; repeat buffer[first]:=0; decr(first); until first=0;
  7096  scanner_status:=normal; warning_index:=null; first:=1;
  7097  state:=new_line; start:=1; index:=0; line:=0; name:=0;
  7098  force_eof:=false;
  7099  align_state:=1000000;@/
  7100  if not init_terminal then goto final_end;
  7101  limit:=last; first:=last+1; {|init_terminal| has set |loc| and |last|}
  7102  end
  7103  
  7104  @* \[24] Getting the next token.
  7105  The heart of \TeX's input mechanism is the |get_next| procedure, which
  7106  we shall develop in the next few sections of the program. Perhaps we
  7107  shouldn't actually call it the ``heart,'' however, because it really acts
  7108  as \TeX's eyes and mouth, reading the source files and gobbling them up.
  7109  And it also helps \TeX\ to regurgitate stored token lists that are to be
  7110  processed again.
  7111  @^eyes and mouth@>
  7112  
  7113  The main duty of |get_next| is to input one token and to set |cur_cmd|
  7114  and |cur_chr| to that token's command code and modifier. Furthermore, if
  7115  the input token is a control sequence, the |eqtb| location of that control
  7116  sequence is stored in |cur_cs|; otherwise |cur_cs| is set to zero.
  7117  
  7118  Underlying this simple description is a certain amount of complexity
  7119  because of all the cases that need to be handled.
  7120  However, the inner loop of |get_next| is reasonably short and fast.
  7121  
  7122  When |get_next| is asked to get the next token of a \.{\\read} line,
  7123  it sets |cur_cmd=cur_chr=cur_cs=0| in the case that no more tokens
  7124  appear on that line. (There might not be any tokens at all, if the
  7125  |end_line_char| has |ignore| as its catcode.)
  7126  
  7127  @ The value of |par_loc| is the |eqtb| address of `\.{\\par}'. This quantity
  7128  is needed because a blank line of input is supposed to be exactly equivalent
  7129  to the appearance of \.{\\par}; we must set |cur_cs:=par_loc|
  7130  when detecting a blank line.
  7131  
  7132  @<Glob...@>=
  7133  @!par_loc:pointer; {location of `\.{\\par}' in |eqtb|}
  7134  @!par_token:halfword; {token representing `\.{\\par}'}
  7135  
  7136  @ @<Put each...@>=
  7137  primitive("par",par_end,256); {cf.\ |scan_file_name|}
  7138  @!@:par_}{\.{\\par} primitive@>
  7139  par_loc:=cur_val; par_token:=cs_token_flag+par_loc;
  7140  
  7141  @ @<Cases of |print_cmd_chr|...@>=
  7142  par_end:print_esc("par");
  7143  
  7144  @ Before getting into |get_next|, let's consider the subroutine that
  7145  is called when an `\.{\\outer}' control sequence has been scanned or
  7146  when the end of a file has been reached. These two cases are distinguished
  7147  by |cur_cs|, which is zero at the end of a file.
  7148  
  7149  @p procedure check_outer_validity;
  7150  var p:pointer; {points to inserted token list}
  7151  @!q:pointer; {auxiliary pointer}
  7152  begin if scanner_status<>normal then
  7153    begin deletions_allowed:=false;
  7154    @<Back up an outer control sequence so that it can be reread@>;
  7155    if scanner_status>skipping then
  7156      @<Tell the user what has run away and try to recover@>
  7157    else  begin print_err("Incomplete "); print_cmd_chr(if_test,cur_if);
  7158  @.Incomplete \\if...@>
  7159      print("; all text was ignored after line "); print_int(skip_line);
  7160      help3("A forbidden control sequence occurred in skipped text.")@/
  7161      ("This kind of error happens when you say `\if...' and forget")@/
  7162      ("the matching `\fi'. I've inserted a `\fi'; this might work.");
  7163      if cur_cs<>0 then cur_cs:=0
  7164      else help_line[2]:=@|
  7165        "The file ended while I was skipping conditional text.";
  7166      cur_tok:=cs_token_flag+frozen_fi; ins_error;
  7167      end;
  7168    deletions_allowed:=true;
  7169    end;
  7170  end;
  7171  
  7172  @ An outer control sequence that occurs in a \.{\\read} will not be reread,
  7173  since the error recovery for \.{\\read} is not very powerful.
  7174  
  7175  @<Back up an outer control sequence so that it can be reread@>=
  7176  if cur_cs<>0 then
  7177    begin if (state=token_list)or(name<1)or(name>17) then
  7178      begin p:=get_avail; info(p):=cs_token_flag+cur_cs;
  7179      back_list(p); {prepare to read the control sequence again}
  7180      end;
  7181    cur_cmd:=spacer; cur_chr:=" "; {replace it by a space}
  7182    end
  7183  
  7184  @ @<Tell the user what has run away...@>=
  7185  begin runaway; {print a definition, argument, or preamble}
  7186  if cur_cs=0 then print_err("File ended")
  7187  @.File ended while scanning...@>
  7188  else  begin cur_cs:=0; print_err("Forbidden control sequence found");
  7189  @.Forbidden control sequence...@>
  7190    end;
  7191  print(" while scanning ");
  7192  @<Print either `\.{definition}' or `\.{use}' or `\.{preamble}' or `\.{text}',
  7193    and insert tokens that should lead to recovery@>;
  7194  print(" of "); sprint_cs(warning_index);
  7195  help4("I suspect you have forgotten a `}', causing me")@/
  7196  ("to read past where you wanted me to stop.")@/
  7197  ("I'll try to recover; but if the error is serious,")@/
  7198  ("you'd better type `E' or `X' now and fix your file.");@/
  7199  error;
  7200  end
  7201  
  7202  @ The recovery procedure can't be fully understood without knowing more
  7203  about the \TeX\ routines that should be aborted, but we can sketch the
  7204  ideas here:  For a runaway definition or a runaway balanced text
  7205  we will insert a right brace; for a
  7206  runaway preamble, we will insert a special \.{\\cr} token and a right
  7207  brace; and for a runaway argument, we will set |long_state| to
  7208  |outer_call| and insert \.{\\par}.
  7209  
  7210  @<Print either `\.{definition}' or ...@>=
  7211  p:=get_avail;
  7212  case scanner_status of
  7213  defining:begin print("definition"); info(p):=right_brace_token+"}";
  7214    end;
  7215  matching:begin print("use"); info(p):=par_token; long_state:=outer_call;
  7216    end;
  7217  aligning:begin print("preamble"); info(p):=right_brace_token+"}"; q:=p;
  7218    p:=get_avail; link(p):=q; info(p):=cs_token_flag+frozen_cr;
  7219    align_state:=-1000000;
  7220    end;
  7221  absorbing:begin print("text"); info(p):=right_brace_token+"}";
  7222    end;
  7223  end; {there are no other cases}
  7224  ins_list(p)
  7225  
  7226  @ We need to mention a procedure here that may be called by |get_next|.
  7227  
  7228  @p procedure@?firm_up_the_line; forward;
  7229  
  7230  @ Now we're ready to take the plunge into |get_next| itself. Parts of
  7231  this routine are executed more often than any other instructions of \TeX.
  7232  @^mastication@>@^inner loop@>
  7233  
  7234  @d switch=25 {a label in |get_next|}
  7235  @d start_cs=26 {another}
  7236  
  7237  @p procedure get_next; {sets |cur_cmd|, |cur_chr|, |cur_cs| to next token}
  7238  label restart, {go here to get the next input token}
  7239    switch, {go here to eat the next character from a file}
  7240    reswitch, {go here to digest it again}
  7241    start_cs, {go here to start looking for a control sequence}
  7242    found, {go here when a control sequence has been found}
  7243    exit; {go here when the next input token has been got}
  7244  var k:0..buf_size; {an index into |buffer|}
  7245  @!t:halfword; {a token}
  7246  @!cat:0..max_char_code; {|cat_code(cur_chr)|, usually}
  7247  @!c,@!cc:ASCII_code; {constituents of a possible expanded code}
  7248  @!d:2..3; {number of excess characters in an expanded code}
  7249  begin restart: cur_cs:=0;
  7250  if state<>token_list then
  7251  @<Input from external file, |goto restart| if no input found@>
  7252  else @<Input from token list, |goto restart| if end of list or
  7253    if a parameter needs to be expanded@>;
  7254  @<If an alignment entry has just ended, take appropriate action@>;
  7255  exit:end;
  7256  
  7257  @ An alignment entry ends when a tab or \.{\\cr} occurs, provided that the
  7258  current level of braces is the same as the level that was present at the
  7259  beginning of that alignment entry; i.e., provided that |align_state| has
  7260  returned to the value it had after the \<u_j> template for that entry.
  7261  @^inner loop@>
  7262  
  7263  @<If an alignment entry has just ended, take appropriate action@>=
  7264  if cur_cmd<=car_ret then if cur_cmd>=tab_mark then if align_state=0 then
  7265    @<Insert the \(v)\<v_j> template and |goto restart|@>
  7266  
  7267  @ @<Input from external file, |goto restart| if no input found@>=
  7268  @^inner loop@>
  7269  begin switch: if loc<=limit then {current line not yet finished}
  7270    begin cur_chr:=buffer[loc]; incr(loc);
  7271    reswitch: cur_cmd:=cat_code(cur_chr);
  7272    @<Change state if necessary, and |goto switch| if the
  7273      current character should be ignored,
  7274      or |goto reswitch| if the current character
  7275      changes to another@>;
  7276    end
  7277  else  begin state:=new_line;@/
  7278    @<Move to next line of file,
  7279      or |goto restart| if there is no next line,
  7280      or |return| if a \.{\\read} line has finished@>;
  7281    check_interrupt;
  7282    goto switch;
  7283    end;
  7284  end
  7285  
  7286  @ The following 48-way switch accomplishes the scanning quickly, assuming
  7287  that a decent \PASCAL\ compiler has translated the code. Note that the numeric
  7288  values for |mid_line|, |skip_blanks|, and |new_line| are spaced
  7289  apart from each other by |max_char_code+1|, so we can add a character's
  7290  command code to the state to get a single number that characterizes both.
  7291  
  7292  @d any_state_plus(#) == mid_line+#,skip_blanks+#,new_line+#
  7293  
  7294  @<Change state if necessary...@>=
  7295  case state+cur_cmd of
  7296  @<Cases where character is ignored@>: goto switch;
  7297  any_state_plus(escape): @<Scan a control sequence
  7298    and set |state:=skip_blanks| or |mid_line|@>;
  7299  any_state_plus(active_char): @<Process an active-character control sequence
  7300    and set |state:=mid_line|@>;
  7301  any_state_plus(sup_mark): @<If this |sup_mark| starts an expanded character
  7302    like~\.{\^\^A} or~\.{\^\^df}, then |goto reswitch|,
  7303    otherwise set |state:=mid_line|@>;
  7304  any_state_plus(invalid_char): @<Decry the invalid character and
  7305    |goto restart|@>;
  7306  @t\4@>@<Handle situations involving spaces, braces, changes of state@>@;
  7307  othercases do_nothing
  7308  endcases
  7309  
  7310  @ @<Cases where character is ignored@>=
  7311  any_state_plus(ignore),skip_blanks+spacer,new_line+spacer
  7312  
  7313  @ We go to |restart| instead of to |switch|, because |state| might equal
  7314  |token_list| after the error has been dealt with
  7315  (cf.\ |clear_for_error_prompt|).
  7316  
  7317  @<Decry the invalid...@>=
  7318  begin print_err("Text line contains an invalid character");
  7319  @.Text line contains...@>
  7320  help2("A funny symbol that I can't read has just been input.")@/
  7321  ("Continue, and I'll forget that it ever happened.");@/
  7322  deletions_allowed:=false; error; deletions_allowed:=true;
  7323  goto restart;
  7324  end
  7325  
  7326  @ @d add_delims_to(#)==#+math_shift,#+tab_mark,#+mac_param,
  7327    #+sub_mark,#+letter,#+other_char
  7328  
  7329  @<Handle situations involving spaces, braces, changes of state@>=
  7330  mid_line+spacer:@<Enter |skip_blanks| state, emit a space@>;
  7331  mid_line+car_ret:@<Finish line, emit a space@>;
  7332  skip_blanks+car_ret,any_state_plus(comment):
  7333    @<Finish line, |goto switch|@>;
  7334  new_line+car_ret:@<Finish line, emit a \.{\\par}@>;
  7335  mid_line+left_brace: incr(align_state);
  7336  skip_blanks+left_brace,new_line+left_brace: begin
  7337    state:=mid_line; incr(align_state);
  7338    end;
  7339  mid_line+right_brace: decr(align_state);
  7340  skip_blanks+right_brace,new_line+right_brace: begin
  7341    state:=mid_line; decr(align_state);
  7342    end;
  7343  add_delims_to(skip_blanks),add_delims_to(new_line): state:=mid_line;
  7344  
  7345  @ When a character of type |spacer| gets through, its character code is
  7346  changed to $\.{"\ "}=@'40$. This means that the ASCII codes for tab and space,
  7347  and for the space inserted at the end of a line, will
  7348  be treated alike when macro parameters are being matched. We do this
  7349  since such characters are indistinguishable on most computer terminal displays.
  7350  
  7351  @<Finish line, emit a space@>=
  7352  begin loc:=limit+1; cur_cmd:=spacer; cur_chr:=" ";
  7353  end
  7354  
  7355  @ The following code is performed only when |cur_cmd=spacer|.
  7356  
  7357  @<Enter |skip_blanks| state, emit a space@>=
  7358  begin state:=skip_blanks; cur_chr:=" ";
  7359  end
  7360  
  7361  @ @<Finish line, |goto switch|@>=
  7362  begin loc:=limit+1; goto switch;
  7363  end
  7364  
  7365  @ @<Finish line, emit a \.{\\par}@>=
  7366  begin loc:=limit+1; cur_cs:=par_loc; cur_cmd:=eq_type(cur_cs);
  7367  cur_chr:=equiv(cur_cs);
  7368  if cur_cmd>=outer_call then check_outer_validity;
  7369  end
  7370  
  7371  @ Notice that a code like \.{\^\^8} becomes \.x if not followed by a hex digit.
  7372   
  7373  @d is_hex(#)==(((#>="0")and(#<="9"))or((#>="a")and(#<="f")))
  7374  @d hex_to_cur_chr==
  7375    if c<="9" then cur_chr:=c-"0" @+else cur_chr:=c-"a"+10;
  7376    if cc<="9" then cur_chr:=16*cur_chr+cc-"0"
  7377    else cur_chr:=16*cur_chr+cc-"a"+10
  7378  
  7379  @<If this |sup_mark| starts an expanded character...@>=
  7380  begin if cur_chr=buffer[loc] then if loc<limit then
  7381    begin c:=buffer[loc+1]; @+if c<@'200 then {yes we have an expanded char}
  7382      begin loc:=loc+2; 
  7383      if is_hex(c) then if loc<=limit then
  7384        begin cc:=buffer[loc]; @+if is_hex(cc) then
  7385          begin incr(loc); hex_to_cur_chr; goto reswitch;
  7386          end;
  7387        end;
  7388      if c<@'100 then cur_chr:=c+@'100 @+else cur_chr:=c-@'100;
  7389      goto reswitch;
  7390      end;
  7391    end;
  7392  state:=mid_line;
  7393  end
  7394  
  7395  @ @<Process an active-character...@>=
  7396  begin cur_cs:=cur_chr+active_base;
  7397  cur_cmd:=eq_type(cur_cs); cur_chr:=equiv(cur_cs); state:=mid_line;
  7398  if cur_cmd>=outer_call then check_outer_validity;
  7399  end
  7400  
  7401  @ Control sequence names are scanned only when they appear in some line of
  7402  a file; once they have been scanned the first time, their |eqtb| location
  7403  serves as a unique identification, so \TeX\ doesn't need to refer to the
  7404  original name any more except when it prints the equivalent in symbolic form.
  7405  
  7406  The program that scans a control sequence has been written carefully
  7407  in order to avoid the blowups that might otherwise occur if a malicious
  7408  user tried something like `\.{\\catcode\'15=0}'. The algorithm might
  7409  look at |buffer[limit+1]|, but it never looks at |buffer[limit+2]|.
  7410  
  7411  If expanded characters like `\.{\^\^A}' or `\.{\^\^df}'
  7412  appear in or just following
  7413  a control sequence name, they are converted to single characters in the
  7414  buffer and the process is repeated, slowly but surely.
  7415  
  7416  @<Scan a control...@>=
  7417  begin if loc>limit then cur_cs:=null_cs {|state| is irrelevant in this case}
  7418  else  begin start_cs: k:=loc; cur_chr:=buffer[k]; cat:=cat_code(cur_chr);
  7419    incr(k);
  7420    if cat=letter then state:=skip_blanks
  7421    else if cat=spacer then state:=skip_blanks
  7422    else state:=mid_line;
  7423    if (cat=letter)and(k<=limit) then
  7424      @<Scan ahead in the buffer until finding a nonletter;
  7425      if an expanded code is encountered, reduce it
  7426      and |goto start_cs|; otherwise if a multiletter control
  7427      sequence is found, adjust |cur_cs| and |loc|, and
  7428      |goto found|@>
  7429    else @<If an expanded code is present, reduce it and |goto start_cs|@>;
  7430    cur_cs:=single_base+buffer[loc]; incr(loc);
  7431    end;
  7432  found: cur_cmd:=eq_type(cur_cs); cur_chr:=equiv(cur_cs);
  7433  if cur_cmd>=outer_call then check_outer_validity;
  7434  end
  7435  
  7436  @ Whenever we reach the following piece of code, we will have
  7437  |cur_chr=buffer[k-1]| and |k<=limit+1| and |cat=cat_code(cur_chr)|. If an
  7438  expanded code like \.{\^\^A} or \.{\^\^df} appears in |buffer[(k-1)..(k+1)]|
  7439  or |buffer[(k-1)..(k+2)]|, we
  7440  will store the corresponding code in |buffer[k-1]| and shift the rest of
  7441  the buffer left two or three places.
  7442  
  7443  @<If an expanded...@>=
  7444  begin if buffer[k]=cur_chr then @+if cat=sup_mark then @+if k<limit then
  7445    begin c:=buffer[k+1]; @+if c<@'200 then {yes, one is indeed present}
  7446      begin d:=2;
  7447      if is_hex(c) then @+if k+2<=limit then
  7448        begin cc:=buffer[k+2]; @+if is_hex(cc) then incr(d);
  7449        end;
  7450      if d>2 then
  7451        begin hex_to_cur_chr; buffer[k-1]:=cur_chr;
  7452        end
  7453      else if c<@'100 then buffer[k-1]:=c+@'100
  7454      else buffer[k-1]:=c-@'100;
  7455      limit:=limit-d; first:=first-d;
  7456      while k<=limit do
  7457        begin buffer[k]:=buffer[k+d]; incr(k);
  7458        end;
  7459      goto start_cs;
  7460      end;
  7461    end;
  7462  end
  7463  
  7464  @ @<Scan ahead in the buffer...@>=
  7465  begin repeat cur_chr:=buffer[k]; cat:=cat_code(cur_chr); incr(k);
  7466  until (cat<>letter)or(k>limit);
  7467  @<If an expanded...@>;
  7468  if cat<>letter then decr(k);
  7469    {now |k| points to first nonletter}
  7470  if k>loc+1 then {multiletter control sequence has been scanned}
  7471    begin cur_cs:=id_lookup(loc,k-loc); loc:=k; goto found;
  7472    end;
  7473  end
  7474  
  7475  @ Let's consider now what happens when |get_next| is looking at a token list.
  7476  
  7477  @<Input from token list, |goto restart| if end of list or
  7478    if a parameter needs to be expanded@>=
  7479  if loc<>null then {list not exhausted}
  7480  @^inner loop@>
  7481    begin t:=info(loc); loc:=link(loc); {move to next}
  7482    if t>=cs_token_flag then {a control sequence token}
  7483      begin cur_cs:=t-cs_token_flag;
  7484      cur_cmd:=eq_type(cur_cs); cur_chr:=equiv(cur_cs);
  7485      if cur_cmd>=outer_call then
  7486        if cur_cmd=dont_expand then
  7487          @<Get the next token, suppressing expansion@>
  7488        else check_outer_validity;
  7489      end
  7490    else  begin cur_cmd:=t div @'400; cur_chr:=t mod @'400;
  7491      case cur_cmd of
  7492      left_brace: incr(align_state);
  7493      right_brace: decr(align_state);
  7494      out_param: @<Insert macro parameter and |goto restart|@>;
  7495      othercases do_nothing
  7496      endcases;
  7497      end;
  7498    end
  7499  else  begin {we are done with this token list}
  7500    end_token_list; goto restart; {resume previous level}
  7501    end
  7502  
  7503  @ The present point in the program is reached only when the |expand|
  7504  routine has inserted a special marker into the input. In this special
  7505  case, |info(loc)| is known to be a control sequence token, and |link(loc)=null|.
  7506  
  7507  @d no_expand_flag=257 {this characterizes a special variant of |relax|}
  7508  
  7509  @<Get the next token, suppressing expansion@>=
  7510  begin cur_cs:=info(loc)-cs_token_flag; loc:=null;@/
  7511  cur_cmd:=eq_type(cur_cs); cur_chr:=equiv(cur_cs);
  7512  if cur_cmd>max_command then
  7513    begin cur_cmd:=relax; cur_chr:=no_expand_flag;
  7514    end;
  7515  end
  7516  
  7517  @ @<Insert macro parameter...@>=
  7518  begin begin_token_list(param_stack[param_start+cur_chr-1],parameter);
  7519  goto restart;
  7520  end
  7521  
  7522  @ All of the easy branches of |get_next| have now been taken care of.
  7523  There is one more branch.
  7524  
  7525  @d end_line_char_inactive == (end_line_char<0)or(end_line_char>255)
  7526  
  7527  @<Move to next line of file, or |goto restart|...@>=
  7528  if name>17 then @<Read next line of file into |buffer|, or
  7529    |goto restart| if the file has ended@>
  7530  else  begin if not terminal_input then {\.{\\read} line has ended}
  7531      begin cur_cmd:=0; cur_chr:=0; return;
  7532      end;
  7533    if input_ptr>0 then {text was inserted during error recovery}
  7534      begin end_file_reading; goto restart; {resume previous level}
  7535      end;
  7536    if selector<log_only then open_log_file;
  7537    if interaction>nonstop_mode then
  7538      begin if end_line_char_inactive then incr(limit);
  7539      if limit=start then {previous line was empty}
  7540        print_nl("(Please type a command or say `\end')");
  7541  @.Please type...@>
  7542      print_ln; first:=start;
  7543      prompt_input("*"); {input on-line into |buffer|}
  7544  @.*\relax@>
  7545      limit:=last;
  7546      if end_line_char_inactive then decr(limit)
  7547      else  buffer[limit]:=end_line_char;
  7548      first:=limit+1;
  7549      loc:=start;
  7550      end
  7551    else fatal_error("*** (job aborted, no legal \end found)");
  7552  @.job aborted@>
  7553      {nonstop mode, which is intended for overnight batch processing,
  7554      never waits for on-line input}
  7555    end
  7556  
  7557  @ The global variable |force_eof| is normally |false|; it is set |true|
  7558  by an \.{\\endinput} command.
  7559  
  7560  @<Glob...@>=
  7561  @!force_eof:boolean; {should the next \.{\\input} be aborted early?}
  7562  
  7563  @ @<Read next line of file into |buffer|, or
  7564    |goto restart| if the file has ended@>=
  7565  begin incr(line); first:=start;
  7566  if not force_eof then
  7567    begin if input_ln(cur_file,true) then {not end of file}
  7568      firm_up_the_line {this sets |limit|}
  7569    else force_eof:=true;
  7570    end;
  7571  if force_eof then
  7572    begin print_char(")"); decr(open_parens);
  7573    update_terminal; {show user that file has been read}
  7574    force_eof:=false;
  7575    end_file_reading; {resume previous level}
  7576    check_outer_validity; goto restart;
  7577    end;
  7578  if end_line_char_inactive then decr(limit)
  7579  else  buffer[limit]:=end_line_char;
  7580  first:=limit+1; loc:=start; {ready to read}
  7581  end
  7582  
  7583  @ If the user has set the |pausing| parameter to some positive value,
  7584  and if nonstop mode has not been selected, each line of input is displayed
  7585  on the terminal and the transcript file, followed by `\.{=>}'.
  7586  \TeX\ waits for a response. If the response is simply |carriage_return|, the
  7587  line is accepted as it stands, otherwise the line typed is
  7588  used instead of the line in the file.
  7589  
  7590  @p procedure firm_up_the_line;
  7591  var k:0..buf_size; {an index into |buffer|}
  7592  begin limit:=last;
  7593  if pausing>0 then if interaction>nonstop_mode then
  7594    begin wake_up_terminal; print_ln;
  7595    if start<limit then for k:=start to limit-1 do print(buffer[k]);
  7596    first:=limit; prompt_input("=>"); {wait for user response}
  7597  @.=>@>
  7598    if last>first then
  7599      begin for k:=first to last-1 do {move line down in buffer}
  7600        buffer[k+start-first]:=buffer[k];
  7601      limit:=start+last-first;
  7602      end;
  7603    end;
  7604  end;
  7605  
  7606  @ Since |get_next| is used so frequently in \TeX, it is convenient
  7607  to define three related procedures that do a little more:
  7608  
  7609  \yskip\hang|get_token| not only sets |cur_cmd| and |cur_chr|, it
  7610  also sets |cur_tok|, a packed halfword version of the current token.
  7611  
  7612  \yskip\hang|get_x_token|, meaning ``get an expanded token,'' is like
  7613  |get_token|, but if the current token turns out to be a user-defined
  7614  control sequence (i.e., a macro call), or a conditional,
  7615  or something like \.{\\topmark} or \.{\\expandafter} or \.{\\csname},
  7616  it is eliminated from the input by beginning the expansion of the macro
  7617  or the evaluation of the conditional.
  7618  
  7619  \yskip\hang|x_token| is like |get_x_token| except that it assumes that
  7620  |get_next| has already been called.
  7621  
  7622  \yskip\noindent
  7623  In fact, these three procedures account for almost every use of |get_next|.
  7624  
  7625  @ No new control sequences will be defined except during a call of
  7626  |get_token|, or when \.{\\csname} compresses a token list, because
  7627  |no_new_control_sequence| is always |true| at other times.
  7628  
  7629  @p procedure get_token; {sets |cur_cmd|, |cur_chr|, |cur_tok|}
  7630  begin no_new_control_sequence:=false; get_next; no_new_control_sequence:=true;
  7631  @^inner loop@>
  7632  if cur_cs=0 then cur_tok:=(cur_cmd*@'400)+cur_chr
  7633  else cur_tok:=cs_token_flag+cur_cs;
  7634  end;
  7635  
  7636  @* \[25] Expanding the next token.
  7637  Only a dozen or so command codes |>max_command| can possibly be returned by
  7638  |get_next|; in increasing order, they are |undefined_cs|, |expand_after|,
  7639  |no_expand|, |input|, |if_test|, |fi_or_else|, |cs_name|, |convert|, |the|,
  7640  |top_bot_mark|, |call|, |long_call|, |outer_call|, |long_outer_call|, and
  7641  |end_template|.{\emergencystretch=40pt\par}
  7642  
  7643  The |expand| subroutine is used when |cur_cmd>max_command|. It removes a
  7644  ``call'' or a conditional or one of the other special operations just
  7645  listed.  It follows that |expand| might invoke itself recursively. In all
  7646  cases, |expand| destroys the current token, but it sets things up so that
  7647  the next |get_next| will deliver the appropriate next token. The value of
  7648  |cur_tok| need not be known when |expand| is called.
  7649  
  7650  Since several of the basic scanning routines communicate via global variables,
  7651  their values are saved as local variables of |expand| so that
  7652  recursive calls don't invalidate them.
  7653  @^recursion@>
  7654  
  7655  @p@t\4@>@<Declare the procedure called |macro_call|@>@;@/
  7656  @t\4@>@<Declare the procedure called |insert_relax|@>@;@/
  7657  procedure@?pass_text; forward;@t\2@>
  7658  procedure@?start_input; forward;@t\2@>
  7659  procedure@?conditional; forward;@t\2@>
  7660  procedure@?get_x_token; forward;@t\2@>
  7661  procedure@?conv_toks; forward;@t\2@>
  7662  procedure@?ins_the_toks; forward;@t\2@>
  7663  procedure expand;
  7664  var t:halfword; {token that is being ``expanded after''}
  7665  @!p,@!q,@!r:pointer; {for list manipulation}
  7666  @!j:0..buf_size; {index into |buffer|}
  7667  @!cv_backup:integer; {to save the global quantity |cur_val|}
  7668  @!cvl_backup,@!radix_backup,@!co_backup:small_number;
  7669    {to save |cur_val_level|, etc.}
  7670  @!backup_backup:pointer; {to save |link(backup_head)|}
  7671  @!save_scanner_status:small_number; {temporary storage of |scanner_status|}
  7672  begin cv_backup:=cur_val; cvl_backup:=cur_val_level; radix_backup:=radix;
  7673  co_backup:=cur_order; backup_backup:=link(backup_head);
  7674  if cur_cmd<call then @<Expand a nonmacro@>
  7675  else if cur_cmd<end_template then macro_call
  7676  else @<Insert a token containing |frozen_endv|@>;
  7677  cur_val:=cv_backup; cur_val_level:=cvl_backup; radix:=radix_backup;
  7678  cur_order:=co_backup; link(backup_head):=backup_backup;
  7679  end;
  7680  
  7681  @ @<Expand a nonmacro@>=
  7682  begin if tracing_commands>1 then show_cur_cmd_chr;
  7683  case cur_cmd of
  7684  top_bot_mark:@<Insert the \(a)appropriate mark text into the scanner@>;
  7685  expand_after:@<Expand the token after the next token@>;
  7686  no_expand:@<Suppress expansion of the next token@>;
  7687  cs_name:@<Manufacture a control sequence name@>;
  7688  convert:conv_toks; {this procedure is discussed in Part 27 below}
  7689  the:ins_the_toks; {this procedure is discussed in Part 27 below}
  7690  if_test:conditional; {this procedure is discussed in Part 28 below}
  7691  fi_or_else:@<Terminate the current conditional and skip to \.{\\fi}@>;
  7692  input:@<Initiate or terminate input from a file@>;
  7693  othercases @<Complain about an undefined macro@>
  7694  endcases;
  7695  end
  7696  
  7697  @ It takes only a little shuffling to do what \TeX\ calls \.{\\expandafter}.
  7698  
  7699  @<Expand the token after...@>=
  7700  begin get_token; t:=cur_tok; get_token;
  7701  if cur_cmd>max_command then expand@+else back_input;
  7702  cur_tok:=t; back_input;
  7703  end
  7704  
  7705  @ The implementation of \.{\\noexpand} is a bit trickier, because it is
  7706  necessary to insert a special `|dont_expand|' marker into \TeX's reading
  7707  mechanism.  This special marker is processed by |get_next|, but it does
  7708  not slow down the inner loop.
  7709  
  7710  Since \.{\\outer} macros might arise here, we must also
  7711  clear the |scanner_status| temporarily.
  7712  
  7713  @<Suppress expansion...@>=
  7714  begin save_scanner_status:=scanner_status; scanner_status:=normal;
  7715  get_token; scanner_status:=save_scanner_status; t:=cur_tok;
  7716  back_input; {now |start| and |loc| point to the backed-up token |t|}
  7717  if t>=cs_token_flag then
  7718    begin p:=get_avail; info(p):=cs_token_flag+frozen_dont_expand;
  7719    link(p):=loc; start:=p; loc:=p;
  7720    end;
  7721  end
  7722  
  7723  @ @<Complain about an undefined macro@>=
  7724  begin print_err("Undefined control sequence");
  7725  @.Undefined control sequence@>
  7726  help5("The control sequence at the end of the top line")@/
  7727  ("of your error message was never \def'ed. If you have")@/
  7728  ("misspelled it (e.g., `\hobx'), type `I' and the correct")@/
  7729  ("spelling (e.g., `I\hbox'). Otherwise just continue,")@/
  7730  ("and I'll forget about whatever was undefined.");
  7731  error;
  7732  end
  7733  
  7734  @ The |expand| procedure and some other routines that construct token
  7735  lists find it convenient to use the following macros, which are valid only if
  7736  the variables |p| and |q| are reserved for token-list building.
  7737  
  7738  @d store_new_token(#)==begin q:=get_avail; link(p):=q; info(q):=#;
  7739    p:=q; {|link(p)| is |null|}
  7740    end
  7741  @d fast_store_new_token(#)==begin fast_get_avail(q); link(p):=q; info(q):=#;
  7742    p:=q; {|link(p)| is |null|}
  7743    end
  7744  
  7745  @ @<Manufacture a control...@>=
  7746  begin r:=get_avail; p:=r; {head of the list of characters}
  7747  repeat get_x_token;
  7748  if cur_cs=0 then store_new_token(cur_tok);
  7749  until cur_cs<>0;
  7750  if cur_cmd<>end_cs_name then @<Complain about missing \.{\\endcsname}@>;
  7751  @<Look up the characters of list |r| in the hash table, and set |cur_cs|@>;
  7752  flush_list(r);
  7753  if eq_type(cur_cs)=undefined_cs then
  7754    begin eq_define(cur_cs,relax,256); {N.B.: The |save_stack| might change}
  7755    end; {the control sequence will now match `\.{\\relax}'}
  7756  cur_tok:=cur_cs+cs_token_flag; back_input;
  7757  end
  7758  
  7759  @ @<Complain about missing \.{\\endcsname}@>=
  7760  begin print_err("Missing "); print_esc("endcsname"); print(" inserted");
  7761  @.Missing \\endcsname...@>
  7762  help2("The control sequence marked <to be read again> should")@/
  7763    ("not appear between \csname and \endcsname.");
  7764  back_error;
  7765  end
  7766  
  7767  @ @<Look up the characters of list |r| in the hash table...@>=
  7768  j:=first; p:=link(r);
  7769  while p<>null do
  7770    begin if j>=max_buf_stack then
  7771      begin max_buf_stack:=j+1;
  7772      if max_buf_stack=buf_size then
  7773        overflow("buffer size",buf_size);
  7774  @:TeX capacity exceeded buffer size}{\quad buffer size@>
  7775      end;
  7776    buffer[j]:=info(p) mod @'400; incr(j); p:=link(p);
  7777    end;
  7778  if j>first+1 then
  7779    begin no_new_control_sequence:=false; cur_cs:=id_lookup(first,j-first);
  7780    no_new_control_sequence:=true;
  7781    end
  7782  else if j=first then cur_cs:=null_cs {the list is empty}
  7783  else cur_cs:=single_base+buffer[first] {the list has length one}
  7784  
  7785  @ An |end_template| command is effectively changed to an |endv| command
  7786  by the following code. (The reason for this is discussed below; the
  7787  |frozen_end_template| at the end of the template has passed the
  7788  |check_outer_validity| test, so its mission of error detection has been
  7789  accomplished.)
  7790  
  7791  @<Insert a token containing |frozen_endv|@>=
  7792  begin cur_tok:=cs_token_flag+frozen_endv; back_input;
  7793  end
  7794  
  7795  @ The processing of \.{\\input} involves the |start_input| subroutine,
  7796  which will be declared later; the processing of \.{\\endinput} is trivial.
  7797  
  7798  @<Put each...@>=
  7799  primitive("input",input,0);@/
  7800  @!@:input_}{\.{\\input} primitive@>
  7801  primitive("endinput",input,1);@/
  7802  @!@:end_input_}{\.{\\endinput} primitive@>
  7803  
  7804  @ @<Cases of |print_cmd_chr|...@>=
  7805  input: if chr_code=0 then print_esc("input")@+else print_esc("endinput");
  7806  
  7807  @ @<Initiate or terminate input...@>=
  7808  if cur_chr>0 then force_eof:=true
  7809  else if name_in_progress then insert_relax
  7810  else start_input
  7811  
  7812  @ Sometimes the expansion looks too far ahead, so we want to insert
  7813  a harmless \.{\\relax} into the user's input.
  7814  
  7815  @<Declare the procedure called |insert_relax|@>=
  7816  procedure insert_relax;
  7817  begin cur_tok:=cs_token_flag+cur_cs; back_input;
  7818  cur_tok:=cs_token_flag+frozen_relax; back_input; token_type:=inserted;
  7819  end;
  7820  
  7821  @ Here is a recursive procedure that is \TeX's usual way to get the
  7822  next token of input. It has been slightly optimized to take account of
  7823  common cases.
  7824  
  7825  @p procedure get_x_token; {sets |cur_cmd|, |cur_chr|, |cur_tok|,
  7826    and expands macros}
  7827  label restart,done;
  7828  begin restart: get_next;
  7829  @^inner loop@>
  7830  if cur_cmd<=max_command then goto done;
  7831  if cur_cmd>=call then
  7832    if cur_cmd<end_template then macro_call
  7833    else  begin cur_cs:=frozen_endv; cur_cmd:=endv;
  7834      goto done; {|cur_chr=null_list|}
  7835      end
  7836  else expand;
  7837  goto restart;
  7838  done: if cur_cs=0 then cur_tok:=(cur_cmd*@'400)+cur_chr
  7839  else cur_tok:=cs_token_flag+cur_cs;
  7840  end;
  7841  
  7842  @ The |get_x_token| procedure is essentially equivalent to two consecutive
  7843  procedure calls: |get_next; x_token|.
  7844  
  7845  @p procedure x_token; {|get_x_token| without the initial |get_next|}
  7846  begin while cur_cmd>max_command do
  7847    begin expand;
  7848    get_next;
  7849    end;
  7850  if cur_cs=0 then cur_tok:=(cur_cmd*@'400)+cur_chr
  7851  else cur_tok:=cs_token_flag+cur_cs;
  7852  end;
  7853  
  7854  @ A control sequence that has been \.{\\def}'ed by the user is expanded by
  7855  \TeX's |macro_call| procedure.
  7856  
  7857  Before we get into the details of |macro_call|, however, let's consider the
  7858  treatment of primitives like \.{\\topmark}, since they are essentially
  7859  macros without parameters. The token lists for such marks are kept in a
  7860  global array of five pointers; we refer to the individual entries of this
  7861  array by symbolic names |top_mark|, etc. The value of |top_mark| is either
  7862  |null| or a pointer to the reference count of a token list.
  7863  
  7864  @d top_mark_code=0 {the mark in effect at the previous page break}
  7865  @d first_mark_code=1 {the first mark between |top_mark| and |bot_mark|}
  7866  @d bot_mark_code=2 {the mark in effect at the current page break}
  7867  @d split_first_mark_code=3 {the first mark found by \.{\\vsplit}}
  7868  @d split_bot_mark_code=4 {the last mark found by \.{\\vsplit}}
  7869  @d top_mark==cur_mark[top_mark_code]
  7870  @d first_mark==cur_mark[first_mark_code]
  7871  @d bot_mark==cur_mark[bot_mark_code]
  7872  @d split_first_mark==cur_mark[split_first_mark_code]
  7873  @d split_bot_mark==cur_mark[split_bot_mark_code]
  7874  
  7875  @<Glob...@>=
  7876  @!cur_mark:array[top_mark_code..split_bot_mark_code] of pointer;
  7877    {token lists for marks}
  7878  
  7879  @ @<Set init...@>=
  7880  top_mark:=null; first_mark:=null; bot_mark:=null;
  7881  split_first_mark:=null; split_bot_mark:=null;
  7882  
  7883  @ @<Put each...@>=
  7884  primitive("topmark",top_bot_mark,top_mark_code);
  7885  @!@:top_mark_}{\.{\\topmark} primitive@>
  7886  primitive("firstmark",top_bot_mark,first_mark_code);
  7887  @!@:first_mark_}{\.{\\firstmark} primitive@>
  7888  primitive("botmark",top_bot_mark,bot_mark_code);
  7889  @!@:bot_mark_}{\.{\\botmark} primitive@>
  7890  primitive("splitfirstmark",top_bot_mark,split_first_mark_code);
  7891  @!@:split_first_mark_}{\.{\\splitfirstmark} primitive@>
  7892  primitive("splitbotmark",top_bot_mark,split_bot_mark_code);
  7893  @!@:split_bot_mark_}{\.{\\splitbotmark} primitive@>
  7894  
  7895  @ @<Cases of |print_cmd_chr|...@>=
  7896  top_bot_mark: case chr_code of
  7897    first_mark_code: print_esc("firstmark");
  7898    bot_mark_code: print_esc("botmark");
  7899    split_first_mark_code: print_esc("splitfirstmark");
  7900    split_bot_mark_code: print_esc("splitbotmark");
  7901    othercases print_esc("topmark")
  7902    endcases;
  7903  
  7904  @ The following code is activated when |cur_cmd=top_bot_mark| and
  7905  when |cur_chr| is a code like |top_mark_code|.
  7906  
  7907  @<Insert the \(a)appropriate mark text into the scanner@>=
  7908  begin if cur_mark[cur_chr]<>null then
  7909    begin_token_list(cur_mark[cur_chr],mark_text);
  7910  end
  7911  
  7912  @ Now let's consider |macro_call| itself, which is invoked when \TeX\ is
  7913  scanning a control sequence whose |cur_cmd| is either |call|, |long_call|,
  7914  |outer_call|, or |long_outer_call|.  The control sequence definition
  7915  appears in the token list whose reference count is in location |cur_chr|
  7916  of |mem|.
  7917  
  7918  The global variable |long_state| will be set to |call| or to |long_call|,
  7919  depending on whether or not the control sequence disallows \.{\\par}
  7920  in its parameters. The |get_next| routine will set |long_state| to
  7921  |outer_call| and emit \.{\\par}, if a file ends or if an \.{\\outer}
  7922  control sequence occurs in the midst of an argument.
  7923  
  7924  @<Glob...@>=
  7925  @!long_state:call..long_outer_call; {governs the acceptance of \.{\\par}}
  7926  
  7927  @ The parameters, if any, must be scanned before the macro is expanded.
  7928  Parameters are token lists without reference counts. They are placed on
  7929  an auxiliary stack called |pstack| while they are being scanned, since
  7930  the |param_stack| may be losing entries during the matching process.
  7931  (Note that |param_stack| can't be gaining entries, since |macro_call| is
  7932  the only routine that puts anything onto |param_stack|, and it
  7933  is not recursive.)
  7934  
  7935  @<Glob...@>=
  7936  @!pstack:array[0..8] of pointer; {arguments supplied to a macro}
  7937  
  7938  @ After parameter scanning is complete, the parameters are moved to the
  7939  |param_stack|. Then the macro body is fed to the scanner; in other words,
  7940  |macro_call| places the defined text of the control sequence at the
  7941  top of\/ \TeX's input stack, so that |get_next| will proceed to read it
  7942  next.
  7943  
  7944  The global variable |cur_cs| contains the |eqtb| address of the control sequence
  7945  being expanded, when |macro_call| begins. If this control sequence has not been
  7946  declared \.{\\long}, i.e., if its command code in the |eq_type| field is
  7947  not |long_call| or |long_outer_call|, its parameters are not allowed to contain
  7948  the control sequence \.{\\par}. If an illegal \.{\\par} appears, the macro
  7949  call is aborted, and the \.{\\par} will be rescanned.
  7950  
  7951  @<Declare the procedure called |macro_call|@>=
  7952  procedure macro_call; {invokes a user-defined control sequence}
  7953  label exit, continue, done, done1, found;
  7954  var r:pointer; {current node in the macro's token list}
  7955  @!p:pointer; {current node in parameter token list being built}
  7956  @!q:pointer; {new node being put into the token list}
  7957  @!s:pointer; {backup pointer for parameter matching}
  7958  @!t:pointer; {cycle pointer for backup recovery}
  7959  @!u,@!v:pointer; {auxiliary pointers for backup recovery}
  7960  @!rbrace_ptr:pointer; {one step before the last |right_brace| token}
  7961  @!n:small_number; {the number of parameters scanned}
  7962  @!unbalance:halfword; {unmatched left braces in current parameter}
  7963  @!m:halfword; {the number of tokens or groups (usually)}
  7964  @!ref_count:pointer; {start of the token list}
  7965  @!save_scanner_status:small_number; {|scanner_status| upon entry}
  7966  @!save_warning_index:pointer; {|warning_index| upon entry}
  7967  @!match_chr:ASCII_code; {character used in parameter}
  7968  begin save_scanner_status:=scanner_status; save_warning_index:=warning_index;
  7969  warning_index:=cur_cs; ref_count:=cur_chr; r:=link(ref_count); n:=0;
  7970  if tracing_macros>0 then @<Show the text of the macro being expanded@>;
  7971  if info(r)<>end_match_token then
  7972    @<Scan the parameters and make |link(r)| point to the macro body; but
  7973      |return| if an illegal \.{\\par} is detected@>;
  7974  @<Feed the macro body and its parameters to the scanner@>;
  7975  exit:scanner_status:=save_scanner_status; warning_index:=save_warning_index;
  7976  end;
  7977  
  7978  @ Before we put a new token list on the input stack, it is wise to clean off
  7979  all token lists that have recently been depleted. Then a user macro that ends
  7980  with a call to itself will not require unbounded stack space.
  7981  
  7982  @<Feed the macro body and its parameters to the scanner@>=
  7983  while (state=token_list)and(loc=null)and(token_type<>v_template) do
  7984    end_token_list; {conserve stack space}
  7985  begin_token_list(ref_count,macro); name:=warning_index; loc:=link(r);
  7986  if n>0 then
  7987    begin if param_ptr+n>max_param_stack then
  7988      begin max_param_stack:=param_ptr+n;
  7989      if max_param_stack>param_size then
  7990        overflow("parameter stack size",param_size);
  7991  @:TeX capacity exceeded parameter stack size}{\quad parameter stack size@>
  7992      end;
  7993    for m:=0 to n-1 do param_stack[param_ptr+m]:=pstack[m];
  7994    param_ptr:=param_ptr+n;
  7995    end
  7996  
  7997  @ At this point, the reader will find it advisable to review the explanation
  7998  of token list format that was presented earlier, since many aspects of that
  7999  format are of importance chiefly in the |macro_call| routine.
  8000  
  8001  The token list might begin with a string of compulsory tokens before the
  8002  first |match| or |end_match|. In that case the macro name is supposed to be
  8003  followed by those tokens; the following program will set |s=null| to
  8004  represent this restriction. Otherwise |s| will be set to the first token of
  8005  a string that will delimit the next parameter.
  8006  
  8007  @<Scan the parameters and make |link(r)| point to the macro body...@>=
  8008  begin scanner_status:=matching; unbalance:=0;
  8009  long_state:=eq_type(cur_cs);
  8010  if long_state>=outer_call then long_state:=long_state-2;
  8011  repeat link(temp_head):=null;
  8012  if (info(r)>match_token+255)or(info(r)<match_token) then s:=null
  8013  else  begin match_chr:=info(r)-match_token; s:=link(r); r:=s;
  8014    p:=temp_head; m:=0;
  8015    end;
  8016  @<Scan a parameter until its delimiter string has been found; or, if |s=null|,
  8017    simply scan the delimiter string@>;@/
  8018  {now |info(r)| is a token whose command code is either |match| or |end_match|}
  8019  until info(r)=end_match_token;
  8020  end
  8021  
  8022  @ If |info(r)| is a |match| or |end_match| command, it cannot be equal to
  8023  any token found by |get_token|. Therefore an undelimited parameter---i.e.,
  8024  a |match| that is immediately followed by |match| or |end_match|---will
  8025  always fail the test `|cur_tok=info(r)|' in the following algorithm.
  8026  
  8027  @<Scan a parameter until its delimiter string has been found; or, ...@>=
  8028  continue: get_token; {set |cur_tok| to the next token of input}
  8029  if cur_tok=info(r) then
  8030    @<Advance \(r)|r|; |goto found| if the parameter delimiter has been
  8031      fully matched, otherwise |goto continue|@>;
  8032  @<Contribute the recently matched tokens to the current parameter, and
  8033    |goto continue| if a partial match is still in effect;
  8034    but abort if |s=null|@>;
  8035  if cur_tok=par_token then if long_state<>long_call then
  8036    @<Report a runaway argument and abort@>;
  8037  if cur_tok<right_brace_limit then
  8038    if cur_tok<left_brace_limit then
  8039      @<Contribute an entire group to the current parameter@>
  8040    else @<Report an extra right brace and |goto continue|@>
  8041  else @<Store the current token, but |goto continue| if it is
  8042     a blank space that would become an undelimited parameter@>;
  8043  incr(m);
  8044  if info(r)>end_match_token then goto continue;
  8045  if info(r)<match_token then goto continue;
  8046  found: if s<>null then @<Tidy up the parameter just scanned, and tuck it away@>
  8047  
  8048  @ @<Store the current token, but |goto continue| if it is...@>=
  8049  begin if cur_tok=space_token then
  8050    if info(r)<=end_match_token then
  8051      if info(r)>=match_token then goto continue;
  8052  store_new_token(cur_tok);
  8053  end
  8054  
  8055  @ A slightly subtle point arises here: When the parameter delimiter ends
  8056  with `\.{\#\{}', the token list will have a left brace both before and
  8057  after the |end_match|\kern-.4pt. Only one of these should affect the
  8058  |align_state|, but both will be scanned, so we must make a correction.
  8059  
  8060  @<Advance \(r)|r|; |goto found| if the parameter delimiter has been fully...@>=
  8061  begin r:=link(r);
  8062  if (info(r)>=match_token)and(info(r)<=end_match_token) then
  8063    begin if cur_tok<left_brace_limit then decr(align_state);
  8064    goto found;
  8065    end
  8066  else goto continue;
  8067  end
  8068  
  8069  @ @<Report an extra right brace and |goto continue|@>=
  8070  begin back_input; print_err("Argument of "); sprint_cs(warning_index);
  8071  @.Argument of \\x has...@>
  8072  print(" has an extra }");
  8073  help6("I've run across a `}' that doesn't seem to match anything.")@/
  8074    ("For example, `\def\a#1{...}' and `\a}' would produce")@/
  8075    ("this error. If you simply proceed now, the `\par' that")@/
  8076    ("I've just inserted will cause me to report a runaway")@/
  8077    ("argument that might be the root of the problem. But if")@/
  8078    ("your `}' was spurious, just type `2' and it will go away.");
  8079  incr(align_state); long_state:=call; cur_tok:=par_token; ins_error;
  8080  goto continue;
  8081  end {a white lie; the \.{\\par} won't always trigger a runaway}
  8082  
  8083  @ If |long_state=outer_call|, a runaway argument has already been reported.
  8084  
  8085  @<Report a runaway argument and abort@>=
  8086  begin if long_state=call then
  8087    begin runaway; print_err("Paragraph ended before ");
  8088  @.Paragraph ended before...@>
  8089    sprint_cs(warning_index); print(" was complete");
  8090    help3("I suspect you've forgotten a `}', causing me to apply this")@/
  8091      ("control sequence to too much text. How can we recover?")@/
  8092      ("My plan is to forget the whole thing and hope for the best.");
  8093    back_error;
  8094    end;
  8095  pstack[n]:=link(temp_head); align_state:=align_state-unbalance;
  8096  for m:=0 to n do flush_list(pstack[m]);
  8097  return;
  8098  end
  8099  
  8100  @ When the following code becomes active, we have matched tokens from |s| to
  8101  the predecessor of |r|, and we have found that |cur_tok<>info(r)|. An
  8102  interesting situation now presents itself: If the parameter is to be
  8103  delimited by a string such as `\.{ab}', and if we have scanned `\.{aa}',
  8104  we want to contribute one `\.a' to the current parameter and resume
  8105  looking for a `\.b'. The program must account for such partial matches and
  8106  for others that can be quite complex.  But most of the time we have |s=r|
  8107  and nothing needs to be done.
  8108  
  8109  Incidentally, it is possible for \.{\\par} tokens to sneak in to certain
  8110  parameters of non-\.{\\long} macros. For example, consider a case like
  8111  `\.{\\def\\a\#1\\par!\{...\}}' where the first \.{\\par} is not followed
  8112  by an exclamation point. In such situations it does not seem appropriate
  8113  to prohibit the \.{\\par}, so \TeX\ keeps quiet about this bending of
  8114  the rules.
  8115  
  8116  @<Contribute the recently matched tokens to the current parameter...@>=
  8117  if s<>r then
  8118    if s=null then @<Report an improper use of the macro and abort@>
  8119    else  begin t:=s;
  8120      repeat store_new_token(info(t)); incr(m); u:=link(t); v:=s;
  8121      loop@+  begin if u=r then
  8122          if cur_tok<>info(v) then goto done
  8123          else  begin r:=link(v); goto continue;
  8124            end;
  8125        if info(u)<>info(v) then goto done;
  8126        u:=link(u); v:=link(v);
  8127        end;
  8128      done: t:=link(t);
  8129      until t=r;
  8130      r:=s; {at this point, no tokens are recently matched}
  8131      end
  8132  
  8133  @ @<Report an improper use...@>=
  8134  begin print_err("Use of "); sprint_cs(warning_index);
  8135  @.Use of x doesn't match...@>
  8136  print(" doesn't match its definition");
  8137  help4("If you say, e.g., `\def\a1{...}', then you must always")@/
  8138    ("put `1' after `\a', since control sequence names are")@/
  8139    ("made up of letters only. The macro here has not been")@/
  8140    ("followed by the required stuff, so I'm ignoring it.");
  8141  error; return;
  8142  end
  8143  
  8144  @ @<Contribute an entire group to the current parameter@>=
  8145  begin unbalance:=1;
  8146  @^inner loop@>
  8147  loop@+  begin fast_store_new_token(cur_tok); get_token;
  8148    if cur_tok=par_token then if long_state<>long_call then
  8149      @<Report a runaway argument and abort@>;
  8150    if cur_tok<right_brace_limit then
  8151      if cur_tok<left_brace_limit then incr(unbalance)
  8152      else  begin decr(unbalance);
  8153        if unbalance=0 then goto done1;
  8154        end;
  8155    end;
  8156  done1: rbrace_ptr:=p; store_new_token(cur_tok);
  8157  end
  8158  
  8159  @ If the parameter consists of a single group enclosed in braces, we must
  8160  strip off the enclosing braces. That's why |rbrace_ptr| was introduced.
  8161  
  8162  @<Tidy up the parameter just scanned, and tuck it away@>=
  8163  begin if (m=1)and(info(p)<right_brace_limit) then
  8164    begin link(rbrace_ptr):=null; free_avail(p);
  8165    p:=link(temp_head); pstack[n]:=link(p); free_avail(p);
  8166    end
  8167  else pstack[n]:=link(temp_head);
  8168  incr(n);
  8169  if tracing_macros>0 then
  8170    begin begin_diagnostic; print_nl(match_chr); print_int(n);
  8171    print("<-"); show_token_list(pstack[n-1],null,1000);
  8172    end_diagnostic(false);
  8173    end;
  8174  end
  8175  
  8176  @ @<Show the text of the macro being expanded@>=
  8177  begin begin_diagnostic; print_ln; print_cs(warning_index);
  8178  token_show(ref_count); end_diagnostic(false);
  8179  end
  8180  
  8181  @* \[26] Basic scanning subroutines.
  8182  Let's turn now to some procedures that \TeX\ calls upon frequently to digest
  8183  certain kinds of patterns in the input. Most of these are quite simple;
  8184  some are quite elaborate. Almost all of the routines call |get_x_token|,
  8185  which can cause them to be invoked recursively.
  8186  @^stomach@>
  8187  @^recursion@>
  8188  
  8189  @ The |scan_left_brace| routine is called when a left brace is supposed to be
  8190  the next non-blank token. (The term ``left brace'' means, more precisely,
  8191  a character whose catcode is |left_brace|.) \TeX\ allows \.{\\relax} to
  8192  appear before the |left_brace|.
  8193  
  8194  @p procedure scan_left_brace; {reads a mandatory |left_brace|}
  8195  begin @<Get the next non-blank non-relax non-call token@>;
  8196  if cur_cmd<>left_brace then
  8197    begin print_err("Missing { inserted");
  8198  @.Missing \{ inserted@>
  8199    help4("A left brace was mandatory here, so I've put one in.")@/
  8200      ("You might want to delete and/or insert some corrections")@/
  8201      ("so that I will find a matching right brace soon.")@/
  8202      ("(If you're confused by all this, try typing `I}' now.)");
  8203    back_error; cur_tok:=left_brace_token+"{"; cur_cmd:=left_brace;
  8204    cur_chr:="{"; incr(align_state);
  8205    end;
  8206  end;
  8207  
  8208  @ @<Get the next non-blank non-relax non-call token@>=
  8209  repeat get_x_token;
  8210  until (cur_cmd<>spacer)and(cur_cmd<>relax)
  8211  
  8212  @ The |scan_optional_equals| routine looks for an optional `\.=' sign preceded
  8213  by optional spaces; `\.{\\relax}' is not ignored here.
  8214  
  8215  @p procedure scan_optional_equals;
  8216  begin  @<Get the next non-blank non-call token@>;
  8217  if cur_tok<>other_token+"=" then back_input;
  8218  end;
  8219  
  8220  @ @<Get the next non-blank non-call token@>=
  8221  repeat get_x_token;
  8222  until cur_cmd<>spacer
  8223  
  8224  @ In case you are getting bored, here is a slightly less trivial routine:
  8225  Given a string of lowercase letters, like `\.{pt}' or `\.{plus}' or
  8226  `\.{width}', the |scan_keyword| routine checks to see whether the next
  8227  tokens of input match this string. The match must be exact, except that
  8228  uppercase letters will match their lowercase counterparts; uppercase
  8229  equivalents are determined by subtracting |"a"-"A"|, rather than using the
  8230  |uc_code| table, since \TeX\ uses this routine only for its own limited
  8231  set of keywords.
  8232  
  8233  If a match is found, the characters are effectively removed from the input
  8234  and |true| is returned. Otherwise |false| is returned, and the input
  8235  is left essentially unchanged (except for the fact that some macros
  8236  may have been expanded, etc.).
  8237  @^inner loop@>
  8238  
  8239  @p function scan_keyword(@!s:str_number):boolean; {look for a given string}
  8240  label exit;
  8241  var p:pointer; {tail of the backup list}
  8242  @!q:pointer; {new node being added to the token list via |store_new_token|}
  8243  @!k:pool_pointer; {index into |str_pool|}
  8244  begin p:=backup_head; link(p):=null; k:=str_start[s];
  8245  while k<str_start[s+1] do
  8246    begin get_x_token; {recursion is possible here}
  8247  @^recursion@>
  8248    if (cur_cs=0)and@|
  8249     ((cur_chr=so(str_pool[k]))or(cur_chr=so(str_pool[k])-"a"+"A")) then
  8250      begin store_new_token(cur_tok); incr(k);
  8251      end
  8252    else if (cur_cmd<>spacer)or(p<>backup_head) then
  8253      begin back_input;
  8254      if p<>backup_head then back_list(link(backup_head));
  8255      scan_keyword:=false; return;
  8256      end;
  8257    end;
  8258  flush_list(link(backup_head)); scan_keyword:=true;
  8259  exit:end;
  8260  
  8261  @ Here is a procedure that sounds an alarm when mu and non-mu units
  8262  are being switched.
  8263  
  8264  @p procedure mu_error;
  8265  begin print_err("Incompatible glue units");
  8266  @.Incompatible glue units@>
  8267  help1("I'm going to assume that 1mu=1pt when they're mixed.");
  8268  error;
  8269  end;
  8270  
  8271  @ The next routine `|scan_something_internal|' is used to fetch internal
  8272  numeric quantities like `\.{\\hsize}', and also to handle the `\.{\\the}'
  8273  when expanding constructions like `\.{\\the\\toks0}' and
  8274  `\.{\\the\\baselineskip}'. Soon we will be considering the |scan_int|
  8275  procedure, which calls |scan_something_internal|; on the other hand,
  8276  |scan_something_internal| also calls |scan_int|, for constructions like
  8277  `\.{\\catcode\`\\\$}' or `\.{\\fontdimen} \.3 \.{\\ff}'. So we
  8278  have to declare |scan_int| as a |forward| procedure. A few other
  8279  procedures are also declared at this point.
  8280  
  8281  @p procedure@?scan_int; forward; {scans an integer value}
  8282  @t\4\4@>@<Declare procedures that scan restricted classes of integers@>@;
  8283  @t\4\4@>@<Declare procedures that scan font-related stuff@>
  8284  
  8285  @ \TeX\ doesn't know exactly what to expect when |scan_something_internal|
  8286  begins.  For example, an integer or dimension or glue value could occur
  8287  immediately after `\.{\\hskip}'; and one can even say \.{\\the} with
  8288  respect to token lists in constructions like
  8289  `\.{\\xdef\\o\{\\the\\output\}}'.  On the other hand, only integers are
  8290  allowed after a construction like `\.{\\count}'. To handle the various
  8291  possibilities, |scan_something_internal| has a |level| parameter, which
  8292  tells the ``highest'' kind of quantity that |scan_something_internal| is
  8293  allowed to produce. Six levels are distinguished, namely |int_val|,
  8294  |dimen_val|, |glue_val|, |mu_val|, |ident_val|, and |tok_val|.
  8295  
  8296  The output of |scan_something_internal| (and of the other routines
  8297  |scan_int|, |scan_dimen|, and |scan_glue| below) is put into the global
  8298  variable |cur_val|, and its level is put into |cur_val_level|. The highest
  8299  values of |cur_val_level| are special: |mu_val| is used only when
  8300  |cur_val| points to something in a ``muskip'' register, or to one of the
  8301  three parameters \.{\\thinmuskip}, \.{\\medmuskip}, \.{\\thickmuskip};
  8302  |ident_val| is used only when |cur_val| points to a font identifier;
  8303  |tok_val| is used only when |cur_val| points to |null| or to the reference
  8304  count of a token list. The last two cases are allowed only when
  8305  |scan_something_internal| is called with |level=tok_val|.
  8306  
  8307  If the output is glue, |cur_val| will point to a glue specification, and
  8308  the reference count of that glue will have been updated to reflect this
  8309  reference; if the output is a nonempty token list, |cur_val| will point to
  8310  its reference count, but in this case the count will not have been updated.
  8311  Otherwise |cur_val| will contain the integer or scaled value in question.
  8312  
  8313  @d int_val=0 {integer values}
  8314  @d dimen_val=1 {dimension values}
  8315  @d glue_val=2 {glue specifications}
  8316  @d mu_val=3 {math glue specifications}
  8317  @d ident_val=4 {font identifier}
  8318  @d tok_val=5 {token lists}
  8319  
  8320  @<Glob...@>=
  8321  @!cur_val:integer; {value returned by numeric scanners}
  8322  @!cur_val_level:int_val..tok_val; {the ``level'' of this value}
  8323  
  8324  @ The hash table is initialized with `\.{\\count}', `\.{\\dimen}', `\.{\\skip}',
  8325  and `\.{\\muskip}' all having |register| as their command code; they are
  8326  distinguished by the |chr_code|, which is either |int_val|, |dimen_val|,
  8327  |glue_val|, or |mu_val|.
  8328  
  8329  @<Put each...@>=
  8330  primitive("count",register,int_val);
  8331  @!@:count_}{\.{\\count} primitive@>
  8332  primitive("dimen",register,dimen_val);
  8333  @!@:dimen_}{\.{\\dimen} primitive@>
  8334  primitive("skip",register,glue_val);
  8335  @!@:skip_}{\.{\\skip} primitive@>
  8336  primitive("muskip",register,mu_val);
  8337  @!@:mu_skip_}{\.{\\muskip} primitive@>
  8338  
  8339  @ @<Cases of |print_cmd_chr|...@>=
  8340  register: if chr_code=int_val then print_esc("count")
  8341    else if chr_code=dimen_val then print_esc("dimen")
  8342    else if chr_code=glue_val then print_esc("skip")
  8343    else print_esc("muskip");
  8344  
  8345  @ OK, we're ready for |scan_something_internal| itself. A second parameter,
  8346  |negative|, is set |true| if the value that is found should be negated.
  8347  It is assumed that |cur_cmd| and |cur_chr| represent the first token of
  8348  the internal quantity to be scanned; an error will be signalled if
  8349  |cur_cmd<min_internal| or |cur_cmd>max_internal|.
  8350  
  8351  @d scanned_result_end(#)==cur_val_level:=#;@+end
  8352  @d scanned_result(#)==@+begin cur_val:=#;scanned_result_end
  8353  
  8354  @p procedure scan_something_internal(@!level:small_number;@!negative:boolean);
  8355    {fetch an internal parameter}
  8356  var m:halfword; {|chr_code| part of the operand token}
  8357  @!p:0..nest_size; {index into |nest|}
  8358  begin m:=cur_chr;
  8359  case cur_cmd of
  8360  def_code: @<Fetch a character code from some table@>;
  8361  toks_register,assign_toks,def_family,set_font,def_font: @<Fetch a token list or
  8362    font identifier, provided that |level=tok_val|@>;
  8363  assign_int: scanned_result(eqtb[m].int)(int_val);
  8364  assign_dimen: scanned_result(eqtb[m].sc)(dimen_val);
  8365  assign_glue: scanned_result(equiv(m))(glue_val);
  8366  assign_mu_glue: scanned_result(equiv(m))(mu_val);
  8367  set_aux: @<Fetch the |space_factor| or the |prev_depth|@>;
  8368  set_prev_graf: @<Fetch the |prev_graf|@>;
  8369  set_page_int:@<Fetch the |dead_cycles| or the |insert_penalties|@>;
  8370  set_page_dimen: @<Fetch something on the |page_so_far|@>;
  8371  set_shape: @<Fetch the |par_shape| size@>;
  8372  set_box_dimen: @<Fetch a box dimension@>;
  8373  char_given,math_given: scanned_result(cur_chr)(int_val);
  8374  assign_font_dimen: @<Fetch a font dimension@>;
  8375  assign_font_int: @<Fetch a font integer@>;
  8376  register: @<Fetch a register@>;
  8377  last_item: @<Fetch an item in the current node, if appropriate@>;
  8378  othercases @<Complain that \.{\\the} can't do this; give zero result@>
  8379  endcases;@/
  8380  while cur_val_level>level do @<Convert \(c)|cur_val| to a lower level@>;
  8381  @<Fix the reference count, if any, and negate |cur_val| if |negative|@>;
  8382  end;
  8383  
  8384  @ @<Fetch a character code from some table@>=
  8385  begin scan_char_num;
  8386  if m=math_code_base then scanned_result(ho(math_code(cur_val)))(int_val)
  8387  else if m<math_code_base then scanned_result(equiv(m+cur_val))(int_val)
  8388  else scanned_result(eqtb[m+cur_val].int)(int_val);
  8389  end
  8390  
  8391  @ @<Fetch a token list...@>=
  8392  if level<>tok_val then
  8393    begin print_err("Missing number, treated as zero");
  8394  @.Missing number...@>
  8395    help3("A number should have been here; I inserted `0'.")@/
  8396      ("(If you can't figure out why I needed to see a number,")@/
  8397      ("look up `weird error' in the index to The TeXbook.)");
  8398  @:TeXbook}{\sl The \TeX book@>
  8399    back_error; scanned_result(0)(dimen_val);
  8400    end
  8401  else if cur_cmd<=assign_toks then
  8402    begin if cur_cmd<assign_toks then {|cur_cmd=toks_register|}
  8403      begin scan_eight_bit_int; m:=toks_base+cur_val;
  8404      end;
  8405    scanned_result(equiv(m))(tok_val);
  8406    end
  8407  else  begin back_input; scan_font_ident;
  8408    scanned_result(font_id_base+cur_val)(ident_val);
  8409    end
  8410  
  8411  @ Users refer to `\.{\\the\\spacefactor}' only in horizontal
  8412  mode, and to `\.{\\the\\prevdepth}' only in vertical mode; so we put the
  8413  associated mode in the modifier part of the |set_aux| command.
  8414  The |set_page_int| command has modifier 0 or 1, for `\.{\\deadcycles}' and
  8415  `\.{\\insertpenalties}', respectively. The |set_box_dimen| command is
  8416  modified by either |width_offset|, |height_offset|, or |depth_offset|.
  8417  And the |last_item| command is modified by either |int_val|, |dimen_val|,
  8418  |glue_val|, |input_line_no_code|, or |badness_code|.
  8419  
  8420  @d input_line_no_code=glue_val+1 {code for \.{\\inputlineno}}
  8421  @d badness_code=glue_val+2 {code for \.{\\badness}}
  8422  
  8423  @<Put each...@>=
  8424  primitive("spacefactor",set_aux,hmode);
  8425  @!@:space_factor_}{\.{\\spacefactor} primitive@>
  8426  primitive("prevdepth",set_aux,vmode);@/
  8427  @!@:prev_depth_}{\.{\\prevdepth} primitive@>
  8428  primitive("deadcycles",set_page_int,0);
  8429  @!@:dead_cycles_}{\.{\\deadcycles} primitive@>
  8430  primitive("insertpenalties",set_page_int,1);
  8431  @!@:insert_penalties_}{\.{\\insertpenalties} primitive@>
  8432  primitive("wd",set_box_dimen,width_offset);
  8433  @!@:wd_}{\.{\\wd} primitive@>
  8434  primitive("ht",set_box_dimen,height_offset);
  8435  @!@:ht_}{\.{\\ht} primitive@>
  8436  primitive("dp",set_box_dimen,depth_offset);
  8437  @!@:dp_}{\.{\\dp} primitive@>
  8438  primitive("lastpenalty",last_item,int_val);
  8439  @!@:last_penalty_}{\.{\\lastpenalty} primitive@>
  8440  primitive("lastkern",last_item,dimen_val);
  8441  @!@:last_kern_}{\.{\\lastkern} primitive@>
  8442  primitive("lastskip",last_item,glue_val);
  8443  @!@:last_skip_}{\.{\\lastskip} primitive@>
  8444  primitive("inputlineno",last_item,input_line_no_code);
  8445  @!@:input_line_no_}{\.{\\inputlineno} primitive@>
  8446  primitive("badness",last_item,badness_code);
  8447  @!@:badness_}{\.{\\badness} primitive@>
  8448  
  8449  @ @<Cases of |print_cmd_chr|...@>=
  8450  set_aux: if chr_code=vmode then print_esc("prevdepth")
  8451  @+else print_esc("spacefactor");
  8452  set_page_int: if chr_code=0 then print_esc("deadcycles")
  8453  @+else print_esc("insertpenalties");
  8454  set_box_dimen: if chr_code=width_offset then print_esc("wd")
  8455  else if chr_code=height_offset then print_esc("ht")
  8456  else print_esc("dp");
  8457  last_item: case chr_code of
  8458    int_val: print_esc("lastpenalty");
  8459    dimen_val: print_esc("lastkern");
  8460    glue_val: print_esc("lastskip");
  8461    input_line_no_code: print_esc("inputlineno");
  8462    othercases print_esc("badness")
  8463    endcases;
  8464  
  8465  @ @<Fetch the |space_factor| or the |prev_depth|@>=
  8466  if abs(mode)<>m then
  8467    begin print_err("Improper "); print_cmd_chr(set_aux,m);
  8468  @.Improper \\spacefactor@>
  8469  @.Improper \\prevdepth@>
  8470    help4("You can refer to \spacefactor only in horizontal mode;")@/
  8471      ("you can refer to \prevdepth only in vertical mode; and")@/
  8472      ("neither of these is meaningful inside \write. So")@/
  8473      ("I'm forgetting what you said and using zero instead.");
  8474    error;
  8475    if level<>tok_val then scanned_result(0)(dimen_val)
  8476    else scanned_result(0)(int_val);
  8477    end
  8478  else if m=vmode then scanned_result(prev_depth)(dimen_val)
  8479  else scanned_result(space_factor)(int_val)
  8480  
  8481  @ @<Fetch the |dead_cycles| or the |insert_penalties|@>=
  8482  begin if m=0 then cur_val:=dead_cycles@+else cur_val:=insert_penalties;
  8483  cur_val_level:=int_val;
  8484  end
  8485  
  8486  @ @<Fetch a box dimension@>=
  8487  begin scan_eight_bit_int;
  8488  if box(cur_val)=null then cur_val:=0 @+else cur_val:=mem[box(cur_val)+m].sc;
  8489  cur_val_level:=dimen_val;
  8490  end
  8491  
  8492  @ Inside an \.{\\output} routine, a user may wish to look at the page totals
  8493  that were present at the moment when output was triggered.
  8494  
  8495  @d max_dimen==@'7777777777 {$2^{30}-1$}
  8496  
  8497  @<Fetch something on the |page_so_far|@>=
  8498  begin if (page_contents=empty) and (not output_active) then
  8499    if m=0 then cur_val:=max_dimen@+else cur_val:=0
  8500  else cur_val:=page_so_far[m];
  8501  cur_val_level:=dimen_val;
  8502  end
  8503  
  8504  @ @<Fetch the |prev_graf|@>=
  8505  if mode=0 then scanned_result(0)(int_val) {|prev_graf=0| within \.{\\write}}
  8506  else begin nest[nest_ptr]:=cur_list; p:=nest_ptr;
  8507    while abs(nest[p].mode_field)<>vmode do decr(p);
  8508    scanned_result(nest[p].pg_field)(int_val);
  8509    end
  8510  
  8511  @ @<Fetch the |par_shape| size@>=
  8512  begin if par_shape_ptr=null then cur_val:=0
  8513  else cur_val:=info(par_shape_ptr);
  8514  cur_val_level:=int_val;
  8515  end
  8516  
  8517  @ Here is where \.{\\lastpenalty}, \.{\\lastkern}, and \.{\\lastskip} are
  8518  implemented. The reference count for \.{\\lastskip} will be updated later.
  8519  
  8520  We also handle \.{\\inputlineno} and \.{\\badness} here, because they are
  8521  legal in similar contexts.
  8522  
  8523  @<Fetch an item in the current node...@>=
  8524  if cur_chr>glue_val then
  8525    begin if cur_chr=input_line_no_code then cur_val:=line
  8526    else cur_val:=last_badness; {|cur_chr=badness_code|}
  8527    cur_val_level:=int_val;
  8528    end
  8529  else begin if cur_chr=glue_val then cur_val:=zero_glue@+else cur_val:=0;
  8530    cur_val_level:=cur_chr;
  8531    if not is_char_node(tail)and(mode<>0) then
  8532      case cur_chr of
  8533      int_val: if type(tail)=penalty_node then cur_val:=penalty(tail);
  8534      dimen_val: if type(tail)=kern_node then cur_val:=width(tail);
  8535      glue_val: if type(tail)=glue_node then
  8536        begin cur_val:=glue_ptr(tail);
  8537        if subtype(tail)=mu_glue then cur_val_level:=mu_val;
  8538        end;
  8539      end {there are no other cases}
  8540    else if (mode=vmode)and(tail=head) then
  8541      case cur_chr of
  8542      int_val: cur_val:=last_penalty;
  8543      dimen_val: cur_val:=last_kern;
  8544      glue_val: if last_glue<>max_halfword then cur_val:=last_glue;
  8545      end; {there are no other cases}
  8546    end
  8547  
  8548  @ @<Fetch a font dimension@>=
  8549  begin find_font_dimen(false); font_info[fmem_ptr].sc:=0;
  8550  scanned_result(font_info[cur_val].sc)(dimen_val);
  8551  end
  8552  
  8553  @ @<Fetch a font integer@>=
  8554  begin scan_font_ident;
  8555  if m=0 then scanned_result(hyphen_char[cur_val])(int_val)
  8556  else scanned_result(skew_char[cur_val])(int_val);
  8557  end
  8558  
  8559  @ @<Fetch a register@>=
  8560  begin scan_eight_bit_int;
  8561  case m of
  8562  int_val:cur_val:=count(cur_val);
  8563  dimen_val:cur_val:=dimen(cur_val);
  8564  glue_val: cur_val:=skip(cur_val);
  8565  mu_val: cur_val:=mu_skip(cur_val);
  8566  end; {there are no other cases}
  8567  cur_val_level:=m;
  8568  end
  8569  
  8570  @ @<Complain that \.{\\the} can't do this; give zero result@>=
  8571  begin print_err("You can't use `"); print_cmd_chr(cur_cmd,cur_chr);
  8572  @.You can't use x after ...@>
  8573  print("' after "); print_esc("the");
  8574  help1("I'm forgetting what you said and using zero instead.");
  8575  error;
  8576  if level<>tok_val then scanned_result(0)(dimen_val)
  8577  else scanned_result(0)(int_val);
  8578  end
  8579  
  8580  @ When a |glue_val| changes to a |dimen_val|, we use the width component
  8581  of the glue; there is no need to decrease the reference count, since it
  8582  has not yet been increased.  When a |dimen_val| changes to an |int_val|,
  8583  we use scaled points so that the value doesn't actually change. And when a
  8584  |mu_val| changes to a |glue_val|, the value doesn't change either.
  8585  
  8586  @<Convert \(c)|cur_val| to a lower level@>=
  8587  begin if cur_val_level=glue_val then cur_val:=width(cur_val)
  8588  else if cur_val_level=mu_val then mu_error;
  8589  decr(cur_val_level);
  8590  end
  8591  
  8592  @ If |cur_val| points to a glue specification at this point, the reference
  8593  count for the glue does not yet include the reference by |cur_val|.
  8594  If |negative| is |true|, |cur_val_level| is known to be |<=mu_val|.
  8595  
  8596  @<Fix the reference count, if any, ...@>=
  8597  if negative then
  8598    if cur_val_level>=glue_val then
  8599      begin cur_val:=new_spec(cur_val);
  8600      @<Negate all three glue components of |cur_val|@>;
  8601      end
  8602    else negate(cur_val)
  8603  else if (cur_val_level>=glue_val)and(cur_val_level<=mu_val) then
  8604    add_glue_ref(cur_val)
  8605  
  8606  @ @<Negate all three...@>=
  8607  begin negate(width(cur_val));
  8608  negate(stretch(cur_val));
  8609  negate(shrink(cur_val));
  8610  end
  8611  
  8612  @ Our next goal is to write the |scan_int| procedure, which scans anything that
  8613  \TeX\ treats as an integer. But first we might as well look at some simple
  8614  applications of |scan_int| that have already been made inside of
  8615  |scan_something_internal|.
  8616  
  8617  @ @<Declare procedures that scan restricted classes of integers@>=
  8618  procedure scan_eight_bit_int;
  8619  begin scan_int;
  8620  if (cur_val<0)or(cur_val>255) then
  8621    begin print_err("Bad register code");
  8622  @.Bad register code@>
  8623    help2("A register number must be between 0 and 255.")@/
  8624      ("I changed this one to zero."); int_error(cur_val); cur_val:=0;
  8625    end;
  8626  end;
  8627  
  8628  @ @<Declare procedures that scan restricted classes of integers@>=
  8629  procedure scan_char_num;
  8630  begin scan_int;
  8631  if (cur_val<0)or(cur_val>255) then
  8632    begin print_err("Bad character code");
  8633  @.Bad character code@>
  8634    help2("A character number must be between 0 and 255.")@/
  8635      ("I changed this one to zero."); int_error(cur_val); cur_val:=0;
  8636    end;
  8637  end;
  8638  
  8639  @ While we're at it, we might as well deal with similar routines that
  8640  will be needed later.
  8641  
  8642  @<Declare procedures that scan restricted classes of integers@>=
  8643  procedure scan_four_bit_int;
  8644  begin scan_int;
  8645  if (cur_val<0)or(cur_val>15) then
  8646    begin print_err("Bad number");
  8647  @.Bad number@>
  8648    help2("Since I expected to read a number between 0 and 15,")@/
  8649      ("I changed this one to zero."); int_error(cur_val); cur_val:=0;
  8650    end;
  8651  end;
  8652  
  8653  @ @<Declare procedures that scan restricted classes of integers@>=
  8654  procedure scan_fifteen_bit_int;
  8655  begin scan_int;
  8656  if (cur_val<0)or(cur_val>@'77777) then
  8657    begin print_err("Bad mathchar");
  8658  @.Bad mathchar@>
  8659    help2("A mathchar number must be between 0 and 32767.")@/
  8660      ("I changed this one to zero."); int_error(cur_val); cur_val:=0;
  8661    end;
  8662  end;
  8663  
  8664  @ @<Declare procedures that scan restricted classes of integers@>=
  8665  procedure scan_twenty_seven_bit_int;
  8666  begin scan_int;
  8667  if (cur_val<0)or(cur_val>@'777777777) then
  8668    begin print_err("Bad delimiter code");
  8669  @.Bad delimiter code@>
  8670    help2("A numeric delimiter code must be between 0 and 2^{27}-1.")@/
  8671      ("I changed this one to zero."); int_error(cur_val); cur_val:=0;
  8672    end;
  8673  end;
  8674  
  8675  @ An integer number can be preceded by any number of spaces and `\.+' or
  8676  `\.-' signs. Then comes either a decimal constant (i.e., radix 10), an
  8677  octal constant (i.e., radix 8, preceded by~\.\'), a hexadecimal constant
  8678  (radix 16, preceded by~\."), an alphabetic constant (preceded by~\.\`), or
  8679  an internal variable. After scanning is complete,
  8680  |cur_val| will contain the answer, which must be at most
  8681  $2^{31}-1=2147483647$ in absolute value. The value of |radix| is set to
  8682  10, 8, or 16 in the cases of decimal, octal, or hexadecimal constants,
  8683  otherwise |radix| is set to zero. An optional space follows a constant.
  8684  
  8685  @d octal_token=other_token+"'" {apostrophe, indicates an octal constant}
  8686  @d hex_token=other_token+"""" {double quote, indicates a hex constant}
  8687  @d alpha_token=other_token+"`" {reverse apostrophe, precedes alpha constants}
  8688  @d point_token=other_token+"." {decimal point}
  8689  @d continental_point_token=other_token+"," {decimal point, Eurostyle}
  8690  
  8691  @<Glob...@>=
  8692  @!radix:small_number; {|scan_int| sets this to 8, 10, 16, or zero}
  8693  
  8694  @ We initialize the following global variables just in case |expand|
  8695  comes into action before any of the basic scanning routines has assigned
  8696  them a value.
  8697  
  8698  @<Set init...@>=
  8699  cur_val:=0; cur_val_level:=int_val; radix:=0; cur_order:=normal;
  8700  
  8701  @ The |scan_int| routine is used also to scan the integer part of a
  8702  fraction; for example, the `\.3' in `\.{3.14159}' will be found by
  8703  |scan_int|. The |scan_dimen| routine assumes that |cur_tok=point_token|
  8704  after the integer part of such a fraction has been scanned by |scan_int|,
  8705  and that the decimal point has been backed up to be scanned again.
  8706  
  8707  @p procedure scan_int; {sets |cur_val| to an integer}
  8708  label done;
  8709  var negative:boolean; {should the answer be negated?}
  8710  @!m:integer; {|@t$2^{31}$@> div radix|, the threshold of danger}
  8711  @!d:small_number; {the digit just scanned}
  8712  @!vacuous:boolean; {have no digits appeared?}
  8713  @!OK_so_far:boolean; {has an error message been issued?}
  8714  begin radix:=0; OK_so_far:=true;@/
  8715  @<Get the next non-blank non-sign token; set |negative| appropriately@>;
  8716  if cur_tok=alpha_token then @<Scan an alphabetic character code into |cur_val|@>
  8717  else if (cur_cmd>=min_internal)and(cur_cmd<=max_internal) then
  8718    scan_something_internal(int_val,false)
  8719  else @<Scan a numeric constant@>;
  8720  if negative then negate(cur_val);
  8721  end;
  8722  
  8723  @ @<Get the next non-blank non-sign token...@>=
  8724  negative:=false;
  8725  repeat @<Get the next non-blank non-call token@>;
  8726  if cur_tok=other_token+"-" then
  8727    begin negative := not negative; cur_tok:=other_token+"+";
  8728    end;
  8729  until cur_tok<>other_token+"+"
  8730  
  8731  @ A space is ignored after an alphabetic character constant, so that
  8732  such constants behave like numeric ones.
  8733  
  8734  @<Scan an alphabetic character code into |cur_val|@>=
  8735  begin get_token; {suppress macro expansion}
  8736  if cur_tok<cs_token_flag then
  8737    begin cur_val:=cur_chr;
  8738    if cur_cmd<=right_brace then
  8739      if cur_cmd=right_brace then incr(align_state)
  8740      else decr(align_state);
  8741    end
  8742  else if cur_tok<cs_token_flag+single_base then
  8743    cur_val:=cur_tok-cs_token_flag-active_base
  8744  else cur_val:=cur_tok-cs_token_flag-single_base;
  8745  if cur_val>255 then
  8746    begin print_err("Improper alphabetic constant");
  8747  @.Improper alphabetic constant@>
  8748    help2("A one-character control sequence belongs after a ` mark.")@/
  8749      ("So I'm essentially inserting \0 here.");
  8750    cur_val:="0"; back_error;
  8751    end
  8752  else @<Scan an optional space@>;
  8753  end
  8754  
  8755  @ @<Scan an optional space@>=
  8756  begin get_x_token; if cur_cmd<>spacer then back_input;
  8757  end
  8758  
  8759  @ @<Scan a numeric constant@>=
  8760  begin radix:=10; m:=214748364;
  8761  if cur_tok=octal_token then
  8762    begin radix:=8; m:=@'2000000000; get_x_token;
  8763    end
  8764  else if cur_tok=hex_token then
  8765    begin radix:=16; m:=@'1000000000; get_x_token;
  8766    end;
  8767  vacuous:=true; cur_val:=0;@/
  8768  @<Accumulate the constant until |cur_tok| is not a suitable digit@>;
  8769  if vacuous then @<Express astonishment that no number was here@>
  8770  else if cur_cmd<>spacer then back_input;
  8771  end
  8772  
  8773  @ @d infinity==@'17777777777 {the largest positive value that \TeX\ knows}
  8774  @d zero_token=other_token+"0" {zero, the smallest digit}
  8775  @d A_token=letter_token+"A" {the smallest special hex digit}
  8776  @d other_A_token=other_token+"A" {special hex digit of type |other_char|}
  8777  
  8778  @<Accumulate the constant...@>=
  8779  loop@+  begin if (cur_tok<zero_token+radix)and(cur_tok>=zero_token)and
  8780      (cur_tok<=zero_token+9) then d:=cur_tok-zero_token
  8781    else if radix=16 then
  8782      if (cur_tok<=A_token+5)and(cur_tok>=A_token) then d:=cur_tok-A_token+10
  8783      else if (cur_tok<=other_A_token+5)and(cur_tok>=other_A_token) then
  8784        d:=cur_tok-other_A_token+10
  8785      else goto done
  8786    else goto done;
  8787    vacuous:=false;
  8788    if (cur_val>=m)and((cur_val>m)or(d>7)or(radix<>10)) then
  8789      begin if OK_so_far then
  8790        begin print_err("Number too big");
  8791  @.Number too big@>
  8792        help2("I can only go up to 2147483647='17777777777=""7FFFFFFF,")@/
  8793          ("so I'm using that number instead of yours.");
  8794        error; cur_val:=infinity; OK_so_far:=false;
  8795        end;
  8796      end
  8797    else cur_val:=cur_val*radix+d;
  8798    get_x_token;
  8799    end;
  8800  done:
  8801  
  8802  @ @<Express astonishment...@>=
  8803  begin print_err("Missing number, treated as zero");
  8804  @.Missing number...@>
  8805  help3("A number should have been here; I inserted `0'.")@/
  8806    ("(If you can't figure out why I needed to see a number,")@/
  8807    ("look up `weird error' in the index to The TeXbook.)");
  8808  @:TeXbook}{\sl The \TeX book@>
  8809  back_error;
  8810  end
  8811  
  8812  @ The |scan_dimen| routine is similar to |scan_int|, but it sets |cur_val| to
  8813  a |scaled| value, i.e., an integral number of sp. One of its main tasks
  8814  is therefore to interpret the abbreviations for various kinds of units and
  8815  to convert measurements to scaled points.
  8816  
  8817  There are three parameters: |mu| is |true| if the finite units must be
  8818  `\.{mu}', while |mu| is |false| if `\.{mu}' units are disallowed;
  8819  |inf| is |true| if the infinite units `\.{fil}', `\.{fill}', `\.{filll}'
  8820  are permitted; and |shortcut| is |true| if |cur_val| already contains
  8821  an integer and only the units need to be considered.
  8822  
  8823  The order of infinity that was found in the case of infinite glue is returned
  8824  in the global variable |cur_order|.
  8825  
  8826  @<Glob...@>=
  8827  @!cur_order:glue_ord; {order of infinity found by |scan_dimen|}
  8828  
  8829  @ Constructions like `\.{-\'77 pt}' are legal dimensions, so |scan_dimen|
  8830  may begin with |scan_int|. This explains why it is convenient to use
  8831  |scan_int| also for the integer part of a decimal fraction.
  8832  
  8833  Several branches of |scan_dimen| work with |cur_val| as an integer and
  8834  with an auxiliary fraction |f|, so that the actual quantity of interest is
  8835  $|cur_val|+|f|/2^{16}$. At the end of the routine, this ``unpacked''
  8836  representation is put into the single word |cur_val|, which suddenly
  8837  switches significance from |integer| to |scaled|.
  8838  
  8839  @d attach_fraction=88 {go here to pack |cur_val| and |f| into |cur_val|}
  8840  @d attach_sign=89 {go here when |cur_val| is correct except perhaps for sign}
  8841  @d scan_normal_dimen==scan_dimen(false,false,false)
  8842  
  8843  @p procedure scan_dimen(@!mu,@!inf,@!shortcut:boolean);
  8844    {sets |cur_val| to a dimension}
  8845  label done, done1, done2, found, not_found, attach_fraction, attach_sign;
  8846  var negative:boolean; {should the answer be negated?}
  8847  @!f:integer; {numerator of a fraction whose denominator is $2^{16}$}
  8848  @<Local variables for dimension calculations@>@;
  8849  begin f:=0; arith_error:=false; cur_order:=normal; negative:=false;
  8850  if not shortcut then
  8851    begin @<Get the next non-blank non-sign...@>;
  8852    if (cur_cmd>=min_internal)and(cur_cmd<=max_internal) then
  8853      @<Fetch an internal dimension and |goto attach_sign|,
  8854        or fetch an internal integer@>
  8855    else  begin back_input;
  8856      if cur_tok=continental_point_token then cur_tok:=point_token;
  8857      if cur_tok<>point_token then scan_int
  8858      else  begin radix:=10; cur_val:=0;
  8859        end;
  8860      if cur_tok=continental_point_token then cur_tok:=point_token;
  8861      if (radix=10)and(cur_tok=point_token) then @<Scan decimal fraction@>;
  8862      end;
  8863    end;
  8864  if cur_val<0 then {in this case |f=0|}
  8865    begin negative := not negative; negate(cur_val);
  8866    end;
  8867  @<Scan units and set |cur_val| to $x\cdot(|cur_val|+f/2^{16})$, where there
  8868    are |x| sp per unit; |goto attach_sign| if the units are internal@>;
  8869  @<Scan an optional space@>;
  8870  attach_sign: if arith_error or(abs(cur_val)>=@'10000000000) then
  8871    @<Report that this dimension is out of range@>;
  8872  if negative then negate(cur_val);
  8873  end;
  8874  
  8875  @ @<Fetch an internal dimension and |goto attach_sign|...@>=
  8876  if mu then
  8877    begin scan_something_internal(mu_val,false);
  8878    @<Coerce glue to a dimension@>;
  8879    if cur_val_level=mu_val then goto attach_sign;
  8880    if cur_val_level<>int_val then mu_error;
  8881    end
  8882  else  begin scan_something_internal(dimen_val,false);
  8883    if cur_val_level=dimen_val then goto attach_sign;
  8884    end
  8885  
  8886  @ @<Local variables for dimension calculations@>=
  8887  @!num,@!denom:1..65536; {conversion ratio for the scanned units}
  8888  @!k,@!kk:small_number; {number of digits in a decimal fraction}
  8889  @!p,@!q:pointer; {top of decimal digit stack}
  8890  @!v:scaled; {an internal dimension}
  8891  @!save_cur_val:integer; {temporary storage of |cur_val|}
  8892  
  8893  @ The following code is executed when |scan_something_internal| was
  8894  called asking for |mu_val|, when we really wanted a ``mudimen'' instead
  8895  of ``muglue.''
  8896  
  8897  @<Coerce glue to a dimension@>=
  8898  if cur_val_level>=glue_val then
  8899    begin v:=width(cur_val); delete_glue_ref(cur_val); cur_val:=v;
  8900    end
  8901  
  8902  @ When the following code is executed, we have |cur_tok=point_token|, but this
  8903  token has been backed up using |back_input|; we must first discard it.
  8904  
  8905  It turns out that a decimal point all by itself is equivalent to `\.{0.0}'.
  8906  Let's hope people don't use that fact.
  8907  
  8908  @<Scan decimal fraction@>=
  8909  begin k:=0; p:=null; get_token; {|point_token| is being re-scanned}
  8910  loop@+  begin get_x_token;
  8911    if (cur_tok>zero_token+9)or(cur_tok<zero_token) then goto done1;
  8912    if k<17 then {digits for |k>=17| cannot affect the result}
  8913      begin q:=get_avail; link(q):=p; info(q):=cur_tok-zero_token;
  8914      p:=q; incr(k);
  8915      end;
  8916    end;
  8917  done1: for kk:=k downto 1 do
  8918    begin dig[kk-1]:=info(p); q:=p; p:=link(p); free_avail(q);
  8919    end;
  8920  f:=round_decimals(k);
  8921  if cur_cmd<>spacer then back_input;
  8922  end
  8923  
  8924  @ Now comes the harder part: At this point in the program, |cur_val| is a
  8925  nonnegative integer and $f/2^{16}$ is a nonnegative fraction less than 1;
  8926  we want to multiply the sum of these two quantities by the appropriate
  8927  factor, based on the specified units, in order to produce a |scaled|
  8928  result, and we want to do the calculation with fixed point arithmetic that
  8929  does not overflow.
  8930  
  8931  @<Scan units and set |cur_val| to $x\cdot(|cur_val|+f/2^{16})$...@>=
  8932  if inf then @<Scan for \(f)\.{fil} units; |goto attach_fraction| if found@>;
  8933  @<Scan for \(u)units that are internal dimensions;
  8934    |goto attach_sign| with |cur_val| set if found@>;
  8935  if mu then @<Scan for \(m)\.{mu} units and |goto attach_fraction|@>;
  8936  if scan_keyword("true") then @<Adjust \(f)for the magnification ratio@>;
  8937  @.true@>
  8938  if scan_keyword("pt") then goto attach_fraction; {the easy case}
  8939  @.pt@>
  8940  @<Scan for \(a)all other units and adjust |cur_val| and |f| accordingly;
  8941    |goto done| in the case of scaled points@>;
  8942  attach_fraction: if cur_val>=@'40000 then arith_error:=true
  8943  else cur_val:=cur_val*unity+f;
  8944  done:
  8945  
  8946  @ A specification like `\.{filllll}' or `\.{fill L L L}' will lead to two
  8947  error messages (one for each additional keyword \.{"l"}).
  8948  
  8949  @<Scan for \(f)\.{fil} units...@>=
  8950  if scan_keyword("fil") then
  8951  @.fil@>
  8952    begin cur_order:=fil;
  8953    while scan_keyword("l") do
  8954      begin if cur_order=filll then
  8955        begin print_err("Illegal unit of measure (");
  8956  @.Illegal unit of measure@>
  8957        print("replaced by filll)");
  8958        help1("I dddon't go any higher than filll."); error;
  8959        end
  8960      else incr(cur_order);
  8961      end;
  8962    goto attach_fraction;
  8963    end
  8964  
  8965  @ @<Scan for \(u)units that are internal dimensions...@>=
  8966  save_cur_val:=cur_val;
  8967  @<Get the next non-blank non-call...@>;
  8968  if (cur_cmd<min_internal)or(cur_cmd>max_internal) then back_input
  8969  else  begin if mu then
  8970      begin scan_something_internal(mu_val,false); @<Coerce glue...@>;
  8971      if cur_val_level<>mu_val then mu_error;
  8972      end
  8973    else scan_something_internal(dimen_val,false);
  8974    v:=cur_val; goto found;
  8975    end;
  8976  if mu then goto not_found;
  8977  if scan_keyword("em") then v:=(@<The em width for |cur_font|@>)
  8978  @.em@>
  8979  else if scan_keyword("ex") then v:=(@<The x-height for |cur_font|@>)
  8980  @.ex@>
  8981  else goto not_found;
  8982  @<Scan an optional space@>;
  8983  found:cur_val:=nx_plus_y(save_cur_val,v,xn_over_d(v,f,@'200000));
  8984  goto attach_sign;
  8985  not_found:
  8986  
  8987  @ @<Scan for \(m)\.{mu} units and |goto attach_fraction|@>=
  8988  if scan_keyword("mu") then goto attach_fraction
  8989  @.mu@>
  8990  else  begin print_err("Illegal unit of measure ("); print("mu inserted)");
  8991  @.Illegal unit of measure@>
  8992    help4("The unit of measurement in math glue must be mu.")@/
  8993      ("To recover gracefully from this error, it's best to")@/
  8994      ("delete the erroneous units; e.g., type `2' to delete")@/
  8995      ("two letters. (See Chapter 27 of The TeXbook.)");
  8996  @:TeXbook}{\sl The \TeX book@>
  8997    error; goto attach_fraction;
  8998    end
  8999  
  9000  @ @<Adjust \(f)for the magnification ratio@>=
  9001  begin prepare_mag;
  9002  if mag<>1000 then
  9003    begin cur_val:=xn_over_d(cur_val,1000,mag);
  9004    f:=(1000*f+@'200000*remainder) div mag;
  9005    cur_val:=cur_val+(f div @'200000); f:=f mod @'200000;
  9006    end;
  9007  end
  9008  
  9009  @ The necessary conversion factors can all be specified exactly as
  9010  fractions whose numerator and denominator sum to 32768 or less.
  9011  According to the definitions here, $\rm2660\,dd\approx1000.33297\,mm$;
  9012  this agrees well with the value $\rm1000.333\,mm$ cited by Bosshard
  9013  @^Bosshard, Hans Rudolf@>
  9014  in {\sl Technische Grundlagen zur Satzherstellung\/} (Bern, 1980).
  9015  
  9016  @d set_conversion_end(#)== denom:=#; end
  9017  @d set_conversion(#)==@+begin num:=#; set_conversion_end
  9018  
  9019  @<Scan for \(a)all other units and adjust |cur_val| and |f|...@>=
  9020  if scan_keyword("in") then set_conversion(7227)(100)
  9021  @.in@>
  9022  else if scan_keyword("pc") then set_conversion(12)(1)
  9023  @.pc@>
  9024  else if scan_keyword("cm") then set_conversion(7227)(254)
  9025  @.cm@>
  9026  else if scan_keyword("mm") then set_conversion(7227)(2540)
  9027  @.mm@>
  9028  else if scan_keyword("bp") then set_conversion(7227)(7200)
  9029  @.bp@>
  9030  else if scan_keyword("dd") then set_conversion(1238)(1157)
  9031  @.dd@>
  9032  else if scan_keyword("cc") then set_conversion(14856)(1157)
  9033  @.cc@>
  9034  else if scan_keyword("sp") then goto done
  9035  @.sp@>
  9036  else @<Complain about unknown unit and |goto done2|@>;
  9037  cur_val:=xn_over_d(cur_val,num,denom);
  9038  f:=(num*f+@'200000*remainder) div denom;@/
  9039  cur_val:=cur_val+(f div @'200000); f:=f mod @'200000;
  9040  done2:
  9041  
  9042  @ @<Complain about unknown unit...@>=
  9043  begin print_err("Illegal unit of measure ("); print("pt inserted)");
  9044  @.Illegal unit of measure@>
  9045  help6("Dimensions can be in units of em, ex, in, pt, pc,")@/
  9046    ("cm, mm, dd, cc, bp, or sp; but yours is a new one!")@/
  9047    ("I'll assume that you meant to say pt, for printer's points.")@/
  9048    ("To recover gracefully from this error, it's best to")@/
  9049    ("delete the erroneous units; e.g., type `2' to delete")@/
  9050    ("two letters. (See Chapter 27 of The TeXbook.)");
  9051  @:TeXbook}{\sl The \TeX book@>
  9052  error; goto done2;
  9053  end
  9054  
  9055  
  9056  @ @<Report that this dimension is out of range@>=
  9057  begin print_err("Dimension too large");
  9058  @.Dimension too large@>
  9059  help2("I can't work with sizes bigger than about 19 feet.")@/
  9060    ("Continue and I'll use the largest value I can.");@/
  9061  error; cur_val:=max_dimen; arith_error:=false;
  9062  end
  9063  
  9064  @ The final member of \TeX's value-scanning trio is |scan_glue|, which
  9065  makes |cur_val| point to a glue specification. The reference count of that
  9066  glue spec will take account of the fact that |cur_val| is pointing to~it.
  9067  
  9068  The |level| parameter should be either |glue_val| or |mu_val|.
  9069  
  9070  Since |scan_dimen| was so much more complex than |scan_int|, we might expect
  9071  |scan_glue| to be even worse. But fortunately, it is very simple, since
  9072  most of the work has already been done.
  9073  
  9074  @p procedure scan_glue(@!level:small_number);
  9075    {sets |cur_val| to a glue spec pointer}
  9076  label exit;
  9077  var negative:boolean; {should the answer be negated?}
  9078  @!q:pointer; {new glue specification}
  9079  @!mu:boolean; {does |level=mu_val|?}
  9080  begin mu:=(level=mu_val); @<Get the next non-blank non-sign...@>;
  9081  if (cur_cmd>=min_internal)and(cur_cmd<=max_internal) then
  9082    begin scan_something_internal(level,negative);
  9083    if cur_val_level>=glue_val then
  9084      begin if cur_val_level<>level then mu_error;
  9085      return;
  9086      end;
  9087    if cur_val_level=int_val then scan_dimen(mu,false,true)
  9088    else if level=mu_val then mu_error;
  9089    end
  9090  else  begin back_input; scan_dimen(mu,false,false);
  9091    if negative then negate(cur_val);
  9092    end;
  9093  @<Create a new glue specification whose width is |cur_val|; scan for its
  9094    stretch and shrink components@>;
  9095  exit:end;
  9096  
  9097  @ @<Create a new glue specification whose width is |cur_val|...@>=
  9098  q:=new_spec(zero_glue); width(q):=cur_val;
  9099  if scan_keyword("plus") then
  9100  @.plus@>
  9101    begin scan_dimen(mu,true,false);
  9102    stretch(q):=cur_val; stretch_order(q):=cur_order;
  9103    end;
  9104  if scan_keyword("minus") then
  9105  @.minus@>
  9106    begin scan_dimen(mu,true,false);
  9107    shrink(q):=cur_val; shrink_order(q):=cur_order;
  9108    end;
  9109  cur_val:=q
  9110  
  9111  @ Here's a similar procedure that returns a pointer to a rule node. This
  9112  routine is called just after \TeX\ has seen \.{\\hrule} or \.{\\vrule};
  9113  therefore |cur_cmd| will be either |hrule| or |vrule|. The idea is to store
  9114  the default rule dimensions in the node, then to override them if
  9115  `\.{height}' or `\.{width}' or `\.{depth}' specifications are
  9116  found (in any order).
  9117  
  9118  @d default_rule=26214 {0.4\thinspace pt}
  9119  
  9120  @p function scan_rule_spec:pointer;
  9121  label reswitch;
  9122  var q:pointer; {the rule node being created}
  9123  begin q:=new_rule; {|width|, |depth|, and |height| all equal |null_flag| now}
  9124  if cur_cmd=vrule then width(q):=default_rule
  9125  else  begin height(q):=default_rule; depth(q):=0;
  9126    end;
  9127  reswitch: if scan_keyword("width") then
  9128  @.width@>
  9129    begin scan_normal_dimen; width(q):=cur_val; goto reswitch;
  9130    end;
  9131  if scan_keyword("height") then
  9132  @.height@>
  9133    begin scan_normal_dimen; height(q):=cur_val; goto reswitch;
  9134    end;
  9135  if scan_keyword("depth") then
  9136  @.depth@>
  9137    begin scan_normal_dimen; depth(q):=cur_val; goto reswitch;
  9138    end;
  9139  scan_rule_spec:=q;
  9140  end;
  9141  
  9142  @* \[27] Building token lists.
  9143  The token lists for macros and for other things like \.{\\mark} and \.{\\output}
  9144  and \.{\\write} are produced by a procedure called |scan_toks|.
  9145  
  9146  Before we get into the details of |scan_toks|, let's consider a much
  9147  simpler task, that of converting the current string into a token list.
  9148  The |str_toks| function does this; it classifies spaces as type |spacer|
  9149  and everything else as type |other_char|.
  9150  
  9151  The token list created by |str_toks| begins at |link(temp_head)| and ends
  9152  at the value |p| that is returned. (If |p=temp_head|, the list is empty.)
  9153  
  9154  @p function str_toks(@!b:pool_pointer):pointer;
  9155    {converts |str_pool[b..pool_ptr-1]| to a token list}
  9156  var p:pointer; {tail of the token list}
  9157  @!q:pointer; {new node being added to the token list via |store_new_token|}
  9158  @!t:halfword; {token being appended}
  9159  @!k:pool_pointer; {index into |str_pool|}
  9160  begin str_room(1);
  9161  p:=temp_head; link(p):=null; k:=b;
  9162  while k<pool_ptr do
  9163    begin t:=so(str_pool[k]);
  9164    if t=" " then t:=space_token
  9165    else t:=other_token+t;
  9166    fast_store_new_token(t);
  9167    incr(k);
  9168    end;
  9169  pool_ptr:=b; str_toks:=p;
  9170  end;
  9171  
  9172  @ The main reason for wanting |str_toks| is the next function,
  9173  |the_toks|, which has similar input/output characteristics.
  9174  
  9175  This procedure is supposed to scan something like `\.{\\skip\\count12}',
  9176  i.e., whatever can follow `\.{\\the}', and it constructs a token list
  9177  containing something like `\.{-3.0pt minus 0.5fill}'.
  9178  
  9179  @p function the_toks:pointer;
  9180  var old_setting:0..max_selector; {holds |selector| setting}
  9181  @!p,@!q,@!r:pointer; {used for copying a token list}
  9182  @!b:pool_pointer; {base of temporary string}
  9183  begin get_x_token; scan_something_internal(tok_val,false);
  9184  if cur_val_level>=ident_val then @<Copy the token list@>
  9185  else begin old_setting:=selector; selector:=new_string; b:=pool_ptr;
  9186    case cur_val_level of
  9187    int_val:print_int(cur_val);
  9188    dimen_val:begin print_scaled(cur_val); print("pt");
  9189      end;
  9190    glue_val: begin print_spec(cur_val,"pt"); delete_glue_ref(cur_val);
  9191      end;
  9192    mu_val: begin print_spec(cur_val,"mu"); delete_glue_ref(cur_val);
  9193      end;
  9194    end; {there are no other cases}
  9195    selector:=old_setting; the_toks:=str_toks(b);
  9196    end;
  9197  end;
  9198  
  9199  @ @<Copy the token list@>=
  9200  begin p:=temp_head; link(p):=null;
  9201  if cur_val_level=ident_val then store_new_token(cs_token_flag+cur_val)
  9202  else if cur_val<>null then
  9203    begin r:=link(cur_val); {do not copy the reference count}
  9204    while r<>null do
  9205      begin fast_store_new_token(info(r)); r:=link(r);
  9206      end;
  9207    end;
  9208  the_toks:=p;
  9209  end
  9210  
  9211  @ Here's part of the |expand| subroutine that we are now ready to complete:
  9212  
  9213  @p procedure ins_the_toks;
  9214  begin link(garbage):=the_toks; ins_list(link(temp_head));
  9215  end;
  9216  
  9217  @ The primitives \.{\\number}, \.{\\romannumeral}, \.{\\string}, \.{\\meaning},
  9218  \.{\\fontname}, and \.{\\jobname} are defined as follows.
  9219  
  9220  @d number_code=0 {command code for \.{\\number}}
  9221  @d roman_numeral_code=1 {command code for \.{\\romannumeral}}
  9222  @d string_code=2 {command code for \.{\\string}}
  9223  @d meaning_code=3 {command code for \.{\\meaning}}
  9224  @d font_name_code=4 {command code for \.{\\fontname}}
  9225  @d job_name_code=5 {command code for \.{\\jobname}}
  9226  
  9227  @<Put each...@>=
  9228  primitive("number",convert,number_code);@/
  9229  @!@:number_}{\.{\\number} primitive@>
  9230  primitive("romannumeral",convert,roman_numeral_code);@/
  9231  @!@:roman_numeral_}{\.{\\romannumeral} primitive@>
  9232  primitive("string",convert,string_code);@/
  9233  @!@:string_}{\.{\\string} primitive@>
  9234  primitive("meaning",convert,meaning_code);@/
  9235  @!@:meaning_}{\.{\\meaning} primitive@>
  9236  primitive("fontname",convert,font_name_code);@/
  9237  @!@:font_name_}{\.{\\fontname} primitive@>
  9238  primitive("jobname",convert,job_name_code);@/
  9239  @!@:job_name_}{\.{\\jobname} primitive@>
  9240  
  9241  @ @<Cases of |print_cmd_chr|...@>=
  9242  convert: case chr_code of
  9243    number_code: print_esc("number");
  9244    roman_numeral_code: print_esc("romannumeral");
  9245    string_code: print_esc("string");
  9246    meaning_code: print_esc("meaning");
  9247    font_name_code: print_esc("fontname");
  9248    othercases print_esc("jobname")
  9249    endcases;
  9250  
  9251  @ The procedure |conv_toks| uses |str_toks| to insert the token list
  9252  for |convert| functions into the scanner; `\.{\\outer}' control sequences
  9253  are allowed to follow `\.{\\string}' and `\.{\\meaning}'.
  9254  
  9255  @p procedure conv_toks;
  9256  var old_setting:0..max_selector; {holds |selector| setting}
  9257  @!c:number_code..job_name_code; {desired type of conversion}
  9258  @!save_scanner_status:small_number; {|scanner_status| upon entry}
  9259  @!b:pool_pointer; {base of temporary string}
  9260  begin c:=cur_chr; @<Scan the argument for command |c|@>;
  9261  old_setting:=selector; selector:=new_string; b:=pool_ptr;
  9262  @<Print the result of command |c|@>;
  9263  selector:=old_setting; link(garbage):=str_toks(b); ins_list(link(temp_head));
  9264  end;
  9265  
  9266  @ @<Scan the argument for command |c|@>=
  9267  case c of
  9268  number_code,roman_numeral_code: scan_int;
  9269  string_code, meaning_code: begin save_scanner_status:=scanner_status;
  9270    scanner_status:=normal; get_token; scanner_status:=save_scanner_status;
  9271    end;
  9272  font_name_code: scan_font_ident;
  9273  job_name_code: if job_name=0 then open_log_file;
  9274  end {there are no other cases}
  9275  
  9276  @ @<Print the result of command |c|@>=
  9277  case c of
  9278  number_code: print_int(cur_val);
  9279  roman_numeral_code: print_roman_int(cur_val);
  9280  string_code:if cur_cs<>0 then sprint_cs(cur_cs)
  9281    else print_char(cur_chr);
  9282  meaning_code: print_meaning;
  9283  font_name_code: begin print(font_name[cur_val]);
  9284    if font_size[cur_val]<>font_dsize[cur_val] then
  9285      begin print(" at "); print_scaled(font_size[cur_val]);
  9286      print("pt");
  9287      end;
  9288    end;
  9289  job_name_code: print(job_name);
  9290  end {there are no other cases}
  9291  
  9292  @ Now we can't postpone the difficulties any longer; we must bravely tackle
  9293  |scan_toks|. This function returns a pointer to the tail of a new token
  9294  list, and it also makes |def_ref| point to the reference count at the
  9295  head of that list.
  9296  
  9297  There are two boolean parameters, |macro_def| and |xpand|. If |macro_def|
  9298  is true, the goal is to create the token list for a macro definition;
  9299  otherwise the goal is to create the token list for some other \TeX\
  9300  primitive: \.{\\mark}, \.{\\output}, \.{\\everypar}, \.{\\lowercase},
  9301  \.{\\uppercase}, \.{\\message}, \.{\\errmessage}, \.{\\write}, or
  9302  \.{\\special}. In the latter cases a left brace must be scanned next; this
  9303  left brace will not be part of the token list, nor will the matching right
  9304  brace that comes at the end. If |xpand| is false, the token list will
  9305  simply be copied from the input using |get_token|. Otherwise all expandable
  9306  tokens will be expanded until unexpandable tokens are left, except that
  9307  the results of expanding `\.{\\the}' are not expanded further.
  9308  If both |macro_def| and |xpand| are true, the expansion applies
  9309  only to the macro body (i.e., to the material following the first
  9310  |left_brace| character).
  9311  
  9312  The value of |cur_cs| when |scan_toks| begins should be the |eqtb|
  9313  address of the control sequence to display in ``runaway'' error
  9314  messages.
  9315  
  9316  @p function scan_toks(@!macro_def,@!xpand:boolean):pointer;
  9317  label found,continue,done,done1,done2;
  9318  var t:halfword; {token representing the highest parameter number}
  9319  @!s:halfword; {saved token}
  9320  @!p:pointer; {tail of the token list being built}
  9321  @!q:pointer; {new node being added to the token list via |store_new_token|}
  9322  @!unbalance:halfword; {number of unmatched left braces}
  9323  @!hash_brace:halfword; {possible `\.{\#\{}' token}
  9324  begin if macro_def then scanner_status:=defining
  9325  @+else scanner_status:=absorbing;
  9326  warning_index:=cur_cs; def_ref:=get_avail; token_ref_count(def_ref):=null;
  9327  p:=def_ref; hash_brace:=0; t:=zero_token;
  9328  if macro_def then @<Scan and build the parameter part of the macro definition@>
  9329  else scan_left_brace; {remove the compulsory left brace}
  9330  @<Scan and build the body of the token list; |goto found| when finished@>;
  9331  found: scanner_status:=normal;
  9332  if hash_brace<>0 then store_new_token(hash_brace);
  9333  scan_toks:=p;
  9334  end;
  9335  
  9336  @ @<Scan and build the parameter part...@>=
  9337  begin loop begin continue: get_token; {set |cur_cmd|, |cur_chr|, |cur_tok|}
  9338    if cur_tok<right_brace_limit then goto done1;
  9339    if cur_cmd=mac_param then
  9340      @<If the next character is a parameter number, make |cur_tok|
  9341        a |match| token; but if it is a left brace, store
  9342        `|left_brace|, |end_match|', set |hash_brace|, and |goto done|@>;
  9343    store_new_token(cur_tok);
  9344    end;
  9345  done1: store_new_token(end_match_token);
  9346  if cur_cmd=right_brace then
  9347    @<Express shock at the missing left brace; |goto found|@>;
  9348  done: end
  9349  
  9350  @ @<Express shock...@>=
  9351  begin print_err("Missing { inserted"); incr(align_state);
  9352  @.Missing \{ inserted@>
  9353  help2("Where was the left brace? You said something like `\def\a}',")@/
  9354    ("which I'm going to interpret as `\def\a{}'."); error; goto found;
  9355  end
  9356  
  9357  @ @<If the next character is a parameter number...@>=
  9358  begin s:=match_token+cur_chr; get_token;
  9359  if cur_tok<left_brace_limit then
  9360    begin hash_brace:=cur_tok;
  9361    store_new_token(cur_tok); store_new_token(end_match_token);
  9362    goto done;
  9363    end;
  9364  if t=zero_token+9 then
  9365    begin print_err("You already have nine parameters");
  9366  @.You already have nine...@>
  9367    help2("I'm going to ignore the # sign you just used,")@/
  9368      ("as well as the token that followed it."); error; goto continue;
  9369    end
  9370  else  begin incr(t);
  9371    if cur_tok<>t then
  9372      begin print_err("Parameters must be numbered consecutively");
  9373  @.Parameters...consecutively@>
  9374      help2("I've inserted the digit you should have used after the #.")@/
  9375        ("Type `1' to delete what you did use."); back_error;
  9376      end;
  9377    cur_tok:=s;
  9378    end;
  9379  end
  9380  
  9381  @ @<Scan and build the body of the token list; |goto found| when finished@>=
  9382  unbalance:=1;
  9383  loop@+  begin if xpand then @<Expand the next part of the input@>
  9384    else get_token;
  9385    if cur_tok<right_brace_limit then
  9386      if cur_cmd<right_brace then incr(unbalance)
  9387      else  begin decr(unbalance);
  9388        if unbalance=0 then goto found;
  9389        end
  9390    else if cur_cmd=mac_param then
  9391      if macro_def then @<Look for parameter number or \.{\#\#}@>;
  9392    store_new_token(cur_tok);
  9393    end
  9394  
  9395  @ Here we insert an entire token list created by |the_toks| without
  9396  expanding it further.
  9397  
  9398  @<Expand the next part of the input@>=
  9399  begin loop begin get_next;
  9400    if cur_cmd<=max_command then goto done2;
  9401    if cur_cmd<>the then expand
  9402    else  begin q:=the_toks;
  9403      if link(temp_head)<>null then
  9404        begin link(p):=link(temp_head); p:=q;
  9405        end;
  9406      end;
  9407    end;
  9408  done2: x_token
  9409  end
  9410  
  9411  @ @<Look for parameter number...@>=
  9412  begin s:=cur_tok;
  9413  if xpand then get_x_token else get_token;
  9414  if cur_cmd<>mac_param then
  9415    if (cur_tok<=zero_token)or(cur_tok>t) then
  9416      begin print_err("Illegal parameter number in definition of ");
  9417  @.Illegal parameter number...@>
  9418      sprint_cs(warning_index);
  9419      help3("You meant to type ## instead of #, right?")@/
  9420      ("Or maybe a } was forgotten somewhere earlier, and things")@/
  9421      ("are all screwed up? I'm going to assume that you meant ##.");
  9422      back_error; cur_tok:=s;
  9423      end
  9424    else cur_tok:=out_param_token-"0"+cur_chr;
  9425  end
  9426  
  9427  @ Another way to create a token list is via the \.{\\read} command. The
  9428  sixteen files potentially usable for reading appear in the following
  9429  global variables. The value of |read_open[n]| will be |closed| if
  9430  stream number |n| has not been opened or if it has been fully read;
  9431  |just_open| if an \.{\\openin} but not a \.{\\read} has been done;
  9432  and |normal| if it is open and ready to read the next line.
  9433  
  9434  @d closed=2 {not open, or at end of file}
  9435  @d just_open=1 {newly opened, first line not yet read}
  9436  
  9437  @<Glob...@>=
  9438  @!read_file:array[0..15] of alpha_file; {used for \.{\\read}}
  9439  @!read_open:array[0..16] of normal..closed; {state of |read_file[n]|}
  9440  
  9441  @ @<Set init...@>=
  9442  for k:=0 to 16 do read_open[k]:=closed;
  9443  
  9444  @ The |read_toks| procedure constructs a token list like that for any
  9445  macro definition, and makes |cur_val| point to it. Parameter |r| points
  9446  to the control sequence that will receive this token list.
  9447  
  9448  @p procedure read_toks(@!n:integer;@!r:pointer);
  9449  label done;
  9450  var p:pointer; {tail of the token list}
  9451  @!q:pointer; {new node being added to the token list via |store_new_token|}
  9452  @!s:integer; {saved value of |align_state|}
  9453  @!m:small_number; {stream number}
  9454  begin scanner_status:=defining; warning_index:=r;
  9455  def_ref:=get_avail; token_ref_count(def_ref):=null;
  9456  p:=def_ref; {the reference count}
  9457  store_new_token(end_match_token);
  9458  if (n<0)or(n>15) then m:=16@+else m:=n;
  9459  s:=align_state; align_state:=1000000; {disable tab marks, etc.}
  9460  repeat @<Input and store tokens from the next line of the file@>;
  9461  until align_state=1000000;
  9462  cur_val:=def_ref; scanner_status:=normal; align_state:=s;
  9463  end;
  9464  
  9465  @ @<Input and store tokens from the next line of the file@>=
  9466  begin_file_reading; name:=m+1;
  9467  if read_open[m]=closed then @<Input for \.{\\read} from the terminal@>
  9468  else if read_open[m]=just_open then @<Input the first line of |read_file[m]|@>
  9469  else @<Input the next line of |read_file[m]|@>;
  9470  limit:=last;
  9471  if end_line_char_inactive then decr(limit)
  9472  else  buffer[limit]:=end_line_char;
  9473  first:=limit+1; loc:=start; state:=new_line;@/
  9474  loop@+  begin get_token;
  9475    if cur_tok=0 then goto done;
  9476      {|cur_cmd=cur_chr=0| will occur at the end of the line}
  9477    if align_state<1000000 then {unmatched `\.\}' aborts the line}
  9478      begin repeat get_token; until cur_tok=0;
  9479      align_state:=1000000; goto done;
  9480      end;
  9481    store_new_token(cur_tok);
  9482    end;
  9483  done: end_file_reading
  9484  
  9485  @ Here we input on-line into the |buffer| array, prompting the user explicitly
  9486  if |n>=0|.  The value of |n| is set negative so that additional prompts
  9487  will not be given in the case of multi-line input.
  9488  
  9489  @<Input for \.{\\read} from the terminal@>=
  9490  if interaction>nonstop_mode then
  9491    if n<0 then prompt_input("")
  9492    else  begin wake_up_terminal;
  9493      print_ln; sprint_cs(r); prompt_input("="); n:=-1;
  9494      end
  9495  else fatal_error("*** (cannot \read from terminal in nonstop modes)")
  9496  @.cannot \\read@>
  9497  
  9498  @ The first line of a file must be treated specially, since |input_ln|
  9499  must be told not to start with |get|.
  9500  @^system dependencies@>
  9501  
  9502  @<Input the first line of |read_file[m]|@>=
  9503  if input_ln(read_file[m],false) then read_open[m]:=normal
  9504  else  begin a_close(read_file[m]); read_open[m]:=closed;
  9505    end
  9506  
  9507  @ An empty line is appended at the end of a |read_file|.
  9508  @^empty line at end of file@>
  9509  
  9510  @<Input the next line of |read_file[m]|@>=
  9511  begin if not input_ln(read_file[m],true) then
  9512    begin a_close(read_file[m]); read_open[m]:=closed;
  9513    if align_state<>1000000 then
  9514      begin runaway;
  9515      print_err("File ended within "); print_esc("read");
  9516  @.File ended within \\read@>
  9517      help1("This \read has unbalanced braces.");
  9518      align_state:=1000000; limit:=0; error;
  9519      end;
  9520    end;
  9521  end
  9522  
  9523  @* \[28] Conditional processing.
  9524  We consider now the way \TeX\ handles various kinds of \.{\\if} commands.
  9525  
  9526  @d if_char_code=0 { `\.{\\if}' }
  9527  @d if_cat_code=1 { `\.{\\ifcat}' }
  9528  @d if_int_code=2 { `\.{\\ifnum}' }
  9529  @d if_dim_code=3 { `\.{\\ifdim}' }
  9530  @d if_odd_code=4 { `\.{\\ifodd}' }
  9531  @d if_vmode_code=5 { `\.{\\ifvmode}' }
  9532  @d if_hmode_code=6 { `\.{\\ifhmode}' }
  9533  @d if_mmode_code=7 { `\.{\\ifmmode}' }
  9534  @d if_inner_code=8 { `\.{\\ifinner}' }
  9535  @d if_void_code=9 { `\.{\\ifvoid}' }
  9536  @d if_hbox_code=10 { `\.{\\ifhbox}' }
  9537  @d if_vbox_code=11 { `\.{\\ifvbox}' }
  9538  @d ifx_code=12 { `\.{\\ifx}' }
  9539  @d if_eof_code=13 { `\.{\\ifeof}' }
  9540  @d if_true_code=14 { `\.{\\iftrue}' }
  9541  @d if_false_code=15 { `\.{\\iffalse}' }
  9542  @d if_case_code=16 { `\.{\\ifcase}' }
  9543  
  9544  @<Put each...@>=
  9545  primitive("if",if_test,if_char_code);
  9546  @!@:if_char_}{\.{\\if} primitive@>
  9547  primitive("ifcat",if_test,if_cat_code);
  9548  @!@:if_cat_code_}{\.{\\ifcat} primitive@>
  9549  primitive("ifnum",if_test,if_int_code);
  9550  @!@:if_int_}{\.{\\ifnum} primitive@>
  9551  primitive("ifdim",if_test,if_dim_code);
  9552  @!@:if_dim_}{\.{\\ifdim} primitive@>
  9553  primitive("ifodd",if_test,if_odd_code);
  9554  @!@:if_odd_}{\.{\\ifodd} primitive@>
  9555  primitive("ifvmode",if_test,if_vmode_code);
  9556  @!@:if_vmode_}{\.{\\ifvmode} primitive@>
  9557  primitive("ifhmode",if_test,if_hmode_code);
  9558  @!@:if_hmode_}{\.{\\ifhmode} primitive@>
  9559  primitive("ifmmode",if_test,if_mmode_code);
  9560  @!@:if_mmode_}{\.{\\ifmmode} primitive@>
  9561  primitive("ifinner",if_test,if_inner_code);
  9562  @!@:if_inner_}{\.{\\ifinner} primitive@>
  9563  primitive("ifvoid",if_test,if_void_code);
  9564  @!@:if_void_}{\.{\\ifvoid} primitive@>
  9565  primitive("ifhbox",if_test,if_hbox_code);
  9566  @!@:if_hbox_}{\.{\\ifhbox} primitive@>
  9567  primitive("ifvbox",if_test,if_vbox_code);
  9568  @!@:if_vbox_}{\.{\\ifvbox} primitive@>
  9569  primitive("ifx",if_test,ifx_code);
  9570  @!@:ifx_}{\.{\\ifx} primitive@>
  9571  primitive("ifeof",if_test,if_eof_code);
  9572  @!@:if_eof_}{\.{\\ifeof} primitive@>
  9573  primitive("iftrue",if_test,if_true_code);
  9574  @!@:if_true_}{\.{\\iftrue} primitive@>
  9575  primitive("iffalse",if_test,if_false_code);
  9576  @!@:if_false_}{\.{\\iffalse} primitive@>
  9577  primitive("ifcase",if_test,if_case_code);
  9578  @!@:if_case_}{\.{\\ifcase} primitive@>
  9579  
  9580  @ @<Cases of |print_cmd_chr|...@>=
  9581  if_test: case chr_code of
  9582    if_cat_code:print_esc("ifcat");
  9583    if_int_code:print_esc("ifnum");
  9584    if_dim_code:print_esc("ifdim");
  9585    if_odd_code:print_esc("ifodd");
  9586    if_vmode_code:print_esc("ifvmode");
  9587    if_hmode_code:print_esc("ifhmode");
  9588    if_mmode_code:print_esc("ifmmode");
  9589    if_inner_code:print_esc("ifinner");
  9590    if_void_code:print_esc("ifvoid");
  9591    if_hbox_code:print_esc("ifhbox");
  9592    if_vbox_code:print_esc("ifvbox");
  9593    ifx_code:print_esc("ifx");
  9594    if_eof_code:print_esc("ifeof");
  9595    if_true_code:print_esc("iftrue");
  9596    if_false_code:print_esc("iffalse");
  9597    if_case_code:print_esc("ifcase");
  9598    othercases print_esc("if")
  9599    endcases;
  9600  
  9601  @ Conditions can be inside conditions, and this nesting has a stack
  9602  that is independent of the |save_stack|.
  9603  
  9604  Four global variables represent the top of the condition stack:
  9605  |cond_ptr| points to pushed-down entries, if any; |if_limit| specifies
  9606  the largest code of a |fi_or_else| command that is syntactically legal;
  9607  |cur_if| is the name of the current type of conditional; and |if_line|
  9608  is the line number at which it began.
  9609  
  9610  If no conditions are currently in progress, the condition stack has the
  9611  special state |cond_ptr=null|, |if_limit=normal|, |cur_if=0|, |if_line=0|.
  9612  Otherwise |cond_ptr| points to a two-word node; the |type|, |subtype|, and
  9613  |link| fields of the first word contain |if_limit|, |cur_if|, and
  9614  |cond_ptr| at the next level, and the second word contains the
  9615  corresponding |if_line|.
  9616  
  9617  @d if_node_size=2 {number of words in stack entry for conditionals}
  9618  @d if_line_field(#)==mem[#+1].int
  9619  @d if_code=1 {code for \.{\\if...} being evaluated}
  9620  @d fi_code=2 {code for \.{\\fi}}
  9621  @d else_code=3 {code for \.{\\else}}
  9622  @d or_code=4 {code for \.{\\or}}
  9623  
  9624  @<Glob...@>=
  9625  @!cond_ptr:pointer; {top of the condition stack}
  9626  @!if_limit:normal..or_code; {upper bound on |fi_or_else| codes}
  9627  @!cur_if:small_number; {type of conditional being worked on}
  9628  @!if_line:integer; {line where that conditional began}
  9629  
  9630  @ @<Set init...@>=
  9631  cond_ptr:=null; if_limit:=normal; cur_if:=0; if_line:=0;
  9632  
  9633  @ @<Put each...@>=
  9634  primitive("fi",fi_or_else,fi_code);
  9635  @!@:fi_}{\.{\\fi} primitive@>
  9636  text(frozen_fi):="fi"; eqtb[frozen_fi]:=eqtb[cur_val];
  9637  primitive("or",fi_or_else,or_code);
  9638  @!@:or_}{\.{\\or} primitive@>
  9639  primitive("else",fi_or_else,else_code);
  9640  @!@:else_}{\.{\\else} primitive@>
  9641  
  9642  @ @<Cases of |print_cmd_chr|...@>=
  9643  fi_or_else: if chr_code=fi_code then print_esc("fi")
  9644    else if chr_code=or_code then print_esc("or")
  9645    else print_esc("else");
  9646  
  9647  @ When we skip conditional text, we keep track of the line number
  9648  where skipping began, for use in error messages.
  9649  
  9650  @<Glob...@>=
  9651  @!skip_line:integer; {skipping began here}
  9652  
  9653  @ Here is a procedure that ignores text until coming to an \.{\\or},
  9654  \.{\\else}, or \.{\\fi} at the current level of $\.{\\if}\ldots\.{\\fi}$
  9655  nesting. After it has acted, |cur_chr| will indicate the token that
  9656  was found, but |cur_tok| will not be set (because this makes the
  9657  procedure run faster).
  9658  
  9659  @p procedure pass_text;
  9660  label done;
  9661  var l:integer; {level of $\.{\\if}\ldots\.{\\fi}$ nesting}
  9662  @!save_scanner_status:small_number; {|scanner_status| upon entry}
  9663  begin save_scanner_status:=scanner_status; scanner_status:=skipping; l:=0;
  9664  skip_line:=line;
  9665  loop@+  begin get_next;
  9666    if cur_cmd=fi_or_else then
  9667      begin if l=0 then goto done;
  9668      if cur_chr=fi_code then decr(l);
  9669      end
  9670    else if cur_cmd=if_test then incr(l);
  9671    end;
  9672  done: scanner_status:=save_scanner_status;
  9673  end;
  9674  
  9675  @ When we begin to process a new \.{\\if}, we set |if_limit:=if_code|; then
  9676  if\/ \.{\\or} or \.{\\else} or \.{\\fi} occurs before the current \.{\\if}
  9677  condition has been evaluated, \.{\\relax} will be inserted.
  9678  For example, a sequence of commands like `\.{\\ifvoid1\\else...\\fi}'
  9679  would otherwise require something after the `\.1'.
  9680  
  9681  @<Push the condition stack@>=
  9682  begin p:=get_node(if_node_size); link(p):=cond_ptr; type(p):=if_limit;
  9683  subtype(p):=cur_if; if_line_field(p):=if_line;
  9684  cond_ptr:=p; cur_if:=cur_chr; if_limit:=if_code; if_line:=line;
  9685  end
  9686  
  9687  @ @<Pop the condition stack@>=
  9688  begin p:=cond_ptr; if_line:=if_line_field(p);
  9689  cur_if:=subtype(p); if_limit:=type(p); cond_ptr:=link(p);
  9690  free_node(p,if_node_size);
  9691  end
  9692  
  9693  @ Here's a procedure that changes the |if_limit| code corresponding to
  9694  a given value of |cond_ptr|.
  9695  
  9696  @p procedure change_if_limit(@!l:small_number;@!p:pointer);
  9697  label exit;
  9698  var q:pointer;
  9699  begin if p=cond_ptr then if_limit:=l {that's the easy case}
  9700  else  begin q:=cond_ptr;
  9701    loop@+  begin if q=null then confusion("if");
  9702  @:this can't happen if}{\quad if@>
  9703      if link(q)=p then
  9704        begin type(q):=l; return;
  9705        end;
  9706      q:=link(q);
  9707      end;
  9708    end;
  9709  exit:end;
  9710  
  9711  @ A condition is started when the |expand| procedure encounters
  9712  an |if_test| command; in that case |expand| reduces to |conditional|,
  9713  which is a recursive procedure.
  9714  @^recursion@>
  9715  
  9716  @p procedure conditional;
  9717  label exit,common_ending;
  9718  var b:boolean; {is the condition true?}
  9719  @!r:"<"..">"; {relation to be evaluated}
  9720  @!m,@!n:integer; {to be tested against the second operand}
  9721  @!p,@!q:pointer; {for traversing token lists in \.{\\ifx} tests}
  9722  @!save_scanner_status:small_number; {|scanner_status| upon entry}
  9723  @!save_cond_ptr:pointer; {|cond_ptr| corresponding to this conditional}
  9724  @!this_if:small_number; {type of this conditional}
  9725  begin @<Push the condition stack@>;@+save_cond_ptr:=cond_ptr;this_if:=cur_chr;@/
  9726  @<Either process \.{\\ifcase} or set |b| to the value of a boolean condition@>;
  9727  if tracing_commands>1 then @<Display the value of |b|@>;
  9728  if b then
  9729    begin change_if_limit(else_code,save_cond_ptr);
  9730    return; {wait for \.{\\else} or \.{\\fi}}
  9731    end;
  9732  @<Skip to \.{\\else} or \.{\\fi}, then |goto common_ending|@>;
  9733  common_ending: if cur_chr=fi_code then @<Pop the condition stack@>
  9734  else if_limit:=fi_code; {wait for \.{\\fi}}
  9735  exit:end;
  9736  
  9737  @ In a construction like `\.{\\if\\iftrue abc\\else d\\fi}', the first
  9738  \.{\\else} that we come to after learning that the \.{\\if} is false is
  9739  not the \.{\\else} we're looking for. Hence the following curious
  9740  logic is needed.
  9741  
  9742  @ @<Skip to \.{\\else} or \.{\\fi}...@>=
  9743  loop@+  begin pass_text;
  9744    if cond_ptr=save_cond_ptr then
  9745      begin if cur_chr<>or_code then goto common_ending;
  9746      print_err("Extra "); print_esc("or");
  9747  @.Extra \\or@>
  9748      help1("I'm ignoring this; it doesn't match any \if.");
  9749      error;
  9750      end
  9751    else if cur_chr=fi_code then @<Pop the condition stack@>;
  9752    end
  9753  
  9754  @ @<Either process \.{\\ifcase} or set |b|...@>=
  9755  case this_if of
  9756  if_char_code, if_cat_code: @<Test if two characters match@>;
  9757  if_int_code, if_dim_code: @<Test relation between integers or dimensions@>;
  9758  if_odd_code: @<Test if an integer is odd@>;
  9759  if_vmode_code: b:=(abs(mode)=vmode);
  9760  if_hmode_code: b:=(abs(mode)=hmode);
  9761  if_mmode_code: b:=(abs(mode)=mmode);
  9762  if_inner_code: b:=(mode<0);
  9763  if_void_code, if_hbox_code, if_vbox_code: @<Test box register status@>;
  9764  ifx_code: @<Test if two tokens match@>;
  9765  if_eof_code: begin scan_four_bit_int; b:=(read_open[cur_val]=closed);
  9766    end;
  9767  if_true_code: b:=true;
  9768  if_false_code: b:=false;
  9769  if_case_code: @<Select the appropriate case
  9770    and |return| or |goto common_ending|@>;
  9771  end {there are no other cases}
  9772  
  9773  @ @<Display the value of |b|@>=
  9774  begin begin_diagnostic;
  9775  if b then print("{true}")@+else print("{false}");
  9776  end_diagnostic(false);
  9777  end
  9778  
  9779  @ Here we use the fact that |"<"|, |"="|, and |">"| are consecutive ASCII
  9780  codes.
  9781  @^ASCII code@>
  9782  
  9783  @<Test relation between integers or dimensions@>=
  9784  begin if this_if=if_int_code then scan_int@+else scan_normal_dimen;
  9785  n:=cur_val; @<Get the next non-blank non-call...@>;
  9786  if (cur_tok>=other_token+"<")and(cur_tok<=other_token+">") then
  9787    r:=cur_tok-other_token
  9788  else  begin print_err("Missing = inserted for ");
  9789  @.Missing = inserted@>
  9790    print_cmd_chr(if_test,this_if);
  9791    help1("I was expecting to see `<', `=', or `>'. Didn't.");
  9792    back_error; r:="=";
  9793    end;
  9794  if this_if=if_int_code then scan_int@+else scan_normal_dimen;
  9795  case r of
  9796  "<": b:=(n<cur_val);
  9797  "=": b:=(n=cur_val);
  9798  ">": b:=(n>cur_val);
  9799  end;
  9800  end
  9801  
  9802  @ @<Test if an integer is odd@>=
  9803  begin scan_int; b:=odd(cur_val);
  9804  end
  9805  
  9806  @ @<Test box register status@>=
  9807  begin scan_eight_bit_int; p:=box(cur_val);
  9808  if this_if=if_void_code then b:=(p=null)
  9809  else if p=null then b:=false
  9810  else if this_if=if_hbox_code then b:=(type(p)=hlist_node)
  9811  else b:=(type(p)=vlist_node);
  9812  end
  9813  
  9814  @ An active character will be treated as category 13 following
  9815  \.{\\if\\noexpand} or following \.{\\ifcat\\noexpand}. We use the fact that
  9816  active characters have the smallest tokens, among all control sequences.
  9817  
  9818  @d get_x_token_or_active_char==@t@>@;
  9819    begin get_x_token;
  9820    if cur_cmd=relax then if cur_chr=no_expand_flag then
  9821      begin cur_cmd:=active_char;
  9822      cur_chr:=cur_tok-cs_token_flag-active_base;
  9823      end;
  9824    end
  9825  
  9826  @<Test if two characters match@>=
  9827  begin get_x_token_or_active_char;
  9828  if (cur_cmd>active_char)or(cur_chr>255) then {not a character}
  9829    begin m:=relax; n:=256;
  9830    end
  9831  else  begin m:=cur_cmd; n:=cur_chr;
  9832    end;
  9833  get_x_token_or_active_char;
  9834  if (cur_cmd>active_char)or(cur_chr>255) then
  9835    begin cur_cmd:=relax; cur_chr:=256;
  9836    end;
  9837  if this_if=if_char_code then b:=(n=cur_chr)@+else b:=(m=cur_cmd);
  9838  end
  9839  
  9840  @ Note that `\.{\\ifx}' will declare two macros different if one is \\{long}
  9841  or \\{outer} and the other isn't, even though the texts of the macros are
  9842  the same.
  9843  
  9844  We need to reset |scanner_status|, since \.{\\outer} control sequences
  9845  are allowed, but we might be scanning a macro definition or preamble.
  9846  
  9847  @<Test if two tokens match@>=
  9848  begin save_scanner_status:=scanner_status; scanner_status:=normal;
  9849  get_next; n:=cur_cs; p:=cur_cmd; q:=cur_chr;
  9850  get_next; if cur_cmd<>p then b:=false
  9851  else if cur_cmd<call then b:=(cur_chr=q)
  9852  else @<Test if two macro texts match@>;
  9853  scanner_status:=save_scanner_status;
  9854  end
  9855  
  9856  @ Note also that `\.{\\ifx}' decides that macros \.{\\a} and \.{\\b} are
  9857  different in examples like this:
  9858  $$\vbox{\halign{\.{#}\hfil&\qquad\.{#}\hfil\cr
  9859    {}\\def\\a\{\\c\}&
  9860    {}\\def\\c\{\}\cr
  9861    {}\\def\\b\{\\d\}&
  9862    {}\\def\\d\{\}\cr}}$$
  9863  
  9864  @<Test if two macro texts match@>=
  9865  begin p:=link(cur_chr); q:=link(equiv(n)); {omit reference counts}
  9866  if p=q then b:=true
  9867  else begin while (p<>null)and(q<>null) do
  9868      if info(p)<>info(q) then p:=null
  9869      else  begin p:=link(p); q:=link(q);
  9870        end;
  9871    b:=((p=null)and(q=null));
  9872    end;
  9873  end
  9874  
  9875  @ @<Select the appropriate case and |return| or |goto common_ending|@>=
  9876  begin scan_int; n:=cur_val; {|n| is the number of cases to pass}
  9877  if tracing_commands>1 then
  9878    begin begin_diagnostic; print("{case "); print_int(n); print_char("}");
  9879    end_diagnostic(false);
  9880    end;
  9881  while n<>0 do
  9882    begin pass_text;
  9883    if cond_ptr=save_cond_ptr then
  9884      if cur_chr=or_code then decr(n)
  9885      else goto common_ending
  9886    else if cur_chr=fi_code then @<Pop the condition stack@>;
  9887    end;
  9888  change_if_limit(or_code,save_cond_ptr);
  9889  return; {wait for \.{\\or}, \.{\\else}, or \.{\\fi}}
  9890  end
  9891  
  9892  @ The processing of conditionals is complete except for the following
  9893  code, which is actually part of |expand|. It comes into play when
  9894  \.{\\or}, \.{\\else}, or \.{\\fi} is scanned.
  9895  
  9896  @<Terminate the current conditional and skip to \.{\\fi}@>=
  9897  if cur_chr>if_limit then
  9898    if if_limit=if_code then insert_relax {condition not yet evaluated}
  9899    else  begin print_err("Extra "); print_cmd_chr(fi_or_else,cur_chr);
  9900  @.Extra \\or@>
  9901  @.Extra \\else@>
  9902  @.Extra \\fi@>
  9903      help1("I'm ignoring this; it doesn't match any \if.");
  9904      error;
  9905      end
  9906  else  begin while cur_chr<>fi_code do pass_text; {skip to \.{\\fi}}
  9907    @<Pop the condition stack@>;
  9908    end
  9909  
  9910  @* \[29] File names.
  9911  It's time now to fret about file names.  Besides the fact that different
  9912  operating systems treat files in different ways, we must cope with the
  9913  fact that completely different naming conventions are used by different
  9914  groups of people. The following programs show what is required for one
  9915  particular operating system; similar routines for other systems are not
  9916  difficult to devise.
  9917  @^fingers@>
  9918  @^system dependencies@>
  9919  
  9920  \TeX\ assumes that a file name has three parts: the name proper; its
  9921  ``extension''; and a ``file area'' where it is found in an external file
  9922  system.  The extension of an input file or a write file is assumed to be
  9923  `\.{.tex}' unless otherwise specified; it is `\.{.log}' on the
  9924  transcript file that records each run of \TeX; it is `\.{.tfm}' on the font
  9925  metric files that describe characters in the fonts \TeX\ uses; it is
  9926  `\.{.dvi}' on the output files that specify typesetting information; and it
  9927  is `\.{.fmt}' on the format files written by \.{INITEX} to initialize \TeX.
  9928  The file area can be arbitrary on input files, but files are usually
  9929  output to the user's current area.  If an input file cannot be
  9930  found on the specified area, \TeX\ will look for it on a special system
  9931  area; this special area is intended for commonly used input files like
  9932  \.{webmac.tex}.
  9933  
  9934  Simple uses of \TeX\ refer only to file names that have no explicit
  9935  extension or area. For example, a person usually says `\.{\\input} \.{paper}'
  9936  or `\.{\\font\\tenrm} \.= \.{helvetica}' instead of `\.{\\input}
  9937  \.{paper.new}' or `\.{\\font\\tenrm} \.= \.{<csd.knuth>test}'. Simple file
  9938  names are best, because they make the \TeX\ source files portable;
  9939  whenever a file name consists entirely of letters and digits, it should be
  9940  treated in the same way by all implementations of \TeX. However, users
  9941  need the ability to refer to other files in their environment, especially
  9942  when responding to error messages concerning unopenable files; therefore
  9943  we want to let them use the syntax that appears in their favorite
  9944  operating system.
  9945  
  9946  The following procedures don't allow spaces to be part of
  9947  file names; but some users seem to like names that are spaced-out.
  9948  System-dependent changes to allow such things should probably
  9949  be made with reluctance, and only when an entire file name that
  9950  includes spaces is ``quoted'' somehow.
  9951  
  9952  @ In order to isolate the system-dependent aspects of file names, the
  9953  @^system dependencies@>
  9954  system-independent parts of \TeX\ are expressed in terms
  9955  of three system-dependent
  9956  procedures called |begin_name|, |more_name|, and |end_name|. In
  9957  essence, if the user-specified characters of the file name are $c_1\ldots c_n$,
  9958  the system-independent driver program does the operations
  9959  $$|begin_name|;\,|more_name|(c_1);\,\ldots\,;\,|more_name|(c_n);
  9960  \,|end_name|.$$
  9961  These three procedures communicate with each other via global variables.
  9962  Afterwards the file name will appear in the string pool as three strings
  9963  called |cur_name|\penalty10000\hskip-.05em,
  9964  |cur_area|, and |cur_ext|; the latter two are null (i.e.,
  9965  |""|), unless they were explicitly specified by the user.
  9966  
  9967  Actually the situation is slightly more complicated, because \TeX\ needs
  9968  to know when the file name ends. The |more_name| routine is a function
  9969  (with side effects) that returns |true| on the calls |more_name|$(c_1)$,
  9970  \dots, |more_name|$(c_{n-1})$. The final call |more_name|$(c_n)$
  9971  returns |false|; or, it returns |true| and the token following $c_n$ is
  9972  something like `\.{\\hbox}' (i.e., not a character). In other words,
  9973  |more_name| is supposed to return |true| unless it is sure that the
  9974  file name has been completely scanned; and |end_name| is supposed to be able
  9975  to finish the assembly of |cur_name|, |cur_area|, and |cur_ext| regardless of
  9976  whether $|more_name|(c_n)$ returned |true| or |false|.
  9977  
  9978  @<Glob...@>=
  9979  @!cur_name:str_number; {name of file just scanned}
  9980  @!cur_area:str_number; {file area just scanned, or \.{""}}
  9981  @!cur_ext:str_number; {file extension just scanned, or \.{""}}
  9982  
  9983  @ The file names we shall deal with for illustrative purposes have the
  9984  following structure:  If the name contains `\.>' or `\.:', the file area
  9985  consists of all characters up to and including the final such character;
  9986  otherwise the file area is null.  If the remaining file name contains
  9987  `\..', the file extension consists of all such characters from the first
  9988  remaining `\..' to the end, otherwise the file extension is null.
  9989  @^system dependencies@>
  9990  
  9991  We can scan such file names easily by using two global variables that keep track
  9992  of the occurrences of area and extension delimiters:
  9993  
  9994  @<Glob...@>=
  9995  @!area_delimiter:pool_pointer; {the most recent `\.>' or `\.:', if any}
  9996  @!ext_delimiter:pool_pointer; {the relevant `\..', if any}
  9997  
  9998  @ Input files that can't be found in the user's area may appear in a standard
  9999  system area called |TEX_area|. Font metric files whose areas are not given
 10000  explicitly are assumed to appear in a standard system area called
 10001  |TEX_font_area|.  These system area names will, of course, vary from place
 10002  to place.
 10003  @^system dependencies@>
 10004  
 10005  @d TEX_area=="TeXinputs:"
 10006  @.TeXinputs@>
 10007  @d TEX_font_area=="TeXfonts:"
 10008  @.TeXfonts@>
 10009  
 10010  @ Here now is the first of the system-dependent routines for file name scanning.
 10011  @^system dependencies@>
 10012  
 10013  @p procedure begin_name;
 10014  begin area_delimiter:=0; ext_delimiter:=0;
 10015  end;
 10016  
 10017  @ And here's the second. The string pool might change as the file name is
 10018  being scanned, since a new \.{\\csname} might be entered; therefore we keep
 10019  |area_delimiter| and |ext_delimiter| relative to the beginning of the current
 10020  string, instead of assigning an absolute address like |pool_ptr| to them.
 10021  @^system dependencies@>
 10022  
 10023  @p function more_name(@!c:ASCII_code):boolean;
 10024  begin if c=" " then more_name:=false
 10025  else  begin str_room(1); append_char(c); {contribute |c| to the current string}
 10026    if (c=">")or(c=":") then
 10027      begin area_delimiter:=cur_length; ext_delimiter:=0;
 10028      end
 10029    else if (c=".")and(ext_delimiter=0) then ext_delimiter:=cur_length;
 10030    more_name:=true;
 10031    end;
 10032  end;
 10033  
 10034  @ The third.
 10035  @^system dependencies@>
 10036  
 10037  @p procedure end_name;
 10038  begin if str_ptr+3>max_strings then
 10039    overflow("number of strings",max_strings-init_str_ptr);
 10040  @:TeX capacity exceeded number of strings}{\quad number of strings@>
 10041  if area_delimiter=0 then cur_area:=""
 10042  else  begin cur_area:=str_ptr;
 10043    str_start[str_ptr+1]:=str_start[str_ptr]+area_delimiter; incr(str_ptr);
 10044    end;
 10045  if ext_delimiter=0 then
 10046    begin cur_ext:=""; cur_name:=make_string;
 10047    end
 10048  else  begin cur_name:=str_ptr;
 10049    str_start[str_ptr+1]:=str_start[str_ptr]+ext_delimiter-area_delimiter-1;
 10050    incr(str_ptr); cur_ext:=make_string;
 10051    end;
 10052  end;
 10053  
 10054  @ Conversely, here is a routine that takes three strings and prints a file
 10055  name that might have produced them. (The routine is system dependent, because
 10056  some operating systems put the file area last instead of first.)
 10057  @^system dependencies@>
 10058  
 10059  @<Basic printing...@>=
 10060  procedure print_file_name(@!n,@!a,@!e:integer);
 10061  begin slow_print(a); slow_print(n); slow_print(e);
 10062  end;
 10063  
 10064  @ Another system-dependent routine is needed to convert three internal
 10065  \TeX\ strings
 10066  into the |name_of_file| value that is used to open files. The present code
 10067  allows both lowercase and uppercase letters in the file name.
 10068  @^system dependencies@>
 10069  
 10070  @d append_to_name(#)==begin c:=#; incr(k);
 10071    if k<=file_name_size then name_of_file[k]:=xchr[c];
 10072    end
 10073  
 10074  @p procedure pack_file_name(@!n,@!a,@!e:str_number);
 10075  var k:integer; {number of positions filled in |name_of_file|}
 10076  @!c: ASCII_code; {character being packed}
 10077  @!j:pool_pointer; {index into |str_pool|}
 10078  begin k:=0;
 10079  for j:=str_start[a] to str_start[a+1]-1 do append_to_name(so(str_pool[j]));
 10080  for j:=str_start[n] to str_start[n+1]-1 do append_to_name(so(str_pool[j]));
 10081  for j:=str_start[e] to str_start[e+1]-1 do append_to_name(so(str_pool[j]));
 10082  if k<=file_name_size then name_length:=k@+else name_length:=file_name_size;
 10083  for k:=name_length+1 to file_name_size do name_of_file[k]:=' ';
 10084  end;
 10085  
 10086  @ A messier routine is also needed, since format file names must be scanned
 10087  before \TeX's string mechanism has been initialized. We shall use the
 10088  global variable |TEX_format_default| to supply the text for default system areas
 10089  and extensions related to format files.
 10090  @^system dependencies@>
 10091  
 10092  @d format_default_length=20 {length of the |TEX_format_default| string}
 10093  @d format_area_length=11 {length of its area part}
 10094  @d format_ext_length=4 {length of its `\.{.fmt}' part}
 10095  @d format_extension=".fmt" {the extension, as a \.{WEB} constant}
 10096  
 10097  @<Glob...@>=
 10098  @!TEX_format_default:packed array[1..format_default_length] of char;
 10099  
 10100  @ @<Set init...@>=
 10101  TEX_format_default:='TeXformats:plain.fmt';
 10102  @.TeXformats@>
 10103  @.plain@>
 10104  @^system dependencies@>
 10105  
 10106  @ @<Check the ``constant'' values for consistency@>=
 10107  if format_default_length>file_name_size then bad:=31;
 10108  
 10109  @ Here is the messy routine that was just mentioned. It sets |name_of_file|
 10110  from the first |n| characters of |TEX_format_default|, followed by
 10111  |buffer[a..b]|, followed by the last |format_ext_length| characters of
 10112  |TEX_format_default|.
 10113  
 10114  We dare not give error messages here, since \TeX\ calls this routine before
 10115  the |error| routine is ready to roll. Instead, we simply drop excess characters,
 10116  since the error will be detected in another way when a strange file name
 10117  isn't found.
 10118  @^system dependencies@>
 10119  
 10120  @p procedure pack_buffered_name(@!n:small_number;@!a,@!b:integer);
 10121  var k:integer; {number of positions filled in |name_of_file|}
 10122  @!c: ASCII_code; {character being packed}
 10123  @!j:integer; {index into |buffer| or |TEX_format_default|}
 10124  begin if n+b-a+1+format_ext_length>file_name_size then
 10125    b:=a+file_name_size-n-1-format_ext_length;
 10126  k:=0;
 10127  for j:=1 to n do append_to_name(xord[TEX_format_default[j]]);
 10128  for j:=a to b do append_to_name(buffer[j]);
 10129  for j:=format_default_length-format_ext_length+1 to format_default_length do
 10130    append_to_name(xord[TEX_format_default[j]]);
 10131  if k<=file_name_size then name_length:=k@+else name_length:=file_name_size;
 10132  for k:=name_length+1 to file_name_size do name_of_file[k]:=' ';
 10133  end;
 10134  
 10135  @ Here is the only place we use |pack_buffered_name|. This part of the program
 10136  becomes active when a ``virgin'' \TeX\ is trying to get going, just after
 10137  the preliminary initialization, or when the user is substituting another
 10138  format file by typing `\.\&' after the initial `\.{**}' prompt.  The buffer
 10139  contains the first line of input in |buffer[loc..(last-1)]|, where
 10140  |loc<last| and |buffer[loc]<>" "|.
 10141  
 10142  @<Declare the function called |open_fmt_file|@>=
 10143  function open_fmt_file:boolean;
 10144  label found,exit;
 10145  var j:0..buf_size; {the first space after the format file name}
 10146  begin j:=loc;
 10147  if buffer[loc]="&" then
 10148    begin incr(loc); j:=loc; buffer[last]:=" ";
 10149    while buffer[j]<>" " do incr(j);
 10150    pack_buffered_name(0,loc,j-1); {try first without the system file area}
 10151    if w_open_in(fmt_file) then goto found;
 10152    pack_buffered_name(format_area_length,loc,j-1);
 10153      {now try the system format file area}
 10154    if w_open_in(fmt_file) then goto found;
 10155    wake_up_terminal;
 10156    wterm_ln('Sorry, I can''t find that format;',' will try PLAIN.');
 10157  @.Sorry, I can't find...@>
 10158    update_terminal;
 10159    end;
 10160    {now pull out all the stops: try for the system \.{plain} file}
 10161  pack_buffered_name(format_default_length-format_ext_length,1,0);
 10162  if not w_open_in(fmt_file) then
 10163    begin wake_up_terminal;
 10164    wterm_ln('I can''t find the PLAIN format file!');
 10165  @.I can't find PLAIN...@>
 10166  @.plain@>
 10167    open_fmt_file:=false; return;
 10168    end;
 10169  found:loc:=j; open_fmt_file:=true;
 10170  exit:end;
 10171  
 10172  @ Operating systems often make it possible to determine the exact name (and
 10173  possible version number) of a file that has been opened. The following routine,
 10174  which simply makes a \TeX\ string from the value of |name_of_file|, should
 10175  ideally be changed to deduce the full name of file~|f|, which is the file
 10176  most recently opened, if it is possible to do this in a \PASCAL\ program.
 10177  @^system dependencies@>
 10178  
 10179  This routine might be called after string memory has overflowed, hence
 10180  we dare not use `|str_room|'.
 10181  
 10182  @p function make_name_string:str_number;
 10183  var k:1..file_name_size; {index into |name_of_file|}
 10184  begin if (pool_ptr+name_length>pool_size)or(str_ptr=max_strings)or
 10185   (cur_length>0) then
 10186    make_name_string:="?"
 10187  else  begin for k:=1 to name_length do append_char(xord[name_of_file[k]]);
 10188    make_name_string:=make_string;
 10189    end;
 10190  end;
 10191  function a_make_name_string(var f:alpha_file):str_number;
 10192  begin a_make_name_string:=make_name_string;
 10193  end;
 10194  function b_make_name_string(var f:byte_file):str_number;
 10195  begin b_make_name_string:=make_name_string;
 10196  end;
 10197  function w_make_name_string(var f:word_file):str_number;
 10198  begin w_make_name_string:=make_name_string;
 10199  end;
 10200  
 10201  @ Now let's consider the ``driver''
 10202  routines by which \TeX\ deals with file names
 10203  in a system-independent manner.  First comes a procedure that looks for a
 10204  file name in the input by calling |get_x_token| for the information.
 10205  
 10206  @p procedure scan_file_name;
 10207  label done;
 10208  begin name_in_progress:=true; begin_name;
 10209  @<Get the next non-blank non-call...@>;
 10210  loop@+begin if (cur_cmd>other_char)or(cur_chr>255) then {not a character}
 10211      begin back_input; goto done;
 10212      end;
 10213    if not more_name(cur_chr) then goto done;
 10214    get_x_token;
 10215    end;
 10216  done: end_name; name_in_progress:=false;
 10217  end;
 10218  
 10219  @ The global variable |name_in_progress| is used to prevent recursive
 10220  use of |scan_file_name|, since the |begin_name| and other procedures
 10221  communicate via global variables. Recursion would arise only by
 10222  devious tricks like `\.{\\input\\input f}'; such attempts at sabotage
 10223  must be thwarted. Furthermore, |name_in_progress| prevents \.{\\input}
 10224  @^recursion@>
 10225  from being initiated when a font size specification is being scanned.
 10226  
 10227  Another global variable, |job_name|, contains the file name that was first
 10228  \.{\\input} by the user. This name is extended by `\.{.log}' and `\.{.dvi}'
 10229  and `\.{.fmt}' in the names of \TeX's output files.
 10230  
 10231  @<Glob...@>=
 10232  @!name_in_progress:boolean; {is a file name being scanned?}
 10233  @!job_name:str_number; {principal file name}
 10234  @!log_opened:boolean; {has the transcript file been opened?}
 10235  
 10236  @ Initially |job_name=0|; it becomes nonzero as soon as the true name is known.
 10237  We have |job_name=0| if and only if the `\.{log}' file has not been opened,
 10238  except of course for a short time just after |job_name| has become nonzero.
 10239  
 10240  @<Initialize the output...@>=
 10241  job_name:=0; name_in_progress:=false; log_opened:=false;
 10242  
 10243  @ Here is a routine that manufactures the output file names, assuming that
 10244  |job_name<>0|. It ignores and changes the current settings of |cur_area|
 10245  and |cur_ext|.
 10246  
 10247  @d pack_cur_name==pack_file_name(cur_name,cur_area,cur_ext)
 10248  
 10249  @p procedure pack_job_name(@!s:str_number); {|s = ".log"|, |".dvi"|, or
 10250    |format_extension|}
 10251  begin cur_area:=""; cur_ext:=s;
 10252  cur_name:=job_name; pack_cur_name;
 10253  end;
 10254  
 10255  @ If some trouble arises when \TeX\ tries to open a file, the following
 10256  routine calls upon the user to supply another file name. Parameter~|s|
 10257  is used in the error message to identify the type of file; parameter~|e|
 10258  is the default extension if none is given. Upon exit from the routine,
 10259  variables |cur_name|, |cur_area|, |cur_ext|, and |name_of_file| are
 10260  ready for another attempt at file opening.
 10261  
 10262  @p procedure prompt_file_name(@!s,@!e:str_number);
 10263  label done;
 10264  var k:0..buf_size; {index into |buffer|}
 10265  begin if interaction=scroll_mode then wake_up_terminal;
 10266  if s="input file name" then print_err("I can't find file `")
 10267  @.I can't find file x@>
 10268  else print_err("I can't write on file `");
 10269  @.I can't write on file x@>
 10270  print_file_name(cur_name,cur_area,cur_ext); print("'.");
 10271  if e=".tex" then show_context;
 10272  print_nl("Please type another "); print(s);
 10273  @.Please type...@>
 10274  if interaction<scroll_mode then
 10275    fatal_error("*** (job aborted, file error in nonstop mode)");
 10276  @.job aborted, file error...@>
 10277  clear_terminal; prompt_input(": "); @<Scan file name in the buffer@>;
 10278  if cur_ext="" then cur_ext:=e;
 10279  pack_cur_name;
 10280  end;
 10281  
 10282  @ @<Scan file name in the buffer@>=
 10283  begin begin_name; k:=first;
 10284  while (buffer[k]=" ")and(k<last) do incr(k);
 10285  loop@+  begin if k=last then goto done;
 10286    if not more_name(buffer[k]) then goto done;
 10287    incr(k);
 10288    end;
 10289  done:end_name;
 10290  end
 10291  
 10292  @ Here's an example of how these conventions are used. Whenever it is time to
 10293  ship out a box of stuff, we shall use the macro |ensure_dvi_open|.
 10294  
 10295  @d ensure_dvi_open==if output_file_name=0 then
 10296    begin if job_name=0 then open_log_file;
 10297    pack_job_name(".dvi");
 10298    while not b_open_out(dvi_file) do
 10299      prompt_file_name("file name for output",".dvi");
 10300    output_file_name:=b_make_name_string(dvi_file);
 10301    end
 10302  
 10303  @<Glob...@>=
 10304  @!dvi_file: byte_file; {the device-independent output goes here}
 10305  @!output_file_name: str_number; {full name of the output file}
 10306  @!log_name:str_number; {full name of the log file}
 10307  
 10308  @ @<Initialize the output...@>=output_file_name:=0;
 10309  
 10310  @ The |open_log_file| routine is used to open the transcript file and to help
 10311  it catch up to what has previously been printed on the terminal.
 10312  
 10313  @p procedure open_log_file;
 10314  var old_setting:0..max_selector; {previous |selector| setting}
 10315  @!k:0..buf_size; {index into |months| and |buffer|}
 10316  @!l:0..buf_size; {end of first input line}
 10317  @!months:packed array [1..36] of char; {abbreviations of month names}
 10318  begin old_setting:=selector;
 10319  if job_name=0 then job_name:="texput";
 10320  @.texput@>
 10321  pack_job_name(".log");
 10322  while not a_open_out(log_file) do @<Try to get a different log file name@>;
 10323  log_name:=a_make_name_string(log_file);
 10324  selector:=log_only; log_opened:=true;
 10325  @<Print the banner line, including the date and time@>;
 10326  input_stack[input_ptr]:=cur_input; {make sure bottom level is in memory}
 10327  print_nl("**");
 10328  @.**@>
 10329  l:=input_stack[0].limit_field; {last position of first line}
 10330  if buffer[l]=end_line_char then decr(l);
 10331  for k:=1 to l do print(buffer[k]);
 10332  print_ln; {now the transcript file contains the first line of input}
 10333  selector:=old_setting+2; {|log_only| or |term_and_log|}
 10334  end;
 10335  
 10336  @ Sometimes |open_log_file| is called at awkward moments when \TeX\ is
 10337  unable to print error messages or even to |show_context|.
 10338  The |prompt_file_name| routine can result in a |fatal_error|, but the |error|
 10339  routine will not be invoked because |log_opened| will be false.
 10340  
 10341  The normal idea of |batch_mode| is that nothing at all should be written
 10342  on the terminal. However, in the unusual case that
 10343  no log file could be opened, we make an exception and allow
 10344  an explanatory message to be seen.
 10345  
 10346  Incidentally, the program always refers to the log file as a `\.{transcript
 10347  file}', because some systems cannot use the extension `\.{.log}' for
 10348  this file.
 10349  
 10350  @<Try to get a different log file name@>=
 10351  begin selector:=term_only;
 10352  prompt_file_name("transcript file name",".log");
 10353  end
 10354  
 10355  @ @<Print the banner...@>=
 10356  begin wlog(banner);
 10357  slow_print(format_ident); print("  ");
 10358  print_int(sys_day); print_char(" ");
 10359  months:='JANFEBMARAPRMAYJUNJULAUGSEPOCTNOVDEC';
 10360  for k:=3*sys_month-2 to 3*sys_month do wlog(months[k]);
 10361  print_char(" "); print_int(sys_year); print_char(" ");
 10362  print_two(sys_time div 60); print_char(":"); print_two(sys_time mod 60);
 10363  end
 10364  
 10365  @ Let's turn now to the procedure that is used to initiate file reading
 10366  when an `\.{\\input}' command is being processed.
 10367  Beware: For historic reasons, this code foolishly conserves a tiny bit
 10368  of string pool space; but that can confuse the interactive `\.E' option.
 10369  @^system dependencies@>
 10370  
 10371  @p procedure start_input; {\TeX\ will \.{\\input} something}
 10372  label done;
 10373  begin scan_file_name; {set |cur_name| to desired file name}
 10374  if cur_ext="" then cur_ext:=".tex";
 10375  pack_cur_name;
 10376  loop@+  begin begin_file_reading; {set up |cur_file| and new level of input}
 10377    if a_open_in(cur_file) then goto done;
 10378    if cur_area="" then
 10379      begin pack_file_name(cur_name,TEX_area,cur_ext);
 10380      if a_open_in(cur_file) then goto done;
 10381      end;
 10382    end_file_reading; {remove the level that didn't work}
 10383    prompt_file_name("input file name",".tex");
 10384    end;
 10385  done: name:=a_make_name_string(cur_file);
 10386  if job_name=0 then
 10387    begin job_name:=cur_name; open_log_file;
 10388    end; {|open_log_file| doesn't |show_context|, so |limit|
 10389      and |loc| needn't be set to meaningful values yet}
 10390  if term_offset+length(name)>max_print_line-2 then print_ln
 10391  else if (term_offset>0)or(file_offset>0) then print_char(" ");
 10392  print_char("("); incr(open_parens); slow_print(name); update_terminal;
 10393  state:=new_line;
 10394  if name=str_ptr-1 then {conserve string pool space (but see note above)}
 10395    begin flush_string; name:=cur_name;
 10396    end;
 10397  @<Read the first line of the new file@>;
 10398  end;
 10399  
 10400  @ Here we have to remember to tell the |input_ln| routine not to
 10401  start with a |get|. If the file is empty, it is considered to
 10402  contain a single blank line.
 10403  @^system dependencies@>
 10404  @^empty line at end of file@>
 10405  
 10406  @<Read the first line...@>=
 10407  begin line:=1;
 10408  if input_ln(cur_file,false) then do_nothing;
 10409  firm_up_the_line;
 10410  if end_line_char_inactive then decr(limit)
 10411  else  buffer[limit]:=end_line_char;
 10412  first:=limit+1; loc:=start;
 10413  end
 10414  
 10415  @* \[30] Font metric data.
 10416  \TeX\ gets its knowledge about fonts from font metric files, also called
 10417  \.{TFM} files; the `\.T' in `\.{TFM}' stands for \TeX,
 10418  but other programs know about them too.
 10419  @:TFM files}{\.{TFM} files@>
 10420  @^font metric files@>
 10421  
 10422  The information in a \.{TFM} file appears in a sequence of 8-bit bytes.
 10423  Since the number of bytes is always a multiple of 4, we could
 10424  also regard the file as a sequence of 32-bit words, but \TeX\ uses the
 10425  byte interpretation. The format of \.{TFM} files was designed by
 10426  Lyle Ramshaw in 1980. The intent is to convey a lot of different kinds
 10427  @^Ramshaw, Lyle Harold@>
 10428  of information in a compact but useful form.
 10429  
 10430  @<Glob...@>=
 10431  @!tfm_file:byte_file;
 10432  
 10433  @ The first 24 bytes (6 words) of a \.{TFM} file contain twelve 16-bit
 10434  integers that give the lengths of the various subsequent portions
 10435  of the file. These twelve integers are, in order:
 10436  $$\vbox{\halign{\hfil#&$\null=\null$#\hfil\cr
 10437  |lf|&length of the entire file, in words;\cr
 10438  |lh|&length of the header data, in words;\cr
 10439  |bc|&smallest character code in the font;\cr
 10440  |ec|&largest character code in the font;\cr
 10441  |nw|&number of words in the width table;\cr
 10442  |nh|&number of words in the height table;\cr
 10443  |nd|&number of words in the depth table;\cr
 10444  |ni|&number of words in the italic correction table;\cr
 10445  |nl|&number of words in the lig/kern table;\cr
 10446  |nk|&number of words in the kern table;\cr
 10447  |ne|&number of words in the extensible character table;\cr
 10448  |np|&number of font parameter words.\cr}}$$
 10449  They are all nonnegative and less than $2^{15}$. We must have |bc-1<=ec<=255|,
 10450  and
 10451  $$\hbox{|lf=6+lh+(ec-bc+1)+nw+nh+nd+ni+nl+nk+ne+np|.}$$
 10452  Note that a font may contain as many as 256 characters (if |bc=0| and |ec=255|),
 10453  and as few as 0 characters (if |bc=ec+1|).
 10454  
 10455  Incidentally, when two or more 8-bit bytes are combined to form an integer of
 10456  16 or more bits, the most significant bytes appear first in the file.
 10457  This is called BigEndian order.
 10458  @!@^BigEndian order@>
 10459  
 10460  @ The rest of the \.{TFM} file may be regarded as a sequence of ten data
 10461  arrays having the informal specification
 10462  $$\def\arr$[#1]#2${\&{array} $[#1]$ \&{of} #2}
 10463  \vbox{\halign{\hfil\\{#}&$\,:\,$\arr#\hfil\cr
 10464  header&|[0..lh-1]@t\\{stuff}@>|\cr
 10465  char\_info&|[bc..ec]char_info_word|\cr
 10466  width&|[0..nw-1]fix_word|\cr
 10467  height&|[0..nh-1]fix_word|\cr
 10468  depth&|[0..nd-1]fix_word|\cr
 10469  italic&|[0..ni-1]fix_word|\cr
 10470  lig\_kern&|[0..nl-1]lig_kern_command|\cr
 10471  kern&|[0..nk-1]fix_word|\cr
 10472  exten&|[0..ne-1]extensible_recipe|\cr
 10473  param&|[1..np]fix_word|\cr}}$$
 10474  The most important data type used here is a |@!fix_word|, which is
 10475  a 32-bit representation of a binary fraction. A |fix_word| is a signed
 10476  quantity, with the two's complement of the entire word used to represent
 10477  negation. Of the 32 bits in a |fix_word|, exactly 12 are to the left of the
 10478  binary point; thus, the largest |fix_word| value is $2048-2^{-20}$, and
 10479  the smallest is $-2048$. We will see below, however, that all but two of
 10480  the |fix_word| values must lie between $-16$ and $+16$.
 10481  
 10482  @ The first data array is a block of header information, which contains
 10483  general facts about the font. The header must contain at least two words,
 10484  |header[0]| and |header[1]|, whose meaning is explained below.
 10485  Additional header information of use to other software routines might
 10486  also be included, but \TeX82 does not need to know about such details.
 10487  For example, 16 more words of header information are in use at the Xerox
 10488  Palo Alto Research Center; the first ten specify the character coding
 10489  scheme used (e.g., `\.{XEROX text}' or `\.{TeX math symbols}'), the next five
 10490  give the font identifier (e.g., `\.{HELVETICA}' or `\.{CMSY}'), and the
 10491  last gives the ``face byte.'' The program that converts \.{DVI} files
 10492  to Xerox printing format gets this information by looking at the \.{TFM}
 10493  file, which it needs to read anyway because of other information that
 10494  is not explicitly repeated in \.{DVI}~format.
 10495  
 10496  \yskip\hang|header[0]| is a 32-bit check sum that \TeX\ will copy into
 10497  the \.{DVI} output file. Later on when the \.{DVI} file is printed,
 10498  possibly on another computer, the actual font that gets used is supposed
 10499  to have a check sum that agrees with the one in the \.{TFM} file used by
 10500  \TeX. In this way, users will be warned about potential incompatibilities.
 10501  (However, if the check sum is zero in either the font file or the \.{TFM}
 10502  file, no check is made.)  The actual relation between this check sum and
 10503  the rest of the \.{TFM} file is not important; the check sum is simply an
 10504  identification number with the property that incompatible fonts almost
 10505  always have distinct check sums.
 10506  @^check sum@>
 10507  
 10508  \yskip\hang|header[1]| is a |fix_word| containing the design size of
 10509  the font, in units of \TeX\ points. This number must be at least 1.0; it is
 10510  fairly arbitrary, but usually the design size is 10.0 for a ``10 point''
 10511  font, i.e., a font that was designed to look best at a 10-point size,
 10512  whatever that really means. When a \TeX\ user asks for a font
 10513  `\.{at} $\delta$ \.{pt}', the effect is to override the design size
 10514  and replace it by $\delta$, and to multiply the $x$ and~$y$ coordinates
 10515  of the points in the font image by a factor of $\delta$ divided by the
 10516  design size.  {\sl All other dimensions in the\/ \.{TFM} file are
 10517  |fix_word|\kern-1pt\ numbers in design-size units}, with the exception of
 10518  |param[1]| (which denotes the slant ratio). Thus, for example, the value
 10519  of |param[6]|, which defines the \.{em} unit, is often the |fix_word| value
 10520  $2^{20}=1.0$, since many fonts have a design size equal to one em.
 10521  The other dimensions must be less than 16 design-size units in absolute
 10522  value; thus, |header[1]| and |param[1]| are the only |fix_word|
 10523  entries in the whole \.{TFM} file whose first byte might be something
 10524  besides 0 or 255.
 10525  
 10526  @ Next comes the |char_info| array, which contains one |@!char_info_word|
 10527  per character. Each word in this part of the file contains six fields
 10528  packed into four bytes as follows.
 10529  
 10530  \yskip\hang first byte: |@!width_index| (8 bits)\par
 10531  \hang second byte: |@!height_index| (4 bits) times 16, plus |@!depth_index|
 10532    (4~bits)\par
 10533  \hang third byte: |@!italic_index| (6 bits) times 4, plus |@!tag|
 10534    (2~bits)\par
 10535  \hang fourth byte: |@!remainder| (8 bits)\par
 10536  \yskip\noindent
 10537  The actual width of a character is \\{width}|[width_index]|, in design-size
 10538  units; this is a device for compressing information, since many characters
 10539  have the same width. Since it is quite common for many characters
 10540  to have the same height, depth, or italic correction, the \.{TFM} format
 10541  imposes a limit of 16 different heights, 16 different depths, and
 10542  64 different italic corrections.
 10543  
 10544  @!@^italic correction@>
 10545  The italic correction of a character has two different uses.
 10546  (a)~In ordinary text, the italic correction is added to the width only if
 10547  the \TeX\ user specifies `\.{\\/}' after the character.
 10548  (b)~In math formulas, the italic correction is always added to the width,
 10549  except with respect to the positioning of subscripts.
 10550  
 10551  Incidentally, the relation $\\{width}[0]=\\{height}[0]=\\{depth}[0]=
 10552  \\{italic}[0]=0$ should always hold, so that an index of zero implies a
 10553  value of zero.  The |width_index| should never be zero unless the
 10554  character does not exist in the font, since a character is valid if and
 10555  only if it lies between |bc| and |ec| and has a nonzero |width_index|.
 10556  
 10557  @ The |tag| field in a |char_info_word| has four values that explain how to
 10558  interpret the |remainder| field.
 10559  
 10560  \yskip\hangg|tag=0| (|no_tag|) means that |remainder| is unused.\par
 10561  \hangg|tag=1| (|lig_tag|) means that this character has a ligature/kerning
 10562  program starting at position |remainder| in the |lig_kern| array.\par
 10563  \hangg|tag=2| (|list_tag|) means that this character is part of a chain of
 10564  characters of ascending sizes, and not the largest in the chain.  The
 10565  |remainder| field gives the character code of the next larger character.\par
 10566  \hangg|tag=3| (|ext_tag|) means that this character code represents an
 10567  extensible character, i.e., a character that is built up of smaller pieces
 10568  so that it can be made arbitrarily large. The pieces are specified in
 10569  |@!exten[remainder]|.\par
 10570  \yskip\noindent
 10571  Characters with |tag=2| and |tag=3| are treated as characters with |tag=0|
 10572  unless they are used in special circumstances in math formulas. For example,
 10573  the \.{\\sum} operation looks for a |list_tag|, and the \.{\\left}
 10574  operation looks for both |list_tag| and |ext_tag|.
 10575  
 10576  @d no_tag=0 {vanilla character}
 10577  @d lig_tag=1 {character has a ligature/kerning program}
 10578  @d list_tag=2 {character has a successor in a charlist}
 10579  @d ext_tag=3 {character is extensible}
 10580  
 10581  @ The |lig_kern| array contains instructions in a simple programming language
 10582  that explains what to do for special letter pairs. Each word in this array is a
 10583  |@!lig_kern_command| of four bytes.
 10584  
 10585  \yskip\hang first byte: |skip_byte|, indicates that this is the final program
 10586    step if the byte is 128 or more, otherwise the next step is obtained by
 10587    skipping this number of intervening steps.\par
 10588  \hang second byte: |next_char|, ``if |next_char| follows the current character,
 10589    then perform the operation and stop, otherwise continue.''\par
 10590  \hang third byte: |op_byte|, indicates a ligature step if less than~128,
 10591    a kern step otherwise.\par
 10592  \hang fourth byte: |remainder|.\par
 10593  \yskip\noindent
 10594  In a kern step, an
 10595  additional space equal to |kern[256*(op_byte-128)+remainder]| is inserted
 10596  between the current character and |next_char|. This amount is
 10597  often negative, so that the characters are brought closer together
 10598  by kerning; but it might be positive.
 10599  
 10600  There are eight kinds of ligature steps, having |op_byte| codes $4a+2b+c$ where
 10601  $0\le a\le b+c$ and $0\le b,c\le1$. The character whose code is
 10602  |remainder| is inserted between the current character and |next_char|;
 10603  then the current character is deleted if $b=0$, and |next_char| is
 10604  deleted if $c=0$; then we pass over $a$~characters to reach the next
 10605  current character (which may have a ligature/kerning program of its own).
 10606  
 10607  If the very first instruction of the |lig_kern| array has |skip_byte=255|,
 10608  the |next_char| byte is the so-called boundary character of this font;
 10609  the value of |next_char| need not lie between |bc| and~|ec|.
 10610  If the very last instruction of the |lig_kern| array has |skip_byte=255|,
 10611  there is a special ligature/kerning program for a boundary character at the
 10612  left, beginning at location |256*op_byte+remainder|.
 10613  The interpretation is that \TeX\ puts implicit boundary characters
 10614  before and after each consecutive string of characters from the same font.
 10615  These implicit characters do not appear in the output, but they can affect
 10616  ligatures and kerning.
 10617  
 10618  If the very first instruction of a character's |lig_kern| program has
 10619  |skip_byte>128|, the program actually begins in location
 10620  |256*op_byte+remainder|. This feature allows access to large |lig_kern|
 10621  arrays, because the first instruction must otherwise
 10622  appear in a location |<=255|.
 10623  
 10624  Any instruction with |skip_byte>128| in the |lig_kern| array must satisfy
 10625  the condition
 10626  $$\hbox{|256*op_byte+remainder<nl|.}$$
 10627  If such an instruction is encountered during
 10628  normal program execution, it denotes an unconditional halt; no ligature
 10629  or kerning command is performed.
 10630  
 10631  @d stop_flag==qi(128) {value indicating `\.{STOP}' in a lig/kern program}
 10632  @d kern_flag==qi(128) {op code for a kern step}
 10633  @d skip_byte(#)==#.b0
 10634  @d next_char(#)==#.b1
 10635  @d op_byte(#)==#.b2
 10636  @d rem_byte(#)==#.b3
 10637  
 10638  @ Extensible characters are specified by an |@!extensible_recipe|, which
 10639  consists of four bytes called |@!top|, |@!mid|, |@!bot|, and |@!rep| (in this
 10640  order). These bytes are the character codes of individual pieces used to
 10641  build up a large symbol.  If |top|, |mid|, or |bot| are zero, they are not
 10642  present in the built-up result. For example, an extensible vertical line is
 10643  like an extensible bracket, except that the top and bottom pieces are missing.
 10644  
 10645  Let $T$, $M$, $B$, and $R$ denote the respective pieces, or an empty box
 10646  if the piece isn't present. Then the extensible characters have the form
 10647  $TR^kMR^kB$ from top to bottom, for some |k>=0|, unless $M$ is absent;
 10648  in the latter case we can have $TR^kB$ for both even and odd values of~|k|.
 10649  The width of the extensible character is the width of $R$; and the
 10650  height-plus-depth is the sum of the individual height-plus-depths of the
 10651  components used, since the pieces are butted together in a vertical list.
 10652  
 10653  @d ext_top(#)==#.b0 {|top| piece in a recipe}
 10654  @d ext_mid(#)==#.b1 {|mid| piece in a recipe}
 10655  @d ext_bot(#)==#.b2 {|bot| piece in a recipe}
 10656  @d ext_rep(#)==#.b3 {|rep| piece in a recipe}
 10657  
 10658  @ The final portion of a \.{TFM} file is the |param| array, which is another
 10659  sequence of |fix_word| values.
 10660  
 10661  \yskip\hang|param[1]=slant| is the amount of italic slant, which is used
 10662  to help position accents. For example, |slant=.25| means that when you go
 10663  up one unit, you also go .25 units to the right. The |slant| is a pure
 10664  number; it's the only |fix_word| other than the design size itself that is
 10665  not scaled by the design size.
 10666  
 10667  \hang|param[2]=space| is the normal spacing between words in text.
 10668  Note that character |" "| in the font need not have anything to do with
 10669  blank spaces.
 10670  
 10671  \hang|param[3]=space_stretch| is the amount of glue stretching between words.
 10672  
 10673  \hang|param[4]=space_shrink| is the amount of glue shrinking between words.
 10674  
 10675  \hang|param[5]=x_height| is the size of one ex in the font; it is also
 10676  the height of letters for which accents don't have to be raised or lowered.
 10677  
 10678  \hang|param[6]=quad| is the size of one em in the font.
 10679  
 10680  \hang|param[7]=extra_space| is the amount added to |param[2]| at the
 10681  ends of sentences.
 10682  
 10683  \yskip\noindent
 10684  If fewer than seven parameters are present, \TeX\ sets the missing parameters
 10685  to zero. Fonts used for math symbols are required to have
 10686  additional parameter information, which is explained later.
 10687  
 10688  @d slant_code=1
 10689  @d space_code=2
 10690  @d space_stretch_code=3
 10691  @d space_shrink_code=4
 10692  @d x_height_code=5
 10693  @d quad_code=6
 10694  @d extra_space_code=7
 10695  
 10696  @ So that is what \.{TFM} files hold. Since \TeX\ has to absorb such information
 10697  about lots of fonts, it stores most of the data in a large array called
 10698  |font_info|. Each item of |font_info| is a |memory_word|; the |fix_word|
 10699  data gets converted into |scaled| entries, while everything else goes into
 10700  words of type |four_quarters|.
 10701  
 10702  When the user defines \.{\\font\\f}, say, \TeX\ assigns an internal number
 10703  to the user's font~\.{\\f}. Adding this number to |font_id_base| gives the
 10704  |eqtb| location of a ``frozen'' control sequence that will always select
 10705  the font.
 10706  
 10707  @<Types...@>=
 10708  @!internal_font_number=font_base..font_max; {|font| in a |char_node|}
 10709  @!font_index=0..font_mem_size; {index into |font_info|}
 10710  
 10711  @ Here now is the (rather formidable) array of font arrays.
 10712  
 10713  @d non_char==qi(256) {a |halfword| code that can't match a real character}
 10714  @d non_address=0 {a spurious |bchar_label|}
 10715  
 10716  @<Glob...@>=
 10717  @!font_info:array[font_index] of memory_word;
 10718    {the big collection of font data}
 10719  @!fmem_ptr:font_index; {first unused word of |font_info|}
 10720  @!font_ptr:internal_font_number; {largest internal font number in use}
 10721  @!font_check:array[internal_font_number] of four_quarters; {check sum}
 10722  @!font_size:array[internal_font_number] of scaled; {``at'' size}
 10723  @!font_dsize:array[internal_font_number] of scaled; {``design'' size}
 10724  @!font_params:array[internal_font_number] of font_index; {how many font
 10725    parameters are present}
 10726  @!font_name:array[internal_font_number] of str_number; {name of the font}
 10727  @!font_area:array[internal_font_number] of str_number; {area of the font}
 10728  @!font_bc:array[internal_font_number] of eight_bits;
 10729    {beginning (smallest) character code}
 10730  @!font_ec:array[internal_font_number] of eight_bits;
 10731    {ending (largest) character code}
 10732  @!font_glue:array[internal_font_number] of pointer;
 10733    {glue specification for interword space, |null| if not allocated}
 10734  @!font_used:array[internal_font_number] of boolean;
 10735    {has a character from this font actually appeared in the output?}
 10736  @!hyphen_char:array[internal_font_number] of integer;
 10737    {current \.{\\hyphenchar} values}
 10738  @!skew_char:array[internal_font_number] of integer;
 10739    {current \.{\\skewchar} values}
 10740  @!bchar_label:array[internal_font_number] of font_index;
 10741    {start of |lig_kern| program for left boundary character,
 10742    |non_address| if there is none}
 10743  @!font_bchar:array[internal_font_number] of min_quarterword..non_char;
 10744    {boundary character, |non_char| if there is none}
 10745  @!font_false_bchar:array[internal_font_number] of min_quarterword..non_char;
 10746    {|font_bchar| if it doesn't exist in the font, otherwise |non_char|}
 10747  
 10748  @ Besides the arrays just enumerated, we have directory arrays that make it
 10749  easy to get at the individual entries in |font_info|. For example, the
 10750  |char_info| data for character |c| in font |f| will be in
 10751  |font_info[char_base[f]+c].qqqq|; and if |w| is the |width_index|
 10752  part of this word (the |b0| field), the width of the character is
 10753  |font_info[width_base[f]+w].sc|. (These formulas assume that
 10754  |min_quarterword| has already been added to |c| and to |w|, since \TeX\
 10755  stores its quarterwords that way.)
 10756  
 10757  @<Glob...@>=
 10758  @!char_base:array[internal_font_number] of integer;
 10759    {base addresses for |char_info|}
 10760  @!width_base:array[internal_font_number] of integer;
 10761    {base addresses for widths}
 10762  @!height_base:array[internal_font_number] of integer;
 10763    {base addresses for heights}
 10764  @!depth_base:array[internal_font_number] of integer;
 10765    {base addresses for depths}
 10766  @!italic_base:array[internal_font_number] of integer;
 10767    {base addresses for italic corrections}
 10768  @!lig_kern_base:array[internal_font_number] of integer;
 10769    {base addresses for ligature/kerning programs}
 10770  @!kern_base:array[internal_font_number] of integer;
 10771    {base addresses for kerns}
 10772  @!exten_base:array[internal_font_number] of integer;
 10773    {base addresses for extensible recipes}
 10774  @!param_base:array[internal_font_number] of integer;
 10775    {base addresses for font parameters}
 10776  
 10777  @ @<Set init...@>=
 10778  for k:=font_base to font_max do font_used[k]:=false;
 10779  
 10780  @ \TeX\ always knows at least one font, namely the null font. It has no
 10781  characters, and its seven parameters are all equal to zero.
 10782  
 10783  @<Initialize table...@>=
 10784  font_ptr:=null_font; fmem_ptr:=7;
 10785  font_name[null_font]:="nullfont"; font_area[null_font]:="";
 10786  hyphen_char[null_font]:="-"; skew_char[null_font]:=-1;
 10787  bchar_label[null_font]:=non_address;
 10788  font_bchar[null_font]:=non_char; font_false_bchar[null_font]:=non_char;
 10789  font_bc[null_font]:=1; font_ec[null_font]:=0;
 10790  font_size[null_font]:=0; font_dsize[null_font]:=0;
 10791  char_base[null_font]:=0; width_base[null_font]:=0;
 10792  height_base[null_font]:=0; depth_base[null_font]:=0;
 10793  italic_base[null_font]:=0; lig_kern_base[null_font]:=0;
 10794  kern_base[null_font]:=0; exten_base[null_font]:=0;
 10795  font_glue[null_font]:=null; font_params[null_font]:=7;
 10796  param_base[null_font]:=-1;
 10797  for k:=0 to 6 do font_info[k].sc:=0;
 10798  
 10799  @ @<Put each...@>=
 10800  primitive("nullfont",set_font,null_font);
 10801  @!@:null_font_}{\.{\\nullfont} primitive@>
 10802  text(frozen_null_font):="nullfont"; eqtb[frozen_null_font]:=eqtb[cur_val];
 10803  
 10804  @ Of course we want to define macros that suppress the detail of how font
 10805  information is actually packed, so that we don't have to write things like
 10806  $$\hbox{|font_info[width_base[f]+font_info[char_base[f]+c].qqqq.b0].sc|}$$
 10807  too often. The \.{WEB} definitions here make |char_info(f)(c)| the
 10808  |four_quarters| word of font information corresponding to character
 10809  |c| of font |f|. If |q| is such a word, |char_width(f)(q)| will be
 10810  the character's width; hence the long formula above is at least
 10811  abbreviated to
 10812  $$\hbox{|char_width(f)(char_info(f)(c))|.}$$
 10813  Usually, of course, we will fetch |q| first and look at several of its
 10814  fields at the same time.
 10815  
 10816  The italic correction of a character will be denoted by
 10817  |char_italic(f)(q)|, so it is analogous to |char_width|.  But we will get
 10818  at the height and depth in a slightly different way, since we usually want
 10819  to compute both height and depth if we want either one.  The value of
 10820  |height_depth(q)| will be the 8-bit quantity
 10821  $$b=|height_index|\times16+|depth_index|,$$ and if |b| is such a byte we
 10822  will write |char_height(f)(b)| and |char_depth(f)(b)| for the height and
 10823  depth of the character |c| for which |q=char_info(f)(c)|. Got that?
 10824  
 10825  The tag field will be called |char_tag(q)|; the remainder byte will be
 10826  called |rem_byte(q)|, using a macro that we have already defined above.
 10827  
 10828  Access to a character's |width|, |height|, |depth|, and |tag| fields is
 10829  part of \TeX's inner loop, so we want these macros to produce code that is
 10830  as fast as possible under the circumstances.
 10831  @^inner loop@>
 10832  
 10833  @d char_info_end(#)==#].qqqq
 10834  @d char_info(#)==font_info[char_base[#]+char_info_end
 10835  @d char_width_end(#)==#.b0].sc
 10836  @d char_width(#)==font_info[width_base[#]+char_width_end
 10837  @d char_exists(#)==(#.b0>min_quarterword)
 10838  @d char_italic_end(#)==(qo(#.b2)) div 4].sc
 10839  @d char_italic(#)==font_info[italic_base[#]+char_italic_end
 10840  @d height_depth(#)==qo(#.b1)
 10841  @d char_height_end(#)==(#) div 16].sc
 10842  @d char_height(#)==font_info[height_base[#]+char_height_end
 10843  @d char_depth_end(#)==(#) mod 16].sc
 10844  @d char_depth(#)==font_info[depth_base[#]+char_depth_end
 10845  @d char_tag(#)==((qo(#.b2)) mod 4)
 10846  
 10847  @ The global variable |null_character| is set up to be a word of
 10848  |char_info| for a character that doesn't exist. Such a word provides a
 10849  convenient way to deal with erroneous situations.
 10850  
 10851  @<Glob...@>=
 10852  @!null_character:four_quarters; {nonexistent character information}
 10853  
 10854  @ @<Set init...@>=
 10855  null_character.b0:=min_quarterword; null_character.b1:=min_quarterword;
 10856  null_character.b2:=min_quarterword; null_character.b3:=min_quarterword;
 10857  
 10858  @ Here are some macros that help process ligatures and kerns.
 10859  We write |char_kern(f)(j)| to find the amount of kerning specified by
 10860  kerning command~|j| in font~|f|. If |j| is the |char_info| for a character
 10861  with a ligature/kern program, the first instruction of that program is either
 10862  |i=font_info[lig_kern_start(f)(j)]| or |font_info[lig_kern_restart(f)(i)]|,
 10863  depending on whether or not |skip_byte(i)<=stop_flag|.
 10864  
 10865  The constant |kern_base_offset| should be simplified, for \PASCAL\ compilers
 10866  that do not do local optimization.
 10867  @^system dependencies@>
 10868  
 10869  @d char_kern_end(#)==256*op_byte(#)+rem_byte(#)].sc
 10870  @d char_kern(#)==font_info[kern_base[#]+char_kern_end
 10871  @d kern_base_offset==256*(128+min_quarterword)
 10872  @d lig_kern_start(#)==lig_kern_base[#]+rem_byte {beginning of lig/kern program}
 10873  @d lig_kern_restart_end(#)==256*op_byte(#)+rem_byte(#)+32768-kern_base_offset
 10874  @d lig_kern_restart(#)==lig_kern_base[#]+lig_kern_restart_end
 10875  
 10876  @ Font parameters are referred to as |slant(f)|, |space(f)|, etc.
 10877  
 10878  @d param_end(#)==param_base[#]].sc
 10879  @d param(#)==font_info[#+param_end
 10880  @d slant==param(slant_code) {slant to the right, per unit distance upward}
 10881  @d space==param(space_code) {normal space between words}
 10882  @d space_stretch==param(space_stretch_code) {stretch between words}
 10883  @d space_shrink==param(space_shrink_code) {shrink between words}
 10884  @d x_height==param(x_height_code) {one ex}
 10885  @d quad==param(quad_code) {one em}
 10886  @d extra_space==param(extra_space_code) {additional space at end of sentence}
 10887  
 10888  @<The em width for |cur_font|@>=quad(cur_font)
 10889  
 10890  @ @<The x-height for |cur_font|@>=x_height(cur_font)
 10891  
 10892  @ \TeX\ checks the information of a \.{TFM} file for validity as the
 10893  file is being read in, so that no further checks will be needed when
 10894  typesetting is going on. The somewhat tedious subroutine that does this
 10895  is called |read_font_info|. It has four parameters: the user font
 10896  identifier~|u|, the file name and area strings |nom| and |aire|, and the
 10897  ``at'' size~|s|. If |s|~is negative, it's the negative of a scale factor
 10898  to be applied to the design size; |s=-1000| is the normal case.
 10899  Otherwise |s| will be substituted for the design size; in this
 10900  case, |s| must be positive and less than $2048\rm\,pt$
 10901  (i.e., it must be less than $2^{27}$ when considered as an integer).
 10902  
 10903  The subroutine opens and closes a global file variable called |tfm_file|.
 10904  It returns the value of the internal font number that was just loaded.
 10905  If an error is detected, an error message is issued and no font
 10906  information is stored; |null_font| is returned in this case.
 10907  
 10908  @d bad_tfm=11 {label for |read_font_info|}
 10909  @d abort==goto bad_tfm {do this when the \.{TFM} data is wrong}
 10910  
 10911  @p function read_font_info(@!u:pointer;@!nom,@!aire:str_number;
 10912    @!s:scaled):internal_font_number; {input a \.{TFM} file}
 10913  label done,bad_tfm,not_found;
 10914  var k:font_index; {index into |font_info|}
 10915  @!file_opened:boolean; {was |tfm_file| successfully opened?}
 10916  @!lf,@!lh,@!bc,@!ec,@!nw,@!nh,@!nd,@!ni,@!nl,@!nk,@!ne,@!np:halfword;
 10917    {sizes of subfiles}
 10918  @!f:internal_font_number; {the new font's number}
 10919  @!g:internal_font_number; {the number to return}
 10920  @!a,@!b,@!c,@!d:eight_bits; {byte variables}
 10921  @!qw:four_quarters;@!sw:scaled; {accumulators}
 10922  @!bch_label:integer; {left boundary start location, or infinity}
 10923  @!bchar:0..256; {boundary character, or 256}
 10924  @!z:scaled; {the design size or the ``at'' size}
 10925  @!alpha:integer;@!beta:1..16;
 10926    {auxiliary quantities used in fixed-point multiplication}
 10927  begin g:=null_font;@/
 10928  @<Read and check the font data; |abort| if the \.{TFM} file is
 10929    malformed; if there's no room for this font, say so and |goto
 10930    done|; otherwise |incr(font_ptr)| and |goto done|@>;
 10931  bad_tfm: @<Report that the font won't be loaded@>;
 10932  done: if file_opened then b_close(tfm_file);
 10933  read_font_info:=g;
 10934  end;
 10935  
 10936  @ There are programs called \.{TFtoPL} and \.{PLtoTF} that convert
 10937  between the \.{TFM} format and a symbolic property-list format
 10938  that can be easily edited. These programs contain extensive
 10939  diagnostic information, so \TeX\ does not have to bother giving
 10940  precise details about why it rejects a particular \.{TFM} file.
 10941  @.TFtoPL@> @.PLtoTF@>
 10942  
 10943  @d start_font_error_message==print_err("Font "); sprint_cs(u);
 10944    print_char("="); print_file_name(nom,aire,"");
 10945    if s>=0 then
 10946      begin print(" at "); print_scaled(s); print("pt");
 10947      end
 10948    else if s<>-1000 then
 10949      begin print(" scaled "); print_int(-s);
 10950      end
 10951  
 10952  @<Report that the font won't be loaded@>=
 10953  start_font_error_message;
 10954  @.Font x=xx not loadable...@>
 10955  if file_opened then print(" not loadable: Bad metric (TFM) file")
 10956  else print(" not loadable: Metric (TFM) file not found");
 10957  help5("I wasn't able to read the size data for this font,")@/
 10958  ("so I will ignore the font specification.")@/
 10959  ("[Wizards can fix TFM files using TFtoPL/PLtoTF.]")@/
 10960  ("You might try inserting a different font spec;")@/
 10961  ("e.g., type `I\font<same font id>=<substitute font name>'.");
 10962  error
 10963  
 10964  @ @<Read and check...@>=
 10965  @<Open |tfm_file| for input@>;
 10966  @<Read the {\.{TFM}} size fields@>;
 10967  @<Use size fields to allocate font information@>;
 10968  @<Read the {\.{TFM}} header@>;
 10969  @<Read character data@>;
 10970  @<Read box dimensions@>;
 10971  @<Read ligature/kern program@>;
 10972  @<Read extensible character recipes@>;
 10973  @<Read font parameters@>;
 10974  @<Make final adjustments and |goto done|@>
 10975  
 10976  @ @<Open |tfm_file| for input@>=
 10977  file_opened:=false;
 10978  if aire="" then pack_file_name(nom,TEX_font_area,".tfm")
 10979  else pack_file_name(nom,aire,".tfm");
 10980  if not b_open_in(tfm_file) then abort;
 10981  file_opened:=true
 10982  
 10983  @ Note: A malformed \.{TFM} file might be shorter than it claims to be;
 10984  thus |eof(tfm_file)| might be true when |read_font_info| refers to
 10985  |tfm_file^| or when it says |get(tfm_file)|. If such circumstances
 10986  cause system error messages, you will have to defeat them somehow,
 10987  for example by defining |fget| to be `\ignorespaces|begin get(tfm_file);|
 10988  |if eof(tfm_file) then abort; end|\unskip'.
 10989  @^system dependencies@>
 10990  
 10991  @d fget==get(tfm_file)
 10992  @d fbyte==tfm_file^
 10993  @d read_sixteen(#)==begin #:=fbyte;
 10994    if #>127 then abort;
 10995    fget; #:=#*@'400+fbyte;
 10996    end
 10997  @d store_four_quarters(#)==begin fget; a:=fbyte; qw.b0:=qi(a);
 10998    fget; b:=fbyte; qw.b1:=qi(b);
 10999    fget; c:=fbyte; qw.b2:=qi(c);
 11000    fget; d:=fbyte; qw.b3:=qi(d);
 11001    #:=qw;
 11002    end
 11003  
 11004  @ @<Read the {\.{TFM}} size fields@>=
 11005  begin read_sixteen(lf);
 11006  fget; read_sixteen(lh);
 11007  fget; read_sixteen(bc);
 11008  fget; read_sixteen(ec);
 11009  if (bc>ec+1)or(ec>255) then abort;
 11010  if bc>255 then {|bc=256| and |ec=255|}
 11011    begin bc:=1; ec:=0;
 11012    end;
 11013  fget; read_sixteen(nw);
 11014  fget; read_sixteen(nh);
 11015  fget; read_sixteen(nd);
 11016  fget; read_sixteen(ni);
 11017  fget; read_sixteen(nl);
 11018  fget; read_sixteen(nk);
 11019  fget; read_sixteen(ne);
 11020  fget; read_sixteen(np);
 11021  if lf<>6+lh+(ec-bc+1)+nw+nh+nd+ni+nl+nk+ne+np then abort;
 11022  if (nw=0)or(nh=0)or(nd=0)or(ni=0) then abort;
 11023  end
 11024  
 11025  @ The preliminary settings of the index-offset variables |char_base|,
 11026  |width_base|, |lig_kern_base|, |kern_base|, and |exten_base| will be
 11027  corrected later by subtracting |min_quarterword| from them; and we will
 11028  subtract 1 from |param_base| too. It's best to forget about such anomalies
 11029  until later.
 11030  
 11031  @<Use size fields to allocate font information@>=
 11032  lf:=lf-6-lh; {|lf| words should be loaded into |font_info|}
 11033  if np<7 then lf:=lf+7-np; {at least seven parameters will appear}
 11034  if (font_ptr=font_max)or(fmem_ptr+lf>font_mem_size) then
 11035    @<Apologize for not loading the font, |goto done|@>;
 11036  f:=font_ptr+1;
 11037  char_base[f]:=fmem_ptr-bc;
 11038  width_base[f]:=char_base[f]+ec+1;
 11039  height_base[f]:=width_base[f]+nw;
 11040  depth_base[f]:=height_base[f]+nh;
 11041  italic_base[f]:=depth_base[f]+nd;
 11042  lig_kern_base[f]:=italic_base[f]+ni;
 11043  kern_base[f]:=lig_kern_base[f]+nl-kern_base_offset;
 11044  exten_base[f]:=kern_base[f]+kern_base_offset+nk;
 11045  param_base[f]:=exten_base[f]+ne
 11046  
 11047  @ @<Apologize for not loading...@>=
 11048  begin start_font_error_message;
 11049  print(" not loaded: Not enough room left");
 11050  @.Font x=xx not loaded...@>
 11051  help4("I'm afraid I won't be able to make use of this font,")@/
 11052  ("because my memory for character-size data is too small.")@/
 11053  ("If you're really stuck, ask a wizard to enlarge me.")@/
 11054  ("Or maybe try `I\font<same font id>=<name of loaded font>'.");
 11055  error; goto done;
 11056  end
 11057  
 11058  @ Only the first two words of the header are needed by \TeX82.
 11059  
 11060  @<Read the {\.{TFM}} header@>=
 11061  begin if lh<2 then abort;
 11062  store_four_quarters(font_check[f]);
 11063  fget; read_sixteen(z); {this rejects a negative design size}
 11064  fget; z:=z*@'400+fbyte; fget; z:=(z*@'20)+(fbyte div@'20);
 11065  if z<unity then abort;
 11066  while lh>2 do
 11067    begin fget;fget;fget;fget;decr(lh); {ignore the rest of the header}
 11068    end;
 11069  font_dsize[f]:=z;
 11070  if s<>-1000 then
 11071    if s>=0 then z:=s
 11072    else z:=xn_over_d(z,-s,1000);
 11073  font_size[f]:=z;
 11074  end
 11075  
 11076  @ @<Read character data@>=
 11077  for k:=fmem_ptr to width_base[f]-1 do
 11078    begin store_four_quarters(font_info[k].qqqq);
 11079    if (a>=nw)or(b div @'20>=nh)or(b mod @'20>=nd)or
 11080      (c div 4>=ni) then abort;
 11081    case c mod 4 of
 11082    lig_tag: if d>=nl then abort;
 11083    ext_tag: if d>=ne then abort;
 11084    list_tag: @<Check for charlist cycle@>;
 11085    othercases do_nothing {|no_tag|}
 11086    endcases;
 11087    end
 11088  
 11089  @ We want to make sure that there is no cycle of characters linked together
 11090  by |list_tag| entries, since such a cycle would get \TeX\ into an endless
 11091  loop. If such a cycle exists, the routine here detects it when processing
 11092  the largest character code in the cycle.
 11093  
 11094  @d check_byte_range(#)==begin if (#<bc)or(#>ec) then abort@+end
 11095  @d current_character_being_worked_on==k+bc-fmem_ptr
 11096  
 11097  @<Check for charlist cycle@>=
 11098  begin check_byte_range(d);
 11099  while d<current_character_being_worked_on do
 11100    begin qw:=char_info(f)(d);
 11101    {N.B.: not |qi(d)|, since |char_base[f]| hasn't been adjusted yet}
 11102    if char_tag(qw)<>list_tag then goto not_found;
 11103    d:=qo(rem_byte(qw)); {next character on the list}
 11104    end;
 11105  if d=current_character_being_worked_on then abort; {yes, there's a cycle}
 11106  not_found:end
 11107  
 11108  @ A |fix_word| whose four bytes are $(a,b,c,d)$ from left to right represents
 11109  the number
 11110  $$x=\left\{\vcenter{\halign{$#$,\hfil\qquad&if $#$\hfil\cr
 11111  b\cdot2^{-4}+c\cdot2^{-12}+d\cdot2^{-20}&a=0;\cr
 11112  -16+b\cdot2^{-4}+c\cdot2^{-12}+d\cdot2^{-20}&a=255.\cr}}\right.$$
 11113  (No other choices of |a| are allowed, since the magnitude of a number in
 11114  design-size units must be less than 16.)  We want to multiply this
 11115  quantity by the integer~|z|, which is known to be less than $2^{27}$.
 11116  If $|z|<2^{23}$, the individual multiplications $b\cdot z$,
 11117  $c\cdot z$, $d\cdot z$ cannot overflow; otherwise we will divide |z| by 2,
 11118  4, 8, or 16, to obtain a multiplier less than $2^{23}$, and we can
 11119  compensate for this later. If |z| has thereby been replaced by
 11120  $|z|^\prime=|z|/2^e$, let $\beta=2^{4-e}$; we shall compute
 11121  $$\lfloor(b+c\cdot2^{-8}+d\cdot2^{-16})\,z^\prime/\beta\rfloor$$
 11122  if $a=0$, or the same quantity minus $\alpha=2^{4+e}z^\prime$ if $a=255$.
 11123  This calculation must be done exactly, in order to guarantee portability
 11124  of \TeX\ between computers.
 11125  
 11126  @d store_scaled(#)==begin fget; a:=fbyte; fget; b:=fbyte;
 11127    fget; c:=fbyte; fget; d:=fbyte;@/
 11128    sw:=(((((d*z)div@'400)+(c*z))div@'400)+(b*z))div beta;
 11129    if a=0 then #:=sw@+else if a=255 then #:=sw-alpha@+else abort;
 11130    end
 11131  
 11132  @<Read box dimensions@>=
 11133  begin @<Replace |z| by $|z|^\prime$ and compute $\alpha,\beta$@>;
 11134  for k:=width_base[f] to lig_kern_base[f]-1 do
 11135    store_scaled(font_info[k].sc);
 11136  if font_info[width_base[f]].sc<>0 then abort; {\\{width}[0] must be zero}
 11137  if font_info[height_base[f]].sc<>0 then abort; {\\{height}[0] must be zero}
 11138  if font_info[depth_base[f]].sc<>0 then abort; {\\{depth}[0] must be zero}
 11139  if font_info[italic_base[f]].sc<>0 then abort; {\\{italic}[0] must be zero}
 11140  end
 11141  
 11142  @ @<Replace |z|...@>=
 11143  begin alpha:=16;
 11144  while z>=@'40000000 do
 11145    begin z:=z div 2; alpha:=alpha+alpha;
 11146    end;
 11147  beta:=256 div alpha; alpha:=alpha*z;
 11148  end
 11149  
 11150  @ @d check_existence(#)==@t@>@;@/
 11151    begin check_byte_range(#);
 11152    qw:=char_info(f)(#); {N.B.: not |qi(#)|}
 11153    if not char_exists(qw) then abort;
 11154    end
 11155  
 11156  @<Read ligature/kern program@>=
 11157  bch_label:=@'77777; bchar:=256;
 11158  if nl>0 then
 11159    begin for k:=lig_kern_base[f] to kern_base[f]+kern_base_offset-1 do
 11160      begin store_four_quarters(font_info[k].qqqq);
 11161      if a>128 then
 11162        begin if 256*c+d>=nl then abort;
 11163        if a=255 then if k=lig_kern_base[f] then bchar:=b;
 11164        end
 11165      else begin if b<>bchar then check_existence(b);
 11166        if c<128 then check_existence(d) {check ligature}
 11167        else if 256*(c-128)+d>=nk then abort; {check kern}
 11168        if a<128 then if k-lig_kern_base[f]+a+1>=nl then abort;
 11169        end;
 11170      end;
 11171    if a=255 then bch_label:=256*c+d;
 11172    end;
 11173  for k:=kern_base[f]+kern_base_offset to exten_base[f]-1 do
 11174    store_scaled(font_info[k].sc);
 11175  
 11176  @ @<Read extensible character recipes@>=
 11177  for k:=exten_base[f] to param_base[f]-1 do
 11178    begin store_four_quarters(font_info[k].qqqq);
 11179    if a<>0 then check_existence(a);
 11180    if b<>0 then check_existence(b);
 11181    if c<>0 then check_existence(c);
 11182    check_existence(d);
 11183    end
 11184  
 11185  @ We check to see that the \.{TFM} file doesn't end prematurely; but
 11186  no error message is given for files having more than |lf| words.
 11187  
 11188  @<Read font parameters@>=
 11189  begin for k:=1 to np do
 11190    if k=1 then {the |slant| parameter is a pure number}
 11191      begin fget; sw:=fbyte; if sw>127 then sw:=sw-256;
 11192      fget; sw:=sw*@'400+fbyte; fget; sw:=sw*@'400+fbyte;
 11193      fget; font_info[param_base[f]].sc:=
 11194        (sw*@'20)+(fbyte div@'20);
 11195      end
 11196    else store_scaled(font_info[param_base[f]+k-1].sc);
 11197  if eof(tfm_file) then abort;
 11198  for k:=np+1 to 7 do font_info[param_base[f]+k-1].sc:=0;
 11199  end
 11200  
 11201  @ Now to wrap it up, we have checked all the necessary things about the \.{TFM}
 11202  file, and all we need to do is put the finishing touches on the data for
 11203  the new font.
 11204  
 11205  @d adjust(#)==#[f]:=qo(#[f])
 11206    {correct for the excess |min_quarterword| that was added}
 11207  
 11208  @<Make final adjustments...@>=
 11209  if np>=7 then font_params[f]:=np@+else font_params[f]:=7;
 11210  hyphen_char[f]:=default_hyphen_char; skew_char[f]:=default_skew_char;
 11211  if bch_label<nl then bchar_label[f]:=bch_label+lig_kern_base[f]
 11212  else bchar_label[f]:=non_address;
 11213  font_bchar[f]:=qi(bchar);
 11214  font_false_bchar[f]:=qi(bchar);
 11215  if bchar<=ec then if bchar>=bc then
 11216    begin qw:=char_info(f)(bchar); {N.B.: not |qi(bchar)|}
 11217    if char_exists(qw) then font_false_bchar[f]:=non_char;
 11218    end;
 11219  font_name[f]:=nom;
 11220  font_area[f]:=aire;
 11221  font_bc[f]:=bc; font_ec[f]:=ec; font_glue[f]:=null;
 11222  adjust(char_base); adjust(width_base); adjust(lig_kern_base);
 11223  adjust(kern_base); adjust(exten_base);
 11224  decr(param_base[f]);
 11225  fmem_ptr:=fmem_ptr+lf; font_ptr:=f; g:=f; goto done
 11226  
 11227  @ Before we forget about the format of these tables, let's deal with two
 11228  of \TeX's basic scanning routines related to font information.
 11229  
 11230  @<Declare procedures that scan font-related stuff@>=
 11231  procedure scan_font_ident;
 11232  var f:internal_font_number;
 11233  @!m:halfword;
 11234  begin @<Get the next non-blank non-call...@>;
 11235  if cur_cmd=def_font then f:=cur_font
 11236  else if cur_cmd=set_font then f:=cur_chr
 11237  else if cur_cmd=def_family then
 11238    begin m:=cur_chr; scan_four_bit_int; f:=equiv(m+cur_val);
 11239    end
 11240  else  begin print_err("Missing font identifier");
 11241  @.Missing font identifier@>
 11242    help2("I was looking for a control sequence whose")@/
 11243    ("current meaning has been defined by \font.");
 11244    back_error; f:=null_font;
 11245    end;
 11246  cur_val:=f;
 11247  end;
 11248  
 11249  @ The following routine is used to implement `\.{\\fontdimen} |n| |f|'.
 11250  The boolean parameter |writing| is set |true| if the calling program
 11251  intends to change the parameter value.
 11252  
 11253  @<Declare procedures that scan font-related stuff@>=
 11254  procedure find_font_dimen(@!writing:boolean);
 11255    {sets |cur_val| to |font_info| location}
 11256  var f:internal_font_number;
 11257  @!n:integer; {the parameter number}
 11258  begin scan_int; n:=cur_val; scan_font_ident; f:=cur_val;
 11259  if n<=0 then cur_val:=fmem_ptr
 11260  else  begin if writing and(n<=space_shrink_code)and@|
 11261      (n>=space_code)and(font_glue[f]<>null) then
 11262      begin delete_glue_ref(font_glue[f]);
 11263      font_glue[f]:=null;
 11264      end;
 11265    if n>font_params[f] then
 11266      if f<font_ptr then cur_val:=fmem_ptr
 11267      else @<Increase the number of parameters in the last font@>
 11268    else cur_val:=n+param_base[f];
 11269    end;
 11270  @<Issue an error message if |cur_val=fmem_ptr|@>;
 11271  end;
 11272  
 11273  @ @<Issue an error message if |cur_val=fmem_ptr|@>=
 11274  if cur_val=fmem_ptr then
 11275    begin print_err("Font "); print_esc(font_id_text(f));
 11276    print(" has only "); print_int(font_params[f]);
 11277    print(" fontdimen parameters");
 11278  @.Font x has only...@>
 11279    help2("To increase the number of font parameters, you must")@/
 11280      ("use \fontdimen immediately after the \font is loaded.");
 11281    error;
 11282    end
 11283  
 11284  @ @<Increase the number of parameters...@>=
 11285  begin repeat if fmem_ptr=font_mem_size then
 11286    overflow("font memory",font_mem_size);
 11287  @:TeX capacity exceeded font memory}{\quad font memory@>
 11288  font_info[fmem_ptr].sc:=0; incr(fmem_ptr); incr(font_params[f]);
 11289  until n=font_params[f];
 11290  cur_val:=fmem_ptr-1; {this equals |param_base[f]+font_params[f]|}
 11291  end
 11292  
 11293  @ When \TeX\ wants to typeset a character that doesn't exist, the
 11294  character node is not created; thus the output routine can assume
 11295  that characters exist when it sees them. The following procedure
 11296  prints a warning message unless the user has suppressed it.
 11297  
 11298  @p procedure char_warning(@!f:internal_font_number;@!c:eight_bits);
 11299  begin if tracing_lost_chars>0 then
 11300    begin begin_diagnostic;
 11301    print_nl("Missing character: There is no ");
 11302  @.Missing character@>
 11303    print_ASCII(c); print(" in font ");
 11304    slow_print(font_name[f]); print_char("!"); end_diagnostic(false);
 11305    end;
 11306  end;
 11307  
 11308  @ Here is a function that returns a pointer to a character node for a
 11309  given character in a given font. If that character doesn't exist,
 11310  |null| is returned instead.
 11311  
 11312  @p function new_character(@!f:internal_font_number;@!c:eight_bits):pointer;
 11313  label exit;
 11314  var p:pointer; {newly allocated node}
 11315  begin if font_bc[f]<=c then if font_ec[f]>=c then
 11316    if char_exists(char_info(f)(qi(c))) then
 11317      begin p:=get_avail; font(p):=f; character(p):=qi(c);
 11318      new_character:=p; return;
 11319      end;
 11320  char_warning(f,c);
 11321  new_character:=null;
 11322  exit:end;
 11323  
 11324  @* \[31] Device-independent file format.
 11325  The most important output produced by a run of \TeX\ is the ``device
 11326  independent'' (\.{DVI}) file that specifies where characters and rules
 11327  are to appear on printed pages. The form of these files was designed by
 11328  David R. Fuchs in 1979. Almost any reasonable typesetting device can be
 11329  @^Fuchs, David Raymond@>
 11330  @:DVI_files}{\.{DVI} files@>
 11331  driven by a program that takes \.{DVI} files as input, and dozens of such
 11332  \.{DVI}-to-whatever programs have been written. Thus, it is possible to
 11333  print the output of \TeX\ on many different kinds of equipment, using \TeX\
 11334  as a device-independent ``front end.''
 11335  
 11336  A \.{DVI} file is a stream of 8-bit bytes, which may be regarded as a
 11337  series of commands in a machine-like language. The first byte of each command
 11338  is the operation code, and this code is followed by zero or more bytes
 11339  that provide parameters to the command. The parameters themselves may consist
 11340  of several consecutive bytes; for example, the `|set_rule|' command has two
 11341  parameters, each of which is four bytes long. Parameters are usually
 11342  regarded as nonnegative integers; but four-byte-long parameters,
 11343  and shorter parameters that denote distances, can be
 11344  either positive or negative. Such parameters are given in two's complement
 11345  notation. For example, a two-byte-long distance parameter has a value between
 11346  $-2^{15}$ and $2^{15}-1$. As in \.{TFM} files, numbers that occupy
 11347  more than one byte position appear in BigEndian order.
 11348  
 11349  A \.{DVI} file consists of a ``preamble,'' followed by a sequence of one
 11350  or more ``pages,'' followed by a ``postamble.'' The preamble is simply a
 11351  |pre| command, with its parameters that define the dimensions used in the
 11352  file; this must come first.  Each ``page'' consists of a |bop| command,
 11353  followed by any number of other commands that tell where characters are to
 11354  be placed on a physical page, followed by an |eop| command. The pages
 11355  appear in the order that \TeX\ generated them. If we ignore |nop| commands
 11356  and \\{fnt\_def} commands (which are allowed between any two commands in
 11357  the file), each |eop| command is immediately followed by a |bop| command,
 11358  or by a |post| command; in the latter case, there are no more pages in the
 11359  file, and the remaining bytes form the postamble.  Further details about
 11360  the postamble will be explained later.
 11361  
 11362  Some parameters in \.{DVI} commands are ``pointers.'' These are four-byte
 11363  quantities that give the location number of some other byte in the file;
 11364  the first byte is number~0, then comes number~1, and so on. For example,
 11365  one of the parameters of a |bop| command points to the previous |bop|;
 11366  this makes it feasible to read the pages in backwards order, in case the
 11367  results are being directed to a device that stacks its output face up.
 11368  Suppose the preamble of a \.{DVI} file occupies bytes 0 to 99. Now if the
 11369  first page occupies bytes 100 to 999, say, and if the second
 11370  page occupies bytes 1000 to 1999, then the |bop| that starts in byte 1000
 11371  points to 100 and the |bop| that starts in byte 2000 points to 1000. (The
 11372  very first |bop|, i.e., the one starting in byte 100, has a pointer of~$-1$.)
 11373  
 11374  @ The \.{DVI} format is intended to be both compact and easily interpreted
 11375  by a machine. Compactness is achieved by making most of the information
 11376  implicit instead of explicit. When a \.{DVI}-reading program reads the
 11377  commands for a page, it keeps track of several quantities: (a)~The current
 11378  font |f| is an integer; this value is changed only
 11379  by \\{fnt} and \\{fnt\_num} commands. (b)~The current position on the page
 11380  is given by two numbers called the horizontal and vertical coordinates,
 11381  |h| and |v|. Both coordinates are zero at the upper left corner of the page;
 11382  moving to the right corresponds to increasing the horizontal coordinate, and
 11383  moving down corresponds to increasing the vertical coordinate. Thus, the
 11384  coordinates are essentially Cartesian, except that vertical directions are
 11385  flipped; the Cartesian version of |(h,v)| would be |(h,-v)|.  (c)~The
 11386  current spacing amounts are given by four numbers |w|, |x|, |y|, and |z|,
 11387  where |w| and~|x| are used for horizontal spacing and where |y| and~|z|
 11388  are used for vertical spacing. (d)~There is a stack containing
 11389  |(h,v,w,x,y,z)| values; the \.{DVI} commands |push| and |pop| are used to
 11390  change the current level of operation. Note that the current font~|f| is
 11391  not pushed and popped; the stack contains only information about
 11392  positioning.
 11393  
 11394  The values of |h|, |v|, |w|, |x|, |y|, and |z| are signed integers having up
 11395  to 32 bits, including the sign. Since they represent physical distances,
 11396  there is a small unit of measurement such that increasing |h| by~1 means
 11397  moving a certain tiny distance to the right. The actual unit of
 11398  measurement is variable, as explained below; \TeX\ sets things up so that
 11399  its \.{DVI} output is in sp units, i.e., scaled points, in agreement with
 11400  all the |scaled| dimensions in \TeX's data structures.
 11401  
 11402  @ Here is a list of all the commands that may appear in a \.{DVI} file. Each
 11403  command is specified by its symbolic name (e.g., |bop|), its opcode byte
 11404  (e.g., 139), and its parameters (if any). The parameters are followed
 11405  by a bracketed number telling how many bytes they occupy; for example,
 11406  `|p[4]|' means that parameter |p| is four bytes long.
 11407  
 11408  \yskip\hang|set_char_0| 0. Typeset character number~0 from font~|f|
 11409  such that the reference point of the character is at |(h,v)|. Then
 11410  increase |h| by the width of that character. Note that a character may
 11411  have zero or negative width, so one cannot be sure that |h| will advance
 11412  after this command; but |h| usually does increase.
 11413  
 11414  \yskip\hang\\{set\_char\_1} through \\{set\_char\_127} (opcodes 1 to 127).
 11415  Do the operations of |set_char_0|; but use the character whose number
 11416  matches the opcode, instead of character~0.
 11417  
 11418  \yskip\hang|set1| 128 |c[1]|. Same as |set_char_0|, except that character
 11419  number~|c| is typeset. \TeX82 uses this command for characters in the
 11420  range |128<=c<256|.
 11421  
 11422  \yskip\hang|@!set2| 129 |c[2]|. Same as |set1|, except that |c|~is two
 11423  bytes long, so it is in the range |0<=c<65536|. \TeX82 never uses this
 11424  command, but it should come in handy for extensions of \TeX\ that deal
 11425  with oriental languages.
 11426  @^oriental characters@>@^Chinese characters@>@^Japanese characters@>
 11427  
 11428  \yskip\hang|@!set3| 130 |c[3]|. Same as |set1|, except that |c|~is three
 11429  bytes long, so it can be as large as $2^{24}-1$. Not even the Chinese
 11430  language has this many characters, but this command might prove useful
 11431  in some yet unforeseen extension.
 11432  
 11433  \yskip\hang|@!set4| 131 |c[4]|. Same as |set1|, except that |c|~is four
 11434  bytes long. Imagine that.
 11435  
 11436  \yskip\hang|set_rule| 132 |a[4]| |b[4]|. Typeset a solid black rectangle
 11437  of height~|a| and width~|b|, with its bottom left corner at |(h,v)|. Then
 11438  set |h:=h+b|. If either |a<=0| or |b<=0|, nothing should be typeset. Note
 11439  that if |b<0|, the value of |h| will decrease even though nothing else happens.
 11440  See below for details about how to typeset rules so that consistency with
 11441  \MF\ is guaranteed.
 11442  
 11443  \yskip\hang|@!put1| 133 |c[1]|. Typeset character number~|c| from font~|f|
 11444  such that the reference point of the character is at |(h,v)|. (The `put'
 11445  commands are exactly like the `set' commands, except that they simply put out a
 11446  character or a rule without moving the reference point afterwards.)
 11447  
 11448  \yskip\hang|@!put2| 134 |c[2]|. Same as |set2|, except that |h| is not changed.
 11449  
 11450  \yskip\hang|@!put3| 135 |c[3]|. Same as |set3|, except that |h| is not changed.
 11451  
 11452  \yskip\hang|@!put4| 136 |c[4]|. Same as |set4|, except that |h| is not changed.
 11453  
 11454  \yskip\hang|put_rule| 137 |a[4]| |b[4]|. Same as |set_rule|, except that
 11455  |h| is not changed.
 11456  
 11457  \yskip\hang|nop| 138. No operation, do nothing. Any number of |nop|'s
 11458  may occur between \.{DVI} commands, but a |nop| cannot be inserted between
 11459  a command and its parameters or between two parameters.
 11460  
 11461  \yskip\hang|bop| 139 $c_0[4]$ $c_1[4]$ $\ldots$ $c_9[4]$ $p[4]$. Beginning
 11462  of a page: Set |(h,v,w,x,y,z):=(0,0,0,0,0,0)| and set the stack empty. Set
 11463  the current font |f| to an undefined value.  The ten $c_i$ parameters hold
 11464  the values of \.{\\count0} $\ldots$ \.{\\count9} in \TeX\ at the time
 11465  \.{\\shipout} was invoked for this page; they can be used to identify
 11466  pages, if a user wants to print only part of a \.{DVI} file. The parameter
 11467  |p| points to the previous |bop| in the file; the first
 11468  |bop| has $p=-1$.
 11469  
 11470  \yskip\hang|eop| 140.  End of page: Print what you have read since the
 11471  previous |bop|. At this point the stack should be empty. (The \.{DVI}-reading
 11472  programs that drive most output devices will have kept a buffer of the
 11473  material that appears on the page that has just ended. This material is
 11474  largely, but not entirely, in order by |v| coordinate and (for fixed |v|) by
 11475  |h|~coordinate; so it usually needs to be sorted into some order that is
 11476  appropriate for the device in question.)
 11477  
 11478  \yskip\hang|push| 141. Push the current values of |(h,v,w,x,y,z)| onto the
 11479  top of the stack; do not change any of these values. Note that |f| is
 11480  not pushed.
 11481  
 11482  \yskip\hang|pop| 142. Pop the top six values off of the stack and assign
 11483  them respectively to |(h,v,w,x,y,z)|. The number of pops should never
 11484  exceed the number of pushes, since it would be highly embarrassing if the
 11485  stack were empty at the time of a |pop| command.
 11486  
 11487  \yskip\hang|right1| 143 |b[1]|. Set |h:=h+b|, i.e., move right |b| units.
 11488  The parameter is a signed number in two's complement notation, |-128<=b<128|;
 11489  if |b<0|, the reference point moves left.
 11490  
 11491  \yskip\hang|@!right2| 144 |b[2]|. Same as |right1|, except that |b| is a
 11492  two-byte quantity in the range |-32768<=b<32768|.
 11493  
 11494  \yskip\hang|@!right3| 145 |b[3]|. Same as |right1|, except that |b| is a
 11495  three-byte quantity in the range |@t$-2^{23}$@><=b<@t$2^{23}$@>|.
 11496  
 11497  \yskip\hang|@!right4| 146 |b[4]|. Same as |right1|, except that |b| is a
 11498  four-byte quantity in the range |@t$-2^{31}$@><=b<@t$2^{31}$@>|.
 11499  
 11500  \yskip\hang|w0| 147. Set |h:=h+w|; i.e., move right |w| units. With luck,
 11501  this parameterless command will usually suffice, because the same kind of motion
 11502  will occur several times in succession; the following commands explain how
 11503  |w| gets particular values.
 11504  
 11505  \yskip\hang|w1| 148 |b[1]|. Set |w:=b| and |h:=h+b|. The value of |b| is a
 11506  signed quantity in two's complement notation, |-128<=b<128|. This command
 11507  changes the current |w|~spacing and moves right by |b|.
 11508  
 11509  \yskip\hang|@!w2| 149 |b[2]|. Same as |w1|, but |b| is two bytes long,
 11510  |-32768<=b<32768|.
 11511  
 11512  \yskip\hang|@!w3| 150 |b[3]|. Same as |w1|, but |b| is three bytes long,
 11513  |@t$-2^{23}$@><=b<@t$2^{23}$@>|.
 11514  
 11515  \yskip\hang|@!w4| 151 |b[4]|. Same as |w1|, but |b| is four bytes long,
 11516  |@t$-2^{31}$@><=b<@t$2^{31}$@>|.
 11517  
 11518  \yskip\hang|x0| 152. Set |h:=h+x|; i.e., move right |x| units. The `|x|'
 11519  commands are like the `|w|' commands except that they involve |x| instead
 11520  of |w|.
 11521  
 11522  \yskip\hang|x1| 153 |b[1]|. Set |x:=b| and |h:=h+b|. The value of |b| is a
 11523  signed quantity in two's complement notation, |-128<=b<128|. This command
 11524  changes the current |x|~spacing and moves right by |b|.
 11525  
 11526  \yskip\hang|@!x2| 154 |b[2]|. Same as |x1|, but |b| is two bytes long,
 11527  |-32768<=b<32768|.
 11528  
 11529  \yskip\hang|@!x3| 155 |b[3]|. Same as |x1|, but |b| is three bytes long,
 11530  |@t$-2^{23}$@><=b<@t$2^{23}$@>|.
 11531  
 11532  \yskip\hang|@!x4| 156 |b[4]|. Same as |x1|, but |b| is four bytes long,
 11533  |@t$-2^{31}$@><=b<@t$2^{31}$@>|.
 11534  
 11535  \yskip\hang|down1| 157 |a[1]|. Set |v:=v+a|, i.e., move down |a| units.
 11536  The parameter is a signed number in two's complement notation, |-128<=a<128|;
 11537  if |a<0|, the reference point moves up.
 11538  
 11539  \yskip\hang|@!down2| 158 |a[2]|. Same as |down1|, except that |a| is a
 11540  two-byte quantity in the range |-32768<=a<32768|.
 11541  
 11542  \yskip\hang|@!down3| 159 |a[3]|. Same as |down1|, except that |a| is a
 11543  three-byte quantity in the range |@t$-2^{23}$@><=a<@t$2^{23}$@>|.
 11544  
 11545  \yskip\hang|@!down4| 160 |a[4]|. Same as |down1|, except that |a| is a
 11546  four-byte quantity in the range |@t$-2^{31}$@><=a<@t$2^{31}$@>|.
 11547  
 11548  \yskip\hang|y0| 161. Set |v:=v+y|; i.e., move down |y| units. With luck,
 11549  this parameterless command will usually suffice, because the same kind of motion
 11550  will occur several times in succession; the following commands explain how
 11551  |y| gets particular values.
 11552  
 11553  \yskip\hang|y1| 162 |a[1]|. Set |y:=a| and |v:=v+a|. The value of |a| is a
 11554  signed quantity in two's complement notation, |-128<=a<128|. This command
 11555  changes the current |y|~spacing and moves down by |a|.
 11556  
 11557  \yskip\hang|@!y2| 163 |a[2]|. Same as |y1|, but |a| is two bytes long,
 11558  |-32768<=a<32768|.
 11559  
 11560  \yskip\hang|@!y3| 164 |a[3]|. Same as |y1|, but |a| is three bytes long,
 11561  |@t$-2^{23}$@><=a<@t$2^{23}$@>|.
 11562  
 11563  \yskip\hang|@!y4| 165 |a[4]|. Same as |y1|, but |a| is four bytes long,
 11564  |@t$-2^{31}$@><=a<@t$2^{31}$@>|.
 11565  
 11566  \yskip\hang|z0| 166. Set |v:=v+z|; i.e., move down |z| units. The `|z|' commands
 11567  are like the `|y|' commands except that they involve |z| instead of |y|.
 11568  
 11569  \yskip\hang|z1| 167 |a[1]|. Set |z:=a| and |v:=v+a|. The value of |a| is a
 11570  signed quantity in two's complement notation, |-128<=a<128|. This command
 11571  changes the current |z|~spacing and moves down by |a|.
 11572  
 11573  \yskip\hang|@!z2| 168 |a[2]|. Same as |z1|, but |a| is two bytes long,
 11574  |-32768<=a<32768|.
 11575  
 11576  \yskip\hang|@!z3| 169 |a[3]|. Same as |z1|, but |a| is three bytes long,
 11577  |@t$-2^{23}$@><=a<@t$2^{23}$@>|.
 11578  
 11579  \yskip\hang|@!z4| 170 |a[4]|. Same as |z1|, but |a| is four bytes long,
 11580  |@t$-2^{31}$@><=a<@t$2^{31}$@>|.
 11581  
 11582  \yskip\hang|fnt_num_0| 171. Set |f:=0|. Font 0 must previously have been
 11583  defined by a \\{fnt\_def} instruction, as explained below.
 11584  
 11585  \yskip\hang\\{fnt\_num\_1} through \\{fnt\_num\_63} (opcodes 172 to 234). Set
 11586  |f:=1|, \dots, \hbox{|f:=63|}, respectively.
 11587  
 11588  \yskip\hang|fnt1| 235 |k[1]|. Set |f:=k|. \TeX82 uses this command for font
 11589  numbers in the range |64<=k<256|.
 11590  
 11591  \yskip\hang|@!fnt2| 236 |k[2]|. Same as |fnt1|, except that |k|~is two
 11592  bytes long, so it is in the range |0<=k<65536|. \TeX82 never generates this
 11593  command, but large font numbers may prove useful for specifications of
 11594  color or texture, or they may be used for special fonts that have fixed
 11595  numbers in some external coding scheme.
 11596  
 11597  \yskip\hang|@!fnt3| 237 |k[3]|. Same as |fnt1|, except that |k|~is three
 11598  bytes long, so it can be as large as $2^{24}-1$.
 11599  
 11600  \yskip\hang|@!fnt4| 238 |k[4]|. Same as |fnt1|, except that |k|~is four
 11601  bytes long; this is for the really big font numbers (and for the negative ones).
 11602  
 11603  \yskip\hang|xxx1| 239 |k[1]| |x[k]|. This command is undefined in
 11604  general; it functions as a $(k+2)$-byte |nop| unless special \.{DVI}-reading
 11605  programs are being used. \TeX82 generates |xxx1| when a short enough
 11606  \.{\\special} appears, setting |k| to the number of bytes being sent. It
 11607  is recommended that |x| be a string having the form of a keyword followed
 11608  by possible parameters relevant to that keyword.
 11609  
 11610  \yskip\hang|@!xxx2| 240 |k[2]| |x[k]|. Like |xxx1|, but |0<=k<65536|.
 11611  
 11612  \yskip\hang|@!xxx3| 241 |k[3]| |x[k]|. Like |xxx1|, but |0<=k<@t$2^{24}$@>|.
 11613  
 11614  \yskip\hang|xxx4| 242 |k[4]| |x[k]|. Like |xxx1|, but |k| can be ridiculously
 11615  large. \TeX82 uses |xxx4| when sending a string of length 256 or more.
 11616  
 11617  \yskip\hang|fnt_def1| 243 |k[1]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.
 11618  Define font |k|, where |0<=k<256|; font definitions will be explained shortly.
 11619  
 11620  \yskip\hang|@!fnt_def2| 244 |k[2]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.
 11621  Define font |k|, where |0<=k<65536|.
 11622  
 11623  \yskip\hang|@!fnt_def3| 245 |k[3]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.
 11624  Define font |k|, where |0<=k<@t$2^{24}$@>|.
 11625  
 11626  \yskip\hang|@!fnt_def4| 246 |k[4]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.
 11627  Define font |k|, where |@t$-2^{31}$@><=k<@t$2^{31}$@>|.
 11628  
 11629  \yskip\hang|pre| 247 |i[1]| |num[4]| |den[4]| |mag[4]| |k[1]| |x[k]|.
 11630  Beginning of the preamble; this must come at the very beginning of the
 11631  file. Parameters |i|, |num|, |den|, |mag|, |k|, and |x| are explained below.
 11632  
 11633  \yskip\hang|post| 248. Beginning of the postamble, see below.
 11634  
 11635  \yskip\hang|post_post| 249. Ending of the postamble, see below.
 11636  
 11637  \yskip\noindent Commands 250--255 are undefined at the present time.
 11638  
 11639  @ @d set_char_0=0 {typeset character 0 and move right}
 11640  @d set1=128 {typeset a character and move right}
 11641  @d set_rule=132 {typeset a rule and move right}
 11642  @d put_rule=137 {typeset a rule}
 11643  @d nop=138 {no operation}
 11644  @d bop=139 {beginning of page}
 11645  @d eop=140 {ending of page}
 11646  @d push=141 {save the current positions}
 11647  @d pop=142 {restore previous positions}
 11648  @d right1=143 {move right}
 11649  @d w0=147 {move right by |w|}
 11650  @d w1=148 {move right and set |w|}
 11651  @d x0=152 {move right by |x|}
 11652  @d x1=153 {move right and set |x|}
 11653  @d down1=157 {move down}
 11654  @d y0=161 {move down by |y|}
 11655  @d y1=162 {move down and set |y|}
 11656  @d z0=166 {move down by |z|}
 11657  @d z1=167 {move down and set |z|}
 11658  @d fnt_num_0=171 {set current font to 0}
 11659  @d fnt1=235 {set current font}
 11660  @d xxx1=239 {extension to \.{DVI} primitives}
 11661  @d xxx4=242 {potentially long extension to \.{DVI} primitives}
 11662  @d fnt_def1=243 {define the meaning of a font number}
 11663  @d pre=247 {preamble}
 11664  @d post=248 {postamble beginning}
 11665  @d post_post=249 {postamble ending}
 11666  
 11667  @ The preamble contains basic information about the file as a whole. As
 11668  stated above, there are six parameters:
 11669  $$\hbox{|@!i[1]| |@!num[4]| |@!den[4]| |@!mag[4]| |@!k[1]| |@!x[k]|.}$$
 11670  The |i| byte identifies \.{DVI} format; currently this byte is always set
 11671  to~2. (The value |i=3| is currently used for an extended format that
 11672  allows a mixture of right-to-left and left-to-right typesetting.
 11673  Some day we will set |i=4|, when \.{DVI} format makes another
 11674  incompatible change---perhaps in the year 2048.)
 11675  
 11676  The next two parameters, |num| and |den|, are positive integers that define
 11677  the units of measurement; they are the numerator and denominator of a
 11678  fraction by which all dimensions in the \.{DVI} file could be multiplied
 11679  in order to get lengths in units of $10^{-7}$ meters. Since $\rm 7227{pt} =
 11680  254{cm}$, and since \TeX\ works with scaled points where there are $2^{16}$
 11681  sp in a point, \TeX\ sets
 11682  $|num|/|den|=(254\cdot10^5)/(7227\cdot2^{16})=25400000/473628672$.
 11683  @^sp@>
 11684  
 11685  The |mag| parameter is what \TeX\ calls \.{\\mag}, i.e., 1000 times the
 11686  desired magnification. The actual fraction by which dimensions are
 11687  multiplied is therefore $|mag|\cdot|num|/1000|den|$. Note that if a \TeX\
 11688  source document does not call for any `\.{true}' dimensions, and if you
 11689  change it only by specifying a different \.{\\mag} setting, the \.{DVI}
 11690  file that \TeX\ creates will be completely unchanged except for the value
 11691  of |mag| in the preamble and postamble. (Fancy \.{DVI}-reading programs allow
 11692  users to override the |mag|~setting when a \.{DVI} file is being printed.)
 11693  
 11694  Finally, |k| and |x| allow the \.{DVI} writer to include a comment, which is not
 11695  interpreted further. The length of comment |x| is |k|, where |0<=k<256|.
 11696  
 11697  @d id_byte=2 {identifies the kind of \.{DVI} files described here}
 11698  
 11699  @ Font definitions for a given font number |k| contain further parameters
 11700  $$\hbox{|c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.}$$
 11701  The four-byte value |c| is the check sum that \TeX\ found in the \.{TFM}
 11702  file for this font; |c| should match the check sum of the font found by
 11703  programs that read this \.{DVI} file.
 11704  @^check sum@>
 11705  
 11706  Parameter |s| contains a fixed-point scale factor that is applied to
 11707  the character widths in font |k|; font dimensions in \.{TFM} files and
 11708  other font files are relative to this quantity, which is called the
 11709  ``at size'' elsewhere in this documentation. The value of |s| is
 11710  always positive and less than $2^{27}$. It is given in the same units
 11711  as the other \.{DVI} dimensions, i.e., in sp when \TeX82 has made the
 11712  file.  Parameter |d| is similar to |s|; it is the ``design size,'' and
 11713  (like~|s|) it is given in \.{DVI} units. Thus, font |k| is to be used
 11714  at $|mag|\cdot s/1000d$ times its normal size.
 11715  
 11716  The remaining part of a font definition gives the external name of the font,
 11717  which is an ASCII string of length |a+l|. The number |a| is the length
 11718  of the ``area'' or directory, and |l| is the length of the font name itself;
 11719  the standard local system font area is supposed to be used when |a=0|.
 11720  The |n| field contains the area in its first |a| bytes.
 11721  
 11722  Font definitions must appear before the first use of a particular font number.
 11723  Once font |k| is defined, it must not be defined again; however, we
 11724  shall see below that font definitions appear in the postamble as well as
 11725  in the pages, so in this sense each font number is defined exactly twice,
 11726  if at all. Like |nop| commands, font definitions can
 11727  appear before the first |bop|, or between an |eop| and a |bop|.
 11728  
 11729  @ Sometimes it is desirable to make horizontal or vertical rules line up
 11730  precisely with certain features in characters of a font. It is possible to
 11731  guarantee the correct matching between \.{DVI} output and the characters
 11732  generated by \MF\ by adhering to the following principles: (1)~The \MF\
 11733  characters should be positioned so that a bottom edge or left edge that is
 11734  supposed to line up with the bottom or left edge of a rule appears at the
 11735  reference point, i.e., in row~0 and column~0 of the \MF\ raster. This
 11736  ensures that the position of the rule will not be rounded differently when
 11737  the pixel size is not a perfect multiple of the units of measurement in
 11738  the \.{DVI} file. (2)~A typeset rule of height $a>0$ and width $b>0$
 11739  should be equivalent to a \MF-generated character having black pixels in
 11740  precisely those raster positions whose \MF\ coordinates satisfy
 11741  |0<=x<@t$\alpha$@>b| and |0<=y<@t$\alpha$@>a|, where $\alpha$ is the number
 11742  of pixels per \.{DVI} unit.
 11743  @:METAFONT}{\MF@>
 11744  @^alignment of rules with characters@>
 11745  @^rules aligning with characters@>
 11746  
 11747  @ The last page in a \.{DVI} file is followed by `|post|'; this command
 11748  introduces the postamble, which summarizes important facts that \TeX\ has
 11749  accumulated about the file, making it possible to print subsets of the data
 11750  with reasonable efficiency. The postamble has the form
 11751  $$\vbox{\halign{\hbox{#\hfil}\cr
 11752    |post| |p[4]| |num[4]| |den[4]| |mag[4]| |l[4]| |u[4]| |s[2]| |t[2]|\cr
 11753    $\langle\,$font definitions$\,\rangle$\cr
 11754    |post_post| |q[4]| |i[1]| 223's$[{\G}4]$\cr}}$$
 11755  Here |p| is a pointer to the final |bop| in the file. The next three
 11756  parameters, |num|, |den|, and |mag|, are duplicates of the quantities that
 11757  appeared in the preamble.
 11758  
 11759  Parameters |l| and |u| give respectively the height-plus-depth of the tallest
 11760  page and the width of the widest page, in the same units as other dimensions
 11761  of the file. These numbers might be used by a \.{DVI}-reading program to
 11762  position individual ``pages'' on large sheets of film or paper; however,
 11763  the standard convention for output on normal size paper is to position each
 11764  page so that the upper left-hand corner is exactly one inch from the left
 11765  and the top. Experience has shown that it is unwise to design \.{DVI}-to-printer
 11766  software that attempts cleverly to center the output; a fixed position of
 11767  the upper left corner is easiest for users to understand and to work with.
 11768  Therefore |l| and~|u| are often ignored.
 11769  
 11770  Parameter |s| is the maximum stack depth (i.e., the largest excess of
 11771  |push| commands over |pop| commands) needed to process this file. Then
 11772  comes |t|, the total number of pages (|bop| commands) present.
 11773  
 11774  The postamble continues with font definitions, which are any number of
 11775  \\{fnt\_def} commands as described above, possibly interspersed with |nop|
 11776  commands. Each font number that is used in the \.{DVI} file must be defined
 11777  exactly twice: Once before it is first selected by a \\{fnt} command, and once
 11778  in the postamble.
 11779  
 11780  @ The last part of the postamble, following the |post_post| byte that
 11781  signifies the end of the font definitions, contains |q|, a pointer to the
 11782  |post| command that started the postamble.  An identification byte, |i|,
 11783  comes next; this currently equals~2, as in the preamble.
 11784  
 11785  The |i| byte is followed by four or more bytes that are all equal to
 11786  the decimal number 223 (i.e., @'337 in octal). \TeX\ puts out four to seven of
 11787  these trailing bytes, until the total length of the file is a multiple of
 11788  four bytes, since this works out best on machines that pack four bytes per
 11789  word; but any number of 223's is allowed, as long as there are at least four
 11790  of them. In effect, 223 is a sort of signature that is added at the very end.
 11791  @^Fuchs, David Raymond@>
 11792  
 11793  This curious way to finish off a \.{DVI} file makes it feasible for
 11794  \.{DVI}-reading programs to find the postamble first, on most computers,
 11795  even though \TeX\ wants to write the postamble last. Most operating
 11796  systems permit random access to individual words or bytes of a file, so
 11797  the \.{DVI} reader can start at the end and skip backwards over the 223's
 11798  until finding the identification byte. Then it can back up four bytes, read
 11799  |q|, and move to byte |q| of the file. This byte should, of course,
 11800  contain the value 248 (|post|); now the postamble can be read, so the
 11801  \.{DVI} reader can discover all the information needed for typesetting the
 11802  pages. Note that it is also possible to skip through the \.{DVI} file at
 11803  reasonably high speed to locate a particular page, if that proves
 11804  desirable. This saves a lot of time, since \.{DVI} files used in production
 11805  jobs tend to be large.
 11806  
 11807  Unfortunately, however, standard \PASCAL\ does not include the ability to
 11808  @^system dependencies@>
 11809  access a random position in a file, or even to determine the length of a file.
 11810  Almost all systems nowadays provide the necessary capabilities, so \.{DVI}
 11811  format has been designed to work most efficiently with modern operating systems.
 11812  But if \.{DVI} files have to be processed under the restrictions of standard
 11813  \PASCAL, one can simply read them from front to back, since the necessary
 11814  header information is present in the preamble and in the font definitions.
 11815  (The |l| and |u| and |s| and |t| parameters, which appear only in the
 11816  postamble, are ``frills'' that are handy but not absolutely necessary.)
 11817  
 11818  @* \[32] Shipping pages out.
 11819  After considering \TeX's eyes and stomach, we come now to the bowels.
 11820  @^bowels@>
 11821  
 11822  The |ship_out| procedure is given a pointer to a box; its mission is
 11823  to describe that box in \.{DVI} form, outputting a ``page'' to |dvi_file|.
 11824  The \.{DVI} coordinates $(h,v)=(0,0)$ should correspond to the upper left
 11825  corner of the box being shipped.
 11826  
 11827  Since boxes can be inside of boxes inside of boxes, the main work of
 11828  |ship_out| is done by two mutually recursive routines, |hlist_out|
 11829  and |vlist_out|, which traverse the hlists and vlists inside of horizontal
 11830  and vertical boxes.
 11831  
 11832  As individual pages are being processed, we need to accumulate
 11833  information about the entire set of pages, since such statistics must be
 11834  reported in the postamble. The global variables |total_pages|, |max_v|,
 11835  |max_h|, |max_push|, and |last_bop| are used to record this information.
 11836  
 11837  The variable |doing_leaders| is |true| while leaders are being output.
 11838  The variable |dead_cycles| contains the number of times an output routine
 11839  has been initiated since the last |ship_out|.
 11840  
 11841  A few additional global variables are also defined here for use in
 11842  |vlist_out| and |hlist_out|. They could have been local variables, but
 11843  that would waste stack space when boxes are deeply nested, since the
 11844  values of these variables are not needed during recursive calls.
 11845  @^recursion@>
 11846  
 11847  @<Glob...@>=
 11848  @!total_pages:integer; {the number of pages that have been shipped out}
 11849  @!max_v:scaled; {maximum height-plus-depth of pages shipped so far}
 11850  @!max_h:scaled; {maximum width of pages shipped so far}
 11851  @!max_push:integer; {deepest nesting of |push| commands encountered so far}
 11852  @!last_bop:integer; {location of previous |bop| in the \.{DVI} output}
 11853  @!dead_cycles:integer; {recent outputs that didn't ship anything out}
 11854  @!doing_leaders:boolean; {are we inside a leader box?}
 11855  @#
 11856  @!c,@!f:quarterword; {character and font in current |char_node|}
 11857  @!rule_ht,@!rule_dp,@!rule_wd:scaled; {size of current rule being output}
 11858  @!g:pointer; {current glue specification}
 11859  @!lq,@!lr:integer; {quantities used in calculations for leaders}
 11860  
 11861  @ @<Set init...@>=
 11862  total_pages:=0; max_v:=0; max_h:=0; max_push:=0; last_bop:=-1;
 11863  doing_leaders:=false; dead_cycles:=0; cur_s:=-1;
 11864  
 11865  @ The \.{DVI} bytes are output to a buffer instead of being written directly
 11866  to the output file. This makes it possible to reduce the overhead of
 11867  subroutine calls, thereby measurably speeding up the computation, since
 11868  output of \.{DVI} bytes is part of \TeX's inner loop. And it has another
 11869  advantage as well, since we can change instructions in the buffer in order to
 11870  make the output more compact. For example, a `|down2|' command can be
 11871  changed to a `|y2|', thereby making a subsequent `|y0|' command possible,
 11872  saving two bytes.
 11873  
 11874  The output buffer is divided into two parts of equal size; the bytes found
 11875  in |dvi_buf[0..half_buf-1]| constitute the first half, and those in
 11876  |dvi_buf[half_buf..dvi_buf_size-1]| constitute the second. The global
 11877  variable |dvi_ptr| points to the position that will receive the next
 11878  output byte. When |dvi_ptr| reaches |dvi_limit|, which is always equal
 11879  to one of the two values |half_buf| or |dvi_buf_size|, the half buffer that
 11880  is about to be invaded next is sent to the output and |dvi_limit| is
 11881  changed to its other value. Thus, there is always at least a half buffer's
 11882  worth of information present, except at the very beginning of the job.
 11883  
 11884  Bytes of the \.{DVI} file are numbered sequentially starting with 0;
 11885  the next byte to be generated will be number |dvi_offset+dvi_ptr|.
 11886  A byte is present in the buffer only if its number is |>=dvi_gone|.
 11887  
 11888  @<Types...@>=
 11889  @!dvi_index=0..dvi_buf_size; {an index into the output buffer}
 11890  
 11891  @ Some systems may find it more efficient to make |dvi_buf| a |packed|
 11892  array, since output of four bytes at once may be facilitated.
 11893  @^system dependencies@>
 11894  
 11895  @<Glob...@>=
 11896  @!dvi_buf:array[dvi_index] of eight_bits; {buffer for \.{DVI} output}
 11897  @!half_buf:dvi_index; {half of |dvi_buf_size|}
 11898  @!dvi_limit:dvi_index; {end of the current half buffer}
 11899  @!dvi_ptr:dvi_index; {the next available buffer address}
 11900  @!dvi_offset:integer; {|dvi_buf_size| times the number of times the
 11901    output buffer has been fully emptied}
 11902  @!dvi_gone:integer; {the number of bytes already output to |dvi_file|}
 11903  
 11904  @ Initially the buffer is all in one piece; we will output half of it only
 11905  after it first fills up.
 11906  
 11907  @<Set init...@>=
 11908  half_buf:=dvi_buf_size div 2; dvi_limit:=dvi_buf_size; dvi_ptr:=0;
 11909  dvi_offset:=0; dvi_gone:=0;
 11910  
 11911  @ The actual output of |dvi_buf[a..b]| to |dvi_file| is performed by calling
 11912  |write_dvi(a,b)|. For best results, this procedure should be optimized to
 11913  run as fast as possible on each particular system, since it is part of
 11914  \TeX's inner loop. It is safe to assume that |a| and |b+1| will both be
 11915  multiples of 4 when |write_dvi(a,b)| is called; therefore it is possible on
 11916  many machines to use efficient methods to pack four bytes per word and to
 11917  output an array of words with one system call.
 11918  @^system dependencies@>
 11919  @^inner loop@>
 11920  @^defecation@>
 11921  
 11922  @p procedure write_dvi(@!a,@!b:dvi_index);
 11923  var k:dvi_index;
 11924  begin for k:=a to b do write(dvi_file,dvi_buf[k]);
 11925  end;
 11926  
 11927  @ To put a byte in the buffer without paying the cost of invoking a procedure
 11928  each time, we use the macro |dvi_out|.
 11929  
 11930  @d dvi_out(#)==@+begin dvi_buf[dvi_ptr]:=#; incr(dvi_ptr);
 11931    if dvi_ptr=dvi_limit then dvi_swap;
 11932    end
 11933  
 11934  @p procedure dvi_swap; {outputs half of the buffer}
 11935  begin if dvi_limit=dvi_buf_size then
 11936    begin write_dvi(0,half_buf-1); dvi_limit:=half_buf;
 11937    dvi_offset:=dvi_offset+dvi_buf_size; dvi_ptr:=0;
 11938    end
 11939  else  begin write_dvi(half_buf,dvi_buf_size-1); dvi_limit:=dvi_buf_size;
 11940    end;
 11941  dvi_gone:=dvi_gone+half_buf;
 11942  end;
 11943  
 11944  @ Here is how we clean out the buffer when \TeX\ is all through; |dvi_ptr|
 11945  will be a multiple of~4.
 11946  
 11947  @<Empty the last bytes out of |dvi_buf|@>=
 11948  if dvi_limit=half_buf then write_dvi(half_buf,dvi_buf_size-1);
 11949  if dvi_ptr>0 then write_dvi(0,dvi_ptr-1)
 11950  
 11951  @ The |dvi_four| procedure outputs four bytes in two's complement notation,
 11952  without risking arithmetic overflow.
 11953  
 11954  @p procedure dvi_four(@!x:integer);
 11955  begin if x>=0 then dvi_out(x div @'100000000)
 11956  else  begin x:=x+@'10000000000;
 11957    x:=x+@'10000000000;
 11958    dvi_out((x div @'100000000) + 128);
 11959    end;
 11960  x:=x mod @'100000000; dvi_out(x div @'200000);
 11961  x:=x mod @'200000; dvi_out(x div @'400);
 11962  dvi_out(x mod @'400);
 11963  end;
 11964  
 11965  @ A mild optimization of the output is performed by the |dvi_pop|
 11966  routine, which issues a |pop| unless it is possible to cancel a
 11967  `|push| |pop|' pair. The parameter to |dvi_pop| is the byte address
 11968  following the old |push| that matches the new |pop|.
 11969  
 11970  @p procedure dvi_pop(@!l:integer);
 11971  begin if (l=dvi_offset+dvi_ptr)and(dvi_ptr>0) then decr(dvi_ptr)
 11972  else dvi_out(pop);
 11973  end;
 11974  
 11975  @ Here's a procedure that outputs a font definition. Since \TeX82 uses at
 11976  most 256 different fonts per job, |fnt_def1| is always used as the command code.
 11977  
 11978  @p procedure dvi_font_def(@!f:internal_font_number);
 11979  var k:pool_pointer; {index into |str_pool|}
 11980  begin dvi_out(fnt_def1);
 11981  dvi_out(f-font_base-1);@/
 11982  dvi_out(qo(font_check[f].b0));
 11983  dvi_out(qo(font_check[f].b1));
 11984  dvi_out(qo(font_check[f].b2));
 11985  dvi_out(qo(font_check[f].b3));@/
 11986  dvi_four(font_size[f]);
 11987  dvi_four(font_dsize[f]);@/
 11988  dvi_out(length(font_area[f]));
 11989  dvi_out(length(font_name[f]));
 11990  @<Output the font name whose internal number is |f|@>;
 11991  end;
 11992  
 11993  @ @<Output the font name whose internal number is |f|@>=
 11994  for k:=str_start[font_area[f]] to str_start[font_area[f]+1]-1 do
 11995    dvi_out(so(str_pool[k]));
 11996  for k:=str_start[font_name[f]] to str_start[font_name[f]+1]-1 do
 11997    dvi_out(so(str_pool[k]))
 11998  
 11999  @ Versions of \TeX\ intended for small computers might well choose to omit
 12000  the ideas in the next few parts of this program, since it is not really
 12001  necessary to optimize the \.{DVI} code by making use of the |w0|, |x0|,
 12002  |y0|, and |z0| commands. Furthermore, the algorithm that we are about to
 12003  describe does not pretend to give an optimum reduction in the length
 12004  of the \.{DVI} code; after all, speed is more important than compactness.
 12005  But the method is surprisingly effective, and it takes comparatively little
 12006  time.
 12007  
 12008  We can best understand the basic idea by first considering a simpler problem
 12009  that has the same essential characteristics. Given a sequence of digits,
 12010  say $3\,1\,4\,1\,5\,9\,2\,6\,5\,3\,5\,8\,9$, we want to assign subscripts
 12011  $d$, $y$, or $z$ to each digit so as to maximize the number of ``$y$-hits''
 12012  and ``$z$-hits''; a $y$-hit is an instance of two appearances of the same
 12013  digit with the subscript $y$, where no $y$'s intervene between the two
 12014  appearances, and a $z$-hit is defined similarly. For example, the sequence
 12015  above could be decorated with subscripts as follows:
 12016  $$3_z\,1_y\,4_d\,1_y\,5_y\,9_d\,2_d\,6_d\,5_y\,3_z\,5_y\,8_d\,9_d.$$
 12017  There are three $y$-hits ($1_y\ldots1_y$ and $5_y\ldots5_y\ldots5_y$) and
 12018  one $z$-hit ($3_z\ldots3_z$); there are no $d$-hits, since the two appearances
 12019  of $9_d$ have $d$'s between them, but we don't count $d$-hits so it doesn't
 12020  matter how many there are. These subscripts are analogous to the \.{DVI}
 12021  commands called \\{down}, $y$, and $z$, and the digits are analogous to
 12022  different amounts of vertical motion; a $y$-hit or $z$-hit corresponds to
 12023  the opportunity to use the one-byte commands |y0| or |z0| in a \.{DVI} file.
 12024  
 12025  \TeX's method of assigning subscripts works like this: Append a new digit,
 12026  say $\delta$, to the right of the sequence. Now look back through the
 12027  sequence until one of the following things happens: (a)~You see
 12028  $\delta_y$ or $\delta_z$, and this was the first time you encountered a
 12029  $y$ or $z$ subscript, respectively.  Then assign $y$ or $z$ to the new
 12030  $\delta$; you have scored a hit. (b)~You see $\delta_d$, and no $y$
 12031  subscripts have been encountered so far during this search.  Then change
 12032  the previous $\delta_d$ to $\delta_y$ (this corresponds to changing a
 12033  command in the output buffer), and assign $y$ to the new $\delta$; it's
 12034  another hit.  (c)~You see $\delta_d$, and a $y$ subscript has been seen
 12035  but not a $z$.  Change the previous $\delta_d$ to $\delta_z$ and assign
 12036  $z$ to the new $\delta$. (d)~You encounter both $y$ and $z$ subscripts
 12037  before encountering a suitable $\delta$, or you scan all the way to the
 12038  front of the sequence. Assign $d$ to the new $\delta$; this assignment may
 12039  be changed later.
 12040  
 12041  The subscripts $3_z\,1_y\,4_d\ldots\,$ in the example above were, in fact,
 12042  produced by this procedure, as the reader can verify. (Go ahead and try it.)
 12043  
 12044  @ In order to implement such an idea, \TeX\ maintains a stack of pointers
 12045  to the \\{down}, $y$, and $z$ commands that have been generated for the
 12046  current page. And there is a similar stack for \\{right}, |w|, and |x|
 12047  commands. These stacks are called the down stack and right stack, and their
 12048  top elements are maintained in the variables |down_ptr| and |right_ptr|.
 12049  
 12050  Each entry in these stacks contains four fields: The |width| field is
 12051  the amount of motion down or to the right; the |location| field is the
 12052  byte number of the \.{DVI} command in question (including the appropriate
 12053  |dvi_offset|); the |link| field points to the next item below this one
 12054  on the stack; and the |info| field encodes the options for possible change
 12055  in the \.{DVI} command.
 12056  
 12057  @d movement_node_size=3 {number of words per entry in the down and right stacks}
 12058  @d location(#)==mem[#+2].int {\.{DVI} byte number for a movement command}
 12059  
 12060  @<Glob...@>=
 12061  @!down_ptr,@!right_ptr:pointer; {heads of the down and right stacks}
 12062  
 12063  @ @<Set init...@>=
 12064  down_ptr:=null; right_ptr:=null;
 12065  
 12066  @ Here is a subroutine that produces a \.{DVI} command for some specified
 12067  downward or rightward motion. It has two parameters: |w| is the amount
 12068  of motion, and |o| is either |down1| or |right1|. We use the fact that
 12069  the command codes have convenient arithmetic properties: |y1-down1=w1-right1|
 12070  and |z1-down1=x1-right1|.
 12071  
 12072  @p procedure movement(@!w:scaled;@!o:eight_bits);
 12073  label exit,found,not_found,2,1;
 12074  var mstate:small_number; {have we seen a |y| or |z|?}
 12075  @!p,@!q:pointer; {current and top nodes on the stack}
 12076  @!k:integer; {index into |dvi_buf|, modulo |dvi_buf_size|}
 12077  begin q:=get_node(movement_node_size); {new node for the top of the stack}
 12078  width(q):=w; location(q):=dvi_offset+dvi_ptr;
 12079  if o=down1 then
 12080    begin link(q):=down_ptr; down_ptr:=q;
 12081    end
 12082  else  begin link(q):=right_ptr; right_ptr:=q;
 12083    end;
 12084  @<Look at the other stack entries until deciding what sort of \.{DVI} command
 12085    to generate; |goto found| if node |p| is a ``hit''@>;
 12086  @<Generate a |down| or |right| command for |w| and |return|@>;
 12087  found: @<Generate a |y0| or |z0| command in order to reuse a previous
 12088    appearance of~|w|@>;
 12089  exit:end;
 12090  
 12091  @ The |info| fields in the entries of the down stack or the right stack
 12092  have six possible settings: |y_here| or |z_here| mean that the \.{DVI}
 12093  command refers to |y| or |z|, respectively (or to |w| or |x|, in the
 12094  case of horizontal motion); |yz_OK| means that the \.{DVI} command is
 12095  \\{down} (or \\{right}) but can be changed to either |y| or |z| (or
 12096  to either |w| or |x|); |y_OK| means that it is \\{down} and can be changed
 12097  to |y| but not |z|; |z_OK| is similar; and |d_fixed| means it must stay
 12098  \\{down}.
 12099  
 12100  The four settings |yz_OK|, |y_OK|, |z_OK|, |d_fixed| would not need to
 12101  be distinguished from each other if we were simply solving the
 12102  digit-subscripting problem mentioned above. But in \TeX's case there is
 12103  a complication because of the nested structure of |push| and |pop|
 12104  commands. Suppose we add parentheses to the digit-subscripting problem,
 12105  redefining hits so that $\delta_y\ldots \delta_y$ is a hit if all $y$'s between
 12106  the $\delta$'s are enclosed in properly nested parentheses, and if the
 12107  parenthesis level of the right-hand $\delta_y$ is deeper than or equal to
 12108  that of the left-hand one. Thus, `(' and `)' correspond to `|push|'
 12109  and `|pop|'. Now if we want to assign a subscript to the final 1 in the
 12110  sequence
 12111  $$2_y\,7_d\,1_d\,(\,8_z\,2_y\,8_z\,)\,1$$
 12112  we cannot change the previous $1_d$ to $1_y$, since that would invalidate
 12113  the $2_y\ldots2_y$ hit. But we can change it to $1_z$, scoring a hit
 12114  since the intervening $8_z$'s are enclosed in parentheses.
 12115  
 12116  The program below removes movement nodes that are introduced after a |push|,
 12117  before it outputs the corresponding |pop|.
 12118  
 12119  @d y_here=1 {|info| when the movement entry points to a |y| command}
 12120  @d z_here=2 {|info| when the movement entry points to a |z| command}
 12121  @d yz_OK=3 {|info| corresponding to an unconstrained \\{down} command}
 12122  @d y_OK=4 {|info| corresponding to a \\{down} that can't become a |z|}
 12123  @d z_OK=5 {|info| corresponding to a \\{down} that can't become a |y|}
 12124  @d d_fixed=6 {|info| corresponding to a \\{down} that can't change}
 12125  
 12126  @ When the |movement| procedure gets to the label |found|, the value of
 12127  |info(p)| will be either |y_here| or |z_here|. If it is, say, |y_here|,
 12128  the procedure generates a |y0| command (or a |w0| command), and marks
 12129  all |info| fields between |q| and |p| so that |y| is not OK in that range.
 12130  
 12131  @<Generate a |y0| or |z0| command...@>=
 12132  info(q):=info(p);
 12133  if info(q)=y_here then
 12134    begin dvi_out(o+y0-down1); {|y0| or |w0|}
 12135    while link(q)<>p do
 12136      begin q:=link(q);
 12137      case info(q) of
 12138      yz_OK: info(q):=z_OK;
 12139      y_OK: info(q):=d_fixed;
 12140      othercases do_nothing
 12141      endcases;
 12142      end;
 12143    end
 12144  else  begin dvi_out(o+z0-down1); {|z0| or |x0|}
 12145    while link(q)<>p do
 12146      begin q:=link(q);
 12147      case info(q) of
 12148      yz_OK: info(q):=y_OK;
 12149      z_OK: info(q):=d_fixed;
 12150      othercases do_nothing
 12151      endcases;
 12152      end;
 12153    end
 12154  
 12155  @ @<Generate a |down| or |right|...@>=
 12156  info(q):=yz_OK;
 12157  if abs(w)>=@'40000000 then
 12158    begin dvi_out(o+3); {|down4| or |right4|}
 12159    dvi_four(w); return;
 12160    end;
 12161  if abs(w)>=@'100000 then
 12162    begin dvi_out(o+2); {|down3| or |right3|}
 12163    if w<0 then w:=w+@'100000000;
 12164    dvi_out(w div @'200000); w:=w mod @'200000; goto 2;
 12165    end;
 12166  if abs(w)>=@'200 then
 12167    begin dvi_out(o+1); {|down2| or |right2|}
 12168    if w<0 then w:=w+@'200000;
 12169    goto 2;
 12170    end;
 12171  dvi_out(o); {|down1| or |right1|}
 12172  if w<0 then w:=w+@'400;
 12173  goto 1;
 12174  2: dvi_out(w div @'400);
 12175  1: dvi_out(w mod @'400); return
 12176  
 12177  @ As we search through the stack, we are in one of three states,
 12178  |y_seen|, |z_seen|, or |none_seen|, depending on whether we have
 12179  encountered |y_here| or |z_here| nodes. These states are encoded as
 12180  multiples of 6, so that they can be added to the |info| fields for quick
 12181  decision-making.
 12182  @^inner loop@>
 12183  
 12184  @d none_seen=0 {no |y_here| or |z_here| nodes have been encountered yet}
 12185  @d y_seen=6 {we have seen |y_here| but not |z_here|}
 12186  @d z_seen=12 {we have seen |z_here| but not |y_here|}
 12187  
 12188  @<Look at the other stack entries until deciding...@>=
 12189  p:=link(q); mstate:=none_seen;
 12190  while p<>null do
 12191    begin if width(p)=w then @<Consider a node with matching width;
 12192      |goto found| if it's a hit@>
 12193    else  case mstate+info(p) of
 12194      none_seen+y_here: mstate:=y_seen;
 12195      none_seen+z_here: mstate:=z_seen;
 12196      y_seen+z_here,z_seen+y_here: goto not_found;
 12197      othercases do_nothing
 12198      endcases;
 12199    p:=link(p);
 12200    end;
 12201  not_found:
 12202  
 12203  @ We might find a valid hit in a |y| or |z| byte that is already gone
 12204  from the buffer. But we can't change bytes that are gone forever; ``the
 12205  moving finger writes, $\ldots\,\,$.''
 12206  
 12207  @<Consider a node with matching width...@>=
 12208  case mstate+info(p) of
 12209  none_seen+yz_OK,none_seen+y_OK,z_seen+yz_OK,z_seen+y_OK:@t@>@;@/
 12210    if location(p)<dvi_gone then goto not_found
 12211    else @<Change buffered instruction to |y| or |w| and |goto found|@>;
 12212  none_seen+z_OK,y_seen+yz_OK,y_seen+z_OK:@t@>@;@/
 12213    if location(p)<dvi_gone then goto not_found
 12214    else @<Change buffered instruction to |z| or |x| and |goto found|@>;
 12215  none_seen+y_here,none_seen+z_here,y_seen+z_here,z_seen+y_here: goto found;
 12216  othercases do_nothing
 12217  endcases
 12218  
 12219  @ @<Change buffered instruction to |y| or |w| and |goto found|@>=
 12220  begin k:=location(p)-dvi_offset;
 12221  if k<0 then k:=k+dvi_buf_size;
 12222  dvi_buf[k]:=dvi_buf[k]+y1-down1;
 12223  info(p):=y_here; goto found;
 12224  end
 12225  
 12226  @ @<Change buffered instruction to |z| or |x| and |goto found|@>=
 12227  begin k:=location(p)-dvi_offset;
 12228  if k<0 then k:=k+dvi_buf_size;
 12229  dvi_buf[k]:=dvi_buf[k]+z1-down1;
 12230  info(p):=z_here; goto found;
 12231  end
 12232  
 12233  @ In case you are wondering when all the movement nodes are removed from
 12234  \TeX's memory, the answer is that they are recycled just before
 12235  |hlist_out| and |vlist_out| finish outputting a box. This restores the
 12236  down and right stacks to the state they were in before the box was output,
 12237  except that some |info|'s may have become more restrictive.
 12238  
 12239  @p procedure prune_movements(@!l:integer);
 12240    {delete movement nodes with |location>=l|}
 12241  label done,exit;
 12242  var p:pointer; {node being deleted}
 12243  begin while down_ptr<>null do
 12244    begin if location(down_ptr)<l then goto done;
 12245    p:=down_ptr; down_ptr:=link(p); free_node(p,movement_node_size);
 12246    end;
 12247  done: while right_ptr<>null do
 12248    begin if location(right_ptr)<l then return;
 12249    p:=right_ptr; right_ptr:=link(p); free_node(p,movement_node_size);
 12250    end;
 12251  exit:end;
 12252  
 12253  @ The actual distances by which we want to move might be computed as the
 12254  sum of several separate movements. For example, there might be several
 12255  glue nodes in succession, or we might want to move right by the width of
 12256  some box plus some amount of glue. More importantly, the baselineskip
 12257  distances are computed in terms of glue together with the depth and
 12258  height of adjacent boxes, and we want the \.{DVI} file to lump these
 12259  three quantities together into a single motion.
 12260  
 12261  Therefore, \TeX\ maintains two pairs of global variables: |dvi_h| and |dvi_v|
 12262  are the |h| and |v| coordinates corresponding to the commands actually
 12263  output to the \.{DVI} file, while |cur_h| and |cur_v| are the coordinates
 12264  corresponding to the current state of the output routines. Coordinate
 12265  changes will accumulate in |cur_h| and |cur_v| without being reflected
 12266  in the output, until such a change becomes necessary or desirable; we
 12267  can call the |movement| procedure whenever we want to make |dvi_h=cur_h|
 12268  or |dvi_v=cur_v|.
 12269  
 12270  The current font reflected in the \.{DVI} output is called |dvi_f|;
 12271  there is no need for a `\\{cur\_f}' variable.
 12272  
 12273  The depth of nesting of |hlist_out| and |vlist_out| is called |cur_s|;
 12274  this is essentially the depth of |push| commands in the \.{DVI} output.
 12275  
 12276  @d synch_h==if cur_h<>dvi_h then
 12277      begin movement(cur_h-dvi_h,right1); dvi_h:=cur_h;
 12278      end
 12279  @d synch_v==if cur_v<>dvi_v then
 12280      begin movement(cur_v-dvi_v,down1); dvi_v:=cur_v;
 12281      end
 12282  
 12283  @<Glob...@>=
 12284  @!dvi_h,@!dvi_v:scaled; {a \.{DVI} reader program thinks we are here}
 12285  @!cur_h,@!cur_v:scaled; {\TeX\ thinks we are here}
 12286  @!dvi_f:internal_font_number; {the current font}
 12287  @!cur_s:integer; {current depth of output box nesting, initially $-1$}
 12288  
 12289  @ @<Initialize variables as |ship_out| begins@>=
 12290  dvi_h:=0; dvi_v:=0; cur_h:=h_offset; dvi_f:=null_font;
 12291  ensure_dvi_open;
 12292  if total_pages=0 then
 12293    begin dvi_out(pre); dvi_out(id_byte); {output the preamble}
 12294  @^preamble of \.{DVI} file@>
 12295    dvi_four(25400000); dvi_four(473628672); {conversion ratio for sp}
 12296    prepare_mag; dvi_four(mag); {magnification factor is frozen}
 12297    old_setting:=selector; selector:=new_string;
 12298    print(" TeX output "); print_int(year); print_char(".");
 12299    print_two(month); print_char("."); print_two(day);
 12300    print_char(":"); print_two(time div 60);
 12301    print_two(time mod 60);
 12302    selector:=old_setting; dvi_out(cur_length);
 12303    for s:=str_start[str_ptr] to pool_ptr-1 do dvi_out(so(str_pool[s]));
 12304    pool_ptr:=str_start[str_ptr]; {flush the current string}
 12305    end
 12306  
 12307  @ When |hlist_out| is called, its duty is to output the box represented
 12308  by the |hlist_node| pointed to by |temp_ptr|. The reference point of that
 12309  box has coordinates |(cur_h,cur_v)|.
 12310  
 12311  Similarly, when |vlist_out| is called, its duty is to output the box represented
 12312  by the |vlist_node| pointed to by |temp_ptr|. The reference point of that
 12313  box has coordinates |(cur_h,cur_v)|.
 12314  @^recursion@>
 12315  
 12316  @p procedure@?vlist_out; forward; {|hlist_out| and |vlist_out| are mutually
 12317    recursive}
 12318  
 12319  @ The recursive procedures |hlist_out| and |vlist_out| each have local variables
 12320  |save_h| and |save_v| to hold the values of |dvi_h| and |dvi_v| just before
 12321  entering a new level of recursion.  In effect, the values of |save_h| and
 12322  |save_v| on \TeX's run-time stack correspond to the values of |h| and |v|
 12323  that a \.{DVI}-reading program will push onto its coordinate stack.
 12324  
 12325  @d move_past=13 {go to this label when advancing past glue or a rule}
 12326  @d fin_rule=14 {go to this label to finish processing a rule}
 12327  @d next_p=15 {go to this label when finished with node |p|}
 12328  
 12329  @p @t\4@>@<Declare procedures needed in |hlist_out|, |vlist_out|@>@t@>@/
 12330  procedure hlist_out; {output an |hlist_node| box}
 12331  label reswitch, move_past, fin_rule, next_p;
 12332  var base_line: scaled; {the baseline coordinate for this box}
 12333  @!left_edge: scaled; {the left coordinate for this box}
 12334  @!save_h,@!save_v: scaled; {what |dvi_h| and |dvi_v| should pop to}
 12335  @!this_box: pointer; {pointer to containing box}
 12336  @!g_order: glue_ord; {applicable order of infinity for glue}
 12337  @!g_sign: normal..shrinking; {selects type of glue}
 12338  @!p:pointer; {current position in the hlist}
 12339  @!save_loc:integer; {\.{DVI} byte location upon entry}
 12340  @!leader_box:pointer; {the leader box being replicated}
 12341  @!leader_wd:scaled; {width of leader box being replicated}
 12342  @!lx:scaled; {extra space between leader boxes}
 12343  @!outer_doing_leaders:boolean; {were we doing leaders?}
 12344  @!edge:scaled; {left edge of sub-box, or right edge of leader space}
 12345  @!glue_temp:real; {glue value before rounding}
 12346  @!cur_glue:real; {glue seen so far}
 12347  @!cur_g:scaled; {rounded equivalent of |cur_glue| times the glue ratio}
 12348  begin cur_g:=0; cur_glue:=float_constant(0);
 12349  this_box:=temp_ptr; g_order:=glue_order(this_box);
 12350  g_sign:=glue_sign(this_box); p:=list_ptr(this_box);
 12351  incr(cur_s);
 12352  if cur_s>0 then dvi_out(push);
 12353  if cur_s>max_push then max_push:=cur_s;
 12354  save_loc:=dvi_offset+dvi_ptr; base_line:=cur_v; left_edge:=cur_h;
 12355  while p<>null do @<Output node |p| for |hlist_out| and move to the next node,
 12356    maintaining the condition |cur_v=base_line|@>;
 12357  prune_movements(save_loc);
 12358  if cur_s>0 then dvi_pop(save_loc);
 12359  decr(cur_s);
 12360  end;
 12361  
 12362  @ We ought to give special care to the efficiency of one part of |hlist_out|,
 12363  since it belongs to \TeX's inner loop. When a |char_node| is encountered,
 12364  we save a little time by processing several nodes in succession until
 12365  reaching a non-|char_node|. The program uses the fact that |set_char_0=0|.
 12366  @^inner loop@>
 12367  
 12368  @<Output node |p| for |hlist_out|...@>=
 12369  reswitch: if is_char_node(p) then
 12370    begin synch_h; synch_v;
 12371    repeat f:=font(p); c:=character(p);
 12372    if f<>dvi_f then @<Change font |dvi_f| to |f|@>;
 12373    if c>=qi(128) then dvi_out(set1);
 12374    dvi_out(qo(c));@/
 12375    cur_h:=cur_h+char_width(f)(char_info(f)(c));
 12376    p:=link(p);
 12377    until not is_char_node(p);
 12378    dvi_h:=cur_h;
 12379    end
 12380  else @<Output the non-|char_node| |p| for |hlist_out|
 12381      and move to the next node@>
 12382  
 12383  @ @<Change font |dvi_f| to |f|@>=
 12384  begin if not font_used[f] then
 12385    begin dvi_font_def(f); font_used[f]:=true;
 12386    end;
 12387  if f<=64+font_base then dvi_out(f-font_base-1+fnt_num_0)
 12388  else  begin dvi_out(fnt1); dvi_out(f-font_base-1);
 12389    end;
 12390  dvi_f:=f;
 12391  end
 12392  
 12393  @ @<Output the non-|char_node| |p| for |hlist_out|...@>=
 12394  begin case type(p) of
 12395  hlist_node,vlist_node:@<Output a box in an hlist@>;
 12396  rule_node: begin rule_ht:=height(p); rule_dp:=depth(p); rule_wd:=width(p);
 12397    goto fin_rule;
 12398    end;
 12399  whatsit_node: @<Output the whatsit node |p| in an hlist@>;
 12400  glue_node: @<Move right or output leaders@>;
 12401  kern_node,math_node:cur_h:=cur_h+width(p);
 12402  ligature_node: @<Make node |p| look like a |char_node| and |goto reswitch|@>;
 12403  othercases do_nothing
 12404  endcases;@/
 12405  goto next_p;
 12406  fin_rule: @<Output a rule in an hlist@>;
 12407  move_past: cur_h:=cur_h+rule_wd;
 12408  next_p:p:=link(p);
 12409  end
 12410  
 12411  @ @<Output a box in an hlist@>=
 12412  if list_ptr(p)=null then cur_h:=cur_h+width(p)
 12413  else  begin save_h:=dvi_h; save_v:=dvi_v;
 12414    cur_v:=base_line+shift_amount(p); {shift the box down}
 12415    temp_ptr:=p; edge:=cur_h;
 12416    if type(p)=vlist_node then vlist_out@+else hlist_out;
 12417    dvi_h:=save_h; dvi_v:=save_v;
 12418    cur_h:=edge+width(p); cur_v:=base_line;
 12419    end
 12420  
 12421  @ @<Output a rule in an hlist@>=
 12422  if is_running(rule_ht) then rule_ht:=height(this_box);
 12423  if is_running(rule_dp) then rule_dp:=depth(this_box);
 12424  rule_ht:=rule_ht+rule_dp; {this is the rule thickness}
 12425  if (rule_ht>0)and(rule_wd>0) then {we don't output empty rules}
 12426    begin synch_h; cur_v:=base_line+rule_dp; synch_v;
 12427    dvi_out(set_rule); dvi_four(rule_ht); dvi_four(rule_wd);
 12428    cur_v:=base_line; dvi_h:=dvi_h+rule_wd;
 12429    end
 12430  
 12431  @ @d billion==float_constant(1000000000)
 12432  @d vet_glue(#)== glue_temp:=#;
 12433    if glue_temp>billion then
 12434             glue_temp:=billion
 12435    else if glue_temp<-billion then
 12436             glue_temp:=-billion
 12437  
 12438  @<Move right or output leaders@>=
 12439  begin g:=glue_ptr(p); rule_wd:=width(g)-cur_g;
 12440  if g_sign<>normal then
 12441    begin if g_sign=stretching then
 12442      begin if stretch_order(g)=g_order then
 12443        begin cur_glue:=cur_glue+stretch(g);
 12444        vet_glue(float(glue_set(this_box))*cur_glue);
 12445  @^real multiplication@>
 12446        cur_g:=round(glue_temp);
 12447        end;
 12448      end
 12449    else if shrink_order(g)=g_order then
 12450        begin cur_glue:=cur_glue-shrink(g);
 12451        vet_glue(float(glue_set(this_box))*cur_glue);
 12452        cur_g:=round(glue_temp);
 12453        end;
 12454    end;
 12455  rule_wd:=rule_wd+cur_g;
 12456  if subtype(p)>=a_leaders then
 12457    @<Output leaders in an hlist, |goto fin_rule| if a rule
 12458      or to |next_p| if done@>;
 12459  goto move_past;
 12460  end
 12461  
 12462  @ @<Output leaders in an hlist...@>=
 12463  begin leader_box:=leader_ptr(p);
 12464  if type(leader_box)=rule_node then
 12465    begin rule_ht:=height(leader_box); rule_dp:=depth(leader_box);
 12466    goto fin_rule;
 12467    end;
 12468  leader_wd:=width(leader_box);
 12469  if (leader_wd>0)and(rule_wd>0) then
 12470    begin rule_wd:=rule_wd+10; {compensate for floating-point rounding}
 12471    edge:=cur_h+rule_wd; lx:=0;
 12472    @<Let |cur_h| be the position of the first box, and set |leader_wd+lx|
 12473      to the spacing between corresponding parts of boxes@>;
 12474    while cur_h+leader_wd<=edge do
 12475      @<Output a leader box at |cur_h|,
 12476        then advance |cur_h| by |leader_wd+lx|@>;
 12477    cur_h:=edge-10; goto next_p;
 12478    end;
 12479  end
 12480  
 12481  @ The calculations related to leaders require a bit of care. First, in the
 12482  case of |a_leaders| (aligned leaders), we want to move |cur_h| to
 12483  |left_edge| plus the smallest multiple of |leader_wd| for which the result
 12484  is not less than the current value of |cur_h|; i.e., |cur_h| should become
 12485  $|left_edge|+|leader_wd|\times\lceil
 12486  (|cur_h|-|left_edge|)/|leader_wd|\rceil$.  The program here should work in
 12487  all cases even though some implementations of \PASCAL\ give nonstandard
 12488  results for the |div| operation when |cur_h| is less than |left_edge|.
 12489  
 12490  In the case of |c_leaders| (centered leaders), we want to increase |cur_h|
 12491  by half of the excess space not occupied by the leaders; and in the
 12492  case of |x_leaders| (expanded leaders) we increase |cur_h|
 12493  by $1/(q+1)$ of this excess space, where $q$ is the number of times the
 12494  leader box will be replicated. Slight inaccuracies in the division might
 12495  accumulate; half of this rounding error is placed at each end of the leaders.
 12496  
 12497  @<Let |cur_h| be the position of the first box, ...@>=
 12498  if subtype(p)=a_leaders then
 12499    begin save_h:=cur_h;
 12500    cur_h:=left_edge+leader_wd*((cur_h-left_edge)@!div leader_wd);
 12501    if cur_h<save_h then cur_h:=cur_h+leader_wd;
 12502    end
 12503  else  begin lq:=rule_wd div leader_wd; {the number of box copies}
 12504    lr:=rule_wd mod leader_wd; {the remaining space}
 12505    if subtype(p)=c_leaders then cur_h:=cur_h+(lr div 2)
 12506    else  begin lx:=lr div (lq+1);
 12507      cur_h:=cur_h+((lr-(lq-1)*lx) div 2);
 12508      end;
 12509    end
 12510  
 12511  @ The `\\{synch}' operations here are intended to decrease the number of
 12512  bytes needed to specify horizontal and vertical motion in the \.{DVI} output.
 12513  
 12514  @<Output a leader box at |cur_h|, ...@>=
 12515  begin cur_v:=base_line+shift_amount(leader_box); synch_v; save_v:=dvi_v;@/
 12516  synch_h; save_h:=dvi_h; temp_ptr:=leader_box;
 12517  outer_doing_leaders:=doing_leaders; doing_leaders:=true;
 12518  if type(leader_box)=vlist_node then vlist_out@+else hlist_out;
 12519  doing_leaders:=outer_doing_leaders;
 12520  dvi_v:=save_v; dvi_h:=save_h; cur_v:=base_line;
 12521  cur_h:=save_h+leader_wd+lx;
 12522  end
 12523  
 12524  @ The |vlist_out| routine is similar to |hlist_out|, but a bit simpler.
 12525  
 12526  @p procedure vlist_out; {output a |vlist_node| box}
 12527  label move_past, fin_rule, next_p;
 12528  var left_edge: scaled; {the left coordinate for this box}
 12529  @!top_edge: scaled; {the top coordinate for this box}
 12530  @!save_h,@!save_v: scaled; {what |dvi_h| and |dvi_v| should pop to}
 12531  @!this_box: pointer; {pointer to containing box}
 12532  @!g_order: glue_ord; {applicable order of infinity for glue}
 12533  @!g_sign: normal..shrinking; {selects type of glue}
 12534  @!p:pointer; {current position in the vlist}
 12535  @!save_loc:integer; {\.{DVI} byte location upon entry}
 12536  @!leader_box:pointer; {the leader box being replicated}
 12537  @!leader_ht:scaled; {height of leader box being replicated}
 12538  @!lx:scaled; {extra space between leader boxes}
 12539  @!outer_doing_leaders:boolean; {were we doing leaders?}
 12540  @!edge:scaled; {bottom boundary of leader space}
 12541  @!glue_temp:real; {glue value before rounding}
 12542  @!cur_glue:real; {glue seen so far}
 12543  @!cur_g:scaled; {rounded equivalent of |cur_glue| times the glue ratio}
 12544  begin cur_g:=0; cur_glue:=float_constant(0);
 12545  this_box:=temp_ptr; g_order:=glue_order(this_box);
 12546  g_sign:=glue_sign(this_box); p:=list_ptr(this_box);
 12547  incr(cur_s);
 12548  if cur_s>0 then dvi_out(push);
 12549  if cur_s>max_push then max_push:=cur_s;
 12550  save_loc:=dvi_offset+dvi_ptr; left_edge:=cur_h; cur_v:=cur_v-height(this_box);
 12551  top_edge:=cur_v;
 12552  while p<>null do @<Output node |p| for |vlist_out| and move to the next node,
 12553    maintaining the condition |cur_h=left_edge|@>;
 12554  prune_movements(save_loc);
 12555  if cur_s>0 then dvi_pop(save_loc);
 12556  decr(cur_s);
 12557  end;
 12558  
 12559  @ @<Output node |p| for |vlist_out|...@>=
 12560  begin if is_char_node(p) then confusion("vlistout")
 12561  @:this can't happen vlistout}{\quad vlistout@>
 12562  else @<Output the non-|char_node| |p| for |vlist_out|@>;
 12563  next_p:p:=link(p);
 12564  end
 12565  
 12566  @ @<Output the non-|char_node| |p| for |vlist_out|@>=
 12567  begin case type(p) of
 12568  hlist_node,vlist_node:@<Output a box in a vlist@>;
 12569  rule_node: begin rule_ht:=height(p); rule_dp:=depth(p); rule_wd:=width(p);
 12570    goto fin_rule;
 12571    end;
 12572  whatsit_node: @<Output the whatsit node |p| in a vlist@>;
 12573  glue_node: @<Move down or output leaders@>;
 12574  kern_node:cur_v:=cur_v+width(p);
 12575  othercases do_nothing
 12576  endcases;@/
 12577  goto next_p;
 12578  fin_rule: @<Output a rule in a vlist, |goto next_p|@>;
 12579  move_past: cur_v:=cur_v+rule_ht;
 12580  end
 12581  
 12582  @ The |synch_v| here allows the \.{DVI} output to use one-byte commands
 12583  for adjusting |v| in most cases, since the baselineskip distance will
 12584  usually be constant.
 12585  
 12586  @<Output a box in a vlist@>=
 12587  if list_ptr(p)=null then cur_v:=cur_v+height(p)+depth(p)
 12588  else  begin cur_v:=cur_v+height(p); synch_v;
 12589    save_h:=dvi_h; save_v:=dvi_v;
 12590    cur_h:=left_edge+shift_amount(p); {shift the box right}
 12591    temp_ptr:=p;
 12592    if type(p)=vlist_node then vlist_out@+else hlist_out;
 12593    dvi_h:=save_h; dvi_v:=save_v;
 12594    cur_v:=save_v+depth(p); cur_h:=left_edge;
 12595    end
 12596  
 12597  @ @<Output a rule in a vlist...@>=
 12598  if is_running(rule_wd) then rule_wd:=width(this_box);
 12599  rule_ht:=rule_ht+rule_dp; {this is the rule thickness}
 12600  cur_v:=cur_v+rule_ht;
 12601  if (rule_ht>0)and(rule_wd>0) then {we don't output empty rules}
 12602    begin synch_h; synch_v;
 12603    dvi_out(put_rule); dvi_four(rule_ht); dvi_four(rule_wd);
 12604    end;
 12605  goto next_p
 12606  
 12607  @ @<Move down or output leaders@>=
 12608  begin g:=glue_ptr(p); rule_ht:=width(g)-cur_g;
 12609  if g_sign<>normal then
 12610    begin if g_sign=stretching then
 12611      begin if stretch_order(g)=g_order then
 12612        begin cur_glue:=cur_glue+stretch(g);
 12613        vet_glue(float(glue_set(this_box))*cur_glue);
 12614  @^real multiplication@>
 12615        cur_g:=round(glue_temp);
 12616        end;
 12617      end
 12618    else if shrink_order(g)=g_order then
 12619        begin cur_glue:=cur_glue-shrink(g);
 12620        vet_glue(float(glue_set(this_box))*cur_glue);
 12621        cur_g:=round(glue_temp);
 12622        end;
 12623    end;
 12624  rule_ht:=rule_ht+cur_g;
 12625  if subtype(p)>=a_leaders then
 12626    @<Output leaders in a vlist, |goto fin_rule| if a rule
 12627      or to |next_p| if done@>;
 12628  goto move_past;
 12629  end
 12630  
 12631  @ @<Output leaders in a vlist...@>=
 12632  begin leader_box:=leader_ptr(p);
 12633  if type(leader_box)=rule_node then
 12634    begin rule_wd:=width(leader_box); rule_dp:=0;
 12635    goto fin_rule;
 12636    end;
 12637  leader_ht:=height(leader_box)+depth(leader_box);
 12638  if (leader_ht>0)and(rule_ht>0) then
 12639    begin rule_ht:=rule_ht+10; {compensate for floating-point rounding}
 12640    edge:=cur_v+rule_ht; lx:=0;
 12641    @<Let |cur_v| be the position of the first box, and set |leader_ht+lx|
 12642      to the spacing between corresponding parts of boxes@>;
 12643    while cur_v+leader_ht<=edge do
 12644      @<Output a leader box at |cur_v|,
 12645        then advance |cur_v| by |leader_ht+lx|@>;
 12646    cur_v:=edge-10; goto next_p;
 12647    end;
 12648  end
 12649  
 12650  @ @<Let |cur_v| be the position of the first box, ...@>=
 12651  if subtype(p)=a_leaders then
 12652    begin save_v:=cur_v;
 12653    cur_v:=top_edge+leader_ht*((cur_v-top_edge)@!div leader_ht);
 12654    if cur_v<save_v then cur_v:=cur_v+leader_ht;
 12655    end
 12656  else  begin lq:=rule_ht div leader_ht; {the number of box copies}
 12657    lr:=rule_ht mod leader_ht; {the remaining space}
 12658    if subtype(p)=c_leaders then cur_v:=cur_v+(lr div 2)
 12659    else  begin lx:=lr div (lq+1);
 12660      cur_v:=cur_v+((lr-(lq-1)*lx) div 2);
 12661      end;
 12662    end
 12663  
 12664  @ When we reach this part of the program, |cur_v| indicates the top of a
 12665  leader box, not its baseline.
 12666  
 12667  @<Output a leader box at |cur_v|, ...@>=
 12668  begin cur_h:=left_edge+shift_amount(leader_box); synch_h; save_h:=dvi_h;@/
 12669  cur_v:=cur_v+height(leader_box); synch_v; save_v:=dvi_v;
 12670  temp_ptr:=leader_box;
 12671  outer_doing_leaders:=doing_leaders; doing_leaders:=true;
 12672  if type(leader_box)=vlist_node then vlist_out@+else hlist_out;
 12673  doing_leaders:=outer_doing_leaders;
 12674  dvi_v:=save_v; dvi_h:=save_h; cur_h:=left_edge;
 12675  cur_v:=save_v-height(leader_box)+leader_ht+lx;
 12676  end
 12677  
 12678  @ The |hlist_out| and |vlist_out| procedures are now complete, so we are
 12679  ready for the |ship_out| routine that gets them started in the first place.
 12680  
 12681  @p procedure ship_out(@!p:pointer); {output the box |p|}
 12682  label done;
 12683  var page_loc:integer; {location of the current |bop|}
 12684  @!j,@!k:0..9; {indices to first ten count registers}
 12685  @!s:pool_pointer; {index into |str_pool|}
 12686  @!old_setting:0..max_selector; {saved |selector| setting}
 12687  begin if tracing_output>0 then
 12688    begin print_nl(""); print_ln;
 12689    print("Completed box being shipped out");
 12690  @.Completed box...@>
 12691    end;
 12692  if term_offset>max_print_line-9 then print_ln
 12693  else if (term_offset>0)or(file_offset>0) then print_char(" ");
 12694  print_char("["); j:=9;
 12695  while (count(j)=0)and(j>0) do decr(j);
 12696  for k:=0 to j do
 12697    begin print_int(count(k));
 12698    if k<j then print_char(".");
 12699    end;
 12700  update_terminal;
 12701  if tracing_output>0 then
 12702    begin print_char("]");
 12703    begin_diagnostic; show_box(p); end_diagnostic(true);
 12704    end;
 12705  @<Ship box |p| out@>;
 12706  if tracing_output<=0 then print_char("]");
 12707  dead_cycles:=0;
 12708  update_terminal; {progress report}
 12709  @<Flush the box from memory, showing statistics if requested@>;
 12710  end;
 12711  
 12712  @ @<Flush the box from memory, showing statistics if requested@>=
 12713  @!stat if tracing_stats>1 then
 12714    begin print_nl("Memory usage before: ");
 12715  @.Memory usage...@>
 12716    print_int(var_used); print_char("&");
 12717    print_int(dyn_used); print_char(";");
 12718    end;
 12719  tats@/
 12720  flush_node_list(p);
 12721  @!stat if tracing_stats>1 then
 12722    begin print(" after: ");
 12723    print_int(var_used); print_char("&");
 12724    print_int(dyn_used); print("; still untouched: ");
 12725    print_int(hi_mem_min-lo_mem_max-1); print_ln;
 12726    end;
 12727  tats
 12728  
 12729  @ @<Ship box |p| out@>=
 12730  @<Update the values of |max_h| and |max_v|; but if the page is too large,
 12731    |goto done|@>;
 12732  @<Initialize variables as |ship_out| begins@>;
 12733  page_loc:=dvi_offset+dvi_ptr;
 12734  dvi_out(bop);
 12735  for k:=0 to 9 do dvi_four(count(k));
 12736  dvi_four(last_bop); last_bop:=page_loc;
 12737  cur_v:=height(p)+v_offset; temp_ptr:=p;
 12738  if type(p)=vlist_node then vlist_out@+else hlist_out;
 12739  dvi_out(eop); incr(total_pages); cur_s:=-1;
 12740  done:
 12741  
 12742  @ Sometimes the user will generate a huge page because other error messages
 12743  are being ignored. Such pages are not output to the \.{dvi} file, since they
 12744  may confuse the printing software.
 12745  
 12746  @<Update the values of |max_h| and |max_v|; but if the page is too large...@>=
 12747  if (height(p)>max_dimen)or@|(depth(p)>max_dimen)or@|
 12748     (height(p)+depth(p)+v_offset>max_dimen)or@|
 12749     (width(p)+h_offset>max_dimen) then
 12750    begin print_err("Huge page cannot be shipped out");
 12751  @.Huge page...@>
 12752    help2("The page just created is more than 18 feet tall or")@/
 12753     ("more than 18 feet wide, so I suspect something went wrong.");
 12754    error;
 12755    if tracing_output<=0 then
 12756      begin begin_diagnostic;
 12757      print_nl("The following box has been deleted:");
 12758  @.The following...deleted@>
 12759      show_box(p);
 12760      end_diagnostic(true);
 12761      end;
 12762    goto done;
 12763    end;
 12764  if height(p)+depth(p)+v_offset>max_v then max_v:=height(p)+depth(p)+v_offset;
 12765  if width(p)+h_offset>max_h then max_h:=width(p)+h_offset
 12766  
 12767  @ At the end of the program, we must finish things off by writing the
 12768  post\-amble. If |total_pages=0|, the \.{DVI} file was never opened.
 12769  If |total_pages>=65536|, the \.{DVI} file will lie. And if
 12770  |max_push>=65536|, the user deserves whatever chaos might ensue.
 12771  
 12772  An integer variable |k| will be declared for use by this routine.
 12773  
 12774  @<Finish the \.{DVI} file@>=
 12775  while cur_s>-1 do
 12776    begin if cur_s>0 then dvi_out(pop)
 12777    else  begin dvi_out(eop); incr(total_pages);
 12778      end;
 12779    decr(cur_s);
 12780    end;
 12781  if total_pages=0 then print_nl("No pages of output.")
 12782  @.No pages of output@>
 12783  else  begin dvi_out(post); {beginning of the postamble}
 12784    dvi_four(last_bop); last_bop:=dvi_offset+dvi_ptr-5; {|post| location}
 12785    dvi_four(25400000); dvi_four(473628672); {conversion ratio for sp}
 12786    prepare_mag; dvi_four(mag); {magnification factor}
 12787    dvi_four(max_v); dvi_four(max_h);@/
 12788    dvi_out(max_push div 256); dvi_out(max_push mod 256);@/
 12789    dvi_out((total_pages div 256) mod 256); dvi_out(total_pages mod 256);@/
 12790    @<Output the font definitions for all fonts that were used@>;
 12791    dvi_out(post_post); dvi_four(last_bop); dvi_out(id_byte);@/
 12792    k:=4+((dvi_buf_size-dvi_ptr) mod 4); {the number of 223's}
 12793    while k>0 do
 12794      begin dvi_out(223); decr(k);
 12795      end;
 12796    @<Empty the last bytes out of |dvi_buf|@>;
 12797    print_nl("Output written on "); slow_print(output_file_name);
 12798  @.Output written on x@>
 12799    print(" ("); print_int(total_pages); print(" page");
 12800    if total_pages<>1 then print_char("s");
 12801    print(", "); print_int(dvi_offset+dvi_ptr); print(" bytes).");
 12802    b_close(dvi_file);
 12803    end
 12804  
 12805  @ @<Output the font definitions...@>=
 12806  while font_ptr>font_base do
 12807    begin if font_used[font_ptr] then dvi_font_def(font_ptr);
 12808    decr(font_ptr);
 12809    end
 12810  
 12811  @* \[33] Packaging.
 12812  We're essentially done with the parts of \TeX\ that are concerned with
 12813  the input (|get_next|) and the output (|ship_out|). So it's time to
 12814  get heavily into the remaining part, which does the real work of typesetting.
 12815  
 12816  After lists are constructed, \TeX\ wraps them up and puts them into boxes.
 12817  Two major subroutines are given the responsibility for this task: |hpack|
 12818  applies to horizontal lists (hlists) and |vpack| applies to vertical lists
 12819  (vlists). The main duty of |hpack| and |vpack| is to compute the dimensions
 12820  of the resulting boxes, and to adjust the glue if one of those dimensions
 12821  is pre-specified. The computed sizes normally enclose all of the material
 12822  inside the new box; but some items may stick out if negative glue is used,
 12823  if the box is overfull, or if a \.{\\vbox} includes other boxes that have
 12824  been shifted left.
 12825  
 12826  The subroutine call |hpack(p,w,m)| returns a pointer to an |hlist_node|
 12827  for a box containing the hlist that starts at |p|. Parameter |w| specifies
 12828  a width; and parameter |m| is either `|exactly|' or `|additional|'.  Thus,
 12829  |hpack(p,w,exactly)| produces a box whose width is exactly |w|, while
 12830  |hpack(p,w,additional)| yields a box whose width is the natural width plus
 12831  |w|.  It is convenient to define a macro called `|natural|' to cover the
 12832  most common case, so that we can say |hpack(p,natural)| to get a box that
 12833  has the natural width of list |p|.
 12834  
 12835  Similarly, |vpack(p,w,m)| returns a pointer to a |vlist_node| for a
 12836  box containing the vlist that starts at |p|. In this case |w| represents
 12837  a height instead of a width; the parameter |m| is interpreted as in |hpack|.
 12838  
 12839  @d exactly=0 {a box dimension is pre-specified}
 12840  @d additional=1 {a box dimension is increased from the natural one}
 12841  @d natural==0,additional {shorthand for parameters to |hpack| and |vpack|}
 12842  
 12843  @ The parameters to |hpack| and |vpack| correspond to \TeX's primitives
 12844  like `\.{\\hbox} \.{to} \.{300pt}', `\.{\\hbox} \.{spread} \.{10pt}'; note
 12845  that `\.{\\hbox}' with no dimension following it is equivalent to
 12846  `\.{\\hbox} \.{spread} \.{0pt}'.  The |scan_spec| subroutine scans such
 12847  constructions in the user's input, including the mandatory left brace that
 12848  follows them, and it puts the specification onto |save_stack| so that the
 12849  desired box can later be obtained by executing the following code:
 12850  $$\vbox{\halign{#\hfil\cr
 12851  |save_ptr:=save_ptr-2;|\cr
 12852  |hpack(p,saved(1),saved(0)).|\cr}}$$
 12853  Special care is necessary to ensure that the special |save_stack| codes
 12854  are placed just below the new group code, because scanning can change
 12855  |save_stack| when \.{\\csname} appears.
 12856  
 12857  @p procedure scan_spec(@!c:group_code;@!three_codes:boolean);
 12858    {scans a box specification and left brace}
 12859  label found;
 12860  var @!s:integer; {temporarily saved value}
 12861  @!spec_code:exactly..additional;
 12862  begin if three_codes then s:=saved(0);
 12863  if scan_keyword("to") then spec_code:=exactly
 12864  @.to@>
 12865  else if scan_keyword("spread") then spec_code:=additional
 12866  @.spread@>
 12867  else  begin spec_code:=additional; cur_val:=0;
 12868    goto found;
 12869    end;
 12870  scan_normal_dimen;
 12871  found: if three_codes then
 12872    begin saved(0):=s; incr(save_ptr);
 12873    end;
 12874  saved(0):=spec_code; saved(1):=cur_val; save_ptr:=save_ptr+2;
 12875  new_save_level(c); scan_left_brace;
 12876  end;
 12877  
 12878  @ To figure out the glue setting, |hpack| and |vpack| determine how much
 12879  stretchability and shrinkability are present, considering all four orders
 12880  of infinity. The highest order of infinity that has a nonzero coefficient
 12881  is then used as if no other orders were present.
 12882  
 12883  For example, suppose that the given list contains six glue nodes with
 12884  the respective stretchabilities 3pt, 8fill, 5fil, 6pt, $-3$fil, $-8$fill.
 12885  Then the total is essentially 2fil; and if a total additional space of 6pt
 12886  is to be achieved by stretching, the actual amounts of stretch will be
 12887  0pt, 0pt, 15pt, 0pt, $-9$pt, and 0pt, since only `fil' glue will be
 12888  considered. (The `fill' glue is therefore not really stretching infinitely
 12889  with respect to `fil'; nobody would actually want that to happen.)
 12890  
 12891  The arrays |total_stretch| and |total_shrink| are used to determine how much
 12892  glue of each kind is present. A global variable |last_badness| is used
 12893  to implement \.{\\badness}.
 12894  
 12895  @<Glob...@>=
 12896  @!total_stretch, @!total_shrink: array[glue_ord] of scaled;
 12897    {glue found by |hpack| or |vpack|}
 12898  @!last_badness:integer; {badness of the most recently packaged box}
 12899  
 12900  @ If the global variable |adjust_tail| is non-null, the |hpack| routine
 12901  also removes all occurrences of |ins_node|, |mark_node|, and |adjust_node|
 12902  items and appends the resulting material onto the list that ends at
 12903  location |adjust_tail|.
 12904  
 12905  @<Glob...@>=
 12906  @!adjust_tail:pointer; {tail of adjustment list}
 12907  
 12908  @ @<Set init...@>=adjust_tail:=null; last_badness:=0;
 12909  
 12910  @ Here now is |hpack|, which contains few if any surprises.
 12911  
 12912  @p function hpack(@!p:pointer;@!w:scaled;@!m:small_number):pointer;
 12913  label reswitch, common_ending, exit;
 12914  var r:pointer; {the box node that will be returned}
 12915  @!q:pointer; {trails behind |p|}
 12916  @!h,@!d,@!x:scaled; {height, depth, and natural width}
 12917  @!s:scaled; {shift amount}
 12918  @!g:pointer; {points to a glue specification}
 12919  @!o:glue_ord; {order of infinity}
 12920  @!f:internal_font_number; {the font in a |char_node|}
 12921  @!i:four_quarters; {font information about a |char_node|}
 12922  @!hd:eight_bits; {height and depth indices for a character}
 12923  begin last_badness:=0; r:=get_node(box_node_size); type(r):=hlist_node;
 12924  subtype(r):=min_quarterword; shift_amount(r):=0;
 12925  q:=r+list_offset; link(q):=p;@/
 12926  h:=0; @<Clear dimensions to zero@>;
 12927  while p<>null do @<Examine node |p| in the hlist, taking account of its effect
 12928    on the dimensions of the new box, or moving it to the adjustment list;
 12929    then advance |p| to the next node@>;
 12930  if adjust_tail<>null then link(adjust_tail):=null;
 12931  height(r):=h; depth(r):=d;@/
 12932  @<Determine the value of |width(r)| and the appropriate glue setting;
 12933    then |return| or |goto common_ending|@>;
 12934  common_ending: @<Finish issuing a diagnostic message
 12935        for an overfull or underfull hbox@>;
 12936  exit: hpack:=r;
 12937  end;
 12938  
 12939  @ @<Clear dimensions to zero@>=
 12940  d:=0; x:=0;
 12941  total_stretch[normal]:=0; total_shrink[normal]:=0;
 12942  total_stretch[fil]:=0; total_shrink[fil]:=0;
 12943  total_stretch[fill]:=0; total_shrink[fill]:=0;
 12944  total_stretch[filll]:=0; total_shrink[filll]:=0
 12945  
 12946  @ @<Examine node |p| in the hlist, taking account of its effect...@>=
 12947  @^inner loop@>
 12948  begin reswitch: while is_char_node(p) do
 12949    @<Incorporate character dimensions into the dimensions of
 12950      the hbox that will contain~it, then move to the next node@>;
 12951  if p<>null then
 12952    begin case type(p) of
 12953    hlist_node,vlist_node,rule_node,unset_node:
 12954      @<Incorporate box dimensions into the dimensions of
 12955        the hbox that will contain~it@>;
 12956    ins_node,mark_node,adjust_node: if adjust_tail<>null then
 12957      @<Transfer node |p| to the adjustment list@>;
 12958    whatsit_node:@<Incorporate a whatsit node into an hbox@>;
 12959    glue_node:@<Incorporate glue into the horizontal totals@>;
 12960    kern_node,math_node: x:=x+width(p);
 12961    ligature_node: @<Make node |p| look like a |char_node|
 12962      and |goto reswitch|@>;
 12963    othercases do_nothing
 12964    endcases;@/
 12965    p:=link(p);
 12966    end;
 12967  end
 12968  
 12969  
 12970  @ @<Make node |p| look like a |char_node| and |goto reswitch|@>=
 12971  begin mem[lig_trick]:=mem[lig_char(p)]; link(lig_trick):=link(p);
 12972  p:=lig_trick; goto reswitch;
 12973  end
 12974  
 12975  @ The code here implicitly uses the fact that running dimensions are
 12976  indicated by |null_flag|, which will be ignored in the calculations
 12977  because it is a highly negative number.
 12978  
 12979  @<Incorporate box dimensions into the dimensions of the hbox...@>=
 12980  begin x:=x+width(p);
 12981  if type(p)>=rule_node then s:=0 @+else s:=shift_amount(p);
 12982  if height(p)-s>h then h:=height(p)-s;
 12983  if depth(p)+s>d then d:=depth(p)+s;
 12984  end
 12985  
 12986  @ The following code is part of \TeX's inner loop; i.e., adding another
 12987  character of text to the user's input will cause each of these instructions
 12988  to be exercised one more time.
 12989  @^inner loop@>
 12990  
 12991  @<Incorporate character dimensions into the dimensions of the hbox...@>=
 12992  begin f:=font(p); i:=char_info(f)(character(p)); hd:=height_depth(i);
 12993  x:=x+char_width(f)(i);@/
 12994  s:=char_height(f)(hd);@+if s>h then h:=s;
 12995  s:=char_depth(f)(hd);@+if s>d then d:=s;
 12996  p:=link(p);
 12997  end
 12998  
 12999  @ Although node |q| is not necessarily the immediate predecessor of node |p|,
 13000  it always points to some node in the list preceding |p|. Thus, we can delete
 13001  nodes by moving |q| when necessary. The algorithm takes linear time, and the
 13002  extra computation does not intrude on the inner loop unless it is necessary
 13003  to make a deletion.
 13004  @^inner loop@>
 13005  
 13006  @<Transfer node |p| to the adjustment list@>=
 13007  begin while link(q)<>p do q:=link(q);
 13008  if type(p)=adjust_node then
 13009    begin link(adjust_tail):=adjust_ptr(p);
 13010    while link(adjust_tail)<>null do adjust_tail:=link(adjust_tail);
 13011    p:=link(p); free_node(link(q),small_node_size);
 13012    end
 13013  else  begin link(adjust_tail):=p; adjust_tail:=p; p:=link(p);
 13014    end;
 13015  link(q):=p; p:=q;
 13016  end
 13017  
 13018  @ @<Incorporate glue into the horizontal totals@>=
 13019  begin g:=glue_ptr(p); x:=x+width(g);@/
 13020  o:=stretch_order(g); total_stretch[o]:=total_stretch[o]+stretch(g);
 13021  o:=shrink_order(g); total_shrink[o]:=total_shrink[o]+shrink(g);
 13022  if subtype(p)>=a_leaders then
 13023    begin g:=leader_ptr(p);
 13024    if height(g)>h then h:=height(g);
 13025    if depth(g)>d then d:=depth(g);
 13026    end;
 13027  end
 13028  
 13029  @ When we get to the present part of the program, |x| is the natural width
 13030  of the box being packaged.
 13031  
 13032  @<Determine the value of |width(r)| and the appropriate glue setting...@>=
 13033  if m=additional then w:=x+w;
 13034  width(r):=w; x:=w-x; {now |x| is the excess to be made up}
 13035  if x=0 then
 13036    begin glue_sign(r):=normal; glue_order(r):=normal;
 13037    set_glue_ratio_zero(glue_set(r));
 13038    return;
 13039    end
 13040  else if x>0 then @<Determine horizontal glue stretch setting, then |return|
 13041      or \hbox{|goto common_ending|}@>
 13042  else @<Determine horizontal glue shrink setting, then |return|
 13043      or \hbox{|goto common_ending|}@>
 13044  
 13045  @ @<Determine horizontal glue stretch setting...@>=
 13046  begin @<Determine the stretch order@>;
 13047  glue_order(r):=o; glue_sign(r):=stretching;
 13048  if total_stretch[o]<>0 then glue_set(r):=unfloat(x/total_stretch[o])
 13049  @^real division@>
 13050  else  begin glue_sign(r):=normal;
 13051    set_glue_ratio_zero(glue_set(r)); {there's nothing to stretch}
 13052    end;
 13053  if o=normal then if list_ptr(r)<>null then
 13054    @<Report an underfull hbox and |goto common_ending|, if this box
 13055      is sufficiently bad@>;
 13056  return;
 13057  end
 13058  
 13059  @ @<Determine the stretch order@>=
 13060  if total_stretch[filll]<>0 then o:=filll
 13061  else if total_stretch[fill]<>0 then o:=fill
 13062  else if total_stretch[fil]<>0 then o:=fil
 13063  else o:=normal
 13064  
 13065  @ @<Report an underfull hbox and |goto common_ending|, if...@>=
 13066  begin last_badness:=badness(x,total_stretch[normal]);
 13067  if last_badness>hbadness then
 13068    begin print_ln;
 13069    if last_badness>100 then print_nl("Underfull")@+else print_nl("Loose");
 13070    print(" \hbox (badness "); print_int(last_badness);
 13071  @.Underfull \\hbox...@>
 13072  @.Loose \\hbox...@>
 13073    goto common_ending;
 13074    end;
 13075  end
 13076  
 13077  @ In order to provide a decent indication of where an overfull or underfull
 13078  box originated, we use a global variable |pack_begin_line| that is
 13079  set nonzero only when |hpack| is being called by the paragraph builder
 13080  or the alignment finishing routine.
 13081  
 13082  @<Glob...@>=
 13083  @!pack_begin_line:integer; {source file line where the current paragraph
 13084    or alignment began; a negative value denotes alignment}
 13085  
 13086  @ @<Set init...@>=
 13087  pack_begin_line:=0;
 13088  
 13089  @ @<Finish issuing a diagnostic message for an overfull or underfull hbox@>=
 13090  if output_active then print(") has occurred while \output is active")
 13091  else  begin if pack_begin_line<>0 then
 13092      begin if pack_begin_line>0 then print(") in paragraph at lines ")
 13093      else print(") in alignment at lines ");
 13094      print_int(abs(pack_begin_line));
 13095      print("--");
 13096      end
 13097    else print(") detected at line ");
 13098    print_int(line);
 13099    end;
 13100  print_ln;@/
 13101  font_in_short_display:=null_font; short_display(list_ptr(r)); print_ln;@/
 13102  begin_diagnostic; show_box(r); end_diagnostic(true)
 13103  
 13104  @ @<Determine horizontal glue shrink setting...@>=
 13105  begin @<Determine the shrink order@>;
 13106  glue_order(r):=o; glue_sign(r):=shrinking;
 13107  if total_shrink[o]<>0 then glue_set(r):=unfloat((-x)/total_shrink[o])
 13108  @^real division@>
 13109  else  begin glue_sign(r):=normal;
 13110    set_glue_ratio_zero(glue_set(r)); {there's nothing to shrink}
 13111    end;
 13112  if (total_shrink[o]<-x)and(o=normal)and(list_ptr(r)<>null) then
 13113    begin last_badness:=1000000;
 13114    set_glue_ratio_one(glue_set(r)); {use the maximum shrinkage}
 13115    @<Report an overfull hbox and |goto common_ending|, if this box
 13116      is sufficiently bad@>;
 13117    end
 13118  else if o=normal then if list_ptr(r)<>null then
 13119    @<Report a tight hbox and |goto common_ending|, if this box
 13120      is sufficiently bad@>;
 13121  return;
 13122  end
 13123  
 13124  @ @<Determine the shrink order@>=
 13125  if total_shrink[filll]<>0 then o:=filll
 13126  else if total_shrink[fill]<>0 then o:=fill
 13127  else if total_shrink[fil]<>0 then o:=fil
 13128  else o:=normal
 13129  
 13130  @ @<Report an overfull hbox and |goto common_ending|, if...@>=
 13131  if (-x-total_shrink[normal]>hfuzz)or(hbadness<100) then
 13132    begin if (overfull_rule>0)and(-x-total_shrink[normal]>hfuzz) then
 13133      begin while link(q)<>null do q:=link(q);
 13134      link(q):=new_rule;
 13135      width(link(q)):=overfull_rule;
 13136      end;
 13137    print_ln; print_nl("Overfull \hbox (");
 13138  @.Overfull \\hbox...@>
 13139    print_scaled(-x-total_shrink[normal]); print("pt too wide");
 13140    goto common_ending;
 13141    end
 13142  
 13143  @ @<Report a tight hbox and |goto common_ending|, if...@>=
 13144  begin last_badness:=badness(-x,total_shrink[normal]);
 13145  if last_badness>hbadness then
 13146    begin print_ln; print_nl("Tight \hbox (badness "); print_int(last_badness);
 13147  @.Tight \\hbox...@>
 13148    goto common_ending;
 13149    end;
 13150  end
 13151  
 13152  @ The |vpack| subroutine is actually a special case of a slightly more
 13153  general routine called |vpackage|, which has four parameters. The fourth
 13154  parameter, which is |max_dimen| in the case of |vpack|, specifies the
 13155  maximum depth of the page box that is constructed. The depth is first
 13156  computed by the normal rules; if it exceeds this limit, the reference
 13157  point is simply moved down until the limiting depth is attained.
 13158  
 13159  @d vpack(#)==vpackage(#,max_dimen) {special case of unconstrained depth}
 13160  
 13161  @p function vpackage(@!p:pointer;@!h:scaled;@!m:small_number;@!l:scaled):
 13162    pointer;
 13163  label common_ending, exit;
 13164  var r:pointer; {the box node that will be returned}
 13165  @!w,@!d,@!x:scaled; {width, depth, and natural height}
 13166  @!s:scaled; {shift amount}
 13167  @!g:pointer; {points to a glue specification}
 13168  @!o:glue_ord; {order of infinity}
 13169  begin last_badness:=0; r:=get_node(box_node_size); type(r):=vlist_node;
 13170  subtype(r):=min_quarterword; shift_amount(r):=0;
 13171  list_ptr(r):=p;@/
 13172  w:=0; @<Clear dimensions to zero@>;
 13173  while p<>null do @<Examine node |p| in the vlist, taking account of its effect
 13174    on the dimensions of the new box; then advance |p| to the next node@>;
 13175  width(r):=w;
 13176  if d>l then
 13177    begin x:=x+d-l; depth(r):=l;
 13178    end
 13179  else depth(r):=d;
 13180  @<Determine the value of |height(r)| and the appropriate glue setting;
 13181    then |return| or |goto common_ending|@>;
 13182  common_ending: @<Finish issuing a diagnostic message
 13183        for an overfull or underfull vbox@>;
 13184  exit: vpackage:=r;
 13185  end;
 13186  
 13187  @ @<Examine node |p| in the vlist, taking account of its effect...@>=
 13188  begin if is_char_node(p) then confusion("vpack")
 13189  @:this can't happen vpack}{\quad vpack@>
 13190  else  case type(p) of
 13191    hlist_node,vlist_node,rule_node,unset_node:
 13192      @<Incorporate box dimensions into the dimensions of
 13193        the vbox that will contain~it@>;
 13194    whatsit_node:@<Incorporate a whatsit node into a vbox@>;
 13195    glue_node: @<Incorporate glue into the vertical totals@>;
 13196    kern_node: begin x:=x+d+width(p); d:=0;
 13197      end;
 13198    othercases do_nothing
 13199    endcases;
 13200  p:=link(p);
 13201  end
 13202  
 13203  @ @<Incorporate box dimensions into the dimensions of the vbox...@>=
 13204  begin x:=x+d+height(p); d:=depth(p);
 13205  if type(p)>=rule_node then s:=0 @+else s:=shift_amount(p);
 13206  if width(p)+s>w then w:=width(p)+s;
 13207  end
 13208  
 13209  @ @<Incorporate glue into the vertical totals@>=
 13210  begin x:=x+d; d:=0;@/
 13211  g:=glue_ptr(p); x:=x+width(g);@/
 13212  o:=stretch_order(g); total_stretch[o]:=total_stretch[o]+stretch(g);
 13213  o:=shrink_order(g); total_shrink[o]:=total_shrink[o]+shrink(g);
 13214  if subtype(p)>=a_leaders then
 13215    begin g:=leader_ptr(p);
 13216    if width(g)>w then w:=width(g);
 13217    end;
 13218  end
 13219  
 13220  @ When we get to the present part of the program, |x| is the natural height
 13221  of the box being packaged.
 13222  
 13223  @<Determine the value of |height(r)| and the appropriate glue setting...@>=
 13224  if m=additional then h:=x+h;
 13225  height(r):=h; x:=h-x; {now |x| is the excess to be made up}
 13226  if x=0 then
 13227    begin glue_sign(r):=normal; glue_order(r):=normal;
 13228    set_glue_ratio_zero(glue_set(r));
 13229    return;
 13230    end
 13231  else if x>0 then @<Determine vertical glue stretch setting, then |return|
 13232      or \hbox{|goto common_ending|}@>
 13233  else @<Determine vertical glue shrink setting, then |return|
 13234      or \hbox{|goto common_ending|}@>
 13235  
 13236  @ @<Determine vertical glue stretch setting...@>=
 13237  begin @<Determine the stretch order@>;
 13238  glue_order(r):=o; glue_sign(r):=stretching;
 13239  if total_stretch[o]<>0 then glue_set(r):=unfloat(x/total_stretch[o])
 13240  @^real division@>
 13241  else  begin glue_sign(r):=normal;
 13242    set_glue_ratio_zero(glue_set(r)); {there's nothing to stretch}
 13243    end;
 13244  if o=normal then if list_ptr(r)<>null then
 13245    @<Report an underfull vbox and |goto common_ending|, if this box
 13246      is sufficiently bad@>;
 13247  return;
 13248  end
 13249  
 13250  @ @<Report an underfull vbox and |goto common_ending|, if...@>=
 13251  begin last_badness:=badness(x,total_stretch[normal]);
 13252  if last_badness>vbadness then
 13253    begin print_ln;
 13254    if last_badness>100 then print_nl("Underfull")@+else print_nl("Loose");
 13255    print(" \vbox (badness "); print_int(last_badness);
 13256  @.Underfull \\vbox...@>
 13257  @.Loose \\vbox...@>
 13258    goto common_ending;
 13259    end;
 13260  end
 13261  
 13262  @ @<Finish issuing a diagnostic message for an overfull or underfull vbox@>=
 13263  if output_active then print(") has occurred while \output is active")
 13264  else  begin if pack_begin_line<>0 then {it's actually negative}
 13265      begin print(") in alignment at lines ");
 13266      print_int(abs(pack_begin_line));
 13267      print("--");
 13268      end
 13269    else print(") detected at line ");
 13270    print_int(line);
 13271    print_ln;@/
 13272    end;
 13273  begin_diagnostic; show_box(r); end_diagnostic(true)
 13274  
 13275  @ @<Determine vertical glue shrink setting...@>=
 13276  begin @<Determine the shrink order@>;
 13277  glue_order(r):=o; glue_sign(r):=shrinking;
 13278  if total_shrink[o]<>0 then glue_set(r):=unfloat((-x)/total_shrink[o])
 13279  @^real division@>
 13280  else  begin glue_sign(r):=normal;
 13281    set_glue_ratio_zero(glue_set(r)); {there's nothing to shrink}
 13282    end;
 13283  if (total_shrink[o]<-x)and(o=normal)and(list_ptr(r)<>null) then
 13284    begin last_badness:=1000000;
 13285    set_glue_ratio_one(glue_set(r)); {use the maximum shrinkage}
 13286    @<Report an overfull vbox and |goto common_ending|, if this box
 13287      is sufficiently bad@>;
 13288    end
 13289  else if o=normal then if list_ptr(r)<>null then
 13290    @<Report a tight vbox and |goto common_ending|, if this box
 13291      is sufficiently bad@>;
 13292  return;
 13293  end
 13294  
 13295  @ @<Report an overfull vbox and |goto common_ending|, if...@>=
 13296  if (-x-total_shrink[normal]>vfuzz)or(vbadness<100) then
 13297    begin print_ln; print_nl("Overfull \vbox (");
 13298  @.Overfull \\vbox...@>
 13299    print_scaled(-x-total_shrink[normal]); print("pt too high");
 13300    goto common_ending;
 13301    end
 13302  
 13303  @ @<Report a tight vbox and |goto common_ending|, if...@>=
 13304  begin last_badness:=badness(-x,total_shrink[normal]);
 13305  if last_badness>vbadness then
 13306    begin print_ln; print_nl("Tight \vbox (badness "); print_int(last_badness);
 13307  @.Tight \\vbox...@>
 13308    goto common_ending;
 13309    end;
 13310  end
 13311  
 13312  @ When a box is being appended to the current vertical list, the
 13313  baselineskip calculation is handled by the |append_to_vlist| routine.
 13314  
 13315  @p procedure append_to_vlist(@!b:pointer);
 13316  var d:scaled; {deficiency of space between baselines}
 13317  @!p:pointer; {a new glue node}
 13318  begin if prev_depth>ignore_depth then
 13319    begin d:=width(baseline_skip)-prev_depth-height(b);
 13320    if d<line_skip_limit then p:=new_param_glue(line_skip_code)
 13321    else  begin p:=new_skip_param(baseline_skip_code);
 13322      width(temp_ptr):=d; {|temp_ptr=glue_ptr(p)|}
 13323      end;
 13324    link(tail):=p; tail:=p;
 13325    end;
 13326  link(tail):=b; tail:=b; prev_depth:=depth(b);
 13327  end;
 13328  
 13329  @* \[34] Data structures for math mode.
 13330  When \TeX\ reads a formula that is enclosed between \.\$'s, it constructs an
 13331  {\sl mlist}, which is essentially a tree structure representing that
 13332  formula.  An mlist is a linear sequence of items, but we can regard it as
 13333  a tree structure because mlists can appear within mlists. For example, many
 13334  of the entries can be subscripted or superscripted, and such ``scripts''
 13335  are mlists in their own right.
 13336  
 13337  An entire formula is parsed into such a tree before any of the actual
 13338  typesetting is done, because the current style of type is usually not
 13339  known until the formula has been fully scanned. For example, when the
 13340  formula `\.{\$a+b \\over c+d\$}' is being read, there is no way to tell
 13341  that `\.{a+b}' will be in script size until `\.{\\over}' has appeared.
 13342  
 13343  During the scanning process, each element of the mlist being built is
 13344  classified as a relation, a binary operator, an open parenthesis, etc.,
 13345  or as a construct like `\.{\\sqrt}' that must be built up. This classification
 13346  appears in the mlist data structure.
 13347  
 13348  After a formula has been fully scanned, the mlist is converted to an hlist
 13349  so that it can be incorporated into the surrounding text. This conversion is
 13350  controlled by a recursive procedure that decides all of the appropriate
 13351  styles by a ``top-down'' process starting at the outermost level and working
 13352  in towards the subformulas. The formula is ultimately pasted together using
 13353  combinations of horizontal and vertical boxes, with glue and penalty nodes
 13354  inserted as necessary.
 13355  
 13356  An mlist is represented internally as a linked list consisting chiefly
 13357  of ``noads'' (pronounced ``no-adds''), to distinguish them from the somewhat
 13358  similar ``nodes'' in hlists and vlists. Certain kinds of ordinary nodes are
 13359  allowed to appear in mlists together with the noads; \TeX\ tells the difference
 13360  by means of the |type| field, since a noad's |type| is always greater than
 13361  that of a node. An mlist does not contain character nodes, hlist nodes, vlist
 13362  nodes, math nodes, ligature nodes,
 13363  or unset nodes; in particular, each mlist item appears in the
 13364  variable-size part of |mem|, so the |type| field is always present.
 13365  
 13366  @ Each noad is four or more words long. The first word contains the |type|
 13367  and |subtype| and |link| fields that are already so familiar to us; the
 13368  second, third, and fourth words are called the noad's |nucleus|, |subscr|,
 13369  and |supscr| fields.
 13370  
 13371  Consider, for example, the simple formula `\.{\$x\^2\$}', which would be
 13372  parsed into an mlist containing a single element called an |ord_noad|.
 13373  The |nucleus| of this noad is a representation of `\.x', the |subscr| is
 13374  empty, and the |supscr| is a representation of `\.2'.
 13375  
 13376  The |nucleus|, |subscr|, and |supscr| fields are further broken into
 13377  subfields. If |p| points to a noad, and if |q| is one of its principal
 13378  fields (e.g., |q=subscr(p)|), there are several possibilities for the
 13379  subfields, depending on the |math_type| of |q|.
 13380  
 13381  \yskip\hang|math_type(q)=math_char| means that |fam(q)| refers to one of
 13382  the sixteen font families, and |character(q)| is the number of a character
 13383  within a font of that family, as in a character node.
 13384  
 13385  \yskip\hang|math_type(q)=math_text_char| is similar, but the character is
 13386  unsubscripted and unsuperscripted and it is followed immediately by another
 13387  character from the same font. (This |math_type| setting appears only
 13388  briefly during the processing; it is used to suppress unwanted italic
 13389  corrections.)
 13390  
 13391  \yskip\hang|math_type(q)=empty| indicates a field with no value (the
 13392  corresponding attribute of noad |p| is not present).
 13393  
 13394  \yskip\hang|math_type(q)=sub_box| means that |info(q)| points to a box
 13395  node (either an |hlist_node| or a |vlist_node|) that should be used as the
 13396  value of the field.  The |shift_amount| in the subsidiary box node is the
 13397  amount by which that box will be shifted downward.
 13398  
 13399  \yskip\hang|math_type(q)=sub_mlist| means that |info(q)| points to
 13400  an mlist; the mlist must be converted to an hlist in order to obtain
 13401  the value of this field.
 13402  
 13403  \yskip\noindent In the latter case, we might have |info(q)=null|. This
 13404  is not the same as |math_type(q)=empty|; for example, `\.{\$P\_\{\}\$}'
 13405  and `\.{\$P\$}' produce different results (the former will not have the
 13406  ``italic correction'' added to the width of |P|, but the ``script skip''
 13407  will be added).
 13408  
 13409  The definitions of subfields given here are evidently wasteful of space,
 13410  since a halfword is being used for the |math_type| although only three
 13411  bits would be needed. However, there are hardly ever many noads present at
 13412  once, since they are soon converted to nodes that take up even more space,
 13413  so we can afford to represent them in whatever way simplifies the
 13414  programming.
 13415  
 13416  @d noad_size=4 {number of words in a normal noad}
 13417  @d nucleus(#)==#+1 {the |nucleus| field of a noad}
 13418  @d supscr(#)==#+2 {the |supscr| field of a noad}
 13419  @d subscr(#)==#+3 {the |subscr| field of a noad}
 13420  @d math_type==link {a |halfword| in |mem|}
 13421  @d fam==font {a |quarterword| in |mem|}
 13422  @d math_char=1 {|math_type| when the attribute is simple}
 13423  @d sub_box=2 {|math_type| when the attribute is a box}
 13424  @d sub_mlist=3 {|math_type| when the attribute is a formula}
 13425  @d math_text_char=4 {|math_type| when italic correction is dubious}
 13426  
 13427  @ Each portion of a formula is classified as Ord, Op, Bin, Rel, Open,
 13428  Close, Punct, or Inner, for purposes of spacing and line breaking. An
 13429  |ord_noad|, |op_noad|, |bin_noad|, |rel_noad|, |open_noad|, |close_noad|,
 13430  |punct_noad|, or |inner_noad| is used to represent portions of the various
 13431  types. For example, an `\.=' sign in a formula leads to the creation of a
 13432  |rel_noad| whose |nucleus| field is a representation of an equals sign
 13433  (usually |fam=0|, |character=@'75|).  A formula preceded by \.{\\mathrel}
 13434  also results in a |rel_noad|.  When a |rel_noad| is followed by an
 13435  |op_noad|, say, and possibly separated by one or more ordinary nodes (not
 13436  noads), \TeX\ will insert a penalty node (with the current |rel_penalty|)
 13437  just after the formula that corresponds to the |rel_noad|, unless there
 13438  already was a penalty immediately following; and a ``thick space'' will be
 13439  inserted just before the formula that corresponds to the |op_noad|.
 13440  
 13441  A noad of type |ord_noad|, |op_noad|, \dots, |inner_noad| usually
 13442  has a |subtype=normal|. The only exception is that an |op_noad| might
 13443  have |subtype=limits| or |no_limits|, if the normal positioning of
 13444  limits has been overridden for this operator.
 13445  
 13446  @d ord_noad=unset_node+3 {|type| of a noad classified Ord}
 13447  @d op_noad=ord_noad+1 {|type| of a noad classified Op}
 13448  @d bin_noad=ord_noad+2 {|type| of a noad classified Bin}
 13449  @d rel_noad=ord_noad+3 {|type| of a noad classified Rel}
 13450  @d open_noad=ord_noad+4 {|type| of a noad classified Open}
 13451  @d close_noad=ord_noad+5 {|type| of a noad classified Close}
 13452  @d punct_noad=ord_noad+6 {|type| of a noad classified Punct}
 13453  @d inner_noad=ord_noad+7 {|type| of a noad classified Inner}
 13454  @d limits=1 {|subtype| of |op_noad| whose scripts are to be above, below}
 13455  @d no_limits=2 {|subtype| of |op_noad| whose scripts are to be normal}
 13456  
 13457  @ A |radical_noad| is five words long; the fifth word is the |left_delimiter|
 13458  field, which usually represents a square root sign.
 13459  
 13460  A |fraction_noad| is six words long; it has a |right_delimiter| field
 13461  as well as a |left_delimiter|.
 13462  
 13463  Delimiter fields are of type |four_quarters|, and they have four subfields
 13464  called |small_fam|, |small_char|, |large_fam|, |large_char|. These subfields
 13465  represent variable-size delimiters by giving the ``small'' and ``large''
 13466  starting characters, as explained in Chapter~17 of {\sl The \TeX book}.
 13467  @:TeXbook}{\sl The \TeX book@>
 13468  
 13469  A |fraction_noad| is actually quite different from all other noads. Not
 13470  only does it have six words, it has |thickness|, |denominator|, and
 13471  |numerator| fields instead of |nucleus|, |subscr|, and |supscr|. The
 13472  |thickness| is a scaled value that tells how thick to make a fraction
 13473  rule; however, the special value |default_code| is used to stand for the
 13474  |default_rule_thickness| of the current size. The |numerator| and
 13475  |denominator| point to mlists that define a fraction; we always have
 13476  $$\hbox{|math_type(numerator)=math_type(denominator)=sub_mlist|}.$$ The
 13477  |left_delimiter| and |right_delimiter| fields specify delimiters that will
 13478  be placed at the left and right of the fraction. In this way, a
 13479  |fraction_noad| is able to represent all of \TeX's operators \.{\\over},
 13480  \.{\\atop}, \.{\\above}, \.{\\overwithdelims}, \.{\\atopwithdelims}, and
 13481   \.{\\abovewithdelims}.
 13482  
 13483  @d left_delimiter(#)==#+4 {first delimiter field of a noad}
 13484  @d right_delimiter(#)==#+5 {second delimiter field of a fraction noad}
 13485  @d radical_noad=inner_noad+1 {|type| of a noad for square roots}
 13486  @d radical_noad_size=5 {number of |mem| words in a radical noad}
 13487  @d fraction_noad=radical_noad+1 {|type| of a noad for generalized fractions}
 13488  @d fraction_noad_size=6 {number of |mem| words in a fraction noad}
 13489  @d small_fam(#)==mem[#].qqqq.b0 {|fam| for ``small'' delimiter}
 13490  @d small_char(#)==mem[#].qqqq.b1 {|character| for ``small'' delimiter}
 13491  @d large_fam(#)==mem[#].qqqq.b2 {|fam| for ``large'' delimiter}
 13492  @d large_char(#)==mem[#].qqqq.b3 {|character| for ``large'' delimiter}
 13493  @d thickness==width {|thickness| field in a fraction noad}
 13494  @d default_code==@'10000000000 {denotes |default_rule_thickness|}
 13495  @d numerator==supscr {|numerator| field in a fraction noad}
 13496  @d denominator==subscr {|denominator| field in a fraction noad}
 13497  
 13498  @ The global variable |empty_field| is set up for initialization of empty
 13499  fields in new noads. Similarly, |null_delimiter| is for the initialization
 13500  of delimiter fields.
 13501  
 13502  @<Glob...@>=
 13503  @!empty_field:two_halves;
 13504  @!null_delimiter:four_quarters;
 13505  
 13506  @ @<Set init...@>=
 13507  empty_field.rh:=empty; empty_field.lh:=null;@/
 13508  null_delimiter.b0:=0; null_delimiter.b1:=min_quarterword;@/
 13509  null_delimiter.b2:=0; null_delimiter.b3:=min_quarterword;
 13510  
 13511  @ The |new_noad| function creates an |ord_noad| that is completely null.
 13512  
 13513  @p function new_noad:pointer;
 13514  var p:pointer;
 13515  begin p:=get_node(noad_size);
 13516  type(p):=ord_noad; subtype(p):=normal;
 13517  mem[nucleus(p)].hh:=empty_field;
 13518  mem[subscr(p)].hh:=empty_field;
 13519  mem[supscr(p)].hh:=empty_field;
 13520  new_noad:=p;
 13521  end;
 13522  
 13523  @ A few more kinds of noads will complete the set: An |under_noad| has its
 13524  nucleus underlined; an |over_noad| has it overlined. An |accent_noad| places
 13525  an accent over its nucleus; the accent character appears as
 13526  |fam(accent_chr(p))| and |character(accent_chr(p))|. A |vcenter_noad|
 13527  centers its nucleus vertically with respect to the axis of the formula;
 13528  in such noads we always have |math_type(nucleus(p))=sub_box|.
 13529  
 13530  And finally, we have |left_noad| and |right_noad| types, to implement
 13531  \TeX's \.{\\left} and \.{\\right}. The |nucleus| of such noads is
 13532  replaced by a |delimiter| field; thus, for example, `\.{\\left(}' produces
 13533  a |left_noad| such that |delimiter(p)| holds the family and character
 13534  codes for all left parentheses. A |left_noad| never appears in an mlist
 13535  except as the first element, and a |right_noad| never appears in an mlist
 13536  except as the last element; furthermore, we either have both a |left_noad|
 13537  and a |right_noad|, or neither one is present. The |subscr| and |supscr|
 13538  fields are always |empty| in a |left_noad| and a |right_noad|.
 13539  
 13540  @d under_noad=fraction_noad+1 {|type| of a noad for underlining}
 13541  @d over_noad=under_noad+1 {|type| of a noad for overlining}
 13542  @d accent_noad=over_noad+1 {|type| of a noad for accented subformulas}
 13543  @d accent_noad_size=5 {number of |mem| words in an accent noad}
 13544  @d accent_chr(#)==#+4 {the |accent_chr| field of an accent noad}
 13545  @d vcenter_noad=accent_noad+1 {|type| of a noad for \.{\\vcenter}}
 13546  @d left_noad=vcenter_noad+1 {|type| of a noad for \.{\\left}}
 13547  @d right_noad=left_noad+1 {|type| of a noad for \.{\\right}}
 13548  @d delimiter==nucleus {|delimiter| field in left and right noads}
 13549  @d scripts_allowed(#)==(type(#)>=ord_noad)and(type(#)<left_noad)
 13550  
 13551  @ Math formulas can also contain instructions like \.{\\textstyle} that
 13552  override \TeX's normal style rules. A |style_node| is inserted into the
 13553  data structure to record such instructions; it is three words long, so it
 13554  is considered a node instead of a noad. The |subtype| is either |display_style|
 13555  or |text_style| or |script_style| or |script_script_style|. The
 13556  second and third words of a |style_node| are not used, but they are
 13557  present because a |choice_node| is converted to a |style_node|.
 13558  
 13559  \TeX\ uses even numbers 0, 2, 4, 6 to encode the basic styles
 13560  |display_style|, \dots, |script_script_style|, and adds~1 to get the
 13561  ``cramped'' versions of these styles. This gives a numerical order that
 13562  is backwards from the convention of Appendix~G in {\sl The \TeX book\/};
 13563  i.e., a smaller style has a larger numerical value.
 13564  @:TeXbook}{\sl The \TeX book@>
 13565  
 13566  @d style_node=unset_node+1 {|type| of a style node}
 13567  @d style_node_size=3 {number of words in a style node}
 13568  @d display_style=0 {|subtype| for \.{\\displaystyle}}
 13569  @d text_style=2 {|subtype| for \.{\\textstyle}}
 13570  @d script_style=4 {|subtype| for \.{\\scriptstyle}}
 13571  @d script_script_style=6 {|subtype| for \.{\\scriptscriptstyle}}
 13572  @d cramped=1 {add this to an uncramped style if you want to cramp it}
 13573  
 13574  @p function new_style(@!s:small_number):pointer; {create a style node}
 13575  var p:pointer; {the new node}
 13576  begin p:=get_node(style_node_size); type(p):=style_node;
 13577  subtype(p):=s; width(p):=0; depth(p):=0; {the |width| and |depth| are not used}
 13578  new_style:=p;
 13579  end;
 13580  
 13581  @ Finally, the \.{\\mathchoice} primitive creates a |choice_node|, which
 13582  has special subfields |display_mlist|, |text_mlist|, |script_mlist|,
 13583  and |script_script_mlist| pointing to the mlists for each style.
 13584  
 13585  @d choice_node=unset_node+2 {|type| of a choice node}
 13586  @d display_mlist(#)==info(#+1) {mlist to be used in display style}
 13587  @d text_mlist(#)==link(#+1) {mlist to be used in text style}
 13588  @d script_mlist(#)==info(#+2) {mlist to be used in script style}
 13589  @d script_script_mlist(#)==link(#+2) {mlist to be used in scriptscript style}
 13590  
 13591  @p function new_choice:pointer; {create a choice node}
 13592  var p:pointer; {the new node}
 13593  begin p:=get_node(style_node_size); type(p):=choice_node;
 13594  subtype(p):=0; {the |subtype| is not used}
 13595  display_mlist(p):=null; text_mlist(p):=null; script_mlist(p):=null;
 13596  script_script_mlist(p):=null;
 13597  new_choice:=p;
 13598  end;
 13599  
 13600  @ Let's consider now the previously unwritten part of |show_node_list|
 13601  that displays the things that can only be present in mlists; this
 13602  program illustrates how to access the data structures just defined.
 13603  
 13604  In the context of the following program, |p| points to a node or noad that
 13605  should be displayed, and the current string contains the ``recursion history''
 13606  that leads to this point. The recursion history consists of a dot for each
 13607  outer level in which |p| is subsidiary to some node, or in which |p| is
 13608  subsidiary to the |nucleus| field of some noad; the dot is replaced by
 13609  `\.\_' or `\.\^' or `\./' or `\.\\' if |p| is descended from the |subscr|
 13610  or |supscr| or |denominator| or |numerator| fields of noads. For example,
 13611  the current string would be `\.{.\^.\_/}' if |p| points to the |ord_noad| for
 13612  |x| in the (ridiculous) formula
 13613  `\.{\$\\sqrt\{a\^\{\\mathinner\{b\_\{c\\over x+y\}\}\}\}\$}'.
 13614  
 13615  @<Cases of |show_node_list| that arise...@>=
 13616  style_node:print_style(subtype(p));
 13617  choice_node:@<Display choice node |p|@>;
 13618  ord_noad,op_noad,bin_noad,rel_noad,open_noad,close_noad,punct_noad,inner_noad,
 13619    radical_noad,over_noad,under_noad,vcenter_noad,accent_noad,
 13620    left_noad,right_noad:@<Display normal noad |p|@>;
 13621  fraction_noad:@<Display fraction noad |p|@>;
 13622  
 13623  @ Here are some simple routines used in the display of noads.
 13624  
 13625  @<Declare procedures needed for displaying the elements of mlists@>=
 13626  procedure print_fam_and_char(@!p:pointer); {prints family and character}
 13627  begin print_esc("fam"); print_int(fam(p)); print_char(" ");
 13628  print_ASCII(qo(character(p)));
 13629  end;
 13630  @#
 13631  procedure print_delimiter(@!p:pointer); {prints a delimiter as 24-bit hex value}
 13632  var a:integer; {accumulator}
 13633  begin a:=small_fam(p)*256+qo(small_char(p));
 13634  a:=a*@"1000+large_fam(p)*256+qo(large_char(p));
 13635  if a<0 then print_int(a) {this should never happen}
 13636  else print_hex(a);
 13637  end;
 13638  
 13639  @ The next subroutine will descend to another level of recursion when a
 13640  subsidiary mlist needs to be displayed. The parameter |c| indicates what
 13641  character is to become part of the recursion history. An empty mlist is
 13642  distinguished from a field with |math_type(p)=empty|, because these are
 13643  not equivalent (as explained above).
 13644  @^recursion@>
 13645  
 13646  @<Declare procedures needed for displaying...@>=
 13647  procedure@?show_info; forward;@t\2@>@?{|show_node_list(info(temp_ptr))|}
 13648  procedure print_subsidiary_data(@!p:pointer;@!c:ASCII_code);
 13649    {display a noad field}
 13650  begin if cur_length>=depth_threshold then
 13651    begin if math_type(p)<>empty then print(" []");
 13652    end
 13653  else  begin append_char(c); {include |c| in the recursion history}
 13654    temp_ptr:=p; {prepare for |show_info| if recursion is needed}
 13655    case math_type(p) of
 13656    math_char: begin print_ln; print_current_string; print_fam_and_char(p);
 13657      end;
 13658    sub_box: show_info; {recursive call}
 13659    sub_mlist: if info(p)=null then
 13660        begin print_ln; print_current_string; print("{}");
 13661        end
 13662      else show_info; {recursive call}
 13663    othercases do_nothing {|empty|}
 13664    endcases;@/
 13665    flush_char; {remove |c| from the recursion history}
 13666    end;
 13667  end;
 13668  
 13669  @ The inelegant introduction of |show_info| in the code above seems better
 13670  than the alternative of using \PASCAL's strange |forward| declaration for a
 13671  procedure with parameters. The \PASCAL\ convention about dropping parameters
 13672  from a post-|forward| procedure is, frankly, so intolerable to the author
 13673  of \TeX\ that he would rather stoop to communication via a global temporary
 13674  variable. (A similar stoopidity occurred with respect to |hlist_out| and
 13675  |vlist_out| above, and it will occur with respect to |mlist_to_hlist| below.)
 13676  @^Knuth, Donald Ervin@>
 13677  @:PASCAL}{\PASCAL@>
 13678  
 13679  @p procedure show_info; {the reader will kindly forgive this}
 13680  begin show_node_list(info(temp_ptr));
 13681  end;
 13682  
 13683  @ @<Declare procedures needed for displaying...@>=
 13684  procedure print_style(@!c:integer);
 13685  begin case c div 2 of
 13686  0: print_esc("displaystyle"); {|display_style=0|}
 13687  1: print_esc("textstyle"); {|text_style=2|}
 13688  2: print_esc("scriptstyle"); {|script_style=4|}
 13689  3: print_esc("scriptscriptstyle"); {|script_script_style=6|}
 13690  othercases print("Unknown style!")
 13691  endcases;
 13692  end;
 13693  
 13694  @ @<Display choice node |p|@>=
 13695  begin print_esc("mathchoice");
 13696  append_char("D"); show_node_list(display_mlist(p)); flush_char;
 13697  append_char("T"); show_node_list(text_mlist(p)); flush_char;
 13698  append_char("S"); show_node_list(script_mlist(p)); flush_char;
 13699  append_char("s"); show_node_list(script_script_mlist(p)); flush_char;
 13700  end
 13701  
 13702  @ @<Display normal noad |p|@>=
 13703  begin case type(p) of
 13704  ord_noad: print_esc("mathord");
 13705  op_noad: print_esc("mathop");
 13706  bin_noad: print_esc("mathbin");
 13707  rel_noad: print_esc("mathrel");
 13708  open_noad: print_esc("mathopen");
 13709  close_noad: print_esc("mathclose");
 13710  punct_noad: print_esc("mathpunct");
 13711  inner_noad: print_esc("mathinner");
 13712  over_noad: print_esc("overline");
 13713  under_noad: print_esc("underline");
 13714  vcenter_noad: print_esc("vcenter");
 13715  radical_noad: begin print_esc("radical"); print_delimiter(left_delimiter(p));
 13716    end;
 13717  accent_noad: begin print_esc("accent"); print_fam_and_char(accent_chr(p));
 13718    end;
 13719  left_noad: begin print_esc("left"); print_delimiter(delimiter(p));
 13720    end;
 13721  right_noad: begin print_esc("right"); print_delimiter(delimiter(p));
 13722    end;
 13723  end;
 13724  if subtype(p)<>normal then
 13725    if subtype(p)=limits then print_esc("limits")
 13726    else print_esc("nolimits");
 13727  if type(p)<left_noad then print_subsidiary_data(nucleus(p),".");
 13728  print_subsidiary_data(supscr(p),"^");
 13729  print_subsidiary_data(subscr(p),"_");
 13730  end
 13731  
 13732  @ @<Display fraction noad |p|@>=
 13733  begin print_esc("fraction, thickness ");
 13734  if thickness(p)=default_code then print("= default")
 13735  else print_scaled(thickness(p));
 13736  if (small_fam(left_delimiter(p))<>0)or@+
 13737    (small_char(left_delimiter(p))<>min_quarterword)or@|
 13738    (large_fam(left_delimiter(p))<>0)or@|
 13739    (large_char(left_delimiter(p))<>min_quarterword) then
 13740    begin print(", left-delimiter "); print_delimiter(left_delimiter(p));
 13741    end;
 13742  if (small_fam(right_delimiter(p))<>0)or@|
 13743    (small_char(right_delimiter(p))<>min_quarterword)or@|
 13744    (large_fam(right_delimiter(p))<>0)or@|
 13745    (large_char(right_delimiter(p))<>min_quarterword) then
 13746    begin print(", right-delimiter "); print_delimiter(right_delimiter(p));
 13747    end;
 13748  print_subsidiary_data(numerator(p),"\");
 13749  print_subsidiary_data(denominator(p),"/");
 13750  end
 13751  
 13752  @ That which can be displayed can also be destroyed.
 13753  
 13754  @<Cases of |flush_node_list| that arise...@>=
 13755  style_node: begin free_node(p,style_node_size); goto done;
 13756    end;
 13757  choice_node:begin flush_node_list(display_mlist(p));
 13758    flush_node_list(text_mlist(p));
 13759    flush_node_list(script_mlist(p));
 13760    flush_node_list(script_script_mlist(p));
 13761    free_node(p,style_node_size); goto done;
 13762    end;
 13763  ord_noad,op_noad,bin_noad,rel_noad,open_noad,close_noad,punct_noad,inner_noad,
 13764    radical_noad,over_noad,under_noad,vcenter_noad,accent_noad:@t@>@;@/
 13765    begin if math_type(nucleus(p))>=sub_box then
 13766      flush_node_list(info(nucleus(p)));
 13767    if math_type(supscr(p))>=sub_box then
 13768      flush_node_list(info(supscr(p)));
 13769    if math_type(subscr(p))>=sub_box then
 13770      flush_node_list(info(subscr(p)));
 13771    if type(p)=radical_noad then free_node(p,radical_noad_size)
 13772    else if type(p)=accent_noad then free_node(p,accent_noad_size)
 13773    else free_node(p,noad_size);
 13774    goto done;
 13775    end;
 13776  left_noad,right_noad: begin free_node(p,noad_size); goto done;
 13777    end;
 13778  fraction_noad: begin flush_node_list(info(numerator(p)));
 13779    flush_node_list(info(denominator(p)));
 13780    free_node(p,fraction_noad_size); goto done;
 13781    end;
 13782  
 13783  @* \[35] Subroutines for math mode.
 13784  In order to convert mlists to hlists, i.e., noads to nodes, we need several
 13785  subroutines that are conveniently dealt with now.
 13786  
 13787  Let us first introduce the macros that make it easy to get at the parameters and
 13788  other font information. A size code, which is a multiple of 16, is added to a
 13789  family number to get an index into the table of internal font numbers
 13790  for each combination of family and size.  (Be alert: Size codes get
 13791  larger as the type gets smaller.)
 13792  
 13793  @d text_size=0 {size code for the largest size in a family}
 13794  @d script_size=16 {size code for the medium size in a family}
 13795  @d script_script_size=32 {size code for the smallest size in a family}
 13796  
 13797  @<Basic printing procedures@>=
 13798  procedure print_size(@!s:integer);
 13799  begin if s=text_size then print_esc("textfont")
 13800  else if s=script_size then print_esc("scriptfont")
 13801  else print_esc("scriptscriptfont");
 13802  end;
 13803  
 13804  @ Before an mlist is converted to an hlist, \TeX\ makes sure that
 13805  the fonts in family~2 have enough parameters to be math-symbol
 13806  fonts, and that the fonts in family~3 have enough parameters to be
 13807  math-extension fonts. The math-symbol parameters are referred to by using the
 13808  following macros, which take a size code as their parameter; for example,
 13809  |num1(cur_size)| gives the value of the |num1| parameter for the current size.
 13810  @^parameters for symbols@>
 13811  @^font parameters@>
 13812  
 13813  @d mathsy_end(#)==fam_fnt(2+#)]].sc
 13814  @d mathsy(#)==font_info[#+param_base[mathsy_end
 13815  @d math_x_height==mathsy(5) {height of `\.x'}
 13816  @d math_quad==mathsy(6) {\.{18mu}}
 13817  @d num1==mathsy(8) {numerator shift-up in display styles}
 13818  @d num2==mathsy(9) {numerator shift-up in non-display, non-\.{\\atop}}
 13819  @d num3==mathsy(10) {numerator shift-up in non-display \.{\\atop}}
 13820  @d denom1==mathsy(11) {denominator shift-down in display styles}
 13821  @d denom2==mathsy(12) {denominator shift-down in non-display styles}
 13822  @d sup1==mathsy(13) {superscript shift-up in uncramped display style}
 13823  @d sup2==mathsy(14) {superscript shift-up in uncramped non-display}
 13824  @d sup3==mathsy(15) {superscript shift-up in cramped styles}
 13825  @d sub1==mathsy(16) {subscript shift-down if superscript is absent}
 13826  @d sub2==mathsy(17) {subscript shift-down if superscript is present}
 13827  @d sup_drop==mathsy(18) {superscript baseline below top of large box}
 13828  @d sub_drop==mathsy(19) {subscript baseline below bottom of large box}
 13829  @d delim1==mathsy(20) {size of \.{\\atopwithdelims} delimiters
 13830    in display styles}
 13831  @d delim2==mathsy(21) {size of \.{\\atopwithdelims} delimiters in non-displays}
 13832  @d axis_height==mathsy(22) {height of fraction lines above the baseline}
 13833  @d total_mathsy_params=22
 13834  
 13835  @ The math-extension parameters have similar macros, but the size code is
 13836  omitted (since it is always |cur_size| when we refer to such parameters).
 13837  @^parameters for symbols@>
 13838  @^font parameters@>
 13839  
 13840  @d mathex(#)==font_info[#+param_base[fam_fnt(3+cur_size)]].sc
 13841  @d default_rule_thickness==mathex(8) {thickness of \.{\\over} bars}
 13842  @d big_op_spacing1==mathex(9) {minimum clearance above a displayed op}
 13843  @d big_op_spacing2==mathex(10) {minimum clearance below a displayed op}
 13844  @d big_op_spacing3==mathex(11) {minimum baselineskip above displayed op}
 13845  @d big_op_spacing4==mathex(12) {minimum baselineskip below displayed op}
 13846  @d big_op_spacing5==mathex(13) {padding above and below displayed limits}
 13847  @d total_mathex_params=13
 13848  
 13849  @ We also need to compute the change in style between mlists and their
 13850  subsidiaries. The following macros define the subsidiary style for
 13851  an overlined nucleus (|cramped_style|), for a subscript or a superscript
 13852  (|sub_style| or |sup_style|), or for a numerator or denominator (|num_style|
 13853  or |denom_style|).
 13854  
 13855  @d cramped_style(#)==2*(# div 2)+cramped {cramp the style}
 13856  @d sub_style(#)==2*(# div 4)+script_style+cramped {smaller and cramped}
 13857  @d sup_style(#)==2*(# div 4)+script_style+(# mod 2) {smaller}
 13858  @d num_style(#)==#+2-2*(# div 6) {smaller unless already script-script}
 13859  @d denom_style(#)==2*(# div 2)+cramped+2-2*(# div 6) {smaller, cramped}
 13860  
 13861  @ When the style changes, the following piece of program computes associated
 13862  information:
 13863  
 13864  @<Set up the values of |cur_size| and |cur_mu|, based on |cur_style|@>=
 13865  begin if cur_style<script_style then cur_size:=text_size
 13866  else cur_size:=16*((cur_style-text_style) div 2);
 13867  cur_mu:=x_over_n(math_quad(cur_size),18);
 13868  end
 13869  
 13870  @ Here is a function that returns a pointer to a rule node having a given
 13871  thickness |t|. The rule will extend horizontally to the boundary of the vlist
 13872  that eventually contains it.
 13873  
 13874  @p function fraction_rule(@!t:scaled):pointer;
 13875    {construct the bar for a fraction}
 13876  var p:pointer; {the new node}
 13877  begin p:=new_rule; height(p):=t; depth(p):=0; fraction_rule:=p;
 13878  end;
 13879  
 13880  @ The |overbar| function returns a pointer to a vlist box that consists of
 13881  a given box |b|, above which has been placed a kern of height |k| under a
 13882  fraction rule of thickness |t| under additional space of height |t|.
 13883  
 13884  @p function overbar(@!b:pointer;@!k,@!t:scaled):pointer;
 13885  var p,@!q:pointer; {nodes being constructed}
 13886  begin p:=new_kern(k); link(p):=b; q:=fraction_rule(t); link(q):=p;
 13887  p:=new_kern(t); link(p):=q; overbar:=vpack(p,natural);
 13888  end;
 13889  
 13890  @ The |var_delimiter| function, which finds or constructs a sufficiently
 13891  large delimiter, is the most interesting of the auxiliary functions that
 13892  currently concern us. Given a pointer |d| to a delimiter field in some noad,
 13893  together with a size code |s| and a vertical distance |v|, this function
 13894  returns a pointer to a box that contains the smallest variant of |d| whose
 13895  height plus depth is |v| or more. (And if no variant is large enough, it
 13896  returns the largest available variant.) In particular, this routine will
 13897  construct arbitrarily large delimiters from extensible components, if
 13898  |d| leads to such characters.
 13899  
 13900  The value returned is a box whose |shift_amount| has been set so that
 13901  the box is vertically centered with respect to the axis in the given size.
 13902  If a built-up symbol is returned, the height of the box before shifting
 13903  will be the height of its topmost component.
 13904  
 13905  @p@t\4@>@<Declare subprocedures for |var_delimiter|@>
 13906  function var_delimiter(@!d:pointer;@!s:small_number;@!v:scaled):pointer;
 13907  label found,continue;
 13908  var b:pointer; {the box that will be constructed}
 13909  @!f,@!g: internal_font_number; {best-so-far and tentative font codes}
 13910  @!c,@!x,@!y: quarterword; {best-so-far and tentative character codes}
 13911  @!m,@!n: integer; {the number of extensible pieces}
 13912  @!u: scaled; {height-plus-depth of a tentative character}
 13913  @!w: scaled; {largest height-plus-depth so far}
 13914  @!q: four_quarters; {character info}
 13915  @!hd: eight_bits; {height-depth byte}
 13916  @!r: four_quarters; {extensible pieces}
 13917  @!z: small_number; {runs through font family members}
 13918  @!large_attempt: boolean; {are we trying the ``large'' variant?}
 13919  begin f:=null_font; w:=0; large_attempt:=false;
 13920  z:=small_fam(d); x:=small_char(d);
 13921  loop@+  begin @<Look at the variants of |(z,x)|; set |f| and |c| whenever
 13922      a better character is found; |goto found| as soon as a
 13923      large enough variant is encountered@>;
 13924    if large_attempt then goto found; {there were none large enough}
 13925    large_attempt:=true; z:=large_fam(d); x:=large_char(d);
 13926    end;
 13927  found: if f<>null_font then
 13928    @<Make variable |b| point to a box for |(f,c)|@>
 13929  else  begin b:=new_null_box;
 13930    width(b):=null_delimiter_space; {use this width if no delimiter was found}
 13931    end;
 13932  shift_amount(b):=half(height(b)-depth(b)) - axis_height(s);
 13933  var_delimiter:=b;
 13934  end;
 13935  
 13936  @ The search process is complicated slightly by the facts that some of the
 13937  characters might not be present in some of the fonts, and they might not
 13938  be probed in increasing order of height.
 13939  
 13940  @<Look at the variants of |(z,x)|; set |f| and |c|...@>=
 13941  if (z<>0)or(x<>min_quarterword) then
 13942    begin z:=z+s+16;
 13943    repeat z:=z-16; g:=fam_fnt(z);
 13944    if g<>null_font then
 13945      @<Look at the list of characters starting with |x| in
 13946        font |g|; set |f| and |c| whenever
 13947        a better character is found; |goto found| as soon as a
 13948        large enough variant is encountered@>;
 13949    until z<16;
 13950    end
 13951  
 13952  @ @<Look at the list of characters starting with |x|...@>=
 13953  begin y:=x;
 13954  if (qo(y)>=font_bc[g])and(qo(y)<=font_ec[g]) then
 13955    begin continue: q:=char_info(g)(y);
 13956    if char_exists(q) then
 13957      begin if char_tag(q)=ext_tag then
 13958        begin f:=g; c:=y; goto found;
 13959        end;
 13960      hd:=height_depth(q);
 13961      u:=char_height(g)(hd)+char_depth(g)(hd);
 13962      if u>w then
 13963        begin f:=g; c:=y; w:=u;
 13964        if u>=v then goto found;
 13965        end;
 13966      if char_tag(q)=list_tag then
 13967        begin y:=rem_byte(q); goto continue;
 13968        end;
 13969      end;
 13970    end;
 13971  end
 13972  
 13973  @ Here is a subroutine that creates a new box, whose list contains a
 13974  single character, and whose width includes the italic correction for
 13975  that character. The height or depth of the box will be negative, if
 13976  the height or depth of the character is negative; thus, this routine
 13977  may deliver a slightly different result than |hpack| would produce.
 13978  
 13979  @<Declare subprocedures for |var_delimiter|@>=
 13980  function char_box(@!f:internal_font_number;@!c:quarterword):pointer;
 13981  var q:four_quarters;
 13982  @!hd:eight_bits; {|height_depth| byte}
 13983  @!b,@!p:pointer; {the new box and its character node}
 13984  begin q:=char_info(f)(c); hd:=height_depth(q);
 13985  b:=new_null_box; width(b):=char_width(f)(q)+char_italic(f)(q);
 13986  height(b):=char_height(f)(hd); depth(b):=char_depth(f)(hd);
 13987  p:=get_avail; character(p):=c; font(p):=f; list_ptr(b):=p; char_box:=b;
 13988  end;
 13989  
 13990  @ When the following code is executed, |char_tag(q)| will be equal to
 13991  |ext_tag| if and only if a built-up symbol is supposed to be returned.
 13992  
 13993  @<Make variable |b| point to a box for |(f,c)|@>=
 13994  if char_tag(q)=ext_tag then
 13995    @<Construct an extensible character in a new box |b|,
 13996      using recipe |rem_byte(q)| and font |f|@>
 13997  else b:=char_box(f,c)
 13998  
 13999  @ When we build an extensible character, it's handy to have the
 14000  following subroutine, which puts a given character on top
 14001  of the characters already in box |b|:
 14002  
 14003  @<Declare subprocedures for |var_delimiter|@>=
 14004  procedure stack_into_box(@!b:pointer;@!f:internal_font_number;
 14005    @!c:quarterword);
 14006  var p:pointer; {new node placed into |b|}
 14007  begin p:=char_box(f,c); link(p):=list_ptr(b); list_ptr(b):=p;
 14008  height(b):=height(p);
 14009  end;
 14010  
 14011  @ Another handy subroutine computes the height plus depth of
 14012  a given character:
 14013  
 14014  @<Declare subprocedures for |var_delimiter|@>=
 14015  function height_plus_depth(@!f:internal_font_number;@!c:quarterword):scaled;
 14016  var q:four_quarters;
 14017  @!hd:eight_bits; {|height_depth| byte}
 14018  begin q:=char_info(f)(c); hd:=height_depth(q);
 14019  height_plus_depth:=char_height(f)(hd)+char_depth(f)(hd);
 14020  end;
 14021  
 14022  @ @<Construct an extensible...@>=
 14023  begin b:=new_null_box;
 14024  type(b):=vlist_node;
 14025  r:=font_info[exten_base[f]+rem_byte(q)].qqqq;@/
 14026  @<Compute the minimum suitable height, |w|, and the corresponding
 14027    number of extension steps, |n|; also set |width(b)|@>;
 14028  c:=ext_bot(r);
 14029  if c<>min_quarterword then stack_into_box(b,f,c);
 14030  c:=ext_rep(r);
 14031  for m:=1 to n do stack_into_box(b,f,c);
 14032  c:=ext_mid(r);
 14033  if c<>min_quarterword then
 14034    begin stack_into_box(b,f,c); c:=ext_rep(r);
 14035    for m:=1 to n do stack_into_box(b,f,c);
 14036    end;
 14037  c:=ext_top(r);
 14038  if c<>min_quarterword then stack_into_box(b,f,c);
 14039  depth(b):=w-height(b);
 14040  end
 14041  
 14042  @ The width of an extensible character is the width of the repeatable
 14043  module. If this module does not have positive height plus depth,
 14044  we don't use any copies of it, otherwise we use as few as possible
 14045  (in groups of two if there is a middle part).
 14046  
 14047  @<Compute the minimum suitable height, |w|, and...@>=
 14048  c:=ext_rep(r); u:=height_plus_depth(f,c);
 14049  w:=0; q:=char_info(f)(c); width(b):=char_width(f)(q)+char_italic(f)(q);@/
 14050  c:=ext_bot(r);@+if c<>min_quarterword then w:=w+height_plus_depth(f,c);
 14051  c:=ext_mid(r);@+if c<>min_quarterword then w:=w+height_plus_depth(f,c);
 14052  c:=ext_top(r);@+if c<>min_quarterword then w:=w+height_plus_depth(f,c);
 14053  n:=0;
 14054  if u>0 then while w<v do
 14055    begin w:=w+u; incr(n);
 14056    if ext_mid(r)<>min_quarterword then w:=w+u;
 14057    end
 14058  
 14059  @ The next subroutine is much simpler; it is used for numerators and
 14060  denominators of fractions as well as for displayed operators and
 14061  their limits above and below. It takes a given box~|b| and
 14062  changes it so that the new box is centered in a box of width~|w|.
 14063  The centering is done by putting \.{\\hss} glue at the left and right
 14064  of the list inside |b|, then packaging the new box; thus, the
 14065  actual box might not really be centered, if it already contains
 14066  infinite glue.
 14067  
 14068  The given box might contain a single character whose italic correction
 14069  has been added to the width of the box; in this case a compensating
 14070  kern is inserted.
 14071  
 14072  @p function rebox(@!b:pointer;@!w:scaled):pointer;
 14073  var p:pointer; {temporary register for list manipulation}
 14074  @!f:internal_font_number; {font in a one-character box}
 14075  @!v:scaled; {width of a character without italic correction}
 14076  begin if (width(b)<>w)and(list_ptr(b)<>null) then
 14077    begin if type(b)=vlist_node then b:=hpack(b,natural);
 14078    p:=list_ptr(b);
 14079    if (is_char_node(p))and(link(p)=null) then
 14080      begin f:=font(p); v:=char_width(f)(char_info(f)(character(p)));
 14081      if v<>width(b) then link(p):=new_kern(width(b)-v);
 14082      end;
 14083    free_node(b,box_node_size);
 14084    b:=new_glue(ss_glue); link(b):=p;
 14085    while link(p)<>null do p:=link(p);
 14086    link(p):=new_glue(ss_glue);
 14087    rebox:=hpack(b,w,exactly);
 14088    end
 14089  else  begin width(b):=w; rebox:=b;
 14090    end;
 14091  end;
 14092  
 14093  @ Here is a subroutine that creates a new glue specification from another
 14094  one that is expressed in `\.{mu}', given the value of the math unit.
 14095  
 14096  @d mu_mult(#)==nx_plus_y(n,#,xn_over_d(#,f,@'200000))
 14097  
 14098  @p function math_glue(@!g:pointer;@!m:scaled):pointer;
 14099  var p:pointer; {the new glue specification}
 14100  @!n:integer; {integer part of |m|}
 14101  @!f:scaled; {fraction part of |m|}
 14102  begin n:=x_over_n(m,@'200000); f:=remainder;@/
 14103  if f<0 then
 14104    begin decr(n); f:=f+@'200000;
 14105    end;
 14106  p:=get_node(glue_spec_size);
 14107  width(p):=mu_mult(width(g)); {convert \.{mu} to \.{pt}}
 14108  stretch_order(p):=stretch_order(g);
 14109  if stretch_order(p)=normal then stretch(p):=mu_mult(stretch(g))
 14110  else stretch(p):=stretch(g);
 14111  shrink_order(p):=shrink_order(g);
 14112  if shrink_order(p)=normal then shrink(p):=mu_mult(shrink(g))
 14113  else shrink(p):=shrink(g);
 14114  math_glue:=p;
 14115  end;
 14116  
 14117  @ The |math_kern| subroutine removes |mu_glue| from a kern node, given
 14118  the value of the math unit.
 14119  
 14120  @p procedure math_kern(@!p:pointer;@!m:scaled);
 14121  var @!n:integer; {integer part of |m|}
 14122  @!f:scaled; {fraction part of |m|}
 14123  begin if subtype(p)=mu_glue then
 14124    begin n:=x_over_n(m,@'200000); f:=remainder;@/
 14125    if f<0 then
 14126      begin decr(n); f:=f+@'200000;
 14127      end;
 14128    width(p):=mu_mult(width(p)); subtype(p):=explicit;
 14129    end;
 14130  end;
 14131  
 14132  @ Sometimes it is necessary to destroy an mlist. The following
 14133  subroutine empties the current list, assuming that |abs(mode)=mmode|.
 14134  
 14135  @p procedure flush_math;
 14136  begin flush_node_list(link(head)); flush_node_list(incompleat_noad);
 14137  link(head):=null; tail:=head; incompleat_noad:=null;
 14138  end;
 14139  
 14140  @* \[36] Typesetting math formulas.
 14141  \TeX's most important routine for dealing with formulas is called
 14142  |mlist_to_hlist|.  After a formula has been scanned and represented as an
 14143  mlist, this routine converts it to an hlist that can be placed into a box
 14144  or incorporated into the text of a paragraph. There are three implicit
 14145  parameters, passed in global variables: |cur_mlist| points to the first
 14146  node or noad in the given mlist (and it might be |null|); |cur_style| is a
 14147  style code; and |mlist_penalties| is |true| if penalty nodes for potential
 14148  line breaks are to be inserted into the resulting hlist. After
 14149  |mlist_to_hlist| has acted, |link(temp_head)| points to the translated hlist.
 14150  
 14151  Since mlists can be inside mlists, the procedure is recursive. And since this
 14152  is not part of \TeX's inner loop, the program has been written in a manner
 14153  that stresses compactness over efficiency.
 14154  @^recursion@>
 14155  
 14156  @<Glob...@>=
 14157  @!cur_mlist:pointer; {beginning of mlist to be translated}
 14158  @!cur_style:small_number; {style code at current place in the list}
 14159  @!cur_size:small_number; {size code corresponding to |cur_style|}
 14160  @!cur_mu:scaled; {the math unit width corresponding to |cur_size|}
 14161  @!mlist_penalties:boolean; {should |mlist_to_hlist| insert penalties?}
 14162  
 14163  @ The recursion in |mlist_to_hlist| is due primarily to a subroutine
 14164  called |clean_box| that puts a given noad field into a box using a given
 14165  math style; |mlist_to_hlist| can call |clean_box|, which can call
 14166  |mlist_to_hlist|.
 14167  @^recursion@>
 14168  
 14169  The box returned by |clean_box| is ``clean'' in the
 14170  sense that its |shift_amount| is zero.
 14171  
 14172  @p procedure@?mlist_to_hlist; forward;@t\2@>@/
 14173  function clean_box(@!p:pointer;@!s:small_number):pointer;
 14174  label found;
 14175  var q:pointer; {beginning of a list to be boxed}
 14176  @!save_style:small_number; {|cur_style| to be restored}
 14177  @!x:pointer; {box to be returned}
 14178  @!r:pointer; {temporary pointer}
 14179  begin case math_type(p) of
 14180  math_char: begin cur_mlist:=new_noad; mem[nucleus(cur_mlist)]:=mem[p];
 14181    end;
 14182  sub_box: begin q:=info(p); goto found;
 14183    end;
 14184  sub_mlist: cur_mlist:=info(p);
 14185  othercases begin q:=new_null_box; goto found;
 14186    end
 14187  endcases;@/
 14188  save_style:=cur_style; cur_style:=s; mlist_penalties:=false;@/
 14189  mlist_to_hlist; q:=link(temp_head); {recursive call}
 14190  cur_style:=save_style; {restore the style}
 14191  @<Set up the values of |cur_size| and |cur_mu|, based on |cur_style|@>;
 14192  found: if is_char_node(q)or(q=null) then x:=hpack(q,natural)
 14193    else if (link(q)=null)and(type(q)<=vlist_node)and(shift_amount(q)=0) then
 14194      x:=q {it's already clean}
 14195    else x:=hpack(q,natural);
 14196  @<Simplify a trivial box@>;
 14197  clean_box:=x;
 14198  end;
 14199  
 14200  @ Here we save memory space in a common case.
 14201  
 14202  @<Simplify a trivial box@>=
 14203  q:=list_ptr(x);
 14204  if is_char_node(q) then
 14205    begin r:=link(q);
 14206    if r<>null then if link(r)=null then if not is_char_node(r) then
 14207     if type(r)=kern_node then {unneeded italic correction}
 14208      begin free_node(r,small_node_size); link(q):=null;
 14209      end;
 14210    end
 14211  
 14212  @ It is convenient to have a procedure that converts a |math_char|
 14213  field to an ``unpacked'' form. The |fetch| routine sets |cur_f|, |cur_c|,
 14214  and |cur_i| to the font code, character code, and character information bytes of
 14215  a given noad field. It also takes care of issuing error messages for
 14216  nonexistent characters; in such cases, |char_exists(cur_i)| will be |false|
 14217  after |fetch| has acted, and the field will also have been reset to |empty|.
 14218  
 14219  @p procedure fetch(@!a:pointer); {unpack the |math_char| field |a|}
 14220  begin cur_c:=character(a); cur_f:=fam_fnt(fam(a)+cur_size);
 14221  if cur_f=null_font then
 14222    @<Complain about an undefined family and set |cur_i| null@>
 14223  else  begin if (qo(cur_c)>=font_bc[cur_f])and(qo(cur_c)<=font_ec[cur_f]) then
 14224      cur_i:=char_info(cur_f)(cur_c)
 14225    else cur_i:=null_character;
 14226    if not(char_exists(cur_i)) then
 14227      begin char_warning(cur_f,qo(cur_c));
 14228      math_type(a):=empty; cur_i:=null_character;
 14229      end;
 14230    end;
 14231  end;
 14232  
 14233  @ @<Complain about an undefined family...@>=
 14234  begin print_err(""); print_size(cur_size); print_char(" ");
 14235  print_int(fam(a)); print(" is undefined (character ");
 14236  print_ASCII(qo(cur_c)); print_char(")");
 14237  help4("Somewhere in the math formula just ended, you used the")@/
 14238  ("stated character from an undefined font family. For example,")@/
 14239  ("plain TeX doesn't allow \it or \sl in subscripts. Proceed,")@/
 14240  ("and I'll try to forget that I needed that character.");
 14241  error; cur_i:=null_character; math_type(a):=empty;
 14242  end
 14243  
 14244  @ The outputs of |fetch| are placed in global variables.
 14245  
 14246  @<Glob...@>=
 14247  @!cur_f:internal_font_number; {the |font| field of a |math_char|}
 14248  @!cur_c:quarterword; {the |character| field of a |math_char|}
 14249  @!cur_i:four_quarters; {the |char_info| of a |math_char|,
 14250    or a lig/kern instruction}
 14251  
 14252  @ We need to do a lot of different things, so |mlist_to_hlist| makes two
 14253  passes over the given mlist.
 14254  
 14255  The first pass does most of the processing: It removes ``mu'' spacing from
 14256  glue, it recursively evaluates all subsidiary mlists so that only the
 14257  top-level mlist remains to be handled, it puts fractions and square roots
 14258  and such things into boxes, it attaches subscripts and superscripts, and
 14259  it computes the overall height and depth of the top-level mlist so that
 14260  the size of delimiters for a |left_noad| and a |right_noad| will be known.
 14261  The hlist resulting from each noad is recorded in that noad's |new_hlist|
 14262  field, an integer field that replaces the |nucleus| or |thickness|.
 14263  @^recursion@>
 14264  
 14265  The second pass eliminates all noads and inserts the correct glue and
 14266  penalties between nodes.
 14267  
 14268  @d new_hlist(#)==mem[nucleus(#)].int {the translation of an mlist}
 14269  
 14270  @ Here is the overall plan of |mlist_to_hlist|, and the list of its
 14271  local variables.
 14272  
 14273  @d done_with_noad=80 {go here when a noad has been fully translated}
 14274  @d done_with_node=81 {go here when a node has been fully converted}
 14275  @d check_dimensions=82 {go here to update |max_h| and |max_d|}
 14276  @d delete_q=83 {go here to delete |q| and move to the next node}
 14277  
 14278  @p@t\4@>@<Declare math construction procedures@>
 14279  procedure mlist_to_hlist;
 14280  label reswitch, check_dimensions, done_with_noad, done_with_node, delete_q,
 14281    done;
 14282  var mlist:pointer; {beginning of the given list}
 14283  @!penalties:boolean; {should penalty nodes be inserted?}
 14284  @!style:small_number; {the given style}
 14285  @!save_style:small_number; {holds |cur_style| during recursion}
 14286  @!q:pointer; {runs through the mlist}
 14287  @!r:pointer; {the most recent noad preceding |q|}
 14288  @!r_type:small_number; {the |type| of noad |r|, or |op_noad| if |r=null|}
 14289  @!t:small_number; {the effective |type| of noad |q| during the second pass}
 14290  @!p,@!x,@!y,@!z: pointer; {temporary registers for list construction}
 14291  @!pen:integer; {a penalty to be inserted}
 14292  @!s:small_number; {the size of a noad to be deleted}
 14293  @!max_h,@!max_d:scaled; {maximum height and depth of the list translated so far}
 14294  @!delta:scaled; {offset between subscript and superscript}
 14295  begin mlist:=cur_mlist; penalties:=mlist_penalties;
 14296  style:=cur_style; {tuck global parameters away as local variables}
 14297  q:=mlist; r:=null; r_type:=op_noad; max_h:=0; max_d:=0;
 14298  @<Set up the values of |cur_size| and |cur_mu|, based on |cur_style|@>;
 14299  while q<>null do @<Process node-or-noad |q| as much as possible in preparation
 14300      for the second pass of |mlist_to_hlist|, then move to the next
 14301      item in the mlist@>;
 14302  @<Convert \(a)a final |bin_noad| to an |ord_noad|@>;
 14303  @<Make a second pass over the mlist, removing all noads and inserting the
 14304    proper spacing and penalties@>;
 14305  end;
 14306  
 14307  @ We use the fact that no character nodes appear in an mlist, hence
 14308  the field |type(q)| is always present.
 14309  
 14310  @<Process node-or-noad...@>=
 14311  begin @<Do first-pass processing based on |type(q)|; |goto done_with_noad|
 14312    if a noad has been fully processed, |goto check_dimensions| if it
 14313    has been translated into |new_hlist(q)|, or |goto done_with_node|
 14314    if a node has been fully processed@>;
 14315  check_dimensions: z:=hpack(new_hlist(q),natural);
 14316  if height(z)>max_h then max_h:=height(z);
 14317  if depth(z)>max_d then max_d:=depth(z);
 14318  free_node(z,box_node_size);
 14319  done_with_noad: r:=q; r_type:=type(r);
 14320  done_with_node: q:=link(q);
 14321  end
 14322  
 14323  @ One of the things we must do on the first pass is change a |bin_noad| to
 14324  an |ord_noad| if the |bin_noad| is not in the context of a binary operator.
 14325  The values of |r| and |r_type| make this fairly easy.
 14326  
 14327  @<Do first-pass processing...@>=
 14328  reswitch: delta:=0;
 14329  case type(q) of
 14330  bin_noad: case r_type of
 14331    bin_noad,op_noad,rel_noad,open_noad,punct_noad,left_noad:
 14332      begin type(q):=ord_noad; goto reswitch;
 14333      end;
 14334    othercases do_nothing
 14335    endcases;
 14336  rel_noad,close_noad,punct_noad,right_noad: begin@t@>@;@/
 14337    @<Convert \(a)a final |bin_noad| to an |ord_noad|@>;
 14338    if type(q)=right_noad then goto done_with_noad;
 14339    end;
 14340  @t\4@>@<Cases for noads that can follow a |bin_noad|@>@;
 14341  @t\4@>@<Cases for nodes that can appear in an mlist, after which we
 14342    |goto done_with_node|@>@;
 14343  othercases confusion("mlist1")
 14344  @:this can't happen mlist1}{\quad mlist1@>
 14345  endcases;@/
 14346  @<Convert \(n)|nucleus(q)| to an hlist and attach the sub/superscripts@>
 14347  
 14348  @ @<Convert \(a)a final |bin_noad| to an |ord_noad|@>=
 14349  if r_type=bin_noad then type(r):=ord_noad
 14350  
 14351  @ @<Cases for nodes that can appear in an mlist...@>=
 14352  style_node: begin cur_style:=subtype(q);
 14353    @<Set up the values of |cur_size| and |cur_mu|, based on |cur_style|@>;
 14354    goto done_with_node;
 14355    end;
 14356  choice_node: @<Change this node to a style node followed by the correct choice,
 14357     then |goto done_with_node|@>;
 14358  ins_node,mark_node,adjust_node,
 14359    whatsit_node,penalty_node,disc_node: goto done_with_node;
 14360  rule_node: begin if height(q)>max_h then max_h:=height(q);
 14361    if depth(q)>max_d then max_d:=depth(q); goto done_with_node;
 14362    end;
 14363  glue_node: begin @<Convert \(m)math glue to ordinary glue@>;
 14364    goto done_with_node;
 14365    end;
 14366  kern_node: begin math_kern(q,cur_mu); goto done_with_node;
 14367    end;
 14368  
 14369  @ @d choose_mlist(#)==begin p:=#(q); #(q):=null;@+end
 14370  
 14371  @<Change this node to a style node...@>=
 14372  begin case cur_style div 2 of
 14373  0: choose_mlist(display_mlist); {|display_style=0|}
 14374  1: choose_mlist(text_mlist); {|text_style=2|}
 14375  2: choose_mlist(script_mlist); {|script_style=4|}
 14376  3: choose_mlist(script_script_mlist); {|script_script_style=6|}
 14377  end; {there are no other cases}
 14378  flush_node_list(display_mlist(q));
 14379  flush_node_list(text_mlist(q));
 14380  flush_node_list(script_mlist(q));
 14381  flush_node_list(script_script_mlist(q));@/
 14382  type(q):=style_node; subtype(q):=cur_style; width(q):=0; depth(q):=0;
 14383  if p<>null then
 14384    begin z:=link(q); link(q):=p;
 14385    while link(p)<>null do p:=link(p);
 14386    link(p):=z;
 14387    end;
 14388  goto done_with_node;
 14389  end
 14390  
 14391  @ Conditional math glue (`\.{\\nonscript}') results in a |glue_node|
 14392  pointing to |zero_glue|, with |subtype(q)=cond_math_glue|; in such a case
 14393  the node following will be eliminated if it is a glue or kern node and if the
 14394  current size is different from |text_size|. Unconditional math glue
 14395  (`\.{\\muskip}') is converted to normal glue by multiplying the dimensions
 14396  by |cur_mu|.
 14397  @!@:non_script_}{\.{\\nonscript} primitive@>
 14398  
 14399  @<Convert \(m)math glue to ordinary glue@>=
 14400  if subtype(q)=mu_glue then
 14401    begin x:=glue_ptr(q);
 14402    y:=math_glue(x,cur_mu); delete_glue_ref(x); glue_ptr(q):=y;
 14403    subtype(q):=normal;
 14404    end
 14405  else if (cur_size<>text_size)and(subtype(q)=cond_math_glue) then
 14406    begin p:=link(q);
 14407    if p<>null then if (type(p)=glue_node)or(type(p)=kern_node) then
 14408      begin link(q):=link(p); link(p):=null; flush_node_list(p);
 14409      end;
 14410    end
 14411  
 14412  @ @<Cases for noads that can follow a |bin_noad|@>=
 14413  left_noad: goto done_with_noad;
 14414  fraction_noad: begin make_fraction(q); goto check_dimensions;
 14415    end;
 14416  op_noad: begin delta:=make_op(q);
 14417    if subtype(q)=limits then goto check_dimensions;
 14418    end;
 14419  ord_noad: make_ord(q);
 14420  open_noad,inner_noad: do_nothing;
 14421  radical_noad: make_radical(q);
 14422  over_noad: make_over(q);
 14423  under_noad: make_under(q);
 14424  accent_noad: make_math_accent(q);
 14425  vcenter_noad: make_vcenter(q);
 14426  
 14427  @ Most of the actual construction work of |mlist_to_hlist| is done
 14428  by procedures with names
 14429  like |make_fraction|, |make_radical|, etc. To illustrate
 14430  the general setup of such procedures, let's begin with a couple of
 14431  simple ones.
 14432  
 14433  @<Declare math...@>=
 14434  procedure make_over(@!q:pointer);
 14435  begin info(nucleus(q)):=@|
 14436    overbar(clean_box(nucleus(q),cramped_style(cur_style)),@|
 14437    3*default_rule_thickness,default_rule_thickness);
 14438  math_type(nucleus(q)):=sub_box;
 14439  end;
 14440  
 14441  @ @<Declare math...@>=
 14442  procedure make_under(@!q:pointer);
 14443  var p,@!x,@!y: pointer; {temporary registers for box construction}
 14444  @!delta:scaled; {overall height plus depth}
 14445  begin x:=clean_box(nucleus(q),cur_style);
 14446  p:=new_kern(3*default_rule_thickness); link(x):=p;
 14447  link(p):=fraction_rule(default_rule_thickness);
 14448  y:=vpack(x,natural);
 14449  delta:=height(y)+depth(y)+default_rule_thickness;
 14450  height(y):=height(x); depth(y):=delta-height(y);
 14451  info(nucleus(q)):=y; math_type(nucleus(q)):=sub_box;
 14452  end;
 14453  
 14454  @ @<Declare math...@>=
 14455  procedure make_vcenter(@!q:pointer);
 14456  var v:pointer; {the box that should be centered vertically}
 14457  @!delta:scaled; {its height plus depth}
 14458  begin v:=info(nucleus(q));
 14459  if type(v)<>vlist_node then confusion("vcenter");
 14460  @:this can't happen vcenter}{\quad vcenter@>
 14461  delta:=height(v)+depth(v);
 14462  height(v):=axis_height(cur_size)+half(delta);
 14463  depth(v):=delta-height(v);
 14464  end;
 14465  
 14466  @ According to the rules in the \.{DVI} file specifications, we ensure alignment
 14467  @^square roots@>
 14468  between a square root sign and the rule above its nucleus by assuming that the
 14469  baseline of the square-root symbol is the same as the bottom of the rule. The
 14470  height of the square-root symbol will be the thickness of the rule, and the
 14471  depth of the square-root symbol should exceed or equal the height-plus-depth
 14472  of the nucleus plus a certain minimum clearance~|clr|. The symbol will be
 14473  placed so that the actual clearance is |clr| plus half the excess.
 14474  
 14475  @<Declare math...@>=
 14476  procedure make_radical(@!q:pointer);
 14477  var x,@!y:pointer; {temporary registers for box construction}
 14478  @!delta,@!clr:scaled; {dimensions involved in the calculation}
 14479  begin x:=clean_box(nucleus(q),cramped_style(cur_style));
 14480  if cur_style<text_style then {display style}
 14481    clr:=default_rule_thickness+(abs(math_x_height(cur_size)) div 4)
 14482  else  begin clr:=default_rule_thickness; clr:=clr + (abs(clr) div 4);
 14483    end;
 14484  y:=var_delimiter(left_delimiter(q),cur_size,height(x)+depth(x)+clr+
 14485    default_rule_thickness);
 14486  delta:=depth(y)-(height(x)+depth(x)+clr);
 14487  if delta>0 then clr:=clr+half(delta); {increase the actual clearance}
 14488  shift_amount(y):=-(height(x)+clr);
 14489  link(y):=overbar(x,clr,height(y));
 14490  info(nucleus(q)):=hpack(y,natural); math_type(nucleus(q)):=sub_box;
 14491  end;
 14492  
 14493  @ Slants are not considered when placing accents in math mode. The accenter is
 14494  centered over the accentee, and the accent width is treated as zero with
 14495  respect to the size of the final box.
 14496  
 14497  @<Declare math...@>=
 14498  procedure make_math_accent(@!q:pointer);
 14499  label done,done1;
 14500  var p,@!x,@!y:pointer; {temporary registers for box construction}
 14501  @!a:integer; {address of lig/kern instruction}
 14502  @!c:quarterword; {accent character}
 14503  @!f:internal_font_number; {its font}
 14504  @!i:four_quarters; {its |char_info|}
 14505  @!s:scaled; {amount to skew the accent to the right}
 14506  @!h:scaled; {height of character being accented}
 14507  @!delta:scaled; {space to remove between accent and accentee}
 14508  @!w:scaled; {width of the accentee, not including sub/superscripts}
 14509  begin fetch(accent_chr(q));
 14510  if char_exists(cur_i) then
 14511    begin i:=cur_i; c:=cur_c; f:=cur_f;@/
 14512    @<Compute the amount of skew@>;
 14513    x:=clean_box(nucleus(q),cramped_style(cur_style)); w:=width(x); h:=height(x);
 14514    @<Switch to a larger accent if available and appropriate@>;
 14515    if h<x_height(f) then delta:=h@+else delta:=x_height(f);
 14516    if (math_type(supscr(q))<>empty)or(math_type(subscr(q))<>empty) then
 14517      if math_type(nucleus(q))=math_char then
 14518        @<Swap the subscript and superscript into box |x|@>;
 14519    y:=char_box(f,c);
 14520    shift_amount(y):=s+half(w-width(y));
 14521    width(y):=0; p:=new_kern(-delta); link(p):=x; link(y):=p;
 14522    y:=vpack(y,natural); width(y):=width(x);
 14523    if height(y)<h then @<Make the height of box |y| equal to |h|@>;
 14524    info(nucleus(q)):=y;
 14525    math_type(nucleus(q)):=sub_box;
 14526    end;
 14527  end;
 14528  
 14529  @ @<Make the height of box |y|...@>=
 14530  begin p:=new_kern(h-height(y)); link(p):=list_ptr(y); list_ptr(y):=p;
 14531  height(y):=h;
 14532  end
 14533  
 14534  @ @<Switch to a larger accent if available and appropriate@>=
 14535  loop@+  begin if char_tag(i)<>list_tag then goto done;
 14536    y:=rem_byte(i);
 14537    i:=char_info(f)(y);
 14538    if not char_exists(i) then goto done;
 14539    if char_width(f)(i)>w then goto done;
 14540    c:=y;
 14541    end;
 14542  done:
 14543  
 14544  @ @<Compute the amount of skew@>=
 14545  s:=0;
 14546  if math_type(nucleus(q))=math_char then
 14547    begin fetch(nucleus(q));
 14548    if char_tag(cur_i)=lig_tag then
 14549      begin a:=lig_kern_start(cur_f)(cur_i);
 14550      cur_i:=font_info[a].qqqq;
 14551      if skip_byte(cur_i)>stop_flag then
 14552        begin a:=lig_kern_restart(cur_f)(cur_i);
 14553        cur_i:=font_info[a].qqqq;
 14554        end;
 14555      loop@+ begin if qo(next_char(cur_i))=skew_char[cur_f] then
 14556          begin if op_byte(cur_i)>=kern_flag then
 14557            if skip_byte(cur_i)<=stop_flag then s:=char_kern(cur_f)(cur_i);
 14558          goto done1;
 14559          end;
 14560        if skip_byte(cur_i)>=stop_flag then goto done1;
 14561        a:=a+qo(skip_byte(cur_i))+1;
 14562        cur_i:=font_info[a].qqqq;
 14563        end;
 14564      end;
 14565    end;
 14566  done1:
 14567  
 14568  @ @<Swap the subscript and superscript into box |x|@>=
 14569  begin flush_node_list(x); x:=new_noad;
 14570  mem[nucleus(x)]:=mem[nucleus(q)];
 14571  mem[supscr(x)]:=mem[supscr(q)];
 14572  mem[subscr(x)]:=mem[subscr(q)];@/
 14573  mem[supscr(q)].hh:=empty_field;
 14574  mem[subscr(q)].hh:=empty_field;@/
 14575  math_type(nucleus(q)):=sub_mlist; info(nucleus(q)):=x;
 14576  x:=clean_box(nucleus(q),cur_style); delta:=delta+height(x)-h; h:=height(x);
 14577  end
 14578  
 14579  @ The |make_fraction| procedure is a bit different because it sets
 14580  |new_hlist(q)| directly rather than making a sub-box.
 14581  
 14582  @<Declare math...@>=
 14583  procedure make_fraction(@!q:pointer);
 14584  var p,@!v,@!x,@!y,@!z:pointer; {temporary registers for box construction}
 14585  @!delta,@!delta1,@!delta2,@!shift_up,@!shift_down,@!clr:scaled;
 14586    {dimensions for box calculations}
 14587  begin if thickness(q)=default_code then thickness(q):=default_rule_thickness;
 14588  @<Create equal-width boxes |x| and |z| for the numerator and denominator,
 14589    and compute the default amounts |shift_up| and |shift_down| by which they
 14590    are displaced from the baseline@>;
 14591  if thickness(q)=0 then @<Adjust \(s)|shift_up| and |shift_down| for the case
 14592    of no fraction line@>
 14593  else @<Adjust \(s)|shift_up| and |shift_down| for the case of a fraction line@>;
 14594  @<Construct a vlist box for the fraction, according to |shift_up| and
 14595    |shift_down|@>;
 14596  @<Put the \(f)fraction into a box with its delimiters, and make |new_hlist(q)|
 14597    point to it@>;
 14598  end;
 14599  
 14600  @ @<Create equal-width boxes |x| and |z| for the numerator and denom...@>=
 14601  x:=clean_box(numerator(q),num_style(cur_style));
 14602  z:=clean_box(denominator(q),denom_style(cur_style));
 14603  if width(x)<width(z) then x:=rebox(x,width(z))
 14604  else z:=rebox(z,width(x));
 14605  if cur_style<text_style then {display style}
 14606    begin shift_up:=num1(cur_size); shift_down:=denom1(cur_size);
 14607    end
 14608  else  begin shift_down:=denom2(cur_size);
 14609    if thickness(q)<>0 then shift_up:=num2(cur_size)
 14610    else shift_up:=num3(cur_size);
 14611    end
 14612  
 14613  @ The numerator and denominator must be separated by a certain minimum
 14614  clearance, called |clr| in the following program. The difference between
 14615  |clr| and the actual clearance is twice |delta|.
 14616  
 14617  @<Adjust \(s)|shift_up| and |shift_down| for the case of no fraction line@>=
 14618  begin if cur_style<text_style then clr:=7*default_rule_thickness
 14619  else clr:=3*default_rule_thickness;
 14620  delta:=half(clr-((shift_up-depth(x))-(height(z)-shift_down)));
 14621  if delta>0 then
 14622    begin shift_up:=shift_up+delta;
 14623    shift_down:=shift_down+delta;
 14624    end;
 14625  end
 14626  
 14627  @ In the case of a fraction line, the minimum clearance depends on the actual
 14628  thickness of the line.
 14629  
 14630  @<Adjust \(s)|shift_up| and |shift_down| for the case of a fraction line@>=
 14631  begin if cur_style<text_style then clr:=3*thickness(q)
 14632  else clr:=thickness(q);
 14633  delta:=half(thickness(q));
 14634  delta1:=clr-((shift_up-depth(x))-(axis_height(cur_size)+delta));
 14635  delta2:=clr-((axis_height(cur_size)-delta)-(height(z)-shift_down));
 14636  if delta1>0 then shift_up:=shift_up+delta1;
 14637  if delta2>0 then shift_down:=shift_down+delta2;
 14638  end
 14639  
 14640  @ @<Construct a vlist box for the fraction...@>=
 14641  v:=new_null_box; type(v):=vlist_node;
 14642  height(v):=shift_up+height(x); depth(v):=depth(z)+shift_down;
 14643  width(v):=width(x); {this also equals |width(z)|}
 14644  if thickness(q)=0 then
 14645    begin p:=new_kern((shift_up-depth(x))-(height(z)-shift_down));
 14646    link(p):=z;
 14647    end
 14648  else  begin y:=fraction_rule(thickness(q));@/
 14649    p:=new_kern((axis_height(cur_size)-delta)-@|(height(z)-shift_down));@/
 14650    link(y):=p; link(p):=z;@/
 14651    p:=new_kern((shift_up-depth(x))-(axis_height(cur_size)+delta));
 14652    link(p):=y;
 14653    end;
 14654  link(x):=p; list_ptr(v):=x
 14655  
 14656  @ @<Put the \(f)fraction into a box with its delimiters...@>=
 14657  if cur_style<text_style then delta:=delim1(cur_size)
 14658  else delta:=delim2(cur_size);
 14659  x:=var_delimiter(left_delimiter(q), cur_size, delta); link(x):=v;@/
 14660  z:=var_delimiter(right_delimiter(q), cur_size, delta); link(v):=z;@/
 14661  new_hlist(q):=hpack(x,natural)
 14662  
 14663  @ If the nucleus of an |op_noad| is a single character, it is to be
 14664  centered vertically with respect to the axis, after first being enlarged
 14665  (via a character list in the font) if we are in display style.  The normal
 14666  convention for placing displayed limits is to put them above and below the
 14667  operator in display style.
 14668  
 14669  The italic correction is removed from the character if there is a subscript
 14670  and the limits are not being displayed. The |make_op|
 14671  routine returns the value that should be used as an offset between
 14672  subscript and superscript.
 14673  
 14674  After |make_op| has acted, |subtype(q)| will be |limits| if and only if
 14675  the limits have been set above and below the operator. In that case,
 14676  |new_hlist(q)| will already contain the desired final box.
 14677  
 14678  @<Declare math...@>=
 14679  function make_op(@!q:pointer):scaled;
 14680  var delta:scaled; {offset between subscript and superscript}
 14681  @!p,@!v,@!x,@!y,@!z:pointer; {temporary registers for box construction}
 14682  @!c:quarterword;@+@!i:four_quarters; {registers for character examination}
 14683  @!shift_up,@!shift_down:scaled; {dimensions for box calculation}
 14684  begin if (subtype(q)=normal)and(cur_style<text_style) then
 14685    subtype(q):=limits;
 14686  if math_type(nucleus(q))=math_char then
 14687    begin fetch(nucleus(q));
 14688    if (cur_style<text_style)and(char_tag(cur_i)=list_tag) then {make it larger}
 14689      begin c:=rem_byte(cur_i); i:=char_info(cur_f)(c);
 14690      if char_exists(i) then
 14691        begin cur_c:=c; cur_i:=i; character(nucleus(q)):=c;
 14692        end;
 14693      end;
 14694    delta:=char_italic(cur_f)(cur_i); x:=clean_box(nucleus(q),cur_style);
 14695    if (math_type(subscr(q))<>empty)and(subtype(q)<>limits) then
 14696      width(x):=width(x)-delta; {remove italic correction}
 14697    shift_amount(x):=half(height(x)-depth(x)) - axis_height(cur_size);
 14698      {center vertically}
 14699    math_type(nucleus(q)):=sub_box; info(nucleus(q)):=x;
 14700    end
 14701  else delta:=0;
 14702  if subtype(q)=limits then
 14703    @<Construct a box with limits above and below it, skewed by |delta|@>;
 14704  make_op:=delta;
 14705  end;
 14706  
 14707  @ The following program builds a vlist box |v| for displayed limits. The
 14708  width of the box is not affected by the fact that the limits may be skewed.
 14709  
 14710  @<Construct a box with limits above and below it...@>=
 14711  begin x:=clean_box(supscr(q),sup_style(cur_style));
 14712  y:=clean_box(nucleus(q),cur_style);
 14713  z:=clean_box(subscr(q),sub_style(cur_style));
 14714  v:=new_null_box; type(v):=vlist_node; width(v):=width(y);
 14715  if width(x)>width(v) then width(v):=width(x);
 14716  if width(z)>width(v) then width(v):=width(z);
 14717  x:=rebox(x,width(v)); y:=rebox(y,width(v)); z:=rebox(z,width(v));@/
 14718  shift_amount(x):=half(delta); shift_amount(z):=-shift_amount(x);
 14719  height(v):=height(y); depth(v):=depth(y);
 14720  @<Attach the limits to |y| and adjust |height(v)|, |depth(v)| to
 14721    account for their presence@>;
 14722  new_hlist(q):=v;
 14723  end
 14724  
 14725  @ We use |shift_up| and |shift_down| in the following program for the
 14726  amount of glue between the displayed operator |y| and its limits |x| and
 14727  |z|. The vlist inside box |v| will consist of |x| followed by |y| followed
 14728  by |z|, with kern nodes for the spaces between and around them.
 14729  
 14730  @<Attach the limits to |y| and adjust |height(v)|, |depth(v)|...@>=
 14731  if math_type(supscr(q))=empty then
 14732    begin free_node(x,box_node_size); list_ptr(v):=y;
 14733    end
 14734  else  begin shift_up:=big_op_spacing3-depth(x);
 14735    if shift_up<big_op_spacing1 then shift_up:=big_op_spacing1;
 14736    p:=new_kern(shift_up); link(p):=y; link(x):=p;@/
 14737    p:=new_kern(big_op_spacing5); link(p):=x; list_ptr(v):=p;
 14738    height(v):=height(v)+big_op_spacing5+height(x)+depth(x)+shift_up;
 14739    end;
 14740  if math_type(subscr(q))=empty then free_node(z,box_node_size)
 14741  else  begin shift_down:=big_op_spacing4-height(z);
 14742    if shift_down<big_op_spacing2 then shift_down:=big_op_spacing2;
 14743    p:=new_kern(shift_down); link(y):=p; link(p):=z;@/
 14744    p:=new_kern(big_op_spacing5); link(z):=p;
 14745    depth(v):=depth(v)+big_op_spacing5+height(z)+depth(z)+shift_down;
 14746    end
 14747  
 14748  @ A ligature found in a math formula does not create a |ligature_node|, because
 14749  there is no question of hyphenation afterwards; the ligature will simply be
 14750  stored in an ordinary |char_node|, after residing in an |ord_noad|.
 14751  
 14752  The |math_type| is converted to |math_text_char| here if we would not want to
 14753  apply an italic correction to the current character unless it belongs
 14754  to a math font (i.e., a font with |space=0|).
 14755  
 14756  No boundary characters enter into these ligatures.
 14757  
 14758  @<Declare math...@>=
 14759  procedure make_ord(@!q:pointer);
 14760  label restart,exit;
 14761  var a:integer; {address of lig/kern instruction}
 14762  @!p,@!r:pointer; {temporary registers for list manipulation}
 14763  begin restart:@t@>@;@/
 14764  if math_type(subscr(q))=empty then if math_type(supscr(q))=empty then
 14765   if math_type(nucleus(q))=math_char then
 14766    begin p:=link(q);
 14767    if p<>null then if (type(p)>=ord_noad)and(type(p)<=punct_noad) then
 14768      if math_type(nucleus(p))=math_char then
 14769      if fam(nucleus(p))=fam(nucleus(q)) then
 14770        begin math_type(nucleus(q)):=math_text_char;
 14771        fetch(nucleus(q));
 14772        if char_tag(cur_i)=lig_tag then
 14773          begin a:=lig_kern_start(cur_f)(cur_i);
 14774          cur_c:=character(nucleus(p));
 14775          cur_i:=font_info[a].qqqq;
 14776          if skip_byte(cur_i)>stop_flag then
 14777            begin a:=lig_kern_restart(cur_f)(cur_i);
 14778            cur_i:=font_info[a].qqqq;
 14779            end;
 14780          loop@+ begin @<If instruction |cur_i| is a kern with |cur_c|, attach
 14781              the kern after~|q|; or if it is a ligature with |cur_c|, combine
 14782              noads |q| and~|p| appropriately; then |return| if the cursor has
 14783              moved past a noad, or |goto restart|@>;
 14784            if skip_byte(cur_i)>=stop_flag then return;
 14785            a:=a+qo(skip_byte(cur_i))+1;
 14786            cur_i:=font_info[a].qqqq;
 14787            end;
 14788          end;
 14789        end;
 14790    end;
 14791  exit:end;
 14792  
 14793  @ Note that a ligature between an |ord_noad| and another kind of noad
 14794  is replaced by an |ord_noad|, when the two noads collapse into one.
 14795  But we could make a parenthesis (say) change shape when it follows
 14796  certain letters. Presumably a font designer will define such
 14797  ligatures only when this convention makes sense.
 14798  
 14799  \chardef\?='174 % vertical line to indicate character retention
 14800  
 14801  @<If instruction |cur_i| is a kern with |cur_c|, ...@>=
 14802  if next_char(cur_i)=cur_c then if skip_byte(cur_i)<=stop_flag then
 14803    if op_byte(cur_i)>=kern_flag then
 14804      begin p:=new_kern(char_kern(cur_f)(cur_i));
 14805      link(p):=link(q); link(q):=p; return;
 14806      end
 14807    else  begin check_interrupt; {allow a way out of infinite ligature loop}
 14808      case op_byte(cur_i) of
 14809    qi(1),qi(5): character(nucleus(q)):=rem_byte(cur_i); {\.{=:\?}, \.{=:\?>}}
 14810    qi(2),qi(6): character(nucleus(p)):=rem_byte(cur_i); {\.{\?=:}, \.{\?=:>}}
 14811    qi(3),qi(7),qi(11):begin r:=new_noad; {\.{\?=:\?}, \.{\?=:\?>}, \.{\?=:\?>>}}
 14812        character(nucleus(r)):=rem_byte(cur_i);
 14813        fam(nucleus(r)):=fam(nucleus(q));@/
 14814        link(q):=r; link(r):=p;
 14815        if op_byte(cur_i)<qi(11) then math_type(nucleus(r)):=math_char
 14816        else math_type(nucleus(r)):=math_text_char; {prevent combination}
 14817        end;
 14818      othercases begin link(q):=link(p);
 14819        character(nucleus(q)):=rem_byte(cur_i); {\.{=:}}
 14820        mem[subscr(q)]:=mem[subscr(p)]; mem[supscr(q)]:=mem[supscr(p)];@/
 14821        free_node(p,noad_size);
 14822        end
 14823      endcases;
 14824      if op_byte(cur_i)>qi(3) then return;
 14825      math_type(nucleus(q)):=math_char; goto restart;
 14826      end
 14827  
 14828  @ When we get to the following part of the program, we have ``fallen through''
 14829  from cases that did not lead to |check_dimensions| or |done_with_noad| or
 14830  |done_with_node|. Thus, |q|~points to a noad whose nucleus may need to be
 14831  converted to an hlist, and whose subscripts and superscripts need to be
 14832  appended if they are present.
 14833  
 14834  If |nucleus(q)| is not a |math_char|, the variable |delta| is the amount
 14835  by which a superscript should be moved right with respect to a subscript
 14836  when both are present.
 14837  @^subscripts@>
 14838  @^superscripts@>
 14839  
 14840  @<Convert \(n)|nucleus(q)| to an hlist and attach the sub/superscripts@>=
 14841  case math_type(nucleus(q)) of
 14842  math_char, math_text_char:
 14843    @<Create a character node |p| for |nucleus(q)|, possibly followed
 14844    by a kern node for the italic correction, and set |delta| to the
 14845    italic correction if a subscript is present@>;
 14846  empty: p:=null;
 14847  sub_box: p:=info(nucleus(q));
 14848  sub_mlist: begin cur_mlist:=info(nucleus(q)); save_style:=cur_style;
 14849    mlist_penalties:=false; mlist_to_hlist; {recursive call}
 14850  @^recursion@>
 14851    cur_style:=save_style; @<Set up the values...@>;
 14852    p:=hpack(link(temp_head),natural);
 14853    end;
 14854  othercases confusion("mlist2")
 14855  @:this can't happen mlist2}{\quad mlist2@>
 14856  endcases;@/
 14857  new_hlist(q):=p;
 14858  if (math_type(subscr(q))=empty)and(math_type(supscr(q))=empty) then
 14859    goto check_dimensions;
 14860  make_scripts(q,delta)
 14861  
 14862  @ @<Create a character node |p| for |nucleus(q)|...@>=
 14863  begin fetch(nucleus(q));
 14864  if char_exists(cur_i) then
 14865    begin delta:=char_italic(cur_f)(cur_i); p:=new_character(cur_f,qo(cur_c));
 14866    if (math_type(nucleus(q))=math_text_char)and(space(cur_f)<>0) then
 14867      delta:=0; {no italic correction in mid-word of text font}
 14868    if (math_type(subscr(q))=empty)and(delta<>0) then
 14869      begin link(p):=new_kern(delta); delta:=0;
 14870      end;
 14871    end
 14872  else p:=null;
 14873  end
 14874  
 14875  @ The purpose of |make_scripts(q,delta)| is to attach the subscript and/or
 14876  superscript of noad |q| to the list that starts at |new_hlist(q)|,
 14877  given that the subscript and superscript aren't both empty. The superscript
 14878  will appear to the right of the subscript by a given distance |delta|.
 14879  
 14880  We set |shift_down| and |shift_up| to the minimum amounts to shift the
 14881  baseline of subscripts and superscripts based on the given nucleus.
 14882  
 14883  @<Declare math...@>=
 14884  procedure make_scripts(@!q:pointer;@!delta:scaled);
 14885  var p,@!x,@!y,@!z:pointer; {temporary registers for box construction}
 14886  @!shift_up,@!shift_down,@!clr:scaled; {dimensions in the calculation}
 14887  @!t:small_number; {subsidiary size code}
 14888  begin p:=new_hlist(q);
 14889  if is_char_node(p) then
 14890    begin shift_up:=0; shift_down:=0;
 14891    end
 14892  else  begin z:=hpack(p,natural);
 14893    if cur_style<script_style then t:=script_size@+else t:=script_script_size;
 14894    shift_up:=height(z)-sup_drop(t);
 14895    shift_down:=depth(z)+sub_drop(t);
 14896    free_node(z,box_node_size);
 14897    end;
 14898  if math_type(supscr(q))=empty then
 14899    @<Construct a subscript box |x| when there is no superscript@>
 14900  else  begin @<Construct a superscript box |x|@>;
 14901    if math_type(subscr(q))=empty then shift_amount(x):=-shift_up
 14902    else @<Construct a sub/superscript combination box |x|, with the
 14903      superscript offset by |delta|@>;
 14904    end;
 14905  if new_hlist(q)=null then new_hlist(q):=x
 14906  else  begin p:=new_hlist(q);
 14907    while link(p)<>null do p:=link(p);
 14908    link(p):=x;
 14909    end;
 14910  end;
 14911  
 14912  @ When there is a subscript without a superscript, the top of the subscript
 14913  should not exceed the baseline plus four-fifths of the x-height.
 14914  
 14915  @<Construct a subscript box |x| when there is no superscript@>=
 14916  begin x:=clean_box(subscr(q),sub_style(cur_style));
 14917  width(x):=width(x)+script_space;
 14918  if shift_down<sub1(cur_size) then shift_down:=sub1(cur_size);
 14919  clr:=height(x)-(abs(math_x_height(cur_size)*4) div 5);
 14920  if shift_down<clr then shift_down:=clr;
 14921  shift_amount(x):=shift_down;
 14922  end
 14923  
 14924  @ The bottom of a superscript should never descend below the baseline plus
 14925  one-fourth of the x-height.
 14926  
 14927  @<Construct a superscript box |x|@>=
 14928  begin x:=clean_box(supscr(q),sup_style(cur_style));
 14929  width(x):=width(x)+script_space;
 14930  if odd(cur_style) then clr:=sup3(cur_size)
 14931  else if cur_style<text_style then clr:=sup1(cur_size)
 14932  else clr:=sup2(cur_size);
 14933  if shift_up<clr then shift_up:=clr;
 14934  clr:=depth(x)+(abs(math_x_height(cur_size)) div 4);
 14935  if shift_up<clr then shift_up:=clr;
 14936  end
 14937  
 14938  @ When both subscript and superscript are present, the subscript must be
 14939  separated from the superscript by at least four times |default_rule_thickness|.
 14940  If this condition would be violated, the subscript moves down, after which
 14941  both subscript and superscript move up so that the bottom of the superscript
 14942  is at least as high as the baseline plus four-fifths of the x-height.
 14943  
 14944  @<Construct a sub/superscript combination box |x|...@>=
 14945  begin y:=clean_box(subscr(q),sub_style(cur_style));
 14946  width(y):=width(y)+script_space;
 14947  if shift_down<sub2(cur_size) then shift_down:=sub2(cur_size);
 14948  clr:=4*default_rule_thickness-
 14949    ((shift_up-depth(x))-(height(y)-shift_down));
 14950  if clr>0 then
 14951    begin shift_down:=shift_down+clr;
 14952    clr:=(abs(math_x_height(cur_size)*4) div 5)-(shift_up-depth(x));
 14953    if clr>0 then
 14954      begin shift_up:=shift_up+clr;
 14955      shift_down:=shift_down-clr;
 14956      end;
 14957    end;
 14958  shift_amount(x):=delta; {superscript is |delta| to the right of the subscript}
 14959  p:=new_kern((shift_up-depth(x))-(height(y)-shift_down)); link(x):=p; link(p):=y;
 14960  x:=vpack(x,natural); shift_amount(x):=shift_down;
 14961  end
 14962  
 14963  @ We have now tied up all the loose ends of the first pass of |mlist_to_hlist|.
 14964  The second pass simply goes through and hooks everything together with the
 14965  proper glue and penalties. It also handles the |left_noad| and |right_noad| that
 14966  might be present, since |max_h| and |max_d| are now known. Variable |p| points
 14967  to a node at the current end of the final hlist.
 14968  
 14969  @<Make a second pass over the mlist, ...@>=
 14970  p:=temp_head; link(p):=null; q:=mlist; r_type:=0; cur_style:=style;
 14971  @<Set up the values of |cur_size| and |cur_mu|, based on |cur_style|@>;
 14972  while q<>null do
 14973    begin @<If node |q| is a style node, change the style and |goto delete_q|;
 14974      otherwise if it is not a noad, put it into the hlist,
 14975      advance |q|, and |goto done|; otherwise set |s| to the size
 14976      of noad |q|, set |t| to the associated type (|ord_noad..
 14977      inner_noad|), and set |pen| to the associated penalty@>;
 14978    @<Append inter-element spacing based on |r_type| and |t|@>;
 14979    @<Append any |new_hlist| entries for |q|, and any appropriate penalties@>;
 14980    r_type:=t;
 14981    delete_q: r:=q; q:=link(q); free_node(r,s);
 14982    done: end
 14983  
 14984  @ Just before doing the big |case| switch in the second pass, the program
 14985  sets up default values so that most of the branches are short.
 14986  
 14987  @<If node |q| is a style node, change the style...@>=
 14988  t:=ord_noad; s:=noad_size; pen:=inf_penalty;
 14989  case type(q) of
 14990  op_noad,open_noad,close_noad,punct_noad,inner_noad: t:=type(q);
 14991  bin_noad: begin t:=bin_noad; pen:=bin_op_penalty;
 14992    end;
 14993  rel_noad: begin t:=rel_noad; pen:=rel_penalty;
 14994    end;
 14995  ord_noad,vcenter_noad,over_noad,under_noad: do_nothing;
 14996  radical_noad: s:=radical_noad_size;
 14997  accent_noad: s:=accent_noad_size;
 14998  fraction_noad: s:=fraction_noad_size;
 14999  left_noad,right_noad: t:=make_left_right(q,style,max_d,max_h);
 15000  style_node: @<Change the current style and |goto delete_q|@>;
 15001  whatsit_node,penalty_node,rule_node,disc_node,adjust_node,ins_node,mark_node,
 15002   glue_node,kern_node:@t@>@;@/
 15003    begin link(p):=q; p:=q; q:=link(q); link(p):=null; goto done;
 15004    end;
 15005  othercases confusion("mlist3")
 15006  @:this can't happen mlist3}{\quad mlist3@>
 15007  endcases
 15008  
 15009  @ The |make_left_right| function constructs a left or right delimiter of
 15010  the required size and returns the value |open_noad| or |close_noad|. The
 15011  |right_noad| and |left_noad| will both be based on the original |style|,
 15012  so they will have consistent sizes.
 15013  
 15014  We use the fact that |right_noad-left_noad=close_noad-open_noad|.
 15015  
 15016  @<Declare math...@>=
 15017  function make_left_right(@!q:pointer;@!style:small_number;
 15018    @!max_d,@!max_h:scaled):small_number;
 15019  var delta,@!delta1,@!delta2:scaled; {dimensions used in the calculation}
 15020  begin if style<script_style then cur_size:=text_size
 15021  else cur_size:=16*((style-text_style) div 2);
 15022  delta2:=max_d+axis_height(cur_size);
 15023  delta1:=max_h+max_d-delta2;
 15024  if delta2>delta1 then delta1:=delta2; {|delta1| is max distance from axis}
 15025  delta:=(delta1 div 500)*delimiter_factor;
 15026  delta2:=delta1+delta1-delimiter_shortfall;
 15027  if delta<delta2 then delta:=delta2;
 15028  new_hlist(q):=var_delimiter(delimiter(q),cur_size,delta);
 15029  make_left_right:=type(q)-(left_noad-open_noad); {|open_noad| or |close_noad|}
 15030  end;
 15031  
 15032  @ @<Change the current style and |goto delete_q|@>=
 15033  begin cur_style:=subtype(q); s:=style_node_size;
 15034  @<Set up the values of |cur_size| and |cur_mu|, based on |cur_style|@>;
 15035  goto delete_q;
 15036  end
 15037  
 15038  @ The inter-element spacing in math formulas depends on an $8\times8$ table that
 15039  \TeX\ preloads as a 64-digit string. The elements of this string have the
 15040  following significance:
 15041  $$\vbox{\halign{#\hfil\cr
 15042  \.0 means no space;\cr
 15043  \.1 means a conditional thin space (\.{\\nonscript\\mskip\\thinmuskip});\cr
 15044  \.2 means a thin space (\.{\\mskip\\thinmuskip});\cr
 15045  \.3 means a conditional medium space
 15046    (\.{\\nonscript\\mskip\\medmuskip});\cr
 15047  \.4 means a conditional thick space
 15048    (\.{\\nonscript\\mskip\\thickmuskip});\cr
 15049  \.* means an impossible case.\cr}}$$
 15050  This is all pretty cryptic, but {\sl The \TeX book\/} explains what is
 15051  supposed to happen, and the string makes it happen.
 15052  @:TeXbook}{\sl The \TeX book@>
 15053  
 15054  A global variable |magic_offset| is computed so that if |a| and |b| are
 15055  in the range |ord_noad..inner_noad|, then |str_pool[a*8+b+magic_offset]|
 15056  is the digit for spacing between noad types |a| and |b|.
 15057  
 15058  If \PASCAL\ had provided a good way to preload constant arrays, this part of
 15059  the program would not have been so strange.
 15060  @:PASCAL}{\PASCAL@>
 15061  
 15062  @d math_spacing=@;@/
 15063  @t\hskip-35pt@>
 15064  "0234000122*4000133**3**344*0400400*000000234000111*1111112341011"
 15065  @t$ \hskip-35pt$@>
 15066  
 15067  @<Glob...@>=
 15068  @!magic_offset:integer; {used to find inter-element spacing}
 15069  
 15070  @ @<Compute the magic offset@>=
 15071  magic_offset:=str_start[math_spacing]-9*ord_noad
 15072  
 15073  @ @<Append inter-element spacing based on |r_type| and |t|@>=
 15074  if r_type>0 then {not the first noad}
 15075    begin case so(str_pool[r_type*8+t+magic_offset]) of
 15076    "0": x:=0;
 15077    "1": if cur_style<script_style then x:=thin_mu_skip_code@+else x:=0;
 15078    "2": x:=thin_mu_skip_code;
 15079    "3": if cur_style<script_style then x:=med_mu_skip_code@+else x:=0;
 15080    "4": if cur_style<script_style then x:=thick_mu_skip_code@+else x:=0;
 15081    othercases confusion("mlist4")
 15082  @:this can't happen mlist4}{\quad mlist4@>
 15083    endcases;
 15084    if x<>0 then
 15085      begin y:=math_glue(glue_par(x),cur_mu);
 15086      z:=new_glue(y); glue_ref_count(y):=null; link(p):=z; p:=z;@/
 15087      subtype(z):=x+1; {store a symbolic subtype}
 15088      end;
 15089    end
 15090  
 15091  @ We insert a penalty node after the hlist entries of noad |q| if |pen|
 15092  is not an ``infinite'' penalty, and if the node immediately following |q|
 15093  is not a penalty node or a |rel_noad| or absent entirely.
 15094  
 15095  @<Append any |new_hlist| entries for |q|, and any appropriate penalties@>=
 15096  if new_hlist(q)<>null then
 15097    begin link(p):=new_hlist(q);
 15098    repeat p:=link(p);
 15099    until link(p)=null;
 15100    end;
 15101  if penalties then if link(q)<>null then if pen<inf_penalty then
 15102    begin r_type:=type(link(q));
 15103    if r_type<>penalty_node then if r_type<>rel_noad then
 15104      begin z:=new_penalty(pen); link(p):=z; p:=z;
 15105      end;
 15106    end
 15107  
 15108  @* \[37] Alignment.
 15109  It's sort of a miracle whenever \.{\\halign} and \.{\\valign} work, because
 15110  they cut across so many of the control structures of \TeX.
 15111  
 15112  Therefore the
 15113  present page is probably not the best place for a beginner to start reading
 15114  this program; it is better to master everything else first.
 15115  
 15116  Let us focus our thoughts on an example of what the input might be, in order
 15117  to get some idea about how the alignment miracle happens. The example doesn't
 15118  do anything useful, but it is sufficiently general to indicate all of the
 15119  special cases that must be dealt with; please do not be disturbed by its
 15120  apparent complexity and meaninglessness.
 15121  $$\vbox{\halign{\.{#}\hfil\cr
 15122  {}\\tabskip 2pt plus 3pt\cr
 15123  {}\\halign to 300pt\{u1\#v1\&\cr
 15124  \hskip 50pt\\tabskip 1pt plus 1fil u2\#v2\&\cr
 15125  \hskip 50pt u3\#v3\\cr\cr
 15126  \hskip 25pt a1\&\\omit a2\&\\vrule\\cr\cr
 15127  \hskip 25pt \\noalign\{\\vskip 3pt\}\cr
 15128  \hskip 25pt b1\\span b2\\cr\cr
 15129  \hskip 25pt \\omit\&c2\\span\\omit\\cr\}\cr}}$$
 15130  Here's what happens:
 15131  
 15132  \yskip
 15133  (0) When `\.{\\halign to 300pt\{}' is scanned, the |scan_spec| routine
 15134  places the 300pt dimension onto the |save_stack|, and an |align_group|
 15135  code is placed above it. This will make it possible to complete the alignment
 15136  when the matching `\.\}' is found.
 15137  
 15138  (1) The preamble is scanned next. Macros in the preamble are not expanded,
 15139  @^preamble@>
 15140  except as part of a tabskip specification. For example, if \.{u2} had been
 15141  a macro in the preamble above, it would have been expanded, since \TeX\
 15142  must look for `\.{minus...}' as part of the tabskip glue. A ``preamble list''
 15143  is constructed based on the user's preamble; in our case it contains the
 15144  following seven items:
 15145  $$\vbox{\halign{\.{#}\hfil\qquad&(#)\hfil\cr
 15146  {}\\glue 2pt plus 3pt&the tabskip preceding column 1\cr
 15147  {}\\alignrecord, width $-\infty$&preamble info for column 1\cr
 15148  {}\\glue 2pt plus 3pt&the tabskip between columns 1 and 2\cr
 15149  {}\\alignrecord, width $-\infty$&preamble info for column 2\cr
 15150  {}\\glue 1pt plus 1fil&the tabskip between columns 2 and 3\cr
 15151  {}\\alignrecord, width $-\infty$&preamble info for column 3\cr
 15152  {}\\glue 1pt plus 1fil&the tabskip following column 3\cr}}$$
 15153  These ``alignrecord'' entries have the same size as an |unset_node|,
 15154  since they will later be converted into such nodes. However, at the
 15155  moment they have no |type| or |subtype| fields; they have |info| fields
 15156  instead, and these |info| fields are initially set to the value |end_span|,
 15157  for reasons explained below. Furthermore, the alignrecord nodes have no
 15158  |height| or |depth| fields; these are renamed |u_part| and |v_part|,
 15159  and they point to token lists for the templates of the alignment.
 15160  For example, the |u_part| field in the first alignrecord points to the
 15161  token list `\.{u1}', i.e., the template preceding the `\.\#' for column~1.
 15162  
 15163  (2) \TeX\ now looks at what follows the \.{\\cr} that ended the preamble.
 15164  It is not `\.{\\noalign}' or `\.{\\omit}', so this input is put back to
 15165  be read again, and the template `\.{u1}' is fed to the scanner. Just
 15166  before reading `\.{u1}', \TeX\ goes into restricted horizontal mode.
 15167  Just after reading `\.{u1}', \TeX\ will see `\.{a1}', and then (when the
 15168  {\.\&} is sensed) \TeX\ will see `\.{v1}'. Then \TeX\ scans an |endv|
 15169  token, indicating the end of a column. At this point an |unset_node| is
 15170  created, containing the contents of the current hlist (i.e., `\.{u1a1v1}').
 15171  The natural width of this unset node replaces the |width| field of the
 15172  alignrecord for column~1; in general, the alignrecords will record the
 15173  maximum natural width that has occurred so far in a given column.
 15174  
 15175  (3) Since `\.{\\omit}' follows the `\.\&', the templates for column~2
 15176  are now bypassed. Again \TeX\ goes into restricted horizontal mode and
 15177  makes an |unset_node| from the resulting hlist; but this time the
 15178  hlist contains simply `\.{a2}'. The natural width of the new unset box
 15179  is remembered in the |width| field of the alignrecord for column~2.
 15180  
 15181  (4) A third |unset_node| is created for column 3, using essentially the
 15182  mechanism that worked for column~1; this unset box contains `\.{u3\\vrule
 15183  v3}'. The vertical rule in this case has running dimensions that will later
 15184  extend to the height and depth of the whole first row, since each |unset_node|
 15185  in a row will eventually inherit the height and depth of its enclosing box.
 15186  
 15187  (5) The first row has now ended; it is made into a single unset box
 15188  comprising the following seven items:
 15189  $$\vbox{\halign{\hbox to 325pt{\qquad\.{#}\hfil}\cr
 15190  {}\\glue 2pt plus 3pt\cr
 15191  {}\\unsetbox for 1 column: u1a1v1\cr
 15192  {}\\glue 2pt plus 3pt\cr
 15193  {}\\unsetbox for 1 column: a2\cr
 15194  {}\\glue 1pt plus 1fil\cr
 15195  {}\\unsetbox for 1 column: u3\\vrule v3\cr
 15196  {}\\glue 1pt plus 1fil\cr}}$$
 15197  The width of this unset row is unimportant, but it has the correct height
 15198  and depth, so the correct baselineskip glue will be computed as the row
 15199  is inserted into a vertical list.
 15200  
 15201  (6) Since `\.{\\noalign}' follows the current \.{\\cr}, \TeX\ appends
 15202  additional material (in this case \.{\\vskip 3pt}) to the vertical list.
 15203  While processing this material, \TeX\ will be in internal vertical
 15204  mode, and |no_align_group| will be on |save_stack|.
 15205  
 15206  (7) The next row produces an unset box that looks like this:
 15207  $$\vbox{\halign{\hbox to 325pt{\qquad\.{#}\hfil}\cr
 15208  {}\\glue 2pt plus 3pt\cr
 15209  {}\\unsetbox for 2 columns: u1b1v1u2b2v2\cr
 15210  {}\\glue 1pt plus 1fil\cr
 15211  {}\\unsetbox for 1 column: {\rm(empty)}\cr
 15212  {}\\glue 1pt plus 1fil\cr}}$$
 15213  The natural width of the unset box that spans columns 1~and~2 is stored
 15214  in a ``span node,'' which we will explain later; the |info| field of the
 15215  alignrecord for column~1 now points to the new span node, and the |info|
 15216  of the span node points to |end_span|.
 15217  
 15218  (8) The final row produces the unset box
 15219  $$\vbox{\halign{\hbox to 325pt{\qquad\.{#}\hfil}\cr
 15220  {}\\glue 2pt plus 3pt\cr
 15221  {}\\unsetbox for 1 column: {\rm(empty)}\cr
 15222  {}\\glue 2pt plus 3pt\cr
 15223  {}\\unsetbox for 2 columns: u2c2v2\cr
 15224  {}\\glue 1pt plus 1fil\cr}}$$
 15225  A new span node is attached to the alignrecord for column 2.
 15226  
 15227  (9) The last step is to compute the true column widths and to change all the
 15228  unset boxes to hboxes, appending the whole works to the vertical list that
 15229  encloses the \.{\\halign}. The rules for deciding on the final widths of
 15230  each unset column box will be explained below.
 15231  
 15232  \yskip\noindent
 15233  Note that as \.{\\halign} is being processed, we fearlessly give up control
 15234  to the rest of \TeX. At critical junctures, an alignment routine is
 15235  called upon to step in and do some little action, but most of the time
 15236  these routines just lurk in the background. It's something like
 15237  post-hypnotic suggestion.
 15238  
 15239  @ We have mentioned that alignrecords contain no |height| or |depth| fields.
 15240  Their |glue_sign| and |glue_order| are pre-empted as well, since it
 15241  is necessary to store information about what to do when a template ends.
 15242  This information is called the |extra_info| field.
 15243  
 15244  @d u_part(#)==mem[#+height_offset].int {pointer to \<u_j> token list}
 15245  @d v_part(#)==mem[#+depth_offset].int {pointer to \<v_j> token list}
 15246  @d extra_info(#)==info(#+list_offset) {info to remember during template}
 15247  
 15248  @ Alignments can occur within alignments, so a small stack is used to access
 15249  the alignrecord information. At each level we have a |preamble| pointer,
 15250  indicating the beginning of the preamble list; a |cur_align| pointer,
 15251  indicating the current position in the preamble list; a |cur_span| pointer,
 15252  indicating the value of |cur_align| at the beginning of a sequence of
 15253  spanned columns; a |cur_loop| pointer, indicating the tabskip glue before
 15254  an alignrecord that should be copied next if the current list is extended;
 15255  and the |align_state| variable, which indicates the nesting of braces so
 15256  that \.{\\cr} and \.{\\span} and tab marks are properly intercepted.
 15257  There also are pointers |cur_head| and |cur_tail| to the head and tail
 15258  of a list of adjustments being moved out from horizontal mode to
 15259  vertical~mode.
 15260  
 15261  The current values of these seven quantities appear in global variables;
 15262  when they have to be pushed down, they are stored in 5-word nodes, and
 15263  |align_ptr| points to the topmost such node.
 15264  
 15265  @d preamble==link(align_head) {the current preamble list}
 15266  @d align_stack_node_size=5 {number of |mem| words to save alignment states}
 15267  
 15268  @<Glob...@>=
 15269  @!cur_align:pointer; {current position in preamble list}
 15270  @!cur_span:pointer; {start of currently spanned columns in preamble list}
 15271  @!cur_loop:pointer; {place to copy when extending a periodic preamble}
 15272  @!align_ptr:pointer; {most recently pushed-down alignment stack node}
 15273  @!cur_head,@!cur_tail:pointer; {adjustment list pointers}
 15274  
 15275  @ The |align_state| and |preamble| variables are initialized elsewhere.
 15276  
 15277  @<Set init...@>=
 15278  align_ptr:=null; cur_align:=null; cur_span:=null; cur_loop:=null;
 15279  cur_head:=null; cur_tail:=null;
 15280  
 15281  @ Alignment stack maintenance is handled by a pair of trivial routines
 15282  called |push_alignment| and |pop_alignment|.
 15283  
 15284  @p procedure push_alignment;
 15285  var p:pointer; {the new alignment stack node}
 15286  begin p:=get_node(align_stack_node_size);
 15287  link(p):=align_ptr; info(p):=cur_align;
 15288  llink(p):=preamble; rlink(p):=cur_span;
 15289  mem[p+2].int:=cur_loop; mem[p+3].int:=align_state;
 15290  info(p+4):=cur_head; link(p+4):=cur_tail;
 15291  align_ptr:=p;
 15292  cur_head:=get_avail;
 15293  end;
 15294  @#
 15295  procedure pop_alignment;
 15296  var p:pointer; {the top alignment stack node}
 15297  begin free_avail(cur_head);
 15298  p:=align_ptr;
 15299  cur_tail:=link(p+4); cur_head:=info(p+4);
 15300  align_state:=mem[p+3].int; cur_loop:=mem[p+2].int;
 15301  cur_span:=rlink(p); preamble:=llink(p);
 15302  cur_align:=info(p); align_ptr:=link(p);
 15303  free_node(p,align_stack_node_size);
 15304  end;
 15305  
 15306  @ \TeX\ has eight procedures that govern alignments: |init_align| and
 15307  |fin_align| are used at the very beginning and the very end; |init_row| and
 15308  |fin_row| are used at the beginning and end of individual rows; |init_span|
 15309  is used at the beginning of a sequence of spanned columns (possibly involving
 15310  only one column); |init_col| and |fin_col| are used at the beginning and
 15311  end of individual columns; and |align_peek| is used after \.{\\cr} to see
 15312  whether the next item is \.{\\noalign}.
 15313  
 15314  We shall consider these routines in the order they are first used during
 15315  the course of a complete \.{\\halign}, namely |init_align|, |align_peek|,
 15316  |init_row|, |init_span|, |init_col|, |fin_col|, |fin_row|, |fin_align|.
 15317  
 15318  @ When \.{\\halign} or \.{\\valign} has been scanned in an appropriate
 15319  mode, \TeX\ calls |init_align|, whose task is to get everything off to a
 15320  good start. This mostly involves scanning the preamble and putting its
 15321  information into the preamble list.
 15322  @^preamble@>
 15323  
 15324  @p @t\4@>@<Declare the procedure called |get_preamble_token|@>@t@>@/
 15325  procedure@?align_peek; forward;@t\2@>@/
 15326  procedure@?normal_paragraph; forward;@t\2@>@/
 15327  procedure init_align;
 15328  label done, done1, done2, continue;
 15329  var save_cs_ptr:pointer; {|warning_index| value for error messages}
 15330  @!p:pointer; {for short-term temporary use}
 15331  begin save_cs_ptr:=cur_cs; {\.{\\halign} or \.{\\valign}, usually}
 15332  push_alignment; align_state:=-1000000; {enter a new alignment level}
 15333  @<Check for improper alignment in displayed math@>;
 15334  push_nest; {enter a new semantic level}
 15335  @<Change current mode to |-vmode| for \.{\\halign}, |-hmode| for \.{\\valign}@>;
 15336  scan_spec(align_group,false);@/
 15337  @<Scan the preamble and record it in the |preamble| list@>;
 15338  new_save_level(align_group);
 15339  if every_cr<>null then begin_token_list(every_cr,every_cr_text);
 15340  align_peek; {look for \.{\\noalign} or \.{\\omit}}
 15341  end;
 15342  
 15343  @ In vertical modes, |prev_depth| already has the correct value. But
 15344  if we are in |mmode| (displayed formula mode), we reach out to the
 15345  enclosing vertical mode for the |prev_depth| value that produces the
 15346  correct baseline calculations.
 15347  
 15348  @<Change current mode...@>=
 15349  if mode=mmode then
 15350    begin mode:=-vmode; prev_depth:=nest[nest_ptr-2].aux_field.sc;
 15351    end
 15352  else if mode>0 then negate(mode)
 15353  
 15354  @ When \.{\\halign} is used as a displayed formula, there should be
 15355  no other pieces of mlists present.
 15356  
 15357  @<Check for improper alignment in displayed math@>=
 15358  if (mode=mmode)and((tail<>head)or(incompleat_noad<>null)) then
 15359    begin print_err("Improper "); print_esc("halign"); print(" inside $$'s");
 15360  @.Improper \\halign...@>
 15361    help3("Displays can use special alignments (like \eqalignno)")@/
 15362    ("only if nothing but the alignment itself is between $$'s.")@/
 15363    ("So I've deleted the formulas that preceded this alignment.");
 15364    error; flush_math;
 15365    end
 15366  
 15367  @ @<Scan the preamble and record it in the |preamble| list@>=
 15368  preamble:=null; cur_align:=align_head; cur_loop:=null; scanner_status:=aligning;
 15369  warning_index:=save_cs_ptr; align_state:=-1000000;
 15370    {at this point, |cur_cmd=left_brace|}
 15371  loop@+  begin @<Append the current tabskip glue to the preamble list@>;
 15372    if cur_cmd=car_ret then goto done; {\.{\\cr} ends the preamble}
 15373    @<Scan preamble text until |cur_cmd| is |tab_mark| or |car_ret|,
 15374      looking for changes in the tabskip glue; append an
 15375      alignrecord to the preamble list@>;
 15376    end;
 15377  done: scanner_status:=normal
 15378  
 15379  @ @<Append the current tabskip glue to the preamble list@>=
 15380  link(cur_align):=new_param_glue(tab_skip_code);
 15381  cur_align:=link(cur_align)
 15382  
 15383  @ @<Scan preamble text until |cur_cmd| is |tab_mark| or |car_ret|...@>=
 15384  @<Scan the template \<u_j>, putting the resulting token list in |hold_head|@>;
 15385  link(cur_align):=new_null_box; cur_align:=link(cur_align); {a new alignrecord}
 15386  info(cur_align):=end_span; width(cur_align):=null_flag;
 15387  u_part(cur_align):=link(hold_head);
 15388  @<Scan the template \<v_j>, putting the resulting token list in |hold_head|@>;
 15389  v_part(cur_align):=link(hold_head)
 15390  
 15391  @ We enter `\.{\\span}' into |eqtb| with |tab_mark| as its command code,
 15392  and with |span_code| as the command modifier. This makes \TeX\ interpret it
 15393  essentially the same as an alignment delimiter like `\.\&', yet it is
 15394  recognizably different when we need to distinguish it from a normal delimiter.
 15395  It also turns out to be useful to give a special |cr_code| to `\.{\\cr}',
 15396  and an even larger |cr_cr_code| to `\.{\\crcr}'.
 15397  
 15398  The end of a template is represented by two ``frozen'' control sequences
 15399  called \.{\\endtemplate}. The first has the command code |end_template|, which
 15400  is |>outer_call|, so it will not easily disappear in the presence of errors.
 15401  The |get_x_token| routine converts the first into the second, which has |endv|
 15402  as its command code.
 15403  
 15404  @d span_code=256 {distinct from any character}
 15405  @d cr_code=257 {distinct from |span_code| and from any character}
 15406  @d cr_cr_code=cr_code+1 {this distinguishes \.{\\crcr} from \.{\\cr}}
 15407  @d end_template_token==cs_token_flag+frozen_end_template
 15408  
 15409  @<Put each of \TeX's primitives into the hash table@>=
 15410  primitive("span",tab_mark,span_code);@/
 15411  @!@:span_}{\.{\\span} primitive@>
 15412  primitive("cr",car_ret,cr_code);
 15413  @!@:cr_}{\.{\\cr} primitive@>
 15414  text(frozen_cr):="cr"; eqtb[frozen_cr]:=eqtb[cur_val];@/
 15415  primitive("crcr",car_ret,cr_cr_code);
 15416  @!@:cr_cr_}{\.{\\crcr} primitive@>
 15417  text(frozen_end_template):="endtemplate"; text(frozen_endv):="endtemplate";
 15418  @.endtemplate@>
 15419  eq_type(frozen_endv):=endv; equiv(frozen_endv):=null_list;
 15420  eq_level(frozen_endv):=level_one;@/
 15421  eqtb[frozen_end_template]:=eqtb[frozen_endv];
 15422  eq_type(frozen_end_template):=end_template;
 15423  
 15424  @ @<Cases of |print_cmd_chr|...@>=
 15425  tab_mark: if chr_code=span_code then print_esc("span")
 15426    else chr_cmd("alignment tab character ");
 15427  car_ret: if chr_code=cr_code then print_esc("cr")
 15428    else print_esc("crcr");
 15429  
 15430  @ The preamble is copied directly, except that \.{\\tabskip} causes a change
 15431  to the tabskip glue, thereby possibly expanding macros that immediately
 15432  follow it. An appearance of \.{\\span} also causes such an expansion.
 15433  
 15434  Note that if the preamble contains `\.{\\global\\tabskip}', the `\.{\\global}'
 15435  token survives in the preamble and the `\.{\\tabskip}' defines new
 15436  tabskip glue (locally).
 15437  
 15438  @<Declare the procedure called |get_preamble_token|@>=
 15439  procedure get_preamble_token;
 15440  label restart;
 15441  begin restart: get_token;
 15442  while (cur_chr=span_code)and(cur_cmd=tab_mark) do
 15443    begin get_token; {this token will be expanded once}
 15444    if cur_cmd>max_command then
 15445      begin expand; get_token;
 15446      end;
 15447    end;
 15448  if cur_cmd=endv then
 15449    fatal_error("(interwoven alignment preambles are not allowed)");
 15450  @.interwoven alignment preambles...@>
 15451  if (cur_cmd=assign_glue)and(cur_chr=glue_base+tab_skip_code) then
 15452    begin scan_optional_equals; scan_glue(glue_val);
 15453    if global_defs>0 then geq_define(glue_base+tab_skip_code,glue_ref,cur_val)
 15454    else eq_define(glue_base+tab_skip_code,glue_ref,cur_val);
 15455    goto restart;
 15456    end;
 15457  end;
 15458  
 15459  @ Spaces are eliminated from the beginning of a template.
 15460  
 15461  @<Scan the template \<u_j>...@>=
 15462  p:=hold_head; link(p):=null;
 15463  loop@+  begin get_preamble_token;
 15464    if cur_cmd=mac_param then goto done1;
 15465    if (cur_cmd<=car_ret)and(cur_cmd>=tab_mark)and(align_state=-1000000) then
 15466     if (p=hold_head)and(cur_loop=null)and(cur_cmd=tab_mark)
 15467      then cur_loop:=cur_align
 15468     else  begin print_err("Missing # inserted in alignment preamble");
 15469  @.Missing \# inserted...@>
 15470      help3("There should be exactly one # between &'s, when an")@/
 15471      ("\halign or \valign is being set up. In this case you had")@/
 15472      ("none, so I've put one in; maybe that will work.");
 15473      back_error; goto done1;
 15474      end
 15475    else if (cur_cmd<>spacer)or(p<>hold_head) then
 15476      begin link(p):=get_avail; p:=link(p); info(p):=cur_tok;
 15477      end;
 15478    end;
 15479  done1:
 15480  
 15481  @ @<Scan the template \<v_j>...@>=
 15482  p:=hold_head; link(p):=null;
 15483  loop@+  begin continue: get_preamble_token;
 15484    if (cur_cmd<=car_ret)and(cur_cmd>=tab_mark)and(align_state=-1000000) then
 15485      goto done2;
 15486    if cur_cmd=mac_param then
 15487      begin print_err("Only one # is allowed per tab");
 15488  @.Only one \# is allowed...@>
 15489      help3("There should be exactly one # between &'s, when an")@/
 15490      ("\halign or \valign is being set up. In this case you had")@/
 15491      ("more than one, so I'm ignoring all but the first.");
 15492      error; goto continue;
 15493      end;
 15494    link(p):=get_avail; p:=link(p); info(p):=cur_tok;
 15495    end;
 15496  done2: link(p):=get_avail; p:=link(p);
 15497  info(p):=end_template_token {put \.{\\endtemplate} at the end}
 15498  
 15499  @ The tricky part about alignments is getting the templates into the
 15500  scanner at the right time, and recovering control when a row or column
 15501  is finished.
 15502  
 15503  We usually begin a row after each \.{\\cr} has been sensed, unless that
 15504  \.{\\cr} is followed by \.{\\noalign} or by the right brace that terminates
 15505  the alignment. The |align_peek| routine is used to look ahead and do
 15506  the right thing; it either gets a new row started, or gets a \.{\\noalign}
 15507  started, or finishes off the alignment.
 15508  
 15509  @<Declare the procedure called |align_peek|@>=
 15510  procedure align_peek;
 15511  label restart;
 15512  begin restart: align_state:=1000000; @<Get the next non-blank non-call token@>;
 15513  if cur_cmd=no_align then
 15514    begin scan_left_brace; new_save_level(no_align_group);
 15515    if mode=-vmode then normal_paragraph;
 15516    end
 15517  else if cur_cmd=right_brace then fin_align
 15518  else if (cur_cmd=car_ret)and(cur_chr=cr_cr_code) then
 15519    goto restart {ignore \.{\\crcr}}
 15520  else  begin init_row; {start a new row}
 15521    init_col; {start a new column and replace what we peeked at}
 15522    end;
 15523  end;
 15524  
 15525  @ To start a row (i.e., a `row' that rhymes with `dough' but not with `bough'),
 15526  we enter a new semantic level, copy the first tabskip glue, and change
 15527  from internal vertical mode to restricted horizontal mode or vice versa.
 15528  The |space_factor| and |prev_depth| are not used on this semantic level,
 15529  but we clear them to zero just to be tidy.
 15530  
 15531  @p @t\4@>@<Declare the procedure called |init_span|@>@t@>@/
 15532  procedure init_row;
 15533  begin push_nest; mode:=(-hmode-vmode)-mode;
 15534  if mode=-hmode then space_factor:=0 @+else prev_depth:=0;
 15535  tail_append(new_glue(glue_ptr(preamble)));
 15536  subtype(tail):=tab_skip_code+1;@/
 15537  cur_align:=link(preamble); cur_tail:=cur_head; init_span(cur_align);
 15538  end;
 15539  
 15540  @ The parameter to |init_span| is a pointer to the alignrecord where the
 15541  next column or group of columns will begin. A new semantic level is
 15542  entered, so that the columns will generate a list for subsequent packaging.
 15543  
 15544  @<Declare the procedure called |init_span|@>=
 15545  procedure init_span(@!p:pointer);
 15546  begin push_nest;
 15547  if mode=-hmode then space_factor:=1000
 15548  else  begin prev_depth:=ignore_depth; normal_paragraph;
 15549    end;
 15550  cur_span:=p;
 15551  end;
 15552  
 15553  @ When a column begins, we assume that |cur_cmd| is either |omit| or else
 15554  the current token should be put back into the input until the \<u_j>
 15555  template has been scanned.  (Note that |cur_cmd| might be |tab_mark| or
 15556  |car_ret|.)  We also assume that |align_state| is approximately 1000000 at
 15557  this time.  We remain in the same mode, and start the template if it is
 15558  called for.
 15559  
 15560  @p procedure init_col;
 15561  begin extra_info(cur_align):=cur_cmd;
 15562  if cur_cmd=omit then align_state:=0
 15563  else  begin back_input; begin_token_list(u_part(cur_align),u_template);
 15564    end; {now |align_state=1000000|}
 15565  end;
 15566  
 15567  @ The scanner sets |align_state| to zero when the \<u_j> template ends. When
 15568  a subsequent \.{\\cr} or \.{\\span} or tab mark occurs with |align_state=0|,
 15569  the scanner activates the following code, which fires up the \<v_j> template.
 15570  We need to remember the |cur_chr|, which is either |cr_cr_code|, |cr_code|,
 15571  |span_code|, or a character code, depending on how the column text has ended.
 15572  
 15573  This part of the program had better not be activated when the preamble
 15574  to another alignment is being scanned, or when no alignment preamble is active.
 15575  
 15576  @<Insert the \(v)\<v_j>...@>=
 15577  begin if (scanner_status=aligning) or (cur_align=null) then
 15578    fatal_error("(interwoven alignment preambles are not allowed)");
 15579  @.interwoven alignment preambles...@>
 15580  cur_cmd:=extra_info(cur_align); extra_info(cur_align):=cur_chr;
 15581  if cur_cmd=omit then begin_token_list(omit_template,v_template)
 15582  else begin_token_list(v_part(cur_align),v_template);
 15583  align_state:=1000000; goto restart;
 15584  end
 15585  
 15586  @ The token list |omit_template| just referred to is a constant token
 15587  list that contains the special control sequence \.{\\endtemplate} only.
 15588  
 15589  @<Initialize the special...@>=
 15590  info(omit_template):=end_template_token; {|link(omit_template)=null|}
 15591  
 15592  @ When the |endv| command at the end of a \<v_j> template comes through the
 15593  scanner, things really start to happen; and it is the |fin_col| routine
 15594  that makes them happen. This routine returns |true| if a row as well as a
 15595  column has been finished.
 15596  
 15597  @p function fin_col:boolean;
 15598  label exit;
 15599  var p:pointer; {the alignrecord after the current one}
 15600  @!q,@!r:pointer; {temporary pointers for list manipulation}
 15601  @!s:pointer; {a new span node}
 15602  @!u:pointer; {a new unset box}
 15603  @!w:scaled; {natural width}
 15604  @!o:glue_ord; {order of infinity}
 15605  @!n:halfword; {span counter}
 15606  begin if cur_align=null then confusion("endv");
 15607  q:=link(cur_align);@+if q=null then confusion("endv");
 15608  @:this can't happen endv}{\quad endv@>
 15609  if align_state<500000 then
 15610    fatal_error("(interwoven alignment preambles are not allowed)");
 15611  @.interwoven alignment preambles...@>
 15612  p:=link(q);
 15613  @<If the preamble list has been traversed, check that the row has ended@>;
 15614  if extra_info(cur_align)<>span_code then
 15615    begin unsave; new_save_level(align_group);@/
 15616    @<Package an unset box for the current column and record its width@>;
 15617    @<Copy the tabskip glue between columns@>;
 15618    if extra_info(cur_align)>=cr_code then
 15619      begin fin_col:=true; return;
 15620      end;
 15621    init_span(p);
 15622    end;
 15623  align_state:=1000000; @<Get the next non-blank non-call token@>;
 15624  cur_align:=p;
 15625  init_col; fin_col:=false;
 15626  exit: end;
 15627  
 15628  @ @<If the preamble list has been traversed, check that the row has ended@>=
 15629  if (p=null)and(extra_info(cur_align)<cr_code) then
 15630   if cur_loop<>null then @<Lengthen the preamble periodically@>
 15631   else  begin print_err("Extra alignment tab has been changed to ");
 15632  @.Extra alignment tab...@>
 15633    print_esc("cr");
 15634    help3("You have given more \span or & marks than there were")@/
 15635    ("in the preamble to the \halign or \valign now in progress.")@/
 15636    ("So I'll assume that you meant to type \cr instead.");
 15637    extra_info(cur_align):=cr_code; error;
 15638    end
 15639  
 15640  @ @<Lengthen the preamble...@>=
 15641  begin link(q):=new_null_box; p:=link(q); {a new alignrecord}
 15642  info(p):=end_span; width(p):=null_flag; cur_loop:=link(cur_loop);
 15643  @<Copy the templates from node |cur_loop| into node |p|@>;
 15644  cur_loop:=link(cur_loop);
 15645  link(p):=new_glue(glue_ptr(cur_loop));
 15646  subtype(link(p)):=tab_skip_code+1;
 15647  end
 15648  
 15649  @ @<Copy the templates from node |cur_loop| into node |p|@>=
 15650  q:=hold_head; r:=u_part(cur_loop);
 15651  while r<>null do
 15652    begin link(q):=get_avail; q:=link(q); info(q):=info(r); r:=link(r);
 15653    end;
 15654  link(q):=null; u_part(p):=link(hold_head);
 15655  q:=hold_head; r:=v_part(cur_loop);
 15656  while r<>null do
 15657    begin link(q):=get_avail; q:=link(q); info(q):=info(r); r:=link(r);
 15658    end;
 15659  link(q):=null; v_part(p):=link(hold_head)
 15660  
 15661  @ @<Copy the tabskip glue...@>=
 15662  tail_append(new_glue(glue_ptr(link(cur_align))));
 15663  subtype(tail):=tab_skip_code+1
 15664  
 15665  @ @<Package an unset...@>=
 15666  begin if mode=-hmode then
 15667    begin adjust_tail:=cur_tail; u:=hpack(link(head),natural); w:=width(u);
 15668    cur_tail:=adjust_tail; adjust_tail:=null;
 15669    end
 15670  else  begin u:=vpackage(link(head),natural,0); w:=height(u);
 15671    end;
 15672  n:=min_quarterword; {this represents a span count of 1}
 15673  if cur_span<>cur_align then @<Update width entry for spanned columns@>
 15674  else if w>width(cur_align) then width(cur_align):=w;
 15675  type(u):=unset_node; span_count(u):=n;@/
 15676  @<Determine the stretch order@>;
 15677  glue_order(u):=o; glue_stretch(u):=total_stretch[o];@/
 15678  @<Determine the shrink order@>;
 15679  glue_sign(u):=o; glue_shrink(u):=total_shrink[o];@/
 15680  pop_nest; link(tail):=u; tail:=u;
 15681  end
 15682  
 15683  @ A span node is a 2-word record containing |width|, |info|, and |link|
 15684  fields. The |link| field is not really a link, it indicates the number of
 15685  spanned columns; the |info| field points to a span node for the same
 15686  starting column, having a greater extent of spanning, or to |end_span|,
 15687  which has the largest possible |link| field; the |width| field holds the
 15688  largest natural width corresponding to a particular set of spanned columns.
 15689  
 15690  A list of the maximum widths so far, for spanned columns starting at a
 15691  given column, begins with the |info| field of the alignrecord for that
 15692  column.
 15693  
 15694  @d span_node_size=2 {number of |mem| words for a span node}
 15695  
 15696  @<Initialize the special list heads...@>=
 15697  link(end_span):=max_quarterword+1; info(end_span):=null;
 15698  
 15699  @ @<Update width entry for spanned columns@>=
 15700  begin q:=cur_span;
 15701  repeat incr(n); q:=link(link(q));
 15702  until q=cur_align;
 15703  if n>max_quarterword then confusion("256 spans"); {this can happen, but won't}
 15704  @^system dependencies@>
 15705  @:this can't happen 256 spans}{\quad 256 spans@>
 15706  q:=cur_span; while link(info(q))<n do q:=info(q);
 15707  if link(info(q))>n then
 15708    begin s:=get_node(span_node_size); info(s):=info(q); link(s):=n;
 15709    info(q):=s; width(s):=w;
 15710    end
 15711  else if width(info(q))<w then width(info(q)):=w;
 15712  end
 15713  
 15714  @ At the end of a row, we append an unset box to the current vlist (for
 15715  \.{\\halign}) or the current hlist (for \.{\\valign}). This unset box
 15716  contains the unset boxes for the columns, separated by the tabskip glue.
 15717  Everything will be set later.
 15718  
 15719  @p procedure fin_row;
 15720  var p:pointer; {the new unset box}
 15721  begin if mode=-hmode then
 15722    begin p:=hpack(link(head),natural);
 15723    pop_nest; append_to_vlist(p);
 15724    if cur_head<>cur_tail then
 15725      begin link(tail):=link(cur_head); tail:=cur_tail;
 15726      end;
 15727    end
 15728  else  begin p:=vpack(link(head),natural); pop_nest;
 15729    link(tail):=p; tail:=p; space_factor:=1000;
 15730    end;
 15731  type(p):=unset_node; glue_stretch(p):=0;
 15732  if every_cr<>null then begin_token_list(every_cr,every_cr_text);
 15733  align_peek;
 15734  end; {note that |glue_shrink(p)=0| since |glue_shrink==shift_amount|}
 15735  
 15736  @ Finally, we will reach the end of the alignment, and we can breathe a
 15737  sigh of relief that memory hasn't overflowed. All the unset boxes will now be
 15738  set so that the columns line up, taking due account of spanned columns.
 15739  
 15740  @p procedure@?do_assignments; forward;@t\2@>@/
 15741  procedure@?resume_after_display; forward;@t\2@>@/
 15742  procedure@?build_page; forward;@t\2@>@/
 15743  procedure fin_align;
 15744  var @!p,@!q,@!r,@!s,@!u,@!v: pointer; {registers for the list operations}
 15745  @!t,@!w:scaled; {width of column}
 15746  @!o:scaled; {shift offset for unset boxes}
 15747  @!n:halfword; {matching span amount}
 15748  @!rule_save:scaled; {temporary storage for |overfull_rule|}
 15749  @!aux_save:memory_word; {temporary storage for |aux|}
 15750  begin if cur_group<>align_group then confusion("align1");
 15751  @:this can't happen align}{\quad align@>
 15752  unsave; {that |align_group| was for individual entries}
 15753  if cur_group<>align_group then confusion("align0");
 15754  unsave; {that |align_group| was for the whole alignment}
 15755  if nest[nest_ptr-1].mode_field=mmode then o:=display_indent
 15756    else o:=0;
 15757  @<Go through the preamble list, determining the column widths and
 15758    changing the alignrecords to dummy unset boxes@>;
 15759  @<Package the preamble list, to determine the actual tabskip glue amounts,
 15760    and let |p| point to this prototype box@>;
 15761  @<Set the glue in all the unset boxes of the current list@>;
 15762  flush_node_list(p); pop_alignment;
 15763  @<Insert the \(c)current list into its environment@>;
 15764  end;@/
 15765  @t\4@>@<Declare the procedure called |align_peek|@>
 15766  
 15767  @ It's time now to dismantle the preamble list and to compute the column
 15768  widths. Let $w_{ij}$ be the maximum of the natural widths of all entries
 15769  that span columns $i$ through $j$, inclusive. The alignrecord for column~$i$
 15770  contains $w_{ii}$ in its |width| field, and there is also a linked list of
 15771  the nonzero $w_{ij}$ for increasing $j$, accessible via the |info| field;
 15772  these span nodes contain the value $j-i+|min_quarterword|$ in their
 15773  |link| fields. The values of $w_{ii}$ were initialized to |null_flag|, which
 15774  we regard as $-\infty$.
 15775  
 15776  The final column widths are defined by the formula
 15777  $$w_j=\max_{1\L i\L j}\biggl( w_{ij}-\sum_{i\L k<j}(t_k+w_k)\biggr),$$
 15778  where $t_k$ is the natural width of the tabskip glue between columns
 15779  $k$ and~$k+1$. However, if $w_{ij}=-\infty$ for all |i| in the range
 15780  |1<=i<=j| (i.e., if every entry that involved column~|j| also involved
 15781  column~|j+1|), we let $w_j=0$, and we zero out the tabskip glue after
 15782  column~|j|.
 15783  
 15784  \TeX\ computes these values by using the following scheme: First $w_1=w_{11}$.
 15785  Then replace $w_{2j}$ by $\max(w_{2j},w_{1j}-t_1-w_1)$, for all $j>1$.
 15786  Then $w_2=w_{22}$. Then replace $w_{3j}$ by $\max(w_{3j},w_{2j}-t_2-w_2)$
 15787  for all $j>2$; and so on. If any $w_j$ turns out to be $-\infty$, its
 15788  value is changed to zero and so is the next tabskip.
 15789  
 15790  @<Go through the preamble list,...@>=
 15791  q:=link(preamble);
 15792  repeat flush_list(u_part(q)); flush_list(v_part(q));
 15793  p:=link(link(q));
 15794  if width(q)=null_flag then
 15795    @<Nullify |width(q)| and the tabskip glue following this column@>;
 15796  if info(q)<>end_span then
 15797    @<Merge the widths in the span nodes of |q| with those of |p|,
 15798      destroying the span nodes of |q|@>;
 15799  type(q):=unset_node; span_count(q):=min_quarterword; height(q):=0;
 15800  depth(q):=0; glue_order(q):=normal; glue_sign(q):=normal;
 15801  glue_stretch(q):=0; glue_shrink(q):=0; q:=p;
 15802  until q=null
 15803  
 15804  @ @<Nullify |width(q)| and the tabskip glue following this column@>=
 15805  begin width(q):=0; r:=link(q); s:=glue_ptr(r);
 15806  if s<>zero_glue then
 15807    begin add_glue_ref(zero_glue); delete_glue_ref(s);
 15808    glue_ptr(r):=zero_glue;
 15809    end;
 15810  end
 15811  
 15812  @ Merging of two span-node lists is a typical exercise in the manipulation of
 15813  linearly linked data structures. The essential invariant in the following
 15814  |repeat| loop is that we want to dispense with node |r|, in |q|'s list,
 15815  and |u| is its successor; all nodes of |p|'s list up to and including |s|
 15816  have been processed, and the successor of |s| matches |r| or precedes |r|
 15817  or follows |r|, according as |link(r)=n| or |link(r)>n| or |link(r)<n|.
 15818  
 15819  @<Merge the widths...@>=
 15820  begin t:=width(q)+width(glue_ptr(link(q)));
 15821  r:=info(q); s:=end_span; info(s):=p; n:=min_quarterword+1;
 15822  repeat width(r):=width(r)-t; u:=info(r);
 15823  while link(r)>n do
 15824    begin s:=info(s); n:=link(info(s))+1;
 15825    end;
 15826  if link(r)<n then
 15827    begin info(r):=info(s); info(s):=r; decr(link(r)); s:=r;
 15828    end
 15829  else  begin if width(r)>width(info(s)) then width(info(s)):=width(r);
 15830    free_node(r,span_node_size);
 15831    end;
 15832  r:=u;
 15833  until r=end_span;
 15834  end
 15835  
 15836  @ Now the preamble list has been converted to a list of alternating unset
 15837  boxes and tabskip glue, where the box widths are equal to the final
 15838  column sizes. In case of \.{\\valign}, we change the widths to heights,
 15839  so that a correct error message will be produced if the alignment is
 15840  overfull or underfull.
 15841  
 15842  @<Package the preamble list...@>=
 15843  save_ptr:=save_ptr-2; pack_begin_line:=-mode_line;
 15844  if mode=-vmode then
 15845    begin rule_save:=overfull_rule;
 15846    overfull_rule:=0; {prevent rule from being packaged}
 15847    p:=hpack(preamble,saved(1),saved(0)); overfull_rule:=rule_save;
 15848    end
 15849  else  begin q:=link(preamble);
 15850    repeat height(q):=width(q); width(q):=0; q:=link(link(q));
 15851    until q=null;
 15852    p:=vpack(preamble,saved(1),saved(0));
 15853    q:=link(preamble);
 15854    repeat width(q):=height(q); height(q):=0; q:=link(link(q));
 15855    until q=null;
 15856    end;
 15857  pack_begin_line:=0
 15858  
 15859  @ @<Set the glue in all the unset...@>=
 15860  q:=link(head); s:=head;
 15861  while q<>null do
 15862    begin if not is_char_node(q) then
 15863      if type(q)=unset_node then
 15864        @<Set the unset box |q| and the unset boxes in it@>
 15865      else if type(q)=rule_node then
 15866        @<Make the running dimensions in rule |q| extend to the
 15867          boundaries of the alignment@>;
 15868    s:=q; q:=link(q);
 15869    end
 15870  
 15871  @ @<Make the running dimensions in rule |q| extend...@>=
 15872  begin if is_running(width(q)) then width(q):=width(p);
 15873  if is_running(height(q)) then height(q):=height(p);
 15874  if is_running(depth(q)) then depth(q):=depth(p);
 15875  if o<>0 then
 15876    begin r:=link(q); link(q):=null; q:=hpack(q,natural);
 15877    shift_amount(q):=o; link(q):=r; link(s):=q;
 15878    end;
 15879  end
 15880  
 15881  @ The unset box |q| represents a row that contains one or more unset boxes,
 15882  depending on how soon \.{\\cr} occurred in that row.
 15883  
 15884  @<Set the unset box |q| and the unset boxes in it@>=
 15885  begin if mode=-vmode then
 15886    begin type(q):=hlist_node; width(q):=width(p);
 15887    end
 15888  else  begin type(q):=vlist_node; height(q):=height(p);
 15889    end;
 15890  glue_order(q):=glue_order(p); glue_sign(q):=glue_sign(p);
 15891  glue_set(q):=glue_set(p); shift_amount(q):=o;
 15892  r:=link(list_ptr(q)); s:=link(list_ptr(p));
 15893  repeat @<Set the glue in node |r| and change it from an unset node@>;
 15894  r:=link(link(r)); s:=link(link(s));
 15895  until r=null;
 15896  end
 15897  
 15898  @ A box made from spanned columns will be followed by tabskip glue nodes and
 15899  by empty boxes as if there were no spanning. This permits perfect alignment
 15900  of subsequent entries, and it prevents values that depend on floating point
 15901  arithmetic from entering into the dimensions of any boxes.
 15902  
 15903  @<Set the glue in node |r|...@>=
 15904  n:=span_count(r); t:=width(s); w:=t; u:=hold_head;
 15905  while n>min_quarterword do
 15906    begin decr(n);
 15907    @<Append tabskip glue and an empty box to list |u|,
 15908      and update |s| and |t| as the prototype nodes are passed@>;
 15909    end;
 15910  if mode=-vmode then
 15911    @<Make the unset node |r| into an |hlist_node| of width |w|,
 15912      setting the glue as if the width were |t|@>
 15913  else @<Make the unset node |r| into a |vlist_node| of height |w|,
 15914      setting the glue as if the height were |t|@>;
 15915  shift_amount(r):=0;
 15916  if u<>hold_head then {append blank boxes to account for spanned nodes}
 15917    begin link(u):=link(r); link(r):=link(hold_head); r:=u;
 15918    end
 15919  
 15920  @ @<Append tabskip glue and an empty box to list |u|...@>=
 15921  s:=link(s); v:=glue_ptr(s); link(u):=new_glue(v); u:=link(u);
 15922  subtype(u):=tab_skip_code+1; t:=t+width(v);
 15923  if glue_sign(p)=stretching then
 15924    begin if stretch_order(v)=glue_order(p) then
 15925      t:=t+round(float(glue_set(p))*stretch(v));
 15926  @^real multiplication@>
 15927    end
 15928  else if glue_sign(p)=shrinking then
 15929    begin if shrink_order(v)=glue_order(p) then
 15930      t:=t-round(float(glue_set(p))*shrink(v));
 15931    end;
 15932  s:=link(s); link(u):=new_null_box; u:=link(u); t:=t+width(s);
 15933  if mode=-vmode then width(u):=width(s)@+else
 15934    begin type(u):=vlist_node; height(u):=width(s);
 15935    end
 15936  
 15937  @ @<Make the unset node |r| into an |hlist_node| of width |w|...@>=
 15938  begin height(r):=height(q); depth(r):=depth(q);
 15939  if t=width(r) then
 15940    begin glue_sign(r):=normal; glue_order(r):=normal;
 15941    set_glue_ratio_zero(glue_set(r));
 15942    end
 15943  else if t>width(r) then
 15944    begin glue_sign(r):=stretching;
 15945    if glue_stretch(r)=0 then set_glue_ratio_zero(glue_set(r))
 15946    else glue_set(r):=unfloat((t-width(r))/glue_stretch(r));
 15947  @^real division@>
 15948    end
 15949  else  begin glue_order(r):=glue_sign(r); glue_sign(r):=shrinking;
 15950    if glue_shrink(r)=0 then set_glue_ratio_zero(glue_set(r))
 15951    else if (glue_order(r)=normal)and(width(r)-t>glue_shrink(r)) then
 15952      set_glue_ratio_one(glue_set(r))
 15953    else glue_set(r):=unfloat((width(r)-t)/glue_shrink(r));
 15954    end;
 15955  width(r):=w; type(r):=hlist_node;
 15956  end
 15957  
 15958  @ @<Make the unset node |r| into a |vlist_node| of height |w|...@>=
 15959  begin width(r):=width(q);
 15960  if t=height(r) then
 15961    begin glue_sign(r):=normal; glue_order(r):=normal;
 15962    set_glue_ratio_zero(glue_set(r));
 15963    end
 15964  else if t>height(r) then
 15965    begin glue_sign(r):=stretching;
 15966    if glue_stretch(r)=0 then set_glue_ratio_zero(glue_set(r))
 15967    else glue_set(r):=unfloat((t-height(r))/glue_stretch(r));
 15968  @^real division@>
 15969    end
 15970  else  begin glue_order(r):=glue_sign(r); glue_sign(r):=shrinking;
 15971    if glue_shrink(r)=0 then set_glue_ratio_zero(glue_set(r))
 15972    else if (glue_order(r)=normal)and(height(r)-t>glue_shrink(r)) then
 15973      set_glue_ratio_one(glue_set(r))
 15974    else glue_set(r):=unfloat((height(r)-t)/glue_shrink(r));
 15975    end;
 15976  height(r):=w; type(r):=vlist_node;
 15977  end
 15978  
 15979  @ We now have a completed alignment, in the list that starts at |head|
 15980  and ends at |tail|. This list will be merged with the one that encloses
 15981  it. (In case the enclosing mode is |mmode|, for displayed formulas,
 15982  we will need to insert glue before and after the display; that part of the
 15983  program will be deferred until we're more familiar with such operations.)
 15984  
 15985  In restricted horizontal mode, the |clang| part of |aux| is undefined;
 15986  an over-cautious \PASCAL\ runtime system may complain about this.
 15987  @^dirty \PASCAL@>
 15988  
 15989  @<Insert the \(c)current list into its environment@>=
 15990  aux_save:=aux; p:=link(head); q:=tail; pop_nest;
 15991  if mode=mmode then @<Finish an alignment in a display@>
 15992  else  begin aux:=aux_save; link(tail):=p;
 15993    if p<>null then tail:=q;
 15994    if mode=vmode then build_page;
 15995    end
 15996  
 15997  @* \[38] Breaking paragraphs into lines.
 15998  We come now to what is probably the most interesting algorithm of \TeX:
 15999  the mechanism for choosing the ``best possible'' breakpoints that yield
 16000  the individual lines of a paragraph. \TeX's line-breaking algorithm takes
 16001  a given horizontal list and converts it to a sequence of boxes that are
 16002  appended to the current vertical list. In the course of doing this, it
 16003  creates a special data structure containing three kinds of records that are
 16004  not used elsewhere in \TeX. Such nodes are created while a paragraph is
 16005  being processed, and they are destroyed afterwards; thus, the other parts
 16006  of \TeX\ do not need to know anything about how line-breaking is done.
 16007  
 16008  The method used here is based on an approach devised by Michael F. Plass and
 16009  @^Plass, Michael Frederick@>
 16010  @^Knuth, Donald Ervin@>
 16011  the author in 1977, subsequently generalized and improved by the same two
 16012  people in 1980. A detailed discussion appears in {\sl Software---Practice
 16013  and Experience \bf11} (1981), 1119--1184, where it is shown that the
 16014  line-breaking problem can be regarded as a special case of the problem of
 16015  computing the shortest path in an acyclic network. The cited paper includes
 16016  numerous examples and describes the history of line breaking as it has been
 16017  practiced by printers through the ages. The present implementation adds two
 16018  new ideas to the algorithm of 1980: Memory space requirements are considerably
 16019  reduced by using smaller records for inactive nodes than for active ones,
 16020  and arithmetic overflow is avoided by using ``delta distances'' instead of
 16021  keeping track of the total distance from the beginning of the paragraph to the
 16022  current point.
 16023  
 16024  @ The |line_break| procedure should be invoked only in horizontal mode; it
 16025  leaves that mode and places its output into the current vlist of the
 16026  enclosing vertical mode (or internal vertical mode).
 16027  There is one explicit parameter:  |final_widow_penalty| is the amount of
 16028  additional penalty to be inserted before the final line of the paragraph.
 16029  
 16030  There are also a number of implicit parameters: The hlist to be broken
 16031  starts at |link(head)|, and it is nonempty. The value of |prev_graf| in the
 16032  enclosing semantic level tells where the paragraph should begin in the
 16033  sequence of line numbers, in case hanging indentation or \.{\\parshape}
 16034  is in use; |prev_graf| is zero unless this paragraph is being continued
 16035  after a displayed formula.  Other implicit parameters, such as the
 16036  |par_shape_ptr| and various penalties to use for hyphenation, etc., appear
 16037  in |eqtb|.
 16038  
 16039  After |line_break| has acted, it will have updated the current vlist and the
 16040  value of |prev_graf|. Furthermore, the global variable |just_box| will
 16041  point to the final box created by |line_break|, so that the width of this
 16042  line can be ascertained when it is necessary to decide whether to use
 16043  |above_display_skip| or |above_display_short_skip| before a displayed formula.
 16044  
 16045  @<Glob...@>=
 16046  @!just_box:pointer; {the |hlist_node| for the last line of the new paragraph}
 16047  
 16048  @ Since |line_break| is a rather lengthy procedure---sort of a small world unto
 16049  itself---we must build it up little by little, somewhat more cautiously
 16050  than we have done with the simpler procedures of \TeX. Here is the
 16051  general outline.
 16052  
 16053  @p@t\4@>@<Declare subprocedures for |line_break|@>
 16054  procedure line_break(@!final_widow_penalty:integer);
 16055  label done,done1,done2,done3,done4,done5,continue;
 16056  var @<Local variables for line breaking@>@;
 16057  begin pack_begin_line:=mode_line; {this is for over/underfull box messages}
 16058  @<Get ready to start line breaking@>;
 16059  @<Find optimal breakpoints@>;
 16060  @<Break the paragraph at the chosen breakpoints, justify the resulting lines
 16061  to the correct widths, and append them to the current vertical list@>;
 16062  @<Clean up the memory by removing the break nodes@>;
 16063  pack_begin_line:=0;
 16064  end;
 16065  
 16066  @ The first task is to move the list from |head| to |temp_head| and go
 16067  into the enclosing semantic level. We also append the \.{\\parfillskip}
 16068  glue to the end of the paragraph, removing a space (or other glue node) if
 16069  it was there, since spaces usually precede blank lines and instances of
 16070  `\.{\$\$}'. The |par_fill_skip| is preceded by an infinite penalty, so
 16071  it will never be considered as a potential breakpoint.
 16072  
 16073  This code assumes that a |glue_node| and a |penalty_node| occupy the
 16074  same number of |mem|~words.
 16075  @^data structure assumptions@>
 16076  
 16077  @<Get ready to start...@>=
 16078  link(temp_head):=link(head);
 16079  if is_char_node(tail) then tail_append(new_penalty(inf_penalty))
 16080  else if type(tail)<>glue_node then tail_append(new_penalty(inf_penalty))
 16081  else  begin type(tail):=penalty_node; delete_glue_ref(glue_ptr(tail));
 16082    flush_node_list(leader_ptr(tail)); penalty(tail):=inf_penalty;
 16083    end;
 16084  link(tail):=new_param_glue(par_fill_skip_code);
 16085  init_cur_lang:=prev_graf mod @'200000;
 16086  init_l_hyf:=prev_graf div @'20000000;
 16087  init_r_hyf:=(prev_graf div @'200000) mod @'100;
 16088  pop_nest;
 16089  
 16090  @ When looking for optimal line breaks, \TeX\ creates a ``break node'' for
 16091  each break that is {\sl feasible}, in the sense that there is a way to end
 16092  a line at the given place without requiring any line to stretch more than
 16093  a given tolerance. A break node is characterized by three things: the position
 16094  of the break (which is a pointer to a |glue_node|, |math_node|, |penalty_node|,
 16095  or |disc_node|); the ordinal number of the line that will follow this
 16096  breakpoint; and the fitness classification of the line that has just
 16097  ended, i.e., |tight_fit|, |decent_fit|, |loose_fit|, or |very_loose_fit|.
 16098  
 16099  @d tight_fit=3 {fitness classification for lines shrinking 0.5 to 1.0 of their
 16100    shrinkability}
 16101  @d loose_fit=1 {fitness classification for lines stretching 0.5 to 1.0 of their
 16102    stretchability}
 16103  @d very_loose_fit=0 {fitness classification for lines stretching more than
 16104    their stretchability}
 16105  @d decent_fit=2 {fitness classification for all other lines}
 16106  
 16107  @ The algorithm essentially determines the best possible way to achieve
 16108  each feasible combination of position, line, and fitness. Thus, it answers
 16109  questions like, ``What is the best way to break the opening part of the
 16110  paragraph so that the fourth line is a tight line ending at such-and-such
 16111  a place?'' However, the fact that all lines are to be the same length
 16112  after a certain point makes it possible to regard all sufficiently large
 16113  line numbers as equivalent, when the looseness parameter is zero, and this
 16114  makes it possible for the algorithm to save space and time.
 16115  
 16116  An ``active node'' and a ``passive node'' are created in |mem| for each
 16117  feasible breakpoint that needs to be considered. Active nodes are three
 16118  words long and passive nodes are two words long. We need active nodes only
 16119  for breakpoints near the place in the paragraph that is currently being
 16120  examined, so they are recycled within a comparatively short time after
 16121  they are created.
 16122  
 16123  @ An active node for a given breakpoint contains six fields:
 16124  
 16125  \yskip\hang|link| points to the next node in the list of active nodes; the
 16126  last active node has |link=last_active|.
 16127  
 16128  \yskip\hang|break_node| points to the passive node associated with this
 16129  breakpoint.
 16130  
 16131  \yskip\hang|line_number| is the number of the line that follows this
 16132  breakpoint.
 16133  
 16134  \yskip\hang|fitness| is the fitness classification of the line ending at this
 16135  breakpoint.
 16136  
 16137  \yskip\hang|type| is either |hyphenated| or |unhyphenated|, depending on
 16138  whether this breakpoint is a |disc_node|.
 16139  
 16140  \yskip\hang|total_demerits| is the minimum possible sum of demerits over all
 16141  lines leading from the beginning of the paragraph to this breakpoint.
 16142  
 16143  \yskip\noindent
 16144  The value of |link(active)| points to the first active node on a linked list
 16145  of all currently active nodes. This list is in order by |line_number|,
 16146  except that nodes with |line_number>easy_line| may be in any order relative
 16147  to each other.
 16148  
 16149  @d active_node_size=3 {number of words in active nodes}
 16150  @d fitness==subtype {|very_loose_fit..tight_fit| on final line for this break}
 16151  @d break_node==rlink {pointer to the corresponding passive node}
 16152  @d line_number==llink {line that begins at this breakpoint}
 16153  @d total_demerits(#)==mem[#+2].int {the quantity that \TeX\ minimizes}
 16154  @d unhyphenated=0 {the |type| of a normal active break node}
 16155  @d hyphenated=1 {the |type| of an active node that breaks at a |disc_node|}
 16156  @d last_active==active {the active list ends where it begins}
 16157  
 16158  @ @<Initialize the special list heads...@>=
 16159  type(last_active):=hyphenated; line_number(last_active):=max_halfword;
 16160  subtype(last_active):=0; {the |subtype| is never examined by the algorithm}
 16161  
 16162  @ The passive node for a given breakpoint contains only four fields:
 16163  
 16164  \yskip\hang|link| points to the passive node created just before this one,
 16165  if any, otherwise it is |null|.
 16166  
 16167  \yskip\hang|cur_break| points to the position of this breakpoint in the
 16168  horizontal list for the paragraph being broken.
 16169  
 16170  \yskip\hang|prev_break| points to the passive node that should precede this
 16171  one in an optimal path to this breakpoint.
 16172  
 16173  \yskip\hang|serial| is equal to |n| if this passive node is the |n|th
 16174  one created during the current pass. (This field is used only when
 16175  printing out detailed statistics about the line-breaking calculations.)
 16176  
 16177  \yskip\noindent
 16178  There is a global variable called |passive| that points to the most
 16179  recently created passive node. Another global variable, |printed_node|,
 16180  is used to help print out the paragraph when detailed information about
 16181  the line-breaking computation is being displayed.
 16182  
 16183  @d passive_node_size=2 {number of words in passive nodes}
 16184  @d cur_break==rlink {in passive node, points to position of this breakpoint}
 16185  @d prev_break==llink {points to passive node that should precede this one}
 16186  @d serial==info {serial number for symbolic identification}
 16187  
 16188  @<Glob...@>=
 16189  @!passive:pointer; {most recent node on passive list}
 16190  @!printed_node:pointer; {most recent node that has been printed}
 16191  @!pass_number:halfword; {the number of passive nodes allocated on this pass}
 16192  
 16193  @ The active list also contains ``delta'' nodes that help the algorithm
 16194  compute the badness of individual lines. Such nodes appear only between two
 16195  active nodes, and they have |type=delta_node|. If |p| and |r| are active nodes
 16196  and if |q| is a delta node between them, so that |link(p)=q| and |link(q)=r|,
 16197  then |q| tells the space difference between lines in the horizontal list that
 16198  start after breakpoint |p| and lines that start after breakpoint |r|. In
 16199  other words, if we know the length of the line that starts after |p| and
 16200  ends at our current position, then the corresponding length of the line that
 16201  starts after |r| is obtained by adding the amounts in node~|q|. A delta node
 16202  contains six scaled numbers, since it must record the net change in glue
 16203  stretchability with respect to all orders of infinity. The natural width
 16204  difference appears in |mem[q+1].sc|; the stretch differences in units of
 16205  pt, fil, fill, and filll appear in |mem[q+2..q+5].sc|; and the shrink difference
 16206  appears in |mem[q+6].sc|. The |subtype| field of a delta node is not used.
 16207  
 16208  @d delta_node_size=7 {number of words in a delta node}
 16209  @d delta_node=2 {|type| field in a delta node}
 16210  
 16211  @ As the algorithm runs, it maintains a set of six delta-like registers
 16212  for the length of the line following the first active breakpoint to the
 16213  current position in the given hlist. When it makes a pass through the
 16214  active list, it also maintains a similar set of six registers for the
 16215  length following the active breakpoint of current interest. A third set
 16216  holds the length of an empty line (namely, the sum of \.{\\leftskip} and
 16217  \.{\\rightskip}); and a fourth set is used to create new delta nodes.
 16218  
 16219  When we pass a delta node we want to do operations like
 16220  $$\hbox{\ignorespaces|for
 16221  k:=1 to 6 do cur_active_width[k]:=cur_active_width[k]+mem[q+k].sc|};$$ and we
 16222  want to do this without the overhead of |for| loops. The |do_all_six|
 16223  macro makes such six-tuples convenient.
 16224  
 16225  @d do_all_six(#)==#(1);#(2);#(3);#(4);#(5);#(6)
 16226  
 16227  @<Glob...@>=
 16228  @!active_width:array[1..6] of scaled;
 16229    {distance from first active node to~|cur_p|}
 16230  @!cur_active_width:array[1..6] of scaled; {distance from current active node}
 16231  @!background:array[1..6] of scaled; {length of an ``empty'' line}
 16232  @!break_width:array[1..6] of scaled; {length being computed after current break}
 16233  
 16234  @ Let's state the principles of the delta nodes more precisely and concisely,
 16235  so that the following programs will be less obscure. For each legal
 16236  breakpoint~|p| in the paragraph, we define two quantities $\alpha(p)$ and
 16237  $\beta(p)$ such that the length of material in a line from breakpoint~|p|
 16238  to breakpoint~|q| is $\gamma+\beta(q)-\alpha(p)$, for some fixed $\gamma$.
 16239  Intuitively, $\alpha(p)$ and $\beta(q)$ are the total length of material from
 16240  the beginning of the paragraph to a point ``after'' a break at |p| and to a
 16241  point ``before'' a break at |q|; and $\gamma$ is the width of an empty line,
 16242  namely the length contributed by \.{\\leftskip} and \.{\\rightskip}.
 16243  
 16244  Suppose, for example, that the paragraph consists entirely of alternating
 16245  boxes and glue skips; let the boxes have widths $x_1\ldots x_n$ and
 16246  let the skips have widths $y_1\ldots y_n$, so that the paragraph can be
 16247  represented by $x_1y_1\ldots x_ny_n$. Let $p_i$ be the legal breakpoint
 16248  at $y_i$; then $\alpha(p_i)=x_1+y_1+\cdots+x_i+y_i$, and $\beta(p_i)=
 16249  x_1+y_1+\cdots+x_i$. To check this, note that the length of material from
 16250  $p_2$ to $p_5$, say, is $\gamma+x_3+y_3+x_4+y_4+x_5=\gamma+\beta(p_5)
 16251  -\alpha(p_2)$.
 16252  
 16253  The quantities $\alpha$, $\beta$, $\gamma$ involve glue stretchability and
 16254  shrinkability as well as a natural width. If we were to compute $\alpha(p)$
 16255  and $\beta(p)$ for each |p|, we would need multiple precision arithmetic, and
 16256  the multiprecise numbers would have to be kept in the active nodes.
 16257  \TeX\ avoids this problem by working entirely with relative differences
 16258  or ``deltas.'' Suppose, for example, that the active list contains
 16259  $a_1\,\delta_1\,a_2\,\delta_2\,a_3$, where the |a|'s are active breakpoints
 16260  and the $\delta$'s are delta nodes. Then $\delta_1=\alpha(a_1)-\alpha(a_2)$
 16261  and $\delta_2=\alpha(a_2)-\alpha(a_3)$. If the line breaking algorithm is
 16262  currently positioned at some other breakpoint |p|, the |active_width| array
 16263  contains the value $\gamma+\beta(p)-\alpha(a_1)$. If we are scanning through
 16264  the list of active nodes and considering a tentative line that runs from
 16265  $a_2$ to~|p|, say, the |cur_active_width| array will contain the value
 16266  $\gamma+\beta(p)-\alpha(a_2)$. Thus, when we move from $a_2$ to $a_3$,
 16267  we want to add $\alpha(a_2)-\alpha(a_3)$ to |cur_active_width|; and this
 16268  is just $\delta_2$, which appears in the active list between $a_2$ and
 16269  $a_3$. The |background| array contains $\gamma$. The |break_width| array
 16270  will be used to calculate values of new delta nodes when the active
 16271  list is being updated.
 16272  
 16273  @ Glue nodes in a horizontal list that is being paragraphed are not supposed to
 16274  include ``infinite'' shrinkability; that is why the algorithm maintains
 16275  four registers for stretching but only one for shrinking. If the user tries to
 16276  introduce infinite shrinkability, the shrinkability will be reset to finite
 16277  and an error message will be issued. A boolean variable |no_shrink_error_yet|
 16278  prevents this error message from appearing more than once per paragraph.
 16279  
 16280  @d check_shrinkage(#)==if (shrink_order(#)<>normal)and(shrink(#)<>0) then
 16281    begin #:=finite_shrink(#);
 16282    end
 16283  
 16284  @<Glob...@>=
 16285  @!no_shrink_error_yet:boolean; {have we complained about infinite shrinkage?}
 16286  
 16287  @ @<Declare subprocedures for |line_break|@>=
 16288  function finite_shrink(@!p:pointer):pointer; {recovers from infinite shrinkage}
 16289  var q:pointer; {new glue specification}
 16290  begin if no_shrink_error_yet then
 16291    begin no_shrink_error_yet:=false;
 16292    @!stat if tracing_paragraphs>0 then end_diagnostic(true);@+tats@;
 16293    print_err("Infinite glue shrinkage found in a paragraph");
 16294  @.Infinite glue shrinkage...@>
 16295    help5("The paragraph just ended includes some glue that has")@/
 16296    ("infinite shrinkability, e.g., `\hskip 0pt minus 1fil'.")@/
 16297    ("Such glue doesn't belong there---it allows a paragraph")@/
 16298    ("of any length to fit on one line. But it's safe to proceed,")@/
 16299    ("since the offensive shrinkability has been made finite.");
 16300    error;
 16301    @!stat if tracing_paragraphs>0 then begin_diagnostic;@+tats@;
 16302    end;
 16303  q:=new_spec(p); shrink_order(q):=normal;
 16304  delete_glue_ref(p); finite_shrink:=q;
 16305  end;
 16306  
 16307  @ @<Get ready to start...@>=
 16308  no_shrink_error_yet:=true;@/
 16309  check_shrinkage(left_skip); check_shrinkage(right_skip);@/
 16310  q:=left_skip; r:=right_skip; background[1]:=width(q)+width(r);@/
 16311  background[2]:=0; background[3]:=0; background[4]:=0; background[5]:=0;@/
 16312  background[2+stretch_order(q)]:=stretch(q);@/
 16313  background[2+stretch_order(r)]:=@|background[2+stretch_order(r)]+stretch(r);@/
 16314  background[6]:=shrink(q)+shrink(r);
 16315  
 16316  @ A pointer variable |cur_p| runs through the given horizontal list as we look
 16317  for breakpoints. This variable is global, since it is used both by |line_break|
 16318  and by its subprocedure |try_break|.
 16319  
 16320  Another global variable called |threshold| is used to determine the feasibility
 16321  of individual lines: Breakpoints are feasible if there is a way to reach
 16322  them without creating lines whose badness exceeds |threshold|.  (The
 16323  badness is compared to |threshold| before penalties are added, so that
 16324  penalty values do not affect the feasibility of breakpoints, except that
 16325  no break is allowed when the penalty is 10000 or more.) If |threshold|
 16326  is 10000 or more, all legal breaks are considered feasible, since the
 16327  |badness| function specified above never returns a value greater than~10000.
 16328  
 16329  Up to three passes might be made through the paragraph in an attempt to find at
 16330  least one set of feasible breakpoints. On the first pass, we have
 16331  |threshold=pretolerance| and |second_pass=final_pass=false|.
 16332  If this pass fails to find a
 16333  feasible solution, |threshold| is set to |tolerance|, |second_pass| is set
 16334  |true|, and an attempt is made to hyphenate as many words as possible.
 16335  If that fails too, we add |emergency_stretch| to the background
 16336  stretchability and set |final_pass=true|.
 16337  
 16338  @<Glob...@>=
 16339  @!cur_p:pointer; {the current breakpoint under consideration}
 16340  @!second_pass:boolean; {is this our second attempt to break this paragraph?}
 16341  @!final_pass:boolean; {is this our final attempt to break this paragraph?}
 16342  @!threshold:integer; {maximum badness on feasible lines}
 16343  
 16344  @ The heart of the line-breaking procedure is `|try_break|', a subroutine
 16345  that tests if the current breakpoint |cur_p| is feasible, by running
 16346  through the active list to see what lines of text can be made from active
 16347  nodes to~|cur_p|.  If feasible breaks are possible, new break nodes are
 16348  created.  If |cur_p| is too far from an active node, that node is
 16349  deactivated.
 16350  
 16351  The parameter |pi| to |try_break| is the penalty associated
 16352  with a break at |cur_p|; we have |pi=eject_penalty| if the break is forced,
 16353  and |pi=inf_penalty| if the break is illegal.
 16354  
 16355  The other parameter, |break_type|, is set to |hyphenated| or |unhyphenated|,
 16356  depending on whether or not the current break is at a |disc_node|. The
 16357  end of a paragraph is also regarded as `|hyphenated|'; this case is
 16358  distinguishable by the condition |cur_p=null|.
 16359  
 16360  @d copy_to_cur_active(#)==cur_active_width[#]:=active_width[#]
 16361  @d deactivate=60 {go here when node |r| should be deactivated}
 16362  
 16363  @<Declare subprocedures for |line_break|@>=
 16364  procedure try_break(@!pi:integer;@!break_type:small_number);
 16365  label exit,done,done1,continue,deactivate;
 16366  var r:pointer; {runs through the active list}
 16367  @!prev_r:pointer; {stays a step behind |r|}
 16368  @!old_l:halfword; {maximum line number in current equivalence class of lines}
 16369  @!no_break_yet:boolean; {have we found a feasible break at |cur_p|?}
 16370  @<Other local variables for |try_break|@>@;
 16371  begin @<Make sure that |pi| is in the proper range@>;
 16372  no_break_yet:=true; prev_r:=active; old_l:=0;
 16373  do_all_six(copy_to_cur_active);
 16374  loop@+  begin continue: r:=link(prev_r);
 16375    @<If node |r| is of type |delta_node|, update |cur_active_width|,
 16376      set |prev_r| and |prev_prev_r|, then |goto continue|@>;
 16377    @<If a line number class has ended, create new active nodes for
 16378      the best feasible breaks in that class; then |return|
 16379      if |r=last_active|, otherwise compute the new |line_width|@>;
 16380    @<Consider the demerits for a line from |r| to |cur_p|;
 16381      deactivate node |r| if it should no longer be active;
 16382      then |goto continue| if a line from |r| to |cur_p| is infeasible,
 16383      otherwise record a new feasible break@>;
 16384    end;
 16385  exit: @!stat @<Update the value of |printed_node| for
 16386    symbolic displays@>@+tats@;
 16387  end;
 16388  
 16389  @ @<Other local variables for |try_break|@>=
 16390  @!prev_prev_r:pointer; {a step behind |prev_r|, if |type(prev_r)=delta_node|}
 16391  @!s:pointer; {runs through nodes ahead of |cur_p|}
 16392  @!q:pointer; {points to a new node being created}
 16393  @!v:pointer; {points to a glue specification or a node ahead of |cur_p|}
 16394  @!t:integer; {node count, if |cur_p| is a discretionary node}
 16395  @!f:internal_font_number; {used in character width calculation}
 16396  @!l:halfword; {line number of current active node}
 16397  @!node_r_stays_active:boolean; {should node |r| remain in the active list?}
 16398  @!line_width:scaled; {the current line will be justified to this width}
 16399  @!fit_class:very_loose_fit..tight_fit; {possible fitness class of test line}
 16400  @!b:halfword; {badness of test line}
 16401  @!d:integer; {demerits of test line}
 16402  @!artificial_demerits:boolean; {has |d| been forced to zero?}
 16403  @!save_link:pointer; {temporarily holds value of |link(cur_p)|}
 16404  @!shortfall:scaled; {used in badness calculations}
 16405  
 16406  @ @<Make sure that |pi| is in the proper range@>=
 16407  if abs(pi)>=inf_penalty then
 16408    if pi>0 then return {this breakpoint is inhibited by infinite penalty}
 16409    else pi:=eject_penalty {this breakpoint will be forced}
 16410  
 16411  @ The following code uses the fact that |type(last_active)<>delta_node|.
 16412  
 16413  @d update_width(#)==@|
 16414    cur_active_width[#]:=cur_active_width[#]+mem[r+#].sc
 16415  
 16416  @<If node |r|...@>=
 16417  @^inner loop@>
 16418  if type(r)=delta_node then
 16419    begin do_all_six(update_width);
 16420    prev_prev_r:=prev_r; prev_r:=r; goto continue;
 16421    end
 16422  
 16423  @ As we consider various ways to end a line at |cur_p|, in a given line number
 16424  class, we keep track of the best total demerits known, in an array with
 16425  one entry for each of the fitness classifications. For example,
 16426  |minimal_demerits[tight_fit]| contains the fewest total demerits of feasible
 16427  line breaks ending at |cur_p| with a |tight_fit| line; |best_place[tight_fit]|
 16428  points to the passive node for the break before~|cur_p| that achieves such
 16429  an optimum; and |best_pl_line[tight_fit]| is the |line_number| field in the
 16430  active node corresponding to |best_place[tight_fit]|. When no feasible break
 16431  sequence is known, the |minimal_demerits| entries will be equal to
 16432  |awful_bad|, which is $2^{30}-1$. Another variable, |minimum_demerits|,
 16433  keeps track of the smallest value in the |minimal_demerits| array.
 16434  
 16435  @d awful_bad==@'7777777777 {more than a billion demerits}
 16436  
 16437  @<Global...@>=
 16438  @!minimal_demerits:array[very_loose_fit..tight_fit] of integer; {best total
 16439    demerits known for current line class and position, given the fitness}
 16440  @!minimum_demerits:integer; {best total demerits known for current line class
 16441    and position}
 16442  @!best_place:array[very_loose_fit..tight_fit] of pointer; {how to achieve
 16443    |minimal_demerits|}
 16444  @!best_pl_line:array[very_loose_fit..tight_fit] of halfword; {corresponding
 16445    line number}
 16446  
 16447  @ @<Get ready to start...@>=
 16448  minimum_demerits:=awful_bad;
 16449  minimal_demerits[tight_fit]:=awful_bad;
 16450  minimal_demerits[decent_fit]:=awful_bad;
 16451  minimal_demerits[loose_fit]:=awful_bad;
 16452  minimal_demerits[very_loose_fit]:=awful_bad;
 16453  
 16454  @ The first part of the following code is part of \TeX's inner loop, so
 16455  we don't want to waste any time. The current active node, namely node |r|,
 16456  contains the line number that will be considered next. At the end of the
 16457  list we have arranged the data structure so that |r=last_active| and
 16458  |line_number(last_active)>old_l|.
 16459  @^inner loop@>
 16460  
 16461  @<If a line number class...@>=
 16462  begin l:=line_number(r);
 16463  if l>old_l then
 16464    begin {now we are no longer in the inner loop}
 16465    if (minimum_demerits<awful_bad)and@|
 16466        ((old_l<>easy_line)or(r=last_active)) then
 16467      @<Create new active nodes for the best feasible breaks
 16468        just found@>;
 16469    if r=last_active then return;
 16470    @<Compute the new line width@>;
 16471    end;
 16472  end
 16473  
 16474  @ It is not necessary to create new active nodes having |minimal_demerits|
 16475  greater than
 16476  |minimum_demerits+abs(adj_demerits)|, since such active nodes will never
 16477  be chosen in the final paragraph breaks. This observation allows us to
 16478  omit a substantial number of feasible breakpoints from further consideration.
 16479  
 16480  @<Create new active nodes...@>=
 16481  begin if no_break_yet then @<Compute the values of |break_width|@>;
 16482  @<Insert a delta node to prepare for breaks at |cur_p|@>;
 16483  if abs(adj_demerits)>=awful_bad-minimum_demerits then
 16484    minimum_demerits:=awful_bad-1
 16485  else minimum_demerits:=minimum_demerits+abs(adj_demerits);
 16486  for fit_class:=very_loose_fit to tight_fit do
 16487    begin if minimal_demerits[fit_class]<=minimum_demerits then
 16488      @<Insert a new active node
 16489        from |best_place[fit_class]| to |cur_p|@>;
 16490    minimal_demerits[fit_class]:=awful_bad;
 16491    end;
 16492  minimum_demerits:=awful_bad;
 16493  @<Insert a delta node to prepare for the next active node@>;
 16494  end
 16495  
 16496  @ When we insert a new active node for a break at |cur_p|, suppose this
 16497  new node is to be placed just before active node |a|; then we essentially
 16498  want to insert `$\delta\,|cur_p|\,\delta^\prime$' before |a|, where
 16499  $\delta=\alpha(a)-\alpha(|cur_p|)$ and $\delta^\prime=\alpha(|cur_p|)-\alpha(a)$
 16500  in the notation explained above.  The |cur_active_width| array now holds
 16501  $\gamma+\beta(|cur_p|)-\alpha(a)$; so $\delta$ can be obtained by
 16502  subtracting |cur_active_width| from the quantity $\gamma+\beta(|cur_p|)-
 16503  \alpha(|cur_p|)$. The latter quantity can be regarded as the length of a
 16504  line ``from |cur_p| to |cur_p|''; we call it the |break_width| at |cur_p|.
 16505  
 16506  The |break_width| is usually negative, since it consists of the background
 16507  (which is normally zero) minus the width of nodes following~|cur_p| that are
 16508  eliminated after a break. If, for example, node |cur_p| is a glue node, the
 16509  width of this glue is subtracted from the background; and we also look
 16510  ahead to eliminate all subsequent glue and penalty and kern and math
 16511  nodes, subtracting their widths as well.
 16512  
 16513  Kern nodes do not disappear at a line break unless they are |explicit|.
 16514  
 16515  @d set_break_width_to_background(#)==break_width[#]:=background[#]
 16516  
 16517  @<Compute the values of |break...@>=
 16518  begin no_break_yet:=false; do_all_six(set_break_width_to_background);
 16519  s:=cur_p;
 16520  if break_type>unhyphenated then if cur_p<>null then
 16521    @<Compute the discretionary |break_width| values@>;
 16522  while s<>null do
 16523    begin if is_char_node(s) then goto done;
 16524    case type(s) of
 16525    glue_node:@<Subtract glue from |break_width|@>;
 16526    penalty_node: do_nothing;
 16527    math_node: break_width[1]:=break_width[1]-width(s);
 16528    kern_node: if subtype(s)<>explicit then goto done
 16529      else break_width[1]:=break_width[1]-width(s);
 16530    othercases goto done
 16531    endcases;@/
 16532    s:=link(s);
 16533    end;
 16534  done: end
 16535  
 16536  @ @<Subtract glue from |break...@>=
 16537  begin v:=glue_ptr(s); break_width[1]:=break_width[1]-width(v);
 16538  break_width[2+stretch_order(v)]:=break_width[2+stretch_order(v)]-stretch(v);
 16539  break_width[6]:=break_width[6]-shrink(v);
 16540  end
 16541  
 16542  @ When |cur_p| is a discretionary break, the length of a line ``from |cur_p| to
 16543  |cur_p|'' has to be defined properly so that the other calculations work out.
 16544  Suppose that the pre-break text at |cur_p| has length $l_0$, the post-break
 16545  text has length $l_1$, and the replacement text has length |l|. Suppose
 16546  also that |q| is the node following the replacement text. Then length of a
 16547  line from |cur_p| to |q| will be computed as $\gamma+\beta(q)-\alpha(|cur_p|)$,
 16548  where $\beta(q)=\beta(|cur_p|)-l_0+l$. The actual length will be the background
 16549  plus $l_1$, so the length from |cur_p| to |cur_p| should be $\gamma+l_0+l_1-l$.
 16550  If the post-break text of the discretionary is empty, a break may also
 16551  discard~|q|; in that unusual case we subtract the length of~|q| and any
 16552  other nodes that will be discarded after the discretionary break.
 16553  
 16554  The value of $l_0$ need not be computed, since |line_break| will put
 16555  it into the global variable |disc_width| before calling |try_break|.
 16556  
 16557  @<Glob...@>=
 16558  @!disc_width:scaled; {the length of discretionary material preceding a break}
 16559  
 16560  @ @<Compute the discretionary |break...@>=
 16561  begin t:=replace_count(cur_p); v:=cur_p; s:=post_break(cur_p);
 16562  while t>0 do
 16563    begin decr(t); v:=link(v);
 16564    @<Subtract the width of node |v| from |break_width|@>;
 16565    end;
 16566  while s<>null do
 16567    begin @<Add the width of node |s| to |break_width|@>;
 16568    s:=link(s);
 16569    end;
 16570  break_width[1]:=break_width[1]+disc_width;
 16571  if post_break(cur_p)=null then s:=link(v);
 16572            {nodes may be discardable after the break}
 16573  end
 16574  
 16575  @ Replacement texts and discretionary texts are supposed to contain
 16576  only character nodes, kern nodes, ligature nodes, and box or rule nodes.
 16577  
 16578  @<Subtract the width of node |v|...@>=
 16579  if is_char_node(v) then
 16580    begin f:=font(v);
 16581    break_width[1]:=break_width[1]-char_width(f)(char_info(f)(character(v)));
 16582    end
 16583  else  case type(v) of
 16584    ligature_node: begin f:=font(lig_char(v));@/
 16585      break_width[1]:=@|break_width[1]-
 16586        char_width(f)(char_info(f)(character(lig_char(v))));
 16587      end;
 16588    hlist_node,vlist_node,rule_node,kern_node:
 16589      break_width[1]:=break_width[1]-width(v);
 16590    othercases confusion("disc1")
 16591  @:this can't happen disc1}{\quad disc1@>
 16592    endcases
 16593  
 16594  @ @<Add the width of node |s| to |b...@>=
 16595  if is_char_node(s) then
 16596    begin f:=font(s);
 16597    break_width[1]:=@|break_width[1]+char_width(f)(char_info(f)(character(s)));
 16598    end
 16599  else  case type(s) of
 16600    ligature_node: begin f:=font(lig_char(s));
 16601      break_width[1]:=break_width[1]+
 16602        char_width(f)(char_info(f)(character(lig_char(s))));
 16603      end;
 16604    hlist_node,vlist_node,rule_node,kern_node:
 16605      break_width[1]:=break_width[1]+width(s);
 16606    othercases confusion("disc2")
 16607  @:this can't happen disc2}{\quad disc2@>
 16608    endcases
 16609  
 16610  @ We use the fact that |type(active)<>delta_node|.
 16611  
 16612  @d convert_to_break_width(#)==@|
 16613    mem[prev_r+#].sc:=@|@t\hskip10pt@>mem[prev_r+#].sc
 16614    -cur_active_width[#]+break_width[#]
 16615  @d store_break_width(#)==active_width[#]:=break_width[#]
 16616  @d new_delta_to_break_width(#)==@|
 16617    mem[q+#].sc:=break_width[#]-cur_active_width[#]
 16618  
 16619  @<Insert a delta node to prepare for breaks at |cur_p|@>=
 16620  if type(prev_r)=delta_node then {modify an existing delta node}
 16621    begin do_all_six(convert_to_break_width);
 16622    end
 16623  else if prev_r=active then {no delta node needed at the beginning}
 16624    begin do_all_six(store_break_width);
 16625    end
 16626  else  begin q:=get_node(delta_node_size); link(q):=r; type(q):=delta_node;@/
 16627    subtype(q):=0; {the |subtype| is not used}
 16628    do_all_six(new_delta_to_break_width);
 16629    link(prev_r):=q; prev_prev_r:=prev_r; prev_r:=q;
 16630    end
 16631  
 16632  @ When the following code is performed, we will have just inserted at
 16633  least one active node before |r|, so |type(prev_r)<>delta_node|.
 16634  
 16635  @d new_delta_from_break_width(#)==@|mem[q+#].sc:=
 16636      cur_active_width[#]-break_width[#]
 16637  
 16638  @<Insert a delta node to prepare for the next active node@>=
 16639  if r<>last_active then
 16640    begin q:=get_node(delta_node_size); link(q):=r; type(q):=delta_node;@/
 16641    subtype(q):=0; {the |subtype| is not used}
 16642    do_all_six(new_delta_from_break_width);
 16643    link(prev_r):=q; prev_prev_r:=prev_r; prev_r:=q;
 16644    end
 16645  
 16646  @ When we create an active node, we also create the corresponding
 16647  passive node.
 16648  
 16649  @<Insert a new active node from |best_place[fit_class]| to |cur_p|@>=
 16650  begin q:=get_node(passive_node_size);
 16651  link(q):=passive; passive:=q; cur_break(q):=cur_p;
 16652  @!stat incr(pass_number); serial(q):=pass_number;@+tats@;@/
 16653  prev_break(q):=best_place[fit_class];@/
 16654  q:=get_node(active_node_size); break_node(q):=passive;
 16655  line_number(q):=best_pl_line[fit_class]+1;
 16656  fitness(q):=fit_class; type(q):=break_type;
 16657  total_demerits(q):=minimal_demerits[fit_class];
 16658  link(q):=r; link(prev_r):=q; prev_r:=q;
 16659  @!stat if tracing_paragraphs>0 then
 16660    @<Print a symbolic description of the new break node@>;
 16661  tats@;@/
 16662  end
 16663  
 16664  @ @<Print a symbolic description of the new break node@>=
 16665  begin print_nl("@@@@"); print_int(serial(passive));
 16666  @.\AT!\AT!@>
 16667  print(": line "); print_int(line_number(q)-1);
 16668  print_char("."); print_int(fit_class);
 16669  if break_type=hyphenated then print_char("-");
 16670  print(" t="); print_int(total_demerits(q));
 16671  print(" -> @@@@");
 16672  if prev_break(passive)=null then print_char("0")
 16673  else print_int(serial(prev_break(passive)));
 16674  end
 16675  
 16676  @ The length of lines depends on whether the user has specified
 16677  \.{\\parshape} or \.{\\hangindent}. If |par_shape_ptr| is not null, it
 16678  points to a $(2n+1)$-word record in |mem|, where the |info| in the first
 16679  word contains the value of |n|, and the other $2n$ words contain the left
 16680  margins and line lengths for the first |n| lines of the paragraph; the
 16681  specifications for line |n| apply to all subsequent lines. If
 16682  |par_shape_ptr=null|, the shape of the paragraph depends on the value of
 16683  |n=hang_after|; if |n>=0|, hanging indentation takes place on lines |n+1|,
 16684  |n+2|, \dots, otherwise it takes place on lines 1, \dots, $\vert
 16685  n\vert$. When hanging indentation is active, the left margin is
 16686  |hang_indent|, if |hang_indent>=0|, else it is 0; the line length is
 16687  $|hsize|-\vert|hang_indent|\vert$. The normal setting is
 16688  |par_shape_ptr=null|, |hang_after=1|, and |hang_indent=0|.
 16689  Note that if |hang_indent=0|, the value of |hang_after| is irrelevant.
 16690  @^length of lines@> @^hanging indentation@>
 16691  
 16692  @<Glob...@>=
 16693  @!easy_line:halfword; {line numbers |>easy_line| are equivalent in break nodes}
 16694  @!last_special_line:halfword; {line numbers |>last_special_line| all have
 16695    the same width}
 16696  @!first_width:scaled; {the width of all lines |<=last_special_line|, if
 16697    no \.{\\parshape} has been specified}
 16698  @!second_width:scaled; {the width of all lines |>last_special_line|}
 16699  @!first_indent:scaled; {left margin to go with |first_width|}
 16700  @!second_indent:scaled; {left margin to go with |second_width|}
 16701  
 16702  @ We compute the values of |easy_line| and the other local variables relating
 16703  to line length when the |line_break| procedure is initializing itself.
 16704  
 16705  @<Get ready to start...@>=
 16706  if par_shape_ptr=null then
 16707    if hang_indent=0 then
 16708      begin last_special_line:=0; second_width:=hsize;
 16709      second_indent:=0;
 16710      end
 16711    else @<Set line length parameters in preparation for hanging indentation@>
 16712  else  begin last_special_line:=info(par_shape_ptr)-1;
 16713    second_width:=mem[par_shape_ptr+2*(last_special_line+1)].sc;
 16714    second_indent:=mem[par_shape_ptr+2*last_special_line+1].sc;
 16715    end;
 16716  if looseness=0 then easy_line:=last_special_line
 16717  else easy_line:=max_halfword
 16718  
 16719  @ @<Set line length parameters in preparation for hanging indentation@>=
 16720  begin last_special_line:=abs(hang_after);
 16721  if hang_after<0 then
 16722    begin first_width:=hsize-abs(hang_indent);
 16723    if hang_indent>=0 then first_indent:=hang_indent
 16724    else first_indent:=0;
 16725    second_width:=hsize; second_indent:=0;
 16726    end
 16727  else  begin first_width:=hsize; first_indent:=0;
 16728    second_width:=hsize-abs(hang_indent);
 16729    if hang_indent>=0 then second_indent:=hang_indent
 16730    else second_indent:=0;
 16731    end;
 16732  end
 16733  
 16734  @ When we come to the following code, we have just encountered the first
 16735  active node~|r| whose |line_number| field contains |l|. Thus we want to
 16736  compute the length of the $l\mskip1mu$th line of the current paragraph. Furthermore,
 16737  we want to set |old_l| to the last number in the class of line numbers
 16738  equivalent to~|l|.
 16739  
 16740  @<Compute the new line width@>=
 16741  if l>easy_line then
 16742    begin line_width:=second_width; old_l:=max_halfword-1;
 16743    end
 16744  else  begin old_l:=l;
 16745    if l>last_special_line then line_width:=second_width
 16746    else if par_shape_ptr=null then line_width:=first_width
 16747    else line_width:=mem[par_shape_ptr+2*l@,].sc;
 16748    end
 16749  
 16750  @ The remaining part of |try_break| deals with the calculation of
 16751  demerits for a break from |r| to |cur_p|.
 16752  
 16753  The first thing to do is calculate the badness, |b|. This value will always
 16754  be between zero and |inf_bad+1|; the latter value occurs only in the
 16755  case of lines from |r| to |cur_p| that cannot shrink enough to fit the necessary
 16756  width. In such cases, node |r| will be deactivated.
 16757  We also deactivate node~|r| when a break at~|cur_p| is forced, since future
 16758  breaks must go through a forced break.
 16759  
 16760  @<Consider the demerits for a line from |r| to |cur_p|...@>=
 16761  begin artificial_demerits:=false;@/
 16762  @^inner loop@>
 16763  shortfall:=line_width-cur_active_width[1]; {we're this much too short}
 16764  if shortfall>0 then
 16765    @<Set the value of |b| to the badness for stretching the line,
 16766      and compute the corresponding |fit_class|@>
 16767  else @<Set the value of |b| to the badness for shrinking the line,
 16768      and compute the corresponding |fit_class|@>;
 16769  if (b>inf_bad)or(pi=eject_penalty) then
 16770    @<Prepare to deactivate node~|r|, and |goto deactivate| unless
 16771      there is a reason to consider lines of text from |r| to |cur_p|@>
 16772  else  begin prev_r:=r;
 16773    if b>threshold then goto continue;
 16774    node_r_stays_active:=true;
 16775    end;
 16776  @<Record a new feasible break@>;
 16777  if node_r_stays_active then goto continue; {|prev_r| has been set to |r|}
 16778  deactivate: @<Deactivate node |r|@>;
 16779  end
 16780  
 16781  @ When a line must stretch, the available stretchability can be found in the
 16782  subarray |cur_active_width[2..5]|, in units of points, fil, fill, and filll.
 16783  
 16784  The present section is part of \TeX's inner loop, and it is most often performed
 16785  when the badness is infinite; therefore it is worth while to make a quick
 16786  test for large width excess and small stretchability, before calling the
 16787  |badness| subroutine.
 16788  @^inner loop@>
 16789  
 16790  @<Set the value of |b| to the badness for stretching...@>=
 16791  if (cur_active_width[3]<>0)or(cur_active_width[4]<>0)or@|
 16792    (cur_active_width[5]<>0) then
 16793    begin b:=0; fit_class:=decent_fit; {infinite stretch}
 16794    end
 16795  else  begin if shortfall>7230584 then if cur_active_width[2]<1663497 then
 16796      begin b:=inf_bad; fit_class:=very_loose_fit; goto done1;
 16797      end;
 16798    b:=badness(shortfall,cur_active_width[2]);
 16799    if b>12 then
 16800      if b>99 then fit_class:=very_loose_fit
 16801      else fit_class:=loose_fit
 16802    else fit_class:=decent_fit;
 16803    done1:
 16804    end
 16805  
 16806  @ Shrinkability is never infinite in a paragraph;
 16807  we can shrink the line from |r| to |cur_p| by at most |cur_active_width[6]|.
 16808  
 16809  @<Set the value of |b| to the badness for shrinking...@>=
 16810  begin if -shortfall>cur_active_width[6] then b:=inf_bad+1
 16811  else b:=badness(-shortfall,cur_active_width[6]);
 16812  if b>12 then fit_class:=tight_fit@+else fit_class:=decent_fit;
 16813  end
 16814  
 16815  @ During the final pass, we dare not lose all active nodes, lest we lose
 16816  touch with the line breaks already found. The code shown here makes sure
 16817  that such a catastrophe does not happen, by permitting overfull boxes as
 16818  a last resort. This particular part of \TeX\ was a source of several subtle
 16819  bugs before the correct program logic was finally discovered; readers
 16820  who seek to ``improve'' \TeX\ should therefore think thrice before daring
 16821  to make any changes here.
 16822  @^overfull boxes@>
 16823  
 16824  @<Prepare to deactivate node~|r|, and |goto deactivate| unless...@>=
 16825  begin if final_pass and (minimum_demerits=awful_bad) and@|
 16826     (link(r)=last_active) and
 16827     (prev_r=active) then
 16828    artificial_demerits:=true {set demerits zero, this break is forced}
 16829  else if b>threshold then goto deactivate;
 16830  node_r_stays_active:=false;
 16831  end
 16832  
 16833  @ When we get to this part of the code, the line from |r| to |cur_p| is
 16834  feasible, its badness is~|b|, and its fitness classification is |fit_class|.
 16835  We don't want to make an active node for this break yet, but we will
 16836  compute the total demerits and record them in the |minimal_demerits| array,
 16837  if such a break is the current champion among all ways to get to |cur_p|
 16838  in a given line-number class and fitness class.
 16839  
 16840  @<Record a new feasible break@>=
 16841  if artificial_demerits then d:=0
 16842  else @<Compute the demerits, |d|, from |r| to |cur_p|@>;
 16843  @!stat if tracing_paragraphs>0 then
 16844    @<Print a symbolic description of this feasible break@>;
 16845  tats@;@/
 16846  d:=d+total_demerits(r); {this is the minimum total demerits
 16847    from the beginning to |cur_p| via |r|}
 16848  if d<=minimal_demerits[fit_class] then
 16849    begin minimal_demerits[fit_class]:=d;
 16850    best_place[fit_class]:=break_node(r); best_pl_line[fit_class]:=l;
 16851    if d<minimum_demerits then minimum_demerits:=d;
 16852    end
 16853  
 16854  @ @<Print a symbolic description of this feasible break@>=
 16855  begin if printed_node<>cur_p then
 16856    @<Print the list between |printed_node| and |cur_p|,
 16857      then set |printed_node:=cur_p|@>;
 16858  print_nl("@@");
 16859  @.\AT!@>
 16860  if cur_p=null then print_esc("par")
 16861  else if type(cur_p)<>glue_node then
 16862    begin if type(cur_p)=penalty_node then print_esc("penalty")
 16863    else if type(cur_p)=disc_node then print_esc("discretionary")
 16864    else if type(cur_p)=kern_node then print_esc("kern")
 16865    else print_esc("math");
 16866    end;
 16867  print(" via @@@@");
 16868  if break_node(r)=null then print_char("0")
 16869  else print_int(serial(break_node(r)));
 16870  print(" b=");
 16871  if b>inf_bad then print_char("*")@+else print_int(b);
 16872  @.*\relax@>
 16873  print(" p="); print_int(pi); print(" d=");
 16874  if artificial_demerits then print_char("*")@+else print_int(d);
 16875  end
 16876  
 16877  @ @<Print the list between |printed_node| and |cur_p|...@>=
 16878  begin print_nl("");
 16879  if cur_p=null then short_display(link(printed_node))
 16880  else  begin save_link:=link(cur_p);
 16881    link(cur_p):=null; print_nl(""); short_display(link(printed_node));
 16882    link(cur_p):=save_link;
 16883    end;
 16884  printed_node:=cur_p;
 16885  end
 16886  
 16887  @ When the data for a discretionary break is being displayed, we will have
 16888  printed the |pre_break| and |post_break| lists; we want to skip over the
 16889  third list, so that the discretionary data will not appear twice.  The
 16890  following code is performed at the very end of |try_break|.
 16891  
 16892  @<Update the value of |printed_node|...@>=
 16893  if cur_p=printed_node then if cur_p<>null then if type(cur_p)=disc_node then
 16894    begin t:=replace_count(cur_p);
 16895    while t>0 do
 16896      begin decr(t); printed_node:=link(printed_node);
 16897      end;
 16898    end
 16899  
 16900  @ @<Compute the demerits, |d|, from |r| to |cur_p|@>=
 16901  begin d:=line_penalty+b;
 16902  if abs(d)>=10000 then d:=100000000@+else d:=d*d;
 16903  if pi<>0 then
 16904    if pi>0 then d:=d+pi*pi
 16905    else if pi>eject_penalty then d:=d-pi*pi;
 16906  if (break_type=hyphenated)and(type(r)=hyphenated) then
 16907    if cur_p<>null then d:=d+double_hyphen_demerits
 16908    else d:=d+final_hyphen_demerits;
 16909  if abs(fit_class-fitness(r))>1 then d:=d+adj_demerits;
 16910  end
 16911  
 16912  @ When an active node disappears, we must delete an adjacent delta node if the
 16913  active node was at the beginning or the end of the active list, or if it
 16914  was surrounded by delta nodes. We also must preserve the property that
 16915  |cur_active_width| represents the length of material from |link(prev_r)|
 16916  to~|cur_p|.
 16917  
 16918  @d combine_two_deltas(#)==@|mem[prev_r+#].sc:=mem[prev_r+#].sc+mem[r+#].sc
 16919  @d downdate_width(#)==@|cur_active_width[#]:=cur_active_width[#]-
 16920    mem[prev_r+#].sc
 16921  
 16922  @<Deactivate node |r|@>=
 16923  link(prev_r):=link(r); free_node(r,active_node_size);
 16924  if prev_r=active then @<Update the active widths, since the first active
 16925    node has been deleted@>
 16926  else if type(prev_r)=delta_node then
 16927    begin r:=link(prev_r);
 16928    if r=last_active then
 16929      begin do_all_six(downdate_width);
 16930      link(prev_prev_r):=last_active;
 16931      free_node(prev_r,delta_node_size); prev_r:=prev_prev_r;
 16932      end
 16933    else if type(r)=delta_node then
 16934      begin do_all_six(update_width);
 16935      do_all_six(combine_two_deltas);
 16936      link(prev_r):=link(r); free_node(r,delta_node_size);
 16937      end;
 16938    end
 16939  
 16940  @ The following code uses the fact that |type(last_active)<>delta_node|. If the
 16941  active list has just become empty, we do not need to update the
 16942  |active_width| array, since it will be initialized when an active
 16943  node is next inserted.
 16944  
 16945  @d update_active(#)==active_width[#]:=active_width[#]+mem[r+#].sc
 16946  
 16947  @<Update the active widths,...@>=
 16948  begin r:=link(active);
 16949  if type(r)=delta_node then
 16950    begin do_all_six(update_active);
 16951    do_all_six(copy_to_cur_active);
 16952    link(active):=link(r); free_node(r,delta_node_size);
 16953    end;
 16954  end
 16955  
 16956  @* \[39] Breaking paragraphs into lines, continued.
 16957  So far we have gotten a little way into the |line_break| routine, having
 16958  covered its important |try_break| subroutine. Now let's consider the
 16959  rest of the process.
 16960  
 16961  The main loop of |line_break| traverses the given hlist,
 16962  starting at |link(temp_head)|, and calls |try_break| at each legal
 16963  breakpoint. A variable called |auto_breaking| is set to true except
 16964  within math formulas, since glue nodes are not legal breakpoints when
 16965  they appear in formulas.
 16966  
 16967  The current node of interest in the hlist is pointed to by |cur_p|. Another
 16968  variable, |prev_p|, is usually one step behind |cur_p|, but the real
 16969  meaning of |prev_p| is this: If |type(cur_p)=glue_node| then |cur_p| is a legal
 16970  breakpoint if and only if |auto_breaking| is true and |prev_p| does not
 16971  point to a glue node, penalty node, explicit kern node, or math node.
 16972  
 16973  The following declarations provide for a few other local variables that are
 16974  used in special calculations.
 16975  
 16976  @<Local variables for line breaking@>=
 16977  @!auto_breaking:boolean; {is node |cur_p| outside a formula?}
 16978  @!prev_p:pointer; {helps to determine when glue nodes are breakpoints}
 16979  @!q,@!r,@!s,@!prev_s:pointer; {miscellaneous nodes of temporary interest}
 16980  @!f:internal_font_number; {used when calculating character widths}
 16981  
 16982  @ The `\ignorespaces|loop|\unskip' in the following code is performed at most
 16983  thrice per call of |line_break|, since it is actually a pass over the
 16984  entire paragraph.
 16985  
 16986  @<Find optimal breakpoints@>=
 16987  threshold:=pretolerance;
 16988  if threshold>=0 then
 16989    begin @!stat if tracing_paragraphs>0 then
 16990      begin begin_diagnostic; print_nl("@@firstpass");@+end;@;@+tats@;@/
 16991    second_pass:=false; final_pass:=false;
 16992    end
 16993  else  begin threshold:=tolerance; second_pass:=true;
 16994    final_pass:=(emergency_stretch<=0);
 16995    @!stat if tracing_paragraphs>0 then begin_diagnostic;@+tats@;
 16996    end;
 16997  loop@+  begin if threshold>inf_bad then threshold:=inf_bad;
 16998    if second_pass then @<Initialize for hyphenating a paragraph@>;
 16999    @<Create an active breakpoint representing the beginning of the paragraph@>;
 17000    cur_p:=link(temp_head); auto_breaking:=true;@/
 17001    prev_p:=cur_p; {glue at beginning is not a legal breakpoint}
 17002    while (cur_p<>null)and(link(active)<>last_active) do
 17003      @<Call |try_break| if |cur_p| is a legal breakpoint;
 17004      on the second pass, also try to hyphenate the next
 17005      word, if |cur_p| is a glue node;
 17006      then advance |cur_p| to the next node of the paragraph
 17007      that could possibly be a legal breakpoint@>;
 17008    if cur_p=null then
 17009      @<Try the final line break at the end of the paragraph,
 17010      and |goto done| if the desired breakpoints have been found@>;
 17011    @<Clean up the memory by removing the break nodes@>;
 17012    if not second_pass then
 17013      begin@!stat if tracing_paragraphs>0 then print_nl("@@secondpass");@;@+tats@/
 17014      threshold:=tolerance; second_pass:=true; final_pass:=(emergency_stretch<=0);
 17015      end {if at first you don't succeed, \dots}
 17016    else begin @!stat if tracing_paragraphs>0 then
 17017        print_nl("@@emergencypass");@;@+tats@/
 17018      background[2]:=background[2]+emergency_stretch; final_pass:=true;
 17019      end;
 17020    end;
 17021  done: @!stat if tracing_paragraphs>0 then
 17022    begin end_diagnostic(true); normalize_selector;
 17023    end;@+tats@/
 17024  
 17025  @ The active node that represents the starting point does not need a
 17026  corresponding passive node.
 17027  
 17028  @d store_background(#)==active_width[#]:=background[#]
 17029  
 17030  @<Create an active breakpoint representing the beginning of the paragraph@>=
 17031  q:=get_node(active_node_size);
 17032  type(q):=unhyphenated; fitness(q):=decent_fit;
 17033  link(q):=last_active; break_node(q):=null;
 17034  line_number(q):=prev_graf+1; total_demerits(q):=0; link(active):=q;
 17035  do_all_six(store_background);@/
 17036  passive:=null; printed_node:=temp_head; pass_number:=0;
 17037  font_in_short_display:=null_font
 17038  
 17039  @ @<Clean...@>=
 17040  q:=link(active);
 17041  while q<>last_active do
 17042    begin cur_p:=link(q);
 17043    if type(q)=delta_node then free_node(q,delta_node_size)
 17044    else free_node(q,active_node_size);
 17045    q:=cur_p;
 17046    end;
 17047  q:=passive;
 17048  while q<>null do
 17049    begin cur_p:=link(q);
 17050    free_node(q,passive_node_size);
 17051    q:=cur_p;
 17052    end
 17053  
 17054  @ Here is the main switch in the |line_break| routine, where legal breaks
 17055  are determined. As we move through the hlist, we need to keep the |active_width|
 17056  array up to date, so that the badness of individual lines is readily calculated
 17057  by |try_break|. It is convenient to use the short name |act_width| for
 17058  the component of active width that represents real width as opposed to glue.
 17059  
 17060  @d act_width==active_width[1] {length from first active node to current node}
 17061  @d kern_break==begin if not is_char_node(link(cur_p)) and auto_breaking then
 17062      if type(link(cur_p))=glue_node then try_break(0,unhyphenated);
 17063    act_width:=act_width+width(cur_p);
 17064    end
 17065  
 17066  @<Call |try_break| if |cur_p| is a legal breakpoint...@>=
 17067  begin if is_char_node(cur_p) then
 17068    @<Advance \(c)|cur_p| to the node following the present
 17069      string of characters@>;
 17070  case type(cur_p) of
 17071  hlist_node,vlist_node,rule_node: act_width:=act_width+width(cur_p);
 17072  whatsit_node: @<Advance \(p)past a whatsit node in the \(l)|line_break| loop@>;
 17073  glue_node: begin @<If node |cur_p| is a legal breakpoint, call |try_break|;
 17074    then update the active widths by including the glue in |glue_ptr(cur_p)|@>;
 17075    if second_pass and auto_breaking then
 17076      @<Try to hyphenate the following word@>;
 17077    end;
 17078  kern_node: if subtype(cur_p)=explicit then kern_break
 17079    else act_width:=act_width+width(cur_p);
 17080  ligature_node: begin f:=font(lig_char(cur_p));
 17081    act_width:=act_width+char_width(f)(char_info(f)(character(lig_char(cur_p))));
 17082    end;
 17083  disc_node: @<Try to break after a discretionary fragment, then |goto done5|@>;
 17084  math_node: begin auto_breaking:=(subtype(cur_p)=after); kern_break;
 17085    end;
 17086  penalty_node: try_break(penalty(cur_p),unhyphenated);
 17087  mark_node,ins_node,adjust_node: do_nothing;
 17088  othercases confusion("paragraph")
 17089  @:this can't happen paragraph}{\quad paragraph@>
 17090  endcases;@/
 17091  prev_p:=cur_p; cur_p:=link(cur_p);
 17092  done5:end
 17093  
 17094  @ The code that passes over the characters of words in a paragraph is
 17095  part of \TeX's inner loop, so it has been streamlined for speed. We use
 17096  the fact that `\.{\\parfillskip}' glue appears at the end of each paragraph;
 17097  it is therefore unnecessary to check if |link(cur_p)=null| when |cur_p| is a
 17098  character node.
 17099  @^inner loop@>
 17100  
 17101  @<Advance \(c)|cur_p| to the node following the present string...@>=
 17102  begin prev_p:=cur_p;
 17103  repeat f:=font(cur_p);
 17104  act_width:=act_width+char_width(f)(char_info(f)(character(cur_p)));
 17105  cur_p:=link(cur_p);
 17106  until not is_char_node(cur_p);
 17107  end
 17108  
 17109  @ When node |cur_p| is a glue node, we look at |prev_p| to see whether or not
 17110  a breakpoint is legal at |cur_p|, as explained above.
 17111  
 17112  @<If node |cur_p| is a legal breakpoint, call...@>=
 17113  if auto_breaking then
 17114    begin if is_char_node(prev_p) then try_break(0,unhyphenated)
 17115    else if precedes_break(prev_p) then try_break(0,unhyphenated)
 17116    else if (type(prev_p)=kern_node)and(subtype(prev_p)<>explicit) then
 17117      try_break(0,unhyphenated);
 17118    end;
 17119  check_shrinkage(glue_ptr(cur_p)); q:=glue_ptr(cur_p);
 17120  act_width:=act_width+width(q);@|
 17121  active_width[2+stretch_order(q)]:=@|
 17122    active_width[2+stretch_order(q)]+stretch(q);@/
 17123  active_width[6]:=active_width[6]+shrink(q)
 17124  
 17125  @ The following code knows that discretionary texts contain
 17126  only character nodes, kern nodes, box nodes, rule nodes, and ligature nodes.
 17127  
 17128  @<Try to break after a discretionary fragment...@>=
 17129  begin s:=pre_break(cur_p); disc_width:=0;
 17130  if s=null then try_break(ex_hyphen_penalty,hyphenated)
 17131  else  begin repeat @<Add the width of node |s| to |disc_width|@>;
 17132      s:=link(s);
 17133    until s=null;
 17134    act_width:=act_width+disc_width;
 17135    try_break(hyphen_penalty,hyphenated);
 17136    act_width:=act_width-disc_width;
 17137    end;
 17138  r:=replace_count(cur_p); s:=link(cur_p);
 17139  while r>0 do
 17140    begin @<Add the width of node |s| to |act_width|@>;
 17141    decr(r); s:=link(s);
 17142    end;
 17143  prev_p:=cur_p; cur_p:=s; goto done5;
 17144  end
 17145  
 17146  @ @<Add the width of node |s| to |disc_width|@>=
 17147  if is_char_node(s) then
 17148    begin f:=font(s);
 17149    disc_width:=disc_width+char_width(f)(char_info(f)(character(s)));
 17150    end
 17151  else  case type(s) of
 17152    ligature_node: begin f:=font(lig_char(s));
 17153      disc_width:=disc_width+
 17154        char_width(f)(char_info(f)(character(lig_char(s))));
 17155      end;
 17156    hlist_node,vlist_node,rule_node,kern_node:
 17157      disc_width:=disc_width+width(s);
 17158    othercases confusion("disc3")
 17159  @:this can't happen disc3}{\quad disc3@>
 17160    endcases
 17161  
 17162  @ @<Add the width of node |s| to |act_width|@>=
 17163  if is_char_node(s) then
 17164    begin f:=font(s);
 17165    act_width:=act_width+char_width(f)(char_info(f)(character(s)));
 17166    end
 17167  else  case type(s) of
 17168    ligature_node: begin f:=font(lig_char(s));
 17169      act_width:=act_width+
 17170        char_width(f)(char_info(f)(character(lig_char(s))));
 17171      end;
 17172    hlist_node,vlist_node,rule_node,kern_node:
 17173      act_width:=act_width+width(s);
 17174    othercases confusion("disc4")
 17175  @:this can't happen disc4}{\quad disc4@>
 17176    endcases
 17177  
 17178  @ The forced line break at the paragraph's end will reduce the list of
 17179  breakpoints so that all active nodes represent breaks at |cur_p=null|.
 17180  On the first pass, we insist on finding an active node that has the
 17181  correct ``looseness.'' On the final pass, there will be at least one active
 17182  node, and we will match the desired looseness as well as we can.
 17183  
 17184  The global variable |best_bet| will be set to the active node for the best
 17185  way to break the paragraph, and a few other variables are used to
 17186  help determine what is best.
 17187  
 17188  @<Glob...@>=
 17189  @!best_bet:pointer; {use this passive node and its predecessors}
 17190  @!fewest_demerits:integer; {the demerits associated with |best_bet|}
 17191  @!best_line:halfword; {line number following the last line of the new paragraph}
 17192  @!actual_looseness:integer; {the difference between |line_number(best_bet)|
 17193    and the optimum |best_line|}
 17194  @!line_diff:integer; {the difference between the current line number and
 17195    the optimum |best_line|}
 17196  
 17197  @ @<Try the final line break at the end of the paragraph...@>=
 17198  begin try_break(eject_penalty,hyphenated);
 17199  if link(active)<>last_active then
 17200    begin @<Find an active node with fewest demerits@>;
 17201    if looseness=0 then goto done;
 17202    @<Find the best active node for the desired looseness@>;
 17203    if (actual_looseness=looseness)or final_pass then goto done;
 17204    end;
 17205  end
 17206  
 17207  @ @<Find an active node...@>=
 17208  r:=link(active); fewest_demerits:=awful_bad;
 17209  repeat if type(r)<>delta_node then if total_demerits(r)<fewest_demerits then
 17210    begin fewest_demerits:=total_demerits(r); best_bet:=r;
 17211    end;
 17212  r:=link(r);
 17213  until r=last_active;
 17214  best_line:=line_number(best_bet)
 17215  
 17216  @ The adjustment for a desired looseness is a slightly more complicated
 17217  version of the loop just considered. Note that if a paragraph is broken
 17218  into segments by displayed equations, each segment will be subject to the
 17219  looseness calculation, independently of the other segments.
 17220  
 17221  @<Find the best active node...@>=
 17222  begin r:=link(active); actual_looseness:=0;
 17223  repeat if type(r)<>delta_node then
 17224    begin line_diff:=line_number(r)-best_line;
 17225    if ((line_diff<actual_looseness)and(looseness<=line_diff))or@|
 17226    ((line_diff>actual_looseness)and(looseness>=line_diff)) then
 17227      begin best_bet:=r; actual_looseness:=line_diff;
 17228      fewest_demerits:=total_demerits(r);
 17229      end
 17230    else if (line_diff=actual_looseness)and@|
 17231      (total_demerits(r)<fewest_demerits) then
 17232      begin best_bet:=r; fewest_demerits:=total_demerits(r);
 17233      end;
 17234    end;
 17235  r:=link(r);
 17236  until r=last_active;
 17237  best_line:=line_number(best_bet);
 17238  end
 17239  
 17240  @ Once the best sequence of breakpoints has been found (hurray), we call on the
 17241  procedure |post_line_break| to finish the remainder of the work.
 17242  (By introducing this subprocedure, we are able to keep |line_break|
 17243  from getting extremely long.)
 17244  
 17245  @<Break the paragraph at the chosen...@>=
 17246  post_line_break(final_widow_penalty)
 17247  
 17248  @ The total number of lines that will be set by |post_line_break|
 17249  is |best_line-prev_graf-1|. The last breakpoint is specified by
 17250  |break_node(best_bet)|, and this passive node points to the other breakpoints
 17251  via the |prev_break| links. The finishing-up phase starts by linking the
 17252  relevant passive nodes in forward order, changing |prev_break| to
 17253  |next_break|. (The |next_break| fields actually reside in the same memory
 17254  space as the |prev_break| fields did, but we give them a new name because
 17255  of their new significance.) Then the lines are justified, one by one.
 17256  
 17257  @d next_break==prev_break {new name for |prev_break| after links are reversed}
 17258  
 17259  @<Declare subprocedures for |line_break|@>=
 17260  procedure post_line_break(@!final_widow_penalty:integer);
 17261  label done,done1;
 17262  var q,@!r,@!s:pointer; {temporary registers for list manipulation}
 17263  @!disc_break:boolean; {was the current break at a discretionary node?}
 17264  @!post_disc_break:boolean; {and did it have a nonempty post-break part?}
 17265  @!cur_width:scaled; {width of line number |cur_line|}
 17266  @!cur_indent:scaled; {left margin of line number |cur_line|}
 17267  @!t:quarterword; {used for replacement counts in discretionary nodes}
 17268  @!pen:integer; {use when calculating penalties between lines}
 17269  @!cur_line: halfword; {the current line number being justified}
 17270  begin @<Reverse the links of the relevant passive nodes, setting |cur_p| to the
 17271    first breakpoint@>;
 17272  cur_line:=prev_graf+1;
 17273  repeat @<Justify the line ending at breakpoint |cur_p|, and append it to the
 17274    current vertical list, together with associated penalties and other
 17275    insertions@>;
 17276  incr(cur_line); cur_p:=next_break(cur_p);
 17277  if cur_p<>null then if not post_disc_break then
 17278    @<Prune unwanted nodes at the beginning of the next line@>;
 17279  until cur_p=null;
 17280  if (cur_line<>best_line)or(link(temp_head)<>null) then
 17281    confusion("line breaking");
 17282  @:this can't happen line breaking}{\quad line breaking@>
 17283  prev_graf:=best_line-1;
 17284  end;
 17285  
 17286  @ The job of reversing links in a list is conveniently regarded as the job
 17287  of taking items off one stack and putting them on another. In this case we
 17288  take them off a stack pointed to by |q| and having |prev_break| fields;
 17289  we put them on a stack pointed to by |cur_p| and having |next_break| fields.
 17290  Node |r| is the passive node being moved from stack to stack.
 17291  
 17292  @<Reverse the links of the relevant passive nodes...@>=
 17293  q:=break_node(best_bet); cur_p:=null;
 17294  repeat r:=q; q:=prev_break(q); next_break(r):=cur_p; cur_p:=r;
 17295  until q=null
 17296  
 17297  @ Glue and penalty and kern and math nodes are deleted at the beginning of
 17298  a line, except in the anomalous case that the node to be deleted is actually
 17299  one of the chosen breakpoints. Otherwise
 17300  the pruning done here is designed to match
 17301  the lookahead computation in |try_break|, where the |break_width| values
 17302  are computed for non-discretionary breakpoints.
 17303  
 17304  @<Prune unwanted nodes at the beginning of the next line@>=
 17305  begin r:=temp_head;
 17306  loop@+  begin q:=link(r);
 17307    if q=cur_break(cur_p) then goto done1;
 17308      {|cur_break(cur_p)| is the next breakpoint}
 17309    {now |q| cannot be |null|}
 17310    if is_char_node(q) then goto done1;
 17311    if non_discardable(q) then goto done1;
 17312    if type(q)=kern_node then if subtype(q)<>explicit then goto done1;
 17313    r:=q; {now |type(q)=glue_node|, |kern_node|, |math_node|, or |penalty_node|}
 17314    end;
 17315  done1: if r<>temp_head then
 17316    begin link(r):=null; flush_node_list(link(temp_head));
 17317    link(temp_head):=q;
 17318    end;
 17319  end
 17320  
 17321  @ The current line to be justified appears in a horizontal list starting
 17322  at |link(temp_head)| and ending at |cur_break(cur_p)|. If |cur_break(cur_p)| is
 17323  a glue node, we reset the glue to equal the |right_skip| glue; otherwise
 17324  we append the |right_skip| glue at the right. If |cur_break(cur_p)| is a
 17325  discretionary node, we modify the list so that the discretionary break
 17326  is compulsory, and we set |disc_break| to |true|. We also append
 17327  the |left_skip| glue at the left of the line, unless it is zero.
 17328  
 17329  @<Justify the line ending at breakpoint |cur_p|, and append it...@>=
 17330  @<Modify the end of the line to reflect the nature of the break and to include
 17331    \.{\\rightskip}; also set the proper value of |disc_break|@>;
 17332  @<Put the \(l)\.{\\leftskip} glue at the left and detach this line@>;
 17333  @<Call the packaging subroutine, setting |just_box| to the justified box@>;
 17334  @<Append the new box to the current vertical list, followed by the list of
 17335    special nodes taken out of the box by the packager@>;
 17336  @<Append a penalty node, if a nonzero penalty is appropriate@>
 17337  
 17338  @ At the end of the following code, |q| will point to the final node on the
 17339  list about to be justified.
 17340  
 17341  @<Modify the end of the line...@>=
 17342  q:=cur_break(cur_p); disc_break:=false; post_disc_break:=false;
 17343  if q<>null then {|q| cannot be a |char_node|}
 17344    if type(q)=glue_node then
 17345      begin delete_glue_ref(glue_ptr(q));
 17346      glue_ptr(q):=right_skip;
 17347      subtype(q):=right_skip_code+1; add_glue_ref(right_skip);
 17348      goto done;
 17349      end
 17350    else  begin if type(q)=disc_node then
 17351        @<Change discretionary to compulsory and set
 17352          |disc_break:=true|@>
 17353      else if (type(q)=math_node)or(type(q)=kern_node) then width(q):=0;
 17354      end
 17355  else  begin q:=temp_head;
 17356    while link(q)<>null do q:=link(q);
 17357    end;
 17358  @<Put the \(r)\.{\\rightskip} glue after node |q|@>;
 17359  done:
 17360  
 17361  @ @<Change discretionary to compulsory...@>=
 17362  begin t:=replace_count(q);
 17363  @<Destroy the |t| nodes following |q|, and
 17364     make |r| point to the following node@>;
 17365  if post_break(q)<>null then @<Transplant the post-break list@>;
 17366  if pre_break(q)<>null then @<Transplant the pre-break list@>;
 17367  link(q):=r; disc_break:=true;
 17368  end
 17369  
 17370  @ @<Destroy the |t| nodes following |q|...@>=
 17371  if t=0 then r:=link(q)
 17372  else  begin r:=q;
 17373    while t>1 do
 17374      begin r:=link(r); decr(t);
 17375      end;
 17376    s:=link(r);
 17377    r:=link(s); link(s):=null;
 17378    flush_node_list(link(q)); replace_count(q):=0;
 17379    end
 17380  
 17381  @ We move the post-break list from inside node |q| to the main list by
 17382  re\-attaching it just before the present node |r|, then resetting |r|.
 17383  
 17384  @<Transplant the post-break list@>=
 17385  begin s:=post_break(q);
 17386  while link(s)<>null do s:=link(s);
 17387  link(s):=r; r:=post_break(q); post_break(q):=null; post_disc_break:=true;
 17388  end
 17389  
 17390  @ We move the pre-break list from inside node |q| to the main list by
 17391  re\-attaching it just after the present node |q|, then resetting |q|.
 17392  
 17393  @<Transplant the pre-break list@>=
 17394  begin s:=pre_break(q); link(q):=s;
 17395  while link(s)<>null do s:=link(s);
 17396  pre_break(q):=null; q:=s;
 17397  end
 17398  
 17399  @ @<Put the \(r)\.{\\rightskip} glue after node |q|@>=
 17400  r:=new_param_glue(right_skip_code); link(r):=link(q); link(q):=r; q:=r
 17401  
 17402  @ The following code begins with |q| at the end of the list to be
 17403  justified. It ends with |q| at the beginning of that list, and with
 17404  |link(temp_head)| pointing to the remainder of the paragraph, if any.
 17405  
 17406  @<Put the \(l)\.{\\leftskip} glue at the left...@>=
 17407  r:=link(q); link(q):=null; q:=link(temp_head); link(temp_head):=r;
 17408  if left_skip<>zero_glue then
 17409    begin r:=new_param_glue(left_skip_code);
 17410    link(r):=q; q:=r;
 17411    end
 17412  
 17413  @ @<Append the new box to the current vertical list...@>=
 17414  append_to_vlist(just_box);
 17415  if adjust_head<>adjust_tail then
 17416    begin link(tail):=link(adjust_head); tail:=adjust_tail;
 17417     end;
 17418  adjust_tail:=null
 17419  
 17420  @ Now |q| points to the hlist that represents the current line of the
 17421  paragraph. We need to compute the appropriate line width, pack the
 17422  line into a box of this size, and shift the box by the appropriate
 17423  amount of indentation.
 17424  
 17425  @<Call the packaging subroutine...@>=
 17426  if cur_line>last_special_line then
 17427    begin cur_width:=second_width; cur_indent:=second_indent;
 17428    end
 17429  else if par_shape_ptr=null then
 17430    begin cur_width:=first_width; cur_indent:=first_indent;
 17431    end
 17432  else  begin cur_width:=mem[par_shape_ptr+2*cur_line].sc;
 17433    cur_indent:=mem[par_shape_ptr+2*cur_line-1].sc;
 17434    end;
 17435  adjust_tail:=adjust_head; just_box:=hpack(q,cur_width,exactly);
 17436  shift_amount(just_box):=cur_indent
 17437  
 17438  @ Penalties between the lines of a paragraph come from club and widow lines,
 17439  from the |inter_line_penalty| parameter, and from lines that end at
 17440  discretionary breaks.  Breaking between lines of a two-line paragraph gets
 17441  both club-line and widow-line penalties. The local variable |pen| will
 17442  be set to the sum of all relevant penalties for the current line, except
 17443  that the final line is never penalized.
 17444  
 17445  @<Append a penalty node, if a nonzero penalty is appropriate@>=
 17446  if cur_line+1<>best_line then
 17447    begin pen:=inter_line_penalty;
 17448    if cur_line=prev_graf+1 then pen:=pen+club_penalty;
 17449    if cur_line+2=best_line then pen:=pen+final_widow_penalty;
 17450    if disc_break then pen:=pen+broken_penalty;
 17451    if pen<>0 then
 17452      begin r:=new_penalty(pen);
 17453      link(tail):=r; tail:=r;
 17454      end;
 17455    end
 17456  
 17457  @* \[40] Pre-hyphenation.
 17458  When the line-breaking routine is unable to find a feasible sequence of
 17459  breakpoints, it makes a second pass over the paragraph, attempting to
 17460  hyphenate the hyphenatable words. The goal of hyphenation is to insert
 17461  discretionary material into the paragraph so that there are more
 17462  potential places to break.
 17463  
 17464  The general rules for hyphenation are somewhat complex and technical,
 17465  because we want to be able to hyphenate words that are preceded or
 17466  followed by punctuation marks, and because we want the rules to work
 17467  for languages other than English. We also must contend with the fact
 17468  that hyphens might radically alter the ligature and kerning structure
 17469  of a word.
 17470  
 17471  A sequence of characters will be considered for hyphenation only if it
 17472  belongs to a ``potentially hyphenatable part'' of the current paragraph.
 17473  This is a sequence of nodes $p_0p_1\ldots p_m$ where $p_0$ is a glue node,
 17474  $p_1\ldots p_{m-1}$ are either character or ligature or whatsit or
 17475  implicit kern nodes, and $p_m$ is a glue or penalty or insertion or adjust
 17476  or mark or whatsit or explicit kern node.  (Therefore hyphenation is
 17477  disabled by boxes, math formulas, and discretionary nodes already inserted
 17478  by the user.) The ligature nodes among $p_1\ldots p_{m-1}$ are effectively
 17479  expanded into the original non-ligature characters; the kern nodes and
 17480  whatsits are ignored. Each character |c| is now classified as either a
 17481  nonletter (if |lc_code(c)=0|), a lowercase letter (if
 17482  |lc_code(c)=c|), or an uppercase letter (otherwise); an uppercase letter
 17483  is treated as if it were |lc_code(c)| for purposes of hyphenation. The
 17484  characters generated by $p_1\ldots p_{m-1}$ may begin with nonletters; let
 17485  $c_1$ be the first letter that is not in the middle of a ligature. Whatsit
 17486  nodes preceding $c_1$ are ignored; a whatsit found after $c_1$ will be the
 17487  terminating node $p_m$. All characters that do not have the same font as
 17488  $c_1$ will be treated as nonletters. The |hyphen_char| for that font
 17489  must be between 0 and 255, otherwise hyphenation will not be attempted.
 17490  \TeX\ looks ahead for as many consecutive letters $c_1\ldots c_n$ as
 17491  possible; however, |n| must be less than 64, so a character that would
 17492  otherwise be $c_{64}$ is effectively not a letter. Furthermore $c_n$ must
 17493  not be in the middle of a ligature.  In this way we obtain a string of
 17494  letters $c_1\ldots c_n$ that are generated by nodes $p_a\ldots p_b$, where
 17495  |1<=a<=b+1<=m|. If |n>=l_hyf+r_hyf|, this string qualifies for hyphenation;
 17496  however, |uc_hyph| must be positive, if $c_1$ is uppercase.
 17497  
 17498  The hyphenation process takes place in three stages. First, the candidate
 17499  sequence $c_1\ldots c_n$ is found; then potential positions for hyphens
 17500  are determined by referring to hyphenation tables; and finally, the nodes
 17501  $p_a\ldots p_b$ are replaced by a new sequence of nodes that includes the
 17502  discretionary breaks found.
 17503  
 17504  Fortunately, we do not have to do all this calculation very often, because
 17505  of the way it has been taken out of \TeX's inner loop. For example, when
 17506  the second edition of the author's 700-page book {\sl Seminumerical
 17507  Algorithms} was typeset by \TeX, only about 1.2 hyphenations needed to be
 17508  @^Knuth, Donald Ervin@>
 17509  tried per paragraph, since the line breaking algorithm needed to use two
 17510  passes on only about 5 per cent of the paragraphs.
 17511  
 17512  @<Initialize for hyphenating...@>=
 17513  begin @!init if trie_not_ready then init_trie;@+tini@;@/
 17514  cur_lang:=init_cur_lang; l_hyf:=init_l_hyf; r_hyf:=init_r_hyf;
 17515  end
 17516  
 17517  @ The letters $c_1\ldots c_n$ that are candidates for hyphenation are placed
 17518  into an array called |hc|; the number |n| is placed into |hn|; pointers to
 17519  nodes $p_{a-1}$ and~$p_b$ in the description above are placed into variables
 17520  |ha| and |hb|; and the font number is placed into |hf|.
 17521  
 17522  @<Glob...@>=
 17523  @!hc:array[0..65] of 0..256; {word to be hyphenated}
 17524  @!hn:0..64; {the number of positions occupied in |hc|;
 17525                                    not always a |small_number|}
 17526  @!ha,@!hb:pointer; {nodes |ha..hb| should be replaced by the hyphenated result}
 17527  @!hf:internal_font_number; {font number of the letters in |hc|}
 17528  @!hu:array[0..63] of 0..256; {like |hc|, before conversion to lowercase}
 17529  @!hyf_char:integer; {hyphen character of the relevant font}
 17530  @!cur_lang,@!init_cur_lang:ASCII_code; {current hyphenation table of interest}
 17531  @!l_hyf,@!r_hyf,@!init_l_hyf,@!init_r_hyf:integer; {limits on fragment sizes}
 17532  @!hyf_bchar:halfword; {boundary character after $c_n$}
 17533  
 17534  @ Hyphenation routines need a few more local variables.
 17535  
 17536  @<Local variables for line...@>=
 17537  @!j:small_number; {an index into |hc| or |hu|}
 17538  @!c:0..255; {character being considered for hyphenation}
 17539  
 17540  @ When the following code is activated, the |line_break| procedure is in its
 17541  second pass, and |cur_p| points to a glue node.
 17542  
 17543  @<Try to hyphenate...@>=
 17544  begin prev_s:=cur_p; s:=link(prev_s);
 17545  if s<>null then
 17546    begin @<Skip to node |ha|, or |goto done1| if no hyphenation
 17547      should be attempted@>;
 17548    if l_hyf+r_hyf>63 then goto done1;
 17549    @<Skip to node |hb|, putting letters into |hu| and |hc|@>;
 17550    @<Check that the nodes following |hb| permit hyphenation and that at least
 17551      |l_hyf+r_hyf| letters have been found, otherwise |goto done1|@>;
 17552    hyphenate;
 17553    end;
 17554  done1: end
 17555  
 17556  @ @<Declare subprocedures for |line_break|@>=
 17557  @t\4@>@<Declare the function called |reconstitute|@>
 17558  procedure hyphenate;
 17559  label common_ending,done,found,found1,found2,not_found,exit;
 17560  var @<Local variables for hyphenation@>@;
 17561  begin @<Find hyphen locations for the word in |hc|, or |return|@>;
 17562  @<If no hyphens were found, |return|@>;
 17563  @<Replace nodes |ha..hb| by a sequence of nodes that includes
 17564    the discretionary hyphens@>;
 17565  exit:end;
 17566  
 17567  @ The first thing we need to do is find the node |ha| just before the
 17568  first letter.
 17569  
 17570  @<Skip to node |ha|, or |goto done1|...@>=
 17571  loop@+  begin if is_char_node(s) then
 17572      begin c:=qo(character(s)); hf:=font(s);
 17573      end
 17574    else if type(s)=ligature_node then
 17575      if lig_ptr(s)=null then goto continue
 17576      else begin q:=lig_ptr(s); c:=qo(character(q)); hf:=font(q);
 17577        end
 17578    else if (type(s)=kern_node)and(subtype(s)=normal) then goto continue
 17579    else if type(s)=whatsit_node then
 17580      begin @<Advance \(p)past a whatsit node in the \(p)pre-hyphenation loop@>;
 17581      goto continue;
 17582      end
 17583    else goto done1;
 17584    if lc_code(c)<>0 then
 17585      if (lc_code(c)=c)or(uc_hyph>0) then goto done2
 17586      else goto done1;
 17587  continue: prev_s:=s; s:=link(prev_s);
 17588    end;
 17589  done2: hyf_char:=hyphen_char[hf];
 17590  if hyf_char<0 then goto done1;
 17591  if hyf_char>255 then goto done1;
 17592  ha:=prev_s
 17593  
 17594  @ The word to be hyphenated is now moved to the |hu| and |hc| arrays.
 17595  
 17596  @<Skip to node |hb|, putting letters...@>=
 17597  hn:=0;
 17598  loop@+  begin if is_char_node(s) then
 17599      begin if font(s)<>hf then goto done3;
 17600      hyf_bchar:=character(s); c:=qo(hyf_bchar);
 17601      if lc_code(c)=0 then goto done3;
 17602      if hn=63 then goto done3;
 17603      hb:=s; incr(hn); hu[hn]:=c; hc[hn]:=lc_code(c); hyf_bchar:=non_char;
 17604      end
 17605    else if type(s)=ligature_node then
 17606      @<Move the characters of a ligature node to |hu| and |hc|;
 17607        but |goto done3| if they are not all letters@>
 17608    else if (type(s)=kern_node)and(subtype(s)=normal) then
 17609      begin hb:=s;
 17610      hyf_bchar:=font_bchar[hf];
 17611      end
 17612    else goto done3;
 17613    s:=link(s);
 17614    end;
 17615  done3:
 17616  
 17617  @ We let |j| be the index of the character being stored when a ligature node
 17618  is being expanded, since we do not want to advance |hn| until we are sure
 17619  that the entire ligature consists of letters. Note that it is possible
 17620  to get to |done3| with |hn=0| and |hb| not set to any value.
 17621  
 17622  @<Move the characters of a ligature node to |hu| and |hc|...@>=
 17623  begin if font(lig_char(s))<>hf then goto done3;
 17624  j:=hn; q:=lig_ptr(s);@+if q>null then hyf_bchar:=character(q);
 17625  while q>null do
 17626    begin c:=qo(character(q));
 17627    if lc_code(c)=0 then goto done3;
 17628    if j=63 then goto done3;
 17629    incr(j); hu[j]:=c; hc[j]:=lc_code(c);@/
 17630    q:=link(q);
 17631    end;
 17632  hb:=s; hn:=j;
 17633  if odd(subtype(s)) then hyf_bchar:=font_bchar[hf]@+else hyf_bchar:=non_char;
 17634  end
 17635  
 17636  @ @<Check that the nodes following |hb| permit hyphenation...@>=
 17637  if hn<l_hyf+r_hyf then goto done1; {|l_hyf| and |r_hyf| are |>=1|}
 17638  loop@+  begin if not(is_char_node(s)) then
 17639      case type(s) of
 17640      ligature_node: do_nothing;
 17641      kern_node: if subtype(s)<>normal then goto done4;
 17642      whatsit_node,glue_node,penalty_node,ins_node,adjust_node,mark_node:
 17643        goto done4;
 17644      othercases goto done1
 17645      endcases;
 17646    s:=link(s);
 17647    end;
 17648  done4:
 17649  
 17650  @* \[41] Post-hyphenation.
 17651  If a hyphen may be inserted between |hc[j]| and |hc[j+1]|, the hyphenation
 17652  procedure will set |hyf[j]| to some small odd number. But before we look
 17653  at \TeX's hyphenation procedure, which is independent of the rest of the
 17654  line-breaking algorithm, let us consider what we will do with the hyphens
 17655  it finds, since it is better to work on this part of the program before
 17656  forgetting what |ha| and |hb|, etc., are all about.
 17657  
 17658  @<Glob...@>=
 17659  @!hyf:array [0..64] of 0..9; {odd values indicate discretionary hyphens}
 17660  @!init_list:pointer; {list of punctuation characters preceding the word}
 17661  @!init_lig:boolean; {does |init_list| represent a ligature?}
 17662  @!init_lft:boolean; {if so, did the ligature involve a left boundary?}
 17663  
 17664  @ @<Local variables for hyphenation@>=
 17665  @!i,@!j,@!l:0..65; {indices into |hc| or |hu|}
 17666  @!q,@!r,@!s:pointer; {temporary registers for list manipulation}
 17667  @!bchar:halfword; {boundary character of hyphenated word, or |non_char|}
 17668  
 17669  @ \TeX\ will never insert a hyphen that has fewer than
 17670  \.{\\lefthyphenmin} letters before it or fewer than
 17671  \.{\\righthyphenmin} after it; hence, a short word has
 17672  comparatively little chance of being hyphenated. If no hyphens have
 17673  been found, we can save time by not having to make any changes to the
 17674  paragraph.
 17675  
 17676  @<If no hyphens were found, |return|@>=
 17677  for j:=l_hyf to hn-r_hyf do if odd(hyf[j]) then goto found1;
 17678  return;
 17679  found1:
 17680  
 17681  @ If hyphens are in fact going to be inserted, \TeX\ first deletes the
 17682  subsequence of nodes between |ha| and~|hb|. An attempt is made to
 17683  preserve the effect that implicit boundary characters and punctuation marks
 17684  had on ligatures inside the hyphenated word, by storing a left boundary or
 17685  preceding character in |hu[0]| and by storing a possible right boundary
 17686  in |bchar|. We set |j:=0| if |hu[0]| is to be part of the reconstruction;
 17687  otherwise |j:=1|.
 17688  The variable |s| will point to the tail of the current hlist, and
 17689  |q| will point to the node following |hb|, so that
 17690  things can be hooked up after we reconstitute the hyphenated word.
 17691  
 17692  @<Replace nodes |ha..hb| by a sequence of nodes...@>=
 17693  q:=link(hb); link(hb):=null; r:=link(ha); link(ha):=null; bchar:=hyf_bchar;
 17694  if is_char_node(ha) then
 17695    if font(ha)<>hf then goto found2
 17696    else begin init_list:=ha; init_lig:=false; hu[0]:=qo(character(ha));
 17697      end
 17698  else if type(ha)=ligature_node then
 17699    if font(lig_char(ha))<>hf then goto found2
 17700    else begin init_list:=lig_ptr(ha); init_lig:=true; init_lft:=(subtype(ha)>1);
 17701      hu[0]:=qo(character(lig_char(ha)));
 17702      if init_list=null then if init_lft then
 17703        begin hu[0]:=256; init_lig:=false;
 17704        end; {in this case a ligature will be reconstructed from scratch}
 17705      free_node(ha,small_node_size);
 17706      end
 17707  else begin {no punctuation found; look for left boundary}
 17708    if not is_char_node(r) then if type(r)=ligature_node then
 17709     if subtype(r)>1 then goto found2;
 17710    j:=1; s:=ha; init_list:=null; goto common_ending;
 17711    end;
 17712  s:=cur_p; {we have |cur_p<>ha| because |type(cur_p)=glue_node|}
 17713  while link(s)<>ha do s:=link(s);
 17714  j:=0; goto common_ending;
 17715  found2: s:=ha; j:=0; hu[0]:=256; init_lig:=false; init_list:=null;
 17716  common_ending: flush_node_list(r);
 17717  @<Reconstitute nodes for the hyphenated word, inserting discretionary hyphens@>;
 17718  flush_list(init_list)
 17719  
 17720  @ We must now face the fact that the battle is not over, even though the
 17721  {\def\!{\kern-1pt}%
 17722  hyphens have been found: The process of reconstituting a word can be nontrivial
 17723  because ligatures might change when a hyphen is present. {\sl The \TeX book\/}
 17724  discusses the difficulties of the word ``difficult'', and
 17725  the discretionary material surrounding a
 17726  hyphen can be considerably more complex than that. Suppose
 17727  \.{abcdef} is a word in a font for which the only ligatures are \.{b\!c},
 17728  \.{c\!d}, \.{d\!e}, and \.{e\!f}. If this word permits hyphenation
 17729  between \.b and \.c, the two patterns with and without hyphenation are
 17730  $\.a\,\.b\,\.-\,\.{c\!d}\,\.{e\!f}$ and $\.a\,\.{b\!c}\,\.{d\!e}\,\.f$.
 17731  Thus the insertion of a hyphen might cause effects to ripple arbitrarily
 17732  far into the rest of the word. A further complication arises if additional
 17733  hyphens appear together with such rippling, e.g., if the word in the
 17734  example just given could also be hyphenated between \.c and \.d; \TeX\
 17735  avoids this by simply ignoring the additional hyphens in such weird cases.}
 17736  
 17737  Still further complications arise in the presence of ligatures that do not
 17738  delete the original characters. When punctuation precedes the word being
 17739  hyphenated, \TeX's method is not perfect under all possible scenarios,
 17740  because punctuation marks and letters can propagate information back and forth.
 17741  For example, suppose the original pre-hyphenation pair
 17742  \.{*a} changes to \.{*y} via a \.{\?=:} ligature, which changes to \.{xy}
 17743  via a \.{=:\?} ligature; if $p_{a-1}=\.x$ and $p_a=\.y$, the reconstitution
 17744  procedure isn't smart enough to obtain \.{xy} again. In such cases the
 17745  font designer should include a ligature that goes from \.{xa} to \.{xy}.
 17746  
 17747  @ The processing is facilitated by a subroutine called |reconstitute|. Given
 17748  a string of characters $x_j\ldots x_n$, there is a smallest index $m\ge j$
 17749  such that the ``translation'' of $x_j\ldots x_n$ by ligatures and kerning
 17750  has the form $y_1\ldots y_t$ followed by the translation of $x_{m+1}\ldots x_n$,
 17751  where $y_1\ldots y_t$ is some nonempty sequence of character, ligature, and
 17752  kern nodes. We call $x_j\ldots x_m$ a ``cut prefix'' of $x_j\ldots x_n$.
 17753  For example, if $x_1x_2x_3=\.{fly}$, and if the font contains `fl' as a
 17754  ligature and a kern between `fl' and `y', then $m=2$, $t=2$, and $y_1$ will
 17755  be a ligature node for `fl' followed by an appropriate kern node~$y_2$.
 17756  In the most common case, $x_j$~forms no ligature with $x_{j+1}$ and we
 17757  simply have $m=j$, $y_1=x_j$. If $m<n$ we can repeat the procedure on
 17758  $x_{m+1}\ldots x_n$ until the entire translation has been found.
 17759  
 17760  The |reconstitute| function returns the integer $m$ and puts the nodes
 17761  $y_1\ldots y_t$ into a linked list starting at |link(hold_head)|,
 17762  getting the input $x_j\ldots x_n$ from the |hu| array. If $x_j=256$,
 17763  we consider $x_j$ to be an implicit left boundary character; in this
 17764  case |j| must be strictly less than~|n|. There is a
 17765  parameter |bchar|, which is either 256 or an implicit right boundary character
 17766  assumed to be present just following~$x_n$. (The value |hu[n+1]| is never
 17767  explicitly examined, but the algorithm imagines that |bchar| is there.)
 17768  
 17769  If there exists an index |k| in the range $j\le k\le m$ such that |hyf[k]|
 17770  is odd and such that the result of |reconstitute| would have been different
 17771  if $x_{k+1}$ had been |hchar|, then |reconstitute| sets |hyphen_passed|
 17772  to the smallest such~|k|. Otherwise it sets |hyphen_passed| to zero.
 17773  
 17774  A special convention is used in the case |j=0|: Then we assume that the
 17775  translation of |hu[0]| appears in a special list of charnodes starting at
 17776  |init_list|; moreover, if |init_lig| is |true|, then |hu[0]| will be
 17777  a ligature character, involving a left boundary if |init_lft| is |true|.
 17778  This facility is provided for cases when a hyphenated
 17779  word is preceded by punctuation (like single or double quotes) that might
 17780  affect the translation of the beginning of the word.
 17781  
 17782  @<Glob...@>=
 17783  @!hyphen_passed:small_number; {first hyphen in a ligature, if any}
 17784  
 17785  @ @<Declare the function called |reconstitute|@>=
 17786  function reconstitute(@!j,@!n:small_number;@!bchar,@!hchar:halfword):
 17787    small_number;
 17788  label continue,done;
 17789  var @!p:pointer; {temporary register for list manipulation} 
 17790  @!t:pointer; {a node being appended to}
 17791  @!q:four_quarters; {character information or a lig/kern instruction}
 17792  @!cur_rh:halfword; {hyphen character for ligature testing}
 17793  @!test_char:halfword; {hyphen or other character for ligature testing}
 17794  @!w:scaled; {amount of kerning}
 17795  @!k:font_index; {position of current lig/kern instruction}
 17796  begin hyphen_passed:=0; t:=hold_head; w:=0; link(hold_head):=null;
 17797   {at this point |ligature_present=lft_hit=rt_hit=false|}
 17798  @<Set up data structures with the cursor following position |j|@>;
 17799  continue:@<If there's a ligature or kern at the cursor position, update the data
 17800    structures, possibly advancing~|j|; continue until the cursor moves@>;
 17801  @<Append a ligature and/or kern to the translation;
 17802    |goto continue| if the stack of inserted ligatures is nonempty@>;
 17803  reconstitute:=j;
 17804  end;
 17805  
 17806  @ The reconstitution procedure shares many of the global data structures
 17807  by which \TeX\ has processed the words before they were hyphenated.
 17808  There is an implied ``cursor'' between characters |cur_l| and |cur_r|;
 17809  these characters will be tested for possible ligature activity. If
 17810  |ligature_present| then |cur_l| is a ligature character formed from the
 17811  original characters following |cur_q| in the current translation list.
 17812  There is a ``ligature stack'' between the cursor and character |j+1|,
 17813  consisting of pseudo-ligature nodes linked together by their |link| fields.
 17814  This stack is normally empty unless a ligature command has created a new
 17815  character that will need to be processed later. A pseudo-ligature is
 17816  a special node having a |character| field that represents a potential
 17817  ligature and a |lig_ptr| field that points to a |char_node| or is |null|.
 17818  We have
 17819  $$|cur_r|=\cases{|character(lig_stack)|,&if |lig_stack>null|;\cr
 17820    |qi(hu[j+1])|,&if |lig_stack=null| and |j<n|;\cr
 17821    bchar,&if |lig_stack=null| and |j=n|.\cr}$$
 17822  
 17823  @<Glob...@>=
 17824  @!cur_l,@!cur_r:halfword; {characters before and after the cursor}
 17825  @!cur_q:pointer; {where a ligature should be detached}
 17826  @!lig_stack:pointer; {unfinished business to the right of the cursor}
 17827  @!ligature_present:boolean; {should a ligature node be made for |cur_l|?}
 17828  @!lft_hit,@!rt_hit:boolean; {did we hit a ligature with a boundary character?}
 17829  
 17830  @ @d append_charnode_to_t(#)== begin link(t):=get_avail; t:=link(t);
 17831      font(t):=hf; character(t):=#;
 17832      end
 17833  @d set_cur_r==begin if j<n then cur_r:=qi(hu[j+1])@+else cur_r:=bchar;
 17834      if odd(hyf[j]) then cur_rh:=hchar@+else cur_rh:=non_char;
 17835      end
 17836  
 17837  @<Set up data structures with the cursor following position |j|@>=
 17838  cur_l:=qi(hu[j]); cur_q:=t;
 17839  if j=0 then
 17840    begin ligature_present:=init_lig; p:=init_list;
 17841    if ligature_present then lft_hit:=init_lft;
 17842    while p>null do
 17843      begin append_charnode_to_t(character(p)); p:=link(p);
 17844      end;
 17845    end
 17846  else if cur_l<non_char then append_charnode_to_t(cur_l);
 17847  lig_stack:=null; set_cur_r
 17848  
 17849  @ We may want to look at the lig/kern program twice, once for a hyphen
 17850  and once for a normal letter. (The hyphen might appear after the letter
 17851  in the program, so we'd better not try to look for both at once.)
 17852  
 17853  @<If there's a ligature or kern at the cursor position, update...@>=
 17854  if cur_l=non_char then
 17855    begin k:=bchar_label[hf];
 17856    if k=non_address then goto done@+else q:=font_info[k].qqqq;
 17857    end
 17858  else begin q:=char_info(hf)(cur_l);
 17859    if char_tag(q)<>lig_tag then goto done;
 17860    k:=lig_kern_start(hf)(q); q:=font_info[k].qqqq;
 17861    if skip_byte(q)>stop_flag then
 17862      begin k:=lig_kern_restart(hf)(q); q:=font_info[k].qqqq;
 17863      end;
 17864    end; {now |k| is the starting address of the lig/kern program}
 17865  if cur_rh<non_char then test_char:=cur_rh@+else test_char:=cur_r;
 17866  loop@+begin if next_char(q)=test_char then if skip_byte(q)<=stop_flag then
 17867      if cur_rh<non_char then
 17868        begin hyphen_passed:=j; hchar:=non_char; cur_rh:=non_char;
 17869        goto continue;
 17870        end
 17871      else begin if hchar<non_char then if odd(hyf[j]) then
 17872          begin hyphen_passed:=j; hchar:=non_char;
 17873          end;
 17874        if op_byte(q)<kern_flag then
 17875        @<Carry out a ligature replacement, updating the cursor structure
 17876          and possibly advancing~|j|; |goto continue| if the cursor doesn't
 17877          advance, otherwise |goto done|@>;
 17878        w:=char_kern(hf)(q); goto done; {this kern will be inserted below}
 17879       end;
 17880    if skip_byte(q)>=stop_flag then
 17881      if cur_rh=non_char then goto done
 17882      else begin cur_rh:=non_char; goto continue;
 17883        end;
 17884    k:=k+qo(skip_byte(q))+1; q:=font_info[k].qqqq;
 17885    end;
 17886  done:
 17887  
 17888  @ @d wrap_lig(#)==if ligature_present then
 17889      begin p:=new_ligature(hf,cur_l,link(cur_q));
 17890      if lft_hit then
 17891        begin subtype(p):=2; lft_hit:=false;
 17892        end;
 17893      if # then if lig_stack=null then
 17894        begin incr(subtype(p)); rt_hit:=false;
 17895        end;
 17896      link(cur_q):=p; t:=p; ligature_present:=false;
 17897      end
 17898  @d pop_lig_stack==begin if lig_ptr(lig_stack)>null then
 17899      begin link(t):=lig_ptr(lig_stack); {this is a charnode for |hu[j+1]|}
 17900      t:=link(t); incr(j);
 17901      end;
 17902    p:=lig_stack; lig_stack:=link(p); free_node(p,small_node_size);
 17903    if lig_stack=null then set_cur_r@+else cur_r:=character(lig_stack);
 17904    end {if |lig_stack| isn't |null| we have |cur_rh=non_char|}
 17905  
 17906  @<Append a ligature and/or kern to the translation...@>=
 17907  wrap_lig(rt_hit);
 17908  if w<>0 then
 17909    begin link(t):=new_kern(w); t:=link(t); w:=0;
 17910    end;
 17911  if lig_stack>null then
 17912    begin cur_q:=t; cur_l:=character(lig_stack); ligature_present:=true;
 17913    pop_lig_stack; goto continue;
 17914    end
 17915  
 17916  @ @<Carry out a ligature replacement, updating the cursor structure...@>=
 17917  begin if cur_l=non_char then lft_hit:=true;
 17918  if j=n then if lig_stack=null then rt_hit:=true;
 17919  check_interrupt; {allow a way out in case there's an infinite ligature loop}
 17920  case op_byte(q) of
 17921  qi(1),qi(5):begin cur_l:=rem_byte(q); {\.{=:\?}, \.{=:\?>}}
 17922    ligature_present:=true;
 17923    end;
 17924  qi(2),qi(6):begin cur_r:=rem_byte(q); {\.{\?=:}, \.{\?=:>}}
 17925    if lig_stack>null then character(lig_stack):=cur_r
 17926    else begin lig_stack:=new_lig_item(cur_r);
 17927      if j=n then bchar:=non_char
 17928      else begin p:=get_avail; lig_ptr(lig_stack):=p;
 17929        character(p):=qi(hu[j+1]); font(p):=hf;
 17930        end;
 17931      end;
 17932    end;
 17933  qi(3):begin cur_r:=rem_byte(q); {\.{\?=:\?}}
 17934    p:=lig_stack; lig_stack:=new_lig_item(cur_r); link(lig_stack):=p;
 17935    end;
 17936  qi(7),qi(11):begin wrap_lig(false); {\.{\?=:\?>}, \.{\?=:\?>>}}
 17937    cur_q:=t; cur_l:=rem_byte(q); ligature_present:=true;
 17938    end;
 17939  othercases begin cur_l:=rem_byte(q); ligature_present:=true; {\.{=:}}
 17940    if lig_stack>null then pop_lig_stack
 17941    else if j=n then goto done
 17942    else begin append_charnode_to_t(cur_r); incr(j); set_cur_r;
 17943      end;
 17944    end
 17945  endcases;
 17946  if op_byte(q)>qi(4) then if op_byte(q)<>qi(7) then goto done;
 17947  goto continue;
 17948  end
 17949  
 17950  @ Okay, we're ready to insert the potential hyphenations that were found.
 17951  When the following program is executed, we want to append the word
 17952  |hu[1..hn]| after node |ha|, and node |q| should be appended to the result.
 17953  During this process, the variable |i| will be a temporary
 17954  index into |hu|; the variable |j| will be an index to our current position
 17955  in |hu|; the variable |l| will be the counterpart of |j|, in a discretionary
 17956  branch; the variable |r| will point to new nodes being created; and
 17957  we need a few new local variables:
 17958  
 17959  @<Local variables for hyph...@>=
 17960  @!major_tail,@!minor_tail:pointer; {the end of lists in the main and
 17961    discretionary branches being reconstructed}
 17962  @!c:ASCII_code; {character temporarily replaced by a hyphen}
 17963  @!c_loc:0..63; {where that character came from}
 17964  @!r_count:integer; {replacement count for discretionary}
 17965  @!hyf_node:pointer; {the hyphen, if it exists}
 17966  
 17967  @ When the following code is performed, |hyf[0]| and |hyf[hn]| will be zero.
 17968  
 17969  @<Reconstitute nodes for the hyphenated word...@>=
 17970  repeat l:=j; j:=reconstitute(j,hn,bchar,qi(hyf_char))+1;
 17971  if hyphen_passed=0 then
 17972    begin link(s):=link(hold_head);
 17973    while link(s)>null do s:=link(s);
 17974    if odd(hyf[j-1]) then
 17975      begin l:=j; hyphen_passed:=j-1; link(hold_head):=null;
 17976      end;
 17977    end;
 17978  if hyphen_passed>0 then
 17979    @<Create and append a discretionary node as an alternative to the
 17980      unhyphenated word, and continue to develop both branches until they
 17981      become equivalent@>;
 17982  until j>hn;
 17983  link(s):=q
 17984  
 17985  @ In this repeat loop we will insert another discretionary if |hyf[j-1]| is
 17986  odd, when both branches of the previous discretionary end at position |j-1|.
 17987  Strictly speaking, we aren't justified in doing this, because we don't know
 17988  that a hyphen after |j-1| is truly independent of those branches. But in almost
 17989  all applications we would rather not lose a potentially valuable hyphenation
 17990  point. (Consider the word `difficult', where the letter `c' is in position |j|.)
 17991  
 17992  @d advance_major_tail==begin major_tail:=link(major_tail); incr(r_count);
 17993      end
 17994  
 17995  @<Create and append a discretionary node as an alternative...@>=
 17996  repeat r:=get_node(small_node_size);
 17997  link(r):=link(hold_head); type(r):=disc_node;
 17998  major_tail:=r; r_count:=0;
 17999  while link(major_tail)>null do advance_major_tail;
 18000  i:=hyphen_passed; hyf[i]:=0;
 18001  @<Put the \(c)characters |hu[l..i]| and a hyphen into |pre_break(r)|@>;
 18002  @<Put the \(c)characters |hu[i+1..@,]| into |post_break(r)|, appending to this
 18003    list and to |major_tail| until synchronization has been achieved@>;
 18004  @<Move pointer |s| to the end of the current list, and set |replace_count(r)|
 18005    appropriately@>;
 18006  hyphen_passed:=j-1; link(hold_head):=null;
 18007  until not odd(hyf[j-1])
 18008  
 18009  @ The new hyphen might combine with the previous character via ligature
 18010  or kern. At this point we have |l-1<=i<j| and |i<hn|.
 18011  
 18012  @<Put the \(c)characters |hu[l..i]| and a hyphen into |pre_break(r)|@>=
 18013  minor_tail:=null; pre_break(r):=null; hyf_node:=new_character(hf,hyf_char);
 18014  if hyf_node<>null then
 18015    begin incr(i); c:=hu[i]; hu[i]:=hyf_char; free_avail(hyf_node);
 18016    end;
 18017  while l<=i do
 18018    begin l:=reconstitute(l,i,font_bchar[hf],non_char)+1;
 18019    if link(hold_head)>null then
 18020      begin if minor_tail=null then pre_break(r):=link(hold_head)
 18021      else link(minor_tail):=link(hold_head);
 18022      minor_tail:=link(hold_head);
 18023      while link(minor_tail)>null do minor_tail:=link(minor_tail);
 18024      end;
 18025    end;
 18026  if hyf_node<>null then
 18027    begin hu[i]:=c; {restore the character in the hyphen position}
 18028    l:=i; decr(i);
 18029    end
 18030  
 18031  @ The synchronization algorithm begins with |l=i+1<=j|.
 18032  
 18033  @<Put the \(c)characters |hu[i+1..@,]| into |post_break(r)|...@>=
 18034  minor_tail:=null; post_break(r):=null; c_loc:=0;
 18035  if bchar_label[hf]<>non_address then {put left boundary at beginning of new line}
 18036    begin decr(l); c:=hu[l]; c_loc:=l; hu[l]:=256;
 18037    end;
 18038  while l<j do
 18039    begin repeat l:=reconstitute(l,hn,bchar,non_char)+1;
 18040    if c_loc>0 then
 18041      begin hu[c_loc]:=c; c_loc:=0;
 18042      end;
 18043    if link(hold_head)>null then
 18044      begin if minor_tail=null then post_break(r):=link(hold_head)
 18045      else link(minor_tail):=link(hold_head);
 18046      minor_tail:=link(hold_head);
 18047      while link(minor_tail)>null do minor_tail:=link(minor_tail);
 18048      end;
 18049    until l>=j;
 18050    while l>j do
 18051      @<Append characters of |hu[j..@,]| to |major_tail|, advancing~|j|@>;
 18052    end
 18053  
 18054  @ @<Append characters of |hu[j..@,]|...@>=
 18055  begin j:=reconstitute(j,hn,bchar,non_char)+1;
 18056  link(major_tail):=link(hold_head);
 18057  while link(major_tail)>null do advance_major_tail;
 18058  end
 18059  
 18060  @ Ligature insertion can cause a word to grow exponentially in size. Therefore
 18061  we must test the size of |r_count| here, even though the hyphenated text
 18062  was at most 63 characters long.
 18063  
 18064  @<Move pointer |s| to the end of the current list...@>=
 18065  if r_count>127 then {we have to forget the discretionary hyphen}
 18066    begin link(s):=link(r); link(r):=null; flush_node_list(r);
 18067    end
 18068  else begin link(s):=r; replace_count(r):=r_count;
 18069    end;
 18070  s:=major_tail
 18071  
 18072  @* \[42] Hyphenation.
 18073  When a word |hc[1..hn]| has been set up to contain a candidate for hyphenation,
 18074  \TeX\ first looks to see if it is in the user's exception dictionary. If not,
 18075  hyphens are inserted based on patterns that appear within the given word,
 18076  using an algorithm due to Frank~M. Liang.
 18077  @^Liang, Franklin Mark@>
 18078  
 18079  Let's consider Liang's method first, since it is much more interesting than the
 18080  exception-lookup routine.  The algorithm begins by setting |hyf[j]| to zero
 18081  for all |j|, and invalid characters are inserted into |hc[0]|
 18082  and |hc[hn+1]| to serve as delimiters. Then a reasonably fast method is
 18083  used to see which of a given set of patterns occurs in the word
 18084  |hc[0..(hn+1)]|. Each pattern $p_1\ldots p_k$ of length |k| has an associated
 18085  sequence of |k+1| numbers $n_0\ldots n_k$; and if the pattern occurs in
 18086  |hc[(j+1)..(j+k)]|, \TeX\ will set |hyf[j+i]:=@tmax@>(hyf[j+i],@t$n_i$@>)| for
 18087  |0<=i<=k|. After this has been done for each pattern that occurs, a
 18088  discretionary hyphen will be inserted between |hc[j]| and |hc[j+1]| when
 18089  |hyf[j]| is odd, as we have already seen.
 18090  
 18091  The set of patterns $p_1\ldots p_k$ and associated numbers $n_0\ldots n_k$
 18092  depends, of course, on the language whose words are being hyphenated, and
 18093  on the degree of hyphenation that is desired. A method for finding
 18094  appropriate |p|'s and |n|'s, from a given dictionary of words and acceptable
 18095  hyphenations, is discussed in Liang's Ph.D. thesis (Stanford University,
 18096  1983); \TeX\ simply starts with the patterns and works from there.
 18097  
 18098  @ The patterns are stored in a compact table that is also efficient for
 18099  retrieval, using a variant of ``trie memory'' [cf.\ {\sl The Art of
 18100  Computer Programming \bf3} (1973), 481--505]. We can find each pattern
 18101  $p_1\ldots p_k$ by letting $z_0$ be one greater than the relevant language
 18102  index and then, for |1<=i<=k|,
 18103  setting |@t$z_i$@>:=trie_link@t$(z_{i-1})+p_i$@>|; the pattern will be
 18104  identified by the number $z_k$. Since all the pattern information is
 18105  packed together into a single |trie_link| array, it is necessary to
 18106  prevent confusion between the data from inequivalent patterns, so another
 18107  table is provided such that |trie_char@t$(z_i)=p_i$@>| for all |i|. There
 18108  is also a table |trie_op|$(z_k)$ to identify the numbers $n_0\ldots n_k$
 18109  associated with $p_1\ldots p_k$.
 18110  
 18111  Comparatively few different number sequences $n_0\ldots n_k$ actually occur,
 18112  since most of the |n|'s are generally zero. Therefore the number sequences
 18113  are encoded in such a way that |trie_op|$(z_k)$ is only one byte long.
 18114  If |trie_op(@t$z_k$@>)<>min_quarterword|, when $p_1\ldots p_k$ has matched
 18115  the letters in |hc[(l-k+1)..l@,]| of language |t|,
 18116  we perform all of the required operations
 18117  for this pattern by carrying out the following little program: Set
 18118  |v:=trie_op(@t$z_k$@>)|. Then set |v:=v+op_start[t]|,
 18119  |hyf[l-hyf_distance[v]]:=@tmax@>(hyf[l-hyf_distance[v]], hyf_num[v])|,
 18120  and |v:=hyf_next[v]|; repeat, if necessary, until |v=min_quarterword|.
 18121  
 18122  @<Types...@>=
 18123  @!trie_pointer=0..trie_size; {an index into |trie|}
 18124  
 18125  @ @d trie_link(#)==trie[#].rh {``downward'' link in a trie}
 18126  @d trie_char(#)==trie[#].b1 {character matched at this trie location}
 18127  @d trie_op(#)==trie[#].b0 {program for hyphenation at this trie location}
 18128  
 18129  @<Glob...@>=
 18130  @!trie:array[trie_pointer] of two_halves; {|trie_link|, |trie_char|, |trie_op|}
 18131  @!hyf_distance:array[1..trie_op_size] of small_number; {position |k-j| of $n_j$}
 18132  @!hyf_num:array[1..trie_op_size] of small_number; {value of $n_j$}
 18133  @!hyf_next:array[1..trie_op_size] of quarterword; {continuation code}
 18134  @!op_start:array[ASCII_code] of 0..trie_op_size; {offset for current language}
 18135  
 18136  @ @<Local variables for hyph...@>=
 18137  @!z:trie_pointer; {an index into |trie|}
 18138  @!v:integer; {an index into |hyf_distance|, etc.}
 18139  
 18140  @ Assuming that these auxiliary tables have been set up properly, the
 18141  hyphenation algorithm is quite short. In the following code we set |hc[hn+2]|
 18142  to the impossible value 256, in order to guarantee that |hc[hn+3]| will
 18143  never be fetched.
 18144  
 18145  @<Find hyphen locations for the word in |hc|...@>=
 18146  for j:=0 to hn do hyf[j]:=0;
 18147  @<Look for the word |hc[1..hn]| in the exception table, and |goto found| (with
 18148    |hyf| containing the hyphens) if an entry is found@>;
 18149  if trie_char(cur_lang+1)<>qi(cur_lang) then return; {no patterns for |cur_lang|}
 18150  hc[0]:=0; hc[hn+1]:=0; hc[hn+2]:=256; {insert delimiters}
 18151  for j:=0 to hn-r_hyf+1 do
 18152    begin z:=trie_link(cur_lang+1)+hc[j]; l:=j;
 18153    while hc[l]=qo(trie_char(z)) do
 18154      begin if trie_op(z)<>min_quarterword then
 18155        @<Store \(m)maximum values in the |hyf| table@>;
 18156      incr(l); z:=trie_link(z)+hc[l];
 18157      end;
 18158    end;
 18159  found: for j:=0 to l_hyf-1 do hyf[j]:=0;
 18160  for j:=0 to r_hyf-1 do hyf[hn-j]:=0
 18161  
 18162  @ @<Store \(m)maximum values in the |hyf| table@>=
 18163  begin v:=trie_op(z);
 18164  repeat v:=v+op_start[cur_lang]; i:=l-hyf_distance[v];
 18165  if hyf_num[v]>hyf[i] then hyf[i]:=hyf_num[v];
 18166  v:=hyf_next[v];
 18167  until v=min_quarterword;
 18168  end
 18169  
 18170  @ The exception table that is built by \TeX's \.{\\hyphenation} primitive is
 18171  organized as an ordered hash table [cf.\ Amble and Knuth, {\sl The Computer
 18172  @^Amble, Ole@> @^Knuth, Donald Ervin@>
 18173  Journal\/ \bf17} (1974), 135--142] using linear probing. If $\alpha$ and
 18174  $\beta$ are words, we will say that $\alpha<\beta$ if $\vert\alpha\vert<
 18175  \vert\beta\vert$ or if $\vert\alpha\vert=\vert\beta\vert$ and
 18176  $\alpha$ is lexicographically smaller than $\beta$. (The notation $\vert
 18177  \alpha\vert$ stands for the length of $\alpha$.) The idea of ordered hashing
 18178  is to arrange the table so that a given word $\alpha$ can be sought by computing
 18179  a hash address $h=h(\alpha)$ and then looking in table positions |h|, |h-1|,
 18180  \dots, until encountering the first word $\L\alpha$. If this word is
 18181  different from $\alpha$, we can conclude that $\alpha$ is not in the table.
 18182  
 18183  The words in the table point to lists in |mem| that specify hyphen positions
 18184  in their |info| fields. The list for $c_1\ldots c_n$ contains the number |k| if
 18185  the word $c_1\ldots c_n$ has a discretionary hyphen between $c_k$ and
 18186  $c_{k+1}$.
 18187  
 18188  @<Types...@>=
 18189  @!hyph_pointer=0..hyph_size; {an index into the ordered hash table}
 18190  
 18191  @ @<Glob...@>=
 18192  @!hyph_word:array[hyph_pointer] of str_number; {exception words}
 18193  @!hyph_list:array[hyph_pointer] of pointer; {lists of hyphen positions}
 18194  @!hyph_count:hyph_pointer; {the number of words in the exception dictionary}
 18195  
 18196  @ @<Local variables for init...@>=
 18197  @!z:hyph_pointer; {runs through the exception dictionary}
 18198  
 18199  @ @<Set init...@>=
 18200  for z:=0 to hyph_size do
 18201    begin hyph_word[z]:=0; hyph_list[z]:=null;
 18202    end;
 18203  hyph_count:=0;
 18204  
 18205  @ The algorithm for exception lookup is quite simple, as soon as we have
 18206  a few more local variables to work with.
 18207  
 18208  @<Local variables for hyph...@>=
 18209  @!h:hyph_pointer; {an index into |hyph_word| and |hyph_list|}
 18210  @!k:str_number; {an index into |str_start|}
 18211  @!u:pool_pointer; {an index into |str_pool|}
 18212  
 18213  @ First we compute the hash code |h|, then we search until we either
 18214  find the word or we don't. Words from different languages are kept
 18215  separate by appending the language code to the string.
 18216  
 18217  @<Look for the word |hc[1...@>=
 18218  h:=hc[1]; incr(hn); hc[hn]:=cur_lang;
 18219  for j:=2 to hn do h:=(h+h+hc[j]) mod hyph_size;
 18220  loop@+  begin @<If the string |hyph_word[h]| is less than \(hc)|hc[1..hn]|,
 18221      |goto not_found|; but if the two strings are equal,
 18222      set |hyf| to the hyphen positions and |goto found|@>;
 18223    if h>0 then decr(h)@+else h:=hyph_size;
 18224    end;
 18225  not_found: decr(hn)
 18226  
 18227  @ @<If the string |hyph_word[h]| is less than \(hc)...@>=
 18228  k:=hyph_word[h]; if k=0 then goto not_found;
 18229  if length(k)<hn then goto not_found;
 18230  if length(k)=hn then
 18231    begin j:=1; u:=str_start[k];
 18232    repeat if so(str_pool[u])<hc[j] then goto not_found;
 18233    if so(str_pool[u])>hc[j] then goto done;
 18234    incr(j); incr(u);
 18235    until j>hn;
 18236    @<Insert hyphens as specified in |hyph_list[h]|@>;
 18237    decr(hn); goto found;
 18238    end;
 18239  done:
 18240  
 18241  @ @<Insert hyphens as specified...@>=
 18242  s:=hyph_list[h];
 18243  while s<>null do
 18244    begin hyf[info(s)]:=1; s:=link(s);
 18245    end
 18246  
 18247  @ @<Search |hyph_list| for pointers to |p|@>=
 18248  for q:=0 to hyph_size do
 18249    begin if hyph_list[q]=p then
 18250      begin print_nl("HYPH("); print_int(q); print_char(")");
 18251      end;
 18252    end
 18253  
 18254  @ We have now completed the hyphenation routine, so the |line_break| procedure
 18255  is finished at last. Since the hyphenation exception table is fresh in our
 18256  minds, it's a good time to deal with the routine that adds new entries to it.
 18257  
 18258  When \TeX\ has scanned `\.{\\hyphenation}', it calls on a procedure named
 18259  |new_hyph_exceptions| to do the right thing.
 18260  
 18261  @d set_cur_lang==if language<=0 then cur_lang:=0
 18262    else if language>255 then cur_lang:=0
 18263    else cur_lang:=language
 18264  
 18265  @p procedure new_hyph_exceptions; {enters new exceptions}
 18266  label reswitch, exit, found, not_found;
 18267  var n:0..64; {length of current word; not always a |small_number|}
 18268  @!j:0..64; {an index into |hc|}
 18269  @!h:hyph_pointer; {an index into |hyph_word| and |hyph_list|}
 18270  @!k:str_number; {an index into |str_start|}
 18271  @!p:pointer; {head of a list of hyphen positions}
 18272  @!q:pointer; {used when creating a new node for list |p|}
 18273  @!s,@!t:str_number; {strings being compared or stored}
 18274  @!u,@!v:pool_pointer; {indices into |str_pool|}
 18275  begin scan_left_brace; {a left brace must follow \.{\\hyphenation}}
 18276  set_cur_lang;
 18277  @<Enter as many hyphenation exceptions as are listed,
 18278  until coming to a right brace; then |return|@>;
 18279  exit:end;
 18280  
 18281  @ @<Enter as many...@>=
 18282  n:=0; p:=null;
 18283  loop@+  begin get_x_token;
 18284    reswitch: case cur_cmd of
 18285    letter,other_char,char_given:@<Append a new letter or hyphen@>;
 18286    char_num: begin scan_char_num; cur_chr:=cur_val; cur_cmd:=char_given;
 18287      goto reswitch;
 18288      end;
 18289    spacer,right_brace: begin if n>1 then @<Enter a hyphenation exception@>;
 18290      if cur_cmd=right_brace then return;
 18291      n:=0; p:=null;
 18292      end;
 18293    othercases @<Give improper \.{\\hyphenation} error@>
 18294    endcases;
 18295    end
 18296  
 18297  @ @<Give improper \.{\\hyph...@>=
 18298  begin print_err("Improper "); print_esc("hyphenation");
 18299  @.Improper \\hyphenation...@>
 18300    print(" will be flushed");
 18301  help2("Hyphenation exceptions must contain only letters")@/
 18302    ("and hyphens. But continue; I'll forgive and forget.");
 18303  error;
 18304  end
 18305  
 18306  @ @<Append a new letter or hyphen@>=
 18307  if cur_chr="-" then @<Append the value |n| to list |p|@>
 18308  else  begin if lc_code(cur_chr)=0 then
 18309      begin print_err("Not a letter");
 18310  @.Not a letter@>
 18311      help2("Letters in \hyphenation words must have \lccode>0.")@/
 18312        ("Proceed; I'll ignore the character I just read.");
 18313      error;
 18314      end
 18315    else if n<63 then
 18316      begin incr(n); hc[n]:=lc_code(cur_chr);
 18317      end;
 18318    end
 18319  
 18320  @ @<Append the value |n| to list |p|@>=
 18321  begin if n<63 then
 18322    begin q:=get_avail; link(q):=p; info(q):=n; p:=q;
 18323    end;
 18324  end
 18325  
 18326  @ @<Enter a hyphenation exception@>=
 18327  begin incr(n); hc[n]:=cur_lang; str_room(n); h:=0;
 18328  for j:=1 to n do
 18329    begin h:=(h+h+hc[j]) mod hyph_size;
 18330    append_char(hc[j]);
 18331    end;
 18332  s:=make_string;
 18333  @<Insert the \(p)pair |(s,p)| into the exception table@>;
 18334  end
 18335  
 18336  @ @<Insert the \(p)pair |(s,p)|...@>=
 18337  if hyph_count=hyph_size then overflow("exception dictionary",hyph_size);
 18338  @:TeX capacity exceeded exception dictionary}{\quad exception dictionary@>
 18339  incr(hyph_count);
 18340  while hyph_word[h]<>0 do
 18341    begin @<If the string |hyph_word[h]| is less than \(or)or equal to
 18342    |s|, interchange |(hyph_word[h],hyph_list[h])| with |(s,p)|@>;
 18343    if h>0 then decr(h)@+else h:=hyph_size;
 18344    end;
 18345  hyph_word[h]:=s; hyph_list[h]:=p
 18346  
 18347  @ @<If the string |hyph_word[h]| is less than \(or)...@>=
 18348  k:=hyph_word[h];
 18349  if length(k)<length(s) then goto found;
 18350  if length(k)>length(s) then goto not_found;
 18351  u:=str_start[k]; v:=str_start[s];
 18352  repeat if str_pool[u]<str_pool[v] then goto found;
 18353  if str_pool[u]>str_pool[v] then goto not_found;
 18354  incr(u); incr(v);
 18355  until u=str_start[k+1];
 18356  found:q:=hyph_list[h]; hyph_list[h]:=p; p:=q;@/
 18357  t:=hyph_word[h]; hyph_word[h]:=s; s:=t;
 18358  not_found:
 18359  
 18360  @* \[43] Initializing the hyphenation tables.
 18361  The trie for \TeX's hyphenation algorithm is built from a sequence of
 18362  patterns following a \.{\\patterns} specification. Such a specification
 18363  is allowed only in \.{INITEX}, since the extra memory for auxiliary tables
 18364  and for the initialization program itself would only clutter up the
 18365  production version of \TeX\ with a lot of deadwood.
 18366  
 18367  The first step is to build a trie that is linked, instead of packed
 18368  into sequential storage, so that insertions are readily made.
 18369  After all patterns have been processed, \.{INITEX}
 18370  compresses the linked trie by identifying common subtries. Finally the
 18371  trie is packed into the efficient sequential form that the hyphenation
 18372  algorithm actually uses.
 18373  
 18374  @<Declare subprocedures for |line_break|@>=
 18375  @!init @<Declare procedures for preprocessing hyphenation patterns@>@;
 18376  tini
 18377  
 18378  @ Before we discuss trie building in detail, let's consider the simpler
 18379  problem of creating the |hyf_distance|, |hyf_num|, and |hyf_next| arrays.
 18380  
 18381  Suppose, for example, that \TeX\ reads the pattern `\.{ab2cde1}'. This is
 18382  a pattern of length 5, with $n_0\ldots n_5=0\,0\,2\,0\,0\,1$ in the
 18383  notation above. We want the corresponding |trie_op| code |v| to have
 18384  |hyf_distance[v]=3|, |hyf_num[v]=2|, and |hyf_next[v]=@t$v^\prime$@>|,
 18385  where the auxiliary |trie_op| code $v^\prime$ has
 18386  |hyf_distance[@t$v^\prime$@>]=0|, |hyf_num[@t$v^\prime$@>]=1|, and
 18387  |hyf_next[@t$v^\prime$@>]=min_quarterword|.
 18388  
 18389  \TeX\ computes an appropriate value |v| with the |new_trie_op| subroutine
 18390  below, by setting
 18391  $$\hbox{|@t$v^\prime$@>:=new_trie_op(0,1,min_quarterword)|,\qquad
 18392  |v:=new_trie_op(3,2,@t$v^\prime$@>)|.}$$
 18393  This subroutine looks up its three
 18394  parameters in a special hash table, assigning a new value only if these
 18395  three have not appeared before for the current language.
 18396  
 18397  The hash table is called |trie_op_hash|, and the number of entries it contains
 18398  is |trie_op_ptr|.
 18399  
 18400  @<Glob...@>=
 18401  @!init @!trie_op_hash:array[-trie_op_size..trie_op_size] of 0..trie_op_size;
 18402    {trie op codes for quadruples}
 18403  @!trie_used:array[ASCII_code] of quarterword;
 18404    {largest opcode used so far for this language}
 18405  @!trie_op_lang:array[1..trie_op_size] of ASCII_code;
 18406    {language part of a hashed quadruple}
 18407  @!trie_op_val:array[1..trie_op_size] of quarterword;
 18408    {opcode corresponding to a hashed quadruple}
 18409  @!trie_op_ptr:0..trie_op_size; {number of stored ops so far}
 18410  tini
 18411  
 18412  @ It's tempting to remove the |overflow| stops in the following procedure;
 18413  |new_trie_op| could return |min_quarterword| (thereby simply ignoring
 18414  part of a hyphenation pattern) instead of aborting the job. However, that would
 18415  lead to different hyphenation results on different installations of \TeX\
 18416  using the same patterns. The |overflow| stops are necessary for portability
 18417  of patterns.
 18418  
 18419  @<Declare procedures for preprocessing hyph...@>=
 18420  function new_trie_op(@!d,@!n:small_number;@!v:quarterword):quarterword;
 18421  label exit;
 18422  var h:-trie_op_size..trie_op_size; {trial hash location}
 18423  @!u:quarterword; {trial op code}
 18424  @!l:0..trie_op_size; {pointer to stored data}
 18425  begin h:=abs(n+313*d+361*v+1009*cur_lang) mod (trie_op_size+trie_op_size)
 18426    - trie_op_size;
 18427  loop@+  begin l:=trie_op_hash[h];
 18428    if l=0 then {empty position found for a new op}
 18429      begin if trie_op_ptr=trie_op_size then
 18430        overflow("pattern memory ops",trie_op_size);
 18431      u:=trie_used[cur_lang];
 18432      if u=max_quarterword then
 18433        overflow("pattern memory ops per language",
 18434          max_quarterword-min_quarterword);
 18435      incr(trie_op_ptr); incr(u); trie_used[cur_lang]:=u;
 18436      hyf_distance[trie_op_ptr]:=d;
 18437      hyf_num[trie_op_ptr]:=n; hyf_next[trie_op_ptr]:=v;
 18438      trie_op_lang[trie_op_ptr]:=cur_lang; trie_op_hash[h]:=trie_op_ptr;
 18439      trie_op_val[trie_op_ptr]:=u; new_trie_op:=u; return;
 18440      end;
 18441    if (hyf_distance[l]=d)and(hyf_num[l]=n)and(hyf_next[l]=v)
 18442     and(trie_op_lang[l]=cur_lang) then
 18443      begin new_trie_op:=trie_op_val[l]; return;
 18444      end;
 18445    if h>-trie_op_size then decr(h)@+else h:=trie_op_size;
 18446    end;
 18447  exit:end;
 18448  
 18449  @ After |new_trie_op| has compressed the necessary opcode information,
 18450  plenty of information is available to unscramble the data into the
 18451  final form needed by our hyphenation algorithm.
 18452  
 18453  @<Sort \(t)the hyphenation op tables into proper order@>=
 18454  op_start[0]:=-min_quarterword;
 18455  for j:=1 to 255 do op_start[j]:=op_start[j-1]+qo(trie_used[j-1]);
 18456  for j:=1 to trie_op_ptr do
 18457    trie_op_hash[j]:=op_start[trie_op_lang[j]]+trie_op_val[j]; {destination}
 18458  for j:=1 to trie_op_ptr do while trie_op_hash[j]>j do
 18459    begin k:=trie_op_hash[j];@/
 18460    t:=hyf_distance[k]; hyf_distance[k]:=hyf_distance[j]; hyf_distance[j]:=t;@/
 18461    t:=hyf_num[k]; hyf_num[k]:=hyf_num[j]; hyf_num[j]:=t;@/
 18462    t:=hyf_next[k]; hyf_next[k]:=hyf_next[j]; hyf_next[j]:=t;@/
 18463    trie_op_hash[j]:=trie_op_hash[k]; trie_op_hash[k]:=k;
 18464    end
 18465  
 18466  @ Before we forget how to initialize the data structures that have been
 18467  mentioned so far, let's write down the code that gets them started.
 18468  
 18469  @<Initialize table entries...@>=
 18470  for k:=-trie_op_size to trie_op_size do trie_op_hash[k]:=0;
 18471  for k:=0 to 255 do trie_used[k]:=min_quarterword;
 18472  trie_op_ptr:=0;
 18473  
 18474  @ The linked trie that is used to preprocess hyphenation patterns appears
 18475  in several global arrays. Each node represents an instruction of the form
 18476  ``if you see character |c|, then perform operation |o|, move to the
 18477  next character, and go to node |l|; otherwise go to node |r|.''
 18478  The four quantities |c|, |o|, |l|, and |r| are stored in four arrays
 18479  |trie_c|, |trie_o|, |trie_l|, and |trie_r|. The root of the trie
 18480  is |trie_l[0]|, and the number of nodes is |trie_ptr|. Null trie
 18481  pointers are represented by zero. To initialize the trie, we simply
 18482  set |trie_l[0]| and |trie_ptr| to zero. We also set |trie_c[0]| to some
 18483  arbitrary value, since the algorithm may access it.
 18484  
 18485  The algorithms maintain the condition
 18486  $$\hbox{|trie_c[trie_r[z]]>trie_c[z]|\qquad
 18487  whenever |z<>0| and |trie_r[z]<>0|};$$ in other words, sibling nodes are
 18488  ordered by their |c| fields.
 18489  
 18490  @d trie_root==trie_l[0] {root of the linked trie}
 18491  
 18492  @<Glob...@>=
 18493  @!init @!trie_c:packed array[trie_pointer] of packed_ASCII_code;
 18494    {characters to match}
 18495  @t\hskip10pt@>@!trie_o:packed array[trie_pointer] of quarterword;
 18496    {operations to perform}
 18497  @t\hskip10pt@>@!trie_l:packed array[trie_pointer] of trie_pointer;
 18498    {left subtrie links}
 18499  @t\hskip10pt@>@!trie_r:packed array[trie_pointer] of trie_pointer;
 18500    {right subtrie links}
 18501  @t\hskip10pt@>@!trie_ptr:trie_pointer; {the number of nodes in the trie}
 18502  @t\hskip10pt@>@!trie_hash:packed array[trie_pointer] of trie_pointer;
 18503    {used to identify equivalent subtries}
 18504  tini
 18505  
 18506  @ Let us suppose that a linked trie has already been constructed.
 18507  Experience shows that we can often reduce its size by recognizing common
 18508  subtries; therefore another hash table is introduced for this purpose,
 18509  somewhat similar to |trie_op_hash|. The new hash table will be
 18510  initialized to zero.
 18511  
 18512  The function |trie_node(p)| returns |p| if |p| is distinct from other nodes
 18513  that it has seen, otherwise it returns the number of the first equivalent
 18514  node that it has seen.
 18515  
 18516  Notice that we might make subtries equivalent even if they correspond to
 18517  patterns for different languages, in which the trie ops might mean quite
 18518  different things. That's perfectly all right.
 18519  
 18520  @<Declare procedures for preprocessing hyph...@>=
 18521  function trie_node(@!p:trie_pointer):trie_pointer; {converts
 18522    to a canonical form}
 18523  label exit;
 18524  var h:trie_pointer; {trial hash location}
 18525  @!q:trie_pointer; {trial trie node}
 18526  begin h:=abs(trie_c[p]+1009*trie_o[p]+@|
 18527      2718*trie_l[p]+3142*trie_r[p]) mod trie_size;
 18528  loop@+  begin q:=trie_hash[h];
 18529    if q=0 then
 18530      begin trie_hash[h]:=p; trie_node:=p; return;
 18531      end;
 18532    if (trie_c[q]=trie_c[p])and(trie_o[q]=trie_o[p])and@|
 18533      (trie_l[q]=trie_l[p])and(trie_r[q]=trie_r[p]) then
 18534      begin trie_node:=q; return;
 18535      end;
 18536    if h>0 then decr(h)@+else h:=trie_size;
 18537    end;
 18538  exit:end;
 18539  
 18540  @ A neat recursive procedure is now able to compress a trie by
 18541  traversing it and applying |trie_node| to its nodes in ``bottom up''
 18542  fashion. We will compress the entire trie by clearing |trie_hash| to
 18543  zero and then saying `|trie_root:=compress_trie(trie_root)|'.
 18544  @^recursion@>
 18545  
 18546  @<Declare procedures for preprocessing hyph...@>=
 18547  function compress_trie(@!p:trie_pointer):trie_pointer;
 18548  begin if p=0 then compress_trie:=0
 18549  else  begin trie_l[p]:=compress_trie(trie_l[p]);
 18550    trie_r[p]:=compress_trie(trie_r[p]);
 18551    compress_trie:=trie_node(p);
 18552    end;
 18553  end;
 18554  
 18555  @ The compressed trie will be packed into the |trie| array using a
 18556  ``top-down first-fit'' procedure. This is a little tricky, so the reader
 18557  should pay close attention: The |trie_hash| array is cleared to zero
 18558  again and renamed |trie_ref| for this phase of the operation; later on,
 18559  |trie_ref[p]| will be nonzero only if the linked trie node |p| is the
 18560  smallest character
 18561  in a family and if the characters |c| of that family have been allocated to
 18562  locations |trie_ref[p]+c| in the |trie| array. Locations of |trie| that
 18563  are in use will have |trie_link=0|, while the unused holes in |trie|
 18564  will be doubly linked with |trie_link| pointing to the next larger vacant
 18565  location and |trie_back| pointing to the next smaller one. This double
 18566  linking will have been carried out only as far as |trie_max|, where
 18567  |trie_max| is the largest index of |trie| that will be needed.
 18568  To save time at the low end of the trie, we maintain array entries
 18569  |trie_min[c]| pointing to the smallest hole that is greater than~|c|.
 18570  Another array |trie_taken| tells whether or not a given location is
 18571  equal to |trie_ref[p]| for some |p|; this array is used to ensure that
 18572  distinct nodes in the compressed trie will have distinct |trie_ref|
 18573  entries.
 18574  
 18575  @d trie_ref==trie_hash {where linked trie families go into |trie|}
 18576  @d trie_back(#)==trie[#].lh {backward links in |trie| holes}
 18577  
 18578  @<Glob...@>=
 18579  @!init @!trie_taken:packed array[1..trie_size] of boolean;
 18580    {does a family start here?}
 18581  @t\hskip10pt@>@!trie_min:array[ASCII_code] of trie_pointer;
 18582    {the first possible slot for each character}
 18583  @t\hskip10pt@>@!trie_max:trie_pointer; {largest location used in |trie|}
 18584  @t\hskip10pt@>@!trie_not_ready:boolean; {is the trie still in linked form?}
 18585  tini
 18586  
 18587  @ Each time \.{\\patterns} appears, it contributes further patterns to
 18588  the future trie, which will be built only when hyphenation is attempted or
 18589  when a format file is dumped. The boolean variable |trie_not_ready|
 18590  will change to |false| when the trie is compressed; this will disable
 18591  further patterns.
 18592  
 18593  @<Initialize table entries...@>=
 18594  trie_not_ready:=true; trie_root:=0; trie_c[0]:=si(0); trie_ptr:=0;
 18595  
 18596  @ Here is how the trie-compression data structures are initialized.
 18597  If storage is tight, it would be possible to overlap |trie_op_hash|,
 18598  |trie_op_lang|, and |trie_op_val| with |trie|, |trie_hash|, and |trie_taken|,
 18599  because we finish with the former just before we need the latter.
 18600  
 18601  @<Get ready to compress the trie@>=
 18602  @<Sort \(t)the hyphenation...@>;
 18603  for p:=0 to trie_size do trie_hash[p]:=0;
 18604  trie_root:=compress_trie(trie_root); {identify equivalent subtries}
 18605  for p:=0 to trie_ptr do trie_ref[p]:=0;
 18606  for p:=0 to 255 do trie_min[p]:=p+1;
 18607  trie_link(0):=1; trie_max:=0
 18608  
 18609  @ The |first_fit| procedure finds the smallest hole |z| in |trie| such that
 18610  a trie family starting at a given node |p| will fit into vacant positions
 18611  starting at |z|. If |c=trie_c[p]|, this means that location |z-c| must
 18612  not already be taken by some other family, and that |z-c+@t$c^\prime$@>|
 18613  must be vacant for all characters $c^\prime$ in the family. The procedure
 18614  sets |trie_ref[p]| to |z-c| when the first fit has been found.
 18615  
 18616  @<Declare procedures for preprocessing hyph...@>=
 18617  procedure first_fit(@!p:trie_pointer); {packs a family into |trie|}
 18618  label not_found,found;
 18619  var h:trie_pointer; {candidate for |trie_ref[p]|}
 18620  @!z:trie_pointer; {runs through holes}
 18621  @!q:trie_pointer; {runs through the family starting at |p|}
 18622  @!c:ASCII_code; {smallest character in the family}
 18623  @!l,@!r:trie_pointer; {left and right neighbors}
 18624  @!ll:1..256; {upper limit of |trie_min| updating}
 18625  begin c:=so(trie_c[p]);
 18626  z:=trie_min[c]; {get the first conceivably good hole}
 18627  loop@+  begin h:=z-c;@/
 18628    @<Ensure that |trie_max>=h+256|@>;
 18629    if trie_taken[h] then goto not_found;
 18630    @<If all characters of the family fit relative to |h|, then
 18631      |goto found|,\30\ otherwise |goto not_found|@>;
 18632    not_found: z:=trie_link(z); {move to the next hole}
 18633    end;
 18634  found: @<Pack the family into |trie| relative to |h|@>;
 18635  end;
 18636  
 18637  @ By making sure that |trie_max| is at least |h+256|, we can be sure that
 18638  |trie_max>z|, since |h=z-c|. It follows that location |trie_max| will
 18639  never be occupied in |trie|, and we will have |trie_max>=trie_link(z)|.
 18640  
 18641  @<Ensure that |trie_max>=h+256|@>=
 18642  if trie_max<h+256 then
 18643    begin if trie_size<=h+256 then overflow("pattern memory",trie_size);
 18644  @:TeX capacity exceeded pattern memory}{\quad pattern memory@>
 18645    repeat incr(trie_max); trie_taken[trie_max]:=false;
 18646    trie_link(trie_max):=trie_max+1; trie_back(trie_max):=trie_max-1;
 18647    until trie_max=h+256;
 18648    end
 18649  
 18650  @ @<If all characters of the family fit relative to |h|...@>=
 18651  q:=trie_r[p];
 18652  while q>0 do
 18653    begin if trie_link(h+so(trie_c[q]))=0 then goto not_found;
 18654    q:=trie_r[q];
 18655    end;
 18656  goto found
 18657  
 18658  @ @<Pack the family into |trie| relative to |h|@>=
 18659  trie_taken[h]:=true; trie_ref[p]:=h; q:=p;
 18660  repeat z:=h+so(trie_c[q]); l:=trie_back(z); r:=trie_link(z);
 18661  trie_back(r):=l; trie_link(l):=r; trie_link(z):=0;
 18662  if l<256 then
 18663    begin if z<256 then ll:=z @+else ll:=256;
 18664    repeat trie_min[l]:=r; incr(l);
 18665    until l=ll;
 18666    end;
 18667  q:=trie_r[q];
 18668  until q=0
 18669  
 18670  @ To pack the entire linked trie, we use the following recursive procedure.
 18671  @^recursion@>
 18672  
 18673  @<Declare procedures for preprocessing hyph...@>=
 18674  procedure trie_pack(@!p:trie_pointer); {pack subtries of a family}
 18675  var q:trie_pointer; {a local variable that need not be saved on recursive calls}
 18676  begin repeat q:=trie_l[p];
 18677  if (q>0)and(trie_ref[q]=0) then
 18678    begin first_fit(q); trie_pack(q);
 18679    end;
 18680  p:=trie_r[p];
 18681  until p=0;
 18682  end;
 18683  
 18684  @ When the whole trie has been allocated into the sequential table, we
 18685  must go through it once again so that |trie| contains the correct
 18686  information. Null pointers in the linked trie will be represented by the
 18687  value~0, which properly implements an ``empty'' family.
 18688  
 18689  @<Move the data into |trie|@>=
 18690  h.rh:=0; h.b0:=min_quarterword; h.b1:=min_quarterword; {|trie_link:=0|,
 18691    |trie_op:=min_quarterword|, |trie_char:=qi(0)|}
 18692  if trie_root=0 then {no patterns were given}
 18693    begin for r:=0 to 256 do trie[r]:=h;
 18694    trie_max:=256;
 18695    end
 18696  else begin trie_fix(trie_root); {this fixes the non-holes in |trie|}
 18697    r:=0; {now we will zero out all the holes}
 18698    repeat s:=trie_link(r); trie[r]:=h; r:=s;
 18699    until r>trie_max;
 18700    end;
 18701  trie_char(0):=qi("?"); {make |trie_char(c)<>c| for all |c|}
 18702  
 18703  @ The fixing-up procedure is, of course, recursive. Since the linked trie
 18704  usually has overlapping subtries, the same data may be moved several
 18705  times; but that causes no harm, and at most as much work is done as it
 18706  took to build the uncompressed trie.
 18707  @^recursion@>
 18708  
 18709  @<Declare procedures for preprocessing hyph...@>=
 18710  procedure trie_fix(@!p:trie_pointer); {moves |p| and its siblings into |trie|}
 18711  var q:trie_pointer; {a local variable that need not be saved on recursive calls}
 18712  @!c:ASCII_code; {another one that need not be saved}
 18713  @!z:trie_pointer; {|trie| reference; this local variable must be saved}
 18714  begin z:=trie_ref[p];
 18715  repeat q:=trie_l[p]; c:=so(trie_c[p]);
 18716  trie_link(z+c):=trie_ref[q]; trie_char(z+c):=qi(c); trie_op(z+c):=trie_o[p];
 18717  if q>0 then trie_fix(q);
 18718  p:=trie_r[p];
 18719  until p=0;
 18720  end;
 18721  
 18722  @ Now let's go back to the easier problem, of building the linked
 18723  trie.  When \.{INITEX} has scanned the `\.{\\patterns}' control
 18724  sequence, it calls on |new_patterns| to do the right thing.
 18725  
 18726  @<Declare procedures for preprocessing hyph...@>=
 18727  procedure new_patterns; {initializes the hyphenation pattern data}
 18728  label done, done1;
 18729  var k,@!l:0..64; {indices into |hc| and |hyf|;
 18730                    not always in |small_number| range}
 18731  @!digit_sensed:boolean; {should the next digit be treated as a letter?}
 18732  @!v:quarterword; {trie op code}
 18733  @!p,@!q:trie_pointer; {nodes of trie traversed during insertion}
 18734  @!first_child:boolean; {is |p=trie_l[q]|?}
 18735  @!c:ASCII_code; {character being inserted}
 18736  begin if trie_not_ready then
 18737    begin set_cur_lang; scan_left_brace; {a left brace must follow \.{\\patterns}}
 18738    @<Enter all of the patterns into a linked trie, until coming to a right
 18739    brace@>;
 18740    end
 18741  else begin print_err("Too late for "); print_esc("patterns");
 18742    help1("All patterns must be given before typesetting begins.");
 18743    error; link(garbage):=scan_toks(false,false); flush_list(def_ref);
 18744    end;
 18745  end;
 18746  
 18747  @ Novices are not supposed to be using \.{\\patterns}, so the error
 18748  messages are terse. (Note that all error messages appear in \TeX's string
 18749  pool, even if they are used only by \.{INITEX}.)
 18750  
 18751  @<Enter all of the patterns into a linked trie...@>=
 18752  k:=0; hyf[0]:=0; digit_sensed:=false;
 18753  loop@+  begin get_x_token;
 18754    case cur_cmd of
 18755    letter,other_char:@<Append a new letter or a hyphen level@>;
 18756    spacer,right_brace: begin if k>0 then
 18757        @<Insert a new pattern into the linked trie@>;
 18758      if cur_cmd=right_brace then goto done;
 18759      k:=0; hyf[0]:=0; digit_sensed:=false;
 18760      end;
 18761    othercases begin print_err("Bad "); print_esc("patterns");
 18762  @.Bad \\patterns@>
 18763      help1("(See Appendix H.)"); error;
 18764      end
 18765    endcases;
 18766    end;
 18767  done:
 18768  
 18769  @ @<Append a new letter or a hyphen level@>=
 18770  if digit_sensed or(cur_chr<"0")or(cur_chr>"9") then
 18771    begin if cur_chr="." then cur_chr:=0 {edge-of-word delimiter}
 18772    else  begin cur_chr:=lc_code(cur_chr);
 18773      if cur_chr=0 then
 18774        begin print_err("Nonletter");
 18775  @.Nonletter@>
 18776        help1("(See Appendix H.)"); error;
 18777        end;
 18778      end;
 18779    if k<63 then
 18780      begin incr(k); hc[k]:=cur_chr; hyf[k]:=0; digit_sensed:=false;
 18781      end;
 18782    end
 18783  else if k<63 then
 18784    begin hyf[k]:=cur_chr-"0"; digit_sensed:=true;
 18785    end
 18786  
 18787  @ When the following code comes into play, the pattern $p_1\ldots p_k$
 18788  appears in |hc[1..k]|, and the corresponding sequence of numbers $n_0\ldots
 18789  n_k$ appears in |hyf[0..k]|.
 18790  
 18791  @<Insert a new pattern into the linked trie@>=
 18792  begin @<Compute the trie op code, |v|, and set |l:=0|@>;
 18793  q:=0; hc[0]:=cur_lang;
 18794  while l<=k do
 18795    begin c:=hc[l]; incr(l); p:=trie_l[q]; first_child:=true;
 18796    while (p>0)and(c>so(trie_c[p])) do
 18797      begin q:=p; p:=trie_r[q]; first_child:=false;
 18798      end;
 18799    if (p=0)or(c<so(trie_c[p])) then
 18800      @<Insert a new trie node between |q| and |p|, and
 18801        make |p| point to it@>;
 18802    q:=p; {now node |q| represents $p_1\ldots p_{l-1}$}
 18803    end;
 18804  if trie_o[q]<>min_quarterword then
 18805    begin print_err("Duplicate pattern");
 18806  @.Duplicate pattern@>
 18807    help1("(See Appendix H.)"); error;
 18808    end;
 18809  trie_o[q]:=v;
 18810  end
 18811  
 18812  @ @<Insert a new trie node between |q| and |p|...@>=
 18813  begin if trie_ptr=trie_size then overflow("pattern memory",trie_size);
 18814  @:TeX capacity exceeded pattern memory}{\quad pattern memory@>
 18815  incr(trie_ptr); trie_r[trie_ptr]:=p; p:=trie_ptr; trie_l[p]:=0;
 18816  if first_child then trie_l[q]:=p@+else trie_r[q]:=p;
 18817  trie_c[p]:=si(c); trie_o[p]:=min_quarterword;
 18818  end
 18819  
 18820  @ @<Compute the trie op code, |v|...@>=
 18821  if hc[1]=0 then hyf[0]:=0;
 18822  if hc[k]=0 then hyf[k]:=0;
 18823  l:=k; v:=min_quarterword;
 18824  loop@+  begin if hyf[l]<>0 then v:=new_trie_op(k-l,hyf[l],v);
 18825    if l>0 then decr(l)@+else goto done1;
 18826    end;
 18827  done1:
 18828  
 18829  @ Finally we put everything together: Here is how the trie gets to its
 18830  final, efficient form.
 18831  The following packing routine is rigged so that the root of the linked
 18832  tree gets mapped into location 1 of |trie|, as required by the hyphenation
 18833  algorithm. This happens because the first call of |first_fit| will
 18834  ``take'' location~1.
 18835  
 18836  @<Declare procedures for preprocessing hyphenation patterns@>=
 18837  procedure init_trie;
 18838  var @!p:trie_pointer; {pointer for initialization}
 18839  @!j,@!k,@!t:integer; {all-purpose registers for initialization}
 18840  @!r,@!s:trie_pointer; {used to clean up the packed |trie|}
 18841  @!h:two_halves; {template used to zero out |trie|'s holes}
 18842  begin @<Get ready to compress the trie@>;
 18843  if trie_root<>0 then
 18844    begin first_fit(trie_root); trie_pack(trie_root);
 18845    end;
 18846  @<Move the data into |trie|@>;
 18847  trie_not_ready:=false;
 18848  end;
 18849  
 18850  @* \[44] Breaking vertical lists into pages.
 18851  The |vsplit| procedure, which implements \TeX's \.{\\vsplit} operation,
 18852  is considerably simpler than |line_break| because it doesn't have to
 18853  worry about hyphenation, and because its mission is to discover a single
 18854  break instead of an optimum sequence of breakpoints.  But before we get
 18855  into the details of |vsplit|, we need to consider a few more basic things.
 18856  
 18857  @ A subroutine called |prune_page_top| takes a pointer to a vlist and
 18858  returns a pointer to a modified vlist in which all glue, kern, and penalty nodes
 18859  have been deleted before the first box or rule node. However, the first
 18860  box or rule is actually preceded by a newly created glue node designed so that
 18861  the topmost baseline will be at distance |split_top_skip| from the top,
 18862  whenever this is possible without backspacing.
 18863  
 18864  In this routine and those that follow, we make use of the fact that a
 18865  vertical list contains no character nodes, hence the |type| field exists
 18866  for each node in the list.
 18867  @^data structure assumptions@>
 18868  
 18869  @p function prune_page_top(@!p:pointer):pointer; {adjust top after page break}
 18870  var prev_p:pointer; {lags one step behind |p|}
 18871  @!q:pointer; {temporary variable for list manipulation}
 18872  begin prev_p:=temp_head; link(temp_head):=p;
 18873  while p<>null do
 18874    case type(p) of
 18875    hlist_node,vlist_node,rule_node:@<Insert glue for |split_top_skip|
 18876      and set~|p:=null|@>;
 18877    whatsit_node,mark_node,ins_node: begin prev_p:=p; p:=link(prev_p);
 18878      end;
 18879    glue_node,kern_node,penalty_node: begin q:=p; p:=link(q); link(q):=null;
 18880      link(prev_p):=p; flush_node_list(q);
 18881      end;
 18882    othercases confusion("pruning")
 18883  @:this can't happen pruning}{\quad pruning@>
 18884    endcases;
 18885  prune_page_top:=link(temp_head);
 18886  end;
 18887  
 18888  @ @<Insert glue for |split_top_skip|...@>=
 18889  begin q:=new_skip_param(split_top_skip_code); link(prev_p):=q; link(q):=p;
 18890    {now |temp_ptr=glue_ptr(q)|}
 18891  if width(temp_ptr)>height(p) then width(temp_ptr):=width(temp_ptr)-height(p)
 18892  else width(temp_ptr):=0;
 18893  p:=null;
 18894  end
 18895  
 18896  @ The next subroutine finds the best place to break a given vertical list
 18897  so as to obtain a box of height~|h|, with maximum depth~|d|.
 18898  A pointer to the beginning of the vertical list is given,
 18899  and a pointer to the optimum breakpoint is returned. The list is effectively
 18900  followed by a forced break, i.e., a penalty node with the |eject_penalty|;
 18901  if the best break occurs at this artificial node, the value |null| is returned.
 18902  
 18903  An array of six |scaled| distances is used to keep track of the height
 18904  from the beginning of the list to the current place, just as in |line_break|.
 18905  In fact, we use one of the same arrays, only changing its name to reflect
 18906  its new significance.
 18907  
 18908  @d active_height==active_width {new name for the six distance variables}
 18909  @d cur_height==active_height[1] {the natural height}
 18910  @d set_height_zero(#)==active_height[#]:=0 {initialize the height to zero}
 18911  @#
 18912  @d update_heights=90 {go here to record glue in the |active_height| table}
 18913  
 18914  @p function vert_break(@!p:pointer; @!h,@!d:scaled):pointer;
 18915    {finds optimum page break}
 18916  label done,not_found,update_heights;
 18917  var prev_p:pointer; {if |p| is a glue node, |type(prev_p)| determines
 18918    whether |p| is a legal breakpoint}
 18919  @!q,@!r:pointer; {glue specifications}
 18920  @!pi:integer; {penalty value}
 18921  @!b:integer; {badness at a trial breakpoint}
 18922  @!least_cost:integer; {the smallest badness plus penalties found so far}
 18923  @!best_place:pointer; {the most recent break that leads to |least_cost|}
 18924  @!prev_dp:scaled; {depth of previous box in the list}
 18925  @!t:small_number; {|type| of the node following a kern}
 18926  begin prev_p:=p; {an initial glue node is not a legal breakpoint}
 18927  least_cost:=awful_bad; do_all_six(set_height_zero); prev_dp:=0;
 18928  loop@+  begin @<If node |p| is a legal breakpoint, check if this break is
 18929      the best known, and |goto done| if |p| is null or
 18930      if the page-so-far is already too full to accept more stuff@>;
 18931    prev_p:=p; p:=link(prev_p);
 18932    end;
 18933  done: vert_break:=best_place;
 18934  end;
 18935  
 18936  @ A global variable |best_height_plus_depth| will be set to the natural size
 18937  of the box that corresponds to the optimum breakpoint found by |vert_break|.
 18938  (This value is used by the insertion-splitting algorithm of the page builder.)
 18939  
 18940  @<Glob...@>=
 18941  @!best_height_plus_depth:scaled; {height of the best box, without stretching or
 18942    shrinking}
 18943  
 18944  @ A subtle point to be noted here is that the maximum depth~|d| might be
 18945  negative, so |cur_height| and |prev_dp| might need to be corrected even
 18946  after a glue or kern node.
 18947  
 18948  @<If node |p| is a legal breakpoint, check...@>=
 18949  if p=null then pi:=eject_penalty
 18950  else  @<Use node |p| to update the current height and depth measurements;
 18951      if this node is not a legal breakpoint, |goto not_found|
 18952      or |update_heights|,
 18953      otherwise set |pi| to the associated penalty at the break@>;
 18954  @<Check if node |p| is a new champion breakpoint; then \(go)|goto done|
 18955    if |p| is a forced break or if the page-so-far is already too full@>;
 18956  if (type(p)<glue_node)or(type(p)>kern_node) then goto not_found;
 18957  update_heights: @<Update the current height and depth measurements with
 18958    respect to a glue or kern node~|p|@>;
 18959  not_found: if prev_dp>d then
 18960      begin cur_height:=cur_height+prev_dp-d;
 18961      prev_dp:=d;
 18962      end;
 18963  
 18964  @ @<Use node |p| to update the current height and depth measurements...@>=
 18965  case type(p) of
 18966  hlist_node,vlist_node,rule_node: begin@t@>@;@/
 18967    cur_height:=cur_height+prev_dp+height(p); prev_dp:=depth(p);
 18968    goto not_found;
 18969    end;
 18970  whatsit_node:@<Process whatsit |p| in |vert_break| loop, |goto not_found|@>;
 18971  glue_node: if precedes_break(prev_p) then pi:=0
 18972    else goto update_heights;
 18973  kern_node: begin if link(p)=null then t:=penalty_node
 18974    else t:=type(link(p));
 18975    if t=glue_node then pi:=0@+else goto update_heights;
 18976    end;
 18977  penalty_node: pi:=penalty(p);
 18978  mark_node,ins_node: goto not_found;
 18979  othercases confusion("vertbreak")
 18980  @:this can't happen vertbreak}{\quad vertbreak@>
 18981  endcases
 18982  
 18983  @ @d deplorable==100000 {more than |inf_bad|, but less than |awful_bad|}
 18984  
 18985  @<Check if node |p| is a new champion breakpoint; then \(go)...@>=
 18986  if pi<inf_penalty then
 18987    begin @<Compute the badness, |b|, using |awful_bad|
 18988      if the box is too full@>;
 18989    if b<awful_bad then
 18990      if pi<=eject_penalty then b:=pi
 18991      else if b<inf_bad then b:=b+pi
 18992        else b:=deplorable;
 18993    if b<=least_cost then
 18994      begin best_place:=p; least_cost:=b;
 18995      best_height_plus_depth:=cur_height+prev_dp;
 18996      end;
 18997    if (b=awful_bad)or(pi<=eject_penalty) then goto done;
 18998    end
 18999  
 19000  @ @<Compute the badness, |b|, using |awful_bad| if the box is too full@>=
 19001  if cur_height<h then
 19002    if (active_height[3]<>0) or (active_height[4]<>0) or
 19003      (active_height[5]<>0) then b:=0
 19004    else b:=badness(h-cur_height,active_height[2])
 19005  else if cur_height-h>active_height[6] then b:=awful_bad
 19006  else b:=badness(cur_height-h,active_height[6])
 19007  
 19008  @ Vertical lists that are subject to the |vert_break| procedure should not
 19009  contain infinite shrinkability, since that would permit any amount of
 19010  information to ``fit'' on one page.
 19011  
 19012  @<Update the current height and depth measurements with...@>=
 19013  if type(p)=kern_node then q:=p
 19014  else  begin q:=glue_ptr(p);
 19015    active_height[2+stretch_order(q)]:=@|
 19016      active_height[2+stretch_order(q)]+stretch(q);@/
 19017    active_height[6]:=active_height[6]+shrink(q);
 19018    if (shrink_order(q)<>normal)and(shrink(q)<>0) then
 19019      begin@t@>@;@/
 19020      print_err("Infinite glue shrinkage found in box being split");@/
 19021  @.Infinite glue shrinkage...@>
 19022      help4("The box you are \vsplitting contains some infinitely")@/
 19023        ("shrinkable glue, e.g., `\vss' or `\vskip 0pt minus 1fil'.")@/
 19024        ("Such glue doesn't belong there; but you can safely proceed,")@/
 19025        ("since the offensive shrinkability has been made finite.");
 19026      error; r:=new_spec(q); shrink_order(r):=normal; delete_glue_ref(q);
 19027      glue_ptr(p):=r; q:=r;
 19028      end;
 19029    end;
 19030  cur_height:=cur_height+prev_dp+width(q); prev_dp:=0
 19031  
 19032  @ Now we are ready to consider |vsplit| itself. Most of
 19033  its work is accomplished by the two subroutines that we have just considered.
 19034  
 19035  Given the number of a vlist box |n|, and given a desired page height |h|,
 19036  the |vsplit| function finds the best initial segment of the vlist and
 19037  returns a box for a page of height~|h|. The remainder of the vlist, if
 19038  any, replaces the original box, after removing glue and penalties and
 19039  adjusting for |split_top_skip|. Mark nodes in the split-off box are used to
 19040  set the values of |split_first_mark| and |split_bot_mark|; we use the
 19041  fact that |split_first_mark=null| if and only if |split_bot_mark=null|.
 19042  
 19043  The original box becomes ``void'' if and only if it has been entirely
 19044  extracted.  The extracted box is ``void'' if and only if the original
 19045  box was void (or if it was, erroneously, an hlist box).
 19046  
 19047  @p function vsplit(@!n:eight_bits; @!h:scaled):pointer;
 19048    {extracts a page of height |h| from box |n|}
 19049  label exit,done;
 19050  var v:pointer; {the box to be split}
 19051  p:pointer; {runs through the vlist}
 19052  q:pointer; {points to where the break occurs}
 19053  begin v:=box(n);
 19054  if split_first_mark<>null then
 19055    begin delete_token_ref(split_first_mark); split_first_mark:=null;
 19056    delete_token_ref(split_bot_mark); split_bot_mark:=null;
 19057    end;
 19058  @<Dispense with trivial cases of void or bad boxes@>;
 19059  q:=vert_break(list_ptr(v),h,split_max_depth);
 19060  @<Look at all the marks in nodes before the break, and set the final
 19061    link to |null| at the break@>;
 19062  q:=prune_page_top(q); p:=list_ptr(v); free_node(v,box_node_size);
 19063  if q=null then box(n):=null {the |eq_level| of the box stays the same}
 19064  else box(n):=vpack(q,natural);
 19065  vsplit:=vpackage(p,h,exactly,split_max_depth);
 19066  exit: end;
 19067  
 19068  @ @<Dispense with trivial cases of void or bad boxes@>=
 19069  if v=null then
 19070    begin vsplit:=null; return;
 19071    end;
 19072  if type(v)<>vlist_node then
 19073    begin print_err(""); print_esc("vsplit"); print(" needs a ");
 19074    print_esc("vbox");
 19075  @:vsplit_}{\.{\\vsplit needs a \\vbox}@>
 19076    help2("The box you are trying to split is an \hbox.")@/
 19077    ("I can't split such a box, so I'll leave it alone.");
 19078    error; vsplit:=null; return;
 19079    end
 19080  
 19081  @ It's possible that the box begins with a penalty node that is the
 19082  ``best'' break, so we must be careful to handle this special case correctly.
 19083  
 19084  @<Look at all the marks...@>=
 19085  p:=list_ptr(v);
 19086  if p=q then list_ptr(v):=null
 19087  else loop@+begin if type(p)=mark_node then
 19088      if split_first_mark=null then
 19089        begin split_first_mark:=mark_ptr(p);
 19090        split_bot_mark:=split_first_mark;
 19091        token_ref_count(split_first_mark):=@|
 19092          token_ref_count(split_first_mark)+2;
 19093        end
 19094      else  begin delete_token_ref(split_bot_mark);
 19095        split_bot_mark:=mark_ptr(p);
 19096        add_token_ref(split_bot_mark);
 19097        end;
 19098    if link(p)=q then
 19099      begin link(p):=null; goto done;
 19100      end;
 19101    p:=link(p);
 19102    end;
 19103  done:
 19104  
 19105  @* \[45] The page builder.
 19106  When \TeX\ appends new material to its main vlist in vertical mode, it uses
 19107  a method something like |vsplit| to decide where a page ends, except that
 19108  the calculations are done ``on line'' as new items come in.
 19109  The main complication in this process is that insertions must be put
 19110  into their boxes and removed from the vlist, in a more-or-less optimum manner.
 19111  
 19112  We shall use the term ``current page'' for that part of the main vlist that
 19113  is being considered as a candidate for being broken off and sent to the
 19114  user's output routine. The current page starts at |link(page_head)|, and
 19115  it ends at |page_tail|.  We have |page_head=page_tail| if this list is empty.
 19116  @^current page@>
 19117  
 19118  Utter chaos would reign if the user kept changing page specifications
 19119  while a page is being constructed, so the page builder keeps the pertinent
 19120  specifications frozen as soon as the page receives its first box or
 19121  insertion.  The global variable |page_contents| is |empty| when the
 19122  current page contains only mark nodes and content-less whatsit nodes; it
 19123  is |inserts_only| if the page contains only insertion nodes in addition to
 19124  marks and whatsits.  Glue nodes, kern nodes, and penalty nodes are
 19125  discarded until a box or rule node appears, at which time |page_contents|
 19126  changes to |box_there|.  As soon as |page_contents| becomes non-|empty|,
 19127  the current |vsize| and |max_depth| are squirreled away into |page_goal|
 19128  and |page_max_depth|; the latter values will be used until the page has
 19129  been forwarded to the user's output routine. The \.{\\topskip} adjustment
 19130  is made when |page_contents| changes to |box_there|.
 19131  
 19132  Although |page_goal| starts out equal to |vsize|, it is decreased by the
 19133  scaled natural height-plus-depth of the insertions considered so far, and by
 19134  the \.{\\skip} corrections for those insertions. Therefore it represents
 19135  the size into which the non-inserted material should fit, assuming that
 19136  all insertions in the current page have been made.
 19137  
 19138  The global variables |best_page_break| and |least_page_cost| correspond
 19139  respectively to the local variables |best_place| and |least_cost| in the
 19140  |vert_break| routine that we have already studied; i.e., they record the
 19141  location and value of the best place currently known for breaking the
 19142  current page. The value of |page_goal| at the time of the best break is
 19143  stored in |best_size|.
 19144  
 19145  @d inserts_only=1
 19146    {|page_contents| when an insert node has been contributed, but no boxes}
 19147  @d box_there=2 {|page_contents| when a box or rule has been contributed}
 19148  
 19149  @<Glob...@>=
 19150  @!page_tail:pointer; {the final node on the current page}
 19151  @!page_contents:empty..box_there; {what is on the current page so far?}
 19152  @!page_max_depth:scaled; {maximum box depth on page being built}
 19153  @!best_page_break:pointer; {break here to get the best page known so far}
 19154  @!least_page_cost:integer; {the score for this currently best page}
 19155  @!best_size:scaled; {its |page_goal|}
 19156  
 19157  @ The page builder has another data structure to keep track of insertions.
 19158  This is a list of four-word nodes, starting and ending at |page_ins_head|.
 19159  That is, the first element of the list is node |r@t$_1$@>=link(page_ins_head)|;
 19160  node $r_j$ is followed by |r@t$_{j+1}$@>=link(r@t$_j$@>)|; and if there are
 19161  |n| items we have |r@t$_{n+1}$@>=page_ins_head|. The |subtype| field of
 19162  each node in this list refers to an insertion number; for example, `\.{\\insert
 19163  250}' would correspond to a node whose |subtype| is |qi(250)|
 19164  (the same as the |subtype| field of the relevant |ins_node|). These |subtype|
 19165  fields are in increasing order, and |subtype(page_ins_head)=
 19166  qi(255)|, so |page_ins_head| serves as a convenient sentinel
 19167  at the end of the list. A record is present for each insertion number that
 19168  appears in the current page.
 19169  
 19170  The |type| field in these nodes distinguishes two possibilities that
 19171  might occur as we look ahead before deciding on the optimum page break.
 19172  If |type(r)=inserting|, then |height(r)| contains the total of the
 19173  height-plus-depth dimensions of the box and all its inserts seen so far.
 19174  If |type(r)=split_up|, then no more insertions will be made into this box,
 19175  because at least one previous insertion was too big to fit on the current
 19176  page; |broken_ptr(r)| points to the node where that insertion will be
 19177  split, if \TeX\ decides to split it, |broken_ins(r)| points to the
 19178  insertion node that was tentatively split, and |height(r)| includes also the
 19179  natural height plus depth of the part that would be split off.
 19180  
 19181  In both cases, |last_ins_ptr(r)| points to the last |ins_node|
 19182  encountered for box |qo(subtype(r))| that would be at least partially
 19183  inserted on the next page; and |best_ins_ptr(r)| points to the last
 19184  such |ins_node| that should actually be inserted, to get the page with
 19185  minimum badness among all page breaks considered so far. We have
 19186  |best_ins_ptr(r)=null| if and only if no insertion for this box should
 19187  be made to produce this optimum page.
 19188  
 19189  The data structure definitions here use the fact that the |@!height| field
 19190  appears in the fourth word of a box node.
 19191  @^data structure assumptions@>
 19192  
 19193  @d page_ins_node_size=4 {number of words for a page insertion node}
 19194  @d inserting=0 {an insertion class that has not yet overflowed}
 19195  @d split_up=1 {an overflowed insertion class}
 19196  @d broken_ptr(#)==link(#+1)
 19197    {an insertion for this class will break here if anywhere}
 19198  @d broken_ins(#)==info(#+1) {this insertion might break at |broken_ptr|}
 19199  @d last_ins_ptr(#)==link(#+2) {the most recent insertion for this |subtype|}
 19200  @d best_ins_ptr(#)==info(#+2) {the optimum most recent insertion}
 19201  
 19202  @<Initialize the special list heads...@>=
 19203  subtype(page_ins_head):=qi(255);
 19204  type(page_ins_head):=split_up; link(page_ins_head):=page_ins_head;
 19205  
 19206  @ An array |page_so_far| records the heights and depths of everything
 19207  on the current page. This array contains six |scaled| numbers, like the
 19208  similar arrays already considered in |line_break| and |vert_break|; and it
 19209  also contains |page_goal| and |page_depth|, since these values are
 19210  all accessible to the user via |set_page_dimen| commands. The
 19211  value of |page_so_far[1]| is also called |page_total|.  The stretch
 19212  and shrink components of the \.{\\skip} corrections for each insertion are
 19213  included in |page_so_far|, but the natural space components of these
 19214  corrections are not, since they have been subtracted from |page_goal|.
 19215  
 19216  The variable |page_depth| records the depth of the current page; it has been
 19217  adjusted so that it is at most |page_max_depth|. The variable
 19218  |last_glue| points to the glue specification of the most recent node
 19219  contributed from the contribution list, if this was a glue node; otherwise
 19220  |last_glue=max_halfword|. (If the contribution list is nonempty,
 19221  however, the value of |last_glue| is not necessarily accurate.)
 19222  The variables |last_penalty| and |last_kern| are similar.  And
 19223  finally, |insert_penalties| holds the sum of the penalties associated with
 19224  all split and floating insertions.
 19225  
 19226  @d page_goal==page_so_far[0] {desired height of information on page being built}
 19227  @d page_total==page_so_far[1] {height of the current page}
 19228  @d page_shrink==page_so_far[6] {shrinkability of the current page}
 19229  @d page_depth==page_so_far[7] {depth of the current page}
 19230  
 19231  @<Glob...@>=
 19232  @!page_so_far:array [0..7] of scaled; {height and glue of the current page}
 19233  @!last_glue:pointer; {used to implement \.{\\lastskip}}
 19234  @!last_penalty:integer; {used to implement \.{\\lastpenalty}}
 19235  @!last_kern:scaled; {used to implement \.{\\lastkern}}
 19236  @!insert_penalties:integer; {sum of the penalties for insertions
 19237    that were held over}
 19238  
 19239  @ @<Put each...@>=
 19240  primitive("pagegoal",set_page_dimen,0);
 19241  @!@:page_goal_}{\.{\\pagegoal} primitive@>
 19242  primitive("pagetotal",set_page_dimen,1);
 19243  @!@:page_total_}{\.{\\pagetotal} primitive@>
 19244  primitive("pagestretch",set_page_dimen,2);
 19245  @!@:page_stretch_}{\.{\\pagestretch} primitive@>
 19246  primitive("pagefilstretch",set_page_dimen,3);
 19247  @!@:page_fil_stretch_}{\.{\\pagefilstretch} primitive@>
 19248  primitive("pagefillstretch",set_page_dimen,4);
 19249  @!@:page_fill_stretch_}{\.{\\pagefillstretch} primitive@>
 19250  primitive("pagefilllstretch",set_page_dimen,5);
 19251  @!@:page_filll_stretch_}{\.{\\pagefilllstretch} primitive@>
 19252  primitive("pageshrink",set_page_dimen,6);
 19253  @!@:page_shrink_}{\.{\\pageshrink} primitive@>
 19254  primitive("pagedepth",set_page_dimen,7);
 19255  @!@:page_depth_}{\.{\\pagedepth} primitive@>
 19256  
 19257  @ @<Cases of |print_cmd_chr|...@>=
 19258  set_page_dimen: case chr_code of
 19259  0: print_esc("pagegoal");
 19260  1: print_esc("pagetotal");
 19261  2: print_esc("pagestretch");
 19262  3: print_esc("pagefilstretch");
 19263  4: print_esc("pagefillstretch");
 19264  5: print_esc("pagefilllstretch");
 19265  6: print_esc("pageshrink");
 19266  othercases print_esc("pagedepth")
 19267  endcases;
 19268  
 19269  @ @d print_plus_end(#)==print(#);@+end
 19270  @d print_plus(#)==if page_so_far[#]<>0 then
 19271    begin print(" plus "); print_scaled(page_so_far[#]); print_plus_end
 19272  
 19273  @p procedure print_totals;
 19274  begin print_scaled(page_total);
 19275  print_plus(2)("");
 19276  print_plus(3)("fil");
 19277  print_plus(4)("fill");
 19278  print_plus(5)("filll");
 19279  if page_shrink<>0 then
 19280    begin print(" minus "); print_scaled(page_shrink);
 19281    end;
 19282  end;
 19283  
 19284  @ @<Show the status of the current page@>=
 19285  if page_head<>page_tail then
 19286    begin print_nl("### current page:");
 19287    if output_active then print(" (held over for next output)");
 19288  @.held over for next output@>
 19289    show_box(link(page_head));
 19290    if page_contents>empty then
 19291      begin print_nl("total height "); print_totals;
 19292  @:total_height}{\.{total height}@>
 19293      print_nl(" goal height "); print_scaled(page_goal);
 19294  @.goal height@>
 19295      r:=link(page_ins_head);
 19296      while r<>page_ins_head do
 19297        begin print_ln; print_esc("insert"); t:=qo(subtype(r));
 19298        print_int(t); print(" adds ");
 19299        if count(t)=1000 then t:=height(r)
 19300        else t:=x_over_n(height(r),1000)*count(t);
 19301        print_scaled(t);
 19302        if type(r)=split_up then
 19303          begin q:=page_head; t:=0;
 19304          repeat q:=link(q);
 19305          if (type(q)=ins_node)and(subtype(q)=subtype(r)) then incr(t);
 19306          until q=broken_ins(r);
 19307          print(", #"); print_int(t); print(" might split");
 19308          end;
 19309        r:=link(r);
 19310        end;
 19311      end;
 19312    end
 19313  
 19314  @ Here is a procedure that is called when the |page_contents| is changing
 19315  from |empty| to |inserts_only| or |box_there|.
 19316  
 19317  @d set_page_so_far_zero(#)==page_so_far[#]:=0
 19318  
 19319  @p procedure freeze_page_specs(@!s:small_number);
 19320  begin page_contents:=s;
 19321  page_goal:=vsize; page_max_depth:=max_depth;
 19322  page_depth:=0; do_all_six(set_page_so_far_zero);
 19323  least_page_cost:=awful_bad;
 19324  @!stat if tracing_pages>0 then
 19325    begin begin_diagnostic;
 19326    print_nl("%% goal height="); print_scaled(page_goal);
 19327  @.goal height@>
 19328    print(", max depth="); print_scaled(page_max_depth);
 19329    end_diagnostic(false);
 19330    end;@;@+tats@;@/
 19331  end;
 19332  
 19333  @ Pages are built by appending nodes to the current list in \TeX's
 19334  vertical mode, which is at the outermost level of the semantic nest. This
 19335  vlist is split into two parts; the ``current page'' that we have been
 19336  talking so much about already, and the ``contribution list'' that receives
 19337  new nodes as they are created.  The current page contains everything that
 19338  the page builder has accounted for in its data structures, as described
 19339  above, while the contribution list contains other things that have been
 19340  generated by other parts of \TeX\ but have not yet been
 19341  seen by the page builder.
 19342  The contribution list starts at |link(contrib_head)|, and it ends at the
 19343  current node in \TeX's vertical mode.
 19344  
 19345  When \TeX\ has appended new material in vertical mode, it calls the procedure
 19346  |build_page|, which tries to catch up by moving nodes from the contribution
 19347  list to the current page. This procedure will succeed in its goal of
 19348  emptying the contribution list, unless a page break is discovered, i.e.,
 19349  unless the current page has grown to the point where the optimum next
 19350  page break has been determined. In the latter case, the nodes after the
 19351  optimum break will go back onto the contribution list, and control will
 19352  effectively pass to the user's output routine.
 19353  
 19354  We make |type(page_head)=glue_node|, so that an initial glue node on
 19355  the current page will not be considered a valid breakpoint.
 19356  
 19357  @<Initialize the special list...@>=
 19358  type(page_head):=glue_node; subtype(page_head):=normal;
 19359  
 19360  @ The global variable |output_active| is true during the time the
 19361  user's output routine is driving \TeX.
 19362  
 19363  @<Glob...@>=
 19364  @!output_active:boolean; {are we in the midst of an output routine?}
 19365  
 19366  @ @<Set init...@>=
 19367  output_active:=false; insert_penalties:=0;
 19368  
 19369  @ The page builder is ready to start a fresh page if we initialize
 19370  the following state variables. (However, the page insertion list is initialized
 19371  elsewhere.)
 19372  
 19373  @<Start a new current page@>=
 19374  page_contents:=empty; page_tail:=page_head; link(page_head):=null;@/
 19375  last_glue:=max_halfword; last_penalty:=0; last_kern:=0;
 19376  page_depth:=0; page_max_depth:=0
 19377  
 19378  @ At certain times box 255 is supposed to be void (i.e., |null|),
 19379  or an insertion box is supposed to be ready to accept a vertical list.
 19380  If not, an error message is printed, and the following subroutine
 19381  flushes the unwanted contents, reporting them to the user.
 19382  
 19383  @p procedure box_error(@!n:eight_bits);
 19384  begin error; begin_diagnostic;
 19385  print_nl("The following box has been deleted:");
 19386  @.The following...deleted@>
 19387  show_box(box(n)); end_diagnostic(true);
 19388  flush_node_list(box(n)); box(n):=null;
 19389  end;
 19390  
 19391  @ The following procedure guarantees that a given box register
 19392  does not contain an \.{\\hbox}.
 19393  
 19394  @p procedure ensure_vbox(@!n:eight_bits);
 19395  var p:pointer; {the box register contents}
 19396  begin p:=box(n);
 19397  if p<>null then if type(p)=hlist_node then
 19398    begin print_err("Insertions can only be added to a vbox");
 19399  @.Insertions can only...@>
 19400    help3("Tut tut: You're trying to \insert into a")@/
 19401      ("\box register that now contains an \hbox.")@/
 19402      ("Proceed, and I'll discard its present contents.");
 19403    box_error(n);
 19404    end;
 19405  end;
 19406  
 19407  @ \TeX\ is not always in vertical mode at the time |build_page|
 19408  is called; the current mode reflects what \TeX\ should return to, after
 19409  the contribution list has been emptied. A call on |build_page| should
 19410  be immediately followed by `|goto big_switch|', which is \TeX's central
 19411  control point.
 19412  
 19413  @d contribute=80 {go here to link a node into the current page}
 19414  
 19415  @p @t\4@>@<Declare the procedure called |fire_up|@>@;@/
 19416  procedure build_page; {append contributions to the current page}
 19417  label exit,done,done1,continue,contribute,update_heights;
 19418  var p:pointer; {the node being appended}
 19419  @!q,@!r:pointer; {nodes being examined}
 19420  @!b,@!c:integer; {badness and cost of current page}
 19421  @!pi:integer; {penalty to be added to the badness}
 19422  @!n:min_quarterword..255; {insertion box number}
 19423  @!delta,@!h,@!w:scaled; {sizes used for insertion calculations}
 19424  begin if (link(contrib_head)=null)or output_active then return;
 19425  repeat continue: p:=link(contrib_head);@/
 19426  @<Update the values of |last_glue|, |last_penalty|, and |last_kern|@>;
 19427  @<Move node |p| to the current page; if it is time for a page break,
 19428    put the nodes following the break back onto the contribution list,
 19429    and |return| to the user's output routine if there is one@>;
 19430  until link(contrib_head)=null;
 19431  @<Make the contribution list empty by setting its tail to |contrib_head|@>;
 19432  exit:end;
 19433  
 19434  @ @d contrib_tail==nest[0].tail_field {tail of the contribution list}
 19435  
 19436  @<Make the contribution list empty...@>=
 19437  if nest_ptr=0 then tail:=contrib_head {vertical mode}
 19438  else contrib_tail:=contrib_head {other modes}
 19439  
 19440  @ @<Update the values of |last_glue|...@>=
 19441  if last_glue<>max_halfword then delete_glue_ref(last_glue);
 19442  last_penalty:=0; last_kern:=0;
 19443  if type(p)=glue_node then
 19444    begin last_glue:=glue_ptr(p); add_glue_ref(last_glue);
 19445    end
 19446  else  begin last_glue:=max_halfword;
 19447    if type(p)=penalty_node then last_penalty:=penalty(p)
 19448    else if type(p)=kern_node then last_kern:=width(p);
 19449    end
 19450  
 19451  @ The code here is an example of a many-way switch into routines that
 19452  merge together in different places. Some people call this unstructured
 19453  programming, but the author doesn't see much wrong with it, as long as
 19454  @^Knuth, Donald Ervin@>
 19455  the various labels have a well-understood meaning.
 19456  
 19457  @<Move node |p| to the current page; ...@>=
 19458  @<If the current page is empty and node |p| is to be deleted, |goto done1|;
 19459    otherwise use node |p| to update the state of the current page;
 19460    if this node is an insertion, |goto contribute|; otherwise if this node
 19461    is not a legal breakpoint, |goto contribute| or |update_heights|;
 19462    otherwise set |pi| to the penalty associated with this breakpoint@>;
 19463  @<Check if node |p| is a new champion breakpoint; then \(if)if it is time for
 19464    a page break, prepare for output, and either fire up the user's
 19465    output routine and |return| or ship out the page and |goto done|@>;
 19466  if (type(p)<glue_node)or(type(p)>kern_node) then goto contribute;
 19467  update_heights:@<Update the current page measurements with respect to the
 19468    glue or kern specified by node~|p|@>;
 19469  contribute: @<Make sure that |page_max_depth| is not exceeded@>;
 19470  @<Link node |p| into the current page and |goto done|@>;
 19471  done1:@<Recycle node |p|@>;
 19472  done:
 19473  
 19474  @ @<Link node |p| into the current page and |goto done|@>=
 19475  link(page_tail):=p; page_tail:=p;
 19476  link(contrib_head):=link(p); link(p):=null; goto done
 19477  
 19478  @ @<Recycle node |p|@>=
 19479  link(contrib_head):=link(p); link(p):=null; flush_node_list(p)
 19480  
 19481  @ The title of this section is already so long, it seems best to avoid
 19482  making it more accurate but still longer, by mentioning the fact that a
 19483  kern node at the end of the contribution list will not be contributed until
 19484  we know its successor.
 19485  
 19486  @<If the current page is empty...@>=
 19487  case type(p) of
 19488  hlist_node,vlist_node,rule_node: if page_contents<box_there then
 19489      @<Initialize the current page, insert the \.{\\topskip} glue
 19490        ahead of |p|, and |goto continue|@>
 19491    else @<Prepare to move a box or rule node to the current page,
 19492      then |goto contribute|@>;
 19493  whatsit_node: @<Prepare to move whatsit |p| to the current page,
 19494    then |goto contribute|@>;
 19495  glue_node: if page_contents<box_there then goto done1
 19496    else if precedes_break(page_tail) then pi:=0
 19497    else goto update_heights;
 19498  kern_node: if page_contents<box_there then goto done1
 19499    else if link(p)=null then return
 19500    else if type(link(p))=glue_node then pi:=0
 19501    else goto update_heights;
 19502  penalty_node: if page_contents<box_there then goto done1@+else pi:=penalty(p);
 19503  mark_node: goto contribute;
 19504  ins_node: @<Append an insertion to the current page and |goto contribute|@>;
 19505  othercases confusion("page")
 19506  @:this can't happen page}{\quad page@>
 19507  endcases
 19508  
 19509  @ @<Initialize the current page, insert the \.{\\topskip} glue...@>=
 19510  begin if page_contents=empty then freeze_page_specs(box_there)
 19511  else page_contents:=box_there;
 19512  q:=new_skip_param(top_skip_code); {now |temp_ptr=glue_ptr(q)|}
 19513  if width(temp_ptr)>height(p) then width(temp_ptr):=width(temp_ptr)-height(p)
 19514  else width(temp_ptr):=0;
 19515  link(q):=p; link(contrib_head):=q; goto continue;
 19516  end
 19517  
 19518  @ @<Prepare to move a box or rule node to the current page...@>=
 19519  begin page_total:=page_total+page_depth+height(p);
 19520  page_depth:=depth(p);
 19521  goto contribute;
 19522  end
 19523  
 19524  @ @<Make sure that |page_max_depth| is not exceeded@>=
 19525  if page_depth>page_max_depth then
 19526    begin page_total:=@|
 19527      page_total+page_depth-page_max_depth;@/
 19528    page_depth:=page_max_depth;
 19529    end;
 19530  
 19531  @ @<Update the current page measurements with respect to the glue...@>=
 19532  if type(p)=kern_node then q:=p
 19533  else begin q:=glue_ptr(p);
 19534    page_so_far[2+stretch_order(q)]:=@|
 19535      page_so_far[2+stretch_order(q)]+stretch(q);@/
 19536    page_shrink:=page_shrink+shrink(q);
 19537    if (shrink_order(q)<>normal)and(shrink(q)<>0) then
 19538      begin@t@>@;@/
 19539      print_err("Infinite glue shrinkage found on current page");@/
 19540  @.Infinite glue shrinkage...@>
 19541      help4("The page about to be output contains some infinitely")@/
 19542        ("shrinkable glue, e.g., `\vss' or `\vskip 0pt minus 1fil'.")@/
 19543        ("Such glue doesn't belong there; but you can safely proceed,")@/
 19544        ("since the offensive shrinkability has been made finite.");
 19545      error;
 19546      r:=new_spec(q); shrink_order(r):=normal; delete_glue_ref(q);
 19547      glue_ptr(p):=r; q:=r;
 19548      end;
 19549    end;
 19550  page_total:=page_total+page_depth+width(q); page_depth:=0
 19551  
 19552  @ @<Check if node |p| is a new champion breakpoint; then \(if)...@>=
 19553  if pi<inf_penalty then
 19554    begin @<Compute the badness, |b|, of the current page,
 19555      using |awful_bad| if the box is too full@>;
 19556    if b<awful_bad then
 19557      if pi<=eject_penalty then c:=pi
 19558      else  if b<inf_bad then c:=b+pi+insert_penalties
 19559        else c:=deplorable
 19560    else c:=b;
 19561    if insert_penalties>=10000 then c:=awful_bad;
 19562    @!stat if tracing_pages>0 then @<Display the page break cost@>;@+tats@;@/
 19563    if c<=least_page_cost then
 19564      begin best_page_break:=p; best_size:=page_goal;
 19565      least_page_cost:=c;
 19566      r:=link(page_ins_head);
 19567      while r<>page_ins_head do
 19568        begin best_ins_ptr(r):=last_ins_ptr(r);
 19569        r:=link(r);
 19570        end;
 19571      end;
 19572    if (c=awful_bad)or(pi<=eject_penalty) then
 19573      begin fire_up(p); {output the current page at the best place}
 19574      if output_active then return; {user's output routine will act}
 19575      goto done; {the page has been shipped out by default output routine}
 19576      end;
 19577    end
 19578  
 19579  @ @<Display the page break cost@>=
 19580  begin begin_diagnostic; print_nl("%");
 19581  print(" t="); print_totals;@/
 19582  print(" g="); print_scaled(page_goal);@/
 19583  print(" b=");
 19584  if b=awful_bad then print_char("*")@+else print_int(b);
 19585  @.*\relax@>
 19586  print(" p="); print_int(pi);
 19587  print(" c=");
 19588  if c=awful_bad then print_char("*")@+else print_int(c);
 19589  if c<=least_page_cost then print_char("#");
 19590  end_diagnostic(false);
 19591  end
 19592  
 19593  @ @<Compute the badness, |b|, of the current page...@>=
 19594  if page_total<page_goal then
 19595    if (page_so_far[3]<>0) or (page_so_far[4]<>0) or@|
 19596      (page_so_far[5]<>0) then b:=0
 19597    else b:=badness(page_goal-page_total,page_so_far[2])
 19598  else if page_total-page_goal>page_shrink then b:=awful_bad
 19599  else b:=badness(page_total-page_goal,page_shrink)
 19600  
 19601  @ @<Append an insertion to the current page and |goto contribute|@>=
 19602  begin if page_contents=empty then freeze_page_specs(inserts_only);
 19603  n:=subtype(p); r:=page_ins_head;
 19604  while n>=subtype(link(r)) do r:=link(r);
 19605  n:=qo(n);
 19606  if subtype(r)<>qi(n) then
 19607    @<Create a page insertion node with |subtype(r)=qi(n)|, and
 19608      include the glue correction for box |n| in the
 19609      current page state@>;
 19610  if type(r)=split_up then insert_penalties:=insert_penalties+float_cost(p)
 19611  else  begin last_ins_ptr(r):=p;
 19612    delta:=page_goal-page_total-page_depth+page_shrink;
 19613      {this much room is left if we shrink the maximum}
 19614    if count(n)=1000 then h:=height(p)
 19615    else h:=x_over_n(height(p),1000)*count(n); {this much room is needed}
 19616    if ((h<=0)or(h<=delta))and(height(p)+height(r)<=dimen(n)) then
 19617      begin page_goal:=page_goal-h; height(r):=height(r)+height(p);
 19618      end
 19619    else @<Find the best way to split the insertion, and change
 19620      |type(r)| to |split_up|@>;
 19621    end;
 19622  goto contribute;
 19623  end
 19624  
 19625  @ We take note of the value of \.{\\skip} |n| and the height plus depth
 19626  of \.{\\box}~|n| only when the first \.{\\insert}~|n| node is
 19627  encountered for a new page. A user who changes the contents of \.{\\box}~|n|
 19628  after that first \.{\\insert}~|n| had better be either extremely careful
 19629  or extremely lucky, or both.
 19630  
 19631  @<Create a page insertion node...@>=
 19632  begin q:=get_node(page_ins_node_size); link(q):=link(r); link(r):=q; r:=q;
 19633  subtype(r):=qi(n); type(r):=inserting; ensure_vbox(n);
 19634  if box(n)=null then height(r):=0
 19635  else height(r):=height(box(n))+depth(box(n));
 19636  best_ins_ptr(r):=null;@/
 19637  q:=skip(n);
 19638  if count(n)=1000 then h:=height(r)
 19639  else h:=x_over_n(height(r),1000)*count(n);
 19640  page_goal:=page_goal-h-width(q);@/
 19641  page_so_far[2+stretch_order(q)]:=@|page_so_far[2+stretch_order(q)]+stretch(q);@/
 19642  page_shrink:=page_shrink+shrink(q);
 19643  if (shrink_order(q)<>normal)and(shrink(q)<>0) then
 19644    begin print_err("Infinite glue shrinkage inserted from "); print_esc("skip");
 19645  @.Infinite glue shrinkage...@>
 19646    print_int(n);
 19647    help3("The correction glue for page breaking with insertions")@/
 19648      ("must have finite shrinkability. But you may proceed,")@/
 19649      ("since the offensive shrinkability has been made finite.");
 19650    error;
 19651    end;
 19652  end
 19653  
 19654  @ Here is the code that will split a long footnote between pages, in an
 19655  emergency. The current situation deserves to be recapitulated: Node |p|
 19656  is an insertion into box |n|; the insertion will not fit, in its entirety,
 19657  either because it would make the total contents of box |n| greater than
 19658  \.{\\dimen} |n|, or because it would make the incremental amount of growth
 19659  |h| greater than the available space |delta|, or both. (This amount |h| has
 19660  been weighted by the insertion scaling factor, i.e., by \.{\\count} |n|
 19661  over 1000.) Now we will choose the best way to break the vlist of the
 19662  insertion, using the same criteria as in the \.{\\vsplit} operation.
 19663  
 19664  @<Find the best way to split the insertion...@>=
 19665  begin if count(n)<=0 then w:=max_dimen
 19666  else  begin w:=page_goal-page_total-page_depth;
 19667    if count(n)<>1000 then w:=x_over_n(w,count(n))*1000;
 19668    end;
 19669  if w>dimen(n)-height(r) then w:=dimen(n)-height(r);
 19670  q:=vert_break(ins_ptr(p),w,depth(p));
 19671  height(r):=height(r)+best_height_plus_depth;
 19672  @!stat if tracing_pages>0 then @<Display the insertion split cost@>;@+tats@;@/
 19673  if count(n)<>1000 then
 19674    best_height_plus_depth:=x_over_n(best_height_plus_depth,1000)*count(n);
 19675  page_goal:=page_goal-best_height_plus_depth;
 19676  type(r):=split_up; broken_ptr(r):=q; broken_ins(r):=p;
 19677  if q=null then insert_penalties:=insert_penalties+eject_penalty
 19678  else if type(q)=penalty_node then insert_penalties:=insert_penalties+penalty(q);
 19679  end
 19680  
 19681  @ @<Display the insertion split cost@>=
 19682  begin begin_diagnostic; print_nl("% split"); print_int(n);
 19683  @.split@>
 19684  print(" to "); print_scaled(w);
 19685  print_char(","); print_scaled(best_height_plus_depth);@/
 19686  print(" p=");
 19687  if q=null then print_int(eject_penalty)
 19688  else if type(q)=penalty_node then print_int(penalty(q))
 19689  else print_char("0");
 19690  end_diagnostic(false);
 19691  end
 19692  
 19693  @ When the page builder has looked at as much material as could appear before
 19694  the next page break, it makes its decision. The break that gave minimum
 19695  badness will be used to put a completed ``page'' into box 255, with insertions
 19696  appended to their other boxes.
 19697  
 19698  We also set the values of |top_mark|, |first_mark|, and |bot_mark|. The
 19699  program uses the fact that |bot_mark<>null| implies |first_mark<>null|;
 19700  it also knows that |bot_mark=null| implies |top_mark=first_mark=null|.
 19701  
 19702  The |fire_up| subroutine prepares to output the current page at the best
 19703  place; then it fires up the user's output routine, if there is one,
 19704  or it simply ships out the page. There is one parameter, |c|, which represents
 19705  the node that was being contributed to the page when the decision to
 19706  force an output was made.
 19707  
 19708  @<Declare the procedure called |fire_up|@>=
 19709  procedure fire_up(@!c:pointer);
 19710  label exit;
 19711  var p,@!q,@!r,@!s:pointer; {nodes being examined and/or changed}
 19712  @!prev_p:pointer; {predecessor of |p|}
 19713  @!n:min_quarterword..255; {insertion box number}
 19714  @!wait:boolean; {should the present insertion be held over?}
 19715  @!save_vbadness:integer; {saved value of |vbadness|}
 19716  @!save_vfuzz: scaled; {saved value of |vfuzz|}
 19717  @!save_split_top_skip: pointer; {saved value of |split_top_skip|}
 19718  begin @<Set the value of |output_penalty|@>;
 19719  if bot_mark<>null then
 19720    begin if top_mark<>null then delete_token_ref(top_mark);
 19721    top_mark:=bot_mark; add_token_ref(top_mark);
 19722    delete_token_ref(first_mark); first_mark:=null;
 19723    end;
 19724  @<Put the \(o)optimal current page into box 255, update |first_mark| and
 19725    |bot_mark|, append insertions to their boxes, and put the
 19726    remaining nodes back on the contribution list@>;
 19727  if (top_mark<>null)and(first_mark=null) then
 19728    begin first_mark:=top_mark; add_token_ref(top_mark);
 19729    end;
 19730  if output_routine<>null then
 19731    if dead_cycles>=max_dead_cycles then
 19732      @<Explain that too many dead cycles have occurred in a row@>
 19733    else @<Fire up the user's output routine and |return|@>;
 19734  @<Perform the default output routine@>;
 19735  exit:end;
 19736  
 19737  @ @<Set the value of |output_penalty|@>=
 19738  if type(best_page_break)=penalty_node then
 19739    begin geq_word_define(int_base+output_penalty_code,penalty(best_page_break));
 19740    penalty(best_page_break):=inf_penalty;
 19741    end
 19742  else geq_word_define(int_base+output_penalty_code,inf_penalty)
 19743  
 19744  @ As the page is finally being prepared for output,
 19745  pointer |p| runs through the vlist, with |prev_p| trailing behind;
 19746  pointer |q| is the tail of a list of insertions that
 19747  are being held over for a subsequent page.
 19748  
 19749  @<Put the \(o)optimal current page into box 255...@>=
 19750  if c=best_page_break then best_page_break:=null; {|c| not yet linked in}
 19751  @<Ensure that box 255 is empty before output@>;
 19752  insert_penalties:=0; {this will count the number of insertions held over}
 19753  save_split_top_skip:=split_top_skip;
 19754  if holding_inserts<=0 then
 19755    @<Prepare all the boxes involved in insertions to act as queues@>;
 19756  q:=hold_head; link(q):=null; prev_p:=page_head; p:=link(prev_p);
 19757  while p<>best_page_break do
 19758    begin if type(p)=ins_node then
 19759      begin if holding_inserts<=0 then
 19760         @<Either insert the material specified by node |p| into the
 19761           appropriate box, or hold it for the next page;
 19762           also delete node |p| from the current page@>;
 19763      end
 19764    else if type(p)=mark_node then @<Update the values of
 19765      |first_mark| and |bot_mark|@>;
 19766    prev_p:=p; p:=link(prev_p);
 19767    end;
 19768  split_top_skip:=save_split_top_skip;
 19769  @<Break the current page at node |p|, put it in box~255,
 19770    and put the remaining nodes on the contribution list@>;
 19771  @<Delete \(t)the page-insertion nodes@>
 19772  
 19773  @ @<Ensure that box 255 is empty before output@>=
 19774  if box(255)<>null then
 19775    begin print_err(""); print_esc("box"); print("255 is not void");
 19776  @:box255}{\.{\\box255 is not void}@>
 19777    help2("You shouldn't use \box255 except in \output routines.")@/
 19778      ("Proceed, and I'll discard its present contents.");
 19779    box_error(255);
 19780    end
 19781  
 19782  @ @<Update the values of |first_mark| and |bot_mark|@>=
 19783  begin if first_mark=null then
 19784    begin first_mark:=mark_ptr(p);
 19785    add_token_ref(first_mark);
 19786    end;
 19787  if bot_mark<>null then delete_token_ref(bot_mark);
 19788  bot_mark:=mark_ptr(p); add_token_ref(bot_mark);
 19789  end
 19790  
 19791  @ When the following code is executed, the current page runs from node
 19792  |link(page_head)| to node |prev_p|, and the nodes from |p| to |page_tail|
 19793  are to be placed back at the front of the contribution list. Furthermore
 19794  the heldover insertions appear in a list from |link(hold_head)| to |q|; we
 19795  will put them into the current page list for safekeeping while the user's
 19796  output routine is active.  We might have |q=hold_head|; and |p=null| if
 19797  and only if |prev_p=page_tail|. Error messages are suppressed within
 19798  |vpackage|, since the box might appear to be overfull or underfull simply
 19799  because the stretch and shrink from the \.{\\skip} registers for inserts
 19800  are not actually present in the box.
 19801  
 19802  @<Break the current page at node |p|, put it...@>=
 19803  if p<>null then
 19804    begin if link(contrib_head)=null then
 19805      if nest_ptr=0 then tail:=page_tail
 19806      else contrib_tail:=page_tail;
 19807    link(page_tail):=link(contrib_head);
 19808    link(contrib_head):=p;
 19809    link(prev_p):=null;
 19810    end;
 19811  save_vbadness:=vbadness; vbadness:=inf_bad;
 19812  save_vfuzz:=vfuzz; vfuzz:=max_dimen; {inhibit error messages}
 19813  box(255):=vpackage(link(page_head),best_size,exactly,page_max_depth);
 19814  vbadness:=save_vbadness; vfuzz:=save_vfuzz;
 19815  if last_glue<>max_halfword then delete_glue_ref(last_glue);
 19816  @<Start a new current page@>; {this sets |last_glue:=max_halfword|}
 19817  if q<>hold_head then
 19818    begin link(page_head):=link(hold_head); page_tail:=q;
 19819    end
 19820  
 19821  @ If many insertions are supposed to go into the same box, we want to know
 19822  the position of the last node in that box, so that we don't need to waste time
 19823  when linking further information into it. The |last_ins_ptr| fields of the
 19824  page insertion nodes are therefore used for this purpose during the
 19825  packaging phase.
 19826  
 19827  @<Prepare all the boxes involved in insertions to act as queues@>=
 19828  begin r:=link(page_ins_head);
 19829  while r<>page_ins_head do
 19830    begin if best_ins_ptr(r)<>null then
 19831      begin n:=qo(subtype(r)); ensure_vbox(n);
 19832      if box(n)=null then box(n):=new_null_box;
 19833      p:=box(n)+list_offset;
 19834      while link(p)<>null do p:=link(p);
 19835      last_ins_ptr(r):=p;
 19836      end;
 19837    r:=link(r);
 19838    end;
 19839  end
 19840  
 19841  @ @<Delete \(t)the page-insertion nodes@>=
 19842  r:=link(page_ins_head);
 19843  while r<>page_ins_head do
 19844    begin q:=link(r); free_node(r,page_ins_node_size); r:=q;
 19845    end;
 19846  link(page_ins_head):=page_ins_head
 19847  
 19848  @ We will set |best_ins_ptr:=null| and package the box corresponding to
 19849  insertion node~|r|, just after making the final insertion into that box.
 19850  If this final insertion is `|split_up|', the remainder after splitting
 19851  and pruning (if any) will be carried over to the next page.
 19852  
 19853  @<Either insert the material specified by node |p| into...@>=
 19854  begin r:=link(page_ins_head);
 19855  while subtype(r)<>subtype(p) do r:=link(r);
 19856  if best_ins_ptr(r)=null then wait:=true
 19857  else  begin wait:=false; s:=last_ins_ptr(r); link(s):=ins_ptr(p);
 19858    if best_ins_ptr(r)=p then
 19859      @<Wrap up the box specified by node |r|, splitting node |p| if
 19860      called for; set |wait:=true| if node |p| holds a remainder after
 19861      splitting@>
 19862    else  begin while link(s)<>null do s:=link(s);
 19863      last_ins_ptr(r):=s;
 19864      end;
 19865    end;
 19866  @<Either append the insertion node |p| after node |q|, and remove it
 19867    from the current page, or delete |node(p)|@>;
 19868  end
 19869  
 19870  @ @<Wrap up the box specified by node |r|, splitting node |p| if...@>=
 19871  begin if type(r)=split_up then
 19872    if (broken_ins(r)=p)and(broken_ptr(r)<>null) then
 19873      begin while link(s)<>broken_ptr(r) do s:=link(s);
 19874      link(s):=null;
 19875      split_top_skip:=split_top_ptr(p);
 19876      ins_ptr(p):=prune_page_top(broken_ptr(r));
 19877      if ins_ptr(p)<>null then
 19878        begin temp_ptr:=vpack(ins_ptr(p),natural);
 19879        height(p):=height(temp_ptr)+depth(temp_ptr);
 19880        free_node(temp_ptr,box_node_size); wait:=true;
 19881        end;
 19882      end;
 19883  best_ins_ptr(r):=null;
 19884  n:=qo(subtype(r));
 19885  temp_ptr:=list_ptr(box(n));
 19886  free_node(box(n),box_node_size);
 19887  box(n):=vpack(temp_ptr,natural);
 19888  end
 19889  
 19890  @ @<Either append the insertion node |p|...@>=
 19891  link(prev_p):=link(p); link(p):=null;
 19892  if wait then
 19893    begin link(q):=p; q:=p; incr(insert_penalties);
 19894    end
 19895  else  begin delete_glue_ref(split_top_ptr(p));
 19896    free_node(p,ins_node_size);
 19897    end;
 19898  p:=prev_p
 19899  
 19900  @ The list of heldover insertions, running from |link(page_head)| to
 19901  |page_tail|, must be moved to the contribution list when the user has
 19902  specified no output routine.
 19903  
 19904  @<Perform the default output routine@>=
 19905  begin if link(page_head)<>null then
 19906    begin if link(contrib_head)=null then
 19907      if nest_ptr=0 then tail:=page_tail@+else contrib_tail:=page_tail
 19908    else link(page_tail):=link(contrib_head);
 19909    link(contrib_head):=link(page_head);
 19910    link(page_head):=null; page_tail:=page_head;
 19911    end;
 19912  ship_out(box(255)); box(255):=null;
 19913  end
 19914  
 19915  @ @<Explain that too many dead cycles have occurred in a row@>=
 19916  begin print_err("Output loop---"); print_int(dead_cycles);
 19917  @.Output loop...@>
 19918  print(" consecutive dead cycles");
 19919  help3("I've concluded that your \output is awry; it never does a")@/
 19920  ("\shipout, so I'm shipping \box255 out myself. Next time")@/
 19921  ("increase \maxdeadcycles if you want me to be more patient!"); error;
 19922  end
 19923  
 19924  @ @<Fire up the user's output routine and |return|@>=
 19925  begin output_active:=true;
 19926  incr(dead_cycles);
 19927  push_nest; mode:=-vmode; prev_depth:=ignore_depth; mode_line:=-line;
 19928  begin_token_list(output_routine,output_text);
 19929  new_save_level(output_group); normal_paragraph;
 19930  scan_left_brace;
 19931  return;
 19932  end
 19933  
 19934  @ When the user's output routine finishes, it has constructed a vlist
 19935  in internal vertical mode, and \TeX\ will do the following:
 19936  
 19937  @<Resume the page builder after an output routine has come to an end@>=
 19938  begin if (loc<>null) or 
 19939   ((token_type<>output_text)and(token_type<>backed_up)) then
 19940    @<Recover from an unbalanced output routine@>;
 19941  end_token_list; {conserve stack space in case more outputs are triggered}
 19942  end_graf; unsave; output_active:=false; insert_penalties:=0;@/
 19943  @<Ensure that box 255 is empty after output@>;
 19944  if tail<>head then {current list goes after heldover insertions}
 19945    begin link(page_tail):=link(head);
 19946    page_tail:=tail;
 19947    end;
 19948  if link(page_head)<>null then {and both go before heldover contributions}
 19949    begin if link(contrib_head)=null then contrib_tail:=page_tail;
 19950    link(page_tail):=link(contrib_head);
 19951    link(contrib_head):=link(page_head);
 19952    link(page_head):=null; page_tail:=page_head;
 19953    end;
 19954  pop_nest; build_page;
 19955  end
 19956  
 19957  @ @<Recover from an unbalanced output routine@>=
 19958  begin print_err("Unbalanced output routine");
 19959  @.Unbalanced output routine@>
 19960  help2("Your sneaky output routine has problematic {'s and/or }'s.")@/
 19961  ("I can't handle that very well; good luck."); error;
 19962  repeat get_token;
 19963  until loc=null;
 19964  end {loops forever if reading from a file, since |null=min_halfword<=0|}
 19965  
 19966  @ @<Ensure that box 255 is empty after output@>=
 19967  if box(255)<>null then
 19968    begin print_err("Output routine didn't use all of ");
 19969    print_esc("box"); print_int(255);
 19970  @.Output routine didn't use...@>
 19971    help3("Your \output commands should empty \box255,")@/
 19972      ("e.g., by saying `\shipout\box255'.")@/
 19973      ("Proceed; I'll discard its present contents.");
 19974    box_error(255);
 19975    end
 19976  
 19977  @* \[46] The chief executive.
 19978  We come now to the |main_control| routine, which contains the master
 19979  switch that causes all the various pieces of \TeX\ to do their things,
 19980  in the right order.
 19981  
 19982  In a sense, this is the grand climax of the program: It applies all the
 19983  tools that we have worked so hard to construct. In another sense, this is
 19984  the messiest part of the program: It necessarily refers to other pieces
 19985  of code all over the place, so that a person can't fully understand what is
 19986  going on without paging back and forth to be reminded of conventions that
 19987  are defined elsewhere. We are now at the hub of the web, the central nervous
 19988  system that touches most of the other parts and ties them together.
 19989  @^brain@>
 19990  
 19991  The structure of |main_control| itself is quite simple. There's a label
 19992  called |big_switch|, at which point the next token of input is fetched
 19993  using |get_x_token|. Then the program branches at high speed into one of
 19994  about 100 possible directions, based on the value of the current
 19995  mode and the newly fetched command code; the sum |abs(mode)+cur_cmd|
 19996  indicates what to do next. For example, the case `|vmode+letter|' arises
 19997  when a letter occurs in vertical mode (or internal vertical mode); this
 19998  case leads to instructions that initialize a new paragraph and enter
 19999  horizontal mode.
 20000  
 20001  The big |case| statement that contains this multiway switch has been labeled
 20002  |reswitch|, so that the program can |goto reswitch| when the next token
 20003  has already been fetched. Most of the cases are quite short; they call
 20004  an ``action procedure'' that does the work for that case, and then they
 20005  either |goto reswitch| or they ``fall through'' to the end of the |case|
 20006  statement, which returns control back to |big_switch|. Thus, |main_control|
 20007  is not an extremely large procedure, in spite of the multiplicity of things
 20008  it must do; it is small enough to be handled by \PASCAL\ compilers that put
 20009  severe restrictions on procedure size.
 20010  @!@^action procedure@>
 20011  
 20012  One case is singled out for special treatment, because it accounts for most
 20013  of \TeX's activities in typical applications. The process of reading simple
 20014  text and converting it into |char_node| records, while looking for ligatures
 20015  and kerns, is part of \TeX's ``inner loop''; the whole program runs
 20016  efficiently when its inner loop is fast, so this part has been written
 20017  with particular care.
 20018  
 20019  @ We shall concentrate first on the inner loop of |main_control|, deferring
 20020  consideration of the other cases until later.
 20021  @^inner loop@>
 20022  
 20023  @d big_switch=60 {go here to branch on the next token of input}
 20024  @d main_loop=70 {go here to typeset a string of consecutive characters}
 20025  @d main_loop_wrapup=80 {go here to finish a character or ligature}
 20026  @d main_loop_move=90 {go here to advance the ligature cursor}
 20027  @d main_loop_move_lig=95 {same, when advancing past a generated ligature}
 20028  @d main_loop_lookahead=100 {go here to bring in another character, if any}
 20029  @d main_lig_loop=110 {go here to check for ligatures or kerning}
 20030  @d append_normal_space=120 {go here to append a normal space between words}
 20031  
 20032  @p @t\4@>@<Declare action procedures for use by |main_control|@>@;
 20033  @t\4@>@<Declare the procedure called |handle_right_brace|@>@;
 20034  procedure main_control; {governs \TeX's activities}
 20035  label big_switch,reswitch,main_loop,main_loop_wrapup,
 20036    main_loop_move,main_loop_move+1,main_loop_move+2,main_loop_move_lig,
 20037    main_loop_lookahead,main_loop_lookahead+1,
 20038    main_lig_loop,main_lig_loop+1,main_lig_loop+2,
 20039    append_normal_space,exit;
 20040  var@!t:integer; {general-purpose temporary variable}
 20041  begin if every_job<>null then begin_token_list(every_job,every_job_text);
 20042  big_switch: get_x_token;@/
 20043  reswitch: @<Give diagnostic information, if requested@>;
 20044  case abs(mode)+cur_cmd of
 20045  hmode+letter,hmode+other_char,hmode+char_given: goto main_loop;
 20046  hmode+char_num: begin scan_char_num; cur_chr:=cur_val; goto main_loop;@+end;
 20047  hmode+no_boundary: begin get_x_token;
 20048    if (cur_cmd=letter)or(cur_cmd=other_char)or(cur_cmd=char_given)or
 20049     (cur_cmd=char_num) then cancel_boundary:=true;
 20050    goto reswitch;
 20051    end;
 20052  hmode+spacer: if space_factor=1000 then goto append_normal_space
 20053    else app_space;
 20054  hmode+ex_space,mmode+ex_space: goto append_normal_space;
 20055  @t\4@>@<Cases of |main_control| that are not part of the inner loop@>@;
 20056  end; {of the big |case| statement}
 20057  goto big_switch;
 20058  main_loop:@<Append character |cur_chr| and the following characters (if~any)
 20059    to the current hlist in the current font; |goto reswitch| when
 20060    a non-character has been fetched@>;
 20061  append_normal_space:@<Append a normal inter-word space to the current list,
 20062    then |goto big_switch|@>;
 20063  exit:end;
 20064  
 20065  @ When a new token has just been fetched at |big_switch|, we have an
 20066  ideal place to monitor \TeX's activity.
 20067  @^debugging@>
 20068  
 20069  @<Give diagnostic information, if requested@>=
 20070  if interrupt<>0 then if OK_to_interrupt then
 20071    begin back_input; check_interrupt; goto big_switch;
 20072    end;
 20073  @!debug if panicking then check_mem(false);@+@;@+gubed
 20074  if tracing_commands>0 then show_cur_cmd_chr
 20075  
 20076  @ The following part of the program was first written in a structured
 20077  manner, according to the philosophy that ``premature optimization is
 20078  the root of all evil.'' Then it was rearranged into pieces of
 20079  spaghetti so that the most common actions could proceed with little or
 20080  no redundancy.
 20081  
 20082  The original unoptimized form of this algorithm resembles the
 20083  |reconstitute| procedure, which was described earlier in connection with
 20084  hyphenation. Again we have an implied ``cursor'' between characters
 20085  |cur_l| and |cur_r|. The main difference is that the |lig_stack| can now
 20086  contain a charnode as well as pseudo-ligatures; that stack is now
 20087  usually nonempty, because the next character of input (if any) has been
 20088  appended to it. In |main_control| we have
 20089  $$|cur_r|=\cases{|character(lig_stack)|,&if |lig_stack>null|;\cr
 20090    |font_bchar[cur_font]|,&otherwise;\cr}$$
 20091  except when |character(lig_stack)=font_false_bchar[cur_font]|.
 20092  Several additional global variables are needed.
 20093  
 20094  @<Glob...@>=
 20095  @!main_f:internal_font_number; {the current font}
 20096  @!main_i:four_quarters; {character information bytes for |cur_l|}
 20097  @!main_j:four_quarters; {ligature/kern command}
 20098  @!main_k:font_index; {index into |font_info|}
 20099  @!main_p:pointer; {temporary register for list manipulation}
 20100  @!main_s:integer; {space factor value}
 20101  @!bchar:halfword; {boundary character of current font, or |non_char|}
 20102  @!false_bchar:halfword; {nonexistent character matching |bchar|, or |non_char|}
 20103  @!cancel_boundary:boolean; {should the left boundary be ignored?}
 20104  @!ins_disc:boolean; {should we insert a discretionary node?}
 20105  
 20106  @ The boolean variables of the main loop are normally false, and always reset
 20107  to false before the loop is left. That saves us the extra work of initializing
 20108  each time.
 20109  
 20110  @<Set init...@>=
 20111  ligature_present:=false; cancel_boundary:=false; lft_hit:=false; rt_hit:=false;
 20112  ins_disc:=false;
 20113  
 20114  @ We leave the |space_factor| unchanged if |sf_code(cur_chr)=0|; otherwise we
 20115  set it equal to |sf_code(cur_chr)|, except that it should never change
 20116  from a value less than 1000 to a value exceeding 1000. The most common
 20117  case is |sf_code(cur_chr)=1000|, so we want that case to be fast.
 20118  
 20119  The overall structure of the main loop is presented here. Some program labels
 20120  are inside the individual sections.
 20121  @^inner loop@>
 20122  
 20123  @d adjust_space_factor==@t@>@;@/
 20124    main_s:=sf_code(cur_chr);
 20125    if main_s=1000 then space_factor:=1000
 20126    else if main_s<1000 then
 20127      begin if main_s>0 then space_factor:=main_s;
 20128      end
 20129    else if space_factor<1000 then space_factor:=1000
 20130    else space_factor:=main_s
 20131  
 20132  @<Append character |cur_chr|...@>=
 20133  adjust_space_factor;@/
 20134  main_f:=cur_font;
 20135  bchar:=font_bchar[main_f]; false_bchar:=font_false_bchar[main_f];
 20136  if mode>0 then if language<>clang then fix_language;
 20137  fast_get_avail(lig_stack); font(lig_stack):=main_f; cur_l:=qi(cur_chr);
 20138  character(lig_stack):=cur_l;@/
 20139  cur_q:=tail;
 20140  if cancel_boundary then
 20141    begin cancel_boundary:=false; main_k:=non_address;
 20142    end
 20143  else main_k:=bchar_label[main_f];
 20144  if main_k=non_address then goto main_loop_move+2; {no left boundary processing}
 20145  cur_r:=cur_l; cur_l:=non_char;
 20146  goto main_lig_loop+1; {begin with cursor after left boundary}
 20147  @#
 20148  main_loop_wrapup:@<Make a ligature node, if |ligature_present|;
 20149    insert a null discretionary, if appropriate@>;
 20150  main_loop_move:@<If the cursor is immediately followed by the right boundary,
 20151    |goto reswitch|; if it's followed by an invalid character, |goto big_switch|;
 20152    otherwise move the cursor one step to the right and |goto main_lig_loop|@>;
 20153  main_loop_lookahead:@<Look ahead for another character, or leave |lig_stack|
 20154    empty if there's none there@>;
 20155  main_lig_loop:@<If there's a ligature/kern command relevant to |cur_l| and
 20156    |cur_r|, adjust the text appropriately; exit to |main_loop_wrapup|@>;
 20157  main_loop_move_lig:@<Move the cursor past a pseudo-ligature, then
 20158    |goto main_loop_lookahead| or |main_lig_loop|@>
 20159  
 20160  @ If |link(cur_q)| is nonnull when |wrapup| is invoked, |cur_q| points to
 20161  the list of characters that were consumed while building the ligature
 20162  character~|cur_l|.
 20163  
 20164  A discretionary break is not inserted for an explicit hyphen when we are in
 20165  restricted horizontal mode. In particular, this avoids putting discretionary
 20166  nodes inside of other discretionaries.
 20167  @^inner loop@>
 20168  
 20169  @d pack_lig(#)== {the parameter is either |rt_hit| or |false|}
 20170    begin main_p:=new_ligature(main_f,cur_l,link(cur_q));
 20171    if lft_hit then
 20172      begin subtype(main_p):=2; lft_hit:=false;
 20173      end;
 20174    if # then if lig_stack=null then
 20175      begin incr(subtype(main_p)); rt_hit:=false;
 20176      end;
 20177    link(cur_q):=main_p; tail:=main_p; ligature_present:=false;
 20178    end
 20179  
 20180  @d wrapup(#)==if cur_l<non_char then
 20181    begin if link(cur_q)>null then
 20182      if character(tail)=qi(hyphen_char[main_f]) then ins_disc:=true;
 20183    if ligature_present then pack_lig(#);
 20184    if ins_disc then
 20185      begin ins_disc:=false;
 20186      if mode>0 then tail_append(new_disc);
 20187      end;
 20188    end
 20189  
 20190  @<Make a ligature node, if |ligature_present|;...@>=
 20191  wrapup(rt_hit)
 20192  
 20193  @ @<If the cursor is immediately followed by the right boundary...@>=
 20194  @^inner loop@>
 20195  if lig_stack=null then goto reswitch;
 20196  cur_q:=tail; cur_l:=character(lig_stack);
 20197  main_loop_move+1:if not is_char_node(lig_stack) then goto main_loop_move_lig;
 20198  main_loop_move+2:if(cur_chr<font_bc[main_f])or(cur_chr>font_ec[main_f]) then
 20199    begin char_warning(main_f,cur_chr); free_avail(lig_stack); goto big_switch;
 20200    end;
 20201  main_i:=char_info(main_f)(cur_l);
 20202  if not char_exists(main_i) then
 20203    begin char_warning(main_f,cur_chr); free_avail(lig_stack); goto big_switch;
 20204    end;
 20205  link(tail):=lig_stack; tail:=lig_stack {|main_loop_lookahead| is next}
 20206  
 20207  @ Here we are at |main_loop_move_lig|.
 20208  When we begin this code we have |cur_q=tail| and |cur_l=character(lig_stack)|.
 20209  
 20210  @<Move the cursor past a pseudo-ligature...@>=
 20211  main_p:=lig_ptr(lig_stack);
 20212  if main_p>null then tail_append(main_p); {append a single character}
 20213  temp_ptr:=lig_stack; lig_stack:=link(temp_ptr);
 20214  free_node(temp_ptr,small_node_size);
 20215  main_i:=char_info(main_f)(cur_l); ligature_present:=true;
 20216  if lig_stack=null then
 20217    if main_p>null then goto main_loop_lookahead
 20218    else cur_r:=bchar
 20219  else cur_r:=character(lig_stack);
 20220  goto main_lig_loop
 20221  
 20222  @ The result of \.{\\char} can participate in a ligature or kern, so we must
 20223  look ahead for it.
 20224  
 20225  @<Look ahead for another character...@>=
 20226  get_next; {set only |cur_cmd| and |cur_chr|, for speed}
 20227  if cur_cmd=letter then goto main_loop_lookahead+1;
 20228  if cur_cmd=other_char then goto main_loop_lookahead+1;
 20229  if cur_cmd=char_given then goto main_loop_lookahead+1;
 20230  x_token; {now expand and set |cur_cmd|, |cur_chr|, |cur_tok|}
 20231  if cur_cmd=letter then goto main_loop_lookahead+1;
 20232  if cur_cmd=other_char then goto main_loop_lookahead+1;
 20233  if cur_cmd=char_given then goto main_loop_lookahead+1;
 20234  if cur_cmd=char_num then
 20235    begin scan_char_num; cur_chr:=cur_val; goto main_loop_lookahead+1;
 20236    end;
 20237  if cur_cmd=no_boundary then bchar:=non_char;
 20238  cur_r:=bchar; lig_stack:=null; goto main_lig_loop;
 20239  main_loop_lookahead+1: adjust_space_factor;
 20240  fast_get_avail(lig_stack); font(lig_stack):=main_f;
 20241  cur_r:=qi(cur_chr); character(lig_stack):=cur_r;
 20242  if cur_r=false_bchar then cur_r:=non_char {this prevents spurious ligatures}
 20243  
 20244  @ Even though comparatively few characters have a lig/kern program, several
 20245  of the instructions here count as part of \TeX's inner loop, since a
 20246  @^inner loop@>
 20247  potentially long sequential search must be performed. For example, tests with
 20248  Computer Modern Roman showed that about 40 per cent of all characters
 20249  actually encountered in practice had a lig/kern program, and that about four
 20250  lig/kern commands were investigated for every such character.
 20251  
 20252  At the beginning of this code we have |main_i=char_info(main_f)(cur_l)|.
 20253  
 20254  @<If there's a ligature/kern command...@>=
 20255  if char_tag(main_i)<>lig_tag then goto main_loop_wrapup;
 20256  if cur_r=non_char then goto main_loop_wrapup;
 20257  main_k:=lig_kern_start(main_f)(main_i); main_j:=font_info[main_k].qqqq;
 20258  if skip_byte(main_j)<=stop_flag then goto main_lig_loop+2;
 20259  main_k:=lig_kern_restart(main_f)(main_j);
 20260  main_lig_loop+1:main_j:=font_info[main_k].qqqq;
 20261  main_lig_loop+2:if next_char(main_j)=cur_r then
 20262   if skip_byte(main_j)<=stop_flag then
 20263    @<Do ligature or kern command, returning to |main_lig_loop|
 20264    or |main_loop_wrapup| or |main_loop_move|@>;
 20265  if skip_byte(main_j)=qi(0) then incr(main_k)
 20266  else begin if skip_byte(main_j)>=stop_flag then goto main_loop_wrapup;
 20267    main_k:=main_k+qo(skip_byte(main_j))+1;
 20268    end;
 20269  goto main_lig_loop+1
 20270  
 20271  @ When a ligature or kern instruction matches a character, we know from
 20272  |read_font_info| that the character exists in the font, even though we
 20273  haven't verified its existence in the normal way.
 20274  
 20275  This section could be made into a subroutine, if the code inside
 20276  |main_control| needs to be shortened.
 20277  
 20278  \chardef\?='174 % vertical line to indicate character retention
 20279  
 20280  @<Do ligature or kern command...@>=
 20281  begin if op_byte(main_j)>=kern_flag then
 20282    begin wrapup(rt_hit);
 20283    tail_append(new_kern(char_kern(main_f)(main_j))); goto main_loop_move;
 20284    end;
 20285  if cur_l=non_char then lft_hit:=true
 20286  else if lig_stack=null then rt_hit:=true;
 20287  check_interrupt; {allow a way out in case there's an infinite ligature loop}
 20288  case op_byte(main_j) of
 20289  qi(1),qi(5):begin cur_l:=rem_byte(main_j); {\.{=:\?}, \.{=:\?>}}
 20290    main_i:=char_info(main_f)(cur_l); ligature_present:=true;
 20291    end;
 20292  qi(2),qi(6):begin cur_r:=rem_byte(main_j); {\.{\?=:}, \.{\?=:>}}
 20293    if lig_stack=null then {right boundary character is being consumed}
 20294      begin lig_stack:=new_lig_item(cur_r); bchar:=non_char;
 20295      end
 20296    else if is_char_node(lig_stack) then {|link(lig_stack)=null|}
 20297      begin main_p:=lig_stack; lig_stack:=new_lig_item(cur_r);
 20298      lig_ptr(lig_stack):=main_p;
 20299      end
 20300    else character(lig_stack):=cur_r;
 20301    end;
 20302  qi(3):begin cur_r:=rem_byte(main_j); {\.{\?=:\?}}
 20303    main_p:=lig_stack; lig_stack:=new_lig_item(cur_r);
 20304    link(lig_stack):=main_p;
 20305    end;
 20306  qi(7),qi(11):begin wrapup(false); {\.{\?=:\?>}, \.{\?=:\?>>}}
 20307    cur_q:=tail; cur_l:=rem_byte(main_j);
 20308    main_i:=char_info(main_f)(cur_l); ligature_present:=true;
 20309    end;
 20310  othercases begin cur_l:=rem_byte(main_j); ligature_present:=true; {\.{=:}}
 20311    if lig_stack=null then goto main_loop_wrapup
 20312    else goto main_loop_move+1;
 20313    end
 20314  endcases;
 20315  if op_byte(main_j)>qi(4) then
 20316    if op_byte(main_j)<>qi(7) then goto main_loop_wrapup;
 20317  if cur_l<non_char then goto main_lig_loop;
 20318  main_k:=bchar_label[main_f]; goto main_lig_loop+1;
 20319  end
 20320  
 20321  @ The occurrence of blank spaces is almost part of \TeX's inner loop,
 20322  @^inner loop@>
 20323  since we usually encounter about one space for every five non-blank characters.
 20324  Therefore |main_control| gives second-highest priority to ordinary spaces.
 20325  
 20326  When a glue parameter like \.{\\spaceskip} is set to `\.{0pt}', we will
 20327  see to it later that the corresponding glue specification is precisely
 20328  |zero_glue|, not merely a pointer to some specification that happens
 20329  to be full of zeroes. Therefore it is simple to test whether a glue parameter
 20330  is zero or~not.
 20331  
 20332  @<Append a normal inter-word space...@>=
 20333  if space_skip=zero_glue then
 20334    begin @<Find the glue specification, |main_p|, for
 20335      text spaces in the current font@>;
 20336    temp_ptr:=new_glue(main_p);
 20337    end
 20338  else temp_ptr:=new_param_glue(space_skip_code);
 20339  link(tail):=temp_ptr; tail:=temp_ptr;
 20340  goto big_switch
 20341  
 20342  @ Having |font_glue| allocated for each text font saves both time and memory.
 20343  If any of the three spacing parameters are subsequently changed by the
 20344  use of \.{\\fontdimen}, the |find_font_dimen| procedure deallocates the
 20345  |font_glue| specification allocated here.
 20346  
 20347  @<Find the glue specification...@>=
 20348  begin main_p:=font_glue[cur_font];
 20349  if main_p=null then
 20350    begin main_p:=new_spec(zero_glue); main_k:=param_base[cur_font]+space_code;
 20351    width(main_p):=font_info[main_k].sc; {that's |space(cur_font)|}
 20352    stretch(main_p):=font_info[main_k+1].sc; {and |space_stretch(cur_font)|}
 20353    shrink(main_p):=font_info[main_k+2].sc; {and |space_shrink(cur_font)|}
 20354    font_glue[cur_font]:=main_p;
 20355    end;
 20356  end
 20357  
 20358  @ @<Declare act...@>=
 20359  procedure app_space; {handle spaces when |space_factor<>1000|}
 20360  var@!q:pointer; {glue node}
 20361  begin if (space_factor>=2000)and(xspace_skip<>zero_glue) then
 20362    q:=new_param_glue(xspace_skip_code)
 20363  else  begin if space_skip<>zero_glue then main_p:=space_skip
 20364    else @<Find the glue specification...@>;
 20365    main_p:=new_spec(main_p);
 20366    @<Modify the glue specification in |main_p| according to the space factor@>;
 20367    q:=new_glue(main_p); glue_ref_count(main_p):=null;
 20368    end;
 20369  link(tail):=q; tail:=q;
 20370  end;
 20371  
 20372  @ @<Modify the glue specification in |main_p| according to the space factor@>=
 20373  if space_factor>=2000 then width(main_p):=width(main_p)+extra_space(cur_font);
 20374  stretch(main_p):=xn_over_d(stretch(main_p),space_factor,1000);
 20375  shrink(main_p):=xn_over_d(shrink(main_p),1000,space_factor)
 20376  
 20377  @ Whew---that covers the main loop. We can now proceed at a leisurely
 20378  pace through the other combinations of possibilities.
 20379  
 20380  @d any_mode(#)==vmode+#,hmode+#,mmode+# {for mode-independent commands}
 20381  
 20382  @<Cases of |main_control| that are not part of the inner loop@>=
 20383  any_mode(relax),vmode+spacer,mmode+spacer,mmode+no_boundary:do_nothing;
 20384  any_mode(ignore_spaces): begin @<Get the next non-blank non-call...@>;
 20385    goto reswitch;
 20386    end;
 20387  vmode+stop: if its_all_over then return; {this is the only way out}
 20388  @t\4@>@<Forbidden cases detected in |main_control|@>@+@,any_mode(mac_param):
 20389    report_illegal_case;
 20390  @<Math-only cases in non-math modes, or vice versa@>: insert_dollar_sign;
 20391  @t\4@>@<Cases of |main_control| that build boxes and lists@>@;
 20392  @t\4@>@<Cases of |main_control| that don't depend on |mode|@>@;
 20393  @t\4@>@<Cases of |main_control| that are for extensions to \TeX@>@;
 20394  
 20395  @ Here is a list of cases where the user has probably gotten into or out of math
 20396  mode by mistake. \TeX\ will insert a dollar sign and rescan the current token.
 20397  
 20398  @d non_math(#)==vmode+#,hmode+#
 20399  
 20400  @<Math-only cases in non-math modes...@>=
 20401  non_math(sup_mark), non_math(sub_mark), non_math(math_char_num),
 20402  non_math(math_given), non_math(math_comp), non_math(delim_num),
 20403  non_math(left_right), non_math(above), non_math(radical),
 20404  non_math(math_style), non_math(math_choice), non_math(vcenter),
 20405  non_math(non_script), non_math(mkern), non_math(limit_switch),
 20406  non_math(mskip), non_math(math_accent),
 20407  mmode+endv, mmode+par_end, mmode+stop, mmode+vskip, mmode+un_vbox,
 20408  mmode+valign, mmode+hrule
 20409  
 20410  @ @<Declare action...@>=
 20411  procedure insert_dollar_sign;
 20412  begin back_input; cur_tok:=math_shift_token+"$";
 20413  print_err("Missing $ inserted");
 20414  @.Missing \$ inserted@>
 20415  help2("I've inserted a begin-math/end-math symbol since I think")@/
 20416  ("you left one out. Proceed, with fingers crossed."); ins_error;
 20417  end;
 20418  
 20419  @ When erroneous situations arise, \TeX\ usually issues an error message
 20420  specific to the particular error. For example, `\.{\\noalign}' should
 20421  not appear in any mode, since it is recognized by the |align_peek| routine
 20422  in all of its legitimate appearances; a special error message is given
 20423  when `\.{\\noalign}' occurs elsewhere. But sometimes the most appropriate
 20424  error message is simply that the user is not allowed to do what he or she
 20425  has attempted. For example, `\.{\\moveleft}' is allowed only in vertical mode,
 20426  and `\.{\\lower}' only in non-vertical modes.  Such cases are enumerated
 20427  here and in the other sections referred to under `See also \dots.'
 20428  
 20429  @<Forbidden cases...@>=
 20430  vmode+vmove,hmode+hmove,mmode+hmove,any_mode(last_item),
 20431  
 20432  @ The `|you_cant|' procedure prints a line saying that the current command
 20433  is illegal in the current mode; it identifies these things symbolically.
 20434  
 20435  @<Declare action...@>=
 20436  procedure you_cant;
 20437  begin print_err("You can't use `");
 20438  @.You can't use x in y mode@>
 20439  print_cmd_chr(cur_cmd,cur_chr);
 20440  print("' in "); print_mode(mode);
 20441  end;
 20442  
 20443  @ @<Declare act...@>=
 20444  procedure report_illegal_case;
 20445  begin you_cant;
 20446  help4("Sorry, but I'm not programmed to handle this case;")@/
 20447  ("I'll just pretend that you didn't ask for it.")@/
 20448  ("If you're in the wrong mode, you might be able to")@/
 20449  ("return to the right one by typing `I}' or `I$' or `I\par'.");@/
 20450  error;
 20451  end;
 20452  
 20453  @ Some operations are allowed only in privileged modes, i.e., in cases
 20454  that |mode>0|. The |privileged| function is used to detect violations
 20455  of this rule; it issues an error message and returns |false| if the
 20456  current |mode| is negative.
 20457  
 20458  @<Declare act...@>=
 20459  function privileged:boolean;
 20460  begin if mode>0 then privileged:=true
 20461  else  begin report_illegal_case; privileged:=false;
 20462    end;
 20463  end;
 20464  
 20465  @ Either \.{\\dump} or \.{\\end} will cause |main_control| to enter the
 20466  endgame, since both of them have `|stop|' as their command code.
 20467  
 20468  @<Put each...@>=
 20469  primitive("end",stop,0);@/
 20470  @!@:end_}{\.{\\end} primitive@>
 20471  primitive("dump",stop,1);@/
 20472  @!@:dump_}{\.{\\dump} primitive@>
 20473  
 20474  @ @<Cases of |print_cmd_chr|...@>=
 20475  stop:if chr_code=1 then print_esc("dump")@+else print_esc("end");
 20476  
 20477  @ We don't want to leave |main_control| immediately when a |stop| command
 20478  is sensed, because it may be necessary to invoke an \.{\\output} routine
 20479  several times before things really grind to a halt. (The output routine
 20480  might even say `\.{\\gdef\\end\{...\}}', to prolong the life of the job.)
 20481  Therefore |its_all_over| is |true| only when the current page
 20482  and contribution list are empty, and when the last output was not a
 20483  ``dead cycle.''
 20484  
 20485  @<Declare act...@>=
 20486  function its_all_over:boolean; {do this when \.{\\end} or \.{\\dump} occurs}
 20487  label exit;
 20488  begin if privileged then
 20489    begin if (page_head=page_tail)and(head=tail)and(dead_cycles=0) then
 20490      begin its_all_over:=true; return;
 20491      end;
 20492    back_input; {we will try to end again after ejecting residual material}
 20493    tail_append(new_null_box);
 20494    width(tail):=hsize;
 20495    tail_append(new_glue(fill_glue));
 20496    tail_append(new_penalty(-@'10000000000));@/
 20497    build_page; {append \.{\\hbox to \\hsize\{\}\\vfill\\penalty-'10000000000}}
 20498    end;
 20499  its_all_over:=false;
 20500  exit:end;
 20501  
 20502  @* \[47] Building boxes and lists.
 20503  The most important parts of |main_control| are concerned with \TeX's
 20504  chief mission of box-making. We need to control the activities that put
 20505  entries on vlists and hlists, as well as the activities that convert
 20506  those lists into boxes. All of the necessary machinery has already been
 20507  developed; it remains for us to ``push the buttons'' at the right times.
 20508  
 20509  @ As an introduction to these routines, let's consider one of the simplest
 20510  cases: What happens when `\.{\\hrule}' occurs in vertical mode, or
 20511  `\.{\\vrule}' in horizontal mode or math mode? The code in |main_control|
 20512  is short, since the |scan_rule_spec| routine already does most of what is
 20513  required; thus, there is no need for a special action procedure.
 20514  
 20515  Note that baselineskip calculations are disabled after a rule in vertical
 20516  mode, by setting |prev_depth:=ignore_depth|.
 20517  
 20518  @<Cases of |main_control| that build...@>=
 20519  vmode+hrule,hmode+vrule,mmode+vrule: begin tail_append(scan_rule_spec);
 20520    if abs(mode)=vmode then prev_depth:=ignore_depth
 20521    else if abs(mode)=hmode then space_factor:=1000;
 20522    end;
 20523  
 20524  @ The processing of things like \.{\\hskip} and \.{\\vskip} is slightly
 20525  more complicated. But the code in |main_control| is very short, since
 20526  it simply calls on the action routine |append_glue|. Similarly, \.{\\kern}
 20527  activates |append_kern|.
 20528  
 20529  @<Cases of |main_control| that build...@>=
 20530  vmode+vskip,hmode+hskip,mmode+hskip,mmode+mskip: append_glue;
 20531  any_mode(kern),mmode+mkern: append_kern;
 20532  
 20533  @ The |hskip| and |vskip| command codes are used for control sequences
 20534  like \.{\\hss} and \.{\\vfil} as well as for \.{\\hskip} and \.{\\vskip}.
 20535  The difference is in the value of |cur_chr|.
 20536  
 20537  @d fil_code=0 {identifies \.{\\hfil} and \.{\\vfil}}
 20538  @d fill_code=1 {identifies \.{\\hfill} and \.{\\vfill}}
 20539  @d ss_code=2 {identifies \.{\\hss} and \.{\\vss}}
 20540  @d fil_neg_code=3 {identifies \.{\\hfilneg} and \.{\\vfilneg}}
 20541  @d skip_code=4 {identifies \.{\\hskip} and \.{\\vskip}}
 20542  @d mskip_code=5 {identifies \.{\\mskip}}
 20543  
 20544  @<Put each...@>=
 20545  primitive("hskip",hskip,skip_code);@/
 20546  @!@:hskip_}{\.{\\hskip} primitive@>
 20547  primitive("hfil",hskip,fil_code);
 20548  @!@:hfil_}{\.{\\hfil} primitive@>
 20549  primitive("hfill",hskip,fill_code);@/
 20550  @!@:hfill_}{\.{\\hfill} primitive@>
 20551  primitive("hss",hskip,ss_code);
 20552  @!@:hss_}{\.{\\hss} primitive@>
 20553  primitive("hfilneg",hskip,fil_neg_code);@/
 20554  @!@:hfil_neg_}{\.{\\hfilneg} primitive@>
 20555  primitive("vskip",vskip,skip_code);@/
 20556  @!@:vskip_}{\.{\\vskip} primitive@>
 20557  primitive("vfil",vskip,fil_code);
 20558  @!@:vfil_}{\.{\\vfil} primitive@>
 20559  primitive("vfill",vskip,fill_code);@/
 20560  @!@:vfill_}{\.{\\vfill} primitive@>
 20561  primitive("vss",vskip,ss_code);
 20562  @!@:vss_}{\.{\\vss} primitive@>
 20563  primitive("vfilneg",vskip,fil_neg_code);@/
 20564  @!@:vfil_neg_}{\.{\\vfilneg} primitive@>
 20565  primitive("mskip",mskip,mskip_code);@/
 20566  @!@:mskip_}{\.{\\mskip} primitive@>
 20567  primitive("kern",kern,explicit);
 20568  @!@:kern_}{\.{\\kern} primitive@>
 20569  primitive("mkern",mkern,mu_glue);@/
 20570  @!@:mkern_}{\.{\\mkern} primitive@>
 20571  
 20572  @ @<Cases of |print_cmd_chr|...@>=
 20573  hskip: case chr_code of
 20574    skip_code:print_esc("hskip");
 20575    fil_code:print_esc("hfil");
 20576    fill_code:print_esc("hfill");
 20577    ss_code:print_esc("hss");
 20578    othercases print_esc("hfilneg")
 20579    endcases;
 20580  vskip: case chr_code of
 20581    skip_code:print_esc("vskip");
 20582    fil_code:print_esc("vfil");
 20583    fill_code:print_esc("vfill");
 20584    ss_code:print_esc("vss");
 20585    othercases print_esc("vfilneg")
 20586    endcases;
 20587  mskip: print_esc("mskip");
 20588  kern: print_esc("kern");
 20589  mkern: print_esc("mkern");
 20590  
 20591  @ All the work relating to glue creation has been relegated to the
 20592  following subroutine. It does not call |build_page|, because it is
 20593  used in at least one place where that would be a mistake.
 20594  
 20595  @<Declare action...@>=
 20596  procedure append_glue;
 20597  var s:small_number; {modifier of skip command}
 20598  begin s:=cur_chr;
 20599  case s of
 20600  fil_code: cur_val:=fil_glue;
 20601  fill_code: cur_val:=fill_glue;
 20602  ss_code: cur_val:=ss_glue;
 20603  fil_neg_code: cur_val:=fil_neg_glue;
 20604  skip_code: scan_glue(glue_val);
 20605  mskip_code: scan_glue(mu_val);
 20606  end; {now |cur_val| points to the glue specification}
 20607  tail_append(new_glue(cur_val));
 20608  if s>=skip_code then
 20609    begin decr(glue_ref_count(cur_val));
 20610    if s>skip_code then subtype(tail):=mu_glue;
 20611    end;
 20612  end;
 20613  
 20614  @ @<Declare act...@>=
 20615  procedure append_kern;
 20616  var s:quarterword; {|subtype| of the kern node}
 20617  begin s:=cur_chr; scan_dimen(s=mu_glue,false,false);
 20618  tail_append(new_kern(cur_val)); subtype(tail):=s;
 20619  end;
 20620  
 20621  @ Many of the actions related to box-making are triggered by the appearance
 20622  of braces in the input. For example, when the user says `\.{\\hbox}
 20623  \.{to} \.{100pt\{$\langle\,\hbox{\rm hlist}\,\rangle$\}}' in vertical mode,
 20624  the information about the box size (100pt, |exactly|) is put onto |save_stack|
 20625  with a level boundary word just above it, and |cur_group:=adjusted_hbox_group|;
 20626  \TeX\ enters restricted horizontal mode to process the hlist. The right
 20627  brace eventually causes |save_stack| to be restored to its former state,
 20628  at which time the information about the box size (100pt, |exactly|) is
 20629  available once again; a box is packaged and we leave restricted horizontal
 20630  mode, appending the new box to the current list of the enclosing mode
 20631  (in this case to the current list of vertical mode), followed by any
 20632  vertical adjustments that were removed from the box by |hpack|.
 20633  
 20634  The next few sections of the program are therefore concerned with the
 20635  treatment of left and right curly braces.
 20636  
 20637  @ If a left brace occurs in the middle of a page or paragraph, it simply
 20638  introduces a new level of grouping, and the matching right brace will not have
 20639  such a drastic effect. Such grouping affects neither the mode nor the
 20640  current list.
 20641  
 20642  @<Cases of |main_control| that build...@>=
 20643  non_math(left_brace): new_save_level(simple_group);
 20644  any_mode(begin_group): new_save_level(semi_simple_group);
 20645  any_mode(end_group): if cur_group=semi_simple_group then unsave
 20646    else off_save;
 20647  
 20648  @ We have to deal with errors in which braces and such things are not
 20649  properly nested. Sometimes the user makes an error of commission by
 20650  inserting an extra symbol, but sometimes the user makes an error of omission.
 20651  \TeX\ can't always tell one from the other, so it makes a guess and tries
 20652  to avoid getting into a loop.
 20653  
 20654  The |off_save| routine is called when the current group code is wrong. It tries
 20655  to insert something into the user's input that will help clean off
 20656  the top level.
 20657  
 20658  @<Declare act...@>=
 20659  procedure off_save;
 20660  var p:pointer; {inserted token}
 20661  begin if cur_group=bottom_level then
 20662    @<Drop current token and complain that it was unmatched@>
 20663  else  begin back_input; p:=get_avail; link(temp_head):=p;
 20664    print_err("Missing ");
 20665    @<Prepare to insert a token that matches |cur_group|,
 20666      and print what it is@>;
 20667    print(" inserted"); ins_list(link(temp_head));
 20668    help5("I've inserted something that you may have forgotten.")@/
 20669    ("(See the <inserted text> above.)")@/
 20670    ("With luck, this will get me unwedged. But if you")@/
 20671    ("really didn't forget anything, try typing `2' now; then")@/
 20672    ("my insertion and my current dilemma will both disappear.");
 20673    error;
 20674    end;
 20675  end;
 20676  
 20677  @ At this point, |link(temp_head)=p|, a pointer to an empty one-word node.
 20678  
 20679  @<Prepare to insert a token that matches |cur_group|...@>=
 20680  case cur_group of
 20681  semi_simple_group: begin info(p):=cs_token_flag+frozen_end_group;
 20682    print_esc("endgroup");
 20683  @.Missing \\endgroup inserted@>
 20684    end;
 20685  math_shift_group: begin info(p):=math_shift_token+"$"; print_char("$");
 20686  @.Missing \$ inserted@>
 20687    end;
 20688  math_left_group: begin info(p):=cs_token_flag+frozen_right; link(p):=get_avail;
 20689    p:=link(p); info(p):=other_token+"."; print_esc("right.");
 20690  @.Missing \\right\hbox{.} inserted@>
 20691  @^null delimiter@>
 20692    end;
 20693  othercases begin info(p):=right_brace_token+"}"; print_char("}");
 20694  @.Missing \} inserted@>
 20695    end
 20696  endcases
 20697  
 20698  @ @<Drop current token and complain that it was unmatched@>=
 20699  begin print_err("Extra "); print_cmd_chr(cur_cmd,cur_chr);
 20700  @.Extra x@>
 20701  help1("Things are pretty mixed up, but I think the worst is over.");@/
 20702  error;
 20703  end
 20704  
 20705  @ The routine for a |right_brace| character branches into many subcases,
 20706  since a variety of things may happen, depending on |cur_group|. Some
 20707  types of groups are not supposed to be ended by a right brace; error
 20708  messages are given in hopes of pinpointing the problem. Most branches
 20709  of this routine will be filled in later, when we are ready to understand
 20710  them; meanwhile, we must prepare ourselves to deal with such errors.
 20711  
 20712  @<Cases of |main_control| that build...@>=
 20713  any_mode(right_brace): handle_right_brace;
 20714  
 20715  @ @<Declare the procedure called |handle_right_brace|@>=
 20716  procedure handle_right_brace;
 20717  var p,@!q:pointer; {for short-term use}
 20718  @!d:scaled; {holds |split_max_depth| in |insert_group|}
 20719  @!f:integer; {holds |floating_penalty| in |insert_group|}
 20720  begin case cur_group of
 20721  simple_group: unsave;
 20722  bottom_level: begin print_err("Too many }'s");
 20723  @.Too many \}'s@>
 20724    help2("You've closed more groups than you opened.")@/
 20725    ("Such booboos are generally harmless, so keep going."); error;
 20726    end;
 20727  semi_simple_group,math_shift_group,math_left_group: extra_right_brace;
 20728  @t\4@>@<Cases of |handle_right_brace| where a |right_brace| triggers
 20729    a delayed action@>@;
 20730  othercases confusion("rightbrace")
 20731  @:this can't happen rightbrace}{\quad rightbrace@>
 20732  endcases;
 20733  end;
 20734  
 20735  @ @<Declare act...@>=
 20736  procedure extra_right_brace;
 20737  begin print_err("Extra }, or forgotten ");
 20738  @.Extra \}, or forgotten x@>
 20739  case cur_group of
 20740  semi_simple_group: print_esc("endgroup");
 20741  math_shift_group: print_char("$");
 20742  math_left_group: print_esc("right");
 20743  end;@/
 20744  help5("I've deleted a group-closing symbol because it seems to be")@/
 20745  ("spurious, as in `$x}$'. But perhaps the } is legitimate and")@/
 20746  ("you forgot something else, as in `\hbox{$x}'. In such cases")@/
 20747  ("the way to recover is to insert both the forgotten and the")@/
 20748  ("deleted material, e.g., by typing `I$}'."); error;
 20749  incr(align_state);
 20750  end;
 20751  
 20752  @ Here is where we clear the parameters that are supposed to revert to their
 20753  default values after every paragraph and when internal vertical mode is entered.
 20754  
 20755  @<Declare act...@>=
 20756  procedure normal_paragraph;
 20757  begin if looseness<>0 then eq_word_define(int_base+looseness_code,0);
 20758  if hang_indent<>0 then eq_word_define(dimen_base+hang_indent_code,0);
 20759  if hang_after<>1 then eq_word_define(int_base+hang_after_code,1);
 20760  if par_shape_ptr<>null then eq_define(par_shape_loc,shape_ref,null);
 20761  end;
 20762  
 20763  @ Now let's turn to the question of how \.{\\hbox} is treated. We actually
 20764  need to consider also a slightly larger context, since constructions like
 20765  `\.{\\setbox3=}\penalty0\.{\\hbox...}' and
 20766  `\.{\\leaders}\penalty0\.{\\hbox...}' and
 20767  `\.{\\lower3.8pt\\hbox...}'
 20768  are supposed to invoke quite
 20769  different actions after the box has been packaged. Conversely,
 20770  constructions like `\.{\\setbox3=}' can be followed by a variety of
 20771  different kinds of boxes, and we would like to encode such things in an
 20772  efficient way.
 20773  
 20774  In other words, there are two problems: to represent the context of a box,
 20775  and to represent its type.
 20776  
 20777  The first problem is solved by putting a ``context code'' on the |save_stack|,
 20778  just below the two entries that give the dimensions produced by |scan_spec|.
 20779  The context code is either a (signed) shift amount, or it is a large
 20780  integer |>=box_flag|, where |box_flag=@t$2^{30}$@>|. Codes |box_flag| through
 20781  |box_flag+255| represent `\.{\\setbox0}' through `\.{\\setbox255}';
 20782  codes |box_flag+256| through |box_flag+511| represent `\.{\\global\\setbox0}'
 20783  through `\.{\\global\\setbox255}';
 20784  code |box_flag+512| represents `\.{\\shipout}'; and codes |box_flag+513|
 20785  through |box_flag+515| represent `\.{\\leaders}', `\.{\\cleaders}',
 20786  and `\.{\\xleaders}'.
 20787  
 20788  The second problem is solved by giving the command code |make_box| to all
 20789  control sequences that produce a box, and by using the following |chr_code|
 20790  values to distinguish between them: |box_code|, |copy_code|, |last_box_code|,
 20791  |vsplit_code|, |vtop_code|, |vtop_code+vmode|, and |vtop_code+hmode|, where
 20792  the latter two are used to denote \.{\\vbox} and \.{\\hbox}, respectively.
 20793  
 20794  @d box_flag==@'10000000000 {context code for `\.{\\setbox0}'}
 20795  @d ship_out_flag==box_flag+512 {context code for `\.{\\shipout}'}
 20796  @d leader_flag==box_flag+513 {context code for `\.{\\leaders}'}
 20797  @d box_code=0 {|chr_code| for `\.{\\box}'}
 20798  @d copy_code=1 {|chr_code| for `\.{\\copy}'}
 20799  @d last_box_code=2 {|chr_code| for `\.{\\lastbox}'}
 20800  @d vsplit_code=3 {|chr_code| for `\.{\\vsplit}'}
 20801  @d vtop_code=4 {|chr_code| for `\.{\\vtop}'}
 20802  
 20803  @<Put each...@>=
 20804  primitive("moveleft",hmove,1);
 20805  @!@:move_left_}{\.{\\moveleft} primitive@>
 20806  primitive("moveright",hmove,0);@/
 20807  @!@:move_right_}{\.{\\moveright} primitive@>
 20808  primitive("raise",vmove,1);
 20809  @!@:raise_}{\.{\\raise} primitive@>
 20810  primitive("lower",vmove,0);
 20811  @!@:lower_}{\.{\\lower} primitive@>
 20812  @#
 20813  primitive("box",make_box,box_code);
 20814  @!@:box_}{\.{\\box} primitive@>
 20815  primitive("copy",make_box,copy_code);
 20816  @!@:copy_}{\.{\\copy} primitive@>
 20817  primitive("lastbox",make_box,last_box_code);
 20818  @!@:last_box_}{\.{\\lastbox} primitive@>
 20819  primitive("vsplit",make_box,vsplit_code);
 20820  @!@:vsplit_}{\.{\\vsplit} primitive@>
 20821  primitive("vtop",make_box,vtop_code);@/
 20822  @!@:vtop_}{\.{\\vtop} primitive@>
 20823  primitive("vbox",make_box,vtop_code+vmode);
 20824  @!@:vbox_}{\.{\\vbox} primitive@>
 20825  primitive("hbox",make_box,vtop_code+hmode);@/
 20826  @!@:hbox_}{\.{\\hbox} primitive@>
 20827  primitive("shipout",leader_ship,a_leaders-1); {|ship_out_flag=leader_flag-1|}
 20828  @!@:ship_out_}{\.{\\shipout} primitive@>
 20829  primitive("leaders",leader_ship,a_leaders);
 20830  @!@:leaders_}{\.{\\leaders} primitive@>
 20831  primitive("cleaders",leader_ship,c_leaders);
 20832  @!@:c_leaders_}{\.{\\cleaders} primitive@>
 20833  primitive("xleaders",leader_ship,x_leaders);
 20834  @!@:x_leaders_}{\.{\\xleaders} primitive@>
 20835  
 20836  @ @<Cases of |print_cmd_chr|...@>=
 20837  hmove: if chr_code=1 then print_esc("moveleft")@+else print_esc("moveright");
 20838  vmove: if chr_code=1 then print_esc("raise")@+else print_esc("lower");
 20839  make_box: case chr_code of
 20840    box_code: print_esc("box");
 20841    copy_code: print_esc("copy");
 20842    last_box_code: print_esc("lastbox");
 20843    vsplit_code: print_esc("vsplit");
 20844    vtop_code: print_esc("vtop");
 20845    vtop_code+vmode: print_esc("vbox");
 20846    othercases print_esc("hbox")
 20847    endcases;
 20848  leader_ship: if chr_code=a_leaders then print_esc("leaders")
 20849    else if chr_code=c_leaders then print_esc("cleaders")
 20850    else if chr_code=x_leaders then print_esc("xleaders")
 20851    else print_esc("shipout");
 20852  
 20853  @ Constructions that require a box are started by calling |scan_box| with
 20854  a specified context code. The |scan_box| routine verifies
 20855  that a |make_box| command comes next and then it calls |begin_box|.
 20856  
 20857  @<Cases of |main_control| that build...@>=
 20858  vmode+hmove,hmode+vmove,mmode+vmove: begin t:=cur_chr;
 20859    scan_normal_dimen;
 20860    if t=0 then scan_box(cur_val)@+else scan_box(-cur_val);
 20861    end;
 20862  any_mode(leader_ship): scan_box(leader_flag-a_leaders+cur_chr);
 20863  any_mode(make_box): begin_box(0);
 20864  
 20865  @ The global variable |cur_box| will point to a newly made box. If the box
 20866  is void, we will have |cur_box=null|. Otherwise we will have
 20867  |type(cur_box)=hlist_node| or |vlist_node| or |rule_node|; the |rule_node|
 20868  case can occur only with leaders.
 20869  
 20870  @<Glob...@>=
 20871  @!cur_box:pointer; {box to be placed into its context}
 20872  
 20873  @ The |box_end| procedure does the right thing with |cur_box|, if
 20874  |box_context| represents the context as explained above.
 20875  
 20876  @<Declare act...@>=
 20877  procedure box_end(@!box_context:integer);
 20878  var p:pointer; {|ord_noad| for new box in math mode}
 20879  begin if box_context<box_flag then @<Append box |cur_box| to the current list,
 20880      shifted by |box_context|@>
 20881  else if box_context<ship_out_flag then @<Store \(c)|cur_box| in a box register@>
 20882  else if cur_box<>null then
 20883    if box_context>ship_out_flag then @<Append a new leader node that
 20884        uses |cur_box|@>
 20885    else ship_out(cur_box);
 20886  end;
 20887  
 20888  @ The global variable |adjust_tail| will be non-null if and only if the
 20889  current box might include adjustments that should be appended to the
 20890  current vertical list.
 20891  
 20892  @<Append box |cur_box| to the current...@>=
 20893  begin if cur_box<>null then
 20894    begin shift_amount(cur_box):=box_context;
 20895    if abs(mode)=vmode then
 20896      begin append_to_vlist(cur_box);
 20897      if adjust_tail<>null then
 20898        begin if adjust_head<>adjust_tail then
 20899          begin link(tail):=link(adjust_head); tail:=adjust_tail;
 20900          end;
 20901        adjust_tail:=null;
 20902        end;
 20903      if mode>0 then build_page;
 20904      end
 20905    else  begin if abs(mode)=hmode then space_factor:=1000
 20906      else  begin p:=new_noad;
 20907        math_type(nucleus(p)):=sub_box;
 20908        info(nucleus(p)):=cur_box; cur_box:=p;
 20909        end;
 20910      link(tail):=cur_box; tail:=cur_box;
 20911      end;
 20912    end;
 20913  end
 20914  
 20915  @ @<Store \(c)|cur_box| in a box register@>=
 20916  if box_context<box_flag+256 then
 20917    eq_define(box_base-box_flag+box_context,box_ref,cur_box)
 20918  else geq_define(box_base-box_flag-256+box_context,box_ref,cur_box)
 20919  
 20920  @ @<Append a new leader node ...@>=
 20921  begin @<Get the next non-blank non-relax...@>;
 20922  if ((cur_cmd=hskip)and(abs(mode)<>vmode))or@|
 20923     ((cur_cmd=vskip)and(abs(mode)=vmode)) then
 20924    begin append_glue; subtype(tail):=box_context-(leader_flag-a_leaders);
 20925    leader_ptr(tail):=cur_box;
 20926    end
 20927  else  begin print_err("Leaders not followed by proper glue");
 20928  @.Leaders not followed by...@>
 20929    help3("You should say `\leaders <box or rule><hskip or vskip>'.")@/
 20930    ("I found the <box or rule>, but there's no suitable")@/
 20931    ("<hskip or vskip>, so I'm ignoring these leaders."); back_error;
 20932    flush_node_list(cur_box);
 20933    end;
 20934  end
 20935  
 20936  @ Now that we can see what eventually happens to boxes, we can consider
 20937  the first steps in their creation. The |begin_box| routine is called when
 20938  |box_context| is a context specification, |cur_chr| specifies the type of
 20939  box desired, and |cur_cmd=make_box|.
 20940  
 20941  @<Declare act...@>=
 20942  procedure begin_box(@!box_context:integer);
 20943  label exit, done;
 20944  var @!p,@!q:pointer; {run through the current list}
 20945  @!m:quarterword; {the length of a replacement list}
 20946  @!k:halfword; {0 or |vmode| or |hmode|}
 20947  @!n:eight_bits; {a box number}
 20948  begin case cur_chr of
 20949  box_code: begin scan_eight_bit_int; cur_box:=box(cur_val);
 20950    box(cur_val):=null; {the box becomes void, at the same level}
 20951    end;
 20952  copy_code: begin scan_eight_bit_int; cur_box:=copy_node_list(box(cur_val));
 20953    end;
 20954  last_box_code: @<If the current list ends with a box node, delete it from
 20955    the list and make |cur_box| point to it; otherwise set |cur_box:=null|@>;
 20956  vsplit_code: @<Split off part of a vertical box, make |cur_box| point to it@>;
 20957  othercases @<Initiate the construction of an hbox or vbox, then |return|@>
 20958  endcases;@/
 20959  box_end(box_context); {in simple cases, we use the box immediately}
 20960  exit:end;
 20961  
 20962  @ Note that the condition |not is_char_node(tail)| implies that |head<>tail|,
 20963  since |head| is a one-word node.
 20964  
 20965  @<If the current list ends with a box node, delete it...@>=
 20966  begin cur_box:=null;
 20967  if abs(mode)=mmode then
 20968    begin you_cant; help1("Sorry; this \lastbox will be void."); error;
 20969    end
 20970  else if (mode=vmode)and(head=tail) then
 20971    begin you_cant;
 20972    help2("Sorry...I usually can't take things from the current page.")@/
 20973      ("This \lastbox will therefore be void."); error;
 20974    end
 20975  else  begin if not is_char_node(tail) then
 20976      if (type(tail)=hlist_node)or(type(tail)=vlist_node) then
 20977        @<Remove the last box, unless it's part of a discretionary@>;
 20978    end;
 20979  end
 20980  
 20981  @ @<Remove the last box...@>=
 20982  begin q:=head;
 20983  repeat p:=q;
 20984  if not is_char_node(q) then if type(q)=disc_node then
 20985    begin for m:=1 to replace_count(q) do p:=link(p);
 20986    if p=tail then goto done;
 20987    end;
 20988  q:=link(p);
 20989  until q=tail;
 20990  cur_box:=tail; shift_amount(cur_box):=0;
 20991  tail:=p; link(p):=null;
 20992  done:end
 20993  
 20994  @ Here we deal with things like `\.{\\vsplit 13 to 100pt}'.
 20995  
 20996  @<Split off part of a vertical box, make |cur_box| point to it@>=
 20997  begin scan_eight_bit_int; n:=cur_val;
 20998  if not scan_keyword("to") then
 20999  @.to@>
 21000    begin print_err("Missing `to' inserted");
 21001  @.Missing `to' inserted@>
 21002    help2("I'm working on `\vsplit<box number> to <dimen>';")@/
 21003    ("will look for the <dimen> next."); error;
 21004    end;
 21005  scan_normal_dimen;
 21006  cur_box:=vsplit(n,cur_val);
 21007  end
 21008  
 21009  @ Here is where we enter restricted horizontal mode or internal vertical
 21010  mode, in order to make a box.
 21011  
 21012  @<Initiate the construction of an hbox or vbox, then |return|@>=
 21013  begin k:=cur_chr-vtop_code; saved(0):=box_context;
 21014  if k=hmode then
 21015    if (box_context<box_flag)and(abs(mode)=vmode) then
 21016      scan_spec(adjusted_hbox_group,true)
 21017    else scan_spec(hbox_group,true)
 21018  else  begin if k=vmode then scan_spec(vbox_group,true)
 21019    else  begin scan_spec(vtop_group,true); k:=vmode;
 21020      end;
 21021    normal_paragraph;
 21022    end;
 21023  push_nest; mode:=-k;
 21024  if k=vmode then
 21025    begin prev_depth:=ignore_depth;
 21026    if every_vbox<>null then begin_token_list(every_vbox,every_vbox_text);
 21027    end
 21028  else  begin space_factor:=1000;
 21029    if every_hbox<>null then begin_token_list(every_hbox,every_hbox_text);
 21030    end;
 21031  return;
 21032  end
 21033  
 21034  @ @<Declare act...@>=
 21035  procedure scan_box(@!box_context:integer);
 21036    {the next input should specify a box or perhaps a rule}
 21037  begin @<Get the next non-blank non-relax...@>;
 21038  if cur_cmd=make_box then begin_box(box_context)
 21039  else if (box_context>=leader_flag)and((cur_cmd=hrule)or(cur_cmd=vrule)) then
 21040    begin cur_box:=scan_rule_spec; box_end(box_context);
 21041    end
 21042  else  begin@t@>@;@/
 21043    print_err("A <box> was supposed to be here");@/
 21044  @.A <box> was supposed to...@>
 21045    help3("I was expecting to see \hbox or \vbox or \copy or \box or")@/
 21046    ("something like that. So you might find something missing in")@/
 21047    ("your output. But keep trying; you can fix this later."); back_error;
 21048    end;
 21049  end;
 21050  
 21051  @ When the right brace occurs at the end of an \.{\\hbox} or \.{\\vbox} or
 21052  \.{\\vtop} construction, the |package| routine comes into action. We might
 21053  also have to finish a paragraph that hasn't ended.
 21054  
 21055  @<Cases of |handle...@>=
 21056  hbox_group: package(0);
 21057  adjusted_hbox_group: begin adjust_tail:=adjust_head; package(0);
 21058    end;
 21059  vbox_group: begin end_graf; package(0);
 21060    end;
 21061  vtop_group: begin end_graf; package(vtop_code);
 21062    end;
 21063  
 21064  @ @<Declare action...@>=
 21065  procedure package(@!c:small_number);
 21066  var h:scaled; {height of box}
 21067  @!p:pointer; {first node in a box}
 21068  @!d:scaled; {max depth}
 21069  begin d:=box_max_depth; unsave; save_ptr:=save_ptr-3;
 21070  if mode=-hmode then cur_box:=hpack(link(head),saved(2),saved(1))
 21071  else  begin cur_box:=vpackage(link(head),saved(2),saved(1),d);
 21072    if c=vtop_code then @<Readjust the height and depth of |cur_box|,
 21073      for \.{\\vtop}@>;
 21074    end;
 21075  pop_nest; box_end(saved(0));
 21076  end;
 21077  
 21078  @ The height of a `\.{\\vtop}' box is inherited from the first item on its list,
 21079  if that item is an |hlist_node|, |vlist_node|, or |rule_node|; otherwise
 21080  the \.{\\vtop} height is zero.
 21081  
 21082  
 21083  @<Readjust the height...@>=
 21084  begin h:=0; p:=list_ptr(cur_box);
 21085  if p<>null then if type(p)<=rule_node then h:=height(p);
 21086  depth(cur_box):=depth(cur_box)-h+height(cur_box); height(cur_box):=h;
 21087  end
 21088  
 21089  @ A paragraph begins when horizontal-mode material occurs in vertical mode,
 21090  or when the paragraph is explicitly started by `\.{\\indent}' or
 21091  `\.{\\noindent}'.
 21092  
 21093  @<Put each...@>=
 21094  primitive("indent",start_par,1);
 21095  @!@:indent_}{\.{\\indent} primitive@>
 21096  primitive("noindent",start_par,0);
 21097  @!@:no_indent_}{\.{\\noindent} primitive@>
 21098  
 21099  @ @<Cases of |print_cmd_chr|...@>=
 21100  start_par: if chr_code=0 then print_esc("noindent")@+ else print_esc("indent");
 21101  
 21102  @ @<Cases of |main_control| that build...@>=
 21103  vmode+start_par: new_graf(cur_chr>0);
 21104  vmode+letter,vmode+other_char,vmode+char_num,vmode+char_given,
 21105     vmode+math_shift,vmode+un_hbox,vmode+vrule,
 21106     vmode+accent,vmode+discretionary,vmode+hskip,vmode+valign,
 21107     vmode+ex_space,vmode+no_boundary:@t@>@;@/
 21108    begin back_input; new_graf(true);
 21109    end;
 21110  
 21111  @ @<Declare act...@>=
 21112  function norm_min(@!h:integer):small_number;
 21113  begin if h<=0 then norm_min:=1@+else if h>=63 then norm_min:=63@+
 21114  else norm_min:=h;
 21115  end;
 21116  @#
 21117  procedure new_graf(@!indented:boolean);
 21118  begin prev_graf:=0;
 21119  if (mode=vmode)or(head<>tail) then
 21120    tail_append(new_param_glue(par_skip_code));
 21121  push_nest; mode:=hmode; space_factor:=1000; set_cur_lang; clang:=cur_lang;
 21122  prev_graf:=(norm_min(left_hyphen_min)*@'100+norm_min(right_hyphen_min))
 21123               *@'200000+cur_lang;
 21124  if indented then
 21125    begin tail:=new_null_box; link(head):=tail; width(tail):=par_indent;@+
 21126    end;
 21127  if every_par<>null then begin_token_list(every_par,every_par_text);
 21128  if nest_ptr=1 then build_page; {put |par_skip| glue on current page}
 21129  end;
 21130  
 21131  @ @<Cases of |main_control| that build...@>=
 21132  hmode+start_par,mmode+start_par: indent_in_hmode;
 21133  
 21134  @ @<Declare act...@>=
 21135  procedure indent_in_hmode;
 21136  var p,@!q:pointer;
 21137  begin if cur_chr>0 then {\.{\\indent}}
 21138    begin p:=new_null_box; width(p):=par_indent;
 21139    if abs(mode)=hmode then space_factor:=1000
 21140    else  begin q:=new_noad; math_type(nucleus(q)):=sub_box;
 21141      info(nucleus(q)):=p; p:=q;
 21142      end;
 21143    tail_append(p);
 21144    end;
 21145  end;
 21146  
 21147  @ A paragraph ends when a |par_end| command is sensed, or when we are in
 21148  horizontal mode when reaching the right brace of vertical-mode routines
 21149  like \.{\\vbox}, \.{\\insert}, or \.{\\output}.
 21150  
 21151  @<Cases of |main_control| that build...@>=
 21152  vmode+par_end: begin normal_paragraph;
 21153    if mode>0 then build_page;
 21154    end;
 21155  hmode+par_end: begin if align_state<0 then off_save; {this tries to
 21156      recover from an alignment that didn't end properly}
 21157    end_graf; {this takes us to the enclosing mode, if |mode>0|}
 21158    if mode=vmode then build_page;
 21159    end;
 21160  hmode+stop,hmode+vskip,hmode+hrule,hmode+un_vbox,hmode+halign: head_for_vmode;
 21161  
 21162  @ @<Declare act...@>=
 21163  procedure head_for_vmode;
 21164  begin if mode<0 then
 21165    if cur_cmd<>hrule then off_save
 21166    else  begin print_err("You can't use `");
 21167      print_esc("hrule"); print("' here except with leaders");
 21168  @.You can't use \\hrule...@>
 21169      help2("To put a horizontal rule in an hbox or an alignment,")@/
 21170        ("you should use \leaders or \hrulefill (see The TeXbook).");
 21171      error;
 21172      end
 21173  else  begin back_input; cur_tok:=par_token; back_input; token_type:=inserted;
 21174    end;
 21175  end;
 21176  
 21177  @ @<Declare act...@>=
 21178  procedure end_graf;
 21179  begin if mode=hmode then
 21180    begin if head=tail then pop_nest {null paragraphs are ignored}
 21181    else line_break(widow_penalty);
 21182    normal_paragraph;
 21183    error_count:=0;
 21184    end;
 21185  end;
 21186  
 21187  @ Insertion and adjustment and mark nodes are constructed by the following
 21188  pieces of the program.
 21189  
 21190  @<Cases of |main_control| that build...@>=
 21191  any_mode(insert),hmode+vadjust,mmode+vadjust: begin_insert_or_adjust;
 21192  any_mode(mark): make_mark;
 21193  
 21194  @ @<Forbidden...@>=
 21195  vmode+vadjust,
 21196  
 21197  @ @<Declare act...@>=
 21198  procedure begin_insert_or_adjust;
 21199  begin if cur_cmd=vadjust then cur_val:=255
 21200  else  begin scan_eight_bit_int;
 21201    if cur_val=255 then
 21202      begin print_err("You can't "); print_esc("insert"); print_int(255);
 21203  @.You can't \\insert255@>
 21204      help1("I'm changing to \insert0; box 255 is special.");
 21205      error; cur_val:=0;
 21206      end;
 21207    end;
 21208  saved(0):=cur_val; incr(save_ptr);
 21209  new_save_level(insert_group); scan_left_brace; normal_paragraph;
 21210  push_nest; mode:=-vmode; prev_depth:=ignore_depth;
 21211  end;
 21212  
 21213  @ @<Cases of |handle...@>=
 21214  insert_group: begin end_graf; q:=split_top_skip; add_glue_ref(q);
 21215    d:=split_max_depth; f:=floating_penalty; unsave; decr(save_ptr);
 21216    {now |saved(0)| is the insertion number, or 255 for |vadjust|}
 21217    p:=vpack(link(head),natural); pop_nest;
 21218    if saved(0)<255 then
 21219      begin tail_append(get_node(ins_node_size));
 21220      type(tail):=ins_node; subtype(tail):=qi(saved(0));
 21221      height(tail):=height(p)+depth(p); ins_ptr(tail):=list_ptr(p);
 21222      split_top_ptr(tail):=q; depth(tail):=d; float_cost(tail):=f;
 21223      end
 21224    else  begin tail_append(get_node(small_node_size));
 21225      type(tail):=adjust_node;@/
 21226      subtype(tail):=0; {the |subtype| is not used}
 21227      adjust_ptr(tail):=list_ptr(p); delete_glue_ref(q);
 21228      end;
 21229    free_node(p,box_node_size);
 21230    if nest_ptr=0 then build_page;
 21231    end;
 21232  output_group: @<Resume the page builder...@>;
 21233  
 21234  @ @<Declare act...@>=
 21235  procedure make_mark;
 21236  var p:pointer; {new node}
 21237  begin p:=scan_toks(false,true); p:=get_node(small_node_size);
 21238  type(p):=mark_node; subtype(p):=0; {the |subtype| is not used}
 21239  mark_ptr(p):=def_ref; link(tail):=p; tail:=p;
 21240  end;
 21241  
 21242  @ Penalty nodes get into a list via the |break_penalty| command.
 21243  @^penalties@>
 21244  
 21245  @<Cases of |main_control| that build...@>=
 21246  any_mode(break_penalty): append_penalty;
 21247  
 21248  @ @<Declare action...@>=
 21249  procedure append_penalty;
 21250  begin scan_int; tail_append(new_penalty(cur_val));
 21251  if mode=vmode then build_page;
 21252  end;
 21253  
 21254  @ The |remove_item| command removes a penalty, kern, or glue node if it
 21255  appears at the tail of the current list, using a brute-force linear scan.
 21256  Like \.{\\lastbox}, this command is not allowed in vertical mode (except
 21257  internal vertical mode), since the current list in vertical mode is sent
 21258  to the page builder.  But if we happen to be able to implement it in
 21259  vertical mode, we do.
 21260  
 21261  @<Cases of |main_control| that build...@>=
 21262  any_mode(remove_item): delete_last;
 21263  
 21264  @ When |delete_last| is called, |cur_chr| is the |type| of node that
 21265  will be deleted, if present.
 21266  
 21267  @<Declare action...@>=
 21268  procedure delete_last;
 21269  label exit;
 21270  var @!p,@!q:pointer; {run through the current list}
 21271  @!m:quarterword; {the length of a replacement list}
 21272  begin if (mode=vmode)and(tail=head) then
 21273    @<Apologize for inability to do the operation now,
 21274      unless \.{\\unskip} follows non-glue@>
 21275  else  begin if not is_char_node(tail) then if type(tail)=cur_chr then
 21276      begin q:=head;
 21277      repeat p:=q;
 21278      if not is_char_node(q) then if type(q)=disc_node then
 21279        begin for m:=1 to replace_count(q) do p:=link(p);
 21280        if p=tail then return;
 21281        end;
 21282      q:=link(p);
 21283      until q=tail;
 21284      link(p):=null; flush_node_list(tail); tail:=p;
 21285      end;
 21286    end;
 21287  exit:end;
 21288  
 21289  @ @<Apologize for inability to do the operation...@>=
 21290  begin if (cur_chr<>glue_node)or(last_glue<>max_halfword) then
 21291    begin you_cant;
 21292    help2("Sorry...I usually can't take things from the current page.")@/
 21293      ("Try `I\vskip-\lastskip' instead.");
 21294    if cur_chr=kern_node then help_line[0]:=
 21295      ("Try `I\kern-\lastkern' instead.")
 21296    else if cur_chr<>glue_node then help_line[0]:=@|
 21297      ("Perhaps you can make the output routine do it.");
 21298    error;
 21299    end;
 21300  end
 21301  
 21302  @ @<Put each...@>=
 21303  primitive("unpenalty",remove_item,penalty_node);@/
 21304  @!@:un_penalty_}{\.{\\unpenalty} primitive@>
 21305  primitive("unkern",remove_item,kern_node);@/
 21306  @!@:un_kern_}{\.{\\unkern} primitive@>
 21307  primitive("unskip",remove_item,glue_node);@/
 21308  @!@:un_skip_}{\.{\\unskip} primitive@>
 21309  primitive("unhbox",un_hbox,box_code);@/
 21310  @!@:un_hbox_}{\.{\\unhbox} primitive@>
 21311  primitive("unhcopy",un_hbox,copy_code);@/
 21312  @!@:un_hcopy_}{\.{\\unhcopy} primitive@>
 21313  primitive("unvbox",un_vbox,box_code);@/
 21314  @!@:un_vbox_}{\.{\\unvbox} primitive@>
 21315  primitive("unvcopy",un_vbox,copy_code);@/
 21316  @!@:un_vcopy_}{\.{\\unvcopy} primitive@>
 21317  
 21318  @ @<Cases of |print_cmd_chr|...@>=
 21319  remove_item: if chr_code=glue_node then print_esc("unskip")
 21320    else if chr_code=kern_node then print_esc("unkern")
 21321    else print_esc("unpenalty");
 21322  un_hbox: if chr_code=copy_code then print_esc("unhcopy")
 21323    else print_esc("unhbox");
 21324  un_vbox: if chr_code=copy_code then print_esc("unvcopy")
 21325    else print_esc("unvbox");
 21326  
 21327  @ The |un_hbox| and |un_vbox| commands unwrap one of the 256 current boxes.
 21328  
 21329  @<Cases of |main_control| that build...@>=
 21330  vmode+un_vbox,hmode+un_hbox,mmode+un_hbox: unpackage;
 21331  
 21332  @ @<Declare act...@>=
 21333  procedure unpackage;
 21334  label exit;
 21335  var p:pointer; {the box}
 21336  @!c:box_code..copy_code; {should we copy?}
 21337  begin c:=cur_chr; scan_eight_bit_int; p:=box(cur_val);
 21338  if p=null then return;
 21339  if (abs(mode)=mmode)or((abs(mode)=vmode)and(type(p)<>vlist_node))or@|
 21340     ((abs(mode)=hmode)and(type(p)<>hlist_node)) then
 21341    begin print_err("Incompatible list can't be unboxed");
 21342  @.Incompatible list...@>
 21343    help3("Sorry, Pandora. (You sneaky devil.)")@/
 21344    ("I refuse to unbox an \hbox in vertical mode or vice versa.")@/
 21345    ("And I can't open any boxes in math mode.");@/
 21346    error; return;
 21347    end;
 21348  if c=copy_code then link(tail):=copy_node_list(list_ptr(p))
 21349  else  begin link(tail):=list_ptr(p); box(cur_val):=null;
 21350    free_node(p,box_node_size);
 21351    end;
 21352  while link(tail)<>null do tail:=link(tail);
 21353  exit:end;
 21354  
 21355  @ @<Forbidden...@>=vmode+ital_corr,
 21356  
 21357  @ Italic corrections are converted to kern nodes when the |ital_corr| command
 21358  follows a character. In math mode the same effect is achieved by appending
 21359  a kern of zero here, since italic corrections are supplied later.
 21360  
 21361  @<Cases of |main_control| that build...@>=
 21362  hmode+ital_corr: append_italic_correction;
 21363  mmode+ital_corr: tail_append(new_kern(0));
 21364  
 21365  @ @<Declare act...@>=
 21366  procedure append_italic_correction;
 21367  label exit;
 21368  var p:pointer; {|char_node| at the tail of the current list}
 21369  @!f:internal_font_number; {the font in the |char_node|}
 21370  begin if tail<>head then
 21371    begin if is_char_node(tail) then p:=tail
 21372    else if type(tail)=ligature_node then p:=lig_char(tail)
 21373    else return;
 21374    f:=font(p);
 21375    tail_append(new_kern(char_italic(f)(char_info(f)(character(p)))));
 21376    subtype(tail):=explicit;
 21377    end;
 21378  exit:end;
 21379  
 21380  @ Discretionary nodes are easy in the common case `\.{\\-}', but in the
 21381  general case we must process three braces full of items.
 21382  
 21383  @<Put each...@>=
 21384  primitive("-",discretionary,1);
 21385  @!@:Single-character primitives -}{\quad\.{\\-}@>
 21386  primitive("discretionary",discretionary,0);
 21387  @!@:discretionary_}{\.{\\discretionary} primitive@>
 21388  
 21389  @ @<Cases of |print_cmd_chr|...@>=
 21390  discretionary: if chr_code=1 then
 21391    print_esc("-")@+else print_esc("discretionary");
 21392  
 21393  @ @<Cases of |main_control| that build...@>=
 21394  hmode+discretionary,mmode+discretionary: append_discretionary;
 21395  
 21396  @ The space factor does not change when we append a discretionary node,
 21397  but it starts out as 1000 in the subsidiary lists.
 21398  
 21399  @<Declare act...@>=
 21400  procedure append_discretionary;
 21401  var c:integer; {hyphen character}
 21402  begin tail_append(new_disc);
 21403  if cur_chr=1 then
 21404    begin c:=hyphen_char[cur_font];
 21405    if c>=0 then if c<256 then pre_break(tail):=new_character(cur_font,c);
 21406    end
 21407  else  begin incr(save_ptr); saved(-1):=0; new_save_level(disc_group);
 21408    scan_left_brace; push_nest; mode:=-hmode; space_factor:=1000;
 21409    end;
 21410  end;
 21411  
 21412  @ The three discretionary lists are constructed somewhat as if they were
 21413  hboxes. A~subroutine called |build_discretionary| handles the transitions.
 21414  (This is sort of fun.)
 21415  
 21416  @<Cases of |handle...@>=
 21417  disc_group: build_discretionary;
 21418  
 21419  @ @<Declare act...@>=
 21420  procedure build_discretionary;
 21421  label done,exit;
 21422  var p,@!q:pointer; {for link manipulation}
 21423  @!n:integer; {length of discretionary list}
 21424  begin unsave;
 21425  @<Prune the current list, if necessary, until it contains only
 21426    |char_node|, |kern_node|, |hlist_node|, |vlist_node|, |rule_node|,
 21427    and |ligature_node| items; set |n| to the length of the list,
 21428    and set |q| to the list's tail@>;
 21429  p:=link(head); pop_nest;
 21430  case saved(-1) of
 21431  0:pre_break(tail):=p;
 21432  1:post_break(tail):=p;
 21433  2:@<Attach list |p| to the current list, and record its length;
 21434    then finish up and |return|@>;
 21435  end; {there are no other cases}
 21436  incr(saved(-1)); new_save_level(disc_group); scan_left_brace;
 21437  push_nest; mode:=-hmode; space_factor:=1000;
 21438  exit:end;
 21439  
 21440  @ @<Attach list |p| to the current...@>=
 21441  begin if (n>0)and(abs(mode)=mmode) then
 21442    begin print_err("Illegal math "); print_esc("discretionary");
 21443  @.Illegal math \\disc...@>
 21444    help2("Sorry: The third part of a discretionary break must be")@/
 21445    ("empty, in math formulas. I had to delete your third part.");
 21446    flush_node_list(p); n:=0; error;
 21447    end
 21448  else link(tail):=p;
 21449  if n<=max_quarterword then replace_count(tail):=n
 21450  else  begin print_err("Discretionary list is too long");
 21451  @.Discretionary list is too long@>
 21452    help2("Wow---I never thought anybody would tweak me here.")@/
 21453    ("You can't seriously need such a huge discretionary list?");
 21454    error;
 21455    end;
 21456  if n>0 then tail:=q;
 21457  decr(save_ptr); return;
 21458  end
 21459  
 21460  @ During this loop, |p=link(q)| and there are |n| items preceding |p|.
 21461  
 21462  @<Prune the current list, if necessary...@>=
 21463  q:=head; p:=link(q); n:=0;
 21464  while p<>null do
 21465    begin if not is_char_node(p) then if type(p)>rule_node then
 21466      if type(p)<>kern_node then if type(p)<>ligature_node then
 21467        begin print_err("Improper discretionary list");
 21468  @.Improper discretionary list@>
 21469        help1("Discretionary lists must contain only boxes and kerns.");@/
 21470        error;
 21471        begin_diagnostic;
 21472        print_nl("The following discretionary sublist has been deleted:");
 21473  @.The following...deleted@>
 21474        show_box(p);
 21475        end_diagnostic(true);
 21476        flush_node_list(p); link(q):=null; goto done;
 21477        end;
 21478    q:=p; p:=link(q); incr(n);
 21479    end;
 21480  done:
 21481  
 21482  @ We need only one more thing to complete the horizontal mode routines, namely
 21483  the \.{\\accent} primitive.
 21484  
 21485  @<Cases of |main_control| that build...@>=
 21486  hmode+accent: make_accent;
 21487  
 21488  @ The positioning of accents is straightforward but tedious. Given an accent
 21489  of width |a|, designed for characters of height |x| and slant |s|;
 21490  and given a character of width |w|, height |h|, and slant |t|: We will shift
 21491  the accent down by |x-h|, and we will insert kern nodes that have the effect of
 21492  centering the accent over the character and shifting the accent to the
 21493  right by $\delta={1\over2}(w-a)+h\cdot t-x\cdot s$.  If either character is
 21494  absent from the font, we will simply use the other, without shifting.
 21495  
 21496  @<Declare act...@>=
 21497  procedure make_accent;
 21498  var s,@!t: real; {amount of slant}
 21499  @!p,@!q,@!r:pointer; {character, box, and kern nodes}
 21500  @!f:internal_font_number; {relevant font}
 21501  @!a,@!h,@!x,@!w,@!delta:scaled; {heights and widths, as explained above}
 21502  @!i:four_quarters; {character information}
 21503  begin scan_char_num; f:=cur_font; p:=new_character(f,cur_val);
 21504  if p<>null then
 21505    begin x:=x_height(f); s:=slant(f)/float_constant(65536);
 21506  @^real division@>
 21507    a:=char_width(f)(char_info(f)(character(p)));@/
 21508    do_assignments;@/
 21509    @<Create a character node |q| for the next character,
 21510      but set |q:=null| if problems arise@>;
 21511    if q<>null then @<Append the accent with appropriate kerns,
 21512        then set |p:=q|@>;
 21513    link(tail):=p; tail:=p; space_factor:=1000;
 21514    end;
 21515  end;
 21516  
 21517  @ @<Create a character node |q| for the next...@>=
 21518  q:=null; f:=cur_font;
 21519  if (cur_cmd=letter)or(cur_cmd=other_char)or(cur_cmd=char_given) then
 21520    q:=new_character(f,cur_chr)
 21521  else if cur_cmd=char_num then
 21522    begin scan_char_num; q:=new_character(f,cur_val);
 21523    end
 21524  else back_input
 21525  
 21526  @ The kern nodes appended here must be distinguished from other kerns, lest
 21527  they be wiped away by the hyphenation algorithm or by a previous line break.
 21528  
 21529  The two kerns are computed with (machine-dependent) |real| arithmetic, but
 21530  their sum is machine-independent; the net effect is machine-independent,
 21531  because the user cannot remove these nodes nor access them via \.{\\lastkern}.
 21532  
 21533  @<Append the accent with appropriate kerns...@>=
 21534  begin t:=slant(f)/float_constant(65536);
 21535  @^real division@>
 21536  i:=char_info(f)(character(q));
 21537  w:=char_width(f)(i); h:=char_height(f)(height_depth(i));
 21538  if h<>x then {the accent must be shifted up or down}
 21539    begin p:=hpack(p,natural); shift_amount(p):=x-h;
 21540    end;
 21541  delta:=round((w-a)/float_constant(2)+h*t-x*s);
 21542  @^real multiplication@>
 21543  @^real addition@>
 21544  r:=new_kern(delta); subtype(r):=acc_kern; link(tail):=r; link(r):=p;
 21545  tail:=new_kern(-a-delta); subtype(tail):=acc_kern; link(p):=tail; p:=q;
 21546  end
 21547  
 21548  @ When `\.{\\cr}' or `\.{\\span}' or a tab mark comes through the scanner
 21549  into |main_control|, it might be that the user has foolishly inserted
 21550  one of them into something that has nothing to do with alignment. But it is
 21551  far more likely that a left brace or right brace has been omitted, since
 21552  |get_next| takes actions appropriate to alignment only when `\.{\\cr}'
 21553  or `\.{\\span}' or tab marks occur with |align_state=0|. The following
 21554  program attempts to make an appropriate recovery.
 21555  
 21556  @<Cases of |main_control| that build...@>=
 21557  any_mode(car_ret), any_mode(tab_mark): align_error;
 21558  any_mode(no_align): no_align_error;
 21559  any_mode(omit): omit_error;
 21560  
 21561  @ @<Declare act...@>=
 21562  procedure align_error;
 21563  begin if abs(align_state)>2 then
 21564    @<Express consternation over the fact that no alignment is in progress@>
 21565  else  begin back_input;
 21566    if align_state<0 then
 21567      begin print_err("Missing { inserted");
 21568  @.Missing \{ inserted@>
 21569      incr(align_state); cur_tok:=left_brace_token+"{";
 21570      end
 21571    else  begin print_err("Missing } inserted");
 21572  @.Missing \} inserted@>
 21573      decr(align_state); cur_tok:=right_brace_token+"}";
 21574      end;
 21575    help3("I've put in what seems to be necessary to fix")@/
 21576      ("the current column of the current alignment.")@/
 21577      ("Try to go on, since this might almost work."); ins_error;
 21578    end;
 21579  end;
 21580  
 21581  @ @<Express consternation...@>=
 21582  begin print_err("Misplaced "); print_cmd_chr(cur_cmd,cur_chr);
 21583  @.Misplaced \&@>
 21584  @.Misplaced \\span@>
 21585  @.Misplaced \\cr@>
 21586  if cur_tok=tab_token+"&" then
 21587    begin help6("I can't figure out why you would want to use a tab mark")@/
 21588    ("here. If you just want an ampersand, the remedy is")@/
 21589    ("simple: Just type `I\&' now. But if some right brace")@/
 21590    ("up above has ended a previous alignment prematurely,")@/
 21591    ("you're probably due for more error messages, and you")@/
 21592    ("might try typing `S' now just to see what is salvageable.");
 21593    end
 21594  else  begin help5("I can't figure out why you would want to use a tab mark")@/
 21595    ("or \cr or \span just now. If something like a right brace")@/
 21596    ("up above has ended a previous alignment prematurely,")@/
 21597    ("you're probably due for more error messages, and you")@/
 21598    ("might try typing `S' now just to see what is salvageable.");
 21599    end;
 21600  error;
 21601  end
 21602  
 21603  @ The help messages here contain a little white lie, since \.{\\noalign}
 21604  and \.{\\omit} are allowed also after `\.{\\noalign\{...\}}'.
 21605  
 21606  @<Declare act...@>=
 21607  procedure no_align_error;
 21608  begin print_err("Misplaced "); print_esc("noalign");
 21609  @.Misplaced \\noalign@>
 21610  help2("I expect to see \noalign only after the \cr of")@/
 21611    ("an alignment. Proceed, and I'll ignore this case."); error;
 21612  end;
 21613  procedure omit_error;
 21614  begin print_err("Misplaced "); print_esc("omit");
 21615  @.Misplaced \\omit@>
 21616  help2("I expect to see \omit only after tab marks or the \cr of")@/
 21617    ("an alignment. Proceed, and I'll ignore this case."); error;
 21618  end;
 21619  
 21620  @ We've now covered most of the abuses of \.{\\halign} and \.{\\valign}.
 21621  Let's take a look at what happens when they are used correctly.
 21622  
 21623  @<Cases of |main_control| that build...@>=
 21624  vmode+halign,hmode+valign:init_align;
 21625  mmode+halign: if privileged then
 21626    if cur_group=math_shift_group then init_align
 21627    else off_save;
 21628  vmode+endv,hmode+endv: do_endv;
 21629  
 21630  @ An |align_group| code is supposed to remain on the |save_stack|
 21631  during an entire alignment, until |fin_align| removes it.
 21632  
 21633  A devious user might force an |endv| command to occur just about anywhere;
 21634  we must defeat such hacks.
 21635  
 21636  @<Declare act...@>=
 21637  procedure do_endv;
 21638  begin base_ptr:=input_ptr; input_stack[base_ptr]:=cur_input;
 21639  while (input_stack[base_ptr].index_field<>v_template) and
 21640        (input_stack[base_ptr].loc_field=null) and
 21641        (input_stack[base_ptr].state_field=token_list) do decr(base_ptr);
 21642  if (input_stack[base_ptr].index_field<>v_template) or
 21643        (input_stack[base_ptr].loc_field<>null) or
 21644        (input_stack[base_ptr].state_field<>token_list) then
 21645    fatal_error("(interwoven alignment preambles are not allowed)");
 21646  @.interwoven alignment preambles...@>
 21647   if cur_group=align_group then
 21648    begin end_graf;
 21649    if fin_col then fin_row;
 21650    end
 21651  else off_save;
 21652  end;
 21653  
 21654  @ @<Cases of |handle_right_brace|...@>=
 21655  align_group: begin back_input; cur_tok:=cs_token_flag+frozen_cr;
 21656    print_err("Missing "); print_esc("cr"); print(" inserted");
 21657  @.Missing \\cr inserted@>
 21658    help1("I'm guessing that you meant to end an alignment here.");
 21659    ins_error;
 21660    end;
 21661  
 21662  @ @<Cases of |handle_right_brace|...@>=
 21663  no_align_group: begin end_graf; unsave; align_peek;
 21664    end;
 21665  
 21666  @ Finally, \.{\\endcsname} is not supposed to get through to |main_control|.
 21667  
 21668  @<Cases of |main_control| that build...@>=
 21669  any_mode(end_cs_name): cs_error;
 21670  
 21671  @ @<Declare act...@>=
 21672  procedure cs_error;
 21673  begin print_err("Extra "); print_esc("endcsname");
 21674  @.Extra \\endcsname@>
 21675  help1("I'm ignoring this, since I wasn't doing a \csname.");
 21676  error;
 21677  end;
 21678  
 21679  @* \[48] Building math lists.
 21680  The routines that \TeX\ uses to create mlists are similar to those we have
 21681  just seen for the generation of hlists and vlists. But it is necessary to
 21682  make ``noads'' as well as nodes, so the reader should review the
 21683  discussion of math mode data structures before trying to make sense out of
 21684  the following program.
 21685  
 21686  Here is a little routine that needs to be done whenever a subformula
 21687  is about to be processed. The parameter is a code like |math_group|.
 21688  
 21689  @<Declare act...@>=
 21690  procedure push_math(@!c:group_code);
 21691  begin push_nest; mode:=-mmode; incompleat_noad:=null; new_save_level(c);
 21692  end;
 21693  
 21694  @ We get into math mode from horizontal mode when a `\.\$' (i.e., a
 21695  |math_shift| character) is scanned. We must check to see whether this
 21696  `\.\$' is immediately followed by another, in case display math mode is
 21697  called for.
 21698  
 21699  @<Cases of |main_control| that build...@>=
 21700  hmode+math_shift:init_math;
 21701  
 21702  @ @<Declare act...@>=
 21703  procedure init_math;
 21704  label reswitch,found,not_found,done;
 21705  var w:scaled; {new or partial |pre_display_size|}
 21706  @!l:scaled; {new |display_width|}
 21707  @!s:scaled; {new |display_indent|}
 21708  @!p:pointer; {current node when calculating |pre_display_size|}
 21709  @!q:pointer; {glue specification when calculating |pre_display_size|}
 21710  @!f:internal_font_number; {font in current |char_node|}
 21711  @!n:integer; {scope of paragraph shape specification}
 21712  @!v:scaled; {|w| plus possible glue amount}
 21713  @!d:scaled; {increment to |v|}
 21714  begin get_token; {|get_x_token| would fail on \.{\\ifmmode}\thinspace!}
 21715  if (cur_cmd=math_shift)and(mode>0) then @<Go into display math mode@>
 21716  else  begin back_input; @<Go into ordinary math mode@>;
 21717    end;
 21718  end;
 21719  
 21720  @ @<Go into ordinary math mode@>=
 21721  begin push_math(math_shift_group); eq_word_define(int_base+cur_fam_code,-1);
 21722  if every_math<>null then begin_token_list(every_math,every_math_text);
 21723  end
 21724  
 21725  @ We get into ordinary math mode from display math mode when `\.{\\eqno}' or
 21726  `\.{\\leqno}' appears. In such cases |cur_chr| will be 0 or~1, respectively;
 21727  the value of |cur_chr| is placed onto |save_stack| for safe keeping.
 21728  
 21729  @<Cases of |main_control| that build...@>=
 21730  mmode+eq_no: if privileged then
 21731    if cur_group=math_shift_group then start_eq_no
 21732    else off_save;
 21733  
 21734  @ @<Put each...@>=
 21735  primitive("eqno",eq_no,0);
 21736  @!@:eq_no_}{\.{\\eqno} primitive@>
 21737  primitive("leqno",eq_no,1);
 21738  @!@:leq_no_}{\.{\\leqno} primitive@>
 21739  
 21740  @ When \TeX\ is in display math mode, |cur_group=math_shift_group|,
 21741  so it is not necessary for the |start_eq_no| procedure to test for
 21742  this condition.
 21743  
 21744  @<Declare act...@>=
 21745  procedure start_eq_no;
 21746  begin saved(0):=cur_chr; incr(save_ptr);
 21747  @<Go into ordinary math mode@>;
 21748  end;
 21749  
 21750  @ @<Cases of |print_cmd_chr|...@>=
 21751  eq_no:if chr_code=1 then print_esc("leqno")@+else print_esc("eqno");
 21752  
 21753  @ @<Forbidden...@>=non_math(eq_no),
 21754  
 21755  @ When we enter display math mode, we need to call |line_break| to
 21756  process the partial paragraph that has just been interrupted by the
 21757  display. Then we can set the proper values of |display_width| and
 21758  |display_indent| and |pre_display_size|.
 21759  
 21760  @<Go into display math mode@>=
 21761  begin if head=tail then {`\.{\\noindent\$\$}' or `\.{\$\${ }\$\$}'}
 21762    begin pop_nest; w:=-max_dimen;
 21763    end
 21764  else  begin line_break(display_widow_penalty);@/
 21765    @<Calculate the natural width, |w|, by which the characters of the
 21766      final line extend to the right of the reference point,
 21767      plus two ems; or set |w:=max_dimen| if the non-blank information
 21768      on that line is affected by stretching or shrinking@>;
 21769    end;
 21770  {now we are in vertical mode, working on the list that will contain the display}
 21771  @<Calculate the length, |l|, and the shift amount, |s|, of the display lines@>;
 21772  push_math(math_shift_group); mode:=mmode;
 21773  eq_word_define(int_base+cur_fam_code,-1);@/
 21774  eq_word_define(dimen_base+pre_display_size_code,w);
 21775  eq_word_define(dimen_base+display_width_code,l);
 21776  eq_word_define(dimen_base+display_indent_code,s);
 21777  if every_display<>null then begin_token_list(every_display,every_display_text);
 21778  if nest_ptr=1 then build_page;
 21779  end
 21780  
 21781  @ @<Calculate the natural width, |w|, by which...@>=
 21782  v:=shift_amount(just_box)+2*quad(cur_font); w:=-max_dimen;
 21783  p:=list_ptr(just_box);
 21784  while p<>null do
 21785    begin @<Let |d| be the natural width of node |p|;
 21786      if the node is ``visible,'' |goto found|;
 21787      if the node is glue that stretches or shrinks, set |v:=max_dimen|@>;
 21788    if v<max_dimen then v:=v+d;
 21789    goto not_found;
 21790    found: if v<max_dimen then
 21791      begin v:=v+d; w:=v;
 21792      end
 21793    else  begin w:=max_dimen; goto done;
 21794      end;
 21795    not_found: p:=link(p);
 21796    end;
 21797  done:
 21798  
 21799  @ @<Let |d| be the natural width of node |p|...@>=
 21800  reswitch: if is_char_node(p) then
 21801    begin f:=font(p); d:=char_width(f)(char_info(f)(character(p)));
 21802    goto found;
 21803    end;
 21804  case type(p) of
 21805  hlist_node,vlist_node,rule_node: begin d:=width(p); goto found;
 21806    end;
 21807  ligature_node:@<Make node |p| look like a |char_node|...@>;
 21808  kern_node,math_node: d:=width(p);
 21809  glue_node:@<Let |d| be the natural width of this glue; if stretching
 21810    or shrinking, set |v:=max_dimen|; |goto found| in the case of leaders@>;
 21811  whatsit_node: @<Let |d| be the width of the whatsit |p|@>;
 21812  othercases d:=0
 21813  endcases
 21814  
 21815  @ We need to be careful that |w|, |v|, and |d| do not depend on any |glue_set|
 21816  values, since such values are subject to system-dependent rounding.
 21817  System-dependent numbers are not allowed to infiltrate parameters like
 21818  |pre_display_size|, since \TeX82 is supposed to make the same decisions on all
 21819  machines.
 21820  
 21821  @<Let |d| be the natural width of this glue...@>=
 21822  begin q:=glue_ptr(p); d:=width(q);
 21823  if glue_sign(just_box)=stretching then
 21824    begin if (glue_order(just_box)=stretch_order(q))and@|
 21825       (stretch(q)<>0) then
 21826      v:=max_dimen;
 21827    end
 21828  else if glue_sign(just_box)=shrinking then
 21829    begin if (glue_order(just_box)=shrink_order(q))and@|
 21830       (shrink(q)<>0) then
 21831      v:=max_dimen;
 21832    end;
 21833  if subtype(p)>=a_leaders then goto found;
 21834  end
 21835  
 21836  @ A displayed equation is considered to be three lines long, so we
 21837  calculate the length and offset of line number |prev_graf+2|.
 21838  
 21839  @<Calculate the length, |l|, ...@>=
 21840  if par_shape_ptr=null then
 21841    if (hang_indent<>0)and@|
 21842     (((hang_after>=0)and(prev_graf+2>hang_after))or@|
 21843      (prev_graf+1<-hang_after)) then
 21844      begin l:=hsize-abs(hang_indent);
 21845      if hang_indent>0 then s:=hang_indent@+else s:=0;
 21846      end
 21847    else  begin l:=hsize; s:=0;
 21848      end
 21849  else  begin n:=info(par_shape_ptr);
 21850    if prev_graf+2>=n then p:=par_shape_ptr+2*n
 21851    else p:=par_shape_ptr+2*(prev_graf+2);
 21852    s:=mem[p-1].sc; l:=mem[p].sc;
 21853    end
 21854  
 21855  @ Subformulas of math formulas cause a new level of math mode to be entered,
 21856  on the semantic nest as well as the save stack. These subformulas arise in
 21857  several ways: (1)~A left brace by itself indicates the beginning of a
 21858  subformula that will be put into a box, thereby freezing its glue and
 21859  preventing line breaks. (2)~A subscript or superscript is treated as a
 21860  subformula if it is not a single character; the same applies to
 21861  the nucleus of things like \.{\\underline}. (3)~The \.{\\left} primitive
 21862  initiates a subformula that will be terminated by a matching \.{\\right}.
 21863  The group codes placed on |save_stack| in these three cases are
 21864  |math_group|, |math_group|, and |math_left_group|, respectively.
 21865  
 21866  Here is the code that handles case (1); the other cases are not quite as
 21867  trivial, so we shall consider them later.
 21868  
 21869  @<Cases of |main_control| that build...@>=
 21870  mmode+left_brace: begin tail_append(new_noad);
 21871    back_input; scan_math(nucleus(tail));
 21872    end;
 21873  
 21874  @ Recall that the |nucleus|, |subscr|, and |supscr| fields in a noad are
 21875  broken down into subfields called |math_type| and either |info| or
 21876  |(fam,character)|. The job of |scan_math| is to figure out what to place
 21877  in one of these principal fields; it looks at the subformula that
 21878  comes next in the input, and places an encoding of that subformula
 21879  into a given word of |mem|.
 21880  
 21881  @d fam_in_range==((cur_fam>=0)and(cur_fam<16))
 21882  
 21883  @<Declare act...@>=
 21884  procedure scan_math(@!p:pointer);
 21885  label restart,reswitch,exit;
 21886  var c:integer; {math character code}
 21887  begin restart:@<Get the next non-blank non-relax...@>;
 21888  reswitch:case cur_cmd of
 21889  letter,other_char,char_given: begin c:=ho(math_code(cur_chr));
 21890      if c=@'100000 then
 21891        begin @<Treat |cur_chr| as an active character@>;
 21892        goto restart;
 21893        end;
 21894      end;
 21895  char_num: begin scan_char_num; cur_chr:=cur_val; cur_cmd:=char_given;
 21896    goto reswitch;
 21897    end;
 21898  math_char_num: begin scan_fifteen_bit_int; c:=cur_val;
 21899    end;
 21900  math_given: c:=cur_chr;
 21901  delim_num: begin scan_twenty_seven_bit_int; c:=cur_val div @'10000;
 21902    end;
 21903  othercases @<Scan a subformula enclosed in braces and |return|@>
 21904  endcases;@/
 21905  math_type(p):=math_char; character(p):=qi(c mod 256);
 21906  if (c>=var_code)and fam_in_range then fam(p):=cur_fam
 21907  else fam(p):=(c div 256) mod 16;
 21908  exit:end;
 21909  
 21910  @ An active character that is an |outer_call| is allowed here.
 21911  
 21912  @<Treat |cur_chr|...@>=
 21913  begin cur_cs:=cur_chr+active_base;
 21914  cur_cmd:=eq_type(cur_cs); cur_chr:=equiv(cur_cs);
 21915  x_token; back_input;
 21916  end
 21917  
 21918  @ The pointer |p| is placed on |save_stack| while a complex subformula
 21919  is being scanned.
 21920  
 21921  @<Scan a subformula...@>=
 21922  begin back_input; scan_left_brace;@/
 21923  saved(0):=p; incr(save_ptr); push_math(math_group); return;
 21924  end
 21925  
 21926  @ The simplest math formula is, of course, `\.{\${ }\$}', when no noads are
 21927  generated. The next simplest cases involve a single character, e.g.,
 21928  `\.{\$x\$}'. Even though such cases may not seem to be very interesting,
 21929  the reader can perhaps understand how happy the author was when `\.{\$x\$}'
 21930  was first properly typeset by \TeX. The code in this section was used.
 21931  @^Knuth, Donald Ervin@>
 21932  
 21933  @<Cases of |main_control| that build...@>=
 21934  mmode+letter,mmode+other_char,mmode+char_given:
 21935    set_math_char(ho(math_code(cur_chr)));
 21936  mmode+char_num: begin scan_char_num; cur_chr:=cur_val;
 21937    set_math_char(ho(math_code(cur_chr)));
 21938    end;
 21939  mmode+math_char_num: begin scan_fifteen_bit_int; set_math_char(cur_val);
 21940    end;
 21941  mmode+math_given: set_math_char(cur_chr);
 21942  mmode+delim_num: begin scan_twenty_seven_bit_int;
 21943    set_math_char(cur_val div @'10000);
 21944    end;
 21945  
 21946  @ The |set_math_char| procedure creates a new noad appropriate to a given
 21947  math code, and appends it to the current mlist. However, if the math code
 21948  is sufficiently large, the |cur_chr| is treated as an active character and
 21949  nothing is appended.
 21950  
 21951  @<Declare act...@>=
 21952  procedure set_math_char(@!c:integer);
 21953  var p:pointer; {the new noad}
 21954  begin if c>=@'100000 then
 21955    @<Treat |cur_chr|...@>
 21956  else  begin p:=new_noad; math_type(nucleus(p)):=math_char;
 21957    character(nucleus(p)):=qi(c mod 256);
 21958    fam(nucleus(p)):=(c div 256) mod 16;
 21959    if c>=var_code then
 21960      begin if fam_in_range then fam(nucleus(p)):=cur_fam;
 21961      type(p):=ord_noad;
 21962      end
 21963    else  type(p):=ord_noad+(c div @'10000);
 21964    link(tail):=p; tail:=p;
 21965    end;
 21966  end;
 21967  
 21968  @ Primitive math operators like \.{\\mathop} and \.{\\underline} are given
 21969  the command code |math_comp|, supplemented by the noad type that they
 21970  generate.
 21971  
 21972  @<Put each...@>=
 21973  primitive("mathord",math_comp,ord_noad);
 21974  @!@:math_ord_}{\.{\\mathord} primitive@>
 21975  primitive("mathop",math_comp,op_noad);
 21976  @!@:math_op_}{\.{\\mathop} primitive@>
 21977  primitive("mathbin",math_comp,bin_noad);
 21978  @!@:math_bin_}{\.{\\mathbin} primitive@>
 21979  primitive("mathrel",math_comp,rel_noad);
 21980  @!@:math_rel_}{\.{\\mathrel} primitive@>
 21981  primitive("mathopen",math_comp,open_noad);
 21982  @!@:math_open_}{\.{\\mathopen} primitive@>
 21983  primitive("mathclose",math_comp,close_noad);
 21984  @!@:math_close_}{\.{\\mathclose} primitive@>
 21985  primitive("mathpunct",math_comp,punct_noad);
 21986  @!@:math_punct_}{\.{\\mathpunct} primitive@>
 21987  primitive("mathinner",math_comp,inner_noad);
 21988  @!@:math_inner_}{\.{\\mathinner} primitive@>
 21989  primitive("underline",math_comp,under_noad);
 21990  @!@:underline_}{\.{\\underline} primitive@>
 21991  primitive("overline",math_comp,over_noad);@/
 21992  @!@:overline_}{\.{\\overline} primitive@>
 21993  primitive("displaylimits",limit_switch,normal);
 21994  @!@:display_limits_}{\.{\\displaylimits} primitive@>
 21995  primitive("limits",limit_switch,limits);
 21996  @!@:limits_}{\.{\\limits} primitive@>
 21997  primitive("nolimits",limit_switch,no_limits);
 21998  @!@:no_limits_}{\.{\\nolimits} primitive@>
 21999  
 22000  @ @<Cases of |print_cmd_chr|...@>=
 22001  math_comp: case chr_code of
 22002    ord_noad: print_esc("mathord");
 22003    op_noad: print_esc("mathop");
 22004    bin_noad: print_esc("mathbin");
 22005    rel_noad: print_esc("mathrel");
 22006    open_noad: print_esc("mathopen");
 22007    close_noad: print_esc("mathclose");
 22008    punct_noad: print_esc("mathpunct");
 22009    inner_noad: print_esc("mathinner");
 22010    under_noad: print_esc("underline");
 22011    othercases print_esc("overline")
 22012    endcases;
 22013  limit_switch: if chr_code=limits then print_esc("limits")
 22014    else if chr_code=no_limits then print_esc("nolimits")
 22015    else print_esc("displaylimits");
 22016  
 22017  @ @<Cases of |main_control| that build...@>=
 22018  mmode+math_comp: begin tail_append(new_noad);
 22019    type(tail):=cur_chr; scan_math(nucleus(tail));
 22020    end;
 22021  mmode+limit_switch: math_limit_switch;
 22022  
 22023  @ @<Declare act...@>=
 22024  procedure math_limit_switch;
 22025  label exit;
 22026  begin if head<>tail then if type(tail)=op_noad then
 22027    begin subtype(tail):=cur_chr; return;
 22028    end;
 22029  print_err("Limit controls must follow a math operator");
 22030  @.Limit controls must follow...@>
 22031  help1("I'm ignoring this misplaced \limits or \nolimits command."); error;
 22032  exit:end;
 22033  
 22034  @ Delimiter fields of noads are filled in by the |scan_delimiter| routine.
 22035  The first parameter of this procedure is the |mem| address where the
 22036  delimiter is to be placed; the second tells if this delimiter follows
 22037  \.{\\radical} or not.
 22038  
 22039  @<Declare act...@>=
 22040  procedure scan_delimiter(@!p:pointer;@!r:boolean);
 22041  begin if r then scan_twenty_seven_bit_int
 22042  else  begin @<Get the next non-blank non-relax...@>;
 22043    case cur_cmd of
 22044    letter,other_char: cur_val:=del_code(cur_chr);
 22045    delim_num: scan_twenty_seven_bit_int;
 22046    othercases cur_val:=-1
 22047    endcases;
 22048    end;
 22049  if cur_val<0 then @<Report that an invalid delimiter code is being changed
 22050     to null; set~|cur_val:=0|@>;
 22051  small_fam(p):=(cur_val div @'4000000) mod 16;
 22052  small_char(p):=qi((cur_val div @'10000) mod 256);
 22053  large_fam(p):=(cur_val div 256) mod 16;
 22054  large_char(p):=qi(cur_val mod 256);
 22055  end;
 22056  
 22057  @ @<Report that an invalid delimiter...@>=
 22058  begin print_err("Missing delimiter (. inserted)");
 22059  @.Missing delimiter...@>
 22060  help6("I was expecting to see something like `(' or `\{' or")@/
 22061    ("`\}' here. If you typed, e.g., `{' instead of `\{', you")@/
 22062    ("should probably delete the `{' by typing `1' now, so that")@/
 22063    ("braces don't get unbalanced. Otherwise just proceed.")@/
 22064    ("Acceptable delimiters are characters whose \delcode is")@/
 22065    ("nonnegative, or you can use `\delimiter <delimiter code>'.");
 22066  back_error; cur_val:=0;
 22067  end
 22068  
 22069  @ @<Cases of |main_control| that build...@>=
 22070  mmode+radical:math_radical;
 22071  
 22072  @ @<Declare act...@>=
 22073  procedure math_radical;
 22074  begin tail_append(get_node(radical_noad_size));
 22075  type(tail):=radical_noad; subtype(tail):=normal;
 22076  mem[nucleus(tail)].hh:=empty_field;
 22077  mem[subscr(tail)].hh:=empty_field;
 22078  mem[supscr(tail)].hh:=empty_field;
 22079  scan_delimiter(left_delimiter(tail),true); scan_math(nucleus(tail));
 22080  end;
 22081  
 22082  @ @<Cases of |main_control| that build...@>=
 22083  mmode+accent,mmode+math_accent:math_ac;
 22084  
 22085  @ @<Declare act...@>=
 22086  procedure math_ac;
 22087  begin if cur_cmd=accent then
 22088    @<Complain that the user should have said \.{\\mathaccent}@>;
 22089  tail_append(get_node(accent_noad_size));
 22090  type(tail):=accent_noad; subtype(tail):=normal;
 22091  mem[nucleus(tail)].hh:=empty_field;
 22092  mem[subscr(tail)].hh:=empty_field;
 22093  mem[supscr(tail)].hh:=empty_field;
 22094  math_type(accent_chr(tail)):=math_char;
 22095  scan_fifteen_bit_int;
 22096  character(accent_chr(tail)):=qi(cur_val mod 256);
 22097  if (cur_val>=var_code)and fam_in_range then fam(accent_chr(tail)):=cur_fam
 22098  else fam(accent_chr(tail)):=(cur_val div 256) mod 16;
 22099  scan_math(nucleus(tail));
 22100  end;
 22101  
 22102  @ @<Complain that the user should have said \.{\\mathaccent}@>=
 22103  begin print_err("Please use "); print_esc("mathaccent");
 22104  print(" for accents in math mode");
 22105  @.Please use \\mathaccent...@>
 22106  help2("I'm changing \accent to \mathaccent here; wish me luck.")@/
 22107    ("(Accents are not the same in formulas as they are in text.)");
 22108  error;
 22109  end
 22110  
 22111  @ @<Cases of |main_control| that build...@>=
 22112  mmode+vcenter: begin scan_spec(vcenter_group,false); normal_paragraph;
 22113    push_nest; mode:=-vmode; prev_depth:=ignore_depth;
 22114    if every_vbox<>null then begin_token_list(every_vbox,every_vbox_text);
 22115    end;
 22116  
 22117  @ @<Cases of |handle...@>=
 22118  vcenter_group: begin end_graf; unsave; save_ptr:=save_ptr-2;
 22119    p:=vpack(link(head),saved(1),saved(0)); pop_nest;
 22120    tail_append(new_noad); type(tail):=vcenter_noad;
 22121    math_type(nucleus(tail)):=sub_box; info(nucleus(tail)):=p;
 22122    end;
 22123  
 22124  @ The routine that inserts a |style_node| holds no surprises.
 22125  
 22126  @<Put each...@>=
 22127  primitive("displaystyle",math_style,display_style);
 22128  @!@:display_style_}{\.{\\displaystyle} primitive@>
 22129  primitive("textstyle",math_style,text_style);
 22130  @!@:text_style_}{\.{\\textstyle} primitive@>
 22131  primitive("scriptstyle",math_style,script_style);
 22132  @!@:script_style_}{\.{\\scriptstyle} primitive@>
 22133  primitive("scriptscriptstyle",math_style,script_script_style);
 22134  @!@:script_script_style_}{\.{\\scriptscriptstyle} primitive@>
 22135  
 22136  @ @<Cases of |print_cmd_chr|...@>=
 22137  math_style: print_style(chr_code);
 22138  
 22139  @ @<Cases of |main_control| that build...@>=
 22140  mmode+math_style: tail_append(new_style(cur_chr));
 22141  mmode+non_script: begin tail_append(new_glue(zero_glue));
 22142    subtype(tail):=cond_math_glue;
 22143    end;
 22144  mmode+math_choice: append_choices;
 22145  
 22146  @ The routine that scans the four mlists of a \.{\\mathchoice} is very
 22147  much like the routine that builds discretionary nodes.
 22148  
 22149  @<Declare act...@>=
 22150  procedure append_choices;
 22151  begin tail_append(new_choice); incr(save_ptr); saved(-1):=0;
 22152  push_math(math_choice_group); scan_left_brace;
 22153  end;
 22154  
 22155  @ @<Cases of |handle_right_brace|...@>=
 22156  math_choice_group: build_choices;
 22157  
 22158  @ @<Declare act...@>=
 22159  @t\4@>@<Declare the function called |fin_mlist|@>@t@>@;@/
 22160  procedure build_choices;
 22161  label exit;
 22162  var p:pointer; {the current mlist}
 22163  begin unsave; p:=fin_mlist(null);
 22164  case saved(-1) of
 22165  0:display_mlist(tail):=p;
 22166  1:text_mlist(tail):=p;
 22167  2:script_mlist(tail):=p;
 22168  3:begin script_script_mlist(tail):=p; decr(save_ptr); return;
 22169    end;
 22170  end; {there are no other cases}
 22171  incr(saved(-1)); push_math(math_choice_group); scan_left_brace;
 22172  exit:end;
 22173  
 22174  @ Subscripts and superscripts are attached to the previous nucleus by the
 22175  @^superscripts@>@^subscripts@>
 22176  action procedure called |sub_sup|. We use the facts that |sub_mark=sup_mark+1|
 22177  and |subscr(p)=supscr(p)+1|.
 22178  
 22179  @<Cases of |main_control| that build...@>=
 22180  mmode+sub_mark,mmode+sup_mark: sub_sup;
 22181  
 22182  @ @<Declare act...@>=
 22183  procedure sub_sup;
 22184  var t:small_number; {type of previous sub/superscript}
 22185  @!p:pointer; {field to be filled by |scan_math|}
 22186  begin t:=empty; p:=null;
 22187  if tail<>head then if scripts_allowed(tail) then
 22188    begin p:=supscr(tail)+cur_cmd-sup_mark; {|supscr| or |subscr|}
 22189    t:=math_type(p);
 22190    end;
 22191  if (p=null)or(t<>empty) then @<Insert a dummy noad to be sub/superscripted@>;
 22192  scan_math(p);
 22193  end;
 22194  
 22195  @ @<Insert a dummy...@>=
 22196  begin tail_append(new_noad);
 22197  p:=supscr(tail)+cur_cmd-sup_mark; {|supscr| or |subscr|}
 22198  if t<>empty then
 22199    begin if cur_cmd=sup_mark then
 22200      begin print_err("Double superscript");
 22201  @.Double superscript@>
 22202      help1("I treat `x^1^2' essentially like `x^1{}^2'.");
 22203      end
 22204    else  begin print_err("Double subscript");
 22205  @.Double subscript@>
 22206      help1("I treat `x_1_2' essentially like `x_1{}_2'.");
 22207      end;
 22208    error;
 22209    end;
 22210  end
 22211  
 22212  @ An operation like `\.{\\over}' causes the current mlist to go into a
 22213  state of suspended animation: |incompleat_noad| points to a |fraction_noad|
 22214  that contains the mlist-so-far as its numerator, while the denominator
 22215  is yet to come. Finally when the mlist is finished, the denominator will
 22216  go into the incompleat fraction noad, and that noad will become the
 22217  whole formula, unless it is surrounded by `\.{\\left}' and `\.{\\right}'
 22218  delimiters.
 22219  
 22220  @d above_code=0 { `\.{\\above}' }
 22221  @d over_code=1 { `\.{\\over}' }
 22222  @d atop_code=2 { `\.{\\atop}' }
 22223  @d delimited_code=3 { `\.{\\abovewithdelims}', etc.}
 22224  
 22225  @<Put each...@>=
 22226  primitive("above",above,above_code);@/
 22227  @!@:above_}{\.{\\above} primitive@>
 22228  primitive("over",above,over_code);@/
 22229  @!@:over_}{\.{\\over} primitive@>
 22230  primitive("atop",above,atop_code);@/
 22231  @!@:atop_}{\.{\\atop} primitive@>
 22232  primitive("abovewithdelims",above,delimited_code+above_code);@/
 22233  @!@:above_with_delims_}{\.{\\abovewithdelims} primitive@>
 22234  primitive("overwithdelims",above,delimited_code+over_code);@/
 22235  @!@:over_with_delims_}{\.{\\overwithdelims} primitive@>
 22236  primitive("atopwithdelims",above,delimited_code+atop_code);
 22237  @!@:atop_with_delims_}{\.{\\atopwithdelims} primitive@>
 22238  
 22239  @ @<Cases of |print_cmd_chr|...@>=
 22240  above: case chr_code of
 22241    over_code:print_esc("over");
 22242    atop_code:print_esc("atop");
 22243    delimited_code+above_code:print_esc("abovewithdelims");
 22244    delimited_code+over_code:print_esc("overwithdelims");
 22245    delimited_code+atop_code:print_esc("atopwithdelims");
 22246    othercases print_esc("above")
 22247    endcases;
 22248  
 22249  @ @<Cases of |main_control| that build...@>=
 22250  mmode+above: math_fraction;
 22251  
 22252  @ @<Declare act...@>=
 22253  procedure math_fraction;
 22254  var c:small_number; {the type of generalized fraction we are scanning}
 22255  begin c:=cur_chr;
 22256  if incompleat_noad<>null then
 22257    @<Ignore the fraction operation and complain about this ambiguous case@>
 22258  else  begin incompleat_noad:=get_node(fraction_noad_size);
 22259    type(incompleat_noad):=fraction_noad;
 22260    subtype(incompleat_noad):=normal;
 22261    math_type(numerator(incompleat_noad)):=sub_mlist;
 22262    info(numerator(incompleat_noad)):=link(head);
 22263    mem[denominator(incompleat_noad)].hh:=empty_field;
 22264    mem[left_delimiter(incompleat_noad)].qqqq:=null_delimiter;
 22265    mem[right_delimiter(incompleat_noad)].qqqq:=null_delimiter;@/
 22266    link(head):=null; tail:=head;
 22267    @<Use code |c| to distinguish between generalized fractions@>;
 22268    end;
 22269  end;
 22270  
 22271  @ @<Use code |c|...@>=
 22272  if c>=delimited_code then
 22273    begin scan_delimiter(left_delimiter(incompleat_noad),false);
 22274    scan_delimiter(right_delimiter(incompleat_noad),false);
 22275    end;
 22276  case c mod delimited_code of
 22277  above_code: begin scan_normal_dimen;
 22278    thickness(incompleat_noad):=cur_val;
 22279    end;
 22280  over_code: thickness(incompleat_noad):=default_code;
 22281  atop_code: thickness(incompleat_noad):=0;
 22282  end {there are no other cases}
 22283  
 22284  @ @<Ignore the fraction...@>=
 22285  begin if c>=delimited_code then
 22286    begin scan_delimiter(garbage,false); scan_delimiter(garbage,false);
 22287    end;
 22288  if c mod delimited_code=above_code then scan_normal_dimen;
 22289  print_err("Ambiguous; you need another { and }");
 22290  @.Ambiguous...@>
 22291  help3("I'm ignoring this fraction specification, since I don't")@/
 22292    ("know whether a construction like `x \over y \over z'")@/
 22293    ("means `{x \over y} \over z' or `x \over {y \over z}'.");
 22294  error;
 22295  end
 22296  
 22297  @ At the end of a math formula or subformula, the |fin_mlist| routine is
 22298  called upon to return a pointer to the newly completed mlist, and to
 22299  pop the nest back to the enclosing semantic level. The parameter to
 22300  |fin_mlist|, if not null, points to a |right_noad| that ends the
 22301  current mlist; this |right_noad| has not yet been appended.
 22302  
 22303  @<Declare the function called |fin_mlist|@>=
 22304  function fin_mlist(@!p:pointer):pointer;
 22305  var q:pointer; {the mlist to return}
 22306  begin if incompleat_noad<>null then @<Compleat the incompleat noad@>
 22307  else  begin link(tail):=p; q:=link(head);
 22308    end;
 22309  pop_nest; fin_mlist:=q;
 22310  end;
 22311  
 22312  @ @<Compleat...@>=
 22313  begin math_type(denominator(incompleat_noad)):=sub_mlist;
 22314  info(denominator(incompleat_noad)):=link(head);
 22315  if p=null then q:=incompleat_noad
 22316  else  begin q:=info(numerator(incompleat_noad));
 22317    if type(q)<>left_noad then confusion("right");
 22318  @:this can't happen right}{\quad right@>
 22319    info(numerator(incompleat_noad)):=link(q);
 22320    link(q):=incompleat_noad; link(incompleat_noad):=p;
 22321    end;
 22322  end
 22323  
 22324  @ Now at last we're ready to see what happens when a right brace occurs
 22325  in a math formula. Two special cases are simplified here: Braces are effectively
 22326  removed when they surround a single Ord without sub/superscripts, or when they
 22327  surround an accent that is the nucleus of an Ord atom.
 22328  
 22329  @<Cases of |handle...@>=
 22330  math_group: begin unsave; decr(save_ptr);@/
 22331    math_type(saved(0)):=sub_mlist; p:=fin_mlist(null); info(saved(0)):=p;
 22332    if p<>null then if link(p)=null then
 22333     if type(p)=ord_noad then
 22334      begin if math_type(subscr(p))=empty then
 22335       if math_type(supscr(p))=empty then
 22336        begin mem[saved(0)].hh:=mem[nucleus(p)].hh;
 22337        free_node(p,noad_size);
 22338        end;
 22339      end
 22340    else if type(p)=accent_noad then if saved(0)=nucleus(tail) then
 22341     if type(tail)=ord_noad then @<Replace the tail of the list by |p|@>;
 22342    end;
 22343  
 22344  @ @<Replace the tail...@>=
 22345  begin q:=head; while link(q)<>tail do q:=link(q);
 22346  link(q):=p; free_node(tail,noad_size); tail:=p;
 22347  end
 22348  
 22349  @ We have dealt with all constructions of math mode except `\.{\\left}' and
 22350  `\.{\\right}', so the picture is completed by the following sections of
 22351  the program.
 22352  
 22353  @<Put each...@>=
 22354  primitive("left",left_right,left_noad);
 22355  @!@:left_}{\.{\\left} primitive@>
 22356  primitive("right",left_right,right_noad);
 22357  @!@:right_}{\.{\\right} primitive@>
 22358  text(frozen_right):="right"; eqtb[frozen_right]:=eqtb[cur_val];
 22359  
 22360  @ @<Cases of |print_cmd_chr|...@>=
 22361  left_right: if chr_code=left_noad then print_esc("left")
 22362  else print_esc("right");
 22363  
 22364  @ @<Cases of |main_control| that build...@>=
 22365  mmode+left_right: math_left_right;
 22366  
 22367  @ @<Declare act...@>=
 22368  procedure math_left_right;
 22369  var t:small_number; {|left_noad| or |right_noad|}
 22370  @!p:pointer; {new noad}
 22371  begin t:=cur_chr;
 22372  if (t=right_noad)and(cur_group<>math_left_group) then
 22373    @<Try to recover from mismatched \.{\\right}@>
 22374  else  begin p:=new_noad; type(p):=t;
 22375    scan_delimiter(delimiter(p),false);
 22376    if t=left_noad then
 22377      begin push_math(math_left_group); link(head):=p; tail:=p;
 22378      end
 22379    else  begin p:=fin_mlist(p); unsave; {end of |math_left_group|}
 22380      tail_append(new_noad); type(tail):=inner_noad;
 22381      math_type(nucleus(tail)):=sub_mlist;
 22382      info(nucleus(tail)):=p;
 22383      end;
 22384    end;
 22385  end;
 22386  
 22387  @ @<Try to recover from mismatch...@>=
 22388  begin if cur_group=math_shift_group then
 22389    begin scan_delimiter(garbage,false);
 22390    print_err("Extra "); print_esc("right");
 22391  @.Extra \\right.@>
 22392    help1("I'm ignoring a \right that had no matching \left.");
 22393    error;
 22394    end
 22395  else off_save;
 22396  end
 22397  
 22398  @ Here is the only way out of math mode.
 22399  
 22400  @<Cases of |main_control| that build...@>=
 22401  mmode+math_shift: if cur_group=math_shift_group then after_math
 22402    else off_save;
 22403  
 22404  @ @<Declare act...@>=
 22405  procedure after_math;
 22406  var l:boolean; {`\.{\\leqno}' instead of `\.{\\eqno}'}
 22407  @!danger:boolean; {not enough symbol fonts are present}
 22408  @!m:integer; {|mmode| or |-mmode|}
 22409  @!p:pointer; {the formula}
 22410  @!a:pointer; {box containing equation number}
 22411  @<Local variables for finishing a displayed formula@>@;
 22412  begin danger:=false;
 22413  @<Check that the necessary fonts for math symbols are present;
 22414    if not, flush the current math lists and set |danger:=true|@>;
 22415  m:=mode; l:=false; p:=fin_mlist(null); {this pops the nest}
 22416  if mode=-m then {end of equation number}
 22417    begin @<Check that another \.\$ follows@>;
 22418    cur_mlist:=p; cur_style:=text_style; mlist_penalties:=false;
 22419    mlist_to_hlist; a:=hpack(link(temp_head),natural);
 22420    unsave; decr(save_ptr); {now |cur_group=math_shift_group|}
 22421    if saved(0)=1 then l:=true;
 22422    danger:=false;
 22423    @<Check that the necessary fonts for math symbols are present;
 22424      if not, flush the current math lists and set |danger:=true|@>;
 22425    m:=mode; p:=fin_mlist(null);
 22426    end
 22427  else a:=null;
 22428  if m<0 then @<Finish math in text@>
 22429  else  begin if a=null then @<Check that another \.\$ follows@>;
 22430    @<Finish displayed math@>;
 22431    end;
 22432  end;
 22433  
 22434  @ @<Check that the necessary fonts...@>=
 22435  if (font_params[fam_fnt(2+text_size)]<total_mathsy_params)or@|
 22436     (font_params[fam_fnt(2+script_size)]<total_mathsy_params)or@|
 22437     (font_params[fam_fnt(2+script_script_size)]<total_mathsy_params) then
 22438    begin print_err("Math formula deleted: Insufficient symbol fonts");@/
 22439  @.Math formula deleted...@>
 22440    help3("Sorry, but I can't typeset math unless \textfont 2")@/
 22441      ("and \scriptfont 2 and \scriptscriptfont 2 have all")@/
 22442      ("the \fontdimen values needed in math symbol fonts.");
 22443    error; flush_math; danger:=true;
 22444    end
 22445  else if (font_params[fam_fnt(3+text_size)]<total_mathex_params)or@|
 22446     (font_params[fam_fnt(3+script_size)]<total_mathex_params)or@|
 22447     (font_params[fam_fnt(3+script_script_size)]<total_mathex_params) then
 22448    begin print_err("Math formula deleted: Insufficient extension fonts");@/
 22449    help3("Sorry, but I can't typeset math unless \textfont 3")@/
 22450      ("and \scriptfont 3 and \scriptscriptfont 3 have all")@/
 22451      ("the \fontdimen values needed in math extension fonts.");
 22452    error; flush_math; danger:=true;
 22453    end
 22454  
 22455  @ The |unsave| is done after everything else here; hence an appearance of
 22456  `\.{\\mathsurround}' inside of `\.{\$...\$}' affects the spacing at these
 22457  particular \.\$'s. This is consistent with the conventions of
 22458  `\.{\$\$...\$\$}', since `\.{\\abovedisplayskip}' inside a display affects the
 22459  space above that display.
 22460  
 22461  @<Finish math in text@>=
 22462  begin tail_append(new_math(math_surround,before));
 22463  cur_mlist:=p; cur_style:=text_style; mlist_penalties:=(mode>0); mlist_to_hlist;
 22464  link(tail):=link(temp_head);
 22465  while link(tail)<>null do tail:=link(tail);
 22466  tail_append(new_math(math_surround,after));
 22467  space_factor:=1000; unsave;
 22468  end
 22469  
 22470  @ \TeX\ gets to the following part of the program when the first `\.\$' ending
 22471  a display has been scanned.
 22472  
 22473  @<Check that another \.\$ follows@>=
 22474  begin get_x_token;
 22475  if cur_cmd<>math_shift then
 22476    begin print_err("Display math should end with $$");
 22477  @.Display math...with \$\$@>
 22478    help2("The `$' that I just saw supposedly matches a previous `$$'.")@/
 22479      ("So I shall assume that you typed `$$' both times.");
 22480    back_error;
 22481    end;
 22482  end
 22483  
 22484  @ We have saved the worst for last: The fussiest part of math mode processing
 22485  occurs when a displayed formula is being centered and placed with an optional
 22486  equation number.
 22487  
 22488  @<Local variables for finishing...@>=
 22489  @!b:pointer; {box containing the equation}
 22490  @!w:scaled; {width of the equation}
 22491  @!z:scaled; {width of the line}
 22492  @!e:scaled; {width of equation number}
 22493  @!q:scaled; {width of equation number plus space to separate from equation}
 22494  @!d:scaled; {displacement of equation in the line}
 22495  @!s:scaled; {move the line right this much}
 22496  @!g1,@!g2:small_number; {glue parameter codes for before and after}
 22497  @!r:pointer; {kern node used to position the display}
 22498  @!t:pointer; {tail of adjustment list}
 22499  
 22500  @ At this time |p| points to the mlist for the formula; |a| is either
 22501  |null| or it points to a box containing the equation number; and we are in
 22502  vertical mode (or internal vertical mode).
 22503  
 22504  @<Finish displayed math@>=
 22505  cur_mlist:=p; cur_style:=display_style; mlist_penalties:=false;
 22506  mlist_to_hlist; p:=link(temp_head);@/
 22507  adjust_tail:=adjust_head; b:=hpack(p,natural); p:=list_ptr(b);
 22508  t:=adjust_tail; adjust_tail:=null;@/
 22509  w:=width(b); z:=display_width; s:=display_indent;
 22510  if (a=null)or danger then
 22511    begin e:=0; q:=0;
 22512    end
 22513  else  begin e:=width(a); q:=e+math_quad(text_size);
 22514    end;
 22515  if w+q>z then
 22516    @<Squeeze the equation as much as possible; if there is an equation
 22517      number that should go on a separate line by itself,
 22518      set~|e:=0|@>;
 22519  @<Determine the displacement, |d|, of the left edge of the equation, with
 22520    respect to the line size |z|, assuming that |l=false|@>;
 22521  @<Append the glue or equation number preceding the display@>;
 22522  @<Append the display and perhaps also the equation number@>;
 22523  @<Append the glue or equation number following the display@>;
 22524  resume_after_display
 22525  
 22526  @ @<Declare act...@>=
 22527  procedure resume_after_display;
 22528  begin if cur_group<>math_shift_group then confusion("display");
 22529  @:this can't happen display}{\quad display@>
 22530  unsave; prev_graf:=prev_graf+3;
 22531  push_nest; mode:=hmode; space_factor:=1000; set_cur_lang; clang:=cur_lang;
 22532  prev_graf:=(norm_min(left_hyphen_min)*@'100+norm_min(right_hyphen_min))
 22533               *@'200000+cur_lang;
 22534  @<Scan an optional space@>;
 22535  if nest_ptr=1 then build_page;
 22536  end;
 22537  
 22538  @ The user can force the equation number to go on a separate line
 22539  by causing its width to be zero.
 22540  
 22541  @<Squeeze the equation as much as possible...@>=
 22542  begin if (e<>0)and((w-total_shrink[normal]+q<=z)or@|
 22543     (total_shrink[fil]<>0)or(total_shrink[fill]<>0)or
 22544     (total_shrink[filll]<>0)) then
 22545    begin free_node(b,box_node_size);
 22546    b:=hpack(p,z-q,exactly);
 22547    end
 22548  else  begin e:=0;
 22549    if w>z then
 22550      begin free_node(b,box_node_size);
 22551      b:=hpack(p,z,exactly);
 22552      end;
 22553    end;
 22554  w:=width(b);
 22555  end
 22556  
 22557  @ We try first to center the display without regard to the existence of
 22558  the equation number. If that would make it too close (where ``too close''
 22559  means that the space between display and equation number is less than the
 22560  width of the equation number), we either center it in the remaining space
 22561  or move it as far from the equation number as possible. The latter alternative
 22562  is taken only if the display begins with glue, since we assume that the
 22563  user put glue there to control the spacing precisely.
 22564  
 22565  @<Determine the displacement, |d|, of the left edge of the equation...@>=
 22566  d:=half(z-w);
 22567  if (e>0)and(d<2*e) then {too close}
 22568    begin d:=half(z-w-e);
 22569    if p<>null then if not is_char_node(p) then if type(p)=glue_node then d:=0;
 22570    end
 22571  
 22572  @ If the equation number is set on a line by itself, either before or
 22573  after the formula, we append an infinite penalty so that no page break will
 22574  separate the display from its number; and we use the same size and
 22575  displacement for all three potential lines of the display, even though
 22576  `\.{\\parshape}' may specify them differently.
 22577  
 22578  @<Append the glue or equation number preceding the display@>=
 22579  tail_append(new_penalty(pre_display_penalty));@/
 22580  if (d+s<=pre_display_size)or l then {not enough clearance}
 22581    begin g1:=above_display_skip_code; g2:=below_display_skip_code;
 22582    end
 22583  else  begin g1:=above_display_short_skip_code;
 22584    g2:=below_display_short_skip_code;
 22585    end;
 22586  if l and(e=0) then {it follows that |type(a)=hlist_node|}
 22587    begin shift_amount(a):=s; append_to_vlist(a);
 22588    tail_append(new_penalty(inf_penalty));
 22589    end
 22590  else tail_append(new_param_glue(g1))
 22591  
 22592  @ @<Append the display and perhaps also the equation number@>=
 22593  if e<>0 then
 22594    begin r:=new_kern(z-w-e-d);
 22595    if l then
 22596      begin link(a):=r; link(r):=b; b:=a; d:=0;
 22597      end
 22598    else  begin link(b):=r; link(r):=a;
 22599      end;
 22600    b:=hpack(b,natural);
 22601    end;
 22602  shift_amount(b):=s+d; append_to_vlist(b)
 22603  
 22604  @ @<Append the glue or equation number following the display@>=
 22605  if (a<>null)and(e=0)and not l then
 22606    begin tail_append(new_penalty(inf_penalty));
 22607    shift_amount(a):=s+z-width(a);
 22608    append_to_vlist(a);
 22609    g2:=0;
 22610    end;
 22611  if t<>adjust_head then {migrating material comes after equation number}
 22612    begin link(tail):=link(adjust_head); tail:=t;
 22613    end;
 22614  tail_append(new_penalty(post_display_penalty));
 22615  if g2>0 then tail_append(new_param_glue(g2))
 22616  
 22617  @ When \.{\\halign} appears in a display, the alignment routines operate
 22618  essentially as they do in vertical mode. Then the following program is
 22619  activated, with |p| and |q| pointing to the beginning and end of the
 22620  resulting list, and with |aux_save| holding the |prev_depth| value.
 22621  
 22622  @<Finish an alignment in a display@>=
 22623  begin do_assignments;
 22624  if cur_cmd<>math_shift then @<Pontificate about improper alignment in display@>
 22625  else @<Check that another \.\$ follows@>;
 22626  pop_nest;
 22627  tail_append(new_penalty(pre_display_penalty));
 22628  tail_append(new_param_glue(above_display_skip_code));
 22629  link(tail):=p;
 22630  if p<>null then tail:=q;
 22631  tail_append(new_penalty(post_display_penalty));
 22632  tail_append(new_param_glue(below_display_skip_code));
 22633  prev_depth:=aux_save.sc; resume_after_display;
 22634  end
 22635  
 22636  @ @<Pontificate...@>=
 22637  begin print_err("Missing $$ inserted");
 22638  @.Missing {\$\$} inserted@>
 22639  help2("Displays can use special alignments (like \eqalignno)")@/
 22640    ("only if nothing but the alignment itself is between $$'s.");
 22641  back_error;
 22642  end
 22643  
 22644  @* \[49] Mode-independent processing.
 22645  The long |main_control| procedure has now been fully specified, except for
 22646  certain activities that are independent of the current mode. These activities
 22647  do not change the current vlist or hlist or mlist; if they change anything,
 22648  it is the value of a parameter or the meaning of a control sequence.
 22649  
 22650  Assignments to values in |eqtb| can be global or local. Furthermore, a
 22651  control sequence can be defined to be `\.{\\long}' or `\.{\\outer}', and
 22652  it might or might not be expanded. The prefixes `\.{\\global}', `\.{\\long}',
 22653  and `\.{\\outer}' can occur in any order. Therefore we assign binary numeric
 22654  codes, making it possible to accumulate the union of all specified prefixes
 22655  by adding the corresponding codes.  (\PASCAL's |set| operations could also
 22656  have been used.)
 22657  
 22658  @<Put each...@>=
 22659  primitive("long",prefix,1);
 22660  @!@:long_}{\.{\\long} primitive@>
 22661  primitive("outer",prefix,2);
 22662  @!@:outer_}{\.{\\outer} primitive@>
 22663  primitive("global",prefix,4);
 22664  @!@:global_}{\.{\\global} primitive@>
 22665  primitive("def",def,0);
 22666  @!@:def_}{\.{\\def} primitive@>
 22667  primitive("gdef",def,1);
 22668  @!@:gdef_}{\.{\\gdef} primitive@>
 22669  primitive("edef",def,2);
 22670  @!@:edef_}{\.{\\edef} primitive@>
 22671  primitive("xdef",def,3);
 22672  @!@:xdef_}{\.{\\xdef} primitive@>
 22673  
 22674  @ @<Cases of |print_cmd_chr|...@>=
 22675  prefix: if chr_code=1 then print_esc("long")
 22676    else if chr_code=2 then print_esc("outer")
 22677    else print_esc("global");
 22678  def: if chr_code=0 then print_esc("def")
 22679    else if chr_code=1 then print_esc("gdef")
 22680    else if chr_code=2 then print_esc("edef")
 22681    else print_esc("xdef");
 22682  
 22683  @ Every prefix, and every command code that might or might not be prefixed,
 22684  calls the action procedure |prefixed_command|. This routine accumulates
 22685  a sequence of prefixes until coming to a non-prefix, then it carries out
 22686  the command.
 22687  
 22688  @<Cases of |main_control| that don't...@>=
 22689  any_mode(toks_register),
 22690  any_mode(assign_toks),
 22691  any_mode(assign_int),
 22692  any_mode(assign_dimen),
 22693  any_mode(assign_glue),
 22694  any_mode(assign_mu_glue),
 22695  any_mode(assign_font_dimen),
 22696  any_mode(assign_font_int),
 22697  any_mode(set_aux),
 22698  any_mode(set_prev_graf),
 22699  any_mode(set_page_dimen),
 22700  any_mode(set_page_int),
 22701  any_mode(set_box_dimen),
 22702  any_mode(set_shape),
 22703  any_mode(def_code),
 22704  any_mode(def_family),
 22705  any_mode(set_font),
 22706  any_mode(def_font),
 22707  any_mode(register),
 22708  any_mode(advance),
 22709  any_mode(multiply),
 22710  any_mode(divide),
 22711  any_mode(prefix),
 22712  any_mode(let),
 22713  any_mode(shorthand_def),
 22714  any_mode(read_to_cs),
 22715  any_mode(def),
 22716  any_mode(set_box),
 22717  any_mode(hyph_data),
 22718  any_mode(set_interaction):prefixed_command;
 22719  
 22720  @ If the user says, e.g., `\.{\\global\\global}', the redundancy is
 22721  silently accepted.
 22722  
 22723  @<Declare act...@>=
 22724  @t\4@>@<Declare subprocedures for |prefixed_command|@>@t@>@;@/
 22725  procedure prefixed_command;
 22726  label done,exit;
 22727  var a:small_number; {accumulated prefix codes so far}
 22728  @!f:internal_font_number; {identifies a font}
 22729  @!j:halfword; {index into a \.{\\parshape} specification}
 22730  @!k:font_index; {index into |font_info|}
 22731  @!p,@!q:pointer; {for temporary short-term use}
 22732  @!n:integer; {ditto}
 22733  @!e:boolean; {should a definition be expanded? or was \.{\\let} not done?}
 22734  begin a:=0;
 22735  while cur_cmd=prefix do
 22736    begin if not odd(a div cur_chr) then a:=a+cur_chr;
 22737    @<Get the next non-blank non-relax...@>;
 22738    if cur_cmd<=max_non_prefixed_command then
 22739      @<Discard erroneous prefixes and |return|@>;
 22740    end;
 22741  @<Discard the prefixes \.{\\long} and \.{\\outer} if they are irrelevant@>;
 22742  @<Adjust \(f)for the setting of \.{\\globaldefs}@>;
 22743  case cur_cmd of
 22744  @t\4@>@<Assignments@>@;
 22745  othercases confusion("prefix")
 22746  @:this can't happen prefix}{\quad prefix@>
 22747  endcases;
 22748  done: @<Insert a token saved by \.{\\afterassignment}, if any@>;
 22749  exit:end;
 22750  
 22751  @ @<Discard erroneous...@>=
 22752  begin print_err("You can't use a prefix with `");
 22753  @.You can't use a prefix with x@>
 22754  print_cmd_chr(cur_cmd,cur_chr); print_char("'");
 22755  help1("I'll pretend you didn't say \long or \outer or \global.");
 22756  back_error; return;
 22757  end
 22758  
 22759  @ @<Discard the prefixes...@>=
 22760  if (cur_cmd<>def)and(a mod 4<>0) then
 22761    begin print_err("You can't use `"); print_esc("long"); print("' or `");
 22762    print_esc("outer"); print("' with `");
 22763  @.You can't use \\long...@>
 22764    print_cmd_chr(cur_cmd,cur_chr); print_char("'");
 22765    help1("I'll pretend you didn't say \long or \outer here.");
 22766    error;
 22767    end
 22768  
 22769  @ The previous routine does not have to adjust |a| so that |a mod 4=0|,
 22770  since the following routines test for the \.{\\global} prefix as follows.
 22771  
 22772  @d global==(a>=4)
 22773  @d define(#)==if global then geq_define(#)@+else eq_define(#)
 22774  @d word_define(#)==if global then geq_word_define(#)@+else eq_word_define(#)
 22775  
 22776  @<Adjust \(f)for the setting of \.{\\globaldefs}@>=
 22777  if global_defs<>0 then
 22778    if global_defs<0 then
 22779      begin if global then a:=a-4;
 22780      end
 22781    else  begin if not global then a:=a+4;
 22782      end
 22783  
 22784  @ When a control sequence is to be defined, by \.{\\def} or \.{\\let} or
 22785  something similar, the |get_r_token| routine will substitute a special
 22786  control sequence for a token that is not redefinable.
 22787  
 22788  @<Declare subprocedures for |prefixed_command|@>=
 22789  procedure get_r_token;
 22790  label restart;
 22791  begin restart: repeat get_token;
 22792  until cur_tok<>space_token;
 22793  if (cur_cs=0)or(cur_cs>frozen_control_sequence) then
 22794    begin print_err("Missing control sequence inserted");
 22795  @.Missing control...@>
 22796    help5("Please don't say `\def cs{...}', say `\def\cs{...}'.")@/
 22797    ("I've inserted an inaccessible control sequence so that your")@/
 22798    ("definition will be completed without mixing me up too badly.")@/
 22799    ("You can recover graciously from this error, if you're")@/
 22800    ("careful; see exercise 27.2 in The TeXbook.");
 22801  @:TeXbook}{\sl The \TeX book@>
 22802    if cur_cs=0 then back_input;
 22803    cur_tok:=cs_token_flag+frozen_protection; ins_error; goto restart;
 22804    end;
 22805  end;
 22806  
 22807  @ @<Initialize table entries...@>=
 22808  text(frozen_protection):="inaccessible";
 22809  @.inaccessible@>
 22810  
 22811  @ Here's an example of the way many of the following routines operate.
 22812  (Unfortunately, they aren't all as simple as this.)
 22813  
 22814  @<Assignments@>=
 22815  set_font: define(cur_font_loc,data,cur_chr);
 22816  
 22817  @ When a |def| command has been scanned,
 22818  |cur_chr| is odd if the definition is supposed to be global, and
 22819  |cur_chr>=2| if the definition is supposed to be expanded.
 22820  
 22821  @<Assignments@>=
 22822  def: begin if odd(cur_chr)and not global and(global_defs>=0) then a:=a+4;
 22823    e:=(cur_chr>=2); get_r_token; p:=cur_cs;
 22824    q:=scan_toks(true,e); define(p,call+(a mod 4),def_ref);
 22825    end;
 22826  
 22827  @ Both \.{\\let} and \.{\\futurelet} share the command code |let|.
 22828  
 22829  @<Put each...@>=
 22830  primitive("let",let,normal);@/
 22831  @!@:let_}{\.{\\let} primitive@>
 22832  primitive("futurelet",let,normal+1);@/
 22833  @!@:future_let_}{\.{\\futurelet} primitive@>
 22834  
 22835  @ @<Cases of |print_cmd_chr|...@>=
 22836  let: if chr_code<>normal then print_esc("futurelet")@+else print_esc("let");
 22837  
 22838  @ @<Assignments@>=
 22839  let:  begin n:=cur_chr;
 22840    get_r_token; p:=cur_cs;
 22841    if n=normal then
 22842      begin repeat get_token;
 22843      until cur_cmd<>spacer;
 22844      if cur_tok=other_token+"=" then
 22845        begin get_token;
 22846        if cur_cmd=spacer then get_token;
 22847        end;
 22848      end
 22849    else  begin get_token; q:=cur_tok; get_token; back_input;
 22850      cur_tok:=q; back_input; {look ahead, then back up}
 22851      end; {note that |back_input| doesn't affect |cur_cmd|, |cur_chr|}
 22852    if cur_cmd>=call then add_token_ref(cur_chr);
 22853    define(p,cur_cmd,cur_chr);
 22854    end;
 22855  
 22856  @ A \.{\\chardef} creates a control sequence whose |cmd| is |char_given|;
 22857  a \.{\\mathchardef} creates a control sequence whose |cmd| is |math_given|;
 22858  and the corresponding |chr| is the character code or math code. A \.{\\countdef}
 22859  or \.{\\dimendef} or \.{\\skipdef} or \.{\\muskipdef} creates a control
 22860  sequence whose |cmd| is |assign_int| or \dots\ or |assign_mu_glue|, and the
 22861  corresponding |chr| is the |eqtb| location of the internal register in question.
 22862  
 22863  @d char_def_code=0 {|shorthand_def| for \.{\\chardef}}
 22864  @d math_char_def_code=1 {|shorthand_def| for \.{\\mathchardef}}
 22865  @d count_def_code=2 {|shorthand_def| for \.{\\countdef}}
 22866  @d dimen_def_code=3 {|shorthand_def| for \.{\\dimendef}}
 22867  @d skip_def_code=4 {|shorthand_def| for \.{\\skipdef}}
 22868  @d mu_skip_def_code=5 {|shorthand_def| for \.{\\muskipdef}}
 22869  @d toks_def_code=6 {|shorthand_def| for \.{\\toksdef}}
 22870  
 22871  @<Put each...@>=
 22872  primitive("chardef",shorthand_def,char_def_code);@/
 22873  @!@:char_def_}{\.{\\chardef} primitive@>
 22874  primitive("mathchardef",shorthand_def,math_char_def_code);@/
 22875  @!@:math_char_def_}{\.{\\mathchardef} primitive@>
 22876  primitive("countdef",shorthand_def,count_def_code);@/
 22877  @!@:count_def_}{\.{\\countdef} primitive@>
 22878  primitive("dimendef",shorthand_def,dimen_def_code);@/
 22879  @!@:dimen_def_}{\.{\\dimendef} primitive@>
 22880  primitive("skipdef",shorthand_def,skip_def_code);@/
 22881  @!@:skip_def_}{\.{\\skipdef} primitive@>
 22882  primitive("muskipdef",shorthand_def,mu_skip_def_code);@/
 22883  @!@:mu_skip_def_}{\.{\\muskipdef} primitive@>
 22884  primitive("toksdef",shorthand_def,toks_def_code);@/
 22885  @!@:toks_def_}{\.{\\toksdef} primitive@>
 22886  
 22887  @ @<Cases of |print_cmd_chr|...@>=
 22888  shorthand_def: case chr_code of
 22889    char_def_code: print_esc("chardef");
 22890    math_char_def_code: print_esc("mathchardef");
 22891    count_def_code: print_esc("countdef");
 22892    dimen_def_code: print_esc("dimendef");
 22893    skip_def_code: print_esc("skipdef");
 22894    mu_skip_def_code: print_esc("muskipdef");
 22895    othercases print_esc("toksdef")
 22896    endcases;
 22897  char_given: begin print_esc("char"); print_hex(chr_code);
 22898    end;
 22899  math_given: begin print_esc("mathchar"); print_hex(chr_code);
 22900    end;
 22901  
 22902  @ We temporarily define |p| to be |relax|, so that an occurrence of |p|
 22903  while scanning the definition will simply stop the scanning instead of
 22904  producing an ``undefined control sequence'' error or expanding the
 22905  previous meaning.  This allows, for instance, `\.{\\chardef\\foo=123\\foo}'.
 22906  
 22907  @<Assignments@>=
 22908  shorthand_def: begin n:=cur_chr; get_r_token; p:=cur_cs; define(p,relax,256);
 22909    scan_optional_equals;
 22910    case n of
 22911    char_def_code: begin scan_char_num; define(p,char_given,cur_val);
 22912      end;
 22913    math_char_def_code: begin scan_fifteen_bit_int; define(p,math_given,cur_val);
 22914      end;
 22915    othercases begin scan_eight_bit_int;
 22916      case n of
 22917      count_def_code: define(p,assign_int,count_base+cur_val);
 22918      dimen_def_code: define(p,assign_dimen,scaled_base+cur_val);
 22919      skip_def_code: define(p,assign_glue,skip_base+cur_val);
 22920      mu_skip_def_code: define(p,assign_mu_glue,mu_skip_base+cur_val);
 22921      toks_def_code: define(p,assign_toks,toks_base+cur_val);
 22922      end; {there are no other cases}
 22923      end
 22924    endcases;
 22925    end;
 22926  
 22927  @ @<Assignments@>=
 22928  read_to_cs: begin scan_int; n:=cur_val;
 22929    if not scan_keyword("to") then
 22930  @.to@>
 22931      begin print_err("Missing `to' inserted");
 22932  @.Missing `to'...@>
 22933      help2("You should have said `\read<number> to \cs'.")@/
 22934      ("I'm going to look for the \cs now."); error;
 22935      end;
 22936    get_r_token;
 22937    p:=cur_cs; read_toks(n,p); define(p,call,cur_val);
 22938    end;
 22939  
 22940  @ The token-list parameters, \.{\\output} and \.{\\everypar}, etc., receive
 22941  their values in the following way. (For safety's sake, we place an
 22942  enclosing pair of braces around an \.{\\output} list.)
 22943  
 22944  @<Assignments@>=
 22945  toks_register,assign_toks: begin q:=cur_cs;
 22946    if cur_cmd=toks_register then
 22947      begin scan_eight_bit_int; p:=toks_base+cur_val;
 22948      end
 22949    else p:=cur_chr; {|p=every_par_loc| or |output_routine_loc| or \dots}
 22950    scan_optional_equals;
 22951    @<Get the next non-blank non-relax non-call token@>;
 22952    if cur_cmd<>left_brace then @<If the right-hand side is a token parameter
 22953        or token register, finish the assignment and |goto done|@>;
 22954    back_input; cur_cs:=q; q:=scan_toks(false,false);
 22955    if link(def_ref)=null then {empty list: revert to the default}
 22956      begin define(p,undefined_cs,null); free_avail(def_ref);
 22957      end
 22958    else  begin if p=output_routine_loc then {enclose in curlies}
 22959        begin link(q):=get_avail; q:=link(q);
 22960        info(q):=right_brace_token+"}";
 22961        q:=get_avail; info(q):=left_brace_token+"{";
 22962        link(q):=link(def_ref); link(def_ref):=q;
 22963        end;
 22964      define(p,call,def_ref);
 22965      end;
 22966    end;
 22967  
 22968  @ @<If the right-hand side is a token parameter...@>=
 22969  begin if cur_cmd=toks_register then
 22970    begin scan_eight_bit_int; cur_cmd:=assign_toks; cur_chr:=toks_base+cur_val;
 22971    end;
 22972  if cur_cmd=assign_toks then
 22973    begin q:=equiv(cur_chr);
 22974    if q=null then define(p,undefined_cs,null)
 22975    else  begin add_token_ref(q); define(p,call,q);
 22976      end;
 22977    goto done;
 22978    end;
 22979  end
 22980  
 22981  @ Similar routines are used to assign values to the numeric parameters.
 22982  
 22983  @<Assignments@>=
 22984  assign_int: begin p:=cur_chr; scan_optional_equals; scan_int;
 22985    word_define(p,cur_val);
 22986    end;
 22987  assign_dimen: begin p:=cur_chr; scan_optional_equals;
 22988    scan_normal_dimen; word_define(p,cur_val);
 22989    end;
 22990  assign_glue,assign_mu_glue: begin p:=cur_chr; n:=cur_cmd; scan_optional_equals;
 22991    if n=assign_mu_glue then scan_glue(mu_val)@+else scan_glue(glue_val);
 22992    trap_zero_glue;
 22993    define(p,glue_ref,cur_val);
 22994    end;
 22995  
 22996  @ When a glue register or parameter becomes zero, it will always point to
 22997  |zero_glue| because of the following procedure. (Exception: The tabskip
 22998  glue isn't trapped while preambles are being scanned.)
 22999  
 23000  @<Declare subprocedures for |prefixed_command|@>=
 23001  procedure trap_zero_glue;
 23002  begin if (width(cur_val)=0)and(stretch(cur_val)=0)and(shrink(cur_val)=0) then
 23003    begin add_glue_ref(zero_glue);
 23004    delete_glue_ref(cur_val); cur_val:=zero_glue;
 23005    end;
 23006  end;
 23007  
 23008  @ The various character code tables are changed by the |def_code| commands,
 23009  and the font families are declared by |def_family|.
 23010  
 23011  @<Put each...@>=
 23012  primitive("catcode",def_code,cat_code_base);
 23013  @!@:cat_code_}{\.{\\catcode} primitive@>
 23014  primitive("mathcode",def_code,math_code_base);
 23015  @!@:math_code_}{\.{\\mathcode} primitive@>
 23016  primitive("lccode",def_code,lc_code_base);
 23017  @!@:lc_code_}{\.{\\lccode} primitive@>
 23018  primitive("uccode",def_code,uc_code_base);
 23019  @!@:uc_code_}{\.{\\uccode} primitive@>
 23020  primitive("sfcode",def_code,sf_code_base);
 23021  @!@:sf_code_}{\.{\\sfcode} primitive@>
 23022  primitive("delcode",def_code,del_code_base);
 23023  @!@:del_code_}{\.{\\delcode} primitive@>
 23024  primitive("textfont",def_family,math_font_base);
 23025  @!@:text_font_}{\.{\\textfont} primitive@>
 23026  primitive("scriptfont",def_family,math_font_base+script_size);
 23027  @!@:script_font_}{\.{\\scriptfont} primitive@>
 23028  primitive("scriptscriptfont",def_family,math_font_base+script_script_size);
 23029  @!@:script_script_font_}{\.{\\scriptscriptfont} primitive@>
 23030  
 23031  @ @<Cases of |print_cmd_chr|...@>=
 23032  def_code: if chr_code=cat_code_base then print_esc("catcode")
 23033    else if chr_code=math_code_base then print_esc("mathcode")
 23034    else if chr_code=lc_code_base then print_esc("lccode")
 23035    else if chr_code=uc_code_base then print_esc("uccode")
 23036    else if chr_code=sf_code_base then print_esc("sfcode")
 23037    else print_esc("delcode");
 23038  def_family: print_size(chr_code-math_font_base);
 23039  
 23040  @ The different types of code values have different legal ranges; the
 23041  following program is careful to check each case properly.
 23042  
 23043  @<Assignments@>=
 23044  def_code: begin @<Let |n| be the largest legal code value, based on |cur_chr|@>;
 23045    p:=cur_chr; scan_char_num; p:=p+cur_val; scan_optional_equals;
 23046    scan_int;
 23047    if ((cur_val<0)and(p<del_code_base))or(cur_val>n) then
 23048      begin print_err("Invalid code ("); print_int(cur_val);
 23049  @.Invalid code@>
 23050      if p<del_code_base then print("), should be in the range 0..")
 23051      else print("), should be at most ");
 23052      print_int(n);
 23053      help1("I'm going to use 0 instead of that illegal code value.");@/
 23054      error; cur_val:=0;
 23055      end;
 23056    if p<math_code_base then define(p,data,cur_val)
 23057    else if p<del_code_base then define(p,data,hi(cur_val))
 23058    else word_define(p,cur_val);
 23059    end;
 23060  
 23061  @ @<Let |n| be the largest...@>=
 23062  if cur_chr=cat_code_base then n:=max_char_code
 23063  else if cur_chr=math_code_base then n:=@'100000
 23064  else if cur_chr=sf_code_base then n:=@'77777
 23065  else if cur_chr=del_code_base then n:=@'77777777
 23066  else n:=255
 23067  
 23068  @ @<Assignments@>=
 23069  def_family: begin p:=cur_chr; scan_four_bit_int; p:=p+cur_val;
 23070    scan_optional_equals; scan_font_ident; define(p,data,cur_val);
 23071    end;
 23072  
 23073  @ Next we consider changes to \TeX's numeric registers.
 23074  
 23075  @<Assignments@>=
 23076  register,advance,multiply,divide: do_register_command(a);
 23077  
 23078  @ We use the fact that |register<advance<multiply<divide|.
 23079  
 23080  @<Declare subprocedures for |prefixed_command|@>=
 23081  procedure do_register_command(@!a:small_number);
 23082  label found,exit;
 23083  var l,@!q,@!r,@!s:pointer; {for list manipulation}
 23084  @!p:int_val..mu_val; {type of register involved}
 23085  begin q:=cur_cmd;
 23086  @<Compute the register location |l| and its type |p|; but |return| if invalid@>;
 23087  if q=register then scan_optional_equals
 23088  else if scan_keyword("by") then do_nothing; {optional `\.{by}'}
 23089  @.by@>
 23090  arith_error:=false;
 23091  if q<multiply then @<Compute result of |register| or
 23092      |advance|, put it in |cur_val|@>
 23093  else @<Compute result of |multiply| or |divide|, put it in |cur_val|@>;
 23094  if arith_error then
 23095    begin print_err("Arithmetic overflow");
 23096  @.Arithmetic overflow@>
 23097    help2("I can't carry out that multiplication or division,")@/
 23098      ("since the result is out of range.");
 23099    if p>=glue_val then delete_glue_ref(cur_val);
 23100    error; return;
 23101    end;
 23102  if p<glue_val then word_define(l,cur_val)
 23103  else  begin trap_zero_glue; define(l,glue_ref,cur_val);
 23104    end;
 23105  exit: end;
 23106  
 23107  @ Here we use the fact that the consecutive codes |int_val..mu_val| and
 23108  |assign_int..assign_mu_glue| correspond to each other nicely.
 23109  
 23110  @<Compute the register location |l| and its type |p|...@>=
 23111  begin if q<>register then
 23112    begin get_x_token;
 23113    if (cur_cmd>=assign_int)and(cur_cmd<=assign_mu_glue) then
 23114      begin l:=cur_chr; p:=cur_cmd-assign_int; goto found;
 23115      end;
 23116    if cur_cmd<>register then
 23117      begin print_err("You can't use `"); print_cmd_chr(cur_cmd,cur_chr);
 23118  @.You can't use x after ...@>
 23119      print("' after "); print_cmd_chr(q,0);
 23120      help1("I'm forgetting what you said and not changing anything.");
 23121      error; return;
 23122      end;
 23123    end;
 23124  p:=cur_chr; scan_eight_bit_int;
 23125  case p of
 23126  int_val: l:=cur_val+count_base;
 23127  dimen_val: l:=cur_val+scaled_base;
 23128  glue_val: l:=cur_val+skip_base;
 23129  mu_val: l:=cur_val+mu_skip_base;
 23130  end; {there are no other cases}
 23131  end;
 23132  found:
 23133  
 23134  @ @<Compute result of |register| or |advance|...@>=
 23135  if p<glue_val then
 23136    begin if p=int_val then scan_int@+else scan_normal_dimen;
 23137    if q=advance then cur_val:=cur_val+eqtb[l].int;
 23138    end
 23139  else  begin scan_glue(p);
 23140    if q=advance then @<Compute the sum of two glue specs@>;
 23141    end
 23142  
 23143  @ @<Compute the sum of two glue specs@>=
 23144  begin q:=new_spec(cur_val); r:=equiv(l);
 23145  delete_glue_ref(cur_val);
 23146  width(q):=width(q)+width(r);
 23147  if stretch(q)=0 then stretch_order(q):=normal;
 23148  if stretch_order(q)=stretch_order(r) then stretch(q):=stretch(q)+stretch(r)
 23149  else if (stretch_order(q)<stretch_order(r))and(stretch(r)<>0) then
 23150    begin stretch(q):=stretch(r); stretch_order(q):=stretch_order(r);
 23151    end;
 23152  if shrink(q)=0 then shrink_order(q):=normal;
 23153  if shrink_order(q)=shrink_order(r) then shrink(q):=shrink(q)+shrink(r)
 23154  else if (shrink_order(q)<shrink_order(r))and(shrink(r)<>0) then
 23155    begin shrink(q):=shrink(r); shrink_order(q):=shrink_order(r);
 23156    end;
 23157  cur_val:=q;
 23158  end
 23159  
 23160  @ @<Compute result of |multiply| or |divide|...@>=
 23161  begin scan_int;
 23162  if p<glue_val then
 23163    if q=multiply then
 23164      if p=int_val then cur_val:=mult_integers(eqtb[l].int,cur_val)
 23165      else cur_val:=nx_plus_y(eqtb[l].int,cur_val,0)
 23166    else cur_val:=x_over_n(eqtb[l].int,cur_val)
 23167  else  begin s:=equiv(l); r:=new_spec(s);
 23168    if q=multiply then
 23169      begin width(r):=nx_plus_y(width(s),cur_val,0);
 23170      stretch(r):=nx_plus_y(stretch(s),cur_val,0);
 23171      shrink(r):=nx_plus_y(shrink(s),cur_val,0);
 23172      end
 23173    else  begin width(r):=x_over_n(width(s),cur_val);
 23174      stretch(r):=x_over_n(stretch(s),cur_val);
 23175      shrink(r):=x_over_n(shrink(s),cur_val);
 23176      end;
 23177    cur_val:=r;
 23178    end;
 23179  end
 23180  
 23181  @ The processing of boxes is somewhat different, because we may need
 23182  to scan and create an entire box before we actually change the value of the old
 23183  one.
 23184  
 23185  @<Assignments@>=
 23186  set_box: begin scan_eight_bit_int;
 23187    if global then n:=256+cur_val@+else n:=cur_val;
 23188    scan_optional_equals;
 23189    if set_box_allowed then scan_box(box_flag+n)
 23190    else begin print_err("Improper "); print_esc("setbox");
 23191  @.Improper \\setbox@>
 23192      help2("Sorry, \setbox is not allowed after \halign in a display,")@/
 23193      ("or between \accent and an accented character."); error;
 23194      end;
 23195    end;
 23196  
 23197  @ The |space_factor| or |prev_depth| settings are changed when a |set_aux|
 23198  command is sensed. Similarly, |prev_graf| is changed in the presence of
 23199  |set_prev_graf|, and |dead_cycles| or |insert_penalties| in the presence of
 23200  |set_page_int|. These definitions are always global.
 23201  
 23202  When some dimension of a box register is changed, the change isn't exactly
 23203  global; but \TeX\ does not look at the \.{\\global} switch.
 23204  
 23205  @<Assignments@>=
 23206  set_aux:alter_aux;
 23207  set_prev_graf:alter_prev_graf;
 23208  set_page_dimen:alter_page_so_far;
 23209  set_page_int:alter_integer;
 23210  set_box_dimen:alter_box_dimen;
 23211  
 23212  @ @<Declare subprocedures for |prefixed_command|@>=
 23213  procedure alter_aux;
 23214  var c:halfword; {|hmode| or |vmode|}
 23215  begin if cur_chr<>abs(mode) then report_illegal_case
 23216  else  begin c:=cur_chr; scan_optional_equals;
 23217    if c=vmode then
 23218      begin scan_normal_dimen; prev_depth:=cur_val;
 23219      end
 23220    else  begin scan_int;
 23221      if (cur_val<=0)or(cur_val>32767) then
 23222        begin print_err("Bad space factor");
 23223  @.Bad space factor@>
 23224        help1("I allow only values in the range 1..32767 here.");
 23225        int_error(cur_val);
 23226        end
 23227      else space_factor:=cur_val;
 23228      end;
 23229    end;
 23230  end;
 23231  
 23232  @ @<Declare subprocedures for |prefixed_command|@>=
 23233  procedure alter_prev_graf;
 23234  var p:0..nest_size; {index into |nest|}
 23235  begin nest[nest_ptr]:=cur_list; p:=nest_ptr;
 23236  while abs(nest[p].mode_field)<>vmode do decr(p);
 23237  scan_optional_equals; scan_int;
 23238  if cur_val<0 then
 23239    begin print_err("Bad "); print_esc("prevgraf");
 23240  @.Bad \\prevgraf@>
 23241    help1("I allow only nonnegative values here.");
 23242    int_error(cur_val);
 23243    end
 23244  else  begin nest[p].pg_field:=cur_val; cur_list:=nest[nest_ptr];
 23245    end;
 23246  end;
 23247  
 23248  @ @<Declare subprocedures for |prefixed_command|@>=
 23249  procedure alter_page_so_far;
 23250  var c:0..7; {index into |page_so_far|}
 23251  begin c:=cur_chr; scan_optional_equals; scan_normal_dimen;
 23252  page_so_far[c]:=cur_val;
 23253  end;
 23254  
 23255  @ @<Declare subprocedures for |prefixed_command|@>=
 23256  procedure alter_integer;
 23257  var c:0..1; {0 for \.{\\deadcycles}, 1 for \.{\\insertpenalties}}
 23258  begin c:=cur_chr; scan_optional_equals; scan_int;
 23259  if c=0 then dead_cycles:=cur_val
 23260  else insert_penalties:=cur_val;
 23261  end;
 23262  
 23263  @ @<Declare subprocedures for |prefixed_command|@>=
 23264  procedure alter_box_dimen;
 23265  var c:small_number; {|width_offset| or |height_offset| or |depth_offset|}
 23266  @!b:eight_bits; {box number}
 23267  begin c:=cur_chr; scan_eight_bit_int; b:=cur_val; scan_optional_equals;
 23268  scan_normal_dimen;
 23269  if box(b)<>null then mem[box(b)+c].sc:=cur_val;
 23270  end;
 23271  
 23272  @ Paragraph shapes are set up in the obvious way.
 23273  
 23274  @<Assignments@>=
 23275  set_shape: begin scan_optional_equals; scan_int; n:=cur_val;
 23276    if n<=0 then p:=null
 23277    else  begin p:=get_node(2*n+1); info(p):=n;
 23278      for j:=1 to n do
 23279        begin scan_normal_dimen;
 23280        mem[p+2*j-1].sc:=cur_val; {indentation}
 23281        scan_normal_dimen;
 23282        mem[p+2*j].sc:=cur_val; {width}
 23283        end;
 23284      end;
 23285    define(par_shape_loc,shape_ref,p);
 23286    end;
 23287  
 23288  @ Here's something that isn't quite so obvious. It guarantees that
 23289  |info(par_shape_ptr)| can hold any positive~|n| for which |get_node(2*n+1)|
 23290  doesn't overflow the memory capacity.
 23291  
 23292  @<Check the ``constant''...@>=
 23293  if 2*max_halfword<mem_top-mem_min then bad:=41;
 23294  
 23295  @ New hyphenation data is loaded by the |hyph_data| command.
 23296  
 23297  @<Put each...@>=
 23298  primitive("hyphenation",hyph_data,0);
 23299  @!@:hyphenation_}{\.{\\hyphenation} primitive@>
 23300  primitive("patterns",hyph_data,1);
 23301  @!@:patterns_}{\.{\\patterns} primitive@>
 23302  
 23303  @ @<Cases of |print_cmd_chr|...@>=
 23304  hyph_data: if chr_code=1 then print_esc("patterns")
 23305    else print_esc("hyphenation");
 23306  
 23307  @ @<Assignments@>=
 23308  hyph_data: if cur_chr=1 then
 23309      begin @!init new_patterns; goto done;@;@+tini@/
 23310      print_err("Patterns can be loaded only by INITEX");
 23311  @.Patterns can be...@>
 23312      help0; error;
 23313      repeat get_token; until cur_cmd=right_brace; {flush the patterns}
 23314      return;
 23315      end
 23316    else  begin new_hyph_exceptions; goto done;
 23317      end;
 23318  
 23319  @ All of \TeX's parameters are kept in |eqtb| except the font information,
 23320  the interaction mode, and the hyphenation tables; these are strictly global.
 23321  
 23322  @<Assignments@>=
 23323  assign_font_dimen: begin find_font_dimen(true); k:=cur_val;
 23324    scan_optional_equals; scan_normal_dimen; font_info[k].sc:=cur_val;
 23325    end;
 23326  assign_font_int: begin n:=cur_chr; scan_font_ident; f:=cur_val;
 23327    scan_optional_equals; scan_int;
 23328    if n=0 then hyphen_char[f]:=cur_val@+else skew_char[f]:=cur_val;
 23329    end;
 23330  
 23331  @ @<Put each...@>=
 23332  primitive("hyphenchar",assign_font_int,0);
 23333  @!@:hyphen_char_}{\.{\\hyphenchar} primitive@>
 23334  primitive("skewchar",assign_font_int,1);
 23335  @!@:skew_char_}{\.{\\skewchar} primitive@>
 23336  
 23337  @ @<Cases of |print_cmd_chr|...@>=
 23338  assign_font_int: if chr_code=0 then print_esc("hyphenchar")
 23339    else print_esc("skewchar");
 23340  
 23341  @ Here is where the information for a new font gets loaded.
 23342  
 23343  @<Assignments@>=
 23344  def_font: new_font(a);
 23345  
 23346  @ @<Declare subprocedures for |prefixed_command|@>=
 23347  procedure new_font(@!a:small_number);
 23348  label common_ending;
 23349  var u:pointer; {user's font identifier}
 23350  @!s:scaled; {stated ``at'' size, or negative of scaled magnification}
 23351  @!f:internal_font_number; {runs through existing fonts}
 23352  @!t:str_number; {name for the frozen font identifier}
 23353  @!old_setting:0..max_selector; {holds |selector| setting}
 23354  @!flushable_string:str_number; {string not yet referenced}
 23355  begin if job_name=0 then open_log_file;
 23356    {avoid confusing \.{texput} with the font name}
 23357  @.texput@>
 23358  get_r_token; u:=cur_cs;
 23359  if u>=hash_base then t:=text(u)
 23360  else if u>=single_base then
 23361    if u=null_cs then t:="FONT"@+else t:=u-single_base
 23362  else  begin old_setting:=selector; selector:=new_string;
 23363    print("FONT"); print(u-active_base); selector:=old_setting;
 23364  @.FONTx@>
 23365    str_room(1); t:=make_string;
 23366    end;
 23367  define(u,set_font,null_font); scan_optional_equals; scan_file_name;
 23368  @<Scan the font size specification@>;
 23369  @<If this font has already been loaded, set |f| to the internal
 23370    font number and |goto common_ending|@>;
 23371  f:=read_font_info(u,cur_name,cur_area,s);
 23372  common_ending: equiv(u):=f; eqtb[font_id_base+f]:=eqtb[u]; font_id_text(f):=t;
 23373  end;
 23374  
 23375  @ @<Scan the font size specification@>=
 23376  name_in_progress:=true; {this keeps |cur_name| from being changed}
 23377  if scan_keyword("at") then @<Put the \(p)(positive) `at' size into |s|@>
 23378  @.at@>
 23379  else if scan_keyword("scaled") then
 23380  @.scaled@>
 23381    begin scan_int; s:=-cur_val;
 23382    if (cur_val<=0)or(cur_val>32768) then
 23383      begin print_err("Illegal magnification has been changed to 1000");@/
 23384  @.Illegal magnification...@>
 23385      help1("The magnification ratio must be between 1 and 32768.");
 23386      int_error(cur_val); s:=-1000;
 23387      end;
 23388    end
 23389  else s:=-1000;
 23390  name_in_progress:=false
 23391  
 23392  @ @<Put the \(p)(positive) `at' size into |s|@>=
 23393  begin scan_normal_dimen; s:=cur_val;
 23394  if (s<=0)or(s>=@'1000000000) then
 23395    begin print_err("Improper `at' size (");
 23396    print_scaled(s); print("pt), replaced by 10pt");
 23397  @.Improper `at' size...@>
 23398    help2("I can only handle fonts at positive sizes that are")@/
 23399    ("less than 2048pt, so I've changed what you said to 10pt.");
 23400    error; s:=10*unity;
 23401    end;
 23402  end
 23403  
 23404  @ When the user gives a new identifier to a font that was previously loaded,
 23405  the new name becomes the font identifier of record. Font names `\.{xyz}' and
 23406  `\.{XYZ}' are considered to be different.
 23407  
 23408  @<If this font has already been loaded...@>=
 23409  flushable_string:=str_ptr-1;
 23410  for f:=font_base+1 to font_ptr do
 23411    if str_eq_str(font_name[f],cur_name)and str_eq_str(font_area[f],cur_area) then
 23412      begin if cur_name=flushable_string then
 23413        begin flush_string; cur_name:=font_name[f];
 23414        end;
 23415      if s>0 then
 23416        begin if s=font_size[f] then goto common_ending;
 23417        end
 23418      else if font_size[f]=xn_over_d(font_dsize[f],-s,1000) then
 23419        goto common_ending;
 23420      end
 23421  
 23422  @ @<Cases of |print_cmd_chr|...@>=
 23423  set_font:begin print("select font "); slow_print(font_name[chr_code]);
 23424    if font_size[chr_code]<>font_dsize[chr_code] then
 23425      begin print(" at "); print_scaled(font_size[chr_code]);
 23426      print("pt");
 23427      end;
 23428    end;
 23429  
 23430  @ @<Put each...@>=
 23431  primitive("batchmode",set_interaction,batch_mode);
 23432  @!@:batch_mode_}{\.{\\batchmode} primitive@>
 23433  primitive("nonstopmode",set_interaction,nonstop_mode);
 23434  @!@:nonstop_mode_}{\.{\\nonstopmode} primitive@>
 23435  primitive("scrollmode",set_interaction,scroll_mode);
 23436  @!@:scroll_mode_}{\.{\\scrollmode} primitive@>
 23437  primitive("errorstopmode",set_interaction,error_stop_mode);
 23438  @!@:error_stop_mode_}{\.{\\errorstopmode} primitive@>
 23439  
 23440  @ @<Cases of |print_cmd_chr|...@>=
 23441  set_interaction: case chr_code of
 23442    batch_mode: print_esc("batchmode");
 23443    nonstop_mode: print_esc("nonstopmode");
 23444    scroll_mode: print_esc("scrollmode");
 23445    othercases print_esc("errorstopmode")
 23446    endcases;
 23447  
 23448  @ @<Assignments@>=
 23449  set_interaction: new_interaction;
 23450  
 23451  @ @<Declare subprocedures for |prefixed_command|@>=
 23452  procedure new_interaction;
 23453  begin print_ln;
 23454  interaction:=cur_chr;
 23455  @<Initialize the print |selector| based on |interaction|@>;
 23456  if log_opened then selector:=selector+2;
 23457  end;
 23458  
 23459  @ The \.{\\afterassignment} command puts a token into the global
 23460  variable |after_token|. This global variable is examined just after
 23461  every assignment has been performed.
 23462  
 23463  @<Glob...@>=
 23464  @!after_token:halfword; {zero, or a saved token}
 23465  
 23466  @ @<Set init...@>=
 23467  after_token:=0;
 23468  
 23469  @ @<Cases of |main_control| that don't...@>=
 23470  any_mode(after_assignment):begin get_token; after_token:=cur_tok;
 23471    end;
 23472  
 23473  @ @<Insert a token saved by \.{\\afterassignment}, if any@>=
 23474  if after_token<>0 then
 23475    begin cur_tok:=after_token; back_input; after_token:=0;
 23476    end
 23477  
 23478  @ Here is a procedure that might be called `Get the next non-blank non-relax
 23479  non-call non-assignment token'.
 23480  
 23481  @<Declare act...@>=
 23482  procedure do_assignments;
 23483  label exit;
 23484  begin loop begin @<Get the next non-blank non-relax...@>;
 23485    if cur_cmd<=max_non_prefixed_command then return;
 23486    set_box_allowed:=false; prefixed_command; set_box_allowed:=true;
 23487    end;
 23488  exit:end;
 23489  
 23490  @ @<Cases of |main_control| that don't...@>=
 23491  any_mode(after_group):begin get_token; save_for_after(cur_tok);
 23492    end;
 23493  
 23494  @ Files for \.{\\read} are opened and closed by the |in_stream| command.
 23495  
 23496  @<Put each...@>=
 23497  primitive("openin",in_stream,1);
 23498  @!@:open_in_}{\.{\\openin} primitive@>
 23499  primitive("closein",in_stream,0);
 23500  @!@:close_in_}{\.{\\closein} primitive@>
 23501  
 23502  @ @<Cases of |print_cmd_chr|...@>=
 23503  in_stream: if chr_code=0 then print_esc("closein")
 23504    else print_esc("openin");
 23505  
 23506  @ @<Cases of |main_control| that don't...@>=
 23507  any_mode(in_stream): open_or_close_in;
 23508  
 23509  @ @<Declare act...@>=
 23510  procedure open_or_close_in;
 23511  var c:0..1; {1 for \.{\\openin}, 0 for \.{\\closein}}
 23512  @!n:0..15; {stream number}
 23513  begin c:=cur_chr; scan_four_bit_int; n:=cur_val;
 23514  if read_open[n]<>closed then
 23515    begin a_close(read_file[n]); read_open[n]:=closed;
 23516    end;
 23517  if c<>0 then
 23518    begin scan_optional_equals; scan_file_name;
 23519    if cur_ext="" then cur_ext:=".tex";
 23520    pack_cur_name;
 23521    if a_open_in(read_file[n]) then read_open[n]:=just_open;
 23522    end;
 23523  end;
 23524  
 23525  @ The user can issue messages to the terminal, regardless of the
 23526  current mode.
 23527  
 23528  @<Cases of |main_control| that don't...@>=
 23529  any_mode(message):issue_message;
 23530  
 23531  @ @<Put each...@>=
 23532  primitive("message",message,0);
 23533  @!@:message_}{\.{\\message} primitive@>
 23534  primitive("errmessage",message,1);
 23535  @!@:err_message_}{\.{\\errmessage} primitive@>
 23536  
 23537  @ @<Cases of |print_cmd_chr|...@>=
 23538  message: if chr_code=0 then print_esc("message")
 23539    else print_esc("errmessage");
 23540  
 23541  @ @<Declare act...@>=
 23542  procedure issue_message;
 23543  var old_setting:0..max_selector; {holds |selector| setting}
 23544  @!c:0..1; {identifies \.{\\message} and \.{\\errmessage}}
 23545  @!s:str_number; {the message}
 23546  begin c:=cur_chr; link(garbage):=scan_toks(false,true);
 23547  old_setting:=selector; selector:=new_string;
 23548  token_show(def_ref); selector:=old_setting;
 23549  flush_list(def_ref);
 23550  str_room(1); s:=make_string;
 23551  if c=0 then @<Print string |s| on the terminal@>
 23552  else @<Print string |s| as an error message@>;
 23553  flush_string;
 23554  end;
 23555  
 23556  @ @<Print string |s| on the terminal@>=
 23557  begin if term_offset+length(s)>max_print_line-2 then print_ln
 23558  else if (term_offset>0)or(file_offset>0) then print_char(" ");
 23559  slow_print(s); update_terminal;
 23560  end
 23561  
 23562  @ If \.{\\errmessage} occurs often in |scroll_mode|, without user-defined
 23563  \.{\\errhelp}, we don't want to give a long help message each time. So we
 23564  give a verbose explanation only once.
 23565  
 23566  @<Glob...@>=
 23567  @!long_help_seen:boolean; {has the long \.{\\errmessage} help been used?}
 23568  
 23569  @ @<Set init...@>=long_help_seen:=false;
 23570  
 23571  @ @<Print string |s| as an error message@>=
 23572  begin print_err(""); slow_print(s);
 23573  if err_help<>null then use_err_help:=true
 23574  else if long_help_seen then help1("(That was another \errmessage.)")
 23575  else  begin if interaction<error_stop_mode then long_help_seen:=true;
 23576    help4("This error message was generated by an \errmessage")@/
 23577    ("command, so I can't give any explicit help.")@/
 23578    ("Pretend that you're Hercule Poirot: Examine all clues,")@/
 23579  @^Poirot, Hercule@>
 23580    ("and deduce the truth by order and method.");
 23581    end;
 23582  error; use_err_help:=false;
 23583  end
 23584  
 23585  @ The |error| routine calls on |give_err_help| if help is requested from
 23586  the |err_help| parameter.
 23587  
 23588  @p procedure give_err_help;
 23589  begin token_show(err_help);
 23590  end;
 23591  
 23592  @ The \.{\\uppercase} and \.{\\lowercase} commands are implemented by
 23593  building a token list and then changing the cases of the letters in it.
 23594  
 23595  @<Cases of |main_control| that don't...@>=
 23596  any_mode(case_shift):shift_case;
 23597  
 23598  @ @<Put each...@>=
 23599  primitive("lowercase",case_shift,lc_code_base);
 23600  @!@:lowercase_}{\.{\\lowercase} primitive@>
 23601  primitive("uppercase",case_shift,uc_code_base);
 23602  @!@:uppercase_}{\.{\\uppercase} primitive@>
 23603  
 23604  @ @<Cases of |print_cmd_chr|...@>=
 23605  case_shift:if chr_code=lc_code_base then print_esc("lowercase")
 23606    else print_esc("uppercase");
 23607  
 23608  @ @<Declare act...@>=
 23609  procedure shift_case;
 23610  var b:pointer; {|lc_code_base| or |uc_code_base|}
 23611  @!p:pointer; {runs through the token list}
 23612  @!t:halfword; {token}
 23613  @!c:eight_bits; {character code}
 23614  begin b:=cur_chr; p:=scan_toks(false,false); p:=link(def_ref);
 23615  while p<>null do
 23616    begin @<Change the case of the token in |p|, if a change is appropriate@>;
 23617    p:=link(p);
 23618    end;
 23619  back_list(link(def_ref)); free_avail(def_ref); {omit reference count}
 23620  end;
 23621  
 23622  @ When the case of a |chr_code| changes, we don't change the |cmd|.
 23623  We also change active characters, using the fact that
 23624  |cs_token_flag+active_base| is a multiple of~256.
 23625  @^data structure assumptions@>
 23626  
 23627  @<Change the case of the token in |p|, if a change is appropriate@>=
 23628  t:=info(p);
 23629  if t<cs_token_flag+single_base then
 23630    begin c:=t mod 256;
 23631    if equiv(b+c)<>0 then info(p):=t-c+equiv(b+c);
 23632    end
 23633  
 23634  @ We come finally to the last pieces missing from |main_control|, namely the
 23635  `\.{\\show}' commands that are useful when debugging.
 23636  
 23637  @<Cases of |main_control| that don't...@>=
 23638  any_mode(xray): show_whatever;
 23639  
 23640  @ @d show_code=0 { \.{\\show} }
 23641  @d show_box_code=1 { \.{\\showbox} }
 23642  @d show_the_code=2 { \.{\\showthe} }
 23643  @d show_lists_code=3 { \.{\\showlists} }
 23644  
 23645  @<Put each...@>=
 23646  primitive("show",xray,show_code);
 23647  @!@:show_}{\.{\\show} primitive@>
 23648  primitive("showbox",xray,show_box_code);
 23649  @!@:show_box_}{\.{\\showbox} primitive@>
 23650  primitive("showthe",xray,show_the_code);
 23651  @!@:show_the_}{\.{\\showthe} primitive@>
 23652  primitive("showlists",xray,show_lists_code);
 23653  @!@:show_lists_code_}{\.{\\showlists} primitive@>
 23654  
 23655  @ @<Cases of |print_cmd_chr|...@>=
 23656  xray: case chr_code of
 23657    show_box_code:print_esc("showbox");
 23658    show_the_code:print_esc("showthe");
 23659    show_lists_code:print_esc("showlists");
 23660    othercases print_esc("show")
 23661    endcases;
 23662  
 23663  @ @<Declare act...@>=
 23664  procedure show_whatever;
 23665  label common_ending;
 23666  var p:pointer; {tail of a token list to show}
 23667  begin case cur_chr of
 23668  show_lists_code: begin begin_diagnostic; show_activities;
 23669    end;
 23670  show_box_code: @<Show the current contents of a box@>;
 23671  show_code: @<Show the current meaning of a token, then |goto common_ending|@>;
 23672  othercases @<Show the current value of some parameter or register,
 23673    then |goto common_ending|@>
 23674  endcases;@/
 23675  @<Complete a potentially long \.{\\show} command@>;
 23676  common_ending: if interaction<error_stop_mode then
 23677    begin help0; decr(error_count);
 23678    end
 23679  else if tracing_online>0 then
 23680    begin@t@>@;@/
 23681    help3("This isn't an error message; I'm just \showing something.")@/
 23682    ("Type `I\show...' to show more (e.g., \show\cs,")@/
 23683    ("\showthe\count10, \showbox255, \showlists).");
 23684    end
 23685  else  begin@t@>@;@/
 23686    help5("This isn't an error message; I'm just \showing something.")@/
 23687    ("Type `I\show...' to show more (e.g., \show\cs,")@/
 23688    ("\showthe\count10, \showbox255, \showlists).")@/
 23689    ("And type `I\tracingonline=1\show...' to show boxes and")@/
 23690    ("lists on your terminal as well as in the transcript file.");
 23691    end;
 23692  error;
 23693  end;
 23694  
 23695  @ @<Show the current meaning of a token...@>=
 23696  begin get_token;
 23697  if interaction=error_stop_mode then wake_up_terminal;
 23698  print_nl("> ");
 23699  if cur_cs<>0 then
 23700    begin sprint_cs(cur_cs); print_char("=");
 23701    end;
 23702  print_meaning; goto common_ending;
 23703  end
 23704  
 23705  @ @<Cases of |print_cmd_chr|...@>=
 23706  undefined_cs: print("undefined");
 23707  call: print("macro");
 23708  long_call: print_esc("long macro");
 23709  outer_call: print_esc("outer macro");
 23710  long_outer_call: begin print_esc("long"); print_esc("outer macro");
 23711    end;
 23712  end_template: print_esc("outer endtemplate");
 23713  
 23714  @ @<Show the current contents of a box@>=
 23715  begin scan_eight_bit_int; begin_diagnostic;
 23716  print_nl("> \box"); print_int(cur_val); print_char("=");
 23717  if box(cur_val)=null then print("void")
 23718  else show_box(box(cur_val));
 23719  end
 23720  
 23721  @ @<Show the current value of some parameter...@>=
 23722  begin p:=the_toks;
 23723  if interaction=error_stop_mode then wake_up_terminal;
 23724  print_nl("> "); token_show(temp_head);
 23725  flush_list(link(temp_head)); goto common_ending;
 23726  end
 23727  
 23728  @ @<Complete a potentially long \.{\\show} command@>=
 23729  end_diagnostic(true); print_err("OK");
 23730  @.OK@>
 23731  if selector=term_and_log then if tracing_online<=0 then
 23732    begin selector:=term_only; print(" (see the transcript file)");
 23733    selector:=term_and_log;
 23734    end
 23735  
 23736  @* \[50] Dumping and undumping the tables.
 23737  After \.{INITEX} has seen a collection of fonts and macros, it
 23738  can write all the necessary information on an auxiliary file so
 23739  that production versions of \TeX\ are able to initialize their
 23740  memory at high speed. The present section of the program takes
 23741  care of such output and input. We shall consider simultaneously
 23742  the processes of storing and restoring,
 23743  so that the inverse relation between them is clear.
 23744  @.INITEX@>
 23745  
 23746  The global variable |format_ident| is a string that is printed right
 23747  after the |banner| line when \TeX\ is ready to start. For \.{INITEX} this
 23748  string says simply `\.{ (INITEX)}'; for other versions of \TeX\ it says,
 23749  for example, `\.{ (preloaded format=plain 1982.11.19)}', showing the year,
 23750  month, and day that the format file was created. We have |format_ident=0|
 23751  before \TeX's tables are loaded.
 23752  
 23753  @<Glob...@>=
 23754  @!format_ident:str_number;
 23755  
 23756  @ @<Set init...@>=
 23757  format_ident:=0;
 23758  
 23759  @ @<Initialize table entries...@>=
 23760  format_ident:=" (INITEX)";
 23761  
 23762  @ @<Declare act...@>=
 23763  @!init procedure store_fmt_file;
 23764  label found1,found2,done1,done2;
 23765  var j,@!k,@!l:integer; {all-purpose indices}
 23766  @!p,@!q: pointer; {all-purpose pointers}
 23767  @!x: integer; {something to dump}
 23768  @!w: four_quarters; {four ASCII codes}
 23769  begin @<If dumping is not allowed, abort@>;
 23770  @<Create the |format_ident|, open the format file,
 23771    and inform the user that dumping has begun@>;
 23772  @<Dump constants for consistency check@>;
 23773  @<Dump the string pool@>;
 23774  @<Dump the dynamic memory@>;
 23775  @<Dump the table of equivalents@>;
 23776  @<Dump the font information@>;
 23777  @<Dump the hyphenation tables@>;
 23778  @<Dump a couple more things and the closing check word@>;
 23779  @<Close the format file@>;
 23780  end;
 23781  tini
 23782  
 23783  @ Corresponding to the procedure that dumps a format file, we have a function
 23784  that reads one in. The function returns |false| if the dumped format is
 23785  incompatible with the present \TeX\ table sizes, etc.
 23786  
 23787  @d bad_fmt=6666 {go here if the format file is unacceptable}
 23788  @d too_small(#)==begin wake_up_terminal;
 23789    wterm_ln('---! Must increase the ',#);
 23790  @.Must increase the x@>
 23791    goto bad_fmt;
 23792    end
 23793  
 23794  @p @t\4@>@<Declare the function called |open_fmt_file|@>@;
 23795  function load_fmt_file:boolean;
 23796  label bad_fmt,exit;
 23797  var j,@!k:integer; {all-purpose indices}
 23798  @!p,@!q: pointer; {all-purpose pointers}
 23799  @!x: integer; {something undumped}
 23800  @!w: four_quarters; {four ASCII codes}
 23801  begin @<Undump constants for consistency check@>;
 23802  @<Undump the string pool@>;
 23803  @<Undump the dynamic memory@>;
 23804  @<Undump the table of equivalents@>;
 23805  @<Undump the font information@>;
 23806  @<Undump the hyphenation tables@>;
 23807  @<Undump a couple more things and the closing check word@>;
 23808  load_fmt_file:=true; return; {it worked!}
 23809  bad_fmt: wake_up_terminal;
 23810    wterm_ln('(Fatal format file error; I''m stymied)');
 23811  @.Fatal format file error@>
 23812  load_fmt_file:=false;
 23813  exit:end;
 23814  
 23815  @ The user is not allowed to dump a format file unless |save_ptr=0|.
 23816  This condition implies that |cur_level=level_one|, hence
 23817  the |xeq_level| array is constant and it need not be dumped.
 23818  
 23819  @<If dumping is not allowed, abort@>=
 23820  if save_ptr<>0 then
 23821    begin print_err("You can't dump inside a group");
 23822  @.You can't dump...@>
 23823    help1("`{...\dump}' is a no-no."); succumb;
 23824    end
 23825  
 23826  @ Format files consist of |memory_word| items, and we use the following
 23827  macros to dump words of different types:
 23828  
 23829  @d dump_wd(#)==begin fmt_file^:=#; put(fmt_file);@+end
 23830  @d dump_int(#)==begin fmt_file^.int:=#; put(fmt_file);@+end
 23831  @d dump_hh(#)==begin fmt_file^.hh:=#; put(fmt_file);@+end
 23832  @d dump_qqqq(#)==begin fmt_file^.qqqq:=#; put(fmt_file);@+end
 23833  
 23834  @<Glob...@>=
 23835  @!fmt_file:word_file; {for input or output of format information}
 23836  
 23837  @ The inverse macros are slightly more complicated, since we need to check
 23838  the range of the values we are reading in. We say `|undump(a)(b)(x)|' to
 23839  read an integer value |x| that is supposed to be in the range |a<=x<=b|.
 23840  System error messages should be suppressed when undumping.
 23841  @^system dependencies@>
 23842  
 23843  @d undump_wd(#)==begin get(fmt_file); #:=fmt_file^;@+end
 23844  @d undump_int(#)==begin get(fmt_file); #:=fmt_file^.int;@+end
 23845  @d undump_hh(#)==begin get(fmt_file); #:=fmt_file^.hh;@+end
 23846  @d undump_qqqq(#)==begin get(fmt_file); #:=fmt_file^.qqqq;@+end
 23847  @d undump_end_end(#)==#:=x;@+end
 23848  @d undump_end(#)==(x>#) then goto bad_fmt@+else undump_end_end
 23849  @d undump(#)==begin undump_int(x); if (x<#) or undump_end
 23850  @d undump_size_end_end(#)==too_small(#)@+else undump_end_end
 23851  @d undump_size_end(#)==if x># then undump_size_end_end
 23852  @d undump_size(#)==begin undump_int(x);
 23853    if x<# then goto bad_fmt; undump_size_end
 23854  
 23855  @ The next few sections of the program should make it clear how we use the
 23856  dump/undump macros.
 23857  
 23858  @<Dump constants for consistency check@>=
 23859  dump_int(@$);@/
 23860  dump_int(mem_bot);@/
 23861  dump_int(mem_top);@/
 23862  dump_int(eqtb_size);@/
 23863  dump_int(hash_prime);@/
 23864  dump_int(hyph_size)
 23865  
 23866  @ Sections of a \.{WEB} program that are ``commented out'' still contribute
 23867  strings to the string pool; therefore \.{INITEX} and \TeX\ will have
 23868  the same strings. (And it is, of course, a good thing that they do.)
 23869  @.WEB@>
 23870  @^string pool@>
 23871  
 23872  @<Undump constants for consistency check@>=
 23873  x:=fmt_file^.int;
 23874  if x<>@$ then goto bad_fmt; {check that strings are the same}
 23875  undump_int(x);
 23876  if x<>mem_bot then goto bad_fmt;
 23877  undump_int(x);
 23878  if x<>mem_top then goto bad_fmt;
 23879  undump_int(x);
 23880  if x<>eqtb_size then goto bad_fmt;
 23881  undump_int(x);
 23882  if x<>hash_prime then goto bad_fmt;
 23883  undump_int(x);
 23884  if x<>hyph_size then goto bad_fmt
 23885  
 23886  @ @d dump_four_ASCII==
 23887    w.b0:=qi(so(str_pool[k])); w.b1:=qi(so(str_pool[k+1]));
 23888    w.b2:=qi(so(str_pool[k+2])); w.b3:=qi(so(str_pool[k+3]));
 23889    dump_qqqq(w)
 23890  
 23891  @<Dump the string pool@>=
 23892  dump_int(pool_ptr);
 23893  dump_int(str_ptr);
 23894  for k:=0 to str_ptr do dump_int(str_start[k]);
 23895  k:=0;
 23896  while k+4<pool_ptr do
 23897    begin dump_four_ASCII; k:=k+4;
 23898    end;
 23899  k:=pool_ptr-4; dump_four_ASCII;
 23900  print_ln; print_int(str_ptr); print(" strings of total length ");
 23901  print_int(pool_ptr)
 23902  
 23903  @ @d undump_four_ASCII==
 23904    undump_qqqq(w);
 23905    str_pool[k]:=si(qo(w.b0)); str_pool[k+1]:=si(qo(w.b1));
 23906    str_pool[k+2]:=si(qo(w.b2)); str_pool[k+3]:=si(qo(w.b3))
 23907  
 23908  @<Undump the string pool@>=
 23909  undump_size(0)(pool_size)('string pool size')(pool_ptr);
 23910  undump_size(0)(max_strings)('max strings')(str_ptr);
 23911  for k:=0 to str_ptr do undump(0)(pool_ptr)(str_start[k]);
 23912  k:=0;
 23913  while k+4<pool_ptr do
 23914    begin undump_four_ASCII; k:=k+4;
 23915    end;
 23916  k:=pool_ptr-4; undump_four_ASCII;
 23917  init_str_ptr:=str_ptr; init_pool_ptr:=pool_ptr
 23918  
 23919  @ By sorting the list of available spaces in the variable-size portion of
 23920  |mem|, we are usually able to get by without having to dump very much
 23921  of the dynamic memory.
 23922  
 23923  We recompute |var_used| and |dyn_used|, so that \.{INITEX} dumps valid
 23924  information even when it has not been gathering statistics.
 23925  
 23926  @<Dump the dynamic memory@>=
 23927  sort_avail; var_used:=0;
 23928  dump_int(lo_mem_max); dump_int(rover);
 23929  p:=mem_bot; q:=rover; x:=0;
 23930  repeat for k:=p to q+1 do dump_wd(mem[k]);
 23931  x:=x+q+2-p; var_used:=var_used+q-p;
 23932  p:=q+node_size(q); q:=rlink(q);
 23933  until q=rover;
 23934  var_used:=var_used+lo_mem_max-p; dyn_used:=mem_end+1-hi_mem_min;@/
 23935  for k:=p to lo_mem_max do dump_wd(mem[k]);
 23936  x:=x+lo_mem_max+1-p;
 23937  dump_int(hi_mem_min); dump_int(avail);
 23938  for k:=hi_mem_min to mem_end do dump_wd(mem[k]);
 23939  x:=x+mem_end+1-hi_mem_min;
 23940  p:=avail;
 23941  while p<>null do
 23942    begin decr(dyn_used); p:=link(p);
 23943    end;
 23944  dump_int(var_used); dump_int(dyn_used);
 23945  print_ln; print_int(x);
 23946  print(" memory locations dumped; current usage is ");
 23947  print_int(var_used); print_char("&"); print_int(dyn_used)
 23948  
 23949  @ @<Undump the dynamic memory@>=
 23950  undump(lo_mem_stat_max+1000)(hi_mem_stat_min-1)(lo_mem_max);
 23951  undump(lo_mem_stat_max+1)(lo_mem_max)(rover);
 23952  p:=mem_bot; q:=rover;
 23953  repeat for k:=p to q+1 do undump_wd(mem[k]);
 23954  p:=q+node_size(q);
 23955  if (p>lo_mem_max)or((q>=rlink(q))and(rlink(q)<>rover)) then goto bad_fmt;
 23956  q:=rlink(q);
 23957  until q=rover;
 23958  for k:=p to lo_mem_max do undump_wd(mem[k]);
 23959  if mem_min<mem_bot-2 then {make more low memory available}
 23960    begin p:=llink(rover); q:=mem_min+1;
 23961    link(mem_min):=null; info(mem_min):=null; {we don't use the bottom word}
 23962    rlink(p):=q; llink(rover):=q;@/
 23963    rlink(q):=rover; llink(q):=p; link(q):=empty_flag;
 23964    node_size(q):=mem_bot-q;
 23965    end;
 23966  undump(lo_mem_max+1)(hi_mem_stat_min)(hi_mem_min);
 23967  undump(null)(mem_top)(avail); mem_end:=mem_top;
 23968  for k:=hi_mem_min to mem_end do undump_wd(mem[k]);
 23969  undump_int(var_used); undump_int(dyn_used)
 23970  
 23971  @ @<Dump the table of equivalents@>=
 23972  @<Dump regions 1 to 4 of |eqtb|@>;
 23973  @<Dump regions 5 and 6 of |eqtb|@>;
 23974  dump_int(par_loc); dump_int(write_loc);@/
 23975  @<Dump the hash table@>
 23976  
 23977  @ @<Undump the table of equivalents@>=
 23978  @<Undump regions 1 to 6 of |eqtb|@>;
 23979  undump(hash_base)(frozen_control_sequence)(par_loc);
 23980  par_token:=cs_token_flag+par_loc;@/
 23981  undump(hash_base)(frozen_control_sequence)(write_loc);@/
 23982  @<Undump the hash table@>
 23983  
 23984  @ The table of equivalents usually contains repeated information, so we dump it
 23985  in compressed form: The sequence of $n+2$ values $(n,x_1,\ldots,x_n,m)$ in the
 23986  format file represents $n+m$ consecutive entries of |eqtb|, with |m| extra
 23987  copies of $x_n$, namely $(x_1,\ldots,x_n,x_n,\ldots,x_n)$.
 23988  
 23989  @<Dump regions 1 to 4 of |eqtb|@>=
 23990  k:=active_base;
 23991  repeat j:=k;
 23992  while j<int_base-1 do
 23993    begin if (equiv(j)=equiv(j+1))and(eq_type(j)=eq_type(j+1))and@|
 23994      (eq_level(j)=eq_level(j+1)) then goto found1;
 23995    incr(j);
 23996    end;
 23997  l:=int_base; goto done1; {|j=int_base-1|}
 23998  found1: incr(j); l:=j;
 23999  while j<int_base-1 do
 24000    begin if (equiv(j)<>equiv(j+1))or(eq_type(j)<>eq_type(j+1))or@|
 24001      (eq_level(j)<>eq_level(j+1)) then goto done1;
 24002    incr(j);
 24003    end;
 24004  done1:dump_int(l-k);
 24005  while k<l do
 24006    begin dump_wd(eqtb[k]); incr(k);
 24007    end;
 24008  k:=j+1; dump_int(k-l);
 24009  until k=int_base
 24010  
 24011  @ @<Dump regions 5 and 6 of |eqtb|@>=
 24012  repeat j:=k;
 24013  while j<eqtb_size do
 24014    begin if eqtb[j].int=eqtb[j+1].int then goto found2;
 24015    incr(j);
 24016    end;
 24017  l:=eqtb_size+1; goto done2; {|j=eqtb_size|}
 24018  found2: incr(j); l:=j;
 24019  while j<eqtb_size do
 24020    begin if eqtb[j].int<>eqtb[j+1].int then goto done2;
 24021    incr(j);
 24022    end;
 24023  done2:dump_int(l-k);
 24024  while k<l do
 24025    begin dump_wd(eqtb[k]); incr(k);
 24026    end;
 24027  k:=j+1; dump_int(k-l);
 24028  until k>eqtb_size
 24029  
 24030  @ @<Undump regions 1 to 6 of |eqtb|@>=
 24031  k:=active_base;
 24032  repeat undump_int(x);
 24033  if (x<1)or(k+x>eqtb_size+1) then goto bad_fmt;
 24034  for j:=k to k+x-1 do undump_wd(eqtb[j]);
 24035  k:=k+x;
 24036  undump_int(x);
 24037  if (x<0)or(k+x>eqtb_size+1) then goto bad_fmt;
 24038  for j:=k to k+x-1 do eqtb[j]:=eqtb[k-1];
 24039  k:=k+x;
 24040  until k>eqtb_size
 24041  
 24042  @ A different scheme is used to compress the hash table, since its lower
 24043  region is usually sparse. When |text(p)<>0| for |p<=hash_used|, we output
 24044  two words, |p| and |hash[p]|. The hash table is, of course, densely packed
 24045  for |p>=hash_used|, so the remaining entries are output in a~block.
 24046  
 24047  @<Dump the hash table@>=
 24048  dump_int(hash_used); cs_count:=frozen_control_sequence-1-hash_used;
 24049  for p:=hash_base to hash_used do if text(p)<>0 then
 24050    begin dump_int(p); dump_hh(hash[p]); incr(cs_count);
 24051    end;
 24052  for p:=hash_used+1 to undefined_control_sequence-1 do dump_hh(hash[p]);
 24053  dump_int(cs_count);@/
 24054  print_ln; print_int(cs_count); print(" multiletter control sequences")
 24055  
 24056  @ @<Undump the hash table@>=
 24057  undump(hash_base)(frozen_control_sequence)(hash_used); p:=hash_base-1;
 24058  repeat undump(p+1)(hash_used)(p); undump_hh(hash[p]);
 24059  until p=hash_used;
 24060  for p:=hash_used+1 to undefined_control_sequence-1 do undump_hh(hash[p]);
 24061  undump_int(cs_count)
 24062  
 24063  @ @<Dump the font information@>=
 24064  dump_int(fmem_ptr);
 24065  for k:=0 to fmem_ptr-1 do dump_wd(font_info[k]);
 24066  dump_int(font_ptr);
 24067  for k:=null_font to font_ptr do
 24068    @<Dump the array info for internal font number |k|@>;
 24069  print_ln; print_int(fmem_ptr-7); print(" words of font info for ");
 24070  print_int(font_ptr-font_base); print(" preloaded font");
 24071  if font_ptr<>font_base+1 then print_char("s")
 24072  
 24073  @ @<Undump the font information@>=
 24074  undump_size(7)(font_mem_size)('font mem size')(fmem_ptr);
 24075  for k:=0 to fmem_ptr-1 do undump_wd(font_info[k]);
 24076  undump_size(font_base)(font_max)('font max')(font_ptr);
 24077  for k:=null_font to font_ptr do
 24078    @<Undump the array info for internal font number |k|@>
 24079  
 24080  @ @<Dump the array info for internal font number |k|@>=
 24081  begin dump_qqqq(font_check[k]);
 24082  dump_int(font_size[k]);
 24083  dump_int(font_dsize[k]);
 24084  dump_int(font_params[k]);@/
 24085  dump_int(hyphen_char[k]);
 24086  dump_int(skew_char[k]);@/
 24087  dump_int(font_name[k]);
 24088  dump_int(font_area[k]);@/
 24089  dump_int(font_bc[k]);
 24090  dump_int(font_ec[k]);@/
 24091  dump_int(char_base[k]);
 24092  dump_int(width_base[k]);
 24093  dump_int(height_base[k]);@/
 24094  dump_int(depth_base[k]);
 24095  dump_int(italic_base[k]);
 24096  dump_int(lig_kern_base[k]);@/
 24097  dump_int(kern_base[k]);
 24098  dump_int(exten_base[k]);
 24099  dump_int(param_base[k]);@/
 24100  dump_int(font_glue[k]);@/
 24101  dump_int(bchar_label[k]);
 24102  dump_int(font_bchar[k]);
 24103  dump_int(font_false_bchar[k]);@/
 24104  print_nl("\font"); print_esc(font_id_text(k)); print_char("=");
 24105  print_file_name(font_name[k],font_area[k],"");
 24106  if font_size[k]<>font_dsize[k] then
 24107    begin print(" at "); print_scaled(font_size[k]); print("pt");
 24108    end;
 24109  end
 24110  
 24111  @ @<Undump the array info for internal font number |k|@>=
 24112  begin undump_qqqq(font_check[k]);@/
 24113  undump_int(font_size[k]);
 24114  undump_int(font_dsize[k]);
 24115  undump(min_halfword)(max_halfword)(font_params[k]);@/
 24116  undump_int(hyphen_char[k]);
 24117  undump_int(skew_char[k]);@/
 24118  undump(0)(str_ptr)(font_name[k]);
 24119  undump(0)(str_ptr)(font_area[k]);@/
 24120  undump(0)(255)(font_bc[k]);
 24121  undump(0)(255)(font_ec[k]);@/
 24122  undump_int(char_base[k]);
 24123  undump_int(width_base[k]);
 24124  undump_int(height_base[k]);@/
 24125  undump_int(depth_base[k]);
 24126  undump_int(italic_base[k]);
 24127  undump_int(lig_kern_base[k]);@/
 24128  undump_int(kern_base[k]);
 24129  undump_int(exten_base[k]);
 24130  undump_int(param_base[k]);@/
 24131  undump(min_halfword)(lo_mem_max)(font_glue[k]);@/
 24132  undump(0)(fmem_ptr-1)(bchar_label[k]);
 24133  undump(min_quarterword)(non_char)(font_bchar[k]);
 24134  undump(min_quarterword)(non_char)(font_false_bchar[k]);
 24135  end
 24136  
 24137  @ @<Dump the hyphenation tables@>=
 24138  dump_int(hyph_count);
 24139  for k:=0 to hyph_size do if hyph_word[k]<>0 then
 24140    begin dump_int(k); dump_int(hyph_word[k]); dump_int(hyph_list[k]);
 24141    end;
 24142  print_ln; print_int(hyph_count); print(" hyphenation exception");
 24143  if hyph_count<>1 then print_char("s");
 24144  if trie_not_ready then init_trie;
 24145  dump_int(trie_max);
 24146  for k:=0 to trie_max do dump_hh(trie[k]);
 24147  dump_int(trie_op_ptr);
 24148  for k:=1 to trie_op_ptr do
 24149    begin dump_int(hyf_distance[k]);
 24150    dump_int(hyf_num[k]);
 24151    dump_int(hyf_next[k]);
 24152    end;
 24153  print_nl("Hyphenation trie of length "); print_int(trie_max);
 24154  @.Hyphenation trie...@>
 24155  print(" has "); print_int(trie_op_ptr); print(" op");
 24156  if trie_op_ptr<>1 then print_char("s");
 24157  print(" out of "); print_int(trie_op_size);
 24158  for k:=255 downto 0 do if trie_used[k]>min_quarterword then
 24159    begin print_nl("  "); print_int(qo(trie_used[k]));
 24160    print(" for language "); print_int(k);
 24161    dump_int(k); dump_int(qo(trie_used[k]));
 24162    end
 24163  
 24164  @ Only ``nonempty'' parts of |op_start| need to be restored.
 24165  
 24166  @<Undump the hyphenation tables@>=
 24167  undump(0)(hyph_size)(hyph_count);
 24168  for k:=1 to hyph_count do
 24169    begin undump(0)(hyph_size)(j);
 24170    undump(0)(str_ptr)(hyph_word[j]);
 24171    undump(min_halfword)(max_halfword)(hyph_list[j]);
 24172    end;
 24173  undump_size(0)(trie_size)('trie size')(j); @+init trie_max:=j;@+tini
 24174  for k:=0 to j do undump_hh(trie[k]);
 24175  undump_size(0)(trie_op_size)('trie op size')(j); @+init trie_op_ptr:=j;@+tini
 24176  for k:=1 to j do
 24177    begin undump(0)(63)(hyf_distance[k]); {a |small_number|}
 24178    undump(0)(63)(hyf_num[k]);
 24179    undump(min_quarterword)(max_quarterword)(hyf_next[k]);
 24180    end;
 24181  init for k:=0 to 255 do trie_used[k]:=min_quarterword;@+tini@;@/
 24182  k:=256;
 24183  while j>0 do
 24184    begin undump(0)(k-1)(k); undump(1)(j)(x);@+init trie_used[k]:=qi(x);@+tini@;@/
 24185    j:=j-x; op_start[k]:=qo(j);
 24186    end;
 24187  @!init trie_not_ready:=false @+tini
 24188  
 24189  @ We have already printed a lot of statistics, so we set |tracing_stats:=0|
 24190  to prevent them from appearing again.
 24191  
 24192  @<Dump a couple more things and the closing check word@>=
 24193  dump_int(interaction); dump_int(format_ident); dump_int(69069);
 24194  tracing_stats:=0
 24195  
 24196  @ @<Undump a couple more things and the closing check word@>=
 24197  undump(batch_mode)(error_stop_mode)(interaction);
 24198  undump(0)(str_ptr)(format_ident);
 24199  undump_int(x);
 24200  if (x<>69069)or eof(fmt_file) then goto bad_fmt
 24201  
 24202  @ @<Create the |format_ident|...@>=
 24203  selector:=new_string;
 24204  print(" (preloaded format="); print(job_name); print_char(" ");
 24205  print_int(year); print_char(".");
 24206  print_int(month); print_char("."); print_int(day); print_char(")");
 24207  if interaction=batch_mode then selector:=log_only
 24208  else selector:=term_and_log;
 24209  str_room(1);
 24210  format_ident:=make_string;
 24211  pack_job_name(format_extension);
 24212  while not w_open_out(fmt_file) do
 24213    prompt_file_name("format file name",format_extension);
 24214  print_nl("Beginning to dump on file ");
 24215  @.Beginning to dump...@>
 24216  slow_print(w_make_name_string(fmt_file)); flush_string;
 24217  print_nl(""); slow_print(format_ident)
 24218  
 24219  @ @<Close the format file@>=
 24220  w_close(fmt_file)
 24221  
 24222  @* \[51] The main program.
 24223  This is it: the part of \TeX\ that executes all those procedures we have
 24224  written.
 24225  
 24226  Well---almost. Let's leave space for a few more routines that we may
 24227  have forgotten.
 24228  
 24229  @p @<Last-minute procedures@>
 24230  
 24231  @ We have noted that there are two versions of \TeX82. One, called \.{INITEX},
 24232  @.INITEX@>
 24233  has to be run first; it initializes everything from scratch, without
 24234  reading a format file, and it has the capability of dumping a format file.
 24235  The other one is called `\.{VIRTEX}'; it is a ``virgin'' program that needs
 24236  @.VIRTEX@>
 24237  to input a format file in order to get started. \.{VIRTEX} typically has
 24238  more memory capacity than \.{INITEX}, because it does not need the space
 24239  consumed by the auxiliary hyphenation tables and the numerous calls on
 24240  |primitive|, etc.
 24241  
 24242  The \.{VIRTEX} program cannot read a format file instantaneously, of course;
 24243  the best implementations therefore allow for production versions of \TeX\ that
 24244  not only avoid the loading routine for \PASCAL\ object code, they also have
 24245  a format file pre-loaded. This is impossible to do if we stick to standard
 24246  \PASCAL; but there is a simple way to fool many systems into avoiding the
 24247  initialization, as follows:\quad(1)~We declare a global integer variable
 24248  called |ready_already|. The probability is negligible that this
 24249  variable holds any particular value like 314159 when \.{VIRTEX} is first
 24250  loaded.\quad(2)~After we have read in a format file and initialized
 24251  everything, we set |ready_already:=314159|.\quad(3)~Soon \.{VIRTEX}
 24252  will print `\.*', waiting for more input; and at this point we
 24253  interrupt the program and save its core image in some form that the
 24254  operating system can reload speedily.\quad(4)~When that core image is
 24255  activated, the program starts again at the beginning; but now
 24256  |ready_already=314159| and all the other global variables have
 24257  their initial values too. The former chastity has vanished!
 24258  
 24259  In other words, if we allow ourselves to test the condition
 24260  |ready_already=314159|, before |ready_already| has been
 24261  assigned a value, we can avoid the lengthy initialization. Dirty tricks
 24262  rarely pay off so handsomely.
 24263  @^dirty \PASCAL@>
 24264  @^system dependencies@>
 24265  
 24266  On systems that allow such preloading, the standard program called \.{TeX}
 24267  should be the one that has \.{plain} format preloaded, since that agrees
 24268  with {\sl The \TeX book}. Other versions, e.g., \.{AmSTeX}, should also
 24269  @:TeXbook}{\sl The \TeX book@>
 24270  @.AmSTeX@>
 24271  @.plain@>
 24272  be provided for commonly used formats.
 24273  
 24274  @<Glob...@>=
 24275  @!ready_already:integer; {a sacrifice of purity for economy}
 24276  
 24277  @ Now this is really it: \TeX\ starts and ends here.
 24278  
 24279  The initial test involving |ready_already| should be deleted if the
 24280  \PASCAL\ runtime system is smart enough to detect such a ``mistake.''
 24281  @^system dependencies@>
 24282  
 24283  @p begin @!{|start_here|}
 24284  history:=fatal_error_stop; {in case we quit during initialization}
 24285  t_open_out; {open the terminal for output}
 24286  if ready_already=314159 then goto start_of_TEX;
 24287  @<Check the ``constant'' values...@>@;
 24288  if bad>0 then
 24289    begin wterm_ln('Ouch---my internal constants have been clobbered!',
 24290      '---case ',bad:1);
 24291  @.Ouch...clobbered@>
 24292    goto final_end;
 24293    end;
 24294  initialize; {set global variables to their starting values}
 24295  @!init if not get_strings_started then goto final_end;
 24296  init_prim; {call |primitive| for each primitive}
 24297  init_str_ptr:=str_ptr; init_pool_ptr:=pool_ptr; fix_date_and_time;
 24298  tini@/
 24299  ready_already:=314159;
 24300  start_of_TEX: @<Initialize the output routines@>;
 24301  @<Get the first line of input and prepare to start@>;
 24302  history:=spotless; {ready to go!}
 24303  main_control; {come to life}
 24304  final_cleanup; {prepare for death}
 24305  end_of_TEX: close_files_and_terminate;
 24306  final_end: ready_already:=0;
 24307  end.
 24308  
 24309  @ Here we do whatever is needed to complete \TeX's job gracefully on the
 24310  local operating system. The code here might come into play after a fatal
 24311  error; it must therefore consist entirely of ``safe'' operations that
 24312  cannot produce error messages. For example, it would be a mistake to call
 24313  |str_room| or |make_string| at this time, because a call on |overflow|
 24314  might lead to an infinite loop.
 24315  @^system dependencies@>
 24316  (Actually there's one way to get error messages, via |prepare_mag|;
 24317  but that can't cause infinite recursion.)
 24318  @^recursion@>
 24319  
 24320  If |final_cleanup| is bypassed, this program doesn't bother to close
 24321  the input files that may still be open.
 24322  
 24323  @<Last-minute...@>=
 24324  procedure close_files_and_terminate;
 24325  var k:integer; {all-purpose index}
 24326  begin @<Finish the extensions@>; new_line_char:=-1;
 24327  @!stat if tracing_stats>0 then @<Output statistics about this job@>;@;@+tats@/
 24328  wake_up_terminal; @<Finish the \.{DVI} file@>;
 24329  if log_opened then
 24330    begin wlog_cr; a_close(log_file); selector:=selector-2;
 24331    if selector=term_only then
 24332      begin print_nl("Transcript written on ");
 24333  @.Transcript written...@>
 24334      slow_print(log_name); print_char(".");
 24335      end;
 24336    end;
 24337  end;
 24338  
 24339  @ The present section goes directly to the log file instead of using
 24340  |print| commands, because there's no need for these strings to take
 24341  up |str_pool| memory when a non-{\bf stat} version of \TeX\ is being used.
 24342  
 24343  @<Output statistics...@>=
 24344  if log_opened then
 24345    begin wlog_ln(' ');
 24346    wlog_ln('Here is how much of TeX''s memory',' you used:');
 24347  @.Here is how much...@>
 24348    wlog(' ',str_ptr-init_str_ptr:1,' string');
 24349    if str_ptr<>init_str_ptr+1 then wlog('s');
 24350    wlog_ln(' out of ', max_strings-init_str_ptr:1);@/
 24351    wlog_ln(' ',pool_ptr-init_pool_ptr:1,' string characters out of ',
 24352      pool_size-init_pool_ptr:1);@/
 24353    wlog_ln(' ',lo_mem_max-mem_min+mem_end-hi_mem_min+2:1,@|
 24354      ' words of memory out of ',mem_end+1-mem_min:1);@/
 24355    wlog_ln(' ',cs_count:1,' multiletter control sequences out of ',
 24356      hash_size:1);@/
 24357    wlog(' ',fmem_ptr:1,' words of font info for ',
 24358      font_ptr-font_base:1,' font');
 24359    if font_ptr<>font_base+1 then wlog('s');
 24360    wlog_ln(', out of ',font_mem_size:1,' for ',font_max-font_base:1);@/
 24361    wlog(' ',hyph_count:1,' hyphenation exception');
 24362    if hyph_count<>1 then wlog('s');
 24363    wlog_ln(' out of ',hyph_size:1);@/
 24364    wlog_ln(' ',max_in_stack:1,'i,',max_nest_stack:1,'n,',@|
 24365      max_param_stack:1,'p,',@|
 24366      max_buf_stack+1:1,'b,',@|
 24367      max_save_stack+6:1,'s stack positions out of ',@|
 24368      stack_size:1,'i,',
 24369      nest_size:1,'n,',
 24370      param_size:1,'p,',
 24371      buf_size:1,'b,',
 24372      save_size:1,'s');
 24373    end
 24374  
 24375  @ We get to the |final_cleanup| routine when \.{\\end} or \.{\\dump} has
 24376  been scanned and |its_all_over|\kern-2pt.
 24377  
 24378  @<Last-minute...@>=
 24379  procedure final_cleanup;
 24380  label exit;
 24381  var c:small_number; {0 for \.{\\end}, 1 for \.{\\dump}}
 24382  begin c:=cur_chr; if c<>1 then new_line_char:=-1;
 24383  if job_name=0 then open_log_file;
 24384  while input_ptr>0 do
 24385    if state=token_list then end_token_list@+else end_file_reading;
 24386  while open_parens>0 do
 24387    begin print(" )"); decr(open_parens);
 24388    end;
 24389  if cur_level>level_one then
 24390    begin print_nl("("); print_esc("end occurred ");
 24391    print("inside a group at level ");
 24392  @:end_}{\.{(\\end occurred...)}@>
 24393    print_int(cur_level-level_one); print_char(")");
 24394    end;
 24395  while cond_ptr<>null do
 24396    begin print_nl("("); print_esc("end occurred ");
 24397    print("when "); print_cmd_chr(if_test,cur_if);
 24398    if if_line<>0 then
 24399      begin print(" on line "); print_int(if_line);
 24400      end;
 24401    print(" was incomplete)");
 24402    if_line:=if_line_field(cond_ptr);
 24403    cur_if:=subtype(cond_ptr); temp_ptr:=cond_ptr;
 24404    cond_ptr:=link(cond_ptr); free_node(temp_ptr,if_node_size);
 24405    end;
 24406  if history<>spotless then
 24407   if ((history=warning_issued)or(interaction<error_stop_mode)) then
 24408    if selector=term_and_log then
 24409    begin selector:=term_only;
 24410    print_nl("(see the transcript file for additional information)");
 24411  @.see the transcript file...@>
 24412    selector:=term_and_log;
 24413    end;
 24414  if c=1 then
 24415    begin @!init for c:=top_mark_code to split_bot_mark_code do
 24416      if cur_mark[c]<>null then delete_token_ref(cur_mark[c]);
 24417    if last_glue<>max_halfword then delete_glue_ref(last_glue);
 24418    store_fmt_file; return;@+tini@/
 24419    print_nl("(\dump is performed only by INITEX)"); return;
 24420  @:dump_}{\.{\\dump...only by INITEX}@>
 24421    end;
 24422  exit:end;
 24423  
 24424  @ @<Last-minute...@>=
 24425  @!init procedure init_prim; {initialize all the primitives}
 24426  begin no_new_control_sequence:=false;
 24427  @<Put each...@>;
 24428  no_new_control_sequence:=true;
 24429  end;
 24430  tini
 24431  
 24432  @ When we begin the following code, \TeX's tables may still contain garbage;
 24433  the strings might not even be present. Thus we must proceed cautiously to get
 24434  bootstrapped in.
 24435  
 24436  But when we finish this part of the program, \TeX\ is ready to call on the
 24437  |main_control| routine to do its work.
 24438  
 24439  @<Get the first line...@>=
 24440  begin @<Initialize the input routines@>;
 24441  if (format_ident=0)or(buffer[loc]="&") then
 24442    begin if format_ident<>0 then initialize; {erase preloaded format}
 24443    if not open_fmt_file then goto final_end;
 24444    if not load_fmt_file then
 24445      begin w_close(fmt_file); goto final_end;
 24446      end;
 24447    w_close(fmt_file);
 24448    while (loc<limit)and(buffer[loc]=" ") do incr(loc);
 24449    end;
 24450  if end_line_char_inactive then decr(limit)
 24451  else  buffer[limit]:=end_line_char;
 24452  fix_date_and_time;@/
 24453  @<Compute the magic offset@>;
 24454  @<Initialize the print |selector|...@>;
 24455  if (loc<limit)and(cat_code(buffer[loc])<>escape) then start_input;
 24456    {\.{\\input} assumed}
 24457  end
 24458  
 24459  @* \[52] Debugging.
 24460  Once \TeX\ is working, you should be able to diagnose most errors with
 24461  the \.{\\show} commands and other diagnostic features. But for the initial
 24462  stages of debugging, and for the revelation of really deep mysteries, you
 24463  can compile \TeX\ with a few more aids, including the \PASCAL\ runtime
 24464  checks and its debugger. An additional routine called |debug_help|
 24465  will also come into play when you type `\.D' after an error message;
 24466  |debug_help| also occurs just before a fatal error causes \TeX\ to succumb.
 24467  @^debugging@>
 24468  @^system dependencies@>
 24469  
 24470  The interface to |debug_help| is primitive, but it is good enough when used
 24471  with a \PASCAL\ debugger that allows you to set breakpoints and to read
 24472  variables and change their values. After getting the prompt `\.{debug \#}', you
 24473  type either a negative number (this exits |debug_help|), or zero (this
 24474  goes to a location where you can set a breakpoint, thereby entering into
 24475  dialog with the \PASCAL\ debugger), or a positive number |m| followed by
 24476  an argument |n|. The meaning of |m| and |n| will be clear from the
 24477  program below. (If |m=13|, there is an additional argument, |l|.)
 24478  @.debug \#@>
 24479  
 24480  @d breakpoint=888 {place where a breakpoint is desirable}
 24481  
 24482  @<Last-minute...@>=
 24483  @!debug procedure debug_help; {routine to display various things}
 24484  label breakpoint,exit;
 24485  var k,@!l,@!m,@!n:integer;
 24486  begin clear_terminal;
 24487    loop begin wake_up_terminal;
 24488    print_nl("debug # (-1 to exit):"); update_terminal;
 24489  @.debug \#@>
 24490    read(term_in,m);
 24491    if m<0 then return
 24492    else if m=0 then
 24493      begin goto breakpoint;@/ {go to every declared label at least once}
 24494      breakpoint: m:=0; @{'BREAKPOINT'@}@/
 24495      end
 24496    else  begin read(term_in,n);
 24497      case m of
 24498      @t\4@>@<Numbered cases for |debug_help|@>@;
 24499      othercases print("?")
 24500      endcases;
 24501      end;
 24502    end;
 24503  exit:end;
 24504  gubed
 24505  
 24506  @ @<Numbered cases...@>=
 24507  1: print_word(mem[n]); {display |mem[n]| in all forms}
 24508  2: print_int(info(n));
 24509  3: print_int(link(n));
 24510  4: print_word(eqtb[n]);
 24511  5: print_word(font_info[n]);
 24512  6: print_word(save_stack[n]);
 24513  7: show_box(n);
 24514    {show a box, abbreviated by |show_box_depth| and |show_box_breadth|}
 24515  8: begin breadth_max:=10000; depth_threshold:=pool_size-pool_ptr-10;
 24516    show_node_list(n); {show a box in its entirety}
 24517    end;
 24518  9: show_token_list(n,null,1000);
 24519  10: slow_print(n);
 24520  11: check_mem(n>0); {check wellformedness; print new busy locations if |n>0|}
 24521  12: search_mem(n); {look for pointers to |n|}
 24522  13: begin read(term_in,l); print_cmd_chr(n,l);
 24523    end;
 24524  14: for k:=0 to n do print(buffer[k]);
 24525  15: begin font_in_short_display:=null_font; short_display(n);
 24526    end;
 24527  16: panicking:=not panicking;
 24528  
 24529  @* \[53] Extensions.
 24530  The program above includes a bunch of ``hooks'' that allow further
 24531  capabilities to be added without upsetting \TeX's basic structure.
 24532  Most of these hooks are concerned with ``whatsit'' nodes, which are
 24533  intended to be used for special purposes; whenever a new extension to
 24534  \TeX\ involves a new kind of whatsit node, a corresponding change needs
 24535  to be made to the routines below that deal with such nodes,
 24536  but it will usually be unnecessary to make many changes to the
 24537  other parts of this program.
 24538  
 24539  In order to demonstrate how extensions can be made, we shall treat
 24540  `\.{\\write}', `\.{\\openout}', `\.{\\closeout}', `\.{\\immediate}',
 24541  `\.{\\special}', and `\.{\\setlanguage}' as if they were extensions.
 24542  These commands are actually primitives of \TeX, and they should
 24543  appear in all implementations of the system; but let's try to imagine
 24544  that they aren't. Then the program below illustrates how a person
 24545  could add them.
 24546  
 24547  Sometimes, of course, an extension will require changes to \TeX\ itself;
 24548  no system of hooks could be complete enough for all conceivable extensions.
 24549  The features associated with `\.{\\write}' are almost all confined to the
 24550  following paragraphs, but there are small parts of the |print_ln| and
 24551  |print_char| procedures that were introduced specifically to \.{\\write}
 24552  characters. Furthermore one of the token lists recognized by the scanner
 24553  is a |write_text|; and there are a few other miscellaneous places where we
 24554  have already provided for some aspect of \.{\\write}.  The goal of a \TeX\
 24555  extender should be to minimize alterations to the standard parts of the
 24556  program, and to avoid them completely if possible. He or she should also
 24557  be quite sure that there's no easy way to accomplish the desired goals
 24558  with the standard features that \TeX\ already has. ``Think thrice before
 24559  extending,'' because that may save a lot of work, and it will also keep
 24560  incompatible extensions of \TeX\ from proliferating.
 24561  @^system dependencies@>
 24562  @^extensions to \TeX@>
 24563  
 24564  @ First let's consider the format of whatsit nodes that are used to represent
 24565  the data associated with \.{\\write} and its relatives. Recall that a whatsit
 24566  has |type=whatsit_node|, and the |subtype| is supposed to distinguish
 24567  different kinds of whatsits. Each node occupies two or more words; the
 24568  exact number is immaterial, as long as it is readily determined from the
 24569  |subtype| or other data.
 24570  
 24571  We shall introduce five |subtype| values here, corresponding to the
 24572  control sequences \.{\\openout}, \.{\\write}, \.{\\closeout}, \.{\\special}, and
 24573  \.{\\setlanguage}. The second word of I/O whatsits has a |write_stream| field
 24574  that identifies the write-stream number (0 to 15, or 16 for out-of-range and
 24575  positive, or 17 for out-of-range and negative).
 24576  In the case of \.{\\write} and \.{\\special}, there is also a field that
 24577  points to the reference count of a token list that should be sent. In the
 24578  case of \.{\\openout}, we need three words and three auxiliary subfields
 24579  to hold the string numbers for name, area, and extension.
 24580  
 24581  @d write_node_size=2 {number of words in a write/whatsit node}
 24582  @d open_node_size=3 {number of words in an open/whatsit node}
 24583  @d open_node=0 {|subtype| in whatsits that represent files to \.{\\openout}}
 24584  @d write_node=1 {|subtype| in whatsits that represent things to \.{\\write}}
 24585  @d close_node=2 {|subtype| in whatsits that represent streams to \.{\\closeout}}
 24586  @d special_node=3 {|subtype| in whatsits that represent \.{\\special} things}
 24587  @d language_node=4 {|subtype| in whatsits that change the current language}
 24588  @d what_lang(#)==link(#+1) {language number, in the range |0..255|}
 24589  @d what_lhm(#)==type(#+1) {minimum left fragment, in the range |1..63|}
 24590  @d what_rhm(#)==subtype(#+1) {minimum right fragment, in the range |1..63|}
 24591  @d write_tokens(#) == link(#+1) {reference count of token list to write}
 24592  @d write_stream(#) == info(#+1) {stream number (0 to 17)}
 24593  @d open_name(#) == link(#+1) {string number of file name to open}
 24594  @d open_area(#) == info(#+2) {string number of file area for |open_name|}
 24595  @d open_ext(#) == link(#+2) {string number of file extension for |open_name|}
 24596  
 24597  @ The sixteen possible \.{\\write} streams are represented by the |write_file|
 24598  array. The |j|th file is open if and only if |write_open[j]=true|. The last
 24599  two streams are special; |write_open[16]| represents a stream number
 24600  greater than 15, while |write_open[17]| represents a negative stream number,
 24601  and both of these variables are always |false|.
 24602  
 24603  @<Glob...@>=
 24604  @!write_file:array[0..15] of alpha_file;
 24605  @!write_open:array[0..17] of boolean;
 24606  
 24607  @ @<Set init...@>=
 24608  for k:=0 to 17 do write_open[k]:=false;
 24609  
 24610  @ Extensions might introduce new command codes; but it's best to use
 24611  |extension| with a modifier, whenever possible, so that |main_control|
 24612  stays the same.
 24613  
 24614  @d immediate_code=4 {command modifier for \.{\\immediate}}
 24615  @d set_language_code=5 {command modifier for \.{\\setlanguage}}
 24616  
 24617  @<Put each...@>=
 24618  primitive("openout",extension,open_node);@/
 24619  @!@:open_out_}{\.{\\openout} primitive@>
 24620  primitive("write",extension,write_node); write_loc:=cur_val;@/
 24621  @!@:write_}{\.{\\write} primitive@>
 24622  primitive("closeout",extension,close_node);@/
 24623  @!@:close_out_}{\.{\\closeout} primitive@>
 24624  primitive("special",extension,special_node);@/
 24625  @!@:special_}{\.{\\special} primitive@>
 24626  primitive("immediate",extension,immediate_code);@/
 24627  @!@:immediate_}{\.{\\immediate} primitive@>
 24628  primitive("setlanguage",extension,set_language_code);@/
 24629  @!@:set_language_}{\.{\\setlanguage} primitive@>
 24630  
 24631  @ The variable |write_loc| just introduced is used to provide an
 24632  appropriate error message in case of ``runaway'' write texts.
 24633  
 24634  @<Glob...@>=
 24635  @!write_loc:pointer; {|eqtb| address of \.{\\write}}
 24636  
 24637  @ @<Cases of |print_cmd_chr|...@>=
 24638  extension: case chr_code of
 24639    open_node:print_esc("openout");
 24640    write_node:print_esc("write");
 24641    close_node:print_esc("closeout");
 24642    special_node:print_esc("special");
 24643    immediate_code:print_esc("immediate");
 24644    set_language_code:print_esc("setlanguage");
 24645    othercases print("[unknown extension!]")
 24646    endcases;
 24647  
 24648  @ When an |extension| command occurs in |main_control|, in any mode,
 24649  the |do_extension| routine is called.
 24650  
 24651  @<Cases of |main_control| that are for extensions...@>=
 24652  any_mode(extension):do_extension;
 24653  
 24654  @ @<Declare act...@>=
 24655  @t\4@>@<Declare procedures needed in |do_extension|@>@;
 24656  procedure do_extension;
 24657  var i,@!j,@!k:integer; {all-purpose integers}
 24658  @!p,@!q,@!r:pointer; {all-purpose pointers}
 24659  begin case cur_chr of
 24660  open_node:@<Implement \.{\\openout}@>;
 24661  write_node:@<Implement \.{\\write}@>;
 24662  close_node:@<Implement \.{\\closeout}@>;
 24663  special_node:@<Implement \.{\\special}@>;
 24664  immediate_code:@<Implement \.{\\immediate}@>;
 24665  set_language_code:@<Implement \.{\\setlanguage}@>;
 24666  othercases confusion("ext1")
 24667  @:this can't happen ext1}{\quad ext1@>
 24668  endcases;
 24669  end;
 24670  
 24671  @ Here is a subroutine that creates a whatsit node having a given |subtype|
 24672  and a given number of words. It initializes only the first word of the whatsit,
 24673  and appends it to the current list.
 24674  
 24675  @<Declare procedures needed in |do_extension|@>=
 24676  procedure new_whatsit(@!s:small_number;@!w:small_number);
 24677  var p:pointer; {the new node}
 24678  begin p:=get_node(w); type(p):=whatsit_node; subtype(p):=s;
 24679  link(tail):=p; tail:=p;
 24680  end;
 24681  
 24682  @ The next subroutine uses |cur_chr| to decide what sort of whatsit is
 24683  involved, and also inserts a |write_stream| number.
 24684  
 24685  @<Declare procedures needed in |do_ext...@>=
 24686  procedure new_write_whatsit(@!w:small_number);
 24687  begin new_whatsit(cur_chr,w);
 24688  if w<>write_node_size then scan_four_bit_int
 24689  else  begin scan_int;
 24690    if cur_val<0 then cur_val:=17
 24691    else if cur_val>15 then cur_val:=16;
 24692    end;
 24693  write_stream(tail):=cur_val;
 24694  end;
 24695  
 24696  @ @<Implement \.{\\openout}@>=
 24697  begin new_write_whatsit(open_node_size);
 24698  scan_optional_equals; scan_file_name;@/
 24699  open_name(tail):=cur_name; open_area(tail):=cur_area; open_ext(tail):=cur_ext;
 24700  end
 24701  
 24702  @ When `\.{\\write 12\{...\}}' appears, we scan the token list `\.{\{...\}}'
 24703  without expanding its macros; the macros will be expanded later when this
 24704  token list is rescanned.
 24705  
 24706  @<Implement \.{\\write}@>=
 24707  begin k:=cur_cs; new_write_whatsit(write_node_size);@/
 24708  cur_cs:=k; p:=scan_toks(false,false); write_tokens(tail):=def_ref;
 24709  end
 24710  
 24711  @ @<Implement \.{\\closeout}@>=
 24712  begin new_write_whatsit(write_node_size); write_tokens(tail):=null;
 24713  end
 24714  
 24715  @ When `\.{\\special\{...\}}' appears, we expand the macros in the token
 24716  list as in \.{\\xdef} and \.{\\mark}.
 24717  
 24718  @<Implement \.{\\special}@>=
 24719  begin new_whatsit(special_node,write_node_size); write_stream(tail):=null;
 24720  p:=scan_toks(false,true); write_tokens(tail):=def_ref;
 24721  end
 24722  
 24723  @ Each new type of node that appears in our data structure must be capable
 24724  of being displayed, copied, destroyed, and so on. The routines that we
 24725  need for write-oriented whatsits are somewhat like those for mark nodes;
 24726  other extensions might, of course, involve more subtlety here.
 24727  
 24728  @<Basic printing...@>=
 24729  procedure print_write_whatsit(@!s:str_number;@!p:pointer);
 24730  begin print_esc(s);
 24731  if write_stream(p)<16 then print_int(write_stream(p))
 24732  else if write_stream(p)=16 then print_char("*")
 24733  @.*\relax@>
 24734  else print_char("-");
 24735  end;
 24736  
 24737  @ @<Display the whatsit...@>=
 24738  case subtype(p) of
 24739  open_node:begin print_write_whatsit("openout",p);
 24740    print_char("="); print_file_name(open_name(p),open_area(p),open_ext(p));
 24741    end;
 24742  write_node:begin print_write_whatsit("write",p);
 24743    print_mark(write_tokens(p));
 24744    end;
 24745  close_node:print_write_whatsit("closeout",p);
 24746  special_node:begin print_esc("special");
 24747    print_mark(write_tokens(p));
 24748    end;
 24749  language_node:begin print_esc("setlanguage");
 24750    print_int(what_lang(p)); print(" (hyphenmin ");
 24751    print_int(what_lhm(p)); print_char(",");
 24752    print_int(what_rhm(p)); print_char(")");
 24753    end;
 24754  othercases print("whatsit?")
 24755  endcases
 24756  
 24757  @ @<Make a partial copy of the whatsit...@>=
 24758  case subtype(p) of
 24759  open_node: begin r:=get_node(open_node_size); words:=open_node_size;
 24760    end;
 24761  write_node,special_node: begin r:=get_node(write_node_size);
 24762    add_token_ref(write_tokens(p)); words:=write_node_size;
 24763    end;
 24764  close_node,language_node: begin r:=get_node(small_node_size);
 24765    words:=small_node_size;
 24766    end;
 24767  othercases confusion("ext2")
 24768  @:this can't happen ext2}{\quad ext2@>
 24769  endcases
 24770  
 24771  @ @<Wipe out the whatsit...@>=
 24772  begin case subtype(p) of
 24773  open_node: free_node(p,open_node_size);
 24774  write_node,special_node: begin delete_token_ref(write_tokens(p));
 24775    free_node(p,write_node_size); goto done;
 24776    end;
 24777  close_node,language_node: free_node(p,small_node_size);
 24778  othercases confusion("ext3")
 24779  @:this can't happen ext3}{\quad ext3@>
 24780  endcases;@/
 24781  goto done;
 24782  end
 24783  
 24784  @ @<Incorporate a whatsit node into a vbox@>=do_nothing
 24785  
 24786  @ @<Incorporate a whatsit node into an hbox@>=do_nothing
 24787  
 24788  @ @<Let |d| be the width of the whatsit |p|@>=d:=0
 24789  
 24790  @ @d adv_past(#)==@+if subtype(#)=language_node then
 24791      begin cur_lang:=what_lang(#); l_hyf:=what_lhm(#); r_hyf:=what_rhm(#);@+end
 24792  
 24793  @<Advance \(p)past a whatsit node in the \(l)|line_break| loop@>=@+
 24794  adv_past(cur_p)
 24795  
 24796  @ @<Advance \(p)past a whatsit node in the \(p)pre-hyphenation loop@>=@+
 24797  adv_past(s)
 24798  
 24799  @ @<Prepare to move whatsit |p| to the current page, then |goto contribute|@>=
 24800  goto contribute
 24801  
 24802  @ @<Process whatsit |p| in |vert_break| loop, |goto not_found|@>=
 24803  goto not_found
 24804  
 24805  @ @<Output the whatsit node |p| in a vlist@>=
 24806  out_what(p)
 24807  
 24808  @ @<Output the whatsit node |p| in an hlist@>=
 24809  out_what(p)
 24810  
 24811  @ After all this preliminary shuffling, we come finally to the routines
 24812  that actually send out the requested data. Let's do \.{\\special} first
 24813  (it's easier).
 24814  
 24815  @<Declare procedures needed in |hlist_out|, |vlist_out|@>=
 24816  procedure special_out(@!p:pointer);
 24817  var old_setting:0..max_selector; {holds print |selector|}
 24818  @!k:pool_pointer; {index into |str_pool|}
 24819  begin synch_h; synch_v;@/
 24820  old_setting:=selector; selector:=new_string;
 24821  show_token_list(link(write_tokens(p)),null,pool_size-pool_ptr);
 24822  selector:=old_setting;
 24823  str_room(1);
 24824  if cur_length<256 then
 24825    begin dvi_out(xxx1); dvi_out(cur_length);
 24826    end
 24827  else  begin dvi_out(xxx4); dvi_four(cur_length);
 24828    end;
 24829  for k:=str_start[str_ptr] to pool_ptr-1 do dvi_out(so(str_pool[k]));
 24830  pool_ptr:=str_start[str_ptr]; {erase the string}
 24831  end;
 24832  
 24833  @ To write a token list, we must run it through \TeX's scanner, expanding
 24834  macros and \.{\\the} and \.{\\number}, etc. This might cause runaways,
 24835  if a delimited macro parameter isn't matched, and runaways would be
 24836  extremely confusing since we are calling on \TeX's scanner in the middle
 24837  of a \.{\\shipout} command. Therefore we will put a dummy control sequence as
 24838  a ``stopper,'' right after the token list. This control sequence is
 24839  artificially defined to be \.{\\outer}.
 24840  @:end_write_}{\.{\\endwrite}@>
 24841  
 24842  @<Initialize table...@>=
 24843  text(end_write):="endwrite"; eq_level(end_write):=level_one;
 24844  eq_type(end_write):=outer_call; equiv(end_write):=null;
 24845  
 24846  @ @<Declare procedures needed in |hlist_out|, |vlist_out|@>=
 24847  procedure write_out(@!p:pointer);
 24848  var old_setting:0..max_selector; {holds print |selector|}
 24849  @!old_mode:integer; {saved |mode|}
 24850  @!j:small_number; {write stream number}
 24851  @!q,@!r:pointer; {temporary variables for list manipulation}
 24852  begin @<Expand macros in the token list
 24853    and make |link(def_ref)| point to the result@>;
 24854  old_setting:=selector; j:=write_stream(p);
 24855  if write_open[j] then selector:=j
 24856  else  begin {write to the terminal if file isn't open}
 24857    if (j=17)and(selector=term_and_log) then selector:=log_only;
 24858    print_nl("");
 24859    end;
 24860  token_show(def_ref); print_ln;
 24861  flush_list(def_ref); selector:=old_setting;
 24862  end;
 24863  
 24864  @ The final line of this routine is slightly subtle; at least, the author
 24865  didn't think about it until getting burnt! There is a used-up token list
 24866  @^Knuth, Donald Ervin@>
 24867  on the stack, namely the one that contained |end_write_token|. (We
 24868  insert this artificial `\.{\\endwrite}' to prevent runaways, as explained
 24869  above.) If it were not removed, and if there were numerous writes on a
 24870  single page, the stack would overflow.
 24871  
 24872  @d end_write_token==cs_token_flag+end_write
 24873  
 24874  @<Expand macros in the token list and...@>=
 24875  q:=get_avail; info(q):=right_brace_token+"}";@/
 24876  r:=get_avail; link(q):=r; info(r):=end_write_token; ins_list(q);@/
 24877  begin_token_list(write_tokens(p),write_text);@/
 24878  q:=get_avail; info(q):=left_brace_token+"{"; ins_list(q);
 24879  {now we're ready to scan
 24880    `\.\{$\langle\,$token list$\,\rangle$\.{\} \\endwrite}'}
 24881  old_mode:=mode; mode:=0;
 24882    {disable \.{\\prevdepth}, \.{\\spacefactor}, \.{\\lastskip}, \.{\\prevgraf}}
 24883  cur_cs:=write_loc; q:=scan_toks(false,true); {expand macros, etc.}
 24884  get_token;@+if cur_tok<>end_write_token then
 24885    @<Recover from an unbalanced write command@>;
 24886  mode:=old_mode;
 24887  end_token_list {conserve stack space}
 24888  
 24889  @ @<Recover from an unbalanced write command@>=
 24890  begin print_err("Unbalanced write command");
 24891  @.Unbalanced write...@>
 24892  help2("On this page there's a \write with fewer real {'s than }'s.")@/
 24893  ("I can't handle that very well; good luck."); error;
 24894  repeat get_token;
 24895  until cur_tok=end_write_token;
 24896  end
 24897  
 24898  @ The |out_what| procedure takes care of outputting whatsit nodes for
 24899  |vlist_out| and |hlist_out|\kern-.3pt.
 24900  
 24901  @<Declare procedures needed in |hlist_out|, |vlist_out|@>=
 24902  procedure out_what(@!p:pointer);
 24903  var j:small_number; {write stream number}
 24904  begin case subtype(p) of
 24905  open_node,write_node,close_node:@<Do some work that has been queued up
 24906    for \.{\\write}@>;
 24907  special_node:special_out(p);
 24908  language_node:do_nothing;
 24909  othercases confusion("ext4")
 24910  @:this can't happen ext4}{\quad ext4@>
 24911  endcases;
 24912  end;
 24913  
 24914  @ We don't implement \.{\\write} inside of leaders. (The reason is that
 24915  the number of times a leader box appears might be different in different
 24916  implementations, due to machine-dependent rounding in the glue calculations.)
 24917  @^leaders@>
 24918  
 24919  @<Do some work that has been queued up...@>=
 24920  if not doing_leaders then
 24921    begin j:=write_stream(p);
 24922    if subtype(p)=write_node then write_out(p)
 24923    else  begin if write_open[j] then a_close(write_file[j]);
 24924      if subtype(p)=close_node then write_open[j]:=false
 24925      else if j<16 then
 24926        begin cur_name:=open_name(p); cur_area:=open_area(p);
 24927        cur_ext:=open_ext(p);
 24928        if cur_ext="" then cur_ext:=".tex";
 24929        pack_cur_name;
 24930        while not a_open_out(write_file[j]) do
 24931          prompt_file_name("output file name",".tex");
 24932        write_open[j]:=true;
 24933        end;
 24934      end;
 24935    end
 24936  
 24937  @ The presence of `\.{\\immediate}' causes the |do_extension| procedure
 24938  to descend to one level of recursion. Nothing happens unless \.{\\immediate}
 24939  is followed by `\.{\\openout}', `\.{\\write}', or `\.{\\closeout}'.
 24940  @^recursion@>
 24941  
 24942  @<Implement \.{\\immediate}@>=
 24943  begin get_x_token;
 24944  if (cur_cmd=extension)and(cur_chr<=close_node) then
 24945    begin p:=tail; do_extension; {append a whatsit node}
 24946    out_what(tail); {do the action immediately}
 24947    flush_node_list(tail); tail:=p; link(p):=null;
 24948    end
 24949  else back_input;
 24950  end
 24951  
 24952  @ The \.{\\language} extension is somewhat different.
 24953  We need a subroutine that comes into play when a character of
 24954  a non-|clang| language is being appended to the current paragraph.
 24955  
 24956  @<Declare action...@>=
 24957  procedure fix_language;
 24958  var @!l:ASCII_code; {the new current language}
 24959  begin if language<=0 then l:=0
 24960  else if language>255 then l:=0
 24961  else l:=language;
 24962  if l<>clang then
 24963    begin new_whatsit(language_node,small_node_size);
 24964    what_lang(tail):=l; clang:=l;@/
 24965    what_lhm(tail):=norm_min(left_hyphen_min);
 24966    what_rhm(tail):=norm_min(right_hyphen_min);
 24967    end;
 24968  end;
 24969  
 24970  @ @<Implement \.{\\setlanguage}@>=
 24971  if abs(mode)<>hmode then report_illegal_case
 24972  else begin new_whatsit(language_node,small_node_size);
 24973    scan_int;
 24974    if cur_val<=0 then clang:=0
 24975    else if cur_val>255 then clang:=0
 24976    else clang:=cur_val;
 24977    what_lang(tail):=clang;
 24978    what_lhm(tail):=norm_min(left_hyphen_min);
 24979    what_rhm(tail):=norm_min(right_hyphen_min);
 24980    end
 24981  
 24982  @ @<Finish the extensions@>=
 24983  for k:=0 to 15 do if write_open[k] then a_close(write_file[k])
 24984  
 24985  @* \[54] System-dependent changes.
 24986  This section should be replaced, if necessary, by any special
 24987  modifications of the program
 24988  that are necessary to make \TeX\ work at a particular installation.
 24989  It is usually best to design your change file so that all changes to
 24990  previous sections preserve the section numbering; then everybody's version
 24991  will be consistent with the published program. More extensive changes,
 24992  which introduce new sections, can be inserted here; then only the index
 24993  itself will get a new section number.
 24994  @^system dependencies@>
 24995  
 24996  @* \[55] Index.
 24997  Here is where you can find all uses of each identifier in the program,
 24998  with underlined entries pointing to where the identifier was defined.
 24999  If the identifier is only one letter long, however, you get to see only
 25000  the underlined entries. {\sl All references are to section numbers instead of
 25001  page numbers.}
 25002  
 25003  This index also lists error messages and other aspects of the program
 25004  that you might want to look up some day. For example, the entry
 25005  for ``system dependencies'' lists all sections that should receive
 25006  special attention from people who are installing \TeX\ in a new
 25007  operating environment. A list of various things that can't happen appears
 25008  under ``this can't happen''. Approximately 40 sections are listed under
 25009  ``inner loop''; these account for about 60\pct! of \TeX's running time,
 25010  exclusive of input and output.