modernc.org/knuth@v0.0.4/weave/weave.web

modernc.org/knuth@v0.0.4/weave/weave.web (about)

     1  % This program by D. E. Knuth is not copyrighted and can be used freely.
     2  % Version 0 was released in December, 1981.
     3  % Version 1 was released in September, 1982, with version 0 of TeX.
     4  % Slight changes were made in October, 1982, for version 0.6 of TeX.
     5  % Version 1.1 changed "_" to "\_" if not within an identifier (November, 1982).
     6  % Version 1.2 added @@= and @@\ and marked changed modules (December, 1982).
     7  % Version 1.3 marked and indexed changed modules better (January, 1983).
     8  % Version 1.4 added "history" (February, 1983).
     9  % Version 1.5 conformed to TeX version 0.96 (March, 1983).
    10  % Version 1.6 conformed to TeX version 0.98 (May, 1983).
    11  % Version 1.7 introduced the new change file format (June, 1983).
    12  % Version 2 was released in July, 1983, with version 0.999 of TeX.
    13  % Version 2.1 corrected a bug in changed_module reckoning (August, 1983).
    14  % Version 2.2 corrected it better (August, 1983).
    15  % Version 2.3 starts the output with \input webmac (August, 1983).
    16  % Version 2.4 fixed a bug in compress(#) (September, 1983).
    17  % Version 2.5 cleared xrefswitch after module names (November, 1983).
    18  % Version 2.6 fixed a bug in declaration of trans array (January, 1984).
    19  % Version 2.7 fixed a bug in real constants (August, 1984).
    20  % Version 2.8 fixed a bug in change_buffer movement (August, 1985).
    21  % Version 2.9 increased max_refs and max_toks to 30000 each (January, 1987).
    22  % Version 3, for Sewell's book, fixed long-line bug in input_ln (March, 1989).
    23  % Version 3.1 fixed a bug for programs with only one module (April, 1989).
    24  % Version 4 was major change to allow 8-bit input (September, 1989).
    25  % Version 4.1, for Breitenlohner, avoids English-only output (March, 1990).
    26  % Version 4.2 conforms to ANSI standard for-loop rules (September, 1990).
    27  % Version 4.3 catches extra } in input (Breitenlohner, September, 1991).
    28  % Version 4.4 corrects changed_module logic, %-overflow (January, 1992).
    29  % Version 4.5 corrects archaic @@z logic and empty change file (January, 2021).
    30  
    31  % Here is TeX material that gets inserted after \input webmac
    32  \def\hang{\hangindent 3em\indent\ignorespaces}
    33  \font\ninerm=cmr9
    34  \let\mc=\ninerm % medium caps for names like SAIL
    35  \def\PASCAL{Pascal}
    36  \def\pb{$\.|\ldots\.|$} % Pascal brackets (|...|)
    37  \def\v{\.{\char'174}} % vertical (|) in typewriter font
    38  \def\dleft{[\![} \def\dright{]\!]} % double brackets
    39  \mathchardef\RA="3221 % right arrow
    40  \mathchardef\BA="3224 % double arrow
    41  \def\({} % kludge for alphabetizing certain module names
    42  
    43  \def\title{WEAVE}
    44  \def\contentspagenumber{15} % should be odd
    45  \def\topofcontents{\null\vfill
    46    \titlefalse % include headline on the contents page
    47    \def\rheader{\mainfont Appendix D\hfil \contentspagenumber}
    48    \centerline{\titlefont The {\ttitlefont WEAVE} processor}
    49    \vskip 15pt
    50    \centerline{(Version 4.5)}
    51    \vfill}
    52  \pageno=\contentspagenumber \advance\pageno by 1
    53  
    54  @* Introduction.
    55  This program converts a \.{WEB} file to a \TeX\ file. It was written
    56  by D. E. Knuth in October, 1981; a somewhat similar {\mc SAIL} program had
    57  been developed in March, 1979, although the earlier program used a top-down
    58  parsing method that is quite different from the present scheme.
    59  
    60  The code uses a few features of the local \PASCAL\ compiler that may need
    61  to be changed in other installations:
    62  
    63  \yskip\item{1)} Case statements have a default.
    64  \item{2)} Input-output routines may need to be adapted for use with a particular
    65  character set and/or for printing messages on the user's terminal.
    66  
    67  \yskip\noindent
    68  These features are also present in the \PASCAL\ version of \TeX, where they
    69  are used in a similar (but more complex) way. System-dependent portions
    70  of \.{WEAVE} can be identified by looking at the entries for `system
    71  dependencies' in the index below.
    72  @!@^system dependencies@>
    73  
    74  The ``banner line'' defined here should be changed whenever \.{WEAVE}
    75  is modified.
    76  
    77  @d banner=='This is WEAVE, Version 4.5'
    78  
    79  @ The program begins with a fairly normal header, made up of pieces that
    80  @^system dependencies@>
    81  will mostly be filled in later. The \.{WEB} input comes from files |web_file|
    82  and |change_file|, and the \TeX\ output goes to file |tex_file|.
    83  
    84  If it is necessary to abort the job because of a fatal error, the program
    85  calls the `|jump_out|' procedure, which goes to the label |end_of_WEAVE|.
    86  
    87  @d end_of_WEAVE = 9999 {go here to wrap it up}
    88  
    89  @p @t\4@>@<Compiler directives@>@/
    90  program WEAVE(@!web_file,@!change_file,@!tex_file);
    91  label end_of_WEAVE; {go here to finish}
    92  const @<Constants in the outer block@>@/
    93  type @<Types in the outer block@>@/
    94  var @<Globals in the outer block@>@/
    95  @<Error handling procedures@>@/
    96  procedure initialize;
    97    var @<Local variables for initialization@>@/
    98    begin @<Set initial values@>@/
    99    end;
   100  
   101  @ Some of this code is optional for use when debugging only;
   102  such material is enclosed between the delimiters |debug| and $|gubed|$.
   103  Other parts, delimited by |stat| and $|tats|$, are optionally included
   104  if statistics about \.{WEAVE}'s memory usage are desired.
   105  
   106  @d debug==@{ {change this to `$\\{debug}\equiv\null$' when debugging}
   107  @d gubed==@t@>@} {change this to `$\\{gubed}\equiv\null$' when debugging}
   108  @f debug==begin
   109  @f gubed==end
   110  @#
   111  @d stat==@{ {change this to `$\\{stat}\equiv\null$'
   112    when gathering usage statistics}
   113  @d tats==@t@>@} {change this to `$\\{tats}\equiv\null$'
   114    when gathering usage statistics}
   115  @f stat==begin
   116  @f tats==end
   117  
   118  @ The \PASCAL\ compiler used to develop this system has ``compiler
   119  directives'' that can appear in comments whose first character is a dollar sign.
   120  In production versions of \.{WEAVE} these directives tell the compiler that
   121  @^system dependencies@>
   122  it is safe to avoid range checks and to leave out the extra code it inserts
   123  for the \PASCAL\ debugger's benefit, although interrupts will occur if
   124  there is arithmetic overflow.
   125  
   126  @<Compiler directives@>=
   127  @{@&$C-,A+,D-@} {no range check, catch arithmetic overflow, no debug overhead}
   128  @!debug @{@&$C+,D+@}@+ gubed {but turn everything on when debugging}
   129  
   130  @ Labels are given symbolic names by the following definitions. We insert
   131  the label `|exit|:' just before the `\ignorespaces|end|\unskip' of a
   132  procedure in which we have used the `|return|' statement defined below;
   133  the label `|restart|' is occasionally used at the very beginning of a
   134  procedure; and the label `|reswitch|' is occasionally used just prior to
   135  a \&{case} statement in which some cases change the conditions and we wish to
   136  branch to the newly applicable case.
   137  Loops that are set up with the \&{loop} construction defined below are
   138  commonly exited by going to `|done|' or to `|found|' or to `|not_found|',
   139  and they are sometimes repeated by going to `|continue|'.
   140  
   141  @d exit=10 {go here to leave a procedure}
   142  @d restart=20 {go here to start a procedure again}
   143  @d reswitch=21 {go here to start a case statement again}
   144  @d continue=22 {go here to resume a loop}
   145  @d done=30 {go here to exit a loop}
   146  @d found=31 {go here when you've found it}
   147  @d not_found=32 {go here when you've found something else}
   148  
   149  @ Here are some macros for common programming idioms.
   150  
   151  @d incr(#) == #:=#+1 {increase a variable by unity}
   152  @d decr(#) == #:=#-1 {decrease a variable by unity}
   153  @d loop == @+ while true do@+ {repeat over and over until a |goto| happens}
   154  @d do_nothing == {empty statement}
   155  @d return == goto exit {terminate a procedure call}
   156  @f return == nil
   157  @f loop == xclause
   158  
   159  @ We assume that |case| statements may include a default case that applies
   160  if no matching label is found. Thus, we shall use constructions like
   161  @^system dependencies@>
   162  $$\vbox{\halign{#\hfil\cr
   163  |case x of|\cr
   164  1: $\langle\,$code for $x=1\,\rangle$;\cr
   165  3: $\langle\,$code for $x=3\,\rangle$;\cr
   166  |othercases| $\langle\,$code for |x<>1| and |x<>3|$\,\rangle$\cr
   167  |endcases|\cr}}$$
   168  since most \PASCAL\ compilers have plugged this hole in the language by
   169  incorporating some sort of default mechanism. For example, the compiler
   170  used to develop \.{WEB} and \TeX\ allows `|others|:' as a default label,
   171  and other \PASCAL s allow syntaxes like `\ignorespaces|else|\unskip' or
   172  `\&{otherwise}' or `\\{otherwise}:', etc. The definitions of |othercases|
   173  and |endcases| should be changed to agree with local conventions.
   174  (Of course, if no default mechanism is available, the |case| statements of
   175  this program must be extended by listing all remaining cases.)
   176  
   177  @d othercases == others: {default for cases not listed explicitly}
   178  @d endcases == @+end {follows the default case in an extended |case| statement}
   179  @f othercases == else
   180  @f endcases == end
   181  
   182  @ The following parameters are set big enough to handle \TeX, so they
   183  should be sufficient for most applications of \.{WEAVE}.
   184  
   185  @<Constants...@>=
   186  @!max_bytes=45000; {|1/ww| times the number of bytes in identifiers,
   187    index entries, and module names; must be less than 65536}
   188  @!max_names=5000; {number of identifiers, index entries, and module names;
   189    must be less than 10240}
   190  @!max_modules=2000;{greater than the total number of modules}
   191  @!hash_size=353; {should be prime}
   192  @!buf_size=100; {maximum length of input line}
   193  @!longest_name=400; {module names shouldn't be longer than this}
   194  @!long_buf_size=500; {|buf_size+longest_name|}
   195  @!line_length=80; {lines of \TeX\ output have at most this many characters,
   196    should be less than 256}
   197  @!max_refs=30000; {number of cross references; must be less than 65536}
   198  @!max_toks=30000; {number of symbols in \PASCAL\ texts being parsed;
   199    must be less than 65536}
   200  @!max_texts=2000; {number of phrases in \PASCAL\ texts being parsed;
   201    must be less than 10240}
   202  @!max_scraps=1000; {number of tokens in \PASCAL\ texts being parsed}
   203  @!stack_size=200; {number of simultaneous output levels}
   204  
   205  @ A global variable called |history| will contain one of four values
   206  at the end of every run: |spotless| means that no unusual messages were
   207  printed; |harmless_message| means that a message of possible interest
   208  was printed but no serious errors were detected; |error_message| means that
   209  at least one error was found; |fatal_message| means that the program
   210  terminated abnormally. The value of |history| does not influence the
   211  behavior of the program; it is simply computed for the convenience
   212  of systems that might want to use such information.
   213  
   214  @d spotless=0 {|history| value for normal jobs}
   215  @d harmless_message=1 {|history| value when non-serious info was printed}
   216  @d error_message=2 {|history| value when an error was noted}
   217  @d fatal_message=3 {|history| value when we had to stop prematurely}
   218  @#
   219  @d mark_harmless==@t@>@+if history=spotless then history:=harmless_message
   220  @d mark_error==history:=error_message
   221  @d mark_fatal==history:=fatal_message
   222  
   223  @<Glob...@>=@!history:spotless..fatal_message; {how bad was this run?}
   224  
   225  @ @<Set init...@>=history:=spotless;
   226  
   227  @* The character set.
   228  One of the main goals in the design of \.{WEB} has been to make it readily
   229  portable between a wide variety of computers. Yet \.{WEB} by its very
   230  nature must use a greater variety of characters than most computer
   231  programs deal with, and character encoding is one of the areas in which
   232  existing machines differ most widely from each other.
   233  
   234  To resolve this problem, all input to \.{WEAVE} and \.{TANGLE} is
   235  converted to an internal eight-bit code that is essentially standard
   236  ASCII, the ``American Standard Code for Information Interchange.''
   237  The conversion is done immediately when each character is read in.
   238  Conversely, characters are converted from ASCII to the user's external
   239  representation just before they are output. (The original ASCII code
   240  was seven bits only; \.{WEB} now allows eight bits in an attempt to
   241  keep up with modern times.)
   242  
   243  Such an internal code is relevant to users of \.{WEB} only because it is
   244  the code used for preprocessed constants like \.{"A"}. If you are writing
   245  a program in \.{WEB} that makes use of such one-character constants, you
   246  should convert your input to ASCII form, like \.{WEAVE} and \.{TANGLE} do.
   247  Otherwise \.{WEB}'s internal coding scheme does not affect you.
   248  @^ASCII code@>
   249  
   250  Here is a table of the standard visible ASCII codes:
   251  $$\def\:{\char\count255\global\advance\count255 by 1}
   252  \count255='40
   253  \vbox{
   254  \hbox{\hbox to 40pt{\it\hfill0\/\hfill}%
   255  \hbox to 40pt{\it\hfill1\/\hfill}%
   256  \hbox to 40pt{\it\hfill2\/\hfill}%
   257  \hbox to 40pt{\it\hfill3\/\hfill}%
   258  \hbox to 40pt{\it\hfill4\/\hfill}%
   259  \hbox to 40pt{\it\hfill5\/\hfill}%
   260  \hbox to 40pt{\it\hfill6\/\hfill}%
   261  \hbox to 40pt{\it\hfill7\/\hfill}}
   262  \vskip 4pt
   263  \hrule
   264  \def\^{\vrule height 10.5pt depth 4.5pt}
   265  \halign{\hbox to 0pt{\hskip -24pt\O{#0}\hfill}&\^
   266  \hbox to 40pt{\tt\hfill#\hfill\^}&
   267  &\hbox to 40pt{\tt\hfill#\hfill\^}\cr
   268  04&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
   269  05&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
   270  06&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
   271  07&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
   272  10&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
   273  11&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
   274  12&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
   275  13&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
   276  14&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
   277  15&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
   278  16&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
   279  17&\:&\:&\:&\:&\:&\:&\:\cr}
   280  \hrule width 280pt}$$
   281  (Actually, of course, code @'040 is an invisible blank space.)  Code @'136
   282  was once an upward arrow (\.{\char'13}), and code @'137 was
   283  once a left arrow (\.^^X), in olden times when the first draft
   284  of ASCII code was prepared; but \.{WEB} works with today's standard
   285  ASCII in which those codes represent circumflex and underline as shown.
   286  
   287  @<Types...@>=
   288  @!ASCII_code=0..255; {eight-bit numbers, a subrange of the integers}
   289  
   290  @ The original \PASCAL\ compiler was designed in the late 60s, when six-bit
   291  character sets were common, so it did not make provision for lowercase
   292  letters. Nowadays, of course, we need to deal with both capital and small
   293  letters in a convenient way, so \.{WEB} assumes that it is being used
   294  with a \PASCAL\ whose character set contains at least the characters of
   295  standard ASCII as listed above. Some \PASCAL\ compilers use the original
   296  name |char| for the data type associated with the characters in text files,
   297  while other \PASCAL s consider |char| to be a 64-element subrange of a larger
   298  data type that has some other name.
   299  
   300  In order to accommodate this difference, we shall use the name |text_char|
   301  to stand for the data type of the characters in the input and output
   302  files.  We shall also assume that |text_char| consists of the elements
   303  |chr(first_text_char)| through |chr(last_text_char)|, inclusive. The
   304  following definitions should be adjusted if necessary.
   305  @^system dependencies@>
   306  
   307  @d text_char == char {the data type of characters in text files}
   308  @d first_text_char=0 {ordinal number of the smallest element of |text_char|}
   309  @d last_text_char=255 {ordinal number of the largest element of |text_char|}
   310  
   311  @<Types...@>=
   312  @!text_file=packed file of text_char;
   313  
   314  @ The \.{WEAVE} and \.{TANGLE} processors convert between ASCII code and
   315  the user's external character set by means of arrays |xord| and |xchr|
   316  that are analogous to \PASCAL's |ord| and |chr| functions.
   317  
   318  @<Globals...@>=
   319  @!xord: array [text_char] of ASCII_code;
   320    {specifies conversion of input characters}
   321  @!xchr: array [ASCII_code] of text_char;
   322    {specifies conversion of output characters}
   323  
   324  @ If we assume that every system using \.{WEB} is able to read and write the
   325  visible characters of standard ASCII (although not necessarily using the
   326  ASCII codes to represent them), the following assignment statements initialize
   327  most of the |xchr| array properly, without needing any system-dependent
   328  changes. For example, the statement \.{xchr[@@\'101]:=\'A\'} that appears
   329  in the present \.{WEB} file might be encoded in, say, {\mc EBCDIC} code
   330  on the external medium on which it resides, but \.{TANGLE} will convert from
   331  this external code to ASCII and back again. Therefore the assignment
   332  statement \.{XCHR[65]:=\'A\'} will appear in the corresponding \PASCAL\ file,
   333  and \PASCAL\ will compile this statement so that |xchr[65]| receives the
   334  character \.A in the external (|char|) code. Note that it would be quite
   335  incorrect to say \.{xchr[@@\'101]:="A"}, because |"A"| is a constant of
   336  type |integer|, not |char|, and because we have $|"A"|=65$ regardless of
   337  the external character set.
   338  
   339  @<Set init...@>=
   340  xchr[@'40]:=' ';
   341  xchr[@'41]:='!';
   342  xchr[@'42]:='"';
   343  xchr[@'43]:='#';
   344  xchr[@'44]:='$';
   345  xchr[@'45]:='%';
   346  xchr[@'46]:='&';
   347  xchr[@'47]:='''';@/
   348  xchr[@'50]:='(';
   349  xchr[@'51]:=')';
   350  xchr[@'52]:='*';
   351  xchr[@'53]:='+';
   352  xchr[@'54]:=',';
   353  xchr[@'55]:='-';
   354  xchr[@'56]:='.';
   355  xchr[@'57]:='/';@/
   356  xchr[@'60]:='0';
   357  xchr[@'61]:='1';
   358  xchr[@'62]:='2';
   359  xchr[@'63]:='3';
   360  xchr[@'64]:='4';
   361  xchr[@'65]:='5';
   362  xchr[@'66]:='6';
   363  xchr[@'67]:='7';@/
   364  xchr[@'70]:='8';
   365  xchr[@'71]:='9';
   366  xchr[@'72]:=':';
   367  xchr[@'73]:=';';
   368  xchr[@'74]:='<';
   369  xchr[@'75]:='=';
   370  xchr[@'76]:='>';
   371  xchr[@'77]:='?';@/
   372  xchr[@'100]:='@@';
   373  xchr[@'101]:='A';
   374  xchr[@'102]:='B';
   375  xchr[@'103]:='C';
   376  xchr[@'104]:='D';
   377  xchr[@'105]:='E';
   378  xchr[@'106]:='F';
   379  xchr[@'107]:='G';@/
   380  xchr[@'110]:='H';
   381  xchr[@'111]:='I';
   382  xchr[@'112]:='J';
   383  xchr[@'113]:='K';
   384  xchr[@'114]:='L';
   385  xchr[@'115]:='M';
   386  xchr[@'116]:='N';
   387  xchr[@'117]:='O';@/
   388  xchr[@'120]:='P';
   389  xchr[@'121]:='Q';
   390  xchr[@'122]:='R';
   391  xchr[@'123]:='S';
   392  xchr[@'124]:='T';
   393  xchr[@'125]:='U';
   394  xchr[@'126]:='V';
   395  xchr[@'127]:='W';@/
   396  xchr[@'130]:='X';
   397  xchr[@'131]:='Y';
   398  xchr[@'132]:='Z';
   399  xchr[@'133]:='[';
   400  xchr[@'134]:='\';
   401  xchr[@'135]:=']';
   402  xchr[@'136]:='^';
   403  xchr[@'137]:='_';@/
   404  xchr[@'140]:='`';
   405  xchr[@'141]:='a';
   406  xchr[@'142]:='b';
   407  xchr[@'143]:='c';
   408  xchr[@'144]:='d';
   409  xchr[@'145]:='e';
   410  xchr[@'146]:='f';
   411  xchr[@'147]:='g';@/
   412  xchr[@'150]:='h';
   413  xchr[@'151]:='i';
   414  xchr[@'152]:='j';
   415  xchr[@'153]:='k';
   416  xchr[@'154]:='l';
   417  xchr[@'155]:='m';
   418  xchr[@'156]:='n';
   419  xchr[@'157]:='o';@/
   420  xchr[@'160]:='p';
   421  xchr[@'161]:='q';
   422  xchr[@'162]:='r';
   423  xchr[@'163]:='s';
   424  xchr[@'164]:='t';
   425  xchr[@'165]:='u';
   426  xchr[@'166]:='v';
   427  xchr[@'167]:='w';@/
   428  xchr[@'170]:='x';
   429  xchr[@'171]:='y';
   430  xchr[@'172]:='z';
   431  xchr[@'173]:='{';
   432  xchr[@'174]:='|';
   433  xchr[@'175]:='}';
   434  xchr[@'176]:='~';@/
   435  xchr[0]:=' '; xchr[@'177]:=' '; {these ASCII codes are not used}
   436  
   437  @ Some of the ASCII codes below @'40 have been given symbolic names in
   438  \.{WEAVE} and \.{TANGLE} because they are used with a special meaning.
   439  
   440  @d and_sign=@'4 {equivalent to `\.{and}'}
   441  @d not_sign=@'5 {equivalent to `\.{not}'}
   442  @d set_element_sign=@'6 {equivalent to `\.{in}'}
   443  @d tab_mark=@'11 {ASCII code used as tab-skip}
   444  @d line_feed=@'12 {ASCII code thrown away at end of line}
   445  @d form_feed=@'14 {ASCII code used at end of page}
   446  @d carriage_return=@'15 {ASCII code used at end of line}
   447  @d left_arrow=@'30 {equivalent to `\.{:=}'}
   448  @d not_equal=@'32 {equivalent to `\.{<>}'}
   449  @d less_or_equal=@'34 {equivalent to `\.{<=}'}
   450  @d greater_or_equal=@'35 {equivalent to `\.{>=}'}
   451  @d equivalence_sign=@'36 {equivalent to `\.{==}'}
   452  @d or_sign=@'37 {equivalent to `\.{or}'}
   453  
   454  @ When we initialize the |xord| array and the remaining parts of |xchr|,
   455  it will be convenient to make use of an index variable, |i|.
   456  
   457  @<Local variables for init...@>=
   458  @!i:0..255;
   459  
   460  @ Here now is the system-dependent part of the character set.
   461  If \.{WEB} is being implemented on a garden-variety \PASCAL\ for which
   462  only standard ASCII codes will appear in the input and output files, you
   463  don't need to make any changes here. But if you have, for example, an extended
   464  character set like the one in Appendix~C of {\sl The \TeX book}, the first
   465  line of code in this module should be changed to
   466  $$\hbox{|for i:=1 to @'37 do xchr[i]:=chr(i);|}$$
   467  \.{WEB}'s character set is essentially identical to \TeX's, even with respect to
   468  characters less than @'40.
   469  @^system dependencies@>
   470  
   471  Changes to the present module will make \.{WEB} more friendly on computers
   472  that have an extended character set, so that one can type things like
   473  \.^^Z\ instead of \.{<>}. If you have an extended set of characters that
   474  are easily incorporated into text files, you can assign codes arbitrarily
   475  here, giving an |xchr| equivalent to whatever characters the users of
   476  \.{WEB} are allowed to have in their input files, provided that unsuitable
   477  characters do not correspond to special codes like |carriage_return|
   478  that are listed above.
   479  
   480  (The present file \.{WEAVE.WEB} does not contain any of the non-ASCII
   481  characters, because it is intended to be used with all implementations of
   482  \.{WEB}.  It was originally created on a Stanford system that has a
   483  convenient extended character set, then ``sanitized'' by applying another
   484  program that transliterated all of the non-standard characters into
   485  standard equivalents.)
   486  
   487  @<Set init...@>=
   488  for i:=1 to @'37 do xchr[i]:=' ';
   489  for i:=@'200 to @'377 do xchr[i]:=' ';
   490  
   491  @ The following system-independent code makes the |xord| array contain a
   492  suitable inverse to the information in |xchr|.
   493  
   494  @<Set init...@>=
   495  for i:=first_text_char to last_text_char do xord[chr(i)]:=" ";
   496  for i:=1 to @'377 do xord[xchr[i]]:=i;
   497  xord[' ']:=" ";
   498  
   499  @* Input and output.
   500  The input conventions of this program are intended to be very much like those
   501  of \TeX\ (except, of course, that they are much simpler, because much less
   502  needs to be done). Furthermore they are identical to those of \.{TANGLE}.
   503  Therefore people who need to make modifications to all three systems
   504  should be able to do so without too many headaches.
   505  
   506  We use the standard \PASCAL\ input/output procedures in several places that
   507  \TeX\ cannot, since \.{WEAVE} does not have to deal with files that are named
   508  dynamically by the user, and since there is no input from the terminal.
   509  
   510  @ Terminal output is done by writing on file |term_out|, which is assumed to
   511  consist of characters of type |text_char|:
   512  @^system dependencies@>
   513  
   514  @d print(#)==write(term_out,#) {`|print|' means write on the terminal}
   515  @d print_ln(#)==write_ln(term_out,#) {`|print|' and then start new line}
   516  @d new_line==write_ln(term_out) {start new line}
   517  @d print_nl(#)==  {print information starting on a new line}
   518    begin new_line; print(#);
   519    end
   520  
   521  @<Globals...@>=
   522  @!term_out:text_file; {the terminal as an output file}
   523  
   524  @ Different systems have different ways of specifying that the output on a
   525  certain file will appear on the user's terminal. Here is one way to do this
   526  on the \PASCAL\ system that was used in \.{TANGLE}'s initial development:
   527  @^system dependencies@>
   528  
   529  @<Set init...@>=
   530  rewrite(term_out,'TTY:'); {send |term_out| output to the terminal}
   531  
   532  @ The |update_terminal| procedure is called when we want
   533  to make sure that everything we have output to the terminal so far has
   534  actually left the computer's internal buffers and been sent.
   535  @^system dependencies@>
   536  
   537  @d update_terminal == break(term_out) {empty the terminal output buffer}
   538  
   539  @ The main input comes from |web_file|; this input may be overridden
   540  by changes in |change_file|. (If |change_file| is empty, there are no changes.)
   541  
   542  @<Globals...@>=
   543  @!web_file:text_file; {primary input}
   544  @!change_file:text_file; {updates}
   545  
   546  @ The following code opens the input files.  Since these files were listed
   547  in the program header, we assume that the \PASCAL\ runtime system has
   548  already checked that suitable file names have been given; therefore no
   549  additional error checking needs to be done. We will see below that
   550  \.{WEAVE} reads through the entire input twice.
   551  @^system dependencies@>
   552  
   553  @p procedure open_input; {prepare to read |web_file| and |change_file|}
   554  begin reset(web_file); reset(change_file);
   555  end;
   556  
   557  @ The main output goes to |tex_file|.
   558  
   559  @<Globals...@>=
   560  @!tex_file: text_file;
   561  
   562  @ The following code opens |tex_file|.
   563  Since this file was listed in the program header, we assume that the
   564  \PASCAL\ runtime system has checked that a suitable external file name has
   565  been given.
   566  @^system dependencies@>
   567  
   568  @<Set init...@>=
   569  rewrite(tex_file);
   570  
   571  @ Input goes into an array called |buffer|.
   572  
   573  @<Globals...@>=@!buffer: array[0..long_buf_size] of ASCII_code;
   574  
   575  @ The |input_ln| procedure brings the next line of input from the specified
   576  file into the |buffer| array and returns the value |true|, unless the file has
   577  already been entirely read, in which case it returns |false|. The conventions
   578  of \TeX\ are followed; i.e., |ASCII_code| numbers representing the next line
   579  of the file are input into |buffer[0]|, |buffer[1]|, \dots,
   580  |buffer[limit-1]|; trailing blanks are ignored;
   581  and the global variable |limit| is set to the length of the
   582  @^system dependencies@>
   583  line. The value of |limit| must be strictly less than |buf_size|.
   584  
   585  We assume that none of the |ASCII_code| values
   586  of |buffer[j]| for |0<=j<limit| is equal to 0, @'177, |line_feed|, |form_feed|,
   587  or |carriage_return|. Since |buf_size| is strictly less than |long_buf_size|,
   588  some of \.{WEAVE}'s routines use the fact that it is safe to refer to
   589  |buffer[limit+2]| without overstepping the bounds of the array.
   590  
   591  @p function input_ln(var f:text_file):boolean;
   592    {inputs a line or returns |false|}
   593  var final_limit:0..buf_size; {|limit| without trailing blanks}
   594  begin limit:=0; final_limit:=0;
   595  if eof(f) then input_ln:=false
   596  else  begin while not eoln(f) do
   597      begin buffer[limit]:=xord[f^]; get(f);
   598      incr(limit);
   599      if buffer[limit-1]<>" " then final_limit:=limit;
   600      if limit=buf_size then
   601        begin while not eoln(f) do get(f);
   602        decr(limit); {keep |buffer[buf_size]| empty}
   603        if final_limit>limit then final_limit:=limit;
   604        print_nl('! Input line too long'); loc:=0; error;
   605  @.Input line too long@>
   606        end;
   607      end;
   608    read_ln(f); limit:=final_limit; input_ln:=true;
   609    end;
   610  end;
   611  
   612  @* Reporting errors to the user.
   613  The \.{WEAVE} processor operates in three phases: first it inputs the source
   614  file and stores cross-reference data, then it inputs the source once again and
   615  produces the \TeX\ output file, and finally it sorts and outputs the index.
   616  
   617  The global variables |phase_one| and |phase_three| tell which Phase we are in.
   618  
   619  @<Globals...@>=
   620  @!phase_one: boolean; {|true| in Phase I, |false| in Phases II and III}
   621  @!phase_three: boolean; {|true| in Phase III, |false| in Phases I and II}
   622  
   623  @ If an error is detected while we are debugging,
   624  we usually want to look at the contents of memory.
   625  A special procedure will be declared later for this purpose.
   626  
   627  @<Error handling...@>=
   628  @!debug@+ procedure debug_help; forward;@+gubed
   629  
   630  @ The command `|err_print('! Error message')|' will report a syntax error to
   631  the user, by printing the error message at the beginning of a new line and
   632  then giving an indication of where the error was spotted in the source file.
   633  Note that no period follows the error message, since the error routine
   634  will automatically supply a period.
   635  
   636  The actual error indications are provided by a procedure called |error|.
   637  However, error messages are not actually reported during phase one,
   638  since errors detected on the first pass will be detected again
   639  during the second.
   640  
   641  @d err_print(#)==
   642    begin if not phase_one then
   643      begin new_line; print(#); error;
   644      end;
   645    end
   646  
   647  @<Error handling...@>=
   648  procedure error; {prints `\..' and location of error message}
   649  var@!k,@!l: 0..long_buf_size; {indices into |buffer|}
   650  begin @<Print error location based on input buffer@>;
   651  update_terminal; mark_error;
   652  @!debug debug_skipped:=debug_cycle;debug_help;@+gubed
   653  end;
   654  
   655  @ The error locations can be indicated by using the global variables
   656  |loc|, |line|, and |changing|, which tell respectively the first
   657  unlooked-at position in |buffer|, the current line number, and whether or not
   658  the current line is from |change_file| or |web_file|.
   659  This routine should be modified on systems whose standard text editor
   660  has special line-numbering conventions.
   661  @^system dependencies@>
   662  
   663  @<Print error location based on input buffer@>=
   664  begin if changing then print('. (change file ')@+else print('. (');
   665  print_ln('l.', line:1, ')');
   666  if loc>=limit then l:=limit else l:=loc;
   667  for k:=1 to l do
   668    if buffer[k-1]=tab_mark then print(' ')
   669    else print(xchr[buffer[k-1]]); {print the characters already read}
   670  new_line;
   671  for k:=1 to l do print(' '); {space out the next line}
   672  for k:=l+1 to limit do print(xchr[buffer[k-1]]); {print the part not yet read}
   673  if buffer[limit]="|" then print(xchr["|"]);
   674    {end of \PASCAL\ text in module names}
   675  print(' '); {this space separates the message from future asterisks}
   676  end
   677  
   678  @ The |jump_out| procedure just cuts across all active procedure levels
   679  and jumps out of the program. This is the only non-local \&{goto} statement
   680  in \.{WEAVE}. It is used when no recovery from a particular error has
   681  been provided.
   682  
   683  Some \PASCAL\ compilers do not implement non-local |goto| statements.
   684  @^system dependencies@>
   685  In such cases the code that appears at label |end_of_WEAVE| should be
   686  copied into the |jump_out| procedure, followed by a call to a system procedure
   687  that terminates the program.
   688  
   689  @d fatal_error(#)==begin new_line; print(#); error; mark_fatal; jump_out;
   690    end
   691  
   692  @<Error handling...@>=
   693  procedure jump_out;
   694  begin goto end_of_WEAVE;
   695  end;
   696  
   697  @ Sometimes the program's behavior is far different from what it should be,
   698  and \.{WEAVE} prints an error message that is really for the \.{WEAVE}
   699  maintenance person, not the user. In such cases the program says
   700  |confusion('indication of where we are')|.
   701  
   702  @d confusion(#)==fatal_error('! This can''t happen (',#,')')
   703  @.This can't happen@>
   704  
   705  @ An overflow stop occurs if \.{WEAVE}'s tables aren't large enough.
   706  
   707  @d overflow(#)==fatal_error('! Sorry, ',#,' capacity exceeded')
   708  @.Sorry, x capacity exceeded@>
   709  
   710  @* Data structures.
   711  During the first phase of its processing, \.{WEAVE} puts identifier names,
   712  index entries, and module names into the large |byte_mem| array, which is
   713  packed with eight-bit integers. Allocation is sequential, since names are
   714  never deleted.
   715  
   716  An auxiliary array |byte_start| is used as a directory for |byte_mem|,
   717  and the |link|, |ilk|, and |xref| arrays give further information about names.
   718  These auxiliary arrays consist of sixteen-bit items.
   719  
   720  @<Types...@>=
   721  @!eight_bits=0..255; {unsigned one-byte quantity}
   722  @!sixteen_bits=0..65535; {unsigned two-byte quantity}
   723  
   724  @ \.{WEAVE} has been designed to avoid the need for indices that are more
   725  than sixteen bits wide, so that it can be used on most computers. But
   726  there are programs that need more than 65536 bytes; \TeX\ is one of these.
   727  To get around this problem, a slight complication has been added to the
   728  data structures:  |byte_mem| is a two-dimensional array, whose first index
   729  is either 0 or 1. (For generality, the first index is actually allowed to
   730  run between 0 and |ww-1|, where |ww| is defined to be 2; the program will
   731  work for any positive value of |ww|, and it can be simplified in obvious
   732  ways if |ww=1|.)
   733  
   734  @d ww=2 {we multiply the byte capacity by approximately this amount}
   735  
   736  @<Globals...@>=
   737  @!byte_mem: packed array [0..ww-1,0..max_bytes] of ASCII_code;
   738    {characters of names}
   739  @!byte_start: array [0..max_names] of sixteen_bits; {directory into |byte_mem|}
   740  @!link: array [0..max_names] of sixteen_bits; {hash table or tree links}
   741  @!ilk: array [0..max_names] of sixteen_bits; {type codes or tree links}
   742  @!xref: array [0..max_names] of sixteen_bits; {heads of cross-reference lists}
   743  
   744  @ The names of identifiers are found by computing a hash address |h| and
   745  then looking at strings of bytes signified by |hash[h]|, |link[hash[h]]|,
   746  |link[link[hash[h]]]|, \dots, until either finding the desired name
   747  or encountering a zero.
   748  
   749  A `|name_pointer|' variable, which signifies a name, is an index into
   750  |byte_start|. The actual sequence of characters in the name pointed to by
   751  |p| appears in positions |byte_start[p]| to |byte_start[p+ww]-1|, inclusive,
   752  in the segment of |byte_mem| whose first index is |p mod ww|. Thus, when
   753  |ww=2| the even-numbered name bytes appear in |byte_mem[0,@t$*$@>]|
   754  and the odd-numbered ones appear in |byte_mem[1,@t$*$@>]|.
   755  The pointer 0 is used for undefined module names; we don't
   756  want to use it for the names of identifiers, since 0 stands for a null
   757  pointer in a linked list.
   758  
   759  We usually have |byte_start[name_ptr+w]=byte_ptr[(name_ptr+w) mod ww]|
   760  for |0<=w<ww|, since these are the starting positions for the next |ww|
   761  names to be stored in |byte_mem|.
   762  
   763  @d length(#)==byte_start[#+ww]-byte_start[#] {the length of a name}
   764  
   765  @<Types...@>=
   766  @!name_pointer=0..max_names; {identifies a name}
   767  
   768  @ @<Global...@>=
   769  @!name_ptr:name_pointer; {first unused position in |byte_start|}
   770  @!byte_ptr:array [0..ww-1] of 0..max_bytes;
   771    {first unused position in |byte_mem|}
   772  
   773  @ @<Local variables for init...@>=
   774  @!wi: 0..ww-1; {to initialize the |byte_mem| indices}
   775  
   776  @ @<Set init...@>=
   777  for wi:=0 to ww-1 do
   778    begin byte_start[wi]:=0; byte_ptr[wi]:=0;
   779    end;
   780  byte_start[ww]:=0; {this makes name 0 of length zero}
   781  name_ptr:=1;
   782  
   783  @ Several types of identifiers are distinguished by their |ilk|:
   784  
   785  \yskip\hang |normal| identifiers are part of the \PASCAL\ program and
   786  will appear in italic type.
   787  
   788  \yskip\hang |roman| identifiers are index entries that appear after
   789  \.{@@\^} in the \.{WEB} file.
   790  
   791  \yskip\hang |wildcard| identifiers are index entries that appear after
   792  \.{@@:} in the \.{WEB} file.
   793  
   794  \yskip\hang |typewriter| identifiers are index entries that appear after
   795  \.{@@.} in the \.{WEB} file.
   796  
   797  \yskip\hang |array_like|, |begin_like|, \dots, |var_like|
   798  identifiers are \PASCAL\ reserved words whose |ilk| explains how they are
   799  to be treated when \PASCAL\ code is being formatted.
   800  
   801  \yskip\hang Finally, if |c| is an ASCII code, an |ilk| equal to
   802  |char_like+c| denotes a reserved word that will be converted to character
   803  |c|.
   804  
   805  @d normal=0 {ordinary identifiers have |normal| ilk}
   806  @d roman=1 {normal index entries have |roman| ilk}
   807  @d wildcard=2 {user-formatted index entries have |wildcard| ilk}
   808  @d typewriter=3 {`typewriter type' entries have |typewriter| ilk}
   809  @d reserved(#)==(ilk[#]>typewriter) {tells if a name is a reserved word}
   810  @d array_like=4 {\&{array}, \&{file}, \&{set}}
   811  @d begin_like=5 {\&{begin}}
   812  @d case_like=6 {\&{case}}
   813  @d const_like=7 {\&{const}, \&{label}, \&{type}}
   814  @d div_like=8 {\&{div}, \&{mod}}
   815  @d do_like=9 {\&{do}, \&{of}, \&{then}}
   816  @d else_like=10 {\&{else}}
   817  @d end_like=11 {\&{end}}
   818  @d for_like=12 {\&{for}, \&{while}, \&{with}}
   819  @d goto_like=13 {\&{goto}, \&{packed}}
   820  @d if_like=14 {\&{if}}
   821  @d intercal_like=15 {not used}
   822  @d nil_like=16 {\&{nil}}
   823  @d proc_like=17 {\&{function}, \&{procedure}, \&{program}}
   824  @d record_like=18 {\&{record}}
   825  @d repeat_like=19 {\&{repeat}}
   826  @d to_like=20 {\&{downto}, \&{to}}
   827  @d until_like=21 {\&{until}}
   828  @d var_like=22 {\&{var}}
   829  @d loop_like=23 {\&{loop}, \&{xclause}}
   830  @d char_like=24 {\&{and}, \&{or}, \&{not}, \&{in}}
   831  
   832  @ The names of modules are stored in |byte_mem| together
   833  with the identifier names, but a hash table is not used for them because
   834  \.{WEAVE} needs to be able to recognize a module name when given a prefix of
   835  that name. A conventional binary search tree is used to retrieve module names,
   836  with fields called |llink| and |rlink| in place of |link| and |ilk|. The
   837  root of this tree is |rlink[0]|.
   838  
   839  @d llink==link {left link in binary search tree for module names}
   840  @d rlink==ilk {right link in binary search tree for module names}
   841  @d root==rlink[0] {the root of the binary search tree for module names}
   842  
   843  @<Set init...@>=
   844  root:=0; {the binary search tree starts out with nothing in it}
   845  
   846  @ Here is a little procedure that prints the text of a given name on the
   847  user's terminal.
   848  
   849  @p procedure print_id(@!p:name_pointer); {print identifier or module name}
   850  var k:0..max_bytes; {index into |byte_mem|}
   851  @!w:0..ww-1; {row of |byte_mem|}
   852  begin if p>=name_ptr then print('IMPOSSIBLE')
   853  else  begin w:=p mod ww;
   854    for k:=byte_start[p] to byte_start[p+ww]-1 do
   855      print(xchr[byte_mem[w,k]]);
   856    end;
   857  end;
   858  
   859  @ We keep track of the current module number in
   860  |module_count|, which is the total number of modules that have started.
   861  Modules which have been altered by a change file entry
   862  have their |changed_module| flag turned on during the first phase.
   863  
   864  @<Globals...@>=
   865  @!module_count:0..max_modules; {the current module number}
   866  @!changed_module: packed array [0..max_modules] of boolean; {is it changed?}
   867  @!change_exists: boolean; {has any module changed?}
   868  
   869  @ The other large memory area in \.{WEAVE} keeps the cross-reference data.
   870  All uses of the name |p| are recorded in a linked list beginning at
   871  |xref[p]|, which points into the |xmem| array. Entries in |xmem| consist
   872  of two sixteen-bit items per word, called the |num| and |xlink| fields.
   873  If |x| is an index into |xmem|, reached from name |p|, the value of |num(x)|
   874  is either a module number where |p| is used, or it is |def_flag| plus a
   875  module number where |p| is defined; and |xlink(x)| points to the next such
   876  cross reference for |p|, if any. This list of cross references is in
   877  decreasing order by module number. The current number of cross references
   878  is |xref_ptr|.
   879  
   880  The global variable |xref_switch| is set either to |def_flag| or to zero,
   881  depending on whether the next cross reference to an identifier is to be
   882  underlined or not in the index. This switch is set to |def_flag| when
   883  \.{@@!} or \.{@@d} or \.{@@f} is scanned, and it is cleared to zero when
   884  the next identifier or index entry cross reference has been made. Similarly,
   885  the global variable |mod_xref_switch| is either |def_flag| or zero, depending
   886  on whether a module name is being defined or used.
   887  
   888  @d num(#)==xmem[#].num_field
   889  @d xlink(#)==xmem[#].xlink_field
   890  @d def_flag=10240 {must be strictly larger than |max_modules|}
   891  
   892  @ @<Types...@>=
   893  @!xref_number=0..max_refs;
   894  
   895  @ @<Globals...@>=
   896  @!xmem:array[xref_number] of packed record@t@>@/
   897    @!num_field: sixteen_bits; {module number plus zero or |def_flag|}
   898    @!xlink_field: sixteen_bits; {pointer to the previous cross reference}
   899    end;
   900  @!xref_ptr:xref_number; {the largest occupied position in |xmem|}
   901  @!xref_switch,@!mod_xref_switch:0..def_flag; {either zero or |def_flag|}
   902  
   903  @ @<Set init...@>=xref_ptr:=0; xref_switch:=0; mod_xref_switch:=0; num(0):=0;
   904  xref[0]:=0; {cross references to undefined modules}
   905  
   906  @ A new cross reference for an identifier is formed by calling |new_xref|,
   907  which discards duplicate entries and ignores non-underlined references
   908  to one-letter identifiers or \PASCAL's reserved words.
   909  
   910  @d append_xref(#)==if xref_ptr=max_refs then overflow('cross reference')
   911    else  begin incr(xref_ptr); num(xref_ptr):=#;
   912      end
   913  
   914  @p procedure new_xref(@!p:name_pointer);
   915  label exit;
   916  var q:xref_number; {pointer to previous cross reference}
   917  @!m,@!n: sixteen_bits; {new and previous cross-reference value}
   918  begin if (reserved(p)or(byte_start[p]+1=byte_start[p+ww]))and
   919    (xref_switch=0) then return;
   920  m:=module_count+xref_switch; xref_switch:=0; q:=xref[p];
   921  if q>0 then
   922    begin n:=num(q);
   923    if (n=m)or(n=m+def_flag) then return
   924    else if m=n+def_flag then
   925      begin num(q):=m; return;
   926      end;
   927    end;
   928  append_xref(m); xlink(xref_ptr):=q; xref[p]:=xref_ptr;
   929  exit: end;
   930  
   931  @ The cross reference lists for module names are slightly different. Suppose
   932  that a module name is defined in modules $m_1$, \dots, $m_k$ and used in
   933  modules $n_1$, \dots, $n_l$. Then its list will contain $m_1+|def_flag|$,
   934  $m_k+|def_flag|$, \dots, $m_2+|def_flag|$, $n_l$, \dots, $n_1$, in
   935  this order.  After Phase II, however, the order will be
   936  $m_1+|def_flag|$, \dots, $m_k+|def_flag|$, $n_1$, \dots, $n_l$.
   937  
   938  @p procedure new_mod_xref(@!p:name_pointer);
   939  var q,@!r:xref_number; {pointers to previous cross references}
   940  begin q:=xref[p]; r:=0;
   941  if q>0 then
   942    begin if mod_xref_switch=0 then while num(q)>=def_flag do
   943      begin r:=q; q:=xlink(q);
   944      end
   945    else if num(q)>=def_flag then
   946      begin r:=q; q:=xlink(q);
   947      end;
   948    end;
   949  append_xref(module_count+mod_xref_switch); xlink(xref_ptr):=q;
   950  mod_xref_switch:=0;
   951  if r=0 then xref[p]:=xref_ptr
   952  else xlink(r):=xref_ptr;
   953  end;
   954  
   955  @ A third large area of memory is used for sixteen-bit `tokens', which appear
   956  in short lists similar to the strings of characters in |byte_mem|. Token lists
   957  are used to contain the result of \PASCAL\ code translated into \TeX\ form;
   958  further details about them will be explained later. A |text_pointer| variable
   959  is an index into |tok_start|.
   960  
   961  @<Types...@>=
   962  @!text_pointer=0..max_texts; {identifies a token list}
   963  
   964  @ The first position of |tok_mem|
   965  that is unoccupied by replacement text is called |tok_ptr|, and the first
   966  unused location of |tok_start| is called |text_ptr|.
   967  Thus, we usually have |tok_start[text_ptr]=tok_ptr|.
   968  
   969  @<Glob...@>=
   970  @t\hskip1em@>@!tok_mem: packed array [0..max_toks] of sixteen_bits; {tokens}
   971  @t\hskip1em@>@!tok_start: array [text_pointer] of sixteen_bits;
   972    {directory into |tok_mem|}
   973  @t\hskip1em@>@!text_ptr:text_pointer; {first unused position in |tok_start|}
   974  @t\hskip1em@>@!tok_ptr:0..max_toks; {first unused position in |tok_mem|}
   975  stat@!max_tok_ptr,@!max_txt_ptr:0..max_toks; {largest values occurring}
   976  tats
   977  
   978  @ @<Set init...@>=
   979  tok_ptr:=1; text_ptr:=1; tok_start[0]:=1; tok_start[1]:=1;
   980  stat max_tok_ptr:=1; max_txt_ptr:=1;@+tats
   981  
   982  @* Searching for identifiers.
   983  The hash table described above is updated by the |id_lookup| procedure,
   984  which finds a given identifier and returns a pointer to its index in
   985  |byte_start|. The identifier is supposed to match character by character
   986  and it is also supposed to have a given |ilk| code; the same name may be
   987  present more than once if it is supposed to appear in the index with
   988  different typesetting conventions.
   989  If the identifier was not already present, it is inserted into the table.
   990  
   991  Because of the way \.{WEAVE}'s scanning mechanism works, it is most convenient
   992  to let |id_lookup| search for an identifier that is present in the |buffer|
   993  array. Two other global variables specify its position in the buffer: the
   994  first character is |buffer[id_first]|, and the last is |buffer[id_loc-1]|.
   995  
   996  @<Glob...@>=
   997  @!id_first:0..long_buf_size; {where the current identifier begins in the buffer}
   998  @!id_loc:0..long_buf_size; {just after the current identifier in the buffer}
   999  @#
  1000  @!hash:array [0..hash_size] of sixteen_bits; {heads of hash lists}
  1001  
  1002  @ Initially all the hash lists are empty.
  1003  
  1004  @<Local variables for init...@>=
  1005  @!h:0..hash_size; {index into hash-head array}
  1006  
  1007  @ @<Set init...@>=
  1008  for h:=0 to hash_size-1 do hash[h]:=0;
  1009  
  1010  @ Here now is the main procedure for finding identifiers (and index
  1011  entries).  The parameter |t| is set to the desired |ilk| code. The
  1012  identifier must either have |ilk=t|, or we must have
  1013  |t=normal| and the identifier must be a reserved word.
  1014  
  1015  @p function id_lookup(@!t:eight_bits):name_pointer; {finds current identifier}
  1016  label found;
  1017  var i:0..long_buf_size; {index into |buffer|}
  1018  @!h:0..hash_size; {hash code}
  1019  @!k:0..max_bytes; {index into |byte_mem|}
  1020  @!w:0..ww-1; {row of |byte_mem|}
  1021  @!l:0..long_buf_size; {length of the given identifier}
  1022  @!p:name_pointer; {where the identifier is being sought}
  1023  begin l:=id_loc-id_first; {compute the length}
  1024  @<Compute the hash code |h|@>;
  1025  @<Compute the name location |p|@>;
  1026  if p=name_ptr then @<Enter a new name into the table at position |p|@>;
  1027  id_lookup:=p;
  1028  end;
  1029  
  1030  @ A simple hash code is used: If the sequence of
  1031  ASCII codes is $c_1c_2\ldots c_n$, its hash value will be
  1032  $$(2^{n-1}c_1+2^{n-2}c_2+\cdots+c_n)\,\bmod\,|hash_size|.$$
  1033  
  1034  @<Compute the hash...@>=
  1035  h:=buffer[id_first]; i:=id_first+1;
  1036  while i<id_loc do
  1037    begin h:=(h+h+buffer[i]) mod hash_size; incr(i);
  1038    end
  1039  
  1040  @ If the identifier is new, it will be placed in position |p=name_ptr|,
  1041  otherwise |p| will point to its existing location.
  1042  
  1043  @<Compute the name location...@>=
  1044  p:=hash[h];
  1045  while p<>0 do
  1046    begin if (length(p)=l)and((ilk[p]=t)or((t=normal)and reserved(p))) then
  1047      @<Compare name |p| with current identifier,
  1048        |goto found| if equal@>;
  1049    p:=link[p];
  1050    end;
  1051  p:=name_ptr; {the current identifier is new}
  1052  link[p]:=hash[h]; hash[h]:=p; {insert |p| at beginning of hash list}
  1053  found:
  1054  
  1055  @ @<Compare name |p|...@>=
  1056  begin i:=id_first; k:=byte_start[p]; w:=p mod ww;
  1057  while (i<id_loc)and(buffer[i]=byte_mem[w,k]) do
  1058    begin incr(i); incr(k);
  1059    end;
  1060  if i=id_loc then goto found; {all characters agree}
  1061  end
  1062  
  1063  @ When we begin the following segment of the program, |p=name_ptr|.
  1064  
  1065  @<Enter a new name...@>=
  1066  begin w:=name_ptr mod ww;
  1067  if byte_ptr[w]+l>max_bytes then overflow('byte memory');
  1068  if name_ptr+ww>max_names then overflow('name');
  1069  i:=id_first; k:=byte_ptr[w]; {get ready to move the identifier into |byte_mem|}
  1070  while i<id_loc do
  1071    begin byte_mem[w,k]:=buffer[i]; incr(k); incr(i);
  1072    end;
  1073  byte_ptr[w]:=k; byte_start[name_ptr+ww]:=k; incr(name_ptr);
  1074  ilk[p]:=t; xref[p]:=0;
  1075  end
  1076  
  1077  @* Initializing the table of reserved words.
  1078  We have to get \PASCAL's reserved words into the hash table, and the
  1079  simplest way to do this is to insert them every time \.{WEAVE} is run.
  1080  A few macros permit us to do the initialization with a compact program.
  1081  
  1082  @d sid9(#)==buffer[9]:=#;cur_name:=id_lookup
  1083  @d sid8(#)==buffer[8]:=#;sid9
  1084  @d sid7(#)==buffer[7]:=#;sid8
  1085  @d sid6(#)==buffer[6]:=#;sid7
  1086  @d sid5(#)==buffer[5]:=#;sid6
  1087  @d sid4(#)==buffer[4]:=#;sid5
  1088  @d sid3(#)==buffer[3]:=#;sid4
  1089  @d sid2(#)==buffer[2]:=#;sid3
  1090  @d sid1(#)==buffer[1]:=#;sid2
  1091  @d id2==id_first:=8; sid8
  1092  @d id3==id_first:=7; sid7
  1093  @d id4==id_first:=6; sid6
  1094  @d id5==id_first:=5; sid5
  1095  @d id6==id_first:=4; sid4
  1096  @d id7==id_first:=3; sid3
  1097  @d id8==id_first:=2; sid2
  1098  @d id9==id_first:=1; sid1
  1099  
  1100  @<Globals...@>=
  1101  @!cur_name:name_pointer; {points to the identifier just inserted}
  1102  
  1103  @ The intended use of the macros above might not be immediately obvious,
  1104  but the riddle is answered by the following:
  1105  
  1106  @<Store all the reserved words@>=
  1107  id_loc:=10;@/
  1108  id3("a")("n")("d")(char_like+and_sign);@/
  1109  id5("a")("r")("r")("a")("y")(array_like);@/
  1110  id5("b")("e")("g")("i")("n")(begin_like);@/
  1111  id4("c")("a")("s")("e")(case_like);@/
  1112  id5("c")("o")("n")("s")("t")(const_like);@/
  1113  id3("d")("i")("v")(div_like);@/
  1114  id2("d")("o")(do_like);@/
  1115  id6("d")("o")("w")("n")("t")("o")(to_like);@/
  1116  id4("e")("l")("s")("e")(else_like);@/
  1117  id3("e")("n")("d")(end_like);@/
  1118  id4("f")("i")("l")("e")(array_like);@/
  1119  id3("f")("o")("r")(for_like);@/
  1120  id8("f")("u")("n")("c")("t")("i")("o")("n")(proc_like);@/
  1121  id4("g")("o")("t")("o")(goto_like);@/
  1122  id2("i")("f")(if_like);@/
  1123  id2("i")("n")(char_like+set_element_sign);@/
  1124  id5("l")("a")("b")("e")("l")(const_like);@/
  1125  id3("m")("o")("d")(div_like);@/
  1126  id3("n")("i")("l")(nil_like);@/
  1127  id3("n")("o")("t")(char_like+not_sign);@/
  1128  id2("o")("f")(do_like);@/
  1129  id2("o")("r")(char_like+or_sign);@/
  1130  id6("p")("a")("c")("k")("e")("d")(goto_like);@/
  1131  id9("p")("r")("o")("c")("e")("d")("u")("r")("e")(proc_like);@/
  1132  id7("p")("r")("o")("g")("r")("a")("m")(proc_like);@/
  1133  id6("r")("e")("c")("o")("r")("d")(record_like);@/
  1134  id6("r")("e")("p")("e")("a")("t")(repeat_like);@/
  1135  id3("s")("e")("t")(array_like);@/
  1136  id4("t")("h")("e")("n")(do_like);@/
  1137  id2("t")("o")(to_like);@/
  1138  id4("t")("y")("p")("e")(const_like);@/
  1139  id5("u")("n")("t")("i")("l")(until_like);@/
  1140  id3("v")("a")("r")(var_like);@/
  1141  id5("w")("h")("i")("l")("e")(for_like);@/
  1142  id4("w")("i")("t")("h")(for_like);@/
  1143  id7("x")("c")("l")("a")("u")("s")("e")(loop_like);@/
  1144  
  1145  @* Searching for module names.
  1146  The |mod_lookup| procedure finds the module name |mod_text[1..l]| in the
  1147  search tree, after inserting it if necessary, and returns a pointer to
  1148  where it was found.
  1149  
  1150  @<Glob...@>=
  1151  @!mod_text:array [0..longest_name] of ASCII_code; {name being sought for}
  1152  
  1153  @ According to the rules of \.{WEB}, no module name
  1154  should be a proper prefix of another, so a ``clean'' comparison should
  1155  occur between any two names. The result of |mod_lookup| is 0 if this
  1156  prefix condition is violated. An error message is printed when such violations
  1157  are detected during phase two of \.{WEAVE}.
  1158  
  1159  @d less=0 {the first name is lexicographically less than the second}
  1160  @d equal=1 {the first name is equal to the second}
  1161  @d greater=2 {the first name is lexicographically greater than the second}
  1162  @d prefix=3 {the first name is a proper prefix of the second}
  1163  @d extension=4 {the first name is a proper extension of the second}
  1164  
  1165  @p function mod_lookup(@!l:sixteen_bits):name_pointer; {finds module name}
  1166  label found;
  1167  var c:less..extension; {comparison between two names}
  1168  @!j:0..longest_name; {index into |mod_text|}
  1169  @!k:0..max_bytes; {index into |byte_mem|}
  1170  @!w:0..ww-1; {row of |byte_mem|}
  1171  @!p:name_pointer; {current node of the search tree}
  1172  @!q:name_pointer; {father of node |p|}
  1173  begin c:=greater; q:=0; p:=root;
  1174  while p<>0 do
  1175    begin @<Set variable |c| to the result of comparing the given name
  1176      to name |p|@>;
  1177    q:=p;
  1178    if c=less then p:=llink[q]
  1179    else if c=greater then p:=rlink[q]
  1180    else goto found;
  1181    end;
  1182  @<Enter a new module name into the tree@>;
  1183  found: if c<>equal then
  1184    begin err_print('! Incompatible section names'); p:=0;
  1185  @.Incompatible section names@>
  1186    end;
  1187  mod_lookup:=p;
  1188  end;
  1189  
  1190  @ @<Enter a new module name...@>=
  1191  w:=name_ptr mod ww; k:=byte_ptr[w];
  1192  if k+l>max_bytes then overflow('byte memory');
  1193  if name_ptr>max_names-ww then overflow('name');
  1194  p:=name_ptr;
  1195  if c=less then llink[q]:=p else rlink[q]:=p;
  1196  llink[p]:=0; rlink[p]:=0; xref[p]:=0; c:=equal;
  1197  for j:=1 to l do byte_mem[w,k+j-1]:=mod_text[j];
  1198  byte_ptr[w]:=k+l; byte_start[name_ptr+ww]:=k+l; incr(name_ptr);
  1199  
  1200  @ @<Set variable |c|...@>=
  1201  begin k:=byte_start[p]; w:=p mod ww; c:=equal; j:=1;
  1202  while (k<byte_start[p+ww]) and (j<=l) and (mod_text[j]=byte_mem[w,k]) do
  1203    begin incr(k); incr(j);
  1204    end;
  1205  if k=byte_start[p+ww] then
  1206    if j>l then c:=equal
  1207    else c:=extension
  1208  else if j>l then c:=prefix
  1209  else if mod_text[j]<byte_mem[w,k] then c:=less
  1210  else c:=greater;
  1211  end
  1212  
  1213  @ The |prefix_lookup| procedure is supposed to find exactly one module
  1214  name that has |mod_text[1..l]| as a prefix. Actually the algorithm
  1215  silently accepts also the situation that some module name is a prefix of
  1216  |mod_text[1..l]|, because the user who painstakingly typed in more than
  1217  necessary probably doesn't want to be told about the wasted effort.
  1218  
  1219  Recall that error messages are not printed during phase one. It is
  1220  possible that the |prefix_lookup| procedure will fail on the first pass,
  1221  because there is no match, yet the second pass might detect no error if a
  1222  matching module name has occurred after the offending prefix. In such a
  1223  case the cross-reference information will be incorrect and \.{WEAVE} will
  1224  report no error. However, such a mistake will be detected by the
  1225  \.{TANGLE} processor.
  1226  
  1227  @p function prefix_lookup(@!l:sixteen_bits):name_pointer; {finds name extension}
  1228  var c:less..extension; {comparison between two names}
  1229  @!count:0..max_names; {the number of hits}
  1230  @!j:0..longest_name; {index into |mod_text|}
  1231  @!k:0..max_bytes; {index into |byte_mem|}
  1232  @!w:0..ww-1; {row of |byte_mem|}
  1233  @!p:name_pointer; {current node of the search tree}
  1234  @!q:name_pointer; {another place to resume the search after one branch is done}
  1235  @!r:name_pointer; {extension found}
  1236  begin q:=0; p:=root; count:=0; r:=0; {begin search at root of tree}
  1237  while p<>0 do
  1238    begin @<Set variable |c| to the result of comparing...@>;
  1239    if c=less then p:=llink[p]
  1240    else if c=greater then p:=rlink[p]
  1241    else  begin r:=p; incr(count); q:=rlink[p]; p:=llink[p];
  1242      end;
  1243    if p=0 then
  1244      begin p:=q; q:=0;
  1245      end;
  1246    end;
  1247  if count<>1 then
  1248    if count=0 then err_print('! Name does not match')
  1249  @.Name does not match@>
  1250    else err_print('! Ambiguous prefix');
  1251  @.Ambiguous prefix@>
  1252  prefix_lookup:=r; {the result will be 0 if there was no match}
  1253  end;
  1254  
  1255  @* Lexical scanning.
  1256  Let us now consider the subroutines that read the \.{WEB} source file
  1257  and break it into meaningful units. There are four such procedures:
  1258  One simply skips to the next `\.{@@\ }' or `\.{@@*}' that begins a
  1259  module; another passes over the \TeX\ text at the beginning of a
  1260  module; the third passes over the \TeX\ text in a \PASCAL\ comment;
  1261  and the last, which is the most interesting, gets the next token of
  1262  a \PASCAL\ text.
  1263  
  1264  @ But first we need to consider the low-level routine |get_line|
  1265  that takes care of merging |change_file| into |web_file|. The |get_line|
  1266  procedure also updates the line numbers for error messages.
  1267  
  1268  @<Globals...@>=
  1269  @!ii:integer; {general purpose |for| loop variable in the outer block}
  1270  @!line:integer; {the number of the current line in the current file}
  1271  @!other_line:integer; {the number of the current line in the input file that
  1272    is not currently being read}
  1273  @!temp_line:integer; {used when interchanging |line| with |other_line|}
  1274  @!limit:0..long_buf_size; {the last character position occupied in the buffer}
  1275  @!loc:0..long_buf_size; {the next character position to be read from the buffer}
  1276  @!input_has_ended: boolean; {if |true|, there is no more input}
  1277  @!changing: boolean; {if |true|, the current line is from |change_file|}
  1278  @!change_pending: boolean; {if |true|, the current change is not yet
  1279    recorded in |changed_module[module_count]|}
  1280  
  1281  @ As we change |changing| from |true| to |false| and back again, we must
  1282  remember to swap the values of |line| and |other_line| so that the |err_print|
  1283  routine will be sure to report the correct line number.
  1284  
  1285  @d change_changing==
  1286    changing := not changing;
  1287    temp_line:=other_line; other_line:=line; line:=temp_line
  1288      {|line @t$\null\BA\null$@> other_line|}
  1289  
  1290  @ When |changing| is |false|, the next line of |change_file| is kept in
  1291  |change_buffer[0..change_limit]|, for purposes of comparison with the next
  1292  line of |web_file|. After the change file has been completely input, we
  1293  set |change_limit:=0|, so that no further matches will be made.
  1294  
  1295  @<Globals...@>=
  1296  @!change_buffer:array[0..buf_size] of ASCII_code;
  1297  @!change_limit:0..buf_size; {the last position occupied in |change_buffer|}
  1298  
  1299  @ Here's a simple function that checks if the two buffers are different.
  1300  
  1301  @p function lines_dont_match:boolean;
  1302  label exit;
  1303  var k:0..buf_size; {index into the buffers}
  1304  begin lines_dont_match:=true;
  1305  if change_limit<>limit then return;
  1306  if limit>0 then
  1307    for k:=0 to limit-1 do if change_buffer[k]<>buffer[k] then return;
  1308  lines_dont_match:=false;
  1309  exit: end;
  1310  
  1311  @ Procedure |prime_the_change_buffer| sets |change_buffer| in preparation
  1312  for the next matching operation. Since blank lines in the change file are
  1313  not used for matching, we have |(change_limit=0)and not changing| if and
  1314  only if the change file is exhausted. This procedure is called only
  1315  when |changing| is true; hence error messages will be reported correctly.
  1316  
  1317  @p procedure prime_the_change_buffer;
  1318  label continue, done, exit;
  1319  var k:0..buf_size; {index into the buffers}
  1320  begin change_limit:=0; {this value will be used if the change file ends}
  1321  @<Skip over comment lines in the change file; |return| if end of file@>;
  1322  @<Skip to the next nonblank line; |return| if end of file@>;
  1323  @<Move |buffer| and |limit| to |change_buffer| and |change_limit|@>;
  1324  exit: end;
  1325  
  1326  @ While looking for a line that begins with \.{@@x} in the change file,
  1327  we allow lines that begin with \.{@@}, as long as they don't begin with
  1328  \.{@@y} or \.{@@z} (which would probably indicate that the change file is
  1329  fouled up).
  1330  
  1331  @<Skip over comment lines in the change file...@>=
  1332  loop@+  begin incr(line);
  1333    if not input_ln(change_file) then return;
  1334    if limit<2 then goto continue;
  1335    if buffer[0]<>"@@" then goto continue;
  1336    if (buffer[1]>="X")and(buffer[1]<="Z") then
  1337      buffer[1]:=buffer[1]+"z"-"Z"; {lowercasify}
  1338    if buffer[1]="x" then goto done;
  1339    if (buffer[1]="y")or(buffer[1]="z") then
  1340      begin loc:=2; err_print('! Where is the matching @@x?');
  1341  @.Where is the match...@>
  1342      end;
  1343  continue: end;
  1344  done:
  1345  
  1346  @ Here we are looking at lines following the \.{@@x}.
  1347  
  1348  @<Skip to the next nonblank line...@>=
  1349  repeat incr(line);
  1350    if not input_ln(change_file) then
  1351      begin err_print('! Change file ended after @@x');
  1352  @.Change file ended...@>
  1353      return;
  1354      end;
  1355  until limit>0;
  1356  
  1357  @ @<Move |buffer| and |limit| to |change_buffer| and |change_limit|@>=
  1358  begin change_limit:=limit;
  1359  if limit>0 then for k:=0 to limit-1 do change_buffer[k]:=buffer[k];
  1360  end
  1361  
  1362  @ The following procedure is used to see if the next change entry should
  1363  go into effect; it is called only when |changing| is false.
  1364  The idea is to test whether or not the current
  1365  contents of |buffer| matches the current contents of |change_buffer|.
  1366  If not, there's nothing more to do; but if so, a change is called for:
  1367  All of the text down to the \.{@@y} is supposed to match. An error
  1368  message is issued if any discrepancy is found. Then the procedure
  1369  prepares to read the next line from |change_file|.
  1370  
  1371  When a match is found, the current module is marked as changed unless
  1372  the first line after the \.{@@x} and after the \.{@@y} both start with
  1373  either |'@@*'| or |'@@ '| (possibly preceded by whitespace).
  1374  
  1375  @d if_module_start_then_make_change_pending(#)==
  1376    loc:=0; buffer[limit]:="!";
  1377    while (buffer[loc]=" ")or(buffer[loc]=tab_mark) do incr(loc);
  1378    buffer[limit]:=" ";
  1379    if buffer[loc]="@@" then
  1380      if (buffer[loc+1]="*") or
  1381         (buffer[loc+1]=" ") or (buffer[loc+1]=tab_mark) then
  1382        change_pending:=#
  1383  
  1384  @p procedure check_change; {switches to |change_file| if the buffers match}
  1385  label exit;
  1386  var n:integer; {the number of discrepancies found}
  1387  @!k:0..buf_size; {index into the buffers}
  1388  begin if lines_dont_match then return;
  1389  change_pending:=false;
  1390  if not changed_module[module_count] then
  1391    begin if_module_start_then_make_change_pending(true);
  1392    if not change_pending then changed_module[module_count]:=true;
  1393    end;
  1394  n:=0;
  1395  loop@+  begin change_changing; {now it's |true|}
  1396    incr(line);
  1397    if not input_ln(change_file) then
  1398      begin err_print('! Change file ended before @@y');
  1399  @.Change file ended...@>
  1400      change_limit:=0;  change_changing; {|false| again}
  1401      return;
  1402      end;
  1403    @<If the current line starts with \.{@@y},
  1404      report any discrepancies and |return|@>;
  1405    @<Move |buffer| and |limit|...@>;
  1406    change_changing; {now it's |false|}
  1407    incr(line);
  1408    if not input_ln(web_file) then
  1409      begin err_print('! WEB file ended during a change');
  1410  @.WEB file ended...@>
  1411      input_has_ended:=true; return;
  1412      end;
  1413    if lines_dont_match then incr(n);
  1414    end;
  1415  exit: end;
  1416  
  1417  @ @<If the current line starts with \.{@@y}...@>=
  1418  if limit>1 then if buffer[0]="@@" then
  1419    begin if (buffer[1]>="X")and(buffer[1]<="Z") then
  1420      buffer[1]:=buffer[1]+"z"-"Z"; {lowercasify}
  1421    if (buffer[1]="x")or(buffer[1]="z") then
  1422      begin loc:=2; err_print('! Where is the matching @@y?');
  1423  @.Where is the match...@>
  1424      end
  1425    else if buffer[1]="y" then
  1426      begin if n>0 then
  1427        begin loc:=2; err_print('! Hmm... ',n:1,
  1428          ' of the preceding lines failed to match');
  1429  @.Hmm... n of the preceding...@>
  1430        end;
  1431      return;
  1432      end;
  1433    end
  1434  
  1435  @ The |reset_input| procedure, which gets \.{WEAVE} ready to read the
  1436  user's \.{WEB} input, is used at the beginning of phases one and two.
  1437  
  1438  @p procedure reset_input;
  1439  begin open_input; line:=0; other_line:=0;@/
  1440  changing:=true; prime_the_change_buffer; change_changing;@/
  1441  limit:=0; loc:=1; buffer[0]:=" "; input_has_ended:=false;
  1442  end;
  1443  
  1444  @ The |get_line| procedure is called when |loc>limit|; it puts the next
  1445  line of merged input into the buffer and updates the other variables
  1446  appropriately. A space is placed at the right end of the line.
  1447  
  1448  @p procedure get_line; {inputs the next line}
  1449  label restart;
  1450  begin restart:if changing then
  1451    @<Read from |change_file| and maybe turn off |changing|@>;
  1452  if not changing then
  1453    begin @<Read from |web_file| and maybe turn on |changing|@>;
  1454    if changing then goto restart;
  1455    end;
  1456  loc:=0; buffer[limit]:=" ";
  1457  end;
  1458  
  1459  @ @<Read from |web_file|...@>=
  1460  begin incr(line);
  1461  if not input_ln(web_file) then input_has_ended:=true
  1462  else if change_limit>0 then check_change;
  1463  end
  1464  
  1465  @ @<Read from |change_file|...@>=
  1466  begin incr(line);
  1467  if not input_ln(change_file) then
  1468    begin err_print('! Change file ended without @@z');
  1469  @.Change file ended...@>
  1470    buffer[0]:="@@"; buffer[1]:="z"; limit:=2;
  1471    end;
  1472  if limit>0 then {check if the change has ended}
  1473    begin if change_pending then
  1474      begin if_module_start_then_make_change_pending(false);
  1475      if change_pending then
  1476        begin changed_module[module_count]:=true; change_pending:=false;
  1477        end;
  1478      end;
  1479    buffer[limit]:=" ";
  1480    if buffer[0]="@@" then
  1481      begin if (buffer[1]>="X")and(buffer[1]<="Z") then
  1482        buffer[1]:=buffer[1]+"z"-"Z"; {lowercasify}
  1483      if (buffer[1]="x")or(buffer[1]="y") then
  1484        begin loc:=2; err_print('! Where is the matching @@z?');
  1485  @.Where is the match...@>
  1486        end
  1487      else if buffer[1]="z" then
  1488        begin prime_the_change_buffer; change_changing;
  1489        end;
  1490      end;
  1491    end;
  1492  end
  1493  
  1494  @ At the end of the program, we will tell the user if the change file
  1495  had a line that didn't match any relevant line in |web_file|.
  1496  
  1497  @<Check that all changes have been read@>=
  1498  if change_limit<>0 then {|changing| is false}
  1499    begin for ii:=0 to change_limit-1 do buffer[ii]:=change_buffer[ii];
  1500    limit:=change_limit; changing:=true; line:=other_line; loc:=change_limit;
  1501    err_print('! Change file entry did not match');
  1502  @.Change file entry did not match@>
  1503    end
  1504  
  1505  @ Control codes in \.{WEB}, which begin with `\.{@@}', are converted
  1506  into a numeric code designed to simplify \.{WEAVE}'s logic; for example,
  1507  larger numbers are given to the control codes that denote more significant
  1508  milestones, and the code of |new_module| should be the largest of
  1509  all. Some of these numeric control codes take the place of ASCII
  1510  control codes that will not otherwise appear in the output of the
  1511  scanning routines.
  1512  @^ASCII code@>
  1513  
  1514  @d ignore=0 {control code of no interest to \.{WEAVE}}
  1515  @d verbatim=@'2 {extended ASCII alpha will not appear}
  1516  @d force_line=@'3 {extended ASCII beta will not appear}
  1517  @d begin_comment=@'11 {ASCII tab mark will not appear}
  1518  @d end_comment=@'12 {ASCII line feed will not appear}
  1519  @d octal=@'14 {ASCII form feed will not appear}
  1520  @d hex=@'15 {ASCII carriage return will not appear}
  1521  @d double_dot=@'40 {ASCII space will not appear except in strings}
  1522  @d no_underline=@'175 {this code will be intercepted without confusion}
  1523  @d underline=@'176 {this code will be intercepted without confusion}
  1524  @d param=@'177 {ASCII delete will not appear}
  1525  @d xref_roman=@'203 {control code for `\.{@@\^}'}
  1526  @d xref_wildcard=@'204 {control code for `\.{@@:}'}
  1527  @d xref_typewriter=@'205 {control code for `\.{@@.}'}
  1528  @d TeX_string=@'206 {control code for `\.{@@t}'}
  1529  @d check_sum=@'207 {control code for `\.{@@\$}'}
  1530  @d join=@'210 {control code for `\.{@@\&}'}
  1531  @d thin_space=@'211 {control code for `\.{@@,}'}
  1532  @d math_break=@'212 {control code for `\.{@@\char'174}'}
  1533  @d line_break=@'213 {control code for `\.{@@/}'}
  1534  @d big_line_break=@'214 {control code for `\.{@@\#}'}
  1535  @d no_line_break=@'215 {control code for `\.{@@+}'}
  1536  @d pseudo_semi=@'216 {control code for `\.{@@;}'}
  1537  @d format=@'217 {control code for `\.{@@f}'}
  1538  @d definition=@'220 {control code for `\.{@@d}'}
  1539  @d begin_Pascal=@'221 {control code for `\.{@@p}'}
  1540  @d module_name=@'222 {control code for `\.{@@<}'}
  1541  @d new_module=@'223 {control code for `\.{@@\ }' and `\.{@@*}'}
  1542  
  1543  @ Control codes are converted from ASCII to \.{WEAVE}'s internal
  1544  representation by the |control_code| routine.
  1545  
  1546  @p function control_code(@!c:ASCII_code):eight_bits; {convert |c|
  1547    after \.{@@}}
  1548  begin case c of
  1549  "@@": control_code:="@@"; {`quoted' at sign}
  1550  "'": control_code:=octal; {precedes octal constant}
  1551  """": control_code:=hex; {precedes hexadecimal constant}
  1552  "$": control_code:=check_sum; {precedes check sum constant}
  1553  " ",tab_mark,"*": control_code:=new_module; {beginning of a new module}
  1554  "=": control_code:=verbatim;
  1555  "\": control_code:=force_line;
  1556  "D","d": control_code:=definition; {macro definition}
  1557  "F","f": control_code:=format; {format definition}
  1558  "{": control_code:=begin_comment; {begin-comment delimiter}
  1559  "}": control_code:=end_comment; {end-comment delimiter}
  1560  "P","p": control_code:=begin_Pascal; {\PASCAL\ text in unnamed module}
  1561  "&": control_code:=join; {concatenate two tokens}
  1562  "<": control_code:=module_name; {beginning of a module name}
  1563  ">": begin err_print('! Extra @@>'); control_code:=ignore;
  1564  @.Extra \AT!>@>
  1565    end; {end of module name should not be discovered in this way}
  1566  "T","t": control_code:=TeX_string; {\TeX\ box within \PASCAL}
  1567  "!": control_code:=underline; {set definition flag}
  1568  "?": control_code:=no_underline; {reset definition flag}
  1569  "^": control_code:=xref_roman; {index entry to be typeset normally}
  1570  ":": control_code:=xref_wildcard; {index entry to be in user format}
  1571  ".": control_code:=xref_typewriter; {index entry to be in typewriter type}
  1572  ",": control_code:=thin_space; {puts extra space in \PASCAL\ format}
  1573  "|": control_code:=math_break; {allows a break in a formula}
  1574  "/": control_code:=line_break; {forces end-of-line in \PASCAL\ format}
  1575  "#": control_code:=big_line_break; {forces end-of-line and some space besides}
  1576  "+": control_code:=no_line_break; {cancels end-of-line down to single space}
  1577  ";": control_code:=pseudo_semi; {acts like a semicolon, but is invisible}
  1578  @t\4@>@<Special control codes allowed only when debugging@>@;
  1579  othercases begin err_print('! Unknown control code'); control_code:=ignore;
  1580  @.Unknown control code@>
  1581    end
  1582  endcases;
  1583  end;
  1584  
  1585  @ If \.{WEAVE} is compiled with debugging commands, one can write
  1586  \.{@@2}, \.{@@1}, and \.{@@0} to turn tracing fully on, partly on,
  1587  and off, respectively.
  1588  @.\AT!2@>
  1589  @.\AT!1@>
  1590  
  1591  @<Special control codes...@>=
  1592  @!debug@t@>@/
  1593  "0","1","2": begin tracing:=c-"0"; control_code:=ignore;
  1594    end;
  1595  gubed
  1596  
  1597  @ The |skip_limbo| routine is used on the first pass to skip through
  1598  portions of the input that are not in any modules, i.e., that precede
  1599  the first module. After this procedure has been called, the value of
  1600  |input_has_ended| will tell whether or not a new module has
  1601  actually been found.
  1602  
  1603  @p procedure skip_limbo; {skip to next module}
  1604  label exit;
  1605  var c:ASCII_code; {character following \.{@@}}
  1606  begin loop if loc>limit then
  1607      begin get_line;
  1608      if input_has_ended then return;
  1609      end
  1610    else  begin buffer[limit+1]:="@@";
  1611      while buffer[loc]<>"@@" do incr(loc);
  1612      if loc<=limit then
  1613        begin loc:=loc+2; c:=buffer[loc-1];
  1614        if (c=" ")or(c=tab_mark)or(c="*") then return;
  1615        end;
  1616      end;
  1617  exit: end;
  1618  
  1619  @ The |skip_TeX| routine is used on the first pass to skip through
  1620  the \TeX\ code at the beginning of a module. It returns the next
  1621  control code or `\v' found in the input. A |new_module| is
  1622  assumed to exist at the very end of the file.
  1623  
  1624  @p function skip_TeX: eight_bits; {skip past pure \TeX\ code}
  1625  label done;
  1626  var c:eight_bits; {control code found}
  1627  begin loop begin if loc>limit then
  1628      begin get_line;
  1629      if input_has_ended then
  1630        begin c:=new_module; goto done;
  1631        end;
  1632      end;
  1633    buffer[limit+1]:="@@";
  1634    repeat c:=buffer[loc]; incr(loc);
  1635    if c="|" then goto done;
  1636    until c="@@";
  1637    if loc<=limit then
  1638      begin c:=control_code(buffer[loc]); incr(loc); goto done;
  1639      end;
  1640    end;
  1641  done:skip_TeX:=c;
  1642  end;
  1643  
  1644  @ The |skip_comment| routine is used on the first pass to skip
  1645  through \TeX\ code in \PASCAL\ comments. The |bal| parameter
  1646  tells how many left braces are assumed to have been scanned when
  1647  this routine is called, and the procedure returns a corresponding
  1648  value of |bal| at the point that scanning has stopped. Scanning
  1649  stops either at a `\v' that introduces \PASCAL\ text,
  1650  in which case the returned value is positive, or it stops at the
  1651  end of the comment, in which case the returned value is zero.
  1652  The scanning also stops in anomalous situations when the comment
  1653  doesn't end or when it contains an illegal use of \.{@@}.
  1654  One should call |skip_comment(1)| when beginning to scan a comment.
  1655  
  1656  @p function skip_comment(@!bal:eight_bits):eight_bits; {skips \TeX\
  1657    code in comments}
  1658  label done;
  1659  var c:ASCII_code; {the current character}
  1660  begin loop begin if loc>limit then
  1661      begin get_line;
  1662      if input_has_ended then
  1663        begin bal:=0; goto done;
  1664        end; {an error message will occur in phase two}
  1665      end;
  1666    c:=buffer[loc]; incr(loc);
  1667    if c="|" then goto done;
  1668    @<Do special things when |c="@@", "\", "{", "}"|; |goto done| at end@>;
  1669    end;
  1670  done: skip_comment:=bal;
  1671  end;
  1672  
  1673  @ @<Do special things when |c="@@"...@>=
  1674  if c="@@" then
  1675    begin c:=buffer[loc];
  1676    if (c<>" ")and(c<>tab_mark)and(c<>"*") then incr(loc)
  1677    else  begin decr(loc); bal:=0; goto done;
  1678      end {an error message will occur in phase two}
  1679    end
  1680  else if (c="\")and(buffer[loc]<>"@@") then incr(loc)
  1681  else if c="{" then incr(bal)
  1682  else if c="}" then
  1683    begin decr(bal);
  1684    if bal=0 then goto done;
  1685    end
  1686  
  1687  @* Inputting the next token.
  1688  As stated above, \.{WEAVE}'s most interesting lexical scanning routine is the
  1689  |get_next| function that inputs the next token of \PASCAL\ input. However,
  1690  |get_next| is not especially complicated.
  1691  
  1692  The result of |get_next| is either an ASCII code for some special character,
  1693  or it is a special code representing a pair of characters (e.g., `\.{:=}'
  1694  or `\.{..}'), or it is the numeric value computed by the |control_code|
  1695  procedure, or it is one of the following special codes:
  1696  
  1697  \yskip\hang |exponent|: The `\.E' in a real constant.
  1698  
  1699  \yskip\hang |identifier|: In this case the global variables |id_first|
  1700  and |id_loc| will have been set to the appropriate values needed by the
  1701  |id_lookup| routine.
  1702  
  1703  \yskip\hang |string|: In this case the global variables |id_first| and
  1704  |id_loc| will have been set to the beginning and ending-plus-one locations
  1705  in the buffer.  The string ends with the first reappearance of its initial
  1706  delimiter; thus, for example, $$\.{\'This isn\'\'t a single string\'}$$
  1707  will be treated as two consecutive strings, the first being \.{\'This
  1708  isn\'}.
  1709  
  1710  \yskip\noindent Furthermore, some of the control codes cause
  1711  |get_next| to take additional actions:
  1712  
  1713  \yskip\hang |xref_roman|, |xref_wildcard|,
  1714  |xref_typewriter|, |TeX_string|: The values of
  1715  |id_first| and |id_loc| will be set so that the string in question appears
  1716  in |buffer[id_first..(id_loc-1)]|.
  1717  
  1718  \yskip\hang |module_name|: In this case the global variable |cur_module| will
  1719  point to the |byte_start| entry for the module name that has just been scanned.
  1720  
  1721  \yskip\noindent If |get_next| sees `\.{@@!}' or `\.{@@?}',
  1722  it sets |xref_switch| to |def_flag| or zero and goes on to the next token.
  1723  
  1724  A global variable called |scanning_hex| is set |true| during the time that
  1725  the letters \.A through \.F should be treated as if they were digits.
  1726  
  1727  @d exponent=@'200 {\.E or \.e following a digit}
  1728  @d string=@'201 {\PASCAL\ string or \.{WEB} precomputed string}
  1729  @d identifier=@'202 {\PASCAL\ identifier or reserved word}
  1730  
  1731  @<Globals...@>=
  1732  @!cur_module: name_pointer; {name of module just scanned}
  1733  @!scanning_hex: boolean; {are we scanning a hexadecimal constant?}
  1734  
  1735  @ @<Set init...@>=
  1736  scanning_hex:=false;
  1737  
  1738  @ As one might expect, |get_next| consists mostly of a big switch
  1739  that branches to the various special cases that can arise.
  1740  
  1741  @d up_to(#)==#-24,#-23,#-22,#-21,#-20,#-19,#-18,#-17,#-16,#-15,#-14,
  1742    #-13,#-12,#-11,#-10,#-9,#-8,#-7,#-6,#-5,#-4,#-3,#-2,#-1,#
  1743  
  1744  @p function get_next:eight_bits; {produces the next input token}
  1745  label restart,done,found;
  1746  var c:eight_bits; {the current character}
  1747  @!d:eight_bits; {the next character}
  1748  @!j,@!k:0..longest_name; {indices into |mod_text|}
  1749  begin restart: if loc>limit then
  1750    begin get_line;
  1751    if input_has_ended then
  1752      begin c:=new_module; goto found;
  1753      end;
  1754    end;
  1755  c:=buffer[loc]; incr(loc);
  1756  if scanning_hex then @<Go to |found| if |c| is a hexadecimal digit,
  1757    otherwise set |scanning_hex:=false|@>;
  1758  case c of
  1759  "A",up_to("Z"),"a",up_to("z"): @<Get an identifier@>;
  1760  "'","""": @<Get a string@>;
  1761  "@@": @<Get control code and possible module name@>;
  1762  @t\4@>@<Compress two-symbol combinations like `\.{:=}'@>@;
  1763  " ",tab_mark: goto restart; {ignore spaces and tabs}
  1764  "}": begin err_print('! Extra }'); goto restart;
  1765  @.Extra \}@>
  1766    end;
  1767  othercases if c>=128 then goto restart {ignore nonstandard characters}
  1768    else do_nothing
  1769  endcases;
  1770  found:@!debug if trouble_shooting then debug_help;@;@+gubed@/
  1771  get_next:=c;
  1772  end;
  1773  
  1774  @ @<Go to |found| if |c| is a hexadecimal digit...@>=
  1775  if ((c>="0")and(c<="9"))or((c>="A")and(c<="F")) then goto found
  1776  else scanning_hex:=false
  1777  
  1778  @ Note that the following code substitutes \.{@@\{} and \.{@@\}} for the
  1779  respective combinations `\.{(*}' and `\.{*)}'. Explicit braces should be used
  1780  for \TeX\ comments in \PASCAL\ text.
  1781  
  1782  @d compress(#)==begin if loc<=limit then begin c:=#; incr(loc); end; end
  1783  
  1784  @<Compress two-symbol...@>=
  1785  ".": if buffer[loc]="." then compress(double_dot)
  1786    else if buffer[loc]=")" then compress("]");
  1787  ":": if buffer[loc]="=" then compress(left_arrow);
  1788  "=": if buffer[loc]="=" then compress(equivalence_sign);
  1789  ">": if buffer[loc]="=" then compress(greater_or_equal);
  1790  "<": if buffer[loc]="=" then compress(less_or_equal)
  1791    else if buffer[loc]=">" then compress(not_equal);
  1792  "(": if buffer[loc]="*" then compress(begin_comment)
  1793    else if buffer[loc]="." then compress("[");
  1794  "*": if buffer[loc]=")" then compress(end_comment);
  1795  
  1796  @ @<Get an identifier@>=
  1797  begin if ((c="E")or(c="e"))and(loc>1) then
  1798    if (buffer[loc-2]<="9")and(buffer[loc-2]>="0") then c:=exponent;
  1799  if c<>exponent then
  1800    begin decr(loc); id_first:=loc;
  1801    repeat incr(loc); d:=buffer[loc];
  1802    until ((d<"0")or((d>"9")and(d<"A"))or((d>"Z")and(d<"a"))or(d>"z"))and(d<>"_");
  1803    c:=identifier; id_loc:=loc;
  1804    end;
  1805  end
  1806  
  1807  @ A string that starts and ends with single or double quote marks is
  1808  scanned by the following piece of the program.
  1809  
  1810  @<Get a string@>=
  1811  begin id_first:=loc-1;
  1812  repeat d:=buffer[loc]; incr(loc);
  1813  if loc>limit then
  1814    begin err_print('! String constant didn''t end');
  1815  @.String constant didn't end@>
  1816    loc:=limit; d:=c;
  1817    end;
  1818  until d=c;
  1819  id_loc:=loc; c:=string;
  1820  end
  1821  
  1822  @ After an \.{@@} sign has been scanned, the next character tells us
  1823  whether there is more work to do.
  1824  
  1825  @<Get control code and possible module name@>=
  1826  begin c:=control_code(buffer[loc]); incr(loc);
  1827  if c=underline then
  1828    begin xref_switch:=def_flag; goto restart;
  1829    end
  1830  else if c=no_underline then
  1831    begin xref_switch:=0; goto restart;
  1832    end
  1833  else if (c<=TeX_string)and(c>=xref_roman) then
  1834    @<Scan to the next \.{@@>}@>
  1835  else if c=hex then scanning_hex:=true
  1836  else if c=module_name then
  1837    @<Scan the module name and make |cur_module| point to it@>
  1838  else if c=verbatim then @<Scan a verbatim string@>;
  1839  end
  1840  
  1841  @ The occurrence of a module name sets |xref_switch| to zero,
  1842  because the module name might (for example) follow \&{var}.
  1843  
  1844  @<Scan the module name...@>=
  1845  begin @<Put module name into |mod_text[1..k]|@>;
  1846  if k>3 then
  1847    begin if (mod_text[k]=".")and(mod_text[k-1]=".")and(mod_text[k-2]=".") then
  1848      cur_module:=prefix_lookup(k-3)
  1849    else cur_module:=mod_lookup(k);
  1850    end
  1851  else cur_module:=mod_lookup(k);
  1852  xref_switch:=0;
  1853  end
  1854  
  1855  @ Module names are placed into the |mod_text| array with consecutive spaces,
  1856  tabs, and carriage-returns replaced by single spaces. There will be no
  1857  spaces at the beginning or the end. (We set |mod_text[0]:=" "| to facilitate
  1858  this, since the |mod_lookup| routine uses |mod_text[1]| as the first
  1859  character of the name.)
  1860  
  1861  @<Set init...@>=mod_text[0]:=" ";
  1862  
  1863  @ @<Put module name...@>=
  1864  k:=0;
  1865  loop@+  begin if loc>limit then
  1866      begin get_line;
  1867      if input_has_ended then
  1868        begin err_print('! Input ended in section name');
  1869  @.Input ended in section name@>
  1870        loc:=1; goto done;
  1871        end;
  1872      end;
  1873    d:=buffer[loc];
  1874    @<If end of name, |goto done|@>;
  1875    incr(loc); if k<longest_name-1 then incr(k);
  1876    if (d=" ")or(d=tab_mark) then
  1877      begin d:=" "; if mod_text[k-1]=" " then decr(k);
  1878      end;
  1879    mod_text[k]:=d;
  1880    end;
  1881  done: @<Check for overlong name@>;
  1882  if (mod_text[k]=" ")and(k>0) then decr(k)
  1883  
  1884  @ @<If end of name,...@>=
  1885  if d="@@" then
  1886    begin d:=buffer[loc+1];
  1887    if d=">" then
  1888      begin loc:=loc+2; goto done;
  1889      end;
  1890    if (d=" ")or(d=tab_mark)or(d="*") then
  1891      begin err_print('! Section name didn''t end'); goto done;
  1892  @.Section name didn't end@>
  1893      end;
  1894    incr(k); mod_text[k]:="@@"; incr(loc); {now |d=buffer[loc]| again}
  1895    end
  1896  
  1897  @ @<Check for overlong name@>=
  1898  if k>=longest_name-2 then
  1899    begin print_nl('! Section name too long: ');
  1900  @.Section name too long@>
  1901    for j:=1 to 25 do print(xchr[mod_text[j]]);
  1902    print('...'); mark_harmless;
  1903    end
  1904  
  1905  @ @<Scan to the next...@>=
  1906  begin id_first:=loc; buffer[limit+1]:="@@";
  1907  while buffer[loc]<>"@@" do incr(loc);
  1908  id_loc:=loc;
  1909  if loc>limit then
  1910    begin err_print('! Control text didn''t end'); loc:=limit;
  1911  @.Control text didn't end@>
  1912    end
  1913  else  begin loc:=loc+2;
  1914    if buffer[loc-1]<>">" then
  1915      err_print('! Control codes are forbidden in control text');
  1916  @.Control codes are forbidden...@>
  1917    end;
  1918  end
  1919  
  1920  @ A verbatim \PASCAL\ string will be treated like ordinary strings, but
  1921  with no surrounding delimiters.  At the present point in the program we
  1922  have |buffer[loc-1]=verbatim|; we must set |id_first| to the beginning
  1923  of the string itself, and |id_loc| to its ending-plus-one location in the
  1924  buffer.  We also set |loc| to the position just after the ending delimiter.
  1925  
  1926  @<Scan a verbatim string@>=
  1927  begin id_first:=loc; incr(loc);
  1928  buffer[limit+1]:="@@"; buffer[limit+2]:=">";
  1929  while (buffer[loc]<>"@@")or(buffer[loc+1]<>">") do incr(loc);
  1930  if loc>=limit then err_print('! Verbatim string didn''t end');
  1931  @.Verbatim string didn't end@>
  1932  id_loc:=loc; loc:=loc+2;
  1933  end
  1934  
  1935  @* Phase one processing.
  1936  We now have accumulated enough subroutines to make it possible to carry out
  1937  \.{WEAVE}'s first pass over the source file. If everything works right,
  1938  both phase one and phase two of \.{WEAVE} will assign the same numbers to
  1939  modules, and these numbers will agree with what \.{TANGLE} does.
  1940  
  1941  The global variable |next_control| often contains the most recent output of
  1942  |get_next|; in interesting cases, this will be the control code that
  1943  ended a module or part of a module.
  1944  
  1945  @<Glob...@>=@!next_control:eight_bits; {control code waiting to be acting upon}
  1946  
  1947  @ The overall processing strategy in phase one has the following
  1948  straightforward outline.
  1949  
  1950  @<Phase I: Read all the user's text and store the cross references@>=
  1951  phase_one:=true; phase_three:=false;
  1952  reset_input;
  1953  module_count:=0; changed_module[0]:=false;
  1954  skip_limbo; change_exists:=false;
  1955  while not input_has_ended do
  1956    @<Store cross reference data for the current module@>;
  1957  changed_module[module_count]:=change_exists;
  1958    {the index changes if anything does}
  1959  phase_one:=false; {prepare for second phase}
  1960  @<Print error messages about unused or undefined module names@>;
  1961  
  1962  @ @<Store cross reference data...@>=
  1963  begin incr(module_count);
  1964  if module_count=max_modules then overflow('section number');
  1965  changed_module[module_count]:=changing;
  1966   {it will become |true| if any line changes}
  1967  if buffer[loc-1]="*" then
  1968    begin print('*',module_count:1);
  1969    update_terminal; {print a progress report}
  1970    end;
  1971  @<Store cross references in the \TeX\ part of a module@>;
  1972  @<Store cross references in the \(definition part of a module@>;
  1973  @<Store cross references in the \PASCAL\ part of a module@>;
  1974  if changed_module[module_count] then change_exists:=true;
  1975  end
  1976  
  1977  @ The |Pascal_xref| subroutine stores references to identifiers in
  1978  \PASCAL\ text material beginning with the current value of |next_control|
  1979  and continuing until |next_control| is `\.\{' or `\v', or until the next
  1980  ``milestone'' is passed (i.e., |next_control>=format|). If
  1981  |next_control>=format| when |Pascal_xref| is called, nothing will happen;
  1982  but if |next_control="|"| upon entry, the procedure assumes that this is
  1983  the `\v' preceding \PASCAL\ text that is to be processed.
  1984  
  1985  The program uses the fact that our internal code numbers satisfy
  1986  the relations |xref_roman=identifier+roman| and |xref_wildcard=identifier
  1987  +wildcard| and |xref_typewriter=identifier+
  1988  typewriter| and |normal=0|. An implied `\.{@@!}' is inserted after
  1989  \&{function}, \&{procedure}, \&{program}, and \&{var}.
  1990  
  1991  @p procedure Pascal_xref; {makes cross references for \PASCAL\ identifiers}
  1992  label exit;
  1993  var p:name_pointer; {a referenced name}
  1994  begin while next_control<format do
  1995    begin if (next_control>=identifier)and
  1996        (next_control<=xref_typewriter) then
  1997      begin p:=id_lookup(next_control-identifier); new_xref(p);
  1998      if (ilk[p]=proc_like)or(ilk[p]=var_like) then
  1999        xref_switch:=def_flag; {implied `\.{@@!}'}
  2000      end;
  2001    next_control:=get_next;
  2002    if (next_control="|")or(next_control="{") then return;
  2003    end;
  2004  exit:end;
  2005  
  2006  @ The |outer_xref| subroutine is like |Pascal_xref| but it begins
  2007  with |next_control<>"|"| and ends with |next_control>=format|. Thus, it
  2008  handles \PASCAL\ text with embedded comments.
  2009  
  2010  @p procedure outer_xref; {extension of |Pascal_xref|}
  2011  var bal:eight_bits; {brace level in comment}
  2012  begin while next_control<format do
  2013    if next_control<>"{" then Pascal_xref
  2014    else  begin bal:=skip_comment(1); next_control:="|";
  2015      while bal>0 do
  2016        begin Pascal_xref;
  2017        if next_control="|" then bal:=skip_comment(bal)
  2018        else bal:=0; {an error will be reported in phase two}
  2019        end;
  2020      end;
  2021  end;
  2022  
  2023  @ In the \TeX\ part of a module, cross reference entries are made only for
  2024  the identifiers in \PASCAL\ texts enclosed in \pb, or for control texts
  2025  enclosed in \.{@@\^}$\,\ldots\,$\.{@@>} or \.{@@.}$\,\ldots\,$\.{@@>}
  2026  or \.{@@:}$\,\ldots\,$\.{@@>}.
  2027  
  2028  @<Store cross references in the \T...@>=
  2029  repeat next_control:=skip_TeX;
  2030  case next_control of
  2031  underline: xref_switch:=def_flag;
  2032  no_underline: xref_switch:=0;
  2033  "|": Pascal_xref;
  2034  xref_roman, xref_wildcard, xref_typewriter, module_name:
  2035    begin loc:=loc-2; next_control:=get_next; {scan to \.{@@>}}
  2036    if next_control<>module_name then
  2037      new_xref(id_lookup(next_control-identifier));
  2038    end;
  2039  othercases do_nothing
  2040  endcases;
  2041  until next_control>=format
  2042  
  2043  @ During the definition and \PASCAL\ parts of a module, cross references
  2044  are made for all identifiers except reserved words; however, the
  2045  identifiers in a format definition are referenced even if they are
  2046  reserved. The \TeX\ code in comments is, of course, ignored, except for
  2047  \PASCAL\ portions enclosed in \pb; the text of a module name is skipped
  2048  entirely, even if it contains \pb\ constructions.
  2049  
  2050  The variables |lhs| and |rhs| point to the respective identifiers involved
  2051  in a format definition.
  2052  
  2053  @<Global...@>=
  2054  @!lhs,@!rhs:name_pointer; {indices into |byte_start| for format identifiers}
  2055  
  2056  @ When we get to the following code we have |next_control>=format|.
  2057  
  2058  @<Store cross references in the \(d...@>=
  2059  while next_control<=definition do {|format| or |definition|}
  2060    begin xref_switch:=def_flag; {implied \.{@@!}}
  2061    if next_control=definition then next_control:=get_next
  2062    else @<Process a format definition@>;
  2063    outer_xref;
  2064    end
  2065  
  2066  @ Error messages for improper format definitions will be issued in phase
  2067  two. Our job in phase one is to define the |ilk| of a properly formatted
  2068  identifier, and to fool the |new_xref| routine into thinking that the
  2069  identifier on the right-hand side of the format definition is not a
  2070  reserved word.
  2071  
  2072  @<Process a form...@>=
  2073  begin next_control:=get_next;
  2074  if next_control=identifier then
  2075    begin lhs:=id_lookup(normal); ilk[lhs]:=normal; new_xref(lhs);
  2076    next_control:=get_next;
  2077    if next_control=equivalence_sign then
  2078      begin next_control:=get_next;
  2079      if next_control=identifier then
  2080        begin rhs:=id_lookup(normal);
  2081        ilk[lhs]:=ilk[rhs]; ilk[rhs]:=normal; new_xref(rhs);
  2082        ilk[rhs]:=ilk[lhs]; next_control:=get_next;
  2083        end;
  2084      end;
  2085    end;
  2086  end
  2087  
  2088  @ Finally, when the \TeX\ and definition parts have been treated, we have
  2089  |next_control>=begin_Pascal|.
  2090  
  2091  @<Store cross references in the \P...@>=
  2092  if next_control<=module_name then {|begin_Pascal| or |module_name|}
  2093    begin if next_control=begin_Pascal then mod_xref_switch:=0
  2094    else mod_xref_switch:=def_flag;
  2095    repeat if next_control=module_name then new_mod_xref(cur_module);
  2096      next_control:=get_next; outer_xref;
  2097    until next_control>module_name;
  2098    end
  2099  
  2100  @ After phase one has looked at everything, we want to check that each
  2101  module name was both defined and used.
  2102  The variable |cur_xref| will point to cross references for the
  2103  current module name of interest.
  2104  
  2105  @<Glob...@>=@!cur_xref:xref_number; {temporary cross reference pointer}
  2106  
  2107  @ The following recursive procedure
  2108  walks through the tree of module names and prints out anomalies.
  2109  @^recursion@>
  2110  
  2111  @p procedure mod_check(@!p:name_pointer); {print anomalies in subtree |p|}
  2112  begin if p>0 then
  2113    begin mod_check(llink[p]);@/
  2114    cur_xref:=xref[p];
  2115    if num(cur_xref)<def_flag then
  2116      begin print_nl('! Never defined: <'); print_id(p);
  2117  @.Never defined: <section name>@>
  2118      print('>'); mark_harmless;
  2119      end;
  2120    while num(cur_xref)>=def_flag do cur_xref:=xlink(cur_xref);
  2121    if cur_xref=0 then
  2122      begin print_nl('! Never used: <'); print_id(p); print('>');
  2123  @.Never used: <section name>@>
  2124      mark_harmless;
  2125      end;
  2126    mod_check(rlink[p]);
  2127    end;
  2128  end;
  2129  
  2130  @ @<Print error messages about un...@>=@+mod_check(root)
  2131  
  2132  @* Low-level output routines.
  2133  The \TeX\ output is supposed to appear in lines at most |line_length|
  2134  characters long, so we place it into an output buffer. During the output
  2135  process, |out_line| will hold the current line number of the line about to
  2136  be output.
  2137  
  2138  @<Glo...@>=
  2139  @!out_buf:array[0..line_length] of ASCII_code; {assembled characters}
  2140  @!out_ptr:0..line_length; {number of characters in |out_buf|}
  2141  @!out_line: integer; {coordinates of next line to be output}
  2142  
  2143  @ The |flush_buffer| routine empties the buffer up to a given breakpoint,
  2144  and moves any remaining characters to the beginning of the next line.
  2145  If the |per_cent| parameter is |true|, a |"%"| is appended to the line
  2146  that is being output; in this case the breakpoint |b| should be strictly
  2147  less than |line_length|. If the |per_cent| parameter is |false|,
  2148  trailing blanks are suppressed.
  2149  The characters emptied from the buffer form a new line of output;
  2150  if the |carryover| parameter is true, a |"%"| in that line will be
  2151  carried over to the next line (so that \TeX\ will ignore the completion
  2152  of commented-out text).
  2153  
  2154  @p procedure flush_buffer(@!b:eight_bits;@!per_cent,@!carryover:boolean);
  2155    {outputs |out_buf[1..b]|, where |b<=out_ptr|}
  2156  label done,found;
  2157  var j,@!k:0..line_length;
  2158  begin j:=b;
  2159  if not per_cent then {remove trailing blanks}
  2160    loop@+  begin if j=0 then goto done;
  2161      if out_buf[j]<>" " then goto done;
  2162      decr(j);
  2163      end;
  2164  done: for k:=1 to j do write(tex_file,xchr[out_buf[k]]);
  2165  if per_cent then write(tex_file,xchr["%"]);
  2166  write_ln(tex_file); incr(out_line);
  2167  if carryover then
  2168    for k:=1 to j do
  2169      if out_buf[k]="%" then
  2170        if (k=1)or(out_buf[k-1]<>"\") then {comment mode should be preserved}
  2171          begin out_buf[b]:="%"; decr(b); goto found;
  2172          end;
  2173  found: if (b<out_ptr) then
  2174    for k:=b+1 to out_ptr do out_buf[k-b]:=out_buf[k];
  2175  out_ptr:=out_ptr-b;
  2176  end;
  2177  
  2178  @ When we are copying \TeX\ source material, we retain line breaks
  2179  that occur in the input, except that an empty line is not
  2180  output when the \TeX\ source line was nonempty. For example, a line
  2181  of the \TeX\ file that contains only an index cross-reference entry
  2182  will not be copied. The |finish_line| routine is called just before
  2183  |get_line| inputs a new line, and just after a line break token has
  2184  been emitted during the output of translated \PASCAL\ text.
  2185  
  2186  @p procedure finish_line; {do this at the end of a line}
  2187  label exit;
  2188  var k:0..buf_size; {index into |buffer|}
  2189  begin if out_ptr>0 then flush_buffer(out_ptr,false,false)
  2190  else  begin for k:=0 to limit do
  2191      if (buffer[k]<>" ")and(buffer[k]<>tab_mark) then return;
  2192    flush_buffer(0,false,false);
  2193    end;
  2194  exit:end;
  2195  
  2196  @ In particular, the |finish_line| procedure is called near the very
  2197  beginning of phase two. We initialize the output variables in a slightly
  2198  tricky way so that the first line of the output file will be
  2199  `\.{\\input webmac}'.
  2200  @.\\input webmac@>
  2201  @.webmac@>
  2202  
  2203  @<Set init...@>=
  2204  out_ptr:=1; out_line:=1; out_buf[1]:="c"; write(tex_file,'\input webma');
  2205  
  2206  @ When we wish to append the character |c| to the output buffer, we write
  2207  `$|out|(c)$'; this will cause the buffer to be emptied if it was already
  2208  full. Similarly, `$|out2|(c_1)(c_2)$' appends a pair of characters.
  2209  A line break will occur at a space or after a single-nonletter
  2210  \TeX\ control sequence.
  2211  
  2212  @d oot(#)==@;@/
  2213    if out_ptr=line_length then break_out;
  2214    incr(out_ptr); out_buf[out_ptr]:=#;
  2215  @d oot1(#)==oot(#)@+end
  2216  @d oot2(#)==oot(#)@,oot1
  2217  @d oot3(#)==oot(#)@,oot2
  2218  @d oot4(#)==oot(#)@,oot3
  2219  @d oot5(#)==oot(#)@,oot4
  2220  @d out==@+begin oot1
  2221  @d out2==@+begin oot2
  2222  @d out3==@+begin oot3
  2223  @d out4==@+begin oot4
  2224  @d out5==@+begin oot5
  2225  
  2226  @ The |break_out| routine is called just before the output buffer is about
  2227  to overflow. To make this routine a little faster, we initialize position
  2228  0 of the output buffer to `\.\\'; this character isn't really output.
  2229  
  2230  @<Set init...@>=
  2231  out_buf[0]:="\";
  2232  
  2233  @ A long line is broken at a blank space or just before a backslash that isn't
  2234  preceded by another backslash. In the latter case, a |"%"| is output at
  2235  the break.
  2236  
  2237  @p procedure break_out; {finds a way to break the output line}
  2238  label exit;
  2239  var k:0..line_length; {index into |out_buf|}
  2240  @!d:ASCII_code; {character from the buffer}
  2241  begin k:=out_ptr;
  2242  loop@+  begin if k=0 then
  2243      @<Print warning message, break the line, |return|@>;
  2244    d:=out_buf[k];
  2245    if d=" " then
  2246      begin flush_buffer(k,false,true); return;
  2247      end;
  2248    if (d="\")and(out_buf[k-1]<>"\") then {in this case |k>1|}
  2249      begin flush_buffer(k-1,true,true); return;
  2250      end;
  2251    decr(k);
  2252    end;
  2253  exit:end;
  2254  
  2255  @ We get to this module only in unusual cases that the entire output line
  2256  consists of a string of backslashes followed by a string of nonblank
  2257  non-backslashes. In such cases it is almost always safe to break the
  2258  line by putting a |"%"| just before the last character.
  2259  
  2260  @<Print warning message...@>=
  2261  begin print_nl('! Line had to be broken (output l.',out_line:1);
  2262  @.Line had to be broken@>
  2263  print_ln('):');
  2264  for k:=1 to out_ptr-1 do print(xchr[out_buf[k]]);
  2265  new_line; mark_harmless;
  2266  flush_buffer(out_ptr-1,true,true); return;
  2267  end
  2268  
  2269  @ Here is a procedure that outputs a module number in decimal notation.
  2270  
  2271  @<Glob...@>=@!dig:array[0..4] of 0..9; {digits to output}
  2272  
  2273  @ The number to be converted by |out_mod| is known to be less than
  2274  |def_flag|, so it cannot have more than five decimal digits.  If
  2275  the module is changed, we output `\.{\\*}' just after the number.
  2276  
  2277  @p procedure out_mod(@!m:integer); {output a module number}
  2278  var k:0..5; {index into |dig|}
  2279  @!a:integer; {accumulator}
  2280  begin k:=0; a:=m;
  2281  repeat dig[k]:=a mod 10; a:=a div 10; incr(k);
  2282  until a=0;
  2283  repeat decr(k); out(dig[k]+"0");
  2284  until k=0;
  2285  if changed_module[m] then out2("\")("*");
  2286  @.\\*@>
  2287  end;
  2288  
  2289  @ The |out_name| subroutine is used to output an identifier or index
  2290  entry, enclosing it in braces.
  2291  
  2292  @p procedure out_name(@!p:name_pointer); {outputs a name}
  2293  var k:0..max_bytes; {index into |byte_mem|}
  2294  @!w:0..ww-1; {row of |byte_mem|}
  2295  begin out("{"); w:=p mod ww;
  2296  for k:=byte_start[p] to byte_start[p+ww]-1 do
  2297    begin if byte_mem[w,k]="_" then out("\");
  2298  @.\\_@>
  2299    out(byte_mem[w,k]);
  2300    end;
  2301  out("}");
  2302  end;
  2303  
  2304  @* Routines that copy \TeX\ material.
  2305  During phase two, we use the subroutines |copy_limbo|, |copy_TeX|, and
  2306  |copy_comment| in place of the analogous |skip_limbo|, |skip_TeX|, and
  2307  |skip_comment| that were used in phase one.
  2308  
  2309  The |copy_limbo| routine, for example, takes \TeX\ material that is not
  2310  part of any module and transcribes it almost verbatim to the output file.
  2311  No `\.{@@}' signs should occur in such material except in `\.{@@@@}'
  2312  pairs; such pairs are replaced by singletons.
  2313  
  2314  @p procedure copy_limbo; {copy \TeX\ code until the next module begins}
  2315  label exit;
  2316  var c:ASCII_code; {character following \.{@@} sign}
  2317  begin loop if loc>limit then
  2318      begin finish_line; get_line;
  2319      if input_has_ended then return;
  2320      end
  2321    else  begin buffer[limit+1]:="@@";
  2322      @<Copy up to control code, |return| if finished@>;
  2323      end;
  2324  exit:end;
  2325  
  2326  @ @<Copy up to control...@>=
  2327  while buffer[loc]<>"@@" do
  2328    begin out(buffer[loc]); incr(loc);
  2329    end;
  2330  if loc<=limit then
  2331    begin loc:=loc+2; c:=buffer[loc-1];
  2332    if (c=" ")or(c=tab_mark)or(c="*") then return;
  2333    out("@@");
  2334    if c<>"@@" then err_print('! Double @@ required outside of sections');
  2335  @.Double \AT! required...@>
  2336    end
  2337  
  2338  @ The |copy_TeX| routine processes the \TeX\ code at the beginning of a
  2339  module; for example, the words you are now reading were copied in this
  2340  way. It returns the next control code or `\v' found in the input.
  2341  
  2342  @p function copy_TeX:eight_bits; {copy pure \TeX\ material}
  2343  label done;
  2344  var c:eight_bits; {control code found}
  2345  begin loop begin if loc>limit then
  2346      begin finish_line; get_line;
  2347      if input_has_ended then
  2348        begin c:=new_module; goto done;
  2349        end;
  2350      end;
  2351    buffer[limit+1]:="@@";
  2352    @<Copy up to `\v' or control code, |goto done| if finished@>;
  2353    end;
  2354  done:copy_TeX:=c;
  2355  end;
  2356  
  2357  @ We don't copy spaces or tab marks into the beginning of a line. This
  2358  makes the test for empty lines in |finish_line| work.
  2359  
  2360  @<Copy up to `\v'...@>=
  2361  repeat c:=buffer[loc]; incr(loc);
  2362  if c="|" then goto done;
  2363  if c<>"@@" then
  2364    begin out(c);
  2365    if (out_ptr=1)and((c=" ")or(c=tab_mark)) then decr(out_ptr);
  2366    end;
  2367  until c="@@";
  2368  if loc<=limit then
  2369    begin c:=control_code(buffer[loc]); incr(loc);
  2370    goto done;
  2371    end
  2372  
  2373  @ The |copy_comment| uses and returns a brace-balance value, following the
  2374  conventions of |skip_comment| above. Instead of copying the \TeX\ material
  2375  into the output buffer, this procedure copies it into the token memory.
  2376  The abbreviation |app_tok(t)| is used to append token |t| to the current
  2377  token list, and it also makes sure that it is possible to append at least
  2378  one further token without overflow.
  2379  
  2380  @d app_tok(#)==begin if tok_ptr+2>max_toks then overflow('token');
  2381    tok_mem[tok_ptr]:=#; incr(tok_ptr);
  2382    end
  2383  
  2384  @p function copy_comment(@!bal:eight_bits):eight_bits; {copies \TeX\ code in
  2385    comments}
  2386  label done;
  2387  var c:ASCII_code; {current character being copied}
  2388  begin loop begin if loc>limit then
  2389      begin get_line;
  2390      if input_has_ended then
  2391        begin err_print('! Input ended in mid-comment');
  2392  @.Input ended in mid-comment@>
  2393        loc:=1; @<Clear |bal| and |goto done|@>;
  2394        end;
  2395      end;
  2396    c:=buffer[loc]; incr(loc);
  2397    if c="|" then goto done;
  2398    app_tok(c);
  2399    @<Copy special things when |c="@@", "\", "{", "}"|;
  2400      |goto done| at end@>;
  2401    end;
  2402  done: copy_comment:=bal;
  2403  end;
  2404  
  2405  @ @<Copy special things when |c="@@"...@>=
  2406  if c="@@" then
  2407    begin incr(loc);
  2408    if buffer[loc-1]<>"@@" then
  2409      begin err_print('! Illegal use of @@ in comment');
  2410  @.Illegal use of \AT!...@>
  2411      loc:=loc-2; decr(tok_ptr); @<Clear |bal|...@>;
  2412      end;
  2413    end
  2414  else if (c="\")and(buffer[loc]<>"@@") then
  2415    begin app_tok(buffer[loc]); incr(loc);
  2416    end
  2417  else if c="{" then incr(bal)
  2418  else if c="}" then
  2419    begin decr(bal);
  2420    if bal=0 then goto done;
  2421    end
  2422  
  2423  @ When the comment has terminated abruptly due to an error, we output
  2424  enough right braces to keep \TeX\ happy.
  2425  
  2426  @<Clear |bal|...@>=
  2427  app_tok(" "); {this is done in case the previous character was `\.\\'}
  2428  repeat app_tok("}"); decr(bal);
  2429  until bal=0;
  2430  goto done;
  2431  
  2432  @* Parsing.
  2433  The most intricate part of \.{WEAVE} is its mechanism for converting
  2434  \PASCAL-like code into \TeX\ code, and we might as well plunge into this
  2435  aspect of the program now. A ``bottom up'' approach is used to parse the
  2436  \PASCAL-like material, since \.{WEAVE} must deal with fragmentary
  2437  constructions whose overall ``part of speech'' is not known.
  2438  
  2439  At the lowest level, the input is represented as a sequence of entities
  2440  that we shall call {\it scraps}, where each scrap of information consists
  2441  of two parts, its {\it category} and its {\it translation}. The category
  2442  is essentially a syntactic class, and the translation is a token list that
  2443  represents \TeX\ code. Rules of syntax and semantics tell us how to
  2444  combine adjacent scraps into larger ones, and if we are lucky an entire
  2445  \PASCAL\ text that starts out as hundreds of small scraps will join
  2446  together into one gigantic scrap whose translation is the desired \TeX\
  2447  code. If we are unlucky, we will be left with several scraps that don't
  2448  combine; their translations will simply be output, one by one.
  2449  
  2450  The combination rules are given as context-sensitive productions that are
  2451  applied from left to right. Suppose that we are currently working on the
  2452  sequence of scraps $s_1\,s_2\ldots s_n$. We try first to find the longest
  2453  production that applies to an initial substring $s_1\,s_2\ldots\,$; but if
  2454  no such productions exist, we try to find the longest production
  2455  applicable to the next substring $s_2\,s_3\ldots\,$; and if that fails, we
  2456  try to match $s_3\,s_4\ldots\,$, etc.
  2457  
  2458  A production applies if the category codes have a given pattern. For
  2459  example, one of the productions is
  2460  $$|open|\;|math|\;|semi|\;\RA\;|open|\;|math|$$
  2461  and it means that three consecutive scraps whose respective categories are
  2462  |open|, |math|, and |semi| are con\-verted to two scraps whose categories
  2463  are |open| and |math|. This production also has an associated rule that
  2464  tells how to combine the translation parts:
  2465  $$\eqalign{O_2&=O_1\cr
  2466  M_2&=M_1\,S\,\.{\\,}\,\hbox{|opt|\thinspace\tt5}\cr}$$
  2467  This means that the |open| scrap has not changed, while the new |math| scrap
  2468  has a translation $M_2$ composed of the translation $M_1$ of the original
  2469  |math| scrap followed by the translation |S| of the |semi| scrap followed
  2470  by `\.{\\,}' followed by `|opt|' followed by `\.5'. (In the \TeX\ file,
  2471  this will specify an additional thin space after the semicolon, followed
  2472  by an optional line break with penalty 50.) Translation rules use subscripts
  2473  to distinguish between translations of scraps whose categories have the
  2474  same initial letter; these subscripts are assigned from left to right.
  2475  
  2476  $\.{WEAVE}$ also has the production rule
  2477  $$|semi|\;\RA\;|terminator|$$
  2478  (meaning that a semicolon can terminate a \PASCAL\ statement). Since
  2479  productions are applied from left to right, this rule will be activated
  2480  only if the |semi| is not preceded by scraps that match other productions;
  2481  in particular, a |semi| that is preceded by `|open| |math|' will have
  2482  disappeared because of the production above, and such semicolons do not
  2483  act as statement terminators.  This incidentally is how \.{WEAVE} is able
  2484  to treat semicolons in two distinctly different ways, the first of which
  2485  is intended for semicolons in the parameter list of a procedure
  2486  declaration.
  2487  
  2488  The translation rule corresponding to $|semi|\;\RA\;|terminator|$ is
  2489  $$T=S$$
  2490  but we shall not mention translation rules in the common case that the
  2491  translation of the new scrap on the right-hand side is simply the
  2492  concatenation of the disappearing scraps on the left-hand side.
  2493  
  2494  @ Here is a list of the category codes that scraps can have.
  2495  
  2496  @d simp=1 {the translation can be used both in horizontal mode
  2497    and in math mode of \TeX}
  2498  @d math=2 {the translation should be used only in \TeX\ math mode}
  2499  @d intro=3 {a statement is expected to follow this, after a space and
  2500    an optional break}
  2501  @d open=4 {denotes an incomplete parenthesized quantity to be used in
  2502    math mode}
  2503  @d beginning=5 {denotes an incomplete compound statement to be used in
  2504    horizontal mode}
  2505  @d close=6 {ends a parenthesis or compound statement}
  2506  @d alpha=7 {denotes the beginning of a clause}
  2507  @d omega=8 {denotes the ending of a clause and possible comment following}
  2508  @d semi=9 {denotes a semicolon and possible comment following it}
  2509  @d terminator=10 {something that ends a statement or declaration}
  2510  @d stmt=11 {denotes a statement or declaration including its terminator}
  2511  @d cond=12 {precedes an \&{if} clause that might have a matching \&{else}}
  2512  @d clause=13 {precedes a statement after which indentation ends}
  2513  @d colon=14 {denotes a colon}
  2514  @d exp=15 {stands for the E in a floating point constant}
  2515  @d proc=16 {denotes a procedure or program or function heading}
  2516  @d case_head=17 {denotes a case statement or record heading}
  2517  @d record_head=18 {denotes a record heading without indentation}
  2518  @d var_head=19 {denotes a variable declaration heading}
  2519  @d elsie=20 {\&{else}}
  2520  @d casey=21 {\&{case}}
  2521  @d mod_scrap=22 {denotes a module name}
  2522  
  2523  @p @!debug procedure print_cat(@!c:eight_bits);
  2524    {symbolic printout of a category}
  2525  begin case c of
  2526  simp: print('simp');
  2527  math: print('math');
  2528  intro: print('intro');
  2529  open: print('open');
  2530  beginning: print('beginning');
  2531  close: print('close');
  2532  alpha: print('alpha');
  2533  omega: print('omega');
  2534  semi: print('semi');
  2535  terminator: print('terminator');
  2536  stmt: print('stmt');
  2537  cond: print('cond');
  2538  clause: print('clause');
  2539  colon: print('colon');
  2540  exp: print('exp');
  2541  proc: print('proc');
  2542  case_head: print('casehead');
  2543  record_head: print('recordhead');
  2544  var_head: print('varhead');
  2545  elsie: print('elsie');
  2546  casey: print('casey');
  2547  mod_scrap: print('module');
  2548  othercases print('UNKNOWN')
  2549  endcases;
  2550  end;
  2551  gubed
  2552  
  2553  @ The token lists for translated \TeX\ output contain some special control
  2554  symbols as well as ordinary characters. These control symbols are
  2555  interpreted by \.{WEAVE} before they are written to the output file.
  2556  
  2557  \yskip\hang |break_space| denotes an optional line break or an en space;
  2558  
  2559  \yskip\hang |force| denotes a line break;
  2560  
  2561  \yskip\hang |big_force| denotes a line break with additional vertical space;
  2562  
  2563  \yskip\hang |opt| denotes an optional line break (with the continuation
  2564  line indented two ems with respect to the normal starting position)---this
  2565  code is followed by an integer |n|, and the break will occur with penalty
  2566  $10n$;
  2567  
  2568  \yskip\hang |backup| denotes a backspace of one em;
  2569  
  2570  \yskip\hang |cancel| obliterates any |break_space| or |force| or |big_force|
  2571  tokens that immediately precede or follow it and also cancels any
  2572  |backup| tokens that follow it;
  2573  
  2574  \yskip\hang |indent| causes future lines to be indented one more em;
  2575  
  2576  \yskip\hang |outdent| causes future lines to be indented one less em.
  2577  
  2578  \yskip\noindent All of these tokens are removed from the \TeX\ output that
  2579  comes from \PASCAL\ text between \pb\ signs; |break_space| and |force| and
  2580  |big_force| become single spaces in this mode. The translation of other
  2581  \PASCAL\ texts results in \TeX\ control sequences \.{\\1}, \.{\\2},
  2582  \.{\\3}, \.{\\4}, \.{\\5}, \.{\\6}, \.{\\7} corresponding respectively to
  2583  |indent|, |outdent|, |opt|, |backup|, |break_space|, |force|, and
  2584  |big_force|. However, a sequence of consecutive `\.\ ', |break_space|,
  2585  |force|, and/or |big_force| tokens is first replaced by a single token
  2586  (the maximum of the given ones).
  2587  
  2588  The tokens |math_rel|, |math_bin|, |math_op| will be translated into
  2589  \.{\\mathrel\{}, \.{\\mathbin\{}, and \.{\\mathop\{}, respectively.
  2590  Other control sequences in the \TeX\ output will be `\.{\\\\\{}$\,\ldots\,$\.\}'
  2591  surrounding identifiers, `\.{\\\&\{}$\,\ldots\,$\.\}' surrounding
  2592  reserved words, `\.{\\.\{}$\,\ldots\,$\.\}' surrounding strings,
  2593  `\.{\\C\{}$\,\ldots\,$\.\}$\,$|force|' surrounding comments, and
  2594  `\.{\\X$n$:}$\,\ldots\,$\.{\\X}' surrounding module names, where
  2595  |n| is the module number.
  2596  
  2597  @d math_bin=@'203
  2598  @d math_rel=@'204
  2599  @d math_op=@'205
  2600  @d big_cancel=@'206 {like |cancel|, also overrides spaces}
  2601  @d cancel=@'207 {overrides |backup|, |break_space|, |force|, |big_force|}
  2602  @d indent=cancel+1 {one more tab (\.{\\1})}
  2603  @d outdent=cancel+2 {one less tab (\.{\\2})}
  2604  @d opt=cancel+3 {optional break in mid-statement (\.{\\3})}
  2605  @d backup=cancel+4 {stick out one unit to the left (\.{\\4})}
  2606  @d break_space=cancel+5 {optional break between statements (\.{\\5})}
  2607  @d force=cancel+6 {forced break between statements (\.{\\6})}
  2608  @d big_force=cancel+7 {forced break with additional space (\.{\\7})}
  2609  @d end_translation=big_force+1 {special sentinel token at end of list}
  2610  
  2611  @ The raw input is converted into scraps according to the following table,
  2612  which gives category codes followed by the translations. Sometimes a single
  2613  item of input produces more than one scrap.
  2614  \def\stars {\.{**}}%
  2615  (The symbol `\stars' stands for `\.{\\\&\{{\rm identifier}\}}',
  2616  i.e., the identifier itself treated as a reserved word. In a few cases the
  2617  category is given as `|@!comment|'; this is not an actual category code, it
  2618  means that the translation will be treated as a comment, as explained
  2619  below.)
  2620  
  2621  \yskip\halign{\quad#\hfil&\quad#\hfil\cr
  2622  \.{<>}&|math:|\.{\\I}\cr
  2623  \.{<=}&|math:|\.{\\L}\cr
  2624  \.{>=}&|math:|\.{\\G}\cr
  2625  \.{:=}&|math:|\.{\\K}\cr
  2626  \.{==}&|math:|\.{\\S}\cr
  2627  \.{(*}&|math:|\.{\\B}\cr
  2628  \.{*)}&|math:|\.{\\T}\cr
  2629  \.{(.}&|open:|\.[\cr
  2630  \.{.)}&|close:|\.]\cr
  2631  \."$\,$string$\,$\."&|simp:|\.{\\.\{"{\rm$\,$modified string$\,$}"\}}\cr
  2632  \.\'$\,$string$\,$\.\'&|simp:|\.{\\.\{\\\'{\rm$\,$modified
  2633    string$\,$}\\\'\}}\cr
  2634  \.{@@=}$\,$string$\,$\.{@@>}&|simp:|\.{\\=\{{\rm$\,$modified string$\,$}\}}\cr
  2635  \#&|math:|\.{\\\#}\cr
  2636  \.\$&|math:|\.{\\\$}\cr
  2637  \.\_&|math:|\.{\\\_}\cr
  2638  \.\%&|math:|\.{\\\%}\cr
  2639  \.\^&|math:|\.{\\\^}\cr
  2640  \.(&|open:|\.(\cr
  2641  \.)&|close:|\.)\cr
  2642  \.[&|open:|\.[\cr
  2643  \.]&|close:|\.]\cr
  2644  \.*&|math:|\.{\\ast}\cr
  2645  \.,&|math:|\.,|@,opt@,|\.9\cr
  2646  \.{..}&|math:|\.{\\to}\cr
  2647  \..&|simp:|\..\cr
  2648  \.:&|colon:|\.:\cr
  2649  \.;&|semi:|\.;\cr
  2650  identifier&|simp:|\.{\\\\\{{\rm$\,$identifier$\,$}\}}\cr
  2651  \.E in constant&|exp:|\.{\\E\{}\cr
  2652  digit $d$&|simp:|$d$\cr
  2653  other character $c$&|math:|$c$\cr
  2654  \.{and}&|math:|\.{\\W}\cr
  2655  \.{array}&|alpha:|\stars\cr
  2656  \.{begin}&|beginning:|$|force|\,\stars\,|cancel|$\qquad|intro:|\cr
  2657  \.{case}&|casey:|\qquad|alpha:|$|force|\,\stars$\cr
  2658  \.{const}&|intro:|$|force|\,|backup|\,\stars$\cr
  2659  \.{div}&|math:|$|math_bin|\,\stars\,\.\}$\cr
  2660  \.{do}&|omega:|\stars\cr
  2661  \.{downto}&|math:|$|math_rel|\,\stars\,\.\}$\cr
  2662  \.{else}&|terminator:|\qquad|elsie:|$|force|\,|backup|\,\stars$\cr
  2663  \.{end}&|terminator:|\qquad|close:|$|force|\,\stars$\cr
  2664  \.{file}&|alpha:|\stars\cr
  2665  \.{for}&|alpha:|$|force|\,\stars$\cr
  2666  \.{function}&|proc:|$|force|\,|backup|\,\stars\,|cancel|$\qquad
  2667    |intro:|$|indent|\,\.{\\\ }$\cr
  2668  \.{goto}&|intro:|\stars\cr
  2669  \.{if}&|cond:|\qquad|alpha:|$|force|\,\stars$\cr
  2670  \.{in}&|math:|\.{\\in}\cr
  2671  \.{label}&|intro:|$|force|\,|backup|\,\stars$\cr
  2672  \.{mod}&|math:|$|math_bin|\,\stars\,\.\}$\cr
  2673  \.{nil}&|simp:|\stars\cr
  2674  \.{not}&|math:|\.{\\R}\cr
  2675  \.{of}&|omega:|\stars\cr
  2676  \.{or}&|math:|\.{\\V}\cr
  2677  \.{packed}&|intro:|\stars\cr
  2678  \.{procedure}&|proc:|$|force|\,|backup|\,\stars\,|cancel|$\qquad
  2679    |intro:|$|indent|\,\.{\\\ }$\cr
  2680  \.{program}&|proc:|$|force|\,|backup|\,\stars\,|cancel|$\qquad
  2681    |intro:|$|indent|\,\.{\\\ }$\cr
  2682  \.{record}&|record_head:|\stars\qquad|intro:|\cr
  2683  \.{repeat}&|beginning:|$|force|\,|indent|\,\stars\,|cancel|$\qquad|intro:|\cr
  2684  \.{set}&|alpha:|\stars\cr
  2685  \.{then}&|omega:|\stars\cr
  2686  \.{to}&|math:|$|math_rel|\,\stars\,\.\}$\cr
  2687  \.{type}&|intro:|$|force|\,|backup|\,\stars$\cr
  2688  \.{until}&|terminator:|\qquad|close:|$|force|\,|backup|\,\stars$\qquad
  2689    |clause:|\cr
  2690  \.{var}&|var_head:|$|force|\,|backup|\,\stars\,|cancel|$\qquad|intro:|\cr
  2691  \.{while}&|alpha:|$|force|\,\stars$\cr
  2692  \.{with}&|alpha:|$|force|\,\stars$\cr
  2693  \.{xclause}&|alpha:|$|force|\,\.{\\\~}$\qquad|omega:|\stars\cr
  2694  \.{@@\'}$\,$const&|simp:|\.{\\O\{}\hbox{const}\.\}\cr
  2695  \.{@@"}$\,$const&|simp:|\.{\\H\{}\hbox{const}\.\}\cr
  2696  \.{@@\$}&|simp:|\.{\\)}\cr
  2697  \.{@@\\}&|simp:|\.{\\]}\cr
  2698  \.{@@,}&|math:|\.{\\,}\cr
  2699  \.{@@t}$\,$stuff$\,$\.{@@>}&|simp:|\.{\\hbox\{{\rm$\,$stuff$\,$}\}}\cr
  2700  \.{@@<}$\,$module$\,$\.{@@>}&|mod_scrap:|\.{\\X$n$:{\rm$\,$module$\,$}\\X}\cr
  2701  \.{@@\#}&|comment:||big_force|\cr
  2702  \.{@@/}&|comment:||force|\cr
  2703  \.{@@\char'174}&|simp:|$|opt|\,\.0$\cr
  2704  \.{@@+}&|comment:|$|big_cancel|\,\.{\\\ }\,|big_cancel|$\cr
  2705  \.{@@;}&|semi:|\cr
  2706  \.{@@\&}&|math:|\.{\\J}\cr
  2707  \.{@@\{}&|math:|\.{\\B}\cr
  2708  \.{@@\}}&|math:|\.{\\T}\cr}
  2709  \yskip\noindent When a string is output, certain characters are preceded by
  2710  `\.\\' signs so that they will print properly.
  2711  
  2712  A comment in the input will be combined with the preceding
  2713  |omega| or |semi| scrap, or with the following |terminator| scrap, if
  2714  possible; otherwise it will be inserted as a separate |terminator| scrap.
  2715  An additional ``comment'' is effectively appended at the end of the
  2716  \PASCAL\ text, just before translation begins; this consists of a |cancel|
  2717  token in the case of \PASCAL\ text in \pb, otherwise it consists of a
  2718  |force| token.
  2719  
  2720  From this table it is evident that \.{WEAVE} will parse a lot of non-\PASCAL\
  2721  programs. For example, the reserved words `\.{for}' and `\.{array}' are
  2722  treated in an identical way by \.{WEAVE} from a syntactic standpoint,
  2723  and semantically they are equivalent except that a forced line break occurs
  2724  just before `\&{for}'; \PASCAL\ programmers may well be surprised at this
  2725  similarity. The idea is to keep \.{WEAVE}'s rules as simple as possible,
  2726  consistent with doing a reasonable job on syntactically correct \PASCAL\
  2727  programs. The production rules below have been formulated in the same
  2728  spirit of ``almost anything goes.''
  2729  
  2730  @ Here is a table of all the productions. The reader can best get a feel for
  2731  @^productions, table of@>
  2732  how they work by trying them out by hand on small examples; no amount of
  2733  explanation will be as effective as watching the rules in action.
  2734  Parsing can also be watched by debugging with `\.{@@2}'.
  2735  
  2736  \def\[#1]{\quad$\dleft#1\dright$}
  2737  \def\sp{\.{\ }}
  2738  \yskip
  2739  \halign to\the\hsize{\hfil\it# &
  2740    #\hfil\hskip-200pt\tabskip 0pt plus 100pt&
  2741    #\hfil\tabskip0pt\cr
  2742  &Production categories\[\hbox{translations}]&Remarks\cr
  2743  \noalign{\yskip}
  2744  1&|alpha@,math@,colon| $\RA$ |alpha@,math|&e.g., |case v:boolean of|\cr
  2745  2&|alpha@,math@,omega| $\RA$ |clause|\[C=A\,\sp\,\.\$\,M\,\.\$\,\sp\,|indent|\,
  2746  O]&e.g., |while x>0 do|\cr
  2747  3&|alpha@,omega| $\RA$ |clause|\[C=A\,\sp\,|indent|\,O]&e.g., |file of|\cr
  2748  4&|alpha@,simp| $\RA$ |alpha@,math|&convert to math mode\cr
  2749  5&|beginning@,close@,(terminator@t or @>stmt)| $\RA$ |stmt|&compound statement
  2750  ends\cr
  2751  6&|beginning@,stmt| $\RA$ |beginning|\[B_2=B_1\,|break_space|\,S]&compound
  2752  statement grows\cr
  2753  7&|case_head@,casey@,clause| $\RA$ |case_head|\[C_4=C_1\,|outdent|\,C_2\,C_3]&
  2754  variant records\cr
  2755  8&|case_head@,close@,terminator| $\RA$ |stmt|\[S=C_1\,|cancel|\,|outdent|\,
  2756  C_2\,T]&end of case statement\cr
  2757  9&|case_head@,stmt| $\RA$ |case_head|\[C_2=C_1\,|force|\,S]&case statement
  2758  grows\cr
  2759  10&|casey@,clause| $\RA$ |case_head|&beginning of case statement\cr
  2760  11&|clause@,stmt| $\RA$ |stmt|\[S_2=C\,|break_space|\,S_1\,|cancel|\,|outdent|\,
  2761  |force|]&end of controlled statement\cr
  2762  12&|cond@,clause@,stmt@,elsie| $\RA$ |clause|\[C_3=C_1\,C_2\,|break_space|\,S\,
  2763  E\,\sp\,|cancel|]&complete conditional\cr
  2764  13&|cond@,clause@,stmt| $\RA$ |stmt|\cr
  2765  &\qquad\[S_2=C_1\,C_2\,|break_space|\,S_1\,
  2766  |cancel|\,|outdent|\,|force|]&incomplete conditional\cr
  2767  14&|elsie| $\RA$ |intro|&unmatched else\cr
  2768  15&|exp@,math@,simp|* $\RA$ |math|\[M_2=E\,M_1\,S\,\.\}]&signed exponent\cr
  2769  16&|exp@,simp|* $\RA$ |math|\[M=E\,S\,\.\}]&unsigned exponent\cr
  2770  17&|intro@,stmt| $\RA$ |stmt|\[S_2=I\,\sp\,|opt|\,\.7\,|cancel|\,S_1]&labeled
  2771  statement, etc.\cr
  2772  18&|math@,close| $\RA$ |stmt@,close|\[S=\.\$\,M\,\.\$]&end of field list\cr
  2773  19&|math@,colon| $\RA$ |intro|\[I=|force|\,|backup|\,\.\$\,M\,\.\$\,C]&compound
  2774  label\cr
  2775  20&|math@,math| $\RA$ |math|&simple concatenation\cr
  2776  21&|math@,simp| $\RA$ |math|&simple concatenation\cr
  2777  22&|math@,stmt| $\RA$ |stmt|\cr
  2778  &\qquad\[S_2=\.\$\,M\,\.\$\,|indent|\,|break_space|\,
  2779  S_1\,|cancel|\,|outdent|\,|force|]&macro or type definition\cr
  2780  23&|math@,terminator| $\RA$ |stmt|\[S=\.\$\,M\,\.\$\,T]&statement involving
  2781  math\cr
  2782  24&|mod_scrap@,(terminator@t or @>semi)| $\RA$ |stmt|\[S=M\,T\,|force|]&module
  2783  like a statement\cr
  2784  25&|mod_scrap| $\RA$ |simp|&module unlike a statement\cr
  2785  26&|open@,case_head@,close| $\RA$ |math|\[M=O\,\.\$\,|cancel|\,C_1\,
  2786  |cancel|\,|outdent|\,\.\$\,C_2]&case in field list\cr
  2787  27&|open@,close| $\RA$ |math|\[M=O\,\.\\\,\.,\,C]&empty set |[]|\cr
  2788  28&|open@,math@,case_head@,close| $\RA$ |math|\cr
  2789  &\qquad\[M_2=O\,M_1\,\.\$\,|cancel|\,
  2790  C_1\,|cancel|\,|outdent|\,\.\$\,C_2]&case in field list\cr
  2791  29&|open@,math@,close| $\RA$ |math|&parenthesized group\cr
  2792  30&|open@,math@,colon| $\RA$ |open@,math|&colon in parentheses\cr
  2793  31&|open@,math@,proc@,intro| $\RA$ |open@,math|\[M_2=M_1\,|math_op|\,|cancel|\,
  2794  P\,\.\}]&|procedure| in parentheses\cr
  2795  32&|open@,math@,semi| $\RA$ |open@,math|\[M_2=M_1\,S\,\.\\\,\.,\,|opt|\,
  2796  \.5]&semicolon in parentheses\cr
  2797  33&|open@,math@,var_head@,intro| $\RA$ |open@,math|\[M_2=M_1\,|math_op|\,
  2798  |cancel|\,V\,\.\}]&|var| in parentheses\cr
  2799  34&|open@,proc@,intro| $\RA$ |open@,math|\[M=|math_op|\,|cancel|\,
  2800  P\,\.\}]&|procedure| in parentheses\cr
  2801  35&|open@,simp| $\RA$ |open@,math|&convert to math mode\cr
  2802  36&|open@,stmt@,close| $\RA$ |math|\[M=O\,\.\$\,|cancel|\,S\,|cancel|\,
  2803  \.\$\,C]&field list\cr
  2804  37&|open@,var_head@,intro| $\RA$ |open@,math|\[M=|math_op|\,|cancel|\,V\,
  2805  \.\}]&|var| in parentheses\cr
  2806  38&|proc@,beginning@,close@,terminator| $\RA$ |stmt|\[S=P\,|cancel|\,
  2807  |outdent|\,B\,C\,T]&end of procedure declaration\cr
  2808  39&|proc@,stmt| $\RA$ |proc|\[P_2=P_1\,|break_space|\,S]&procedure declaration
  2809  grows\cr
  2810  40&|record_head@,intro@,casey| $\RA$ |casey|\[C_2=R\,I\,\sp\,|cancel|\,C_1]&
  2811  \&{record case} $\ldots$\cr
  2812  41&|record_head| $\RA$ |case_head|\[C=|indent|\,R\,|cancel|]&other \&{record}
  2813  structures\cr
  2814  42&|semi| $\RA$ |terminator|&semicolon after statement\cr
  2815  43&|simp@,close| $\RA$ |stmt@,close|&end of field list\cr
  2816  44&|simp@,colon| $\RA$ |intro|\[I=|force|\,|backup|\,S\,C]&simple label\cr
  2817  45&|simp@,math| $\RA$ |math|&simple concatenation\cr
  2818  46&|simp@,mod_scrap| $\RA$ |mod_scrap|&in emergencies\cr
  2819  47&|simp@,simp| $\RA$ |simp|&simple concatenation\cr
  2820  48&|simp@,terminator| $\RA$ |stmt|&simple statement\cr
  2821  49&|stmt@,stmt| $\RA$ |stmt|\[S_3=S_1\,|break_space|\,S_2]&adjacent
  2822  statements\cr
  2823  50&|terminator| $\RA$ |stmt|&empty statement\cr
  2824  51&|var_head@,beginning| $\RA$ |stmt@,beginning|&end of variable
  2825  declarations\cr
  2826  52&|var_head@,math@,colon| $\RA$ |var_head@,intro|\[I=\.\$\,M\,\.\$\,C]&
  2827  variable declaration\cr
  2828  53&|var_head@,simp@,colon| $\RA$ |var_head@,intro|&variable declaration\cr
  2829  54&|var_head@,stmt| $\RA$ |var_head|\[V_2=V_1\,|break_space|\,S]&variable
  2830  declarations grow\cr}
  2831  \yskip\noindent
  2832  Translations are not specified here when they are simple concatenations
  2833  of the scraps that change. For example, the full translation of
  2834  `|open@,math@,colon| $\RA$ |open@,math|' is $O_2=O_1$, $M_2=M_1C$.
  2835  
  2836  The notation `|simp|*', in the |exp|-related productions above,
  2837  stands for a |simp| scrap that isn't followed by another |simp|.
  2838  
  2839  @* Implementing the productions.
  2840  When \PASCAL\ text is to be processed with the grammar above, we put its
  2841  initial scraps $s_1\ldots s_n$ into two arrays |cat[1..n]| and |trans[1..n]|.
  2842  The value of |cat[k]| is simply a category code from the list above; the
  2843  value of |trans[k]| is a text pointer, i.e., an index into |tok_start|.
  2844  Our production rules have the nice property that the right-hand side is never
  2845  longer than the left-hand side. Therefore it is convenient to use sequential
  2846  allocation for the current sequence of scraps. Five pointers are used to
  2847  manage the parsing:
  2848  
  2849  \yskip\hang |pp| (the parsing pointer) is such that we are trying to match
  2850  the category codes |cat[pp]@,cat[pp+1]|$\,\ldots\,$ to the left-hand sides
  2851  of productions.
  2852  
  2853  \yskip\hang |scrap_base|, |lo_ptr|, |hi_ptr|, and |scrap_ptr| are such that
  2854  the current sequence of scraps appears in positions |scrap_base| through
  2855  |lo_ptr| and |hi_ptr| through |scrap_ptr|, inclusive, in the |cat| and
  2856  |trans| arrays. Scraps located between |scrap_base| and |lo_ptr| have
  2857  been examined, while those in positions |>=hi_ptr| have not yet been
  2858  looked at by the parsing process.
  2859  
  2860  \yskip\noindent Initially |scrap_ptr| is set to the position of the final
  2861  scrap to be parsed, and it doesn't change its value. The parsing process
  2862  makes sure that |lo_ptr>=pp+3|, since productions have as many as four terms,
  2863  by moving scraps from |hi_ptr| to |lo_ptr|. If there are
  2864  fewer than |pp+3| scraps left, the positions up to |pp+3| are filled with
  2865  blanks that will not match in any productions. Parsing stops when
  2866  |pp=lo_ptr+1| and |hi_ptr=scrap_ptr+1|.
  2867  
  2868  The |trans| array elements are declared to be of type |0..10239| instead
  2869  of type |text_pointer|, because the final sorting phase of \.{WEAVE}
  2870  uses this array to contain elements of type |name_pointer|. Both
  2871  of these types are subranges of |0..10239|.
  2872  
  2873  @<Glo...@>=
  2874  @!cat:array[0..max_scraps] of eight_bits; {category codes of scraps}
  2875  @!trans:array[0..max_scraps] of 0..10239; {translation texts of scraps}
  2876  @!pp:0..max_scraps; {current position for reducing productions}
  2877  @!scrap_base:0..max_scraps; {beginning of the current scrap sequence}
  2878  @!scrap_ptr:0..max_scraps; {ending of the current scrap sequence}
  2879  @!lo_ptr:0..max_scraps; {last scrap that has been examined}
  2880  @!hi_ptr:0..max_scraps; {first scrap that has not been examined}
  2881  stat@!max_scr_ptr:0..max_scraps; {largest value assumed by |scrap_ptr|}
  2882  tats
  2883  
  2884  @ @<Set init...@>=
  2885  scrap_base:=1; scrap_ptr:=0;
  2886  stat max_scr_ptr:=0; @+tats
  2887  
  2888  @ Token lists in |@!tok_mem| are composed of the following kinds of
  2889  items for \TeX\ output.
  2890  
  2891  \yskip\item{$\bullet$}ASCII codes and special codes like |force| and
  2892  |math_rel| represent themselves;
  2893  
  2894  \item{$\bullet$}|id_flag+p| represents \.{\\\\\{{\rm identifier $p$}\}};
  2895  
  2896  \item{$\bullet$}|res_flag+p| represents \.{\\\&\{{\rm identifier $p$}\}};
  2897  
  2898  \item{$\bullet$}|mod_flag+p| represents module name |p|;
  2899  
  2900  \item{$\bullet$}|tok_flag+p| represents token list number |p|;
  2901  
  2902  \item{$\bullet$}|inner_tok_flag+p| represents token list number |p|, to be
  2903  translated without line-break controls.
  2904  
  2905  @d id_flag=10240 {signifies an identifier}
  2906  @d res_flag=id_flag+id_flag {signifies a reserved word}
  2907  @d mod_flag=res_flag+id_flag {signifies a module name}
  2908  @d tok_flag==mod_flag+id_flag {signifies a token list}
  2909  @d inner_tok_flag==tok_flag+id_flag {signifies a token list in `\pb'}
  2910  @#
  2911  @d lbrace==xchr["{"] {this avoids possible \PASCAL\ compiler confusion}
  2912  @d rbrace==xchr["}"] {because these braces might occur within comments}
  2913  
  2914  @p @!debug procedure print_text(@!p:text_pointer); {prints a token list}
  2915  var j:0..max_toks; {index into |tok_mem|}
  2916  @!r:0..id_flag-1; {remainder of token after the flag has been stripped off}
  2917  begin if p>=text_ptr then print('BAD')
  2918  else for j:=tok_start[p] to tok_start[p+1]-1 do
  2919    begin r:=tok_mem[j] mod id_flag;
  2920    case tok_mem[j] div id_flag of
  2921    1: begin print('\\',lbrace); print_id(r); print(rbrace);
  2922      end; {|id_flag|}
  2923    2: begin print('\&',lbrace); print_id(r); print(rbrace);
  2924      end; {|res_flag|}
  2925    3: begin print('<'); print_id(r); print('>');
  2926      end; {|mod_flag|}
  2927    4: print('[[',r:1,']]'); {|tok_flag|}
  2928    5: print('|[[',r:1,']]|'); {|inner_tok_flag|}
  2929    othercases @<Print token |r| in symbolic form@>
  2930    endcases;
  2931    end;
  2932  end;
  2933  gubed
  2934  
  2935  @ @<Print token |r|...@>=
  2936  case r of
  2937  math_bin: print('\mathbin',lbrace);
  2938  math_rel: print('\mathrel',lbrace);
  2939  math_op: print('\mathop',lbrace);
  2940  big_cancel: print('[ccancel]');
  2941  cancel: print('[cancel]');
  2942  indent: print('[indent]');
  2943  outdent: print('[outdent]');
  2944  backup: print('[backup]');
  2945  opt: print('[opt]');
  2946  break_space: print('[break]');
  2947  force: print('[force]');
  2948  big_force: print('[fforce]');
  2949  end_translation: print('[quit]');
  2950  othercases print(xchr[r])
  2951  endcases
  2952  
  2953  @ The production rules listed above are embedded directly into the \.{WEAVE}
  2954  program, since it is easier to do this than to write an interpretive system
  2955  that would handle production systems in general. Several macros are defined
  2956  here so that the program for each production is fairly short.
  2957  
  2958  All of our productions conform to the general notion that some |k|
  2959  consecutive scraps starting at some position |j| are to be replaced by a
  2960  single scrap of some category |c| whose translation is composed from the
  2961  translations of the disappearing scraps. After this production has been
  2962  applied, the production pointer |pp| should change by an amount |d|. Such
  2963  a production can be represented by the quadruple $(j,k,c,d)$. For example,
  2964  the production `|simp@,math| $\RA$ |math|' would be represented by
  2965  `$(|pp|,2,|math|,-1)$'; in this case the pointer $pp$ should decrease by 1
  2966  after the production has been applied, because some productions with
  2967  |math| in their second positions might now match, but no productions have
  2968  |math| in the third or fourth position of their left-hand sides. Note that
  2969  the value of |d| is determined by the whole collection of productions, not
  2970  by an individual one. Consider the further example
  2971  `|var_head@,math@,colon| $\RA$ |var_head@,intro|', which is represented by
  2972  `$(|pp|+1,2,|intro|,+1)$'; the $+1$ here is deduced by looking at the
  2973  grammar and seeing that no matches could possibly occur at positions |<=pp|
  2974  after this production has been applied. The determination of |d| has been
  2975  done by hand in each case, based on the full set of productions but not on
  2976  the grammar of \PASCAL\ or on the rules for constructing the initial
  2977  scraps.
  2978  
  2979  We also attach a serial number to each production, so that additional
  2980  information is available when debugging. For example, the program below
  2981  contains the statement `|reduce(pp+1,2,intro,+1)(52)|' when it implements
  2982  the production just mentioned.
  2983  
  2984  Before calling |reduce|, the program should have appended the tokens of
  2985  the new translation to the |tok_mem| array. We commonly want to append
  2986  copies of several existing translations, and macros are defined to
  2987  simplify these common cases. For example, |app2(pp)| will append the
  2988  translations of two consecutive scraps, |trans[pp]| and |trans[pp+1]|, to
  2989  the current token list. If the entire new translation is formed in this
  2990  way, we write `$|squash|(j,k,c,d)$' instead of `$|reduce|(j,k,c,d)$'. For
  2991  example, `|squash(pp,2,math,-1)|' is an abbreviation for `|app2(pp);
  2992  reduce(pp,2,math,-1)|'.
  2993  
  2994  The code below is an exact translation of the production rules into
  2995  \PASCAL, using such macros, and the reader should have no difficulty
  2996  understanding the format by comparing the code with the symbolic
  2997  productions as they were listed earlier.
  2998  
  2999  {\sl Caution:\/} The macros |app|, |app1|, |app2|, and |app3| are
  3000  sequences of statements that are not enclosed with |begin| and $|end|$,
  3001  because such delimiters would make the \PASCAL\ program much longer. This
  3002  means that it is necessary to write |begin| and |end| explicitly when such
  3003  a macro is used as a single statement. Several mysterious bugs in the
  3004  original programming of \.{WEAVE} were caused by a failure to remember
  3005  this fact.  Next time the author will know better.
  3006  
  3007  @d production(#)==@!debug prod(#) gubed; goto found
  3008  @d reduce(#)==red(#); production
  3009  @d production_end(#)==@!debug prod(#) gubed; goto found;
  3010    end
  3011  @d squash(#)==begin sq(#); production_end
  3012  @d app(#)==tok_mem[tok_ptr]:=#; incr(tok_ptr) {this is like |app_tok|,
  3013    but it doesn't test for overflow}
  3014  @d app1(#)==tok_mem[tok_ptr]:=tok_flag+trans[#]; incr(tok_ptr)
  3015  @d app2(#)==app1(#);app1(#+1)
  3016  @d app3(#)==app2(#);app1(#+2)
  3017  
  3018  @ Let us consider the big case statement for productions now, before looking
  3019  at its context. We want to design the program so that this case statement
  3020  works, so we might as well not keep ourselves in suspense about exactly what
  3021  code needs to be provided with a proper environment.
  3022  
  3023  The code here is more complicated than it need be, since some popular
  3024  \PASCAL\ compilers are unable to deal with procedures that contain a lot
  3025  of program text. The |translate| procedure, which incorporates the |case|
  3026  statement here, would become too long for those compilers if we did
  3027  not do something to split the cases into parts. Therefore
  3028  a separate procedure called |five_cases| has been introduced.
  3029  @^split procedures@>
  3030  This auxiliary procedure contains approximately half of the program text
  3031  that |translate| would otherwise have had. There's also a procedure
  3032  called |alpha_cases|, which turned out to be necessary because the best
  3033  two-way split wasn't good enough. The procedure could be split further
  3034  in an analogous manner, but the present scheme works on all compilers
  3035  known to the author.
  3036  
  3037  @<Match a production at |pp|, or increase |pp| if there is no match@>=
  3038  if cat[pp]<=alpha then
  3039    if cat[pp]<alpha then five_cases@+else alpha_cases
  3040  else  begin case cat[pp] of
  3041    case_head: @<Cases for |case_head|@>;
  3042    casey: @<Cases for |casey|@>;
  3043    clause: @<Cases for |clause|@>;
  3044    cond: @<Cases for |cond|@>;
  3045    elsie: @<Cases for |elsie|@>;
  3046    exp: @<Cases for |exp|@>;
  3047    mod_scrap: @<Cases for |mod_scrap|@>;
  3048    proc: @<Cases for |proc|@>;
  3049    record_head: @<Cases for |record_head|@>;
  3050    semi: @<Cases for |semi|@>;
  3051    stmt: @<Cases for |stmt|@>;
  3052    terminator: @<Cases for |terminator|@>;
  3053    var_head: @<Cases for |var_head|@>;
  3054    othercases do_nothing
  3055    endcases;@/
  3056    incr(pp); {if no match was found, we move to the right}
  3057    found: end
  3058  
  3059  @ Here are the procedures that need to be present for the reason just
  3060  explained.
  3061  
  3062  @<Declaration of subprocedures for |translate|@>=
  3063  procedure five_cases; {handles almost half of the syntax}
  3064  label found;
  3065  begin case cat[pp] of
  3066  beginning: @<Cases for |beginning|@>;
  3067  intro: @<Cases for |intro|@>;
  3068  math: @<Cases for |math|@>;
  3069  open: @<Cases for |open|@>;
  3070  simp: @<Cases for |simp|@>;
  3071  othercases do_nothing
  3072  endcases;@/
  3073  incr(pp); {if no match was found, we move to the right}
  3074  found: end;
  3075  @#
  3076  procedure alpha_cases;
  3077  label found;
  3078  begin @<Cases for |alpha|@>;
  3079  incr(pp); {if no match was found, we move to the right}
  3080  found: end;
  3081  
  3082  @ Now comes the code that tries to match each production starting
  3083  with a particular type of scrap. Whenever a match is discovered,
  3084  the |squash| or |reduce| macro will cause the appropriate action
  3085  to be performed, followed by |goto found|.
  3086  
  3087  @<Cases for |alpha|@>=
  3088  if cat[pp+1]=math then
  3089    begin if cat[pp+2]=colon then squash(pp+1,2,math,0)(1)
  3090    else if cat[pp+2]=omega then
  3091      begin app1(pp); app(" "); app("$"); app1(pp+1);
  3092      app("$"); app(" "); app(indent); app1(pp+2);
  3093      reduce(pp,3,clause,-2)(2);
  3094      end;
  3095    end
  3096  else if cat[pp+1]=omega then
  3097    begin app1(pp); app(" "); app(indent); app1(pp+1);
  3098    reduce(pp,2,clause,-2)(3);
  3099    end
  3100  else if cat[pp+1]=simp then squash(pp+1,1,math,0)(4)
  3101  
  3102  @ @<Cases for |beginning|@>=
  3103  if cat[pp+1]=close then
  3104    begin if (cat[pp+2]=terminator)or(cat[pp+2]=stmt) then
  3105      squash(pp,3,stmt,-2)(5);
  3106    end
  3107  else if cat[pp+1]=stmt then
  3108    begin app1(pp); app(break_space); app1(pp+1);
  3109    reduce(pp,2,beginning,-1)(6);
  3110    end
  3111  
  3112  @ @<Cases for |case_head|@>=
  3113  if cat[pp+1]=casey then
  3114    begin if cat[pp+2]=clause then
  3115      begin app1(pp); app(outdent); app2(pp+1);
  3116      reduce(pp,3,case_head,0)(7);
  3117      end;
  3118    end
  3119  else if cat[pp+1]=close then
  3120    begin if cat[pp+2]=terminator then
  3121      begin app1(pp); app(cancel); app(outdent); app2(pp+1);
  3122      reduce(pp,3,stmt,-2)(8);
  3123      end;
  3124    end
  3125  else if cat[pp+1]=stmt then
  3126    begin app1(pp); app(force); app1(pp+1);
  3127    reduce(pp,2,case_head,0)(9);
  3128    end
  3129  
  3130  @ @<Cases for |casey|@>=
  3131  if cat[pp+1]=clause then squash(pp,2,case_head,0)(10)
  3132  
  3133  @ @<Cases for |clause|@>=
  3134  if cat[pp+1]=stmt then
  3135    begin app1(pp); app(break_space); app1(pp+1);
  3136    app(cancel); app(outdent);
  3137    app(force); reduce(pp,2,stmt,-2)(11);
  3138    end
  3139  
  3140  @ @<Cases for |cond|@>=
  3141  if (cat[pp+1]=clause)and(cat[pp+2]=stmt) then
  3142    if cat[pp+3]=elsie then
  3143      begin app2(pp); app(break_space); app2(pp+2); app(" ");
  3144      app(cancel); reduce(pp,4,clause,-2)(12);
  3145      end
  3146    else  begin app2(pp); app(break_space); app1(pp+2); app(cancel);
  3147      app(outdent); app(force); reduce(pp,3,stmt,-2)(13);
  3148      end
  3149  
  3150  @ @<Cases for |elsie|@>=
  3151  squash(pp,1,intro,-3)(14)
  3152  
  3153  @ @<Cases for |exp|@>=
  3154  if cat[pp+1]=math then
  3155    begin if cat[pp+2]=simp then if cat[pp+3]<>simp then
  3156      begin app3(pp); app("}"); reduce(pp,3,math,-1)(15);
  3157      end;
  3158    end
  3159  else if cat[pp+1]=simp then if cat[pp+2]<>simp then
  3160    begin app2(pp); app("}"); reduce(pp,2,math,-1)(16);
  3161    end
  3162  
  3163  @ @<Cases for |intro|@>=
  3164  if cat[pp+1]=stmt then
  3165    begin app1(pp); app(" "); app(opt); app("7");
  3166    app(cancel); app1(pp+1); reduce(pp,2,stmt,-2)(17);
  3167    end
  3168  
  3169  @ @<Cases for |math|@>=
  3170  if cat[pp+1]=close then
  3171    begin app("$"); app1(pp); app("$"); reduce(pp,1,stmt,-2)(18);
  3172    end
  3173  else if cat[pp+1]=colon then
  3174    begin app(force); app(backup); app("$"); app1(pp);
  3175    app("$"); app1(pp+1); reduce(pp,2,intro,-3)(19);
  3176    end
  3177  else if cat[pp+1]=math then squash(pp,2,math,-1)(20)
  3178  else if cat[pp+1]=simp then squash(pp,2,math,-1)(21)
  3179  else if cat[pp+1]=stmt then
  3180    begin app("$"); app1(pp); app("$"); app(indent);
  3181    app(break_space); app1(pp+1); app(cancel); app(outdent);
  3182    app(force); reduce(pp,2,stmt,-2)(22);
  3183    end
  3184  else if cat[pp+1]=terminator then
  3185    begin app("$"); app1(pp); app("$"); app1(pp+1);
  3186    reduce(pp,2,stmt,-2)(23);
  3187    end
  3188  
  3189  @ @<Cases for |mod_scrap|@>=
  3190  if (cat[pp+1]=terminator)or(cat[pp+1]=semi) then
  3191    begin app2(pp); app(force); reduce(pp,2,stmt,-2)(24);
  3192    end
  3193  else squash(pp,1,simp,-2)(25)
  3194  
  3195  @ @<Cases for |open|@>=
  3196  if (cat[pp+1]=case_head)and(cat[pp+2]=close) then
  3197    begin app1(pp); app("$"); app(cancel); app1(pp+1); app(cancel);
  3198    app(outdent); app("$"); app1(pp+2); reduce(pp,3,math,-1)(26);
  3199    end
  3200  else if cat[pp+1]=close then
  3201    begin app1(pp); app("\"); app(","); app1(pp+1);
  3202  @.\\,@>
  3203    reduce(pp,2,math,-1)(27);
  3204    end
  3205  else if cat[pp+1]=math then @<Cases for |open@,math|@>
  3206  else if cat[pp+1]=proc then
  3207    begin if cat[pp+2]=intro then
  3208      begin app(math_op); app(cancel); app1(pp+1); app("}");
  3209      reduce(pp+1,2,math,0)(34);
  3210      end;
  3211    end
  3212  else if cat[pp+1]=simp then squash(pp+1,1,math,0)(35)
  3213  else if (cat[pp+1]=stmt)and(cat[pp+2]=close) then
  3214    begin app1(pp); app("$"); app(cancel); app1(pp+1); app(cancel);
  3215    app("$"); app1(pp+2); reduce(pp,3,math,-1)(36);
  3216    end
  3217  else if cat[pp+1]=var_head then
  3218    begin if cat[pp+2]=intro then
  3219      begin app(math_op); app(cancel); app1(pp+1); app("}");
  3220      reduce(pp+1,2,math,0)(37);
  3221      end;
  3222    end
  3223  
  3224  @ @<Cases for |open@,math|@>=
  3225  begin if (cat[pp+2]=case_head)and(cat[pp+3]=close) then
  3226    begin app2(pp); app("$"); app(cancel); app1(pp+2); app(cancel);
  3227    app(outdent); app("$"); app1(pp+3); reduce(pp,4,math,-1)(28);
  3228    end
  3229  else if cat[pp+2]=close then squash(pp,3,math,-1)(29)
  3230  else if cat[pp+2]=colon then squash(pp+1,2,math,0)(30)
  3231  else if cat[pp+2]=proc then
  3232    begin if cat[pp+3]=intro then
  3233      begin app1(pp+1); app(math_op); app(cancel);
  3234      app1(pp+2); app("}"); reduce(pp+1,3,math,0)(31);
  3235      end;
  3236    end
  3237  else if cat[pp+2]=semi then
  3238    begin app2(pp+1); app("\"); app(","); app(opt); app("5");
  3239  @.\\,@>
  3240    reduce(pp+1,2,math,0)(32);
  3241    end
  3242  else if cat[pp+2]=var_head then
  3243    begin if cat[pp+3]=intro then
  3244      begin app1(pp+1); app(math_op); app(cancel);
  3245      app1(pp+2); app("}"); reduce(pp+1,3,math,0)(33);
  3246      end;
  3247    end;
  3248  end
  3249  
  3250  @ @<Cases for |proc|@>=
  3251  if cat[pp+1]=beginning then
  3252    begin if (cat[pp+2]=close)and(cat[pp+3]=terminator) then
  3253      begin app1(pp); app(cancel); app(outdent); app3(pp+1);
  3254      reduce(pp,4,stmt,-2)(38);
  3255      end;
  3256    end
  3257  else if cat[pp+1]=stmt then
  3258    begin app1(pp); app(break_space); app1(pp+1);
  3259    reduce(pp,2,proc,-2)(39);
  3260    end
  3261  
  3262  @ @<Cases for |record_head|@>=
  3263  if (cat[pp+1]=intro)and(cat[pp+2]=casey) then
  3264    begin app2(pp); app(" "); app(cancel); app1(pp+2);
  3265    reduce(pp,3,casey,-2)(40);
  3266    end
  3267  else  begin app(indent); app1(pp); app(cancel);
  3268    reduce(pp,1,case_head,0)(41);
  3269    end
  3270  
  3271  @ @<Cases for |semi|@>=
  3272  squash(pp,1,terminator,-3)(42)
  3273  
  3274  @ @<Cases for |simp|@>=
  3275  if cat[pp+1]=close then squash(pp,1,stmt,-2)(43)
  3276  else if cat[pp+1]=colon then
  3277    begin app(force); app(backup); app2(pp); reduce(pp,2,intro,-3)(44);
  3278    end
  3279  else if cat[pp+1]=math then squash(pp,2,math,-1)(45)
  3280  else if cat[pp+1]=mod_scrap then squash(pp,2,mod_scrap,0)(46)
  3281  else if cat[pp+1]=simp then squash(pp,2,simp,-2)(47)
  3282  else if cat[pp+1]=terminator then squash(pp,2,stmt,-2)(48)
  3283  
  3284  @ @<Cases for |stmt|@>=
  3285  if cat[pp+1]=stmt then
  3286    begin app1(pp); app(break_space); app1(pp+1);
  3287    reduce(pp,2,stmt,-2)(49);
  3288    end
  3289  
  3290  @ @<Cases for |terminator|@>=
  3291  squash(pp,1,stmt,-2)(50)
  3292  
  3293  @ @<Cases for |var_head|@>=
  3294  if cat[pp+1]=beginning then squash(pp,1,stmt,-2)(51)
  3295  else if cat[pp+1]=math then
  3296    begin if cat[pp+2]=colon then
  3297      begin app("$"); app1(pp+1); app("$"); app1(pp+2);
  3298      reduce(pp+1,2,intro,+1)(52);
  3299      end;
  3300    end
  3301  else if cat[pp+1]=simp then
  3302    begin if cat[pp+2]=colon then squash(pp+1,2,intro,+1)(53);
  3303    end
  3304  else if cat[pp+1]=stmt then
  3305    begin app1(pp); app(break_space); app1(pp+1);
  3306    reduce(pp,2,var_head,-2)(54);
  3307    end
  3308  
  3309  @ The `|freeze_text|' macro is used to give official status to a token list.
  3310  Before saying |freeze_text|, items are appended to the current token list,
  3311  and we know that the eventual number of this token list will be the current
  3312  value of |text_ptr|. But no list of that number really exists as yet,
  3313  because no ending point for the current list has been
  3314  stored in the |tok_start| array. After saying |freeze_text|, the
  3315  old current token list becomes legitimate, and its number is the current
  3316  value of |text_ptr-1| since |text_ptr| has been increased. The new
  3317  current token list is empty and ready to be appended to.
  3318  Note that |freeze_text| does not check to see that |text_ptr| hasn't gotten
  3319  too large, since it is assumed that this test was done beforehand.
  3320  
  3321  @d freeze_text==incr(text_ptr); tok_start[text_ptr]:=tok_ptr
  3322  
  3323  @ The `|reduce|' macro used in our code for productions actually calls on
  3324  a procedure named `|red|', which makes the appropriate changes to the
  3325  scrap list.
  3326  
  3327  @p procedure red(@!j:sixteen_bits; @!k:eight_bits; @!c:eight_bits;
  3328    @!d:integer);
  3329  var i:0..max_scraps; {index into scrap memory}
  3330  begin cat[j]:=c; trans[j]:=text_ptr; freeze_text;
  3331  if k>1 then
  3332    begin for i:=j+k to lo_ptr do
  3333      begin cat[i-k+1]:=cat[i]; trans[i-k+1]:=trans[i];
  3334      end;
  3335    lo_ptr:=lo_ptr-k+1;
  3336    end;
  3337  @<Change |pp| to $\max(|scrap_base|,|pp+d|)$@>;
  3338  end;
  3339  
  3340  @ @<Change |pp| to $\max(|scrap_base|,|pp+d|)$@>=
  3341  if pp+d>=scrap_base then pp:=pp+d
  3342  else pp:=scrap_base
  3343  
  3344  @ Similarly, the `|squash|' macro invokes a procedure called `|sq|'. This
  3345  procedure takes advantage of the simplification that occurs when |k=1|.
  3346  
  3347  @p procedure sq(@!j:sixteen_bits; @!k:eight_bits; @!c:eight_bits;
  3348    @!d:integer);
  3349  var i:0..max_scraps; {index into scrap memory}
  3350  begin if k=1 then
  3351    begin cat[j]:=c; @<Change |pp|...@>;
  3352    end
  3353  else  begin for i:=j to j+k-1 do
  3354      begin app1(i);
  3355      end;
  3356    red(j,k,c,d);
  3357    end;
  3358  end;
  3359  
  3360  @ Here now is the code that applies productions as long as possible. It
  3361  requires two local labels (|found| and |done|), as well as a local
  3362  variable (|i|).
  3363  
  3364  @<Reduce the scraps using the productions until no more rules apply@>=
  3365  loop@+begin @<Make sure the entries |cat[pp..(pp+3)]| are defined@>;
  3366    if (tok_ptr+8>max_toks)or(text_ptr+4>max_texts) then
  3367      begin stat if tok_ptr>max_tok_ptr then max_tok_ptr:=tok_ptr;
  3368      if text_ptr>max_txt_ptr then max_txt_ptr:=text_ptr;
  3369      tats@;@/
  3370      overflow('token/text');
  3371      end;
  3372    if pp>lo_ptr then goto done;
  3373    @<Match a production...@>;
  3374    end;
  3375  done:
  3376  
  3377  @ If we get to the end of the scrap list, category codes equal to zero are
  3378  stored, since zero does not match anything in a production.
  3379  
  3380  @<Make sure the entries...@>=
  3381  if lo_ptr<pp+3 then
  3382    begin repeat if hi_ptr<=scrap_ptr then
  3383      begin incr(lo_ptr);@/
  3384      cat[lo_ptr]:=cat[hi_ptr]; trans[lo_ptr]:=trans[hi_ptr];@/
  3385      incr(hi_ptr);
  3386      end;
  3387    until (hi_ptr>scrap_ptr)or(lo_ptr=pp+3);
  3388    for i:=lo_ptr+1 to pp+3 do cat[i]:=0;
  3389    end
  3390  
  3391  @ If \.{WEAVE} is being run in debugging mode, the production numbers and
  3392  current stack categories will be printed out when |tracing| is set to 2;
  3393  a sequence of two or more irreducible scraps will be printed out when
  3394  |tracing| is set to 1.
  3395  @.\AT!2@>
  3396  @.\AT!1@>
  3397  
  3398  @<Glo...@>=
  3399  @!debug@!tracing:0..2; {can be used to show parsing details}
  3400  gubed
  3401  
  3402  @ The |prod| procedure is called in debugging mode just after |reduce| or
  3403  |squash|; its parameter is the number of the production that has just
  3404  been applied.
  3405  
  3406  @p @!debug procedure prod(@!n:eight_bits); {shows current categories}
  3407  var k:1..max_scraps; {index into |cat|}
  3408  begin if tracing=2 then
  3409    begin print_nl(n:1,':');
  3410    for k:=scrap_base to lo_ptr do
  3411      begin if k=pp then print('*') @+ else print(' ');
  3412      print_cat(cat[k]);
  3413      end;
  3414    if hi_ptr<=scrap_ptr then print('...'); {indicate that more is coming}
  3415    end;
  3416  end;
  3417  gubed
  3418  
  3419  @ The |translate| function assumes that scraps have been stored in
  3420  positions |scrap_base| through |scrap_ptr| of |cat| and |trans|. It
  3421  appends a |terminator| scrap and begins to apply productions as much as
  3422  possible. The result is a token list containing the translation of
  3423  the given sequence of scraps.
  3424  
  3425  After calling |translate|, we will have |text_ptr+3<=max_texts| and
  3426  |tok_ptr+6<=max_toks|, so it will be possible to create up to three token
  3427  lists with up to six tokens without checking for overflow. Before calling
  3428  |translate|, we should have |text_ptr<max_texts| and |scrap_ptr<max_scraps|,
  3429  since |translate| might add a new text and a new scrap before it checks
  3430  for overflow.
  3431  
  3432  @p @<Declaration of subprocedures for |translate|@>@;
  3433  function translate:text_pointer; {converts a sequence of scraps}
  3434  label done,found;
  3435  var i: 1..max_scraps; {index into |cat|}
  3436  @!j:0..max_scraps; {runs through final scraps}
  3437  @!debug@!k:0..long_buf_size; {index into |buffer|}
  3438  gubed
  3439  begin pp:=scrap_base; lo_ptr:=pp-1; hi_ptr:=pp;
  3440  @<If tracing, print an indication of where we are@>;
  3441  @<Reduce the scraps...@>;
  3442  if (lo_ptr=scrap_base)and(cat[lo_ptr]<>math) then translate:=trans[lo_ptr]
  3443  else @<Combine the irreducible scraps that remain@>;
  3444  end;
  3445  
  3446  @ If the initial sequence of scraps does not reduce to a single scrap,
  3447  we concatenate the translations of all remaining scraps, separated by
  3448  blank spaces, with dollar signs surrounding the translations of |math|
  3449  scraps.
  3450  
  3451  @<Combine the irreducible...@>=
  3452  begin @<If semi-tracing, show the irreducible scraps@>;
  3453  for j:=scrap_base to lo_ptr do
  3454    begin if j<>scrap_base then
  3455      begin app(" ");
  3456      end;
  3457    if cat[j]=math then
  3458      begin app("$");
  3459      end;
  3460    app1(j);
  3461    if cat[j]=math then
  3462      begin app("$");
  3463      end;
  3464    if tok_ptr+6>max_toks then overflow('token');
  3465    end;
  3466  freeze_text; translate:=text_ptr-1;
  3467  end
  3468  
  3469  @ @<If semi-tracing, show the irreducible scraps@>=
  3470  @!debug if (lo_ptr>scrap_base)and(tracing=1) then
  3471    begin print_nl('Irreducible scrap sequence in section ',module_count:1);
  3472    print_ln(':'); mark_harmless;
  3473    for j:=scrap_base to lo_ptr do
  3474      begin print(' '); print_cat(cat[j]);
  3475      end;
  3476    end;
  3477  gubed
  3478  
  3479  @ @<If tracing,...@>=
  3480  @!debug if tracing=2 then
  3481    begin print_nl('Tracing after l.',line:1,':'); mark_harmless;
  3482    if loc>50 then
  3483      begin print('...');
  3484      for k:=loc-50 to loc do print(xchr[buffer[k-1]]);
  3485      end
  3486    else for k:=1 to loc do print(xchr[buffer[k-1]]);
  3487    end
  3488  gubed
  3489  
  3490  @* Initializing the scraps.
  3491  If we are going to use the powerful production mechanism just developed, we
  3492  must get the scraps set up in the first place, given a \PASCAL\ text. A table
  3493  of the initial scraps corresponding to \PASCAL\ tokens appeared above in the
  3494  section on parsing; our goal now is to implement that table. We shall do this
  3495  by implementing a subroutine called |Pascal_parse| that is analogous to the
  3496  |Pascal_xref| routine used during phase one.
  3497  
  3498  Like |Pascal_xref|, the |Pascal_parse| procedure starts with the current
  3499  value of |next_control| and it uses the operation |next_control:=get_next|
  3500  repeatedly to read \PASCAL\ text until encountering the next `\v' or
  3501  `\.\{', or until |next_control>=format|. The scraps corresponding to what
  3502  it reads are appended into the |cat| and |trans| arrays, and |scrap_ptr|
  3503  is advanced.
  3504  
  3505  Like |prod|, this procedure has to split into pieces so that each
  3506  part is short enough to be handled by \PASCAL\ compilers that discriminate
  3507  against long subroutines. This time there are two split-off routines,
  3508  called |easy_cases| and |sub_cases|.
  3509  @^split procedures@>
  3510  
  3511  After studying |Pascal_parse|, we will look at the sub-procedures
  3512  |app_comment|, |app_octal|, and |app_hex| that are used in some of its
  3513  branches.
  3514  
  3515  @p @<Declaration of the |app_comment| procedure@>@;
  3516  @<Declaration of the |app_octal| and |app_hex| procedures@>@;
  3517  @<Declaration of the |easy_cases| procedure@>@;
  3518  @<Declaration of the |sub_cases| procedure@>@;
  3519  procedure Pascal_parse; {creates scraps from \PASCAL\ tokens}
  3520  label reswitch, exit;
  3521  var j:0..long_buf_size; {index into |buffer|}
  3522  @!p:name_pointer; {identifier designator}
  3523  begin while next_control<format do
  3524    begin @<Append the scrap appropriate to |next_control|@>;
  3525    next_control:=get_next;
  3526    if (next_control="|")or(next_control="{") then return;
  3527    end;
  3528  exit:end;
  3529  
  3530  @ The macros defined here are helpful abbreviations for the operations
  3531  needed when generating the scraps. A scrap of category |c| whose
  3532  translation has three tokens $t_1$, $t_2$, $t_3$ is generated by
  3533  |sc3|$(t_1)(t_2)(t_3)(c)$, etc.
  3534  
  3535  @d s0(#)==incr(scrap_ptr); cat[scrap_ptr]:=#; trans[scrap_ptr]:=text_ptr;
  3536    freeze_text;
  3537    end
  3538  @d s1(#)==app(#);s0
  3539  @d s2(#)==app(#);s1
  3540  @d s3(#)==app(#);s2
  3541  @d s4(#)==app(#);s3
  3542  @d sc4==@+begin s4
  3543  @d sc3==@+begin s3
  3544  @d sc2==@+begin s2
  3545  @d sc1==@+begin s1
  3546  @d sc0(#)==begin incr(scrap_ptr); cat[scrap_ptr]:=#; trans[scrap_ptr]:=0;
  3547    end
  3548  @d comment_scrap(#)==begin app(#); app_comment;
  3549    end
  3550  
  3551  @ @<Append the scr...@>=
  3552  @<Make sure that there is room for at least four more scraps, six more
  3553  tokens, and four more texts@>;
  3554  reswitch: case next_control of
  3555  string,verbatim: @<Append a \(string scrap@>;
  3556  identifier: @<Append an identifier scrap@>;
  3557  TeX_string: @<Append a \TeX\ string scrap@>;
  3558  othercases easy_cases
  3559  endcases
  3560  
  3561  @ The |easy_cases| each result in straightforward scraps.
  3562  
  3563  @<Declaration of the |easy_cases| procedure@>=
  3564  procedure easy_cases; {a subprocedure of |Pascal_parse|}
  3565  begin case next_control of
  3566  set_element_sign: sc3("\")("i")("n")(math);
  3567  @.\\in@>
  3568  double_dot: sc3("\")("t")("o")(math);
  3569  @.\\to@>
  3570  "#","$","%","^","_": sc2("\")(next_control)(math);
  3571  @.\\\#@>
  3572  @.\\\$@>
  3573  @.\\\%@>
  3574  @.\\\^@>
  3575  ignore,"|",xref_roman,xref_wildcard,xref_typewriter: do_nothing;
  3576  "(","[": sc1(next_control)(open);
  3577  ")","]": sc1(next_control)(close);
  3578  "*": sc4("\")("a")("s")("t")(math);
  3579  @.\\ast@>
  3580  ",": sc3(",")(opt)("9")(math);
  3581  ".","0","1","2","3","4","5","6","7","8","9": sc1(next_control)(simp);
  3582  ";": sc1(";")(semi);
  3583  ":": sc1(":")(colon);
  3584  @t\4@>  @<Cases involving nonstandard ASCII characters@>@;
  3585  exponent: sc3("\")("E")("{")(exp);
  3586  @.\\E@>
  3587  begin_comment: sc2("\")("B")(math);
  3588  @.\\B@>
  3589  end_comment: sc2("\")("T")(math);
  3590  @.\\T@>
  3591  octal: app_octal;
  3592  hex: app_hex;
  3593  check_sum: sc2("\")(")")(simp);
  3594  @.\\)@>
  3595  force_line: sc2("\")("]")(simp);
  3596  @.\\]@>
  3597  thin_space: sc2("\")(",")(math);
  3598  @.\\,@>
  3599  math_break: sc2(opt)("0")(simp);
  3600  line_break: comment_scrap(force);
  3601  big_line_break: comment_scrap(big_force);
  3602  no_line_break: begin app(big_cancel); app("\"); app(" ");
  3603  @.\\\ @>
  3604    comment_scrap(big_cancel);
  3605    end;
  3606  pseudo_semi: sc0(semi);
  3607  join: sc2("\")("J")(math);
  3608  @.\\J@>
  3609  othercases sc1(next_control)(math)
  3610  endcases;
  3611  end;
  3612  
  3613  @ @<Make sure that there is room for at least four...@>=
  3614  if (scrap_ptr+4>max_scraps)or(tok_ptr+6>max_toks)or(text_ptr+4>max_texts) then
  3615    begin stat if scrap_ptr>max_scr_ptr then max_scr_ptr:=scrap_ptr;
  3616    if tok_ptr>max_tok_ptr then max_tok_ptr:=tok_ptr;
  3617    if text_ptr>max_txt_ptr then max_txt_ptr:=text_ptr;
  3618    tats@;@/
  3619    overflow('scrap/token/text');
  3620    end
  3621  
  3622  @ Some nonstandard ASCII characters may have entered \.{WEAVE} by means of
  3623  standard ones. They are converted to \TeX\ control sequences so that it is
  3624  possible to keep \.{WEAVE} from stepping beyond standard ASCII.
  3625  
  3626  @<Cases involving nonstandard...@>=
  3627  not_equal: sc2("\")("I")(math);
  3628  @.\\I@>
  3629  less_or_equal: sc2("\")("L")(math);
  3630  @.\\L@>
  3631  greater_or_equal: sc2("\")("G")(math);
  3632  @.\\G@>
  3633  equivalence_sign: sc2("\")("S")(math);
  3634  @.\\S@>
  3635  and_sign: sc2("\")("W")(math);
  3636  @.\\W@>
  3637  or_sign: sc2("\")("V")(math);
  3638  @.\\V@>
  3639  not_sign: sc2("\")("R")(math);
  3640  @.\\R@>
  3641  left_arrow: sc2("\")("K")(math);
  3642  @.\\K@>
  3643  
  3644  @ The following code must use |app_tok| instead of |app| in order to
  3645  protect against overflow. Note that |tok_ptr+1<=max_toks| after |app_tok|
  3646  has been used, so another |app| is legitimate before testing again.
  3647  
  3648  Many of the special characters in a string must be prefixed by `\.\\' so that
  3649  \TeX\ will print them properly.
  3650  @^special string characters@>
  3651  
  3652  @<Append a \(string scrap@>=
  3653  begin app("\");
  3654  if next_control=verbatim then
  3655    begin app("=");
  3656  @.\\=@>
  3657    end
  3658  else  begin app(".");
  3659  @.\\.@>
  3660    end;
  3661  app("{"); j:=id_first;
  3662  while j<id_loc do
  3663    begin case buffer[j] of
  3664    " ","\","#","%","$","^","'","`","{","}","~","&","_":
  3665        begin app("\");
  3666        end;
  3667  @.\\\ @>
  3668  @.\\\\@>
  3669  @.\\\#@>
  3670  @.\\\%@>
  3671  @.\\\$@>
  3672  @.\\\^@>
  3673  @.\\\'@>
  3674  @.\\\`@>
  3675  @.\\\{@>
  3676  @.\\\}@>
  3677  @.\\\~@>
  3678  @.\\\&@>
  3679  @.\\_@>
  3680    "@@": if buffer[j+1]="@@" then incr(j)
  3681      else err_print('! Double @@ should be used in strings');
  3682  @.Double \AT! should be used...@>
  3683    othercases do_nothing
  3684    endcases;@/
  3685    app_tok(buffer[j]); incr(j);
  3686    end;
  3687  sc1("}")(simp);
  3688  end
  3689  
  3690  @ @<Append a \TeX\ string scrap@>=
  3691  begin app("\"); app("h"); app("b"); app("o"); app("x");
  3692  app("{");
  3693  for j:=id_first to id_loc-1 do app_tok(buffer[j]);
  3694  sc1("}")(simp);
  3695  end
  3696  
  3697  @ @<Append an identifier scrap@>=
  3698  begin p:=id_lookup(normal);
  3699  case ilk[p] of
  3700  normal,array_like,const_like,div_like,
  3701    do_like,for_like,goto_like,nil_like,to_like: sub_cases(p);
  3702  @t\4@>@<Cases that generate more than one scrap@>@;
  3703  othercases begin next_control:=ilk[p]-char_like; goto reswitch;
  3704    end {\&{and}, \&{in}, \&{not}, \&{or}}
  3705  endcases;
  3706  end
  3707  
  3708  @ The |sub_cases| also result in straightforward scraps.
  3709  
  3710  @<Declaration of the |sub_cases| procedure@>=
  3711  procedure sub_cases(@!p:name_pointer); {a subprocedure of |Pascal_parse|}
  3712  begin case ilk[p] of
  3713  normal: sc1(id_flag+p)(simp); {not a reserved word}
  3714  array_like: sc1(res_flag+p)(alpha); {\&{array}, \&{file}, \&{set}}
  3715  const_like: sc3(force)(backup)(res_flag+p)(intro);
  3716    {\&{const}, \&{label}, \&{type}}
  3717  div_like: sc3(math_bin)(res_flag+p)("}")(math); {\&{div}, \&{mod}}
  3718  do_like: sc1(res_flag+p)(omega); {\&{do}, \&{of}, \&{then}}
  3719  for_like: sc2(force)(res_flag+p)(alpha); {\&{for}, \&{while}, \&{with}}
  3720  goto_like: sc1(res_flag+p)(intro); {\&{goto}, \&{packed}}
  3721  nil_like: sc1(res_flag+p)(simp); {\&{nil}}
  3722  to_like: sc3(math_rel)(res_flag+p)("}")(math); {\&{downto}, \&{to}}
  3723  end;
  3724  end;
  3725  
  3726  @ @<Cases that generate more than one scrap@>=
  3727  begin_like: begin sc3(force)(res_flag+p)(cancel)(beginning); sc0(intro);
  3728    end; {\&{begin}}
  3729  case_like: begin sc0(casey); sc2(force)(res_flag+p)(alpha);
  3730    end; {\&{case}}
  3731  else_like: begin @<Append |terminator| if not already present@>;
  3732    sc3(force)(backup)(res_flag+p)(elsie);
  3733    end; {\&{else}}
  3734  end_like: begin @<Append |term...@>;
  3735    sc2(force)(res_flag+p)(close);
  3736    end; {\&{end}}
  3737  if_like: begin sc0(cond); sc2(force)(res_flag+p)(alpha);
  3738    end; {\&{if}}
  3739  loop_like: begin sc3(force)("\")("~")(alpha);
  3740  @.\\\~@>
  3741    sc1(res_flag+p)(omega);
  3742    end; {\&{xclause}}
  3743  proc_like: begin sc4(force)(backup)(res_flag+p)(cancel)(proc);
  3744    sc3(indent)("\")(" ")(intro);
  3745  @.\\\ @>
  3746    end; {\&{function}, \&{procedure}, \&{program}}
  3747  record_like: begin sc1(res_flag+p)(record_head); sc0(intro);
  3748    end; {\&{record}}
  3749  repeat_like: begin sc4(force)(indent)(res_flag+p)(cancel)(beginning);
  3750    sc0(intro);
  3751    end; {\&{repeat}}
  3752  until_like: begin @<Append |term...@>;
  3753    sc3(force)(backup)(res_flag+p)(close); sc0(clause);
  3754    end; {\&{until}}
  3755  var_like: begin sc4(force)(backup)(res_flag+p)(cancel)(var_head); sc0(intro);
  3756    end; {\&{var}}
  3757  
  3758  @ If a comment or semicolon appears before the reserved words \&{end},
  3759  \&{else}, or \&{until}, the |semi| or |terminator| scrap that is already
  3760  present overrides the |terminator| scrap belonging to this reserved word.
  3761  
  3762  @<Append |termin...@>=
  3763  if (scrap_ptr<scrap_base)or((cat[scrap_ptr]<>terminator)and
  3764      (cat[scrap_ptr]<>semi)) then sc0(terminator)
  3765  
  3766  @ A comment is incorporated into the previous scrap if that scrap is of type
  3767  |omega| or |semi| or |terminator|. (These three categories have consecutive
  3768  category codes.) Otherwise the comment is entered as a separate scrap
  3769  of type |terminator|, and it will combine with a |terminator| scrap that
  3770  immediately follows~it.
  3771  
  3772  The |app_comment| procedure takes care of placing a comment at the end of the
  3773  current scrap list. When |app_comment| is called, we assume that the current
  3774  token list is the translation of the comment involved.
  3775  
  3776  @<Declaration of the |app_comment|...@>=
  3777  procedure app_comment; {append a comment to the scrap list}
  3778  begin freeze_text;
  3779  if (scrap_ptr<scrap_base)or(cat[scrap_ptr]<omega)or
  3780      (cat[scrap_ptr]>terminator) then sc0(terminator)
  3781  else  begin app1(scrap_ptr); {|cat[scrap_ptr]| is
  3782      |omega| or |semi| or |terminator|}
  3783    end;
  3784  app(text_ptr-1+tok_flag); trans[scrap_ptr]:=text_ptr; freeze_text;
  3785  end;
  3786  
  3787  @ We are now finished with |Pascal_parse|, except for two relatively
  3788  trivial subprocedures that convert constants into tokens.
  3789  
  3790  @<Declaration of the |app_octal| and...@>=
  3791  procedure app_octal;
  3792  begin app("\"); app("O"); app("{");
  3793  @.\\O@>
  3794  while (buffer[loc]>="0")and(buffer[loc]<="7") do
  3795    begin app_tok(buffer[loc]); incr(loc);
  3796    end;
  3797  sc1("}")(simp);
  3798  end;
  3799  @#
  3800  procedure app_hex;
  3801  begin app("\"); app("H"); app("{");
  3802  @.\\H@>
  3803  while ((buffer[loc]>="0")and(buffer[loc]<="9"))or@|
  3804      ((buffer[loc]>="A")and(buffer[loc]<="F")) do
  3805    begin app_tok(buffer[loc]); incr(loc);
  3806    end;
  3807  sc1("}")(simp);
  3808  end;
  3809  
  3810  
  3811  @ When the `\v' that introduces \PASCAL\ text is sensed, a call on
  3812  |Pascal_translate| will return a pointer to the \TeX\ translation of
  3813  that text. If scraps exist in the |cat| and |trans| arrays, they are
  3814  unaffected by this translation process.
  3815  
  3816  @p function Pascal_translate: text_pointer;
  3817  var p:text_pointer; {points to the translation}
  3818  @!save_base:0..max_scraps; {holds original value of |scrap_base|}
  3819  begin save_base:=scrap_base; scrap_base:=scrap_ptr+1;
  3820  Pascal_parse; {get the scraps together}
  3821  if next_control<>"|" then err_print('! Missing "|" after Pascal text');
  3822  @.Missing "|"...@>
  3823  app_tok(cancel); app_comment; {place a |cancel| token as a final ``comment''}
  3824  p:=translate; {make the translation}
  3825  stat if scrap_ptr>max_scr_ptr then max_scr_ptr:=scrap_ptr;@;@+tats@;@/
  3826  scrap_ptr:=scrap_base-1; scrap_base:=save_base; {scrap the scraps}
  3827  Pascal_translate:=p;
  3828  end;
  3829  
  3830  @ The |outer_parse| routine is to |Pascal_parse| as |outer_xref|
  3831  is to |Pascal_xref|: It constructs a sequence of scraps for \PASCAL\ text
  3832  until |next_control>=format|. Thus, it takes care of embedded comments.
  3833  
  3834  @p procedure outer_parse; {makes scraps from \PASCAL\ tokens and comments}
  3835  var bal:eight_bits; {brace level in comment}
  3836  @!p,@!q:text_pointer; {partial comments}
  3837  begin while next_control<format do
  3838    if next_control<>"{" then Pascal_parse
  3839    else  begin @<Make sure that there is room for at least seven more
  3840        tokens, three more texts, and one more scrap@>;
  3841      app("\"); app("C"); app("{");
  3842  @.\\C@>
  3843      bal:=copy_comment(1); next_control:="|";
  3844      while bal>0 do
  3845        begin p:=text_ptr; freeze_text; q:=Pascal_translate;
  3846        {at this point we have |tok_ptr+6<=max_toks|}
  3847        app(tok_flag+p); app(inner_tok_flag+q);
  3848        if next_control="|" then bal:=copy_comment(bal)
  3849        else bal:=0; {an error has been reported}
  3850        end;
  3851      app(force); app_comment; {the full comment becomes a scrap}
  3852      end;
  3853  end;
  3854  
  3855  @ @<Make sure that there is room for at least seven more...@>=
  3856  if (tok_ptr+7>max_toks)or(text_ptr+3>max_texts)or(scrap_ptr>=max_scraps) then
  3857    begin stat if scrap_ptr>max_scr_ptr then max_scr_ptr:=scrap_ptr;
  3858    if tok_ptr>max_tok_ptr then max_tok_ptr:=tok_ptr;
  3859    if text_ptr>max_txt_ptr then max_txt_ptr:=text_ptr;
  3860    tats@;@/
  3861    overflow('token/text/scrap');
  3862    end
  3863  
  3864  @* Output of tokens.
  3865  So far our programs have only built up multi-layered token lists in
  3866  \.{WEAVE}'s internal memory; we have to figure out how to get them into
  3867  the desired final form. The job of converting token lists to characters in
  3868  the \TeX\ output file is not difficult, although it is an implicitly
  3869  recursive process. Four main considerations had to be kept in mind when
  3870  this part of \.{WEAVE} was designed.  (a) There are two modes of output:
  3871  |outer| mode, which translates tokens like |force| into line-breaking
  3872  control sequences, and |inner| mode, which ignores them except that blank
  3873  spaces take the place of line breaks. (b) The |cancel| instruction applies
  3874  to adjacent token or tokens that are output, and this cuts across levels
  3875  of recursion since `|cancel|' occurs at the beginning or end of a token
  3876  list on one level. (c) The \TeX\ output file will be semi-readable if line
  3877  breaks are inserted after the result of tokens like |break_space| and
  3878  |force|.  (d) The final line break should be suppressed, and there should
  3879  be no |force| token output immediately after `\.{\\Y\\P}'.
  3880  
  3881  @ The output process uses a stack to keep track of what is going on at
  3882  different ``levels'' as the token lists are being written out. Entries on
  3883  this stack have three parts:
  3884  
  3885  \yskip\hang |end_field| is the |tok_mem| location where the token list of a
  3886  particular level will end;
  3887  
  3888  \yskip\hang |tok_field| is the |tok_mem| location from which the next token
  3889  on a particular level will be read;
  3890  
  3891  \yskip\hang |mode_field| is the current mode, either |inner| or |outer|.
  3892  
  3893  \yskip\noindent The current values of these quantities are referred to
  3894  quite frequently, so they are stored in a separate place instead of in the
  3895  |stack| array. We call the current values |cur_end|, |cur_tok|, and
  3896  |cur_mode|.
  3897  
  3898  The global variable |stack_ptr| tells how many levels of output are
  3899  currently in progress. The end of output occurs when an |end_translation|
  3900  token is found, so the stack is never empty except when we first begin the
  3901  output process.
  3902  
  3903  @d inner=0 {value of |mode| for \PASCAL\ texts within \TeX\ texts}
  3904  @d outer=1 {value of |mode| for \PASCAL\ texts in modules}
  3905  
  3906  @<Types...@>=
  3907  @!mode=inner..outer;@/
  3908  @!output_state=record@!end_field:sixteen_bits; {ending location of token list}
  3909    @!tok_field:sixteen_bits; {present location within token list}
  3910    @!mode_field:mode; {interpretation of control tokens}
  3911    end;
  3912  
  3913  @ @d cur_end==cur_state.end_field {current ending location in |tok_mem|}
  3914  @d cur_tok==cur_state.tok_field {location of next output token in |tok_mem|}
  3915  @d cur_mode==cur_state.mode_field {current mode of interpretation}
  3916  @d init_stack==stack_ptr:=0;cur_mode:=outer {do this to initialize the stack}
  3917  
  3918  @<Glob...@>=
  3919  @!cur_state:output_state; {|cur_end|, |cur_tok|, |cur_mode|}
  3920  @!stack:array[1..stack_size] of output_state; {info for non-current levels}
  3921  @!stack_ptr:0..stack_size; {first unused location in the output state stack}
  3922  stat@!max_stack_ptr:0..stack_size; {largest value assumed by |stack_ptr|}
  3923  tats
  3924  
  3925  @ @<Set init...@>=stat max_stack_ptr:=0;@+tats
  3926  
  3927  @ To insert token-list |p| into the output, the |push_level| subroutine
  3928  is called; it saves the old level of output and gets a new one going.
  3929  The value of |cur_mode| is not changed.
  3930  
  3931  @p procedure push_level(@!p:text_pointer); {suspends the current level}
  3932  begin if stack_ptr=stack_size then overflow('stack')
  3933  else  begin if stack_ptr>0 then
  3934      stack[stack_ptr]:=cur_state; {save |cur_end|$\,\ldots\,$|cur_mode|}
  3935    incr(stack_ptr);
  3936    stat if stack_ptr>max_stack_ptr then
  3937      max_stack_ptr:=stack_ptr;@;@+tats@;@/
  3938    cur_tok:=tok_start[p]; cur_end:=tok_start[p+1];
  3939    end;
  3940  end;
  3941  
  3942  @ Conversely, the |pop_level| routine restores the conditions that were in
  3943  force when the current level was begun. This subroutine will never be
  3944  called when |stack_ptr=1|. It is so simple, we declare it as a macro:
  3945  
  3946  @d pop_level==begin decr(stack_ptr); cur_state:=stack[stack_ptr];
  3947    end {do this when |cur_tok| reaches |cur_end|}
  3948  
  3949  @ The |get_output| function returns the next byte of output that is not a
  3950  reference to a token list. It returns the values |identifier| or |res_word|
  3951  or |mod_name| if the next token is to be an identifier (typeset in
  3952  italics), a reserved word (typeset in boldface) or a module name (typeset
  3953  by a complex routine that might generate additional levels of output).
  3954  In these cases |cur_name| points to the identifier or module name in
  3955  question.
  3956  
  3957  @d res_word=@'201 {returned by |get_output| for reserved words}
  3958  @d mod_name=@'200 {returned by |get_output| for module names}
  3959  
  3960  @p function get_output:eight_bits; {returns the next token of output}
  3961  label restart;
  3962  var a:sixteen_bits; {current item read from |tok_mem|}
  3963  begin restart: while cur_tok=cur_end do pop_level;
  3964  a:=tok_mem[cur_tok]; incr(cur_tok);
  3965  if a>=@'400 then
  3966    begin cur_name:=a mod id_flag;
  3967    case a div id_flag of
  3968    2: a:=res_word; {|a=res_flag+cur_name|}
  3969    3: a:=mod_name; {|a=mod_flag+cur_name|}
  3970    4: begin push_level(cur_name); goto restart;
  3971      end; {|a=tok_flag+cur_name|}
  3972    5: begin push_level(cur_name); cur_mode:=inner; goto restart;
  3973      end; {|a=inner_tok_flag+cur_name|}
  3974    othercases a:=identifier {|a=id_flag+cur_name|}
  3975    endcases;
  3976    end;
  3977  @!debug if trouble_shooting then debug_help; @+ gubed@/
  3978  get_output:=a;
  3979  end;
  3980  
  3981  @ The real work associated with token output is done by |make_output|.
  3982  This procedure appends an |end_translation| token to the current token list,
  3983  and then it repeatedly calls |get_output| and feeds characters to the output
  3984  buffer until reaching the |end_translation| sentinel. It is possible for
  3985  |make_output| to
  3986  be called recursively, since a module name may include embedded \PASCAL\
  3987  text; however, the depth of recursion never exceeds one level, since
  3988  module names cannot be inside of module names.
  3989  
  3990  A procedure called |output_Pascal| does the scanning, translation, and
  3991  output of \PASCAL\ text within `\pb' brackets, and this procedure uses
  3992  |make_output| to output the current token list. Thus, the recursive call
  3993  of |make_output| actually occurs when |make_output| calls |output_Pascal|
  3994  while outputting the name of a module.
  3995  @^recursion@>
  3996  
  3997  @p procedure make_output; forward; @t\2@>@#
  3998  procedure output_Pascal; {outputs the current token list}
  3999  var save_tok_ptr,@!save_text_ptr,@!save_next_control:sixteen_bits;
  4000    {values to be restored}
  4001  p:text_pointer; {translation of the \PASCAL\ text}
  4002  begin save_tok_ptr:=tok_ptr; save_text_ptr:=text_ptr;
  4003  save_next_control:=next_control; next_control:="|"; p:=Pascal_translate;
  4004  app(p+inner_tok_flag);
  4005  make_output; {output the list}
  4006  stat if text_ptr>max_txt_ptr then max_txt_ptr:=text_ptr;
  4007  if tok_ptr>max_tok_ptr then max_tok_ptr:=tok_ptr;@;@+tats@;@/
  4008  text_ptr:=save_text_ptr; tok_ptr:=save_tok_ptr; {forget the tokens}
  4009  next_control:=save_next_control; {restore |next_control| to original state}
  4010  end;
  4011  
  4012  @ Here is \.{WEAVE}'s major output handler.
  4013  
  4014  @p procedure make_output; {outputs the equivalents of tokens}
  4015  label reswitch,exit,found;
  4016  var a:eight_bits; {current output byte}
  4017  @!b:eight_bits; {next output byte}
  4018  @!k,@!k_limit:0..max_bytes; {indices into |byte_mem|}
  4019  @!w:0..ww-1; {row of |byte_mem|}
  4020  @!j:0..long_buf_size; {index into |buffer|}
  4021  @!string_delimiter:ASCII_code; {first and last character of
  4022    string being copied}
  4023  @!save_loc,@!save_limit:0..long_buf_size; {|loc| and |limit| to be restored}
  4024  @!cur_mod_name:name_pointer; {name of module being output}
  4025  @!save_mode:mode; {value of |cur_mode| before a sequence of breaks}
  4026  begin app(end_translation); {append a sentinel}
  4027  freeze_text; push_level(text_ptr-1);
  4028  loop@+  begin a:=get_output;
  4029  reswitch: case a of
  4030    end_translation: return;
  4031    identifier,res_word:@<Output an identifier@>;
  4032    mod_name:@<Output a module name@>;
  4033    math_bin,math_op,math_rel:@<Output a \.{\\math} operator@>;
  4034    cancel: begin repeat a:=get_output;
  4035      until (a<backup)or(a>big_force);
  4036      goto reswitch;
  4037      end;
  4038    big_cancel: begin repeat a:=get_output;
  4039      until ((a<backup)and(a<>" "))or(a>big_force);
  4040      goto reswitch;
  4041      end;
  4042    indent,outdent,opt,backup,break_space,force,big_force:@<Output a
  4043      \(control, look ahead in case of line breaks,
  4044      possibly |goto reswitch|@>;
  4045    othercases out(a) {otherwise |a| is an ASCII character}
  4046    endcases;
  4047    end;
  4048  exit:end;
  4049  
  4050  @ An identifier of length one does not have to be enclosed in braces, and it
  4051  looks slightly better if set in a math-italic font instead of a (slightly
  4052  narrower) text-italic font. Thus we output `\.{\\\char'174a}' but
  4053  `\.{\\\\\{aa\}}'.
  4054  
  4055  @<Output an identifier@>=
  4056  begin out("\");
  4057  if a=identifier then
  4058    if length(cur_name)=1 then out("|")
  4059  @.\\|@>
  4060    else out("\")
  4061  @.\\\\@>
  4062  else out("&"); {|a=res_word|}
  4063  @.\\\&@>
  4064  if length(cur_name)=1 then out(byte_mem[cur_name mod ww,byte_start[cur_name]])
  4065  else out_name(cur_name);
  4066  end
  4067  
  4068  @ @<Output a \....@>=
  4069  begin out5("\")("m")("a")("t")("h");
  4070  if a=math_bin then out3("b")("i")("n")
  4071  else if a=math_rel then out3("r")("e")("l")
  4072  else out2("o")("p");
  4073  out("{");
  4074  end
  4075  
  4076  @ The current mode does not affect the behavior of \.{WEAVE}'s output routine
  4077  except when we are outputting control tokens.
  4078  
  4079  @<Output a \(control...@>=
  4080  if a<break_space then
  4081    begin if cur_mode=outer then
  4082      begin out2("\")(a-cancel+"0");
  4083  @.\\1@>
  4084  @.\\2@>
  4085  @.\\3@>
  4086  @.\\4@>
  4087  @.\\5@>
  4088  @.\\6@>
  4089  @.\\7@>
  4090      if a=opt then out(get_output) {|opt| is followed by a digit}
  4091      end
  4092    else if a=opt then b:=get_output {ignore digit following |opt|}
  4093    end
  4094  else @<Look ahead for strongest line break, |goto reswitch|@>
  4095  
  4096  @ If several of the tokens |break_space|, |force|, |big_force| occur in a
  4097  row, possibly mixed with blank spaces (which are ignored),
  4098  the largest one is used. A line break also occurs in the output file,
  4099  except at the very end of the translation. The very first line break
  4100  is suppressed (i.e., a line break that follows `\.{\\Y\\P}').
  4101  
  4102  @<Look ahead for st...@>=
  4103  begin b:=a; save_mode:=cur_mode;
  4104  loop@+  begin a:=get_output;
  4105    if (a=cancel)or(a=big_cancel) then goto reswitch;
  4106      {|cancel| overrides everything}
  4107    if ((a<>" ")and(a<break_space))or(a>big_force) then
  4108      begin if save_mode=outer then
  4109        begin if out_ptr>3 then
  4110          if (out_buf[out_ptr]="P")and
  4111            (out_buf[out_ptr-1]="\")and
  4112  @.\\P@>
  4113  @.\\Y@>
  4114            (out_buf[out_ptr-2]="Y")and
  4115            (out_buf[out_ptr-3]="\") then
  4116            goto reswitch;
  4117  @.\\1@>
  4118  @.\\2@>
  4119  @.\\3@>
  4120  @.\\4@>
  4121  @.\\5@>
  4122  @.\\6@>
  4123  @.\\7@>
  4124        out2("\")(b-cancel+"0");
  4125        if a<>end_translation then finish_line;
  4126        end
  4127      else if (a<>end_translation)and(cur_mode=inner) then out(" ");
  4128      goto reswitch;
  4129      end;
  4130    if a>b then b:=a; {if |a=" "| we have |a<b|}
  4131    end;
  4132  end
  4133  
  4134  @ The remaining part of |make_output| is somewhat more complicated. When we
  4135  output a module name, we may need to enter the parsing and translation
  4136  routines, since the name may contain \PASCAL\ code embedded in
  4137  \pb\ constructions. This \PASCAL\ code is placed at the end of the active
  4138  input buffer and the translation process uses the end of the active
  4139  |tok_mem| area.
  4140  
  4141  @<Output a module name@>=
  4142  begin out2("\")("X");
  4143  @.\\X@>
  4144  cur_xref:=xref[cur_name];
  4145  if num(cur_xref)>=def_flag then
  4146    begin out_mod(num(cur_xref)-def_flag);
  4147    if phase_three then
  4148      begin cur_xref:=xlink(cur_xref);
  4149      while num(cur_xref)>=def_flag do
  4150        begin out2(",")(" ");
  4151        out_mod(num(cur_xref)-def_flag);
  4152        cur_xref:=xlink(cur_xref);
  4153        end;
  4154      end;
  4155    end
  4156  else out("0"); {output the module number, or zero if it was undefined}
  4157  out(":"); @<Output the text of the module name@>;
  4158  out2("\")("X");
  4159  end
  4160  
  4161  @ @<Output the text...@>=
  4162  k:=byte_start[cur_name]; w:=cur_name mod ww; k_limit:=byte_start[cur_name+ww];
  4163  cur_mod_name:=cur_name;
  4164  while k<k_limit do
  4165    begin b:=byte_mem[w,k]; incr(k);
  4166    if b="@@" then @<Skip next character, give error if not `\.{@@}'@>;
  4167    if b<>"|" then out(b)
  4168    else  begin @<Copy the \PASCAL\ text into |buffer[(limit+1)..j]|@>;
  4169      save_loc:=loc; save_limit:=limit; loc:=limit+2; limit:=j+1;
  4170      buffer[limit]:="|"; output_Pascal;
  4171      loc:=save_loc; limit:=save_limit;
  4172      end;
  4173    end
  4174  
  4175  @ @<Skip next char...@>=
  4176  begin if byte_mem[w,k]<>"@@" then
  4177    begin print_nl('! Illegal control code in section name:');
  4178  @.Illegal control code...@>
  4179    print_nl('<'); print_id(cur_mod_name); print('> '); mark_error;
  4180    end;
  4181  incr(k);
  4182  end
  4183  
  4184  @ The \PASCAL\ text enclosed in \pb\ should not contain `\v' characters,
  4185  except within strings. We put a `\v' at the front of the buffer, so that an
  4186  error message that displays the whole buffer will look a little bit sensible.
  4187  The variable |string_delimiter| is zero outside of strings, otherwise it
  4188  equals the delimiter that began the string being copied.
  4189  
  4190  @<Copy the \PASCAL\ text into...@>=
  4191  j:=limit+1; buffer[j]:="|"; string_delimiter:=0;
  4192  loop@+  begin if k>=k_limit then
  4193      begin print_nl('! Pascal text in section name didn''t end:');
  4194  @.Pascal text...didn't end@>
  4195      print_nl('<'); print_id(cur_mod_name); print('> '); mark_error;
  4196      goto found;
  4197      end;
  4198    b:=byte_mem[w,k]; incr(k);
  4199    if b="@@" then @<Copy a control code into the buffer@>
  4200    else  begin if (b="""")or(b="'") then
  4201        if string_delimiter=0 then string_delimiter:=b
  4202        else if string_delimiter=b then string_delimiter:=0;
  4203      if (b<>"|")or(string_delimiter<>0) then
  4204        begin if j>long_buf_size-3 then overflow('buffer');
  4205        incr(j); buffer[j]:=b;
  4206        end
  4207      else goto found;
  4208      end;
  4209    end;
  4210  found:
  4211  
  4212  @ @<Copy a control code into the buffer@>=
  4213  begin if j>long_buf_size-4 then overflow('buffer');
  4214  buffer[j+1]:="@@"; buffer[j+2]:=byte_mem[w,k]; j:=j+2; incr(k);
  4215  end
  4216  
  4217  @* Phase two processing.
  4218  We have assembled enough pieces of the puzzle in order to be ready to specify
  4219  the processing in \.{WEAVE}'s main pass over the source file. Phase two
  4220  is analogous to phase one, except that more work is involved because we must
  4221  actually output the \TeX\ material instead of merely looking at the
  4222  \.{WEB} specifications.
  4223  
  4224  @<Phase II: Read all the text again and translate it to \TeX\ form@>=
  4225  reset_input; print_nl('Writing the output file...');
  4226  module_count:=0;
  4227  copy_limbo;
  4228  finish_line; flush_buffer(0,false,false); {insert a blank line, it looks nice}
  4229  while not input_has_ended do @<Translate the \(current module@>
  4230  
  4231  @ The output file will contain the control sequence \.{\\Y} between non-null
  4232  sections of a module, e.g., between the \TeX\ and definition parts if both
  4233  are nonempty. This puts a little white space between the parts when they are
  4234  printed. However, we don't want \.{\\Y} to occur between two definitions
  4235  within a single module. The variables |out_line| or |out_ptr| will
  4236  change if a section is non-null, so the following macros `|save_position|'
  4237  and `|emit_space_if_needed|' are able to handle the situation:
  4238  
  4239  @d save_position==save_line:=out_line; save_place:=out_ptr
  4240  @d emit_space_if_needed==if (save_line<>out_line)or(save_place<>out_ptr) then
  4241    out2("\")("Y")
  4242  @.\\Y@>
  4243  
  4244  @<Glo...@>=
  4245  @!save_line:integer; {former value of |out_line|}
  4246  @!save_place:sixteen_bits; {former value of |out_ptr|}
  4247  
  4248  @ @<Translate the \(current module@>=
  4249  begin incr(module_count);@/
  4250  @<Output the code for the beginning of a new module@>;
  4251  save_position;@/
  4252  @<Translate the \TeX\ part of the current module@>;
  4253  @<Translate the \(definition part of the current module@>;
  4254  @<Translate the \PASCAL\ part of the current module@>;
  4255  @<Show cross references to this module@>;
  4256  @<Output the code for the end of a module@>;
  4257  end
  4258  
  4259  @ Modules beginning with the \.{WEB} control sequence `\.{@@\ }' start in the
  4260  output with the \TeX\ control sequence `\.{\\M}', followed by the module
  4261  number. Similarly, `\.{@@*}' modules lead to the control sequence `\.{\\N}'.
  4262  If this is a changed module, we put \.{*} just before the module number.
  4263  
  4264  @<Output the code for the beginning...@>=
  4265  out("\");
  4266  if buffer[loc-1]<>"*" then out("M")
  4267  @.\\M@>
  4268  else  begin out("N"); print('*',module_count:1);
  4269  @.\\N@>
  4270    update_terminal; {print a progress report}
  4271    end;
  4272  out_mod(module_count); out2(".")(" ")
  4273  
  4274  @ In the \TeX\ part of a module, we simply copy the source text, except that
  4275  index entries are not copied and \PASCAL\ text within \pb\ is translated.
  4276  
  4277  @<Translate the \T...@>=
  4278  repeat next_control:=copy_TeX;
  4279  case next_control of
  4280  "|": begin init_stack; output_Pascal;
  4281    end;
  4282  "@@": out("@@");
  4283  octal: @<Translate an octal constant appearing in \TeX\ text@>;
  4284  hex: @<Translate a hexadecimal constant appearing in \TeX\ text@>;
  4285  TeX_string,xref_roman,xref_wildcard,xref_typewriter,module_name:
  4286    begin loc:=loc-2; next_control:=get_next; {skip to \.{@@>}}
  4287    if next_control=TeX_string then
  4288      err_print('! TeX string should be in Pascal text only');
  4289  @.TeX string should be...@>
  4290    end;
  4291  begin_comment,end_comment,check_sum,thin_space,math_break,line_break,
  4292    big_line_break,no_line_break,join,pseudo_semi:
  4293      err_print('! You can''t do that in TeX text');
  4294  @.You can't do that...@>
  4295  othercases do_nothing
  4296  endcases;
  4297  until next_control>=format
  4298  
  4299  @ @<Translate an octal constant appearing in \TeX\ text@>=
  4300  begin out3("\")("O")("{");
  4301  @.\\O@>
  4302  while (buffer[loc]>="0")and(buffer[loc]<="7") do
  4303    begin out(buffer[loc]); incr(loc);
  4304    end; {since |buffer[limit]=" "|, this loop will end}
  4305  out("}");
  4306  end
  4307  
  4308  @ @<Translate a hexadecimal constant appearing in \TeX\ text@>=
  4309  begin out3("\")("H")("{");
  4310  @.\\H@>
  4311  while ((buffer[loc]>="0")and(buffer[loc]<="9"))or@|
  4312      ((buffer[loc]>="A")and(buffer[loc]<="F")) do
  4313    begin out(buffer[loc]); incr(loc);
  4314    end;
  4315  out("}");
  4316  end
  4317  @ When we get to the following code we have |next_control>=format|, and
  4318  the token memory is in its initial empty state.
  4319  
  4320  @<Translate the \(d...@>=
  4321  if next_control<=definition then {definition part non-empty}
  4322    begin emit_space_if_needed; save_position;
  4323    end;
  4324  while next_control<=definition do {|format| or |definition|}
  4325    begin init_stack;
  4326    if next_control=definition then @<Start a macro definition@>
  4327    else @<Start a format definition@>;
  4328    outer_parse; finish_Pascal;
  4329    end
  4330  
  4331  @ The |finish_Pascal| procedure outputs the translation of the current
  4332  scraps, preceded by the control sequence `\.{\\P}' and followed by the
  4333  control sequence `\.{\\par}'. It also restores the token and scrap
  4334  memories to their initial empty state.
  4335  
  4336  A |force| token is appended to the current scraps before translation
  4337  takes place, so that the translation will normally end with \.{\\6} or
  4338  \.{\\7} (the \TeX\ macros for |force| and |big_force|). This \.{\\6} or
  4339  \.{\\7} is replaced by the concluding \.{\\par} or by \.{\\Y\\par}.
  4340  
  4341  @p procedure finish_Pascal; {finishes a definition or a \PASCAL\ part}
  4342  var p:text_pointer; {translation of the scraps}
  4343  begin out2("\")("P"); app_tok(force); app_comment; p:=translate;
  4344  @.\\P@>
  4345  app(p+tok_flag); make_output; {output the list}
  4346  if out_ptr>1 then
  4347    if out_buf[out_ptr-1]="\" then
  4348  @.\\6@>
  4349  @.\\7@>
  4350  @.\\Y@>
  4351      if out_buf[out_ptr]="6" then out_ptr:=out_ptr-2
  4352      else if out_buf[out_ptr]="7" then out_buf[out_ptr]:="Y";
  4353  out4("\")("p")("a")("r"); finish_line;
  4354  stat if text_ptr>max_txt_ptr then max_txt_ptr:=text_ptr;
  4355  if tok_ptr>max_tok_ptr then max_tok_ptr:=tok_ptr;
  4356  if scrap_ptr>max_scr_ptr then max_scr_ptr:=scrap_ptr;
  4357  tats@;@/
  4358  tok_ptr:=1; text_ptr:=1; scrap_ptr:=0; {forget the tokens and the scraps}
  4359  end;
  4360  
  4361  @ @<Start a macro...@>=
  4362  begin sc2("\")("D")(intro); {this will produce `\&{define }'}
  4363  @.\\D@>
  4364  next_control:=get_next;
  4365  if next_control<>identifier then err_print('! Improper macro definition')
  4366  @.Improper macro definition@>
  4367  else sc1(id_flag+id_lookup(normal))(math);
  4368  next_control:=get_next;
  4369  end
  4370  
  4371  @ @<Start a format...@>=
  4372  begin sc2("\")("F")(intro); {this will produce `\&{format }'}
  4373  @.\\F@>
  4374  next_control:=get_next;
  4375  if next_control=identifier then
  4376    begin sc1(id_flag+id_lookup(normal))(math);
  4377    next_control:=get_next;
  4378    if next_control=equivalence_sign then
  4379      begin sc2("\")("S")(math); {output an equivalence sign}
  4380  @.\\S@>
  4381      next_control:=get_next;
  4382      if next_control=identifier then
  4383        begin sc1(id_flag+id_lookup(normal))(math);
  4384        sc0(semi); {insert an invisible semicolon}
  4385        next_control:=get_next;
  4386        end;
  4387      end;
  4388    end;
  4389  if scrap_ptr<>5 then err_print('! Improper format definition');
  4390  @.Improper format definition@>
  4391  end
  4392  
  4393  @ Finally, when the \TeX\ and definition parts have been treated, we have
  4394  |next_control>=begin_Pascal|. We will make the global variable |this_module|
  4395  point to the current module name, if it has a name.
  4396  
  4397  @<Glob...@>=@!this_module:name_pointer; {the current module name, or zero}
  4398  
  4399  @ @<Translate the \P...@>=
  4400  this_module:=0;
  4401  if next_control<=module_name then
  4402    begin emit_space_if_needed; init_stack;
  4403    if next_control=begin_Pascal then next_control:=get_next
  4404    else  begin this_module:=cur_module;
  4405      @<Check that |=| or |==| follows this module name, and
  4406        emit the scraps to start the module definition@>;
  4407      end;
  4408    while next_control<=module_name do
  4409      begin outer_parse;
  4410      @<Emit the scrap for a module name if present@>;
  4411      end;
  4412    finish_Pascal;
  4413    end
  4414  
  4415  @ @<Check that |=|...@>=
  4416  repeat next_control:=get_next;
  4417  until next_control<>"+"; {allow optional `\.{+=}'}
  4418  if (next_control<>"=")and(next_control<>equivalence_sign) then
  4419    err_print('! You need an = sign after the section name')
  4420  @.You need an = sign...@>
  4421  else next_control:=get_next;
  4422  if out_ptr>1 then
  4423    if (out_buf[out_ptr]="Y")and(out_buf[out_ptr-1]="\") then
  4424  @.\\Y@>
  4425      begin app(backup); {the module name will be flush left}
  4426      end;
  4427  sc1(mod_flag+this_module)(mod_scrap);
  4428  cur_xref:=xref[this_module];
  4429  if num(cur_xref)<>module_count+def_flag then
  4430    begin sc3(math_rel)("+")("}")(math);
  4431      {module name is multiply defined}
  4432    this_module:=0; {so we won't give cross-reference info here}
  4433    end;
  4434  sc2("\")("S")(math); {output an equivalence sign}
  4435  @.\\S@>
  4436  sc1(force)(semi); {this forces a line break unless `\.{@@+}' follows}
  4437  
  4438  @ @<Emit the scrap...@>=
  4439  if next_control<module_name then
  4440    begin err_print('! You can''t do that in Pascal text');
  4441  @.You can't do that...@>
  4442    next_control:=get_next;
  4443    end
  4444  else if next_control=module_name then
  4445    begin sc1(mod_flag+cur_module)(mod_scrap); next_control:=get_next;
  4446    end
  4447  
  4448  @ Cross references relating to a named module are given after the module ends.
  4449  
  4450  @<Show cross...@>=
  4451  if this_module>0 then
  4452    begin @<Rearrange the list pointed to by |cur_xref|@>;
  4453    footnote(def_flag); footnote(0);
  4454    end
  4455  
  4456  @ To rearrange the order of the linked list of cross references, we need
  4457  four more variables that point to cross reference entries.  We'll end up
  4458  with a list pointed to by |cur_xref|.
  4459  
  4460  @<Glob...@>=
  4461  @!next_xref,@!this_xref,@!first_xref,@!mid_xref:xref_number;
  4462    {pointer variables for rearranging a list}
  4463  
  4464  @ We want to rearrange the cross reference list so that all the entries with
  4465  |def_flag| come first, in ascending order; then come all the other
  4466  entries, in ascending order.  There may be no entries in either one or both
  4467  of these categories.
  4468  
  4469  @<Rearrange the list...@>=
  4470  first_xref:=xref[this_module];
  4471  this_xref:=xlink(first_xref); {bypass current module number}
  4472  if num(this_xref)>def_flag then
  4473    begin mid_xref:=this_xref; cur_xref:=0; {this value doesn't matter}
  4474    repeat  next_xref:=xlink(this_xref); xlink(this_xref):=cur_xref;
  4475      cur_xref:=this_xref; this_xref:=next_xref;
  4476    until num(this_xref)<=def_flag;
  4477    xlink(first_xref):=cur_xref;
  4478    end
  4479  else mid_xref:=0; {first list null}
  4480  cur_xref:=0;
  4481  while this_xref<>0 do
  4482    begin next_xref:=xlink(this_xref); xlink(this_xref):=cur_xref;
  4483    cur_xref:=this_xref; this_xref:=next_xref;
  4484    end;
  4485  if mid_xref>0 then xlink(mid_xref):=cur_xref
  4486  else xlink(first_xref):=cur_xref;
  4487  cur_xref:=xlink(first_xref)
  4488  
  4489  @ The |footnote| procedure gives cross reference information about
  4490  multiply defined module names (if the |flag| parameter is |def_flag|), or about
  4491  the uses of a module name (if the |flag| parameter is zero). It assumes that
  4492  |cur_xref| points to the first cross-reference entry of interest, and it
  4493  leaves |cur_xref| pointing to the first element not printed.  Typical outputs:
  4494  `\.{\\A101.}'; `\.{\\Us370\\ET1009.}'; `\.{\\As8, 27\\*, 51\\ETs64.}'.
  4495  
  4496  @p procedure footnote(@!flag:sixteen_bits); {outputs module cross-references}
  4497  label done,exit;
  4498  var q:xref_number; {cross-reference pointer variable}
  4499  begin if num(cur_xref)<=flag then return;
  4500  finish_line; out("\");
  4501  @.\\A@>
  4502  @.\\U@>
  4503  if flag=0 then out("U")@+else out("A");
  4504  @<Output all the module numbers on the reference list |cur_xref|@>;
  4505  out(".");
  4506  exit:end;
  4507  
  4508  @ The following code distinguishes three cases, according as the number
  4509  of cross references is one, two, or more than two. Variable |q| points
  4510  to the first cross reference, and the last link is a zero.
  4511  
  4512  @<Output all the module numbers...@>=
  4513  q:=cur_xref; if num(xlink(q))>flag then out("s"); {plural}
  4514  @.\\As@>
  4515  @.\\Us@>
  4516  loop@+  begin out_mod(num(cur_xref)-flag);
  4517    cur_xref:=xlink(cur_xref); {point to the next cross reference to output}
  4518    if num(cur_xref)<=flag then goto done;
  4519    if num(xlink(cur_xref))>flag then out2(",")(" ") {not the last}
  4520    else begin out3("\")("E")("T"); {the last}
  4521  @.\\ET@>
  4522      if cur_xref<>xlink(q) then out("s"); {the last of more than two}
  4523  @.\\ETs@>
  4524      end;
  4525    end;
  4526  done:
  4527  
  4528  @ @<Output the code for the end of a module@>=
  4529  out3("\")("f")("i"); finish_line;
  4530  flush_buffer(0,false,false); {insert a blank line, it looks nice}
  4531  @.\\fi@>
  4532  
  4533  @* Phase three processing.
  4534  We are nearly finished! \.{WEAVE}'s only remaining task is to write out the
  4535  index, after sorting the identifiers and index entries.
  4536  
  4537  @<Phase III: Output the cross-reference index@>=
  4538  phase_three:=true; print_nl('Writing the index...');
  4539  if change_exists then
  4540    begin finish_line; @<Tell about changed modules@>;
  4541    end;
  4542  finish_line; out4("\")("i")("n")("x"); finish_line;
  4543  @.\\inx@>
  4544  @<Do the first pass of sorting@>;
  4545  @<Sort and output the index@>;
  4546  out4("\")("f")("i")("n"); finish_line;
  4547  @.\\fin@>
  4548  @<Output all the module names@>;
  4549  out4("\")("c")("o")("n"); finish_line;
  4550  @.\\con@>
  4551  print('Done.');
  4552  
  4553  @ Just before the index comes a list of all the changed modules, including
  4554  the index module itself.
  4555  
  4556  @<Glob...@>=
  4557  @!k_module:0..max_modules; {runs through the modules}
  4558  
  4559  @ @<Tell about changed modules@>=
  4560  begin {remember that the index is already marked as changed}
  4561  k_module:=1;
  4562  out4("\")("c")("h")(" ");
  4563  while k_module<module_count do
  4564    begin if changed_module[k_module] then
  4565      begin out_mod(k_module); out2(",")(" ");
  4566      end;
  4567    incr(k_module);
  4568    end;
  4569  out_mod(k_module);
  4570  out(".");
  4571  end
  4572  
  4573  @ A left-to-right radix sorting method is used, since this makes it easy to
  4574  adjust the collating sequence and since the running time will be at worst
  4575  proportional to the total length of all entries in the index. We put the
  4576  identifiers into 230 different lists based on their first characters.
  4577  (Uppercase letters are put into the same list as the corresponding lowercase
  4578  letters, since we want to have `$t<\\{TeX}<\&{to}$'.) The
  4579  list for character |c| begins at location |bucket[c]| and continues through
  4580  the |blink| array.
  4581  
  4582  @<Glob...@>=
  4583  @!bucket:array[ASCII_code] of name_pointer;
  4584  @!next_name: name_pointer; {successor of |cur_name| when sorting}
  4585  @!c:ASCII_code; {index into |bucket|}
  4586  @!h:0..hash_size; {index into |hash|}
  4587  @!blink:array[0..max_names] of sixteen_bits; {links in the buckets}
  4588  
  4589  @ To begin the sorting, we go through all the hash lists and put each entry
  4590  having a nonempty cross-reference list into the proper bucket.
  4591  
  4592  @<Do the first pass...@>=
  4593  for c:=0 to 255 do bucket[c]:=0;
  4594  for h:=0 to hash_size-1 do
  4595    begin next_name:=hash[h];
  4596    while next_name<>0 do
  4597      begin cur_name:=next_name; next_name:=link[cur_name];
  4598      if xref[cur_name]<>0 then
  4599        begin c:=byte_mem[cur_name mod ww,byte_start[cur_name]];
  4600        if (c<="Z")and(c>="A") then c:=c+@'40;
  4601        blink[cur_name]:=bucket[c]; bucket[c]:=cur_name;
  4602        end;
  4603      end;
  4604    end
  4605  
  4606  @ During the sorting phase we shall use the |cat| and |trans| arrays from
  4607  \.{WEAVE}'s parsing algorithm and rename them |depth| and |head|. They now
  4608  represent a stack of identifier lists for all the index entries that have
  4609  not yet been output. The variable |sort_ptr| tells how many such lists are
  4610  present; the lists are output in reverse order (first |sort_ptr|, then
  4611  |sort_ptr-1|, etc.). The |j|th list starts at |head[j]|, and if the first
  4612  |k| characters of all entries on this list are known to be equal we have
  4613  |depth[j]=k|.
  4614  
  4615  @d depth==cat {reclaims memory that is no longer needed for parsing}
  4616  @d head==trans {ditto}
  4617  @d sort_ptr==scrap_ptr {ditto}
  4618  @d max_sorts==max_scraps {ditto}
  4619  
  4620  @<Globals...@>=
  4621  @!cur_depth:eight_bits; {depth of current buckets}
  4622  @!cur_byte:0..max_bytes; {index into |byte_mem|}
  4623  @!cur_bank:0..ww-1; {row of |byte_mem|}
  4624  @!cur_val:sixteen_bits; {current cross reference number}
  4625  stat@!max_sort_ptr:0..max_sorts;@+tats {largest value of |sort_ptr|}
  4626  
  4627  @ @<Set init...@>=stat max_sort_ptr:=0;@+tats
  4628  
  4629  @ The desired alphabetic order is specified by the |collate| array; namely,
  4630  |collate[0]<collate[1]<@t$\cdots$@><collate[229]|.
  4631  
  4632  @<Glob...@>=@!collate:array[0..229] of ASCII_code; {collation order}
  4633  
  4634  @ @<Local variables for init...@>=
  4635  @!c:ASCII_code; {used to initialize |collate|}
  4636  
  4637  @ We use the order $\hbox{null}<\.\ <\hbox{other characters}<\.\_<
  4638  \.A=\.a<\cdots<\.Z=\.z<\.0<\cdots<\.9.$
  4639  
  4640  @<Set init...@>=
  4641  collate[0]:=0; collate[1]:=" ";
  4642  for c:=1 to " "-1 do collate[c+1]:=c;
  4643  for c:=" "+1 to "0"-1 do collate[c]:=c;
  4644  for c:="9"+1 to "A"-1 do collate[c-10]:=c;
  4645  for c:="Z"+1 to "_"-1 do collate[c-36]:=c;
  4646  collate["_"-36]:="_"+1;
  4647  for c:="z"+1 to 255 do collate[c-63]:=c;
  4648  collate[193]:="_";
  4649  for c:="a" to "z" do collate[c-"a"+194]:=c;
  4650  for c:="0" to "9" do collate[c-"0"+220]:=c;
  4651  
  4652  @ Procedure |unbucket| goes through the buckets and adds nonempty lists
  4653  to the stack, using the collating sequence specified in the |collate| array.
  4654  The parameter to |unbucket| tells the current depth in the buckets.
  4655  Any two sequences that agree in their first 255 character positions are
  4656  regarded as identical.
  4657  
  4658  @d infinity=255 {$\infty$ (approximately)}
  4659  
  4660  @p procedure unbucket(@!d:eight_bits); {empties buckets having depth |d|}
  4661  var c:ASCII_code; {index into |bucket|}
  4662  begin for c:=229 downto 0 do if bucket[collate[c]]>0 then
  4663    begin if sort_ptr>max_sorts then overflow('sorting');
  4664    incr(sort_ptr);
  4665    stat if sort_ptr>max_sort_ptr then max_sort_ptr:=sort_ptr;@;@+tats@;@/
  4666    if c=0 then depth[sort_ptr]:=infinity else depth[sort_ptr]:=d;
  4667    head[sort_ptr]:=bucket[collate[c]]; bucket[collate[c]]:=0;
  4668    end;
  4669  end;
  4670  
  4671  @ @<Sort and output...@>=
  4672  sort_ptr:=0; unbucket(1);
  4673  while sort_ptr>0 do
  4674    begin cur_depth:=cat[sort_ptr];
  4675    if (blink[head[sort_ptr]]=0)or(cur_depth=infinity) then
  4676      @<Output index entries for the list at |sort_ptr|@>
  4677    else @<Split the list at |sort_ptr| into further lists@>;
  4678    end
  4679  
  4680  @ @<Split the list...@>=
  4681  begin next_name:=head[sort_ptr];
  4682  repeat cur_name:=next_name; next_name:=blink[cur_name];
  4683    cur_byte:=byte_start[cur_name]+cur_depth; cur_bank:=cur_name mod ww;
  4684    if cur_byte=byte_start[cur_name+ww] then c:=0 {we hit the end of the name}
  4685    else  begin c:=byte_mem[cur_bank,cur_byte];
  4686      if (c<="Z")and(c>="A") then c:=c+@'40;
  4687      end;
  4688    blink[cur_name]:=bucket[c]; bucket[c]:=cur_name;
  4689  until next_name=0;
  4690  decr(sort_ptr); unbucket(cur_depth+1);
  4691  end
  4692  
  4693  @ @<Output index...@>=
  4694  begin cur_name:=head[sort_ptr];
  4695  @!debug if trouble_shooting then debug_help;@;@+gubed@/
  4696  repeat out2("\")(":");
  4697  @.\\:@>
  4698    @<Output the name at |cur_name|@>;
  4699    @<Output the cross-references at |cur_name|@>;
  4700    cur_name:=blink[cur_name];
  4701  until cur_name=0;
  4702  decr(sort_ptr);
  4703  end
  4704  
  4705  @ @<Output the name...@>=
  4706  case ilk[cur_name] of
  4707  normal: if length(cur_name)=1 then out2("\")("|")@+else out2("\")("\");
  4708  @.\\|@>
  4709  @.\\\\@>
  4710  roman: do_nothing;
  4711  wildcard: out2("\")("9");
  4712  @.\\9@>
  4713  typewriter: out2("\")(".");
  4714  @.\\.@>
  4715  othercases out2("\")("&")
  4716  @.\\\&@>
  4717  endcases;@/
  4718  out_name(cur_name)
  4719  
  4720  @ Section numbers that are to be underlined are enclosed in
  4721  `\.{\\[}$\,\ldots\,$\.]'.
  4722  
  4723  @<Output the cross-references...@>=
  4724  @<Invert the cross-reference list at |cur_name|, making |cur_xref| the head@>;
  4725  repeat out2(",")(" "); cur_val:=num(cur_xref);
  4726  if cur_val<def_flag then out_mod(cur_val)
  4727  else  begin out2("\")("["); out_mod(cur_val-def_flag); out("]");
  4728  @.\\[@>
  4729    end;
  4730  cur_xref:=xlink(cur_xref);
  4731  until cur_xref=0;
  4732  out("."); finish_line
  4733  
  4734  @ List inversion is best thought of as popping elements off one stack and
  4735  pushing them onto another. In this case |cur_xref| will be the head of
  4736  the stack that we push things onto.
  4737  
  4738  @<Invert the cross-reference list at |cur_name|, making |cur_xref| the head@>=
  4739  this_xref:=xref[cur_name]; cur_xref:=0;
  4740  repeat next_xref:=xlink(this_xref); xlink(this_xref):=cur_xref;
  4741  cur_xref:=this_xref; this_xref:=next_xref;
  4742  until this_xref=0
  4743  
  4744  @ The following recursive procedure walks through the tree of module names and
  4745  prints them.
  4746  @^recursion@>
  4747  
  4748  @p procedure mod_print(p:name_pointer); {print all module names in subtree |p|}
  4749  begin if p>0 then
  4750    begin mod_print(llink[p]);@/
  4751    out2("\")(":");@/
  4752  @.\\:@>
  4753    tok_ptr:=1; text_ptr:=1; scrap_ptr:=0; init_stack;
  4754    app(p+mod_flag); make_output;
  4755    footnote(0); {|cur_xref| was set by |make_output|}
  4756    finish_line;@/
  4757    mod_print(rlink[p]);
  4758    end;
  4759  end;
  4760  
  4761  @ @<Output all the module names@>=@+mod_print(root)
  4762  
  4763  @* Debugging.
  4764  The \PASCAL\ debugger with which \.{WEAVE} was developed allows breakpoints
  4765  to be set, and variables can be read and changed, but procedures cannot be
  4766  executed. Therefore a `|debug_help|' procedure has been inserted in the main
  4767  loops of each phase of the program; when |ddt| and |dd| are set to appropriate
  4768  values, symbolic printouts of various tables will appear.
  4769  
  4770  The idea is to set a breakpoint inside the |debug_help| routine, at the
  4771  place of `\ignorespaces|breakpoint:|\unskip' below.  Then when
  4772  |debug_help| is to be activated, set |trouble_shooting| equal to |true|.
  4773  The |debug_help| routine will prompt you for values of |ddt| and |dd|,
  4774  discontinuing this when |ddt<=0|; thus you type $2n+1$ integers, ending
  4775  with zero or a negative number. Then control either passes to the
  4776  breakpoint, allowing you to look at and/or change variables (if you typed
  4777  zero), or to exit the routine (if you typed a negative value).
  4778  
  4779  Another global variable, |debug_cycle|, can be used to skip silently
  4780  past calls on |debug_help|. If you set |debug_cycle>1|, the program stops
  4781  only every |debug_cycle| times |debug_help| is called; however,
  4782  any error stop will set |debug_cycle| to zero.
  4783  
  4784  @<Globals...@>=
  4785  @!debug@!trouble_shooting:boolean; {is |debug_help| wanted?}
  4786  @!ddt:integer; {operation code for the |debug_help| routine}
  4787  @!dd:integer; {operand in procedures performed by |debug_help|}
  4788  @!debug_cycle:integer; {threshold for |debug_help| stopping}
  4789  @!debug_skipped:integer; {we have skipped this many |debug_help| calls}
  4790  @!term_in:text_file; {the user's terminal as an input file}
  4791  gubed
  4792  
  4793  @ The debugging routine needs to read from the user's terminal.
  4794  @^system dependencies@>
  4795  @<Set init...@>=
  4796  @!debug trouble_shooting:=true; debug_cycle:=1; debug_skipped:=0; tracing:=0;@/
  4797  trouble_shooting:=false; debug_cycle:=99999; {use these when it almost works}
  4798  reset(term_in,'TTY:','/I'); {open |term_in| as the terminal, don't do a |get|}
  4799  gubed
  4800  
  4801  @ @d breakpoint=888 {place where a breakpoint is desirable}
  4802  @^system dependencies@>
  4803  
  4804  @p @!debug procedure debug_help; {routine to display various things}
  4805  label breakpoint,exit;
  4806  var k:integer; {index into various arrays}
  4807  begin incr(debug_skipped);
  4808  if debug_skipped<debug_cycle then return;
  4809  debug_skipped:=0;
  4810  loop@+  begin print_nl('#'); update_terminal; {prompt}
  4811    read(term_in,ddt); {read a debug-command code}
  4812    if ddt<0 then return
  4813    else if ddt=0 then
  4814      begin goto breakpoint;@\ {go to every label at least once}
  4815      breakpoint: ddt:=0;@\
  4816      end
  4817    else  begin read(term_in,dd);
  4818      case ddt of
  4819      1: print_id(dd);
  4820      2: print_text(dd);
  4821      3: for k:=1 to dd do print(xchr[buffer[k]]);
  4822      4: for k:=1 to dd do print(xchr[mod_text[k]]);
  4823      5: for k:=1 to out_ptr do print(xchr[out_buf[k]]);
  4824      6: for k:=1 to dd do
  4825        begin print_cat(cat[k]); print(' ');
  4826        end;
  4827      othercases print('?')
  4828      endcases;
  4829      end;
  4830    end;
  4831  exit:end;
  4832  gubed
  4833  
  4834  @* The main program.
  4835  Let's put it all together now: \.{WEAVE} starts and ends here.
  4836  @^system dependencies@>
  4837  
  4838  The main procedure has been split into three sub-procedures in order to
  4839  keep certain \PASCAL\ compilers from overflowing their capacity.
  4840  @^split procedures@>
  4841  
  4842  @p procedure Phase_I;
  4843  begin @<Phase I:...@>;
  4844  end;
  4845  @#
  4846  procedure Phase_II;
  4847  begin @<Phase II:...@>;
  4848  end;
  4849  @#
  4850  begin initialize; {beginning of the main program}
  4851  print_ln(banner); {print a ``banner line''}
  4852  @<Store all the reserved words@>;
  4853  Phase_I; Phase_II;@/
  4854  @<Phase III:...@>;
  4855  @<Check that all changes have been read@>;
  4856  end_of_WEAVE:
  4857  stat @<Print statistics about memory usage@>;@+tats@;@/
  4858  @t\4\4@>{here files should be closed if the operating system requires it}
  4859  @<Print the job |history|@>;
  4860  end.
  4861  
  4862  @ @<Print statistics about memory usage@>=
  4863  print_nl('Memory usage statistics: ',
  4864    name_ptr:1,' names, ', xref_ptr:1,' cross references, ',
  4865    byte_ptr[0]:1);
  4866  for cur_bank:=1 to ww-1 do print('+',byte_ptr[cur_bank]:1);
  4867    print(' bytes;');
  4868  print_nl('parsing required ',max_scr_ptr:1,' scraps, ',max_txt_ptr:1,
  4869    ' texts, ',max_tok_ptr:1,' tokens, ', max_stack_ptr:1,' levels;');
  4870  print_nl('sorting required ',max_sort_ptr:1, ' levels.')
  4871  
  4872  @ Some implementations may wish to pass the |history| value to the
  4873  operating system so that it can be used to govern whether or not other
  4874  programs are started. Here we simply report the history to the user.
  4875  @^system dependencies@>
  4876  
  4877  @<Print the job |history|@>=
  4878  case history of
  4879  spotless: print_nl('(No errors were found.)');
  4880  harmless_message: print_nl('(Did you see the warning message above?)');
  4881  error_message: print_nl('(Pardon me, but I think I spotted something wrong.)');
  4882  fatal_message: print_nl('(That was a fatal error, my friend.)');
  4883  end {there are no other cases}
  4884  
  4885  @* System-dependent changes.
  4886  This module should be replaced, if necessary, by changes to the program
  4887  that are necessary to make \.{WEAVE} work at a particular installation.
  4888  It is usually best to design your change file so that all changes to
  4889  previous modules preserve the module numbering; then everybody's version
  4890  will be consistent with the printed program. More extensive changes,
  4891  which introduce new modules, can be inserted here; then only the index
  4892  itself will get a new module number.
  4893  @^system dependencies@>
  4894  
  4895  @* Index.
  4896  If you have read and understood the code for Phase III above, you know what
  4897  is in this index and how it got here. All modules in which an identifier is
  4898  used are listed with that identifier, except that reserved words are
  4899  indexed only when they appear in format definitions, and the appearances
  4900  of identifiers in module names are not indexed. Underlined entries
  4901  correspond to where the identifier was declared. Error messages, control
  4902  sequences put into the output, and a few
  4903  other things like ``recursion'' are indexed here too.