modernc.org/knuth@v0.0.4/web/testdata/ctan.org/tex-archive/systems/knuth/dist/texware/pooltype.web (about)

     1  % This program by D. E. Knuth is not copyrighted and can be used freely.
     2  % Version 1 was implemented in June 1982.
     3  % Slight changes were made in October, 1982, for version 0.6 of TeX.
     4  % Version 2 (July 1983) is consistent with TeX version 0.999.
     5  % Version 3 (September 1989) is consistent with 8-bit TeX.
     6  
     7  % Here is TeX material that gets inserted after \input webmac
     8  \def\hang{\hangindent 3em\indent\ignorespaces}
     9  \font\ninerm=cmr9
    10  \let\mc=\ninerm % medium caps for names like SAIL
    11  \def\PASCAL{Pascal}
    12  
    13  \def\(#1){} % this is used to make section names sort themselves better
    14  \def\9#1{} % this is used for sort keys in the index
    15  
    16  \def\title{POOL\lowercase{type}}
    17  \def\contentspagenumber{101}
    18  \def\topofcontents{\null
    19    \titlefalse % include headline on the contents page
    20    \def\rheader{\mainfont\hfil \contentspagenumber}
    21    \vfill
    22    \centerline{\titlefont The {\ttitlefont POOLtype} processor}
    23    \vskip 15pt
    24    \centerline{(Version 3, September 1989)}
    25    \vfill}
    26  \def\botofcontents{\vfill
    27    \centerline{\hsize 5in\baselineskip9pt
    28      \vbox{\ninerm\noindent
    29      The preparation of this report
    30      was supported in part by the National Science
    31      Foundation under grants IST-8201926 and MCS-8300984,
    32      and by the System Development Foundation. `\TeX' is a
    33      trademark of the American Mathematical Society.}}}
    34  \pageno=\contentspagenumber \advance\pageno by 1
    35  
    36  @* Introduction.
    37  The \.{POOLtype} utility program converts string pool files output
    38  by \.{TANGLE} into a slightly more symbolic format that may be useful
    39  when \.{TANGLE}d programs are being debugged.
    40  
    41  It's a pretty trivial routine, but people may want to try transporting
    42  this program before they get up enough courage to tackle \TeX\ itself.
    43  The first 256 strings are treated as \TeX\ treats them, using routines
    44  copied from \TeX82.
    45  
    46  @ \.{POOLtype} is written entirely in standard \PASCAL, except that it has
    47  to do some slightly system-dependent character code conversion on input
    48  and output. The input is read from |pool_file|, and the output is written
    49  on |output|. If the input is erroneous, the |output| file will describe
    50  the error.
    51  @^system dependencies@>
    52  
    53  @p program POOLtype(@!pool_file,@!output);
    54  label 9999; {this labels the end of the program}
    55  type @<Types in the outer block@>@/
    56  var @<Globals in the outer block@>@/
    57  procedure initialize; {this procedure gets things started properly}
    58    var @<Local variables for initialization@>@;
    59    begin @<Set initial values of key variables@>@/
    60    end;
    61  
    62  @ Here are some macros for common programming idioms.
    63  
    64  @d incr(#) == #:=#+1 {increase a variable by unity}
    65  @d decr(#) == #:=#-1 {decrease a variable by unity}
    66  @d do_nothing == {empty statement}
    67  
    68  @* The character set.
    69  (The following material is copied verbatim from \TeX82.
    70  Thus, the same system-dependent changes should be made to both programs.)
    71  
    72  In order to make \TeX\ readily portable to a wide variety of
    73  computers, all of its input text is converted to an internal eight-bit
    74  code that includes standard ASCII, the ``American Standard Code for
    75  Information Interchange.''  This conversion is done immediately when each
    76  character is read in. Conversely, characters are converted from ASCII to
    77  the user's external representation just before they are output to a
    78  text file.
    79  
    80  Such an internal code is relevant to users of \TeX\ primarily because it
    81  governs the positions of characters in the fonts. For example, the
    82  character `\.A' has ASCII code $65=@'101$, and when \TeX\ typesets
    83  this letter it specifies character number 65 in the current font.
    84  If that font actually has `\.A' in a different position, \TeX\ doesn't
    85  know what the real position is; the program that does the actual printing from
    86  \TeX's device-independent files is responsible for converting from ASCII to
    87  a particular font encoding.
    88  @^ASCII code@>
    89  
    90  \TeX's internal code also defines the value of constants
    91  that begin with a reverse apostrophe; and it provides an index to the
    92  \.{\\catcode}, \.{\\mathcode}, \.{\\uccode}, \.{\\lccode}, and \.{\\delcode}
    93  tables.
    94  
    95  @ Characters of text that have been converted to \TeX's internal form
    96  are said to be of type |ASCII_code|, which is a subrange of the integers.
    97  
    98  @<Types...@>=
    99  @!ASCII_code=0..255; {eight-bit numbers}
   100  
   101  @ The original \PASCAL\ compiler was designed in the late 60s, when six-bit
   102  character sets were common, so it did not make provision for lowercase
   103  letters. Nowadays, of course, we need to deal with both capital and small
   104  letters in a convenient way, especially in a program for typesetting;
   105  so the present specification of \TeX\ has been written under the assumption
   106  that the \PASCAL\ compiler and run-time system permit the use of text files
   107  with more than 64 distinguishable characters. More precisely, we assume that
   108  the character set contains at least the letters and symbols associated
   109  with ASCII codes @'40 through @'176; all of these characters are now
   110  available on most computer terminals.
   111  
   112  Since we are dealing with more characters than were present in the first
   113  \PASCAL\ compilers, we have to decide what to call the associated data
   114  type. Some \PASCAL s use the original name |char| for the
   115  characters in text files, even though there now are more than 64 such
   116  characters, while other \PASCAL s consider |char| to be a 64-element
   117  subrange of a larger data type that has some other name.
   118  
   119  In order to accommodate this difference, we shall use the name |text_char|
   120  to stand for the data type of the characters that are converted to and
   121  from |ASCII_code| when they are input and output. We shall also assume
   122  that |text_char| consists of the elements |chr(first_text_char)| through
   123  |chr(last_text_char)|, inclusive. The following definitions should be
   124  adjusted if necessary.
   125  @^system dependencies@>
   126  
   127  @d text_char == char {the data type of characters in text files}
   128  @d first_text_char=0 {ordinal number of the smallest element of |text_char|}
   129  @d last_text_char=255 {ordinal number of the largest element of |text_char|}
   130  
   131  @<Local variables for init...@>=
   132  @!i:integer;
   133  
   134  @ The \TeX\ processor converts between ASCII code and
   135  the user's external character set by means of arrays |xord| and |xchr|
   136  that are analogous to \PASCAL's |ord| and |chr| functions.
   137  
   138  @<Glob...@>=
   139  @!xord: array [text_char] of ASCII_code;
   140    {specifies conversion of input characters}
   141  @!xchr: array [ASCII_code] of text_char;
   142    {specifies conversion of output characters}
   143  
   144  @ Since we are assuming that our \PASCAL\ system is able to read and
   145  write the visible characters of standard ASCII (although not
   146  necessarily using the ASCII codes to represent them), the following
   147  assignment statements initialize the standard part of the |xchr| array
   148  properly, without needing any system-dependent changes. On the other
   149  hand, it is possible to implement \TeX\ with less complete character
   150  sets, and in such cases it will be necessary to change something here.
   151  @^system dependencies@>
   152  
   153  @<Set init...@>=
   154  xchr[@'40]:=' ';
   155  xchr[@'41]:='!';
   156  xchr[@'42]:='"';
   157  xchr[@'43]:='#';
   158  xchr[@'44]:='$';
   159  xchr[@'45]:='%';
   160  xchr[@'46]:='&';
   161  xchr[@'47]:='''';@/
   162  xchr[@'50]:='(';
   163  xchr[@'51]:=')';
   164  xchr[@'52]:='*';
   165  xchr[@'53]:='+';
   166  xchr[@'54]:=',';
   167  xchr[@'55]:='-';
   168  xchr[@'56]:='.';
   169  xchr[@'57]:='/';@/
   170  xchr[@'60]:='0';
   171  xchr[@'61]:='1';
   172  xchr[@'62]:='2';
   173  xchr[@'63]:='3';
   174  xchr[@'64]:='4';
   175  xchr[@'65]:='5';
   176  xchr[@'66]:='6';
   177  xchr[@'67]:='7';@/
   178  xchr[@'70]:='8';
   179  xchr[@'71]:='9';
   180  xchr[@'72]:=':';
   181  xchr[@'73]:=';';
   182  xchr[@'74]:='<';
   183  xchr[@'75]:='=';
   184  xchr[@'76]:='>';
   185  xchr[@'77]:='?';@/
   186  xchr[@'100]:='@@';
   187  xchr[@'101]:='A';
   188  xchr[@'102]:='B';
   189  xchr[@'103]:='C';
   190  xchr[@'104]:='D';
   191  xchr[@'105]:='E';
   192  xchr[@'106]:='F';
   193  xchr[@'107]:='G';@/
   194  xchr[@'110]:='H';
   195  xchr[@'111]:='I';
   196  xchr[@'112]:='J';
   197  xchr[@'113]:='K';
   198  xchr[@'114]:='L';
   199  xchr[@'115]:='M';
   200  xchr[@'116]:='N';
   201  xchr[@'117]:='O';@/
   202  xchr[@'120]:='P';
   203  xchr[@'121]:='Q';
   204  xchr[@'122]:='R';
   205  xchr[@'123]:='S';
   206  xchr[@'124]:='T';
   207  xchr[@'125]:='U';
   208  xchr[@'126]:='V';
   209  xchr[@'127]:='W';@/
   210  xchr[@'130]:='X';
   211  xchr[@'131]:='Y';
   212  xchr[@'132]:='Z';
   213  xchr[@'133]:='[';
   214  xchr[@'134]:='\';
   215  xchr[@'135]:=']';
   216  xchr[@'136]:='^';
   217  xchr[@'137]:='_';@/
   218  xchr[@'140]:='`';
   219  xchr[@'141]:='a';
   220  xchr[@'142]:='b';
   221  xchr[@'143]:='c';
   222  xchr[@'144]:='d';
   223  xchr[@'145]:='e';
   224  xchr[@'146]:='f';
   225  xchr[@'147]:='g';@/
   226  xchr[@'150]:='h';
   227  xchr[@'151]:='i';
   228  xchr[@'152]:='j';
   229  xchr[@'153]:='k';
   230  xchr[@'154]:='l';
   231  xchr[@'155]:='m';
   232  xchr[@'156]:='n';
   233  xchr[@'157]:='o';@/
   234  xchr[@'160]:='p';
   235  xchr[@'161]:='q';
   236  xchr[@'162]:='r';
   237  xchr[@'163]:='s';
   238  xchr[@'164]:='t';
   239  xchr[@'165]:='u';
   240  xchr[@'166]:='v';
   241  xchr[@'167]:='w';@/
   242  xchr[@'170]:='x';
   243  xchr[@'171]:='y';
   244  xchr[@'172]:='z';
   245  xchr[@'173]:='{';
   246  xchr[@'174]:='|';
   247  xchr[@'175]:='}';
   248  xchr[@'176]:='~';@/
   249  
   250  @ Some of the ASCII codes without visible characters have been given symbolic
   251  names in this program because they are used with a special meaning.
   252  
   253  @d null_code=@'0 {ASCII code that might disappear}
   254  @d carriage_return=@'15 {ASCII code used at end of line}
   255  @d invalid_code=@'177 {ASCII code that many systems prohibit in text files}
   256  
   257  @ The ASCII code is ``standard'' only to a certain extent, since many
   258  computer installations have found it advantageous to have ready access
   259  to more than 94 printing characters. Appendix~C of {\sl The \TeX book\/}
   260  gives a complete specification of the intended correspondence between
   261  characters and \TeX's internal representation.
   262  @:TeXbook}{\sl The \TeX book@>
   263  
   264  If \TeX\ is being used
   265  on a garden-variety \PASCAL\ for which only standard ASCII
   266  codes will appear in the input and output files, it doesn't really matter
   267  what codes are specified in |xchr[0..@'37]|, but the safest policy is to
   268  blank everything out by using the code shown below.
   269  
   270  However, other settings of |xchr| will make \TeX\ more friendly on
   271  computers that have an extended character set, so that users can type things
   272  like `\.^^Z' instead of `\.{\\ne}'. People with extended character sets can
   273  assign codes arbitrarily, giving an |xchr| equivalent to whatever
   274  characters the users of \TeX\ are allowed to have in their input files.
   275  It is best to make the codes correspond to the intended interpretations as
   276  shown in Appendix~C whenever possible; but this is not necessary. For
   277  example, in countries with an alphabet of more than 26 letters, it is
   278  usually best to map the additional letters into codes less than~@'40.
   279  To get the most ``permissive'' character set, change |' '| on the
   280  right of these assignment statements to |chr(i)|.
   281  @^character set dependencies@>
   282  @^system dependencies@>
   283  
   284  @<Set init...@>=
   285  for i:=0 to @'37 do xchr[i]:=' ';
   286  for i:=@'177 to @'377 do xchr[i]:=' ';
   287  
   288  @ The following system-independent code makes the |xord| array contain a
   289  suitable inverse to the information in |xchr|. Note that if |xchr[i]=xchr[j]|
   290  where |i<j<@'177|, the value of |xord[xchr[i]]| will turn out to be
   291  |j| or more; hence, standard ASCII code numbers will be used instead of
   292  codes below @'40 in case there is a coincidence.
   293  
   294  @<Set init...@>=
   295  for i:=first_text_char to last_text_char do xord[chr(i)]:=invalid_code;
   296  for i:=@'200 to @'377 do xord[xchr[i]]:=i;
   297  for i:=0 to @'176 do xord[xchr[i]]:=i;
   298  
   299  @* String handling.
   300  (The following material is copied from the \\{get\_strings\_started} procedure
   301  of \TeX82, with slight changes.)
   302  
   303  @<Glob...@>=
   304  @!k,@!l:0..255; {small indices or counters}
   305  @!m,@!n:text_char; {characters input from |pool_file|}
   306  @!s:integer; {number of strings treated so far}
   307  
   308  @ The global variable |count| keeps track of the total number of characters
   309  in strings.
   310  
   311  @<Glob...@>=
   312  @!count:integer; {how long the string pool is, so far}
   313  
   314  @ @<Set init...@>=
   315  count:=0;
   316  
   317  @ This is the main program, where \.{POOLtype} starts and ends.
   318  
   319  @d abort(#)==begin write_ln(#); goto 9999;
   320    end
   321  
   322  @p begin initialize;@/
   323  @<Make the first 256 strings@>;
   324  s:=256;@/
   325  @<Read the other strings from the \.{POOL} file,
   326    or give an error message and abort@>;
   327  write_ln('(',count:1,' characters in all.)');
   328  9999:end.
   329  
   330  @ @d lc_hex(#)==l:=#;
   331    if l<10 then l:=l+"0" @+else l:=l-10+"a"
   332  
   333  @<Make the first 256...@>=
   334  for k:=0 to 255 do
   335    begin write(k:3,': "'); l:=k;
   336    if (@<Character |k| cannot be printed@>) then
   337      begin write(xchr["^"],xchr["^"]);
   338      if k<@'100 then l:=k+@'100
   339      else if k<@'200 then l:=k-@'100
   340      else begin lc_hex(k div 16); write(xchr[l]); lc_hex(k mod 16); incr(count);
   341        end;
   342      count:=count+2;
   343      end;
   344    if l="""" then write(xchr[l],xchr[l])
   345    else write(xchr[l]);
   346    incr(count); write_ln('"');
   347    end
   348  
   349  @ The first 128 strings will contain 95 standard ASCII characters, and the
   350  other 33 characters will be printed in three-symbol form like `\.{\^\^A}'
   351  unless a system-dependent change is made here. Installations that have
   352  an extended character set, where for example |xchr[@'32]=@t\.{\'^^Z\'}@>|,
   353  would like string @'32 to be the single character @'32 instead of the
   354  three characters @'136, @'136, @'132 (\.{\^\^Z}). On the other hand,
   355  even people with an extended character set will want to represent string
   356  @'15 by \.{\^\^M}, since @'15 is |carriage_return|; the idea is to
   357  produce visible strings instead of tabs or line-feeds or carriage-returns
   358  or bell-rings or characters that are treated anomalously in text files.
   359  
   360  Unprintable characters of codes 128--255 are, similarly, rendered
   361  \.{\^\^80}--\.{\^\^ff}.
   362  
   363  The boolean expression defined here should be |true| unless \TeX\
   364  internal code number~|k| corresponds to a non-troublesome visible
   365  symbol in the local character set.  An appropriate formula for the
   366  extended character set recommended in {\sl The \TeX book\/} would, for
   367  example, be `|k in [0,@'10..@'12,@'14,@'15,@'33,@'177..@'377]|'.
   368  If character |k| cannot be printed, and |k<@'200|, then character |k+@'100| or
   369  |k-@'100| must be printable; moreover, ASCII codes |[@'41..@'46,
   370  @'60..@'71, @'136, @'141..@'146, @'160..@'171]| must be printable.
   371  Thus, at least 80 printable characters are needed.
   372  @:TeXbook}{\sl The \TeX book@>
   373  @^character set dependencies@>
   374  @^system dependencies@>
   375  
   376  @<Character |k| cannot be printed@>=
   377    (k<" ")or(k>"~")
   378  
   379  @ When the \.{WEB} system program called \.{TANGLE} processes a source file,
   380  it outputs a \PASCAL\ program and also a string pool file. The present
   381  program reads the latter file, where each string appears as a two-digit decimal
   382  length followed by the string itself, and the information is output with its
   383  associated index number. The strings are surrounded by double-quote marks;
   384  double-quotes in the string itself are repeated.
   385  
   386  @<Glob...@>=
   387  @!pool_file:packed file of text_char;
   388    {the string-pool file output by \.{TANGLE}}
   389  @!xsum:boolean; {has the check sum been found?}
   390  
   391  @ @<Read the other strings...@>=
   392  reset(pool_file); xsum:=false;
   393  if eof(pool_file) then abort('! I can''t read the POOL file.');
   394  repeat @<Read one string, but abort if there are problems@>;
   395  until xsum;
   396  if not eof(pool_file) then abort('! There''s junk after the check sum')
   397  
   398  @ @<Read one string...@>=
   399  if eof(pool_file) then abort('! POOL file contained no check sum');
   400  read(pool_file,m,n); {read two digits of string length}
   401  if m<>'*' then
   402    begin if (xord[m]<"0")or(xord[m]>"9")or(xord[n]<"0")or(xord[n]>"9") then
   403      abort('! POOL line doesn''t begin with two digits');
   404    l:=xord[m]*10+xord[n]-"0"*11; {compute the length}
   405    write(s:3,': "'); count:=count+l;
   406    for k:=1 to l do
   407      begin if eoln(pool_file) then
   408        begin write_ln('"'); abort('! That POOL line was too short');
   409        end;
   410      read(pool_file,m); write(xchr[xord[m]]);
   411      if xord[m]="""" then write(xchr[""""]);
   412      end;
   413    write_ln('"'); incr(s);
   414    end
   415  else xsum:=true;
   416  read_ln(pool_file)
   417  
   418  @* System-dependent changes.
   419  This section should be replaced, if necessary, by changes to the program
   420  that are necessary to make \.{POOLtype} work at a particular installation.
   421  It is usually best to design your change file so that all changes to
   422  previous sections preserve the section numbering; then everybody's version
   423  will be consistent with the printed program. More extensive changes,
   424  which introduce new sections, can be inserted here; then only the index
   425  itself will get a new section number.
   426  @^system dependencies@>
   427  
   428  @* Index.
   429  Indications of system dependencies appear here together with the section numbers
   430  where each ident\-i\-fier is used.