modernc.org/knuth@v0.0.4/web/testdata/ctan.org/tex-archive/systems/knuth/dist/texware/pooltype.web (about) 1 % This program by D. E. Knuth is not copyrighted and can be used freely. 2 % Version 1 was implemented in June 1982. 3 % Slight changes were made in October, 1982, for version 0.6 of TeX. 4 % Version 2 (July 1983) is consistent with TeX version 0.999. 5 % Version 3 (September 1989) is consistent with 8-bit TeX. 6 7 % Here is TeX material that gets inserted after \input webmac 8 \def\hang{\hangindent 3em\indent\ignorespaces} 9 \font\ninerm=cmr9 10 \let\mc=\ninerm % medium caps for names like SAIL 11 \def\PASCAL{Pascal} 12 13 \def\(#1){} % this is used to make section names sort themselves better 14 \def\9#1{} % this is used for sort keys in the index 15 16 \def\title{POOL\lowercase{type}} 17 \def\contentspagenumber{101} 18 \def\topofcontents{\null 19 \titlefalse % include headline on the contents page 20 \def\rheader{\mainfont\hfil \contentspagenumber} 21 \vfill 22 \centerline{\titlefont The {\ttitlefont POOLtype} processor} 23 \vskip 15pt 24 \centerline{(Version 3, September 1989)} 25 \vfill} 26 \def\botofcontents{\vfill 27 \centerline{\hsize 5in\baselineskip9pt 28 \vbox{\ninerm\noindent 29 The preparation of this report 30 was supported in part by the National Science 31 Foundation under grants IST-8201926 and MCS-8300984, 32 and by the System Development Foundation. `\TeX' is a 33 trademark of the American Mathematical Society.}}} 34 \pageno=\contentspagenumber \advance\pageno by 1 35 36 @* Introduction. 37 The \.{POOLtype} utility program converts string pool files output 38 by \.{TANGLE} into a slightly more symbolic format that may be useful 39 when \.{TANGLE}d programs are being debugged. 40 41 It's a pretty trivial routine, but people may want to try transporting 42 this program before they get up enough courage to tackle \TeX\ itself. 43 The first 256 strings are treated as \TeX\ treats them, using routines 44 copied from \TeX82. 45 46 @ \.{POOLtype} is written entirely in standard \PASCAL, except that it has 47 to do some slightly system-dependent character code conversion on input 48 and output. The input is read from |pool_file|, and the output is written 49 on |output|. If the input is erroneous, the |output| file will describe 50 the error. 51 @^system dependencies@> 52 53 @p program POOLtype(@!pool_file,@!output); 54 label 9999; {this labels the end of the program} 55 type @<Types in the outer block@>@/ 56 var @<Globals in the outer block@>@/ 57 procedure initialize; {this procedure gets things started properly} 58 var @<Local variables for initialization@>@; 59 begin @<Set initial values of key variables@>@/ 60 end; 61 62 @ Here are some macros for common programming idioms. 63 64 @d incr(#) == #:=#+1 {increase a variable by unity} 65 @d decr(#) == #:=#-1 {decrease a variable by unity} 66 @d do_nothing == {empty statement} 67 68 @* The character set. 69 (The following material is copied verbatim from \TeX82. 70 Thus, the same system-dependent changes should be made to both programs.) 71 72 In order to make \TeX\ readily portable to a wide variety of 73 computers, all of its input text is converted to an internal eight-bit 74 code that includes standard ASCII, the ``American Standard Code for 75 Information Interchange.'' This conversion is done immediately when each 76 character is read in. Conversely, characters are converted from ASCII to 77 the user's external representation just before they are output to a 78 text file. 79 80 Such an internal code is relevant to users of \TeX\ primarily because it 81 governs the positions of characters in the fonts. For example, the 82 character `\.A' has ASCII code $65=@'101$, and when \TeX\ typesets 83 this letter it specifies character number 65 in the current font. 84 If that font actually has `\.A' in a different position, \TeX\ doesn't 85 know what the real position is; the program that does the actual printing from 86 \TeX's device-independent files is responsible for converting from ASCII to 87 a particular font encoding. 88 @^ASCII code@> 89 90 \TeX's internal code also defines the value of constants 91 that begin with a reverse apostrophe; and it provides an index to the 92 \.{\\catcode}, \.{\\mathcode}, \.{\\uccode}, \.{\\lccode}, and \.{\\delcode} 93 tables. 94 95 @ Characters of text that have been converted to \TeX's internal form 96 are said to be of type |ASCII_code|, which is a subrange of the integers. 97 98 @<Types...@>= 99 @!ASCII_code=0..255; {eight-bit numbers} 100 101 @ The original \PASCAL\ compiler was designed in the late 60s, when six-bit 102 character sets were common, so it did not make provision for lowercase 103 letters. Nowadays, of course, we need to deal with both capital and small 104 letters in a convenient way, especially in a program for typesetting; 105 so the present specification of \TeX\ has been written under the assumption 106 that the \PASCAL\ compiler and run-time system permit the use of text files 107 with more than 64 distinguishable characters. More precisely, we assume that 108 the character set contains at least the letters and symbols associated 109 with ASCII codes @'40 through @'176; all of these characters are now 110 available on most computer terminals. 111 112 Since we are dealing with more characters than were present in the first 113 \PASCAL\ compilers, we have to decide what to call the associated data 114 type. Some \PASCAL s use the original name |char| for the 115 characters in text files, even though there now are more than 64 such 116 characters, while other \PASCAL s consider |char| to be a 64-element 117 subrange of a larger data type that has some other name. 118 119 In order to accommodate this difference, we shall use the name |text_char| 120 to stand for the data type of the characters that are converted to and 121 from |ASCII_code| when they are input and output. We shall also assume 122 that |text_char| consists of the elements |chr(first_text_char)| through 123 |chr(last_text_char)|, inclusive. The following definitions should be 124 adjusted if necessary. 125 @^system dependencies@> 126 127 @d text_char == char {the data type of characters in text files} 128 @d first_text_char=0 {ordinal number of the smallest element of |text_char|} 129 @d last_text_char=255 {ordinal number of the largest element of |text_char|} 130 131 @<Local variables for init...@>= 132 @!i:integer; 133 134 @ The \TeX\ processor converts between ASCII code and 135 the user's external character set by means of arrays |xord| and |xchr| 136 that are analogous to \PASCAL's |ord| and |chr| functions. 137 138 @<Glob...@>= 139 @!xord: array [text_char] of ASCII_code; 140 {specifies conversion of input characters} 141 @!xchr: array [ASCII_code] of text_char; 142 {specifies conversion of output characters} 143 144 @ Since we are assuming that our \PASCAL\ system is able to read and 145 write the visible characters of standard ASCII (although not 146 necessarily using the ASCII codes to represent them), the following 147 assignment statements initialize the standard part of the |xchr| array 148 properly, without needing any system-dependent changes. On the other 149 hand, it is possible to implement \TeX\ with less complete character 150 sets, and in such cases it will be necessary to change something here. 151 @^system dependencies@> 152 153 @<Set init...@>= 154 xchr[@'40]:=' '; 155 xchr[@'41]:='!'; 156 xchr[@'42]:='"'; 157 xchr[@'43]:='#'; 158 xchr[@'44]:='$'; 159 xchr[@'45]:='%'; 160 xchr[@'46]:='&'; 161 xchr[@'47]:='''';@/ 162 xchr[@'50]:='('; 163 xchr[@'51]:=')'; 164 xchr[@'52]:='*'; 165 xchr[@'53]:='+'; 166 xchr[@'54]:=','; 167 xchr[@'55]:='-'; 168 xchr[@'56]:='.'; 169 xchr[@'57]:='/';@/ 170 xchr[@'60]:='0'; 171 xchr[@'61]:='1'; 172 xchr[@'62]:='2'; 173 xchr[@'63]:='3'; 174 xchr[@'64]:='4'; 175 xchr[@'65]:='5'; 176 xchr[@'66]:='6'; 177 xchr[@'67]:='7';@/ 178 xchr[@'70]:='8'; 179 xchr[@'71]:='9'; 180 xchr[@'72]:=':'; 181 xchr[@'73]:=';'; 182 xchr[@'74]:='<'; 183 xchr[@'75]:='='; 184 xchr[@'76]:='>'; 185 xchr[@'77]:='?';@/ 186 xchr[@'100]:='@@'; 187 xchr[@'101]:='A'; 188 xchr[@'102]:='B'; 189 xchr[@'103]:='C'; 190 xchr[@'104]:='D'; 191 xchr[@'105]:='E'; 192 xchr[@'106]:='F'; 193 xchr[@'107]:='G';@/ 194 xchr[@'110]:='H'; 195 xchr[@'111]:='I'; 196 xchr[@'112]:='J'; 197 xchr[@'113]:='K'; 198 xchr[@'114]:='L'; 199 xchr[@'115]:='M'; 200 xchr[@'116]:='N'; 201 xchr[@'117]:='O';@/ 202 xchr[@'120]:='P'; 203 xchr[@'121]:='Q'; 204 xchr[@'122]:='R'; 205 xchr[@'123]:='S'; 206 xchr[@'124]:='T'; 207 xchr[@'125]:='U'; 208 xchr[@'126]:='V'; 209 xchr[@'127]:='W';@/ 210 xchr[@'130]:='X'; 211 xchr[@'131]:='Y'; 212 xchr[@'132]:='Z'; 213 xchr[@'133]:='['; 214 xchr[@'134]:='\'; 215 xchr[@'135]:=']'; 216 xchr[@'136]:='^'; 217 xchr[@'137]:='_';@/ 218 xchr[@'140]:='`'; 219 xchr[@'141]:='a'; 220 xchr[@'142]:='b'; 221 xchr[@'143]:='c'; 222 xchr[@'144]:='d'; 223 xchr[@'145]:='e'; 224 xchr[@'146]:='f'; 225 xchr[@'147]:='g';@/ 226 xchr[@'150]:='h'; 227 xchr[@'151]:='i'; 228 xchr[@'152]:='j'; 229 xchr[@'153]:='k'; 230 xchr[@'154]:='l'; 231 xchr[@'155]:='m'; 232 xchr[@'156]:='n'; 233 xchr[@'157]:='o';@/ 234 xchr[@'160]:='p'; 235 xchr[@'161]:='q'; 236 xchr[@'162]:='r'; 237 xchr[@'163]:='s'; 238 xchr[@'164]:='t'; 239 xchr[@'165]:='u'; 240 xchr[@'166]:='v'; 241 xchr[@'167]:='w';@/ 242 xchr[@'170]:='x'; 243 xchr[@'171]:='y'; 244 xchr[@'172]:='z'; 245 xchr[@'173]:='{'; 246 xchr[@'174]:='|'; 247 xchr[@'175]:='}'; 248 xchr[@'176]:='~';@/ 249 250 @ Some of the ASCII codes without visible characters have been given symbolic 251 names in this program because they are used with a special meaning. 252 253 @d null_code=@'0 {ASCII code that might disappear} 254 @d carriage_return=@'15 {ASCII code used at end of line} 255 @d invalid_code=@'177 {ASCII code that many systems prohibit in text files} 256 257 @ The ASCII code is ``standard'' only to a certain extent, since many 258 computer installations have found it advantageous to have ready access 259 to more than 94 printing characters. Appendix~C of {\sl The \TeX book\/} 260 gives a complete specification of the intended correspondence between 261 characters and \TeX's internal representation. 262 @:TeXbook}{\sl The \TeX book@> 263 264 If \TeX\ is being used 265 on a garden-variety \PASCAL\ for which only standard ASCII 266 codes will appear in the input and output files, it doesn't really matter 267 what codes are specified in |xchr[0..@'37]|, but the safest policy is to 268 blank everything out by using the code shown below. 269 270 However, other settings of |xchr| will make \TeX\ more friendly on 271 computers that have an extended character set, so that users can type things 272 like `\.^^Z' instead of `\.{\\ne}'. People with extended character sets can 273 assign codes arbitrarily, giving an |xchr| equivalent to whatever 274 characters the users of \TeX\ are allowed to have in their input files. 275 It is best to make the codes correspond to the intended interpretations as 276 shown in Appendix~C whenever possible; but this is not necessary. For 277 example, in countries with an alphabet of more than 26 letters, it is 278 usually best to map the additional letters into codes less than~@'40. 279 To get the most ``permissive'' character set, change |' '| on the 280 right of these assignment statements to |chr(i)|. 281 @^character set dependencies@> 282 @^system dependencies@> 283 284 @<Set init...@>= 285 for i:=0 to @'37 do xchr[i]:=' '; 286 for i:=@'177 to @'377 do xchr[i]:=' '; 287 288 @ The following system-independent code makes the |xord| array contain a 289 suitable inverse to the information in |xchr|. Note that if |xchr[i]=xchr[j]| 290 where |i<j<@'177|, the value of |xord[xchr[i]]| will turn out to be 291 |j| or more; hence, standard ASCII code numbers will be used instead of 292 codes below @'40 in case there is a coincidence. 293 294 @<Set init...@>= 295 for i:=first_text_char to last_text_char do xord[chr(i)]:=invalid_code; 296 for i:=@'200 to @'377 do xord[xchr[i]]:=i; 297 for i:=0 to @'176 do xord[xchr[i]]:=i; 298 299 @* String handling. 300 (The following material is copied from the \\{get\_strings\_started} procedure 301 of \TeX82, with slight changes.) 302 303 @<Glob...@>= 304 @!k,@!l:0..255; {small indices or counters} 305 @!m,@!n:text_char; {characters input from |pool_file|} 306 @!s:integer; {number of strings treated so far} 307 308 @ The global variable |count| keeps track of the total number of characters 309 in strings. 310 311 @<Glob...@>= 312 @!count:integer; {how long the string pool is, so far} 313 314 @ @<Set init...@>= 315 count:=0; 316 317 @ This is the main program, where \.{POOLtype} starts and ends. 318 319 @d abort(#)==begin write_ln(#); goto 9999; 320 end 321 322 @p begin initialize;@/ 323 @<Make the first 256 strings@>; 324 s:=256;@/ 325 @<Read the other strings from the \.{POOL} file, 326 or give an error message and abort@>; 327 write_ln('(',count:1,' characters in all.)'); 328 9999:end. 329 330 @ @d lc_hex(#)==l:=#; 331 if l<10 then l:=l+"0" @+else l:=l-10+"a" 332 333 @<Make the first 256...@>= 334 for k:=0 to 255 do 335 begin write(k:3,': "'); l:=k; 336 if (@<Character |k| cannot be printed@>) then 337 begin write(xchr["^"],xchr["^"]); 338 if k<@'100 then l:=k+@'100 339 else if k<@'200 then l:=k-@'100 340 else begin lc_hex(k div 16); write(xchr[l]); lc_hex(k mod 16); incr(count); 341 end; 342 count:=count+2; 343 end; 344 if l="""" then write(xchr[l],xchr[l]) 345 else write(xchr[l]); 346 incr(count); write_ln('"'); 347 end 348 349 @ The first 128 strings will contain 95 standard ASCII characters, and the 350 other 33 characters will be printed in three-symbol form like `\.{\^\^A}' 351 unless a system-dependent change is made here. Installations that have 352 an extended character set, where for example |xchr[@'32]=@t\.{\'^^Z\'}@>|, 353 would like string @'32 to be the single character @'32 instead of the 354 three characters @'136, @'136, @'132 (\.{\^\^Z}). On the other hand, 355 even people with an extended character set will want to represent string 356 @'15 by \.{\^\^M}, since @'15 is |carriage_return|; the idea is to 357 produce visible strings instead of tabs or line-feeds or carriage-returns 358 or bell-rings or characters that are treated anomalously in text files. 359 360 Unprintable characters of codes 128--255 are, similarly, rendered 361 \.{\^\^80}--\.{\^\^ff}. 362 363 The boolean expression defined here should be |true| unless \TeX\ 364 internal code number~|k| corresponds to a non-troublesome visible 365 symbol in the local character set. An appropriate formula for the 366 extended character set recommended in {\sl The \TeX book\/} would, for 367 example, be `|k in [0,@'10..@'12,@'14,@'15,@'33,@'177..@'377]|'. 368 If character |k| cannot be printed, and |k<@'200|, then character |k+@'100| or 369 |k-@'100| must be printable; moreover, ASCII codes |[@'41..@'46, 370 @'60..@'71, @'136, @'141..@'146, @'160..@'171]| must be printable. 371 Thus, at least 80 printable characters are needed. 372 @:TeXbook}{\sl The \TeX book@> 373 @^character set dependencies@> 374 @^system dependencies@> 375 376 @<Character |k| cannot be printed@>= 377 (k<" ")or(k>"~") 378 379 @ When the \.{WEB} system program called \.{TANGLE} processes a source file, 380 it outputs a \PASCAL\ program and also a string pool file. The present 381 program reads the latter file, where each string appears as a two-digit decimal 382 length followed by the string itself, and the information is output with its 383 associated index number. The strings are surrounded by double-quote marks; 384 double-quotes in the string itself are repeated. 385 386 @<Glob...@>= 387 @!pool_file:packed file of text_char; 388 {the string-pool file output by \.{TANGLE}} 389 @!xsum:boolean; {has the check sum been found?} 390 391 @ @<Read the other strings...@>= 392 reset(pool_file); xsum:=false; 393 if eof(pool_file) then abort('! I can''t read the POOL file.'); 394 repeat @<Read one string, but abort if there are problems@>; 395 until xsum; 396 if not eof(pool_file) then abort('! There''s junk after the check sum') 397 398 @ @<Read one string...@>= 399 if eof(pool_file) then abort('! POOL file contained no check sum'); 400 read(pool_file,m,n); {read two digits of string length} 401 if m<>'*' then 402 begin if (xord[m]<"0")or(xord[m]>"9")or(xord[n]<"0")or(xord[n]>"9") then 403 abort('! POOL line doesn''t begin with two digits'); 404 l:=xord[m]*10+xord[n]-"0"*11; {compute the length} 405 write(s:3,': "'); count:=count+l; 406 for k:=1 to l do 407 begin if eoln(pool_file) then 408 begin write_ln('"'); abort('! That POOL line was too short'); 409 end; 410 read(pool_file,m); write(xchr[xord[m]]); 411 if xord[m]="""" then write(xchr[""""]); 412 end; 413 write_ln('"'); incr(s); 414 end 415 else xsum:=true; 416 read_ln(pool_file) 417 418 @* System-dependent changes. 419 This section should be replaced, if necessary, by changes to the program 420 that are necessary to make \.{POOLtype} work at a particular installation. 421 It is usually best to design your change file so that all changes to 422 previous sections preserve the section numbering; then everybody's version 423 will be consistent with the printed program. More extensive changes, 424 which introduce new sections, can be inserted here; then only the index 425 itself will get a new section number. 426 @^system dependencies@> 427 428 @* Index. 429 Indications of system dependencies appear here together with the section numbers 430 where each ident\-i\-fier is used.