modernc.org/knuth@v0.0.4/mft/mft.web (about) 1 % This program by D. E. Knuth is not copyrighted and can be used freely. 2 % Version 0.0 was more-or-less debugged on June 4, 1985. 3 % Version 0.1 improved formatting of : and added \\ (June 15, 1985). 4 % Version 0.2 improved formatting of good, fixed @@ bug (August 4, 1985). 5 % Version 0.3 fixed minor bug in change_file move (August 30, 1985). 6 % Version 0.4 fixed minor bug regarding empty comments (April 8, 1989). 7 % Version 1.0 was tuned up for the METAFONTware report (April 16, 1989). 8 % Version 1.1 ditto, with input handled by Hosek's idea (April 27, 1989). 9 % Version 2 has the new primitives of METAFONT 2.0 (October 16, 1989). 10 % Version 2.1 corrects two of those primitives (January 20, 2021). 11 12 % Here is TeX material that gets inserted after \input webmac 13 \def\hang{\hangindent 3em\indent\ignorespaces} 14 \font\ninerm=cmr9 15 \let\mc=\ninerm % medium caps for names like SAIL 16 \def\PASCAL{Pascal} 17 \font\logo=manfnt % font used for the METAFONT logo 18 \def\MF{{\logo META}\-{\logo FONT}} 19 \def\pb{$\.|\ldots\.|$} % MF brackets (|...|) 20 \def\v{\.{\char'174}} % vertical (|) in typewriter font 21 \def\dleft{[\![} \def\dright{]\!]} % double brackets 22 \mathchardef\RA="3221 % right arrow 23 \mathchardef\BA="3224 % double arrow 24 \def\({} % kludge for alphabetizing certain module names 25 26 \def\title{MFT} 27 \def\contentspagenumber{401} 28 \def\topofcontents{\null 29 \titlefalse % include headline on the contents page 30 \def\rheader{\mainfont\hfil \contentspagenumber} 31 \vfill 32 \centerline{\titlefont The {\ttitlefont MFT} processor} 33 \vskip 15pt 34 \centerline{(Version 2.1, January 2021)} 35 \vfill} 36 \def\botofcontents{\vfill 37 \centerline{\hsize 5in\baselineskip9pt 38 \vbox{\ninerm\noindent 39 The preparation of this report 40 was supported in part by the National Science 41 Foundation under grants IST-8201926, MCS-8300984, and 42 CCR-8610181, 43 and by the System Development Foundation. `\TeX' is a 44 trademark of the American Mathematical Society. 45 `{\logo hijklmnj}\kern1pt' is a trademark of Addison-Wesley 46 Publishing Company.}}} 47 \pageno=\contentspagenumber \advance\pageno by 1 48 49 @* Introduction. 50 This program converts a \MF\ source file to a \TeX\ file. It was written 51 by D.~E. Knuth in June, 1985; a somewhat similar {\mc SAIL} program had 52 @^Knuth, Donald Ervin@> 53 been developed in January, 1980. 54 55 The general idea is to input a file called, say, \.{foo.mf} and to produce an 56 output file called, say, \.{foo.tex}. The latter file, when processed by \TeX, 57 will yield a ``prettyprinted'' representation of the input file. 58 @^user manual@> 59 60 Line breaks in the input are carried over into the output; moreover, 61 blank spaces at the beginning of a line are converted to quads of indentation 62 in the output. Thus, the user has full control over the indentation and line 63 breaks. Each line of input is translated independently of the others. 64 65 A slight change to \MF's comment convention allows further control. 66 Namely, `\.{\%\%}' indicates that the remainder of an input line should be 67 copied verbatim to the output; this interrupts the translation and forces 68 \.{MFT} to produce a certain result. 69 70 Furthermore, `\.{\%\%\%} $\langle\,$token$_1\,\rangle\ldots 71 \langle\,$token$_n\,\rangle$' 72 introduces a change in \.{MFT}'s formatting rules; all tokens after the first 73 will henceforth be translated according to the current conventions for 74 $\langle\,$token$_1\,\rangle$. The tokens must be symbolic (i.e., not 75 numeric or string tokens). For example, the input line 76 $$\.{\%\%\% addto fill draw filldraw}$$ 77 says that the `\.{fill}', `\.{draw}', and `\.{filldraw}' operations of 78 plain \MF\ should be formatted as the primitive token `\.{addto}', i.e., 79 in boldface type. (Without such reformatting commands, \.{MFT} would treat 80 `\.{fill}' like an ordinary tag or variable name. In fact, you need 81 a reformatting command even to get parentheses to act like delimiters!) 82 83 \MF\ comments, which follow a single \.\% sign, should be valid \TeX\ 84 input. But \MF\ material can be included in \pb\ within a comment; this 85 will be translated by \.{MFT} as if it were not in a comment. For example, 86 a phrase like `\.{make} \.{\v x2r\v} \.{zero}' will be translated into 87 `\.{make \$x\_\{2r\}\$ zero}'. 88 89 The rules just stated apply to lines that contain one, two, or three \.\% signs 90 in a row. Comments to \.{MFT} can follow `\.{\%\%\%\%}'. 91 Five or more \.\% signs should not be used. 92 93 Beside the normal input file, \.{MFT} also looks for a change file 94 (e.g., `\.{foo.ch}'), which allows substitutions to be made in the 95 translation. The change file follows the conventions of \.{WEB}, and 96 it should be null if there are no changes. (Changes usually contain 97 verbatim instructions to compensate for the fact that \.{MFT} cannot 98 format everything in an optimum way.) 99 100 There's also a third input file (e.g., `\.{plain.mft}'), which is 101 input before the other two. This file normally contains the `\.{\%\%\%}' 102 formatting commands that are necessary to tune \.{MFT} to a particular 103 style of \MF\ code, so it is called the style file. 104 105 The output of \.{MFT} should be accompanied by the macros in a small 106 package called \.{mftmac.tex}. 107 @.mftmac@> 108 109 Caveat: This program is not as ``bulletproof'' as the other routines 110 produced by Stanford's \TeX\ project. It takes care of a great deal of 111 tedious formatting, but it can produce strange output, because \MF\ is 112 an extremely general language. Users should proofread their output carefully. 113 114 @ \.{MFT} uses a few features of the local \PASCAL\ compiler that may 115 need to be changed in other installations: 116 117 \yskip\item{1)} Case statements have a default. 118 \item{2)} Input-output routines may need to be adapted for use with a particular 119 character set and/or for printing messages on the user's terminal. 120 121 \yskip\noindent 122 These features are also present in the \PASCAL\ version of \TeX, where they 123 are used in a similar (but more complex) way. System-dependent portions 124 of \.{MFT} can be identified by looking at the entries for `system 125 dependencies' in the index below. 126 @!@^system dependencies@> 127 128 The ``banner line'' defined here should be changed whenever \.{MFT} 129 is modified. 130 131 @d banner=='This is MFT, Version 2.1' 132 133 @ The program begins with a fairly normal header, made up of pieces that 134 @^system dependencies@> 135 will mostly be filled in later. The \.{MF} input comes from files |mf_file|, 136 |change_file|, and |style_file|; the \TeX\ output goes to file |tex_file|. 137 138 If it is necessary to abort the job because of a fatal error, the program 139 calls the `|jump_out|' procedure, which goes to the label |end_of_MFT|. 140 141 @d end_of_MFT = 9999 {go here to wrap it up} 142 143 @p @t\4@>@<Compiler directives@>@/ 144 program MFT(@!mf_file,@!change_file,@!style_file,@!tex_file); 145 label end_of_MFT; {go here to finish} 146 const @<Constants in the outer block@>@/ 147 type @<Types in the outer block@>@/ 148 var @<Globals in the outer block@>@/ 149 @<Error handling procedures@>@/ 150 procedure initialize; 151 var @<Local variables for initialization@>@/ 152 begin @<Set initial values@>@/ 153 end; 154 155 @ The \PASCAL\ compiler used to develop this system has ``compiler 156 directives'' that can appear in comments whose first character is a dollar sign. 157 In our case these directives tell the compiler to detect 158 @^system dependencies@> 159 things that are out of range. 160 161 @<Compiler directives@>= 162 @{@&$C+,A+,D-@} {range check, catch arithmetic overflow, no debug overhead} 163 164 @ Labels are given symbolic names by the following definitions. We insert 165 the label `|exit|:' just before the `\ignorespaces|end|\unskip' of a 166 procedure in which we have used the `|return|' statement defined below; 167 the label `|restart|' is occasionally used at the very beginning of a 168 procedure; and the label `|reswitch|' is occasionally used just prior to 169 a \&{case} statement in which some cases change the conditions and we wish to 170 branch to the newly applicable case. 171 Loops that are set up with the \&{loop} construction defined below are 172 commonly exited by going to `|done|' or to `|found|' or to `|not_found|', 173 and they are sometimes repeated by going to `|continue|'. 174 175 @d exit=10 {go here to leave a procedure} 176 @d restart=20 {go here to start a procedure again} 177 @d reswitch=21 {go here to start a case statement again} 178 @d continue=22 {go here to resume a loop} 179 @d done=30 {go here to exit a loop} 180 @d found=31 {go here when you've found it} 181 @d not_found=32 {go here when you've found something else} 182 183 @ Here are some macros for common programming idioms. 184 185 @d incr(#) == #:=#+1 {increase a variable by unity} 186 @d decr(#) == #:=#-1 {decrease a variable by unity} 187 @d loop == @+ while true do@+ {repeat over and over until a |goto| happens} 188 @d do_nothing == {empty statement} 189 @d return == goto exit {terminate a procedure call} 190 @f return == nil 191 @f loop == xclause 192 193 @ We assume that |case| statements may include a default case that applies 194 if no matching label is found. Thus, we shall use constructions like 195 @^system dependencies@> 196 $$\vbox{\halign{#\hfil\cr 197 |case x of|\cr 198 1: $\langle\,$code for $x=1\,\rangle$;\cr 199 3: $\langle\,$code for $x=3\,\rangle$;\cr 200 |othercases| $\langle\,$code for |x<>1| and |x<>3|$\,\rangle$\cr 201 |endcases|\cr}}$$ 202 since most \PASCAL\ compilers have plugged this hole in the language by 203 incorporating some sort of default mechanism. For example, the compiler 204 used to develop \.{WEB} and \TeX\ allows `|others|:' as a default label, 205 and other \PASCAL s allow syntaxes like `\ignorespaces|else|\unskip' or 206 `\&{otherwise}' or `\\{otherwise}:', etc. The definitions of |othercases| 207 and |endcases| should be changed to agree with local conventions. 208 (Of course, if no default mechanism is available, the |case| statements of 209 this program must be extended by listing all remaining cases.) 210 211 @d othercases == others: {default for cases not listed explicitly} 212 @d endcases == @+end {follows the default case in an extended |case| statement} 213 @f othercases == else 214 @f endcases == end 215 216 @ The following parameters are set big enough to handle the Computer 217 Modern fonts, so they should be sufficient for most applications of \.{MFT}. 218 219 @<Constants...@>= 220 @!max_bytes=10000; {the number of bytes in tokens; must be less than 65536} 221 @!max_names=1000; {number of tokens} 222 @!hash_size=353; {should be prime} 223 @!buf_size=100; {maximum length of input line} 224 @!line_length=80; {lines of \TeX\ output have at most this many characters, 225 should be less than 256} 226 227 @ A global variable called |history| will contain one of four values 228 at the end of every run: |spotless| means that no unusual messages were 229 printed; |harmless_message| means that a message of possible interest 230 was printed but no serious errors were detected; |error_message| means that 231 at least one error was found; |fatal_message| means that the program 232 terminated abnormally. The value of |history| does not influence the 233 behavior of the program; it is simply computed for the convenience 234 of systems that might want to use such information. 235 236 @d spotless=0 {|history| value for normal jobs} 237 @d harmless_message=1 {|history| value when non-serious info was printed} 238 @d error_message=2 {|history| value when an error was noted} 239 @d fatal_message=3 {|history| value when we had to stop prematurely} 240 @# 241 @d mark_harmless==@t@>@+if history=spotless then history:=harmless_message 242 @d mark_error==history:=error_message 243 @d mark_fatal==history:=fatal_message 244 245 @<Glob...@>=@!history:spotless..fatal_message; {how bad was this run?} 246 247 @ @<Set init...@>=history:=spotless; 248 249 @* The character set. 250 \.{MFT} works internally with ASCII codes, like all other programs 251 associated with \TeX\ and \MF. The present section has been lifted 252 almost verbatim from the \MF\ program. 253 @^ASCII code@> 254 255 @ Characters of text that have been converted to \MF's internal form 256 are said to be of type |ASCII_code|, which is a subrange of the integers. 257 258 @<Types...@>= 259 @!ASCII_code=0..255; {eight-bit numbers} 260 261 @ The original \PASCAL\ compiler was designed in the late 60s, when six-bit 262 character sets were common, so it did not make provision for lowercase 263 letters. Nowadays, of course, we need to deal with both capital and small 264 letters in a convenient way, especially in a program for font design; 265 so the present specification of \.{MFT} has been written under the assumption 266 that the \PASCAL\ compiler and run-time system permit the use of text files 267 with more than 64 distinguishable characters. More precisely, we assume that 268 the character set contains at least the letters and symbols associated 269 with ASCII codes @'40 through @'176. If additional characters are present, 270 \.{MFT} can be configured to work with them too. 271 272 Since we are dealing with more characters than were present in the first 273 \PASCAL\ compilers, we have to decide what to call the associated data 274 type. Some \PASCAL s use the original name |char| for the 275 characters in text files, even though there now are more than 64 such 276 characters, while other \PASCAL s consider |char| to be a 64-element 277 subrange of a larger data type that has some other name. 278 279 In order to accommodate this difference, we shall use the name |text_char| 280 to stand for the data type of the characters that are converted to and 281 from |ASCII_code| when they are input and output. We shall also assume 282 that |text_char| consists of the elements |chr(first_text_char)| through 283 |chr(last_text_char)|, inclusive. The following definitions should be 284 adjusted if necessary. 285 @^system dependencies@> 286 287 @d text_char == char {the data type of characters in text files} 288 @d first_text_char=0 {ordinal number of the smallest element of |text_char|} 289 @d last_text_char=255 {ordinal number of the largest element of |text_char|} 290 291 @<Types...@>= 292 @!text_file=packed file of text_char; 293 294 @ @<Local variables for init...@>= 295 @!i:0..255; 296 297 @ The \.{MFT} processor converts between ASCII code and 298 the user's external character set by means of arrays |xord| and |xchr| 299 that are analogous to \PASCAL's |ord| and |chr| functions. 300 301 @<Glob...@>= 302 @!xord: array [text_char] of ASCII_code; 303 {specifies conversion of input characters} 304 @!xchr: array [ASCII_code] of text_char; 305 {specifies conversion of output characters} 306 307 @ Since we are assuming that our \PASCAL\ system is able to read and write the 308 visible characters of standard ASCII (although not necessarily using the 309 ASCII codes to represent them), the following assignment statements initialize 310 most of the |xchr| array properly, without needing any system-dependent 311 changes. On the other hand, it is possible to implement \.{MFT} with 312 less complete character sets, and in such cases it will be necessary to 313 change something here. 314 @^system dependencies@> 315 316 @<Set init...@>= 317 xchr[@'40]:=' '; 318 xchr[@'41]:='!'; 319 xchr[@'42]:='"'; 320 xchr[@'43]:='#'; 321 xchr[@'44]:='$'; 322 xchr[@'45]:='%'; 323 xchr[@'46]:='&'; 324 xchr[@'47]:='''';@/ 325 xchr[@'50]:='('; 326 xchr[@'51]:=')'; 327 xchr[@'52]:='*'; 328 xchr[@'53]:='+'; 329 xchr[@'54]:=','; 330 xchr[@'55]:='-'; 331 xchr[@'56]:='.'; 332 xchr[@'57]:='/';@/ 333 xchr[@'60]:='0'; 334 xchr[@'61]:='1'; 335 xchr[@'62]:='2'; 336 xchr[@'63]:='3'; 337 xchr[@'64]:='4'; 338 xchr[@'65]:='5'; 339 xchr[@'66]:='6'; 340 xchr[@'67]:='7';@/ 341 xchr[@'70]:='8'; 342 xchr[@'71]:='9'; 343 xchr[@'72]:=':'; 344 xchr[@'73]:=';'; 345 xchr[@'74]:='<'; 346 xchr[@'75]:='='; 347 xchr[@'76]:='>'; 348 xchr[@'77]:='?';@/ 349 xchr[@'100]:='@@'; 350 xchr[@'101]:='A'; 351 xchr[@'102]:='B'; 352 xchr[@'103]:='C'; 353 xchr[@'104]:='D'; 354 xchr[@'105]:='E'; 355 xchr[@'106]:='F'; 356 xchr[@'107]:='G';@/ 357 xchr[@'110]:='H'; 358 xchr[@'111]:='I'; 359 xchr[@'112]:='J'; 360 xchr[@'113]:='K'; 361 xchr[@'114]:='L'; 362 xchr[@'115]:='M'; 363 xchr[@'116]:='N'; 364 xchr[@'117]:='O';@/ 365 xchr[@'120]:='P'; 366 xchr[@'121]:='Q'; 367 xchr[@'122]:='R'; 368 xchr[@'123]:='S'; 369 xchr[@'124]:='T'; 370 xchr[@'125]:='U'; 371 xchr[@'126]:='V'; 372 xchr[@'127]:='W';@/ 373 xchr[@'130]:='X'; 374 xchr[@'131]:='Y'; 375 xchr[@'132]:='Z'; 376 xchr[@'133]:='['; 377 xchr[@'134]:='\'; 378 xchr[@'135]:=']'; 379 xchr[@'136]:='^'; 380 xchr[@'137]:='_';@/ 381 xchr[@'140]:='`'; 382 xchr[@'141]:='a'; 383 xchr[@'142]:='b'; 384 xchr[@'143]:='c'; 385 xchr[@'144]:='d'; 386 xchr[@'145]:='e'; 387 xchr[@'146]:='f'; 388 xchr[@'147]:='g';@/ 389 xchr[@'150]:='h'; 390 xchr[@'151]:='i'; 391 xchr[@'152]:='j'; 392 xchr[@'153]:='k'; 393 xchr[@'154]:='l'; 394 xchr[@'155]:='m'; 395 xchr[@'156]:='n'; 396 xchr[@'157]:='o';@/ 397 xchr[@'160]:='p'; 398 xchr[@'161]:='q'; 399 xchr[@'162]:='r'; 400 xchr[@'163]:='s'; 401 xchr[@'164]:='t'; 402 xchr[@'165]:='u'; 403 xchr[@'166]:='v'; 404 xchr[@'167]:='w';@/ 405 xchr[@'170]:='x'; 406 xchr[@'171]:='y'; 407 xchr[@'172]:='z'; 408 xchr[@'173]:='{'; 409 xchr[@'174]:='|'; 410 xchr[@'175]:='}'; 411 xchr[@'176]:='~'; 412 413 @ The ASCII code is ``standard'' only to a certain extent, since many 414 computer installations have found it advantageous to have ready access 415 to more than 94 printing characters. If \.{MFT} is being used 416 on a garden-variety \PASCAL\ for which only standard ASCII 417 codes will appear in the input and output files, it doesn't really matter 418 what codes are specified in |xchr[0..@'37]|, but the safest policy is to 419 blank everything out by using the code shown below. 420 421 However, other settings of |xchr| will make \.{MFT} more friendly on 422 computers that have an extended character set, so that users can type things 423 like `\.^^Z' instead of `\.{<>}', and so that \.{MFT} can echo the 424 page breaks found in its input. People with extended character sets can 425 assign codes arbitrarily, giving an |xchr| equivalent to whatever 426 characters the users of \.{MFT} are allowed to have in their input files. 427 Appropriate changes to \.{MFT}'s |char_class| table should then be made. 428 (Unlike \TeX, each installation of \MF\ has a fixed assignment of category 429 codes, called the |char_class|.) Such changes make portability of programs 430 more difficult, so they should be introduced cautiously if at all. 431 @^character set dependencies@> 432 @^system dependencies@> 433 434 @<Set init...@>= 435 for i:=0 to @'37 do xchr[i]:=' '; 436 for i:=@'177 to @'377 do xchr[i]:=' '; 437 438 @ The following system-independent code makes the |xord| array contain a 439 suitable inverse to the information in |xchr|. Note that if |xchr[i]=xchr[j]| 440 where |i<j<@'177|, the value of |xord[xchr[i]]| will turn out to be 441 |j| or more; hence, standard ASCII code numbers will be used instead of 442 codes below @'40 in case there is a coincidence. 443 444 @<Set init...@>= 445 for i:=first_text_char to last_text_char do xord[chr(i)]:=@'177; 446 for i:=@'200 to @'377 do xord[xchr[i]]:=i; 447 for i:=1 to @'176 do xord[xchr[i]]:=i; 448 449 @* Input and output. 450 The I/O conventions of this program are essentially identical to those 451 of \.{WEAVE}. Therefore people who need to make modifications should be 452 able to do so without too many headaches. 453 454 @ Terminal output is done by writing on file |term_out|, which is assumed to 455 consist of characters of type |text_char|: 456 @^system dependencies@> 457 458 @d print(#)==write(term_out,#) {`|print|' means write on the terminal} 459 @d print_ln(#)==write_ln(term_out,#) {`|print|' and then start new line} 460 @d new_line==write_ln(term_out) {start new line on the terminal} 461 @d print_nl(#)== {print information starting on a new line} 462 begin new_line; print(#); 463 end 464 465 @<Globals...@>= 466 @!term_out:text_file; {the terminal as an output file} 467 468 @ Different systems have different ways of specifying that the output on a 469 certain file will appear on the user's terminal. Here is one way to do this 470 on the \PASCAL\ system that was used in \.{WEAVE}'s initial development: 471 @^system dependencies@> 472 473 @<Set init...@>= 474 rewrite(term_out,'TTY:'); {send |term_out| output to the terminal} 475 476 @ The |update_terminal| procedure is called when we want 477 to make sure that everything we have output to the terminal so far has 478 actually left the computer's internal buffers and been sent. 479 @^system dependencies@> 480 481 @d update_terminal == break(term_out) {empty the terminal output buffer} 482 483 @ The main input comes from |mf_file|; this input may be overridden 484 by changes in |change_file|. (If |change_file| is empty, there are no changes.) 485 Furthermore the |style_file| is input first; it is unchangeable. 486 487 @<Globals...@>= 488 @!mf_file:text_file; {primary input} 489 @!change_file:text_file; {updates} 490 @!style_file:text_file; {formatting bootstrap} 491 492 @ The following code opens the input files. Since these files were listed 493 in the program header, we assume that the \PASCAL\ runtime system has 494 already checked that suitable file names have been given; therefore no 495 additional error checking needs to be done. 496 @^system dependencies@> 497 498 @p procedure open_input; {prepare to read the inputs} 499 begin reset(mf_file); reset(change_file); reset(style_file); 500 end; 501 502 @ The main output goes to |tex_file|. 503 504 @<Globals...@>= 505 @!tex_file: text_file; 506 507 @ The following code opens |tex_file|. 508 Since this file was listed in the program header, we assume that the 509 \PASCAL\ runtime system has checked that a suitable external file name has 510 been given. 511 @^system dependencies@> 512 513 @<Set init...@>= 514 rewrite(tex_file); 515 516 @ Input goes into an array called |buffer|. 517 518 @<Globals...@>=@!buffer: array[0..buf_size] of ASCII_code; 519 520 @ The |input_ln| procedure brings the next line of input from the specified 521 file into the |buffer| array and returns the value |true|, unless the file has 522 already been entirely read, in which case it returns |false|. The conventions 523 of \TeX\ are followed; i.e., |ASCII_code| numbers representing the next line 524 of the file are input into |buffer[0]|, |buffer[1]|, \dots, 525 |buffer[limit-1]|; trailing blanks are ignored; 526 and the global variable |limit| is set to the length of the 527 @^system dependencies@> 528 line. The value of |limit| must be strictly less than |buf_size|. 529 530 @p function input_ln(var f:text_file):boolean; 531 {inputs a line or returns |false|} 532 var final_limit:0..buf_size; {|limit| without trailing blanks} 533 begin limit:=0; final_limit:=0; 534 if eof(f) then input_ln:=false 535 else begin while not eoln(f) do 536 begin buffer[limit]:=xord[f^]; get(f); 537 incr(limit); 538 if buffer[limit-1]<>" " then final_limit:=limit; 539 if limit=buf_size then 540 begin while not eoln(f) do get(f); 541 decr(limit); {keep |buffer[buf_size]| empty} 542 if final_limit>limit then final_limit:=limit; 543 print_nl('! Input line too long'); loc:=0; error; 544 @.Input line too long@> 545 end; 546 end; 547 read_ln(f); limit:=final_limit; input_ln:=true; 548 end; 549 end; 550 551 @* Reporting errors to the user. 552 The command `|err_print('! Error message')|' will report a syntax error to 553 the user, by printing the error message at the beginning of a new line and 554 then giving an indication of where the error was spotted in the source file. 555 Note that no period follows the error message, since the error routine 556 will automatically supply a period. 557 558 The actual error indications are provided by a procedure called |error|. 559 560 @d err_print(#)== 561 begin new_line; print(#); error; 562 end 563 564 @<Error handling...@>= 565 procedure error; {prints `\..' and location of error message} 566 var@!k,@!l: 0..buf_size; {indices into |buffer|} 567 begin @<Print error location based on input buffer@>; 568 update_terminal; mark_error; 569 end; 570 571 @ The error locations can be indicated by using the global variables 572 |loc|, |line|, |styling|, and |changing|, which tell respectively the first 573 unlooked-at position in |buffer|, the current line number, and whether or not 574 the current line is from |style_file| or |change_file| or |mf_file|. 575 This routine should be modified on systems whose standard text editor 576 has special line-numbering conventions. 577 @^system dependencies@> 578 579 @<Print error location based on input buffer@>= 580 begin if styling then print('. (style file ') 581 else if changing then print('. (change file ')@+else print('. ('); 582 print_ln('l.', line:1, ')'); 583 if loc>=limit then l:=limit else l:=loc; 584 for k:=1 to l do 585 print(xchr[buffer[k-1]]); {print the characters already read} 586 new_line; 587 for k:=1 to l do print(' '); {space out the next line} 588 for k:=l+1 to limit do print(xchr[buffer[k-1]]); {print the part not yet read} 589 end 590 591 @ The |jump_out| procedure just cuts across all active procedure levels 592 and jumps out of the program. This is the only non-local \&{goto} statement 593 in \.{MFT}. It is used when no recovery from a particular error has 594 been provided. 595 596 Some \PASCAL\ compilers do not implement non-local |goto| statements. 597 @^system dependencies@> 598 In such cases the code that appears at label |end_of_MFT| should be 599 copied into the |jump_out| procedure, followed by a call to a system procedure 600 that terminates the program. 601 602 @d fatal_error(#)==begin new_line; print(#); error; mark_fatal; jump_out; 603 end 604 605 @<Error handling...@>= 606 procedure jump_out; 607 begin goto end_of_MFT; 608 end; 609 610 @ Sometimes the program's behavior is far different from what it should be, 611 and \.{MFT} prints an error message that is really for the \.{MFT} 612 maintenance person, not the user. In such cases the program says 613 |confusion('indication of where we are')|. 614 615 @d confusion(#)==fatal_error('! This can''t happen (',#,')') 616 @.This can't happen@> 617 618 @ An overflow stop occurs if \.{MFT}'s tables aren't large enough. 619 620 @d overflow(#)==fatal_error('! Sorry, ',#,' capacity exceeded') 621 @.Sorry, x capacity exceeded@> 622 623 @* Inserting the changes. 624 Let's turn now to the low-level routine |get_line| 625 that takes care of merging |change_file| into |mf_file|. The |get_line| 626 procedure also updates the line numbers for error messages. 627 (This routine was copied from \.{WEAVE}, but updated to include |styling|.) 628 629 @<Globals...@>= 630 @!line:integer; {the number of the current line in the current file} 631 @!other_line:integer; {the number of the current line in the input file that 632 is not currently being read} 633 @!temp_line:integer; {used when interchanging |line| with |other_line|} 634 @!limit:0..buf_size; {the last character position occupied in the buffer} 635 @!loc:0..buf_size; {the next character position to be read from the buffer} 636 @!input_has_ended: boolean; {if |true|, there is no more input} 637 @!changing: boolean; {if |true|, the current line is from |change_file|} 638 @!styling: boolean; {if |true|, the current line is from |style_file|} 639 640 @ As we change |changing| from |true| to |false| and back again, we must 641 remember to swap the values of |line| and |other_line| so that the |err_print| 642 routine will be sure to report the correct line number. 643 644 @d change_changing== 645 changing := not changing; 646 temp_line:=other_line; other_line:=line; line:=temp_line 647 {|line @t$\null\BA\null$@> other_line|} 648 649 @ When |changing| is |false|, the next line of |change_file| is kept in 650 |change_buffer[0..change_limit]|, for purposes of comparison with the next 651 line of |mf_file|. After the change file has been completely input, we 652 set |change_limit:=0|, so that no further matches will be made. 653 654 @<Globals...@>= 655 @!change_buffer:array[0..buf_size] of ASCII_code; 656 @!change_limit:0..buf_size; {the last position occupied in |change_buffer|} 657 658 @ Here's a simple function that checks if the two buffers are different. 659 660 @p function lines_dont_match:boolean; 661 label exit; 662 var k:0..buf_size; {index into the buffers} 663 begin lines_dont_match:=true; 664 if change_limit<>limit then return; 665 if limit>0 then 666 for k:=0 to limit-1 do if change_buffer[k]<>buffer[k] then return; 667 lines_dont_match:=false; 668 exit: end; 669 670 @ Procedure |prime_the_change_buffer| sets |change_buffer| in preparation 671 for the next matching operation. Since blank lines in the change file are 672 not used for matching, we have |(change_limit=0)and not changing| if and 673 only if the change file is exhausted. This procedure is called only 674 when |changing| is true; hence error messages will be reported correctly. 675 676 @p procedure prime_the_change_buffer; 677 label continue, done, exit; 678 var k:0..buf_size; {index into the buffers} 679 begin change_limit:=0; {this value will be used if the change file ends} 680 @<Skip over comment lines in the change file; |return| if end of file@>; 681 @<Skip to the next nonblank line; |return| if end of file@>; 682 @<Move |buffer| and |limit| to |change_buffer| and |change_limit|@>; 683 exit: end; 684 685 @ While looking for a line that begins with \.{@@x} in the change file, 686 we allow lines that begin with \.{@@}, as long as they don't begin with 687 \.{@@y} or \.{@@z} (which would probably indicate that the change file is 688 fouled up). 689 690 @<Skip over comment lines in the change file...@>= 691 loop@+ begin incr(line); 692 if not input_ln(change_file) then return; 693 if limit<2 then goto continue; 694 if buffer[0]<>"@@" then goto continue; 695 if (buffer[1]>="X")and(buffer[1]<="Z") then 696 buffer[1]:=buffer[1]+"z"-"Z"; {lowercasify} 697 if buffer[1]="x" then goto done; 698 if (buffer[1]="y")or(buffer[1]="z") then 699 begin loc:=2; err_print('! Where is the matching @@x?'); 700 @.Where is the match...@> 701 end; 702 continue: end; 703 done: 704 705 @ Here we are looking at lines following the \.{@@x}. 706 707 @<Skip to the next nonblank line...@>= 708 repeat incr(line); 709 if not input_ln(change_file) then 710 begin err_print('! Change file ended after @@x'); 711 @.Change file ended...@> 712 return; 713 end; 714 until limit>0; 715 716 @ @<Move |buffer| and |limit| to |change_buffer| and |change_limit|@>= 717 begin change_limit:=limit; 718 if limit>0 then for k:=0 to limit-1 do change_buffer[k]:=buffer[k]; 719 end 720 721 @ The following procedure is used to see if the next change entry should 722 go into effect; it is called only when |changing| is false. 723 The idea is to test whether or not the current 724 contents of |buffer| matches the current contents of |change_buffer|. 725 If not, there's nothing more to do; but if so, a change is called for: 726 All of the text down to the \.{@@y} is supposed to match. An error 727 message is issued if any discrepancy is found. Then the procedure 728 prepares to read the next line from |change_file|. 729 730 @p procedure check_change; {switches to |change_file| if the buffers match} 731 label exit; 732 var n:integer; {the number of discrepancies found} 733 @!k:0..buf_size; {index into the buffers} 734 begin if lines_dont_match then return; 735 n:=0; 736 loop@+ begin change_changing; {now it's |true|} 737 incr(line); 738 if not input_ln(change_file) then 739 begin err_print('! Change file ended before @@y'); 740 @.Change file ended...@> 741 change_limit:=0; change_changing; {|false| again} 742 return; 743 end; 744 @<If the current line starts with \.{@@y}, 745 report any discrepancies and |return|@>; 746 @<Move |buffer| and |limit|...@>; 747 change_changing; {now it's |false|} 748 incr(line); 749 if not input_ln(mf_file) then 750 begin err_print('! MF file ended during a change'); 751 @.MF file ended...@> 752 input_has_ended:=true; return; 753 end; 754 if lines_dont_match then incr(n); 755 end; 756 exit: end; 757 758 @ @<If the current line starts with \.{@@y}...@>= 759 if limit>1 then if buffer[0]="@@" then 760 begin if (buffer[1]>="X")and(buffer[1]<="Z") then 761 buffer[1]:=buffer[1]+"z"-"Z"; {lowercasify} 762 if (buffer[1]="x")or(buffer[1]="z") then 763 begin loc:=2; err_print('! Where is the matching @@y?'); 764 @.Where is the match...@> 765 end 766 else if buffer[1]="y" then 767 begin if n>0 then 768 begin loc:=2; err_print('! Hmm... ',n:1, 769 ' of the preceding lines failed to match'); 770 @.Hmm... n of the preceding...@> 771 end; 772 return; 773 end; 774 end 775 776 @ Here's what we do to get the input rolling. 777 778 @<Initialize the input system@>= 779 begin open_input; line:=0; other_line:=0;@/ 780 changing:=true; prime_the_change_buffer; change_changing;@/ 781 styling:=true; limit:=0; loc:=1; buffer[0]:=" "; input_has_ended:=false; 782 end 783 784 @ The |get_line| procedure is called when |loc>limit|; it puts the next 785 line of merged input into the buffer and updates the other variables 786 appropriately. 787 788 @p procedure get_line; {inputs the next line} 789 label restart; 790 begin restart: if styling then 791 @<Read from |style_file| and maybe turn off |styling|@>; 792 if not styling then 793 begin if changing then 794 @<Read from |change_file| and maybe turn off |changing|@>; 795 if not changing then 796 begin @<Read from |mf_file| and maybe turn on |changing|@>; 797 if changing then goto restart; 798 end; 799 end; 800 end; 801 802 @ @<Read from |mf_file|...@>= 803 begin incr(line); 804 if not input_ln(mf_file) then input_has_ended:=true 805 else if change_limit>0 then check_change; 806 end 807 808 @ @<Read from |style_file|...@>= 809 begin incr(line); 810 if not input_ln(style_file) then 811 begin styling:=false; line:=0; 812 end; 813 end 814 815 @ @<Read from |change_file|...@>= 816 begin incr(line); 817 if not input_ln(change_file) then 818 begin err_print('! Change file ended without @@z'); 819 @.Change file ended...@> 820 buffer[0]:="@@"; buffer[1]:="z"; limit:=2; 821 end; 822 if limit>1 then {check if the change has ended} 823 if buffer[0]="@@" then 824 begin if (buffer[1]>="X")and(buffer[1]<="Z") then 825 buffer[1]:=buffer[1]+"z"-"Z"; {lowercasify} 826 if (buffer[1]="x")or(buffer[1]="y") then 827 begin loc:=2; err_print('! Where is the matching @@z?'); 828 @.Where is the match...@> 829 end 830 else if buffer[1]="z" then 831 begin prime_the_change_buffer; change_changing; 832 end; 833 end; 834 end 835 836 @ At the end of the program, we will tell the user if the change file 837 had a line that didn't match any relevant line in |mf_file|. 838 839 @<Check that all changes have been read@>= 840 if change_limit<>0 then {|changing| is false} 841 begin for loc:=0 to change_limit-1 do buffer[loc]:=change_buffer[loc]; 842 limit:=change_limit; changing:=true; line:=other_line; loc:=change_limit; 843 err_print('! Change file entry did not match'); 844 @.Change file entry did not match@> 845 end 846 847 @* Data structures. 848 \.{MFT} puts token names 849 into the large |byte_mem| array, which is packed with eight-bit integers. 850 Allocation is sequential, since names are never deleted. 851 852 An auxiliary array |byte_start| is used as a directory for |byte_mem|; 853 the |link| and |ilk| arrays give further information about names. 854 These auxiliary arrays consist of sixteen-bit items. 855 856 @<Types...@>= 857 @!eight_bits=0..255; {unsigned one-byte quantity} 858 @!sixteen_bits=0..65535; {unsigned two-byte quantity} 859 860 @ \.{MFT} has been designed to avoid the need for indices that are more 861 than sixteen bits wide, so that it can be used on most computers. 862 863 @<Globals...@>= 864 @!byte_mem: packed array [0..max_bytes] of ASCII_code; {characters of names} 865 @!byte_start: array [0..max_names] of sixteen_bits; {directory into |byte_mem|} 866 @!link: array [0..max_names] of sixteen_bits; {hash table links} 867 @!ilk: array [0..max_names] of sixteen_bits; {type codes} 868 869 @ The names of tokens are found by computing a hash address |h| and 870 then looking at strings of bytes signified by |hash[h]|, |link[hash[h]]|, 871 |link[link[hash[h]]]|, \dots, until either finding the desired name 872 or encountering a zero. 873 874 A `|name_pointer|' variable, which signifies a name, is an index into 875 |byte_start|. The actual sequence of characters in the name pointed to by 876 |p| appears in positions |byte_start[p]| to |byte_start[p+1]-1|, inclusive, 877 of |byte_mem|. 878 879 We usually have |byte_start[name_ptr]=byte_ptr|, which is 880 the starting position for the next name to be stored in |byte_mem|. 881 882 @d length(#)==byte_start[#+1]-byte_start[#] {the length of a name} 883 884 @<Types...@>= 885 @!name_pointer=0..max_names; {identifies a name} 886 887 @ @<Global...@>= 888 @!name_ptr:name_pointer; {first unused position in |byte_start|} 889 @!byte_ptr:0..max_bytes; {first unused position in |byte_mem|} 890 891 @ @<Set init...@>= 892 byte_start[0]:=0; byte_ptr:=0; 893 byte_start[1]:=0; {this makes name 0 of length zero} 894 name_ptr:=1; 895 896 @ The hash table described above is updated by the |lookup| procedure, 897 which finds a given name and returns a pointer to its index in 898 |byte_start|. The token is supposed to match character by character. 899 If it was not already present, it is inserted into the table. 900 901 Because of the way \.{MFT}'s scanning mechanism works, it is most convenient 902 to let |lookup| search for a token that is present in the |buffer| 903 array. Two other global variables specify its position in the buffer: the 904 first character is |buffer[id_first]|, and the last is |buffer[id_loc-1]|. 905 906 @<Glob...@>= 907 @!id_first:0..buf_size; {where the current token begins in the buffer} 908 @!id_loc:0..buf_size; {just after the current token in the buffer} 909 @# 910 @!hash:array [0..hash_size] of sixteen_bits; {heads of hash lists} 911 912 @ Initially all the hash lists are empty. 913 914 @<Local variables for init...@>= 915 @!h:0..hash_size; {index into hash-head array} 916 917 @ @<Set init...@>= 918 for h:=0 to hash_size-1 do hash[h]:=0; 919 920 @ Here now is the main procedure for finding tokens. 921 922 @p function lookup:name_pointer; {finds current token} 923 label found; 924 var i:0..buf_size; {index into |buffer|} 925 @!h:0..hash_size; {hash code} 926 @!k:0..max_bytes; {index into |byte_mem|} 927 @!l:0..buf_size; {length of the given token} 928 @!p:name_pointer; {where the token is being sought} 929 begin l:=id_loc-id_first; {compute the length} 930 @<Compute the hash code |h|@>; 931 @<Compute the name location |p|@>; 932 if p=name_ptr then @<Enter a new name into the table at position |p|@>; 933 lookup:=p; 934 end; 935 936 @ A simple hash code is used: If the sequence of 937 ASCII codes is $c_1c_2\ldots c_n$, its hash value will be 938 $$(2^{n-1}c_1+2^{n-2}c_2+\cdots+c_n)\,\bmod\,|hash_size|.$$ 939 940 @<Compute the hash...@>= 941 h:=buffer[id_first]; i:=id_first+1; 942 while i<id_loc do 943 begin h:=(h+h+buffer[i]) mod hash_size; incr(i); 944 end 945 946 @ If the token is new, it will be placed in position |p=name_ptr|, 947 otherwise |p| will point to its existing location. 948 949 @<Compute the name location...@>= 950 p:=hash[h]; 951 while p<>0 do 952 begin if length(p)=l then 953 @<Compare name |p| with current token, 954 |goto found| if equal@>; 955 p:=link[p]; 956 end; 957 p:=name_ptr; {the current token is new} 958 link[p]:=hash[h]; hash[h]:=p; {insert |p| at beginning of hash list} 959 found: 960 961 @ @<Compare name |p|...@>= 962 begin i:=id_first; k:=byte_start[p]; 963 while (i<id_loc)and(buffer[i]=byte_mem[k]) do 964 begin incr(i); incr(k); 965 end; 966 if i=id_loc then goto found; {all characters agree} 967 end 968 969 @ When we begin the following segment of the program, |p=name_ptr|. 970 971 @<Enter a new name...@>= 972 begin if byte_ptr+l>max_bytes then overflow('byte memory'); 973 if name_ptr+1>max_names then overflow('name'); 974 i:=id_first; {get ready to move the token into |byte_mem|} 975 while i<id_loc do 976 begin byte_mem[byte_ptr]:=buffer[i]; incr(byte_ptr); incr(i); 977 end; 978 incr(name_ptr); byte_start[name_ptr]:=byte_ptr; 979 @<Assign the default value to |ilk[p]|@>; 980 end 981 982 @* Initializing the primitive tokens. 983 Each token read by \.{MFT} is recognized as belonging to one of the 984 following ``types'': 985 986 @d indentation=0 {internal code for space at beginning of a line} 987 @d end_of_line=1 {internal code for hypothetical token at end of a line} 988 @d end_of_file=2 {internal code for hypothetical token at end of the input} 989 @d verbatim=3 {internal code for the token `\.{\%\%}'} 990 @d set_format=4 {internal code for the token `\.{\%\%\%}'} 991 @d mft_comment=5 {internal code for the token `\.{\%\%\%\%}'} 992 @d min_action_type=6 {smallest code for tokens that produce ``real'' output} 993 @d numeric_token=6 {internal code for tokens like `\.{3.14159}'} 994 @d string_token=7 {internal code for tokens like `|"pie"|'} 995 @d min_symbolic_token=8 {smallest internal code for a symbolic token} 996 @d op=8 {internal code for tokens like `\.{sqrt}'} 997 @d command=9 {internal code for tokens like `\.{addto}'} 998 @d endit=10 {internal code for tokens like `\.{fi}'} 999 @d binary=11 {internal code for tokens like `\.{and}'} 1000 @d abinary=12 {internal code for tokens like `\.{+}'} 1001 @d bbinary=13 {internal code for tokens like `\.{step}'} 1002 @d ampersand=14 {internal code for the token `\.{\char`\&}'} 1003 @d pyth_sub=15 {internal code for the token `\.{+-+}'} 1004 @d as_is=16 {internal code for tokens like `\.{]}'} 1005 @d bold=17 {internal code for tokens like `\.{nullpen}'} 1006 @d type_name=18 {internal code for tokens like `\.{numeric}'} 1007 @d path_join=19 {internal code for the token `\.{..}'} 1008 @d colon=20 {internal code for the token `\.:'} 1009 @d semicolon=21 {internal code for the token `\.;'} 1010 @d backslash=22 {internal code for the token `\.{\\}'} 1011 @d double_back=23 {internal code for the token `\.{\\\\}'} 1012 @d less_or_equal=24 {internal code for the token `\.{<=}'} 1013 @d greater_or_equal=25 {internal code for the token `\.{>=}'} 1014 @d not_equal=26 {internal code for the token `\.{<>}'} 1015 @d sharp=27 {internal code for the token `\.{\char`\#}'} 1016 @d comment=28 {internal code for the token `\.{\char`\%}'} 1017 @d recomment=29 {internal code used to resume a comment after `\pb'} 1018 @d min_suffix=30 {smallest code for symbolic tokens in suffixes} 1019 @d internal=30 {internal code for tokens like `\.{pausing}'} 1020 @d input_command=31 {internal code for tokens like `\.{input}'} 1021 @d special_tag=32 {internal code for tags that take at most one subscript} 1022 @d tag=33 {internal code for nonprimitive tokens} 1023 1024 @<Assign the default value to |ilk[p]|@>=ilk[p]:=tag 1025 1026 @ We have to get \MF's primitives into the hash table, and the 1027 simplest way to do this is to insert them every time \.{MFT} is run. 1028 1029 A few macros permit us to do the initialization with a compact program. 1030 We use the fact that the longest primitive is \.{intersectiontimes}, 1031 which is 17 letters long. 1032 1033 @d spr17(#)==buffer[17]:=#;cur_tok:=lookup;ilk[cur_tok]:= 1034 @d spr16(#)==buffer[16]:=#;spr17 1035 @d spr15(#)==buffer[15]:=#;spr16 1036 @d spr14(#)==buffer[14]:=#;spr15 1037 @d spr13(#)==buffer[13]:=#;spr14 1038 @d spr12(#)==buffer[12]:=#;spr13 1039 @d spr11(#)==buffer[11]:=#;spr12 1040 @d spr10(#)==buffer[10]:=#;spr11 1041 @d spr9(#)==buffer[9]:=#;spr10 1042 @d spr8(#)==buffer[8]:=#;spr9 1043 @d spr7(#)==buffer[7]:=#;spr8 1044 @d spr6(#)==buffer[6]:=#;spr7 1045 @d spr5(#)==buffer[5]:=#;spr6 1046 @d spr4(#)==buffer[4]:=#;spr5 1047 @d spr3(#)==buffer[3]:=#;spr4 1048 @d spr2(#)==buffer[2]:=#;spr3 1049 @d spr1(#)==buffer[1]:=#;spr2 1050 @d pr1==id_first:=17; spr17 1051 @d pr2==id_first:=16; spr16 1052 @d pr3==id_first:=15; spr15 1053 @d pr4==id_first:=14; spr14 1054 @d pr5==id_first:=13; spr13 1055 @d pr6==id_first:=12; spr12 1056 @d pr7==id_first:=11; spr11 1057 @d pr8==id_first:=10; spr10 1058 @d pr9==id_first:=9; spr9 1059 @d pr10==id_first:=8; spr8 1060 @d pr11==id_first:=7; spr7 1061 @d pr12==id_first:=6; spr6 1062 @d pr13==id_first:=5; spr5 1063 @d pr14==id_first:=4; spr4 1064 @d pr15==id_first:=3; spr3 1065 @d pr16==id_first:=2; spr2 1066 @d pr17==id_first:=1; spr1 1067 1068 @ The intended use of the macros above might not be immediately obvious, 1069 but the riddle is answered by the following: 1070 1071 @<Store all the primitives@>= 1072 id_loc:=18;@/ 1073 pr2(".")(".")(path_join);@/ 1074 pr1("[")(as_is);@/ 1075 pr1("]")(as_is);@/ 1076 pr1("}")(as_is);@/ 1077 pr1("{")(as_is);@/ 1078 pr1(":")(colon);@/ 1079 pr2(":")(":")(colon);@/ 1080 pr3("|")("|")(":")(colon);@/ 1081 pr2(":")("=")(as_is);@/ 1082 pr1(",")(as_is);@/ 1083 pr1(";")(semicolon);@/ 1084 pr1("\")(backslash);@/ 1085 pr2("\")("\")(double_back);@/ 1086 pr5("a")("d")("d")("t")("o")(command);@/ 1087 pr2("a")("t")(bbinary);@/ 1088 pr7("a")("t")("l")("e")("a")("s")("t")(op);@/ 1089 pr10("b")("e")("g")("i")("n")("g")("r")("o")("u")("p")(command); 1090 pr8("c")("o")("n")("t")("r")("o")("l")("s")(op);@/ 1091 pr4("c")("u")("l")("l")(command);@/ 1092 pr4("c")("u")("r")("l")(op);@/ 1093 pr10("d")("e")("l")("i")("m")("i")("t")("e")("r")("s")(command);@/ 1094 pr7("d")("i")("s")("p")("l")("a")("y")(command);@/ 1095 pr8("e")("n")("d")("g")("r")("o")("u")("p")(endit);@/ 1096 pr8("e")("v")("e")("r")("y")("j")("o")("b")(command);@/ 1097 pr6("e")("x")("i")("t")("i")("f")(command);@/ 1098 pr11("e")("x")("p")("a")("n")("d")("a")("f")("t")("e")("r")(command);@/ 1099 pr4("f")("r")("o")("m")(bbinary);@/ 1100 pr8("i")("n")("w")("i")("n")("d")("o")("w")(bbinary);@/ 1101 pr7("i")("n")("t")("e")("r")("i")("m")(command);@/ 1102 pr3("l")("e")("t")(command);@/ 1103 pr11("n")("e")("w")("i")("n")("t")("e")("r")("n")("a")("l")(command);@/ 1104 pr2("o")("f")(command);@/ 1105 pr10("o")("p")("e")("n")("w")("i")("n")("d")("o")("w")(command);@/ 1106 pr10("r")("a")("n")("d")("o")("m")("s")("e")("e")("d")(command);@/ 1107 pr4("s")("a")("v")("e")(command);@/ 1108 pr10("s")("c")("a")("n")("t")("o")("k")("e")("n")("s")(command);@/ 1109 pr7("s")("h")("i")("p")("o")("u")("t")(command);@/ 1110 pr4("s")("t")("e")("p")(bbinary);@/ 1111 pr3("s")("t")("r")(command);@/ 1112 pr7("t")("e")("n")("s")("i")("o")("n")(op);@/ 1113 pr2("t")("o")(bbinary);@/ 1114 pr5("u")("n")("t")("i")("l")(bbinary);@/ 1115 pr3("d")("e")("f")(command);@/ 1116 pr6("v")("a")("r")("d")("e")("f")(command);@/ 1117 1118 @ (There are so many primitives, it's necessary to break this long 1119 initialization code up into pieces so as not to overflow \.{WEAVE}'s capacity.) 1120 1121 @<Store all the primitives@>= 1122 pr10("p")("r")("i")("m")("a")("r")("y")("d")("e")("f")(command);@/ 1123 pr12("s")("e")("c")("o")("n")("d")("a")("r")("y")("d")("e")("f")(command);@/ 1124 pr11("t")("e")("r")("t")("i")("a")("r")("y")("d")("e")("f")(command);@/ 1125 pr6("e")("n")("d")("d")("e")("f")(endit);@/ 1126 pr3("f")("o")("r")(command);@/ 1127 pr11("f")("o")("r")("s")("u")("f")("f")("i")("x")("e")("s")(command);@/ 1128 pr7("f")("o")("r")("e")("v")("e")("r")(command);@/ 1129 pr6("e")("n")("d")("f")("o")("r")(endit);@/ 1130 pr5("q")("u")("o")("t")("e")(command);@/ 1131 pr4("e")("x")("p")("r")(command);@/ 1132 pr6("s")("u")("f")("f")("i")("x")(command);@/ 1133 pr4("t")("e")("x")("t")(command);@/ 1134 pr7("p")("r")("i")("m")("a")("r")("y")(command);@/ 1135 pr9("s")("e")("c")("o")("n")("d")("a")("r")("y")(command);@/ 1136 pr8("t")("e")("r")("t")("i")("a")("r")("y")(command);@/ 1137 pr5("i")("n")("p")("u")("t")(input_command);@/ 1138 pr8("e")("n")("d")("i")("n")("p")("u")("t")(bold);@/ 1139 pr2("i")("f")(command);@/ 1140 pr2("f")("i")(endit);@/ 1141 pr4("e")("l")("s")("e")(command);@/ 1142 pr6("e")("l")("s")("e")("i")("f")(command);@/ 1143 pr4("t")("r")("u")("e")(bold);@/ 1144 pr5("f")("a")("l")("s")("e")(bold);@/ 1145 pr11("n")("u")("l")("l")("p")("i")("c")("t")("u")("r")("e")(bold);@/ 1146 pr7("n")("u")("l")("l")("p")("e")("n")(bold);@/ 1147 pr7("j")("o")("b")("n")("a")("m")("e")(bold);@/ 1148 pr10("r")("e")("a")("d")("s")("t")("r")("i")("n")("g")(bold);@/ 1149 pr9("p")("e")("n")("c")("i")("r")("c")("l")("e")(bold);@/ 1150 pr4("g")("o")("o")("d")(special_tag);@/ 1151 pr2("=")(":")(as_is);@/ 1152 pr3("=")(":")("|")(as_is);@/ 1153 pr4("=")(":")("|")(">")(as_is);@/ 1154 pr3("|")("=")(":")(as_is);@/ 1155 pr4("|")("=")(":")(">")(as_is);@/ 1156 pr4("|")("=")(":")("|")(as_is);@/ 1157 pr5("|")("=")(":")("|")(">")(as_is);@/ 1158 pr6("|")("=")(":")("|")(">")(">")(as_is);@/ 1159 pr4("k")("e")("r")("n")(binary); 1160 pr6("s")("k")("i")("p")("t")("o")(command);@/ 1161 1162 @ (Does anybody out there remember the commercials that went \.{LS-MFT}?) 1163 1164 @<Store all the prim...@>= 1165 pr13("n")("o")("r")("m")("a")("l")("d")("e")("v")("i")("a")("t")("e")(op);@/ 1166 pr3("o")("d")("d")(op);@/ 1167 pr5("k")("n")("o")("w")("n")(op);@/ 1168 pr7("u")("n")("k")("n")("o")("w")("n")(op);@/ 1169 pr3("n")("o")("t")(op);@/ 1170 pr7("d")("e")("c")("i")("m")("a")("l")(op);@/ 1171 pr7("r")("e")("v")("e")("r")("s")("e")(op);@/ 1172 pr8("m")("a")("k")("e")("p")("a")("t")("h")(op);@/ 1173 pr7("m")("a")("k")("e")("p")("e")("n")(op);@/ 1174 pr11("t")("o")("t")("a")("l")("w")("e")("i")("g")("h")("t")(op);@/ 1175 pr3("o")("c")("t")(op);@/ 1176 pr3("h")("e")("x")(op);@/ 1177 pr5("A")("S")("C")("I")("I")(op);@/ 1178 pr4("c")("h")("a")("r")(op);@/ 1179 pr6("l")("e")("n")("g")("t")("h")(op);@/ 1180 pr13("t")("u")("r")("n")("i")("n")("g")("n")("u")("m")("b")("e")("r")(op);@/ 1181 pr5("x")("p")("a")("r")("t")(op);@/ 1182 pr5("y")("p")("a")("r")("t")(op);@/ 1183 pr6("x")("x")("p")("a")("r")("t")(op);@/ 1184 pr6("x")("y")("p")("a")("r")("t")(op);@/ 1185 pr6("y")("x")("p")("a")("r")("t")(op);@/ 1186 pr6("y")("y")("p")("a")("r")("t")(op);@/ 1187 pr4("s")("q")("r")("t")(op);@/ 1188 pr4("m")("e")("x")("p")(op);@/ 1189 pr4("m")("l")("o")("g")(op);@/ 1190 pr4("s")("i")("n")("d")(op);@/ 1191 pr4("c")("o")("s")("d")(op);@/ 1192 pr5("f")("l")("o")("o")("r")(op);@/ 1193 pr14("u")("n")("i")("f")("o")("r")("m")("d")("e")("v")("i")("a")("t")("e")(op); 1194 @/ 1195 pr10("c")("h")("a")("r")("e")("x")("i")("s")("t")("s")(op);@/ 1196 pr5("a")("n")("g")("l")("e")(op);@/ 1197 pr5("c")("y")("c")("l")("e")(op);@/ 1198 1199 @ (If you think this \.{WEB} code is ugly, you should see the Pascal code 1200 it produces.) 1201 1202 @<Store all the primitives@>= 1203 pr13("t")("r")("a")("c")("i")("n")("g") 1204 ("t")("i")("t")("l")("e")("s")(internal);@/ 1205 pr16("t")("r")("a")("c")("i")("n")("g") 1206 ("e")("q")("u")("a")("t")("i")("o")("n")("s")(internal);@/ 1207 pr15("t")("r")("a")("c")("i")("n")("g") 1208 ("c")("a")("p")("s")("u")("l")("e")("s")(internal);@/ 1209 pr14("t")("r")("a")("c")("i")("n")("g") 1210 ("c")("h")("o")("i")("c")("e")("s")(internal);@/ 1211 pr12("t")("r")("a")("c")("i")("n")("g") 1212 ("s")("p")("e")("c")("s")(internal);@/ 1213 pr11("t")("r")("a")("c")("i")("n")("g") 1214 ("p")("e")("n")("s")(internal);@/ 1215 pr15("t")("r")("a")("c")("i")("n")("g") 1216 ("c")("o")("m")("m")("a")("n")("d")("s")(internal);@/ 1217 pr13("t")("r")("a")("c")("i")("n")("g") 1218 ("m")("a")("c")("r")("o")("s")(internal);@/ 1219 pr12("t")("r")("a")("c")("i")("n")("g") 1220 ("e")("d")("g")("e")("s")(internal);@/ 1221 pr13("t")("r")("a")("c")("i")("n")("g") 1222 ("o")("u")("t")("p")("u")("t")(internal);@/ 1223 pr12("t")("r")("a")("c")("i")("n")("g") 1224 ("s")("t")("a")("t")("s")(internal);@/ 1225 pr13("t")("r")("a")("c")("i")("n")("g") 1226 ("o")("n")("l")("i")("n")("e")(internal);@/ 1227 pr15("t")("r")("a")("c")("i")("n")("g") 1228 ("r")("e")("s")("t")("o")("r")("e")("s")(internal);@/ 1229 1230 @ @<Store all the primitives@>= 1231 pr4("y")("e")("a")("r")(internal);@/ 1232 pr5("m")("o")("n")("t")("h")(internal);@/ 1233 pr3("d")("a")("y")(internal);@/ 1234 pr4("t")("i")("m")("e")(internal);@/ 1235 pr8("c")("h")("a")("r")("c")("o")("d")("e")(internal);@/ 1236 pr7("c")("h")("a")("r")("e")("x")("t")(internal);@/ 1237 pr6("c")("h")("a")("r")("w")("d")(internal);@/ 1238 pr6("c")("h")("a")("r")("h")("t")(internal);@/ 1239 pr6("c")("h")("a")("r")("d")("p")(internal);@/ 1240 pr6("c")("h")("a")("r")("i")("c")(internal);@/ 1241 pr6("c")("h")("a")("r")("d")("x")(internal);@/ 1242 pr6("c")("h")("a")("r")("d")("y")(internal);@/ 1243 pr10("d")("e")("s")("i")("g")("n")("s")("i")("z")("e")(internal);@/ 1244 pr4("h")("p")("p")("p")(internal);@/ 1245 pr4("v")("p")("p")("p")(internal);@/ 1246 pr7("x")("o")("f")("f")("s")("e")("t")(internal);@/ 1247 pr7("y")("o")("f")("f")("s")("e")("t")(internal);@/ 1248 pr7("p")("a")("u")("s")("i")("n")("g")(internal);@/ 1249 pr12("s")("h")("o")("w") 1250 ("s")("t")("o")("p")("p")("i")("n")("g")(internal);@/ 1251 pr10("f")("o")("n")("t")("m")("a")("k")("i")("n")("g")(internal);@/ 1252 pr8("p")("r")("o")("o")("f")("i")("n")("g")(internal);@/ 1253 pr9("s")("m")("o")("o")("t")("h")("i")("n")("g")(internal);@/ 1254 pr12("a")("u")("t")("o")("r")("o")("u")("n")("d")("i")("n")("g")(internal);@/ 1255 pr11("g")("r")("a")("n")("u")("l")("a")("r")("i")("t")("y")(internal);@/ 1256 pr6("f")("i")("l")("l")("i")("n")(internal);@/ 1257 pr12("t")("u")("r")("n")("i")("n")("g")("c")("h")("e")("c")("k")(internal);@/ 1258 pr12("w")("a")("r")("n")("i")("n")("g")("c")("h")("e")("c")("k")(internal);@/ 1259 pr12("b")("o")("u")("n")("d")("a")("r")("y")("c")("h")("a")("r")(internal);@/ 1260 1261 @ Still more. 1262 1263 @<Store all the prim...@>= 1264 pr1("+")(abinary);@/ 1265 pr1("-")(abinary);@/ 1266 pr1("*")(abinary);@/ 1267 pr1("/")(as_is);@/ 1268 pr2("+")("+")(binary);@/ 1269 pr3("+")("-")("+")(pyth_sub);@/ 1270 pr3("a")("n")("d")(binary);@/ 1271 pr2("o")("r")(binary);@/ 1272 pr1("<")(as_is);@/ 1273 pr2("<")("=")(less_or_equal);@/ 1274 pr1(">")(as_is);@/ 1275 pr2(">")("=")(greater_or_equal);@/ 1276 pr1("=")(as_is);@/ 1277 pr2("<")(">")(not_equal);@/ 1278 pr9("s")("u")("b")("s")("t")("r")("i")("n")("g")(command);@/ 1279 pr7("s")("u")("b")("p")("a")("t")("h")(command);@/ 1280 pr13("d")("i")("r")("e")("c")("t")("i")("o")("n")@| 1281 ("t")("i")("m")("e")(command);@/ 1282 pr5("p")("o")("i")("n")("t")(command);@/ 1283 pr10("p")("r")("e")("c")("o")("n")("t")("r")("o")("l")(command);@/ 1284 pr11("p")("o")("s")("t")("c")("o")("n")("t")("r")("o")("l")(command);@/ 1285 pr9("p")("e")("n")("o")("f")("f")("s")("e")("t")(command);@/ 1286 pr1("&")(ampersand);@/ 1287 pr7("r")("o")("t")("a")("t")("e")("d")(binary);@/ 1288 pr7("s")("l")("a")("n")("t")("e")("d")(binary);@/ 1289 pr6("s")("c")("a")("l")("e")("d")(binary);@/ 1290 pr7("s")("h")("i")("f")("t")("e")("d")(binary);@/ 1291 pr11("t")("r")("a")("n")("s")("f")("o")("r")("m")("e")("d")(binary);@/ 1292 pr7("x")("s")("c")("a")("l")("e")("d")(binary);@/ 1293 pr7("y")("s")("c")("a")("l")("e")("d")(binary);@/ 1294 pr7("z")("s")("c")("a")("l")("e")("d")(binary);@/ 1295 pr17("i")("n")("t")("e")("r")("s")("e")("c")("t")("i")("o")("n")@| 1296 ("t")("i")("m")("e")("s")(binary);@/ 1297 pr7("n")("u")("m")("e")("r")("i")("c")(type_name);@/ 1298 pr6("s")("t")("r")("i")("n")("g")(type_name);@/ 1299 pr7("b")("o")("o")("l")("e")("a")("n")(type_name);@/ 1300 pr4("p")("a")("t")("h")(type_name);@/ 1301 pr3("p")("e")("n")(type_name);@/ 1302 pr7("p")("i")("c")("t")("u")("r")("e")(type_name);@/ 1303 pr9("t")("r")("a")("n")("s")("f")("o")("r")("m")(type_name);@/ 1304 pr4("p")("a")("i")("r")(type_name);@/ 1305 1306 @ At last we are done with the tedious initialization of primitives. 1307 1308 @<Store all the prim...@>= 1309 pr3("e")("n")("d")(endit);@/ 1310 pr4("d")("u")("m")("p")(endit);@/ 1311 pr9("b")("a")("t")("c")("h")("m")("o")("d")("e")(bold); 1312 pr11("n")("o")("n")("s")("t")("o")("p")("m")("o")("d")("e")(bold); 1313 pr10("s")("c")("r")("o")("l")("l")("m")("o")("d")("e")(bold); 1314 pr13("e")("r")("r")("o")("r")("s")("t")("o")("p")@| 1315 ("m")("o")("d")("e")(bold); 1316 pr5("i")("n")("n")("e")("r")(command);@/ 1317 pr5("o")("u")("t")("e")("r")(command);@/ 1318 pr9("s")("h")("o")("w")("t")("o")("k")("e")("n")(command);@/ 1319 pr9("s")("h")("o")("w")("s")("t")("a")("t")("s")(bold);@/ 1320 pr4("s")("h")("o")("w")(command);@/ 1321 pr12("s")("h")("o")("w")("v")("a")("r")("i")("a")("b")("l")("e")(command);@/ 1322 pr16("s")("h")("o")("w")@| 1323 ("d")("e")("p")("e")("n")("d")("e")("n")("c")("i")("e")("s")(bold);@/ 1324 pr7("c")("o")("n")("t")("o")("u")("r")(command);@/ 1325 pr10("d")("o")("u")("b")("l")("e")("p")("a")("t")("h")(command);@/ 1326 pr4("a")("l")("s")("o")(command);@/ 1327 pr7("w")("i")("t")("h")("p")("e")("n")(command);@/ 1328 pr10("w")("i")("t")("h")("w")("e")("i")("g")("h")("t")(command);@/ 1329 pr8("d")("r")("o")("p")("p")("i")("n")("g")(command);@/ 1330 pr7("k")("e")("e")("p")("i")("n")("g")(command);@/ 1331 pr7("m")("e")("s")("s")("a")("g")("e")(command);@/ 1332 pr10("e")("r")("r")("m")("e")("s")("s")("a")("g")("e")(command);@/ 1333 pr7("e")("r")("r")("h")("e")("l")("p")(command);@/ 1334 pr8("c")("h")("a")("r")("l")("i")("s")("t")(command);@/ 1335 pr8("l")("i")("g")("t")("a")("b")("l")("e")(command);@/ 1336 pr10("e")("x")("t")("e")("n")("s")("i")("b")("l")("e")(command);@/ 1337 pr10("h")("e")("a")("d")("e")("r")("b")("y")("t")("e")(command);@/ 1338 pr9("f")("o")("n")("t")("d")("i")("m")("e")("n")(command);@/ 1339 pr7("s")("p")("e")("c")("i")("a")("l")(command);@/ 1340 pr10("n")("u")("m")("s")("p")("e")("c")("i")("a")("l")(command);@/ 1341 pr1("%")(comment);@/ 1342 pr2("%")("%")(verbatim);@/ 1343 pr3("%")("%")("%")(set_format);@/ 1344 pr4("%")("%")("%")("%")(mft_comment);@/ 1345 pr1("#")(sharp);@/ 1346 1347 @ We also want to store a few other strings of characters that are 1348 used in \.{MFT}'s translation to \TeX\ code. 1349 1350 @d ttr1(#)==byte_mem[byte_ptr-1]:=#; cur_tok:=name_ptr; 1351 incr(name_ptr); byte_start[name_ptr]:=byte_ptr 1352 @d ttr2(#)==byte_mem[byte_ptr-2]:=#; ttr1 1353 @d ttr3(#)==byte_mem[byte_ptr-3]:=#; ttr2 1354 @d ttr4(#)==byte_mem[byte_ptr-4]:=#; ttr3 1355 @d ttr5(#)==byte_mem[byte_ptr-5]:=#; ttr4 1356 @d tr1==incr(byte_ptr); ttr1 1357 @d tr2==byte_ptr:=byte_ptr+2; ttr2 1358 @d tr3==byte_ptr:=byte_ptr+3; ttr3 1359 @d tr4==byte_ptr:=byte_ptr+4; ttr4 1360 @d tr5==byte_ptr:=byte_ptr+5; ttr5 1361 1362 @<Glob...@>= 1363 @!translation:array[ASCII_code] of name_pointer; 1364 @!i:ASCII_code; {index into |translation|} 1365 1366 @ @<Store all the translations@>= 1367 for i:=0 to 255 do translation[i]:=0; 1368 tr2("\")("$"); translation["$"]:=cur_tok;@/ 1369 tr2("\")("#"); translation["#"]:=cur_tok;@/ 1370 tr2("\")("&"); translation["&"]:=cur_tok;@/ 1371 tr2("\")("{"); translation["{"]:=cur_tok;@/ 1372 tr2("\")("}"); translation["}"]:=cur_tok;@/ 1373 tr2("\")("_"); translation["_"]:=cur_tok;@/ 1374 tr2("\")("%"); translation["%"]:=cur_tok;@/ 1375 tr4("\")("B")("S")(" "); translation["\"]:=cur_tok;@/ 1376 tr4("\")("H")("A")(" "); translation["^"]:=cur_tok;@/ 1377 tr4("\")("T")("I")(" "); translation["~"]:=cur_tok;@/ 1378 tr5("\")("a")("s")("t")(" "); translation["*"]:=cur_tok;@/ 1379 tr4("\")("A")("M")(" "); tr_amp:=cur_tok;@/ 1380 @.\\AM, etc@> 1381 tr4("\")("B")("L")(" "); tr_skip:=cur_tok;@/ 1382 tr4("\")("S")("H")(" "); tr_sharp:=cur_tok;@/ 1383 tr4("\")("P")("S")(" "); tr_ps:=cur_tok;@/ 1384 tr4("\")("l")("e")(" "); tr_le:=cur_tok;@/ 1385 tr4("\")("g")("e")(" "); tr_ge:=cur_tok;@/ 1386 tr4("\")("n")("e")(" "); tr_ne:=cur_tok;@/ 1387 tr5("\")("q")("u")("a")("d"); tr_quad:=cur_tok;@/ 1388 1389 @ @<Glob...@>= 1390 @!tr_le,@!tr_ge,@!tr_ne,@!tr_amp,@!tr_sharp,@!tr_skip,@!tr_ps, 1391 @!tr_quad:name_pointer; {special translations} 1392 1393 @* Inputting the next token. 1394 \.{MFT}'s lexical scanning routine is called |get_next|. This procedure 1395 inputs the next token of \MF\ input and puts its encoded meaning into 1396 two global variables, |cur_type| and |cur_tok|. 1397 1398 @<Glob...@>= 1399 @!cur_type:eight_bits; {type of token just scanned} 1400 @!cur_tok:integer; {hash table or buffer location} 1401 @!prev_type:eight_bits; {previous value of |cur_type|} 1402 @!prev_tok:integer; {previous value of |cur_tok|} 1403 1404 @ @<Set init...@>= 1405 cur_type:=end_of_line; cur_tok:=0; 1406 1407 @ Two global state variables affect the behavior of |get_next|: A space 1408 will be considered significant when |start_of_line| is |true|, 1409 and the buffer will be considered devoid of information when |empty_buffer| 1410 is |true|. 1411 1412 @<Glob...@>= 1413 @!start_of_line:boolean; {has the current line had nothing but spaces so far?} 1414 @!empty_buffer:boolean; {is it time to input a new line?} 1415 1416 @ The 256 |ASCII_code| characters are grouped into classes by means of 1417 the |char_class| table. Individual class numbers have no semantic 1418 or syntactic significance, expect in a few instances defined here. 1419 There's also |max_class|, which can be used as a basis for additional 1420 class numbers in nonstandard extensions of \MF. 1421 1422 @d digit_class=0 {the class number of \.{0123456789}} 1423 @d period_class=1 {the class number of `\..'} 1424 @d space_class=2 {the class number of spaces and nonstandard characters} 1425 @d percent_class=3 {the class number of `\.\%'} 1426 @d string_class=4 {the class number of `\."'} 1427 @d right_paren_class=8 {the class number of `\.)'} 1428 @d isolated_classes==5,6,7,8 {characters that make length-one tokens only} 1429 @d letter_class=9 {letters and the underline character} 1430 @d left_bracket_class=17 {`\.['} 1431 @d right_bracket_class=18 {`\.]'} 1432 @d invalid_class=20 {bad character in the input} 1433 @d end_line_class=21 {end of an input line (\.{MFT} only)} 1434 @d max_class=21 {the largest class number} 1435 1436 @<Glob...@>= 1437 @!char_class:array[ASCII_code] of 0..max_class; {the class numbers} 1438 1439 @ If changes are made to accommodate non-ASCII character sets, they should be 1440 essentially the same in \.{MFT} as in \MF. However, \.{MFT} has an additional 1441 class number, the |end_line_class|, which is used only for the special 1442 character |carriage_return| that is placed at the end of the input buffer. 1443 @^character set dependencies@> 1444 @^system dependencies@> 1445 1446 @d carriage_return=@'15 {special code placed in |buffer[limit]|} 1447 1448 @<Set init...@>= 1449 for i:="0" to "9" do char_class[i]:=digit_class; 1450 char_class["."]:=period_class; 1451 char_class[" "]:=space_class; 1452 char_class["%"]:=percent_class; 1453 char_class[""""]:=string_class;@/ 1454 char_class[","]:=5; 1455 char_class[";"]:=6; 1456 char_class["("]:=7; 1457 char_class[")"]:=right_paren_class; 1458 for i:="A" to "Z" do char_class[i]:=letter_class; 1459 for i:="a" to "z" do char_class[i]:=letter_class; 1460 char_class["_"]:=letter_class;@/ 1461 char_class["<"]:=10; 1462 char_class["="]:=10; 1463 char_class[">"]:=10; 1464 char_class[":"]:=10; 1465 char_class["|"]:=10;@/ 1466 char_class["`"]:=11; 1467 char_class["'"]:=11;@/ 1468 char_class["+"]:=12; 1469 char_class["-"]:=12;@/ 1470 char_class["/"]:=13; 1471 char_class["*"]:=13; 1472 char_class["\"]:=13;@/ 1473 char_class["!"]:=14; 1474 char_class["?"]:=14;@/ 1475 char_class["#"]:=15; 1476 char_class["&"]:=15; 1477 char_class["@@"]:=15; 1478 char_class["$"]:=15;@/ 1479 char_class["^"]:=16; 1480 char_class["~"]:=16;@/ 1481 char_class["["]:=left_bracket_class; 1482 char_class["]"]:=right_bracket_class;@/ 1483 char_class["{"]:=19; 1484 char_class["}"]:=19;@/ 1485 for i:=0 to " "-1 do char_class[i]:=invalid_class; 1486 char_class[carriage_return]:=end_line_class;@/ 1487 for i:=127 to 255 do char_class[i]:=invalid_class; 1488 1489 @ And now we're ready to take the plunge into |get_next| itself. 1490 1491 @d switch=25 {a label in |get_next|} 1492 @d pass_digits=85 {another} 1493 @d pass_fraction=86 {and still another, although |goto| is considered harmful} 1494 1495 @p procedure get_next; {sets |cur_type| and |cur_tok| to next token} 1496 label switch,pass_digits,pass_fraction,done,found,exit; 1497 var @!c:ASCII_code; {the current character in the buffer} 1498 @!class:ASCII_code; {its class number} 1499 begin prev_type:=cur_type; prev_tok:=cur_tok; 1500 if empty_buffer then 1501 @<Bring in a new line of input; |return| if the file has ended@>; 1502 switch: c:=buffer[loc]; id_first:=loc; incr(loc); class:=char_class[c]; 1503 @<Branch on the |class|, scan the token; |return| directly if the 1504 token is special, or |goto found| if it needs to be looked up@>; 1505 found:id_loc:=loc; cur_tok:=lookup; cur_type:=ilk[cur_tok]; 1506 exit:end; 1507 1508 @ @d emit(#)==@t@>@+begin cur_type:=#; cur_tok:=id_first; return;@+end 1509 1510 @<Branch on the |class|...@>= 1511 case class of 1512 digit_class:goto pass_digits; 1513 period_class:begin class:=char_class[buffer[loc]]; 1514 if class>period_class then goto switch {ignore isolated `\..'} 1515 else if class<period_class then goto pass_fraction; {|class=digit_class|} 1516 end; 1517 space_class:if start_of_line then emit(indentation) 1518 else goto switch; 1519 end_line_class: emit(end_of_line); 1520 string_class:@<Get a string token and |return|@>; 1521 isolated_classes: goto found; 1522 invalid_class:@<Decry the invalid character and |goto switch|@>; 1523 othercases do_nothing {letters, etc.} 1524 endcases;@/ 1525 while char_class[buffer[loc]]=class do incr(loc); 1526 goto found; 1527 pass_digits: while char_class[buffer[loc]]=digit_class do incr(loc); 1528 if buffer[loc]<>"." then goto done; 1529 if char_class[buffer[loc+1]]<>digit_class then goto done; 1530 incr(loc); 1531 pass_fraction:repeat incr(loc); 1532 until char_class[buffer[loc]]<>digit_class; 1533 done:emit(numeric_token) 1534 1535 @ @<Get a string token and |return|@>= 1536 loop@+begin if buffer[loc]="""" then 1537 begin incr(loc); emit(string_token); 1538 end; 1539 if loc=limit then @<Decry the missing string delimiter and |goto switch|@>; 1540 incr(loc); 1541 end 1542 1543 @ @<Decry the missing string delimiter and |goto switch|@>= 1544 begin err_print('! Incomplete string will be ignored'); goto switch; 1545 @.Incomplete string...@> 1546 end 1547 1548 @ @<Decry the invalid character and |goto switch|@>= 1549 begin err_print('! Invalid character will be ignored'); goto switch; 1550 @.Invalid character...@> 1551 end 1552 1553 @ @<Bring in a new line of input; |return| if the file has ended@>= 1554 begin get_line; 1555 if input_has_ended then emit(end_of_file); 1556 buffer[limit]:=carriage_return; loc:=0; start_of_line:=true; 1557 empty_buffer:=false; 1558 end 1559 1560 @* Low-level output routines. 1561 The \TeX\ output is supposed to appear in lines at most |line_length| 1562 characters long, so we place it into an output buffer. During the output 1563 process, |out_line| will hold the current line number of the line about to 1564 be output. 1565 1566 @<Glo...@>= 1567 @!out_buf:array[0..line_length] of ASCII_code; {assembled characters} 1568 @!out_ptr:0..line_length; {number of characters in |out_buf|} 1569 @!out_line: integer; {coordinates of next line to be output} 1570 1571 @ The |flush_buffer| routine empties the buffer up to a given breakpoint, 1572 and moves any remaining characters to the beginning of the next line. 1573 If the |per_cent| parameter is |true|, a |"%"| is appended to the line 1574 that is being output; in this case the breakpoint |b| should be strictly 1575 less than |line_length|. If the |per_cent| parameter is |false|, 1576 trailing blanks are suppressed. 1577 The characters emptied from the buffer form a new line of output. 1578 1579 @p procedure flush_buffer(@!b:eight_bits;@!per_cent:boolean); 1580 {outputs |out_buf[1..b]|, where |b<=out_ptr|} 1581 label done; 1582 var j,@!k:0..line_length; 1583 begin j:=b; 1584 if not per_cent then {remove trailing blanks} 1585 loop@+ begin if j=0 then goto done; 1586 if out_buf[j]<>" " then goto done; 1587 decr(j); 1588 end; 1589 done: for k:=1 to j do write(tex_file,xchr[out_buf[k]]); 1590 if per_cent then write(tex_file,xchr["%"]); 1591 write_ln(tex_file); incr(out_line); 1592 if b<out_ptr then for k:=b+1 to out_ptr do out_buf[k-b]:=out_buf[k]; 1593 out_ptr:=out_ptr-b; 1594 end; 1595 1596 @ \.{MFT} calls |flush_buffer(out_ptr,false)| before it has input 1597 anything. We initialize the output variables 1598 so that the first line of the output file will be `\.{\\input mftmac}'. 1599 @.\\input mftmac@> 1600 @.mftmac@> 1601 1602 @<Set init...@>= 1603 out_ptr:=1; out_buf[1]:=" "; out_line:=1; write(tex_file,'\input mftmac'); 1604 1605 @ When we wish to append the character |c| to the output buffer, we write 1606 `$|out|(c)$'; this will cause the buffer to be emptied if it was already 1607 full. Similarly, `$|out2|(c_1)(c_2)$' appends a pair of characters. 1608 A line break will occur at a space or after a single-nonletter 1609 \TeX\ control sequence. 1610 1611 @d oot(#)==@;@/ 1612 if out_ptr=line_length then break_out; 1613 incr(out_ptr); out_buf[out_ptr]:=#; 1614 @d oot1(#)==oot(#)@+end 1615 @d oot2(#)==oot(#)@,oot1 1616 @d oot3(#)==oot(#)@,oot2 1617 @d oot4(#)==oot(#)@,oot3 1618 @d oot5(#)==oot(#)@,oot4 1619 @d out==@+begin oot1 1620 @d out2==@+begin oot2 1621 @d out3==@+begin oot3 1622 @d out4==@+begin oot4 1623 @d out5==@+begin oot5 1624 1625 @ The |break_out| routine is called just before the output buffer is about 1626 to overflow. To make this routine a little faster, we initialize position 1627 0 of the output buffer to `\.\\'; this character isn't really output. 1628 1629 @<Set init...@>= 1630 out_buf[0]:="\"; 1631 1632 @ A long line is broken at a blank space or just before a backslash that isn't 1633 preceded by another backslash. In the latter case, a |"%"| is output at 1634 the break. (This policy has a known bug, in the rare situation that the 1635 backslash was in a string constant that's being output ``verbatim.'') 1636 1637 @p procedure break_out; {finds a way to break the output line} 1638 label exit; 1639 var k:0..line_length; {index into |out_buf|} 1640 @!d:ASCII_code; {character from the buffer} 1641 begin k:=out_ptr; 1642 loop@+ begin if k=0 then 1643 @<Print warning message, break the line, |return|@>; 1644 d:=out_buf[k]; 1645 if d=" " then 1646 begin flush_buffer(k,false); return; 1647 end; 1648 if (d="\")and(out_buf[k-1]<>"\") then {in this case |k>1|} 1649 begin flush_buffer(k-1,true); return; 1650 end; 1651 decr(k); 1652 end; 1653 exit:end; 1654 1655 @ We get to this module only in unusual cases that the entire output line 1656 consists of a string of backslashes followed by a string of nonblank 1657 non-backslashes. In such cases it is almost always safe to break the 1658 line by putting a |"%"| just before the last character. 1659 1660 @<Print warning message...@>= 1661 begin print_nl('! Line had to be broken (output l.',out_line:1); 1662 @.Line had to be broken@> 1663 print_ln('):'); 1664 for k:=1 to out_ptr-1 do print(xchr[out_buf[k]]); 1665 new_line; mark_harmless; 1666 flush_buffer(out_ptr-1,true); return; 1667 end 1668 1669 @ To output a string of bytes from |byte_mem|, we call |out_str|. 1670 1671 @p procedure out_str(@!p:name_pointer); {outputs a string} 1672 var @!k:0..max_bytes; {index into |byte_mem|} 1673 begin for k:=byte_start[p] to byte_start[p+1]-1 do out(byte_mem[k]); 1674 end; 1675 1676 @ The |out_name| subroutine is used to output a symbolic token. 1677 Unusual characters are translated into forms that won't screw up. 1678 1679 @p procedure out_name(@!p:name_pointer); {outputs a name} 1680 var @!k:0..max_bytes; {index into |byte_mem|} 1681 @!t:name_pointer; {translation of character being output, if any} 1682 begin for k:=byte_start[p] to byte_start[p+1]-1 do 1683 begin t:=translation[byte_mem[k]]; 1684 if t=0 then out(byte_mem[k]) 1685 else out_str(t); 1686 end; 1687 end; 1688 1689 @ We often want to output a name after calling a numeric macro 1690 (e.g., `\.{\\1\{foo\}}'). 1691 1692 @p procedure out_mac_and_name(@!n:ASCII_code; @!p:name_pointer); 1693 begin out("\"); out(n); 1694 if length(p)=1 then out_name(p) 1695 else begin out("{"); out_name(p); out("}"); 1696 end; 1697 end; 1698 1699 @ Here's a routine that simply copies from the input buffer to the output 1700 buffer. 1701 1702 @p procedure copy(@!first_loc:integer); {output |buffer[first_loc..loc-1]|} 1703 var @!k:0..buf_size; {|buffer| location being copied} 1704 begin for k:=first_loc to loc-1 do out(buffer[k]); 1705 end; 1706 1707 @* Translation. 1708 The main work of \.{MFT} is accomplished by a routine that translates 1709 the tokens, one by one, with a limited amount of lookahead/lookbehind. 1710 Automata theorists might loosely call this a ``finite state transducer,'' 1711 because the flow of control is comparatively simple. 1712 1713 @p procedure do_the_translation; 1714 label restart,reswitch,done,exit; 1715 var @!k:0..buf_size; {looks ahead in the buffer} 1716 @!t:integer; {type that spreads to new tokens} 1717 begin restart:if out_ptr>0 then flush_buffer(out_ptr,false); 1718 empty_buffer:=true; 1719 loop@+ begin get_next; 1720 if start_of_line then @<Do special actions at the start of a line@>; 1721 reswitch:case cur_type of 1722 numeric_token:@<Translate a numeric token or a fraction@>; 1723 string_token:@<Translate a string token@>; 1724 indentation:out_str(tr_quad); 1725 end_of_line,mft_comment:@<Wind up a line of translation and |goto restart|, 1726 or finish a \pb\ segment and |goto reswitch|@>; 1727 end_of_file:return; 1728 @t\4@> @<Cases that translate primitive tokens@>@; 1729 comment,recomment:@<Translate a comment and |goto restart|, 1730 unless there's a \pb\ segment@>; 1731 verbatim:@<Copy the rest of the current input line to the output, 1732 then |goto restart|@>; 1733 set_format:@<Change the translation format of tokens, 1734 and |goto restart| or |reswitch|@>; 1735 internal,special_tag,tag:@<Translate a tag and possible subscript@>; 1736 end; {all cases have been listed} 1737 end; 1738 exit:end; 1739 1740 @ @<Do special actions at the start of a line@>= 1741 if cur_type>=min_action_type then 1742 begin out("$"); start_of_line:=false; 1743 case cur_type of 1744 endit:out2("\")("!"); 1745 @.\\!@> 1746 binary,abinary,bbinary,ampersand,pyth_sub:out2("{")("}"); 1747 @.\{\}@> 1748 othercases do_nothing 1749 endcases; 1750 end 1751 else if cur_type=end_of_line then 1752 begin out_str(tr_skip); goto restart; 1753 end 1754 else if cur_type=mft_comment then goto restart 1755 1756 @ Let's start with some of the easier translations, so that the harder 1757 ones will also be easy when we get to them. A string like |"cat"| 1758 comes out `\.{\\7"cat"}'. 1759 1760 @<Translate a string token@>= 1761 begin out2("\")("7"); copy(cur_tok); 1762 @.\\7@> 1763 end 1764 1765 @ Similarly, the translation of `\.{sqrt}' is `\.{\\1\{sqrt\}}'. 1766 1767 @<Cases that translate primitive tokens@>= 1768 op: out_mac_and_name("1",cur_tok); 1769 @.\\1@> 1770 command: out_mac_and_name("2",cur_tok); 1771 @.\\2@> 1772 type_name: if prev_type=command then out_mac_and_name("1",cur_tok) 1773 else out_mac_and_name("2",cur_tok); 1774 endit: out_mac_and_name("3",cur_tok); 1775 @.\\3@> 1776 bbinary: out_mac_and_name("4",cur_tok); 1777 @.\\4@> 1778 bold: out_mac_and_name("5",cur_tok); 1779 @.\\5@> 1780 binary: out_mac_and_name("6",cur_tok); 1781 @.\\6@> 1782 path_join: out_mac_and_name("8",cur_tok); 1783 @.\\8@> 1784 colon: out_mac_and_name("?",cur_tok); 1785 @.\\?@> 1786 1787 @ Here are a few more easy cases. 1788 1789 @<Cases that translate primitive tokens@>= 1790 as_is,sharp,abinary: out_name(cur_tok); 1791 double_back: out2("\")(";"); 1792 @.\\;@> 1793 semicolon: begin out_name(cur_tok); get_next; 1794 if cur_type<>end_of_line then if cur_type<>endit then out2("\")(" "); 1795 @.\\\char32@> 1796 goto reswitch; 1797 end; 1798 1799 @ Some of the primitives have a fixed output (independent of |cur_tok|): 1800 1801 @<Cases that translate primitive tokens@>= 1802 backslash:out_str(translation["\"]); 1803 pyth_sub:out_str(tr_ps); 1804 less_or_equal:out_str(tr_le); 1805 greater_or_equal:out_str(tr_ge); 1806 not_equal:out_str(tr_ne); 1807 ampersand:out_str(tr_amp); 1808 1809 @ The remaining primitive is slightly special. 1810 1811 @<Cases that translate primitive tokens@>= 1812 input_command: begin out_mac_and_name("2",cur_tok); 1813 out5("\")("h")("b")("o")("x"); 1814 @<Scan the file name and output it in \.{typewriter type}@>; 1815 end; 1816 1817 @ File names have different formats on different computers, so we don't scan 1818 them with |get_next|. Here we use 1819 a rule that probably covers most cases satisfactorily: We ignore leading 1820 blanks, then consider the file name to consist of all subsequent characters 1821 up to the first blank, semicolon, comment, or end-of-line. 1822 (A |carriage_return| appears at the end of the line.) 1823 1824 @<Scan the file name and output it in \.{typewriter type}@>= 1825 while buffer[loc]=" " do incr(loc); 1826 out5("{")("\")("t")("t")(" "); 1827 while (buffer[loc]<>" ")and(buffer[loc]<>"%")and(buffer[loc]<>";") 1828 and(loc<limit) do 1829 begin out(buffer[loc]); incr(loc); 1830 end; 1831 out("}") 1832 1833 @ @<Translate a numeric token or a fraction@>= 1834 if buffer[loc]="/" then 1835 if char_class[buffer[loc+1]]=digit_class then {it's a fraction} 1836 begin out5("\")("f")("r")("a")("c"); copy(cur_tok); get_next; 1837 @.\\frac@> 1838 out2("/")("{"); get_next; copy(cur_tok); out("}"); 1839 end 1840 else copy(cur_tok) 1841 else copy(cur_tok) 1842 1843 @ @<Translate a tag and possible subscript@>= 1844 begin if length(cur_tok)=1 then out_name(cur_tok) 1845 else out_mac_and_name("\",cur_tok); 1846 @.\\\\@> 1847 get_next; 1848 if byte_mem[byte_start[prev_tok]]="'" then goto reswitch; 1849 case prev_type of 1850 internal:begin if (cur_type=numeric_token)or(cur_type>=min_suffix) then 1851 out2("\")(","); 1852 @.\\,@> 1853 goto reswitch; 1854 end; 1855 special_tag:if cur_type<min_suffix then goto reswitch 1856 else begin out("."); cur_type:=internal; goto reswitch; 1857 @..@> 1858 end; 1859 tag:begin if cur_type=tag then if byte_mem[byte_start[cur_tok]]="'" then 1860 goto reswitch; {a sequence of primes goes on the main line} 1861 if (cur_type=numeric_token)or(cur_type>=min_suffix) then 1862 @<Translate a subscript@> 1863 else if cur_type=sharp then out_str(tr_sharp) 1864 else goto reswitch; 1865 end; 1866 end; {there are no other cases} 1867 end 1868 1869 @ @<Translate a subscript@>= 1870 begin out2("_")("{"); 1871 loop@+ begin if cur_type>=min_suffix then out_name(cur_tok) 1872 else copy(cur_tok); 1873 if prev_type=special_tag then 1874 begin get_next; goto done; 1875 end; 1876 get_next; 1877 if cur_type<min_suffix then if cur_type<>numeric_token then goto done; 1878 if cur_type=prev_type then 1879 if cur_type=numeric_token then out2("\")(",") 1880 @.\\,@> 1881 else if char_class[byte_mem[byte_start[cur_tok]]]=@| 1882 char_class[byte_mem[byte_start[prev_tok]]] then 1883 if byte_mem[byte_start[prev_tok]]<>"." then out(".") 1884 else out2("\")(","); 1885 end; 1886 done: out("}"); goto reswitch; 1887 end 1888 1889 @ The tricky thing about comments is that they might contain \pb. 1890 We scan ahead for this, and replace the second `\.{\char'174}' 1891 by a |carriage_return|. 1892 1893 @<Translate a comment and |goto restart|...@>= 1894 begin if cur_type=comment then out2("\")("9"); 1895 @.\\9@> 1896 id_first:=loc; 1897 while (loc<limit)and(buffer[loc]<>"|") do incr(loc); 1898 copy(id_first); 1899 if loc<limit then 1900 begin start_of_line:=true; incr(loc); k:=loc; 1901 while (k<limit)and(buffer[k]<>"|") do incr(k); 1902 buffer[k]:=carriage_return; 1903 end 1904 else begin if out_buf[out_ptr]="\" then out(" "); 1905 out4("\")("p")("a")("r"); goto restart; 1906 @.\\par@> 1907 end; 1908 end 1909 1910 @ @<Copy the rest of the current input line to the output...@>= 1911 begin id_first:=loc; loc:=limit; copy(id_first); 1912 if out_ptr=0 then 1913 begin out_ptr:=1; out_buf[1]:=" "; 1914 end; 1915 goto restart; 1916 end 1917 1918 @ @<Wind up a line of translation...@>= 1919 begin out("$"); 1920 if (loc<limit)and(cur_type=end_of_line) then 1921 begin cur_type:=recomment; goto reswitch; 1922 end 1923 else begin out4("\")("p")("a")("r"); goto restart; 1924 @.\\par@> 1925 end; 1926 end 1927 1928 @ @<Change the translation format...@>= 1929 begin start_of_line:=false; get_next; t:=cur_type; 1930 while cur_type>=min_symbolic_token do 1931 begin get_next; 1932 if cur_type>=min_symbolic_token then ilk[cur_tok]:=t; 1933 end; 1934 if cur_type<>end_of_line then if cur_type<>mft_comment then 1935 begin err_print('! Only symbolic tokens should appear after %%%'); 1936 @.Only symbolic tokens...@> 1937 goto reswitch; 1938 end; 1939 empty_buffer:=true; goto restart; 1940 end 1941 1942 @* The main program. 1943 Let's put it all together now: \.{MFT} starts and ends here. 1944 @^system dependencies@> 1945 1946 @p begin initialize; {beginning of the main program} 1947 print_ln(banner); {print a ``banner line''} 1948 @<Store all the primitives@>; 1949 @<Store all the translations@>; 1950 @<Initialize the input...@>; 1951 do_the_translation; 1952 @<Check that all changes have been read@>; 1953 end_of_MFT:{here files should be closed if the operating system requires it} 1954 @<Print the job |history|@>; 1955 end. 1956 1957 @ Some implementations may wish to pass the |history| value to the 1958 operating system so that it can be used to govern whether or not other 1959 programs are started. Here we simply report the history to the user. 1960 @^system dependencies@> 1961 1962 @<Print the job |history|@>= 1963 case history of 1964 spotless: print_nl('(No errors were found.)'); 1965 harmless_message: print_nl('(Did you see the warning message above?)'); 1966 error_message: print_nl('(Pardon me, but I think I spotted something wrong.)'); 1967 fatal_message: print_nl('(That was a fatal error, my friend.)'); 1968 end {there are no other cases} 1969 1970 @* System-dependent changes. 1971 This module should be replaced, if necessary, by changes to the program 1972 that are necessary to make \.{MFT} work at a particular installation. 1973 It is usually best to design your change file so that all changes to 1974 previous modules preserve the module numbering; then everybody's version 1975 will be consistent with the printed program. More extensive changes, 1976 which introduce new modules, can be inserted here; then only the index 1977 itself will get a new module number. 1978 @^system dependencies@> 1979 1980 @* Index.