modernc.org/knuth@v0.0.4/web/testdata/ctan.org/tex-archive/systems/knuth/dist/texware/pltotf.web (about) 1 % This program by D. E. Knuth is not copyrighted and can be used freely. 2 % Version 0 was implemented in January 1982. 3 % In February 1982 a new restriction on ligature steps was added. 4 % In June 1982 the routines were divided into smaller pieces for IBM people. 5 % Hex was added in September 1982, and the result became "Version 1". 6 % Version 1.1 fixed a bug in section 28 (since eoln is undefined after eof). 7 % Slight changes were made in October, 1982, for version 0.6 of TeX. 8 % Version 1.2 fixed a bug in section 115 (TOP, MID, and BOT can be zero) 9 % Version 1.3 (April 1983) blanked out unused BCPL header bytes 10 % Version 2 (July 1983) was released with TeX version 0.999. 11 % Version 2.1 (September 1983) changed TEXINFO to FONTDIMEN. 12 % Version 2.2 (May 1985) added checksum computation to match METAFONT. 13 % Version 2.3 (August 1985) introduced `backup' to fix a minor bug. 14 % Version 3 (October 1989) introduced extended ligature features. 15 % Version 3.1 (November 1989) fixed two bugs (notably min_nl:=0). 16 % Version 3.2 (December 1989) improved `shorten', increased max_letters. 17 % Version 3.3 (September 1990) fixed `nonexistent char 0' (John Gourlay). 18 % Version 3.4 (March 1991) has more robust `out_scaled' (Wayne Sullivan). 19 % Version 3.5 (March 1995) initialized lk_step_ended (Armin K\"ollner). 20 % Version 3.6 (January 2014) corrected possible end-of-line glitch (Ken Nakano), 21 % and get_fix now treats -- as + (Peter Breitenlohner). 22 23 % Here is TeX material that gets inserted after \input webmac 24 \def\hang{\hangindent 3em\indent\ignorespaces} 25 \font\ninerm=cmr9 26 \let\mc=\ninerm % medium caps for names like SAIL 27 \def\PASCAL{Pascal} 28 \font\logo=logo10 % for the METAFONT logo 29 \def\MF{{\logo METAFONT}} 30 31 \def\(#1){} % this is used to make section names sort themselves better 32 \def\9#1{} % this is used for sort keys in the index 33 34 \def\title{PL\lowercase{to}TF} 35 \def\contentspagenumber{301} 36 \def\topofcontents{\null 37 \titlefalse % include headline on the contents page 38 \def\rheader{\mainfont\hfil \contentspagenumber} 39 \vfill 40 \centerline{\titlefont The {\ttitlefont PLtoTF} processor} 41 \vskip 15pt 42 \centerline{(Version 3.6, January 2014)} 43 \vfill} 44 \def\botofcontents{\vfill 45 \centerline{\hsize 5in\baselineskip9pt 46 \vbox{\ninerm\noindent 47 The preparation of this report 48 was supported in part by the National Science 49 Foundation under grants IST-8201926 and MCS-8300984, 50 and by the System Development Foundation. `\TeX' is a 51 trademark of the American Mathematical Society.}}} 52 \pageno=\contentspagenumber \advance\pageno by 1 53 54 @* Introduction. 55 The \.{PLtoTF} utility program converts property-list (``\.{PL}'') 56 files into equivalent \TeX\ font metric (``\.{TFM}'') files. It also 57 makes a thorough check of the given \.{PL} file, so that the \.{TFM} 58 file should be acceptable to \TeX. 59 60 The first \.{PLtoTF} program was designed by Leo Guibas in the summer of 61 1978. Contributions by Frank Liang, Doug Wyatt, and Lyle Ramshaw 62 also had a significant effect on the evolution of the present code. 63 64 Extensions for an enhanced ligature mechanism were added by the author in 1989. 65 66 The |banner| string defined here should be changed whenever \.{PLtoTF} 67 gets modified. 68 69 @d banner=='This is PLtoTF, Version 3.6' {printed when the program starts} 70 71 @ This program is written entirely in standard \PASCAL, except that 72 it has to do some slightly system-dependent character code conversion 73 on input. Furthermore, lower case letters are used in error messages; 74 they could be converted to upper case if necessary. The input is read 75 from |pl_file|, and the output is written on |tfm_file|; error messages and 76 other remarks are written on the |output| file, which the user may 77 choose to assign to the terminal if the system permits it. 78 @^system dependencies@> 79 80 The term |print| is used instead of |write| when this program writes on 81 the |output| file, so that all such output can be easily deflected. 82 83 @d print(#)==write(#) 84 @d print_ln(#)==write_ln(#) 85 86 @p program PLtoTF(@!pl_file,@!tfm_file,@!output); 87 const @<Constants in the outer block@>@/ 88 type @<Types in the outer block@>@/ 89 var @<Globals in the outer block@>@/ 90 procedure initialize; {this procedure gets things started properly} 91 var @<Local variables for initialization@>@/ 92 begin print_ln(banner);@/ 93 @<Set initial values@>@/ 94 end; 95 96 @ The following parameters can be changed at compile time to extend or 97 reduce \.{PLtoTF}'s capacity. 98 99 @<Constants...@>= 100 @!buf_size=60; {length of lines displayed in error messages} 101 @!max_header_bytes=100; {four times the maximum number of words allowed in 102 the \.{TFM} file header block, must be 1024 or less} 103 @!max_param_words=30; {the maximum number of \.{fontdimen} parameters allowed} 104 @!max_lig_steps=5000; 105 {maximum length of ligature program, must be at most $32767-257=32510$} 106 @!max_kerns=500; {the maximum number of distinct kern values} 107 @!hash_size=5003; {preferably a prime number, a bit larger than the number 108 of character pairs in lig/kern steps} 109 110 @ Here are some macros for common programming idioms. 111 112 @d incr(#) == #:=#+1 {increase a variable by unity} 113 @d decr(#) == #:=#-1 {decrease a variable by unity} 114 @d do_nothing == {empty statement} 115 116 @* Property list description of font metric data. 117 The idea behind \.{PL} files is that precise details about fonts, i.e., the 118 facts that are needed by typesetting routines like \TeX, sometimes have to 119 be supplied by hand. The nested property-list format provides a reasonably 120 convenient way to do this. 121 122 A good deal of computation is necessary to parse and process a 123 \.{PL} file, so it would be inappropriate for \TeX\ itself to do this 124 every time it loads a font. \TeX\ deals only with the compact descriptions 125 of font metric data that appear in \.{TFM} files. Such data is so compact, 126 however, it is almost impossible for anybody but a computer to read it. 127 The purpose of \.{PLtoTF} is to convert from a human-oriented file of text 128 to a computer-oriented file of binary numbers. 129 130 @<Glob...@>= 131 @!pl_file:text; 132 133 @ @<Set init...@>= 134 reset(pl_file); 135 136 @ A \.{PL} file is a list of entries of the form 137 $$\.{(PROPERTYNAME VALUE)}$$ 138 where the property name is one of a finite set of names understood by 139 this program, and the value may itself in turn be a property list. 140 The idea is best understood by looking at an example, so let's consider 141 a fragment of the \.{PL} file for a hypothetical font. 142 $$\vbox{\halign{\.{#}\hfil\cr 143 (FAMILY NOVA)\cr 144 (FACE F MIE)\cr 145 (CODINGSCHEME ASCII)\cr 146 (DESIGNSIZE D 10)\cr 147 (DESIGNUNITS D 18)\cr 148 (COMMENT A COMMENT IS IGNORED)\cr 149 (COMMENT (EXCEPT THIS ONE ISN'T))\cr 150 (COMMENT (ACTUALLY IT IS, EVEN THOUGH\cr 151 \qquad\qquad IT SAYS IT ISN'T))\cr 152 (FONTDIMEN\cr 153 \qquad (SLANT R -.25)\cr 154 \qquad (SPACE D 6)\cr 155 \qquad (SHRINK D 2)\cr 156 \qquad (STRETCH D 3)\cr 157 \qquad (XHEIGHT R 10.55)\cr 158 \qquad (QUAD D 18)\cr 159 \qquad )\cr 160 (LIGTABLE\cr 161 \qquad (LABEL C f)\cr 162 \qquad (LIG C f O 200)\cr 163 \qquad (SKIP D 1)\cr 164 \qquad (LABEL O 200)\cr 165 \qquad (LIG C i O 201)\cr 166 \qquad (KRN O 51 R 1.5)\cr 167 \qquad (/LIG C ? C f)\cr 168 \qquad (STOP)\cr 169 \qquad )\cr 170 (CHARACTER C f\cr 171 \qquad (CHARWD D 6)\cr 172 \qquad (CHARHT R 13.5)\cr 173 \qquad (CHARIC R 1.5)\cr 174 \qquad )\cr}}$$ 175 This example says that the font whose metric information is being described 176 belongs to the hypothetical 177 \.{NOVA} family; its face code is medium italic extended; 178 and the characters appear in ASCII code positions. The design size is 10 points, 179 and all other sizes in this \.{PL} file are given in units such that 18 units 180 equals the design size. The font is slanted with a slope of $-.25$ (hence the 181 letters actually slant backward---perhaps that is why the family name is 182 \.{NOVA}). The normal space between words is 6 units (i.e., one third of 183 the 18-unit design size), with glue that shrinks by 2 units or stretches by 3. 184 The letters for which accents don't need to be raised or lowered are 10.55 185 units high, and one em equals 18 units. 186 187 The example ligature table is a bit trickier. It specifies that the 188 letter \.f followed by another \.f is changed to code @'200, while 189 code @'200 followed by \.i is changed to @'201; presumably codes @'200 190 and @'201 represent the ligatures `ff' and `ffi'. Moreover, in both cases 191 \.f and @'200, if the following character is the code @'51 (which is a 192 right parenthesis), an additional 1.5 units of space should be inserted 193 before the @'51. (The `\.{SKIP}~\.D~\.1' skips over one \.{LIG} or 194 \.{KRN} command, which in this case is the second \.{LIG}; in this way 195 two different ligature/kern programs can come together.) 196 Finally, if either \.f or @'200 is followed by a question mark, 197 the question mark is replaced by \.f and the ligature program is 198 started over. (Thus, the character pair `\.{f?}' would actually become 199 the ligature `ff', and `\.{ff?}' or `\.{f?f}' would become `fff'. To 200 avoid this restart procedure, the \.{/LIG} command could be replaced 201 by \.{/LIG>}; then `\.{f?} would become `f\kern0ptf' and `\.{f?f}' 202 would become `f\kern0ptff'.) 203 204 Character \.f itself is 6 units wide and 13.5 units tall, in this example. 205 Its depth is zero (since \.{CHARDP} is not given), and its italic correction 206 is 1.5 units. 207 208 @ The example above illustrates most of the features found in \.{PL} files. 209 Note that some property names, like \.{FAMILY} or \.{COMMENT}, take a 210 string as their value; this string continues until the first unmatched 211 right parenthesis. But most property names, like \.{DESIGNSIZE} and \.{SLANT} 212 and \.{LABEL}, take a number as their value. This number can be expressed in 213 a variety of ways, indicated by a prefixed code; \.D stands for decimal, 214 \.H for hexadecimal, \.O for octal, \.R for real, \.C for character, and 215 \.F for ``face.'' Other property names, like \.{LIG}, take two numbers as 216 their value. And still other names, like \.{FONTDIMEN} and \.{LIGTABLE} and 217 \.{CHARACTER}, have more complicated values that involve property lists. 218 219 A property name is supposed to be used only in an appropriate property 220 list. For example, \.{CHARWD} shouldn't occur on the outer level or 221 within \.{FONTDIMEN}. 222 223 The individual property-and-value pairs in a property list can appear in 224 any order. For instance, `\.{SHRINK}' precedes `\.{STRETCH}' in the above 225 example, although the \.{TFM} file always puts the stretch parameter first. 226 One could even give the information about characters like `\.f' before 227 specifying the number of units in the design size, or before specifying the 228 ligature and kerning table. However, the \.{LIGTABLE} itself is an exception 229 to this rule; the individual elements of the \.{LIGTABLE} property list 230 can be reordered only to a certain extent without changing the meaning 231 of that table. 232 233 If property-and-value pairs are omitted, a default value is used. For example, 234 we have already noted that the default for \.{CHARDP} is zero. The default 235 for {\sl every\/} numeric value is, in fact, zero, unless otherwise stated 236 below. 237 238 If the same property name is used more than once, \.{PLtoTF} will not notice 239 the discrepancy; it simply uses the final value given. Once again, however, the 240 \.{LIGTABLE} is an exception to this rule; \.{PLtoTF} will complain if there 241 is more than one label for some character. And of course many of the 242 entries in the \.{LIGTABLE} property list have the same property name. 243 244 From these rules, you can guess (correctly) that \.{PLtoTF} operates in four 245 main steps. First it assigns the default values to all properties; then it scans 246 through the \.{PL} file, changing property values as new ones are seen; then 247 it checks the information and corrects any problems; and finally it outputs 248 the \.{TFM} file. 249 250 @ Instead of relying on a hypothetical example, let's consider a complete 251 grammar for \.{PL} files. At the outer level, the following property names 252 are valid: 253 254 \yskip\hang\.{CHECKSUM} (four-byte value). The value, which should be a 255 nonnegative integer less than $2^{32}$, is used to identify a particular 256 version of a font; it should match the check sum value stored with the font 257 itself. An explicit check sum of zero is used to bypass 258 check sum testing. If no checksum is specified in the \.{PL} file, 259 \.{PLtoTF} will compute the checksum that \MF\ would compute from the 260 same data. 261 262 \yskip\hang\.{DESIGNSIZE} (numeric value, default is 10). The value, which 263 should be a real number in the range |1.0<=x<2048|, represents the default 264 amount by which all quantities will be scaled if the font is not loaded 265 with an `\.{at}' specification. For example, if one says 266 `\.{\\font\\A=cmr10 at 15pt}' in \TeX\ language, the design size in the \.{TFM} 267 file is ignored and effectively replaced by 15 points; but if one simply 268 says `\.{\\font\\A=cmr10}' the stated design size is used. This quantity is 269 always in units of printer's points. 270 271 \yskip\hang\.{DESIGNUNITS} (numeric value, default is 1). The value 272 should be a positive real number; it says how many units equals the design 273 size (or the eventual `\.{at}' size, if the font is being scaled). For 274 example, suppose you have a font that has been digitized with 600 pixels per 275 em, and the design size is one em; then you could say `\.{(DESIGNUNITS R 600)}' 276 if you wanted to give all of your measurements in units of pixels. 277 278 \yskip\hang\.{CODINGSCHEME} (string value, default is `\.{UNSPECIFIED}'). 279 The string should not contain parentheses, and its length must be less than 40. 280 It identifies the correspondence between the numeric codes and font characters. 281 (\TeX\ ignores this information, but other software programs make use of it.) 282 283 \yskip\hang\.{FAMILY} (string value, default is `\.{UNSPECIFIED}'). 284 The string should not contain parentheses, and its length must be less than 20. 285 It identifies the name of the family to which this font belongs, e.g., 286 `\.{HELVETICA}'. (\TeX\ ignores this information; but it is needed, for 287 example, when converting \.{DVI} files to \.{PRESS} files for Xerox 288 equipment.) 289 290 \yskip\hang\.{FACE} (one-byte value). This number, which must lie between 291 0 and 255 inclusive, is a subsidiary ident\-ifi\-ca\-tion of the font within its 292 family. For example, bold italic condensed fonts might have the same family name 293 as light roman extended fonts, differing only in their face byte. (\TeX\ 294 ignores this information; but it is needed, for example, when converting 295 \.{DVI} files to \.{PRESS} files for Xerox equipment.) 296 297 \yskip\hang\.{SEVENBITSAFEFLAG} (string value, default is `\.{FALSE}'). The 298 value should start with either `\.T' (true) or `\.F' (false). If true, character 299 codes less than 128 cannot lead to codes of 128 or more via ligatures or 300 charlists or extensible characters. (\TeX82 ignores this flag, but older 301 versions of \TeX\ would only accept \.{TFM} files that were seven-bit safe.) 302 \.{PLtoTF} computes the correct value of this flag and gives an error message 303 only if a claimed ``true'' value is incorrect. 304 305 \yskip\hang\.{HEADER} (a one-byte value followed by a four-byte value). 306 The one-byte value should be between 18 and a maximum limit that can be 307 raised or lowered depending on the compile-time setting of |max_header_bytes|. 308 The four-byte value goes into the header word whose index is the one-byte 309 value; for example, to set |header[18]:=1|, one may write 310 `\.{(HEADER D 18 O 1)}'. This notation is used for header information that 311 is presently unnamed. (\TeX\ ignores it.) 312 313 \yskip\hang\.{FONTDIMEN} (property list value). See below for the names 314 allowed in this property list. 315 316 \yskip\hang\.{LIGTABLE} (property list value). See below for the rules 317 about this special kind of property list. 318 319 \yskip\hang\.{BOUNDARYCHAR} (one-byte value). If this character appears in 320 a \.{LIGTABLE} command, it matches ``end of word'' as well as itself. 321 If no boundary character is given and no \.{LABEL} \.{BOUNDARYCHAR} occurs 322 within \.{LIGTABLE}, word boundaries will not affect ligatures or kerning. 323 324 \yskip\hang\.{CHARACTER}. The value is a one-byte integer followed by 325 a property list. The integer represents the number of a character that is 326 present in the font; the property list of a character is defined below. 327 The default is an empty property list. 328 329 @ Numeric property list values can be given in various forms identified by 330 a prefixed letter. 331 332 \yskip\hang\.C denotes an ASCII character, which should be a standard visible 333 character that is not a parenthesis. The numeric value will therefore be 334 between @'41 and @'176 but not @'50 or @'51. 335 336 \yskip\hang\.D denotes a decimal integer, which must be nonnegative and 337 less than 256. (Use \.R for larger values or for negative values.) 338 339 \yskip\hang\.F denotes a three-letter Xerox face code; the admissible codes 340 are \.{MRR}, \.{MIR}, \.{BRR}, \.{BIR}, \.{LRR}, \.{LIR}, \.{MRC}, \.{MIC}, 341 \.{BRC}, \.{BIC}, \.{LRC}, \.{LIC}, \.{MRE}, \.{MIE}, \.{BRE}, \.{BIE}, 342 \.{LRE}, and \.{LIE}, denoting the integers 0 to 17, respectively. 343 344 \yskip\hang\.O denotes an unsigned octal integer, which must be less than 345 $2^{32}$, i.e., at most `\.{O 37777777777}'. 346 347 \yskip\hang\.H denotes an unsigned hexadecimal integer, which must be less than 348 $2^{32}$, i.e., at most `\.{H FFFFFFFF}'. 349 350 \yskip\hang\.R denotes a real number in decimal notation, optionally preceded 351 by a `\.+' or `\.-' sign, and optionally including a decimal point. The 352 absolute value must be less than 2048. 353 354 @ The property names allowed in a \.{FONTDIMEN} property list correspond to 355 various \TeX\ parameters, each of which has a (real) numeric value. All 356 of the parameters except \.{SLANT} are in design units. The admissible 357 names are \.{SLANT}, \.{SPACE}, \.{STRETCH}, \.{SHRINK}, \.{XHEIGHT}, 358 \.{QUAD}, \.{EXTRASPACE}, \.{NUM1}, \.{NUM2}, \.{NUM3}, \.{DENOM1}, 359 \.{DENOM2}, \.{SUP1}, \.{SUP2}, \.{SUP3}, \.{SUB1}, \.{SUB2}, \.{SUPDROP}, 360 \.{SUBDROP}, \.{DELIM1}, \.{DELIM2}, and \.{AXISHEIGHT}, for parameters 361 1~to~22. The alternate names \.{DEFAULTRULETHICKNESS}, 362 \.{BIGOPSPACING1}, \.{BIGOPSPACING2}, \.{BIGOPSPACING3}, 363 \.{BIGOPSPACING4}, and \.{BIGOPSPACING5}, may also be used for parameters 364 8 to 13. 365 366 The notation `\.{PARAMETER} $n$' provides another way to specify the 367 $n$th parameter; for example, `\.{(PARAMETER} \.{D 1 R -.25)}' is another way 368 to specify that the \.{SLANT} is $-0.25$. The value of $n$ must be positive 369 and less than |max_param_words|. 370 371 @ The elements of a \.{CHARACTER} property list can be of six different types. 372 373 \yskip\hang\.{CHARWD} (real value) denotes the character's width in 374 design units. 375 376 \yskip\hang\.{CHARHT} (real value) denotes the character's height in 377 design units. 378 379 \yskip\hang\.{CHARDP} (real value) denotes the character's depth in 380 design units. 381 382 \yskip\hang\.{CHARIC} (real value) denotes the character's italic correction in 383 design units. 384 385 \yskip\hang\.{NEXTLARGER} (one-byte value), specifies the character that 386 follows the present one in a ``charlist.'' The value must be the number of a 387 character in the font, and there must be no infinite cycles of supposedly 388 larger and larger characters. 389 390 \yskip\hang\.{VARCHAR} (property list value), specifies an extensible character. 391 This option and \.{NEXTLARGER} are mutually exclusive; i.e., they cannot 392 both be used within the same \.{CHARACTER} list. 393 394 \yskip\noindent 395 The elements of a \.{VARCHAR} property list are either \.{TOP}, \.{MID}, 396 \.{BOT}, or \.{REP}; the values are integers, which must be zero or the number 397 of a character in the font. A zero value for \.{TOP}, \.{MID}, or \.{BOT} means 398 that the corresponding piece of the extensible character is absent. A nonzero 399 value, or a \.{REP} value of zero, denotes the character code used to make 400 up the top, middle, bottom, or replicated piece of an extensible character. 401 402 @ A \.{LIGTABLE} property list contains elements of four kinds, specifying a 403 program in a simple command language that \TeX\ uses for ligatures and kerns. 404 If several \.{LIGTABLE} lists appear, they are effectively concatenated into 405 a single list. 406 407 \yskip\hang\.{LABEL} (one-byte value) means that the program for the 408 stated character value starts here. The integer must be the number of a 409 character in the font; its \.{CHARACTER} property list must not have a 410 \.{NEXTLARGER} or \.{VARCHAR} field. At least one \.{LIG} or \.{KRN} step 411 must follow. 412 413 \yskip\hang\.{LABEL} \.{BOUNDARYCHAR} means that the program for 414 beginning-of-word ligatures starts here. 415 416 \yskip\hang\.{LIG} (two one-byte values). The instruction `\.{(LIG} $c$ $r$\.)' 417 means, ``If the next character is $c$, then insert character~$r$ and 418 possibly delete the current character and/or~$c$; 419 otherwise go on to the next instruction.'' 420 Characters $r$ and $c$ must be present in the font. \.{LIG} may be immediately 421 preceded or followed by a slash, and then immediately followed by \.> 422 characters not exceeding the number of slashes. Thus there are eight 423 possible forms: 424 $$\hbox to .8\hsize{\.{LIG}\hfil\.{/LIG}\hfil\.{/LIG>}\hfil 425 \.{LIG/}\hfil\.{LIG/>}\hfil\.{/LIG/}\hfil\.{/LIG/>}\hfil\.{/LIG/>>}}$$ 426 The slashes specify retention of the left or right original character; the 427 \.> signs specify passing over the result without further ligature processing. 428 429 \yskip\hang\.{KRN} (a one-byte value and a real value). The instruction 430 `\.{(KRN} $c$ $r$\.)' means, ``If the next character is $c$, then insert 431 a blank space of width $r$ between the current character and $c$; 432 otherwise go on to the next instruction.'' The value of $r$, which is in 433 design units, is often negative. Character code $c$ must exist 434 in the font. 435 436 \yskip\hang\.{STOP} (no value). This instruction ends a ligature/kern program. 437 It must follow either a \.{LIG} or \.{KRN} instruction, not a \.{LABEL} 438 or \.{STOP} or \.{SKIP}. 439 440 \yskip\hang\.{SKIP} (value in the range |0..127|). This instruction specifies 441 continuation of a ligature/kern program after the specified number of \.{LIG} 442 or \.{KRN} steps has been skipped over. The number of subsequent \.{LIG} and 443 \.{KRN} instructions must therefore exceed this specified amount. 444 445 @ In addition to all these possibilities, the property name \.{COMMENT} is 446 allowed in any property list. Such comments are ignored. 447 448 @ So that is what \.{PL} files hold. The next question is, ``What about 449 \.{TFM} files?'' A complete answer to that question appears in the 450 documentation of the companion program, \.{TFtoPL}, so it will not 451 be repeated here. Suffice it to say that a \.{TFM} file stores all of the 452 relevant font information in a sequence of 8-bit bytes. The number of 453 bytes is always a multiple of 4, so we could regard the \.{TFM} file 454 as a sequence of 32-bit words; but \TeX\ uses the byte interpretation, 455 and so does \.{PLtoTF}. Note that the bytes are considered to be unsigned 456 numbers. 457 458 @<Glob...@>= 459 @!tfm_file:packed file of 0..255; 460 461 @ On some systems you may have to do something special to write a 462 packed file of bytes. For example, the following code didn't work 463 when it was first tried at Stanford, because packed files have to be 464 opened with a special switch setting on the \PASCAL\ that was used. 465 @^system dependencies@> 466 467 @<Set init...@>= 468 rewrite(tfm_file); 469 470 @* Basic input routines. 471 For the purposes of this program, a |byte| is an unsigned eight-bit quantity, 472 and an |ASCII_code| is an integer between @'40 and @'177. Such ASCII codes 473 correspond to one-character constants like \.{"A"} in \.{WEB} language. 474 475 @<Types...@>= 476 @!byte=0..255; {unsigned eight-bit quantity} 477 @!ASCII_code=@'40..@'177; {standard ASCII code numbers} 478 479 @ One of the things \.{PLtoTF} has to do is convert characters of strings 480 to ASCII form, since that is the code used for the family name and the 481 coding scheme in a \.{TFM} file. An array |xord| is used to do the 482 conversion from |char|; the method below should work with little or no change 483 on most \PASCAL\ systems. 484 @^system dependencies@> 485 486 @d first_ord=0 {ordinal number of the smallest element of |char|} 487 @d last_ord=127 {ordinal number of the largest element of |char|} 488 489 @<Global...@>= 490 @!xord:array[char] of ASCII_code; {conversion table} 491 492 @ @<Local variables for init...@>= 493 @!k:integer; {all-purpose initialization index} 494 495 @ Characters that should not appear in \.{PL} files (except in comments) 496 are mapped into @'177. 497 498 @d invalid_code=@'177 {code deserving an error message} 499 500 @<Set init...@>= 501 for k:=first_ord to last_ord do xord[chr(k)]:=invalid_code; 502 xord[' ']:=" "; xord['!']:="!"; xord['"']:=""""; xord['#']:="#"; 503 xord['$']:="$"; xord['%']:="%"; xord['&']:="&"; xord['''']:="'"; 504 xord['(']:="("; xord[')']:=")"; xord['*']:="*"; xord['+']:="+"; xord[',']:=","; 505 xord['-']:="-"; xord['.']:="."; xord['/']:="/"; xord['0']:="0"; xord['1']:="1"; 506 xord['2']:="2"; xord['3']:="3"; xord['4']:="4"; xord['5']:="5"; xord['6']:="6"; 507 xord['7']:="7"; xord['8']:="8"; xord['9']:="9"; xord[':']:=":"; xord[';']:=";"; 508 xord['<']:="<"; xord['=']:="="; xord['>']:=">"; xord['?']:="?"; 509 xord['@@']:="@@"; xord['A']:="A"; xord['B']:="B"; xord['C']:="C"; 510 xord['D']:="D"; xord['E']:="E"; xord['F']:="F"; xord['G']:="G"; xord['H']:="H"; 511 xord['I']:="I"; xord['J']:="J"; xord['K']:="K"; xord['L']:="L"; xord['M']:="M"; 512 xord['N']:="N"; xord['O']:="O"; xord['P']:="P"; xord['Q']:="Q"; xord['R']:="R"; 513 xord['S']:="S"; xord['T']:="T"; xord['U']:="U"; xord['V']:="V"; xord['W']:="W"; 514 xord['X']:="X"; xord['Y']:="Y"; xord['Z']:="Z"; xord['[']:="["; xord['\']:="\"; 515 xord[']']:="]"; xord['^']:="^"; xord['_']:="_"; xord['`']:="`"; xord['a']:="a"; 516 xord['b']:="b"; xord['c']:="c"; xord['d']:="d"; xord['e']:="e"; xord['f']:="f"; 517 xord['g']:="g"; xord['h']:="h"; xord['i']:="i"; xord['j']:="j"; xord['k']:="k"; 518 xord['l']:="l"; xord['m']:="m"; xord['n']:="n"; xord['o']:="o"; xord['p']:="p"; 519 xord['q']:="q"; xord['r']:="r"; xord['s']:="s"; xord['t']:="t"; xord['u']:="u"; 520 xord['v']:="v"; xord['w']:="w"; xord['x']:="x"; xord['y']:="y"; xord['z']:="z"; 521 xord['{']:="{"; xord['|']:="|"; xord['}']:="}"; xord['~']:="~"; 522 523 @ In order to help catch errors of badly nested parentheses, \.{PLtoTF} 524 assumes that the user will begin each line with a number of blank spaces equal 525 to some constant times the number of open parentheses at the beginning of 526 that line. However, the program doesn't know in advance what the constant 527 is, nor does it want to print an error message on every line for a user 528 who has followed no consistent pattern of indentation. 529 530 Therefore the following strategy is adopted: If the user has been consistent 531 with indentation for ten or more lines, an indentation error will be 532 reported. The constant of indentation is reset on every line that should 533 have nonzero indentation. 534 535 @<Glob...@>= 536 @!line:integer; {the number of the current line} 537 @!good_indent:integer; {the number of lines since the last bad indentation} 538 @!indent: integer; {the number of spaces per open parenthesis, zero if unknown} 539 @!level: integer; {the current number of open parentheses} 540 541 @ @<Set init...@>= 542 line:=0; good_indent:=0; indent:=0; level:=0; 543 544 @ The input need not really be broken into lines of any maximum length, and 545 we could read it character by character without any buffering. But we shall 546 place it into a small buffer so that offending lines can be displayed in error 547 messages. 548 549 @<Glob...@>= 550 @!left_ln,@!right_ln:boolean; {are the left and right ends of the buffer 551 at end-of-line marks?} 552 @!limit:0..buf_size; {position of the last character present in the buffer} 553 @!loc:0..buf_size; {position of the last character read in the buffer} 554 @!buffer:array[1..buf_size] of char; 555 @!input_has_ended:boolean; {there is no more input to read} 556 557 @ @<Set init...@>= 558 limit:=0; loc:=0; left_ln:=true; right_ln:=true; input_has_ended:=false; 559 560 @ Just before each \.{CHARACTER} property list is evaluated, the character 561 code is printed in octal notation. Up to eight such codes appear on a line; 562 so we have a variable to keep track of how many are currently there. 563 564 @<Glob...@>= 565 @!chars_on_line:0..8; {the number of characters printed on the current line} 566 567 @ @<Set init...@>= 568 chars_on_line:=0; 569 570 @ The following routine prints an error message and an indication of 571 where the error was detected. The error message should not include any 572 final punctuation, since this procedure supplies its own. 573 574 @d err_print(#)==begin if chars_on_line>0 then print_ln(' '); 575 print(#); show_error_context; 576 end 577 578 @p procedure show_error_context; {prints the current scanner location} 579 var k:0..buf_size; {an index into |buffer|} 580 begin print_ln(' (line ',line:1,').'); 581 if not left_ln then print('...'); 582 for k:=1 to loc do print(buffer[k]); {print the characters already scanned} 583 print_ln(' '); 584 if not left_ln then print(' '); 585 for k:=1 to loc do print(' '); {space out the second line} 586 for k:=loc+1 to limit do print(buffer[k]); {print the characters yet unseen} 587 if right_ln then print_ln(' ')@+else print_ln('...'); 588 chars_on_line:=0; 589 end; 590 591 @ Here is a procedure that does the right thing when we are done 592 reading the present contents of the buffer. It keeps |buffer[buf_size]| 593 empty, in order to avoid range errors on certain \PASCAL\ compilers. 594 595 An infinite sequence of right parentheses is placed at the end of the 596 file, so that the program is sure to get out of whatever level of nesting 597 it is in. 598 599 On some systems it is desirable to modify this code so that tab marks 600 in the buffer are replaced by blank spaces. (Simply setting 601 |xord[chr(@'11)]:=" "| would not work; for example, two-line 602 error messages would not come out properly aligned.) 603 @^system dependencies@> 604 605 @p procedure fill_buffer; 606 begin left_ln:=right_ln; limit:=0; loc:=0; 607 if left_ln then 608 begin if line>0 then read_ln(pl_file); 609 incr(line); 610 end; 611 if eof(pl_file) then 612 begin limit:=1; buffer[1]:=')'; right_ln:=false; input_has_ended:=true; 613 end 614 else begin while (limit<buf_size-2)and(not eoln(pl_file)) do 615 begin incr(limit); read(pl_file,buffer[limit]); 616 end; 617 buffer[limit+1]:=' '; right_ln:=eoln(pl_file); 618 if right_ln then begin incr(limit); buffer[limit+1]:=' '; 619 end; 620 if left_ln then @<Set |loc| to the number of leading blanks in 621 the buffer, and check the indentation@>; 622 end; 623 end; 624 625 @ The interesting part about |fill_buffer| is the part that learns what 626 indentation conventions the user is following, if any. 627 628 @d bad_indent(#)==begin if good_indent>=10 then err_print(#); 629 good_indent:=0; indent:=0; 630 end 631 632 @<Set |loc|...@>= 633 begin while (loc<limit)and(buffer[loc+1]=' ') do incr(loc); 634 if loc<limit then 635 begin if level=0 then 636 if loc=0 then incr(good_indent) 637 else bad_indent('Warning: Indented line occurred at level zero') 638 @.Warning: Indented line...@> 639 else if indent=0 then 640 if loc mod level=0 then 641 begin indent:=loc div level; good_indent:=1; 642 end 643 else good_indent:=0 644 else if indent*level=loc then incr(good_indent) 645 else bad_indent('Warning: Inconsistent indentation; ', 646 @.Warning: Inconsistent indentation...@> 647 'you are at parenthesis level ',level:1); 648 end; 649 end 650 651 @* Basic scanning routines. 652 The global variable |cur_char| holds the ASCII code corresponding to the 653 character most recently read from the input buffer, or to a character that 654 has been substituted for the real one. 655 656 @<Global...@>= 657 @!cur_char:ASCII_code; {we have just read this} 658 659 @ Here is a procedure that sets |cur_char| to an ASCII code for the 660 next character of input, if that character is a letter or digit or slash 661 or \.>. Otherwise 662 it sets |cur_char:=" "|, and the input system will be poised to reread the 663 character that was rejected, whether or not it was a space. 664 Lower case letters are converted to upper case. 665 666 @p procedure get_keyword_char; 667 begin while (loc=limit)and(not right_ln) do fill_buffer; 668 if loc=limit then cur_char:=" " {end-of-line counts as a delimiter} 669 else begin cur_char:=xord[buffer[loc+1]]; 670 if cur_char>="a" then cur_char:=cur_char-@'40; 671 if ((cur_char>="0")and(cur_char<="9")) then incr(loc) 672 else if ((cur_char>="A")and(cur_char<="Z")) then incr(loc) 673 else if cur_char="/" then incr(loc) 674 else if cur_char=">" then incr(loc) 675 else cur_char:=" "; 676 end; 677 end; 678 679 @ The following procedure sets |cur_char| to the next character code, 680 and converts lower case to upper case. If the character is a left or 681 right parenthesis, it will not be ``digested''; the character will 682 be read again and again, until the calling routine does something 683 like `|incr(loc)|' to get past it. Such special treatment of parentheses 684 insures that the structural information they contain won't be lost in 685 the midst of other error recovery operations. 686 687 @d backup==begin if (cur_char>")")or(cur_char<"(") then decr(loc); 688 end {undoes the effect of |get_next|} 689 690 @p procedure get_next; {sets |cur_char| to next, balks at parentheses} 691 begin while loc=limit do fill_buffer; 692 incr(loc); cur_char:=xord[buffer[loc]]; 693 if cur_char>="a" then 694 if cur_char<="z" then cur_char:=cur_char-@'40 {uppercasify} 695 else begin if cur_char=invalid_code then 696 begin err_print('Illegal character in the file'); 697 @.Illegal character...@> 698 cur_char:="?"; 699 end; 700 end 701 else if (cur_char<=")")and(cur_char>="(") then decr(loc); 702 end; 703 704 @ The next procedure is used to ignore the text of a comment, or to pass over 705 erroneous material. As such, it has the privilege of passing parentheses. 706 It stops after the first right parenthesis that drops the level below 707 the level in force when the procedure was called. 708 709 @p procedure skip_to_end_of_item; 710 var l:integer; {initial value of |level|} 711 begin l:=level; 712 while level>=l do 713 begin while loc=limit do fill_buffer; 714 incr(loc); 715 if buffer[loc]=')' then decr(level) 716 else if buffer[loc]='(' then incr(level); 717 end; 718 if input_has_ended then err_print('File ended unexpectedly: No closing ")"'); 719 @.File ended unexpectedly...@> 720 cur_char:=" "; {now the right parenthesis has been read and digested} 721 end; 722 723 @ Sometimes we merely want to skip past characters in the input until we 724 reach a left or a right parenthesis. For example, we do this whenever we 725 have finished scanning a property value and we hope that a right parenthesis 726 is next (except for possible blank spaces). 727 728 @d skip_to_paren==repeat get_next@;@+ until (cur_char="(")or(cur_char=")") 729 @d skip_error(#)==begin err_print(#); skip_to_paren; 730 end {this gets to the right parenthesis if something goes wrong} 731 @d flush_error(#)==begin err_print(#); skip_to_end_of_item; 732 end {this gets past the right parenthesis if something goes wrong} 733 734 @ After a property value has been scanned, we want to move just past the 735 right parenthesis that should come next in the input (except for possible 736 blank spaces). 737 738 @p procedure finish_the_property; {do this when the value has been scanned} 739 begin while cur_char=" " do get_next; 740 if cur_char<>")" then err_print('Junk after property value will be ignored'); 741 @.Junk after property value...@> 742 skip_to_end_of_item; 743 end; 744 745 @* Scanning property names. 746 We have to figure out the meaning of names that appear in the \.{PL} file, 747 by looking them up in a dictionary of known keywords. Keyword number $n$ 748 appears in locations |start[n]| through |start[n+1]-1| of an array called 749 |dictionary|. 750 751 @d max_name_index=88 {upper bound on the number of keywords} 752 @d max_letters=600 {upper bound on the total length of all keywords} 753 754 @<Global...@>= 755 @!start:array[1..max_name_index] of 0..max_letters; 756 @!dictionary:array[0..max_letters] of ASCII_code; 757 @!start_ptr:0..max_name_index; {the first available place in |start|} 758 @!dict_ptr:0..max_letters; {the first available place in |dictionary|} 759 760 @ @<Set init...@>= 761 start_ptr:=1; start[1]:=0; dict_ptr:=0; 762 763 @ When we are looking for a name, we put it into the |cur_name| array. 764 When we have found it, the corresponding |start| index will go into 765 the global variable |name_ptr|. 766 767 @d longest_name=20 {length of \.{DEFAULTRULETHICKNESS}} 768 769 @<Glob...@>= 770 @!cur_name:array[1..longest_name] of ASCII_code; {a name to look up} 771 @!name_length:0..longest_name; {its length} 772 @!name_ptr:0..max_name_index; {its ordinal number in the dictionary} 773 774 @ A conventional hash table with linear probing (cf.\ Algorithm 6.4L 775 in {\sl The Art of Computer Pro\-gram\-ming\/}) is used for the dictionary 776 operations. If |nhash[h]=0|, the table position is empty, otherwise |nhash[h]| 777 points into the |start| array. 778 779 @d hash_prime=101 {size of the hash table} 780 781 @<Glob...@>= 782 @!nhash:array[0..hash_prime-1] of 0..max_name_index; 783 @!cur_hash:0..hash_prime-1; {current position in the hash table} 784 785 @ @<Local...@>= 786 @!h:0..hash_prime-1; {runs through the hash table} 787 788 @ @<Set init...@>= 789 for h:=0 to hash_prime-1 do nhash[h]:=0; 790 791 @ Since there is no chance of the hash table overflowing, the procedure 792 is very simple. After |lookup| has done its work, |cur_hash| will point 793 to the place where the given name was found, or where it should be inserted. 794 795 @p procedure lookup; {finds |cur_name| in the dictionary} 796 var k:0..longest_name; {index into |cur_name|} 797 @!j:0..max_letters; {index into |dictionary|} 798 @!not_found:boolean; {clumsy thing necessary to avoid |goto| statement} 799 begin @<Compute the hash code, |cur_hash|, for |cur_name|@>; 800 not_found:=true; 801 while not_found do 802 begin if cur_hash=0 then cur_hash:=hash_prime-1@+else decr(cur_hash); 803 if nhash[cur_hash]=0 then not_found:=false 804 else begin j:=start[nhash[cur_hash]]; 805 if start[nhash[cur_hash]+1]=j+name_length then 806 begin not_found:=false; 807 for k:=1 to name_length do 808 if dictionary[j+k-1]<>cur_name[k] then not_found:=true; 809 end; 810 end; 811 end; 812 name_ptr:=nhash[cur_hash]; 813 end; 814 815 @ @<Compute the hash...@>= 816 cur_hash:=cur_name[1]; 817 for k:=2 to name_length do 818 cur_hash:=(cur_hash+cur_hash+cur_name[k]) mod hash_prime 819 820 @ The ``meaning'' of the keyword that begins at |start[k]| in the 821 dictionary is kept in |equiv[k]|. The numeric |equiv| codes are given 822 symbolic meanings by the following definitions. 823 824 @d comment_code=0 825 @d check_sum_code=1 826 @d design_size_code=2 827 @d design_units_code=3 828 @d coding_scheme_code=4 829 @d family_code=5 830 @d face_code=6 831 @d seven_bit_safe_flag_code=7 832 @d header_code= 8 833 @d font_dimen_code=9 834 @d lig_table_code=10 835 @d boundary_char_code=11 836 @d character_code=12 837 @d parameter_code=20 838 @d char_info_code=50 839 @d width=1 840 @d height=2 841 @d depth=3 842 @d italic=4 843 @d char_wd_code=char_info_code+width 844 @d char_ht_code=char_info_code+height 845 @d char_dp_code=char_info_code+depth 846 @d char_ic_code=char_info_code+italic 847 @d next_larger_code=55 848 @d var_char_code=56 849 @d label_code=70 850 @d stop_code=71 851 @d skip_code=72 852 @d krn_code=73 853 @d lig_code=74 854 855 @<Glo...@>= 856 @!equiv:array[0..max_name_index] of byte; 857 @!cur_code:byte; {equivalent most recently found in |equiv|} 858 859 @ We have to get the keywords into the hash table and into the dictionary in 860 the first place (sigh). The procedure that does this has the desired 861 |equiv| code as a parameter. In order to facilitate \.{WEB} macro writing 862 for the initialization, the keyword being initialized is placed into the 863 last positions of |cur_name|, instead of the first positions. 864 865 @p procedure enter_name(v:byte); {|cur_name| goes into the dictionary} 866 var k:0..longest_name; 867 begin for k:=1 to name_length do 868 cur_name[k]:=cur_name[k+longest_name-name_length]; 869 {now the name has been shifted into the correct position} 870 lookup; {this sets |cur_hash| to the proper insertion place} 871 nhash[cur_hash]:=start_ptr; equiv[start_ptr]:=v; 872 for k:=1 to name_length do 873 begin dictionary[dict_ptr]:=cur_name[k]; incr(dict_ptr); 874 end; 875 incr(start_ptr); start[start_ptr]:=dict_ptr; 876 end; 877 878 @ Here are the macros to load a name of up to 20 letters into the 879 dictionary. For example, the macro |load5| is used for five-letter keywords. 880 881 @d tail(#)==enter_name(#) 882 @d t20(#)==cur_name[20]:=#;tail 883 @d t19(#)==cur_name[19]:=#;t20 884 @d t18(#)==cur_name[18]:=#;t19 885 @d t17(#)==cur_name[17]:=#;t18 886 @d t16(#)==cur_name[16]:=#;t17 887 @d t15(#)==cur_name[15]:=#;t16 888 @d t14(#)==cur_name[14]:=#;t15 889 @d t13(#)==cur_name[13]:=#;t14 890 @d t12(#)==cur_name[12]:=#;t13 891 @d t11(#)==cur_name[11]:=#;t12 892 @d t10(#)==cur_name[10]:=#;t11 893 @d t9(#)==cur_name[9]:=#;t10 894 @d t8(#)==cur_name[8]:=#;t9 895 @d t7(#)==cur_name[7]:=#;t8 896 @d t6(#)==cur_name[6]:=#;t7 897 @d t5(#)==cur_name[5]:=#;t6 898 @d t4(#)==cur_name[4]:=#;t5 899 @d t3(#)==cur_name[3]:=#;t4 900 @d t2(#)==cur_name[2]:=#;t3 901 @d t1(#)==cur_name[1]:=#;t2 902 @d load3==name_length:=3;t18 903 @d load4==name_length:=4;t17 904 @d load5==name_length:=5;t16 905 @d load6==name_length:=6;t15 906 @d load7==name_length:=7;t14 907 @d load8==name_length:=8;t13 908 @d load9==name_length:=9;t12 909 @d load10==name_length:=10;t11 910 @d load11==name_length:=11;t10 911 @d load12==name_length:=12;t9 912 @d load13==name_length:=13;t8 913 @d load14==name_length:=14;t7 914 @d load15==name_length:=15;t6 915 @d load16==name_length:=16;t5 916 @d load17==name_length:=17;t4 917 @d load18==name_length:=18;t3 918 @d load19==name_length:=19;t2 919 @d load20==name_length:=20;t1 920 921 @ (Thank goodness for keyboard macros in the text editor used to create this 922 \.{WEB} file.) 923 924 @<Enter all of the names and their equivalents, except the parameter names@>= 925 equiv[0]:=comment_code; {this is used after unknown keywords} 926 load8("C")("H")("E")("C")("K")("S")("U")("M")(check_sum_code);@/ 927 load10("D")("E")("S")("I")("G")("N")("S")("I")("Z")("E")(design_size_code);@/ 928 load11("D")("E")("S")("I")("G")("N") 929 ("U")("N")("I")("T")("S")(design_units_code);@/ 930 load12("C")("O")("D")("I")("N")("G") 931 ("S")("C")("H")("E")("M")("E")(coding_scheme_code);@/ 932 load6("F")("A")("M")("I")("L")("Y")(family_code);@/ 933 load4("F")("A")("C")("E")(face_code);@/ 934 load16("S")("E")("V")("E")("N")("B")("I")("T")@/@t\hskip2em@> 935 ("S")("A")("F")("E")("F")("L")("A")("G")(seven_bit_safe_flag_code);@/ 936 load6("H")("E")("A")("D")("E")("R")(header_code);@/ 937 load9("F")("O")("N")("T")("D")("I")("M")("E")("N")(font_dimen_code);@/ 938 load8("L")("I")("G")("T")("A")("B")("L")("E")(lig_table_code);@/ 939 load12("B")("O")("U")("N")("D")("A")("R")("Y")("C")("H")("A")("R") 940 (boundary_char_code);@/ 941 load9("C")("H")("A")("R")("A")("C")("T")("E")("R")(character_code);@/ 942 load9("P")("A")("R")("A")("M")("E")("T")("E")("R")(parameter_code);@/ 943 load6("C")("H")("A")("R")("W")("D")(char_wd_code);@/ 944 load6("C")("H")("A")("R")("H")("T")(char_ht_code);@/ 945 load6("C")("H")("A")("R")("D")("P")(char_dp_code);@/ 946 load6("C")("H")("A")("R")("I")("C")(char_ic_code);@/ 947 load10("N")("E")("X")("T")("L")("A")("R")("G")("E")("R")(next_larger_code);@/ 948 load7("V")("A")("R")("C")("H")("A")("R")(var_char_code);@/ 949 load3("T")("O")("P")(var_char_code+1);@/ 950 load3("M")("I")("D")(var_char_code+2);@/ 951 load3("B")("O")("T")(var_char_code+3);@/ 952 load3("R")("E")("P")(var_char_code+4);@/ 953 load3("E")("X")("T")(var_char_code+4); {compatibility with older \.{PL} format} 954 load7("C")("O")("M")("M")("E")("N")("T")(comment_code);@/ 955 load5("L")("A")("B")("E")("L")(label_code);@/ 956 load4("S")("T")("O")("P")(stop_code);@/ 957 load4("S")("K")("I")("P")(skip_code);@/ 958 load3("K")("R")("N")(krn_code);@/ 959 load3("L")("I")("G")(lig_code);@/ 960 load4("/")("L")("I")("G")(lig_code+2);@/ 961 load5("/")("L")("I")("G")(">")(lig_code+6);@/ 962 load4("L")("I")("G")("/")(lig_code+1);@/ 963 load5("L")("I")("G")("/")(">")(lig_code+5);@/ 964 load5("/")("L")("I")("G")("/")(lig_code+3);@/ 965 load6("/")("L")("I")("G")("/")(">")(lig_code+7);@/ 966 load7("/")("L")("I")("G")("/")(">")(">")(lig_code+11);@/ 967 968 @ @<Enter the parameter names@>= 969 load5("S")("L")("A")("N")("T")(parameter_code+1);@/ 970 load5("S")("P")("A")("C")("E")(parameter_code+2);@/ 971 load7("S")("T")("R")("E")("T")("C")("H")(parameter_code+3);@/ 972 load6("S")("H")("R")("I")("N")("K")(parameter_code+4);@/ 973 load7("X")("H")("E")("I")("G")("H")("T")(parameter_code+5);@/ 974 load4("Q")("U")("A")("D")(parameter_code+6);@/ 975 load10("E")("X")("T")("R")("A")("S")("P")("A")("C")("E")(parameter_code+7);@/ 976 load4("N")("U")("M")("1")(parameter_code+8);@/ 977 load4("N")("U")("M")("2")(parameter_code+9);@/ 978 load4("N")("U")("M")("3")(parameter_code+10);@/ 979 load6("D")("E")("N")("O")("M")("1")(parameter_code+11);@/ 980 load6("D")("E")("N")("O")("M")("2")(parameter_code+12);@/ 981 load4("S")("U")("P")("1")(parameter_code+13);@/ 982 load4("S")("U")("P")("2")(parameter_code+14);@/ 983 load4("S")("U")("P")("3")(parameter_code+15);@/ 984 load4("S")("U")("B")("1")(parameter_code+16);@/ 985 load4("S")("U")("B")("2")(parameter_code+17);@/ 986 load7("S")("U")("P")("D")("R")("O")("P")(parameter_code+18);@/ 987 load7("S")("U")("B")("D")("R")("O")("P")(parameter_code+19);@/ 988 load6("D")("E")("L")("I")("M")("1")(parameter_code+20);@/ 989 load6("D")("E")("L")("I")("M")("2")(parameter_code+21);@/ 990 load10("A")("X")("I")("S")("H")("E")("I")("G")("H")("T")(parameter_code+22);@/ 991 load20("D")("E")("F")("A")("U")("L")("T")("R")("U")("L")("E")@/@t\hskip2em@> 992 ("T")("H")("I")("C")("K")("N")("E")("S")("S")(parameter_code+8);@/ 993 load13("B")("I")("G")("O")("P") 994 ("S")("P")("A")("C")("I")("N")("G")("1")(parameter_code+9);@/ 995 load13("B")("I")("G")("O")("P") 996 ("S")("P")("A")("C")("I")("N")("G")("2")(parameter_code+10);@/ 997 load13("B")("I")("G")("O")("P") 998 ("S")("P")("A")("C")("I")("N")("G")("3")(parameter_code+11);@/ 999 load13("B")("I")("G")("O")("P") 1000 ("S")("P")("A")("C")("I")("N")("G")("4")(parameter_code+12);@/ 1001 load13("B")("I")("G")("O")("P") 1002 ("S")("P")("A")("C")("I")("N")("G")("5")(parameter_code+13);@/ 1003 1004 @ When a left parenthesis has been scanned, the following routine 1005 is used to interpret the keyword that follows, and to store the 1006 equivalent value in |cur_code|. 1007 1008 @p procedure get_name; 1009 begin incr(loc); incr(level); {pass the left parenthesis} 1010 cur_char:=" "; 1011 while cur_char=" " do get_next; 1012 if (cur_char>")")or(cur_char<"(") then decr(loc); {back up one character} 1013 name_length:=0; get_keyword_char; {prepare to scan the name} 1014 while cur_char<>" " do 1015 begin if name_length=longest_name then cur_name[1]:="X" {force error} 1016 else incr(name_length); 1017 cur_name[name_length]:=cur_char; 1018 get_keyword_char; 1019 end; 1020 lookup; 1021 if name_ptr=0 then err_print('Sorry, I don''t know that property name'); 1022 @.Sorry, I don't know...@> 1023 cur_code:=equiv[name_ptr]; 1024 end; 1025 1026 @* Scanning numeric data. 1027 The next thing we need is a trio of subroutines to read the one-byte, 1028 four-byte, and real numbers that may appear as property values. 1029 These subroutines are careful to stick to numbers between $-2^{31}$ 1030 and $2^{31}-1$, inclusive, so that a computer with two's complement 1031 32-bit arithmetic will not be interrupted by overflow. 1032 1033 @ The first number scanner, which returns a one-byte value, surely has 1034 no problems of arithmetic overflow. 1035 1036 @p function get_byte:byte; {scans a one-byte property value} 1037 var acc:integer; {an accumulator} 1038 @!t:ASCII_code; {the type of value to be scanned} 1039 begin repeat get_next; 1040 until cur_char<>" "; {skip the blanks before the type code} 1041 t:=cur_char; acc:=0; 1042 repeat get_next; 1043 until cur_char<>" "; {skip the blanks after the type code} 1044 if t="C" then @<Scan an ASCII character code@> 1045 else if t="D" then @<Scan a small decimal number@> 1046 else if t="O" then @<Scan a small octal number@> 1047 else if t="H" then @<Scan a small hexadecimal number@> 1048 else if t="F" then @<Scan a face code@> 1049 else skip_error('You need "C" or "D" or "O" or "H" or "F" here'); 1050 @.You need "C" or "D" ...here@> 1051 cur_char:=" "; get_byte:=acc; 1052 end; 1053 1054 @ The |get_next| routine converts lower case to upper case, but it leaves 1055 the character in the buffer, so we can unconvert it. 1056 1057 @<Scan an ASCII...@>= 1058 if (cur_char>=@'41)and(cur_char<=@'176)and 1059 ((cur_char<"(")or(cur_char>")")) then 1060 acc:=xord[buffer[loc]] 1061 else skip_error('"C" value must be standard ASCII and not a paren') 1062 @:C value}\.{"C" value must be...@> 1063 1064 @ @<Scan a small dec...@>= 1065 begin while (cur_char>="0")and(cur_char<="9") do 1066 begin acc:=acc*10+cur_char-"0"; 1067 if acc>255 then 1068 begin skip_error('This value shouldn''t exceed 255'); 1069 @.This value shouldn't...@> 1070 acc:=0; cur_char:=" "; 1071 end 1072 else get_next; 1073 end; 1074 backup; 1075 end 1076 1077 @ @<Scan a small oct...@>= 1078 begin while (cur_char>="0")and(cur_char<="7") do 1079 begin acc:=acc*8+cur_char-"0"; 1080 if acc>255 then 1081 begin skip_error('This value shouldn''t exceed ''377'); 1082 @.This value shouldn't...@> 1083 acc:=0; cur_char:=" "; 1084 end 1085 else get_next; 1086 end; 1087 backup; 1088 end 1089 1090 @ @<Scan a small hex...@>= 1091 begin while ((cur_char>="0")and(cur_char<="9"))or 1092 ((cur_char>="A")and(cur_char<="F")) do 1093 begin if cur_char>="A" then cur_char:=cur_char+"0"+10-"A"; 1094 acc:=acc*16+cur_char-"0"; 1095 if acc>255 then 1096 begin skip_error('This value shouldn''t exceed "FF'); 1097 @.This value shouldn't...@> 1098 acc:=0; cur_char:=" "; 1099 end 1100 else get_next; 1101 end; 1102 backup; 1103 end 1104 1105 @ @<Scan a face...@>= 1106 begin if cur_char="B" then acc:=2 1107 else if cur_char="L" then acc:=4 1108 else if cur_char<>"M" then acc:=18; 1109 get_next; 1110 if cur_char="I" then incr(acc) 1111 else if cur_char<>"R" then acc:=18; 1112 get_next; 1113 if cur_char="C" then acc:=acc+6 1114 else if cur_char="E" then acc:=acc+12 1115 else if cur_char<>"R" then acc:=18; 1116 if acc>=18 then 1117 begin skip_error('Illegal face code, I changed it to MRR'); 1118 @.Illegal face code...@> 1119 acc:=0; 1120 end; 1121 end 1122 1123 @ The routine that scans a four-byte value puts its output into |cur_bytes|, 1124 which is a record containing (yes, you guessed it) four bytes. 1125 1126 @<Types...@>= 1127 @!four_bytes=record @!b0:byte;@+@!b1:byte;@+@!b2:byte;@+@!b3:byte;@+end; 1128 1129 @ @d c0==cur_bytes.b0 1130 @d c1==cur_bytes.b1 1131 @d c2==cur_bytes.b2 1132 @d c3==cur_bytes.b3 1133 1134 @<Glob...@>= 1135 @!cur_bytes:four_bytes; {a four-byte accumulator} 1136 1137 @ Since the |get_four_bytes| routine is used very infrequently, no attempt 1138 has been made to make it fast; we only want it to work. 1139 1140 @p procedure get_four_bytes; {scans an octal constant and sets |four_bytes|} 1141 var c:integer; {leading byte} 1142 @!r:integer; {radix} 1143 @!q:integer; {|256/r|} 1144 begin repeat get_next; 1145 until cur_char<>" "; {skip the blanks before the type code} 1146 r:=0; c0:=0; c1:=0; c2:=0; c3:=0; {start with the accumulator zero} 1147 if cur_char="H" then r:=16 1148 else if cur_char="O" then r:=8 1149 else skip_error('An octal ("O") or hex ("H") value is needed here'); 1150 @.An octal ("O") or hex ("H")...@> 1151 if r>0 then 1152 begin q:=256 div r; 1153 repeat get_next; 1154 until cur_char<>" "; {skip the blanks after the type code} 1155 while ((cur_char>="0")and(cur_char<="9"))or@| 1156 ((cur_char>="A")and(cur_char<="F")) do 1157 @<Multiply by |r|, add |cur_char-"0"|, and |get_next|@>; 1158 end; 1159 end; 1160 1161 @ @<Multiply by |r|...@>= 1162 begin if cur_char>="A" then cur_char:=cur_char+"0"+10-"A"; 1163 c:=(r*c0)+(c1 div q); 1164 if c>255 then 1165 begin c0:=0; c1:=0; c2:=0; c3:=0; 1166 if r=8 then 1167 skip_error('Sorry, the maximum octal value is O 37777777777') 1168 @.Sorry, the maximum octal...@> 1169 else skip_error('Sorry, the maximum hex value is H FFFFFFFF'); 1170 @.Sorry, the maximum hex...@> 1171 end 1172 else if cur_char>="0"+r then skip_error('Illegal digit') 1173 @.Illegal digit@> 1174 else begin c0:=c; 1175 c1:=(r*(c1 mod q))+(c2 div q); 1176 c2:=(r*(c2 mod q))+(c3 div q); 1177 c3:=(r*(c3 mod q))+cur_char-"0"; 1178 get_next; 1179 end; 1180 end 1181 1182 @ The remaining scanning routine is the most interesting. It scans a real 1183 constant and returns the nearest |fix_word| approximation to that constant. 1184 A |fix_word| is a 32-bit integer that represents a real value that 1185 has been multiplied by $2^{20}$. Since \.{PLtoTF} restricts the magnitude 1186 of reals to 2048, the |fix_word| will have a magnitude less than $2^{31}$. 1187 1188 @d unity==@'4000000 {$2^{20}$, the |fix_word| 1.0} 1189 1190 @<Types...@>= 1191 @!fix_word=integer; {a scaled real value with 20 bits of fraction} 1192 1193 @ When a real value is desired, we might as well treat `\.D' and `\.R' 1194 formats as if they were identical. 1195 1196 @p function get_fix:fix_word; {scans a real property value} 1197 var negative:boolean; {was there a minus sign?} 1198 @!acc:integer; {an accumulator} 1199 @!int_part:integer; {the integer part} 1200 @!j:0..7; {the number of decimal places stored} 1201 begin repeat get_next; 1202 until cur_char<>" "; {skip the blanks before the type code} 1203 negative:=false; acc:=0; {start with the accumulators zero} 1204 if (cur_char<>"R")and(cur_char<>"D") then 1205 skip_error('An "R" or "D" value is needed here') 1206 @.An "R" or "D" ... needed here@> 1207 else begin @<Scan the blanks and/or signs after the type code@>; 1208 while (cur_char>="0") and (cur_char<="9") do 1209 @<Multiply by 10, add |cur_char-"0"|, and |get_next|@>; 1210 int_part:=acc; acc:=0; 1211 if cur_char="." then @<Scan the fraction part and put it in |acc|@>; 1212 if (acc>=unity)and(int_part=2047) then 1213 skip_error('Real constants must be less than 2048') 1214 @.Real constants must be...@> 1215 else acc:=int_part*unity+acc; 1216 end; 1217 if negative then get_fix:=-acc@+else get_fix:=acc; 1218 end; 1219 1220 @ @<Scan the blanks...@>= 1221 repeat get_next; 1222 if cur_char="-" then 1223 begin cur_char:=" "; negative:=not negative; 1224 end 1225 else if cur_char="+" then cur_char:=" "; 1226 until cur_char<>" " 1227 1228 @ @<Multiply by 10...@>= 1229 begin acc:=acc*10+cur_char-"0"; 1230 if acc>=2048 then 1231 begin skip_error('Real constants must be less than 2048'); 1232 @.Real constants must be...@> 1233 acc:=0; cur_char:=" "; 1234 end 1235 else get_next; 1236 end 1237 1238 @ To scan the fraction $.d_1d_2\ldots\,$, we keep track of up to seven 1239 of the digits $d_j$. A correct result is obtained if we first compute 1240 $f^\prime=\lfloor 2^{21}(d_1\ldots d_j)/10^j\rfloor$, after which 1241 $f=\lfloor(f^\prime+1)/2\rfloor$. It is possible to have $f=1.0$. 1242 1243 @<Glob...@>= 1244 @!fraction_digits:array[1..7] of integer; {$2^{21}$ times $d_j$} 1245 1246 @ @<Scan the frac...@>= 1247 begin j:=0; get_next; 1248 while (cur_char>="0")and(cur_char<="9") do 1249 begin if j<7 then 1250 begin incr(j); fraction_digits[j]:=@'10000000*(cur_char-"0"); 1251 end; 1252 get_next; 1253 end; 1254 acc:=0; 1255 while j>0 do 1256 begin acc:=fraction_digits[j]+(acc div 10); decr(j); 1257 end; 1258 acc:=(acc+10) div 20; 1259 end 1260 1261 @* Storing the property values. 1262 When property values have been found, they are squirreled away in a bunch 1263 of arrays. The header information is unpacked into bytes in an array 1264 called |header_bytes|. The ligature/kerning program is stored in an array 1265 of type |four_bytes|. 1266 Another |four_bytes| array holds the specifications of extensible characters. 1267 The kerns and parameters are stored in separate arrays of |fix_word| values. 1268 1269 Instead of storing the design size in the header array, we will keep it 1270 in a |fix_word| variable until the last minute. The number of units in the 1271 design size is also kept in a |fix_word|. 1272 1273 @<Glob...@>= 1274 @!header_bytes:array[header_index] of byte; {the header block} 1275 @!header_ptr:header_index; {the number of header bytes in use} 1276 @!design_size:fix_word; {the design size} 1277 @!design_units:fix_word; {reciprocal of the scaling factor} 1278 @!seven_bit_safe_flag:boolean; {does the file claim to be seven-bit-safe?} 1279 @!lig_kern:array[0..max_lig_steps] of four_bytes; {the ligature program} 1280 @!nl:0..32767; {the number of ligature/kern instructions so far} 1281 @!min_nl:0..32767; {the final value of |nl| must be at least this} 1282 @!kern:array[0..max_kerns] of fix_word; {the distinct kerning amounts} 1283 @!nk:0..max_kerns; {the number of entries of |kern|} 1284 @!exten:array[0..255] of four_bytes; {extensible character specs} 1285 @!ne:0..256; {the number of extensible characters} 1286 @!param:array[1..max_param_words] of fix_word; {\.{FONTDIMEN} parameters} 1287 @!np:0..max_param_words; {the largest parameter set nonzero} 1288 @!check_sum_specified:boolean; {did the user name the check sum?} 1289 @!bchar:0..256; {the right boundary character, or 256 if unspecified} 1290 1291 @ @<Types...@>= 1292 @!header_index=0..max_header_bytes; 1293 @!indx=0..@'77777; 1294 1295 @ @<Local...@>= 1296 @!d:header_index; {an index into |header_bytes|} 1297 1298 @ We start by setting up the default values. 1299 1300 @d check_sum_loc=0 1301 @d design_size_loc=4 1302 @d coding_scheme_loc=8 1303 @d family_loc=coding_scheme_loc+40 1304 @d seven_flag_loc=family_loc+20 1305 @d face_loc=seven_flag_loc+3 1306 1307 @<Set init...@>= 1308 for d:=0 to 18*4-1 do header_bytes[d]:=0; 1309 header_bytes[8]:=11; header_bytes[9]:="U"; 1310 header_bytes[10]:="N"; 1311 header_bytes[11]:="S"; 1312 header_bytes[12]:="P"; 1313 header_bytes[13]:="E"; 1314 header_bytes[14]:="C"; 1315 header_bytes[15]:="I"; 1316 header_bytes[16]:="F"; 1317 header_bytes[17]:="I"; 1318 header_bytes[18]:="E"; 1319 header_bytes[19]:="D"; 1320 @.UNSPECIFIED@> 1321 for d:=family_loc to family_loc+11 do header_bytes[d]:=header_bytes[d-40]; 1322 design_size:=10*unity; design_units:=unity; seven_bit_safe_flag:=false;@/ 1323 header_ptr:=18*4; nl:=0; min_nl:=0; nk:=0; ne:=0; np:=0;@/ 1324 check_sum_specified:=false; bchar:=256; 1325 1326 @ Most of the dimensions, however, go into the |memory| array. There are 1327 at most 257 widths, 257 heights, 257 depths, and 257 italic corrections, 1328 since the value 0 is required but it need not be used. So |memory| has room 1329 for 1028 entries, each of which is a |fix_word|. An auxiliary table called 1330 |link| is used to link these words together in linear lists, so that 1331 sorting and other operations can be done conveniently. 1332 1333 We also add four ``list head'' words to the |memory| and |link| arrays; 1334 these are in locations |width| through |italic|, i.e., 1 through 4. 1335 For example, |link[height]| points to the smallest element in 1336 the sorted list of distinct heights that have appeared so far, and 1337 |memory[height]| is the number of distinct heights. 1338 1339 @d mem_size=1028+4 {number of nonzero memory addresses} 1340 1341 @<Types...@>= 1342 @!pointer=0..mem_size; {an index into memory} 1343 1344 @ The arrays |char_wd|, |char_ht|, |char_dp|, and |char_ic| contain 1345 pointers to the |memory| array entries where the corresponding dimensions 1346 appear. Two other arrays, |char_tag| and |char_remainder|, hold 1347 the other information that \.{TFM} files pack into a |char_info_word|. 1348 1349 @d no_tag=0 {vanilla character} 1350 @d lig_tag=1 {character has a ligature/kerning program} 1351 @d list_tag=2 {character has a successor in a charlist} 1352 @d ext_tag=3 {character is extensible} 1353 @d bchar_label==char_remainder[256] 1354 {beginning of ligature program for left boundary} 1355 1356 @<Glob...@>= 1357 @!memory:array[pointer] of fix_word; {character dimensions and kerns} 1358 @!mem_ptr:pointer; {largest |memory| word in use} 1359 @!link:array[pointer] of pointer; {to make lists of |memory| items} 1360 @!char_wd:array[byte] of pointer; {pointers to the widths} 1361 @!char_ht:array[byte] of pointer; {pointers to the heights} 1362 @!char_dp:array[byte] of pointer; {pointers to the depths} 1363 @!char_ic:array[byte] of pointer; {pointers to italic corrections} 1364 @!char_tag:array[byte] of no_tag..ext_tag; {character tags} 1365 @!char_remainder:array[0..256] of 0..65535; {pointers to ligature labels, 1366 next larger characters, or extensible characters} 1367 1368 @ @<Local...@>= 1369 @!c:byte; {runs through all character codes} 1370 1371 @ @<Set init...@>= 1372 bchar_label:=@'77777; 1373 for c:=0 to 255 do 1374 begin char_wd[c]:=0; char_ht[c]:=0; char_dp[c]:=0; char_ic[c]:=0;@/ 1375 char_tag[c]:=no_tag; char_remainder[c]:=0; 1376 end; 1377 memory[0]:=@'17777777777; {an ``infinite'' element at the end of the lists} 1378 memory[width]:=0; link[width]:=0; {width list is empty} 1379 memory[height]:=0; link[height]:=0; {height list is empty} 1380 memory[depth]:=0; link[depth]:=0; {depth list is empty} 1381 memory[italic]:=0; link[italic]:=0; {italic list is empty} 1382 mem_ptr:=italic; 1383 1384 @ As an example of these data structures, let us consider the simple 1385 routine that inserts a potentially new element into one of the dimension 1386 lists. The first parameter indicates the list head (i.e., |h=width| for 1387 the width list, etc.); the second parameter is the value that is to be 1388 inserted into the list if it is not already present. The procedure 1389 returns the value of the location where the dimension appears in |memory|. 1390 The fact that |memory[0]| is larger than any legal dimension makes the 1391 algorithm particularly short. 1392 1393 We do have to handle two somewhat subtle situations. A width of zero must be 1394 put into the list, so that a zero-width character in the font will not appear 1395 to be nonexistent (i.e., so that its |char_wd| index will not be zero), but 1396 this does not need to be done for heights, depths, or italic corrections. 1397 Furthermore, it is necessary to test for memory overflow even though we 1398 have provided room for the maximum number of different dimensions in any 1399 legal font, since the \.{PL} file might foolishly give any number of 1400 different sizes to the same character. 1401 1402 @p function sort_in(@!h:pointer;@!d:fix_word):pointer; {inserts into list} 1403 var p:pointer; {the current node of interest} 1404 begin if (d=0)and(h<>width) then sort_in:=0 1405 else begin p:=h; 1406 while d>=memory[link[p]] do p:=link[p]; 1407 if (d=memory[p])and(p<>h) then sort_in:=p 1408 else if mem_ptr=mem_size then 1409 begin err_print('Memory overflow: more than 1028 widths, etc'); 1410 @.Memory overflow...@> 1411 print_ln('Congratulations! It''s hard to make this error.'); 1412 sort_in:=p; 1413 end 1414 else begin incr(mem_ptr); memory[mem_ptr]:=d; 1415 link[mem_ptr]:=link[p]; link[p]:=mem_ptr; incr(memory[h]); 1416 sort_in:=mem_ptr; 1417 end; 1418 end; 1419 end; 1420 1421 @ When these lists of dimensions are eventually written to the \.{TFM} 1422 file, we may have to do some rounding of values, because the \.{TFM} file 1423 allows at most 256 widths, 16 heights, 16 depths, and 64 italic 1424 corrections. The following procedure takes a given list head |h| and a 1425 given dimension |d|, and returns the minimum $m$ such that the elements of 1426 the list can be covered by $m$ intervals of width $d$. It also sets 1427 |next_d| to the smallest value $d^\prime>d$ such that the covering found 1428 by this procedure would be different. In particular, if $d=0$ it computes 1429 the number of elements of the list, and sets |next_d| to the smallest 1430 distance between two list elements. (The covering by intervals of width 1431 |next_d| is not guaranteed to have fewer than $m$ elements, but in practice 1432 this seems to happen most of the time.) 1433 1434 @<Glob...@>= 1435 @!next_d:fix_word; {the next larger interval that is worth trying} 1436 1437 @ Once again we can make good use of the fact that |memory[0]| is ``infinite.'' 1438 1439 @p function min_cover(@!h:pointer;@!d:fix_word):integer; 1440 var p:pointer; {the current node of interest} 1441 @!l:fix_word; {the least element covered by the current interval} 1442 @!m:integer; {the current size of the cover being generated} 1443 begin m:=0; p:=link[h]; next_d:=memory[0]; 1444 while p<>0 do 1445 begin incr(m); l:=memory[p]; 1446 while memory[link[p]]<=l+d do p:=link[p]; 1447 p:=link[p]; 1448 if memory[p]-l<next_d then next_d:=memory[p]-l; 1449 end; 1450 min_cover:=m; 1451 end; 1452 1453 @ The following procedure uses |min_cover| to determine the smallest $d$ 1454 such that a given list can be covered with at most a given number of 1455 intervals. 1456 1457 @p function shorten(@!h:pointer;m:integer):fix_word; {finds best way to round} 1458 var d:fix_word; {the current trial interval length} 1459 @!k:integer; {the size of a minimum cover} 1460 begin if memory[h]>m then 1461 begin excess:=memory[h]-m; 1462 k:=min_cover(h,0); d:=next_d; {now the answer is at least |d|} 1463 repeat d:=d+d; k:=min_cover(h,d); 1464 until k<=m; {first we ascend rapidly until finding the range} 1465 d:=d div 2; k:=min_cover(h,d); {now we run through the feasible steps} 1466 while k>m do 1467 begin d:=next_d; k:=min_cover(h,d); 1468 end; 1469 shorten:=d; 1470 end 1471 else shorten:=0; 1472 end; 1473 1474 @ When we are nearly ready to output the \.{TFM} file, we will set 1475 |index[p]:=k| if the dimension in |memory[p]| is being rounded to the 1476 |k|th element of its list. 1477 1478 @<Glob...@>= 1479 @!index:array[pointer] of byte; 1480 @!excess:byte; {number of words to remove, if list is being shortened} 1481 1482 @ Here is the procedure that sets the |index| values. It also shortens 1483 the list so that there is only one element per covering interval; 1484 the remaining elements are the midpoints of their clusters. 1485 1486 @p procedure set_indices(@!h:pointer;@!d:fix_word); {reduces and indexes a list} 1487 var p:pointer; {the current node of interest} 1488 @!q:pointer; {trails one step behind |p|} 1489 @!m:byte; {index number of nodes in the current interval} 1490 @!l:fix_word; {least value in the current interval} 1491 begin q:=h; p:=link[q]; m:=0; 1492 while p<>0 do 1493 begin incr(m); l:=memory[p]; index[p]:=m; 1494 while memory[link[p]]<=l+d do 1495 begin p:=link[p]; index[p]:=m; decr(excess); 1496 if excess=0 then d:=0; 1497 end; 1498 link[q]:=p; memory[p]:=l+(memory[p]-l) div 2; q:=p; p:=link[p]; 1499 end; 1500 memory[h]:=m; 1501 end; 1502 1503 @* The input phase. 1504 We're ready now to read and parse the \.{PL} file, storing property 1505 values as we go. 1506 1507 @<Glob...@>= 1508 @!c:byte; {the current character or byte being processed} 1509 1510 @ @<Read all the input@>= 1511 cur_char:=" "; 1512 repeat while cur_char=" " do get_next; 1513 if cur_char="(" then @<Read a font property value@> 1514 else if (cur_char=")")and not input_has_ended then 1515 begin err_print('Extra right parenthesis'); 1516 incr(loc); cur_char:=" "; 1517 end 1518 @.Extra right parenthesis@> 1519 else if not input_has_ended then junk_error; 1520 until input_has_ended 1521 1522 @ The |junk_error| routine just referred to is called when something 1523 appears in the forbidden area between properties of a property list. 1524 1525 @p procedure junk_error; {gets past no man's land} 1526 begin err_print('There''s junk here that is not in parentheses'); 1527 @.There's junk here...@> 1528 skip_to_paren; 1529 end; 1530 1531 @ For each font property, we are supposed to read the data from the 1532 left parenthesis that is the current value of |cur_char| to the right 1533 parenthesis that matches it in the input. The main complication is 1534 to recover with reasonable grace from various error conditions that might arise. 1535 1536 @<Read a font property value@>= 1537 begin get_name; 1538 if cur_code=comment_code then skip_to_end_of_item 1539 else if cur_code>character_code then 1540 flush_error('This property name doesn''t belong on the outer level') 1541 @.This property name doesn't belong...@> 1542 else begin @<Read the font property value specified by |cur_code|@>; 1543 finish_the_property; 1544 end; 1545 end 1546 1547 @ @<Read the font property value spec...@>= 1548 case cur_code of 1549 check_sum_code: begin check_sum_specified:=true; read_four_bytes(check_sum_loc); 1550 end; 1551 design_size_code: @<Read the design size@>; 1552 design_units_code: @<Read the design units@>; 1553 coding_scheme_code: read_BCPL(coding_scheme_loc,40); 1554 family_code: read_BCPL(family_loc,20); 1555 face_code:header_bytes[face_loc]:=get_byte; 1556 seven_bit_safe_flag_code: @<Read the seven-bit-safe flag@>; 1557 header_code: @<Read an indexed header word@>; 1558 font_dimen_code: @<Read font parameter list@>; 1559 lig_table_code: read_lig_kern; 1560 boundary_char_code: bchar:=get_byte; 1561 character_code: read_char_info; 1562 end 1563 1564 @ The |case| statement just given makes use of two subroutines that we 1565 haven't defined yet. The first of these puts a 32-bit octal quantity 1566 into four specified bytes of the header block. 1567 1568 @p procedure read_four_bytes(l:header_index); 1569 begin get_four_bytes; 1570 header_bytes[l]:=c0; 1571 header_bytes[l+1]:=c1; 1572 header_bytes[l+2]:=c2; 1573 header_bytes[l+3]:=c3; 1574 end; 1575 1576 @ The second little procedure is used to scan a string and to store it in 1577 the ``{\mc BCPL} format'' required by \.{TFM} files. The string is supposed 1578 to contain at most |n| bytes, including the first byte (which holds the 1579 length of the rest of the string). 1580 1581 @p procedure read_BCPL(l:header_index;n:byte); 1582 var k:header_index; 1583 begin k:=l; 1584 while cur_char=" " do get_next; 1585 while (cur_char<>"(")and(cur_char<>")") do 1586 begin if k<l+n then incr(k); 1587 if k<l+n then header_bytes[k]:=cur_char; 1588 get_next; 1589 end; 1590 if k=l+n then 1591 begin err_print('String is too long; its first ',n-1:1, 1592 @.String is too long...@> 1593 ' characters will be kept'); decr(k); 1594 end; 1595 header_bytes[l]:=k-l; 1596 while k<l+n-1 do {tidy up the remaining bytes by setting them to nulls} 1597 begin incr(k); header_bytes[k]:=0; 1598 end; 1599 end; 1600 1601 @ @<Read the design size@>= 1602 begin next_d:=get_fix; 1603 if next_d<unity then 1604 err_print('The design size must be at least 1') 1605 @.The design size must...@> 1606 else design_size:=next_d; 1607 end 1608 1609 @ @<Read the design units@>= 1610 begin next_d:=get_fix; 1611 if next_d<=0 then 1612 err_print('The number of units per design size must be positive') 1613 @.The number of units...@> 1614 else design_units:=next_d; 1615 end 1616 1617 @ @<Read the seven-bit-safe...@>= 1618 begin while cur_char=" " do get_next; 1619 if cur_char="T" then seven_bit_safe_flag:=true 1620 else if cur_char="F" then seven_bit_safe_flag:=false 1621 else err_print('The flag value should be "TRUE" or "FALSE"'); 1622 @.The flag value should be...@> 1623 skip_to_paren; 1624 end 1625 1626 @ @<Read an indexed header word@>= 1627 begin c:=get_byte; 1628 if c<18 then skip_error('HEADER indices should be 18 or more') 1629 @.HEADER indices...@> 1630 else if 4*c+4>max_header_bytes then 1631 skip_error('This HEADER index is too big for my present table size') 1632 @.This HEADER index is too big...@> 1633 else begin while header_ptr<4*c+4 do 1634 begin header_bytes[header_ptr]:=0; incr(header_ptr); 1635 end; 1636 read_four_bytes(4*c); 1637 end; 1638 end 1639 1640 @ The remaining kinds of font property values that need to be read are 1641 those that involve property lists on higher levels. Each of these has a 1642 loop similar to the one that was used at level zero. Then we put the 1643 right parenthesis back so that `|finish_the_property|' will be happy; 1644 there is probably a more elegant way to do this. 1645 1646 @d finish_inner_property_list==begin decr(loc); incr(level); cur_char:=")"; 1647 end 1648 1649 @<Read font parameter list@>= 1650 begin while level=1 do 1651 begin while cur_char=" " do get_next; 1652 if cur_char="(" then @<Read a parameter value@> 1653 else if cur_char=")" then skip_to_end_of_item 1654 else junk_error; 1655 end; 1656 finish_inner_property_list; 1657 end 1658 1659 @ @<Read a parameter value@>= 1660 begin get_name; 1661 if cur_code=comment_code then skip_to_end_of_item 1662 else if (cur_code<parameter_code)or(cur_code>=char_wd_code) then 1663 flush_error('This property name doesn''t belong in a FONTDIMEN list') 1664 @.This property name doesn't belong...@> 1665 else begin if cur_code=parameter_code then c:=get_byte 1666 else c:=cur_code-parameter_code; 1667 if c=0 then flush_error('PARAMETER index must not be zero') 1668 @.PARAMETER index must not...@> 1669 else if c>max_param_words then 1670 flush_error('This PARAMETER index is too big for my present table size') 1671 @.This PARAMETER index is too big...@> 1672 else begin while np<c do 1673 begin incr(np); param[np]:=0; 1674 end; 1675 param[c]:=get_fix; 1676 finish_the_property; 1677 end; 1678 end; 1679 end 1680 1681 @ @<Read ligature/kern list@>= 1682 begin lk_step_ended:=false; 1683 while level=1 do 1684 begin while cur_char=" " do get_next; 1685 if cur_char="(" then @<Read a ligature/kern command@> 1686 else if cur_char=")" then skip_to_end_of_item 1687 else junk_error; 1688 end; 1689 finish_inner_property_list; 1690 end 1691 1692 @ @<Read a ligature/kern command@>= 1693 begin get_name; 1694 if cur_code=comment_code then skip_to_end_of_item 1695 else if cur_code<label_code then 1696 flush_error('This property name doesn''t belong in a LIGTABLE list') 1697 @.This property name doesn't belong...@> 1698 else begin case cur_code of 1699 label_code:@<Read a label step@>; 1700 stop_code:@<Read a stop step@>; 1701 skip_code:@<Read a skip step@>; 1702 krn_code:@<Read a kerning step@>; 1703 lig_code,lig_code+1,lig_code+2,lig_code+3,lig_code+5,lig_code+6,lig_code+7, 1704 lig_code+11:@<Read a ligature step@>; 1705 end; {there are no other cases |>=label_code|} 1706 finish_the_property; 1707 end; 1708 end 1709 1710 @ When a character is about to be tagged, we call the following 1711 procedure so that an error message is given in case of multiple tags. 1712 1713 @p procedure check_tag(c:byte); {print error if |c| already tagged} 1714 begin case char_tag[c] of 1715 no_tag: do_nothing; 1716 lig_tag: err_print('This character already appeared in a LIGTABLE LABEL'); 1717 @.This character already...@> 1718 list_tag: err_print('This character already has a NEXTLARGER spec'); 1719 ext_tag: err_print('This character already has a VARCHAR spec'); 1720 end; 1721 end; 1722 1723 @ @<Read a label step@>= 1724 begin while cur_char=" " do get_next; 1725 if cur_char="B" then 1726 begin bchar_label:=nl; skip_to_paren; {\.{LABEL BOUNDARYCHAR}} 1727 end 1728 else begin backup; c:=get_byte; 1729 check_tag(c); char_tag[c]:=lig_tag; char_remainder[c]:=nl; 1730 end; 1731 if min_nl<=nl then min_nl:=nl+1; 1732 lk_step_ended:=false; 1733 end 1734 1735 @ @d stop_flag=128 {value indicating `\.{STOP}' in a lig/kern program} 1736 @d kern_flag=128 {op code for a kern step} 1737 1738 @<Globals...@>= 1739 @!lk_step_ended:boolean; 1740 {was the last \.{LIGTABLE} property \.{LIG} or \.{KRN}?} 1741 @!krn_ptr:0..max_kerns; {an index into |kern|} 1742 1743 @ @<Read a stop step@>= 1744 if not lk_step_ended then 1745 err_print('STOP must follow LIG or KRN') 1746 @.STOP must follow LIG or KRN@> 1747 else begin lig_kern[nl-1].b0:=stop_flag; lk_step_ended:=false; 1748 end 1749 1750 @ @<Read a skip step@>= 1751 if not lk_step_ended then 1752 err_print('SKIP must follow LIG or KRN') 1753 @.SKIP must follow LIG or KRN@> 1754 else begin c:=get_byte; 1755 if c>=128 then err_print('Maximum SKIP amount is 127') 1756 @.Maximum SKIP amount...@> 1757 else if nl+c>=max_lig_steps then 1758 err_print('Sorry, LIGTABLE too long for me to handle') 1759 @.Sorry, LIGTABLE too long...@> 1760 else begin lig_kern[nl-1].b0:=c; 1761 if min_nl<=nl+c then min_nl:=nl+c+1; 1762 end; 1763 lk_step_ended:=false; 1764 end 1765 1766 @ @<Read a ligature step@>= 1767 begin lig_kern[nl].b0:=0; 1768 lig_kern[nl].b2:=cur_code-lig_code; 1769 lig_kern[nl].b1:=get_byte; 1770 lig_kern[nl].b3:=get_byte; 1771 if nl>=max_lig_steps-1 then 1772 err_print('Sorry, LIGTABLE too long for me to handle') 1773 @.Sorry, LIGTABLE too long...@> 1774 else incr(nl); 1775 lk_step_ended:=true; 1776 end 1777 1778 @ @<Read a kerning step@>= 1779 begin lig_kern[nl].b0:=0; lig_kern[nl].b1:=get_byte; 1780 kern[nk]:=get_fix; krn_ptr:=0; 1781 while kern[krn_ptr]<>kern[nk] do incr(krn_ptr); 1782 if krn_ptr=nk then 1783 begin if nk<max_kerns then incr(nk) 1784 else begin err_print('Sorry, too many different kerns for me to handle'); 1785 @.Sorry, too many different kerns...@> 1786 decr(krn_ptr); 1787 end; 1788 end; 1789 lig_kern[nl].b2:=kern_flag+(krn_ptr div 256); 1790 lig_kern[nl].b3:=krn_ptr mod 256; 1791 if nl>=max_lig_steps-1 then 1792 err_print('Sorry, LIGTABLE too long for me to handle') 1793 @.Sorry, LIGTABLE too long...@> 1794 else incr(nl); 1795 lk_step_ended:=true; 1796 end 1797 1798 @ Finally we come to the part of \.{PLtoTF}'s input mechanism 1799 that is used most, the processing of individual character data. 1800 1801 @<Read character info list@>= 1802 begin c:=get_byte; {read the character code that is being specified} 1803 @<Print |c| in octal notation@>; 1804 while level=1 do 1805 begin while cur_char=" " do get_next; 1806 if cur_char="(" then @<Read a character property@> 1807 else if cur_char=")" then skip_to_end_of_item 1808 else junk_error; 1809 end; 1810 if char_wd[c]=0 then char_wd[c]:=sort_in(width,0); {legitimatize |c|} 1811 finish_inner_property_list; 1812 end 1813 1814 @ @<Read a character prop...@>= 1815 begin get_name; 1816 if cur_code=comment_code then skip_to_end_of_item 1817 else if (cur_code<char_wd_code)or(cur_code>var_char_code) then 1818 flush_error('This property name doesn''t belong in a CHARACTER list') 1819 @.This property name doesn't belong...@> 1820 else begin case cur_code of 1821 char_wd_code:char_wd[c]:=sort_in(width,get_fix); 1822 char_ht_code:char_ht[c]:=sort_in(height,get_fix); 1823 char_dp_code:char_dp[c]:=sort_in(depth,get_fix); 1824 char_ic_code:char_ic[c]:=sort_in(italic,get_fix); 1825 next_larger_code:begin check_tag(c); char_tag[c]:=list_tag; 1826 char_remainder[c]:=get_byte; 1827 end; 1828 var_char_code:@<Read an extensible recipe for |c|@>; 1829 end;@/ 1830 finish_the_property; 1831 end; 1832 end 1833 1834 @ @<Read an extensible r...@>= 1835 begin if ne=256 then 1836 err_print('At most 256 VARCHAR specs are allowed') 1837 @.At most 256 VARCHAR specs...@> 1838 else begin check_tag(c); char_tag[c]:=ext_tag; char_remainder[c]:=ne;@/ 1839 exten[ne].b0:=0; exten[ne].b1:=0; exten[ne].b2:=0; exten[ne].b3:=0; 1840 while level=2 do 1841 begin while cur_char=" " do get_next; 1842 if cur_char="(" then @<Read an extensible piece@> 1843 else if cur_char=")" then skip_to_end_of_item 1844 else junk_error; 1845 end; 1846 incr(ne); 1847 finish_inner_property_list; 1848 end; 1849 end 1850 1851 @ @<Read an extensible p...@>= 1852 begin get_name; 1853 if cur_code=comment_code then skip_to_end_of_item 1854 else if (cur_code<var_char_code+1)or(cur_code>var_char_code+4) then 1855 flush_error('This property name doesn''t belong in a VARCHAR list') 1856 @.This property name doesn't belong...@> 1857 else begin case cur_code-(var_char_code+1) of 1858 0:exten[ne].b0:=get_byte; 1859 1:exten[ne].b1:=get_byte; 1860 2:exten[ne].b2:=get_byte; 1861 3:exten[ne].b3:=get_byte; 1862 end;@/ 1863 finish_the_property; 1864 end; 1865 end 1866 1867 @ The input routine is now complete except for the following code, 1868 which prints a progress report as the file is being read. 1869 1870 @p procedure print_octal(c:byte); {prints three octal digits} 1871 begin print('''',(c div 64):1,((c div 8) mod 8):1,(c mod 8):1); 1872 end; 1873 1874 @ @<Print |c| in octal...@>= 1875 begin if chars_on_line=8 then 1876 begin print_ln(' '); chars_on_line:=1; 1877 end 1878 else begin if chars_on_line>0 then print(' '); 1879 incr(chars_on_line); 1880 end; 1881 print_octal(c); {progress report} 1882 end 1883 1884 @* The checking and massaging phase. 1885 Once the whole \.{PL} file has been read in, we must check it for consistency 1886 and correct any errors. This process consists mainly of running through 1887 the characters that exist and seeing if they refer to characters that 1888 don't exist. We also compute the true value of |seven_unsafe|; we make sure 1889 that the charlists and ligature programs contain no loops; and we 1890 shorten the lists of widths, heights, depths, and italic corrections, 1891 if necessary, to keep from exceeding the required maximum sizes. 1892 1893 @<Glob...@>= 1894 @!seven_unsafe:boolean; {do seven-bit characters generate eight-bit ones?} 1895 1896 @ @<Correct and check the information@>= 1897 if nl>0 then @<Make sure the ligature/kerning program ends appropriately@>; 1898 seven_unsafe:=false; 1899 for c:=0 to 255 do if char_wd[c]<>0 then 1900 @<For all characters |g| generated by |c|, 1901 make sure that |char_wd[g]| is nonzero, and 1902 set |seven_unsafe| if |c<128<=g|@>; 1903 if bchar_label<@'77777 then 1904 begin c:=256; @<Check ligature program of |c|@>; 1905 end; 1906 if seven_bit_safe_flag and seven_unsafe then 1907 print_ln('The font is not really seven-bit-safe!'); 1908 @.The font is not...safe@> 1909 @<Check for infinite ligature loops@>; 1910 @<Doublecheck the lig/kern commands and the extensible recipes@>; 1911 for c:=0 to 255 do 1912 @<Make sure that |c| is not the largest element of a charlist cycle@>; 1913 @<Put the width, height, depth, and italic lists into final form@> 1914 1915 @ The checking that we need in several places is accomplished by three 1916 macros that are only slightly tricky. 1917 1918 @d existence_tail(#)==begin char_wd[g]:=sort_in(width,0); 1919 print(#,' '); print_octal(c); 1920 print_ln(' had no CHARACTER spec.'); 1921 end; 1922 end 1923 @d check_existence_and_safety(#)==begin g:=#; 1924 if (g>=128)and(c<128) then seven_unsafe:=true; 1925 if char_wd[g]=0 then existence_tail 1926 @d check_existence(#)==begin g:=#; 1927 if char_wd[g]=0 then existence_tail 1928 1929 @<For all characters |g| generated by |c|...@>= 1930 case char_tag[c] of 1931 no_tag: do_nothing; 1932 lig_tag: @<Check ligature program of |c|@>; 1933 list_tag: check_existence_and_safety(char_remainder[c]) 1934 ('The character NEXTLARGER than'); 1935 @.The character NEXTLARGER...@> 1936 ext_tag:@<Check the pieces of |exten[c]|@>; 1937 end 1938 1939 @ @<Check the pieces...@>= 1940 begin if exten[char_remainder[c]].b0>0 then 1941 check_existence_and_safety(exten[char_remainder[c]].b0) 1942 ('TOP piece of character'); 1943 @.TOP piece of character...@> 1944 if exten[char_remainder[c]].b1>0 then 1945 check_existence_and_safety(exten[char_remainder[c]].b1) 1946 ('MID piece of character'); 1947 @.MID piece of character...@> 1948 if exten[char_remainder[c]].b2>0 then 1949 check_existence_and_safety(exten[char_remainder[c]].b2) 1950 ('BOT piece of character'); 1951 @.BOT piece of character...@> 1952 check_existence_and_safety(exten[char_remainder[c]].b3) 1953 ('REP piece of character'); 1954 @.REP piece of character...@> 1955 end 1956 1957 @ @<Make sure that |c| is not the largest element of a charlist cycle@>= 1958 if char_tag[c]=list_tag then 1959 begin g:=char_remainder[c]; 1960 while (g<c)and(char_tag[g]=list_tag) do g:=char_remainder[g]; 1961 if g=c then 1962 begin char_tag[c]:=no_tag; 1963 print('A cycle of NEXTLARGER characters has been broken at '); 1964 @.A cycle of NEXTLARGER...@> 1965 print_octal(c); print_ln('.'); 1966 end; 1967 end 1968 1969 @ @<Glob...@>= 1970 @!delta:fix_word; {size of the intervals needed for rounding} 1971 1972 @ @d round_message(#)==if delta>0 then print_ln('I had to round some ', 1973 @.I had to round...@> 1974 #,'s by ',(((delta+1) div 2)/@'4000000):1:7,' units.') 1975 1976 @<Put the width, height, depth, and italic lists into final form@>= 1977 delta:=shorten(width,255); set_indices(width,delta); round_message('width');@/ 1978 delta:=shorten(height,15); set_indices(height,delta); round_message('height');@/ 1979 delta:=shorten(depth,15); set_indices(depth,delta); round_message('depth');@/ 1980 delta:=shorten(italic,63); set_indices(italic,delta); 1981 round_message('italic correction'); 1982 1983 @ @d clear_lig_kern_entry== {make an unconditional \.{STOP}} 1984 lig_kern[nl].b0:=255; lig_kern[nl].b1:=0; 1985 lig_kern[nl].b2:=0; lig_kern[nl].b3:=0 1986 1987 @<Make sure the ligature/kerning program ends...@>= 1988 begin if bchar_label<@'77777 then {make room for it} 1989 begin clear_lig_kern_entry; incr(nl); 1990 end; {|bchar_label| will be stored later} 1991 while min_nl>nl do 1992 begin clear_lig_kern_entry; incr(nl); 1993 end; 1994 if lig_kern[nl-1].b0=0 then lig_kern[nl-1].b0:=stop_flag; 1995 end 1996 1997 @ It's not trivial to check for infinite loops generated by repeated 1998 insertion of ligature characters. But fortunately there is a nice 1999 algorithm for such testing, copied here from the program \.{TFtoPL} 2000 where it is explained further. 2001 2002 @d simple=0 {$f(x,y)=z$} 2003 @d left_z=1 {$f(x,y)=f(z,y)$} 2004 @d right_z=2 {$f(x,y)=f(x,z)$} 2005 @d both_z=3 {$f(x,y)=f(f(x,z),y)$} 2006 @d pending=4 {$f(x,y)$ is being evaluated} 2007 2008 2009 @ @<Glo...@>= 2010 @!lig_ptr:0..max_lig_steps; {an index into |lig_kern|} 2011 @!hash:array[0..hash_size] of 0..66048; {$256x+y+1$ for $x\le257$ and $y\le255$} 2012 @!class:array[0..hash_size] of simple..pending; 2013 @!lig_z:array[0..hash_size] of 0..257; 2014 @!hash_ptr:0..hash_size; {the number of nonzero entries in |hash|} 2015 @!hash_list:array[0..hash_size] of 0..hash_size; {list of those nonzero entries} 2016 @!h:0..hash_size; {index into the hash table} 2017 @!tt:indx; {temporary register} 2018 @!x_lig_cycle,@!y_lig_cycle:0..256; {problematic ligature pair} 2019 2020 @ @<Set init...@>= 2021 hash_ptr:=0; y_lig_cycle:=256; 2022 for k:=0 to hash_size do hash[k]:=0; 2023 2024 @ @d lig_exam==lig_kern[lig_ptr].b1 2025 @d lig_gen==lig_kern[lig_ptr].b3 2026 2027 @<Check lig...@>= 2028 begin lig_ptr:=char_remainder[c]; 2029 repeat if hash_input(lig_ptr,c) then 2030 begin if lig_kern[lig_ptr].b2<kern_flag then 2031 begin if lig_exam<>bchar then 2032 check_existence(lig_exam)('LIG character examined by'); 2033 @.LIG character examined...@> 2034 check_existence(lig_gen)('LIG character generated by'); 2035 @.LIG character generated...@> 2036 if lig_gen>=128 then if(c<128)or(c=256) then 2037 if(lig_exam<128)or(lig_exam=bchar) then seven_unsafe:=true; 2038 end 2039 else if lig_exam<>bchar then 2040 check_existence(lig_exam)('KRN character examined by'); 2041 @.KRN character examined...@> 2042 end; 2043 if lig_kern[lig_ptr].b0>=stop_flag then lig_ptr:=nl 2044 else lig_ptr:=lig_ptr+1+lig_kern[lig_ptr].b0; 2045 until lig_ptr>=nl; 2046 end 2047 2048 @ The |hash_input| procedure is copied from \.{TFtoPL}, but it is made 2049 into a boolean function that returns |false| if the ligature command 2050 was masked by a previous one. 2051 2052 @p function hash_input(@!p,@!c:indx):boolean; 2053 {enter data for character |c| and command in location |p|, unless it isn't new} 2054 label 30; {go here for a quick exit} 2055 var @!cc:simple..both_z; {class of data being entered} 2056 @!zz:0..255; {function value or ligature character being entered} 2057 @!y:0..255; {the character after the cursor} 2058 @!key:integer; {value to be stored in |hash|} 2059 @!t:integer; {temporary register for swapping} 2060 begin if hash_ptr=hash_size then 2061 begin hash_input:=false; goto 30;@+end; 2062 @<Compute the command parameters |y|, |cc|, and |zz|@>; 2063 key:=256*c+y+1; h:=(1009*key) mod hash_size; 2064 while hash[h]>0 do 2065 begin if hash[h]<=key then 2066 begin if hash[h]=key then 2067 begin hash_input:=false; goto 30; {unused ligature command} 2068 end; 2069 t:=hash[h]; hash[h]:=key; key:=t; {do ordered-hash-table insertion} 2070 t:=class[h]; class[h]:=cc; cc:=t; {namely, do a swap} 2071 t:=lig_z[h]; lig_z[h]:=zz; zz:=t; 2072 end; 2073 if h>0 then decr(h)@+else h:=hash_size; 2074 end; 2075 hash[h]:=key; class[h]:=cc; lig_z[h]:=zz; 2076 incr(hash_ptr); hash_list[hash_ptr]:=h; 2077 hash_input:=true; 2078 30:end; 2079 2080 @ @<Compute the command param...@>= 2081 y:=lig_kern[p].b1; t:=lig_kern[p].b2; cc:=simple; 2082 zz:=lig_kern[p].b3; 2083 if t>=kern_flag then zz:=y 2084 else begin case t of 2085 0,6:do_nothing; {\.{LIG},\.{/LIG>}} 2086 5,11:zz:=y; {\.{LIG/>}, \.{/LIG/>>}} 2087 1,7:cc:=left_z; {\.{LIG/}, \.{/LIG/>}} 2088 2:cc:=right_z; {\.{/LIG}} 2089 3:cc:=both_z; {\.{/LIG/}} 2090 end; {there are no other cases} 2091 end 2092 2093 @ (More good stuff from \.{TFtoPL}.) 2094 2095 @p function f(@!h,@!x,@!y:indx):indx; forward;@t\2@> 2096 {compute $f$ for arguments known to be in |hash[h]|} 2097 function eval(@!x,@!y:indx):indx; {compute $f(x,y)$ with hashtable lookup} 2098 var @!key:integer; {value sought in hash table} 2099 begin key:=256*x+y+1; h:=(1009*key) mod hash_size; 2100 while hash[h]>key do 2101 if h>0 then decr(h)@+else h:=hash_size; 2102 if hash[h]<key then eval:=y {not in ordered hash table} 2103 else eval:=f(h,x,y); 2104 end; 2105 2106 @ Pascal's beastly convention for |forward| declarations prevents us from 2107 saying |function f(h,x,y:indx):indx| here. 2108 2109 @p function f; 2110 begin case class[h] of 2111 simple: do_nothing; 2112 left_z: begin class[h]:=pending; lig_z[h]:=eval(lig_z[h],y); class[h]:=simple; 2113 end; 2114 right_z: begin class[h]:=pending; lig_z[h]:=eval(x,lig_z[h]); class[h]:=simple; 2115 end; 2116 both_z: begin class[h]:=pending; lig_z[h]:=eval(eval(x,lig_z[h]),y); 2117 class[h]:=simple; 2118 end; 2119 pending: begin x_lig_cycle:=x; y_lig_cycle:=y; lig_z[h]:=257; class[h]:=simple; 2120 end; {the value 257 will break all cycles, since it's not in |hash|} 2121 end; {there are no other cases} 2122 f:=lig_z[h]; 2123 end; 2124 2125 @ @<Check for infinite...@>= 2126 if hash_ptr<hash_size then for hh:=1 to hash_ptr do 2127 begin tt:=hash_list[hh]; 2128 if class[tt]>simple then {make sure $f$ is well defined} 2129 tt:=f(tt,(hash[tt]-1)div 256,(hash[tt]-1)mod 256); 2130 end; 2131 if(hash_ptr=hash_size)or(y_lig_cycle<256) then 2132 begin if hash_ptr<hash_size then 2133 begin print('Infinite ligature loop starting with '); 2134 @.Infinite ligature loop...@> 2135 if x_lig_cycle=256 then print('boundary')@+else print_octal(x_lig_cycle); 2136 print(' and '); print_octal(y_lig_cycle); print_ln('!'); 2137 end 2138 else print_ln('Sorry, I haven''t room for so many ligature/kern pairs!'); 2139 @.Sorry, I haven't room...@> 2140 print_ln('All ligatures will be cleared.'); 2141 for c:=0 to 255 do if char_tag[c]=lig_tag then 2142 begin char_tag[c]:=no_tag; char_remainder[c]:=0; 2143 end; 2144 nl:=0; bchar:=256; bchar_label:=@'77777; 2145 end 2146 2147 @ The lig/kern program may still contain references to nonexistent characters, 2148 if parts of that program are never used. Similarly, there may be extensible 2149 characters that are never used, because they were overridden by 2150 \.{NEXTLARGER}, say. This would produce an invalid \.{TFM} file; so we 2151 must fix such errors. 2152 2153 @d double_check_tail(#)==@t\1@>if char_wd[0]=0 2154 then char_wd[0]:=sort_in(width,0); 2155 print('Unused ',#,' refers to nonexistent character '); 2156 print_octal(c); print_ln('!'); 2157 end; 2158 end 2159 @d double_check_lig(#)==begin c:=lig_kern[lig_ptr].#; 2160 if char_wd[c]=0 then if c<>bchar then 2161 begin lig_kern[lig_ptr].#:=0; double_check_tail 2162 @d double_check_ext(#)==begin c:=exten[g].#; 2163 if c>0 then if char_wd[c]=0 then 2164 begin exten[g].#:=0; double_check_tail 2165 @d double_check_rep(#)==begin c:=exten[g].#; 2166 if char_wd[c]=0 then 2167 begin exten[g].#:=0; double_check_tail 2168 2169 @<Doublecheck...@>= 2170 if nl>0 then for lig_ptr:=0 to nl-1 do 2171 if lig_kern[lig_ptr].b2<kern_flag then 2172 begin if lig_kern[lig_ptr].b0<255 then 2173 begin double_check_lig(b1)('LIG step'); double_check_lig(b3)('LIG step'); 2174 end; 2175 end 2176 else double_check_lig(b1)('KRN step'); 2177 @.Unused LIG step...@> 2178 @.Unused KRN step...@> 2179 if ne>0 then for g:=0 to ne-1 do 2180 begin double_check_ext(b0)('VARCHAR TOP'); 2181 double_check_ext(b1)('VARCHAR MID'); 2182 double_check_ext(b2)('VARCHAR BOT'); 2183 double_check_rep(b3)('VARCHAR REP'); 2184 @.Unused VARCHAR...@> 2185 end 2186 2187 @* The output phase. 2188 Now that we know how to get all of the font data correctly stored in 2189 \.{PLtoTF}'s memory, it only remains to write the answers out. 2190 2191 First of all, it is convenient to have an abbreviation for output to the 2192 \.{TFM} file: 2193 2194 @d out(#)==write(tfm_file,#) 2195 2196 @ The general plan for producing \.{TFM} files is long but simple: 2197 2198 @<Do the output@>= 2199 @<Compute the twelve subfile sizes@>; 2200 @<Output the twelve subfile sizes@>; 2201 @<Output the header block@>; 2202 @<Output the character info@>; 2203 @<Output the dimensions themselves@>; 2204 @<Output the ligature/kern program@>; 2205 @<Output the extensible character recipes@>; 2206 @<Output the parameters@> 2207 2208 @ A \.{TFM} file begins with 12 numbers that tell how big its subfiles are. 2209 We already know most of these numbers; for example, the number of distinct 2210 widths is |memory[width]+1|, where the $+1$ accounts for the zero width that 2211 is always supposed to be present. But we still should compute the beginning 2212 and ending character codes (|bc| and |ec|), the number of header words (|lh|), 2213 and the total number of words in the \.{TFM} file (|lf|). 2214 2215 @<Gl...@>= 2216 @!bc:byte; {the smallest character code in the font} 2217 @!ec:byte; {the largest character code in the font} 2218 @!lh:byte; {the number of words in the header block} 2219 @!lf:0..32767; {the number of words in the entire \.{TFM} file} 2220 @!not_found:boolean; {has a font character been found?} 2221 @!temp_width:fix_word; {width being used to compute a check sum} 2222 2223 @ It might turn out that no characters exist at all. But \.{PLtoTF} keeps 2224 going and writes the \.{TFM} anyway. In this case |ec| will be~0 and |bc| 2225 will be~1. 2226 2227 @<Compute the twelve...@>= 2228 lh:=header_ptr div 4;@/ 2229 not_found:=true; bc:=0; 2230 while not_found do 2231 if (char_wd[bc]>0)or(bc=255) then not_found:=false 2232 else incr(bc); 2233 not_found:=true; ec:=255; 2234 while not_found do 2235 if (char_wd[ec]>0)or(ec=0) then not_found:=false 2236 else decr(ec); 2237 if bc>ec then bc:=1; 2238 incr(memory[width]); incr(memory[height]); incr(memory[depth]); 2239 incr(memory[italic]);@/ 2240 @<Compute the ligature/kern program offset@>; 2241 lf:=6+lh+(ec-bc+1)+memory[width]+memory[height]+memory[depth]+ 2242 memory[italic]+nl+lk_offset+nk+ne+np; 2243 2244 @ @d out_size(#)==out((#) div 256); out((#) mod 256) 2245 2246 @<Output the twelve subfile sizes@>= 2247 out_size(lf); out_size(lh); out_size(bc); out_size(ec); 2248 out_size(memory[width]); out_size(memory[height]); 2249 out_size(memory[depth]); out_size(memory[italic]); 2250 out_size(nl+lk_offset); out_size(nk); out_size(ne); out_size(np); 2251 2252 @ The routines that follow need a few temporary variables of different types. 2253 2254 @<Gl...@>= 2255 @!j:0..max_header_bytes; {index into |header_bytes|} 2256 @!p:pointer; {index into |memory|} 2257 @!q:width..italic; {runs through the list heads for dimensions} 2258 @!par_ptr:0..max_param_words; {runs through the parameters} 2259 2260 @ The header block follows the subfile sizes. The necessary information all 2261 appears in |header_bytes|, except that the design size and the seven-bit-safe 2262 flag must still be set. 2263 2264 @<Output the header block@>= 2265 if not check_sum_specified then @<Compute the check sum@>; 2266 header_bytes[design_size_loc]:=design_size div @'100000000; 2267 {this works since |design_size>0|} 2268 header_bytes[design_size_loc+1]:=(design_size div @'200000) mod 256; 2269 header_bytes[design_size_loc+2]:=(design_size div 256) mod 256; 2270 header_bytes[design_size_loc+3]:=design_size mod 256; 2271 if not seven_unsafe then header_bytes[seven_flag_loc]:=128; 2272 for j:=0 to header_ptr-1 do out(header_bytes[j]); 2273 2274 @ @<Compute the check sum@>= 2275 begin c0:=bc; c1:=ec; c2:=bc; c3:=ec; 2276 for c:=bc to ec do if char_wd[c]>0 then 2277 begin temp_width:=memory[char_wd[c]]; 2278 if design_units<>unity then 2279 temp_width:=round((temp_width/design_units)*1048576.0); 2280 temp_width:=temp_width + (c+4)*@'20000000; {this should be positive} 2281 c0:=(c0+c0+temp_width) mod 255; 2282 c1:=(c1+c1+temp_width) mod 253; 2283 c2:=(c2+c2+temp_width) mod 251; 2284 c3:=(c3+c3+temp_width) mod 247; 2285 end; 2286 header_bytes[check_sum_loc]:=c0; 2287 header_bytes[check_sum_loc+1]:=c1; 2288 header_bytes[check_sum_loc+2]:=c2; 2289 header_bytes[check_sum_loc+3]:=c3; 2290 end 2291 2292 @ The next block contains packed |char_info|. 2293 2294 @<Output the character info@>= 2295 index[0]:=0; 2296 for c:=bc to ec do 2297 begin out(index[char_wd[c]]); 2298 out(index[char_ht[c]]*16+index[char_dp[c]]); 2299 out(index[char_ic[c]]*4+char_tag[c]); 2300 out(char_remainder[c]); 2301 end 2302 2303 @ When a scaled quantity is output, we may need to divide it by |design_units|. 2304 The following subroutine takes care of this, using floating point arithmetic 2305 only if |design_units<>1.0|. 2306 2307 @p procedure out_scaled(x:fix_word); {outputs a scaled |fix_word|} 2308 var @!n:byte; {the first byte after the sign} 2309 @!m:0..65535; {the two least significant bytes} 2310 begin if abs(x/design_units)>=16.0 then 2311 begin print_ln('The relative dimension ',x/@'4000000:1:3, 2312 ' is too large.'); 2313 @.The relative dimension...@> 2314 print(' (Must be less than 16*designsize'); 2315 if design_units<>unity then print(' =',design_units/@'200000:1:3, 2316 ' designunits'); 2317 print_ln(')'); x:=0; 2318 end; 2319 if design_units<>unity then x:=round((x/design_units)*1048576.0); 2320 if x<0 then 2321 begin out(255); x:=x+@'100000000; 2322 if x<=0 then x:=1; 2323 end 2324 else begin out(0); 2325 if x>=@'100000000 then x:=@'77777777; 2326 end; 2327 n:=x div @'200000; m:=x mod @'200000; 2328 out(n); out(m div 256); out(m mod 256); 2329 end; 2330 2331 @ We have output the packed indices for individual characters. 2332 The scaled widths, heights, depths, and italic corrections are next. 2333 2334 @<Output the dimensions themselves@>= 2335 for q:=width to italic do 2336 begin out(0); out(0); out(0); out(0); {output the zero word} 2337 p:=link[q]; {head of list} 2338 while p>0 do 2339 begin out_scaled(memory[p]); 2340 p:=link[p]; 2341 end; 2342 end; 2343 2344 @ One embarrassing problem remains: The ligature/kern program might be very 2345 long, but the starting addresses in |char_remainder| can be at most~255. 2346 Therefore we need to output some indirect address information; we want to 2347 compute |lk_offset| so that addition of |lk_offset| to all remainders makes 2348 all but |lk_offset| distinct remainders less than~256. 2349 2350 For this we need a sorted table of all relevant remainders. 2351 2352 @<Glob...@>= 2353 @!label_table:array[0..256] of record 2354 @!rr: -1..@'77777; {sorted label values} 2355 @!cc: byte; {associated characters} 2356 end; 2357 @!label_ptr:0..256; {index of highest entry in |label_table|} 2358 @!sort_ptr:0..256; {index into |label_table|} 2359 @!lk_offset:0..256; {smallest offset value that might work} 2360 @!t:0..@'77777; {label value that is being redirected} 2361 @!extra_loc_needed:boolean; {do we need a special word for |bchar|?} 2362 2363 @ @<Compute the ligature/kern program offset@>= 2364 @<Insert all labels into |label_table|@>; 2365 if bchar<256 then 2366 begin extra_loc_needed:=true; lk_offset:=1; 2367 end 2368 else begin extra_loc_needed:=false; lk_offset:=0; 2369 end; 2370 @<Find the minimum |lk_offset| and adjust all remainders@>; 2371 if bchar_label<@'77777 then 2372 begin lig_kern[nl-1].b2:=(bchar_label+lk_offset)div 256; 2373 lig_kern[nl-1].b3:=(bchar_label+lk_offset)mod 256; 2374 end 2375 2376 @ @<Insert all labels...@>= 2377 label_ptr:=0; label_table[0].rr:=-1; {sentinel} 2378 for c:=bc to ec do if char_tag[c]=lig_tag then 2379 begin sort_ptr:=label_ptr; {there's a hole at position |sort_ptr+1|} 2380 while label_table[sort_ptr].rr>char_remainder[c] do 2381 begin label_table[sort_ptr+1]:=label_table[sort_ptr]; 2382 decr(sort_ptr); {move the hole} 2383 end; 2384 label_table[sort_ptr+1].cc:=c; 2385 label_table[sort_ptr+1].rr:=char_remainder[c]; 2386 incr(label_ptr); 2387 end 2388 2389 @ @<Find the minimum |lk_offset| and adjust all remainders@>= 2390 begin sort_ptr:=label_ptr; {the largest unallocated label} 2391 if label_table[sort_ptr].rr+lk_offset > 255 then 2392 begin lk_offset:=0; extra_loc_needed:=false; {location 0 can do double duty} 2393 repeat char_remainder[label_table[sort_ptr].cc]:=lk_offset; 2394 while label_table[sort_ptr-1].rr=label_table[sort_ptr].rr do 2395 begin decr(sort_ptr); char_remainder[label_table[sort_ptr].cc]:=lk_offset; 2396 end; 2397 incr(lk_offset); decr(sort_ptr); 2398 until lk_offset+label_table[sort_ptr].rr<256; 2399 {N.B.: |lk_offset=256| satisfies this when |sort_ptr=0|} 2400 end; 2401 if lk_offset>0 then while sort_ptr>0 do 2402 begin char_remainder[label_table[sort_ptr].cc]:= 2403 char_remainder[label_table[sort_ptr].cc]+lk_offset; 2404 decr(sort_ptr); 2405 end; 2406 end 2407 2408 @ @<Output the ligature/kern program@>= 2409 if extra_loc_needed then {|lk_offset=1|} 2410 begin out(255); out(bchar); out(0); out(0); 2411 end 2412 else for sort_ptr:=1 to lk_offset do {output the redirection specs} 2413 begin t:=label_table[label_ptr].rr; 2414 if bchar<256 then 2415 begin out(255); out(bchar); 2416 end 2417 else begin out(254); out(0); 2418 end; 2419 out_size(t+lk_offset); 2420 repeat decr(label_ptr); until label_table[label_ptr].rr<t; 2421 end; 2422 if nl>0 then for lig_ptr:=0 to nl-1 do 2423 begin out(lig_kern[lig_ptr].b0); 2424 out(lig_kern[lig_ptr].b1); 2425 out(lig_kern[lig_ptr].b2); 2426 out(lig_kern[lig_ptr].b3); 2427 end; 2428 if nk>0 then for krn_ptr:=0 to nk-1 do out_scaled(kern[krn_ptr]) 2429 2430 @ @<Output the extensible character recipes@>= 2431 if ne>0 then for c:=0 to ne-1 do 2432 begin out(exten[c].b0); 2433 out(exten[c].b1); 2434 out(exten[c].b2); 2435 out(exten[c].b3); 2436 end; 2437 2438 @ For our grand finale, we wind everything up by outputting the parameters. 2439 2440 @<Output the parameters@>= 2441 for par_ptr:=1 to np do 2442 begin if par_ptr=1 then 2443 @<Output the slant (|param[1]|) without scaling@> 2444 else out_scaled(param[par_ptr]); 2445 end 2446 2447 @ @<Output the slant...@>= 2448 begin if param[1]<0 then 2449 begin param[1]:=param[1]+@'10000000000; 2450 out((param[1] div @'100000000)+256-64); 2451 end 2452 else out(param[1] div @'100000000); 2453 out((param[1] div @'200000) mod 256); 2454 out((param[1] div 256) mod 256); 2455 out(param[1] mod 256); 2456 end 2457 2458 @* The main program. 2459 The routines sketched out so far need to be packaged into separate procedures, 2460 on some systems, since some \PASCAL\ compilers place a strict limit on the 2461 size of a routine. The packaging is done here in an attempt to avoid some 2462 system-dependent changes. 2463 2464 @p procedure param_enter; 2465 begin @<Enter the parameter names@>; 2466 end; 2467 @# 2468 procedure name_enter; {enter all names and their equivalents} 2469 begin @<Enter all of the names...@>; 2470 param_enter; 2471 end; 2472 @# 2473 procedure read_lig_kern; 2474 var @!krn_ptr:0..max_kerns; {an index into |kern|} 2475 @!c:byte; {runs through all character codes} 2476 begin @<Read ligature/kern list@>; 2477 end; 2478 @# 2479 procedure read_char_info; 2480 var @!c:byte; {the char} 2481 begin @<Read character info list@>; 2482 end; 2483 @# 2484 procedure read_input; 2485 var @!c:byte; {header or parameter index} 2486 begin @<Read all the input@>; 2487 end; 2488 @# 2489 procedure corr_and_check; 2490 var @!c:0..256; {runs through all character codes} 2491 @!hh:0..hash_size; {an index into |hash_list|} 2492 @!lig_ptr:0..max_lig_steps; {an index into |lig_kern|} 2493 @!g:byte; {a character generated by the current character |c|} 2494 begin @<Correct and check the information@> 2495 end; 2496 2497 @ Here is where \.{PLtoTF} begins and ends. 2498 2499 @p begin initialize;@/ 2500 name_enter;@/ 2501 read_input; print_ln('.');@/ 2502 corr_and_check;@/ 2503 @<Do the output@>; 2504 end. 2505 2506 @* System-dependent changes. 2507 This section should be replaced, if necessary, by changes to the program 2508 that are necessary to make \.{PLtoTF} work at a particular installation. 2509 It is usually best to design your change file so that all changes to 2510 previous sections preserve the section numbering; then everybody's version 2511 will be consistent with the printed program. More extensive changes, 2512 which introduce new sections, can be inserted here; then only the index 2513 itself will get a new section number. 2514 @^system dependencies@> 2515 2516 @* Index. 2517 Pointers to error messages appear here together with the section numbers 2518 where each ident\-i\-fier is used.