modernc.org/knuth@v0.0.4/web/testdata/ctan.org/tex-archive/systems/knuth/dist/mfware/gftopk.web (about)

     1  % This program is by Tomas Rokicki.  A few routines were borrowed from
     2  % GFtoPXL by Arthur Samuel, who borrowed from GFtype by DRF and DEK,
     3  % who borrowed from DVItype, and so on.
     4  
     5  % Version 0.0 (development): started 26 July 1985 TGR.
     6  % Version 1.0: finished 29 July 1985 TGR.
     7  % Version 1.1: revised for new pk format 9 August 1985 TGR.
     8  % Version 1.2: fixed two's complement bug 23 January 1985 TGR.
     9  % Version 1.3: fixed bounding box calculations and some documentation.
    10  %                                     7 September 1986 TGR
    11  % Version 1.4: fixed row to glyph conversion 14 November 1987 TGR
    12  % Version 1.5: eliminated semicolons before endcases 12 July 1988 TGR
    13  % Version 2.0: slightly tuned up for METAFONTware report 17 Apr 1989 DEK/TGR
    14  % Version 2.1: fixed paint0/endrow bug reported by John Hobby 31 Jul 1989 TGR
    15  % Version 2.2: minor tune up; retain previous source info 21 Nov 1989 don
    16  % Version 2.3: fixed a few bugs with selection of preamble types, if
    17  %  gf_ch < 0, or if comp_size = 1016 (both unlikely).  Removed some
    18  %  code that would never get executed since bad_gf terminates.  Also
    19  %  some other nits that don't really affect functionality.  29 Jul 1990  TGR
    20  %  Bugs and fixes reported by Peter Breitenlohner (PEB).
    21  %  Corrected two typos -- 21 Dec 96 (don)
    22  % Version 2.4: fixed cases that might move to negative. 06 January 2014 PEB
    23  
    24  \def\versiondate{06 January 2014}
    25  
    26  % Here is TeX material that gets inserted after \input webmac
    27  \def\hang{\hangindent 3em\noindent\ignorespaces}
    28  \def\textindent#1{\hangindent2.5em\noindent\hbox to2.5em{\hss#1 }\ignorespaces}
    29  \font\ninerm=cmr9
    30  \let\mc=\ninerm % medium caps for names like SAIL
    31  \font\tenss=cmss10 % for `The METAFONTbook'
    32  \def\PASCAL{Pascal}
    33  \def\ph{{\mc PASCAL-H}}
    34  \font\logo=manfnt % font used for the METAFONT logo
    35  \def\MF{{\logo META}\-{\logo FONT}}
    36  \def\<#1>{$\langle#1\rangle$}
    37  \def\section{\mathhexbox278}
    38  \let\swap=\leftrightarrow
    39  \def\round{\mathop{\rm round}\nolimits}
    40  
    41  \def\(#1){} % this is used to make section names sort themselves better
    42  \def\9#1{} % this is used for sort keys in the index via @@:sort key}{entry@@>
    43  
    44  \def\title{GFtoPK}
    45  \def\contentspagenumber{201}
    46  \def\topofcontents{\null
    47    \titlefalse % include headline on the contents page
    48    \def\rheader{\mainfont\hfil \contentspagenumber}
    49    \vfill
    50    \centerline{\titlefont The {\ttitlefont GFtoPK} processor}
    51    \vskip 15pt
    52    \centerline{(Version 2.4, \versiondate)}
    53    \vfill}
    54  \def\botofcontents{\vfill
    55    \centerline{\hsize 5in\baselineskip9pt
    56      \vbox{\ninerm\noindent
    57      The preparation of this report
    58      was supported in part by the National Science
    59      Foundation under grants IST-8201926, MCS-8300984, and
    60      CCR-8610181,
    61      and by the System Development Foundation. `\TeX' is a
    62      trademark of the American Mathematical Society.
    63      `{\logo hijklmnj}\kern1pt' is a trademark of Addison-Wesley
    64      Publishing Company.}}}
    65  \pageno=\contentspagenumber \advance\pageno by 1
    66  
    67  @* Introduction.
    68  This program reads a \.{GF} file and packs it into a \.{PK} file.  \.{PK} files
    69  are significantly smaller than \.{GF} files, and they are much easier to
    70  interpret.  This program is meant to be the bridge between \MF\ and \.{DVI}
    71  drivers that read \.{PK} files.  Here are some statistics comparing typical
    72  input and output file sizes:
    73  
    74  $$\vbox{
    75  \halign{#\hfil\quad&\hfil#\qquad&&\hfil#\quad\cr
    76  Font&\omit\hfil Resolution\hfil\quad
    77   &\.{GF} size&\.{PK} size&Reduction factor\cr
    78  \noalign{\medskip}
    79  cmr10&300&13200&5484&42\char`\%\cr
    80  cmr10&360&15342&6496&42\char`\%\cr
    81  cmr10&432&18120&7808&43\char`\%\cr
    82  cmr10&511&21020&9440&45\char`\%\cr
    83  cmr10&622&24880&11492&46\char`\%\cr
    84  cmr10&746&29464&13912&47\char`\%\cr
    85  cminch&300&48764&22076&45\char`\%\cr
    86  }}$$
    87  It is hoped that the simplicity and small size of the \.{PK} files will make
    88  them widely accepted.
    89  
    90  The \.{PK} format was designed and implemented by Tomas Rokicki during
    91  @^Rokicki, Tomas Gerhard Paul@>
    92  the summer of 1985. This program borrows a few routines from \.{GFtoPXL} by
    93  Arthur Samuel.
    94  @^Samuel, Arthur Lee@>
    95  
    96  The |banner| string defined here should be changed whenever \.{GFtoPK}
    97  gets modified. The |preamble_comment| macro (near the end of the program)
    98  should be changed too.
    99  
   100  @d banner=='This is GFtoPK, Version 2.4' {printed when the program starts}
   101  
   102  @ Some of the diagnostic information is printed using
   103  |d_print_ln|.  When debugging, it should be set the same as
   104  |print_ln|, defined later.
   105  @^debugging@>
   106  
   107  @d d_print_ln(#)==
   108  
   109  @ This program is written in standard \PASCAL, except where it is
   110  necessary to use extensions; for example, one extension is to use a
   111  default |case| as in \.{TANGLE}, \.{WEAVE}, etc.  All places where
   112  nonstandard constructions are used should be listed in the index under
   113  ``system dependencies.''
   114  @!@^system dependencies@>
   115  
   116  @d othercases == others: {default for cases not listed explicitly}
   117  @d endcases == @+end {follows the default case in an extended |case| statement}
   118  @f othercases == else
   119  @f endcases == end
   120  
   121  @ The binary input comes from |gf_file|, and the output font is written
   122  on |pk_file|.  All text output is written on \PASCAL's standard |output|
   123  file.  The term |print| is used instead of |write| when this program writes
   124  on |output|, so that all such output could easily be redirected if desired.
   125  
   126  @d print(#)==write(#)
   127  @d print_ln(#)==write_ln(#)
   128  
   129  @p program GFtoPK(@!gf_file,@!pk_file,@!output);
   130  label @<Labels in the outer block@>@/
   131  const @<Constants in the outer block@>@/
   132  type @<Types in the outer block@>@/
   133  var @<Globals in the outer block@>@/
   134  procedure initialize; {this procedure gets things started properly}
   135    var i:integer; {loop index for initializations}
   136    begin print_ln(banner);@/
   137    @<Set initial values@>@/
   138    end;
   139  
   140  @ If the program has to stop prematurely, it goes to the
   141  `|final_end|'.
   142  
   143  @d final_end=9999 {label for the end of it all}
   144  
   145  @<Labels...@>=final_end;
   146  
   147  @ The following parameters can be changed at compile time to extend or
   148  reduce \.{GFtoPK}'s capacity.  The values given here should be quite
   149  adequate for most uses.  Assuming an average of about three strokes per
   150  raster line, there are six run-counts per line, and therefore |max_row|
   151  will be sufficient for a character 2600 pixels high.
   152  
   153  @<Constants...@>=
   154  @!line_length=79; {bracketed lines of output will be at most this long}
   155  @!max_row=16000; {largest index in the main |row| array}
   156  
   157  @ Here are some macros for common programming idioms.
   158  
   159  @d incr(#) == #:=#+1 {increase a variable by unity}
   160  @d decr(#) == #:=#-1 {decrease a variable by unity}
   161  
   162  @ If the \.{GF} file is badly malformed, the whole process must be aborted;
   163  \.{GFtoPK} will give up, after issuing an error message about the symptoms
   164  that were noticed.
   165  
   166  Such errors might be discovered inside of subroutines inside of subroutines,
   167  so a procedure called |jump_out| has been introduced. This procedure, which
   168  simply transfers control to the label |final_end| at the end of the program,
   169  contains the only non-local |goto| statement in \.{GFtoPK}.
   170  @^system dependencies@>
   171  
   172  @d abort(#)==begin print(' ',#); jump_out;
   173      end
   174  @d bad_gf(#)==abort('Bad GF file: ',#,'!')
   175  @.Bad GF file@>
   176  
   177  @p procedure jump_out;
   178  begin goto final_end;
   179  end;
   180  
   181  @* The character set.
   182  Like all programs written with the  \.{WEB} system, \.{GFtoPK} can be
   183  used with any character set. But it uses ASCII code internally, because
   184  the programming for portable input-output is easier when a fixed internal
   185  code is used.
   186  
   187  The next few sections of \.{GFtoPK} have therefore been copied from the
   188  analogous ones in the \.{WEB} system routines. They have been considerably
   189  simplified, since \.{GFtoPK} need not deal with the controversial
   190  ASCII codes less than @'40 or greater than @'176.
   191  If such codes appear in the \.{GF} file,
   192  they will be printed as question marks.
   193  
   194  @<Types...@>=
   195  @!ASCII_code=" ".."~"; {a subrange of the integers}
   196  
   197  @ The original \PASCAL\ compiler was designed in the late 60s, when six-bit
   198  character sets were common, so it did not make provision for lower case
   199  letters. Nowadays, of course, we need to deal with both upper and lower case
   200  alphabets in a convenient way, especially in a program like \.{GFtoPK}.
   201  So we shall assume that the \PASCAL\ system being used for \.{GFtoPK}
   202  has a character set containing at least the standard visible characters
   203  of ASCII code (|"!"| through |"~"|).
   204  
   205  Some \PASCAL\ compilers use the original name |char| for the data type
   206  associated with the characters in text files, while other \PASCAL s
   207  consider |char| to be a 64-element subrange of a larger data type that has
   208  some other name.  In order to accommodate this difference, we shall use
   209  the name |text_char| to stand for the data type of the characters in the
   210  output file.  We shall also assume that |text_char| consists of
   211  the elements |chr(first_text_char)| through |chr(last_text_char)|,
   212  inclusive. The following definitions should be adjusted if necessary.
   213  @^system dependencies@>
   214  
   215  @d text_char == char {the data type of characters in text files}
   216  @d first_text_char=0 {ordinal number of the smallest element of |text_char|}
   217  @d last_text_char=127 {ordinal number of the largest element of |text_char|}
   218  
   219  @<Types...@>=
   220  @!text_file=packed file of text_char;
   221  
   222  @ The \.{GFtoPK} processor converts between ASCII code and
   223  the user's external character set by means of arrays |xord| and |xchr|
   224  that are analogous to \PASCAL's |ord| and |chr| functions.
   225  
   226  @<Globals...@>=
   227  @!xord: array [text_char] of ASCII_code;
   228    {specifies conversion of input characters}
   229  @!xchr: array [0..255] of text_char;
   230    {specifies conversion of output characters}
   231  
   232  @ Under our assumption that the visible characters of standard ASCII are
   233  all present, the following assignment statements initialize the
   234  |xchr| array properly, without needing any system-dependent changes.
   235  
   236  @<Set init...@>=
   237  for i:=0 to @'37 do xchr[i]:='?';
   238  xchr[@'40]:=' ';
   239  xchr[@'41]:='!';
   240  xchr[@'42]:='"';
   241  xchr[@'43]:='#';
   242  xchr[@'44]:='$';
   243  xchr[@'45]:='%';
   244  xchr[@'46]:='&';
   245  xchr[@'47]:='''';@/
   246  xchr[@'50]:='(';
   247  xchr[@'51]:=')';
   248  xchr[@'52]:='*';
   249  xchr[@'53]:='+';
   250  xchr[@'54]:=',';
   251  xchr[@'55]:='-';
   252  xchr[@'56]:='.';
   253  xchr[@'57]:='/';@/
   254  xchr[@'60]:='0';
   255  xchr[@'61]:='1';
   256  xchr[@'62]:='2';
   257  xchr[@'63]:='3';
   258  xchr[@'64]:='4';
   259  xchr[@'65]:='5';
   260  xchr[@'66]:='6';
   261  xchr[@'67]:='7';@/
   262  xchr[@'70]:='8';
   263  xchr[@'71]:='9';
   264  xchr[@'72]:=':';
   265  xchr[@'73]:=';';
   266  xchr[@'74]:='<';
   267  xchr[@'75]:='=';
   268  xchr[@'76]:='>';
   269  xchr[@'77]:='?';@/
   270  xchr[@'100]:='@@';
   271  xchr[@'101]:='A';
   272  xchr[@'102]:='B';
   273  xchr[@'103]:='C';
   274  xchr[@'104]:='D';
   275  xchr[@'105]:='E';
   276  xchr[@'106]:='F';
   277  xchr[@'107]:='G';@/
   278  xchr[@'110]:='H';
   279  xchr[@'111]:='I';
   280  xchr[@'112]:='J';
   281  xchr[@'113]:='K';
   282  xchr[@'114]:='L';
   283  xchr[@'115]:='M';
   284  xchr[@'116]:='N';
   285  xchr[@'117]:='O';@/
   286  xchr[@'120]:='P';
   287  xchr[@'121]:='Q';
   288  xchr[@'122]:='R';
   289  xchr[@'123]:='S';
   290  xchr[@'124]:='T';
   291  xchr[@'125]:='U';
   292  xchr[@'126]:='V';
   293  xchr[@'127]:='W';@/
   294  xchr[@'130]:='X';
   295  xchr[@'131]:='Y';
   296  xchr[@'132]:='Z';
   297  xchr[@'133]:='[';
   298  xchr[@'134]:='\';
   299  xchr[@'135]:=']';
   300  xchr[@'136]:='^';
   301  xchr[@'137]:='_';@/
   302  xchr[@'140]:='`';
   303  xchr[@'141]:='a';
   304  xchr[@'142]:='b';
   305  xchr[@'143]:='c';
   306  xchr[@'144]:='d';
   307  xchr[@'145]:='e';
   308  xchr[@'146]:='f';
   309  xchr[@'147]:='g';@/
   310  xchr[@'150]:='h';
   311  xchr[@'151]:='i';
   312  xchr[@'152]:='j';
   313  xchr[@'153]:='k';
   314  xchr[@'154]:='l';
   315  xchr[@'155]:='m';
   316  xchr[@'156]:='n';
   317  xchr[@'157]:='o';@/
   318  xchr[@'160]:='p';
   319  xchr[@'161]:='q';
   320  xchr[@'162]:='r';
   321  xchr[@'163]:='s';
   322  xchr[@'164]:='t';
   323  xchr[@'165]:='u';
   324  xchr[@'166]:='v';
   325  xchr[@'167]:='w';@/
   326  xchr[@'170]:='x';
   327  xchr[@'171]:='y';
   328  xchr[@'172]:='z';
   329  xchr[@'173]:='{';
   330  xchr[@'174]:='|';
   331  xchr[@'175]:='}';
   332  xchr[@'176]:='~';
   333  for i:=@'177 to 255 do xchr[i]:='?';
   334  
   335  @ The following system-independent code makes the |xord| array contain a
   336  suitable inverse to the information in |xchr|.
   337  
   338  @<Set init...@>=
   339  for i:=first_text_char to last_text_char do xord[chr(i)]:=@'40;
   340  for i:=" " to "~" do xord[xchr[i]]:=i;
   341  
   342  @* Generic font file format.
   343  The most important output produced by a typical run of \MF\ is the
   344  ``generic font'' (\.{GF}) file that specifies the bit patterns of the
   345  characters that have been drawn. The term {\sl generic\/} indicates that
   346  this file format doesn't match the conventions of any name-brand manufacturer;
   347  but it is easy to convert \.{GF} files to the special format required by
   348  almost all digital phototypesetting equipment. There's a strong analogy
   349  between the \.{DVI} files written by \TeX\ and the \.{GF} files written
   350  by \MF; and, in fact, the file formats have a lot in common.
   351  
   352  A \.{GF} file is a stream of 8-bit bytes that may be
   353  regarded as a series of commands in a machine-like language. The first
   354  byte of each command is the operation code, and this code is followed by
   355  zero or more bytes that provide parameters to the command. The parameters
   356  themselves may consist of several consecutive bytes; for example, the
   357  `|boc|' (beginning of character) command has six parameters, each of
   358  which is four bytes long. Parameters are usually regarded as nonnegative
   359  integers; but four-byte-long parameters can be either positive or
   360  negative, hence they range in value from $-2^{31}$ to $2^{31}-1$.
   361  As in \.{TFM} files, numbers that occupy
   362  more than one byte position appear in BigEndian order,
   363  and negative numbers appear in two's complement notation.
   364  
   365  A \.{GF} file consists of a ``preamble,'' followed by a sequence of one or
   366  more ``characters,'' followed by a ``postamble.'' The preamble is simply a
   367  |pre| command, with its parameters that introduce the file; this must come
   368  first.  Each ``character'' consists of a |boc| command, followed by any
   369  number of other commands that specify ``black'' pixels,
   370  followed by an |eoc| command. The characters appear in the order that \MF\
   371  generated them. If we ignore no-op commands (which are allowed between any
   372  two commands in the file), each |eoc| command is immediately followed by a
   373  |boc| command, or by a |post| command; in the latter case, there are no
   374  more characters in the file, and the remaining bytes form the postamble.
   375  Further details about the postamble will be explained later.
   376  
   377  Some parameters in \.{GF} commands are ``pointers.'' These are four-byte
   378  quantities that give the location number of some other byte in the file;
   379  the first file byte is number~0, then comes number~1, and so on.
   380  
   381  @ The \.{GF} format is intended to be both compact and easily interpreted
   382  by a machine. Compactness is achieved by making most of the information
   383  relative instead of absolute. When a \.{GF}-reading program reads the
   384  commands for a character, it keeps track of two quantities: (a)~the current
   385  column number,~|m|; and (b)~the current row number,~|n|.  These are 32-bit
   386  signed integers, although most actual font formats produced from \.{GF}
   387  files will need to curtail this vast range because of practical
   388  limitations. (\MF\ output will never allow $\vert m\vert$ or $\vert
   389  n\vert$ to get extremely large, but the \.{GF} format tries to be more
   390  general.)
   391  
   392  How do \.{GF}'s row and column numbers correspond to the conventions
   393  of \TeX\ and \MF? Well, the ``reference point'' of a character, in \TeX's
   394  view, is considered to be at the lower left corner of the pixel in row~0
   395  and column~0. This point is the intersection of the baseline with the left
   396  edge of the type; it corresponds to location $(0,0)$ in \MF\ programs.
   397  Thus the pixel in \.{GF} row~0 and column~0 is \MF's unit square, comprising
   398  the region of the plane whose coordinates both lie between 0 and~1. The
   399  pixel in \.{GF} row~|n| and column~|m| consists of the points whose \MF\
   400  coordinates |(x,y)| satisfy |m<=x<=m+1| and |n<=y<=n+1|.  Negative values of
   401  |m| and~|x| correspond to columns of pixels {\sl left\/} of the reference
   402  point; negative values of |n| and~|y| correspond to rows of pixels {\sl
   403  below\/} the baseline.
   404  
   405  Besides |m| and |n|, there's also a third aspect of the current
   406  state, namely the @!|paint_switch|, which is always either \\{black} or
   407  \\{white}. Each \\{paint} command advances |m| by a specified amount~|d|,
   408  and blackens the intervening pixels if |paint_switch=black|; then
   409  the |paint_switch| changes to the opposite state. \.{GF}'s commands are
   410  designed so that |m| will never decrease within a row, and |n| will never
   411  increase within a character; hence there is no way to whiten a pixel that
   412  has been blackened.
   413  
   414  @ Here is a list of all the commands that may appear in a \.{GF} file. Each
   415  command is specified by its symbolic name (e.g., |boc|), its opcode byte
   416  (e.g., 67), and its parameters (if any). The parameters are followed
   417  by a bracketed number telling how many bytes they occupy; for example,
   418  `|d[2]|' means that parameter |d| is two bytes long.
   419  
   420  \yskip\hang|paint_0| 0. This is a \\{paint} command with |d=0|; it does
   421  nothing but change the |paint_switch| from \\{black} to \\{white} or
   422  vice~versa.
   423  
   424  \yskip\hang\\{paint\_1} through \\{paint\_63} (opcodes 1 to 63).
   425  These are \\{paint} commands with |d=1| to~63, defined as follows: If
   426  |paint_switch=black|, blacken |d|~pixels of the current row~|n|,
   427  in columns |m| through |m+d-1| inclusive. Then, in any case,
   428  complement the |paint_switch| and advance |m| by~|d|.
   429  
   430  \yskip\hang|paint1| 64 |d[1]|. This is a \\{paint} command with a specified
   431  value of~|d|; \MF\ uses it to paint when |64<=d<256|.
   432  
   433  \yskip\hang|@!paint2| 65 |d[2]|. Same as |paint1|, but |d|~can be as high
   434  as~65535.
   435  
   436  \yskip\hang|@!paint3| 66 |d[3]|. Same as |paint1|, but |d|~can be as high
   437  as $2^{24}-1$. \MF\ never needs this command, and it is hard to imagine
   438  anybody making practical use of it; surely a more compact encoding will be
   439  desirable when characters can be this large. But the command is there,
   440  anyway, just in case.
   441  
   442  \yskip\hang|boc| 67 |c[4]| |p[4]| |min_m[4]| |max_m[4]| |min_n[4]|
   443  |max_n[4]|. Beginning of a character:  Here |c| is the character code, and
   444  |p| points to the previous character beginning (if any) for characters having
   445  this code number modulo 256.  (The pointer |p| is |-1| if there was no
   446  prior character with an equivalent code.) The values of registers |m| and |n|
   447  defined by the instructions that follow for this character must
   448  satisfy |min_m<=m<=max_m| and |min_n<=n<=max_n|.  (The values of |max_m| and
   449  |min_n| need not be the tightest bounds possible.)  When a \.{GF}-reading
   450  program sees a |boc|, it can use |min_m|, |max_m|, |min_n|, and |max_n| to
   451  initialize the bounds of an array. Then it sets |m:=min_m|, |n:=max_n|, and
   452  |paint_switch:=white|.
   453  
   454  \yskip\hang|boc1| 68 |c[1]| |@!del_m[1]| |max_m[1]| |@!del_n[1]| |max_n[1]|.
   455  Same as |boc|, but |p| is assumed to be~$-1$; also |del_m=max_m-min_m|
   456  and |del_n=max_n-min_n| are given instead of |min_m| and |min_n|.
   457  The one-byte parameters must be between 0 and 255, inclusive.
   458  \ (This abbreviated |boc| saves 19~bytes per character, in common cases.)
   459  
   460  \yskip\hang|eoc| 69. End of character: All pixels blackened so far
   461  constitute the pattern for this character. In particular, a completely
   462  blank character might have |eoc| immediately following |boc|.
   463  
   464  \yskip\hang|skip0| 70. Decrease |n| by 1 and set |m:=min_m|,
   465  |paint_switch:=white|. \ (This finishes one row and begins another,
   466  ready to whiten the leftmost pixel in the new row.)
   467  
   468  \yskip\hang|skip1| 71 |d[1]|. Decrease |n| by |d+1|, set |m:=min_m|, and set
   469  |paint_switch:=white|. This is a way to produce |d| all-white rows.
   470  
   471  \yskip\hang|@!skip2| 72 |d[2]|. Same as |skip1|, but |d| can be as large
   472  as 65535.
   473  
   474  \yskip\hang|@!skip3| 73 |d[3]|. Same as |skip1|, but |d| can be as large
   475  as $2^{24}-1$. \MF\ obviously never needs this command.
   476  
   477  \yskip\hang|new_row_0| 74. Decrease |n| by 1 and set |m:=min_m|,
   478  |paint_switch:=black|. \ (This finishes one row and begins another,
   479  ready to {\sl blacken\/} the leftmost pixel in the new row.)
   480  
   481  \yskip\hang|@!new_row_1| through |@!new_row_164| (opcodes 75 to 238). Same as
   482  |new_row_0|, but with |m:=min_m+1| through |min_m+164|, respectively.
   483  
   484  \yskip\hang|xxx1| 239 |k[1]| |x[k]|. This command is undefined in
   485  general; it functions as a $(k+2)$-byte |no_op| unless special \.{GF}-reading
   486  programs are being used. \MF\ generates \\{xxx} commands when encountering
   487  a \&{special} string; this occurs in the \.{GF} file only between
   488  characters, after the preamble, and before the postamble. However,
   489  \\{xxx} commands might appear within characters,
   490  in \.{GF} files generated by other
   491  processors. It is recommended that |x| be a string having the form of a
   492  keyword followed by possible parameters relevant to that keyword.
   493  
   494  \yskip\hang|@!xxx2| 240 |k[2]| |x[k]|. Like |xxx1|, but |0<=k<65536|.
   495  
   496  \yskip\hang|xxx3| 241 |k[3]| |x[k]|. Like |xxx1|, but |0<=k<@t$2^{24}$@>|.
   497  \MF\ uses this when sending a \&{special} string whose length exceeds~255.
   498  
   499  \yskip\hang|@!xxx4| 242 |k[4]| |x[k]|. Like |xxx1|, but |k| can be
   500  ridiculously large; |k| mustn't be negative.
   501  
   502  \yskip\hang|yyy| 243 |y[4]|. This command is undefined in general;
   503  it functions as a 5-byte |no_op| unless special \.{GF}-reading programs
   504  are being used. \MF\ puts |scaled| numbers into |yyy|'s, as a
   505  result of \&{numspecial} commands; the intent is to provide numeric
   506  parameters to \\{xxx} commands that immediately precede.
   507  
   508  \yskip\hang|no_op| 244. No operation, do nothing. Any number of |no_op|'s
   509  may occur between \.{GF} commands, but a |no_op| cannot be inserted between
   510  a command and its parameters or between two parameters.
   511  
   512  \yskip\hang|char_loc| 245 |c[1]| |dx[4]| |dy[4]| |w[4]| |p[4]|.
   513  This command will appear only in the postamble, which will be explained
   514  shortly.
   515  
   516  \yskip\hang|@!char_loc0| 246 |c[1]| |@!dm[1]| |w[4]| |p[4]|.
   517  Same as |char_loc|, except that |dy| is assumed to be zero, and the value
   518  of~|dx| is taken to be |65536*dm|, where |0<=dm<256|.
   519  
   520  \yskip\hang|pre| 247 |i[1]| |k[1]| |x[k]|.
   521  Beginning of the preamble; this must come at the very beginning of the
   522  file. Parameter |i| is an identifying number for \.{GF} format, currently
   523  131. The other information is merely commentary; it is not given
   524  special interpretation like \\{xxx} commands are. (Note that \\{xxx}
   525  commands may immediately follow the preamble, before the first |boc|.)
   526  
   527  \yskip\hang|post| 248. Beginning of the postamble, see below.
   528  
   529  \yskip\hang|post_post| 249. Ending of the postamble, see below.
   530  
   531  \yskip\noindent Commands 250--255 are undefined at the present time.
   532  
   533  @d gf_id_byte=131 {identifies the kind of \.{GF} files described here}
   534  
   535  @ Here are the opcodes that \.{GFtoPK} actually refers to.
   536  
   537  @d paint_0=0 {beginning of the \\{paint} commands}
   538  @d paint1=64 {move right a given number of columns, then
   539    black${}\swap{}$white}
   540  @d boc=67 {beginning of a character}
   541  @d boc1=68 {abbreviated |boc|}
   542  @d eoc=69 {end of a character}
   543  @d skip0=70 {skip no blank rows}
   544  @d skip1=71 {skip over blank rows}
   545  @d new_row_0=74 {move down one row and then right}
   546  @d max_new_row=238 {move down one row and then right}
   547  @d xxx1=239 {for \&{special} strings}
   548  @d yyy=243 {for \&{numspecial} numbers}
   549  @d no_op=244 {no operation}
   550  @d char_loc=245 {character locators in the postamble}
   551  @d char_loc0=246 {character locators in the postamble}
   552  @d pre=247 {preamble}
   553  @d post=248 {postamble beginning}
   554  @d post_post=249 {postamble ending}
   555  @d undefined_commands==250,251,252,253,254,255
   556  
   557  @ The last character in a \.{GF} file is followed by `|post|'; this command
   558  introduces the postamble, which summarizes important facts that \MF\ has
   559  accumulated. The postamble has the form
   560  $$\vbox{\halign{\hbox{#\hfil}\cr
   561    |post| |p[4]| |@!ds[4]| |@!cs[4]| |@!hppp[4]| |@!vppp[4]|
   562     |@!min_m[4]| |@!max_m[4]| |@!min_n[4]| |@!max_n[4]|\cr
   563    $\langle\,$character locators$\,\rangle$\cr
   564    |post_post| |q[4]| |i[1]| 223's$[{\G}4]$\cr}}$$
   565  Here |p| is a pointer to the byte following the final |eoc| in the file
   566  (or to the byte following the preamble, if there are no characters);
   567  it can be used to locate the beginning of \\{xxx} commands
   568  that might have preceded the postamble. The |ds| and |cs| parameters
   569  @^design size@> @^check sum@>
   570  give the design size and check sum, respectively, which are exactly the
   571  values put into the header of any \.{TFM} file that shares information with
   572  this \.{GF} file. Parameters |hppp| and |vppp| are the ratios of
   573  pixels per point, horizontally and vertically, expressed as |scaled| integers
   574  (i.e., multiplied by $2^{16}$); they can be used to correlate the font
   575  with specific device resolutions, magnifications, and ``at sizes.''  Then
   576  come |min_m|, |max_m|, |min_n|, and |max_n|, which bound the values that
   577  registers |m| and~|n| assume in all characters in this \.{GF} file.
   578  (These bounds need not be the best possible; |max_m| and |min_n| may, on the
   579  other hand, be tighter than the similar bounds in |boc| commands. For
   580  example, some character may have |min_n=-100| in its |boc|, but it might
   581  turn out that |n| never gets lower than |-50| in any character; then
   582  |min_n| can have any value |<=-50|. If there are no characters in the file,
   583  it's possible to have |min_m>max_m| and/or |min_n>max_n|.)
   584  
   585  @ Character locators are introduced by |char_loc| commands,
   586  which specify a character residue~|c|, character escapements (|dx,dy|),
   587  a character width~|w|, and a pointer~|p|
   588  to the beginning of that character. (If two or more characters have the
   589  same code~|c| modulo 256, only the last will be indicated; the others can be
   590  located by following backpointers. Characters whose codes differ by a
   591  multiple of 256 are assumed to share the same font metric information,
   592  hence the \.{TFM} file contains only residues of character codes modulo~256.
   593  This convention is intended for oriental languages, when there are many
   594  character shapes but few distinct widths.)
   595  @^oriental characters@>@^Chinese characters@>@^Japanese characters@>
   596  
   597  The character escapements (|dx,dy|) are the values of \MF's \&{chardx}
   598  and \&{chardy} parameters; they are in units of |scaled| pixels;
   599  i.e., |dx| is in horizontal pixel units times $2^{16}$, and |dy| is in
   600  vertical pixel units times $2^{16}$.  This is the intended amount of
   601  displacement after typesetting the character; for \.{DVI} files, |dy|
   602  should be zero, but other document file formats allow nonzero vertical
   603  escapement.
   604  
   605  The character width~|w| duplicates the information in the \.{TFM} file; it
   606  is $2^{24}$ times the ratio of the true width to the font's design size.
   607  
   608  The backpointer |p| points to the character's |boc|, or to the first of
   609  a sequence of consecutive \\{xxx} or |yyy| or |no_op| commands that
   610  immediately precede the |boc|, if such commands exist; such ``special''
   611  commands essentially belong to the characters, while the special commands
   612  after the final character belong to the postamble (i.e., to the font
   613  as a whole). This convention about |p| applies also to the backpointers
   614  in |boc| commands, even though it wasn't explained in the description
   615  of~|boc|. @^backpointers@>
   616  
   617  Pointer |p| might be |-1| if the character exists in the \.{TFM} file
   618  but not in the \.{GF} file. This unusual situation can arise in \MF\ output
   619  if the user had |proofing<0| when the character was being shipped out,
   620  but then made |proofing>=0| in order to get a \.{GF} file.
   621  
   622  @ The last part of the postamble, following the |post_post| byte that
   623  signifies the end of the character locators, contains |q|, a pointer to the
   624  |post| command that started the postamble.  An identification byte, |i|,
   625  comes next; this currently equals~131, as in the preamble.
   626  
   627  The |i| byte is followed by four or more bytes that are all equal to
   628  the decimal number 223 (i.e., @'337 in octal). \MF\ puts out four to seven of
   629  these trailing bytes, until the total length of the file is a multiple of
   630  four bytes, since this works out best on machines that pack four bytes per
   631  word; but any number of 223's is allowed, as long as there are at least four
   632  of them. In effect, 223 is a sort of signature that is added at the very end.
   633  @^Fuchs, David Raymond@>
   634  
   635  This curious way to finish off a \.{GF} file makes it feasible for
   636  \.{GF}-reading programs to find the postamble first, on most computers,
   637  even though \MF\ wants to write the postamble last. Most operating
   638  systems permit random access to individual words or bytes of a file, so
   639  the \.{GF} reader can start at the end and skip backwards over the 223's
   640  until finding the identification byte. Then it can back up four bytes, read
   641  |q|, and move to byte |q| of the file. This byte should, of course,
   642  contain the value 248 (|post|); now the postamble can be read, so the
   643  \.{GF} reader can discover all the information needed for individual
   644  characters.
   645  
   646  Unfortunately, however, standard \PASCAL\ does not include the ability to
   647  @^system dependencies@>
   648  access a random position in a file, or even to determine the length of a file.
   649  Almost all systems nowadays provide the necessary capabilities, so \.{GF}
   650  format has been designed to work most efficiently with modern operating
   651  systems.  \.{GFtoPK} first reads the postamble, and then scans the file from
   652  front to back.
   653  
   654  @* Packed file format.
   655  The packed file format is a compact representation of the data contained in a
   656  \.{GF} file.  The information content is the same, but packed (\.{PK}) files
   657  are almost always less than half the size of their \.{GF} counterparts.  They
   658  are also easier to convert into a raster representation because they do not
   659  have a profusion of \\{paint}, \\{skip}, and \\{new\_row} commands to be
   660  separately interpreted.  In addition, the \.{PK} format expressly forbids
   661  \&{special} commands within a character.  The minimum bounding box for each
   662  character is explicit in the format, and does not need to be scanned for as in
   663  the \.{GF} format.  Finally, the width and escapement values are combined with
   664  the raster information into character ``packets'', making it simpler in many
   665  cases to process a character.
   666  
   667  A \.{PK} file is organized as a stream of 8-bit bytes.  At times, these bytes
   668  might be split into 4-bit nybbles or single bits, or combined into multiple
   669  byte parameters.  When bytes are split into smaller pieces, the `first' piece
   670  is always the most significant of the byte.  For instance, the first bit of
   671  a byte is the bit with value 128; the first nybble can be found by dividing
   672  a byte by 16.  Similarly, when bytes are combined into multiple byte
   673  parameters, the first byte is the most significant of the parameter.  If the
   674  parameter is signed, it is represented by two's-complement notation.
   675  
   676  The set of possible eight-bit values is separated into two sets, those that
   677  introduce a character definition, and those that do not.  The values that
   678  introduce a character definition range from 0 to 239; byte values
   679  above 239 are interpreted as commands.  Bytes that introduce character
   680  definitions are called flag bytes, and various fields within the byte indicate
   681  various things about how the character definition is encoded.  Command bytes
   682  have zero or more parameters, and can never appear within a character
   683  definition or between parameters of another command, where they would be
   684  interpreted as data.
   685  
   686  A \.{PK} file consists of a preamble, followed by a sequence of one or more
   687  character definitions, followed by a postamble.  The preamble command must
   688  be the first byte in the file, followed immediately by its parameters.
   689  Any number of character definitions may follow, and any command but the
   690  preamble command and the postamble command may occur between character
   691  definitions.  The very last command in the file must be the postamble.
   692  
   693  @ The packed file format is intended to be easy to read and interpret by
   694  device drivers.  The small size of the file reduces the input/output overhead
   695  each time a font is loaded.  For those drivers that load and save each font
   696  file into memory, the small size also helps reduce the memory requirements.
   697  The length of each character packet is specified, allowing the character raster
   698  data to be loaded into memory by simply counting bytes, rather than
   699  interpreting each command; then, each character can be interpreted on a demand
   700  basis.  This also makes it possible for a driver to skip a particular
   701  character quickly if it knows that the character is unused.
   702  
   703  @ First, the command bytes will be presented; then the format of the
   704  character definitions will be defined.  Eight of the possible sixteen
   705  commands (values 240 through 255) are currently defined; the others are
   706  reserved for future extensions.  The commands are listed below.  Each command
   707  is specified by its symbolic name (e.g., \\{pk\_no\_op}), its opcode byte,
   708  and any parameters.  The parameters are followed by a bracketed number
   709  telling how many bytes they occupy, with the number preceded by a plus sign if
   710  it is a signed quantity.  (Four byte quantities are always signed, however.)
   711  
   712  \yskip\hang|pk_xxx1| 240 |k[1]| |x[k]|.  This command is undefined in general;
   713  it functions as a $(k+2)$-byte \\{no\_op} unless special \.{PK}-reading
   714  programs are being used.  \MF\ generates \\{xxx} commands when encountering
   715  a \&{special} string.  It is recommended that |x| be a string having the form
   716  of a keyword followed by possible parameters relevant to that keyword.
   717  
   718  \yskip\hang\\{pk\_xxx2} 241 |k[2]| |x[k]|.  Like |pk_xxx1|, but |0<=k<65536|.
   719  
   720  \yskip\hang\\{pk\_xxx3} 242 |k[3]| |x[k]|.  Like |pk_xxx1|, but
   721  |0<=k<@t$2^{24}$@>|.  \MF\ uses this when sending a \&{special} string whose
   722  length exceeds~255.
   723  
   724  \yskip\hang\\{pk\_xxx4} 243 |k[4]| |x[k]|.  Like |pk_xxx1|, but |k| can be
   725  ridiculously large; |k| mustn't be negative.
   726  
   727  \yskip\hang|pk_yyy| 244 |y[4]|.  This command is undefined in general; it
   728  functions as a five-byte \\{no\_op} unless special \.{PK} reading programs
   729  are being used.  \MF\ puts |scaled| numbers into |yyy|'s, as a result of
   730  \&{numspecial} commands; the intent is to provide numeric parameters to
   731  \\{xxx} commands that immediately precede.
   732  
   733  \yskip\hang|pk_post| 245.  Beginning of the postamble.  This command is
   734  followed by enough |pk_no_op| commands to make the file a multiple
   735  of four bytes long.  Zero through three bytes are usual, but any number
   736  is allowed.
   737  This should make the file easy to read on machines that pack four bytes to
   738  a word.
   739  
   740  \yskip\hang|pk_no_op| 246.  No operation, do nothing.  Any number of
   741  |pk_no_op|'s may appear between \.{PK} commands, but a |pk_no_op| cannot be
   742  inserted between a command and its parameters, between two parameters, or
   743  inside a character definition.
   744  
   745  \yskip\hang|pk_pre| 247 |i[1]| |k[1]| |x[k]| |ds[4]| |cs[4]| |hppp[4]|
   746  |vppp[4]|.  Preamble command.  Here, |i| is the identification byte of the
   747  file, currently equal to 89.  The string |x| is merely a comment, usually
   748  indicating the source of the \.{PK} file.  The parameters |ds| and |cs| are
   749  the design size of the file in $1/2^{20}$ points, and the checksum of the
   750  file, respectively.  The checksum should match the \.{TFM} file and the
   751  \.{GF} files for this font.  Parameters |hppp| and |vppp| are the ratios
   752  of pixels per point, horizontally and vertically, multiplied by $2^{16}$; they
   753  can be used to correlate the font with specific device resolutions,
   754  magnifications, and ``at sizes''.  Usually, the name of the \.{PK} file is
   755  formed by concatenating the font name (e.g., cmr10) with the resolution at
   756  which the font is prepared in pixels per inch multiplied by the magnification
   757  factor, and the letters \.{pk}.  For instance, cmr10 at 300 dots per inch
   758  should be named \.{cmr10.300pk}; at one thousand dots per inch and magstephalf,
   759  it should be named \.{cmr10.1095pk}.
   760  
   761  @ We put a few of the above opcodes into definitions for symbolic use by
   762  this program.
   763  
   764  @d pk_id = 89 {the version of \.{PK} file described}
   765  @d pk_xxx1 = 240 {\&{special} commands}
   766  @d pk_yyy = 244 {\&{numspecial} commands}
   767  @d pk_post = 245 {postamble}
   768  @d pk_no_op = 246 {no operation}
   769  @d pk_pre = 247 {preamble}
   770  
   771  @ The \.{PK} format has two conflicting goals: to pack character raster and
   772  size information as compactly as possible, while retaining ease of translation
   773  into raster and other forms.  A suitable compromise was found in the use of
   774  run-encoding of the raster information.  Instead of packing the individual
   775  bits of the character, we instead count the number of consecutive `black' or
   776  `white' pixels in a horizontal raster row, and then encode this number.  Run
   777  counts are found for each row from left to right, traversing rows from the
   778  top to bottom. This is essentially the way the \.{GF} format works.
   779  Instead of presenting each row individually, however, we concatenate all
   780  of the horizontal raster rows into one long string of pixels, and encode this
   781  row.  With knowledge of the width of the bit-map, the original character glyph
   782  can easily be reconstructed.  In addition, we do not need special commands to
   783  mark the end of one row and the beginning of the next.
   784  
   785  Next, we place the burden of finding the minimum bounding box on the part
   786  of the font generator, since the characters will usually be used much more
   787  often than they are generated.  The minimum bounding box is the smallest
   788  rectangle that encloses all `black' pixels of a character.  We also
   789  eliminate the need for a special end of character marker, by supplying
   790  exactly as many bits as are required to fill the minimum bounding box, from
   791  which the end of the character is implicit.
   792  
   793  Let us next consider the distribution of the run counts.  Analysis of several
   794  dozen pixel files at 300 dots per inch yields a distribution peaking at four,
   795  falling off slowly until ten, then a bit more steeply until twenty, and then
   796  asymptotically approaching the horizontal.  Thus, the great majority of our
   797  run counts will fit in a four-bit nybble.  The eight-bit byte is attractive for
   798  our run-counts, as it is the standard on many systems; however, the wasted four
   799  bits in the majority of cases seem a high price to pay.  Another possibility
   800  is to use a Huffman-type encoding scheme with a variable number of bits for
   801  each run-count; this was rejected because of the overhead in fetching and
   802  examining individual bits in the file.  Thus, the character raster definitions
   803  in the \.{PK} file format are based on the four-bit nybble.
   804  
   805  @ An analysis of typical pixel files yielded another interesting statistic:
   806  Fully 37\char`\%\
   807  of the raster rows were duplicates of the previous row.  Thus, the \.{PK}
   808  format allows the specification of repeat counts, which indicate how many times
   809  a horizontal raster row is to be repeated.  These repeated rows are taken out
   810  of the character glyph before individual rows are concatenated into the long
   811  string of pixels.
   812  
   813  For elegance, we disallow a run count of zero.  The case of a null raster
   814  description should be gleaned from the character width and height being equal
   815  to zero, and no raster data should be read.  No other zero counts are ever
   816  necessary.  Also, in the absence of repeat counts, the repeat value is set to
   817  be zero (only the original row is sent.)  If a repeat count is seen, it takes
   818  effect on the current row.  The current row is defined as the row on which the
   819  first pixel of the next run count will lie.  The repeat count is set back to
   820  zero when the last pixel in the current row is seen, and the row is sent out.
   821  
   822  This poses a problem for entirely black and entirely white rows, however.  Let
   823  us say that the current row ends with four white pixels, and then we have five
   824  entirely empty rows, followed by a black pixel at the beginning of the next
   825  row, and the character width is ten pixels.  We would like to use a repeat
   826  count, but there is no legal place to put it.  If we put it before the white
   827  run count, it will apply to the current row.  If we put it after, it applies
   828  to the row with the black pixel at the beginning.  Thus, entirely white or
   829  entirely black repeated rows are always packed as large run counts (in this
   830  case, a white run count of 54) rather than repeat counts.
   831  
   832  @ Now we turn our attention to the actual packing of the run counts and
   833  repeat counts into nybbles.  There are only sixteen possible nybble values.
   834  We need to indicate run counts and repeat counts.  Since the run counts are
   835  much more common, we will devote the majority of the nybble values to them.
   836  We therefore indicate a repeat count by a nybble of 14 followed by a packed
   837  number, where a packed number will be explained later.  Since the repeat
   838  count value of one is so common, we indicate a repeat one command by a single
   839  nybble of 15.  A 14 followed by the packed number 1 is still legal for a
   840  repeat one count.  The run counts are coded directly as packed
   841  numbers.
   842  
   843  For packed numbers, therefore, we have the nybble values 0 through 13.  We
   844  need to represent the positive integers up to, say, $2^{31}-1$.  We would
   845  like the more common smaller numbers to take only one or two nybbles, and
   846  the infrequent large numbers to take three or more.  We could therefore
   847  allocate one nybble value to indicate a large run count taking three or more
   848  nybbles.  We do this with the value 0.
   849  
   850  @ We are left with the values 1 through 13.  We can allocate some of these, say
   851  |dyn_f|, to be one-nybble run counts.
   852  These will work for the run counts |1..dyn_f|.  For subsequent run
   853  counts, we will use a nybble greater than |dyn_f|, followed by a second nybble,
   854  whose value can run from 0 through 15.  Thus, the two-nybble values will
   855  run from |dyn_f+1..(13-dyn_f)*16+dyn_f|.  We have our definition of large run
   856  count values now, being all counts greater than |(13-dyn_f)*16+dyn_f|.
   857  
   858  We can analyze our several dozen pixel files and determine an optimal value of
   859  |dyn_f|, and use this value for all of the characters.  Unfortunately, values
   860  of |dyn_f| that pack small characters well tend to pack the large characters
   861  poorly, and values that pack large characters well are not efficient for the
   862  smaller characters.  Thus, we choose the optimal |dyn_f| on a character basis,
   863  picking the value that will pack each individual character in the smallest
   864  number of nybbles.  Legal values of |dyn_f| run from 0 (with no one-nybble run
   865  counts) to 13 (with no two-nybble run counts).
   866  
   867  @ Our only remaining task in the coding of packed numbers is the large run
   868  counts.  We use a scheme suggested by D.~E.~Knuth
   869  @^Knuth, Donald Ervin@>
   870  that simply and elegantly represents arbitrarily large values.  The
   871  general scheme to represent an integer |i| is to write its hexadecimal
   872  representation, with leading zeros removed.  Then we count the number of
   873  digits, and prepend one less than that many zeros before the hexadecimal
   874  representation.  Thus, the values from one to fifteen occupy one nybble;
   875  the values sixteen through 255 occupy three, the values 256 through 4095
   876  require five, etc.
   877  
   878  For our purposes, however, we have already represented the numbers one
   879  through |(13-dyn_f)*16+dyn_f|.  In addition, the one-nybble values have
   880  already been taken by our other commands, which means that only the values
   881  from sixteen up are available to us for long run counts.  Thus, we simply
   882  normalize our long run counts, by subtracting |(13-dyn_f)*16+dyn_f+1| and
   883  adding 16, and then we represent the result according to the scheme above.
   884  
   885  @ The final algorithm for decoding the run counts based on the above scheme
   886  might look like this, assuming that a procedure called \\{get\_nyb} is
   887  available to get the next nybble from the file, and assuming that the global
   888  |repeat_count| indicates whether a row needs to be repeated.  Note that this
   889  routine is recursive, but since a repeat count can never directly follow
   890  another repeat count, it can only be recursive to one level.
   891  
   892  @p@{ function pk_packed_num : integer ;
   893  var i,@!j : integer ;
   894  begin
   895     i := get_nyb ;
   896     if i = 0 then begin
   897        repeat j := get_nyb ; incr(i) ; until j <> 0 ;
   898        while i > 0 do begin j := j * 16 + get_nyb ; decr(i) ; end ;
   899        pk_packed_num := j - 15 + (13-dyn_f)*16 + dyn_f ;
   900     end else if i <= dyn_f then
   901        pk_packed_num := i
   902     else if i < 14 then
   903        pk_packed_num := (i-dyn_f-1)*16+get_nyb+dyn_f+1
   904     else begin
   905        if i = 14 then
   906           repeat_count := pk_packed_num
   907        else
   908           repeat_count := 1 ;
   909        pk_packed_num := pk_packed_num ;
   910     end ;
   911  end ; @}
   912  
   913  @ For low resolution fonts, or characters with `gray' areas, run encoding can
   914  often make the character many times larger.  Therefore, for those characters
   915  that cannot be encoded efficiently with run counts, the \.{PK} format allows
   916  bit-mapping of the characters.  This is indicated by a |dyn_f| value of
   917  14.  The bits are packed tightly, by concatenating all of the horizontal raster
   918  rows into one long string, and then packing this string eight bits to a byte.
   919  The number of bytes required can be calculated by |(width*height+7) div 8|.
   920  This format should only be used when packing the character by run counts takes
   921  more bytes than this, although, of course, it is legal for any character.
   922  Any extra bits in the last byte should be set to zero.
   923  
   924  @ At this point, we are ready to introduce the format for a character
   925  descriptor.  It consists of three parts: a flag byte, a character preamble,
   926  and the raster data.  The most significant four bits of the flag byte
   927  yield the |dyn_f| value for that character.  (Notice that only values of
   928  0 through 14 are legal for |dyn_f|, with 14 indicating a bit mapped character;
   929  thus, the flag bytes do not conflict with the command bytes, whose upper nybble
   930  is always 15.)  The next bit (with weight 8) indicates whether the first run
   931  count is a black count or a white count, with a one indicating a black count.
   932  For bit-mapped characters, this bit should be set to a zero.  The next bit
   933  (with weight 4) indicates whether certain later parameters (referred to as size
   934  parameters) are given in one-byte or two-byte quantities, with a one indicating
   935  that they are in two-byte quantities.  The last two bits are concatenated on to
   936  the beginning of the packet-length parameter in the character preamble,
   937  which will be explained below.
   938  
   939  However, if the last three bits of the flag byte are all set (normally
   940  indicating that the size parameters are two-byte values and that a 3 should be
   941  prepended to the length parameter), then a long format of the character
   942  preamble should be used instead of one of the short forms.
   943  
   944  Therefore, there are three formats for the character preamble; the one that
   945  is used depends on the least significant three bits of the flag byte.  If the
   946  least significant three bits are in the range zero through three, the short
   947  format is used.  If they are in the range four through six, the extended short
   948  format is used.  Otherwise, if the least significant bits are all set, then
   949  the long form of the character preamble is used.  The preamble formats are
   950  explained below.
   951  
   952  \yskip\hang Short form: |flag[1]| |pl[1]| |cc[1]| |tfm[3]| |dm[1]| |w[1]|
   953  |h[1]| |hoff[+1]| |voff[+1]|.
   954  If this format of the character preamble is used, the above
   955  parameters must all fit in the indicated number of bytes, signed or unsigned
   956  as indicated.  Almost all of the standard \TeX\ font characters fit; the few
   957  exceptions are fonts such as \.{cminch}.
   958  
   959  \yskip\hang Extended short form: |flag[1]| |pl[2]| |cc[1]| |tfm[3]| |dm[2]|
   960  |w[2]| |h[2]| |hoff[+2]| |voff[+2]|.  Larger characters use this extended
   961  format.
   962  
   963  \yskip\hang Long form: |flag[1]| |pl[4]| |cc[4]| |tfm[4]| |dx[4]| |dy[4]|
   964  |w[4]| |h[4]| |hoff[4]| |voff[4]|.  This is the general format that
   965  allows all of the
   966  parameters of the \.{GF} file format, including vertical escapement.
   967  \vskip\baselineskip
   968  The |flag| parameter is the flag byte.  The parameter |pl| (packet length)
   969  contains the offset
   970  of the byte following this character descriptor, with respect to the beginning
   971  of the |tfm| width parameter.  This is given so a \.{PK} reading program can,
   972  once it has read the flag byte, packet length, and character code (|cc|), skip
   973  over the character by simply reading this many more bytes.  For the two short
   974  forms of the character preamble, the last two bits of the flag byte should be
   975  considered the two most-significant bits of the packet length.  For the short
   976  format, the true packet length might be calculated as |(flag mod 4)*256+pl|;
   977  for the short extended format, it might be calculated as
   978  |(flag mod 4)*65536+pl|.
   979  
   980  The |w| parameter is the width and the |h| parameter is the height in pixels
   981  of the minimum bounding box.  The |dx| and |dy| parameters are the horizontal
   982  and vertical escapements, respectively.  In the short formats, |dy| is assumed
   983  to be zero and |dm| is |dx| but in pixels;
   984  in the long format, |dx| and |dy| are both
   985  in pixels multiplied by $2^{16}$.  The |hoff| is the horizontal offset from the
   986  upper left pixel to the reference pixel; the |voff| is the vertical offset.
   987  They are both given in pixels, with right and down being positive.  The
   988  reference pixel is the pixel that occupies the unit square in \MF; the
   989  \MF\ reference point is the lower left hand corner of this pixel.  (See the
   990  example below.)
   991  
   992  @ \TeX\ requires all characters that have the same character codes
   993  modulo 256 to have also the same |tfm| widths and escapement values.  The \.{PK}
   994  format does not itself make this a requirement, but in order for the font to
   995  work correctly with the \TeX\ software, this constraint should be observed.
   996  (The standard version of \TeX\ cannot output character codes greater
   997  than 255, but extended versions do exist.)
   998  
   999  Following the character preamble is the raster information for the
  1000  character, packed by run counts or by bits, as indicated by the flag byte.
  1001  If the character is packed by run counts and the required number of nybbles
  1002  is odd, then the last byte of the raster description should have a zero
  1003  for its least significant nybble.
  1004  
  1005  @ As an illustration of the \.{PK} format, the character \char4\ from the font
  1006  amr10 at 300 dots per inch will be encoded.  This character was chosen
  1007  because it illustrates some
  1008  of the borderline cases.  The raster for the character looks like this (the
  1009  row numbers are chosen for convenience, and are not \MF's row numbers.)
  1010  
  1011  \vskip\baselineskip
  1012  {\def\smbox{\vrule height 7pt width 7pt depth 0pt \hskip 3pt}%
  1013  \catcode`\*=\active \let*=\smbox
  1014  \centerline{\vbox{\baselineskip=10pt
  1015  \halign{\hfil#\quad&&\hfil#\hfil\cr
  1016  0& & &*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*\cr
  1017  1& & &*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*\cr
  1018  2& & &*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*\cr
  1019  3& & &*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*\cr
  1020  4& & &*&*& & & & & & & & & & & & & & & & &*&*\cr
  1021  5& & &*&*& & & & & & & & & & & & & & & & &*&*\cr
  1022  6& & &*&*& & & & & & & & & & & & & & & & &*&*\cr
  1023  7\cr
  1024  8\cr
  1025  9& & & & &*&*& & & & & & & & & & & & &*&*& & \cr
  1026  10& & & & &*&*& & & & & & & & & & & & &*&*& & \cr
  1027  11& & & & &*&*& & & & & & & & & & & & &*&*& & \cr
  1028  12& & & & &*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*& & \cr
  1029  13& & & & &*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*& & \cr
  1030  14& & & & &*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*& & \cr
  1031  15& & & & &*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*& & \cr
  1032  16& & & & &*&*& & & & & & & & & & & & &*&*& & \cr
  1033  17& & & & &*&*& & & & & & & & & & & & &*&*& & \cr
  1034  18& & & & &*&*& & & & & & & & & & & & &*&*& & \cr
  1035  19\cr
  1036  20\cr
  1037  21\cr
  1038  22& & &*&*& & & & & & & & & & & & & & & & &*&*\cr
  1039  23& & &*&*& & & & & & & & & & & & & & & & &*&*\cr
  1040  24& & &*&*& & & & & & & & & & & & & & & & &*&*\cr
  1041  25& & &*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*\cr
  1042  26& & &*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*\cr
  1043  27& & &*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*\cr
  1044  28&+& &*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*&*\cr
  1045  &\hphantom{*}&\hphantom{*}\cr
  1046  }}}}
  1047  The width of the minimum bounding box for this character is 20; its height
  1048  is 29.  The `+' represents the reference pixel; notice how it lies outside the
  1049  minimum bounding box.  The |hoff| value is $-2$, and the |voff| is~28.
  1050  
  1051  The first task is to calculate the run counts and repeat counts.  The repeat
  1052  counts are placed at the first transition (black to white or white to black)
  1053  in a row, and are enclosed in brackets.  White counts are enclosed in
  1054  parentheses.  It is relatively easy to generate the counts list:
  1055  \vskip\baselineskip
  1056  \centerline{82 [2] (16) 2 (42) [2] 2 (12) 2 (4) [3]}
  1057  \centerline{16 (4) [2] 2 (12) 2 (62) [2] 2 (16) 82}
  1058  \vskip\baselineskip
  1059  Note that any duplicated rows that are not all white or all black are removed
  1060  before the run counts are calculated.  The rows thus removed are rows 5, 6,
  1061  10, 11, 13, 14, 15, 17, 18, 23, and 24.
  1062  
  1063  @ The next step in the encoding of this character is to calculate the optimal
  1064  value of |dyn_f|.  The details of how this calculation is done are not
  1065  important here; suffice it to say that there is a simple algorithm that can
  1066  determine the best value of |dyn_f| in one pass over the count list.  For this
  1067  character, the optimal value turns out to be 8 (atypically low).  Thus, all
  1068  count values less than or equal to 8 are packed in one nybble; those from
  1069  nine to $(13-8)*16+8$ or 88 are packed in two nybbles.  The run encoded values
  1070  now become (in hex, separated according to the above list):
  1071  \vskip\baselineskip
  1072  \centerline{\tt D9 E2 97 2 B1 E2 2 93 2 4 E3}
  1073  \centerline{\tt 97 4 E2 2 93 2 C5 E2 2 97 D9}
  1074  \vskip\baselineskip\noindent
  1075  which comes to 36 nybbles, or 18 bytes.  This is shorter than the 73 bytes
  1076  required for the bit map, so we use the run count packing.
  1077  
  1078  @ The short form of the character preamble is used because all of the
  1079  parameters fit in their respective lengths.  The packet length is therefore
  1080  18 bytes for the raster, plus
  1081  eight bytes for the character preamble parameters following the character
  1082  code, or 26.  The |tfm| width for this character is 640796, or {\tt 9C71C} in
  1083  hexadecimal.  The horizontal escapement is 25 pixels.  The flag byte is
  1084  88 hex, indicating the short preamble, the black first count, and the
  1085  |dyn_f| value of 8.  The final total character packet, in hexadecimal, is:
  1086  \vskip\baselineskip
  1087  $$\vbox{\halign{\hfil #\quad&&{\tt #\ }\cr
  1088  Flag byte&88\cr
  1089  Packet length&1A\cr
  1090  Character code&04\cr
  1091  |tfm| width&09&C7&1C\cr
  1092  Horizontal escapement (pixels)&19\cr
  1093  Width of bit map&14\cr
  1094  Height of bit map&1D\cr
  1095  Horizontal offset (signed)&FE\cr
  1096  Vertical offset&1C\cr
  1097  Raster data&D9&E2&97\cr
  1098  &2B&1E&22\cr
  1099  &93&24&E3\cr
  1100  &97&4E&22\cr
  1101  &93&2C&5E\cr
  1102  &22&97&D9\cr}}$$
  1103  
  1104  @* Input and output for binary files.
  1105  We have seen that a \.{GF} file is a sequence of 8-bit bytes. The bytes
  1106  appear physically in what is called a `|packed file of 0..255|'
  1107  in \PASCAL\ lingo.  The \.{PK} file is also a sequence of 8-bit bytes.
  1108  
  1109  Packing is system dependent, and many \PASCAL\ systems fail to implement
  1110  such files in a sensible way (at least, from the viewpoint of producing
  1111  good production software).  For example, some systems treat all
  1112  byte-oriented files as text, looking for end-of-line marks and such
  1113  things. Therefore some system-dependent code is often needed to deal with
  1114  binary files, even though most of the program in this section of
  1115  \.{GFtoPK} is written in standard \PASCAL.
  1116  @^system dependencies@>
  1117  
  1118  We shall stick to simple \PASCAL\ in this program, for reasons of clarity,
  1119  even if such simplicity is sometimes unrealistic.
  1120  
  1121  @<Types...@>=
  1122  @!eight_bits=0..255; {unsigned one-byte quantity}
  1123  @!byte_file=packed file of eight_bits; {files that contain binary data}
  1124  
  1125  @ The program deals with two binary file variables: |gf_file| is the
  1126  input file that we are translating into \.{PK} format, to be written
  1127  on |pk_file|.
  1128  
  1129  @<Glob...@>=
  1130  @!gf_file:byte_file; {the stuff we are \.{GFtoPK}ing}
  1131  @!pk_file:byte_file; {the stuff we have \.{GFtoPK}ed}
  1132  
  1133  @ To prepare the |gf_file| for input, we |reset| it.
  1134  
  1135  @p procedure open_gf_file; {prepares to read packed bytes in |gf_file|}
  1136  begin reset(gf_file);
  1137  gf_loc := 0 ;
  1138  end;
  1139  
  1140  @ To prepare the |pk_file| for output, we |rewrite| it.
  1141  
  1142  @p procedure open_pk_file; {prepares to write packed bytes in |pk_file|}
  1143  begin rewrite(pk_file);
  1144  pk_loc := 0 ; pk_open := true ;
  1145  end;
  1146  
  1147  @ The variable |pk_loc| contains the number of the byte about to
  1148  be written to the |pk_file|, and |gf_loc| is the byte about to be read
  1149  from the |gf_file|.  Also, |pk_open| indicates that the packed file has
  1150  been opened and is ready for output.
  1151  
  1152  @<Glob...@>=
  1153  @!pk_loc:integer; {where we are about to write, in |pk_file|}
  1154  @!gf_loc:integer; {where are we in the |gf_file|}
  1155  @!pk_open:boolean; {is the packed file open?}
  1156  
  1157  @ We do not open the |pk_file| until after the postamble of the |gf_file|
  1158  has been read.  This can be used, for instance, to calculate a resolution
  1159  to put in the suffix of the |pk_file| name.  This also means, however, that
  1160  specials in the postamble (which \MF\ never generates) do not get sent to
  1161  the |pk_file|.
  1162  
  1163  @<Set init...@>=
  1164  pk_open := false ;
  1165  
  1166  @ We shall use two simple functions to read the next byte or
  1167  bytes from |gf_file|.  We either need to get an individual byte or a
  1168  set of four bytes.
  1169  @^system dependencies@>
  1170  
  1171  @p function gf_byte:integer; {returns the next byte, unsigned}
  1172  var b:eight_bits;
  1173  begin if eof(gf_file) then bad_gf('Unexpected end of file!')
  1174  @.Unexpected end of file@>
  1175  else  begin read(gf_file,b); gf_byte:=b;
  1176    end;
  1177  incr(gf_loc);
  1178  end;
  1179  @#
  1180  function gf_signed_quad:integer; {returns the next four bytes, signed}
  1181  var a,@!b,@!c,@!d:eight_bits;
  1182  begin read(gf_file,a); read(gf_file,b); read(gf_file,c); read(gf_file,d);
  1183  if a<128 then gf_signed_quad:=((a*256+b)*256+c)*256+d
  1184  else gf_signed_quad:=(((a-256)*256+b)*256+c)*256+d;
  1185  gf_loc := gf_loc + 4 ;
  1186  end;
  1187  
  1188  @ We also need a few routines to write data to the \.{PK} file.  We write
  1189  data in 4-, 8-, 16-, 24-, and 32-bit chunks, so we define the appropriate
  1190  routines. We must be careful not to let the sign bit mess us up, as some
  1191  \PASCAL s implement division of a negative integer differently.
  1192  
  1193  @p procedure pk_byte(a:integer) ;
  1194  begin
  1195     if pk_open then begin
  1196        if a < 0 then a := a + 256 ;
  1197        write(pk_file, a) ;
  1198        incr(pk_loc) ;
  1199     end ;
  1200  end ;
  1201  @#
  1202  procedure pk_halfword(a:integer) ;
  1203  begin
  1204     if a < 0 then a := a + 65536 ;
  1205     write(pk_file, a div 256) ;
  1206     write(pk_file, a mod 256) ;
  1207     pk_loc := pk_loc + 2 ;
  1208  end ;
  1209  @#
  1210  procedure pk_three_bytes(a:integer);
  1211  begin
  1212     write(pk_file, a div 65536 mod 256) ;
  1213     write(pk_file, a div 256 mod 256) ;
  1214     write(pk_file, a mod 256) ;
  1215     pk_loc := pk_loc + 3 ;
  1216  end ;
  1217  @#
  1218  procedure pk_word(a:integer) ;
  1219  var b : integer ;
  1220  begin
  1221     if pk_open then begin
  1222        if a < 0 then begin
  1223           a := a + @'10000000000 ;
  1224           a := a + @'10000000000 ;
  1225           b := 128 + a div 16777216 ;
  1226        end else b := a div 16777216 ;
  1227        write(pk_file, b) ;
  1228        write(pk_file, a div 65536 mod 256) ;
  1229        write(pk_file, a div 256 mod 256) ;
  1230        write(pk_file, a mod 256) ;
  1231        pk_loc := pk_loc + 4 ;
  1232     end ;
  1233  end ;
  1234  @#
  1235  procedure pk_nyb(a:integer) ;
  1236  begin
  1237     if bit_weight = 16 then begin
  1238        output_byte := a * 16 ;
  1239        bit_weight := 1 ;
  1240     end else begin
  1241        pk_byte(output_byte + a) ;
  1242        bit_weight := 16 ;
  1243     end ;
  1244  end ;
  1245  
  1246  @ We need the globals |bit_weight| and |output_byte| for buffering.
  1247  
  1248  @<Glob...@>=
  1249  @!bit_weight : integer ; {output bit weight}
  1250  @!output_byte : integer ; {output byte for pk file}
  1251  
  1252  @ Finally we come to the routines that are used for random access of the
  1253  |gf_file|.  To correctly find and read the postamble of the file, we need
  1254  two routines, one to find the length of the |gf_file|, and one to position
  1255  the |gf_file|.  We assume that the first byte of the file is numbered zero.
  1256  
  1257  Such routines are, of course, highly system dependent.  They are implemented
  1258  here in terms of two assumed system routines called |set_pos| and |cur_pos|.
  1259  The call |set_pos(f,n)| moves to item |n| in file |f|, unless |n| is negative
  1260  or larger than the total number of items in |f|; in the latter case,
  1261  |set_pos(f,n)| moves to the end of file |f|.  The call |cur_pos(f)| gives the
  1262  total number of items in |f|, if |eof(f)| is true; we use |cur_pos| only in
  1263  such a situation.
  1264  @^system dependencies@>
  1265  
  1266  @p procedure find_gf_length ;
  1267  begin
  1268     set_pos(gf_file, -1) ; gf_len := cur_pos(gf_file) ;
  1269  end ;
  1270  @#
  1271  procedure move_to_byte(@!n : integer) ;
  1272  begin
  1273     set_pos(gf_file, n); gf_loc := n ;
  1274  end ;
  1275  
  1276  @ The global |gf_len| contains the final total length of the |gf_file|.
  1277  
  1278  @<Glob...@>=
  1279  @!gf_len : integer ; {length of |gf_file|}
  1280  
  1281  @* Plan of attack.
  1282  It would seem at first that converting a \.{GF} file to \.{PK} format should
  1283  be relatively easy, since they both use a form of run-encoding.  Unfortunately,
  1284  several idiosyncrasies of the \.{GF} format make this conversion slightly
  1285  cumbersome.
  1286  The \.{GF} format separates the raster information from the escapement values
  1287  and \.{TFM} widths; the \.{PK} format combines all information about a single
  1288  character into one character packet.  The \.{GF} run-encoding is
  1289  on a row-by-row basis, and the \.{PK} format is on a glyph basis, as if all
  1290  of the raster rows in the glyph were concatenated into one long row.  The
  1291  encoding of the run-counts in the \.{GF} files is fixed, whereas the \.{PK}
  1292  format uses a dynamic encoding scheme that must be adjusted for each
  1293  character.  And,
  1294  finally, any repeated rows can be marked and sent with a single command in
  1295  the \.{PK} format.
  1296  
  1297  There are four major steps in the conversion process.  First, the postamble
  1298  of the |gf_file| is found and read, and the data from the character locators
  1299  is stored in memory.  Next, the preamble of the |pk_file| is written.  The
  1300  third and by far
  1301  the most difficult step reads the raster representation of all of the
  1302  characters from the \.{GF} file, packs them, and writes them to the |pk_file|.
  1303  Finally, the postamble is written to the |pk_file|.
  1304  
  1305  The conversion of the character raster information from the |gf_file| to the
  1306  format required by the |pk_file| takes several smaller steps.
  1307  The \.{GF} file is read, the commands are interpreted, and the run
  1308  counts are stored in the working |row| array.  Each row is terminated by a
  1309  |end_of_row| value, and the character glyph is terminated by an
  1310  |end_of_char| value.  Then, this representation of the character glyph
  1311  is scanned to determine the minimum bounding box in which it will fit,
  1312  correcting the |min_m|, |max_m|, |min_n|, and |max_n| values, and calculating
  1313  the offset values.  The third sub-step is to restructure the row list from
  1314  a list based on rows to a list based on the entire glyph.  Then, an optimal
  1315  value of |dyn_f| is calculated, and the final
  1316  size of the counts is found for the \.{PK} file format, and compared with
  1317  the bit-wise packed glyph.  If the run-encoding scheme is shorter, the
  1318  character is written to the |pk_file| as row counts; otherwise, it is written
  1319  using a bit-packed scheme.
  1320  
  1321  To save various information while the \.{GF} file is being loaded, we need
  1322  several arrays.  The |tfm_width|, |dx|, and |dy| arrays store the obvious
  1323  values.  The |status| array contains
  1324  the current status of the particular character.  A value of 0 indicates
  1325  that the character has never been defined; a 1 indicates that the character
  1326  locator for that character was read in; and a 2 indicates that the raster
  1327  information for at least
  1328  one character was read from the |gf_file| and written to the |pk_file|.
  1329  The |row| array contains row counts.  It is filled anew
  1330  for each character, and is used as a general workspace.  The \.{GF} counts are
  1331  stored starting at location 2 in this array, so that the \.{PK} counts can be
  1332  written to the same array, overwriting the \.{GF} counts, without destroying
  1333  any counts before they are used.  (A possible repeat count in the first row
  1334  might make the first row of the \.{PK} file one count longer; all succeeding
  1335  rows are guaranteed to be the same length or shorter because of the
  1336  |end_of_row| flags in the \.{GF} format that are unnecessary in the \.{PK}
  1337  format.)
  1338  
  1339  @d virgin==0 {never heard of this character yet}
  1340  @d located==1 {locators read for this character}
  1341  @d sent==2 {at least one of these characters has been sent}
  1342  
  1343  @<Glob...@>=
  1344  @!tfm_width: array[0..255] of integer; {the \.{TFM} widths of characters}
  1345  @!dx, @!dy: array[0..255] of integer; {the horizontal and vertical escapements}
  1346  @!status: array[0..255] of virgin..sent; {character status}
  1347  @!row: array[0..max_row] of integer; {the row counts for working}
  1348  
  1349  @ Here we initialize all of the character |status| values to |virgin|.
  1350  
  1351  @<Set init...@>=
  1352  for i := 0 to 255 do
  1353     status[i] := virgin ;
  1354  
  1355  @ And, finally, we need to define the |end_of_row| and |end_of_char| values.
  1356  These cannot be values that can be taken on either by legitimate run counts,
  1357  even when wrapping around an entire character.  Nor can they be values that
  1358  repeat counts can take on.  Since repeat counts can be arbitrarily large, we
  1359  restrict ourselves to negative values whose absolute values are greater than
  1360  the largest possible repeat count.
  1361  
  1362  @d end_of_row==(-99999) {indicates the end of a row}
  1363  @d end_of_char==(-99998) {indicates the end of a character}
  1364  
  1365  @* Reading the generic font file.
  1366  There are two major procedures in this program that do all of the work.
  1367  The first is |convert_gf_file|, which interprets the \.{GF} commands and
  1368  puts row counts into the |row| array.  The second, which we only
  1369  anticipate at the moment, actually packs the row counts into nybbles and
  1370  writes them to the packed file.
  1371  
  1372  @p @<Packing procedures@> ;
  1373  procedure convert_gf_file;
  1374  var
  1375     @!i, @!j, @!k : integer ; {general purpose indices}
  1376     @!gf_com : integer ; {current gf command}
  1377     @<Locals to |convert_gf_file|@>
  1378  begin
  1379     open_gf_file ;
  1380     if gf_byte <> pre then bad_gf('First byte is not preamble');
  1381  @.First byte is not preamble@>
  1382     if gf_byte <> gf_id_byte then
  1383          bad_gf('Identification byte is incorrect');
  1384  @.Identification byte incorrect@>
  1385     @<Find and interpret postamble@> ;
  1386     move_to_byte(2) ;
  1387     open_pk_file ;
  1388     @<Write preamble@> ;
  1389     repeat
  1390       gf_com := gf_byte ;
  1391       case gf_com of
  1392          boc, boc1 : @<Interpret character@> ;
  1393          @<Specials and |no_op| cases@> ;
  1394          post : ; {we will actually do the work for this one later}
  1395       othercases bad_gf('Unexpected ',gf_com:1,' command between characters')
  1396  @.Unexpected command@>
  1397       endcases ;
  1398     until gf_com = post ;
  1399     @<Write postamble@> ;
  1400  end ;
  1401  
  1402  @ We need a few easy macros to expand some case statements:
  1403  
  1404  @d four_cases(#)==#,#+1,#+2,#+3
  1405  @d sixteen_cases(#)==four_cases(#),four_cases(#+4),four_cases(#+8),
  1406           four_cases(#+12)
  1407  @d sixty_four_cases(#)==sixteen_cases(#),sixteen_cases(#+16),
  1408           sixteen_cases(#+32),sixteen_cases(#+48)
  1409  @d one_sixty_five_cases(#)==sixty_four_cases(#),sixty_four_cases(#+64),
  1410           sixteen_cases(#+128),sixteen_cases(#+144),four_cases(#+160),#+164
  1411  
  1412  @ In this program, all special commands are passed unchanged and any |no_op|
  1413  bytes are ignored, so we write some code to handle these:
  1414  
  1415  @<Specials and |no_op| cases@>=
  1416  four_cases(xxx1) : begin
  1417     pk_byte(gf_com - xxx1 + pk_xxx1) ;
  1418     i := 0 ; for j := 0 to gf_com - xxx1 do begin
  1419        k := gf_byte ; pk_byte(k) ; i := i * 256 + k ;
  1420     end ;
  1421     for j := 1 to i do pk_byte(gf_byte) ; end ;
  1422  yyy : begin pk_byte(pk_yyy) ; pk_word(gf_signed_quad) ; end ;
  1423  no_op :
  1424  
  1425  @ Now we need the routine that handles the character commands.  Again,
  1426  only a subset of the gf commands are permissible inside character
  1427  definitions, so we only look for these.
  1428  
  1429  @<Interpret character@>=
  1430  begin
  1431    if gf_com = boc then begin
  1432      gf_ch := gf_signed_quad ;
  1433      i := gf_signed_quad ; {dispose of back pointer}
  1434      min_m := gf_signed_quad ;
  1435      max_m := gf_signed_quad ;
  1436      min_n := gf_signed_quad ;
  1437      max_n := gf_signed_quad ;
  1438    end else begin
  1439      gf_ch := gf_byte ;
  1440      i := gf_byte ;
  1441      max_m := gf_byte ;
  1442      min_m := max_m - i ;
  1443      i := gf_byte ;
  1444      max_n := gf_byte ;
  1445      min_n := max_n - i ;
  1446    end ;
  1447    d_print_ln('Character ',gf_ch:1) ;
  1448    if gf_ch>=0 then gf_ch_mod_256 := gf_ch mod 256
  1449    else gf_ch_mod_256 := 255-((-(1+gf_ch)) mod 256);
  1450    if status[gf_ch_mod_256] = virgin then
  1451      bad_gf('no character locator for character ',gf_ch:1) ;
  1452  @.no character locator...@>
  1453    @<Convert character to packed form@> ;
  1454  end
  1455  
  1456  @ Communication between the procedures |convert_gf_file| and
  1457  |pack_and_send_character| is done with a few global variables.
  1458  
  1459  @<Glob...@>=
  1460  @!gf_ch : integer ; {the character we are working with}
  1461  @!gf_ch_mod_256 : integer ; {locator pointer}
  1462  @!pred_pk_loc : integer ; {where we predict the end of the character to be.}
  1463  @!max_n, @!min_n : integer ; {the maximum and minimum horizontal rows}
  1464  @!max_m, @!min_m : integer ; {the maximum and minimum vertical rows}
  1465  @!row_ptr : integer ; {where we are in the |row| array.}
  1466  
  1467  @ Now we are at the beginning of a character that we need the raster for.
  1468  Before we get into the complexities of decoding the |paint|, |skip|, and
  1469  |new_row| commands, let's define a macro that will help us fill up the
  1470  |row| array.  Note that we check that |row_ptr| never exceeds |max_row|;
  1471  Instead of
  1472  calling |bad_gf| directly, as this macro is repeated eight times, we simply
  1473  set the |bad| flag true.
  1474  
  1475  @d put_in_rows(#)==begin if row_ptr > max_row then bad := true else begin
  1476  row[row_ptr]:=#; incr(row_ptr); end ; end
  1477  
  1478  @ Now we have the procedure that decodes the various commands and puts counts
  1479  into the |row| array.  This would be a trivial procedure, except for
  1480  the |paint_0| command.  Because the |paint_0| command exists, it is possible
  1481  to have a sequence like |paint| 42, |paint_0|, |paint| 38, |paint_0|,
  1482  |paint_0|, |paint_0|, |paint| 33, |skip_0|.  This would be an entirely empty
  1483  row, but if we left the zeros in the |row| array, it would be difficult
  1484  to recognize the row as empty.
  1485  
  1486  This type of situation probably would never
  1487  occur in practice, but it is defined by the \.{GF} format, so we must be able
  1488  to handle it.  The extra code is really quite simple, just difficult to
  1489  understand; and it does not cut down the speed appreciably.  Our goal is
  1490  this: to collapse sequences like |paint| 42, |paint_0|, |paint| 32 to a single
  1491  count of 74, and to insure that the last count of a row is a black count rather
  1492  than a white count.  A buffer variable |extra|, and two state flags, |on| and
  1493  |state|, enable us to accomplish this.
  1494  
  1495  The |on| variable is essentially the |paint_switch| described in the \.{GF}
  1496  description.  If it is true, then we are currently painting black pixels.
  1497  The |extra| variable holds a count that is about to be placed into the
  1498  |row| array.  We hold it in this array until we get a |paint| command
  1499  of the opposite color that is greater than 0.  If we get a |paint_0| command,
  1500  then the |state| flag is turned on, indicating that the next count we receive
  1501  can be added to the |extra| variable as it is the same color.
  1502  
  1503  @<Convert character to packed form@>=
  1504  begin
  1505    bad := false ;
  1506    row_ptr := 2 ;
  1507    on := false ;
  1508    extra := 0 ;
  1509    state := true ;
  1510    repeat
  1511      gf_com := gf_byte ;
  1512      case gf_com of
  1513  @t\4@>@<Cases for |paint| commands@>;
  1514  four_cases(skip0) : begin
  1515    i := 0 ; for j := 1 to gf_com - skip0 do i := i * 256 + gf_byte ;
  1516    if on = state then put_in_rows(extra) ;
  1517    for j := 0 to i do put_in_rows(end_of_row) ;
  1518    on := false ; extra := 0 ; state := true ;
  1519  end ;
  1520  one_sixty_five_cases(new_row_0) : begin
  1521    if on = state then put_in_rows(extra) ;
  1522    put_in_rows(end_of_row) ;
  1523    on := true ; extra := gf_com - new_row_0 ; state := false ;
  1524  end ;
  1525  @t\4@>@<Specials and |no_op| cases@> ;
  1526  eoc : begin
  1527    if on = state then put_in_rows(extra) ;
  1528    if ( row_ptr > 2 ) and ( row[row_ptr - 1] <> end_of_row) then
  1529      put_in_rows(end_of_row) ;
  1530    put_in_rows(end_of_char) ;
  1531    if bad then abort('Ran out of internal memory for row counts!') ;
  1532  @.Ran out of memory@>
  1533    pack_and_send_character ;
  1534    status[gf_ch_mod_256] := sent ;
  1535    if pk_loc <> pred_pk_loc then
  1536      abort('Internal error while writing character!') ;
  1537  @.Internal error@>
  1538  end ;
  1539  othercases bad_gf('Unexpected ',gf_com:1,' command in character definition')
  1540  @.Unexpected command@>
  1541      endcases ;
  1542    until gf_com = eoc ;
  1543  end
  1544  
  1545  @ A few more locals used above and below:
  1546  
  1547  @<Locals to |convert_gf_file|@>=
  1548  @!on : boolean ; {indicates whether we are white or black}
  1549  @!state : boolean ; {a state variable---is the next count the same race as
  1550     the one in the |extra| buffer?}
  1551  @!extra : integer ; {where we pool our counts}
  1552  @!bad : boolean ; {did we run out of space?}
  1553  
  1554  @ @<Cases for |paint| commands@>=
  1555  paint_0 : begin
  1556    state := not state ;
  1557    on := not on ;
  1558  end ;
  1559  sixty_four_cases(paint_0+1),paint1+1,paint1+2 : begin
  1560    if gf_com < paint1 then i := gf_com - paint_0
  1561    else begin
  1562      i := 0 ; for j := 0 to gf_com - paint1 do i := i * 256 + gf_byte ;
  1563    end ;
  1564    if state then begin
  1565      extra := extra + i ;
  1566      state := false ;
  1567    end else begin
  1568      put_in_rows(extra) ;
  1569      extra := i ;
  1570    end ;
  1571    on := not on ;
  1572  end
  1573  
  1574  @ Our last remaining task is to interpret the postamble commands.  The only
  1575  things that may appear in the postamble are |post_post|, |char_loc|,
  1576  |char_loc0|, and the special commands.
  1577  Note that any special commands that might appear in the postamble are
  1578  not written to the |pk_file|.  Since \MF\ does not generate special commands
  1579  in the postamble, this should not be a major difficulty.
  1580  
  1581  @<Find and interpret postamble@>=
  1582  find_gf_length ;
  1583  if gf_len<8 then bad_gf('only ',gf_len:1,' bytes long');
  1584  @.only n bytes long@>
  1585  post_loc := gf_len - 4 ;
  1586  repeat
  1587     if post_loc = 0 then bad_gf('all 223''s');
  1588  @.all 223\char39s@>
  1589     move_to_byte(post_loc); k := gf_byte; decr(post_loc) ;
  1590  until k <> 223 ;
  1591  if k <> gf_id_byte then bad_gf('ID byte is ',k:1);
  1592  @.ID byte is wrong@>
  1593  if post_loc<5 then bad_gf('post location is ',post_loc:1) ;
  1594  @.post location is@>
  1595  move_to_byte(post_loc - 3);
  1596  q := gf_signed_quad ;
  1597  if (q<0) or (q>post_loc-3) then bad_gf('post pointer is ',q:1) ;
  1598  @.post pointer is wrong@>
  1599  move_to_byte(q) ; k := gf_byte ;
  1600  if k <> post then bad_gf('byte at ',q:1,' is not post') ;
  1601  @.byte is not post@>
  1602  i := gf_signed_quad ; {skip over junk}
  1603  design_size := gf_signed_quad ;
  1604  check_sum := gf_signed_quad ;
  1605  hppp := gf_signed_quad ;
  1606  h_mag := round ( hppp * 72.27 / 65536 ) ;
  1607  vppp := gf_signed_quad ;
  1608  if hppp <> vppp then print_ln('Odd aspect ratio!') ;
  1609  @.Odd aspect ratio@>
  1610  i := gf_signed_quad ; i := gf_signed_quad ; {skip over junk}
  1611  i := gf_signed_quad ; i := gf_signed_quad ;
  1612  repeat
  1613    gf_com := gf_byte ;
  1614    case gf_com of
  1615  char_loc, char_loc0 : begin
  1616    gf_ch := gf_byte ;
  1617    if status[gf_ch] <> virgin then
  1618      bad_gf('Locator for this character already found.');
  1619  @.Locator...already found@>
  1620    if gf_com = char_loc then begin
  1621      dx[gf_ch] := gf_signed_quad ;
  1622      dy[gf_ch] := gf_signed_quad ;
  1623    end else begin
  1624      dx[gf_ch] := gf_byte * 65536 ;
  1625      dy[gf_ch] := 0 ;
  1626    end ;
  1627    tfm_width[gf_ch] := gf_signed_quad ;
  1628    i := gf_signed_quad ;
  1629    status[gf_ch] := located ;
  1630  end ;
  1631  @<Specials and |no_op| cases@> ;
  1632  post_post : ;
  1633  othercases bad_gf('Unexpected ',gf_com:1,' in postamble')
  1634  @.Unexpected command@>
  1635    endcases ;
  1636  until gf_com = post_post
  1637  
  1638  @ Just a few more locals:
  1639  
  1640  @<Locals to |convert_gf_file|@>=
  1641  @!hppp, @!vppp : integer ; {horizontal and vertical pixels per point}
  1642  @!q : integer ; {quad temporary}
  1643  @!post_loc : integer ; {where the postamble was}
  1644  
  1645  @* Converting the counts to packed format.
  1646  This procedure is passed the set of row counts from the \.{GF} file.  It
  1647  writes the character to the \.{PK} file.  First, the minimum bounding box
  1648  is determined.  Next, the row-oriented count list is converted to a count
  1649  list based on the entire glyph.  Finally, we calculate
  1650  the optimal |dyn_f| and send the character.
  1651  
  1652  @<Packing procedures@>=
  1653  procedure pack_and_send_character ;
  1654  var i, @!j, @!k : integer ; {general indices}
  1655  @<Locals to |pack_and_send_character|@>
  1656  begin
  1657    @<Scan for bounding box@> ;
  1658    @<Convert row-list to glyph-list@> ;
  1659    @<Calculate |dyn_f| and packed size and write character@> ;
  1660  end
  1661  
  1662  @ Now we have the row counts in our |row| array.  To find the real |max_n|,
  1663  we look for
  1664  the first non-|end_of_row| value in the |row|.  If it is an |end_of_char|,
  1665  the entire character is blank.  Otherwise, we first eliminate all of the blank
  1666  rows at the end of the character.  Next, for each remaining row, we check the
  1667  first white count for a new |min_m|, and the total length of the row
  1668  for a new |max_m|.
  1669  
  1670  @<Scan for bounding box@>=
  1671  i := 2 ; decr(row_ptr) ;
  1672  while row[i] = end_of_row do incr(i) ;
  1673  if row[i] <> end_of_char then begin
  1674    max_n := max_n - i + 2 ;
  1675    while row[row_ptr - 2] = end_of_row do begin
  1676      decr(row_ptr) ;  row[row_ptr] := end_of_char ;
  1677    end ;
  1678    min_n := max_n + 1 ;
  1679    extra := max_m - min_m + 1 ;
  1680    max_m := 0 ;
  1681    j := i ;
  1682    while row[j] <> end_of_char do begin
  1683      decr(min_n) ;
  1684      if row[j] <> end_of_row then begin
  1685        k := row[j] ;
  1686        if k < extra then extra := k ;
  1687        incr(j) ;
  1688        while row[j] <> end_of_row do begin
  1689          k := k + row[j] ; incr(j) ;
  1690        end ;
  1691        if max_m < k then max_m := k ;
  1692      end ;
  1693      incr(j) ;
  1694    end ;
  1695    min_m := min_m + extra ;
  1696    max_m := min_m + max_m - 1 - extra ;
  1697    height := max_n - min_n + 1 ;
  1698    width := max_m - min_m + 1 ;
  1699    x_offset := - min_m ;
  1700    y_offset := max_n ;
  1701    d_print_ln('W ',width:1,' H ',height:1,' X ',x_offset:1, ' Y ',y_offset:1);
  1702  end else begin
  1703    height := 0 ; width := 0 ; x_offset := 0 ; y_offset := 0 ;
  1704    d_print_ln('Empty raster.');
  1705  end
  1706  
  1707  @ We must convert the run-count array from a row orientation to a glyph
  1708  orientation, with repeat counts for repeated rows.  We separate this task
  1709  into two smaller tasks, on a per row basis.  But first, we define a new
  1710  macro to help us fill up this new array.  Here, we have no fear that we will
  1711  run out of space, as the glyph representation is provably smaller than the
  1712  rows representation.
  1713  
  1714  @d put_count(#)==begin row[put_ptr] := #; incr(put_ptr);
  1715  if repeat_flag > 0 then begin
  1716     row[put_ptr] := - repeat_flag ; repeat_flag := 0 ; incr(put_ptr) ; end ;
  1717  end
  1718  
  1719  @<Convert row-list to glyph-list@>=
  1720  put_ptr := 0 ; row_ptr := 2 ; repeat_flag := 0 ;
  1721  state := true ; buff := 0 ;
  1722  while row[row_ptr] = end_of_row do incr(row_ptr) ;
  1723  while row[row_ptr] <> end_of_char do begin
  1724     @<Skip over repeated rows@> ;
  1725     @<Reformat count list@> ;
  1726  end ;
  1727  if buff > 0 then
  1728     put_count(buff) ;
  1729  put_count(end_of_char)
  1730  
  1731  @ Some more locals for |pack_and_send_character| used above:
  1732  
  1733  @<Locals to |pack_and_send_character|@>=
  1734  @!extra : integer ; {little buffer for count values}
  1735  @!put_ptr : integer ; {next location to fill in |row|}
  1736  @!repeat_flag : integer ; {how many times the current row is repeated}
  1737  @!h_bit : integer ; {horizontal bit count for each row}
  1738  @!buff : integer ; {our count accumulator}
  1739  
  1740  @ In this short section of code, we are at the beginning of a new row.
  1741  We scan forward, looking for repeated rows.  If there are any, |repeat_flag|
  1742  gets the count, and the |row_ptr| points to the beginning of the last of the
  1743  repeated rows.  Two points must be made here.  First, we do not count all-black
  1744  or all-white rows as repeated, as a large ``paint'' count will take care of
  1745  them, and also there is no black to white or white to black transition in the
  1746  row where we could insert a repeat count.  That is the meaning of the big
  1747  if statement that conditions this section.  Secondly, the |while row[i] =
  1748  row[j] do| loop is guaranteed to terminate, as $|j| > |i|$ and the character
  1749  is terminated by a unique |end_of_char| value.
  1750  
  1751  @<Skip over repeated rows@>=
  1752  i := row_ptr ;
  1753  if ( row[i] <> end_of_row ) and ( ( row[i] <> extra ) or ( row[i+1] <>
  1754     width ) ) then begin
  1755     j := i + 1 ;
  1756     while row[j-1] <> end_of_row do incr(j) ;
  1757     while row[i] = row[j] do begin
  1758        if row[i] = end_of_row then begin
  1759           incr(repeat_flag) ;
  1760           row_ptr := i + 1 ;
  1761        end ;
  1762        incr(i) ; incr(j) ;
  1763     end ;
  1764  end
  1765  
  1766  @ Here we actually spit out a row.  The routine is somewhat similar to the
  1767  routine where we actually interpret the \.{GF} commands in the count buffering.
  1768  We must make sure to keep track of how many bits have actually been sent, so
  1769  when we hit the end of a row, we can send a white count for the remaining
  1770  bits, and possibly add the white count of the next row to it.  And, finally,
  1771  we must not forget to subtract the |extra| white space at the beginning of
  1772  each row from the first white count.
  1773  
  1774  @<Reformat count list@>=
  1775  if row[row_ptr] <> end_of_row then row[row_ptr] := row[row_ptr] - extra ;
  1776  h_bit := 0;
  1777  while row[row_ptr] <> end_of_row do begin
  1778     h_bit := h_bit + row[row_ptr] ;
  1779     if state then begin
  1780        buff := buff + row[row_ptr] ;
  1781        state := false ;
  1782     end else if row[row_ptr] > 0 then begin
  1783        put_count(buff) ;
  1784        buff := row[row_ptr] ;
  1785     end else state := true ;
  1786     incr(row_ptr) ;
  1787  end ;
  1788  if h_bit < width then
  1789     if state then
  1790        buff := buff + width - h_bit
  1791     else begin
  1792        put_count(buff) ;
  1793        buff := width - h_bit ;
  1794        state := true ;
  1795     end
  1796  else state := false ;
  1797  incr(row_ptr)
  1798  
  1799  @ Here is another piece of rather intricate code.  We determine the
  1800  smallest size in which we can pack the data, calculating |dyn_f| in the
  1801  process.  To do this, we calculate the size required if |dyn_f| is 0, and put
  1802  this in |comp_size|.  Then, we calculate the changes in the size for each
  1803  increment of |dyn_f|, and stick these values in the |deriv| array.  Finally,
  1804  we scan through this array and find the final minimum value, which we then
  1805  use to send the character data.
  1806  
  1807  @<Calculate |dyn_f| and packed size and write character@>=
  1808  for i := 1 to 13 do deriv[i] := 0 ;
  1809  i := 0 ;
  1810  first_on := row[i] = 0 ;
  1811  if first_on then incr(i) ;
  1812  comp_size := 0 ;
  1813  while row[i] <> end_of_char do
  1814     @<Process count for best |dyn_f| value@> ;
  1815  b_comp_size := comp_size ;
  1816  dyn_f := 0 ;
  1817  for i := 1 to 13 do begin
  1818     comp_size := comp_size + deriv[i] ;
  1819     if comp_size <= b_comp_size then begin
  1820        b_comp_size := comp_size ;
  1821        dyn_f := i ;
  1822     end ;
  1823  end ;
  1824  comp_size := (b_comp_size + 1) div 2 ;
  1825  if (comp_size > (height * width + 7) div 8) or (height * width = 0) then begin
  1826     comp_size := (height * width + 7) div 8 ;
  1827     dyn_f := 14 ;
  1828  end ;
  1829  d_print_ln('Best packing is dyn_f of ',dyn_f:1,' with length '
  1830      ,comp_size:1);
  1831  @<Write character preamble@> ;
  1832  if dyn_f <> 14 then
  1833     @<Send compressed format@>
  1834  else if height > 0 then
  1835     @<Send bit map@>
  1836  
  1837  @ When we enter this module, we have a count at |row[i]|.  First, we add to
  1838  the |comp_size| the number of
  1839  nybbles that this count would require, assuming |dyn_f| to be zero.  When
  1840  |dyn_f| is zero, there are no one nybble counts, so we simply choose between
  1841  two-nybble and extensible counts and add the appropriate value.
  1842  
  1843  Next, we take the count value and determine the value of |dyn_f| (if any) that
  1844  would cause this count to take either more or less nybbles.  If a valid value
  1845  for |dyn_f| exists in this range, we accumulate this change in the |deriv|
  1846  array.
  1847  
  1848  One special case handled here is a repeat count of one.
  1849  A repeat count of one will never change the length of the raster
  1850  representation, no matter what |dyn_f| is, because it is always
  1851  represented by the nybble value 15.
  1852  
  1853  @<Process count for best |dyn_f| value@>=
  1854  begin
  1855     j := row[i] ;
  1856     if j = -1 then incr(comp_size)
  1857     else begin
  1858        if j < 0 then begin
  1859           incr(comp_size) ;
  1860           j := - j ;
  1861        end ;
  1862        if j < 209 then comp_size := comp_size + 2
  1863        else begin
  1864           k := j - 193 ;
  1865           while k >= 16 do begin
  1866              k := k div 16 ;
  1867              comp_size := comp_size + 2 ;
  1868           end ;
  1869           incr(comp_size) ;
  1870        end ;
  1871        if j < 14 then decr(deriv[j])
  1872        else if j < 209 then incr(deriv[(223 - j) div 15])
  1873        else begin
  1874           k := 16 ;
  1875           while ( k * 16 < j + 3 ) do k := k * 16 ;
  1876           if j-k <= 192 then deriv[(207-j+k) div 15] := deriv[(207-j+k) div 15]
  1877              + 2 ;
  1878         end ;
  1879     end ;
  1880     incr(i) ;
  1881  end
  1882  
  1883  @ We need a handful of locals:
  1884  
  1885  @<Locals to |pack_and_send_character|@>=
  1886  @!dyn_f : integer ; {packing value}
  1887  @!height, @!width : integer ; {height and width of character}
  1888  @!x_offset, @!y_offset : integer ; {offsets}
  1889  @!deriv : array[1..13] of integer ; {derivative}
  1890  @!b_comp_size : integer ; {best size}
  1891  @!first_on : boolean ; {indicates that the first bit is on}
  1892  @!flag_byte : integer ; {flag byte for character}
  1893  @!state : boolean ; {state variable}
  1894  @!on : boolean ; {white or black?}
  1895  
  1896  @ Now we write the character preamble information.  First we need to determine
  1897  which of the three formats we should use.
  1898  
  1899  @<Write character preamble@>=
  1900  flag_byte := dyn_f * 16 ;
  1901  if first_on then flag_byte := flag_byte + 8 ;
  1902  if (gf_ch <> gf_ch_mod_256) or (tfm_width[gf_ch_mod_256] > 16777215) or
  1903        (tfm_width[gf_ch_mod_256] < 0) or (dy[gf_ch_mod_256] <> 0) or
  1904        (dx[gf_ch_mod_256] < 0) or (dx[gf_ch_mod_256] mod 65536 <> 0) or
  1905        (comp_size > 196594) or (width > 65535) or
  1906        (height > 65535) or (x_offset > 32767) or (y_offset > 32767) or
  1907        (x_offset < -32768) or (y_offset < -32768) then
  1908     @<Write long character preamble@>
  1909  else if (dx[gf_ch] > 16777215) or (width > 255) or (height > 255) or
  1910        (x_offset > 127) or (y_offset > 127) or (x_offset < -128) or
  1911        (y_offset < -128) or (comp_size > 1015) then
  1912     @<Write two-byte short character preamble@>
  1913  else
  1914     @<Write one-byte short character preamble@>
  1915  
  1916  @ If we must write a long character preamble, we
  1917  adjust a few parameters, then write the data.
  1918  
  1919  @<Write long character preamble@>=
  1920  begin
  1921     flag_byte := flag_byte + 7 ;
  1922     pk_byte(flag_byte) ;
  1923     comp_size := comp_size + 28 ;
  1924     pk_word(comp_size) ;
  1925     pk_word(gf_ch) ;
  1926     pred_pk_loc := pk_loc + comp_size ;
  1927     pk_word(tfm_width[gf_ch_mod_256]) ;
  1928     pk_word(dx[gf_ch_mod_256]) ;
  1929     pk_word(dy[gf_ch_mod_256]) ;
  1930     pk_word(width) ;
  1931     pk_word(height) ;
  1932     pk_word(x_offset) ;
  1933     pk_word(y_offset) ;
  1934  end
  1935  
  1936  @ Here we write a short short character preamble, with one-byte size
  1937  parameters.
  1938  
  1939  @<Write one-byte short character preamble@>=
  1940  begin
  1941     comp_size := comp_size + 8 ;
  1942     flag_byte := flag_byte + comp_size div 256 ;
  1943     pk_byte(flag_byte) ;
  1944     pk_byte(comp_size mod 256) ;
  1945     pk_byte(gf_ch) ;
  1946     pred_pk_loc := pk_loc + comp_size ;
  1947     pk_three_bytes(tfm_width[gf_ch_mod_256]) ;
  1948     pk_byte(dx[gf_ch_mod_256] div 65536) ;
  1949     pk_byte(width) ;
  1950     pk_byte(height) ;
  1951     pk_byte(x_offset) ;
  1952     pk_byte(y_offset) ;
  1953  end
  1954  
  1955  @ Here we write an extended short character preamble, with two-byte
  1956  size parameters.
  1957  
  1958  @<Write two-byte short character preamble@>=
  1959  begin
  1960     comp_size := comp_size + 13 ;
  1961     flag_byte := flag_byte + comp_size div 65536 + 4 ;
  1962     pk_byte(flag_byte) ;
  1963     pk_halfword(comp_size mod 65536) ;
  1964     pk_byte(gf_ch) ;
  1965     pred_pk_loc := pk_loc + comp_size ;
  1966     pk_three_bytes(tfm_width[gf_ch_mod_256]) ;
  1967     pk_halfword(dx[gf_ch_mod_256] div 65536) ;
  1968     pk_halfword(width) ;
  1969     pk_halfword(height) ;
  1970     pk_halfword(x_offset) ;
  1971     pk_halfword(y_offset) ;
  1972  end
  1973  
  1974  @ At this point, we have decided that the run-encoded format is smaller.  (This
  1975  is almost always the case.)  We send out the data, a nybble at a time.
  1976  
  1977  @<Send compressed format@>=
  1978  begin
  1979     bit_weight := 16 ;
  1980     max_2 := 208 - 15 * dyn_f ;
  1981     i := 0 ;
  1982     if row[i] = 0 then incr(i) ;
  1983     while row[i] <> end_of_char do begin
  1984        j := row[i] ;
  1985        if j = -1 then
  1986           pk_nyb(15)
  1987        else begin
  1988           if j < 0 then begin
  1989              pk_nyb(14) ;
  1990              j := - j ;
  1991           end ;
  1992           if j <= dyn_f then pk_nyb(j)
  1993           else if j <= max_2 then begin
  1994              j := j - dyn_f - 1 ;
  1995              pk_nyb(j div 16 + dyn_f + 1) ;
  1996              pk_nyb(j mod 16) ;
  1997           end else begin
  1998              j := j - max_2 + 15 ;
  1999              k := 16 ;
  2000              while k <= j do begin
  2001                 k := k * 16 ;
  2002                 pk_nyb(0) ;
  2003              end ;
  2004              while k > 1 do begin
  2005                 k := k div 16 ;
  2006                 pk_nyb(j div k) ;
  2007                 j := j mod k ;
  2008              end ;
  2009           end ;
  2010        end ;
  2011        incr(i) ;
  2012     end ;
  2013     if bit_weight <> 16 then pk_byte(output_byte) ;
  2014  end
  2015  
  2016  @ This code is for the case where we have decided to send the character raster
  2017  packed by bits.  It uses the bit counts as well, sending eight at a time.
  2018  Here we have a miniature packed format interpreter, as we must repeat any rows
  2019  that are repeated.  The algorithm to do this was a lot of fun to generate.  Can
  2020  you figure out how it works?
  2021  
  2022  @<Send bit map@>=
  2023  begin
  2024     buff := 0 ;
  2025     p_bit := 8 ;
  2026     i := 1 ;
  2027     h_bit := width ;
  2028     on := false ;
  2029     state := false ;
  2030     count := row[0] ;
  2031     repeat_flag := 0 ;
  2032     while ( row[i] <> end_of_char ) or state or ( count > 0 ) do begin
  2033        if state then begin
  2034           count := r_count ; i := r_i ; on := r_on ;
  2035           decr(repeat_flag) ;
  2036        end else begin
  2037           r_count := count ; r_i := i ; r_on := on ;
  2038        end ;
  2039        @<Send one row by bits@> ;
  2040        if state and ( repeat_flag = 0 ) then begin
  2041           count := s_count ; i := s_i ; on := s_on ;
  2042           state := false ;
  2043        end else if not state and ( repeat_flag > 0 ) then begin
  2044           s_count := count ; s_i := i ; s_on := on ;
  2045           state := true ;
  2046        end ;
  2047     end ;
  2048     if p_bit <> 8 then pk_byte(buff) ;
  2049  end
  2050  
  2051  @ All of the remaining locals:
  2052  
  2053  @<Locals to |pack_and_send_character|@>=
  2054  @!comp_size : integer ; {length of the packed representation in bytes}
  2055  @!count : integer ; {number of bits in current state to send}
  2056  @!p_bit : integer ; {what bit are we about to send out?}
  2057  @!r_on, @!s_on : boolean ; {state saving variables}
  2058  @!r_count, @!s_count : integer ; {ditto}
  2059  @!r_i, @!s_i : integer ; {and again.}
  2060  @!max_2 : integer ; {the highest count that fits in two bytes}
  2061  
  2062  @ We make the |power| array global.
  2063  
  2064  @<Glob...@>=
  2065  @!power : array[0..8] of integer ; {easy powers of two}
  2066  
  2067  @ We initialize the power array.
  2068  
  2069  @<Set init...@>=
  2070  power[0] := 1 ;
  2071  for i := 1 to 8 do power[i] := power[i-1] + power[i-1] ;
  2072  
  2073  @ Here we are at the beginning of a row and simply output the next |width| bits.
  2074  We break the possibilities up into three cases: we finish a byte but not
  2075  the row, we finish a row, and we finish neither a row nor a byte.  But,
  2076  first, we insure that we have a |count| value.
  2077  
  2078  @<Send one row by bits@>=
  2079  repeat
  2080     if count = 0 then begin
  2081        if row[i] < 0 then begin
  2082           if not state then repeat_flag := - row[i] ;
  2083           incr(i) ;
  2084        end ;
  2085        count := row[i] ;
  2086        incr(i) ;
  2087        on := not on ;
  2088     end ;
  2089     if ( count >= p_bit ) and ( p_bit < h_bit ) then begin
  2090  { we end a byte, we don't end the row }
  2091        if on then buff := buff + power[p_bit] - 1 ;
  2092        pk_byte(buff) ; buff := 0 ;
  2093        h_bit := h_bit - p_bit ; count := count - p_bit ; p_bit := 8 ;
  2094     end else if ( count < p_bit ) and ( count < h_bit ) then begin
  2095  { we end neither the row nor the byte }
  2096        if on then buff := buff + power[p_bit] - power[p_bit - count] ;
  2097        p_bit := p_bit - count ; h_bit := h_bit - count ; count := 0 ;
  2098     end else begin
  2099  { we end a row and maybe a byte }
  2100        if on then buff := buff + power[p_bit] - power[p_bit - h_bit] ;
  2101        count := count - h_bit ; p_bit := p_bit - h_bit ; h_bit := width ;
  2102        if p_bit = 0 then begin
  2103           pk_byte(buff) ; buff := 0 ; p_bit := 8 ;
  2104        end ;
  2105     end ;
  2106  until h_bit = width
  2107  
  2108  @ Now we are ready for the routine that writes the preamble of the packed
  2109  file.
  2110  
  2111  @d preamble_comment == 'GFtoPK 2.4 output from '
  2112  @d comm_length = 23 {length of |preamble_comment|}
  2113  @d from_length = 6 {length of its |' from '| part}
  2114  
  2115  @<Write preamble@>=
  2116  pk_byte(pk_pre) ;
  2117  pk_byte(pk_id) ;
  2118  i := gf_byte ; {get length of introductory comment}
  2119  repeat if i=0 then j:="."@+else j:=gf_byte;
  2120  decr(i); {some people think it's wise to avoid |goto| statements}
  2121  until j<>" "; {remove leading blanks}
  2122  incr(i); {this many bytes to copy}
  2123  if i=0 then k:=comm_length-from_length
  2124  else k := i+comm_length;
  2125  if k>255 then pk_byte(255)@+else pk_byte(k);
  2126  for k := 1 to comm_length do
  2127    if(i>0)or(k<=comm_length-from_length) then pk_byte(xord[comment[k]]) ;
  2128  print('''') ;
  2129  for k := 1 to i do
  2130    begin if k>1 then j:=gf_byte;
  2131    print(xchr[j]);
  2132    if k<256-comm_length then pk_byte(j);
  2133    end;
  2134  print_ln('''') ;@/
  2135  pk_word(design_size) ;
  2136  pk_word(check_sum) ;
  2137  pk_word(hppp) ;
  2138  pk_word(vppp)
  2139  
  2140  @ Of course, we need an array to hold the comment.
  2141  
  2142  @<Glob...@>=
  2143  @!comment : packed array[1..comm_length] of char ;
  2144  
  2145  @ @<Set init...@>=
  2146  comment := preamble_comment ;
  2147  
  2148  @ Writing the postamble is even easier.
  2149  
  2150  @<Write postamble@>=
  2151  pk_byte(pk_post) ;
  2152  while (pk_loc mod 4 <> 0) do pk_byte(pk_no_op)
  2153  
  2154  @ Once we are finished with the \.{GF} file, we check the status of each
  2155  character to insure that each character that had a locator also had raster
  2156  information.
  2157  
  2158  @<Check for unrasterized locators@>=
  2159  for i := 0 to 255 do
  2160     if status[i] = located then
  2161        print_ln('Character ',i:1,' missing raster information!')
  2162  @.missing raster information@>
  2163  
  2164  @ Finally, the main program.
  2165  
  2166  @p begin
  2167    initialize ;
  2168    convert_gf_file ;
  2169    @<Check for unrasterized locators@> ;
  2170    print_ln(gf_len:1,' bytes packed to ',pk_loc:1,' bytes.') ;
  2171  final_end : end .
  2172  
  2173  @ A few more globals.
  2174  
  2175  @<Glob...@>=
  2176  @!check_sum : integer ; {the checksum of the file}
  2177  @!design_size : integer ; {the design size of the font}
  2178  @!h_mag : integer ; {the pixel magnification in pixels per inch}
  2179  @!i : integer ;
  2180  
  2181  @* System-dependent changes.
  2182  This section should be replaced, if necessary, by changes to the program
  2183  that are necessary to make \.{GFtoPK} work at a particular installation.
  2184  It is usually best to design your change file so that all changes to
  2185  previous sections preserve the section numbering; then everybody's version
  2186  will be consistent with the printed program. More extensive changes,
  2187  which introduce new sections, can be inserted here; then only the index
  2188  itself will get a new section number.
  2189  @^system dependencies@>
  2190  
  2191  @* Index.
  2192  Pointers to error messages appear here together with the section numbers
  2193  where each ident\-i\-fier is used.