modernc.org/knuth@v0.0.4/mf/internal/trap/trapman.tex

modernc.org/knuth@v0.0.4/mf/internal/trap/trapman.tex (about)

     1  % The TRAP manual: How to validate MF --- last updated by D E Knuth on 4 Dec 89
     2  \font\eighttt= cmtt8
     3  \font\eightrm= cmr8
     4  \font\titlefont=cmssdc10 at 40pt
     5  \let\mc=\eightrm
     6  \font\logo=manfnt % font used for the METAFONT logo
     7  \def\MF{{\logo META}\-{\logo FONT}}
     8  \rm
     9  \let\mainfont=\tenrm
    10  
    11  \def\.#1{\hbox{\tt#1}}
    12  \def\\#1{\hbox{\it#1\/\hskip.05em}} % italic type for identifiers
    13  
    14  \parskip 2pt plus 1pt
    15  \baselineskip 12pt plus .25pt
    16  
    17  \def\verbatim#1{\begingroup \frenchspacing
    18    \def\do##1{\catcode`##1=12 } \dospecials
    19    \parskip 0pt \parindent 0pt
    20    \catcode`\ =\active \catcode`\^^M=\active
    21    \tt \def\par{\ \endgraf} \obeylines \obeyspaces
    22    \input #1 \endgroup}
    23  % a blank line will be typeset at the end of the file;
    24  % if you're unlucky it will appear on a page by itself!
    25  {\obeyspaces\global\let =\ }
    26  
    27  \output{\shipout\box255\global\advance\pageno by 1} % for the title page only
    28  \null
    29  \vfill
    30  \centerline{\titlefont A Torture Test}
    31  \vskip8pt
    32  \centerline{\titlefont for \logo ()*+,-.*}
    33  \vskip 24pt
    34  \centerline{by Donald E. Knuth}
    35  \centerline{Stanford University}
    36  \vskip 6pt
    37  \centerline{({\sl Version 2, January 1990\/})}
    38  \vfill
    39  \centerline{\vbox{\hsize 4in
    40  \noindent Programs that claim to be implementations of \MF84 are
    41  supposed to be able to process the test routine contained in this
    42  report, producing the outputs contained in this report.}}
    43  \vskip 24pt
    44  {\baselineskip 9pt
    45  \eightrm\noindent
    46  The preparation of this report was supported in part by the National Science
    47  Foundation under grants IST-8201926 and MCS-8300984,
    48  and by the System Development Foundation.
    49  {\logo opqrstuq} is a trademark of Addison-Wesley Publishing Company.
    50    
    51  
    52  }\pageno=0\eject
    53  
    54  \output{\shipout\vbox{ % for subsequent pages
    55      \baselineskip0pt\lineskip0pt
    56      \hbox to\hsize{\strut
    57        \ifodd\pageno \hfil\eightrm\firstmark\hfil
    58          \mainfont\the\pageno
    59        \else\mainfont\the\pageno\hfil
    60          \eightrm\firstmark\hfil\fi}
    61      \vskip 10pt
    62      \box255}
    63    \global\advance\pageno by 1}
    64  \let\runninghead=\mark
    65  \outer\def\section#1.{\noindent{\bf#1.}\quad
    66    \runninghead{\uppercase{#1} }\ignorespaces}
    67  
    68  \section Introduction.
    69  People often think that their programs are ``debugged'' when large applications
    70  have been run successfully. But system programmers know that a typical large
    71  application tends to use at most about 50 per cent of the instructions
    72  in a typical compiler. Although the other half of the code---which tends
    73  to be the ``harder half''---might be riddled with errors, the system seems
    74  to be working quite impressively until an unusual case shows up on the
    75  next day. And on the following day another error manifests itself, and so on;
    76  months or years go by before certain parts of the compiler are even
    77  activated, much less tested in combination with other portions of the system,
    78  if user applications provide the only tests.
    79  
    80  How then shall we go about testing a compiler? Ideally we would like to
    81  have a formal proof of correctness, certified by a computer.
    82  This would give us a lot of confidence,
    83  although of course the formal verification program might itself be incorrect.
    84  A more serious drawback of automatic verification is that the formal
    85  specifications of the compiler are likely to be wrong, since they aren't
    86  much easier to write than the compiler itself. Alternatively, we can
    87  substitute an informal proof of correctness: The programmer writes his or
    88  her code in a structured manner and checks that appropriate relations
    89  remain invariant, etc. This helps greatly to reduce errors, but it cannot
    90  be expected to remove them completely; the task of checking a large
    91  system is sufficiently formidable that human beings cannot do it without
    92  making at least a few slips here and there.
    93  
    94  Thus, we have seen that test programs are unsatisfactory if they are simply
    95  large user applications; yet some sort of test program is needed because
    96  proofs of correctness aren't adequate either. People have proposed schemes
    97  for constructing test data automatically from a program text, but such
    98  approaches run the risk of circularity, since they cannot assume that a
    99  given program has the right structure.
   100  
   101  I have been having good luck with a somewhat different approach,
   102  first used in 1960 to debug an {\mc ALGOL} compiler. The idea is to
   103  construct a test file that is about as different from a typical user
   104  application as could be imagined. Instead of testing things that people
   105  normally want to do, the file tests complicated things that people would
   106  never dare to think of, and it embeds these complexities in still
   107  more arcane constructions. Instead of trying to make the compiler do the
   108  right thing, the goal is to make it fail (until the bugs have all been found).
   109  
   110  To write such a fiendish test routine, one simply gets into a nasty frame
   111  of mind and tries to do everything in the unexpected way. Parameters
   112  that are normally positive are set negative or zero; borderline cases
   113  are pushed to the limit; deliberate errors are made in hopes that the
   114  compiler will not be able to recover properly from them.
   115  
   116  A user's application tends to exercise 50\%\ of a compiler's logic,
   117  but my first fiendish tests tend to improve this to about 90\%. As the
   118  next step I generally make use of frequency-counting software to identify
   119  the instructions that have still not been called upon. Then I add ever more
   120  fiendishness to the test routine, until more than 99\%\ of the code
   121  has been used at least once. (The remaining bits are things that
   122  can occur only if the source program is really huge, or if certain
   123  fatal errors are detected; or they are cases so similar to other well-tested
   124  things that there can be little doubt of their validity.)
   125  
   126  Of course, this is not guaranteed to work. But my experience in 1960 was
   127  that only two bugs were ever found in that {\mc ALGOL} compiler after it
   128  correctly translated that original fiendish test. And one of those bugs
   129  was actually present in the results of the test; I simply had failed to
   130  notice that the output was incorrect. Similar experiences occurred later
   131  during the 60s and 70s, with respect to a few assemblers, compilers,
   132  and simulators that I wrote.
   133  
   134  This method of debugging, combined with the methodology of structured
   135  programming and informal proofs (otherwise known as careful desk checking),
   136  leads to greater reliability of production software than any other
   137  method I know. Therefore I have used it in developing \MF84, and the
   138  main bulk of this report is simply a presentation of the test program
   139  that was used to get the bugs out of \MF.
   140  
   141  Such a test file is useful also after a program has been debugged, since
   142  it can be used to give some assurance that subsequent modifications don't
   143  mess things up.
   144  
   145  The test file is called \.{TRAP.MF}, because of my warped sense of humor:
   146  \MF's companion system, \TeX, has a similar test file called \.{TRIP}, and I
   147  couldn't help thinking about Billy Goat Gruff and the story of ``trip,
   148  trap, trip, trap.''
   149  
   150  The contents of this test file are so remote from what people actually
   151  do with \MF, I feel apologetic if I have to explain the correct
   152  translation of \.{TRAP.MF}; nobody really cares about most of the
   153  nitty-gritty rules that are involved. Yet I believe \.{TRAP} exemplifies
   154  the sort of test program that has outstanding diagnostic ability, as
   155  explained above.
   156  
   157  If somebody claims to have a correct implementation of \MF, I will not
   158  believe it until I see that \.{TRAP.MF} is translated properly.
   159  I propose, in fact, that a program must meet two criteria before it
   160  can justifiably be called \MF: (1)~The person who wrote it must be
   161  happy with the way it works at his or her installation; and (2)~the
   162  program must produce the correct results from \.{TRAP.MF}.
   163  
   164  \MF\ is in the public domain, and its algorithms are published;
   165  I've done this since I do not want to discourage its use by placing
   166  proprietary restrictions on the software. However, I don't want
   167  faulty imitations to masquerade as \MF\ processors, since users
   168  want \MF\ to produce identical results on different machines.
   169  Hence I am planning to do whatever I can to suppress any systems that
   170  call themselves \MF\ without meeting conditions (1) and~(2).
   171  I have copyrighted the programs so that I have some chance to forbid
   172  unauthorized copies; I explicitly authorize copying of correct
   173  \MF\ implementations, and not of incorrect ones!
   174  
   175  The remainder of this report consists of appendices, whose contents ought
   176  to be described briefly here:
   177  
   178  Appendix A explains in detail how to carry out a test of \MF, given
   179  a tape that contains copies of the other appendices.
   180  
   181  Appendix B is \.{TRAP.MF}, the fiendish test file that has already
   182  been mentioned. People who think that they understand \MF\ are challenged
   183  to see if they know what \MF\ is supposed to do with this file.
   184  People who know only a little about \MF\ might still find it
   185  interesting to study Appendix~B, just to get some insights into the
   186  methodology advocated here.
   187  
   188  Appendix C is \.{TRAPIN.LOG}, a correct transcript file \.{TRAP.LOG}
   189  that results if \.{INIMF} is applied to \.{TRAP.MF}. (\.{INIMF} is
   190  the name of a version of \MF\ that does certain initializations;
   191  this run of \.{INIMF} also creates a binary base file called \.{TRAP.BASE}.)
   192  
   193  Appendix D is a correct transcript file \.{TRAP.LOG} that results if
   194  \.{INIMF} or any other version of \MF\ is applied to \.{TRAP.MF}
   195  with base file \.{TRAP.BASE}.
   196  
   197  Appendix E is \.{TRAP.TYP}, the symbolic version of a correct output
   198  file \.{TRAP.72270GF} that was produced at the same time as the \.{TRAP.LOG}
   199  file of Appendix~D.
   200  
   201  Appendix F is \.{TRAP.PL}, the symbolic version of a correct output
   202  file \.{TRAP.TFM} that was produced at the same time as the \.{TRAP.LOG}
   203  file of Appendix~D.
   204  
   205  Appendix G is \.{TRAP.FOT}, an abbreviated version of Appendix D that
   206  appears on the user's terminal during the run that produces \.{TRAP.LOG},
   207  \.{TRAP.72270GF}, and \.{TRAP.TFM}.
   208  
   209  The debugging of \MF\ and the testing of the adequacy of \.{TRAP.MF}
   210  could not have been done nearly as well as reported here except for
   211  the magnificent software support provided by my colleague David R. Fuchs.
   212  In particular, he extended our local Pascal compiler so that
   213  frequency counting and a number of other important features were added
   214  to its online debugging abilities.
   215  
   216  The method of testing advocated here has one chief difficulty that deserves
   217  comment: I had to verify by hand that \MF\ did the right things
   218  to \.{TRAP.MF}. This took many hours, and perhaps I have missed
   219  something (as I did in 1960); I must confess that I have not checked
   220  every single number in Appendices D, E, and~F. However, I'm willing to pay
   221  $\$$81.92 to the first finder of any remaining bug in \MF, and I will
   222  be surprised if that bug doesn't show up also in one of these appendices.
   223  
   224  \vfill\eject
   225  
   226  \section Appendix A: How to test \MF.
   227  
   228  \item{0.} Let's assume that you have a tape containing \.{TRAP.MF},
   229  \.{TRAPIN.LOG}, \.{TRAP.LOG}, \.{TRAP.TYP}, \.{TRAP.PL}, and \.{TRAP.FOT},
   230  as in Appendices B, C, D, E, F, and~G. Furthermore, let's suppose that you
   231  have a working \.{WEB} system, and that you have working programs
   232  \.{TFtoPL} and \.{GFtype}, as described in the \TeX ware and \MF ware reports.
   233  
   234  \item{1.} Prepare a version of \.{INIMF}. (This means that your \.{WEB}
   235  change file should have {\bf init} and {\bf tini} defined to be null.)
   236  The {\bf debug} and {\bf gubed} macros should be null, in order to
   237  activate special printouts that occur when $\\{tracingedges}>1.0$.
   238  The {\bf stat} and {\bf tats} macros should also be null, so that
   239  statistics are kept. Set \\{mem\_top} and \\{mem\_max} to 3000
   240  (or to \\{mem\_min} plus 3000, if \\{mem\_min} isn't zero),
   241  for purposes of this test version.
   242  Also set $\\{error\_line}=64$, $\\{half\_error\_line}=32$,
   243  $\\{max\_print\_line}=72$, $\\{screen\_width}=100$, and
   244  $\\{screen\_depth}=200$; these parameters affect many of the lines of
   245  the test output, so your job will be much easier if you use the same
   246  settings that were used to produce Appendix~E. Also (if possible) set
   247  $\\{gf\_buf\_size}=8$, since this tests more parts of the program.
   248  You probably should also use the ``normal'' settings of other parameters
   249  found in \.{MF.WEB} (e.g., $\\{max\_internal}=100$, $\\{buf\_size}=500$,
   250  etc.), since these show up in a few lines of the test output. Finally,
   251  change \MF's screen-display routines by putting the following simple lines
   252  in the change file:
   253  $$\vbox{\halign{\tt#\hfil\cr
   254  \char`\@x Screen routines:\cr
   255  begin init\char`\_screen:=false;\cr
   256  \char`\@y\cr
   257  begin init\char`\_screen:=true;
   258   \char`\{screen instructions will be logged\char`\}\cr
   259  \char`\@z\cr}}$$
   260  None of the other screen routines (\\{update\_screen}, \\{blank\_rectangle},
   261  \\{paint\_row}) should be changed in any way; the effect will be to have
   262  \MF's actions recorded in the transcript files instead of on the screen,
   263  in a machine-independent way.
   264  
   265  \item{2.} Run the \.{INIMF} prepared in step 1. In response to the first
   266  `\.{**}' prompt, type carriage return (thus getting another `\.{**}').
   267  Then type `\.{\char`\\input trap}'. You should get an output that matches
   268  the file \.{TRAPIN.LOG} (Appendix~C). Don't be alarmed by the error
   269  messages that you see, unless they are different from those in Appendix~C.
   270  
   271  \def\sp{{\char'40}}
   272  \item{3.} Run \.{INIMF} again. This time type `\.{\sp\&trap\sp\sp trap\sp}'.
   273  (The spaces in this input help to check certain parts of \MF\ that
   274  aren't otherwise used.) You should get outputs \.{TRAP.LOG}, \.{TRAP.72270GF},
   275  and \.{TRAP.TFM}.
   276  Furthermore, your terminal should receive output that matches \.{TRAP.FOT}
   277  (Appendix~G). During the middle part of this test, however, the terminal
   278  will not be getting output, because \.{batchmode} is being
   279  tested; don't worry if nothing seems to be happening for a while---nothing
   280  is supposed to.
   281  
   282  \item{4.} Compare the \.{TRAP.LOG} file from step 3 with the ``master''
   283  \.{TRAP.LOG} file of step~0. (Let's hope you put that master file in a
   284  safe place so that it wouldn't be clobbered.) There should be perfect
   285  agreement between these files except in the following respects:
   286  
   287  \itemitem{a)} The dates and possibly the file names will
   288  naturally be different.
   289  
   290  \itemitem{b)} If you had different values for \\{stack\_size}, \\{buf\_size},
   291  etc., the corresponding capacity values will be different when they
   292  are printed out at the end.
   293  
   294  \itemitem{c)} Help messages may be different; indeed, the author encourages
   295  non-English help messages in versions of \MF\ for people who don't
   296  understand English as well as some other language.
   297  
   298  \itemitem{d)} The total number and length of strings at the end and/or
   299  ``still untouched'' may well be different.
   300  
   301  \itemitem{e)} If your \MF\ uses a different memory allocation or
   302  packing scheme, the memory usage statistics may change.
   303  
   304  \itemitem{f)} If you use a different storage allocation scheme, the
   305  capsule numbers will probably be different, but the order of variables
   306  should be unchanged when dependent variables are shown. \MF\ should also
   307  choose the same variables to be dependent.
   308  
   309  \itemitem{g)} If your computer handles integer division of negative operands
   310  in a nonstandard way, you may get results that are rounded differently.
   311  Although \TeX\ is careful to be machine-independent in this regard,
   312  \MF\ is not, because integer divisions are present in so many places.
   313  
   314  \item{5.} Use \.{GFtype} to convert your file \.{TRAP.72270GF} to a file
   315  \.{TRAP.TYP}. (Both of \.{GFtype}'s options, i.e., mnemonic output and image
   316  output, should be enabled so that you get the maximum amount of output.)
   317  The resulting file should agree with the master \.{TRAP.TYP} file of step~0,
   318  assuming that your \.{GFtype} has the ``normal'' values of compile-time
   319  constants ($\\{top\_pixel}=69$, etc.).
   320  
   321  \item{6.} Use \.{TFtoPL} to convert your file \.{TRAP.TFM} to a file
   322  \.{TRAP.PL}. The resulting file should agree with the master \.{TRAP.PL}
   323  file of step~0.
   324  
   325  \item{7.} You might also wish to test \.{TRAP} with other versions of
   326  \MF\ (i.e., \.{VIRMF} or a production version with another base file
   327  preloaded). It should work unless \MF's primitives have been redefined in
   328  the base file. However, this step isn't essential, since all the code of
   329  \.{VIRMF} appears in \.{INIMF}; you probably won't catch any more errors
   330  this way, unless they would already become obvious from normal use of
   331  the~system.
   332  
   333  \vfill\eject
   334  
   335  \section Appendix B: The \.{TRAP.MF} file.
   336  The contents of the test routine are prefixed here with line numbers, for
   337  ease in comparing this file with the error messages printed later; the
   338  line numbers aren't actually present.
   339  \runninghead{APPENDIX B: \.{TRAP.MF} (CONTINUED)}
   340  
   341  \vskip 8pt
   342  \begingroup\count255=0
   343  \everypar{\global\advance\count255 by 1
   344    \hbox to 20pt{\sevenrm\hfil\the\count255\ \ }}
   345  \verbatim{trap.mf}
   346  \endgroup
   347  \vfill\eject
   348  
   349  \section Appendix C: The \.{TRAPIN.LOG} file.
   350  When \.{INIMF} makes the \.{TRAP.BASE} file, it also creates a file called
   351  \.{TRAP.LOG} that looks like this.
   352  \runninghead{APPENDIX C: \.{TRAPIN.LOG} (CONTINUED)}
   353  
   354  \vskip8pt
   355  \verbatim{trapin.log}
   356  \vfill\eject
   357  
   358  \section Appendix D: The \.{TRAP.LOG} file.
   359  Here is the major output of the \.{TRAP} test; it is generated by running
   360  \.{INIMF} and loading \.{TRAP.BASE}, then reading \.{TRAP.MF}.
   361  \runninghead{APPENDIX D: \.{TRAP.LOG} (CONTINUED)}
   362  
   363  {\let\tt=\eighttt\leftskip 1in\baselineskip 9pt plus .1pt minus .1pt
   364  \vskip8pt
   365  \verbatim{trap.log}
   366  }
   367  \vfill\eject
   368  
   369  \section Appendix E: The \.{TRAP.TYP} file.
   370  Here is another major component of the test. It shows the output of \.{GFtype}
   371  applied to the file \.{TRAP.72270GF} that is created at the same time
   372  Appendix D was produced.
   373  \runninghead{APPENDIX E: \.{TRAP.TYP} (CONTINUED)}
   374  
   375  {\let\tt=\eighttt\leftskip 1in\baselineskip 9pt plus .1pt minus .1pt
   376  \vskip8pt
   377  \verbatim{trap.typ}
   378  }
   379  \vfill\eject
   380  
   381  \section Appendix F: The \.{TRAP.PL} file.
   382  In this case we have the output of \.{TFtoPL}
   383  applied to the file \.{TRAP.TFM} that is created at the same time
   384  Appendix D was produced.
   385  \runninghead{APPENDIX F: \.{TRAP.PL} (CONTINUED)}
   386  
   387  {\let\tt=\eighttt\leftskip 1in\baselineskip 9pt plus .1pt minus .1pt
   388  \vskip8pt
   389  \verbatim{trap.pl}
   390  }
   391  \vfill\eject
   392  
   393  \section Appendix G: The \.{TRAP.FOT} file.
   394  This shows what appeared on the terminal while Appendix D was being produced.
   395  \runninghead{APPENDIX G: \.{TRAP.FOT} (CONTINUED)}
   396  
   397  \vskip8pt
   398  \verbatim{trap.fot}
   399  
   400  \vfill\end