github.com/mvdan/u-root-coreutils@v0.0.0-20230122170626-c2eef2898555/pkg/boot/linux/doc.go (about)

     1  // Copyright 2022 the u-root Authors. All rights reserved
     2  // Use of this source code is governed by a BSD-style
     3  // license that can be found in the LICENSE file.
     4  //
     5  // SPDX-License-Identifier: BSD-3-Clause
     6  //
     7  
     8  // The linux package loads bzImage-based Linux kernels using the kexec_load
     9  // system call.
    10  //
    11  // Callers may choose a 64bit or 32bit purgatory to use at runtime.
    12  //
    13  // kexec_load is conceptually simple to use: give it some code segments and an
    14  // entry point, then tell the kernel to jump to the entry point.
    15  //
    16  // The kernel's contract on x86_64 is that kexec will jump to the entry point
    17  // in 64bit mode with identity-mapped page tables and unspecified garbage in
    18  // registers.
    19  //
    20  // Theoretically, one could just load the ELF segments of a kernel into
    21  // kexec_load, give it the entry point, and let it run. However, a loaded Linux
    22  // kernel may expect to be loaded in either 32bit or 64bit mode and expects an
    23  // argument in rsi: a pointer to the Linux boot_params struct.
    24  //
    25  // To drop into the right addressing mode and pass this argument to the kernel,
    26  // some additional code is placed before the execution of the new Linux kernel
    27  // called a purgatory. So it will go: old kernel jumps to purgatory jumps to
    28  // new kernel.
    29  //
    30  // Our purgatory will only have these three responsibilities:
    31  // - drop to the intended addressing mode (32bit or 64bit)
    32  // - set the right rsi parameters
    33  // - jump to the entry point.
    34  //
    35  // The purgatory has to be supplied by us as part of kexec_load, and set up to
    36  // do the right thing. Two purgatories written in Assembly are part of this
    37  // package: one that sets up the Linux args and remains in 64bit mode, and one
    38  // that sets up the Linux args and drops to 32bit mode.
    39  //
    40  // ## History of kexec
    41  //
    42  // The original kexec was designed to work with ELF, and was even hooked
    43  // into the exec system call: starting a new kernel was as easy as typing
    44  // ./vmlinux
    45  // i.e., kexec was truly just a variant on exec!
    46  //
    47  // In the earliest kexec, as in all the early "kernel boots kernel" implementations (1),
    48  // (and still in Plan 9 today), the kernel directly loaded, and started, the
    49  // next kernel. For a number of reasons, kexec introduced the concept of a
    50  // purgatory. The purgatory in principle is both simple and elegant: a small
    51  // bit of code, supplied by user space, that manages the transition from one
    52  // kernel to the next, and vanishes.
    53  //
    54  // The purgatory has a few main responsibilities:
    55  // o (optionally) copy the new kernel over top of the old kernel
    56  // o do any special device setup that neither kernel can manage (mainly console)
    57  // o run a SHA256 over the kernel
    58  // o communicate arguments to the new kernel (on x86, assembly linux params at 0x90000)
    59  // o be able to return to the caller if things go wrong
    60  // o run anywhere, because we may be booting a 16-bit kernel
    61  //
    62  // That last item is the one that causes a lot of trouble. In 2000, systems
    63  // with 16 MiB were still common, Linux kernels had to load at 0x100000,
    64  // memtest86 had to load in the low 640k, and finding a place to put the
    65  // purgatory required that it be a position independent program. Rather than
    66  // being written as such, it was instead compiled as a relocatable ELF, the
    67  // relocation being done at kexec time. I.e., kexec includes a link step.
    68  //
    69  // Because processors have changed a lot since 2000, when kexec was first
    70  // written, these old assumptions are worth re-examining.
    71  //
    72  // First, systems that use kexec come with at least 1 GiB of memory nowadays.
    73  // Further, newer kernels always avoid using the low 1MiB, since buggy BIOSes
    74  // might corrupt memory. Finally, nobody cares about booting 16-bit kernels any
    75  // more -- even memtest86 runs as a Linux binary. Hence: we can always get
    76  // space in the low 1MiB for the purgatory, and in fact we can assume that
    77  // memory is available at 0x3000. The low 640K must alway be there. That means
    78  // we can link the purgatory to run at a fixed place -- since most kexec users
    79  // load it at a fixed place anyway.
    80  //
    81  // Second, with relocatable kernels, the copy function is no longer needed.
    82  //
    83  // Third, we can dispense with ideas of returning. If things are so messed up
    84  // that we can not kexec, it is likely time to reset the machine. Should we
    85  // desire to implement return later, however, we need not use the messy
    86  // mechanisms in the current purgatories to save registers. If we mark the
    87  // function with a returnstwice attribute, gcc will use caller-save semantics
    88  // for the call, not callee-save, removing any need to worry about saving
    89  // registers.
    90  //
    91  // Hence, we can, should we care, arrange for the purgatory right up to the
    92  // point that it drops to 32-bit unpaged mode. Because the number of operations
    93  // from 32-bit to calling the next kernel are so few, we do not feel it is
    94  // necessary to return past that point.
    95  //
    96  // Fourth, parameter passing is unnecessarily messy in the current purgatories.
    97  // We can rewrite that contract: if we consider the first 8 uint64_t in the
    98  // purgatory, the first can be used for a relative jump around the next 7, and
    99  // those seven quadwords can be used for parameter passing.
   100  //
   101  // These changes should let us:
   102  // o build the purgatory as a non-relative ELF, i.e. a statically linked program with one ELF program (segment)
   103  // o and link it at 0x3000; the code was putting the current relative ELF in a fixed place anyway
   104  // o use the ELF program header to tell us where to put the purgatory
   105  // o communicate arguments in the seven quadwords mentioned above
   106  // o rather than one does-it-all purgatory as we have today, we can provide several variants
   107  //
   108  //	so we get one suited to the job at hand.
   109  //
   110  // This should result in a dramatically simpler purgatory implementation. Also,
   111  // being much simpler, it can be entirely Go assembly, obviating the need for a
   112  // C compiler. This preserves a desired property of u-root: that it can always
   113  // be built with only the Go toolchain.
   114  //
   115  // (1) "Give your bootstrap the boot: using the operating system to boot the operating system"
   116  // Ron Minnich,  2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)
   117  package linux