github.com/mvdan/u-root-coreutils@v0.0.0-20230122170626-c2eef2898555/pkg/boot/linux/doc.go (about) 1 // Copyright 2022 the u-root Authors. All rights reserved 2 // Use of this source code is governed by a BSD-style 3 // license that can be found in the LICENSE file. 4 // 5 // SPDX-License-Identifier: BSD-3-Clause 6 // 7 8 // The linux package loads bzImage-based Linux kernels using the kexec_load 9 // system call. 10 // 11 // Callers may choose a 64bit or 32bit purgatory to use at runtime. 12 // 13 // kexec_load is conceptually simple to use: give it some code segments and an 14 // entry point, then tell the kernel to jump to the entry point. 15 // 16 // The kernel's contract on x86_64 is that kexec will jump to the entry point 17 // in 64bit mode with identity-mapped page tables and unspecified garbage in 18 // registers. 19 // 20 // Theoretically, one could just load the ELF segments of a kernel into 21 // kexec_load, give it the entry point, and let it run. However, a loaded Linux 22 // kernel may expect to be loaded in either 32bit or 64bit mode and expects an 23 // argument in rsi: a pointer to the Linux boot_params struct. 24 // 25 // To drop into the right addressing mode and pass this argument to the kernel, 26 // some additional code is placed before the execution of the new Linux kernel 27 // called a purgatory. So it will go: old kernel jumps to purgatory jumps to 28 // new kernel. 29 // 30 // Our purgatory will only have these three responsibilities: 31 // - drop to the intended addressing mode (32bit or 64bit) 32 // - set the right rsi parameters 33 // - jump to the entry point. 34 // 35 // The purgatory has to be supplied by us as part of kexec_load, and set up to 36 // do the right thing. Two purgatories written in Assembly are part of this 37 // package: one that sets up the Linux args and remains in 64bit mode, and one 38 // that sets up the Linux args and drops to 32bit mode. 39 // 40 // ## History of kexec 41 // 42 // The original kexec was designed to work with ELF, and was even hooked 43 // into the exec system call: starting a new kernel was as easy as typing 44 // ./vmlinux 45 // i.e., kexec was truly just a variant on exec! 46 // 47 // In the earliest kexec, as in all the early "kernel boots kernel" implementations (1), 48 // (and still in Plan 9 today), the kernel directly loaded, and started, the 49 // next kernel. For a number of reasons, kexec introduced the concept of a 50 // purgatory. The purgatory in principle is both simple and elegant: a small 51 // bit of code, supplied by user space, that manages the transition from one 52 // kernel to the next, and vanishes. 53 // 54 // The purgatory has a few main responsibilities: 55 // o (optionally) copy the new kernel over top of the old kernel 56 // o do any special device setup that neither kernel can manage (mainly console) 57 // o run a SHA256 over the kernel 58 // o communicate arguments to the new kernel (on x86, assembly linux params at 0x90000) 59 // o be able to return to the caller if things go wrong 60 // o run anywhere, because we may be booting a 16-bit kernel 61 // 62 // That last item is the one that causes a lot of trouble. In 2000, systems 63 // with 16 MiB were still common, Linux kernels had to load at 0x100000, 64 // memtest86 had to load in the low 640k, and finding a place to put the 65 // purgatory required that it be a position independent program. Rather than 66 // being written as such, it was instead compiled as a relocatable ELF, the 67 // relocation being done at kexec time. I.e., kexec includes a link step. 68 // 69 // Because processors have changed a lot since 2000, when kexec was first 70 // written, these old assumptions are worth re-examining. 71 // 72 // First, systems that use kexec come with at least 1 GiB of memory nowadays. 73 // Further, newer kernels always avoid using the low 1MiB, since buggy BIOSes 74 // might corrupt memory. Finally, nobody cares about booting 16-bit kernels any 75 // more -- even memtest86 runs as a Linux binary. Hence: we can always get 76 // space in the low 1MiB for the purgatory, and in fact we can assume that 77 // memory is available at 0x3000. The low 640K must alway be there. That means 78 // we can link the purgatory to run at a fixed place -- since most kexec users 79 // load it at a fixed place anyway. 80 // 81 // Second, with relocatable kernels, the copy function is no longer needed. 82 // 83 // Third, we can dispense with ideas of returning. If things are so messed up 84 // that we can not kexec, it is likely time to reset the machine. Should we 85 // desire to implement return later, however, we need not use the messy 86 // mechanisms in the current purgatories to save registers. If we mark the 87 // function with a returnstwice attribute, gcc will use caller-save semantics 88 // for the call, not callee-save, removing any need to worry about saving 89 // registers. 90 // 91 // Hence, we can, should we care, arrange for the purgatory right up to the 92 // point that it drops to 32-bit unpaged mode. Because the number of operations 93 // from 32-bit to calling the next kernel are so few, we do not feel it is 94 // necessary to return past that point. 95 // 96 // Fourth, parameter passing is unnecessarily messy in the current purgatories. 97 // We can rewrite that contract: if we consider the first 8 uint64_t in the 98 // purgatory, the first can be used for a relative jump around the next 7, and 99 // those seven quadwords can be used for parameter passing. 100 // 101 // These changes should let us: 102 // o build the purgatory as a non-relative ELF, i.e. a statically linked program with one ELF program (segment) 103 // o and link it at 0x3000; the code was putting the current relative ELF in a fixed place anyway 104 // o use the ELF program header to tell us where to put the purgatory 105 // o communicate arguments in the seven quadwords mentioned above 106 // o rather than one does-it-all purgatory as we have today, we can provide several variants 107 // 108 // so we get one suited to the job at hand. 109 // 110 // This should result in a dramatically simpler purgatory implementation. Also, 111 // being much simpler, it can be entirely Go assembly, obviating the need for a 112 // C compiler. This preserves a desired property of u-root: that it can always 113 // be built with only the Go toolchain. 114 // 115 // (1) "Give your bootstrap the boot: using the operating system to boot the operating system" 116 // Ron Minnich, 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) 117 package linux