github.com/inspektor-gadget/inspektor-gadget@v0.28.1/docs/builtin-gadgets/profile/block-io.md (about) 1 --- 2 title: 'Using profile block-io' 3 weight: 20 4 description: > 5 Analyze block I/O performance through a latency distribution. 6 --- 7 8 The profile block-io gadget gathers information about the usage of the 9 block device I/O (disk I/O), generating a histogram distribution of I/O 10 latency (time), when the gadget is stopped. 11 12 Notice that the latency of the disk I/O is measured from when the call is 13 issued to the device until its completion, it does not include time spent 14 in the kernel queue. This means that the histogram reflects only the 15 performance of the device and not the effective latency suffered by the 16 applications. 17 18 The histogram shows the number of I/O operations (`count` column) that lie in 19 the latency range `interval-start` -> `interval-end` (`usecs` column), which, 20 as the columns name indicates, is given in microseconds. 21 22 For this guide, we will use [the `stress` tool](https://man.archlinux.org/man/stress.1) that allows 23 us to load and stress the system in many different ways. In particular, we will use the `--io` flag 24 that will generate a given number of workers to spin on the [sync() 25 syscall](https://man7.org/linux/man-pages/man2/sync.2.html). In this way, we will generate disk I/O 26 that we will analyse using the biolatency gadget. 27 28 For further details, please refer to [the BCC 29 documentation](https://github.com/iovisor/bcc/blob/master/tools/biolatency_example.txt). 30 31 ### On Kubernetes 32 33 Firstly, let's use the profile block-io gadget to see the I/O latency in our 34 testing node with its normal load work: 35 36 ```bash 37 # Run the gadget on the worker-node node 38 $ kubectl gadget profile block-io --node worker-node 39 Tracing block device I/O... Hit Ctrl-C to end 40 41 # Wait for around 1 minute and hit Ctrl+C to stop the gadget and see the results 42 ^C 43 44 usecs : count distribution 45 0 -> 1 : 0 | | 46 2 -> 3 : 0 | | 47 4 -> 7 : 0 | | 48 8 -> 15 : 0 | | 49 16 -> 31 : 0 | | 50 32 -> 63 : 17 |* | 51 64 -> 127 : 261 |******************* | 52 128 -> 255 : 546 |****************************************| 53 256 -> 511 : 426 |******************************* | 54 512 -> 1023 : 227 |**************** | 55 1024 -> 2047 : 18 |* | 56 2048 -> 4095 : 8 | | 57 4096 -> 8191 : 23 |* | 58 8192 -> 16383 : 15 |* | 59 16384 -> 32767 : 2 | | 60 32768 -> 65535 : 1 | | 61 ``` 62 63 This output shows that the bulk of the I/O was between 64 and 1023 us, and 64 that there were 1544 I/O operations during the time the gadget was running. 65 Notice that we waited for 1 minute but longer time would produce more 66 stable results. 67 68 Now, let's increase the I/O operations using the stress tool: 69 70 ```bash 71 # Start by creating our testing namespace 72 $ kubectl create ns test-biolatency 73 74 # Run stress with 1 worker that will generate I/O operations 75 $ kubectl run --restart=Never --image=polinux/stress stress-io -n test-biolatency -- stress --io 1 76 $ kubectl wait --timeout=-1s -n test-biolatency --for=condition=ready pod/stress-io 77 pod/stress-io condition met 78 $ kubectl get pod -n test-biolatency -o wide 79 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES 80 stress-io 1/1 Running 0 2s 10.244.1.7 worker-node <none> <none> 81 ``` 82 83 Using the profile block-io gadget, we can generate another histogram to analyse the 84 disk I/O with this load: 85 86 ```bash 87 # Run the gadget again 88 $ kubectl gadget profile block-io --node worker-node 89 Tracing block device I/O... Hit Ctrl-C to end 90 91 # Wait again for 1 minute and hit Ctrl+C to stop the gadget and see the results 92 ^C 93 94 usecs : count distribution 95 0 -> 1 : 0 | | 96 2 -> 3 : 0 | | 97 4 -> 7 : 0 | | 98 8 -> 15 : 0 | | 99 16 -> 31 : 411 | | 100 32 -> 63 : 310822 |****************************************| 101 64 -> 127 : 293404 |************************************* | 102 128 -> 255 : 194881 |************************* | 103 256 -> 511 : 96520 |************ | 104 512 -> 1023 : 33756 |**** | 105 1024 -> 2047 : 4414 | | 106 2048 -> 4095 : 1007 | | 107 4096 -> 8191 : 1025 | | 108 8192 -> 16383 : 176 | | 109 16384 -> 32767 : 13 | | 110 32768 -> 65535 : 7 | | 111 65536 -> 131071 : 0 | | 112 131072 -> 262143 : 0 | | 113 262144 -> 524287 : 1 | | 114 524288 -> 1048575 : 0 | | 115 1048576 -> 2097151 : 1 | | 116 117 # Remove load 118 $ kubectl delete pod/stress-io -n test-biolatency 119 ``` 120 121 The new histogram shows how the number of I/O operations increased 122 significantly, passing from 1544 (normal load) to 936438 (stressing the I/O). 123 On the other hand, even though this histogram shows that the bulk of the I/O 124 was still lower than 1023us, we can observe that there were several I/O 125 operations that suffered a high latency due to the load, one of them, 126 even more than 1 sec. 127 128 Delete the demo test namespace: 129 ```bash 130 $ kubectl delete ns test-biolatency 131 namespace "test-biolatency" deleted 132 ``` 133 134 ### With `ig` 135 136 * Generate some io load: 137 138 ```bash 139 $ docker run -d --rm --name stresstest polinux/stress stress --io 10 140 ``` 141 142 * Start `ig`: 143 144 ```bash 145 $ sudo ./ig profile block-io 146 ``` 147 148 * Observe the results: 149 150 ```bash 151 $ sudo ./ig profile block-io 152 Tracing block device I/O... Hit Ctrl-C to end.^C 153 usecs : count distribution 154 1 -> 1 : 0 | | 155 2 -> 3 : 0 | | 156 4 -> 7 : 0 | | 157 8 -> 15 : 0 | | 158 16 -> 31 : 0 | | 159 32 -> 63 : 113 | | 160 64 -> 127 : 7169 |****************************************| 161 128 -> 255 : 3724 |******************** | 162 256 -> 511 : 2198 |************ | 163 512 -> 1023 : 712 |*** | 164 1024 -> 2047 : 203 |* | 165 2048 -> 4095 : 23 | | 166 4096 -> 8191 : 7 | | 167 8192 -> 16383 : 3 | | 168 ``` 169 170 * Remove the docker container: 171 172 ```bash 173 $ docker stop stresstest 174 ```