github.com/grailbio/base@v0.0.11/file/filebench/snapshot.txt

github.com/grailbio/base@v0.0.11/file/filebench/snapshot.txt (about)

     1  For reference, whoever's changing base/file code may occasionally run this benchmark and
     2  update the snapshot below. It can be useful for showing code reviewers the result of a change,
     3  or just so readers can get a sense of performance without running the benchmarks themselves.
     4  
     5  Of course, since we're not totally controlling the environment or the data, be careful to
     6  set a baseline before evaluating your change.
     7  
     8  Some context for the numbers below:
     9    * S3 performance guidelines [1] suggest that each request should result in around 85–90 MB/s
    10      of read throughput. Our numbers are MiB/s (not sure if they mean M = 1000^2 or Mi = 1024^2) and
    11      it looks like our sequential reads ramp up to the right vicinity.
    12    * EC2 documentation offers network performance expectations (Gbps):
    13        * m5.x:      1.25 (base) - 10 (burst)
    14        * m5.4x:     5           - 10
    15        * m5.12x:   12
    16        * m5.24x:   25
    17        * m5n.24x: 100
    18      Note that a 1000 in the table below is MiB/s which is 8*1.024^2 ~= 8.4 Gbps.
    19  
    20  [1] https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance-design-patterns.html#optimizing-performance-parallelization
    21  [2] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/general-purpose-instances.html#general-purpose-network-performance
    22  
    23  Very brief summary and speculations about our current S3 performance for non-parallel(!) clients:
    24    * Could be improved for small sequential reads (few MiB at a time) in the current io.Reader
    25      model by adding (read-ahead) buffering (basically speculating about future reads).
    26    * Can't get much better for large reads (GiBs) with large buffers (100s MiBs) on small instance
    27      types because we're getting close to documented network performance.
    28    * Can probably get better for large reads with large buffers on large, network-optimized (*n)
    29      instances where we're not especially close to network limits yet. Maybe this requires more
    30      careful CPU and allocation usage.
    31  Parallel clients might matter more:
    32    * Clients reading many small files in parallel (like machine learning training reading V3
    33      fragments file columns) may already do chunked reads pretty effectively. We should measure to
    34      verify, though; maybe the chunks are not right-sized.
    35    * Clients reading very large files may benefit even more from a parallel-friendly API (ReaderAt):
    36        * Copies to disk could be accelerated without huge memory buffers.
    37          Examples: `grail-file cp`, biofs/gfilefs, reflow interning, bigslice shuffling.
    38        * Copies to memory could be accelerated without duplicated memory buffers:
    39          Examples: bio/align/mapper's index (if it was Go), FASTA.
    40  
    41  Results:
    42  Rows are read chunk sizes (MiB), columns are number of sequential chunks.
    43  Data cells are average MiB/s for the entire sequential read.
    44  
    45  Note: During development, the results seemed worse on some days, peaking around 75 MiB/s rather
    46  than 95 MiB/s. It was consistent for several runs within that day. Unclear if the cause was
    47  something local in the benchmarking machines (or in our builds) or the S3 service.
    48  
    49  The numbers below were generated using the bigmachine runner in us-west-2 reading a GRAIL-internal
    50  113 GiB file residing in an S3 bucket in us-west-2. The first columns are reading directly from S3;
    51  the next ones are through FUSE.
    52  
    53  A couple of the FUSE numbers seem surprisingly high. I suspect this is due to parts of reads being
    54  served by the page cache, due to randomly overlapping with earlier benchmark tasks. We could try
    55  to confirm this and clear the page cache in the future, though for now it's just useful to alert
    56  about issues causing widespread slowness.
    57  
    58  [0] m5.4xlarge
    59          s3://                                                       /tmp/s3
    60          1        8      64    512    p1      p8      p64    p512    1          8     64    512    p1    p8     p64    p512
    61  0       0        0      0     3      0       0       0      1       0          0     0     2      0     0      0      2
    62  1       4        24     41    46     5       34      198    142     5          26    41    74     4     30     167    417
    63  8       22       37     41    44     21      140     620    942     23         44    65           22    112    756
    64  16      35       46     49           35      203     867            34         48                 28    182
    65  32      55       50     65           51      317     999            40         51                 43    267
    66  128     177      245                 217     960                    51                            53
    67  512     728      415                 447     1075                   48                            38
    68  1024    855                          839
    69  4096    1077                         1025
    70  
    71  [1] m5.12xlarge
    72          s3://                                                        /tmp/s3
    73          1        8      64    512    p1      p8      p64     p512    1          8     64    512    p1    p8     p64    p512
    74  0       0        0      0     2      0       0       0       1       0          0     0     2      0     0      0      2
    75  1       5        27     45    59     4       34      213     648     5          26    50    71     4     34     213    537
    76  8       29       47     41    50     27      142     823     1209    29         45    83           29    152    732
    77  16      37       53     64           32      165     822             31         48                 28    230
    78  32      74       64     84           65      346     1258            45         57                 50    236
    79  128     231      181                 202     854                     55                            46
    80  512     360      615                 541     1297                    52                            57
    81  1024    1000                         1076
    82  4096    1297                         1280
    83  
    84  [2] m5.24xlarge
    85          s3://                                                        /tmp/s3
    86          1        8      64    512    p1      p8      p64     p512    1          8     64    512    p1    p8     p64    p512
    87  0       0        0      0     2      0       0       0       1       0          0     0     2      0     0      0      2
    88  1       5        26     46    52     5       37      188     492     3          30    50    69     4     30     170    661
    89  8       31       46     52    50     28      169     897     2119    25         50    62           27    158    811
    90  16      41       54     54           37      166     1365            36         50                 39    208
    91  32      66       83     29           55      279     1873            42         69                 44    282
    92  128     168      199                 182     1224                    54                            52
    93  512     555      643                 495     2448                    59                            55
    94  1024    789                          907
    95  4096    2395                         2410
    96  
    97  [3] m5n.24xlarge
    98          s3://                                                        /tmp/s3
    99          1        8      64    512    p1      p8      p64     p512    1          8     64    512    p1    p8     p64    p512
   100  0       0        0      0     2      0       0       0       1       0          0     0     2      0     0      0      1
   101  1       4        28     53    55     5       32      214     954     4          28    50    52     4     31     188    849
   102  8       24       44     60    55     24      165     865     2811    25         42    43           26    144    788
   103  16      38       55     62           43      181     992             38         52                 39    202
   104  32      55       80     64           60      314     2407            42         59                 48    283
   105  128     171      179                 190     1005                    56                            51
   106  512     462      549                 469     4068                    56                            70
   107  1024    1343                         821
   108  4096    2921                         3010