github.com/Schaudge/hts@v0.0.0-20240223063651-737b4d69d68c/paper/examples/flagstat/README.md (about)

     1  flagstat
     2  ========
     3  
     4  This example replicates the output of [samtools](https://samtools.github.io) flagstat command.
     5  Core-for-core the C implementation outperforms the Go implementation.
     6  
     7  On an example BAM file the output of samtools (1.3.2-199-gec1d68e/htslib 1.3.2-199-gec1d68e) is:
     8  ```
     9  $ time samtools flagstat 9827_2#49.bam
    10  56463236 + 0 in total (QC-passed reads + QC-failed reads)
    11  0 + 0 secondary
    12  0 + 0 supplementary
    13  269248 + 0 duplicates
    14  55357963 + 0 mapped (98.04% : N/A)
    15  56463236 + 0 paired in sequencing
    16  28231618 + 0 read1
    17  28231618 + 0 read2
    18  54363468 + 0 properly paired (96.28% : N/A)
    19  55062652 + 0 with itself and mate mapped
    20  295311 + 0 singletons (0.52% : N/A)
    21  360264 + 0 with mate mapped to a different chr
    22  300699 + 0 with mate mapped to a different chr (mapQ>=5)
    23  
    24  real	1m31.517s
    25  user	1m30.268s
    26  sys	0m1.180s
    27  ```
    28  
    29  The following give the same flagstat output, but with reduced time.
    30  
    31  `--input-fmt-option nthreads=2`
    32  ```
    33  real	0m46.057s
    34  user	1m49.684s
    35  sys	0m4.432s
    36  ```
    37  
    38  `--input-fmt-option nthreads=4`
    39  ```
    40  real	0m26.816s
    41  user	1m55.148s
    42  sys	0m3.856s
    43  ```
    44  
    45  `--input-fmt-option nthreads=8`
    46  ```
    47  real	0m23.006s
    48  user	2m10.352s
    49  sys	0m5.648s
    50  ```
    51  
    52  and of this command (Go 1.8) on the same file is:
    53  ```
    54  $ go build github.com/biogo/hts/paper/examples/flagstat
    55  $ export GOMAXPROCS=1
    56  $ time ./flagstat 9827_2#49.bam
    57  56463236 + 0 in total (QC-passed reads + QC-failed reads)
    58  0 + 0 in total secondary
    59  0 + 0 in total supplementary
    60  269248 + 0 duplicates
    61  55357963 + 0 mapped (98.04% : N/A)
    62  56463236 + 0 paired in sequencing
    63  28231618 + 0 read1
    64  28231618 + 0 read2
    65  54363468 + 0 properly paired (96.28% : N/A)
    66  55062652 + 0 with itself and mate mapped
    67  295311 + 0 singletons (0.52% : N/A)
    68  360264 + 0 with mate mapped to a different chr
    69  300699 + 0 with mate mapped to a different chr (mapQ >= 5)
    70  
    71  real	5m2.323s
    72  user	5m0.312s
    73  sys	0m2.148s
    74  ```
    75  
    76  The following give the same flagstat output, but with reduced time.
    77  
    78  GOMAXPROCS=2
    79  ```
    80  real	2m41.310s
    81  user	5m18.948s
    82  sys	0m2.600s
    83  ```
    84  
    85  GOMAXPROCS=4
    86  ```
    87  real	1m40.957s
    88  user	6m21.232s
    89  sys	0m3.688s
    90  ```
    91  
    92  GOMAXPROCS=8
    93  ```
    94  real	1m28.465s
    95  user	9m7.480s
    96  sys	0m8.056s
    97  ```
    98  
    99  The file used in the benchmark was 9827_2#49.bam, available from ftp://ftp.sra.ebi.ac.uk/vol1/ERA242/ERA242167/bam/9827_2%2349.bam