github.com/biogo/biogo@v1.0.4/README.md (about)

     1  ![bíogo](https://raw.githubusercontent.com/biogo/biogo/master/biogo.png)
     2  
     3  # bíogo
     4  
     5  [![GoDoc](https://godoc.org/github.com/biogo/biogo?status.png)](http://godoc.org/github.com/biogo/biogo)
     6  [![Build Status](https://travis-ci.org/biogo/biogo.svg?branch=master)](https://travis-ci.org/biogo/biogo)
     7  
     8  ## Installation
     9  
    10          $ go get github.com/biogo/biogo/...
    11  
    12  ## Overview
    13  
    14  bíogo is a bioinformatics library for the Go language.
    15  
    16  ## Getting help
    17  
    18  Help or similar requests are preferred on the biogo-user Google Group.
    19  
    20  https://groups.google.com/forum/#!forum/biogo-user
    21  
    22  ## Contributing
    23  
    24  If you find any bugs, feel free to file an issue on the github issue tracker.
    25  Pull requests are welcome, though if they involve changes to API or addition of features, please first open a discussion at the biogo-dev Google Group.
    26  
    27  https://groups.google.com/forum/#!forum/biogo-dev
    28  
    29  ## Citing ##
    30  
    31  If you use bíogo, please cite Kortschak, Snyder, Maragkakis and Adelson "bíogo: a simple high-performance bioinformatics toolkit for the Go language", doi:[10.21105/joss.00167](http://dx.doi.org/10.21105/joss.00167), and Kortschak and Adelson "bíogo: a simple high-performance bioinformatics toolkit for the Go language", doi:[10.1101/005033](http://biorxiv.org/content/early/2014/05/12/005033).
    32  
    33  ## The Purpose of bíogo
    34  
    35  bíogo stems from the need to address the size and structure of modern genomic
    36  and metagenomic data sets. These properties enforce requirements on the
    37  libraries and languages used for analysis:
    38  
    39  * speed - size of data sets
    40  * concurrency - problems often embarrassingly parallelisable
    41  
    42  In addition to the computational burden of massive data set sizes in modern
    43  genomics there is an increasing need for complex pipelines to resolve questions
    44  in tightening problem space and also a developing need to be able to develop
    45  new algorithms to allow novel approaches to interesting questions. These issues
    46  suggest the need for a simplicity in syntax to facilitate:
    47  
    48  * ease of coding
    49  * checking for correctness in development and particularly in peer review
    50  
    51  Related to the second issue is the [reluctance of some researchers to release
    52  code because of quality
    53  concerns](http://www.nature.com/news/2010/101013/full/467753a.html "Publish
    54  your computer code: it is good enough. Nature 2010.").
    55  
    56  The issue of code release is the first of the principles formalised in the
    57  [Science Code Manifesto](http://sciencecodemanifesto.org/).
    58  
    59      Code  All source code written specifically to process data for a published
    60            paper must be available to the reviewers and readers of the paper.
    61  
    62  A language with a simple, yet expressive, syntax should facilitate development
    63  of higher quality code and thus help reduce this barrier to research code
    64  release.
    65  
    66  ## Articles ##
    67  
    68  [bíogo: a simple high-performance bioinformatics toolkit for the Go language](http://biorxiv.org/content/early/2014/05/12/005033)
    69  
    70  [Analysis of Illumina sequencing data using bíogo](http://talks.godoc.org/github.com/biogo/talks/illumination/illumina.article)
    71  
    72  [Using and extending types in bíogo](http://talks.godoc.org/github.com/biogo/talks/types/types.article)
    73  
    74  ## Yet Another Bioinformatics Library
    75  
    76  It seems that nearly every language has it own bioinformatics library, some of
    77  which are very mature, for example [BioPerl](http://bioperl.org) and
    78  [BioPython](http://biopython.org). Why add another one?
    79  
    80  The different libraries excel in different fields, acting as scripting glue for
    81  applications in a pipeline (much of [[1], [2], [3]]) and interacting with external hosts
    82  [[1], [2], [4], [5]], wrapping lower level high performance languages with more user
    83  friendly syntax [[1], [2], [3], [4]] or providing bioinformatics functions for high
    84  performance languages [[5], [6]].
    85  
    86  The intended niche for bíogo lies somewhere between the scripting libraries and
    87  high performance language libraries in being easy to use for both small and
    88  large projects while having reasonable performance with computationally
    89  intensive tasks.
    90  
    91  The intent is to reduce the level of investment required to develop new
    92  research software for computationally intensive tasks.
    93  
    94  [1]: http://bioperl.org/ "BioPerl"
    95  [2]: http://biopython.org/ "BioPython"
    96  [3]: http://bioruby.org/ "BioRuby"
    97  [4]: http://pycogent.sourceforge.net/ "PyCogent"
    98  [5]: http://biojava.org/ "BioJava"
    99  [6]: http://www.seqan.de/ "SeqAn"
   100  
   101  1. BioPerl
   102      http://genome.cshlp.org/content/12/10/1611.full
   103      http://www.springerlink.com/content/pp72033m171568p2
   104  
   105  2. BioPython
   106      http://bioinformatics.oxfordjournals.org/content/25/11/1422
   107  
   108  3. BioRuby
   109      http://bioinformatics.oxfordjournals.org/content/26/20/2617
   110  
   111  4. PyCogent
   112      http://genomebiology.com/2007/8/8/R171
   113  
   114  5. BioJava
   115      http://bioinformatics.oxfordjournals.org/content/24/18/2096
   116  
   117  6. SeqAn
   118      http://www.biomedcentral.com/1471-2105/9/11
   119  
   120  ## Library Structure and Coding Style
   121  
   122  The bíogo library structure is influenced both by the Go core library.
   123  
   124  The coding style should be aligned with normal Go idioms as represented in the
   125  Go core libraries.
   126  
   127  ## Quality Scores
   128  
   129  Quality scores are supported for all sequence types, including protein. Phred
   130  and Solexa scoring systems are able to be read from files, however internal
   131  representation of quality scores is with Phred, so there will be precision loss
   132  in conversion. A Solexa quality score type is provided for use where this will
   133  be a problem.
   134  
   135  ## Copyright and License
   136  
   137  Copyright ©2011-2013 The bíogo Authors except where otherwise noted. All rights
   138  reserved. Use of this source code is governed by a BSD-style license that can be
   139  found in the LICENSE file.
   140  
   141  The bíogo logo is derived from Bitstream Charter, Copyright ©1989-1992
   142  Bitstream Inc., Cambridge, MA.
   143  
   144  BITSTREAM CHARTER is a registered trademark of Bitstream Inc.