github.com/serversong/goreporter@v0.0.0-20200325104552-3cfaf44fd178/linters/spellcheck/misspell/README.md (about)

     1  [![Build Status](https://travis-ci.org/client9/misspell.svg?branch=master)](https://travis-ci.org/client9/misspell) [![Go Report Card](https://goreportcard.com/badge/github.com/client9/misspell)](https://goreportcard.com/report/github.com/client9/misspell) [![GoDoc](https://godoc.org/github.com/client9/misspell?status.svg)](https://godoc.org/github.com/client9/misspell) [![Coverage](http://gocover.io/_badge/github.com/client9/misspell)](http://gocover.io/github.com/client9/misspell) [![license](https://img.shields.io/badge/license-MIT-blue.svg?style=flat)](https://raw.githubusercontent.com/client9/misspell/master/LICENSE)
     2  
     3  Correct commonly misspelled English words... quickly.
     4  
     5  ### install with `go get -u github.com/client9/misspell/cmd/misspell`
     6  
     7  ```bash
     8  $ misspell all.html your.txt important.md files.go
     9  your.txt:42:10 found "langauge" a misspelling of "language"
    10  
    11  # ^ file, line, column
    12  ```
    13  
    14  You'll need [golang 1.5 or newer](https://golang.org/) installed to compile
    15  it.  But after that it's a standalone binary.
    16  
    17  If people want pre-compiled binaries, [file a
    18  ticket](https://github.com/client9/misspell/issues) please.
    19  
    20  ## FAQ
    21  
    22  * [Automatic Corrections](#correct)
    23  * [Converting UK spellings to US](#locale)
    24  * [Using pipes and stdin](#stdin)
    25  * [Golang special support](#golang)
    26  * [gometalinter support](#gometalinter)
    27  * [CSV Output](#csv)
    28  * [Using SQLite3](#sqlite)
    29  * [Changing output format](#output)
    30  * [Checking a folder recursively](#recursive)
    31  * [Performance](#performance)
    32  * [Known Issues](#issues)
    33  * [Debugging](#debug)
    34  * [False Negatives and missing words](#missing)
    35  * [Origin of Word Lists](#words)
    36  * [Software License](#license)
    37  * [Problem statement](#problem)
    38  * [Other spelling correctors](#others)
    39  * [Other ideas](#otherideas)
    40  
    41  <a name="correct"></a>
    42  ### How can I make the corrections automatically?
    43  
    44  Just add the `-w` flag!
    45  
    46  ```
    47  $ misspell -w all.html your.txt important.md files.go
    48  your.txt:9:21:corrected "langauge" to "language"
    49  
    50  # ^booyah
    51  ```
    52  
    53  <a name="locale"></a>
    54  ### How do I convert British spellings to American (or vice-versa)?
    55  
    56  Add the `-locale US` flag!
    57  
    58  ```bash
    59  $ misspell -locale US important.txt
    60  important.txt:10:20 found "colour" a misspelling of "color"
    61  ```
    62  
    63  Add the `-locale UK` flag!
    64  
    65  ```bash
    66  $ echo "My favorite color is blue" | misspell -locale UK
    67  stdin:1:3:found "favorite color" a misspelling of "favourite colour"
    68  ```
    69  
    70  Help is appreciated as I'm neither British nor an
    71  expert in the English language.
    72  
    73  <a name="recursive"></a>
    74  ### How do you check an entire folder recursively?
    75  
    76  Just list a directory you'd like to check
    77  
    78  ```bash
    79  misspell .
    80  misspell aDirectory anotherDirectory aFile
    81  ```
    82  
    83  You can also run misspell recursively using the following shell tricks:
    84  
    85  ```bash
    86  misspell directory/**/*
    87  ```
    88  
    89  or
    90  
    91  ```bash
    92  find . -type f | xargs misspell
    93  ```
    94  
    95  <a name="stdin"></a>
    96  ### Can I use pipes or `stdin` for input?
    97  
    98  Yes!
    99  
   100  Print messages to `stderr` only:
   101  
   102  ```bash
   103  $ echo "zeebra" | misspell
   104  stdin:1:0:found "zeebra" a misspelling of "zebra"
   105  ```
   106  
   107  Print messages to `stderr`, and corrected text to `stdout`:
   108  
   109  ```bash
   110  $ echo "zeebra" | misspell -w
   111  stdin:1:0:corrected "zeebra" to "zebra"
   112  zebra
   113  ```
   114  
   115  Only print the corrected text to `stdout`:
   116  
   117  ```bash
   118  $ echo "zeebra" | misspell -w -q
   119  zebra
   120  ```
   121  
   122  <a name="golang"></a>
   123  ### Are there special rules for golang source files?
   124  
   125  Yes!  If the file ends in `.go`, then misspell will only check spelling in
   126  comments.
   127  
   128  If you want to force a file to be checked as a golang source, use `-source=go`
   129  on the command line.  Conversely, you can check a golang source as if it were
   130  pure text by using `-source=text`.  You might want to do this since many
   131  variable names have misspellings in them!
   132  
   133  ### Can I check only-comments in other other programming languages?
   134  
   135  I'm told the using `-source=go` works well for ruby, javascript, java, c and
   136  c++.
   137  
   138  It doesn't work well for python and bash.
   139  
   140  <a name="gometalinter"></a>
   141  ### Does this work with gometalinter?
   142  
   143  [gometalinter](https://github.com/alecthomas/gometalinter) runs
   144  multiple golang linters.  Starting on [2016-06-12](https://github.com/alecthomas/gometalinter/pull/134)
   145  gometalinter supports `misspell` natively but it is disabled by default.
   146  
   147  ```bash
   148  # update your copy of gometalinter
   149  go get -u github.com/alecthomas/gometalinter
   150  
   151  # install updates and misspell
   152  gometalinter --install --update
   153  ```
   154  
   155  To use, just enable `misspell`
   156  
   157  ```
   158  gometalinter --enable misspell ./...
   159  ```
   160  
   161  Note that gometalinter only checks golang files, and uses the default options
   162  of `misspell`
   163  
   164  You may wish to run this on your plaintext (.txt) and/or markdown files too.
   165  
   166  
   167  <a name="csv"</a>
   168  ### How Can I Get CSV Output?
   169  
   170  Using `-f csv`, the output is standard comma-seprated values with headers in the first row.
   171  
   172  ```
   173  misspell -f csv *
   174  file,line,column,typo,corrected
   175  "README.md",9,22,langauge,language
   176  "README.md",47,25,langauge,language
   177  ```
   178  
   179  <a name="sqlite"</a>
   180  ### How can I export to SQLite3? 
   181  
   182  Using `-f sqlite`, the output is a [sqlite3](https://www.sqlite.org/index.html) dump-file.
   183  
   184  ```bash
   185  $ misspell -f sqlite * > /tmp/misspell.sql
   186  $ cat /tmp/misspell.sql
   187  
   188  PRAGMA foreign_keys=OFF;
   189  BEGIN TRANSACTION;
   190  CREATE TABLE misspell(
   191    "file" TEXT,
   192    "line" INTEGER,i
   193    "column" INTEGER,i
   194    "typo" TEXT,
   195    "corrected" TEXT
   196  );
   197  INSERT INTO misspell VALUES("install.txt",202,31,"immediatly","immediately");
   198  # etc...
   199  COMMIT;
   200  ```
   201  
   202  ```bash
   203  $ sqlite3 -init /tmp/misspell.sql :memory: 'select count(*) from misspell'
   204  1
   205  ```
   206  
   207  With some tricks you can directly pipe output to sqlite3 by using `-init /dev/stdin`:
   208  
   209  ```
   210  misspell -f sqlite * | sqlite3 -init /dev/stdin -column -cmd '.width 60 15' ':memory' \
   211      'select substr(file,35),typo,count(*) as count from misspell group by file, typo order by count desc;'
   212  ```
   213  
   214  <a name="output"></a>
   215  ### How can I change the output format?
   216  
   217  Using the `-f template` flag you can pass in a
   218  [golang text template](https://golang.org/pkg/text/template/) to format the output.
   219  
   220  One can use `printf "%q" VALUE` to safely quote a value.
   221  
   222  The default template is compatible with [gometalinter](https://github.com/alecthomas/gometalinter)
   223  ```
   224  {{ .Filename }}:{{ .Line }}:{{ .Column }}:corrected {{ printf "%q" .Original }} to "{{ printf "%q" .Corrected }}"
   225  ```
   226  
   227  To just print probable misspellings:
   228  
   229  ```
   230  -f '{{ .Original }}'
   231  ```
   232  
   233  <a name="problem"></a>
   234  ### What problem does this solve?
   235  
   236  This corrects commonly misspelled English words in computer source
   237  code, and other text-based formats (`.txt`, `.md`, etc).
   238  
   239  It is designed to run quickly so it can be
   240  used as a [pre-commit hook](https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks)
   241  with minimal burden on the developer.
   242  
   243  It does not work with binary formats (e.g. Word, etc).
   244  
   245  It is not a complete spell-checking program nor a grammar checker.
   246  
   247  <a name="others"></a>
   248  ### What are other misspelling correctors and what's wrong with them?
   249  
   250  Some other misspelling correctors:
   251  
   252  * https://github.com/vlajos/misspell_fixer
   253  * https://github.com/lyda/misspell-check
   254  * https://github.com/lucasdemarchi
   255  
   256  They all work but had problems that prevented me from using them at scale:
   257  
   258  * slow, all of the above check one misspelling at a time (i.e. linear) using regexps
   259  * not MIT/Apache2 licensed (or equivalent)
   260  * have dependencies that don't work for me (python3, bash, linux sed, etc)
   261  * don't understand American vs. British English and sometimes makes unwelcome "corrections"
   262  
   263  That said, they might be perfect for you and many have more features
   264  than this project!
   265  
   266  <a name="performance"></a>
   267  ### How fast is it?
   268  
   269  Misspell is Easily 100x to 1000x faster than other spelling correctors.  You
   270  should be able to check and correct 1000 files in under 250ms.
   271  
   272  This uses the mighty power of golang's
   273  [strings.Replacer](https://golang.org/pkg/strings/#Replacer) which is
   274  a implementation or variation of the
   275  [Aho–Corasick algorithm](https://en.wikipedia.org/wiki/Aho–Corasick_algorithm).
   276  This makes multiple substring matches *simultaneously*
   277  
   278  In addition this uses multiple CPU cores to work on multiple files.
   279  
   280  <a name="issues"></a>
   281  ### What problems does it have?
   282  
   283  Unlike the other projects, this doesn't know what a "word" is.  There may be
   284  more false positives and false negatives due to this.  On the other hand, it
   285  sometimes catches things others don't.
   286  
   287  Either way, please file bugs and we'll fix them!
   288  
   289  Since it operates in parallel to make corrections, it can be non-obvious to
   290  determine exactly what word was corrected.
   291  
   292  <a name="#debug"></a>
   293  ### It's making mistakes.  How can I debug?
   294  
   295  Run using `-debug` flag on the file you want.  It should then print what word
   296  it is trying to correct.  Then [file a
   297  bug](https://github.com/client9/misspell/issues) describing the problem.
   298  Thanks!
   299  
   300  <a name="#missing"></a>
   301  ### Why is it making mistakes or missing items in golang files?
   302  
   303  The matching function is *case-sensitive*, so variable names that are multiple
   304  worlds either in all-upper or all-lower case sometimes can cause false
   305  positives.  For instance a variable named `bodyreader` could trigger a false
   306  positive since `yrea` is in the middle that could be corrected to `year`.
   307  Other problems happen if the variable name uses a English contraction that
   308  should use an apostrophe.  The best way of fixing this is to use the
   309  [Effective Go naming
   310  conventions](https://golang.org/doc/effective_go.html#mixed-caps) and use
   311  [camelCase](https://en.wikipedia.org/wiki/CamelCase) for variable names.  You
   312  can check your code using [golint](https://github.com/golang/lint)
   313  
   314  <a name="license"></a>
   315  ### What license is this?
   316  
   317  [MIT](https://github.com/client9/misspell/blob/master/LICENSE)
   318  
   319  <a name="words"></a>
   320  ### Where do the word lists come from?
   321  
   322  It started with a word list from
   323  [Wikipedia](https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines).
   324  Unfortunately, this list had to be highly edited as many of the words are
   325  obsolete or based from mistakes on mechanical typewriters (I'm guessing).
   326  
   327  Additional words were added based on actually mistakes seen in
   328  the wild (meaning self-generated).
   329  
   330  Variations of UK and US spellings are based on many sources including:
   331  
   332  * http://www.tysto.com/uk-us-spelling-list.html (with heavy editing, many are incorrect)
   333  * http://www.oxforddictionaries.com/us/words/american-and-british-spelling-american (excellent site but incomplete)
   334  * Diffing US and UK [scowl dictionaries](http://wordlist.aspell.net)
   335  
   336  American English is more accepting of spelling variations than is British
   337  English, so "what is American or not" is subject to opinion.  Corrections and help welcome.
   338  
   339  <a name="otherideas">
   340  ### What are some other enhancements that could be done?
   341  
   342  Here's some ideas for enhancements:
   343  
   344  *Capitalization of proper nouns* could be done (e.g. weekday and month names, country names, language names)
   345  
   346  *Opinionated US spellings*   US English has a number of words with alternate
   347  spellings.  Think [adviser vs.
   348  advisor](http://grammarist.com/spelling/adviser-advisor/).  While "advisor" is not wrong, the opinionated US
   349  locale would correct "advisor" to "adviser".
   350  
   351  *Versioning*  Some type of versioning is needed so reporting mistakes and errors is easier.
   352  
   353  *Feedback*  Mistakes would be sent to some server for agregation and feedback review.
   354  
   355  
   356  ## Github Emoji Test :imp:
   357  
   358  :imp:  😻
   359  
   360  Bold **:imp::**
   361  
   362  ```
   363  This is an :imp:
   364  ```
   365  
   366  This is an `:imp:`
   367  
   368