github.com/scottcagno/storage@v1.8.0/pkg/search/search.go

github.com/scottcagno/storage@v1.8.0/pkg/search/search.go (about)

1 package search
2
3 type Searcher interface {
4 FindIndex(text, pattern []byte) int
5 FindIndexString(text, pattern string) int
6 }
7
8 // Boyer-Moore:
9 // Works by pre-analyzing the pattern and comparing from right-to-left. If a mismatch occurs, the
10 // initial analysis is used to determine how far the pattern can be shifted w.r.t. the text being
11 // searched. This works particularly well for long search patterns. In particular, it can be
12 // sublinear, as you do not need to read every single character of your text. So if your pattern is one
13 // or two characters, then it literally becomes linear searching. The length of your pattern you are
14 // trying to search is in theory equal to the best case scenario of how many characters you
15 // can skip for something that doesn't match.
16
17 // Knuth-Morris-Pratt:
18 // Also works by pre-analyzing the pattern, but tries to re-use whatever was already matched in the
19 // initial part of the pattern to avoid having to rematch that. This can work quite well, if your
20 // alphabet is small (f.ex. DNA bases), as you get a higher chance that your search patterns
21 // contain re-usable sub-patterns. KMP is best suited for searching texts that have a lot of tight
22 // repetition.
23
24 // Rabin-Karp:
25 // Works by utilizing efficient computation of hash values of the successive substrings of the text,
26 // which it then uses for comparing matches. It is best on large text in which you are finding multiple
27 // pattern matches, like detecting plagiarism.