github.com/unigraph-dev/dgraph@v1.1.1-0.20200923154953-8b52b426f765/chunker/README.txt (about)

     1  go tool pprof --alloc_objects uidassigner heap.prof
     2  
     3  (pprof) top10
     4  196427053 of 207887723 total (94.49%)
     5  Dropped 41 nodes (cum <= 1039438)
     6  Showing top 10 nodes out of 31 (cum >= 8566234)
     7        flat  flat%   sum%        cum   cum%
     8    55529704 26.71% 26.71%   55529704 26.71%  github.com/dgraph-io/dgraph/rdf.Parse
     9    28255068 13.59% 40.30%   30647245 14.74%  github.com/dgraph-io/dgraph/posting.(*List).getPostingList
    10    20406729  9.82% 50.12%   20406729  9.82%  github.com/zond/gotomic.newRealEntryWithHashCode
    11    17777182  8.55% 58.67%   17777182  8.55%  strings.makeCutsetFunc
    12    17582839  8.46% 67.13%   17706815  8.52%  github.com/dgraph-io/dgraph/loader.(*state).readLines
    13    15139047  7.28% 74.41%   88445933 42.55%  github.com/dgraph-io/dgraph/loader.(*state).parseStream
    14    12927366  6.22% 80.63%   12927366  6.22%  github.com/zond/gotomic.(*element).search
    15    10789028  5.19% 85.82%   66411362 31.95%  github.com/dgraph-io/dgraph/posting.GetOrCreate
    16     9453856  4.55% 90.37%    9453856  4.55%  github.com/zond/gotomic.(*hashHit).search
    17     8566234  4.12% 94.49%    8566234  4.12%  github.com/dgraph-io/dgraph/uid.stringKey
    18  
    19  
    20  (pprof) list rdf.Parse
    21  Total: 207887723
    22  ROUTINE ======================== github.com/dgraph-io/dgraph/rdf.Parse in /home/mrjn/go/src/github.com/dgraph-io/dgraph/rdf/parse.go
    23    55529704   55529704 (flat, cum) 26.71% of Total
    24           .          .    118:	}
    25           .          .    119:	return val[1 : len(val)-1]
    26           .          .    120:}
    27           .          .    121:
    28           .          .    122:func Parse(line string) (rnq NQuad, rerr error) {
    29    54857942   54857942    123:	l := lex.NewLexer(line)
    30           .          .    124:	go run(l)
    31           .          .    125:	var oval string
    32           .          .    126:	var vend bool
    33  
    34  
    35  This showed that lex.NewLexer(..) was pretty expensive in terms of memory allocation.
    36  So, let's use sync.Pool here.
    37  
    38  After using sync.Pool, this is the output:
    39  
    40  422808936 of 560381333 total (75.45%)
    41  Dropped 63 nodes (cum <= 2801906)
    42  Showing top 10 nodes out of 62 (cum >= 18180150)
    43        flat  flat%   sum%        cum   cum%
    44   103445194 18.46% 18.46%  103445194 18.46%  github.com/Sirupsen/logrus.(*Entry).WithFields
    45    65448918 11.68% 30.14%  163184489 29.12%  github.com/Sirupsen/logrus.(*Entry).WithField
    46    48366300  8.63% 38.77%  203838187 36.37%  github.com/dgraph-io/dgraph/posting.(*List).get
    47    39789719  7.10% 45.87%   49276181  8.79%  github.com/dgraph-io/dgraph/posting.(*List).getPostingList
    48    36642638  6.54% 52.41%   36642638  6.54%  github.com/dgraph-io/dgraph/lex.NewLexer
    49    35190301  6.28% 58.69%   35190301  6.28%  github.com/google/flatbuffers/go.(*Builder).growByteBuffer
    50    31392455  5.60% 64.29%   31392455  5.60%  github.com/zond/gotomic.newRealEntryWithHashCode
    51    25895676  4.62% 68.91%   25895676  4.62%  github.com/zond/gotomic.(*element).search
    52    18546971  3.31% 72.22%   72863016 13.00%  github.com/dgraph-io/dgraph/loader.(*state).parseStream
    53    18090764  3.23% 75.45%   18180150  3.24%  github.com/dgraph-io/dgraph/loader.(*state).readLines
    54  
    55  After a few more discussions, I realized that lexer didn't need to be allocated on the heap.
    56  So, I switched it to be allocated on stack. These are the results.
    57  
    58  $ go tool pprof uidassigner heap.prof 
    59  Entering interactive mode (type "help" for commands)
    60  (pprof) top10
    61  1308.70MB of 1696.59MB total (77.14%)
    62  Dropped 73 nodes (cum <= 8.48MB)
    63  Showing top 10 nodes out of 52 (cum >= 161.50MB)
    64        flat  flat%   sum%        cum   cum%
    65    304.56MB 17.95% 17.95%   304.56MB 17.95%  github.com/dgraph-io/dgraph/posting.NewList
    66    209.55MB 12.35% 30.30%   209.55MB 12.35%  github.com/Sirupsen/logrus.(*Entry).WithFields
    67    207.55MB 12.23% 42.54%   417.10MB 24.58%  github.com/Sirupsen/logrus.(*Entry).WithField
    68       108MB  6.37% 48.90%      108MB  6.37%  github.com/dgraph-io/dgraph/uid.(*lockManager).newOrExisting
    69        88MB  5.19% 54.09%       88MB  5.19%  github.com/zond/gotomic.newMockEntry
    70     85.51MB  5.04% 59.13%    85.51MB  5.04%  github.com/google/flatbuffers/go.(*Builder).growByteBuffer
    71     78.01MB  4.60% 63.73%    78.01MB  4.60%  github.com/dgraph-io/dgraph/posting.Key
    72     78.01MB  4.60% 68.32%    78.51MB  4.63%  github.com/dgraph-io/dgraph/uid.stringKey
    73        76MB  4.48% 72.80%       76MB  4.48%  github.com/zond/gotomic.newRealEntryWithHashCode
    74     73.50MB  4.33% 77.14%   161.50MB  9.52%  github.com/zond/gotomic.(*Hash).getBucketByIndex
    75  
    76  Now, rdf.Parse is no longer shows up in memory profiler. Win!