golang.org/toolchain@v0.0.1-go1.9rc2.windows-amd64/src/cmd/vendor/github.com/google/pprof/doc/developer/profile.proto.md (about) 1 This is a description of the profile.proto format. 2 3 # Overview 4 5 Profile.proto is a data representation for profile data. It is independent of 6 the type of data being collected and the sampling process used to collect that 7 data. On disk, it is represented as a gzip-compressed protocol buffer, described 8 at src/proto/profile.proto 9 10 A profile in this context refers to a collection of samples, each one 11 representing measurements performed at a certain point in the life of a job. A 12 sample associates a set of measurement values with a list of locations, commonly 13 representing the program call stack when the sample was taken. 14 15 Tools such as pprof analyze these samples and display this information in 16 multiple forms, such as identifying hottest locations, building graphical call 17 graphs or trees, etc. 18 19 # General structure of a profile 20 21 A profile is represented on a Profile message, which contain the following 22 fields: 23 24 * *sample*: A profile sample, with the values measured and the associated call 25 stack as a list of location ids. Samples with identical call stacks can be 26 merged by adding their respective values, element by element. 27 * *location*: A unique place in the program, commonly mapped to a single 28 instruction address. It has a unique nonzero id, to be referenced from the 29 samples. It contains source information in the form of lines, and a mapping id 30 that points to a binary. 31 * *function*: A program function as defined in the program source. It has a 32 unique nonzero id, referenced from the location lines. It contains a 33 human-readable name for the function (eg a C++ demangled name), a system name 34 (eg a C++ mangled name), the name of the corresponding source file, and other 35 function attributes. 36 * *mapping*: A binary that is part of the program during the profile 37 collection. It has a unique nonzero id, referenced from the locations. It 38 includes details on how the binary was mapped during program execution. By 39 convention the main program binary is the first mapping, followed by any 40 shared libraries. 41 * *string_table*: All strings in the profile are represented as indices into 42 this repeating field. The first string is empty, so index == 0 always 43 represents the empty string. 44 45 # Measurement values 46 47 Measurement values are represented as 64-bit integers. The profile contains an 48 explicit description of each value represented, using a ValueType message, with 49 two fields: 50 51 * *Type*: A human-readable description of the type semantics. For example “cpu” 52 to represent CPU time, “wall” or “time” for wallclock time, or “memory” for 53 bytes allocated. 54 * *Unit*: A human-readable name of the unit represented by the 64-bit integer 55 values. For example, it could be “nanoseconds” or “milliseconds” for a time 56 value, or “bytes” or “megabytes” for a memory size. If this is just 57 representing a number of events, the recommended unit name is “count”. 58 59 A profile can represent multiple measurements per sample, but all samples must 60 have the same number and type of measurements. The actual values are stored in 61 the Sample.value fields, each one described by the corresponding 62 Profile.sample_type field. 63 64 Some profiles have a uniform period that describe the granularity of the data 65 collection. For example, a CPU profile may have a period of 100ms, or a memory 66 allocation profile may have a period of 512kb. Profiles can optionally describe 67 such a value on the Profile.period and Profile.period_type fields. The profile 68 period is meant for human consumption and does not affect the interpretation of 69 the profiling data. 70 71 By convention, the first value on all profiles is the number of samples 72 collected at this call stack, with unit “count”. Because the profile does not 73 describe the sampling process beyond the optional period, it must include 74 unsampled values for all measurements. For example, a CPU profile could have 75 value[0] == samples, and value[1] == time in milliseconds. 76 77 ## Locations, functions and mappings 78 79 Each sample lists the id of each location where the sample was collected, in 80 bottom-up order. Each location has an explicit unique nonzero integer id, 81 independent of its position in the profile, and holds additional information to 82 identify the corresponding source. 83 84 The profile source is expected to perform any adjustment required to the 85 locations in order to point to the calls in the stack. For example, if the 86 profile source extracts the call stack by walking back over the program stack, 87 it must adjust the instruction addresses to point to the actual call 88 instruction, instead of the instruction that each call will return to. 89 90 Sources usually generate profiles that fall into these two categories: 91 92 * *Unsymbolized profiles*: These only contain instruction addresses, and are to 93 be symbolized by a separate tool. It is critical for each location to point to 94 a valid mapping, which will provide the information required for 95 symbolization. These are used for profiles of compiled languages, such as C++ 96 and Go. 97 98 * *Symbolized profiles*: These contain all the symbol information available for 99 the profile. Mappings and instruction addresses are optional for symbolized 100 locations. These are used for profiles of interpreted or jitted languages, 101 such as Java or Python. Also, the profile format allows the generation of 102 mixed profiles, with symbolized and unsymbolized locations. 103 104 The symbol information is represented in the repeating lines field of the 105 Location message. A location has multiple lines if it reflects multiple program 106 sources, for example if representing inlined call stacks. Lines reference 107 functions by their unique nonzero id, and the source line number within the 108 source file listed by the function. A function contains the source attributes 109 for a function, including its name, source file, etc. Functions include both a 110 user and a system form of the name, for example to include C++ demangled and 111 mangled names. For profiles where only a single name exists, both should be set 112 to the same string. 113 114 Mappings are also referenced from locations by their unique nonzero id, and 115 include all information needed to symbolize addresses within the mapping. It 116 includes similar information to the Linux /proc/self/maps file. Locations 117 associated to a mapping should have addresses that land between the mapping 118 start and limit. Also, if available, mappings should include a build id to 119 uniquely identify the version of the binary being used. 120 121 ## Labels 122 123 Samples optionally contain labels, which are annotations to discriminate samples 124 with identical locations. For example, a label can be used on a malloc profile 125 to indicate allocation size, so two samples on the same call stack with sizes 126 2MB and 4MB do not get merged into a single sample with two allocations and a 127 size of 6MB. 128 129 Labels can be string-based or numeric. They are represented by the Label 130 message, with a key identifying the label and either a string or numeric 131 value. For numeric labels, by convention the key represents the measurement unit 132 of the numeric value. So for the previous example, the samples would have labels 133 {“bytes”, 2097152} and {“bytes”, 4194304}. 134 135 ## Keep and drop expressions 136 137 Some profile sources may have knowledge of locations that are uninteresting or 138 irrelevant. However, if symbolization is needed in order to identify these 139 locations, the profile source may not be able to remove them when the profile is 140 generated. The profile format provides a mechanism to identify these frames by 141 name, through regular expressions. 142 143 These expressions must match the function name in its entirety. Frames that 144 match Profile.drop\_frames will be dropped from the profile, along with any 145 frames below it. Frames that match Profile.keep\_frames will be kept, even if 146 they match drop\_frames. 147