go.chromium.org/luci@v0.0.0-20240309015107-7cdc2e660f33/logdog/common/storage/bigtable/doc.go (about) 1 // Copyright 2015 The LUCI Authors. 2 // 3 // Licensed under the Apache License, Version 2.0 (the "License"); 4 // you may not use this file except in compliance with the License. 5 // You may obtain a copy of the License at 6 // 7 // http://www.apache.org/licenses/LICENSE-2.0 8 // 9 // Unless required by applicable law or agreed to in writing, software 10 // distributed under the License is distributed on an "AS IS" BASIS, 11 // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 // See the License for the specific language governing permissions and 13 // limitations under the License. 14 15 // Package bigtable provides an implementation of the Storage interface backed 16 // by Google Cloud Platform's BigTable. 17 // 18 // # Intermediate Log Table 19 // 20 // The Intermediate Log Table stores LogEntry protobufs that have been ingested, 21 // but haven't yet been archived. It is a tall table whose rows are keyed off 22 // of the log's (Path,Stream-Index) in that order. 23 // 24 // Each entry in the table will contain the following schema: 25 // - Column Family "log" 26 // - Column "data": the LogEntry raw protobuf data. Soft size limit of ~1MB. 27 // 28 // The log path is the composite of the log's (Prefix, Name) properties. Logs 29 // belonging to the same stream will share the same path, so they will be 30 // clustered together and suitable for efficient iteration. Immediately 31 // following the path will be the log's stream index. 32 // 33 // [ 20 bytes ] ~ [ 1-5 bytes ] 34 // B64(SHA256(Path(Prefix, Name))) + '~' + HEX(cmpbin(StreamIndex)) 35 // 36 // As there is no (technical) size constraint to either the Prefix or Name 37 // values, they will both be hashed using SHA256 to produce a unique key 38 // representing that specific log stream. 39 // 40 // This allows a key to be generated representing "immediately after the row" 41 // by appending two '~' characters to the base hash. Since the second '~' 42 // character is always greater than any HEX(cmpbin(*)) value, this will 43 // effectively upper-bound the row. 44 // 45 // "cmpbin" (go.chromium.org/luci/common/cmpbin) will be used to format the 46 // stream index. It is a variable-width number encoding scheme that offers the 47 // guarantee that byte-sorted encoded numbers will maintain the same order as 48 // the numbers themselves. 49 package bigtable