github.com/quay/claircore@v1.5.28/docs/concepts/indexer_architecture.md (about)

     1  # Indexer
     2  `claircore/indexer`
     3  
     4  The Indexer package performs Libindex's heavy lifting. It is responsible for retreiving Manifest layers, parsing the contents of each layer, and computing an IndexReport.
     5  
     6  To perform this action in incremental steps the Indexer is implemented as a finite state machine. At each state transition the Indexer persists an updated IndexReport to its datastore.
     7  
     8  ## States
     9  The following diagram expresses the possible states of the Indexer:
    10  ```mermaid
    11  stateDiagram-v2
    12  	state if_indexed <<choice>>
    13  	[*] --> CheckManifest
    14  	CheckManifest --> if_indexed
    15  	if_indexed --> [*]: Indexed
    16  	if_indexed --> FetchLayers: Unindexed
    17  	FetchLayers --> ScanLayers
    18  	ScanLayers --> Coalesce
    19  	Coalesce --> IndexManifest
    20  	IndexManifest --> IndexFinished
    21  	IndexFinished --> [*]
    22  %% These notes make the diagram unreadable :/
    23  %% note left of CheckManifest: Determine if this manifest has been indexed previously.
    24  %% note right of FetchLayers: Determine which layers need to be indexed and fetch them.
    25  %% note right of ScanLayers: Concurrently run needed Indexers on layers.
    26  %% note right of Coalesce: Compute the final contents of the container image.
    27  %% note right of IndexManifest: Associate all the discoved data.
    28  %% note right of IndexFinished: Persist the results.
    29  ```
    30  
    31  ## Data Model
    32  The Indexer data model focuses on content addressable hashes as primary keys, the deduplication of package/distribution/repostitory information, and the recording of scan artifacts.
    33  Scan artifacts are unique artifacts found within a layer which point to a deduplicated general package/distribution/repository record.
    34  
    35  The following diagram outlines the current Indexer data model.
    36  ```mermaid
    37  %%{init: {"er":{"layoutDirection":"RL"}} }%%
    38  erDiagram
    39  	ManifestLayer many to 1 Manifest: ""
    40  	ManifestLayer many to 1 Layer: ""
    41  	ScannedLayer many to 1 Layer: ""
    42  	ScannedLayer many to 1 Scanner: ""
    43  	ScannedManifest many to 1 Manifest: ""
    44  	ScannedManifest many to 1 Scanner: ""
    45  
    46  	TYPE_ScanArtifact 1 to 1 Layer: ""
    47  	TYPE_ScanArtifact 1 to 1 Scanner: ""
    48  	TYPE_ScanArtifact 1 to 1 TYPE: ""
    49  
    50  	ManifestIndex many to 1 Manifest: ""
    51  	ManifestIndex 1 to zero or one TYPE: ""
    52  
    53  	IndexReport 1 to 1 Manifest: "cached result"
    54  ```
    55  Note that `TYPE` stands in for each of the Indexer types (i.e. `Package`, `Repository`, etc.).
    56  
    57  ## HTTP Resources
    58  
    59  Indexers as currently built may make network requests.
    60  This is an outstanding issue.
    61  The following are the URLs used.
    62  
    63  {{# injecturls indexer }}