github.com/KusionStack/kpm@v0.8.4-0.20240326033734-dc72298a30e5/docs/research/selection-strategy.md (about)

     1  # A Tour of various Selection Strategy
     2  
     3  Almost in every Package Manager, there are 4 main actors:
     4  
     5  **Project code** is the code for which we want to manage the dependency.
     6  
     7  **Manifest file** is a file in which the dependencies for the project code are listed.
     8  
     9  **Lock file** is a package manager generated file that contains all the information necessary to reproduce the same dependency tree across any platform.
    10  
    11  **Dependency code** is the fetched code of the resolved dependencies.
    12  
    13  To prevent dependency conflicts, dependency resolution and optimizing dependency tree, selection strategy is used by package manager.
    14  
    15  Lets study the selection strategy of famous package managers: 
    16  
    17  ### Cargo
    18  
    19  In rust, dependencies are specified in cargo.toml file in the format <name\> = <version\>. [Semver](https://semver.org/) is used when specifying version numbers. 
    20  
    21  To update a dependency safely, rust uses the concept of version compatibility.
    22  
    23  Cargo uses Semantic to constrain the compatibility between
    24  different versions of a package. Cargo uses the leftmost nonzero number of the version to determine compatibility, e.g. version numbers 1.0.16 and 1.1.16 are considered compatible, and cargo considers it safe to update in the compatible range, but updates outside the compatibility range are not allowed
    25  
    26  lets see how semantic version requirement is considered during resolution of dependencies.
    27  
    28  • When multiple packages require a common dependency, the resolver aims to ensure they utilize the same version within a SemVer compatibility range, favoring the latest version within that range. For example, if package 1 depends on `foo = "1.0"` and package 2 depends on `foo = "1.1"`, then if the highest version during lock file generation is 1.2.1, both packages will utilize this version. Even if a new version like 2.0.0 is released later, it won't be automatically chosen as it's deemed incompatible.
    29  <br>
    30  • If multiple packages have a common dependency with semver-incompatible versions, then Cargo will allow this, but will build two separate copies of the dependency.
    31  <br>
    32  • If the resolver is constrained to two different versions within the same compatibility range, it will raise an error, as multiple versions within the range are not permitted.
    33  <br>
    34  • Many of the versions in Cargo are pre-releases, which Cargo does not usually use. To use these pre-releases, the user must specify the pre-release version, which often means that it is unstable.
    35  
    36  Cargo's dependency parser considers various factors beyond Semantic Versioning requirements, including package characteristics, dependency types, parser versions, and numerous other rules.
    37  
    38  Running `cargo build` will resolve dependencies listed in the manifest file and save the result in `cargo.lock` file. 
    39  
    40  #### Advantages
    41  
    42  • **Compatibility Assurance**: Cargo ensures that dependencies adhere to Semantic Versioning (SemVer) rules, promoting compatibility and reducing potential conflicts between packages.
    43  
    44  • **Integration with Rust Ecosystem**: Cargo is tightly integrated with the Rust ecosystem, facilitating seamless dependency management for Rust projects. Its integration with tools like rustc, the Rust compiler, and rustup, the Rust toolchain installer, enhances developer productivity and simplifies the development workflow.
    45  
    46  #### Disadvantages:
    47  
    48  • **Security Risks in Package Ecosystem**: use of yanked values and unsafe keywords in real-world Rust libraries and applications contribute to these risks.
    49  
    50  • **Dependency Bloat**: In some cases, Cargo's dependency resolution may result in the inclusion of unnecessary or overly large dependencies, leading to increased binary sizes or longer build times. This can impact the performance and efficiency of the final application, especially in resource-constrained environments.
    51  
    52  ### Go Package Manager
    53  
    54  The Go package manager adopts a Minimum Version Selection (MVS) approach to determine which packages to include in the final list for building. MVS aims to create builds that closely mirror the dependencies used by the package author during development. This means that when a user builds a project, the dependencies chosen are as similar as possible to the ones the original author developed against.
    55  
    56  Minimal Version Selection (MVS) operates on the assumption that each module specifies only the minimum versions of its dependencies, adhering to the import compatibility rule where newer versions are expected to be compatible with older ones. This means dependency requirements include only minimum versions, without specifying maximum versions or incompatible later versions.
    57  
    58  version selection strategy is meant to provide algorithms for four operations on build list:
    59  
    60  1. Construct the current build list: 
    61      
    62      The rough build list for package M would be just the list of all modules reachable in the requirement graph starting at M and following arrows. This can be accomplished through a straightforward recursive traversal of the graph, ensuring to skip nodes that have already been visited. The rough built list can then be converted to the final build list.
    63  
    64  2. Upgrade all modules to their latest versions:
    65  
    66      This can be achieved by running `go get -u` which will upgrade all the modules to their latest versions. 
    67      Upgrading the modules would mean all arrows in the dependency graph is now pointing to the latest version of the modules. This will result in a upgraded dependency graph but changes in the dependency graph alone won't cause future builds to use the updated modules. To achieve this we need a change in our built list in a way that won't affect dependent packages built list, as upgrades should be limited to our package alone.
    68  
    69      At first glance, it would seem intutive to include all the updated packages in our built list. But, not all packages are necessary and we want to include as few additional modules as possible. To produce a minimum requirement list, an helper algorithm R is introduced.
    70  
    71      **Algorithm R**: 
    72  
    73      To compute a minimal requirement list inducing a given build list below the target, reverse postorder traversal is employed, ensuring modules are visited after all those pointing into them. Each module is added only if it's not implied by previously visited ones. 
    74  
    75  3. Upgrade one module to a specific newer version:
    76  
    77      Upgrading all modules to their latest versions can be risky, so developers often opt to upgrade only one module. 
    78  
    79      Upgrading one module mean that the arrow which earlier pointed to that module is now pointing to the upgraded version. We can construct a built list from the updated dependency graph, which can then be fed to Algorithm R to get a minimum requirement list.
    80  
    81  4. Downgrade one module to a specific older version.
    82  
    83      The downgrade algorithm examines each of the target's requirements separately. If a requirement conflicts with the proposed downgrade, meaning its build list contains a version of a module that is no longer allowed, the algorithm iterates through older versions until finding one that aligns with the downgrade.
    84  
    85      Downgrades make changes to the built list by removing requirements.