github.com/intel/goresctrl@v0.5.0/doc/rdt.md (about)

     1  # Intel RDT (Resource Director Technology)
     2  
     3  ## Background
     4  
     5  Intel® RDT provides capabilities for cache and memory allocation and
     6  monitoring. In Linux system the functionality is exposed to the user space via
     7  the [resctrl](https://www.kernel.org/doc/Documentation/x86/intel_rdt_ui.txt)
     8  filesystem. Cache and memory allocation in RDT is handled by using resource
     9  control groups or classes of service (CLOSes). Resource allocation is specified
    10  on the group level and each task (process/thread) is assigned to one group. In
    11  the context of goresctrl the term 'RDT class' is used instead of 'resource
    12  control group' or 'CLOS'.
    13  
    14  Goresctrl supports all available RDT technologies, i.e. L2 and L3 Cache
    15  Allocation (CAT) with Code and Data Prioritization (CDP) and Memory Bandwidth
    16  Allocation (MBA) plus Cache Monitoring (CMT) and Memory Bandwidth Monitoring
    17  (MBM).
    18  
    19  ## API
    20  
    21  The API is described in
    22  [pkg.go.dev](https://pkg.go.dev/github.com/intel/goresctrl/pkg/rdt).
    23  
    24  # Configuration
    25  
    26  ## RDT Classes
    27  
    28  Goresctrl provides hiearachical approach for managing RDT resources. The RDT
    29  configuration is a two-level hierarchy consisting of partitions and classes: a
    30  set of partitions each having a set of classes.
    31  
    32  ### Partitions
    33  
    34  A partition consists of available resources and classes that share the
    35  resources. Resources include portions of caches (L2 and L3) and memory
    36  bandwidth (MB). Cache partitioning is exclusive: cache portions of two
    37  partitions are not allowed to overlap. However, by design of the underlying
    38  technology, MB allocations are not exclusive. Thus, it is possible to assign
    39  all partitions 100% of memory bandwidth, for example.
    40  
    41  ### Classes
    42  
    43  Classes represent the actual RDT classes processes are assigned to. In
    44  contrast to partitions, cache allocation between classes under a specific
    45  partition may overlap (and they usually do).
    46  
    47  Requirements for class specifications:
    48  
    49  - Names of classes must be unique accross all partitions
    50  - Total number of classes (CLOSes) supported by the underlying hardware must
    51    not be exceeded.
    52    - **NOTE:** resctrl root and possible groups managed outside goresctrl are also
    53    accounted against this limit.
    54  - Reserved name `DEFAULT` or an empty string refer to the resctrl root
    55  
    56  ## Configuration format
    57  
    58  ```yaml
    59  # Common options
    60  options:
    61    l2:
    62      # Set to false if L2 CAT must be available (Default is true).
    63      optional: [true|false]
    64    l3:
    65      # Set to false if L3 CAT must be available (Default is true).
    66      optional: [true|false]
    67    mb:
    68      # Set to false if MBA must be available (Default is true).
    69      optional: [true|false]
    70  partitions:
    71    <partition-name>:
    72      # L2 CAT configuration of the partition
    73      l2Allocation:
    74        <cache-ids>:
    75          # L2 allocation spec used when CDP is not enabled, or, if CDP is
    76          # enabled but separate code and data specs are not specified
    77          unified: <cat-allocation-spec>
    78          # L2 allocation spec for the code path when CDP is enabled (optional)
    79          code: <cat-allocation-spec>
    80          # L2 allocation spec for the data path when CDP is enabled (optional)
    81          data: <cat-allocation-spec>
    82      # L3 CAT configuration of the partition
    83      l3Allocation:
    84        <cache-ids>:
    85          # L3 allocation spec used when CDP is not enabled, or, if CDP is
    86          # enabled but separate code and data specs are not specified
    87          unified: <cat-allocation-spec>
    88          # L3 allocation spec for the code path when CDP is enabled (optional)
    89          code: <cat-allocation-spec>
    90          # L3 allocation spec for the data path when CDP is enabled (optional)
    91          data: <cat-allocation-spec>
    92      # MBA configuration of the partition
    93      mbAllocation:
    94        # MB allocation spec
    95        <cache-ids>: <mb-allocation-spec>
    96      classes:
    97        <class-name>:
    98          l2Allocation:
    99            <cache-ids>:
   100              # L2 allocation spec used when CDP is not enabled, or, if CDP is
   101              # enabled but separate code and data specs are not specified
   102              unified: <cat-allocation-spec>
   103              # L2 allocation spec for the code path when CDP is enabled (optional)
   104              code: <cat-allocation-spec>
   105              # L2 allocation spec for the data path when CDP is enabled (optional)
   106              data: <cat-allocation-spec>
   107          l3Allocation:
   108            <cache-ids>:
   109              # L3 allocation spec used when CDP is not enabled, or, if CDP is
   110              # enabled but separate code and data specs are not specified
   111              unified: <cat-allocation-spec>
   112              # L3 allocation spec for the code path when CDP is enabled (optional)
   113              code: <cat-allocation-spec>
   114              # L3 allocation spec for the data path when CDP is enabled (optional)
   115              data: <cat-allocation-spec>
   116          mbAllocation:
   117            # MB allocation spec of the class
   118            <cache-ids>: <mb-allocation-spec>
   119  
   120          # Settings for the Kubernetes helper functions. Have no effect on the resctrl
   121          # configuration and control interface.
   122          kubernetes:
   123            # Set to true to deny assigning to this class via container annotation
   124            denyContainerAnnotation: [true|false]
   125            # Set to true to deny assigning to this class via pod annotation
   126            denyPodAnnotation: [true|false]
   127  ```
   128  
   129  | Field | Format | Example | Description |
   130  | ----- | ------ | ------- | ----------- |
   131  | `<partition-name>` | string | `exclusive` | Name of a higher level RDT partition.
   132  | `<class-name>`     | string | `guaranteed` | Name of an RDT class, mapping to a directory in the resctrl fs. Reserved name `DEFAULT` or an empty string can be used to refer to the root class.
   133  | `<cache-ids>`      | cpuset (string) | `0,2,4,8-11` | Set of cache ids. Special value 'all' denotes a default used for cache "all the reset".
   134  | `<cat-allocation-spec>` | percentage (string) | `"60%"` | Cache allocation spec, may be specified as relative (percentage) or absolute (bitmask). An absolute bitmask must be contiguous.
   135                            | hex bitmask (string) | `"0xf0"` |
   136                            | bit numbers (string) | `"0-3"` |
   137  | `<mb-allocation-spec>` | list of strings | `[50%, 1000MBps]` | Memory bandwidth allocation spec, separarate values for percentage and MBps based allocation. The *MBps* value is in effect when resctrl is mounted with `-o mba_MBps`.
   138  
   139  ## Short forms
   140  
   141  The configuration accepts shortforms in order to allow easier and more readable
   142  configuration of the common and simple use cases.
   143  
   144  1. Separate unified/code/data specs can be omitted, when no separate CDP config
   145     is desired, i.e.
   146  
   147        ```
   148            <cache-ids>: <cat-allocation-spec>
   149  
   150        ```
   151  
   152     is equal to
   153  
   154        ```
   155            <cache-ids>:
   156              unified: <cat-allocation-spec>
   157        ```
   158  
   159  1. `<cache-ids>` may be omitted if no cache id specific configuration (and no
   160     CDP config for CAT) is desired, i.e.
   161  
   162        ```
   163          l3Allocation: "60%"
   164          mbAllocation: ["50%"]
   165        ```
   166  
   167     is equal to
   168  
   169        ```
   170          l3Allocation:
   171            all:
   172              unified: "60%"
   173          mbAllocation:
   174            all: ["50%"]
   175        ```
   176  
   177  ## Examples
   178  
   179  Below is a config snippet that would allocate (ca.) 60% of the L3 cache lines
   180  exclusively to the guaranteed class. The remaining 40% L3 is for burstable and
   181  besteffort, Besteffort getting only 50% of this. guaranteed class gets full
   182  memory bandwidth whereas the other classes are throttled to 50%.
   183  
   184  ```yaml
   185  options:
   186    l2:
   187      optional: true
   188    l3:
   189      optional: true
   190    mb:
   191      optional: true
   192  partitions:
   193    exclusive:
   194      # Allocate 80% of all L2 cache IDs to the "exclusive" partition
   195      l2Allocation: "80%"
   196      # Allocate 60% of all L3 cache IDs to the "exclusive" partition
   197      l3Allocation: "60%"
   198      mbAllocation: ["100%"]
   199      classes:
   200        guaranteed:
   201          # Allocate all of the partitions cache lines and memory bandwidth to "guaranteed"
   202          l2Allocation: "100%"
   203          l3Allocation: "100%"
   204          # The class will get 100% by default
   205          #mbAllocation: ["100%"]
   206    shared:
   207      # Allocate 20% of L2 and 40% L3 cache IDs to the "shared" partition
   208      # These will NOT overlap with the cache lines allocated for "exclusive" partition
   209      l2Allocation: "20%"
   210      l3Allocation: "40%"
   211      mbAllocation: ["50%"]
   212      classes:
   213        burstable:
   214          # Allow "burstable" to use all cache lines of the "shared" partition
   215          l2Allocation: "100%"
   216          l3Allocation: "100%"
   217          # The class will get 100% by default
   218          #mbAllocation: ["100%"]
   219        besteffort:
   220          # Allow "besteffort" to use all L2 but only half of the L3 cache
   221          # lines of the "shared" partition.
   222          # These will overlap with those used by "burstable"
   223          l2Allocation: "100%"
   224          l3Allocation: "50%"
   225          # The class will get 100% by default
   226          #mbAllocation: ["100%"]
   227        DEFAULT:
   228          # Also configure the resctrl root that all processes in the system are
   229          # placed in by default
   230          l2Allocation: "50%"
   231          l3Allocation: "30%"
   232          # The class will get 100% by default
   233          #mbAllocation: ["100%"]
   234  ```
   235  
   236  The configuration also supports far more fine-grained control, e.g. per
   237  cache-ID configuration (i.e. different cache ids, or sockets, having different
   238  allocation) and Code and Data Prioritization (CDP) allowing different cache
   239  allocation for code and data paths.
   240  
   241  ```yaml
   242  ...
   243      partitions:
   244        exclusive:
   245          l3Allocation: "60%"
   246          mbAllocation: ["100%"]
   247          classes:
   248            # Automatically gets 100% of what was allocated for the partition
   249            guaranteed:
   250        shared:
   251          l3Allocation:
   252            # 'all' denotes the default and must be specified
   253            all: "40%"
   254            # Specific cache allocation for cache-ids 2 and 3
   255            2-3: "20%"
   256          mbAllocation: ["100%"]
   257          classes:
   258            burstable:
   259              l3Allocation:
   260                all:
   261                  unified: "100%"
   262                  code: "100%"
   263                  data: "80%"
   264                mbAllocation:
   265                  all: ["80%"]
   266                  2-3: ["50%"]
   267  ...
   268  ...
   269  ```
   270  
   271  In addition, if the hardware details are known, raw bitmasks or bit numbers
   272  (`0x1f` or `0-4`) can be used instead of percentages in order to be able to
   273  configure cache allocations exactly as required. The bits in this case
   274  correspond to those in /sys/fs/resctrl/ bitmasks. You can also mix relative
   275  (percentage) and absolute (bitmask) allocations. For cases where the resctrl
   276  filesystem is mounted with `-o mba_MBps` Memory bandwidth must be specifed in
   277  MBps.
   278  
   279  ```yaml
   280  ...
   281      partitions:
   282        exclusive:
   283          # Specify bitmask in bit numbers
   284          l3Allocation: "8-19"
   285          # MBps value takes effect when resctrl mount option mba_MBps is used
   286          mbAllocation: ["100%", "100000MBps"]
   287          classes:
   288            # Automatically gets 100% of what was allocated for the partition
   289            guaranteed:
   290        shared:
   291          # Explicit bitmask
   292          l3Allocation: "0xff"
   293          mbAllocation: ["50%", "2000MBps"]
   294          classes:
   295            # burstable gets 100% of what was allocated for the partition
   296            burstable:
   297            besteffort:
   298              l3Allocation: "50%"
   299              # besteffort gets 50% of the 50% (i.e. 25% of total) or 1000MBps
   300              mbAllocation: ["50%", "1000MBps"]
   301  ```
   302  
   303  ## Dynamic Configuration
   304  
   305  RDT supports dynamic configuration i.e. the parameters of existing classes may
   306  changed on-the-fly.