github.com/1aal/kubeblocks@v0.0.0-20231107070852-e1c03e598921/docs/developer_docs/integration/multi-component.md

github.com/1aal/kubeblocks@v0.0.0-20231107070852-e1c03e598921/docs/developer_docs/integration/multi-component.md (about)

     1  ---
     2  title: Multi-component configuration
     3  description: How to configure multi-component in KubeBlocks with NebulaGraph as an example
     4  keywords: [multi-component,add-on]
     5  sidebar_position: 7
     6  sidebar_label: Multi-component configuration
     7  ---
     8  
     9  # Multi-component configuration
    10  
    11  So far, you've learned the definition, backup, and configuration of single-component clusters (e.g., Oracle-MySQL).
    12  
    13  This tutorial takes NebulaGraph as an example to demonstrate how to integrate a multi-component cluster and address several common issues in multi-component configurations. You can find more details in [this repository](https://github.com/apecloud/kubeblocks/tree/main/deploy/nebula).
    14  
    15  ## Before you start
    16  
    17  - Finish [Tutorial 1](./how-to-add-an-add-on.md).
    18  - Knowledge about basic KubeBlocks concepts, such as ClusterDefinition, Cluster, ComponentRef, and Component.
    19  
    20  ## NebulaGraph Architecture
    21  
    22  First, take a look at the overall architecture of NebulaGraph.
    23  
    24  NebulaGraph applies the separation of storage and computing architecture and consists of three services: the Graph Service, the Meta Service, and the Storage Service. The following figure shows the architecture of a typical NebulaGraph cluster.
    25  
    26  ![NebulaGraph Architecture (source: https://github.com/vesoft-inc/nebula)](./../../img/nebula-aichitecture.png)
    27  
    28  - Metad: It is a component based on the Raft protocol and is responsible for data management tasks such as Schema operations, cluster management, and user permission management.
    29  - Graphd: It is the compute component and is responsible for handling query requests, including query parsing, validation, and generating and executing query plans.
    30  - Storaged: It is the distributed storage component based on Multi Group Raft, responsible for storing data.
    31  
    32  If the client is considered, the fourth component is:
    33  
    34  - Client: It is a stateless component used to connect to Graphd and send graph queries.
    35  
    36  ## Configure cluster typology
    37  
    38  Now you've learned the four components of NebulaGraph, and how each component is started and configured.
    39  
    40  Similar to a single-component cluster, you can quickly assemble the definition for a multi-component cluster.
    41  
    42  ```yaml
    43  apiVersion: apps.kubeblocks.io/v1alpha1
    44  kind: ClusterDefinition
    45  metadata:
    46    name: nebula
    47  spec:
    48    componentDefs:
    49      - name: nebula-console    # client
    50        workloadType: Stateless
    51        characterType: nebula
    52        podSpec: ...
    53      - name: nebula-graphd     # graphd
    54        workloadType: Stateful
    55        podSpec: ...
    56      - name: nebula-metad      # metad
    57        workloadType: Stateful
    58        podSpec: ...
    59      - name: nebula-storaged   # storaged
    60        workloadType: Stateful
    61        podSpec: ...
    62  ---
    63  apiVersion: apps.kubeblocks.io/v1alpha1
    64  kind: ClusterVersion
    65  metadata:
    66    name: nebula-3.5.0
    67  spec:
    68    clusterDefinitionRef: nebula   # clusterdef name
    69    componentVersions:
    70      - componentDefRef: nebula-console  # Specify image for client
    71        versionsContext:
    72          containers:
    73          - name: nebula-console
    74            image: ...
    75      - componentDefRef: nebula-graphd  # Specify image for graphd
    76        versionsContext:
    77          containers:
    78          - name: nebula-graphd
    79            image: 
    80      - componentDefRef: nebula-metad   # Specify image for metad
    81        versionsContext:
    82          containers:
    83          - name: nebula-metad
    84            image: ...
    85      - componentDefRef: nebula-storaged  # Specify image for storaged
    86        versionsContext:
    87          containers:
    88          - name: nebula-storaged
    89            image: ...
    90  ```
    91  
    92  The above YAML file provides an outline of the ClusterDefinition and ClusterVersion for NebulaGraph. Corresponding to Figure 1., four components (including the client) and their version information are specified.
    93  
    94  If each component can be started independently, the information provided in Figure 2. would be sufficient.
    95  
    96  However, it can be observed that in a multi-component cluster, there are often inter-component references. So, how to specify the references thereof?
    97  
    98  ## Configure inter-component references
    99  
   100  As discovered, components may refer to each other and Figure 3. shows the inter-component references in a NebulaGraph cluster. For example,
   101  
   102  1. Nebula-Console needs to know the port number and service name of Nebula-Graphd.
   103  2. Nebula-Graphd needs to know the DNS of each Pod of Nebula-Metad. 
   104  3. Nebula-Storaged also needs to know the DNS of each Pod of Nebula-Metad.
   105  
   106  ![Nebula Inter-Component References](./../../img/nebula-inter-component-ref.png)
   107  
   108  Therefore, three common types of inter-component references are: \
   109  
   110  1. **Service Reference**
   111     e.g., Nebula-Console needs to obtain the service name of Nebula-Graphd.
   112  2. **HostName Reference**
   113     e.g., Nebula-Graphd needs to configure the DNS of all Pods of Nebula-metad. This reference typically points to a stateful component.
   114  3. **Field Reference**
   115     e.g., Nebula-Console needs to obtain a service port name of Nebula-Graphd.
   116  
   117  To ensure that the cluster starts normally, the above information needs to be injected into the Pod through environment variables (whether it is loaded through configmap or defined as pod env).
   118  
   119  In KubeBlocks, the `ComponentDefRef` API can be used to achieve the goal. It introduces the following APIs:
   120  
   121  - `componentDefName`, used to specify the name of the component definition that is being referenced to.
   122  - `componentRefEnv`, which defines a set of environment variables that need to be injected.
   123    - `name` defines the name of the injected environment variable.
   124    - `valueFrom` defines the source of the variable value.
   125  
   126  Next, you will learn how `ComponentDefRef` deals with the three types of references mentioned above.
   127  
   128  ### Service Reference
   129  
   130  Case 1: Nebula-Console needs to obtain the service name of Nebula-Graphd.
   131  
   132  When defining `nebula-console`, add the following definitions (as `componentDefRef` shows):
   133  
   134  ```yaml
   135      - name: nebula-console
   136        workloadType: Stateless
   137        characterType: nebula
   138        componentDefRef:
   139          - componentDefName: nebula-graphd
   140            componentRefEnv:
   141              - name: GRAPHD_SVC_NAME
   142                valueFrom:
   143                  type: ServiceRef
   144  ```
   145  
   146  - Specify the component that is being referenced to as `nebula-graphd`.
   147  - The name of the injected environment variable is `GRAPHD_SVC_NAME`.
   148  - The value type of the variable is `ServerRef`, indicating that the value comes from the service name of the referenced component.
   149  
   150  :::note
   151  
   152  In KubeBlocks, if you've defined the `service` for a component, when you create a cluster, KubeBlocks will create a service named `{clusterName}-{componentName}` for that component.
   153  
   154  :::
   155  
   156  ### HostName Reference
   157  
   158  Case 2: Nebula-Graphd needs to configure the DNS of all PODs of Nebula-Metad.
   159  
   160  ```yaml
   161      - name: nebula-graphd
   162        workloadType: Statelful    
   163        componentDefRef:
   164          - componentDefName: nebula-metad
   165            componentRefEnv:
   166              - name: NEBULA_METAD_SVC
   167                valueFrom:
   168                  type: HeadlessServiceRef
   169                  format: $(POD_FQDN):9559    # Optional, specify value format
   170  ```
   171  
   172  - Specify the component that is being referenced to as nebula-metad.
   173  - The name of the injected environment variable is NEBULA_METAD_SVC.
   174  - The value type of the variable is HeadlessServiceRef.
   175    - It indicates that the value comes from the FQDN of all Pods of the referenced component, and multiple values are connected with , by default.
   176    - If the default FQDN format does not meet your needs, customize the format through format (as shown in Line 9).
   177  
   178  :::note
   179  
   180  KubeBlocks provides three built-in variables as placeholders and they will be replaced with specific values when the cluster is created:
   181  - ${POD_ORDINAL}, which is the ordinal number of the Pod.
   182  - ${POD_NAME}, which is the name of the Pod, formatted as `{clusterName}-{componentName}-{podOrdinal}`.
   183  - ${POD_FQDN}, which is the Fully Qualified Domain Name (FQDN) of the Pod.
   184  
   185  In KubeBlocks, each stateful component has a Headless Service named `headlessServiceName = {clusterName}-{componentName}-headless` by default.
   186  
   187  Therefore, the format of the Pod FQDN of each stateful component is:
   188  `POD_FQDN = {clusterName}-{componentName}-{podIndex}.{headlessServiceName}.{namespace}.svc`.
   189  
   190  :::
   191  
   192  ### Field Reference
   193  
   194  Case 3: Nebula-Console needs to obtain a service port name of Nebula-Graphd.
   195  
   196  When defining `nebula-console` , add the following configurations (as `componentDefRef` shows):
   197  
   198  ```yaml
   199      - name: nebula-console
   200        workloadType: Stateless
   201        characterType: nebula
   202        componentDefRef:
   203          - componentDefName: nebula-graphd
   204            componentRefEnv:
   205              - name: GRAPHD_SVC_PORT
   206                valueFrom:
   207                  type: FieldRef
   208                  fieldPath: $.componentDef.service.ports[?(@.name == "thrift")].port
   209  ```
   210  
   211  - Specify the component that is being referenced to as `nebula-graphd`.
   212  - The name of the injected environment variable is `GRAPHD_SVC_PORT`.
   213  - The value type of the variable is `FieldRef`, indicating that the value comes from a certain property value of the referenced component and is specified by `fieldPath`.
   214  
   215  `fieldPath` provides a way to parse property values through JSONPath syntax.
   216  When parsing JSONPath, KubeBlocks registers two root objects by default:
   217  
   218  - **componentDef**, the componentDef object being referenced.
   219  - **components**, all components corresponding to the componentDef in the created cluster.
   220  
   221  Therefore, in `fieldPath`, you can use `$.componentDef.service.ports[?(@.name == "thrift")].port` to obtain the port number named `thrift` in the service defined by this component.
   222  
   223  ## Summary
   224  
   225  This tutorial takes NebulaGraph as an example and introduces several types and solutions of inter-component references.
   226  
   227  In addition to NebulaGraph, engines like GreptimDB, Pulsar, RisingWave and StarRocks also adopt `componentDefRef` API to deal with component references. You can also refer to their solutions.
   228  
   229  For more information about the `componentDefRef`, refer to [ComponentDefRef API](https://kubeblocks.io/docs/release-0.6/developer_docs/api-reference/cluster#apps.kubeblocks.io/v1alpha1.ComponentDefRef).
   230  
   231  ## Appendix
   232  
   233  ### A1. YAML tips
   234  
   235  Since Nebula-Graphd, Nebula-Metad and Nebula-Storaged all require the FQDN of each Pod in Nebula-Metad, you don't need to configure them repeatedly.
   236  
   237  Quickly configure them with YAML anchors.
   238  
   239  ```yaml
   240  - name: nebula-graphd
   241    # ...
   242    componentDefRef:
   243      - &metadRef # Define an anchor with `&`
   244        componentDefName: nebula-metad
   245        componentRefEnv:
   246          - name: NEBULA_METAD_SVC
   247            valueFrom:
   248              type: HeadlessServiceRef
   249              format: $(POD_FQDN){{ .Values.clusterDomain }}:9559
   250              joinWith: ","
   251  - name: nebula-storaged
   252    componentDefRef:
   253      - *metadRef # Use the anchor with `*` to avoid duplication
   254  ```