github.com/NVIDIA/aistore@v1.3.23-0.20240517131212-7df6609be51d/docs/getting_started.md (about) 1 --- 2 layout: post 3 title: GETTING STARTED 4 permalink: /docs/getting-started 5 redirect_from: 6 - /getting_started.md/ 7 - /docs/getting_started.md/ 8 --- 9 10 AIStore runs on a single Linux or Mac machine. Bare-metal Kubernetes and [GCP/GKE](https://cloud.google.com/kubernetes-engine) Cloud-based deployments are also supported. There are numerous other options. 11 12 Generally, when deciding how to deploy a system like AIS with so many possibilities to choose from, a good place to start would be answering the following two fundamental questions: 13 14 * what's the dataset size, or sizes? 15 * what hardware will I use? 16 17 For datasets, say, below 50TB a single host may suffice and should, therefore, be considered a viable option. On the other hand, the [Cloud deployment](#cloud-deployment) option may sound attractive for its ubiquitous convenience and for _not_ thinking about the hardware and the sizes - at least, not right away. 18 19 > Note as well that **you can always start small**: a single-host deployment, a 3-node cluster in the Cloud or on-premises, etc. AIStore supports many options to inter-connect existing clusters - the capability called *unified global namespace* - or migrate existing datasets (on-demand or via supported storage services). For introductions and further pointers, please refer to the [AIStore Overview](/docs/overview.md). 20 21 ## Prerequisites 22 23 AIStore runs on commodity Linux machines with no special requirements whatsoever. It is expected that within a given cluster, all AIS [targets](/docs/overview.md#key-concepts-and-diagrams) are identical, hardware-wise. 24 25 * [Linux](#Linux) distribution with `GCC`, `sysstat`, `attr` and `util-linux` packages(**) 26 * Linux kernel 5.15+ 27 * [Go 1.21 or later](https://golang.org/dl/) 28 * Extended attributes (`xattrs` - see next section) 29 * Optionally, Amazon (AWS), Google Cloud Platform (GCP), and/or Azure Cloud Storage accounts. 30 31 > (**) [Mac](#macOS) is also supported albeit in a limited (development only) way. 32 33 See also: 34 35 * Section [assorted command lines](#assorted-command-lines), and 36 * `CROSS_COMPILE` comment below. 37 38 ### Linux 39 40 Depending on your Linux distribution, you may or may not have `GCC`, `sysstat`, and/or `attr` packages. These packages must be installed. 41 42 Speaking of distributions, our current default recommendation is Ubuntu Server 20.04 LTS. But Ubuntu 18.04 and CentOS 8.x (or later) will also work. As well as numerous others. 43 44 For the [local filesystem](/docs/performance.md), we currently recommend xfs. But again, this (default) recommendation shall not be interpreted as a limitation of any kind: other fine choices include zfs, ext4, f2fs, and more. 45 46 Since AIS itself provides n-way mirroring and erasure coding, hardware RAID would _not_ be recommended. But can be used, and will work. 47 48 The capability called [extended attributes](https://en.wikipedia.org/wiki/Extended_file_attributes), or `xattrs`, is a long-time POSIX legacy supported by all mainstream filesystems with no exceptions. Unfortunately, `xattrs` may not always be enabled in the Linux kernel configurations - the fact that can be easily found out by running the `setfattr` command. 49 50 If disabled, please make sure to enable xattrs in your Linux kernel configuration. To quickly check: 51 52 ```console 53 $ touch foo 54 $ setfattr -n user.bar -v ttt foo 55 $ getfattr -n user.bar foo 56 ``` 57 58 ### macOS 59 60 For developers, there's also macOS aka Darwin option. Certain capabilities related to querrying the state and status of local hardware resources (memory, CPU, disks) may be missing. In fact, it is easy to review specifics with a quick check on the sources: 61 62 ```console 63 $ find . -name "*darwin*" 64 65 ./fs/fs_darwin.go 66 ./cmn/cos/err_darwin.go 67 ./sys/proc_darwin.go 68 ./sys/cpu_darwin.go 69 ./sys/mem_darwin.go 70 ./ios/diskstats_darwin.go 71 ./ios/dutils_darwin.go 72 ./ios/fsutils_darwin.go 73 ... 74 ``` 75 76 Benchmarking and stress-testing is also being done on Linux only - another reason to consider Linux (and only Linux) for production deployments. 77 78 The rest of this document is structured as follows: 79 80 ------------------------------------------------ 81 82 ## Table of Contents 83 84 - [Local Playground](#local-playground) 85 - [From source](#from-source) 86 - [Running Local Playground with emulated disks](#running-local-playground-with-emulated-disks) 87 - [Running Local Playground remotely](#running-local-playground-remotely) 88 - [Make](#make) 89 - [System environment variables](#system-environment-variables) 90 - [Multiple deployment options](#multiple-deployment-options) 91 - [Kubernetes deployments](#kubernetes-deployments) 92 - [Minimal all-in-one-docker Deployment](#minimal-all-in-one-docker-deployment) 93 - [Local Playground Demo](#local-playground-demo) 94 - [Manual deployment](#manual-deployment) 95 - [Testing your cluster](#testing-your-cluster) 96 - [Kubernetes Playground](#kubernetes-playground) 97 - [Setting Up HTTPS Locally](#setting-up-https-locally) 98 - [Build, Make and Development Tools](#build-make-and-development-tools) 99 - [Containerized Deployments: Host Resource Sharing](#containerized-deployments-host-resource-sharing) 100 - [Assorted command lines](#assorted-command-lines) 101 102 ## Local Playground 103 104 If you're looking for speedy evaluation, want to experiment with [supported features](https://github.com/NVIDIA/aistore/tree/main?tab=readme-ov-file#features), get a feel of initial usage, or development - for any and all of these reasons running AIS from its GitHub source might be a good option. 105 106 Hence, we introduced and keep maintaining **Local Playground** - one of the several supported [deployment options](#multiple-deployment-options). 107 108 > Some of the most popular deployment options are also **summarized** in this [table](https://github.com/NVIDIA/aistore/tree/main/deploy#readme). The list includes Local Playground, and its complementary guide [here](https://github.com/NVIDIA/aistore/blob/main/deploy/dev/local/README.md). 109 110 > Local Playground is **not intended** for production and is not meant to provide optimal performance. 111 112 To run AIStore from source, one would typically need to have **Go**: compiler, linker, tools, and required packages. However: 113 114 > `CROSS_COMPILE` option (see below) can be used to build AIStore without having (to install) [Go](https://golang.org/dl/) and its toolchain (requires Docker). 115 116 To install Go(lang) on Linux: 117 118 * Download the latest `go1.21.<x>.linux-amd64.tar.gz` from [Go downloads](https://golang.org/dl/) 119 * Follow [installation instructions](https://go.dev/doc/install) 120 * **Or** simply run: `tar -C /usr/local -xzf go1.21.<x>.linux-amd64.tar.gz` and add `/usr/local/go/bin` to $PATH 121 122 Next, if not done yet, export the [`GOPATH`](https://go.dev/doc/gopath_code#GOPATH) environment variable. 123 124 Here's an additional [5-minute introduction](/deploy/dev/local/README.md) that talks more in-depth about setting up the Go environment variables. 125 126 Once done, we can run AIS as follows: 127 128 129 ## Step 1: Clone the AIStore repository and preload dependencies 130 131 We want to clone the repository into the following path so we can access 132 some of the associated binaries through the environment variables we set up earlier. 133 134 ```console 135 $ cd $GOPATH/src/github.com/NVIDIA 136 $ git clone https://github.com/NVIDIA/aistore.git 137 $ cd aistore 138 # Optionally, run `make mod-tidy` to preload dependencies 139 ``` 140 141 ## Step 2: Deploy cluster and verify the running status using `ais` cli 142 143 > **NOTE**: For a local deployment, we do not need production filesystem paths. For more information, read about [configuration basics](/docs/configuration.md#rest-of-this-document-is-structured-as-follows). If you need a physical disk or virtual block device, you must add them to the fspaths config. See [running local playground with emulated disks](#running-local-playground-with-emulated-disks) for more information. 144 145 Many useful commands are provided via top [Makefile](https://github.com/NVIDIA/aistore/blob/main/Makefile) (for details, see [Make](#make) section below). 146 147 In particular, we can use `make` to deploy our very first 1-node cluster: 148 149 ```console 150 $ make kill clean cli deploy <<< $'1\n1\n1\nn\nn\nn\n0\n' 151 ``` 152 153 OR (same): 154 ```console 155 $ make kill clean # kill previous cluster and clean any binaries 156 $ make cli # build ais cli 157 $ make deploy 158 Enter number of storage targets: 159 1 160 Enter number of proxies (gateways): 161 1 162 Number of local mountpaths (enter 0 for preconfigured filesystems): 163 1 164 Select backend providers: 165 Amazon S3: (y/n) ? 166 n 167 Google Cloud Storage: (y/n) ? 168 n 169 Azure: (y/n) ? 170 n 171 Loopback device size, e.g. 10G, 100M (press Enter to skip): 172 ``` 173 174 You should see a successful deployment in the output when the AIS proxy is listening on port 8080 (unless changed in config). 175 176 ``` 177 Building aisnode 2ce1aaeb1 [build tags: mono] 178 done. 179 Proxy is listening on port: 8080 180 Primary endpoint: http://localhost:8080 181 ``` 182 183 Note that [`clean_deploy.sh`](/docs/development.md#clean-deploy) with no arguments also builds AIStore binaries (such as `aisnode` and `ais` CLI). You can pass in arguments to configure the same options that the `make deploy` command above uses. 184 185 ```console 186 $ ./scripts/clean_deploy.sh --target-cnt 1 --proxy-cnt 1 --mountpath-cnt 1 --deployment local 187 ``` 188 189 We can verify that the cluster is running using: 190 191 ```console 192 $ ais show cluster 193 194 # Example output below 195 PROXY MEM USED(%) MEM AVAIL LOAD AVERAGE UPTIME STATUS VERSION BUILD TIME 196 p[FAEp8080][P] 0.13% 29.31GiB [0.4 0.2 0.2] - online 3.23.rc3.2ce1aaeb1 2024-05-15T16:19:05-0700 197 198 TARGET MEM USED(%) MEM AVAIL CAP USED(%) CAP AVAIL LOAD AVERAGE REBALANCE UPTIME STATUS VERSION BUILD TIME 199 t[hgjt8081] 0.13% 29.31GiB 39% 21.999GiB [0.4 0.2 0.2] - - online 3.23.rc3.2ce1aaeb1 2024-05-15T16:19:05-0700 200 201 Summary: 202 Proxies: 1 203 Targets: 1 (one disk) 204 Capacity: used 14.61GiB (39%), available 22.00GiB 205 Cluster Map: version 5, UUID PrVDVYkcT, primary p[FAEp8080] 206 Deployment: dev 207 Status: 2 online 208 Rebalance: n/a 209 Authentication: disabled 210 Version: 3.23.rc3.2ce1aaeb1 211 Build: 2024-05-15T16:19:05-0700 212 ``` 213 214 If you get repreated errors indicating that the cluster is taking a long time to start up, then your configuration is incorrect. 215 216 ## Step 3: Run `aisloader` tool 217 218 We can now run the `aisloader` tool to benchmark our new cluster. 219 220 ```console 221 $ make aisloader # build aisloader tool 222 223 $ aisloader -bucket=ais://abc -duration 2m -numworkers=8 -minsize=1K -maxsize=1K -pctput=100 --cleanup=false # run aisloader for 2 minutes (8 workers, 1KB size, 100% write, no cleanup) 224 ``` 225 226 ## Step 4: Run iostat (or use any of the multiple [documented](/docs/prometheus.md) ways to monitor AIS performance) 227 228 ```console 229 $ iostat -dxm 10 sda sdb 230 ``` 231 232 ### Running Local Playground with emulated disks 233 234 Here's a quick walk-through (with more references included below). 235 236 * Step 1: patch `deploy/dev/local/aisnode_config.sh` as follows: 237 238 ```diff 239 diff --git a/deploy/dev/local/aisnode_config.sh b/deploy/dev/local/aisnode_config.sh 240 index c5e0e4fae..46085e19c 100755 241 --- a/deploy/dev/local/aisnode_config.sh 242 +++ b/deploy/dev/local/aisnode_config.sh 243 @@ -192,11 +192,12 @@ cat > $AIS_LOCAL_CONF_FILE <<EOL 244 "port_intra_data": "${PORT_INTRA_DATA:-10080}" 245 }, 246 "fspaths": { 247 - $AIS_FS_PATHS 248 + "/tmp/ais/mp1": "", 249 + "/tmp/ais/mp2": "" 250 }, 251 "test_fspaths": { 252 "root": "${TEST_FSPATH_ROOT:-/tmp/ais$NEXT_TIER/}", 253 - "count": ${TEST_FSPATH_COUNT:-0}, 254 + "count": 0, 255 "instance": ${INSTANCE:-0} 256 } 257 } 258 ``` 259 260 * Step 2: deploy a single target with two loopback devices (1GB size each): 261 262 ```console 263 $ make kill clean cli deploy <<< $'1\n1\n2\ny\ny\nn\n1G\n' 264 265 $ mount | grep dev/loop 266 /dev/loop23 on /tmp/ais/mp1 type ext4 (rw,relatime) 267 /dev/loop25 on /tmp/ais/mp2 type ext4 (rw,relatime) 268 ``` 269 270 * Step 3: observe a running cluster; notice the deployment [type](#multiple-deployment-options) and the number of disks: 271 272 ```console 273 $ ais show cluster 274 PROXY MEM USED(%) MEM AVAIL LOAD AVERAGE UPTIME STATUS VERSION BUILD TIME 275 p[BOxqibgv][P] 0.14% 27.28GiB [1.2 1.1 1.1] - online 3.22.bf26375e5 2024-02-29T11:11:52-0500 276 277 TARGET MEM USED(%) MEM AVAIL CAP USED(%) CAP AVAIL LOAD AVERAGE REBALANCE UPTIME STATUS VERSION 278 t[IwzSpiIm] 0.14% 27.28GiB 6% 1.770GiB [1.2 1.1 1.1] - - online 3.22.bf26375e5 279 280 Summary: 281 Proxies: 1 282 Targets: 1 (num disks: 2) 283 Cluster Map: version 4, UUID g7sPH9dTY, primary p[BOxqibgv] 284 Deployment: linux 285 Status: 2 online 286 Rebalance: n/a 287 Authentication: disabled 288 Version: 3.22.bf26375e5 289 Build: 2024-02-29T11:11:52-0500 290 ``` 291 292 See also: 293 > [for developers](development.md); 294 > [cluster and node configuration](configuration.md); 295 > [supported deployments: summary table and links](https://github.com/NVIDIA/aistore/blob/main/deploy/README.md). 296 297 ### Running Local Playground remotely 298 299 AIStore (product and solution) is fully based on HTTP(S) utilizing the protocol both externally (to support both frontend interfaces and communications with remote backends) and internally, for [intra-cluster streaming](/transport). 300 301 Connectivity-wise, what that means is that your local deployment at `localhost:8080` can as easily run at any **arbitrary HTTP(S)** address. 302 303 Here're the quick change you make to deploy Local Playground at (e.g.) `10.0.0.207`, whereby the main gateway's listening port would still remain `8080` default: 304 305 ```diff 306 diff --git a/deploy/dev/local/aisnode_config.sh b/deploy/dev/local/aisnode_config.sh | 307 index 9198c0de4..be63f50d0 100755 | 308 --- a/deploy/dev/local/aisnode_config.sh | 309 +++ b/deploy/dev/local/aisnode_config.sh | 310 @@ -181,7 +181,7 @@ cat > $AIS_LOCAL_CONF_FILE <<EOL | 311 "confdir": "${AIS_CONF_DIR:-/etc/ais/}", | 312 "log_dir": "${AIS_LOG_DIR:-/tmp/ais$NEXT_TIER/log}", | 313 "host_net": { | 314 - "hostname": "${HOSTNAME_LIST}", | 315 + "hostname": "10.0.0.207", | 316 "hostname_intra_control": "${HOSTNAME_LIST_INTRA_CONTROL}", | 317 "hostname_intra_data": "${HOSTNAME_LIST_INTRA_DATA}", | 318 "port": "${PORT:-8080}", | 319 diff --git a/deploy/dev/local/deploy.sh b/deploy/dev/local/deploy.sh | 320 index e0b467d82..b18361155 100755 | 321 --- a/deploy/dev/local/deploy.sh | 322 +++ b/deploy/dev/local/deploy.sh | 323 @@ -68,7 +68,7 @@ else | 324 PORT_INTRA_DATA=${PORT_INTRA_DATA:-13080} | 325 NEXT_TIER="_next" | 326 fi | 327 -AIS_PRIMARY_URL="http://localhost:$PORT" | 328 +AIS_PRIMARY_URL="http://10.0.0.207:$PORT" | 329 if $AIS_USE_HTTPS; then | 330 AIS_PRIMARY_URL="https://localhost:$PORT" | 331 ``` 332 333 ## Make 334 335 AIS comes with its own build system that we use to build both standalone binaries and container images for a variety of deployment options. 336 337 The very first `make` command you may want to execute could as well be: 338 339 ```console 340 $ make help 341 ``` 342 343 This shows all subcommands, environment variables, and numerous usage examples, including: 344 345 ```console 346 Examples: 347 # Deploy cluster locally 348 $ make deploy 349 350 # Stop locally deployed cluster and cleanup all cluster-related data and bucket metadata (but not cluster map) 351 $ make kill clean 352 353 # Stop and then deploy (non-interactively) cluster consisting of 7 targets (4 mountpaths each) and 2 proxies; build `aisnode` executable with the support for GCP and AWS backends 354 $ make kill deploy <<< $'7\n2\n4\ny\ny\nn\n0\n' 355 ``` 356 357 ## System environment variables 358 359 The variables include `AIS_ENDPOINT`, `AIS_AUTHN_TOKEN_FILE`, and [more](/api/env). 360 361 Almost in all cases, there's an "AIS_" prefix (hint: `git grep AIS_`). 362 363 And in all cases with no exception, the variable takes precedence over the corresponding configuration, if exists. For instance: 364 365 ```console 366 AIS_ENDPOINT=https://10.0.1.138 ais show cluster 367 ``` 368 369 overrides the default endpoint as per `ais config cli` or (same) `ais config cli --json` 370 371 > Endpoints are equally provided by each and every running AIS gateway (aka "proxy") and each endpoint can be (equally) used to access the cluster. To find out what's currently configured, run (e.g.): 372 373 ```console 374 $ ais config node <NODE> local host_net --json 375 ``` 376 377 where `NODE` is, effectively, any clustered proxy (that'll show up if you type `ais config node` and press `<TAB-TAB>`). 378 379 Other variables, such as [`AIS_PRIMARY_EP`](environment-vars.md#primary) and [`AIS_USE_HTTPS`](environment-vars.md#https) can prove to be useful at deployment time. 380 381 For developers, CLI `ais config cluster log.modules ec xs` (for instance) would allow to selectively raise and/or reduce logging verbosity on a per module bases - modules EC (erasure coding) and xactions (batch jobs) in this particular case. 382 383 > To list all log modules, type `ais config cluster` or `ais config node` and press `<TAB-TAB>`. 384 385 Finally, there's also HTTPS configuration (including **X509** certificates and options), and the corresponding [environment](#tls-testing-with-self-signed-certificates). 386 387 For details, please see section [TLS: testing with self-signed certificates](#tls-testing-with-self-signed-certificates) below. 388 389 ## Multiple deployment options 390 391 AIStore deploys anywhere anytime supporting multiple deployment options [summarized and further referenced here](/deploy/README.md). 392 393 All [containerized deployments](/deploy/README.md) have their own separate `Makefiles`. With the exception of [local playground](#local-playground), each specific build-able development (`dev/`) and production (`prod/`) option under the `deploy` folder contains a pair: {`Dockerfile`, `Makefile`}. 394 395 > This separation is typically small in size and easily readable and maintainable. 396 397 Also supported is the option *not* to have the [required](#prerequisites) [Go](https://go.dev) installed and configured. To still be able to build AIS binaries without [Go](https://go.dev) on your machine, make sure that you have `docker` and simply uncomment `CROSS_COMPILE` line in the top [`Makefile`](https://github.com/NVIDIA/aistore/blob/main/Makefile). 398 399 In the software, _type of the deployment_ is also present in some minimal way. In particular, to overcome certain limitations of [Local Playground](#local-playground) (single disk shared by multiple targets, etc.) - we need to know the _type_. Which can be: 400 401 | enumerated type | comment | 402 | --- | --- | 403 | `dev` | development | 404 | `k8s` | Kubernetes | 405 | `linux` | Linux | 406 407 > The most recently updated enumeration can be found in the [source](https://github.com/NVIDIA/aistore/blob/main/ais/utils.go#L329) 408 409 > The _type_ shows up in the `show cluster` output - see example above. 410 411 ### Kubernetes deployments 412 413 For any Kubernetes deployments (including, of course, production deployments) please use a separate and dedicated [AIS-K8s GitHub](https://github.com/NVIDIA/ais-k8s/blob/master/docs/README.md) repository. The repo contains detailed [Ansible playbooks](https://github.com/NVIDIA/ais-k8s/tree/master/playbooks) that cover a variety of use cases and configurations. 414 415 In particular, [AIS-K8s GitHub repository](https://github.com/NVIDIA/ais-k8s/blob/master/terraform/README.md) provides a single-line command to deploy Kubernetes cluster and the underlying infrastructure with the AIStore cluster running inside (see below). The only requirement is having a few dependencies preinstalled (in particular, `helm`) and a Cloud account. 416 417 The following GIF illustrates the steps to deploy AIS on the Google Cloud Platform (GCP): 418 419 ![Kubernetes cloud deployment](images/ais-k8s-deploy.gif) 420 421 Finally, the [repository](https://github.com/NVIDIA/ais-k8s) hosts the [Kubernetes Operator](https://github.com/NVIDIA/ais-k8s/tree/master/operator) project that will eventually replace Helm charts and will become the main deployment, lifecycle, and operation management "vehicle" for AIStore. 422 423 ### Minimal all-in-one-docker Deployment 424 425 This option has the unmatched convenience of requiring an absolute minimum time and resources - please see this [README](/deploy/prod/docker/single/README.md) for details. 426 427 ### Manual deployment 428 429 You can also run `make deploy` in the root directory of the repository to deploy a cluster: 430 ```console 431 $ make deploy 432 Enter number of storage targets: 433 10 434 Enter number of proxies (gateways): 435 3 436 Number of local cache directories (enter 0 to use preconfigured filesystems): 437 2 438 Select backend providers: 439 Amazon S3: (y/n) ? 440 n 441 Google Cloud Storage: (y/n) ? 442 n 443 Azure: (y/n) ? 444 n 445 HDFS: (y/n) ? 446 n 447 Create loopback devices (note that it may take some time): (y/n) ? 448 n 449 Building aisnode: version=df24df77 providers= 450 ``` 451 > Notice the "Cloud" prompt above and the fact that access to 3rd party Cloud storage is a deployment-time option. 452 453 Run `make help` for supported (make) options and usage examples, including: 454 455 ```console 456 # Restart a cluster of 7 targets (4 mountpaths each) and 2 proxies; utilize previously generated (pre-shutdown) local configurations 457 $ make restart <<< $'7\n2\n4\ny\ny\nn\n0\n' 458 459 ... 460 ``` 461 462 Further: 463 464 * `make kill` - terminate local AIStore. 465 * `make restart` - shut it down and immediately restart using the existing configuration. 466 * `make help` - show make options and usage examples. 467 468 For even more development options and tools, please refer to: 469 470 * [development docs](/docs/development.md) 471 472 ### Testing your cluster 473 474 For development, health-checking a new deployment, or for any other (functional and performance testing) related reason you can run any/all of the included tests. 475 476 For example: 477 478 ```console 479 $ go test ./ais/test -v -run=Mirror 480 ``` 481 482 The `go test` above will create an AIS bucket, configure it as a two-way mirror, generate thousands of random objects, read them all several times, and then destroy the replicas and eventually the bucket as well. 483 484 Alternatively, if you happen to have Amazon and/or Google Cloud account, make sure to specify the corresponding (S3 or GCS) bucket name when running `go test` commands. 485 For example, the following will download objects from your (presumably) S3 bucket and distribute them across AIStore: 486 487 ```console 488 $ BUCKET=aws://myS3bucket go test ./ais/test -v -run=download 489 ``` 490 491 To run all tests in the category [short tests](https://pkg.go.dev/testing#Short): 492 493 ```console 494 # using randomly named ais://nnn bucket (that will be created on the fly and destroyed in the end): 495 $ BUCKET=ais://nnn make test-short 496 497 # with existing Google Cloud bucket gs://myGCPbucket 498 $ BUCKET=gs://myGCPbucket make test-short 499 ``` 500 501 The command randomly shuffles existing short tests and then, depending on your platform, usually takes anywhere between 15 and 30 minutes. To terminate, press Ctrl-C at any time. 502 503 > Ctrl-C or any other (kind of) abnormal termination of a running test may have a side effect of leaving some test data in the test bucket. 504 505 ## Kubernetes Playground 506 507 In our development and testing, we make use of [Minikube](https://kubernetes.io/docs/tutorials/hello-minikube/) and the capability, further documented [here](/deploy/dev/k8s/README.md), to run the Kubernetes cluster on a single development machine. There's a distinct advantage that AIStore extensions that require Kubernetes - such as [Extract-Transform-Load](/docs/etl.md), for example - can be developed rather efficiently. 508 509 * [AIStore on Minikube](/deploy/dev/k8s/README.md) 510 511 ## Setting Up HTTPS Locally 512 513 In the end, all examples above run a bunch of local web servers that listen for plain HTTP requests. Following are quick steps for developers to engage HTTPS. 514 515 This is still a so-called _local playground_ type deployment _from scratch_, whereby we are not trying to switch an existing cluster from HTTP to HTTPS, or vice versa. All we do here is deploying a brand new HTTPS-based aistore. 516 517 > Note: If you need to switch an existing AIS cluster to HTTPS, please refer to [these steps](switch_https.md). 518 519 ### Generate Certificates 520 521 Creating a self-signed certificate along with its private key and a Certificate Authority (CA) certificate using [OpenSSL](https://www.openssl.org/) involves several steps. First we create a self-signed Certificate Authority (CA) certificate (`ca.crt`). Then we create a Certificate Signing Request (CSR) and finally based on CSR and CA we create the server certs (`server.key` and `server.crt`). 522 523 ```console 524 $ openssl req -x509 -newkey rsa:2048 -keyout ca.key -out ca.crt -days 1024 -nodes -subj "/CN=localhost" -extensions v3_ca -config <(printf "[req]\ndistinguished_name=req\nx509_extensions=v3_ca\n[ v3_ca ]\nsubjectAltName=DNS:localhost,DNS:127.0.0.1,IP:127.0.0.1\nbasicConstraints=CA:TRUE\n") 525 $ openssl req -new -newkey rsa:2048 -nodes -keyout server.key -out server.csr -subj "/C=US/ST=California/L=Santa Clara/O=NVIDIA/OU=AIStore/CN=localhost" -config <(printf "[req]\ndistinguished_name=req\nreq_extensions = v3_req\n[ v3_req ]\nsubjectAltName=DNS:localhost,DNS:127.0.0.1,IP:127.0.0.1\n") 526 $ openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out server.crt -days 365 -sha256 -extfile <(printf "[ext]\nsubjectAltName=DNS:localhost,DNS:127.0.0.1,IP:127.0.0.1\nbasicConstraints=CA:FALSE\nkeyUsage=digitalSignature,nonRepudiation,keyEncipherment,dataEncipherment\nextendedKeyUsage=serverAuth,clientAuth\n") -extensions ext 527 ``` 528 529 **Important:** Ensure you specify the appropriate DNS relevant to your deployment context. For deployments intended to run locally, the domain name `localhost` and `127.0.0.1` should be provided for the certificates to be valid. 530 531 ### Deploy Cluster (4 targets, 1 gateway, 6 mountpaths, AWS) 532 533 ```console 534 $ # shutdown prev running AIS cluster 535 $ make kill 536 $ # delete smaps of prev AIS clusters 537 $ find ~/.ais* -type f -name ".ais.smap" | xargs rm 538 $ AIS_USE_HTTPS=true AIS_SKIP_VERIFY_CRT=true AIS_SERVER_CRT=<path-to-cert>/server.crt AIS_SERVER_KEY=<path-to-key>/server.key make deploy <<< $'4\n1\n6\ny\nn\nn\n0\n' 539 ``` 540 > Notice environment variables above: **AIS_USE_HTTPS**, **AIS_SKIP_VERIFY_CRT**, **AIS_SERVER_CRT** and **AIS_SERVER_KEY**. 541 542 ### Accessing the Cluster 543 544 To use CLI, try first any command with HTTPS-based cluster endpoint, for instance: 545 546 ```console 547 $ AIS_ENDPOINT=https://127.0.0.1:8080 ais show cluster 548 ``` 549 550 But if it fails with "failed to verify certificate" message, perform a simple step to configufre CLI to skip HTTPS cert validation: 551 552 ```console 553 $ ais config cli set cluster.skip_verify_crt true 554 "cluster.skip_verify_crt" set to: "true" (was: "false") 555 ``` 556 557 And then try again. 558 559 ### TLS: testing with self-signed certificates 560 561 In the previous example, the cluster is simply ignoring SSL certificate verification due to `AIS_SKIP_VERIFY_CRT` being `true`. For a more secure setup, consider validating certificates by configuring the necessary environment variables as shown in the table below- 562 563 | var name | description | the corresponding cluster configuration | 564 | -- | -- | -- | 565 | `AIS_USE_HTTPS` | when false, we use plain HTTP with all the TLS config (below) simply **ignored** | "net.http.use_https" | 566 | `AIS_SERVER_CRT` | aistore cluster X509 certificate | "net.http.server_crt" | 567 | `AIS_SERVER_KEY` | certificate's private key | "net.http.server_key"| 568 | `AIS_DOMAIN_TLS` | NOTE: not supported, must be empty (domain, hostname, or SAN registered with the certificate) | "net.http.domain_tls"| 569 | `AIS_CLIENT_CA_TLS` | Certificate authority that authorized (signed) the certificate | "net.http.client_ca_tls" | 570 | `AIS_CLIENT_AUTH_TLS` | Client authentication during TLS handshake: a range from 0 (no authentication) to 4 (request and validate client's certificate) | "net.http.client_auth_tls" | 571 | `AIS_SKIP_VERIFY_CRT` | when true: skip X509 cert verification (usually enabled to circumvent limitations of self-signed certs) | "net.http.skip_verify" | 572 573 > More info on [`AIS_CLIENT_AUTH_TLS`](https://pkg.go.dev/crypto/tls#ClientAuthType). 574 575 In the following example, we run https based deployment where `AIS_SKIP_VERIFY_CRT` is `false`. 576 577 ```console 578 $ make kill 579 $ # delete smaps 580 $ find ~/.ais* -type f -name ".ais.smap" | xargs rm 581 $ # substitute varibles in below files to point to correct certificates 582 $ source ais/test/tls-env/server.conf 583 $ source ais/test/tls-env/client.conf 584 $ AIS_USE_HTTPS=true make deploy <<< $'6\n6\n4\ny\ny\nn\n\n' 585 ``` 586 587 Notice that when the cluster is first time deployed `server.conf` environment (above) overrides aistore cluster configuration. 588 589 > Environment is ignored upon cluster restarts and upgrades. 590 591 On the other hand, `ais/test/tls-env/client.conf` contains environment variables to override CLI config. The correspondence between environment and config names is easy to see as well. 592 593 > See also: [Client-side TLS environment](/docs/cli.md#environment-variables) 594 595 ## Build, Make and Development Tools 596 597 As noted, the project utilizes GNU `make` to build and run things both locally and remotely (e.g., when deploying AIStore via [Kubernetes](/deploy/dev/k8s/Dockerfile). As the very first step, run `make help` for help on: 598 599 * **building** AIS node binary (called `aisnode`) deployable as both storage target **or** an ais gateway (most of the time referred to as "proxy"); 600 * **building** [CLI](/docs/cli.md) 601 * **building** [benchmark tools](/bench/tools/README.md). 602 603 In particular, the `make` provides a growing number of developer-friendly commands to: 604 605 * **deploy** the AIS cluster on your local development machine; 606 * **run** all or selected tests; 607 * **instrument** AIS binary with race detection, CPU and/or memory profiling, and more. 608 609 ## Containerized Deployments: Host Resource Sharing 610 611 The following **applies to all containerized deployments**: 612 613 1. AIS nodes always automatically detect *containerization*. 614 2. If deployed as a container, each AIS node independently discovers whether its own container's memory and/or CPU resources are restricted. 615 3. Finally, the node then abides by those restrictions. 616 617 To that end, each AIS node at startup loads and parses [cgroup](https://www.kernel.org/doc/Documentation/cgroup-v2.txt) settings for the container and, if the number of CPUs is restricted, adjusts the number of allocated system threads for its goroutines. 618 619 > This adjustment is accomplished via the Go runtime [GOMAXPROCS variable](https://golang.org/pkg/runtime/). For in-depth information on CPU bandwidth control and scheduling in a multi-container environment, please refer to the [CFS Bandwidth Control](https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt) document. 620 621 Further, given the container's cgroup/memory limitation, each AIS node adjusts the amount of memory available for itself. 622 623 > Memory limits may affect [dSort](/docs/dsort.md) performance forcing it to "spill" the content associated with in-progress resharding into local drives. The same is true for erasure-coding which also requires memory to rebuild objects from slices, etc. 624 625 > For technical details on AIS memory management, please see [this readme](/memsys/README.md). 626 627 ## Assorted command lines 628 629 AIStore targets may execute (and parse the output of) the following 3 commands: 630 631 ```console 632 $ du -bc 633 $ lsblk -Jt 634 $ df -PT # e.g., `df -PT /tmp/foo` 635 ``` 636 637 **Tip**: 638 639 > In fact, prior to deploying AIS cluster on a given Linux distribution the very first time, it'd make sense to maybe run the 3 commands and check output for "invalid option" or lack of thereof.