github.com/NVIDIA/aistore@v1.3.23-0.20240517131212-7df6609be51d/docs/_posts/2022-03-17-promote.md (about) 1 --- 2 layout: post 3 title: "Promoting local and shared files" 4 date: Mar 17, 2022 5 author: Alex Aizman 6 categories: aistore 7 --- 8 9 When it comes to working with files, the first question often is *how*? How to easily and quickly move or copy existing file datasets into AIS clusters? 10 There are, in fact, several distinct ways to handle [existing datasets](/docs/overview.md#existing-datasets). But if those are files, we do recommend to maybe take a look at [`promote`](/docs/overview.md). 11 12 Introduced in v3.0 to exclusively handle local files and directories, `promote` over time has become the preferred method. Any file source is supported, local or remote, shared or not. 13 14 Here's a quick and commented CLI illustration: 15 16 ```console 17 # assume, we have files: 18 $ ls /mnt/share/abc 19 1000.test 1001.test 1002.test ... 9999.test 20 21 # promote them all into ais bucket (named ais-bucket), with simultaneous renaming abc/ => xyz/ 22 $ ais object promote /mnt/share/abc ais://ais-bucket/xyz 23 promoted "/mnt/share/abc" => ais://ais-bucket, xaction ID "L5pptahI8" 24 25 # the result: 26 $ ais ls ais://ais-bucket 27 NAME SIZE 28 xyz/1000.test 123k 29 xyz/1001.test 456k 30 ... 31 xyz/9999.test 789k 32 ``` 33 34 But if we want to, for instance: 35 36 - promote recursively to an s3 bucket - with nested subdirectories but without renaming the destination base, and 37 - delete sources upon (promoting) success, and 38 - overwrite destination objects (if exist), and finally 39 - prevent auto-detecting file share 40 41 then we can do something like this: 42 43 ```console 44 $ ais object promote /mnt/share/abc s3://s3-bucket -r --delete-src --overwrite-dst --not-file-share 45 ``` 46 47 > `--not-file-share` translates as follows: each target to act autonomously, skipping auto-detection and promoting the entire file source as "seen" by this target. 48 49 Historically, AIS `promote` is reminiscent of what's usually called **server-side copy** - a time-honored technique to engage sources and destinations directly, thus avoiding network roundtrips and client-side bottlenecks. 50 51 AIS, of course, takes it to a different level by distributing the work between clustered nodes that act independently and in parallel: 52 53 ![Promote file share](/assets/promote-file-share.png) 54 55 Here we have a client promoting NFS or SMB share called `mnt/share`. 56 57 The client can use CLI (as shown above), [HTTP](/docs/http_api.md), or directly call native API via [api.Promote](https://github.com/NVIDIA/aistore/tree/main/api). Either way: 58 59 1. Given correct HTTP address, the request finds the designated AIS cluster (shown as a green/gray bubble) and the AIS gateway (aka "proxy") with that address. 60 61 2. AIS proxy then initiates a 2-phase transaction where the *begin* phase performs a range of validations to make sure that storage targets are ready to execute. 62 63 3. In particular, unless auto-detection is disabled, each target computes a digest of all sorted filenames under `mnt/share`. For example, `target-x` would compute `digest-x`, `target-y` - `digest-y`, and so on. 64 65 4. This concludes the *begin* phase, after which the cluster, and each target individually, start *committing* - i.e, reading files from `mnt/share` and writing them locally as objects. 66 67 Further details are indicated on the line diagram above. The one nuance that's maybe worth reiterating is that each target handles those, and **only** those, files that map to itself, location-wise. 68