github.com/dolthub/dolt/go@v0.40.5-0.20240520175717-68db7794bea6/store/nbs/NBS-on-AWS.md (about) 1 # Backing a Noms Block Store with AWS 2 3 How to use S3 and DynamoDB as the persistent storage layer for a Noms Block Store (NBS). 4 5 ## Overview 6 7 When running atop AWS, NBS stores immutable chunk data in S3 objects and mutable state -- a 'manifest' indicating which S3 objects are live, essentially -- in DynamoDB. It is possible to have many separate Noms Block Stores backed by a single bucket/table as long as you give each a distinct name. You could also choose to spin up a separate bucket/table pair for each NBS, though this is not required -- and, indeed, probably overkill. 8 9 ## AWS Setup 10 11 This assumes a setup in a single AWS region. 12 13 ### Create an S3 bucket and DynamoDB table 14 15 There are no special requirements on the S3 bucket you create. Just choose a name and, once it's created, remember the ARN for use later. 16 17 The DynamoDB table you create, on the other hand, does need to have a particular structure. It must have a *primary partition key* that is a *string* with the name *db*. Again, remember its ARN for later use. 18 19 ### Access control 20 21 The NBS code honors AWS credentials files, so when running on your development machine the easiest thing to do is drop the creds of the user that created the bucket and table above into `~/.aws/credentials` and run that way. This isn't a great approach for running in on an EC2 instance in production, however. The right way to do that is to create an IAM Role, and run your instance as that role. 22 23 Create such a role using the IAM Management Console (or command line tool of your choice) and make sure it has a policy with at least the following permissions: 24 25 ```json 26 { 27 "Version": "2012-10-17", 28 "Statement": [ 29 { 30 "Sid": "Stmt1453230562000", 31 "Effect": "Allow", 32 "Action": [ 33 "dynamodb:BatchGetItem", 34 "dynamodb:BatchWriteItem", 35 "dynamodb:DeleteItem", 36 "dynamodb:GetItem", 37 "dynamodb:PutItem", 38 ], 39 "Resource": [ 40 "[ARN for your DynamoDB table]", 41 ] 42 }, 43 { 44 "Sid": "Stmt1454457944000", 45 "Effect": "Allow", 46 "Action": [ 47 "s3:AbortMultipartUpload", 48 "s3:CompleteMultipartUpload", 49 "s3:CreateMultipartUpload", 50 "s3:GetObject", 51 "s3:PutObject", 52 "s3:UploadPart", 53 "s3:UploadPartCopy", 54 ], 55 "Resource": [ 56 "[ARN for your S3 bucket]", 57 ] 58 } 59 ] 60 } 61 ``` 62 63 This is where the ARN for your bucket and table come in. 64 65 ## Instantiating an NBS-on-AWS ChunkStore 66 67 ### On the command line 68 69 ```shell 70 noms ds aws://[dynamo-table:s3-bucket]/store-name 71 ``` 72 73 ### NewAWSStore 74 75 If your code only needs to create a store pointing to a single named stores, you can write code similar to the following: 76 77 ```go 78 sess := session.Must(session.NewSession(aws.NewConfig().WithRegion("us-west-2"))) 79 store := nbs.NewAWSStore("dynamo-table", "store-name", "s3-bucket", s3.New(sess), dynamodb.New(sess), 1<<28)) 80 ``` 81 82 ### NewAWSStoreFactory 83 84 If you find yourself wanting to create NBS instances pointing to multiple, different named stores, you can use `nbs.NewAWSStoreFactory()`, which also supports caching Noms data on disk in some cases: 85 86 ```go 87 sess := session.Must(session.NewSession(aws.NewConfig().WithRegion("us-west-2"))) 88 fact := nbs.NewAWSStoreFactory( 89 sess, "dynamo-table", "s3-bucket", 90 128 /* Maximum number of open files in cache */, 91 1 << 28 /* Amount of index data to cache in memory */, 92 1 << 30 /* Amount of Noms data to cache on disk */, 93 "/path/to/cache" /* Directory in which to cache Noms data */, 94 ) 95 store := fact.CreateStore("store-name") 96 ``` 97