github.com/treeverse/lakefs@v1.24.1-0.20240520134607-95648127bfb0/design/open/spark/grand-theft-s3-client.md (about) 1 # Getting S3 Clients in Spark Clients 2 3 ## Why? 4 5 We have 2 clients for Spark that work on lakeFS but also need to access AWS 6 S3 directly: 7 8 <table> 9 <tr> 10 <th>Client</th><th>Where used</th><th>Why it needs an S3 client</th> 11 </tr><tr> 12 <td>Spark Metadata client</td> 13 <td> 14 GC (both committed and uncommititted), Spark Export, and also 15 available for users to use to access lakeFS metadata directly. 16 </td><td> 17 Accesses stored metadata directly on S3 and deletes data objects. 18 </td> 19 </tr><tr> 20 <td> 21 lakeFSFS 22 </td> 23 <td> 24 Reading and writing directly on lakeFS. 25 </td> 26 <td> 27 Read ETags of uploaded objects to put them in lakeFS metadata. In 28 _some_ Hadoop versions, the S3AFileSystem returns FileStatus objects 29 with a S3AFileStatus.getETag method. Otherwise a separate call to S3 30 is needed. 31 </td> 32 </tr> 33 </table> 34 35 ![David Niven as Sir Charles Litton, The Pink Panther][pink-panther-img] 36 37 These Spark clients cannot work without a working S3 client[^1]. This is: 38 39 * **Different** between our two clients. 40 41 The Spark metadata client supports _only_ authentication to S3 using 42 access keys or STS, lakeFSFS supports _only_ taking ("_stealing_") clients 43 from S3AFileSystem. 44 45 * **Brittle**. 46 47 Some users cannot use the authentication methods that we make available to 48 them. The thievery code in lakeFSFS is subtle and greatly depends on an 49 assumed underlying implementation; it can break when DataBricks introduce 50 new features. 51 * **Uninformative** in the case of system or user error. 52 53 Users receive very poor error reports. If they get as far as an S3 client 54 but it is misconfigured, S3 happily generates "400 Bad Request" messages. 55 If client theft fails, it generates a report of the _last_ failure -- 56 probably not the most _important_ failure. 57 58 There are numerous bug reports and user questions about this area. 59 60 ## What? 61 62 We propose to: 63 64 1. **Reduce friction.** When S3A already works on a Spark installation, 65 users should typically not have to add _any_ S3-related configuration in 66 order to use lakeFS clients. 67 1. **Unify** S3 client acquisition between the two clients. Both clients will 68 support the same configuration options: clients "stolen" from the 69 underlying S3AFileSystem, and explicitly created clients with static 70 access keys and STS. Prefer stealing clients to creating them -- these 71 are most likely to work. 72 1. **Improve** error reporting. Report the stages attempted and how each 73 one failed. 74 1. **Create a more general scheme** for generating clients. Over time we 75 can hope to support more underlying implementations. 76 77 ## Design principles 78 79 Unify client generation code into a single library. We will be able to test 80 this library individually on various Spark setups. This will probably not 81 be automatic -- there is no automatic source for _new_ Spark setups, and it 82 is not clear how often _existing_ Spark setups change. But even being able 83 to run a single command on a Spark cluster and get useful information will 84 be very useful for investigation, helping customers probe their setup, and 85 further development to support setups where we fail. 86 87 This library will define an interface for _client acquisition_: given 88 various parameters TBD (perhaps a SparkContext or a Hadoop configuration), a 89 path, and optionally also a FileSystem on that path, a client acquisition 90 attempt returns a client or a failure message. 91 92 A future version may well generalize to acquiring a client for other 93 underlying storage types from other FileSystems. 94 95 The library will include code that tries each of a list of strategies, in 96 order of desirability. It will return a client or throw some exception with 97 a detailed method. And it will report which strategy was actually used to 98 acquire the client. To increase performance, the library will cache to 99 client used by FileSystem. This will typically mean that the acquisition 100 code is called just once. 101 102 The list of strategies will be configurable on a Hadoop property. 103 Additionally we will create pre-populated lists, one recommended for 104 no-hassle production use and the other consisting of all (or almost) 105 strategies that will be recommended for debugging. Users who explicitly 106 wish to use a single strategy will simply configure that one strategy as the 107 only option. 108 109 One complication is that many FileSystems are _layered_ and strategies to 110 detect them may require some recursion or at least iteration. For instance, 111 while S3A may support `S3AFileSystem.getAmazonS3Client`, on DataBricks we 112 might have to unwrap it from `CredentialScopeFileSystem` using 113 `CredentialScopeFileSystem.getReadDelegate`, and then try to acquire an S3 114 client from whatever is returned. 115 116 The type of the returned client is indeterminate to the caller. It _is_ an 117 AmazonS3Client with desired authentication and the ability to connect to the 118 bucket. But it may well be one of a different version or package than the 119 caller expects, and if the caller so much as attempts to cast it to 120 AmazonS3Client it will get a nasty ClassCastException. Similarly for its 121 *Request and *Response objects. Everything should be done using reflection, 122 and the library should also help with this call.[^2] 123 124 [^2]: The current code assumes that the expected Request object has the same 125 actual type and is compatible. This is a bug and will surely break 126 somewhere. 127 128 ### Example: information a strategy might return 129 130 One method of generating a FileSystem is to call `getWrappedFs` _if_ that 131 FileSystem has such a call, and recurse on that. When such a strategy fails, 132 it should report: 133 134 1. The dynamic type of FileSystem that it received. 135 1. What failed: 136 1. `getWrappedFs`? For instance, if it received a FileSystem that does 137 not have this method. 138 1. Acquiring a FileSystem from the wrapped instance? This is a recursive 139 attempt, and its failures will also include information about failure. 140 141 142 [pink-panther-img]: https://static.wikia.nocookie.net/pinkpanther/images/7/76/David_Niven_-_01.webp/revision/latest?cb=20220531105637 143 144 [^1]: lakeFSFS _might_ not need the client, if it can find an ETag on 145 returned FileStatus objects.