github.com/artpar/rclone@v1.67.3/docs/content/hdfs.md (about) 1 --- 2 title: "HDFS Remote" 3 description: "Remote for Hadoop Distributed Filesystem" 4 versionIntroduced: "v1.54" 5 --- 6 7 # {{< icon "fa fa-globe" >}} HDFS 8 9 [HDFS](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html) is a 10 distributed file-system, part of the [Apache Hadoop](https://hadoop.apache.org/) framework. 11 12 Paths are specified as `remote:` or `remote:path/to/dir`. 13 14 ## Configuration 15 16 Here is an example of how to make a remote called `remote`. First run: 17 18 rclone config 19 20 This will guide you through an interactive setup process: 21 22 ``` 23 No remotes found, make a new one? 24 n) New remote 25 s) Set configuration password 26 q) Quit config 27 n/s/q> n 28 name> remote 29 Type of storage to configure. 30 Enter a string value. Press Enter for the default (""). 31 Choose a number from below, or type in your own value 32 [skip] 33 XX / Hadoop distributed file system 34 \ "hdfs" 35 [skip] 36 Storage> hdfs 37 ** See help for hdfs backend at: https://rclone.org/hdfs/ ** 38 39 hadoop name node and port 40 Enter a string value. Press Enter for the default (""). 41 Choose a number from below, or type in your own value 42 1 / Connect to host namenode at port 8020 43 \ "namenode:8020" 44 namenode> namenode.hadoop:8020 45 hadoop user name 46 Enter a string value. Press Enter for the default (""). 47 Choose a number from below, or type in your own value 48 1 / Connect to hdfs as root 49 \ "root" 50 username> root 51 Edit advanced config? (y/n) 52 y) Yes 53 n) No (default) 54 y/n> n 55 Remote config 56 -------------------- 57 [remote] 58 type = hdfs 59 namenode = namenode.hadoop:8020 60 username = root 61 -------------------- 62 y) Yes this is OK (default) 63 e) Edit this remote 64 d) Delete this remote 65 y/e/d> y 66 Current remotes: 67 68 Name Type 69 ==== ==== 70 hadoop hdfs 71 72 e) Edit existing remote 73 n) New remote 74 d) Delete remote 75 r) Rename remote 76 c) Copy remote 77 s) Set configuration password 78 q) Quit config 79 e/n/d/r/c/s/q> q 80 ``` 81 82 This remote is called `remote` and can now be used like this 83 84 See all the top level directories 85 86 rclone lsd remote: 87 88 List the contents of a directory 89 90 rclone ls remote:directory 91 92 Sync the remote `directory` to `/home/local/directory`, deleting any excess files. 93 94 rclone sync --interactive remote:directory /home/local/directory 95 96 ### Setting up your own HDFS instance for testing 97 98 You may start with a [manual setup](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html) 99 or use the docker image from the tests: 100 101 If you want to build the docker image 102 103 ``` 104 git clone https://github.com/artpar/artpar.git 105 cd rclone/fstest/testserver/images/test-hdfs 106 docker build --rm -t rclone/test-hdfs . 107 ``` 108 109 Or you can just use the latest one pushed 110 111 ``` 112 docker run --rm --name "rclone-hdfs" -p 127.0.0.1:9866:9866 -p 127.0.0.1:8020:8020 --hostname "rclone-hdfs" rclone/test-hdfs 113 ``` 114 115 **NB** it need few seconds to startup. 116 117 For this docker image the remote needs to be configured like this: 118 119 ``` 120 [remote] 121 type = hdfs 122 namenode = 127.0.0.1:8020 123 username = root 124 ``` 125 126 You can stop this image with `docker kill rclone-hdfs` (**NB** it does not use volumes, so all data 127 uploaded will be lost.) 128 129 ### Modification times 130 131 Time accurate to 1 second is stored. 132 133 ### Checksum 134 135 No checksums are implemented. 136 137 ### Usage information 138 139 You can use the `rclone about remote:` command which will display filesystem size and current usage. 140 141 ### Restricted filename characters 142 143 In addition to the [default restricted characters set](/overview/#restricted-characters) 144 the following characters are also replaced: 145 146 | Character | Value | Replacement | 147 | --------- |:-----:|:-----------:| 148 | : | 0x3A | : | 149 150 Invalid UTF-8 bytes will also be [replaced](/overview/#invalid-utf8). 151 152 {{< rem autogenerated options start" - DO NOT EDIT - instead edit fs.RegInfo in backend/hdfs/hdfs.go then run make backenddocs" >}} 153 ### Standard options 154 155 Here are the Standard options specific to hdfs (Hadoop distributed file system). 156 157 #### --hdfs-namenode 158 159 Hadoop name nodes and ports. 160 161 E.g. "namenode-1:8020,namenode-2:8020,..." to connect to host namenodes at port 8020. 162 163 Properties: 164 165 - Config: namenode 166 - Env Var: RCLONE_HDFS_NAMENODE 167 - Type: CommaSepList 168 - Default: 169 170 #### --hdfs-username 171 172 Hadoop user name. 173 174 Properties: 175 176 - Config: username 177 - Env Var: RCLONE_HDFS_USERNAME 178 - Type: string 179 - Required: false 180 - Examples: 181 - "root" 182 - Connect to hdfs as root. 183 184 ### Advanced options 185 186 Here are the Advanced options specific to hdfs (Hadoop distributed file system). 187 188 #### --hdfs-service-principal-name 189 190 Kerberos service principal name for the namenode. 191 192 Enables KERBEROS authentication. Specifies the Service Principal Name 193 (SERVICE/FQDN) for the namenode. E.g. \"hdfs/namenode.hadoop.docker\" 194 for namenode running as service 'hdfs' with FQDN 'namenode.hadoop.docker'. 195 196 Properties: 197 198 - Config: service_principal_name 199 - Env Var: RCLONE_HDFS_SERVICE_PRINCIPAL_NAME 200 - Type: string 201 - Required: false 202 203 #### --hdfs-data-transfer-protection 204 205 Kerberos data transfer protection: authentication|integrity|privacy. 206 207 Specifies whether or not authentication, data signature integrity 208 checks, and wire encryption are required when communicating with 209 the datanodes. Possible values are 'authentication', 'integrity' 210 and 'privacy'. Used only with KERBEROS enabled. 211 212 Properties: 213 214 - Config: data_transfer_protection 215 - Env Var: RCLONE_HDFS_DATA_TRANSFER_PROTECTION 216 - Type: string 217 - Required: false 218 - Examples: 219 - "privacy" 220 - Ensure authentication, integrity and encryption enabled. 221 222 #### --hdfs-encoding 223 224 The encoding for the backend. 225 226 See the [encoding section in the overview](/overview/#encoding) for more info. 227 228 Properties: 229 230 - Config: encoding 231 - Env Var: RCLONE_HDFS_ENCODING 232 - Type: Encoding 233 - Default: Slash,Colon,Del,Ctl,InvalidUtf8,Dot 234 235 #### --hdfs-description 236 237 Description of the remote 238 239 Properties: 240 241 - Config: description 242 - Env Var: RCLONE_HDFS_DESCRIPTION 243 - Type: string 244 - Required: false 245 246 {{< rem autogenerated options stop >}} 247 248 ## Limitations 249 250 - No server-side `Move` or `DirMove`. 251 - Checksums not implemented.