github.com/artpar/rclone@v1.67.3/docs/content/hdfs.md (about)

     1  ---
     2  title: "HDFS Remote"
     3  description: "Remote for Hadoop Distributed Filesystem"
     4  versionIntroduced: "v1.54"
     5  ---
     6  
     7  # {{< icon "fa fa-globe" >}} HDFS
     8  
     9  [HDFS](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html) is a
    10  distributed file-system, part of the [Apache Hadoop](https://hadoop.apache.org/) framework.
    11  
    12  Paths are specified as `remote:` or `remote:path/to/dir`.
    13  
    14  ## Configuration
    15  
    16  Here is an example of how to make a remote called `remote`. First run:
    17  
    18       rclone config
    19  
    20  This will guide you through an interactive setup process:
    21  
    22  ```
    23  No remotes found, make a new one?
    24  n) New remote
    25  s) Set configuration password
    26  q) Quit config
    27  n/s/q> n
    28  name> remote
    29  Type of storage to configure.
    30  Enter a string value. Press Enter for the default ("").
    31  Choose a number from below, or type in your own value
    32  [skip]
    33  XX / Hadoop distributed file system
    34     \ "hdfs"
    35  [skip]
    36  Storage> hdfs
    37  ** See help for hdfs backend at: https://rclone.org/hdfs/ **
    38  
    39  hadoop name node and port
    40  Enter a string value. Press Enter for the default ("").
    41  Choose a number from below, or type in your own value
    42   1 / Connect to host namenode at port 8020
    43     \ "namenode:8020"
    44  namenode> namenode.hadoop:8020
    45  hadoop user name
    46  Enter a string value. Press Enter for the default ("").
    47  Choose a number from below, or type in your own value
    48   1 / Connect to hdfs as root
    49     \ "root"
    50  username> root
    51  Edit advanced config? (y/n)
    52  y) Yes
    53  n) No (default)
    54  y/n> n
    55  Remote config
    56  --------------------
    57  [remote]
    58  type = hdfs
    59  namenode = namenode.hadoop:8020
    60  username = root
    61  --------------------
    62  y) Yes this is OK (default)
    63  e) Edit this remote
    64  d) Delete this remote
    65  y/e/d> y
    66  Current remotes:
    67  
    68  Name                 Type
    69  ====                 ====
    70  hadoop               hdfs
    71  
    72  e) Edit existing remote
    73  n) New remote
    74  d) Delete remote
    75  r) Rename remote
    76  c) Copy remote
    77  s) Set configuration password
    78  q) Quit config
    79  e/n/d/r/c/s/q> q
    80  ```
    81  
    82  This remote is called `remote` and can now be used like this
    83  
    84  See all the top level directories
    85  
    86      rclone lsd remote:
    87  
    88  List the contents of a directory
    89  
    90      rclone ls remote:directory
    91  
    92  Sync the remote `directory` to `/home/local/directory`, deleting any excess files.
    93  
    94      rclone sync --interactive remote:directory /home/local/directory
    95  
    96  ### Setting up your own HDFS instance for testing
    97  
    98  You may start with a [manual setup](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html)
    99  or use the docker image from the tests:
   100  
   101  If you want to build the docker image
   102  
   103  ```
   104  git clone https://github.com/artpar/artpar.git
   105  cd rclone/fstest/testserver/images/test-hdfs
   106  docker build --rm -t rclone/test-hdfs .
   107  ```
   108  
   109  Or you can just use the latest one pushed
   110  
   111  ```
   112  docker run --rm --name "rclone-hdfs" -p 127.0.0.1:9866:9866 -p 127.0.0.1:8020:8020 --hostname "rclone-hdfs" rclone/test-hdfs
   113  ```
   114  
   115  **NB** it need few seconds to startup.
   116  
   117  For this docker image the remote needs to be configured like this:
   118  
   119  ```
   120  [remote]
   121  type = hdfs
   122  namenode = 127.0.0.1:8020
   123  username = root
   124  ```
   125  
   126  You can stop this image with `docker kill rclone-hdfs` (**NB** it does not use volumes, so all data
   127  uploaded will be lost.)
   128  
   129  ### Modification times
   130  
   131  Time accurate to 1 second is stored.
   132  
   133  ### Checksum
   134  
   135  No checksums are implemented.
   136  
   137  ### Usage information
   138  
   139  You can use the `rclone about remote:` command which will display filesystem size and current usage.
   140  
   141  ### Restricted filename characters
   142  
   143  In addition to the [default restricted characters set](/overview/#restricted-characters)
   144  the following characters are also replaced:
   145  
   146  | Character | Value | Replacement |
   147  | --------- |:-----:|:-----------:|
   148  | :         | 0x3A  | :           |
   149  
   150  Invalid UTF-8 bytes will also be [replaced](/overview/#invalid-utf8).
   151  
   152  {{< rem autogenerated options start" - DO NOT EDIT - instead edit fs.RegInfo in backend/hdfs/hdfs.go then run make backenddocs" >}}
   153  ### Standard options
   154  
   155  Here are the Standard options specific to hdfs (Hadoop distributed file system).
   156  
   157  #### --hdfs-namenode
   158  
   159  Hadoop name nodes and ports.
   160  
   161  E.g. "namenode-1:8020,namenode-2:8020,..." to connect to host namenodes at port 8020.
   162  
   163  Properties:
   164  
   165  - Config:      namenode
   166  - Env Var:     RCLONE_HDFS_NAMENODE
   167  - Type:        CommaSepList
   168  - Default:     
   169  
   170  #### --hdfs-username
   171  
   172  Hadoop user name.
   173  
   174  Properties:
   175  
   176  - Config:      username
   177  - Env Var:     RCLONE_HDFS_USERNAME
   178  - Type:        string
   179  - Required:    false
   180  - Examples:
   181      - "root"
   182          - Connect to hdfs as root.
   183  
   184  ### Advanced options
   185  
   186  Here are the Advanced options specific to hdfs (Hadoop distributed file system).
   187  
   188  #### --hdfs-service-principal-name
   189  
   190  Kerberos service principal name for the namenode.
   191  
   192  Enables KERBEROS authentication. Specifies the Service Principal Name
   193  (SERVICE/FQDN) for the namenode. E.g. \"hdfs/namenode.hadoop.docker\"
   194  for namenode running as service 'hdfs' with FQDN 'namenode.hadoop.docker'.
   195  
   196  Properties:
   197  
   198  - Config:      service_principal_name
   199  - Env Var:     RCLONE_HDFS_SERVICE_PRINCIPAL_NAME
   200  - Type:        string
   201  - Required:    false
   202  
   203  #### --hdfs-data-transfer-protection
   204  
   205  Kerberos data transfer protection: authentication|integrity|privacy.
   206  
   207  Specifies whether or not authentication, data signature integrity
   208  checks, and wire encryption are required when communicating with
   209  the datanodes. Possible values are 'authentication', 'integrity'
   210  and 'privacy'. Used only with KERBEROS enabled.
   211  
   212  Properties:
   213  
   214  - Config:      data_transfer_protection
   215  - Env Var:     RCLONE_HDFS_DATA_TRANSFER_PROTECTION
   216  - Type:        string
   217  - Required:    false
   218  - Examples:
   219      - "privacy"
   220          - Ensure authentication, integrity and encryption enabled.
   221  
   222  #### --hdfs-encoding
   223  
   224  The encoding for the backend.
   225  
   226  See the [encoding section in the overview](/overview/#encoding) for more info.
   227  
   228  Properties:
   229  
   230  - Config:      encoding
   231  - Env Var:     RCLONE_HDFS_ENCODING
   232  - Type:        Encoding
   233  - Default:     Slash,Colon,Del,Ctl,InvalidUtf8,Dot
   234  
   235  #### --hdfs-description
   236  
   237  Description of the remote
   238  
   239  Properties:
   240  
   241  - Config:      description
   242  - Env Var:     RCLONE_HDFS_DESCRIPTION
   243  - Type:        string
   244  - Required:    false
   245  
   246  {{< rem autogenerated options stop >}}
   247  
   248  ## Limitations
   249  
   250  - No server-side `Move` or `DirMove`.
   251  - Checksums not implemented.