storj.io/minio@v0.0.0-20230509071714-0cbc90f649b1/docs/gateway/hdfs.md (about)

     1  # MinIO HDFS Gateway [![Slack](https://slack.minio.io/slack?type=svg)](https://slack.minio.io)
     2  MinIO HDFS gateway adds Amazon S3 API support to Hadoop HDFS filesystem. Applications can use both the S3 and file APIs concurrently without requiring any data migration. Since the gateway is stateless and shared-nothing, you may elastically provision as many MinIO instances as needed to distribute the load.
     3  
     4  > NOTE: Intention of this gateway implementation it to make it easy to migrate your existing data on HDFS clusters to MinIO clusters using standard tools like `mc` or `aws-cli`, if the goal is to use HDFS perpetually we recommend that HDFS should be used directly for all write operations.
     5  
     6  ## Run MinIO Gateway for HDFS Storage
     7  
     8  ### Using Binary
     9  Namenode information is obtained by reading `core-site.xml` automatically from your hadoop environment variables *$HADOOP_HOME*
    10  ```
    11  export MINIO_ROOT_USER=minio
    12  export MINIO_ROOT_PASSWORD=minio123
    13  minio gateway hdfs
    14  ```
    15  
    16  You can also override the namenode endpoint as shown below.
    17  ```
    18  export MINIO_ROOT_USER=minio
    19  export MINIO_ROOT_PASSWORD=minio123
    20  minio gateway hdfs hdfs://namenode:8200
    21  ```
    22  
    23  ### Using Docker
    24  Using docker is experimental, most Hadoop environments are not dockerized and may require additional steps in getting this to work properly. You are better off just using the binary in this situation.
    25  ```
    26  docker run -p 9000:9000 \
    27   --name hdfs-s3 \
    28   -e "MINIO_ROOT_USER=minio" \
    29   -e "MINIO_ROOT_PASSWORD=minio123" \
    30   minio/minio gateway hdfs hdfs://namenode:8200
    31  ```
    32  
    33  ### Setup Kerberos
    34  
    35  MinIO supports two kerberos authentication methods, keytab and ccache.
    36  
    37  To enable kerberos authentication, you need to set `hadoop.security.authentication=kerberos` in the HDFS config file.
    38  
    39  ```xml
    40  <property>
    41    <name>hadoop.security.authentication</name>
    42    <value>kerberos</value>
    43  </property>
    44  ```
    45  
    46  MinIO will load `krb5.conf` from environment variable `KRB5_CONFIG` or default location `/etc/krb5.conf`.
    47  ```sh
    48  export KRB5_CONFIG=/path/to/krb5.conf
    49  ```
    50  
    51  If you want MinIO to use ccache for authentication, set environment variable `KRB5CCNAME` to the credential cache file path,
    52  or MinIO will use the default location `/tmp/krb5cc_%{uid}`.
    53  ```sh
    54  export KRB5CCNAME=/path/to/krb5cc
    55  ```
    56  
    57  If you prefer to use keytab, with automatically renewal, you need to config three environment variables:
    58  
    59  - `KRB5KEYTAB`: the location of keytab file
    60  - `KRB5USERNAME`: the username
    61  - `KRB5REALM`: the realm
    62  
    63  Please note that the username is not principal name.
    64  
    65  ```sh
    66  export KRB5KEYTAB=/path/to/keytab
    67  export KRB5USERNAME=hdfs
    68  export KRB5REALM=REALM.COM
    69  ```
    70  
    71  ## Test using MinIO Browser
    72  *MinIO gateway* comes with an embedded web based object browser. Point your web browser to http://127.0.0.1:9000 to ensure that your server has started successfully.
    73  
    74  ![Screenshot](https://raw.githubusercontent.com/minio/minio/master/docs/screenshots/minio-browser-gateway.png)
    75  
    76  ## Test using MinIO Client `mc`
    77  
    78  `mc` provides a modern alternative to UNIX commands such as ls, cat, cp, mirror, diff etc. It supports filesystems and Amazon S3 compatible cloud storage services.
    79  
    80  ### Configure `mc`
    81  
    82  ```
    83  mc alias set myhdfs http://gateway-ip:9000 access_key secret_key
    84  ```
    85  
    86  ### List buckets on hdfs
    87  
    88  ```
    89  mc ls myhdfs
    90  [2017-02-22 01:50:43 PST]     0B user/
    91  [2017-02-26 21:43:51 PST]     0B datasets/
    92  [2017-02-26 22:10:11 PST]     0B assets/
    93  ```
    94  
    95  ### Known limitations
    96  Gateway inherits the following limitations of HDFS storage layer:
    97  - No bucket policy support (HDFS has no such concept)
    98  - No bucket notification APIs are not supported (HDFS has no support for fsnotify)
    99  - No server side encryption support (Intentionally not implemented)
   100  - No server side compression support (Intentionally not implemented)
   101  - Concurrent multipart operations are not supported (HDFS lacks safe locking support, or poorly implemented)
   102  
   103  ## Explore Further
   104  - [`mc` command-line interface](https://docs.minio.io/docs/minio-client-quickstart-guide)
   105  - [`aws` command-line interface](https://docs.minio.io/docs/aws-cli-with-minio)
   106  - [`minio-go` Go SDK](https://docs.minio.io/docs/golang-client-quickstart-guide)