Mounting CHDFS Instance

Last updated: 2023-12-21 16:07:06

Scenario

CHDFS is a high-performance distributed file system with standard HDFS access protocol and hierarchical namespace. EMR supports reading and writing data on CHDFS. This document primarily elucidates how to mount CHDFS to the EMR cluster.

Instructions

Scenario One: Mounting CHDFS on a New Cluster

Note
New Cluster: For clusters created on and after December 31, 2019, EMR defaults the CHDFS mount address to /data/emr/hdfs/tmp/chdfs.
The EMR cluster has been automatically adapted to CHDFS. Create CHDFS and set permissions appropriately to facilitate network communication between CHDFS and the EMR cluster. The configuration steps are as follows:
1. Create a CHDFS in the same region as the EMR cluster. Refer to Creating CHDFS for guidance.
2. Create a permission group as needed. Refer to Creating a Permission Group for guidance.
3. Create permission rules as needed. Refer to Creating Permission Rules for guidance.
4. Create a mount point under the same network as the EMR cluster. Refer to Creating a Mount Point for guidance.
5. Check the connectivity between CHDFS and the EMR cluster using the hadoop fs command line tool. Run the hadoop fs -ls ofs://${mountpoint}/ command, where mountpoint is the mount address. If the file list is displayed correctly, this indicates that CHDFS has been successfully mounted.

Scenario Two: Mounting CHDFS on an Existing Cluster

Note
Existing Cluster: A cluster that was created before December 31, 2019.
For mounting CHDFS on an existing EMR cluster, refer to Mounting CHDFS.