前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >HDFS Java API 实践

HDFS Java API 实践

作者头像
Michael阿明
发布2021-09-06 10:15:22
3930
发布2021-09-06 10:15:22
举报
文章被收录于专栏:Michael阿明学习之路

文章目录

1. 启动 Hadoop 集群

安装集群:https://cloud.tencent.com/developer/article/1872854

启动命令:

代码语言:javascript
复制
start-dfs.sh
start-yarn.sh
mr-jobhistory-daemon.sh start historyserver
# 第三条可以用下面的命令,上面的显示过期了,以后弃用
mapred --daemon start historyserver

2. 使用 HDFS Shell

  • 创建文件夹,创建文件
代码语言:javascript
复制
[dnn@master ~]$ mkdir /opt/hadoop-3.3.0/HelloHadoop
[dnn@master ~]$ vim /opt/hadoop-3.3.0/HelloHadoop/file1.txt

文本内容:

代码语言:javascript
复制
hello hadoop
i am Michael
代码语言:javascript
复制
[dnn@master ~]$ vim /opt/hadoop-3.3.0/HelloHadoop/file2.txt

文本内容:

代码语言:javascript
复制
learning BigData
very cool
  • 创建 HDFS 目录 hadoop fs -mkdir -p /InputData, -p 多级目录
  • 检查是否创建
代码语言:javascript
复制
[dnn@master ~]$ hadoop fs -ls /
Found 4 items
drwxr-xr-x   - dnn supergroup          0 2021-03-13 06:50 /InputData
drwxr-xr-x   - dnn supergroup          0 2021-03-12 06:53 /InputDataTest
drwxr-xr-x   - dnn supergroup          0 2021-03-12 07:12 /OutputDataTest
drwxrwx---   - dnn supergroup          0 2021-03-12 06:19 /tmp
  • 上传、查看
代码语言:javascript
复制
[dnn@master ~]$ hadoop fs -put /opt/hadoop-3.3.0/HelloHadoop/* /InputData
[dnn@master ~]$ hadoop fs -cat /InputData/file1.txt
hello hadoop
i am Michael
[dnn@master ~]$ hadoop fs -cat /InputData/file2.txt
learning BigData
very cool
  • 查看系统整体信息 hdfs dfsadmin -report
代码语言:javascript
复制
[dnn@master ~]$ hdfs dfsadmin -report
Configured Capacity: 36477861888 (33.97 GB)
Present Capacity: 23138791499 (21.55 GB)
DFS Remaining: 23136948224 (21.55 GB)
DFS Used: 1843275 (1.76 MB)
DFS Used%: 0.01%
Replicated Blocks:
	Under replicated blocks: 12
	Blocks with corrupt replicas: 0
	Missing blocks: 0
	Missing blocks (with replication factor 1): 0
	Low redundancy blocks with highest priority to recover: 0
	Pending deletion blocks: 0
Erasure Coded Block Groups: 
	Low redundancy block groups: 0
	Block groups with corrupt internal blocks: 0
	Missing block groups: 0
	Low redundancy blocks with highest priority to recover: 0
	Pending deletion blocks: 0

-------------------------------------------------
Live datanodes (2):

Name: 192.168.253.128:9866 (slave1)
Hostname: slave1
Decommission Status : Normal
Configured Capacity: 18238930944 (16.99 GB)
DFS Used: 929792 (908 KB)
Non DFS Used: 6669701120 (6.21 GB)
DFS Remaining: 11568300032 (10.77 GB)
DFS Used%: 0.01%
DFS Remaining%: 63.43%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sat Mar 13 06:57:09 CST 2021
Last Block Report: Sat Mar 13 06:49:24 CST 2021
Num of Blocks: 12


Name: 192.168.253.129:9866 (slave2)
Hostname: slave2
Decommission Status : Normal
Configured Capacity: 18238930944 (16.99 GB)
DFS Used: 913483 (892.07 KB)
Non DFS Used: 6669369269 (6.21 GB)
DFS Remaining: 11568648192 (10.77 GB)
DFS Used%: 0.01%
DFS Remaining%: 63.43%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sat Mar 13 06:57:09 CST 2021
Last Block Report: Sat Mar 13 06:45:42 CST 2021
Num of Blocks: 12

3. 使用 HDFS Web UI

可以看见副本数是 3,Block 大小是 128 Mb

4. 安装 Eclipse IDE

4.1 上传文件

编写上传文件的代码:

代码语言:javascript
复制
/**
 * 
 */
package com.michael.hdfs;
import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

/**
 * @author dnn
 *
 */
public class UploadFile {

	/**
	 * 
	 */
	public UploadFile() {
		// TODO Auto-generated constructor stub
	}

	/**
	 * @param args
	 */
	public static void main(String[] args) throws IOException{
		// TODO Auto-generated method stub
		Configuration conf = new Configuration();
		FileSystem hdfs = FileSystem.get(conf);
		Path scr = new Path("/opt/hadoop-3.3.0/HelloHadoop/file1.txt");
		Path dest = new Path("file1.txt");
		hdfs.copyFromLocalFile(scr, dest);
		System.out.println("Upload to " + conf.get("fs.defaultFS"));
		FileStatus files[] = hdfs.listStatus(dest);
		for(FileStatus file : files) {
			System.out.println(file.getPath());
		}
	}
}

运行:并未拷贝到 hdfs系统内

代码语言:javascript
复制
log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Upload to file:///
file:/home/dnn/eclipse-workspace/HDFS_example/file1.txt

查看hdfs系统文件,没有file1.txt

代码语言:javascript
复制
[dnn@master ~]$ hadoop fs -ls /
Found 4 items
drwxr-xr-x   - dnn supergroup          0 2021-03-13 06:54 /InputData
drwxr-xr-x   - dnn supergroup          0 2021-03-12 06:53 /InputDataTest
drwxr-xr-x   - dnn supergroup          0 2021-03-12 07:12 /OutputDataTest
drwxrwx---   - dnn supergroup          0 2021-03-12 06:19 /tmp

更改:设置默认地址

代码语言:javascript
复制
		Configuration conf = new Configuration();
		conf.set("fs.defaultFS", "hdfs://192.168.253.130:9000");//加入这句
		FileSystem hdfs = FileSystem.get(conf);

输出:正确了,上传到 hdfs 里了

代码语言:javascript
复制
Upload to hdfs://192.168.253.130:9000
hdfs://192.168.253.130:9000/user/dnn/file1.txt
代码语言:javascript
复制
[dnn@master Desktop]$ hadoop fs -ls -R /user
drwxr-xr-x   - dnn supergroup          0 2021-03-16 07:43 /user/dnn
-rw-r--r--   3 dnn supergroup         26 2021-03-16 07:43 /user/dnn/file1.txt
  • 在集群上运行 1 、导出 jar 文件

2、bash输入命令执行

代码语言:javascript
复制
[dnn@master Desktop]$ hadoop jar /home/dnn/eclipse-workspace/HDFS_example/hdfs_uploadfile.jar com.michael.hdfs.UploadFile
Upload to hdfs://192.168.253.130:9000
hdfs://192.168.253.130:9000/user/dnn/file1.txt
[dnn@master Desktop]$ hadoop fs -ls -R /user
drwxr-xr-x   - dnn supergroup          0 2021-03-16 07:59 /user/dnn
-rw-r--r--   3 dnn supergroup         26 2021-03-16 07:59 /user/dnn/file1.txt

4.2 查询文件位置

代码语言:javascript
复制
package com.michael.hdfs;
import java.io.IOException;
import java.net.URI;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.BlockLocation;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

public class FileLoc {

	public static void main(String[] args) throws IOException{
		// TODO Auto-generated method stub
		String uri = "hdfs://master:9000/user/dnn/file1.txt";
		Configuration conf = new Configuration();
		try {
			FileSystem fs = FileSystem.get(URI.create(uri), conf);
			Path fpath = new Path(uri);
			FileStatus filestatus = fs.getFileStatus(fpath);
			BlockLocation [] blklocations = fs.getFileBlockLocations(filestatus, 0, filestatus.getLen());
			int blockLen = blklocations.length;
			for(int i = 0; i < blockLen; ++i) {
				String [] hosts = blklocations[i].getHosts();
				System.out.println("block" + i + "_location:" + hosts[0]);
			}
		}
		catch(IOException e) {
			e.printStackTrace();
		}
	}
}
代码语言:javascript
复制
[dnn@master Desktop]$ hadoop jar /home/dnn/eclipse-workspace/HDFS_example/hdfs_filelocation.jar com.michael.hdfs.FileLoc
block0_location:slave2

4.3 创建目录

代码语言:javascript
复制
package com.michael.hdfs;
import java.io.IOException;
import java.net.URI;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;


public class CreatDir {

	public static void main(String[] args) {
		// TODO Auto-generated method stub
		String uri = "hdfs://master:9000";
		Configuration conf = new Configuration();
		try {
			FileSystem fs = FileSystem.get(URI.create(uri), conf);
			Path dfs = new Path("/test");
			boolean flag = fs.mkdirs(dfs);
			System.out.println(flag ? "create success" : "create failure");
		}
		catch(IOException e) {
			e.printStackTrace();
		}
	}
}
代码语言:javascript
复制
[dnn@master Desktop]$ hadoop jar /home/dnn/eclipse-workspace/HDFS_example/hdfs_mkdir.jar com.michael.hdfs.CreatDir
create success

4.4 读取文件内容

代码语言:javascript
复制
package com.michael.hdfs;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.FSDataInputStream;


public class ReadFile {

	public static void main(String[] args) {
		// TODO Auto-generated method stub
		try {
			Configuration conf = new Configuration();
			conf.set("fs.defaultFS", "hdfs://master:9000");
			conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
			FileSystem fs = FileSystem.get(conf);
			Path file = new Path("file1.txt");
			FSDataInputStream getIt = fs.open(file);
			BufferedReader d = new BufferedReader(new InputStreamReader(getIt));
			String content = d.readLine();
			System.out.println(content);
			d.close();
			fs.close();
		}
		catch(IOException e) {
			e.printStackTrace();
		}
	}
}
代码语言:javascript
复制
[dnn@master Desktop]$ hadoop jar /home/dnn/eclipse-workspace/HDFS_example/hdfs_readfile.jar com.michael.hdfs.ReadFile
hello hadoop

4.5 写入文件

代码语言:javascript
复制
package com.michael.hdfs;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.Path;
import com.michael.hdfs.ReadFile;

public class WriteFile {

	public static void main(String[] args) {
		// TODO Auto-generated method stub
		try {
			Configuration conf = new Configuration();
			conf.set("fs.defaultFS", "hdfs://master:9000");
			conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
			FileSystem fs = FileSystem.get(conf);
			byte[] buffer = "hello Michael !!!".getBytes();
			String filename = "test_file.txt";
			FSDataOutputStream os = fs.create(new Path(filename));
			os.write(buffer, 0 , buffer.length);
			System.out.println("create: " + filename);
			os.close();
			fs.close();
			ReadFile r = new ReadFile();
			r.read(filename);
		}
		catch(IOException e) {
			e.printStackTrace();
		}
	}
}
代码语言:javascript
复制
package com.michael.hdfs;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.FSDataInputStream;


public class ReadFile {

	public static void main(String[] args) {
		// TODO Auto-generated method stub
		try {
			Configuration conf = new Configuration();
			conf.set("fs.defaultFS", "hdfs://master:9000");
			conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
			FileSystem fs = FileSystem.get(conf);
			Path file = new Path("test_file.txt");
			FSDataInputStream getIt = fs.open(file);
			BufferedReader d = new BufferedReader(new InputStreamReader(getIt));
			String content = d.readLine();
			System.out.println(content);
			d.close();
			fs.close();
		}
		catch(IOException e) {
			e.printStackTrace();
		}
	}

	public void read(String filename) {
		// TODO Auto-generated method stub
		try {
			Configuration conf = new Configuration();
			conf.set("fs.defaultFS", "hdfs://master:9000");
			conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
			FileSystem fs = FileSystem.get(conf);
			Path file = new Path(filename);
			FSDataInputStream getIt = fs.open(file);
			BufferedReader d = new BufferedReader(new InputStreamReader(getIt));
			String content = d.readLine();
			System.out.println(content);
			d.close();
			fs.close();
		}
		catch(IOException e) {
			e.printStackTrace();
		}
	}
}
代码语言:javascript
复制
[dnn@master Desktop]$ hadoop jar /home/dnn/eclipse-workspace/HDFS_example/hdfs_writefile.jar com.michael.hdfs.WriteFile
create: test_file.txt
hello Michael !!!
本文参与 腾讯云自媒体同步曝光计划,分享自作者个人站点/博客。
原始发表:2021/03/17 ,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 文章目录
  • 1. 启动 Hadoop 集群
  • 2. 使用 HDFS Shell
  • 3. 使用 HDFS Web UI
  • 4. 安装 Eclipse IDE
    • 4.1 上传文件
      • 4.2 查询文件位置
        • 4.3 创建目录
          • 4.4 读取文件内容
            • 4.5 写入文件
            相关产品与服务
            大数据
            全栈大数据产品,面向海量数据场景,帮助您 “智理无数,心中有数”!
            领券
            问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档