前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Out

Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Out

作者头像
别先生
发布2018-05-28 16:54:27
1.6K0
发布2018-05-28 16:54:27
举报
文章被收录于专栏:别先生别先生

1、当时初学Sqoop的时候,mysql导入到hdfs导入命令执行以后,在hdfs上面没有找到对应的数据,今天根据这个bug,顺便解决这个问题吧,之前写的https://cloud.tencent.com/developer/article/1352041

代码语言:javascript
复制
 1 [hadoop@slaver1 sqoop-1.4.5-cdh5.3.6]$ bin/sqoop import \
 2 > --connect jdbc:mysql://slaver1:3306/test \
 3 > --username root \
 4 > --password 123456 \
 5 > --table tb_user \
 6 > --m 1
 7 Warning: /home/hadoop/soft/sqoop-1.4.5-cdh5.3.6/../hcatalog does not exist! HCatalog jobs will fail.
 8 Please set $HCAT_HOME to the root of your HCatalog installation.
 9 Warning: /home/hadoop/soft/sqoop-1.4.5-cdh5.3.6/../accumulo does not exist! Accumulo imports will fail.
10 Please set $ACCUMULO_HOME to the root of your Accumulo installation.
11 18/05/18 19:32:51 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.3.6
12 18/05/18 19:32:51 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
13 18/05/18 19:32:51 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14 18/05/18 19:32:51 INFO tool.CodeGenTool: Beginning code generation
15 18/05/18 19:32:52 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tb_user` AS t LIMIT 1
16 18/05/18 19:32:52 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tb_user` AS t LIMIT 1
17 18/05/18 19:32:52 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoop/soft/hadoop-2.5.0-cdh5.3.6
18 Note: /tmp/sqoop-hadoop/compile/cb147e9deb144db0034d6f38cb47ad68/tb_user.java uses or overrides a deprecated API.
19 Note: Recompile with -Xlint:deprecation for details.
20 18/05/18 19:33:03 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/cb147e9deb144db0034d6f38cb47ad68/tb_user.jar
21 18/05/18 19:33:03 WARN manager.MySQLManager: It looks like you are importing from mysql.
22 18/05/18 19:33:03 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
23 18/05/18 19:33:03 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
24 18/05/18 19:33:03 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
25 18/05/18 19:33:03 INFO mapreduce.ImportJobBase: Beginning import of tb_user
26 18/05/18 19:33:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
27 18/05/18 19:33:04 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
28 18/05/18 19:33:07 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
29 18/05/18 19:33:07 INFO client.RMProxy: Connecting to ResourceManager at slaver1/192.168.19.131:8032
30 18/05/18 19:33:08 WARN security.UserGroupInformation: PriviledgedActionException as:hadoop (auth:SIMPLE) cause:org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://slaver1:9000/user/hadoop/tb_user already exists
31 18/05/18 19:33:08 ERROR tool.ImportTool: Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://slaver1:9000/user/hadoop/tb_user already exists
32     at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146)
33     at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:554)
34     at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:430)
35     at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1295)
36     at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1292)
37     at java.security.AccessController.doPrivileged(Native Method)
38     at javax.security.auth.Subject.doAs(Subject.java:415)
39     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
40     at org.apache.hadoop.mapreduce.Job.submit(Job.java:1292)
41     at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1313)
42     at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:198)
43     at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:171)
44     at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:268)
45     at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:665)
46     at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118)
47     at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497)
48     at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
49     at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
50     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
51     at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
52     at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
53     at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
54     at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
55 
56 [hadoop@slaver1 sqoop-1.4.5-cdh5.3.6]$ 

2、报错说以及存在了,hdfs://slaver1:9000/user/hadoop/tb_user already exists,首先根据路径找到了问题,先将这个路径上面的删除了,然后再执行的时候发现将mysql的数据表数据可以导入到hdfs分布式文件系统上面。

代码语言:javascript
复制
1 [hadoop@slaver1 ~]$ hdfs dfs -rm -r /user/hadoop/tb_user
2 18/05/18 19:34:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
3 18/05/18 19:34:07 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
4 Deleted /user/hadoop/tb_user
5 [hadoop@slaver1 ~]$

3、执行如下所示:

代码语言:javascript
复制
[hadoop@slaver1 sqoop-1.4.5-cdh5.3.6]$ bin/sqoop import \
> --connect jdbc:mysql://slaver1:3306/test \
> --username root \
> --password 123456 \
> --table tb_user \
> --m 1
Warning: /home/hadoop/soft/sqoop-1.4.5-cdh5.3.6/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /home/hadoop/soft/sqoop-1.4.5-cdh5.3.6/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/05/18 19:39:26 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.3.6
18/05/18 19:39:26 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/05/18 19:39:26 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
18/05/18 19:39:26 INFO tool.CodeGenTool: Beginning code generation
18/05/18 19:39:27 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tb_user` AS t LIMIT 1
18/05/18 19:39:27 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tb_user` AS t LIMIT 1
18/05/18 19:39:27 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoop/soft/hadoop-2.5.0-cdh5.3.6
Note: /tmp/sqoop-hadoop/compile/464d9aff412a6285fd9f6f4c6d16b4e6/tb_user.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/05/18 19:39:29 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/464d9aff412a6285fd9f6f4c6d16b4e6/tb_user.jar
18/05/18 19:39:29 WARN manager.MySQLManager: It looks like you are importing from mysql.
18/05/18 19:39:29 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
18/05/18 19:39:29 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
18/05/18 19:39:29 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
18/05/18 19:39:29 INFO mapreduce.ImportJobBase: Beginning import of tb_user
18/05/18 19:39:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/05/18 19:39:30 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
18/05/18 19:39:31 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
18/05/18 19:39:31 INFO client.RMProxy: Connecting to ResourceManager at slaver1/192.168.19.131:8032
18/05/18 19:39:37 INFO db.DBInputFormat: Using read commited transaction isolation
18/05/18 19:39:37 INFO mapreduce.JobSubmitter: number of splits:1
18/05/18 19:39:38 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1526642793183_0001
18/05/18 19:39:39 INFO impl.YarnClientImpl: Submitted application application_1526642793183_0001
18/05/18 19:39:39 INFO mapreduce.Job: The url to track the job: http://slaver1:8088/proxy/application_1526642793183_0001/
18/05/18 19:39:39 INFO mapreduce.Job: Running job: job_1526642793183_0001
18/05/18 19:39:58 INFO mapreduce.Job: Job job_1526642793183_0001 running in uber mode : false
18/05/18 19:39:58 INFO mapreduce.Job:  map 0% reduce 0%
18/05/18 19:40:09 INFO mapreduce.Job:  map 100% reduce 0%
18/05/18 19:40:09 INFO mapreduce.Job: Job job_1526642793183_0001 completed successfully
18/05/18 19:40:09 INFO mapreduce.Job: Counters: 30
    File System Counters
        FILE: Number of bytes read=0
        FILE: Number of bytes written=132974
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=87
        HDFS: Number of bytes written=153
        HDFS: Number of read operations=4
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=2
    Job Counters 
        Launched map tasks=1
        Other local map tasks=1
        Total time spent by all maps in occupied slots (ms)=8377
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=8377
        Total vcore-seconds taken by all map tasks=8377
        Total megabyte-seconds taken by all map tasks=8578048
    Map-Reduce Framework
        Map input records=10
        Map output records=10
        Input split bytes=87
        Spilled Records=0
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=102
        CPU time spent (ms)=1310
        Physical memory (bytes) snapshot=102690816
        Virtual memory (bytes) snapshot=841768960
        Total committed heap usage (bytes)=15794176
    File Input Format Counters 
        Bytes Read=0
    File Output Format Counters 
        Bytes Written=153
18/05/18 19:40:09 INFO mapreduce.ImportJobBase: Transferred 153 bytes in 38.0904 seconds (4.0168 bytes/sec)
18/05/18 19:40:09 INFO mapreduce.ImportJobBase: Retrieved 10 records.
[hadoop@slaver1 sqoop-1.4.5-cdh5.3.6]$ 

4、数据如下所示:

代码语言:javascript
复制
[hadoop@slaver1 ~]$ hdfs dfs -cat /user/hadoop/tb_user/part-m-00000
18/05/18 19:41:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
1,张三,15236083001
2,李四,15236083001
3,王五,15236083001
4,小明,15236083001
5,小红,15236083001
6,小别,15236083001
7,7,7
8,8,8
9,9,9
10,10,10
[hadoop@slaver1 ~]$ 
本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2018-05-18 ,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
云数据库 SQL Server
腾讯云数据库 SQL Server (TencentDB for SQL Server)是业界最常用的商用数据库之一,对基于 Windows 架构的应用程序具有完美的支持。TencentDB for SQL Server 拥有微软正版授权,可持续为用户提供最新的功能,避免未授权使用软件的风险。具有即开即用、稳定可靠、安全运行、弹性扩缩等特点。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档