我试图在Hadoop集群上启动H2O。遗憾的是,它不起作用,给了我一个错误,即找不到类water.hadoop.h2omapper。
Hadoop环境是版本2.6中的HDP,包含5个节点,其中1个运行纱线资源管理器,3个节点是纱线客户端的数据节点。每个数据节点都有32 of和4个CPU核的资源。它们上没有运行其他应用程序。我在Ambari的每个节点上配置了最多16 in和3芯的纱线应用程序。
我从终端启动H2O集群(在所有节点上都进行了尝试,每个节点都有相同的错误),输出如下:
[root@host3 h2o-3.14.0.6-hdp2.6]# sudo -u hdfs hadoop jar h2odriver.jar -nodes 3 -mapperXmx 6g -output h2o-test
Determining driver host interface for mapper->driver callback...
[Possible callback IP address: 192.168.20.35]
[Possible callback IP address: 127.0.0.1]
Using mapper->driver callback IP address and port: 192.168.20.35:46619
(You can override these with -driverif and -driverport/-driverportrange.)
Memory Settings:
mapreduce.map.java.opts: -Xms6g -Xmx6g -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dlog4j.defaultInitOverride=true
Extra memory percent: 10
mapreduce.map.memory.mb: 6758
17/10/13 07:49:14 INFO client.RMProxy: Connecting to ResourceManager at host2/192.168.20.34:8050
17/10/13 07:49:14 INFO client.AHSProxy: Connecting to Application History server at host2/192.168.20.34:10200
17/10/13 07:49:15 WARN mapreduce.JobResourceUploader: No job jar file set. User classes may not be found. See Job or Job#setJar(String).
17/10/13 07:49:15 INFO mapreduce.JobSubmitter: number of splits:3
17/10/13 07:49:15 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1507793796947_0002
17/10/13 07:49:15 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources.
17/10/13 07:49:15 INFO impl.YarnClientImpl: Submitted application application_1507793796947_0002
17/10/13 07:49:15 INFO mapreduce.Job: The url to track the job: http://host2:8088/proxy/application_1507793796947_0002/
Job name 'H2O_86929' submitted
JobTracker job ID is 'job_1507793796947_0002'
For YARN users, logs command is 'yarn logs -applicationId application_1507793796947_0002'
Waiting for H2O cluster to come up...
17/10/13 07:49:29 INFO client.RMProxy: Connecting to ResourceManager at host2/192.168.20.34:8050
17/10/13 07:49:29 INFO client.AHSProxy: Connecting to Application History server at host2/192.168.20.34:10200
----- YARN cluster metrics -----
Number of YARN worker nodes: 3
----- Nodes -----
Node: http://host5:8042 Rack: /default-rack, RUNNING, 1 containers used, 4,0 / 16,0 GB used, 1 / 3 vcores used
Node: http://host4:8042 Rack: /default-rack, RUNNING, 0 containers used, 0,0 / 16,0 GB used, 0 / 3 vcores used
Node: http://host3:8042 Rack: /default-rack, RUNNING, 0 containers used, 0,0 / 16,0 GB used, 0 / 3 vcores used
----- Queues -----
Queue name: default
Queue state: RUNNING
Current capacity: 0,11
Capacity: 1,00
Maximum capacity: 1,00
Application count: 1
----- Applications in this queue -----
Application ID: application_1507793796947_0002 (H2O_86929)
Started: hdfs (Fri Oct 13 07:49:15 CEST 2017)
Application state: FINISHED
Tracking URL: http://host2:8088/proxy/application_1507793796947_0002/
Queue name: default
Used/Reserved containers: 1 / 0
Needed/Used/Reserved memory: 4,0 GB / 4,0 GB / 0,0 GB
Needed/Used/Reserved vcores: 1 / 1 / 0
Queue 'default' approximate utilization: 4,0 / 48,0 GB used, 1 / 9 vcores used
----------------------------------------------------------------------
ERROR: Unable to start any H2O nodes; please contact your YARN administrator.
A common cause for this is the requested container size (6,6 GB)
exceeds the following YARN settings:
yarn.nodemanager.resource.memory-mb
yarn.scheduler.maximum-allocation-mbYarn应用程序的系统日志中相应的错误条目:
2017-10-13 07:49:24,505 FATAL [IPC Server handler 1 on 40503] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1507793796947_0002_m_000002_0 - exited : java.lang.RuntimeException: java.lang.ClassNotFoundException: Class water.hadoop.h2omapper not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2241)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:745)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
Caused by: java.lang.ClassNotFoundException: Class water.hadoop.h2omapper not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2147)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2239)
... 8 more
2017-10-13 07:49:24,506 INFO [IPC Server handler 1 on 40503] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from attempt_1507793796947_0002_m_000002_0: Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class water.hadoop.h2omapper not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2241)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:745)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
Caused by: java.lang.ClassNotFoundException: Class water.hadoop.h2omapper not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2147)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2239)
... 8 more
2017-10-13 07:49:24,507 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1507793796947_0002_m_000002_0: Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class water.hadoop.h2omapper not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2241)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:745)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
Caused by: java.lang.ClassNotFoundException: Class water.hadoop.h2omapper not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2147)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2239)
... 8 more完整日志是可用的这里。
任何帮助都将不胜感激。
向你问好,马库斯
发布于 2017-10-12 18:33:46
根据您提供的部分日志,下面的行可以帮助您了解一些内容:
2017-10-12 07:45:02,172 FATAL [IPC Server handler 1 on 39365]
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
attempt_1507726330188_0001_m_000002_0 - exited :
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class water.hadoop.h2omapper not found显然,由于缺少类,任务id #2实例#0 (或第一个实例)失败。因此,如果任务#2显示错误,则意味着另一个任务#1已经在运行。这也意味着作业已经开始或处于运行状态。因此,问题发生在作业执行阶段。这意味着执行此特定任务的DataNode无法找到H2O驱动程序库。因此,由于某些原因,节点上的节点或文件系统可能不可用。如果你研究你的详细日志,你将能够看到为什么会发生这种情况。
更多细节
由于权限或文件系统问题,任何映射器都无法访问h2odriver.jar,这就是您的作业未启动的原因。您应该确保正确地修复Hadoop权限和可访问性,这样就可以在没有root用户的情况下启动任何"hadoop“命令,并且需要超级用户别名。
发布于 2017-10-13 10:25:40
你还能做其他的纱线工作吗?(比如pi示例)。
应用程序id显示,您的Hadoop集群是在2017年10月12日清华启动的07:36:36世界协调时(即昨天),这是集群尝试运行的第一个(和第二个)作业。
而且,集群中节点的大小真的很小。
在我看来,所有这些看起来都像是在尝试成为您自己的Hadoop管理员,但是还没有让它发挥作用。:)
继续尝试,当您的集群配置正确时,H2O将运行。
https://stackoverflow.com/questions/46702563
复制相似问题