当我试图在我的Windows7机器上构建nutch 2.1时,我得到了以下错误:
Buildfile: C:\apache-nutch-2.1\build.xml
[taskdef] Could not load definitions from resource org/sonar/ant/antlib.xml. It could not be found.
ivy-probe-antlib:
ivy-download:
[taskdef] Could not load definitions from resource org/sonar/ant/antlib.xml. It co
我正试图与apache合作,爬行一些网站。在youtube上学习教程的时候。我想出了这么多错误,并成功地解决了这些错误,但我现在遇到的错误对我来说真的很难理解。请帮帮忙。下面的教程是:
成功运行Hbase之后,使用以下命令执行nutch文件夹的下一步操作
sudo ant runtime
每当我运行这个命令时,我都会得到错误
Build Failed
/nutch/apache-nutch-2.3/build.xml:101: Compile failed; see the compiler error output for details.
详情如下:注意:我忽略了这些警告,只提到了错误。
我正试着给Nutch注射种子。
我使用的命令:
bin/nutch inject /root/project/nutch-old/runtime/local/conf/urls/
结果是:
InjectorJob: starting at 2017-01-06 05:29:21
InjectorJob: Injecting urlDir: /root/project/nutch-old/runtime/local/conf/urls
InjectorJob: Using class org.apache.gora.accumulo.store.AccumuloStore as the Go
我使用的是nutch 2.0和solr 4.0,并且只取得了最小的成功,我有3个urls,regex-urlfilter.xml被设置为允许一切。我运行了这个脚本
#!/bin/bash
# Nutch crawl
export NUTCH_HOME=~/java/workspace/Nutch2.0/runtime/local
# depth in the web exploration
n=1
# number of selected urls for fetching
maxUrls=50000
# solr server
solrUrl=http://localhost:8983
当我试图使用generate命令生成urls时,我会得到以下错误:
GeneratorJob: java.lang.RuntimeException:作业失败: name=generate: 1357474131-234134646,org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54) at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:191) at org.apache.nutch.crawl.GeneratorJob.generate(G
当我执行nutch命令来创建crawldb文件夹和内容时:
soporte@CNEOSYLAP /usr/local/apache-nutch-2.2.1/runtime/local
$ bin/nutch crawl urls -dir crawl -depth 3 -topN 5
我得到了这个错误:
InjectorJob: Using class org.apache.gora.hbase.store.HBaseStore as the Gora storage class.
Exception in thread "main" org.apache.hadoop.map
您好,我跟随尝试在Eclipse中运行Nutch,然后一步一步地运行Nutch。
我完成了这一步(Nutch 1.X),没有问题:
svn co https://svn.apache.org/repos/asf/nutch/trunk
cd trunk
因为我在1.X上工作,所以我跳到了step#5。
Add “http.agent.name” and “http.robots.agents” with appropiate values in “conf/nutch-site.xml”. See conf/nutch-default.xml for the description of
我正在我的centOS虚拟机上安装nutch2.2.1,在注入种子urls(目录名)时出现错误。我使用了这个命令:
/usr/share/apache-nutch-2.1/src/bin/nutch inject root/apache-nutch-2.1/src/testresources/testcrawl urls
我得到了一个错误:
Error: Could not find or load main class org.apache.nutch.crawl.InjectorJob
类似地,对于命令
/usr/share/apache-nutch-2.1/src/bin/nutch r