文章/答案/技术大牛

发布

社区首页 >问答首页 >使用Java查找字符串的MapReduce

问使用Java查找字符串的MapReduce
EN

Stack Overflow用户

提问于 2016-04-30 11:09:15

回答 2查看 301关注 0票数 0

我试图从一个文本文件中搜索一个特定的字符串并出现该字符串，但在运行此代码后，我得到了io.LongWritable之间的classCastException。

Error: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.Text
        at searchaString.SearchDriver$searchMap.map(SearchDriver.java:1)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

16/04/30 02:48:17信息mapreduce.Job: map 0%Reduce 0%16/04/30 02:48:23信息mapreduce.Job:任务Id : attempt_1461630807194_0021_m_000000_2，状态:失败错误: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable无法转换为org.apache.hadoop.io.Text

package samples.wordcount;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
//import org.apache.hadoop.util.GenericOptionsParser;
//import org.apache.hadoop.mapred.lib.NLineInputFormat;
import java.io.IOException;
import java.util.Iterator;


public class WordCount {

    public static void main(String[] args) throws Exception {

        @SuppressWarnings("unused")
        JobClient jobC =new JobClient();

        Configuration conf = new Configuration();
        //String args[] = parser.getRemainingArgs();

        Job job = Job.getInstance(conf);
        job.setJobName("WordCount");


        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);

        job.setJarByClass(WordCount.class);

        job.setMapperClass(TokenizerMapper.class);
        job.setReducerClass(IntSumReducer.class);

        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(IntWritable.class);

        //job.setInputFormatClass(TextInputFormat.class);

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        /*String MyWord = args[2];
        TokenizerMapper.find = MyWord;*/

        System.exit(job.waitForCompletion(true) ?  0:1);
    }

    public static class TokenizerMapper extends Mapper<Text, Text, Text, IntWritable> {

        private final static IntWritable one = new IntWritable(1);
        //  private Text word = new Text();
        static String find="txt was not created";
        public int i;

        public void map(Text key, Text value,OutputCollector<Text, IntWritable> output,Reporter reporter) throws IOException, InterruptedException
        {
            String cleanLine = value.toString();        

            String[] cleanL =cleanLine.split("home");

            output.collect(new Text(cleanL[1]), one);

        }
    }

    public static class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> {



        public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output,Reporter reporter)
                throws IOException, InterruptedException {

            int sum = 0;

            String wordText="txt was not created";

            while(values.hasNext()) {

                Boolean check = values.toString().contains("txt was not created");

                if(check)
                {
                    String[] cleanL =values.toString().split("\\.");

                    for(String w : cleanL)
                    {
                        if(w.length()>=wordText.length())

                        {
                            String wrd = w.substring(0,wordText.length()); 

                            if(wrd.equals(wordText))
                            {
                                IntWritable value=values.next();
                                sum += value.get();

                            }

                        }
                    }
                }
            }
            output.collect(key,new IntWritable(sum));
        }
    }
}

我是这个MapReduce的新手，不知道该怎么做。

这也是我的文本文件的样子：

没有创建tab/hdhd/hip/home.slkj.skjdh.dgsyququ/djkdjjd.****text **我必须搜索出现的特定文本。

请回复。

如果你分享一些解决方案，请简要解释我应该在代码中更改什么。

谢谢。

java

windows

hadoop

mapreduce

hortonworks-data-platform

回答 2

Stack Overflow用户

发布于 2016-04-30 17:48:42

您已经给出了Mapper类的签名，如下所示

公共静态类TokenizerMapper扩展了映射器

map方法获取的输入键是行的字节偏移量。例如，如果下面是ur文件的内容

你好，世界！

map函数将第一行的字节偏移量(十六进制)作为键，并使用"Hello World!“作为价值。ByteOffset是一种长值。

将输入键更改为LongWritable

票数 0

Stack Overflow用户

发布于 2016-05-01 20:36:31

新的映射器:公共类TokenizerMapper扩展了映射器

并且您的写入方法是cont.write(新文本(CleanL1)，one)；

"one“不是IntWritabe的东西。更改您的签名，如下所示:公共类TokenizerMapper扩展映射器，并按如下方式编写

Cont.write(新文本(CleanL1)，新文本(“one”))；

或

公共类TokenizerMapper扩展了映射器和编写为

Cont.write(新文本(CleanL1)，新IntWritable(1)；

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/36950261

复制

相似问题

问使用Java查找字符串的MapReduce
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Java查找字符串的MapReduceEN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Java查找字符串的MapReduce
EN