前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >使用Eclipse编译运行MapReduce程序

使用Eclipse编译运行MapReduce程序

作者头像
不愿意做鱼的小鲸鱼
发布2022-09-24 09:59:27
7550
发布2022-09-24 09:59:27
举报
文章被收录于专栏:web全栈web全栈

详细的配置文档

mapreduce也是比较久学的,详细的内容和操作可以看下面的文档。 点击下载 链接:https://pan.baidu.com/s/1BIBpClKy2xcqAJtxUJoYVA 提取码:ctca

1. WordCount

统计一堆文件中单词出现的个数 代码如下 * TokenizerMapper.java

代码语言:javascript
复制
package com.test;

import java.io.IOException;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable> {

    public void map(Object key,Text value,Context context)throws IOException,InterruptedException{
        String line=value.toString();
        String[] words=line.split(" ");
        for(String word:words){
            context.write(new Text(word), new IntWritable(1));
        }
    }

}
  • IntSumReducer.java
代码语言:javascript
复制
package com.test;

import java.io.IOException;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

public class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
    public static Integer num=0;
    public void reduce(Text key2,Iterable<IntWritable> values,Context context) throws IOException,InterruptedException{
        Integer count=0;
        num++;
        for(IntWritable value:values){
            count+=value.get();
        }
        Text key1=new Text(num.toString()+" "+key2);
        context.write(key1, new IntWritable(count));
    }

}
  • WordCount.java
代码语言:javascript
复制
package com.test;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class WordCount {
    public WordCount(){

    }

    public static void main(String[] args)throws Exception {
        // TODO Auto-generated method stub
        Configuration conf=new Configuration();
        Job job=Job.getInstance(conf, "wordcount");
        job.setJarByClass(WordCount.class);
        job.setMapperClass(TokenizerMapper.class);
        job.setReducerClass(IntSumReducer.class);
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(IntWritable.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        FileInputFormat.addInputPath(job, new Path("hdfs://192.168.119.128:9000/input"));
        FileOutputFormat.setOutputPath(job, new Path("hdfs://192.168.119.128:9000/output"));
        System.exit(job.waitForCompletion(true)?0:1);

    }

}

运行结果

使用Eclipse编译运行MapReduce程序-左眼会陪右眼哭の博客
使用Eclipse编译运行MapReduce程序-左眼会陪右眼哭の博客
在这里插入图片描述
在这里插入图片描述

2. RemoveSame

去除一堆文件中重复出现的单词 * rsmapper.java

代码语言:javascript
复制
package removesame;

import java.io.IOException;

import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class rsmapper extends Mapper<Object, Text, Text, NullWritable> {

    public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
        String line = value.toString();
        context.write(new Text(line), NullWritable.get());
    }
}
  • rsreduce.java
代码语言:javascript
复制
package removesame;

import java.io.IOException;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

public class rsreduce extends Reducer<Text, NullWritable, IntWritable, Text> {
    public static int num=0;
    public void reduce(Text key, Iterable<NullWritable> values, Context context) throws IOException, InterruptedException {
        // process values
        context.write(new IntWritable(num),key);
        num++;  
    }
}
  • rsmapreduce.java
代码语言:javascript
复制
package removesame;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class rsmapreduce {

    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        //是否运行为本地模式,就是看这个参数值是否为local,默认就是local
        conf.set("fs.defaultFS", "file:///"); 
        Job job = Job.getInstance(conf, "JobName");
        job.setJarByClass(rsmapreduce.class);
        job.setMapperClass(rsmapper.class);
        job.setReducerClass(rsreduce.class);
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(NullWritable.class);
        job.setOutputKeyClass(IntWritable.class);
        job.setOutputValueClass(Text.class);
        FileInputFormat.setInputPaths(job, new Path("F:\\native_file\\removesame\\input"));
        FileOutputFormat.setOutputPath(job, new Path("F:\\native_file\\removesame\\output"));

        if (!job.waitForCompletion(true))
            return;
    }

}

结果如下

在这里插入图片描述
在这里插入图片描述

3. Sort

使用mapreduce,给一堆数据进行排序

使用Eclipse编译运行MapReduce程序-左眼会陪右眼哭の博客
使用Eclipse编译运行MapReduce程序-左眼会陪右眼哭の博客

代码如下

代码语言:javascript
复制
package sort;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class Sort {

    public static class Map extends Mapper<Object,Text,IntWritable,NullWritable>{
         private static IntWritable data=new IntWritable();
           //实现map函数
         public void map(Object key,Text value,Context context)throws IOException,InterruptedException{
            String line=value.toString();
            data.set(Integer.parseInt(line));
            context.write(data, NullWritable.get());
            }
        }
       public static class Reduce extends Reducer<IntWritable,NullWritable,IntWritable,NullWritable>{
            public void reduce(IntWritable key,Iterable<NullWritable> values,Context context) throws IOException,InterruptedException{
                context.write(key, NullWritable.get());
           }
        }
      public static void main(String[] args) throws Exception{
           Configuration conf = new Configuration();
         //设置以后可以读取本地文件
           conf.set("fs.defaultFS", "file:///"); 
           Job job= Job.getInstance(conf,"Data Sort");
           job.setJarByClass( Sort.class);
           job.setMapperClass( Map.class);
           job.setReducerClass( Reduce.class);
           job.setMapOutputKeyClass(IntWritable.class);
           job.setMapOutputValueClass(NullWritable.class);
           job.setOutputKeyClass(IntWritable.class);
           job.setOutputValueClass(NullWritable.class);
           FileInputFormat.setInputPaths(job, new Path("F:\\native_file\\sort\\input"));
            FileOutputFormat.setOutputPath(job, new Path("F:\\native_file\\sort\\output"));
            boolean finish=job.waitForCompletion( true );
           if(finish){
               System.out.println("Congratulations");
           }
      }

}

运行结果

使用Eclipse编译运行MapReduce程序-左眼会陪右眼哭の博客
使用Eclipse编译运行MapReduce程序-左眼会陪右眼哭の博客

排序结果

在这里插入图片描述
在这里插入图片描述
本文参与 腾讯云自媒体同步曝光计划,分享自作者个人站点/博客。
如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 详细的配置文档
  • 1. WordCount
  • 2. RemoveSame
  • 3. Sort
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档