我写了一个MR程序来估计PI(3.141592.........)如下所示,但我发现一个问题:
框架发出的map任务数是11,以下是输出(总共35行).But我预计输出是11行。有什么我错过的吗?
环78534096环78539304环78540871环78537925环78537161环78544419环78537045环78534861环78545779环78528890环78542686环78534539环78538255环78543392环78543191环78534882环78540938环785534882环78536155环78545739环7854739环78541807环78540635环78547561环7840521环78541320环78537605环78541379环78536238环
公共类PiEstimation {
public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, LongWritable> {
private final static Text INCIRCLE = new Text("INCIRCLE");
private final static LongWritable TimesInAMap = new LongWritable(100000000);
private static Random random = new Random();
public class MyPoint {
private double x = 0.0;
private double y = 0.0;
MyPoint(double _x,double _y) {
this.x = _x;
this.y = _y;
}
public boolean inCircle() {
if ( ((x-0.5)*(x-0.5) + (y-0.5)*(y-0.5)) <= 0.25 )
return true;
else
return false;
}
public void setPoint(double _x,double _y) {
this.x = _x;
this.y = _y;
}
}
public void map(LongWritable key, Text value, OutputCollector<Text, LongWritable> output, Reporter reporter) throws IOException {
long i = 0;
long N = TimesInAMap.get();
MyPoint myPoint = new MyPoint(random.nextDouble(),random.nextDouble());
long sum = 0;
while (i < N ) {
if (myPoint.inCircle()) {
sum++;
}
myPoint.setPoint(random.nextDouble(),random.nextDouble());
i++;
}
output.collect(INCIRCLE, new LongWritable(sum));
}
}
public static class Reduce extends MapReduceBase implements Reducer<Text, LongWritable, Text, LongWritable> {
public void reduce(Text key, Iterator<LongWritable> values, OutputCollector<Text, LongWritable> output, Reporter reporter) throws IOException {
long sum = 0;
while (values.hasNext()) {
//sum += values.next().get();
output.collect(key, values.next());
}
//output.collect(key, new LongWritable(sum));
}
}
public static void main(String[] args) throws Exception {
JobConf conf = new JobConf(PiEstimation.class);
conf.setJobName("PiEstimation");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(LongWritable.class);
conf.setMapperClass(Map.class);
conf.setCombinerClass(Reduce.class);
conf.setReducerClass(Reduce.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
conf.setNumMapTasks(10);
conf.setNumReduceTasks(1);
FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
JobClient.runJob(conf);
}}
发布于 2012-03-26 18:23:44
启动的map任务的数量由许多因素决定--主要是输入格式、将输入文件分块到其中的相关块大小以及输入文件本身是否可拆分。
另外,调用map的次数取决于每个map分割中的记录数(映射器正在处理的数据)。
假设您有一个100行的文本文件用于输入-很可能这将由单个Mapper处理,但map方法被调用100次-输入文件中的每一行调用一次
如果你计算输入文件中的行数--这就是所有映射器调用map的次数。很难准确地确定在每个Mapper中会调用多少次map。
https://stackoverflow.com/questions/9869986
复制相似问题