我正在编写一个映射函数,它生成作为某些user_id的键,并且值也是文本类型。我就是这样做的
public static class UserMapper extends Mapper<Object, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text userid = new Text();
private Text catid = new Text();
/* map method */
public void map(Object key, Text value, Context context)
throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString(), ","); /* separated by "," */
int count = 0;
userid.set(itr.nextToken());
while (itr.hasMoreTokens()) {
if (++count == 3) {
catid.set(itr.nextToken());
context.write(userid, catid);
}else {
itr.nextToken();
}
}
}
}然后,在主程序中,我将映射程序的输出类设置为:
Job job = new Job(conf, "Customer Analyzer");
job.setJarByClass(popularCategories.class);
job.setMapperClass(UserMapper.class);
job.setCombinerClass(UserReducer.class);
job.setReducerClass(UserReducer.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);因此,即使我已经将输出值的类设置为Text.class,但在编译它时仍然会出现以下错误:
popularCategories.java:39: write(org.apache.hadoop.io.Text,org.apache.hadoop.io.IntWritable)
in org.apache.hadoop.mapreduce.TaskInputOutputContext<java.lang.Object,
org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,
org.apache.hadoop.io.IntWritable>
cannot be applied to (org.apache.hadoop.io.Text,org.apache.hadoop.io.Text)
context.write(userid, catid);
^根据这个错误,它仍然在考虑这种格式的映射类:write(org.apache.hadoop.io.Text,org.apache.hadoop.io.IntWritable)。
因此,当我按如下方式更改类定义时,问题就解决了。
public static class UserMapper extends Mapper<Object, Text, Text, Text> {
}因此,我想了解类定义与设置mapper输出vaue类之间的区别。
https://stackoverflow.com/questions/35762385
复制相似问题