返回第一个元素
scala
scala> val rdd = sc.parallelize(List(1,2,3,3))
scala> rdd.first()
res1: Int = 1
java
JavaRDD...val rdd = sc.parallelize(List(1,2,3,3))
scala> rdd.take(2)
res3: Array[Int] = Array(1, 2)
java
JavaRDD...= sc.parallelize(List(1,2,3,3))
scala> rdd.collect()
res4: Array[Int] = Array(1, 2, 3, 3)
java
JavaRDD...scala> rdd.countByValue()
res6: scala.collection.Map[Int,Long] = Map(1 -> 1, 2 -> 1, 3 -> 2)
java
JavaRDD...Integer> top = rdd.top(2);
takeOrdered
rdd.take(n)
对RDD元素进行升序排序,取出前n个元素并返回,也可以自定义比较器(这里不介绍),类似于top的相反的方法