我已经设置了,这样我就可以在本地开发并且获得Intellij的好处,同时利用Azure上一个大型星火集群的功能。当我想读或写到Azure数据湖spark.read.csv("abfss://blah.csv)时,我得到以下信息 at org.apache.hadoop.fs.FileSystem$Cache.getInterna
JavaRDD<Tuple2<String, String>> pairRDD = someRDD.flatMap at org.apache.spark.rdd.RDD.flatMap(RDD.scala:295)
at org.apache.spark.api.java.JavaRDDLike$class.flatMap<
("com.databricks.spark.avro").load(files: _*)java.lang.IllegalArgumentException: java.net.URISyntaxExceptioncom.amazon.ws.emr.hadoop.fs.EmrFileSystem.globStatus(EmrFileSystem.java:362)
at org.apache.spark.deploy.SparkHadoopUtil.globPa
我正在尝试使用Java8中的spark 2.1.0进行flatMapJavaDStream<String> words = lines.flatMap(x -> Arrays.asList(x.split(" ")).iterator());Error:(31, 25) java: method flatMap in class org.apache.spark</
"bootstrap.servers"-> "127.0.0.1:9092"), )Exception in thread "main" java.lang.ClassCastException$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap:
at org.apache.spark.rdd.RDD$$anonfun$flatMap$1.apply(RDD.scala:333) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala(RDD.scala:316)
at org.apache.<
Ignoring this directory. at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242)
at scala.collection.mutable.ArrayOps$ofRef.flatMap(ArrayOp
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
atscala.collection.TraversableLike$$anonfun$flatMap</
$$extractorFor$1.apply(ScalaReflection.scala:502)at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)
at scala.collection.AbstractTraversable.