使用spark ipython样板,可以创建spark流应用程序吗?因为spark上下文是用notebook预先配置的,所以这似乎是不可能的。我正在尝试一个简单的应用程序:
import org.apache.spark._
import org.apache.spark.streaming._
import org.apache.spark.streaming.StreamingContext._
val ssc = new StreamingContext(sc, Seconds(1))
val lines = ssc.socketTextStream("129.41.138.175", 9999)
// Split each line into words
val words = lines.flatMap(_.split(" "))
// Count each word in each batch
val pairs = words.map(word => (word, 1))
val wordCounts = pairs.reduceByKey(_ + _)
// Print the first ten elements of each RDD generated in this DStream 
wordCounts.print()
ssc.start()             // Start the computation
ssc.awaitTermination()  // Wait for the computation to terminate
Error:
Name: akka.actor.InvalidActorNameException
Message: actor name [JobScheduler] is not unique!
StackTrace: akka.actor.dungeon.ChildrenContainer$NormalChildrenContainer.reserve(ChildrenContainer.scala:130)
akka.actor.dungeon.Children$class.reserveChild(Children.scala:77)
akka.actor.ActorCell.reserveChild(ActorCell.scala:369)
akka.actor.dungeon.Children$class.makeChild(Children.scala:202)
akka.actor.dungeon.Children$class.attachChild(Children.scala:42)
akka.actor.ActorCell.attachChild(ActorCell.scala:369)
akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:552)
org.apache.spark.streaming.scheduler.JobScheduler.start(JobScheduler.scala:58)
...发布于 2015-10-30 04:10:35
我们确实支持Spark streaming。它内置于我们在Bluemix上部署的Spark中。但这取决于Spark的版本和使用的语言。早期的Spark,如1.3.1,不支持python的流媒体。当前版本为1.4.1。
https://stackoverflow.com/questions/31978638
复制相似问题