前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >SparkR的第一个测试例子Spark Pi计算

SparkR的第一个测试例子Spark Pi计算

作者头像
sparkexpert
发布2022-05-07 13:37:51
5510
发布2022-05-07 13:37:51
举报
文章被收录于专栏:大数据智能实战

安装SparkR颇费周折,网上看到的各种安装方法,其实最终测试都很不好用。可能是国内有些网站被屏蔽的关系吧。

如install_github("amplab-extras/SparkR-pkg", subdir="pkg"),这条命令,就卡在SBT的环节,即使下载了SBT进行安装也是不行。其错误报码是:“Invalid or corrupt jarfile sbt/sbt-launch-0.13.6.jar”

单独在Spark源码下编译也是无法成功,虽然利用R -e  "devtools::install('.')"这样的命令能够生成SparkR的包,但是进行测试的时候,发现网络连接没通过,其核心原因还是因为没有生成sparkr-assembly-0.1.jar,缺少这个当然所有都无法进行联调。

编译完成之后,利用RStudio进行了第一个测试,sparkR进行Spark Pi测试,测试通过,非常开心。

这是在SparkR中输出的结果。

library(SparkR) [SparkR] Initializing with classpath /home/ndscbigdata/R/x86_64-pc-linux-gnu-library/3.2/SparkR/sparkr-assembly-0.1.jar >  > #args <- commandArgs(trailing = TRUE) >  > #if (length(args) < 1) { > #  print("Usage: pi <master> [<slices>]") > #  q("no") > #} >  > #sc <- sparkR.init(master="spark://ubuntu-bigdata-5:7077","PiR"); > sc <- sparkR.init(master="local", "PiR") Launching java with command  /usr/lib/jvm/java-8-oracle/bin/java   -Xmx512m -cp '/home/ndscbigdata/R/x86_64-pc-linux-gnu-library/3.2/SparkR/sparkr-assembly-0.1.jar:' edu.berkeley.cs.amplab.sparkr.SparkRBackend /tmp/RtmpGq7K9F/backend_port4ae6710a585b  15/10/09 09:31:27 INFO Slf4jLogger: Slf4jLogger started >  > slices <- ifelse(length(args) > 1, as.integer(args[[2]]), 2) >  > n <- 100000 * slices >  > piFunc <- function(elem) { +   rands <- runif(n = 2, min = -1, max = 1) +   val <- ifelse((rands[1]^2 + rands[2]^2) < 1, 1.0, 0.0) +   val + } >  >  > piFuncVec <- function(elems) { +   message(length(elems)) +   rands1 <- runif(n = length(elems), min = -1, max = 1) +   rands2 <- runif(n = length(elems), min = -1, max = 1) +   val <- ifelse((rands1^2 + rands2^2) < 1, 1.0, 0.0) +   sum(val) + } >  > rdd <- parallelize(sc, 1:n, slices) > count <- reduce(lapplyPartition(rdd, piFuncVec), sum) 15/10/09 09:31:28 WARN TaskSetManager: Stage 0 contains a task of very large size (391 KB). The maximum recommended task size is 100 KB. 100000 15/10/09 09:31:29 INFO RRDD: Times: boot = 0.768 s, init = 0.003 s, broadcast = 0.000 s, read-input = 0.001 s, compute = 0.066 s, write-output = 0.000 s, total = 0.838 s 100000 15/10/09 09:31:29 INFO RRDD: Times: boot = 0.004 s, init = 0.002 s, broadcast = 0.000 s, read-input = 0.001 s, compute = 0.062 s, write-output = 0.000 s, total = 0.069 s > cat("Pi is roughly", 4.0 * count / n, "\n") Pi is roughly 3.14792  > cat("Num elements in RDD ", count(rdd), "\n") 15/10/09 09:31:29 WARN TaskSetManager: Stage 1 contains a task of very large size (391 KB). The maximum recommended task size is 100 KB. 15/10/09 09:31:29 INFO RRDD: Times: boot = 0.005 s, init = 0.002 s, broadcast = 0.000 s, read-input = 0.001 s, compute = 0.000 s, write-output = 0.000 s, total = 0.008 s 15/10/09 09:31:29 INFO RRDD: Times: boot = 0.004 s, init = 0.002 s, broadcast = 0.000 s, read-input = 0.001 s, compute = 0.000 s, write-output = 0.000 s, total = 0.007 s Num elements in RDD  200000  15/10/09 09:31:29 WARN TaskSetManager: Stage 2 contains a task of very large size (391 KB). The maximum recommended task size is 100 KB. 15/10/09 09:31:29 INFO RRDD: Times: boot = 0.004 s, init = 0.002 s, broadcast = 0.000 s, read-input = 0.001 s, compute = 0.000 s, write-output = 0.000 s, total = 0.007 s 15/10/09 09:31:29 INFO RRDD: Times: boot = 0.006 s, init = 0.002 s, broadcast = 0.000 s, read-input = 0.001 s, compute = 0.000 s, write-output = 0.000 s, total = 0.009 s 15/10/09 09:31:29 WARN TaskSetManager: Stage 3 contains a task of very large size (391 KB). The maximum recommended task size is 100 KB. 15/10/09 09:31:29 INFO RRDD: Times: boot = 0.004 s, init = 0.001 s, broadcast = 0.000 s, read-input = 0.001 s, compute = 0.000 s, write-output = 0.000 s, total = 0.006 s 15/10/09 09:31:29 INFO RRDD: Times: boot = 0.004 s, init = 0.002 s, broadcast = 0.000 s, read-input = 0.001 s, compute = 0.000 s, write-output = 0.001 s, total = 0.008 s

本文参与 腾讯云自媒体同步曝光计划,分享自作者个人站点/博客。
原始发表:2015-10-09,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档