前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Spark on K8S

Spark on K8S

作者头像
sparkle123
发布2021-01-26 14:48:28
5840
发布2021-01-26 14:48:28
举报

Spark on K8S TimeLine

KICKOFF

Spark Standalone on Kubernetes (via k8s community) SPIP: SPARK-18278 https://github.com/apache-spark-on-k8s/spark (fork)

Spark 2.3.0

Officially native Kubernetes support (first release) Experimental Kubernetes 1.7+

Spark 2.4.3

Latest release version PySpark/SparkR applications support Client mode support (for interactive applications and notebooks) Support for mounting certain types of Kubernetes volumes

Spark 3.0+

SPARK-25826 Kerberos HDFS support Dynamic allocation support

提交运行

image.png

bin/spark-submit \  
--master k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port> \  
--deploy-mode cluster \  
--name spark-pi \  
--class org.apache.spark.examples.SparkPi \  
--conf spark.executor.instances=5 \  
--conf spark.kubernetes.container.image=<spark-image> \  
local:///path/to/examples.jar

问题

UI No Logs

Spark on K8S 的Executors页面无logs

出错无法退出

SPARK-27927 driver pod hangs with pyspark 2.4.3 and master on kubernetes SPARK-27812 kubernetes client import non-daemon thread which block jvm exit.

hostPath as LOCAL_DIRS

Spark on k8s默认mount emptyDir这类Volume,实际对应物理机的单盘下的临时路径.

Spark 3.0

hostPath支持 SPARK-27499 Support mapping spark.local.dir to hostPath volume

External shuffle service SPARK-25299 Use remote storage for persisting shuffle data

Dynamic resource allocation SPARK-24432 Add support for dynamic resource allocation SPARK-27963 Allow dynamic allocation without an external shuffle service

访问安全HDFS集群 SPARK-25826 Kerberos HDFS support

SPARK-28949 Kubernetes CGroup leaking leads to Spark Pods hang in Pending status SPARK-28992 Support update dependencies from hdfs when task run on executor pods SPARK-28947 Status logging occurs on every state change but not at an interval for liveness. SPARK-28896 Spark client process is unable to upload jars to hdfs while using ConfigMap not HADOOP_CONF_DIR

本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • Spark on K8S TimeLine
  • 提交运行
  • 问题
  • Spark 3.0
相关产品与服务
容器服务
腾讯云容器服务(Tencent Kubernetes Engine, TKE)基于原生 kubernetes 提供以容器为核心的、高度可扩展的高性能容器管理服务,覆盖 Serverless、边缘计算、分布式云等多种业务部署场景,业内首创单个集群兼容多种计算节点的容器资源管理模式。同时产品作为云原生 Finops 领先布道者,主导开源项目Crane,全面助力客户实现资源优化、成本控制。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档