专栏首页运维一切marathon错误记录 原

marathon错误记录 原

[2017-03-03 19:42:42,812] INFO Client session timed out, have not heard from server in 6666ms for sessionid 0x35a933430d50004, closing socket connection and 
attempting reconnect (org.apache.zookeeper.ClientCnxn:pool-1-thread-1-SendThread(192.168.91.99:2181))
[2017-03-03 19:42:42,912] INFO State change: SUSPENDED (org.apache.curator.framework.state.ConnectionStateManager:pool-1-thread-1-EventThread)
[2017-03-03 19:42:42,947] INFO Opening socket connection to server 192.168.52.92/192.168.52.92:2181. Will not attempt to authenticate using SASL (unknown error
) (org.apache.zookeeper.ClientCnxn:pool-1-thread-1-SendThread(192.168.52.92:2181))
[2017-03-03 19:42:42,948] INFO Socket connection established to 192.168.92/192.168.52.92:2181, initiating session (org.apache.zookeeper.ClientCnxn:pool-1-th
read-1-SendThread(10.125.52.92:2181))
[2017-03-03 19:42:42,951] INFO Session establishment complete on server 192.168.52.92/192.168.52.92:2181, sessionid = 0x35a933430d50004, negotiated timeout = 1
0000 (org.apache.zookeeper.ClientCnxn:pool-1-thread-1-SendThread(192.168.52.92:2181))
[2017-03-03 19:42:42,951] INFO State change: RECONNECTED (org.apache.curator.framework.state.ConnectionStateManager:pool-1-thread-1-EventThread)
[2017-03-03 19:42:42,953] INFO Leader defeated. New leader: 192.168.48.125:8080 (mesosphere.marathon.core.election.impl.CuratorElectionService:pool-1-thread-1
)
[2017-03-03 19:42:42,957] INFO Deleting existing tombstone for old twitter commons leader election (mesosphere.marathon.core.election.impl.CuratorElectionSer
vice:pool-1-thread-1)
[2017-03-03 19:42:42,959] INFO Lost leadership (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ea97d137:pool-1-thread-1)
[2017-03-03 19:42:42,959] INFO All actors suspended:
* Actor[akka://marathon/user/taskTracker#989799113]
* Actor[akka://marathon/user/reviveOffersWhenWanted#-1681045213]
* Actor[akka://marathon/user/taskKillServiceActor#-1306622116]
* Actor[akka://marathon/user/launchQueue#819767243]
* Actor[akka://marathon/user/offersWantedForReconciliation#-2099816564]
* Actor[akka://marathon/user/rateLimiter#503420309]
* Actor[akka://marathon/user/groupManager#-752628876]
* Actor[akka://marathon/user/offerMatcherLaunchTokens#-562928907]
* Actor[akka://marathon/user/killOverdueStagedTasks#-1773633501]
* Actor[akka://marathon/user/offerMatcherManager#123957678]
* Actor[akka://marathon/user/expungeOverdueLostTasks#-1479038444] (mesosphere.marathon.core.leadership.impl.LeadershipCoordinatorActor:marathon-akka.actor.de
fault-dispatcher-9)
[2017-03-03 19:42:42,960] INFO Stopping driver (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ea97d137:pool-1-thread-1)
I0303 19:42:42.960778 10617 sched.cpp:1987] Asked to stop the driver
I0303 19:42:42.961051 10679 sched.cpp:1187] Stopping framework '041eee2c-d32b-413b-931e-dc1f47a97971-0000'
[2017-03-03 19:42:42,961] ERROR Terminating after loss of leadership (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ea97d137:pool-1-thread-1
)
[2017-03-03 19:42:42,961] INFO ExpungeOverdueLostTasksActor has stopped (mesosphere.marathon.core.task.jobs.impl.ExpungeOverdueLostTasksActor:marathon-akka.a
ctor.default-dispatcher-19)
[2017-03-03 19:42:42,964] INFO Driver future completed with result=Success(()). (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ea97d137:Fork
JoinPool-2-worker-37)
[2017-03-03 19:42:42,964] INFO Stopped appTaskLaunchActor for /php-test version 2017-03-03T09:45:32.125Z (mesosphere.marathon.core.launchqueue.impl.TaskLaunc
herActor:marathon-akka.actor.default-dispatcher-21)
[2017-03-03 19:42:42,964] INFO Call postDriverRuns callbacks on EntityStoreCache(MarathonStore(app:)), EntityStoreCache(MarathonStore(group:)), EntityStoreCa
che(MarathonStore(deployment:)), EntityStoreCache(MarathonStore(framework:)), EntityStoreCache(MarathonStore(taskFailure:)), EntityStoreCache(MarathonStore(e
vents:)) (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ea97d137:ForkJoinPool-2-worker-37)
[2017-03-03 19:42:42,965] INFO Finished postDriverRuns callbacks (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ea97d137:ForkJoinPool-2-work
er-37)
[2017-03-03 19:42:42,965] INFO Shutting down services (mesosphere.marathon.Main$:shutdownHook1)
[2017-03-03 19:42:42,965] INFO Shutting down actor system akka://marathon (mesosphere.marathon.core.base.ActorsModule:Thread-3)
(END)

这个问题是这个样子,如果你的zookeeper集群不稳定,而且此前有部署过marathon集群,这下就经常会出现这种问题。marathon如果开启集群模式(--ha=true),如果zookeeper集群的节点连接出现延迟的问题或者其他问题,进而marathon无法确定其他节点的情况,失去竞选能力,然后自我毁灭。 zookeeper部署的时候要格外注意跟marathon集群的结合,另外如果你不启用marathon的集群模式,你最好关闭marathon的集群模式。

谨记一点,Marathon的选举依赖zookeeper

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • jenkins "DNSQuestion"日志无限循环问题解决

    资料 https://wiki.jenkins.io/display/JENKINS/Features+controlled+by+system+propert...

    domain0
  • 一个haproxy的代理模版 原

    domain0
  • 容器停止后续操作 原

    如果你的容器每次都是被kill -9的信号直接灭掉,可能你的数据或者系统就会有各种操蛋的事情,这里给出一个docker容器关闭时捕捉处理stop信号,更加合理处...

    domain0
  • 观点 | 投资人必看:2017年关于人工智能创业的五大预测

    选自bradfordcross 作者:Bradford 机器之心编译 参与:黄小天、李亚洲、微胖、蒋思源 近日,机器学习与金融风险投资机构 DCVC 的合伙人...

    机器之心
  • 2017年年报:人工智能的大发展时期?前方依旧是漫漫长路

    人工智能的发展无论是对企业还是对服务供应商,无论是对人才的获得还是留用都造成了极为深远的影响。在人才获得的方面,企业和服务供应商对统计学毕业生,机器学习程序员,...

    全球资讯翻译官
  • 源中瑞数字资产多币种钱包app定制开发

    源中瑞数字资产多币种钱包app定制开发度误伤,采用hd钱包技术,多重签名加密保障数字资产的安全,支持提供各类专业算法,提供用户管理、充值、提现、交易等功能,支持...

    v13823115027
  • AI人工智能常见名词

    在大家意识到之前,第四次工业革命 ― 人工智能革命已悄悄掀起,渗入日常。搜寻人工智能,或 Artificial intelligence,马上冒出一堆新闻,标题...

    机器人网
  • 人工智能的真正面目

    1.人工智能算法需要:大数据、强大的电脑运算能力、优秀的人工智能算法工程师。

    云深无际
  • 如何看待人工智障?

    在这之中,有很多人其实对人工智能是持有怀疑态度的,或是无神论者或是有神论者,持有这种观点的人都是大有人在。

    刀刀老高
  • 为什么我不应该关注你?负责任的推荐系统中的赞成和反对理由(社交和信息网络)

    一些推荐系统(RS)采用了解释,以增强对推荐的信任。然而,目前的解释生成技术倾向于强烈支持推荐的产品,而不是同时展示支持和反对的理由。我们认为,通过向用户坦率地...

    Jillchen996

扫码关注云+社区

领取腾讯云代金券