前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >marathon错误记录 原

marathon错误记录 原

作者头像
domain0
发布2018-08-02 11:04:01
6670
发布2018-08-02 11:04:01
举报
文章被收录于专栏:运维一切运维一切运维一切
[2017-03-03 19:42:42,812] INFO Client session timed out, have not heard from server in 6666ms for sessionid 0x35a933430d50004, closing socket connection and 
attempting reconnect (org.apache.zookeeper.ClientCnxn:pool-1-thread-1-SendThread(192.168.91.99:2181))
[2017-03-03 19:42:42,912] INFO State change: SUSPENDED (org.apache.curator.framework.state.ConnectionStateManager:pool-1-thread-1-EventThread)
[2017-03-03 19:42:42,947] INFO Opening socket connection to server 192.168.52.92/192.168.52.92:2181. Will not attempt to authenticate using SASL (unknown error
) (org.apache.zookeeper.ClientCnxn:pool-1-thread-1-SendThread(192.168.52.92:2181))
[2017-03-03 19:42:42,948] INFO Socket connection established to 192.168.92/192.168.52.92:2181, initiating session (org.apache.zookeeper.ClientCnxn:pool-1-th
read-1-SendThread(10.125.52.92:2181))
[2017-03-03 19:42:42,951] INFO Session establishment complete on server 192.168.52.92/192.168.52.92:2181, sessionid = 0x35a933430d50004, negotiated timeout = 1
0000 (org.apache.zookeeper.ClientCnxn:pool-1-thread-1-SendThread(192.168.52.92:2181))
[2017-03-03 19:42:42,951] INFO State change: RECONNECTED (org.apache.curator.framework.state.ConnectionStateManager:pool-1-thread-1-EventThread)
[2017-03-03 19:42:42,953] INFO Leader defeated. New leader: 192.168.48.125:8080 (mesosphere.marathon.core.election.impl.CuratorElectionService:pool-1-thread-1
)
[2017-03-03 19:42:42,957] INFO Deleting existing tombstone for old twitter commons leader election (mesosphere.marathon.core.election.impl.CuratorElectionSer
vice:pool-1-thread-1)
[2017-03-03 19:42:42,959] INFO Lost leadership (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ea97d137:pool-1-thread-1)
[2017-03-03 19:42:42,959] INFO All actors suspended:
* Actor[akka://marathon/user/taskTracker#989799113]
* Actor[akka://marathon/user/reviveOffersWhenWanted#-1681045213]
* Actor[akka://marathon/user/taskKillServiceActor#-1306622116]
* Actor[akka://marathon/user/launchQueue#819767243]
* Actor[akka://marathon/user/offersWantedForReconciliation#-2099816564]
* Actor[akka://marathon/user/rateLimiter#503420309]
* Actor[akka://marathon/user/groupManager#-752628876]
* Actor[akka://marathon/user/offerMatcherLaunchTokens#-562928907]
* Actor[akka://marathon/user/killOverdueStagedTasks#-1773633501]
* Actor[akka://marathon/user/offerMatcherManager#123957678]
* Actor[akka://marathon/user/expungeOverdueLostTasks#-1479038444] (mesosphere.marathon.core.leadership.impl.LeadershipCoordinatorActor:marathon-akka.actor.de
fault-dispatcher-9)
[2017-03-03 19:42:42,960] INFO Stopping driver (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ea97d137:pool-1-thread-1)
I0303 19:42:42.960778 10617 sched.cpp:1987] Asked to stop the driver
I0303 19:42:42.961051 10679 sched.cpp:1187] Stopping framework '041eee2c-d32b-413b-931e-dc1f47a97971-0000'
[2017-03-03 19:42:42,961] ERROR Terminating after loss of leadership (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ea97d137:pool-1-thread-1
)
[2017-03-03 19:42:42,961] INFO ExpungeOverdueLostTasksActor has stopped (mesosphere.marathon.core.task.jobs.impl.ExpungeOverdueLostTasksActor:marathon-akka.a
ctor.default-dispatcher-19)
[2017-03-03 19:42:42,964] INFO Driver future completed with result=Success(()). (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ea97d137:Fork
JoinPool-2-worker-37)
[2017-03-03 19:42:42,964] INFO Stopped appTaskLaunchActor for /php-test version 2017-03-03T09:45:32.125Z (mesosphere.marathon.core.launchqueue.impl.TaskLaunc
herActor:marathon-akka.actor.default-dispatcher-21)
[2017-03-03 19:42:42,964] INFO Call postDriverRuns callbacks on EntityStoreCache(MarathonStore(app:)), EntityStoreCache(MarathonStore(group:)), EntityStoreCa
che(MarathonStore(deployment:)), EntityStoreCache(MarathonStore(framework:)), EntityStoreCache(MarathonStore(taskFailure:)), EntityStoreCache(MarathonStore(e
vents:)) (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ea97d137:ForkJoinPool-2-worker-37)
[2017-03-03 19:42:42,965] INFO Finished postDriverRuns callbacks (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ea97d137:ForkJoinPool-2-work
er-37)
[2017-03-03 19:42:42,965] INFO Shutting down services (mesosphere.marathon.Main$:shutdownHook1)
[2017-03-03 19:42:42,965] INFO Shutting down actor system akka://marathon (mesosphere.marathon.core.base.ActorsModule:Thread-3)
(END)

这个问题是这个样子,如果你的zookeeper集群不稳定,而且此前有部署过marathon集群,这下就经常会出现这种问题。marathon如果开启集群模式(--ha=true),如果zookeeper集群的节点连接出现延迟的问题或者其他问题,进而marathon无法确定其他节点的情况,失去竞选能力,然后自我毁灭。 zookeeper部署的时候要格外注意跟marathon集群的结合,另外如果你不启用marathon的集群模式,你最好关闭marathon的集群模式。

谨记一点,Marathon的选举依赖zookeeper

本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2017/03/06 ,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档