marathon错误记录原

domain0

发布于 2018-08-02 11:04:01

7021

发布于 2018-08-02 11:04:01

文章被收录于专栏：运维一切

[2017-03-03 19:42:42,812] INFO Client session timed out, have not heard from server in 6666ms for sessionid 0x35a933430d50004, closing socket connection and 
attempting reconnect (org.apache.zookeeper.ClientCnxn:pool-1-thread-1-SendThread(192.168.91.99:2181))
[2017-03-03 19:42:42,912] INFO State change: SUSPENDED (org.apache.curator.framework.state.ConnectionStateManager:pool-1-thread-1-EventThread)
[2017-03-03 19:42:42,947] INFO Opening socket connection to server 192.168.52.92/192.168.52.92:2181. Will not attempt to authenticate using SASL (unknown error
) (org.apache.zookeeper.ClientCnxn:pool-1-thread-1-SendThread(192.168.52.92:2181))
[2017-03-03 19:42:42,948] INFO Socket connection established to 192.168.92/192.168.52.92:2181, initiating session (org.apache.zookeeper.ClientCnxn:pool-1-th
read-1-SendThread(10.125.52.92:2181))
[2017-03-03 19:42:42,951] INFO Session establishment complete on server 192.168.52.92/192.168.52.92:2181, sessionid = 0x35a933430d50004, negotiated timeout = 1
0000 (org.apache.zookeeper.ClientCnxn:pool-1-thread-1-SendThread(192.168.52.92:2181))
[2017-03-03 19:42:42,951] INFO State change: RECONNECTED (org.apache.curator.framework.state.ConnectionStateManager:pool-1-thread-1-EventThread)
[2017-03-03 19:42:42,953] INFO Leader defeated. New leader: 192.168.48.125:8080 (mesosphere.marathon.core.election.impl.CuratorElectionService:pool-1-thread-1
)
[2017-03-03 19:42:42,957] INFO Deleting existing tombstone for old twitter commons leader election (mesosphere.marathon.core.election.impl.CuratorElectionSer
vice:pool-1-thread-1)
[2017-03-03 19:42:42,959] INFO Lost leadership (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ea97d137:pool-1-thread-1)
[2017-03-03 19:42:42,959] INFO All actors suspended:
* Actor[akka://marathon/user/taskTracker#989799113]
* Actor[akka://marathon/user/reviveOffersWhenWanted#-1681045213]
* Actor[akka://marathon/user/taskKillServiceActor#-1306622116]
* Actor[akka://marathon/user/launchQueue#819767243]
* Actor[akka://marathon/user/offersWantedForReconciliation#-2099816564]
* Actor[akka://marathon/user/rateLimiter#503420309]
* Actor[akka://marathon/user/groupManager#-752628876]
* Actor[akka://marathon/user/offerMatcherLaunchTokens#-562928907]
* Actor[akka://marathon/user/killOverdueStagedTasks#-1773633501]
* Actor[akka://marathon/user/offerMatcherManager#123957678]
* Actor[akka://marathon/user/expungeOverdueLostTasks#-1479038444] (mesosphere.marathon.core.leadership.impl.LeadershipCoordinatorActor:marathon-akka.actor.de
fault-dispatcher-9)
[2017-03-03 19:42:42,960] INFO Stopping driver (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ea97d137:pool-1-thread-1)
I0303 19:42:42.960778 10617 sched.cpp:1987] Asked to stop the driver
I0303 19:42:42.961051 10679 sched.cpp:1187] Stopping framework '041eee2c-d32b-413b-931e-dc1f47a97971-0000'
[2017-03-03 19:42:42,961] ERROR Terminating after loss of leadership (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ea97d137:pool-1-thread-1
)
[2017-03-03 19:42:42,961] INFO ExpungeOverdueLostTasksActor has stopped (mesosphere.marathon.core.task.jobs.impl.ExpungeOverdueLostTasksActor:marathon-akka.a
ctor.default-dispatcher-19)
[2017-03-03 19:42:42,964] INFO Driver future completed with result=Success(()). (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ea97d137:Fork
JoinPool-2-worker-37)
[2017-03-03 19:42:42,964] INFO Stopped appTaskLaunchActor for /php-test version 2017-03-03T09:45:32.125Z (mesosphere.marathon.core.launchqueue.impl.TaskLaunc
herActor:marathon-akka.actor.default-dispatcher-21)
[2017-03-03 19:42:42,964] INFO Call postDriverRuns callbacks on EntityStoreCache(MarathonStore(app:)), EntityStoreCache(MarathonStore(group:)), EntityStoreCa
che(MarathonStore(deployment:)), EntityStoreCache(MarathonStore(framework:)), EntityStoreCache(MarathonStore(taskFailure:)), EntityStoreCache(MarathonStore(e
vents:)) (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ea97d137:ForkJoinPool-2-worker-37)
[2017-03-03 19:42:42,965] INFO Finished postDriverRuns callbacks (mesosphere.marathon.MarathonSchedulerService$$EnhancerByGuice$$ea97d137:ForkJoinPool-2-work
er-37)
[2017-03-03 19:42:42,965] INFO Shutting down services (mesosphere.marathon.Main$:shutdownHook1)
[2017-03-03 19:42:42,965] INFO Shutting down actor system akka://marathon (mesosphere.marathon.core.base.ActorsModule:Thread-3)
(END)

这个问题是这个样子，如果你的zookeeper集群不稳定，而且此前有部署过marathon集群，这下就经常会出现这种问题。marathon如果开启集群模式（--ha=true）,如果zookeeper集群的节点连接出现延迟的问题或者其他问题，进而marathon无法确定其他节点的情况，失去竞选能力，然后自我毁灭。 zookeeper部署的时候要格外注意跟marathon集群的结合，另外如果你不启用marathon的集群模式，你最好关闭marathon的集群模式。

谨记一点，Marathon的选举依赖zookeeper

本文参与腾讯云自媒体同步曝光计划，分享自作者个人站点/博客。

原始发表：2017/03/06 ，如有侵权请联系 cloudcommunity@tencent.com 删除

其他

本文分享自作者个人站点/博客前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

其他

登录后参与评论

0 条评论

热度

marathon错误记录原

marathon错误记录原

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

marathon错误记录 原

marathon错误记录 原

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

marathon错误记录原

marathon错误记录原