mesos开发指南

1 frameworks开发指南

这个文档中,我们称Mesos的应用为”frameworks”。

In this document we refer to Mesos applications as “frameworks”.

Mesos支持java,python,c++。可以从MESOS_HOME/src/examples/找对应的例子,搞明白framework的scheduler和executor的开发。

2 第一步创建Framework调度器

Framework的scheduler可以是C/c++/java/Scala或者Python,你的Framework调度器必须继承scheduler类。scheduler必须创建一个SchedulerDriver(负责中转你的掉的钱和Mesos master的通讯)的实例,然后调用SchedulerDriver.run().

2.1 Scheduler API

声明在 MESOS_HOME/include/mesos/scheduler.hpp

/** * Empty virtual destructor (necessary to instantiate subclasses). */ virtual ~Scheduler() {} /** * Invoked when the scheduler successfully registers with a Mesos * master. A unique ID (generated by the master) used for * distinguishing this framework from others and MasterInfo * with the ip and port of the current master are provided as arguments. */ virtual void registered(SchedulerDriver* driver, const FrameworkID& frameworkId, const MasterInfo& masterInfo) = 0; /** * Invoked when the scheduler re-registers with a newly elected Mesos master. * This is only called when the scheduler has previously been registered. * MasterInfo containing the updated information about the elected master * is provided as an argument. */ virtual void reregistered(SchedulerDriver* driver, const MasterInfo& masterInfo) = 0; /** * Invoked when the scheduler becomes "disconnected" from the master * (e.g., the master fails and another is taking over). */ virtual void disconnected(SchedulerDriver* driver) = 0; /** * Invoked when resources have been offered to this framework. A * single offer will only contain resources from a single slave. * Resources associated with an offer will not be re-offered to * _this_ framework until either (a) this framework has rejected * those resources (see SchedulerDriver::launchTasks) or (b) those * resources have been rescinded (see Scheduler::offerRescinded). * Note that resources may be concurrently offered to more than one * framework at a time (depending on the allocator being used). In * that case, the first framework to launch tasks using those * resources will be able to use them while the other frameworks * will have those resources rescinded (or if a framework has * already launched tasks with those resources then those tasks will * fail with a TASK_LOST status and a message saying as much). */ virtual void resourceOffers(SchedulerDriver* driver, const std::vector<Offer>& offers) = 0; /** * Invoked when an offer is no longer valid (e.g., the slave was * lost or another framework used resources in the offer). If for * whatever reason an offer is never rescinded (e.g., dropped * message, failing over framework, etc.), a framwork that attempts * to launch tasks using an invalid offer will receive TASK_LOST * status updats for those tasks (see Scheduler::resourceOffers). */ virtual void offerRescinded(SchedulerDriver* driver, const OfferID& offerId) = 0; /** * Invoked when the status of a task has changed (e.g., a slave is * lost and so the task is lost, a task finishes and an executor * sends a status update saying so, etc). Note that returning from * this callback _acknowledges_ receipt of this status update! If * for whatever reason the scheduler aborts during this callback (or * the process exits) another status update will be delivered (note, * however, that this is currently not true if the slave sending the * status update is lost/fails during that time). */ virtual void statusUpdate(SchedulerDriver* driver, const TaskStatus& status) = 0; /** * Invoked when an executor sends a message. These messages are best * effort; do not expect a framework message to be retransmitted in * any reliable fashion. */ virtual void frameworkMessage(SchedulerDriver* driver, const ExecutorID& executorId, const SlaveID& slaveId, const std::string& data) = 0; /** * Invoked when a slave has been determined unreachable (e.g., * machine failure, network partition). Most frameworks will need to * reschedule any tasks launched on this slave on a new slave. */ virtual void slaveLost(SchedulerDriver* driver, const SlaveID& slaveId) = 0; /** * Invoked when an executor has exited/terminated. Note that any * tasks running will have TASK_LOST status updates automagically * generated. */ virtual void executorLost(SchedulerDriver* driver, const ExecutorID& executorId, const SlaveID& slaveId, int status) = 0; /** * Invoked when there is an unrecoverable error in the scheduler or * scheduler driver. The driver will be aborted BEFORE invoking this * callback. */ virtual void error(SchedulerDriver* driver, const std::string& message) = 0;

3 第二步创建Framework Executor

Your framework executor must inherit from the Executor class. It must override the launchTask() method. You can use the $MESOS_HOME environment variable inside of your executor to determine where Mesos is running from.

Framework executor必须继承Executor类,并且重写launchTask()方法。可以通过设置executor环境变量$MESOS_HOME配置Mesos运行环境。

3.1 Executor API

声明在 MESOS_HOME/include/mesos/executor.hpp

/** * Invoked once the executor driver has been able to successfully * connect with Mesos. In particular, a scheduler can pass some * data to it's executors through the FrameworkInfo.ExecutorInfo's * data field. */ virtual void registered(ExecutorDriver* driver, const ExecutorInfo& executorInfo, const FrameworkInfo& frameworkInfo, const SlaveInfo& slaveInfo) = 0; /** * Invoked when the executor re-registers with a restarted slave. */ virtual void reregistered(ExecutorDriver* driver, const SlaveInfo& slaveInfo) = 0; /** * Invoked when the executor becomes "disconnected" from the slave * (e.g., the slave is being restarted due to an upgrade). */ virtual void disconnected(ExecutorDriver* driver) = 0; /** * Invoked when a task has been launched on this executor (initiated * via Scheduler::launchTasks). Note that this task can be realized * with a thread, a process, or some simple computation, however, no * other callbacks will be invoked on this executor until this * callback has returned. */ virtual void launchTask(ExecutorDriver* driver, const TaskInfo& task) = 0; /** * Invoked when a task running within this executor has been killed * (via SchedulerDriver::killTask). Note that no status update will * be sent on behalf of the executor, the executor is responsible * for creating a new TaskStatus (i.e., with TASK_KILLED) and * invoking ExecutorDriver::sendStatusUpdate. */ virtual void killTask(ExecutorDriver* driver, const TaskID& taskId) = 0; /** * Invoked when a framework message has arrived for this * executor. These messages are best effort; do not expect a * framework message to be retransmitted in any reliable fashion. */ virtual void frameworkMessage(ExecutorDriver* driver, const std::string& data) = 0; /** * Invoked when the executor should terminate all of it's currently * running tasks. Note that after a Mesos has determined that an * executor has terminated any tasks that the executor did not send * terminal status updates for (e.g., TASK_KILLED, TASK_FINISHED, * TASK_FAILED, etc) a TASK_LOST status update will be created. */ virtual void shutdown(ExecutorDriver* driver) = 0; /** * Invoked when a fatal error has occured with the executor and/or * executor driver. The driver will be aborted BEFORE invoking this * callback. */ virtual void error(ExecutorDriver* driver, const std::string& message) = 0;

4 安装Framework

你必须把framework放在集群中所有的slave可以获得地方。比如运行在HDFS上,可以把executor放在HDFS上。然后通过MesosSchedulerDriver构造器的ExecutorInfo参数传递。ExecutorInfo是一个Protocol Buffer Message类。(定义在include/mesos/mesos.proto),配置URL字段类似为HDFS://path/to/executor/。当能也可以在启动framework slave时通过frameworks_home这个参数选型传递给mesos-slave指明executors存储在哪里。然后配置ExecutorInfo为相对路径。Slave会预先拼接frameworks_home和相对路径。

一旦你确认executors可以被mesos-slaves调用时,就可以执行scheduler,scheduler注册给Mesos master,然后接受资源。

原文发布于微信公众号 - 大数据和云计算技术(jiezhu2007)

原文发表时间:2014-03-31

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

相关文章

来自专栏个人分享

spark单机模式简单搭建

待安装列表 hadoop hive scala spark 一.环境变量配置: ~/.bash_profile PATH=$PATH:$HOME/bin

3581
来自专栏乐沙弥的世界

MySQL MHA 典型使用场景

Since MHA Manager uses very little CPU/Memory resources, you can manage lots of ...

760
来自专栏Kubernetes

SkyDNS2源码分析

SkyDNS2是SkyDNS Version 2.x的统称,其官方文档只有README.md,网上能找到的资料也不多,因此需要我们自行对代码进行一定的分析,才能...

3766
来自专栏张善友的专栏

How to Add an API to your Web Service

Introduction APIs are a great way to extend your application, build a community,...

2397
来自专栏乐沙弥的世界

Failed to create or upgrade OLR

    对于Oracle 11g RAC 的安装,与Oracle 10g(clusterware)类似,grid 安装完毕后需要执行orainstroot.sh...

804
来自专栏一个会写诗的程序员的博客

RESTFeel: 一个企业级的API管理&测试平台。RESTFeel帮助你设计、开发、测试您的APIRESTFeel功能简介:MongoDB configuration:Building From

The build file is configured to download and use an embedded Tomcat server. So t...

1044
来自专栏JAVA后端开发

activiti集成spring boot的一个怪问题

最近想集成activti到spring boot中,上网找了一下例子,发现很简单,就开干了!

1194
来自专栏Urahara Blog

PHP Disabled_functions Bypass

4884
来自专栏SpringSpace.cn

在 ubuntu 12.10 中安装 opensips 1.8.2

解压软件包: tar -zxvf opensips-1.8.2_src.tar.gz

1832
来自专栏编程坑太多

springboot (12) druid

1633

扫码关注云+社区