mesos开发指南

1 frameworks开发指南

这个文档中,我们称Mesos的应用为”frameworks”。

In this document we refer to Mesos applications as “frameworks”.

Mesos支持java,python,c++。可以从MESOS_HOME/src/examples/找对应的例子,搞明白framework的scheduler和executor的开发。

2 第一步创建Framework调度器

Framework的scheduler可以是C/c++/java/Scala或者Python,你的Framework调度器必须继承scheduler类。scheduler必须创建一个SchedulerDriver(负责中转你的掉的钱和Mesos master的通讯)的实例,然后调用SchedulerDriver.run().

2.1 Scheduler API

声明在 MESOS_HOME/include/mesos/scheduler.hpp

/** * Empty virtual destructor (necessary to instantiate subclasses). */ virtual ~Scheduler() {} /** * Invoked when the scheduler successfully registers with a Mesos * master. A unique ID (generated by the master) used for * distinguishing this framework from others and MasterInfo * with the ip and port of the current master are provided as arguments. */ virtual void registered(SchedulerDriver* driver, const FrameworkID& frameworkId, const MasterInfo& masterInfo) = 0; /** * Invoked when the scheduler re-registers with a newly elected Mesos master. * This is only called when the scheduler has previously been registered. * MasterInfo containing the updated information about the elected master * is provided as an argument. */ virtual void reregistered(SchedulerDriver* driver, const MasterInfo& masterInfo) = 0; /** * Invoked when the scheduler becomes "disconnected" from the master * (e.g., the master fails and another is taking over). */ virtual void disconnected(SchedulerDriver* driver) = 0; /** * Invoked when resources have been offered to this framework. A * single offer will only contain resources from a single slave. * Resources associated with an offer will not be re-offered to * _this_ framework until either (a) this framework has rejected * those resources (see SchedulerDriver::launchTasks) or (b) those * resources have been rescinded (see Scheduler::offerRescinded). * Note that resources may be concurrently offered to more than one * framework at a time (depending on the allocator being used). In * that case, the first framework to launch tasks using those * resources will be able to use them while the other frameworks * will have those resources rescinded (or if a framework has * already launched tasks with those resources then those tasks will * fail with a TASK_LOST status and a message saying as much). */ virtual void resourceOffers(SchedulerDriver* driver, const std::vector<Offer>& offers) = 0; /** * Invoked when an offer is no longer valid (e.g., the slave was * lost or another framework used resources in the offer). If for * whatever reason an offer is never rescinded (e.g., dropped * message, failing over framework, etc.), a framwork that attempts * to launch tasks using an invalid offer will receive TASK_LOST * status updats for those tasks (see Scheduler::resourceOffers). */ virtual void offerRescinded(SchedulerDriver* driver, const OfferID& offerId) = 0; /** * Invoked when the status of a task has changed (e.g., a slave is * lost and so the task is lost, a task finishes and an executor * sends a status update saying so, etc). Note that returning from * this callback _acknowledges_ receipt of this status update! If * for whatever reason the scheduler aborts during this callback (or * the process exits) another status update will be delivered (note, * however, that this is currently not true if the slave sending the * status update is lost/fails during that time). */ virtual void statusUpdate(SchedulerDriver* driver, const TaskStatus& status) = 0; /** * Invoked when an executor sends a message. These messages are best * effort; do not expect a framework message to be retransmitted in * any reliable fashion. */ virtual void frameworkMessage(SchedulerDriver* driver, const ExecutorID& executorId, const SlaveID& slaveId, const std::string& data) = 0; /** * Invoked when a slave has been determined unreachable (e.g., * machine failure, network partition). Most frameworks will need to * reschedule any tasks launched on this slave on a new slave. */ virtual void slaveLost(SchedulerDriver* driver, const SlaveID& slaveId) = 0; /** * Invoked when an executor has exited/terminated. Note that any * tasks running will have TASK_LOST status updates automagically * generated. */ virtual void executorLost(SchedulerDriver* driver, const ExecutorID& executorId, const SlaveID& slaveId, int status) = 0; /** * Invoked when there is an unrecoverable error in the scheduler or * scheduler driver. The driver will be aborted BEFORE invoking this * callback. */ virtual void error(SchedulerDriver* driver, const std::string& message) = 0;

3 第二步创建Framework Executor

Your framework executor must inherit from the Executor class. It must override the launchTask() method. You can use the $MESOS_HOME environment variable inside of your executor to determine where Mesos is running from.

Framework executor必须继承Executor类,并且重写launchTask()方法。可以通过设置executor环境变量$MESOS_HOME配置Mesos运行环境。

3.1 Executor API

声明在 MESOS_HOME/include/mesos/executor.hpp

/** * Invoked once the executor driver has been able to successfully * connect with Mesos. In particular, a scheduler can pass some * data to it's executors through the FrameworkInfo.ExecutorInfo's * data field. */ virtual void registered(ExecutorDriver* driver, const ExecutorInfo& executorInfo, const FrameworkInfo& frameworkInfo, const SlaveInfo& slaveInfo) = 0; /** * Invoked when the executor re-registers with a restarted slave. */ virtual void reregistered(ExecutorDriver* driver, const SlaveInfo& slaveInfo) = 0; /** * Invoked when the executor becomes "disconnected" from the slave * (e.g., the slave is being restarted due to an upgrade). */ virtual void disconnected(ExecutorDriver* driver) = 0; /** * Invoked when a task has been launched on this executor (initiated * via Scheduler::launchTasks). Note that this task can be realized * with a thread, a process, or some simple computation, however, no * other callbacks will be invoked on this executor until this * callback has returned. */ virtual void launchTask(ExecutorDriver* driver, const TaskInfo& task) = 0; /** * Invoked when a task running within this executor has been killed * (via SchedulerDriver::killTask). Note that no status update will * be sent on behalf of the executor, the executor is responsible * for creating a new TaskStatus (i.e., with TASK_KILLED) and * invoking ExecutorDriver::sendStatusUpdate. */ virtual void killTask(ExecutorDriver* driver, const TaskID& taskId) = 0; /** * Invoked when a framework message has arrived for this * executor. These messages are best effort; do not expect a * framework message to be retransmitted in any reliable fashion. */ virtual void frameworkMessage(ExecutorDriver* driver, const std::string& data) = 0; /** * Invoked when the executor should terminate all of it's currently * running tasks. Note that after a Mesos has determined that an * executor has terminated any tasks that the executor did not send * terminal status updates for (e.g., TASK_KILLED, TASK_FINISHED, * TASK_FAILED, etc) a TASK_LOST status update will be created. */ virtual void shutdown(ExecutorDriver* driver) = 0; /** * Invoked when a fatal error has occured with the executor and/or * executor driver. The driver will be aborted BEFORE invoking this * callback. */ virtual void error(ExecutorDriver* driver, const std::string& message) = 0;

4 安装Framework

你必须把framework放在集群中所有的slave可以获得地方。比如运行在HDFS上,可以把executor放在HDFS上。然后通过MesosSchedulerDriver构造器的ExecutorInfo参数传递。ExecutorInfo是一个Protocol Buffer Message类。(定义在include/mesos/mesos.proto),配置URL字段类似为HDFS://path/to/executor/。当能也可以在启动framework slave时通过frameworks_home这个参数选型传递给mesos-slave指明executors存储在哪里。然后配置ExecutorInfo为相对路径。Slave会预先拼接frameworks_home和相对路径。

一旦你确认executors可以被mesos-slaves调用时,就可以执行scheduler,scheduler注册给Mesos master,然后接受资源。

原文发布于微信公众号 - 大数据和云计算技术(jiezhu2007)

原文发表时间:2014-03-31

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

相关文章

来自专栏我的技术专栏

数据结构图文解析之:栈的简介及C++模板实现

1405
来自专栏友弟技术工作室

Python Cerberuscerberus地狱犬 (Cerberus是一个用于Python的轻量级且可扩展的数据验证库)概述安装Cerberus用法验证规则(Validation Rules)规范

cerberus地狱犬 (Cerberus是一个用于Python的轻量级且可扩展的数据验证库) 前言 文章内容有点多,是自己学习cerberus的记录,原文,由...

5575
来自专栏liuchengxu

用 Go 构建一个区块链 -- Part 3: 持久化和命令行接口

翻译的系列文章我已经放到了 GitHub 上:blockchain-tutorial,后续如有更新都会在 GitHub 上,可能就不在这里同步了。如果想直接运行...

712
来自专栏枕边书

Hystrix 配置参数全解析

不久前在部门周会上分享了 Hystrix 源码解析之后,就无奈地背上了专家包袱,同事们都认为我对 Hystrix 很熟,我们接触 Hystrix 更多的还是工作...

964
来自专栏青玉伏案

iOS开发之线程间的MachPort通信与子线程中的Notification转发

如题,今天的博客我们就来记录一下iOS开发中使用MachPort来实现线程间的通信,然后使用该知识点来转发子线程中所发出的Notification。简单的说,M...

2478
来自专栏数据结构与算法

P1801 黑匣子_NOI导刊2010提高(06)

题目描述 Black Box是一种原始的数据库。它可以储存一个整数数组,还有一个特别的变量i。最开始的时候Black Box是空的.而i等于0。这个Black ...

3176
来自专栏熊二哥

SpringBoot详细研究-03系统集成

据说杰克船长被黑客盗片了,看来信息安全依然任重而道远,本文以此为引子,来介绍下spring boot对于系统集成方面的支持。 ? ? Spring Secur...

4196
来自专栏魏琼东

分布式消息总线,基于.NET Socket Tcp的发布-订阅框架之离线支持,附代码下载

     在前面的分享一个分布式消息总线,基于.NET Socket Tcp的发布-订阅框架,附代码下载一文之中给大家分享和介绍了一个极其简单也非常容易上的基于...

850
来自专栏贺贺的前端工程师之路

JavaScript 之 Object.apply()与Object.call()和Object.bind()

apply()调用一个方法,其具有一个指定的this 值,以及作为一个数组(或类似于数组的参数)。

602
来自专栏java一日一条

50个常见的 Java 错误及避免方法(第一部分)

在开发Java软件时可能会遇到许多类型的错误,但大多数是可以避免的。为此我们罗列了50个最常见的Java编码错误,其中包含代码示例和教程,以帮助大家解决常见的编...

823

扫码关注云+社区