Flink学习笔记：2、Flink介绍

程裕强

发布于 2018-01-02 16:53:36

1.9K0

发布于 2018-01-02 16:53:36

2、Flink介绍

Some of you might have been already using Apache Spark in your day-to-day life and might have been wondering if I have Spark why I need to use Flink? The question is quite expected and the comparison is natural. Let me try to answer that in brief. The very first thing we need to understand here is Flink is based on the streaming first principle which means it is real streaming processing engine and not a fast processing engine that collects streams as mini batches. Flink considers batch processing as a special case of streaming whereas it is vice-versa in the case of Spark.

大致意思：可能我们工作中已经使用可Apache Spark，那现在为什么需要使用Flink？这是因为Flink是基于流式优先原则，这意味着它是真正的流处理引擎，而不是一个快速批处理引擎，以小批量方式收集流。Flink认为批处理是流的一种特殊情况，而Spark则是反过来。Spark的流处理本质是微批处理。

2.1 History

Flink started as a research project named Stratosphere with the goal of building a next generation big data analytics platform at universities in the Berlin area. It was accepted as an Apache Incubator project on April 16, 2014. Flink开始作为一个名为Stratosphere的研究项目，目标是在柏林地区的大学建立下一代大数据分析平台。 2014年4月16日被接受为Apache孵化器项目。

From version 0.6, Stratosphere was renamed Flink. The latest versions of Flink are focused on supporting various features such as batch processing,stream processing, graph processing, machine learning, and so on.Flink 0.7 introduced the most important feature of Flink that is, Flink’s streaming API. Initially release only had the Java API. Later releases started supporting Scala API as well. Now let’s look the current architecture of Flink in the next section. 从0.6版本开始，Stratosphere改名为Flink。 Flink的最新版本重点支持批处理，流处理，图形处理，机器学习等各种功能.Flink 0.7引入了Flink最重要的特性，即Flink的流媒体API。最初版本只有Java API。后来的版本也开始支持Scala API。现在我们来看下一节中Flink的当前体系结构。

2.2 Architecture（架构）

Flink 1.X’s architecture consists of various components such as deploy, core processing, and APIs. The following diagram shows the components, APIs, and libraries:

Flink 1.X的架构由各种组件组成，如部署，核心处理和API。下图显示了组件，API和库：

Flink has a layered architecture where each component is a part of a specific layer. Each layer is built on top of the others for clear abstraction. Flink is designed to run on local machines, in a YARN cluster, or on the cloud. Runtime is Flink’s core data processing engine that receives the program through APIs in the form of JobGraph. JobGraph is a simple parallel data flow with a set of tasks that produce and consume data streams. Flink有一个分层架构，其中每个组件都是特定图层的一部分。每个图层都建立在其他图层之上，以实现清晰的抽象。 Flink设计用于在本地机器，YARN群集或云上运行。 Runtime是Flink的核心数据处理引擎，通过JobGraph的形式通过API接收程序。 JobGraph是一个简单的并行数据流，包含一组产生和使用数据流的任务。

The DataStream and DataSet APIs are the interfaces programmers can use for defining the Job. JobGraphs are generated by these APIs when the programs are compiled. Once compiled, the DataSet API allows the optimizer to generate the optimal execution plan while DataStream API uses a stream build for efficient execution plans. DataStream和DataSet API是程序员可以用来定义Job的接口。 JobGraphs是在编译程序时由这些API生成的。编译后，DataSet API允许优化器生成最佳执行计划，而DataStream API则使用流生成来实现高效的执行计划。

The optimized JobGraph is then submitted to the executors according to the deployment model. You can choose a local, remote, or YARN mode of deployment. If you have a Hadoop cluster already running, it is always better to use a YARN mode of deployment. 然后根据部署模型将优化的JobGraph提交给执行者。您可以选择本地，远程或YARN部署模式。如果您的Hadoop集群已经在运行，那么最好使用YARN部署模式。

2.3 Distributed execution （分布式执行）

Flink’s distributed execution consists of two important processes, master and worker. When a Flink program is executed, various processes take part in the execution, namely Job Manager, Task Manager, and Job Client. Flink的分布式执行由两个重要的进程组成，master进程和worker进程。执行Flink程序时，各个进程参与执行，即作业管理器，任务管理器和作业客户端。

The following diagram shows the Flink program execution: 下图显示了Flink程序的执行：

The Flink program needs to be submitted to a Job Client. The Job Client then submits the job to the Job Manager. It’s the Job Manager’s responsibility to orchestrate the resource allocation and job execution. The very first thing it does is allocate the required resources. Once the resource allocation is done, the task is submitted to the respective the Task Manager. On receiving the task, the Task Manager initiates a thread to start the execution. While the execution is in place, the Task Managers keep on reporting the change of states to the Job Manager. There can be various states such as starting the execution, in progress, or finished. Once the job execution is complete, the results are sent back to the client.

Flink程序需要提交给作业客户端。作业客户端然后将作业提交给作业管理器。作业管理者有责任编排资源分配和作业执行。它所做的第一件事是分配所需的资源。一旦资源分配完成，任务就被提交给相应的任务管理器。在接收任务时，任务管理器启动一个线程开始执行。在执行到位的同时，任务经理不断向作业管理器报告状态变化。可以有各种状态，如开始执行，进行中或完成。作业执行完成后，结果会发送回客户端。

2.3.1 Job Manager

The master processes, also known as Job Managers, coordinate and manage the execution of the program. Their main responsibilities include scheduling tasks, managing checkpoints,failure recovery, and so on. master进程也称为作业管理器，负责协调和管理程序的执行。他们的主要职责包括调度任务，管理检查点，故障恢复等。

There can be many Masters running in parallel and sharing these responsibilities. This helps in achieving high availability. One of the masters needs to be the leader. If the leader node goes down, the master node (standby) will be elected as leader. 可以有许多Master并行运作并分担这些责任。这有助于实现高可用性。其中一个master需要成为leader。如果leader节点关闭，master节点（standby）将被选为leader。

The Job Manager consists of the following important components:

Actorsystem、Scheduler、Check pointing 作业管理器包含以下重要组件： Actorsystem（演员系统）、调度、检查指向

Flink internally uses the Akka actor system for communication between the Job Managers and the Task Managers. Flink内部使用Akka角色系统来管理Job Manager和Task Manager。

2.3.2 Actor system

An actor system is a container of actors with various roles. It provides services such as scheduling, configuration, logging, and so on. It also contains a thread pool from where all actors are initiated. All actors reside in a hierarchy. Each newly created actor would be assigned to a parent. Actors talk to each other using a messaging system. Each actor has its own mailbox from where it reads all the messages. If the actors are local, the messages are shared through shared memory but if the actors are remote then messages are passed thought RPC calls. 演员系统是具有各种角色的演员的容器(container)。它提供诸如调度，配置，日志记录等服务。它还包含一个线程池，从所有的角色开始。所有的演员驻留在一个层次结构中。每个新创建的actor都将被分配给父母。演员们使用信息系统互相交谈。每个参与者都有自己的邮箱，从中读取所有邮件。如果参与者是本地的，则消息通过共享内存共享，但是如果参与者是远程的，则认为RPC调用消息。

Each parent is responsible for the supervision of its children. If any error happens with the children, the parent gets notified. If an actor can solve its own problem then it can restart its children. If it cannot solve the problem then it can escalate the issue to its own parent: 每位家长负责监督其子女。如果孩子出现任何错误，家长会收到通知。如果一个演员可以解决自己的问题，那么它可以重新启动孩子。如果它不能解决问题，那么它可以把问题升级到它自己的父母：

In Flink, an actor is a container having state and behavior. An actor’s thread sequentially keeps on processing the messages it will receive in its mailbox. The state and the behavior are determined by the message it has received. 在Flink中，actor是具有状态和行为的容器。一个actor的线程依次继续处理它将在邮箱中收到的消息。状态和行为是由收到的信息决定的。

3.3 Scheduler(调度)

Executors in Flink are defined as task slots. Each Task Manager needs to manage one or more task slots. Internally, Flink decides which tasks needs to share the slot and which tasks must be placed into a specific slot. It defines that through the SlotSharingGroup and CoLocationGroup. Flink中的执行者被定义为任务槽。每个任务管理器都需要管理一个或多个任务槽。在内部，Flink决定哪些任务需要共享该插槽以及哪些任务必须被放置在特定的插槽中。它通过SlotSharingGroup和CoLocationGroup。

3.4 Check pointing(检查指向)

Check pointing is Flink’s backbone for providing consistent fault tolerance. It keeps on taking consistent snapshots for distributed data streams and executor states. It is inspired by the Chandy-Lamport algorithm but has been modified for Flink’s tailored requirement. 检查指向是Flink提供一致容错的主干。它始终为分布式数据流和执行器状态提供一致的快照。它受Chandy-Lamport算法的启发，但是已经根据Flink的定制要求进行了修改。

The fault-tolerant mechanism keeps on creating lightweight snapshots for the data flows. They therefore continue the functionality without any significant over-burden. Generally the state of the data flow is kept in a configured place such as HDFS. 容错机制一直为数据流创建轻量级快照。因此，他们继续功能，没有任何重大的负担。通常，数据流的状态保存在HDFS等配置的地方。

In case of any failure, Flink stops the executors and resets them and starts executing from the latest available checkpoint. 如果出现任何故障，Flink会停止执行程序并重置它们，并从最新的可用检查点开始执行。

Stream barriers are core elements of Flink’s snapshots. They are ingested into data streams without affecting the flow. Barriers never overtake the records. They group sets of records into a snapshot. Each barrier carries a unique ID. The following diagram shows how the barriers are injected into the data stream for snapshots: 流量障碍是Flink快照的核心要素。它们被摄入数据流而不影响流量。障碍永远不会超过记录。他们将一组记录分成快照。每个障碍都带有一个唯一的ID。下图显示了如何将屏障注入到快照的数据流中：

Each snapshot state is reported to the Flink Job Manager’s checkpoint coordinator. While drawing snapshots, Flink handles the alignment of records in order to avoid re-processing the same records because of any failure. This alignment generally takes some milliseconds. But for some intense applications, where even millisecond latency is not acceptable, we have an option to choose low latency over exactly a single record processing. By default, Flink processes each record exactly once. If any application needs low latency and is fine with at least a single delivery, we can switch off that trigger. This will skip the alignment and will improve the latency.

将每个快照状态报告给Flink作业管理器的检查点协调器。在绘制快照时，Flink处理记录对齐，以避免由于任何故障而重新处理相同的记录。这种对齐通常需要几毫秒。但是对于一些激烈的应用，即使毫秒级的延迟是不可接受的，我们也可以选择在一个记录处理中选择低延迟。默认情况下，Flink只处理一个记录。如果任何应用程序需要低延迟，并且至少有一次交付就可以，我们可以关闭该触发器。这将跳过对齐，并会改善延迟。

2.3.5 Task manager

Task managers are worker nodes that execute the tasks in one or more threads in JVM. Parallelism of task execution is determined by the task slots available on each Task Manager. Each task represents a set of resources allocated to the task slot. For example, if a Task Manager has four slots then it will allocate 25% of the memory to each slot. There could be one or more threads running in a task slot. Threads in the same slot share the same JVM. Tasks in the same JVM share TCP connections and heart beat messages: 任务管理器是在JVM中的一个或多个线程中执行任务的工作者节点。任务执行的并行性由每个任务管理器上可用的任务槽决定。每个任务代表分配给任务槽的一组资源。例如，如果任务管理器有四个插槽，那么它将为每个插槽分配25％的内存。可能有一个或多个线程在任务槽中运行。同一个槽中的线程共享相同的JVM。同一JVM中的任务共享TCP连接和心跳消息：

2.3.6 Job client

The job client is not an internal part of Flink’s program execution but it is the starting point of the execution. The job client is responsible for accepting the program from the user and then creating a data flow and then submitting the data flow to the Job Manager for further execution. Once the execution is completed, the job client provides the results back to the user. A data flow is a plan of execution. Consider a very simple word count program: 作业客户端不是Flink程序执行的内部部分，但是它是执行的起点。作业客户端负责接受来自用户的程序，然后创建数据流，然后将数据流提交给作业管理器以供进一步执行。一旦执行完成，作业客户端将结果提供给用户。数据流是一个执行计划。考虑一个非常简单的字数统计程序：

val text=env.readTextFile("input.txt")  //source
val counts=text.flatMap{_.toLowerCase.split("\\w+") filter{_.nonEmpty}}
           .map{(_,1)}
           .groupBy(0)
           .sum(1)                       //Transformation  
counts.writeAsCsv("output.txt","\n"," ") //Sink

When a client accepts the program from the user, it then transforms it into a data flow. The data flow for the aforementioned program may look like this: 当客户端接受来自用户的程序时，它将其转换成数据流。上述程序的数据流可能如下所示：

The preceding diagram shows how a program gets transformed into a data flow. Flink data flows are parallel and distributed by default. For parallel data processing, Flink partitions the operators and streams. Operator partitions are called sub-tasks. Streams can distribute the data in a one-to-one or a re-distributed manner. 上图显示了程序如何转换为数据流。 Flink数据流默认是并行分布的。对于并行数据处理，Flink分割运算符和流。操作员分区被称为子任务。流可以以一对一或重新分布的方式分发数据。

The data flows directly from the source to the map operators as there is no need to shuffle the data. But for a GroupBy operation Flink may need to redistribute the data by keys in order to get the correct results: 数据直接从源流向map运算符，因为不需要混洗数据。但是对于GroupBy操作，Flink可能需要通过keys重新分配数据才能获得正确的结果：

2.4 Features

In the earlier sections, we tried to understand the Flink architecture and its execution model. Because of its robust architecture, Flink is full of various features. 在前面的章节中，我们尝试了解Flink体系结构及其执行模型。由于其强大的架构，Flink充满了各种功能。

2.4.1 High performance(高性能)

Flink is designed to achieve high performance and low latency. Unlike other streaming frameworks such as Spark, you don’t need to do many manual configurations to get the best performance. Flink’s pipelined data processing gives better performance compared to its counterparts. Flink旨在实现高性能和低延迟。与Spark等其他流式框架不同，您不需要执行许多手动配置即可获得最佳性能。与同行相比，Flink的流水线数据处理性能更好。

2.4.2 Exactly-once stateful computation(确切的一次有状态计算)

As we discussed in the previous section, Flink’s distributed checkpoint processing helps to guarantee processing each record exactly once. In the case of high-throughput applications, Flink provides us with a switch to allow at least once processing. 正如我们在上一节中讨论的那样，Flink的分布式检查点处理有助于保证每个记录只处理一次。在高通量应用的情况下，Flink为我们提供了一个开关，允许至少一次处理。

2.4.3 Flexible streaming windows(灵活的流式窗口)

Flink supports data-driven windows. This means we can design a window based on time,counts, or sessions. A window can also be customized which allows us to detect specific patterns in event streams. Flink支持数据驱动的窗口。这意味着我们可以根据时间，计数或会话设计一个窗口。还可以定制窗口，使我们能够检测事件流中的特定模式。

2.4.4 Fault tolerance(容错)

Flink’s distributed, lightweight snapshot mechanism helps in achieving a great degree of fault tolerance. It allows Flink to provide high-throughput performance with guaranteed delivery. Flink的分布式轻量级快照机制有助于实现高度的容错性。它允许Flink提供高吞吐量性能和保证交付。

2.4.5 Memory management（内存管理）

Flink is supplied with its own memory management inside a JVM which makes it independent of Java’s default garbage collector. It efficiently does memory management by using hashing,indexing, caching, and sorting. Flink在JVM内部提供了自己的内存管理，使其独立于Java的默认垃圾收集器。它通过使用散列，索引，缓存和排序有效地进行内存管理。

2.4.6 Optimizer（优化）

Flink’s batch data processing API is optimized in order to avoid memory-consuming operations such as shuffle, sort, and so on. It also makes sure that caching is used in order to avoid heavy disk IO operations. Flink的批处理数据处理API经过优化，以避免诸如洗牌，分类等耗费内存的操作。它还确保使用缓存来避免繁重的IO操作。

2.4.7 Stream and batch in one platform（在一个平台上进行流和批处理）

Flink provides APIs for both batch and stream data processing. So once you set up the Flink environment, it can host stream and batch processing applications easily. In fact Flink works on Streaming first principle and considers batch processing as the special case of streaming. Flink为批处理和流数据处理提供API。所以一旦你建立了Flink的环境，它可以容易地托管流和批处理应用程序。事实上，Flink的工作原理是流式处理，并将批处理视为流式处理的特例。

2.4.8 Libraries（库）

Flink has a rich set of libraries to do machine learning, graph processing, relational data processing, and so on. Because of its architecture, it is very easy to perform complex event processing and alerting. We are going to see more about these libraries in subsequent chapters. Flink有一套丰富的库来做机器学习，图形处理，关系数据处理等等。由于其架构，执行复杂事件处理和警报非常容易。我们将在随后的章节中看到更多关于这些库的信息。

2.4.9 Event time semantics（事件时间语义）

Flink supports event time semantics. This helps in processing streams where events arrive out of order. Sometimes events may come delayed. Flink’s architecture allows us to define windows based on time, counts, and sessions, which helps in dealing with such scenarios. Flink支持事件时间语义。这有助于处理事件无序到达的流。有时事件可能会延迟。 Flink的架构允许我们根据时间，计数和会话来定义窗口，这有助于处理这种情况。

本文参与腾讯云自媒体同步曝光计划，分享自作者个人站点/博客。

原始发表：2017-11-29 ，如有侵权请联系 cloudcommunity@tencent.com 删除

spark

分布式

本文分享自作者个人站点/博客前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

spark

分布式

登录后参与评论

0 条评论

热度