分布式系统的一些阅读笔记

哒呵呵

发布于 2018-08-06 15:43:06

6140

发布于 2018-08-06 15:43:06

distributed systems high level:scalability, availability, performance, latency and fault tolerance CAP theorem and the FLP impossibility result. time and order the replication problem:preventing divergence and accepting divergence 注意： Most things are trivial at a small scale - and the same problem becomes much harder once you surpass a certain size, volume or other physically constrained thing. Scalability： is the ability of a system, network, or process, to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth. 一共有三种： Size scalability: adding more nodes should make the system linearly faster; growing the dataset should not increase latency Geographic scalability: it should be possible to use multiple data centers to reduce the time it takes to respond to user queries, while dealing with cross-data center latency in some sensible manner. Administrative scalability: adding more nodes should not increase the administrative costs of the system (e.g. the administrators-to-machines ratio). 当然这三可遇不可求，比如Geographic scalability也就google的spanner Performance： is characterized by the amount of useful work accomplished by a computer system compared to the time and resources used. 有三个标准： Short response time/low latency for a given piece of work High throughput (rate of processing work) Low utilization of computing resource(s) Availability (and fault tolerance)： the proportion of time a system is in a functioning condition. If a user cannot access the system, it is said to be unavailable. 备注：Distributed systems can take a bunch of unreliable components, and build a reliable system on top of them. Availability = uptime / (uptime + downtime). Fault tolerance： ability of a system to behave in a well-defined manner once faults occur 分布式系统的两个限制因素： the number of nodes (which increases with the required storage and computation capacity) the distance between nodes (information travels, at best, at the speed of light) 比如： an increase in the number of independent nodes increases the probability of failure in a system (reducing availability and increasing administrative costs) an increase in the number of independent nodes may increase the need for communication between nodes (reducing performance as scale increases) an increase in geographic distance increases the minimum latency for communication between distant nodes (reducing performance for certain operations) partition and replicate Partitioning is dividing the dataset into smaller distinct independent sets; this is used to reduce the impact of dataset growth since each partition is a subset of the data. 分区的好处： Partitioning improves performance by limiting the amount of data to be examined and by locating related data in the same partition Partitioning improves availability by allowing partitions to fail independently, increasing the number of nodes that need to fail before availability is sacrificed Replication is making copies of the same data on multiple machines; 复制的好处： Replication improves performance by making additional computing power and bandwidth applicable to a new copy of the data Replication improves availability by creating additional copies of the data, increasing the number of nodes that need to fail before availability is sacrificed 分布式系统的抽象： each node executes a program concurrently knowledge is local: nodes have fast access only to their local state, and any information about global state is potentially out of date nodes can fail and recover from failure independently messages can be delayed or lost (independent of node failure; it is not easy to distinguish network failure and node failure) and clocks are not synchronized across nodes (local timestamps do not correspond to the global real time order, which cannot be easily observed) Nodes： the ability to execute a program the ability to store data into volatile memory (which can be lost upon failure) and into stable state (which can be read after a failure) a clock (which may or may not be assumed to be accurate) Communication links： Communication links connect individual nodes to each other, and allow messages to be sent in either direction. Timing / ordering Synchronous system model Processes execute in lock-step; there is a known upper bound on message transmission delay; each process has an accurate clock Asynchronous system model No timing assumptions - e.g. processes execute at independent rates; there is no bound on message transmission delay; useful clocks do not exist The consensus problem Agreement: Every correct process must agree on the same value. Integrity: Every correct process decides at most one value, and if it decides some value, then it must have been proposed by some process. Termination: All processes eventually reach a decision. Validity: If all correct processes propose the same value V, then all correct processes decide V. The FLP impossibility result there does not exist a (deterministic) algorithm for the consensus problem in an asynchronous system subject to failures, even if messages can never be lost, at most one process may fail, and it can only fail by crashing (stopping executing) The CAP theorem Consistency: all nodes see the same data at the same time. Availability: node failures do not prevent survivors from continuing to operate. Partition tolerance: the system continues to operate despite message loss due to network and/or node failure CA (consistency + availability). Examples include full strict quorum protocols, such as two-phase commit. CP (consistency + partition tolerance). Examples include majority quorum protocols in which minority partitions are unavailable such as Paxos. AP (availability + partition tolerance). Examples include protocols using conflict resolution, such as Dynamo. A CA system does not distinguish between node failures and network failures, and hence must stop accepting writes everywhere to avoid introducing divergence (multiple copies). A CP system prevents divergence (e.g. maintains single-copy consistency) by forcing asymmetric behavior on the two sides of the partition. Consistency and availability are not really binary choices, unless you limit yourself to strong consistency. Strong consistency vs. other consistency models Strong consistency models (capable of maintaining a single copy) Linearizable consistency Sequential consistency Weak consistency models (not strong) Client-centric consistency models Causal consistency: strongest model available Eventual consistency models 两种强一致性模型 Linearizable consistency: Under linearizable consistency, all operations appear to have executed atomically in an order that is consistent with the global real-time ordering of operations. Sequential consistency: Under sequential consistency, all operations appear to have executed atomically in some order that is consistent with the order seen at individual nodes and that is equal at all nodes. 这两者的关键区别： The key difference is that linearizable consistency requires that the order in which operations take effect is equal to the actual real-time ordering of operations. Sequential consistency allows for operations to be reordered as long as the order observed on each node remains consistent. The only way someone can distinguish between the two is if they can observe all the inputs and timings going into the system; from the perspective of a client interacting with a node, the two are equivalent. Client-centric consistency models are consistency models that involve the notion of a client or session in some way. 备注：This is often implemented by building additional caching into the client library, so that if a client moves to a replica node that contains old data, then the client library returns its cached value rather than the old value from the replica. The eventual consistency model says that if you stop changing values, then after some undefined amount of time all replicas will agree on the same value. Total and partial order 其定义如下： The natural state in a distributed system is partial order A total order is a binary relation that defines an order for every element in some set. 什么是时间呢？ Time is a source of order 其解释有三个： Order Duration Interpretation 全球时间： The global clock is basically a source of total order (exact order of every operation on all nodes even if those nodes have never communicated). 也就是说： Assuming that clocks on distributed nodes are perfectly synchronized means assuming that clocks start at the same value and never drift apart. Local-clock也就是本地时间 It assigns a partial order: events on each system are ordered but events cannot be ordered across systems by only using a clock. No-clock假设没有时间 events can be ordered on a single system using a counter and no communication, but ordering events across systems requires a message exchange. 为什么需要时间呢？ Time can define order across a system (without communication) Time can define boundary conditions for algorithms 向量时间，是一个计数器 Whenever a process does work, increment the counter Whenever a process sends a message, include the counter When a message is received, set the counter to max(local_counter, received_counter) + 1 failure detectors Strong completeness. Every crashed process is eventually suspected by every correct process. Weak completeness. Every crashed process is eventually suspected by some correct process. Strong accuracy. No correct process is suspected ever. Weak accuracy. Some correct process is never suspected. 时间需要关注的三个性质： the causal ordering of events failure detection (e.g. approximations of upper bounds on message delivery) consistent snapshots (e.g. the ability to examine the state of a system at some point in time; not discussed here)

本文参与腾讯云自媒体分享计划，分享自微信公众号。

原始发表：2017-07-08，如有侵权请联系 cloudcommunity@tencent.com 删除

其他

本文分享自鸿的学习笔记微信公众号，前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体分享计划，欢迎热爱写作的你一起参与！

其他

登录后参与评论

0 条评论

热度

分布式系统的一些阅读笔记

分布式系统的一些阅读笔记

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐