前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >《HBase 权威指南》学习笔记一 引言

《HBase 权威指南》学习笔记一 引言

作者头像
架构师刀哥
发布2018-03-20 17:29:16
1.2K0
发布2018-03-20 17:29:16
举报
文章被收录于专栏:坚毅的PHP

解决数据库多写问题,同事推荐使用hbase,并做了HBase培训,也看到老大tim参会说淘宝用hbase替代部分mysql核心应用,学习研究下看是否适用

分布式计算的谬论.:

1 The network is reliable. 2 Latency is zero. 3 Bandwidth is infinite. 4 The network is secure. 5 Topology doesn't change. 6 There is one administrator. 7 Transport cost is zero. 8 The network is homogeneous.

 下载版本0.92.1  889个文件  285749 行java代码(find . -name '*.java'|wc -l)

《HBase 权威指南》目录摘要:

  1. hbase演进 November 2006 Google releases paper on BigTable February 2007 Initial HBase prototype created as Hadoop contrib§ October 2007 First “usable” HBase (Hadoop 0.15.0) January 2008 Hadoop becomes an Apache top-level project, HBase becomes subproject October 2008 HBase 0.18.1 released January 2009 HBase 0.19.0 released September 2009 HBase 0.20.0 released, the performance release May 2010 HBase becomes an Apache top-level project June 2010 HBase 0.89.20100621, first developer release January 2011 HBase 0.90.0 released, the durability and stability release Mid 2011 HBase 0.92.0 released, tagged as coprocessor and security release
  2. rdbms的局限性 举例“Hush, the HBase URL Shortener”这个应用,随访问量增大要加slave,加cache,只能做简单查询,考虑读写的不断优化和扩展,分表分库,在应用层面改程序,做sharding,买好的硬件,以及随后的不尽噩梦。
  3. HBase的面向column的表 the most basic unit is a column. One or more columns form a row that is addressed uniquely by a row key. A number of rows, in turn, form a table, and there can be many of them. Each column may have multiple versions, with each distinct value contained in a separate cell. (Table, RowKey, Family, Column, Timestamp) → Value  可在编程语言中表达为: SortedMap<RowKey, List<SortedMap<Column, List<Value, Timestamp>>>>   (p19) 相同rowkey会有不同时间戳的数据,对应不同的版本,数据存储在HFiles中,索引保存在内存中,默认64KB,HFiles又被保存在Hadoop Distributed File System(hdfs)中,确保在跨服务器的数据写入不会丢失。索引存储在文件块的最后面.
  4. HBase的anto-sharding region去管理监控做sharding。“Each region is served by exactly one region server, and each of these servers can serve many regions at any time"
  1. 数据写入流程 When data is updated it is first written to a commit log, called a write-ahead log (WAL) in HBase, and then stored in the in-memory memstore. Once the data in memory has exceeded a given maximum value, it is flushed as an HFile to disk. After the flush, the commit logs can be discarded up to the last unflushed modification. While the system is flushing the memstore to disk, it can continue to serve readers and writers without having to block them.   Since flushing memstores to disk causes more and more HFiles to be created, HBase has a housekeeping mechanism that merges the files into larger ones using compaction. There are two types of compaction: minor compactions and major compactions.(p24)
  2. HBase组成部分 the client library, one master server, and many region servers.HBase master server 使用zookeeper管理region servers,负载均衡,去掉繁忙服务器。hbase相比google bigtable,增加了" push-down predicates, that is, filters, reducing data transferred over the network"
本文参与 腾讯云自媒体同步曝光计划,分享自作者个人站点/博客。
原始发表:2012-05-28 ,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
数据库
云数据库为企业提供了完善的关系型数据库、非关系型数据库、分析型数据库和数据库生态工具。您可以通过产品选择和组合搭建,轻松实现高可靠、高可用性、高性能等数据库需求。云数据库服务也可大幅减少您的运维工作量,更专注于业务发展,让企业一站式享受数据上云及分布式架构的技术红利!
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档