开发者社区

文档建议反馈控制台

最新优惠活动

文章/答案/技术大牛

发布

社区首页 >专栏 >Hbase访问方式之Hbase shell 转

Hbase访问方式之Hbase shell 转

双面人

发布于 2019-04-10 14:18:36

1.3K0

发布于 2019-04-10 14:18:36

举报

文章被收录于专栏：热爱IT

Hbase的访问方式 1、Native Java API：最常规和高效的访问方式； 2、HBase Shell：HBase的命令行工具，最简单的接口，适合HBase管理使用； 3、Thrift Gateway：利用Thrift序列化技术，支持C++，PHP，Python等多种语言，适合其他异构系统在线访问HBase表数据； 4、REST Gateway：支持REST 风格的Http API访问HBase, 解除了语言限制； 5、MapReduce：直接使用MapReduce作业处理Hbase数据； 6、使用Pig/hive处理Hbase数据。

Hbase shell基本用法 hbase shell 的help对语法的介绍很全，搞明白help上的实例，对hbase shell的操作就很熟练了。 hbase shell 的操作分为 10类，本文只介绍前4类，分别是：

Group name commands general status, table_help, version, whoami ddl alter, alter_async, alter_status, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, show_filters namespace alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables dml append, count, delete, deleteall, get, get_counter, incr, put, scan, truncate, truncate_preserve tools assign, balance_switch, balancer, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, close_region, compact, compact_rs, flush, major_compact, merge_region, move, split, trace, unassign, wal_roll, zk_dump replication add_peer, append_peer_tableCFs, disable_peer, disable_table_replication, enable_peer, enable_table_replication, list_peers, list_replicated_tables, remove_peer, remove_peer_tableCFs, set_peer_tableCFs, show_peer_tableCFs snapshots clone_snapshot, delete_all_snapshot, delete_snapshot, list_snapshots, restore_snapshot, snapshot configuration update_all_config, update_config security grant, revoke, user_permission visibility labels add_labels, clear_auths, get_auths, list_labels, set_auths, set_visibility Hbase shell 命令具体介绍 general status 作用：查询当前服务器状态。实例：

hbase(main):006:0> status 1 servers, 0 dead, 5.0000 average load 1 2 更多用法：

hbase(main):002:0> help 'status' hbase> status hbase> status 'simple' hbase> status 'summary' hbase> status 'detailed' hbase> status 'replication' hbase> status 'replication', 'source' hbase> status 'replication', 'sink' 1 2 3 4 5 6 7 8 version 作用：查看hbase版本实例：

hbase(main):010:0> version 1.0.3, rf1e1312f9790a7c40f6a4b5a1bab2ea1dd559890, Tue Jan 19 19:26:53 PST 2016 1 2 whoami 作用：查询当前hbase用户实例：

hbase(main):011:0> whoami datanode1 (auth:SIMPLE) groups: datanode1 1 2 3 ddl create 作用：创建一个表实例：

#在命名空间ns1下，创建表t1，其中有一个列族f1，f1的版本数为5 hbase> create 'ns1:t1', {NAME => 'f1', VERSIONS => 5}

#在默认命名空间下，创建表t1，有三个列族f1,f2,f3 hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'} #等价于 hbase> create 't1', 'f1', 'f2', 'f3'

#创建表t1，列族f1，并设置f1的版本数为1，属性TTL为2592000，属性BLOCKCACHE为true。属性的含义在这就不解释了。 hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}

# 创建表t1,列族f1，并设置f1的配置hbase.hstore.blockingStoreFiles 为 10 hbase> create 't1', {NAME => 'f1', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}}

#创建表时，配置信息可以放在最后，例如： hbase> create 'ns1:t1', 'f1', SPLITS => ['10', '20', '30', '40'] hbase> create 't1', 'f1', SPLITS => ['10', '20', '30', '40'] hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe' hbase> create 't1', {NAME => 'f1', VERSIONS => 5}, METADATA => { 'mykey' => 'myvalue' } hbase> # Optionally pre-split the table into NUMREGIONS, using hbase> # SPLITALGO ("HexStringSplit", "UniformSplit" or classname)

#指定Pre-splitting的region的块数，和分割函数。 hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'} hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit', REGION_REPLICATION => 2, CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'}}

#也可以用另一个表t2的引用去创建一个新表t1，t1表具有t2的所有列族，并且加上f1列族。 hbase> t1 = create 't2', 'f1' 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 alter 作用：可以修改，增加，删除表的列族信息、属性、配置等。实例：

#对于表t1，如果t1含有f1列族，则将f1列族的版本数设为5. # 如果t1不含f1列数，则添加f1列族到表t1上。并将f1的版本数设置为5. hbase> alter 't1', NAME => 'f1', VERSIONS => 5

#添加或修改多个列族 hbase> alter 't1', 'f1', {NAME => 'f2', IN_MEMORY => true}, {NAME => 'f3', VERSIONS => 5}

#删除命名空间ns1 中的表t1 的列族f1 的两种方法 hbase> alter 'ns1:t1', NAME => 'f1', METHOD => 'delete' hbase> alter 'ns1:t1', 'delete' => 'f1'

#修改表t1的MAX_FILESIZE属性的值。 hbase> alter 't1', MAX_FILESIZE => '134217728'

# 修改表t1或者列族f2的配置 hbase> alter 't1', CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'} hbase> alter 't1', {NAME => 'f2', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}}

#删除属性 hbase> alter 't1', METHOD => 'table_att_unset', NAME => 'MAX_FILESIZE'

hbase> alter 't1', METHOD => 'table_att_unset', NAME => 'coprocessor$1'

#一次性修改多个属性值 hbase> alter 't1', { NAME => 'f1', VERSIONS => 3 }, { MAX_FILESIZE => '134217728' }, { METHOD => 'delete', NAME => 'f2' }, OWNER => 'johndoe', METADATA => { 'mykey' => 'myvalue' } hbase(main):014:0> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 alter_async 作用：异步更新，与alter的作用相同。

describe / desc 作用：显示表的属性，表的列族的属性。实例：

# 命令：显示表t1信息 hbase> describe 't3' # 显示出的信息： Table t3 is ENABLED t3 COLUMN FAMILIES DESCRIPTION {NAME => 'colfa', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP _DELETED_CELLS => 'false', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TT L => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', RE PLICATION_SCOPE => '0'} 1 row(s) in 0.0200 seconds 1 2 3 4 5 6 7 8 9 10 11 disable 作用：disable表，删除一个表之前，必须把表disable 实例：

#disable表t1 hbase> disable 't1' 1 2 disable_all 作用： disable多个表，接受正则表达好似。实例：

# disable 所有以t开头的表 hbase> disable_all 't.*' 1 2 drop 作用：删除表。但是删除之前，必须disable该表实例：

# 删除表t2 hbase(main):005:0> disable 't2' 0 row(s) in 1.2270 seconds hbase(main):006:0> drop 't2' 0 row(s) in 0.1750 seconds 1 2 3 4 5 drop_all 作用：删除多个表，接受正则表达式。实例：

# 删除所有表名以t开头的表 hbase> drop_all 't.*' 1 2 enable 作用：与disble相反，enable表

enable_all 作用：enable多个表，接受正则表达式

exists 作用：查询表是否存在实例：

# 查询表名为t1的表是否存在 hbase(main):003:0> exists 't1' Table t1 does exist 0 row(s) in 0.3170 seconds 1 2 3 4 get_table 作用：返回一个表引用对象实例：

# 将表t1的应用对象赋给t1d hbase> t1d = get_table 't1' #t1d操作 t1d.scan t1d.describe ... 1 2 3 4 5 6 is_disabled 作用：查询表是否disable

is_enabled 作用：查询表是否enable

list 作用：显示出hbase中的表，接受正则表达式实例：

#显示所有命名空间的所有表 hbase> list #显示表名以abc开头的表 hbase> list 'abc.*' #显示命名空间ns下的表名以abc开头的表 hbase> list 'ns:abc.*' #显示命名空间ns下的所有表 hbase> list 'ns:.*' 1 2 3 4 5 6 7 8 show_filters 作用：显示出所有过滤器实例：

#显示出所有过滤器 hbase> show_filters 1 2 namespace create_namespace 作用：创建命名空间实例：

# 创建命名空间ns1 hbase> create_namespace 'ns1'

# 创建命名空间ns1，并且配置ns1 hbase> create_namespace 'ns1', {'PROPERTY_NAME'=>'PROPERTY_VALUE'} 1 2 3 4 5 alter_namespace 作用：修改，添加，删除命名空间的属性实例：

# 设置命名空间ns1的属性 hbase> alter_namespace 'ns1', {METHOD => 'set', 'PROPERTY_NAME' => 'PROPERTY_VALUE'}

# 删除命名空间ns1的属性 hbase> alter_namespace 'ns1', {METHOD => 'unset', NAME=>'PROPERTY_NAME'} 1 2 3 4 5 describe_namespace 作用：描述命名空间实例：

# 描述命名空间ns1 hbase(main):008:0> describe_namespace 'ns1' DESCRIPTION {NAME => 'ns1', PROPERTY_NAME => 'PROPERTY_VALUE'} 1 row(s) in 0.0040 seconds 1 2 3 4 5 drop_namespace 作用：删除命名空间，命名空间必须为空，不包含表

list_namespace 作用：列出所有命名空间实例：

# 列出所有命名空间 hbase(main):008:0> describe_namespace 'ns1' DESCRIPTION {NAME => 'ns1', PROPERTY_NAME => 'PROPERTY_VALUE'} 1 row(s) in 0.0040 seconds 1 2 3 4 5 list_namespace_tables 作用：显示出某一个命名空间下的所有表实例：

# 显示出默认命名空间下的所有表 hbase(main):004:0> list_namespace_tables 'default' TABLE peoples t1 t3 3 row(s) in 0.0210 seconds 1 2 3 4 5 6 7 dml scan 作用：扫描某一个表实例：

# 扫描命名空间hbase下的meta表，显示出meta表的所有数据 hbase> scan 'hbase:meta'

# 扫描命名空间hbase下的meta表的列族info的列regioninfo，显示出meta表的列族info下的regioninfo列的所有数据 hbase> scan 'hbase:meta', {COLUMNS => 'info:regioninfo'}

# 扫描命名空间ns1下表t1的列族'c1'和'c2'。显示出命名空间ns1下表t1的列族'c1'和'c2'的所有数据 hbase> scan 'ns1:t1', {COLUMNS => ['c1', 'c2']}

# 扫描命名空间ns1下表t1的列族'c1'和'c2'。显示出命名空间ns1下表t1的列族'c1'和'c2'，且只显示前10个rowkey的数据。 hbase> scan 'ns1:t1', {COLUMNS => ['c1', 'c2'], LIMIT => 10}

# 扫描命名空间ns1下表t1的列族'c1'和'c2'。显示出命名空间ns1下表t1的列族'c1'和'c2'，且只显示从rowkey=“xyz”开始的前10个rowkey的数据。 hbase> scan 'ns1:t1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}

# 扫描默认命名空间下表t1的列族c1时间戳从'1303668804'到'1303668904'的数据 hbase> scan 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804, 1303668904]}

# 反向显示表t1的数据 hbase> scan 't1', {REVERSED => true}

# 过滤显示表t1的数据 hbase> scan 't1', {FILTER => "(PrefixFilter ('row2') AND (QualifierFilter (>=, 'binary:xyz'))) AND (TimestampsFilter ( 123, 456))"}

# RAW为true，显示出表t1的所有数据，包括已经删除的 hbase> scan 't1', {RAW => true, VERSIONS => 10}

# 表t1的引用的扫描 hbase> t11 = get_table 't1' hbase> t11.scan 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 append 作用：实例：

# 向表t1的rowkey为r1的列c1的值后面添加字符串value hbase> append 't1', 'r1', 'c1', 'value'

#表t1的引用对象t11使用append。 hbase> t11.append 'r1', 'c1', 'value' 1 2 3 4 5 count 作用：统计表的行数实例：

#统计表t1的行数 count 't1'

#统计表t1的行数，其中参数的含义如下 # INTERVAL设置多少行显示一次及对应的rowkey，默认1000；CACHE每次去取的缓存区大小，默认是10，调整该参数可提高查询速度 # 例如，查询表t1中的行数，每10条显示一次，缓存区为1000 count 't1', INTERVAL => 10, CACHE => 1000

#对应的表应用对象的用法 hbase> t.count hbase> t.count INTERVAL => 100000 hbase> t.count CACHE => 1000 hbase> t.count INTERVAL => 10, CACHE => 1000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 delete 作用：删除表中cell数据实例：

# 删除命名空间ns1下的表t1的rowkey的r1的列c1，时间戳为ts1 hbase> delete 'ns1:t1', 'r1', 'c1', ts1

# 删除默认命名空间下的表t1的rowkey的r1的列c1，时间戳为ts1 hbase> delete 't1', 'r1', 'c1', ts1

#应用对象的用法 hbase> t.delete 'r1', 'c1', ts1 1 2 3 4 5 6 7 8 deleteall 作用：一次性删除多个cell数据实例：

#删除命名空间ns1下表t1的rowkey为r1的所有数据 hbase> deleteall 'ns1:t1', 'r1'

#删除默认命名空间下表t1的rowkey为r1的所有数据 hbase> deleteall 't1', 'r1'

#删除命名空间ns1下表t1的rowkey为r1的列c1的所有数据 hbase> deleteall 't1', 'r1', 'c1'

# 删除默认命名空间下的表t1的rowkey的r1的列c1，时间戳为ts1 hbase> deleteall 't1', 'r1', 'c1', ts1

#应用对象的用法 hbase> t.deleteall 'r1' hbase> t.deleteall 'r1', 'c1' hbase> t.deleteall 'r1', 'c1', ts1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 get 作用：得到某一列或cell的数据。实例：

#得到命名空间ns1下表t1的rowkey为r1的数据 hbase> get 'ns1:t1', 'r1'

#得到默认命名空间下表t1的rowkey为r1的数据 hbase> get 't1', 'r1'

#得到默认命名空间下表t1的rowkey为r1，时间戳范围在ts1和ts2之间的数据 hbase> get 't1', 'r1', {TIMERANGE => [ts1, ts2]}

#得到默认命名空间下表t1的rowkey为r1的c1列的数据 hbase> get 't1', 'r1', {COLUMN => 'c1'}

#得到默认命名空间下表t1的rowkey为r1的c1,c2,c3列的数据 hbase> get 't1', 'r1', {COLUMN => ['c1', 'c2', 'c3']}

#得到默认命名空间下表t1的rowkey为r1的c1列，时间戳为ts1的数据 hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}

#得到默认命名空间下表t1的rowkey为r1的c1列，时间戳范围为ts1到ts2，版本数为4的数据 hbase> get 't1', 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}

#应用对象的用法 hbase> t.get 'r1' hbase> t.get 'r1', {TIMERANGE => [ts1, ts2]} hbase> t.get 'r1', {COLUMN => 'c1'} hbase> t.get 'r1', {COLUMN => ['c1', 'c2', 'c3']} hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1} hbase> t.get 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4} hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4} 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 put 作用：添加cell 实例：

# 向命名空间ns1下表t1的rowkey为r1的列c1添加数据 hbase> put 'ns1:t1', 'r1', 'c1', 'value'

# 向默认命名空间下表t1的rowkey为r1的列c1添加数据 hbase> put 't1', 'r1', 'c1', 'value'

# 向默认命名空间下表t1的rowkey为r1的列c1添加数据，并设置时间戳为ts1 hbase> put 't1', 'r1', 'c1', 'value', ts1

# 向默认命名空间下表t1的rowkey为r1的列c1添加数据，并设置时间戳为ts1，并设置属性 hbase> put 't1', 'r1', 'c1', 'value', ts1, {ATTRIBUTES=>{'mykey'=>'myvalue'}}

#引用对象的用法 t.put 'r1', 'c1', 'value', ts1, {ATTRIBUTES=>{'mykey'=>'myvalue'}} 1 2 3 4 5 6 7 8 9 10 11 12 13 14 truncate 作用：删除表，不用disable 实例：

#删除表t3，不用disable truncate 't3' 1 2 引用文档： http://blog.csdn.net/woshiwanxin102213/article/details/17611457 split的三种方式

(adsbygoogle = window.adsbygoogle || []).push({});

本文参与腾讯云自媒体同步曝光计划，分享自作者个人站点/博客。

如有侵权请联系 cloudcommunity@tencent.com 删除

TDSQL MySQL 版

本文分享自作者个人站点/博客前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

TDSQL MySQL 版

评论

登录后参与评论

0 条评论

热度

最新

LV.

相关产品与服务

TDSQL MySQL 版

TDSQL MySQL 版（TDSQL for MySQL）是腾讯打造的一款分布式数据库产品，具备强一致高可用、全球部署架构、分布式水平扩展、高性能、企业级安全等特性，同时提供智能 DBA、自动化运营、监控告警等配套设施，为客户提供完整的分布式数据库解决方案。

产品介绍产品文档

精选特惠用云无忧