Hbase

Hbase shell常用命令小记

大数据空心菜发表了文章 0 个评论 3235 次浏览 2016-09-13 00:19 来自相关话题

1、进入hbase shell console $HBASE_HOME/bin/hbase shell 如果有kerberos认证，需要事先使用相应的keytab进行一下认证（使用kinit命令），认证成功之后再使用hbase shel ...查看全部

1、进入hbase shell console
$HBASE_HOME/bin/hbase shell
如果有kerberos认证，需要事先使用相应的keytab进行一下认证（使用kinit命令），认证成功之后再使用hbase shell进入可以使用whoami命令可查看当前用户

hbase(main):002:0> whoami

2016-09-12 13:09:42,440 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

root (auth:SIMPLE)

    groups: root

2、表的管理
1）查看有哪些表

hbase(main):001:0> list

TABLE                                                                                                              

pythonTrace                                                                                   

1 row(s) in 0.1320 seconds

2）创建表
语法：create , {NAME => , VERSIONS => }
例如：创建表t1，有两个family name：f1，f2，且版本数均为2

hbase(main):001:0> create 't1',{NAME => 'f1', VERSIONS => 2},{NAME => 'f2', VERSIONS => 2}

0 row(s) in 0.4400 seconds



=> Hbase::Table - t1

3）删除表
分两步：首先disable，然后drop ; 例如：删除表t1

hbase(main):002:0> disable 't1'

0 row(s) in 1.2030 seconds



hbase(main):003:0> drop 't1'

0 row(s) in 0.1870 seconds

4）查看表的结构
语法：describe

, 例如：查看表t1的结构

hbase(main):005:0> describe 't1'

Table t1 is ENABLED                                                                                                

t1                                                                                                                 

COLUMN FAMILIES DESCRIPTION                                                                                        

{NAME => 'f1', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '2', COMP

RESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_M

EMORY => 'false', BLOCKCACHE => 'true'}                                                                            

{NAME => 'f2', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '2', COMP

RESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_M

EMORY => 'false', BLOCKCACHE => 'true'}                                                                            

2 row(s) in 0.0240 seconds

5）修改表结构
修改表结构必须先disable
语法：alter 't1', {NAME => 'f1'}, {NAME => 'f2', METHOD => 'delete'}; 例如：修改表t1的cf的TTL为180天

hbase(main):006:0> disable 't1'

0 row(s) in 1.1950 seconds



hbase(main):007:0> alter 't1',{NAME=>'body',TTL=>'15552000'},{NAME=>'meta', TTL=>'15552000'}

Updating all regions with the new schema...

1/1 regions updated.

Done.

Updating all regions with the new schema...

1/1 regions updated.

Done.

0 row(s) in 2.1910 seconds



hbase(main):008:0> enable 't1'

0 row(s) in 0.3930 seconds

3、权限管理
1）分配权限
语法 : grant

参数后面用逗号分隔
权限用五个字母表示： "RWXCA"； READ('R'), WRITE('W'), EXEC('X'), CREATE('C'), ADMIN('A')
例如，给用户‘test'分配对表t1有读写的权限

hbase(main)> grant 'test','RW','t1'

2）查看权限
语法：user_permission

; 例如，查看表t1的权限列表

hbase(main)> user_permission 't1'

3）收回权限
与分配权限类似，语法：revoke

例如，收回test用户在表t1上的权限

hbase(main)> revoke 'test','t1'

4、表数据的增删改查
1）添加数据
语法：put

,,,,
例如：给表t1的添加一行记录：rowkey是rowkey001，family name：f1，column name：col1，value：value01，timestamp：系统默认

hbase(main)> put 't1','rowkey001','f1:col1','value01'

用法比较单一
2）查询数据
a）查询某行记录
语法：get

,,[,....] ；例如：查询表t1，rowkey001中的f1下的col1的值

hbase(main)> get 't1','rowkey001', 'f1:col1'

# 或者：

hbase(main)> get 't1','rowkey001', {COLUMN=>'f1:col1'}

查询表t1，rowke002中的f1下的所有列值

hbase(main)> get 't1','rowkey001'

b）扫描表
语法：scan

, {COLUMNS => [ ,.... ], LIMIT => num}
另外，还可以添加STARTROW、TIMERANGE和FITLER等高级功能；例如：扫描表t1的前5条数据

hbase(main)> scan 't1',{LIMIT=>5}

c）查询表中的数据行数
语法：count

, {INTERVAL => intervalNum, CACHE => cacheNum}
INTERVAL设置多少行显示一次及对应的rowkey，默认1000；CACHE每次去取的缓存区大小，默认是10，调整该参数可提高查询速度
例如，查询表t1中的行数，每100条显示一次，缓存区为500

hbase(main)> count 't1', {INTERVAL => 100, CACHE => 500}

3）删除数据
a )删除行中的某个列值
语法：delete

, , , ,必须指定列名
例如：删除表t1，rowkey001中的f1:col1的数据

hbase(main)> delete 't1','rowkey001','f1:col1'

注：将删除改行f1:col1列所有版本的数据

b )删除行
语法：deleteall

, , , ，可以不指定列名，删除整行数据
例如：删除表t1，rowk001的数据

hbase(main)> deleteall 't1','rowkey001'

c）删除表中的所有数据
语法： truncate

其具体过程是：disable table -> drop table -> create table ；例如：删除表t1的所有数据

hbase(main)> truncate 't1'

5、Region管理
1）移动region
语法：move 'encodeRegionName', 'ServerName'
encodeRegionName指的regioName后面的编码，ServerName指的是master-status的Region Servers列表
示例如下：

hbase(main)>move '4343995a58be8e5bbc739af1e91cd72d', 'db-41.xxx.xxx.org,60020,1390274516739'

2）开启/关闭region
语法：balance_switch true|false

hbase(main)> balance_switch

3）手动split
语法：split 'regionName', 'splitKey'
4）手动触发major compaction
#语法：

#Compact all regions in a table:

hbase> major_compact 't1'

#Compact an entire region:

hbase> major_compact 'r1'

#Compact a single column family within a region:

hbase> major_compact 'r1', 'c1'

#Compact a single column family within a table:

hbase> major_compact 't1', 'c1'

6、配置管理及节点重启
1）修改hdfs配置
hdfs配置位置：/etc/hadoop/conf
# 同步hdfs配置

cat /home/hadoop/slaves|xargs -i -t scp /etc/hadoop/conf/hdfs-site.xml hadoop@{}:/etc/hadoop/conf/hdfs-site.xml

# 关闭：

cat /home/hadoop/slaves|xargs -i -t ssh hadoop@{} "sudo /home/hadoop/cdh4/hadoop-2.0.0-cdh4.2.1/sbin/hadoop-daemon.sh --config /etc/hadoop/conf stop datanode"

# 启动：

cat /home/hadoop/slaves|xargs -i -t ssh hadoop@{} "sudo /home/hadoop/cdh4/hadoop-2.0.0-cdh4.2.1/sbin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode"

2）修改hbase配置
hbase配置位置：
# 同步hbase配置

cat /home/hadoop/hbase/conf/regionservers|xargs -i -t scp /home/hadoop/hbase/conf/hbase-site.xml hadoop@{}:/home/hadoop/hbase/conf/hbase-site.xml

# graceful重启

cd ~/hbase

bin/graceful_stop.sh --restart --reload --debug inspurXXX.xxx.xxx.org

Hadoop运维经验杂谈

大数据 chris 发表了文章 0 个评论 3286 次浏览 2016-04-07 00:57 来自相关话题

Hadoop在蓝汛系统架构： Cloudera和它的产品们 Ap ...查看全部

Hadoop在蓝汛

系统架构：

Cloudera和它的产品们

Apache Hadoop与CDH版本关系

CDH为什么更好？

]安装升级更简单：[/

yum ,tar, rpm, cloudera manager 四种安装方法

]更快获取新功能和修正新bug[/

]年度release，季度update[/

]Yum安装自动匹配合适的生态系统版本[/

]自动目录配置（logs，conf），以及hdfs和mapred用户创建[/

]详细的文档[/

CDH3u3重大改善
CDH3u4重大改善
Cloudera Manager

Cloudera Training

]关于Training[/

分为Administrator和Development两门课程

]关于认证考试[/

]关于证书[/

运维事故

1、伤不起的内存现象1

系统上线第二天，Jobtracker不工作，web页面打不开

原因

一次提交Job数量太多，导致Jobtracker 内存溢出

解决

调大JT内存；限制Running Job数量

现象2

NN内存溢出，重启后发现50030页面显示fsimage损坏，调查发现SNN fsimage同样损坏了

原因

小文件太多导致NN/SNN内存溢出，导致fsimage文件损坏，但是重启后的NN可以正常服务。

原因

Cloudera google group去救，获得后门脚本

2、低效的MapReduce Job现象

MapReduce Job执行时间过长

原因

MR中用到了Spring，小文件导致Map方法效率低下，GZ文件读写效率低

解决

MR去Spring化；开启JVM重用；使用LZO作为输入和map输出结果；加大reduce并行copy线程数

压缩与MapReduce性能

]前提：大量小文件[/

]输入147GB，文件数45047，平均3MB[/

]CPU 8 core；32GB内存；7200转磁盘；28台Slave机器[/

3、OMG，整个集群完蛋了现象

早上来发现所有DataNode都dead了，重启后10分钟，DN陆续又都dead了；调查发现节点有8%左右丢包率

原因

交换机模块故障；DN不能Hold住大量小文件

解决

升级3u2到3u4；设置DN内存到2GB

遇到无法跨越的问题解决办法

]加入Hadoop官方Mail List[/

]加入Cloudera Google Group[/

监控与告警

]监控：ganglia[/

]设备告警、服务告警：nagios[/

]业务告警：自己实现[/

Nagios告警：

业务监控：

使用./bin/graceful_stop.sh had1停止一个hbase regionserver失败

贡献

大数据空心菜回复了问题 3 人关注 1 个回复 6860 次浏览 2016-03-02 21:40 来自相关话题

hbase RegionServer节点启动失败

贡献

大数据空心菜回复了问题 2 人关注 2 个回复 7049 次浏览 2016-02-27 16:21 来自相关话题

Hbase的Python API模块Starbase介绍

编程 chris 发表了文章 0 个评论 6054 次浏览 2016-02-20 16:14 来自相关话题

The following guest post is provided by Artur Barseghyan, a web developer currently employed by Goldmund, Wyldebeast & Wunderliebe ...查看全部

The following guest post is provided by Artur Barseghyan, a web developer currently employed by Goldmund, Wyldebeast & Wunderliebe in The Netherlands.

Python is my personal (and primary) programming language of choice and also happens to be the primary programming language at my company. So, when starting to work with a new technology, I prefer to use a clean and easy (Pythonic!) API.

After studying tons of articles on the web, reading (and writing) white papers, and doing basic performance tests (sometimes hard if you’re on a tight schedule), my company recently selected Cloudera for our Big Data platform (including using Apache HBase as our data store for Apache Hadoop), with Cloudera Manager serving a role as “one console to rule them all.”

However, I was surprised shortly thereafter to learn about the absence of a working Python wrapper around the REST API for HBase (aka Stargate). I decided to write one in my free time, and the result, ladies and gentlemen, wasStarbase (GPL).

In this post, I will provide some code samples and briefly explain what work has been done on Starbase. I assume that reader of this blog post already has some basic understanding of HBase (that is, of tables, column families, qualifiers, and so on).

一、安装

Next, I’ll show you some frequently used commands and use cases. But first, install the current version of Starbase from CheeseShop (PyPi).

# pip install starbase

导入模块:

>>> from starbase import Connection

…and create a connection instance. Starbase defaults to 127.0.0.1:8000; if your settings are different, specify them here.

>>> c = Connection()

二、API 操作实例

2.1 显示所有的表
假设有两个现有的表名为table1和table2表,以下将会打印出来。

>>> c.tables()

['table1', 'table2']

2.2 表的设计操作
每当你需要操作的表,你需要先创建一个表的实例。
创建一个表实例(注意,在这一步骤中没有创建表):

>>> t = c.table('table3')

Create a new table:
Create a table with columns ‘column1′, ‘column2′, ‘column3′ (here the table is actually created):

>>> t.create('column1', 'column2', 'column3')

201

检查表是否存在：

>>> t.exists()

True

查看表的列：

>>> t.columns()

['column1', 'column2', 'column3']

将列添加到表,(‘column4’,‘column5’,‘column6’,‘column7’):

>>> t.add_columns('column4', 'column5', 'column6', 'column7')

200

删除列表，(‘column6’, ‘column7’):

>>> t.drop_columns('column6', 'column7')

201

删除整个表:

>>> t.drop()

200

2.3 表的数据操作
将数据插入一行:

>>> t.insert(

[quote]>>     'my-key-1',

>>>     {

>>>         'column1': {'key11': 'value 11', 'key12': 'value 12', 'key13': 'value 13'},

>>>         'column2': {'key21': 'value 21', 'key22': 'value 22'},

>>>         'column3': {'key32': 'value 31', 'key32': 'value 32'}

>>>     }

>>> )

200

请注意,您也可以使用“本地”的命名方式列和细胞(限定词)。以下的结果等于前面的例子的结果。

>>> t.insert(

>>>     'my-key-1a',

>>>     {

>>>         'column1:key11': 'value 11', 'column1:key12': 'value 12', 'column1:key13': 'value 13',

>>>         'column2:key21': 'value 21', 'column2:key22': 'value 22',

>>>         'column3:key32': 'value 31', 'column3:key32': 'value 32'

>>>     }

>>> )

200

更新一排数据：

>>> t.update(

>>>     'my-key-1',

>>>     {'column4': {'key41': 'value 41', 'key42': 'value 42'}}

>>> )

200

Remove a row cell (qualifier):

>>> t.remove('my-key-1', 'column4', 'key41')

200

Remove a row column (column family):

>>> t.remove('my-key-1', 'column4')

200

Remove an entire row:

>>> t.remove('my-key-1')

200

Fetch a single row with all columns:

>>> t.fetch('my-key-1')

  {

      'column1': {'key11': 'value 11', 'key12': 'value 12', 'key13': 'value 13'},

      'column2': {'key21': 'value 21', 'key22': 'value 22'},

      'column3': {'key32': 'value 31', 'key32': 'value 32'}

  }

Fetch a single row with selected columns (limit to ‘column1′ and ‘column2′ columns):

>>> t.fetch('my-key-1', ['column1', 'column2'])

  {

      'column1': {'key11': 'value 11', 'key12': 'value 12', 'key13': 'value 13'},

      'column2': {'key21': 'value 21', 'key22': 'value 22'},

  }

Narrow the result set even more (limit to cells ‘key1′ and ‘key2′ of column `column1` and cell ‘key32′ of column ‘column3′):

>>> t.fetch('my-key-1', {'column1': ['key11', 'key13'], 'column3': ['key32']})

  {

      'column1': {'key11': 'value 11', 'key13': 'value 13'},

      'column3': {'key32': 'value 32'}

  }

Note that you may also use the native means of naming the columns and cells (qualifiers). The example below does exactly the same thing as the example above.

>>>  t.fetch('my-key-1', ['column1:key11', 'column1:key13', 'column3:key32'])

  {

      'column1': {'key11': 'value 11', 'key13': 'value 13'},

      'column3': {'key32': 'value 32'}

  }

If you set the perfect_dict argument to False, you’ll get the native data structure:

>>>  t.fetch('my-key-1', ['column1:key11', 'column1:key13', 'column3:key32'], perfect_dict=False)

{

    'column1:key11': 'value 11', 'column1:key13': 'value 13',

    'column3:key32': 'value 32'

}

2.4 对表数据批处理操作
Batch operations (insert and update) work similarly to routine insert and update, but are done in a batch. You are advised to operate in batch as much as possible.[/quote]

In the example below, we will insert 5,000 records in a batch:

>>> data = {

[quote]>>     'column1': {'key11': 'value 11', 'key12': 'value 12', 'key13': 'value 13'},

>>>     'column2': {'key21': 'value 21', 'key22': 'value 22'},

>>> }

>>> b = t.batch()

>>> for i in range(0, 5000):

>>>     b.insert('my-key-%s' % i, data)

>>> b.commit(finalize=True)

{'method': 'PUT', 'response': [200], 'url': 'table3/bXkta2V5LTA='}

In the example below, we will update 5,000 records in a batch:

>>> data = {

>>>     'column3': {'key31': 'value 31', 'key32': 'value 32'},

>>> }

>>> b = t.batch()

>>> for i in range(0, 5000):

>>>     b.update('my-key-%s' % i, data)

>>> b.commit(finalize=True)

{'method': 'POST', 'response': [200], 'url': 'table3/bXkta2V5LTA='}

Note: The table batch method accepts an optional size argument (int). If set, an auto-commit is fired each the time the stack is full.
2.5 表数据搜索（行扫描）
A table scanning feature is in development. At the moment it’s only possible to fetch all rows from a table. The result set returned is a generator.[/quote]

注意：表数据扫描功能正在开发中。目前仅支持取出表中所有数据（Full Table Scan），暂不支持范围扫描（RowKey Range Scan），其结果以一个迭代器形式返回。

>>> t.fetch_all_rows()

就介绍到这里了，没有时间翻译，聽简单的英文！

Python访问hbase数据操作脚本分享

编程 chris 发表了文章 0 个评论 3082 次浏览 2016-02-20 15:33 来自相关话题

#!/usr/bin/python import getopt,sys,time from thrift.transport.TSocket import TSocket from thrift.transp ...查看全部

#!/usr/bin/python

 

import getopt,sys,time

from thrift.transport.TSocket import TSocket

from thrift.transport.TTransport import TBufferedTransport

from thrift.protocol import TBinaryProtocol

from hbase import Hbase

 

def usage():

        print '''Usage :

        -h: Show help information;

        -l: Show all table in hbase;

        -t {table} Show table descriptors;

        -t {table} -k {key} : show cell;

        -t {table} -k {key} -c {coulmn} : Show the coulmn;

        -t {table} -k {key} -c {coulmn} -v {versions} : Show more version;

    (write by liuhuorong@koudai.com)

        '''

 

class geilihbase:

        def __init__(self):

                self.transport = TBufferedTransport(TSocket("127.0.0.1", "9090"))

                self.transport.open()

                self.protocol = TBinaryProtocol.TBinaryProtocol(self.transport)

                self.client = Hbase.Client(self.protocol)

        def __del__(self):

                self.transport.close()

        def glisttable(self):

                for table in self.client.getTableNames():

                        print table

        def ggetColumnDescriptors(self,table):

                rarr=self.client.getColumnDescriptors(table)

                if rarr:

                        for (k,v) in rarr.items():

                                print "%-20s\t%s" % (k,v)

        def gget(self,table,key,coulmn):

                rarr=self.client.get(table,key,coulmn)

                if rarr:

                        print "%-15s %-20s\t%s" % (rarr[0].timestamp,time.strftime("%Y-%m-%d %H:%M:%S",time.localtime(rarr[0].timestamp/1000)),rarr[0].value)

        def ggetrow(self,table,key):

                rarr=self.client.getRow(table, key)

                if rarr:

                        for (k,v) in rarr[0].columns.items():

                                print "%-20s\t%-15s %-20s\t%s" % (k,v.timestamp,time.strftime("%Y-%m-%d %H:%M:%S",time.localtime(v.timestamp/1000)),v.value)

        def ggetver(self, table, key, coulmn, versions):

                rarr=self.client.getVer(table,key,coulmn, versions);

                if rarr:

                        for row in rarr:

                                print "%-15s %-20s\t%s" % (row.timestamp,time.strftime("%Y-%m-%d %H:%M:%S",time.localtime(row.timestamp/1000)),row.value)

 

def main(argv):

        tablename=""

        key=""

        coulmn=""

        versions=""

        try:

                opts, args = getopt.getopt(argv, "lht:k:c:v:", ["help","list"])

        except getopt.GetoptError:

                usage()

                sys.exit(2)

        for opt, arg in opts:

                if opt in ("-h", "--help"):

                        usage()

                        sys.exit(0)

                elif opt in ("-l", "--list"):

                        ghbase=geilihbase()

                        ghbase.glisttable()

                        sys.exit(0)

                elif opt == '-t':

                        tablename = arg

                elif opt == '-k':

                        key = arg

                elif opt == '-c':

                        coulmn = arg

                elif opt == '-v':

                        versions = int(arg)

        if ( tablename and key and coulmn and versions ):

                ghbase=geilihbase()

                ghbase.ggetver(tablename, key, coulmn, versions)

                sys.exit(0)

        if (tablename and key and coulmn ):

                ghbase=geilihbase()

                ghbase.gget(tablename, key, coulmn)

                sys.exit(0)

        if (tablename and key ):

                ghbase=geilihbase()

                ghbase.ggetrow(tablename, key)

                sys.exit(0)

        if (tablename ):

                ghbase=geilihbase()

                ghbase.ggetColumnDescriptors(tablename)

                sys.exit(0)

        usage()

        sys.exit(1)

 

if __name__ == "__main__":

        main(sys.argv[1:])

Hbase HRegionServer 启动失败

贡献

大数据 koyo 回复了问题 2 人关注 2 个回复 5618 次浏览 2015-12-15 14:33 来自相关话题

条新动态, 点击查看

空心菜回答了问题 • 2015-12-15 14:32 • 2 个回复不感兴趣

Hbase HRegionServer 启动失败

赞同来自:

你这个问题，很明显，从log第一句： 2015-12-15 05:32:50,674 FATAL regionserver.HRegionServer: Master rejected startup because clock is out of sync... 显示全部 »

你这个问题，很明显，从log第一句： 2015-12-15 05:32:50,674 FATAL regionserver.HRegionServer: Master rejected startup because clock is out of sync直译过来就是：**Master拒绝启动,因为时钟的同步 ** 你hbase的HRegionServer节点跟hbase的HMaster节点时间不一致导致的！解决方法： # yum -y install ntpdate && chkconfig ntpdate off # crontab -e #add sync time cron scripts _/2 _ _ _ * ntpdate asia.pool.ntp.org

CDH Hadoop + HBase HA 部署详解

大数据空心菜发表了文章 0 个评论 8140 次浏览 2016-11-07 21:07 来自相关话题

CDH 的部署和 Apache Hadoop 的部署是没有任何区别的。这里着重的是 HA的部署，需要特殊说明的是NameNode HA 需要依赖 Zookeeper。准备Hosts文件配置： ...查看全部

CDH 的部署和 Apache Hadoop 的部署是没有任何区别的。这里着重的是 HA的部署，需要特殊说明的是NameNode HA 需要依赖 Zookeeper。

准备

Hosts文件配置：

cat > /etc/hosts << _HOSTS_

127.0.0.1          localhost

10.0.2.59          cdh-m1

10.0.2.60          cdh-m2

10.0.2.61          cdh-s1

_HOSTS_

各个节点服务情况

cdh-m1 Zookeeper JournalNode NameNode DFSZKFailoverController HMaster

cdh-m2 Zookeeper JournalNode NameNode DFSZKFailoverController HMaster

cdh-s1 Zookeeper JournalNode DataNode HRegionServer

对几个新服务说明下：

JournalNode 用于同步 NameNode 元数据，和 Zookeeper 一样需要 2N+1个节点存活集群才可用;
DFSZKFailoverController（ZKFC）用于主备切换，类似 Keepalived 所扮演的角色。

NTP 服务

设置时区

rm -f /etc/localtime

ln -s  /usr/share/zoneinfo/Asia/Shanghai /etc/localtime

配置NTP Server

yum install -y ntp

cat > /etc/ntp.conf << _NTP_

driftfile /var/lib/ntp/drift



restrict default nomodify

restrict -6 default nomodify



server cn.ntp.org.cn prefer

server news.neu.edu.cn iburst

server dns.sjtu.edu.cn iburst

server 127.127.1.1 iburst



tinker dispersion 100

tinker step 1800

tinker stepout 3600

includefile /etc/ntp/crypto/pw



keys /etc/ntp/keys

_NTP_



# NTP启动时立即同步

cat >> /etc/ntp/step-tickers << _NTP_

server cn.ntp.org.cn prefer

server news.neu.edu.cn iburst

server dns.sjtu.edu.cn iburst

_NTP_



# 同步硬件时钟

cat >> /etc/sysconfig/ntpd << _NTPHW_

SYNC_HWCLOCK=yes

_NTPHW_

启动并设置开机自启动

/etc/init.d/ntpd start

chkconfig ntpd on

配置 NTP Client

yum install -y ntp

# 注意修改内网NTP Server地址

cat > /etc/ntp.conf << _NTP_

driftfile /var/lib/ntp/drift



restrict default nomodify

restrict -6 default nomodify



restrict 127.0.0.1

restrict -6 ::1



server 10.0.2.59 prefer



tinker dispersion 100

tinker step 1800

tinker stepout 3600

includefile /etc/ntp/crypto/pw



keys /etc/ntp/keys

_NTP_



# NTP启动时立即同步

cat >> /etc/ntp/step-tickers << _NTP_

server 10.0.2.59 prefer

_NTP_



# 同步硬件时钟

cat >> /etc/sysconfig/ntpd << _NTPHW_

SYNC_HWCLOCK=yes

_NTPHW_

启动并设置开机自启动

/etc/init.d/ntpd start

chkconfig ntpd on

检查 NTP 同步

ntpq -p



# 结果

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

*time7.aliyun.co 10.137.38.86     2 u   17   64    3   44.995    5.178   0.177

 news.neu.edu.cn .INIT.          16 u    -   64    0    0.000    0.000   0.000

 202.120.2.90    .INIT.          16 u    -   64    0    0.000    0.000   0.000

JDK配置

创建目录

mkdir -p /data/{install,app,logs,pid,appData}

mkdir /data/appData/tmp



cd /data/install

wget -c http://oracle.com/jdk-7u51-linux-x64.gz

tar xf jdk-7u51-linux-x64.gz -C /data/app

cd /data/app

ln -s jdk1.7.0_51 jdk1.7

cat >> /etc/profile << _PATH_

export JAVA_HOME=/data/app/jdk1.7

export CLASSPATH=.:\$JAVA_HOME/lib/dt.jar:\$JAVA_HOME/lib/tools.jar

export PATH=\$JAVA_HOME/bin:\$PATH

_PATH_

source /etc/profile

创建运行账户

useradd -u 600 run

下载安装包

# http://archive.cloudera.com/cdh5/cdh/5/

cd /data/install

wget -c http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.4.5.tar.gz

wget -c http://archive.apache.org/dist/zookeeper/zookeeper-3.4.5/zookeeper-3.4.5.tar.gz

wget -c http://archive.cloudera.com/cdh5/cdh/5/hbase-1.0.0-cdh5.4.5.tar.gz

安装 Zookeeper

cd /data/install

tar xf zookeeper-3.4.5.tar.gz -C /data/app

cd /data/app

ln -s zookeeper-3.4.5 zookeeper

设置环境变量

sed -i '/^export PATH=/i\export ZOOKEEPER_HOME=/data/app/zookeeper' /etc/profile

sed -i 's#export PATH=#&\$ZOOKEEPER_HOME/bin:#' /etc/profile

source /etc/profile

删除无用文件

cd $ZOOKEEPER_HOME

rm -rf *xml *txt zookeeper-3.4.5.jar.* src recipes docs dist-maven contrib

rm -f $ZOOKEEPER_HOME/bin/*.cmd $ZOOKEEPER_HOME/bin/*.txt

rm -f $ZOOKEEPER_HOME/conf/zoo_sample.cfg

创建数据目录

mkdir -p /data/appData/zookeeper/{data,logs}

配置

cat > $ZOOKEEPER_HOME/conf/zoo.cfg << _ZOO_

tickTime=2000

initLimit=10

syncLimit=5

clientPort=2181

dataDir=/data/appData/zookeeper/data

dataLogDir=/data/appData/zookeeper/logs

server.1=cdh-m1:2888:3888

server.2=cdh-m2:2888:3888

server.3=cdh-s1:2888:3888

_ZOO_

修改Zookeeper的日志打印方式，与日志路径设置, 编辑

$ZOOKEEPER_HOME/bin/zkEnv.sh

在27行后加入两个变量

ZOO_LOG_DIR=/data/logs/zookeeper

ZOO_LOG4J_PROP="INFO,ROLLINGFILE"

创建 myid文件

# 注意myid与配置文件保持一致

echo 1 >/data/appData/zookeeper/data/myid

设置目录权限

chown -R run.run /data/{app,appData,logs}

启动、停止

# 启动

runuser - run -c 'zkServer.sh start'

# 停止

runuser - run -c 'zkServer.sh stop'

安装 Hadoop

tar xf hadoop-2.6.0-cdh5.4.5.tar.gz -C /data/app

cd /data/app

ln -s hadoop-2.6.0-cdh5.4.5 hadoop

设置环境变量

sed -i '/^export PATH=/i\export HADOOP_HOME=/data/app/hadoop' /etc/profile

sed -i 's#export PATH=#&\$HADOOP_HOME/bin:\$HADOOP_HOME/sbin:#' /etc/profile

source /etc/profile

删除无用文件

cd $HADOOP_HOME

rm -rf *txt share/doc src examples* include bin-mapreduce1 cloudera

find . -name "*.cmd"|xargs rm -f

新建数据目录

mkdir -p /data/appData/hdfs/{name,edits,data,jn,tmp}

配置

切换到配置文件目录

cd $HADOOP_HOME/etc/hadoop

编辑 core-site.xml

<?xml version="1.0" encoding="UTF-8"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

     <!-- HDFS 集群名称，可指定端口 -->

    <property>

        <name>fs.defaultFS</name>

        <value>hdfs://hdfs-cdh</value>

    </property>



    <!-- 临时文件目录 -->

    <property>

        <name>hadoop.tmp.dir</name>

        <value>/data/appData/hdfs/tmp</value>

    </property>



    <!-- 回收站设置，0不启用回收站，1440 表示1440分钟后删除 -->

    <property>

        <name>fs.trash.interval</name>

        <value>1440</value>

    </property>



    <!-- SequenceFiles在读写中可以使用的缓存大小，单位 bytes 默认 4096 -->

    <property>

        <name>io.file.buffer.size</name>

        <value>131072</value>

    </property>



    <!-- 可用压缩算法，启用在hdfs-site.xml中，需要编译动态链接库才能用 -->

    <property>

        <name>io.compression.codecs</name>

        <value>org.apache.hadoop.io.compress.SnappyCodec</value>

    </property>

</configuration>

编辑 hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

    <!-- 指定hdfs 集群名称，需要和core-site.xml中的保持一致 -->

    <property>

        <name>dfs.nameservices</name>

        <value>hdfs-cdh</value>

    </property>



    <!-- 指定 Zookeeper 用于NameNode HA，默认官方配置在core-site.xml中，为了查看清晰配置到hdfs-site.xml也是可用的 -->

    <property>

        <name>ha.zookeeper.quorum</name>

        <value>cdh-m1:2181,cdh-m2:2181,cdh-s1:2181</value>

    </property>



    <!-- hdfs-cdh 下有两个NameNode，分别为 nn1,nn2 -->

    <property>

        <name>dfs.ha.namenodes.hdfs-cdh</name>

        <value>nn1,nn2</value>

    </property>



    <!-- nn1 RPC通信地址 -->

    <property>

        <name>dfs.namenode.rpc-address.hdfs-cdh.nn1</name>

        <value>cdh-m1:9000</value>

    </property>



    <!-- nn1 HTTP通信地址 -->

    <property>

        <name>dfs.namenode.http-address.hdfs-cdh.nn1</name>

        <value>cdh-m1:50070</value>

    </property>



    <!-- nn2 RPC通信地址 -->

    <property>

        <name>dfs.namenode.rpc-address.hdfs-cdh.nn2</name>

        <value>cdh-m2:9000</value>

    </property>



    <!-- nn2 HTTP通信地址 -->

    <property>

        <name>dfs.namenode.http-address.hdfs-cdh.nn2</name>

        <value>cdh-m2:50070</value>

    </property>



    <!-- 指定NameNode元数据在JournalNode上的存储路径 -->

    <property>

        <name>dfs.namenode.shared.edits.dir</name>

        <value>qjournal://cdh-m1:8485;cdh-m2:8485;cdh-s1:8485;/hdfs-cdh</value>

    </property>



    <!-- 开启NameNode失败自动切换 -->

    <property>

        <name>dfs.ha.automatic-failover.enabled</name>

        <value>true</value>

    </property>



    <!-- 配置主备切换实现方式 -->

    <property>

        <name>dfs.client.failover.proxy.provider.hdfs-cdh</name>

        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

    </property>



    <!-- 配置主备切换方法，每个方法一行-->

    <property>

        <name>dfs.ha.fencing.methods</name>

        <value>

            sshfence

            shell(/bin/true)

        </value>

    </property>



    <!-- 指定运行用户的秘钥，需要NameNode双向免密码登录，用于主备自动切换 -->

    <property>

        <name>dfs.ha.fencing.ssh.private-key-files</name>

        <value>/home/run/.ssh/id_rsa</value>

    </property>



    <!-- 配置sshfence 超时时间 -->

    <property>

        <name>dfs.ha.fencing.ssh.connect-timeout</name>

        <value>50000</value>

    </property>



    <!-- NameNode 数据本地存储路径 -->

    <property>

        <name>dfs.namenode.name.dir</name>

        <value>/data/appData/hdfs/name</value>

    </property>



    <!-- DataNode 数据本地存储路径 -->

    <property>

        <name>dfs.datanode.data.dir</name>

        <value>/data/appData/hdfs/data</value>

    </property>



    <!-- JournalNode 数据本地存储路径 -->

    <property>

        <name>dfs.journalnode.edits.dir</name>

        <value>/data/appData/hdfs/jn</value>

    </property>



    <!-- 修改文件存储到edits，定期同步到DataNode -->

    <property>

        <name>dfs.namenode.edits.noeditlogchannelflush</name>

        <value>true</value>

    </property>



    <!-- edits 数据本地存储路径 -->

    <property>

        <name>dfs.namenode.edits.dir</name>

        <value>/data/appData/hdfs/edits</value>

    </property>



    <!-- 开启Block Location metadata允许impala知道数据块在哪块磁盘上 默认关闭 -->

    <property>

        <name>dfs.datanode.hdfs-blocks-metadata.enabled</name>

        <value>true</value>

    </property>



    <!-- 权限检查 默认开启 -->

    <property>

        <name>dfs.permissions.enabled</name>

        <value>false</value>

    </property>



    <!-- block 大小设置 -->

    <property>

        <name>dfs.blocksize</name>

        <value>64m</value>

    </property>

</configuration>

小于5个DataNode建议添加如下配置

<!-- 数据副本数量，不能超过DataNode数量，大集群建议使用默认值 默认 3 -->

    <property>

        <name>dfs.replication</name>

        <value>2</value>

    </property>



    <!-- 当副本写入失败时不分配新节点，小集群适用 -->

    <property>

        <name>dfs.client.block.write.replace-datanode-on-failure.policy</name>

        <value>NEVER</value>

    </property>

在 hadoop-env.sh 中添加如下变量

export JAVA_HOME=/data/app/jdk1.7

export HADOOP_LOG_DIR=/data/logs/hadoop

export HADOOP_PID_DIR=/data/pid

# SSH端口 可选

export HADOOP_SSH_OPTS="-p 22"

Heap 设置，单位 MB

export HADOOP_HEAPSIZE=1024

权限设置

chown -R run.run /data/{app,appData,logs}

chmod 777 /data/pid

格式化

格式化只需要执行一次,格式化之前启动Zookeeper

切换用户

su - run

启动所有 JournalNode

hadoop-daemon.sh start journalnode

格式化 Zookeeper（为 ZKFC 创建znode）

hdfs zkfc -formatZK

NameNode 主节点格式化并启动

hdfs namenode -format

hadoop-daemon.sh start namenode

NameNode 备节点同步数据并启动

hdfs namenode -bootstrapStandby

hadoop-daemon.sh start namenode

启动 ZKFC

hadoop-daemon.sh start zkfc

启动 DataNode

hadoop-daemon.sh start datanode

启动与停止

切换用户

su - run

集群批量启动
需要配置运行用户ssh-key免密码登录，与$HADOOP_HOME/etc/hadoop/slaves

# 启动

start-dfs.sh

# 停止

stop-dfs.sh

单服务启动停止
启动HDFS

hadoop-daemon.sh start journalnode

hadoop-daemon.sh start namenode

hadoop-daemon.sh start zkfc

hadoop-daemon.sh start datanode

停止HDFS

hadoop-daemon.sh stop datanode

hadoop-daemon.sh stop namenode

hadoop-daemon.sh stop journalnode

hadoop-daemon.sh stop zkfc

测试

HDFS HA 测试
打开 NameNode 状态页：
http://cdh-m1:50010
http://cdh-m2:50010

在 Overview 后面能看见 active 或 standby，active 为当前 Master，停止 active 上的 NameNode，检查 standby是否为 active。

HDFS 测试

hadoop fs -mkdir /test

hadoop fs -put /etc/hosts /test

hadoop fs -ls /test

结果:

-rw-r--r--   2 java supergroup         89 2016-06-15 10:30 /test/hosts

# 其中权限后面的列（这里的2）代表文件总数，即副本数量。

HDFS 管理命令

# 动态加载 hdfs-site.xml

hadoop dfsadmin -refreshNodes

HBase安装配置

cd /data/install

tar xf hbase-1.0.0-cdh5.4.5.tar.gz -C /data/app

cd /data/app

ln -s hbase-1.0.0-cdh5.4.5 hbase

设置环境变量

sed -i '/^export PATH=/i\export HBASE_HOME=/data/app/hbase' /etc/profile

sed -i 's#export PATH=#&\$HBASE_HOME/bin:#' /etc/profile

source /etc/profile

删除无用文件

cd $HBASE_HOME

rm -rf *.txt pom.xml src docs cloudera dev-support hbase-annotations hbase-assembly hbase-checkstyle hbase-client hbase-common hbase-examples hbase-hadoop2-compat hbase-hadoop-compat hbase-it hbase-prefix-tree hbase-protocol hbase-rest hbase-server hbase-shell hbase-testing-util hbase-thrift

find . -name "*.cmd"|xargs rm -f

配置
进入配置文件目录

cd $HBASE_HOME/conf

编辑 hbase-site.xml

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

    <!-- HBase 数据存储路径 -->

    <property>

        <name>hbase.rootdir</name>

        <value>hdfs://hdfs-cdh/hbase</value>

    </property>



    <!-- 完全分布式模式 -->

    <property>

        <name>hbase.cluster.distributed</name>

        <value>true</value>

    </property>



    <!-- HMaster 节点 -->

    <property>

        <name>hbase.master</name>

        <value>cdh-m1:60000,cdh-m2:60000</value>

    </property>



    <!-- Zookeeper 节点 -->

    <property>

        <name>hbase.zookeeper.quorum</name>

        <value>cdh-m1:2181,cdh-m2:2181,cdh-s1:2181</value>

    </property>



    <!-- znode 路径，Zookeeper集群中有多个HBase集群需要设置不同znode -->

    <property>

        <name>zookeeper.znode.parent</name>

        <value>/hbase</value>

    </property>



    <!-- HBase 协处理器 -->

    <property>

        <name>hbase.coprocessor.user.region.classes</name>

        <value>org.apache.hadoop.hbase.coprocessor.AggregateImplementation</value>

    </property>

</configuration>

在 hbase-env.sh 中添加如下变量

export JAVA_HOME=/data/app/jdk1.7

export HBASE_LOG_DIR=/data/logs/hbase

export HBASE_PID_DIR=/data/pid

export HBASE_MANAGES_ZK=false

# SSH 默认端口 可选

export HBASE_SSH_OPTS="-o ConnectTimeout=1 -p 36000"

Heap 设置，单位 MB

export HBASE_HEAPSIZE=1024

可选设置 regionservers 中添加所有RegionServer主机名，用于集群批量启动、停止。

启动与停止
切换用户

su - run

集群批量启动
需要配置运行用户ssh-key免密码登录，与$HBASE_HOME/conf/regionservers

# 启动

start-hbase.sh

# 停止

stop-hbase.sh

单服务启动停止
HMaster

# 启动

hbase-daemon.sh start master

# 停止

hbase-daemon.sh stop master

HRegionServer

# 启动

hbase-daemon.sh start regionserver

# 停止

hbase-daemon.sh stop regionserver

测试
HBase HA 测试
浏览器打开两个HMaster状态页:
http://cdh-m1:60010
http://cdh-m2:60010

可以在Master后面看见其中一个主机名，Backup Masters中看见另一个。
停止当前Master，刷新另一个HMaster状态页会发现Master后面已经切换，HA成功。

HBase 测试
进入hbase shell 执行：

create 'users','user_id','address','info'

list

put 'users','anton','info:age','24'

get 'users','anton'



# 最终结果

COLUMN                     CELL

 info:age                  timestamp=1465972035945, value=24

1 row(s) in 0.0170 seconds

清除测试数据：

disable 'users'

drop 'users'

到这里安装就全部完成。

Hbase/Hdfs删除节点

大数据空心菜发表了文章 0 个评论 9423 次浏览 2015-11-30 00:16 来自相关话题

线上有台服务器随时可能会挂掉，所以需要把在这个服务器上hbase的regionserver和hdfs的datanode节点移除。然后重新拿台新服务器部署接管。之前在文章 http://openskill.cn/article/17 ...查看全部

线上有台服务器随时可能会挂掉，所以需要把在这个服务器上hbase的regionserver和hdfs的datanode节点移除。然后重新拿台新服务器部署接管。

之前在文章 http://openskill.cn/article/178 中讲到怎么新增一个hdfs的datanode，所以我先讲一下怎么添加一个hbase的regionserver，然后再讲怎么删除！

添加hbase regionserver节点

添加步骤如下：
1、在hbase master上修改regionservers文件

# cd hbase_install_dir/conf

# echo "new_hbase_node_hostname" >> ./regionservers

2、如果你hbase集群使用自身zk集群的话，还需要修改hbase-site.xml文件，反之不用操作！

# cd hbase_install_dir/conf

# vim hbase-site.xml

找到hbase.zookeeper.quorum属性   -->加入新节点

3、同步以上修改的文件到hbase的各个节点上
4、在新节点上启动hbase regionserver

# cd hbase_install_dir/bin/

# ./hbase-daemon.sh start regionserver

5、在hbasemaster启动hbase shell

用status命令确认一下集群情况

hbase新增一个 regionserver节点补充完成了，下面介绍删除hbase和hdfs节点！

集群上既部署有Hadoop，又部署有HBase，因为HBase存储是基于Hadoop HDFS的，所以先要移除HBase节点，之后再移除Hadoop节点。添加则反之。

移除hbase regionserver节点

1、在0.90.2之前，我们只能通过在要卸载的节点上执行；我的hbase版本(0.98.7)

# cd hbase_install_dir

# ./bin/hbase-daemon.sh stop regionserver

来实现。这条语句执行后，该RegionServer首先关闭其负载的所有Region而后关闭自己。在关闭时，RegionServer在ZooKeeper中的"Ephemeral Node"会失效。此时，Master检测到RegionServer挂掉并把它作为一个宕机节点，并将该RegionServer上的Region重新分配到其他RegionServer。

注意：使用此方法前，一定要关闭HBase Load Balancer。关闭方法：

hbase(main):001:0> balance_switch false

true

0 row(s) in 0.3290 seconds

总结：

这种方法很大的一个缺点是该节点上的Region会离线很长时间。因为假如该RegionServer上有大量Region的话，因为Region的关闭是顺序执行的，第一个关闭的Region得等到和最后一个Region关闭并Assigned后一起上线。这是一个相当漫长的时间。以我这次的实验为例，现在一台RegionServer平均有1000个Region，每个Region Assigned需要4s，也就是说光Assigned就至少需要1个小时。

2、自0.90.2之后，HBase添加了一个新的方法，即"graceful_stop",在你移除的服务器执行：

# cd hbase_install_dir

# ./bin/graceful_stop.sh hostname

该命令会自动关闭Load Balancer，然后Assigned Region，之后会将该节点关闭。除此之外，你还可以查看remove的过程，已经assigned了多少个Region，还剩多少个Region，每个Region 的Assigned耗时。

补充graceful stop的一些其他命令参数:

# ./bin/graceful_stop.sh

Usage: graceful_stop.sh [--config &conf-dir>] [--restart] [--reload] [--thrift] [--rest] &hostname>

 thrift      If we should stop/start thrift before/after the hbase stop/start

 rest        If we should stop/start rest before/after the hbase stop/start

 restart     If we should restart after graceful stop

 reload      Move offloaded regions back on to the stopped server

 debug       Move offloaded regions back on to the stopped server

 hostname    Hostname of server we are to stop

最终都需要我们手动打开load balancer：

hbase(main):001:0> balance_switch false

true

0 row(s) in 0.3590 seconds

然后再开启：

hbase(main):001:0> balance_switch true

false

0 row(s) in 0.3290 seconds

对比两种方法，建议使用"graceful_stop"来移除hbase RegionServer节点。
官网说明：http://hbase.apache.org/0.94/book/node.management.html http://hbase.apache.org/book.html#decommission

移除hdfs datanode节点

1、在core-site.xml文件下新增如下内容



       dfs.hosts.exclude

       /hdfs_install_dir/conf/excludes

2、创建exclude文件，把需要删除节点的主机名写入

# cd hdfs_install_dir/conf

# vim excludes

添加需要删除的节点主机名，比如 hdnode1 保存退出

3、然后在namenode节点执行如下命令，强制让namenode重新读取配置文件，不需要重启集群。

# cd hdfs_install_dir/bin/

# ./hadoop dfsadmin -refreshNodes

它会在后台进行Block块的移动
4、查看状态
等待第三步的操作结束后，需要下架的机器就可以安全的关闭了。

# ./hadoop dfsadmin -report

可以查看到现在集群上连接的节点

正在执行Decommission，会显示： 

Decommission Status : Decommission in progress  



执行完毕后，会显示： 

Decommission Status : Decommissioned

如下：

Name: 10.0.180.6:50010

Decommission Status : Decommission in progress

Configured Capacity: 917033340928 (10.83 TB)

DFS Used: 7693401063424 (7 TB)

Non DFS Used: 118121652224 (110.00 GB)

DFS Remaining: 4105510625280(3.63 TB)

DFS Used%: 64.56%

DFS Remaining%: 34.45%

Last contact: Mon Nov 29 23:53:52 CST 2015

也可以直接通过Hadoop 浏览器查看：
LIVE的节点可以查看到：http://master_ip:50070/dfsnodelist.jsp?whatNodes=LIVE
查看看到卸载的节点状态是：Decommission in progress
等待节点完成移除后，浏览：http://master_ip:50070/dfsnodelist.jsp?whatNodes=DEAD 结果如下：

完成后，删除的节点显示在dead nodes中。且其上的服务停止。Live Nodes中仅剩had2，had3
以上即为从Hadoop集群中Remove Node的过程，但是，有一点一定要注意：
hdfs-site.xml配置文件中dfs.replication值必须小于或者等于踢除节点后正常datanode的数量，即：

dfs.replication <= 集群所剩节点数

修改备份系数可以参考：http://heylinux.com/archives/2047.html

重载入删除的datanode节点

1、修改namenode的core-site.xml文件，把我们刚刚加入的内容删除或者注释掉，我这里选择注释掉。

2、再执行重载namenode的配置文件

# ./bin/hadoop dfsadmin -refreshNodes

3、最后去启动datanode上的datanode

# ./bin/hadoop-daemon.sh start datanode

starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-root-datanode-had1.out

4、查看启动情况

# jps

18653 Jps

19687 DataNode  ---->启动正常

重新载入HBase RegionServer节点
只需要重启regionserver进程即可。
参考：http://www.edureka.co/blog/commissioning-and-decommissioning-nodes-in-a-hadoop-cluster/
https://pravinchavan.wordpress.com/2013/06/03/removing-node-from-hadoop-cluster/

hbase SessionExpiredException: KeeperErrorCode = Session expired

大数据 OpenSkill 回复了问题 2 人关注 1 个回复 14664 次浏览 2017-10-15 13:54 来自相关话题

org.apache.hadoop.hbase.NotServingRegionException: Region

运维空心菜回复了问题 2 人关注 1 个回复 7780 次浏览 2017-06-27 17:44 来自相关话题

使用./bin/graceful_stop.sh had1停止一个hbase regionserver失败

大数据空心菜回复了问题 3 人关注 1 个回复 6860 次浏览 2016-03-02 21:40 来自相关话题

hbase RegionServer节点启动失败

大数据空心菜回复了问题 2 人关注 2 个回复 7049 次浏览 2016-02-27 16:21 来自相关话题

Hbase HRegionServer 启动失败

大数据 koyo 回复了问题 2 人关注 2 个回复 5618 次浏览 2015-12-15 14:33 来自相关话题

小米Hbase服务化实践

大数据 Geek小A 发起了问题 1 人关注 0 个回复 4258 次浏览 2015-11-24 19:04 来自相关话题

CDH Hadoop + HBase HA 部署详解

大数据空心菜发表了文章 0 个评论 8140 次浏览 2016-11-07 21:07 来自相关话题

CDH 的部署和 Apache Hadoop 的部署是没有任何区别的。这里着重的是 HA的部署，需要特殊说明的是NameNode HA 需要依赖 Zookeeper。

准备

Hosts文件配置：

cat > /etc/hosts << _HOSTS_

127.0.0.1          localhost

10.0.2.59          cdh-m1

10.0.2.60          cdh-m2

10.0.2.61          cdh-s1

_HOSTS_

各个节点服务情况

cdh-m1 Zookeeper JournalNode NameNode DFSZKFailoverController HMaster

cdh-m2 Zookeeper JournalNode NameNode DFSZKFailoverController HMaster

cdh-s1 Zookeeper JournalNode DataNode HRegionServer

对几个新服务说明下：

JournalNode 用于同步 NameNode 元数据，和 Zookeeper 一样需要 2N+1个节点存活集群才可用;
DFSZKFailoverController（ZKFC）用于主备切换，类似 Keepalived 所扮演的角色。

NTP 服务

设置时区

rm -f /etc/localtime

ln -s  /usr/share/zoneinfo/Asia/Shanghai /etc/localtime

配置NTP Server

yum install -y ntp

cat > /etc/ntp.conf << _NTP_

driftfile /var/lib/ntp/drift



restrict default nomodify

restrict -6 default nomodify



server cn.ntp.org.cn prefer

server news.neu.edu.cn iburst

server dns.sjtu.edu.cn iburst

server 127.127.1.1 iburst



tinker dispersion 100

tinker step 1800

tinker stepout 3600

includefile /etc/ntp/crypto/pw



keys /etc/ntp/keys

_NTP_



# NTP启动时立即同步

cat >> /etc/ntp/step-tickers << _NTP_

server cn.ntp.org.cn prefer

server news.neu.edu.cn iburst

server dns.sjtu.edu.cn iburst

_NTP_



# 同步硬件时钟

cat >> /etc/sysconfig/ntpd << _NTPHW_

SYNC_HWCLOCK=yes

_NTPHW_

启动并设置开机自启动

/etc/init.d/ntpd start

chkconfig ntpd on

配置 NTP Client

yum install -y ntp

# 注意修改内网NTP Server地址

cat > /etc/ntp.conf << _NTP_

driftfile /var/lib/ntp/drift



restrict default nomodify

restrict -6 default nomodify



restrict 127.0.0.1

restrict -6 ::1



server 10.0.2.59 prefer



tinker dispersion 100

tinker step 1800

tinker stepout 3600

includefile /etc/ntp/crypto/pw



keys /etc/ntp/keys

_NTP_



# NTP启动时立即同步

cat >> /etc/ntp/step-tickers << _NTP_

server 10.0.2.59 prefer

_NTP_



# 同步硬件时钟

cat >> /etc/sysconfig/ntpd << _NTPHW_

SYNC_HWCLOCK=yes

_NTPHW_

启动并设置开机自启动

/etc/init.d/ntpd start

chkconfig ntpd on

检查 NTP 同步

ntpq -p



# 结果

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

*time7.aliyun.co 10.137.38.86     2 u   17   64    3   44.995    5.178   0.177

 news.neu.edu.cn .INIT.          16 u    -   64    0    0.000    0.000   0.000

 202.120.2.90    .INIT.          16 u    -   64    0    0.000    0.000   0.000

JDK配置

创建目录

mkdir -p /data/{install,app,logs,pid,appData}

mkdir /data/appData/tmp



cd /data/install

wget -c http://oracle.com/jdk-7u51-linux-x64.gz

tar xf jdk-7u51-linux-x64.gz -C /data/app

cd /data/app

ln -s jdk1.7.0_51 jdk1.7

cat >> /etc/profile << _PATH_

export JAVA_HOME=/data/app/jdk1.7

export CLASSPATH=.:\$JAVA_HOME/lib/dt.jar:\$JAVA_HOME/lib/tools.jar

export PATH=\$JAVA_HOME/bin:\$PATH

_PATH_

source /etc/profile

创建运行账户

useradd -u 600 run

下载安装包

# http://archive.cloudera.com/cdh5/cdh/5/

cd /data/install

wget -c http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.4.5.tar.gz

wget -c http://archive.apache.org/dist/zookeeper/zookeeper-3.4.5/zookeeper-3.4.5.tar.gz

wget -c http://archive.cloudera.com/cdh5/cdh/5/hbase-1.0.0-cdh5.4.5.tar.gz

安装 Zookeeper

cd /data/install

tar xf zookeeper-3.4.5.tar.gz -C /data/app

cd /data/app

ln -s zookeeper-3.4.5 zookeeper

设置环境变量

sed -i '/^export PATH=/i\export ZOOKEEPER_HOME=/data/app/zookeeper' /etc/profile

sed -i 's#export PATH=#&\$ZOOKEEPER_HOME/bin:#' /etc/profile

source /etc/profile

删除无用文件

cd $ZOOKEEPER_HOME

rm -rf *xml *txt zookeeper-3.4.5.jar.* src recipes docs dist-maven contrib

rm -f $ZOOKEEPER_HOME/bin/*.cmd $ZOOKEEPER_HOME/bin/*.txt

rm -f $ZOOKEEPER_HOME/conf/zoo_sample.cfg

创建数据目录

mkdir -p /data/appData/zookeeper/{data,logs}

配置

cat > $ZOOKEEPER_HOME/conf/zoo.cfg << _ZOO_

tickTime=2000

initLimit=10

syncLimit=5

clientPort=2181

dataDir=/data/appData/zookeeper/data

dataLogDir=/data/appData/zookeeper/logs

server.1=cdh-m1:2888:3888

server.2=cdh-m2:2888:3888

server.3=cdh-s1:2888:3888

_ZOO_

修改Zookeeper的日志打印方式，与日志路径设置, 编辑

$ZOOKEEPER_HOME/bin/zkEnv.sh

在27行后加入两个变量

ZOO_LOG_DIR=/data/logs/zookeeper

ZOO_LOG4J_PROP="INFO,ROLLINGFILE"

创建 myid文件

# 注意myid与配置文件保持一致

echo 1 >/data/appData/zookeeper/data/myid

设置目录权限

chown -R run.run /data/{app,appData,logs}

启动、停止

# 启动

runuser - run -c 'zkServer.sh start'

# 停止

runuser - run -c 'zkServer.sh stop'

安装 Hadoop

tar xf hadoop-2.6.0-cdh5.4.5.tar.gz -C /data/app

cd /data/app

ln -s hadoop-2.6.0-cdh5.4.5 hadoop

设置环境变量

sed -i '/^export PATH=/i\export HADOOP_HOME=/data/app/hadoop' /etc/profile

sed -i 's#export PATH=#&\$HADOOP_HOME/bin:\$HADOOP_HOME/sbin:#' /etc/profile

source /etc/profile

删除无用文件

cd $HADOOP_HOME

rm -rf *txt share/doc src examples* include bin-mapreduce1 cloudera

find . -name "*.cmd"|xargs rm -f

新建数据目录

mkdir -p /data/appData/hdfs/{name,edits,data,jn,tmp}

配置

切换到配置文件目录

cd $HADOOP_HOME/etc/hadoop

编辑 core-site.xml

<?xml version="1.0" encoding="UTF-8"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

     <!-- HDFS 集群名称，可指定端口 -->

    <property>

        <name>fs.defaultFS</name>

        <value>hdfs://hdfs-cdh</value>

    </property>



    <!-- 临时文件目录 -->

    <property>

        <name>hadoop.tmp.dir</name>

        <value>/data/appData/hdfs/tmp</value>

    </property>



    <!-- 回收站设置，0不启用回收站，1440 表示1440分钟后删除 -->

    <property>

        <name>fs.trash.interval</name>

        <value>1440</value>

    </property>



    <!-- SequenceFiles在读写中可以使用的缓存大小，单位 bytes 默认 4096 -->

    <property>

        <name>io.file.buffer.size</name>

        <value>131072</value>

    </property>



    <!-- 可用压缩算法，启用在hdfs-site.xml中，需要编译动态链接库才能用 -->

    <property>

        <name>io.compression.codecs</name>

        <value>org.apache.hadoop.io.compress.SnappyCodec</value>

    </property>

</configuration>

编辑 hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

    <!-- 指定hdfs 集群名称，需要和core-site.xml中的保持一致 -->

    <property>

        <name>dfs.nameservices</name>

        <value>hdfs-cdh</value>

    </property>



    <!-- 指定 Zookeeper 用于NameNode HA，默认官方配置在core-site.xml中，为了查看清晰配置到hdfs-site.xml也是可用的 -->

    <property>

        <name>ha.zookeeper.quorum</name>

        <value>cdh-m1:2181,cdh-m2:2181,cdh-s1:2181</value>

    </property>



    <!-- hdfs-cdh 下有两个NameNode，分别为 nn1,nn2 -->

    <property>

        <name>dfs.ha.namenodes.hdfs-cdh</name>

        <value>nn1,nn2</value>

    </property>



    <!-- nn1 RPC通信地址 -->

    <property>

        <name>dfs.namenode.rpc-address.hdfs-cdh.nn1</name>

        <value>cdh-m1:9000</value>

    </property>



    <!-- nn1 HTTP通信地址 -->

    <property>

        <name>dfs.namenode.http-address.hdfs-cdh.nn1</name>

        <value>cdh-m1:50070</value>

    </property>



    <!-- nn2 RPC通信地址 -->

    <property>

        <name>dfs.namenode.rpc-address.hdfs-cdh.nn2</name>

        <value>cdh-m2:9000</value>

    </property>



    <!-- nn2 HTTP通信地址 -->

    <property>

        <name>dfs.namenode.http-address.hdfs-cdh.nn2</name>

        <value>cdh-m2:50070</value>

    </property>



    <!-- 指定NameNode元数据在JournalNode上的存储路径 -->

    <property>

        <name>dfs.namenode.shared.edits.dir</name>

        <value>qjournal://cdh-m1:8485;cdh-m2:8485;cdh-s1:8485;/hdfs-cdh</value>

    </property>



    <!-- 开启NameNode失败自动切换 -->

    <property>

        <name>dfs.ha.automatic-failover.enabled</name>

        <value>true</value>

    </property>



    <!-- 配置主备切换实现方式 -->

    <property>

        <name>dfs.client.failover.proxy.provider.hdfs-cdh</name>

        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

    </property>



    <!-- 配置主备切换方法，每个方法一行-->

    <property>

        <name>dfs.ha.fencing.methods</name>

        <value>

            sshfence

            shell(/bin/true)

        </value>

    </property>



    <!-- 指定运行用户的秘钥，需要NameNode双向免密码登录，用于主备自动切换 -->

    <property>

        <name>dfs.ha.fencing.ssh.private-key-files</name>

        <value>/home/run/.ssh/id_rsa</value>

    </property>



    <!-- 配置sshfence 超时时间 -->

    <property>

        <name>dfs.ha.fencing.ssh.connect-timeout</name>

        <value>50000</value>

    </property>



    <!-- NameNode 数据本地存储路径 -->

    <property>

        <name>dfs.namenode.name.dir</name>

        <value>/data/appData/hdfs/name</value>

    </property>



    <!-- DataNode 数据本地存储路径 -->

    <property>

        <name>dfs.datanode.data.dir</name>

        <value>/data/appData/hdfs/data</value>

    </property>



    <!-- JournalNode 数据本地存储路径 -->

    <property>

        <name>dfs.journalnode.edits.dir</name>

        <value>/data/appData/hdfs/jn</value>

    </property>



    <!-- 修改文件存储到edits，定期同步到DataNode -->

    <property>

        <name>dfs.namenode.edits.noeditlogchannelflush</name>

        <value>true</value>

    </property>



    <!-- edits 数据本地存储路径 -->

    <property>

        <name>dfs.namenode.edits.dir</name>

        <value>/data/appData/hdfs/edits</value>

    </property>



    <!-- 开启Block Location metadata允许impala知道数据块在哪块磁盘上 默认关闭 -->

    <property>

        <name>dfs.datanode.hdfs-blocks-metadata.enabled</name>

        <value>true</value>

    </property>



    <!-- 权限检查 默认开启 -->

    <property>

        <name>dfs.permissions.enabled</name>

        <value>false</value>

    </property>



    <!-- block 大小设置 -->

    <property>

        <name>dfs.blocksize</name>

        <value>64m</value>

    </property>

</configuration>

小于5个DataNode建议添加如下配置

<!-- 数据副本数量，不能超过DataNode数量，大集群建议使用默认值 默认 3 -->

    <property>

        <name>dfs.replication</name>

        <value>2</value>

    </property>



    <!-- 当副本写入失败时不分配新节点，小集群适用 -->

    <property>

        <name>dfs.client.block.write.replace-datanode-on-failure.policy</name>

        <value>NEVER</value>

    </property>

在 hadoop-env.sh 中添加如下变量

export JAVA_HOME=/data/app/jdk1.7

export HADOOP_LOG_DIR=/data/logs/hadoop

export HADOOP_PID_DIR=/data/pid

# SSH端口 可选

export HADOOP_SSH_OPTS="-p 22"

Heap 设置，单位 MB

export HADOOP_HEAPSIZE=1024

权限设置

chown -R run.run /data/{app,appData,logs}

chmod 777 /data/pid

格式化

格式化只需要执行一次,格式化之前启动Zookeeper

切换用户

su - run

启动所有 JournalNode

hadoop-daemon.sh start journalnode

格式化 Zookeeper（为 ZKFC 创建znode）

hdfs zkfc -formatZK

NameNode 主节点格式化并启动

hdfs namenode -format

hadoop-daemon.sh start namenode

NameNode 备节点同步数据并启动

hdfs namenode -bootstrapStandby

hadoop-daemon.sh start namenode

启动 ZKFC

hadoop-daemon.sh start zkfc

启动 DataNode

hadoop-daemon.sh start datanode

启动与停止

切换用户

su - run

集群批量启动
需要配置运行用户ssh-key免密码登录，与$HADOOP_HOME/etc/hadoop/slaves

# 启动

start-dfs.sh

# 停止

stop-dfs.sh

单服务启动停止
启动HDFS

hadoop-daemon.sh start journalnode

hadoop-daemon.sh start namenode

hadoop-daemon.sh start zkfc

hadoop-daemon.sh start datanode

停止HDFS

hadoop-daemon.sh stop datanode

hadoop-daemon.sh stop namenode

hadoop-daemon.sh stop journalnode

hadoop-daemon.sh stop zkfc

测试

HDFS HA 测试
打开 NameNode 状态页：
http://cdh-m1:50010
http://cdh-m2:50010

在 Overview 后面能看见 active 或 standby，active 为当前 Master，停止 active 上的 NameNode，检查 standby是否为 active。

HDFS 测试

hadoop fs -mkdir /test

hadoop fs -put /etc/hosts /test

hadoop fs -ls /test

结果:

-rw-r--r--   2 java supergroup         89 2016-06-15 10:30 /test/hosts

# 其中权限后面的列（这里的2）代表文件总数，即副本数量。

HDFS 管理命令

# 动态加载 hdfs-site.xml

hadoop dfsadmin -refreshNodes

HBase安装配置

cd /data/install

tar xf hbase-1.0.0-cdh5.4.5.tar.gz -C /data/app

cd /data/app

ln -s hbase-1.0.0-cdh5.4.5 hbase

设置环境变量

sed -i '/^export PATH=/i\export HBASE_HOME=/data/app/hbase' /etc/profile

sed -i 's#export PATH=#&\$HBASE_HOME/bin:#' /etc/profile

source /etc/profile

删除无用文件

cd $HBASE_HOME

rm -rf *.txt pom.xml src docs cloudera dev-support hbase-annotations hbase-assembly hbase-checkstyle hbase-client hbase-common hbase-examples hbase-hadoop2-compat hbase-hadoop-compat hbase-it hbase-prefix-tree hbase-protocol hbase-rest hbase-server hbase-shell hbase-testing-util hbase-thrift

find . -name "*.cmd"|xargs rm -f

配置
进入配置文件目录

cd $HBASE_HOME/conf

编辑 hbase-site.xml

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

    <!-- HBase 数据存储路径 -->

    <property>

        <name>hbase.rootdir</name>

        <value>hdfs://hdfs-cdh/hbase</value>

    </property>



    <!-- 完全分布式模式 -->

    <property>

        <name>hbase.cluster.distributed</name>

        <value>true</value>

    </property>



    <!-- HMaster 节点 -->

    <property>

        <name>hbase.master</name>

        <value>cdh-m1:60000,cdh-m2:60000</value>

    </property>



    <!-- Zookeeper 节点 -->

    <property>

        <name>hbase.zookeeper.quorum</name>

        <value>cdh-m1:2181,cdh-m2:2181,cdh-s1:2181</value>

    </property>



    <!-- znode 路径，Zookeeper集群中有多个HBase集群需要设置不同znode -->

    <property>

        <name>zookeeper.znode.parent</name>

        <value>/hbase</value>

    </property>



    <!-- HBase 协处理器 -->

    <property>

        <name>hbase.coprocessor.user.region.classes</name>

        <value>org.apache.hadoop.hbase.coprocessor.AggregateImplementation</value>

    </property>

</configuration>

在 hbase-env.sh 中添加如下变量

export JAVA_HOME=/data/app/jdk1.7

export HBASE_LOG_DIR=/data/logs/hbase

export HBASE_PID_DIR=/data/pid

export HBASE_MANAGES_ZK=false

# SSH 默认端口 可选

export HBASE_SSH_OPTS="-o ConnectTimeout=1 -p 36000"

Heap 设置，单位 MB

export HBASE_HEAPSIZE=1024

可选设置 regionservers 中添加所有RegionServer主机名，用于集群批量启动、停止。

启动与停止
切换用户

su - run

集群批量启动
需要配置运行用户ssh-key免密码登录，与$HBASE_HOME/conf/regionservers

# 启动

start-hbase.sh

# 停止

stop-hbase.sh

单服务启动停止
HMaster

# 启动

hbase-daemon.sh start master

# 停止

hbase-daemon.sh stop master

HRegionServer

# 启动

hbase-daemon.sh start regionserver

# 停止

hbase-daemon.sh stop regionserver

测试
HBase HA 测试
浏览器打开两个HMaster状态页:
http://cdh-m1:60010
http://cdh-m2:60010

可以在Master后面看见其中一个主机名，Backup Masters中看见另一个。
停止当前Master，刷新另一个HMaster状态页会发现Master后面已经切换，HA成功。

HBase 测试
进入hbase shell 执行：

create 'users','user_id','address','info'

list

put 'users','anton','info:age','24'

get 'users','anton'



# 最终结果

COLUMN                     CELL

 info:age                  timestamp=1465972035945, value=24

1 row(s) in 0.0170 seconds

清除测试数据：

disable 'users'

drop 'users'

到这里安装就全部完成。

Hbase shell常用命令小记

大数据空心菜发表了文章 0 个评论 3235 次浏览 2016-09-13 00:19 来自相关话题

hbase(main):002:0> whoami

2016-09-12 13:09:42,440 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

root (auth:SIMPLE)

    groups: root

2、表的管理
1）查看有哪些表

hbase(main):001:0> list

TABLE                                                                                                              

pythonTrace                                                                                   

1 row(s) in 0.1320 seconds

2）创建表
语法：create

, {NAME => , VERSIONS => }
例如：创建表t1，有两个family name：f1，f2，且版本数均为2

hbase(main):001:0> create 't1',{NAME => 'f1', VERSIONS => 2},{NAME => 'f2', VERSIONS => 2}

0 row(s) in 0.4400 seconds



=> Hbase::Table - t1

3）删除表
分两步：首先disable，然后drop ; 例如：删除表t1

hbase(main):002:0> disable 't1'

0 row(s) in 1.2030 seconds



hbase(main):003:0> drop 't1'

0 row(s) in 0.1870 seconds

4）查看表的结构
语法：describe

, 例如：查看表t1的结构

hbase(main):005:0> describe 't1'

Table t1 is ENABLED                                                                                                

t1                                                                                                                 

COLUMN FAMILIES DESCRIPTION                                                                                        

{NAME => 'f1', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '2', COMP

RESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_M

EMORY => 'false', BLOCKCACHE => 'true'}                                                                            

{NAME => 'f2', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '2', COMP

RESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_M

EMORY => 'false', BLOCKCACHE => 'true'}                                                                            

2 row(s) in 0.0240 seconds

5）修改表结构
修改表结构必须先disable
语法：alter 't1', {NAME => 'f1'}, {NAME => 'f2', METHOD => 'delete'}; 例如：修改表t1的cf的TTL为180天

hbase(main):006:0> disable 't1'

0 row(s) in 1.1950 seconds



hbase(main):007:0> alter 't1',{NAME=>'body',TTL=>'15552000'},{NAME=>'meta', TTL=>'15552000'}

Updating all regions with the new schema...

1/1 regions updated.

Done.

Updating all regions with the new schema...

1/1 regions updated.

Done.

0 row(s) in 2.1910 seconds



hbase(main):008:0> enable 't1'

0 row(s) in 0.3930 seconds

3、权限管理
1）分配权限
语法 : grant

参数后面用逗号分隔
权限用五个字母表示： "RWXCA"； READ('R'), WRITE('W'), EXEC('X'), CREATE('C'), ADMIN('A')
例如，给用户‘test'分配对表t1有读写的权限

hbase(main)> grant 'test','RW','t1'

2）查看权限
语法：user_permission

; 例如，查看表t1的权限列表

hbase(main)> user_permission 't1'

3）收回权限
与分配权限类似，语法：revoke

例如，收回test用户在表t1上的权限

hbase(main)> revoke 'test','t1'

4、表数据的增删改查
1）添加数据
语法：put

,,,,
例如：给表t1的添加一行记录：rowkey是rowkey001，family name：f1，column name：col1，value：value01，timestamp：系统默认

hbase(main)> put 't1','rowkey001','f1:col1','value01'

用法比较单一
2）查询数据
a）查询某行记录
语法：get

,,[,....] ；例如：查询表t1，rowkey001中的f1下的col1的值

hbase(main)> get 't1','rowkey001', 'f1:col1'

# 或者：

hbase(main)> get 't1','rowkey001', {COLUMN=>'f1:col1'}

查询表t1，rowke002中的f1下的所有列值

hbase(main)> get 't1','rowkey001'

b）扫描表
语法：scan

, {COLUMNS => [ ,.... ], LIMIT => num}
另外，还可以添加STARTROW、TIMERANGE和FITLER等高级功能；例如：扫描表t1的前5条数据

hbase(main)> scan 't1',{LIMIT=>5}

c）查询表中的数据行数
语法：count

hbase(main)> count 't1', {INTERVAL => 100, CACHE => 500}

3）删除数据
a )删除行中的某个列值
语法：delete

, , , ,必须指定列名
例如：删除表t1，rowkey001中的f1:col1的数据

hbase(main)> delete 't1','rowkey001','f1:col1'

注：将删除改行f1:col1列所有版本的数据

b )删除行
语法：deleteall

, , , ，可以不指定列名，删除整行数据
例如：删除表t1，rowk001的数据

hbase(main)> deleteall 't1','rowkey001'

c）删除表中的所有数据
语法： truncate

其具体过程是：disable table -> drop table -> create table ；例如：删除表t1的所有数据

hbase(main)> truncate 't1'

hbase(main)>move '4343995a58be8e5bbc739af1e91cd72d', 'db-41.xxx.xxx.org,60020,1390274516739'

2）开启/关闭region
语法：balance_switch true|false

hbase(main)> balance_switch

3）手动split
语法：split 'regionName', 'splitKey'
4）手动触发major compaction
#语法：

#Compact all regions in a table:

hbase> major_compact 't1'

#Compact an entire region:

hbase> major_compact 'r1'

#Compact a single column family within a region:

hbase> major_compact 'r1', 'c1'

#Compact a single column family within a table:

hbase> major_compact 't1', 'c1'

6、配置管理及节点重启
1）修改hdfs配置
hdfs配置位置：/etc/hadoop/conf
# 同步hdfs配置

cat /home/hadoop/slaves|xargs -i -t scp /etc/hadoop/conf/hdfs-site.xml hadoop@{}:/etc/hadoop/conf/hdfs-site.xml

# 关闭：

cat /home/hadoop/slaves|xargs -i -t ssh hadoop@{} "sudo /home/hadoop/cdh4/hadoop-2.0.0-cdh4.2.1/sbin/hadoop-daemon.sh --config /etc/hadoop/conf stop datanode"

# 启动：

cat /home/hadoop/slaves|xargs -i -t ssh hadoop@{} "sudo /home/hadoop/cdh4/hadoop-2.0.0-cdh4.2.1/sbin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode"

2）修改hbase配置
hbase配置位置：
# 同步hbase配置

cat /home/hadoop/hbase/conf/regionservers|xargs -i -t scp /home/hadoop/hbase/conf/hbase-site.xml hadoop@{}:/home/hadoop/hbase/conf/hbase-site.xml

# graceful重启

cd ~/hbase

bin/graceful_stop.sh --restart --reload --debug inspurXXX.xxx.xxx.org

Hadoop运维经验杂谈

大数据 chris 发表了文章 0 个评论 3286 次浏览 2016-04-07 00:57 来自相关话题

Hadoop在蓝汛系统架构： Cloudera和它的产品们 Ap ...查看全部

Hadoop在蓝汛

系统架构：

Cloudera和它的产品们

Apache Hadoop与CDH版本关系

CDH为什么更好？

]安装升级更简单：[/

yum ,tar, rpm, cloudera manager 四种安装方法

]更快获取新功能和修正新bug[/

]年度release，季度update[/

]Yum安装自动匹配合适的生态系统版本[/

]自动目录配置（logs，conf），以及hdfs和mapred用户创建[/

]详细的文档[/

CDH3u3重大改善
CDH3u4重大改善
Cloudera Manager

Cloudera Training

]关于Training[/

分为Administrator和Development两门课程

]关于认证考试[/

]关于证书[/

运维事故

1、伤不起的内存现象1

系统上线第二天，Jobtracker不工作，web页面打不开

原因

一次提交Job数量太多，导致Jobtracker 内存溢出

解决

调大JT内存；限制Running Job数量

现象2

NN内存溢出，重启后发现50030页面显示fsimage损坏，调查发现SNN fsimage同样损坏了

原因

小文件太多导致NN/SNN内存溢出，导致fsimage文件损坏，但是重启后的NN可以正常服务。

原因

Cloudera google group去救，获得后门脚本

2、低效的MapReduce Job现象

MapReduce Job执行时间过长

原因

MR中用到了Spring，小文件导致Map方法效率低下，GZ文件读写效率低

解决

MR去Spring化；开启JVM重用；使用LZO作为输入和map输出结果；加大reduce并行copy线程数

压缩与MapReduce性能

]前提：大量小文件[/

]输入147GB，文件数45047，平均3MB[/

]CPU 8 core；32GB内存；7200转磁盘；28台Slave机器[/

3、OMG，整个集群完蛋了现象

早上来发现所有DataNode都dead了，重启后10分钟，DN陆续又都dead了；调查发现节点有8%左右丢包率

原因

交换机模块故障；DN不能Hold住大量小文件

解决

升级3u2到3u4；设置DN内存到2GB

遇到无法跨越的问题解决办法

]加入Hadoop官方Mail List[/

]加入Cloudera Google Group[/

监控与告警

]监控：ganglia[/

]设备告警、服务告警：nagios[/

]业务告警：自己实现[/

Nagios告警：

业务监控：

Hbase的Python API模块Starbase介绍

编程 chris 发表了文章 0 个评论 6054 次浏览 2016-02-20 16:14 来自相关话题

The following guest post is provided by Artur Barseghyan, a web developer currently employed by Goldmund, Wyldebeast & Wunderliebe ...查看全部

一、安装

Next, I’ll show you some frequently used commands and use cases. But first, install the current version of Starbase from CheeseShop (PyPi).

# pip install starbase

导入模块:

>>> from starbase import Connection

…and create a connection instance. Starbase defaults to 127.0.0.1:8000; if your settings are different, specify them here.

>>> c = Connection()

二、API 操作实例

2.1 显示所有的表
假设有两个现有的表名为table1和table2表,以下将会打印出来。

>>> c.tables()

['table1', 'table2']

2.2 表的设计操作
每当你需要操作的表,你需要先创建一个表的实例。
创建一个表实例(注意,在这一步骤中没有创建表):

>>> t = c.table('table3')

Create a new table:
Create a table with columns ‘column1′, ‘column2′, ‘column3′ (here the table is actually created):

>>> t.create('column1', 'column2', 'column3')

201

检查表是否存在：

>>> t.exists()

True

查看表的列：

>>> t.columns()

['column1', 'column2', 'column3']

将列添加到表,(‘column4’,‘column5’,‘column6’,‘column7’):

>>> t.add_columns('column4', 'column5', 'column6', 'column7')

200

删除列表，(‘column6’, ‘column7’):

>>> t.drop_columns('column6', 'column7')

201

删除整个表:

>>> t.drop()

200

2.3 表的数据操作
将数据插入一行:

>>> t.insert(

[quote]>>     'my-key-1',

>>>     {

>>>         'column1': {'key11': 'value 11', 'key12': 'value 12', 'key13': 'value 13'},

>>>         'column2': {'key21': 'value 21', 'key22': 'value 22'},

>>>         'column3': {'key32': 'value 31', 'key32': 'value 32'}

>>>     }

>>> )

200

请注意,您也可以使用“本地”的命名方式列和细胞(限定词)。以下的结果等于前面的例子的结果。

>>> t.insert(

>>>     'my-key-1a',

>>>     {

>>>         'column1:key11': 'value 11', 'column1:key12': 'value 12', 'column1:key13': 'value 13',

>>>         'column2:key21': 'value 21', 'column2:key22': 'value 22',

>>>         'column3:key32': 'value 31', 'column3:key32': 'value 32'

>>>     }

>>> )

200

更新一排数据：

>>> t.update(

>>>     'my-key-1',

>>>     {'column4': {'key41': 'value 41', 'key42': 'value 42'}}

>>> )

200

Remove a row cell (qualifier):

>>> t.remove('my-key-1', 'column4', 'key41')

200

Remove a row column (column family):

>>> t.remove('my-key-1', 'column4')

200

Remove an entire row:

>>> t.remove('my-key-1')

200

Fetch a single row with all columns:

>>> t.fetch('my-key-1')

  {

      'column1': {'key11': 'value 11', 'key12': 'value 12', 'key13': 'value 13'},

      'column2': {'key21': 'value 21', 'key22': 'value 22'},

      'column3': {'key32': 'value 31', 'key32': 'value 32'}

  }

Fetch a single row with selected columns (limit to ‘column1′ and ‘column2′ columns):

>>> t.fetch('my-key-1', ['column1', 'column2'])

  {

      'column1': {'key11': 'value 11', 'key12': 'value 12', 'key13': 'value 13'},

      'column2': {'key21': 'value 21', 'key22': 'value 22'},

  }

Narrow the result set even more (limit to cells ‘key1′ and ‘key2′ of column `column1` and cell ‘key32′ of column ‘column3′):

>>> t.fetch('my-key-1', {'column1': ['key11', 'key13'], 'column3': ['key32']})

  {

      'column1': {'key11': 'value 11', 'key13': 'value 13'},

      'column3': {'key32': 'value 32'}

  }

Note that you may also use the native means of naming the columns and cells (qualifiers). The example below does exactly the same thing as the example above.

>>>  t.fetch('my-key-1', ['column1:key11', 'column1:key13', 'column3:key32'])

  {

      'column1': {'key11': 'value 11', 'key13': 'value 13'},

      'column3': {'key32': 'value 32'}

  }

If you set the perfect_dict argument to False, you’ll get the native data structure:

>>>  t.fetch('my-key-1', ['column1:key11', 'column1:key13', 'column3:key32'], perfect_dict=False)

{

    'column1:key11': 'value 11', 'column1:key13': 'value 13',

    'column3:key32': 'value 32'

}

>>> data = {

[quote]>>     'column1': {'key11': 'value 11', 'key12': 'value 12', 'key13': 'value 13'},

>>>     'column2': {'key21': 'value 21', 'key22': 'value 22'},

>>> }

>>> b = t.batch()

>>> for i in range(0, 5000):

>>>     b.insert('my-key-%s' % i, data)

>>> b.commit(finalize=True)

{'method': 'PUT', 'response': [200], 'url': 'table3/bXkta2V5LTA='}

In the example below, we will update 5,000 records in a batch:

>>> data = {

>>>     'column3': {'key31': 'value 31', 'key32': 'value 32'},

>>> }

>>> b = t.batch()

>>> for i in range(0, 5000):

>>>     b.update('my-key-%s' % i, data)

>>> b.commit(finalize=True)

{'method': 'POST', 'response': [200], 'url': 'table3/bXkta2V5LTA='}

>>> t.fetch_all_rows()

就介绍到这里了，没有时间翻译，聽简单的英文！

Python访问hbase数据操作脚本分享

编程 chris 发表了文章 0 个评论 3082 次浏览 2016-02-20 15:33 来自相关话题

#!/usr/bin/python import getopt,sys,time from thrift.transport.TSocket import TSocket from thrift.transp ...查看全部

#!/usr/bin/python

 

import getopt,sys,time

from thrift.transport.TSocket import TSocket

from thrift.transport.TTransport import TBufferedTransport

from thrift.protocol import TBinaryProtocol

from hbase import Hbase

 

def usage():

        print '''Usage :

        -h: Show help information;

        -l: Show all table in hbase;

        -t {table} Show table descriptors;

        -t {table} -k {key} : show cell;

        -t {table} -k {key} -c {coulmn} : Show the coulmn;

        -t {table} -k {key} -c {coulmn} -v {versions} : Show more version;

    (write by liuhuorong@koudai.com)

        '''

 

class geilihbase:

        def __init__(self):

                self.transport = TBufferedTransport(TSocket("127.0.0.1", "9090"))

                self.transport.open()

                self.protocol = TBinaryProtocol.TBinaryProtocol(self.transport)

                self.client = Hbase.Client(self.protocol)

        def __del__(self):

                self.transport.close()

        def glisttable(self):

                for table in self.client.getTableNames():

                        print table

        def ggetColumnDescriptors(self,table):

                rarr=self.client.getColumnDescriptors(table)

                if rarr:

                        for (k,v) in rarr.items():

                                print "%-20s\t%s" % (k,v)

        def gget(self,table,key,coulmn):

                rarr=self.client.get(table,key,coulmn)

                if rarr:

                        print "%-15s %-20s\t%s" % (rarr[0].timestamp,time.strftime("%Y-%m-%d %H:%M:%S",time.localtime(rarr[0].timestamp/1000)),rarr[0].value)

        def ggetrow(self,table,key):

                rarr=self.client.getRow(table, key)

                if rarr:

                        for (k,v) in rarr[0].columns.items():

                                print "%-20s\t%-15s %-20s\t%s" % (k,v.timestamp,time.strftime("%Y-%m-%d %H:%M:%S",time.localtime(v.timestamp/1000)),v.value)

        def ggetver(self, table, key, coulmn, versions):

                rarr=self.client.getVer(table,key,coulmn, versions);

                if rarr:

                        for row in rarr:

                                print "%-15s %-20s\t%s" % (row.timestamp,time.strftime("%Y-%m-%d %H:%M:%S",time.localtime(row.timestamp/1000)),row.value)

 

def main(argv):

        tablename=""

        key=""

        coulmn=""

        versions=""

        try:

                opts, args = getopt.getopt(argv, "lht:k:c:v:", ["help","list"])

        except getopt.GetoptError:

                usage()

                sys.exit(2)

        for opt, arg in opts:

                if opt in ("-h", "--help"):

                        usage()

                        sys.exit(0)

                elif opt in ("-l", "--list"):

                        ghbase=geilihbase()

                        ghbase.glisttable()

                        sys.exit(0)

                elif opt == '-t':

                        tablename = arg

                elif opt == '-k':

                        key = arg

                elif opt == '-c':

                        coulmn = arg

                elif opt == '-v':

                        versions = int(arg)

        if ( tablename and key and coulmn and versions ):

                ghbase=geilihbase()

                ghbase.ggetver(tablename, key, coulmn, versions)

                sys.exit(0)

        if (tablename and key and coulmn ):

                ghbase=geilihbase()

                ghbase.gget(tablename, key, coulmn)

                sys.exit(0)

        if (tablename and key ):

                ghbase=geilihbase()

                ghbase.ggetrow(tablename, key)

                sys.exit(0)

        if (tablename ):

                ghbase=geilihbase()

                ghbase.ggetColumnDescriptors(tablename)

                sys.exit(0)

        usage()

        sys.exit(1)

 

if __name__ == "__main__":

        main(sys.argv[1:])

Hbase/Hdfs删除节点

大数据空心菜发表了文章 0 个评论 9423 次浏览 2015-11-30 00:16 来自相关话题

添加hbase regionserver节点

添加步骤如下：
1、在hbase master上修改regionservers文件

# cd hbase_install_dir/conf

# echo "new_hbase_node_hostname" >> ./regionservers

2、如果你hbase集群使用自身zk集群的话，还需要修改hbase-site.xml文件，反之不用操作！

# cd hbase_install_dir/conf

# vim hbase-site.xml

找到hbase.zookeeper.quorum属性   -->加入新节点

3、同步以上修改的文件到hbase的各个节点上
4、在新节点上启动hbase regionserver

# cd hbase_install_dir/bin/

# ./hbase-daemon.sh start regionserver

5、在hbasemaster启动hbase shell

用status命令确认一下集群情况

移除hbase regionserver节点

1、在0.90.2之前，我们只能通过在要卸载的节点上执行；我的hbase版本(0.98.7)

# cd hbase_install_dir

# ./bin/hbase-daemon.sh stop regionserver

hbase(main):001:0> balance_switch false

true

0 row(s) in 0.3290 seconds

总结：

这种方法很大的一个缺点是该节点上的Region会离线很长时间。因为假如该RegionServer上有大量Region的话，因为Region的关闭是顺序执行的，第一个关闭的Region得等到和最后一个Region关闭并Assigned后一起上线。这是一个相当漫长的时间。以我这次的实验为例，现在一台RegionServer平均有1000个Region，每个Region Assigned需要4s，也就是说光Assigned就至少需要1个小时。

2、自0.90.2之后，HBase添加了一个新的方法，即"graceful_stop",在你移除的服务器执行：

# cd hbase_install_dir

# ./bin/graceful_stop.sh hostname

# ./bin/graceful_stop.sh

Usage: graceful_stop.sh [--config &conf-dir>] [--restart] [--reload] [--thrift] [--rest] &hostname>

 thrift      If we should stop/start thrift before/after the hbase stop/start

 rest        If we should stop/start rest before/after the hbase stop/start

 restart     If we should restart after graceful stop

 reload      Move offloaded regions back on to the stopped server

 debug       Move offloaded regions back on to the stopped server

 hostname    Hostname of server we are to stop

最终都需要我们手动打开load balancer：

hbase(main):001:0> balance_switch false

true

0 row(s) in 0.3590 seconds

然后再开启：

hbase(main):001:0> balance_switch true

false

0 row(s) in 0.3290 seconds

移除hdfs datanode节点

1、在core-site.xml文件下新增如下内容



       dfs.hosts.exclude

       /hdfs_install_dir/conf/excludes

2、创建exclude文件，把需要删除节点的主机名写入

# cd hdfs_install_dir/conf

# vim excludes

添加需要删除的节点主机名，比如 hdnode1 保存退出

3、然后在namenode节点执行如下命令，强制让namenode重新读取配置文件，不需要重启集群。

# cd hdfs_install_dir/bin/

# ./hadoop dfsadmin -refreshNodes

它会在后台进行Block块的移动
4、查看状态
等待第三步的操作结束后，需要下架的机器就可以安全的关闭了。

# ./hadoop dfsadmin -report

可以查看到现在集群上连接的节点

正在执行Decommission，会显示： 

Decommission Status : Decommission in progress  



执行完毕后，会显示： 

Decommission Status : Decommissioned

如下：

Name: 10.0.180.6:50010

Decommission Status : Decommission in progress

Configured Capacity: 917033340928 (10.83 TB)

DFS Used: 7693401063424 (7 TB)

Non DFS Used: 118121652224 (110.00 GB)

DFS Remaining: 4105510625280(3.63 TB)

DFS Used%: 64.56%

DFS Remaining%: 34.45%

Last contact: Mon Nov 29 23:53:52 CST 2015

dfs.replication <= 集群所剩节点数

修改备份系数可以参考：http://heylinux.com/archives/2047.html

重载入删除的datanode节点

1、修改namenode的core-site.xml文件，把我们刚刚加入的内容删除或者注释掉，我这里选择注释掉。

2、再执行重载namenode的配置文件

# ./bin/hadoop dfsadmin -refreshNodes

3、最后去启动datanode上的datanode

# ./bin/hadoop-daemon.sh start datanode

starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-root-datanode-had1.out

4、查看启动情况

# jps

18653 Jps

19687 DataNode  ---->启动正常

Java调用Hbase API访问接口

大数据空心菜发表了文章 0 个评论 4258 次浏览 2015-11-08 23:35 来自相关话题

HBase是一个分布式的、面向列的开源数据库，该技术来源于 Fay Chang 所撰写的Google论文“Bigtable：一个结构化数据的分布式存储系统”。就像Bigtable利用了Google文件系统（File System）所提供的分布式数据存储一样，H ...查看全部

HBase是一个分布式的、面向列的开源数据库，该技术来源于 Fay Chang 所撰写的Google论文“Bigtable：一个结构化数据的分布式存储系统”。就像Bigtable利用了Google文件系统（File System）所提供的分布式数据存储一样，HBase在Hadoop之上提供了类似于Bigtable的能力。HBase是Apache的Hadoop项目的子项目。HBase不同于一般的关系数据库，它是一个适合于非结构化数据存储的数据库。另一个不同的是HBase基于列的而不是基于行的模式。

HBase表一般特点：

]大：一个表可以有上亿行，上百万列[/

]面向列:面向列(族)的存储和权限控制，列(族)独立检索[/

]稀疏:对于为空(null)的列并不占用存储空间，表可以设计非常稀疏[/

Java 调用 Hbase 非关系型数据库，Hbase 中提供了相关的 Java API 访问接口便于使用，下面是本人综合网络总结的通过 Java 操作 HBase 进行创建、修改、删除表以及查询等。具体封装代码如下：

package yoodb.hbase;

 

import java.io.IOException;

 

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.hbase.HBaseConfiguration;

import org.apache.hadoop.hbase.HColumnDescriptor;

import org.apache.hadoop.hbase.HTableDescriptor;

import org.apache.hadoop.hbase.KeyValue;

import org.apache.hadoop.hbase.client.Delete;

import org.apache.hadoop.hbase.client.Get;

import org.apache.hadoop.hbase.client.HBaseAdmin;

import org.apache.hadoop.hbase.client.HTable;

import org.apache.hadoop.hbase.client.HTablePool;

import org.apache.hadoop.hbase.client.Put;

import org.apache.hadoop.hbase.client.Result;

import org.apache.hadoop.hbase.client.ResultScanner;

import org.apache.hadoop.hbase.client.Scan;

import org.apache.hadoop.hbase.util.Bytes;

 

public class HBaseTest {

 

    // 声明静态配置

    static Configuration conf = null;

    static final HTablePool tablePool;

    static {

        conf = HBaseConfiguration.create();

        conf.set("hbase.zookeeper.quorum", "yoodb");

        tablePool = new HTablePool(conf, 15);

    }

 

    /*

     * 创建表

     * @tableName 表名

     * @family 列族数组

     */

    public static void creatTable(String tableName, String[] family)

            throws Exception {

        HBaseAdmin admin = new HBaseAdmin(conf);

        HTableDescriptor desc = new HTableDescriptor(tableName);

        for (int i = 0; i < family.length; i++) {

            desc.addFamily(new HColumnDescriptor(family[i]));

        }

        if (admin.tableExists(tableName)) {

            System.out.println("table Exists!");

            System.exit(0);

        } else {

            admin.createTable(desc);

            System.out.println("create table Success!");

        }

    }

 

    /*

     * 表添加数据

     * @rowKey rowKey

     * @tableName 表名

     * @column1 第一个列族数组 realname

     * @value1 第一个列的值的数组

     * @column2 第二个列族数组 address

     * @value2 第二个列的值的数组

     */

    public static void addTableData(String rowKey, String tableName,String[] column1, String[] value1, String[] column2, String[] value2)

            throws IOException {

        Put put = new Put(Bytes.toBytes(rowKey));

        HTable table = (HTable) tablePool.getTable(tableName);

        HColumnDescriptor[] columnFamilies = table.getTableDescriptor()

                .getColumnFamilies();

 

        for (int i = 0; i < columnFamilies.length; i++) {

            String familyName = columnFamilies[i].getNameAsString();

            if (familyName.equals("realname")) {

                for (int j = 0; j < column1.length; j++) {

                    put.add(Bytes.toBytes(familyName),Bytes.toBytes(column1[j]), Bytes.toBytes(value1[j]));

                }

            }

            if (familyName.equals("address")) {

                for (int j = 0; j < column2.length; j++) {

                    put.add(Bytes.toBytes(familyName),Bytes.toBytes(column2[j]), Bytes.toBytes(value2[j]));

                }

            }

        }

        table.put(put);

    }

 

    /*

     * 更新表中的某一列

     * @tableName 表名

     * @rowKey rowKey

     * @familyName 列族名

     * @columnName 列名

     * @value 更新后的值

     */

    public static void updateTable(String tableName, String rowKey,

            String familyName, String columnName, String value)

            throws IOException {

        HTable table = (HTable) tablePool.getTable(tableName);

        Put put = new Put(Bytes.toBytes(rowKey));

        put.add(Bytes.toBytes(familyName), Bytes.toBytes(columnName),Bytes.toBytes(value));

        table.put(put);

    System.out.println("update table Success!");

    }

 

    /*

     * 根据rwokey查询

     * @rowKey rowKey

     * @tableName 表名

     */

    public static Result getResult(String tableName, String rowKey)

            throws IOException {

        Get get = new Get(Bytes.toBytes(rowKey));

        HTable table = (HTable) tablePool.getTable(tableName);

        Result result = table.get(get);

        for (KeyValue kv : result.list()) {

            System.out.println("family==>" + Bytes.toString(kv.getFamily()));

            System.out.println("qualifier==>" + Bytes.toString(kv.getQualifier()));

            System.out.println("value==>" + Bytes.toString(kv.getValue()));

            System.out.println("Timestamp==>" + kv.getTimestamp());

        }

        return result;

    }

 

    /*

     * 遍历查询hbase表数组

     * @tableName 表名

     */

    public static void getResultScann(String tableName) throws IOException {

        Scan scan = new Scan();

        ResultScanner rs = null;

        HTable table = (HTable) tablePool.getTable(tableName);

        try {

            rs = table.getScanner(scan);

            for (Result r : rs) {

                for (KeyValue kv : r.list()) {

                    System.out.println("family==>" + Bytes.toString(kv.getFamily()));

                    System.out.println("qualifier==>" + Bytes.toString(kv.getQualifier()));

                    System.out.println("value==>" + Bytes.toString(kv.getValue()));

                    System.out.println("timestamp==>" + kv.getTimestamp());

                }

            }

        } finally {

            rs.close();

        }

    }

 

    /*

     * 查询表中的某单一列

     * @tableName 表名

     * @rowKey rowKey

     */

    public static void getResultByColumn(String tableName, String rowKey,

            String familyName, String columnName) throws IOException {

        HTable table = (HTable) tablePool.getTable(tableName);

        Get get = new Get(Bytes.toBytes(rowKey));

        get.addColumn(Bytes.toBytes(familyName), Bytes.toBytes(columnName)); // 获取指定列族以及列中修饰符对应列名

        Result result = table.get(get);

        for (KeyValue kv : result.list()) {

            System.out.println("family==>" + Bytes.toString(kv.getFamily()));

            System.out.println("qualifier==>" + Bytes.toString(kv.getQualifier()));

            System.out.println("value==>" + Bytes.toString(kv.getValue()));

            System.out.println("Timestamp==>" + kv.getTimestamp());

        }

    }

 

    /*

     * 查询某列数据的多个版本

     * @tableName 表名

     * @rowKey rowKey

     * @familyName 列族名

     * @columnName 列名

     */

    public static void getResultByVersion(String tableName, String rowKey,

            String familyName, String columnName) throws IOException {

        HTable table = (HTable) tablePool.getTable(tableName);

        Get get = new Get(Bytes.toBytes(rowKey));

        get.addColumn(Bytes.toBytes(familyName), Bytes.toBytes(columnName));

        get.setMaxVersions(5);

        Result result = table.get(get);

        for (KeyValue kv : result.list()) {

            System.out.println("family==>" + Bytes.toString(kv.getFamily()));

            System.out.println("qualifier==>" + Bytes.toString(kv.getQualifier()));

            System.out.println("value==>" + Bytes.toString(kv.getValue()));

            System.out.println("Timestamp==>" + kv.getTimestamp());

        }

         

    }

 

    /*

     * 删除指定的列

     * @tableName 表名

     * @rowKey rowKey

     * @familyName 列族名

     * @columnName 列名

     */

    public static void deleteColumn(String tableName, String rowKey,

            String falilyName, String columnName) throws IOException {

        HTable table = (HTable) tablePool.getTable(tableName);

        Delete deleteColumn = new Delete(Bytes.toBytes(rowKey));

        deleteColumn.deleteColumns(Bytes.toBytes(falilyName),Bytes.toBytes(columnName));

        table.delete(deleteColumn);

        System.out.println(falilyName + "==>" + columnName + "is deleted!");

    }

 

    /*

     * 删除指定的列

     * @tableName 表名

     * @rowKey rowKey

     */

    public static void deleteAllColumn(String tableName, String rowKey) throws IOException {

        HTable table = (HTable) tablePool.getTable(tableName);

        Delete deleteAll = new Delete(Bytes.toBytes(rowKey));

        table.delete(deleteAll);

        System.out.println("all columns are deleted!");

    }

 

    /*

     * 删除表

     * 

     * @tableName 表名

     */

    public static void deleteTable(String tableName) throws IOException {

        HBaseAdmin admin = new HBaseAdmin(conf);

        admin.disableTable(tableName);

        admin.deleteTable(tableName);

        System.out.println(tableName + " is deleted!");

    }

}

Java Hbase main函数测试类，具体代码如下：

package com.yoodb;

 

public class Test {

    public static void main(String[] args) throws Exception {

        // 创建表

        String tableName = "yoodbblog";

        String[] family = { "realname","address" };

        HBaseTest.creatTable(tableName,family);

        // 为表添加数据

        String[] column1 = { "title", "author", "content" }; 

        String[] value1 = {"素文宅","yoodb","www.yoodb.com" }; 

        String[] column2 = { "name", "nickname" };

        String[] value2 = { "真实名称", "昵称" }; 

        HBaseTest.addTableData("rowkey1","yoodbblog",column1, value1, column2, value2);

        // 删除一列

        HBaseTest.deleteColumn("yoodbblog", "rowkey1", "realname", "name");

        // 删除所有列

        HBaseTest.deleteAllColumn("yoodbblog", "rowkey1");

        // 删除表

        HBaseTest.deleteTable("yoodbblog");

        // 查询

        HBaseTest.getResult("yoodbblog", "rowkey1");

        // 查询某一列的值

        HBaseTest.getResultByColumn("yoodbblog", "rowkey1", "realname", "nickname");

        // 修改某一列的值

        HBaseTest.updateTable("yoodbblog", "rowkey1", "realname", "nickname","假昵称");

        // 遍历表数据查询

        HBaseTest.getResultScann("yoodbblog");

        // 查询某列的多版本

        HBaseTest.getResultByVersion("yoodbblog", "rowkey1", "realname", "name");

    }

}

分享原文地址

hbase两点错误总结

大数据空心菜发表了文章 0 个评论 5783 次浏览 2015-10-24 17:01 来自相关话题

一、hbase的HRegionServer节点启动失败 2015-10-23 17:24:33,147 WARN [regionserver60020] zookeeper.RecoverableZooKeeper: Node /h ...查看全部

一、hbase的HRegionServer节点启动失败

2015-10-23 17:24:33,147 WARN  [regionserver60020] zookeeper.RecoverableZooKeeper: Node /hbase/rs/SlaveServer,60020,1413095376898 already deleted, retry=false

2015-10-23 17:24:33,147 WARN  [regionserver60020] regionserver.HRegionServer: Failed deleting my ephemeral node

org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /hbase/rs/SlaveServer,60020,1413095376898

	at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)

	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)

	at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873)

	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.delete(RecoverableZooKeeper.java:156)

	at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1273)

	at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1262)

	at org.apache.hadoop.hbase.regionserver.HRegionServer.deleteMyEphemeralNode(HRegionServer.java:1298)

	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1012)

	at java.lang.Thread.run(Thread.java:662)

2015-10-23 17:24:33,158 INFO  [regionserver60020] zookeeper.ZooKeeper: Session: 0x249020a2cfd0014 closed

2015-10-23 17:24:33,158 INFO  [regionserver60020-EventThread] zookeeper.ClientCnxn: EventThread shut down

2015-10-23 17:24:33,158 INFO  [regionserver60020] regionserver.HRegionServer: stopping server null; zookeeper connection closed.

2015-10-23 17:24:33,158 INFO  [regionserver60020] regionserver.HRegionServer: regionserver60020 exiting

2015-10-23 17:24:33,158 ERROR [main] regionserver.HRegionServerCommandLine: Region server exiting

java.lang.RuntimeException: HRegionServer Aborted

	at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:66)

	at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:85)

	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

	at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)

	at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2422)

2015-10-23 17:24:33,160 INFO  [Thread-9] regionserver.ShutdownHook: Shutdown hook starting; hbase.shutdown.hook=true; fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@8d5aad

2015-10-23 17:24:33,160 INFO  [Thread-9] regionserver.ShutdownHook: Starting fs shutdown hook thread.

2015-10-23 17:24:33,160 INFO  [Thread-9] regionserver.ShutdownHook: Shutdown hook finished.

一般这种情况，是因为集群中节点时间相差太多，时间没有同步导致的，解决方案：

# yum -y install ntpdate  && chkconfig ntpdate off

# crontab -e     #add sync time cron scripts

[i]/2 [/i] [i] [/i] * ntpdate asia.pool.ntp.org

如果遇到是其他原因的同学，下面回答分享一下！

二、主机名配置问题

failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

根据查看提示链接http://wiki.apache.org/hadoop/ConnectionRefused排查错误，将/etc/hosts中的127.0.0.1 hbase1删除（从节点对应也删除）后程序运行正常。接着尝试运行HBase，没有出现问题！创建表也正常了！
一开始知道得删除hosts文件中127.0.1.1，但是没想到127.0.0.1 主机名也得删除。

还有一种情况也会导致集群启动问题，那就是主机名不规范，作为hadoop集群中的主机名，是不支持_和-的，比如hbase_host1这是不支持的！

HBase file layout needs to be upgraded案例分析

大数据空心菜发表了文章 0 个评论 7652 次浏览 2015-10-15 01:00 来自相关话题

今天在一个内网的测试环境平台，kafka的river插件状态非正常，然后同事只好重建kafka river，river的状态始终无法正常，没有办法，同事对服务还不是很熟悉，我只好帮忙看看了！因为kafka 的river插件作为ka ...查看全部

今天在一个内网的测试环境平台，kafka的river插件状态非正常，然后同事只好重建kafka river，river的状态始终无法正常，没有办法，同事对服务还不是很熟悉，我只好帮忙看看了！

因为kafka 的river插件作为kafka消息数据的consumers角色，把消费掉的数据，通过Hbase转存储到hdfs中！
如下所示是river对hbase的配置：





 hbase.rootdir

 hdfs://10.2.2.39:9000/hbase





 hbase.cluster.distributed

 true





hbase.master

10.2.2.39:60000





 hbase.zookeeper.quorum

 10.2.2.56,10.2.2.94,10.2.2.225

从这可以看出river插件是需要hbase的，然后我执行创建river的命令，tail观看到hbase master的hbase-root-master-hbase1.log如下：

2015-10-14 17:34:13,980 INFO  [master:hbase1:60000] util.FSUtils: Waiting for dfs to exit safe mode...

从log中可以看出hbase在等待hdfs退出安全模式，为什么要等Hdfs退出安全模式呢？那下面就具体看看Hdfs的log中有什么线索，查看Hdfs的Namenode的hadoop-root-namenode-had1.log记录如下：

2015-10-14 17:33:52,283 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:root (auth:SIMPLE) cause:org.apache.hadoop.hdfs.server.namenode.SafeModeException: Log not rolled. Name node is in safe mode.

运行如下命令等待退出安全模式

bin/hadoop dfsadmin -safemode wait

发现半分钟后没有反映，然后运行如下命令检查hdfs的健康状态

bin/hadoop fsck /

发现有很多corrupt blocks，不过还好备份数大于1.此时，hdfs需要自动的把备份数增加到2，所以需要对数据进行写操作，必须退出安全模式，于是：

bin/hadoop  dfsadmin -safemode leave

关闭之后等待集群把数据备份好，达到2，耐心等待一段时间吧，看数据量的大小，达到2之后，运行：

bin/hadoop  fsck -move

也可以尝试：执行健康检查，删除损坏掉的block。 bin/hdfs fsck  /  -delete 注意: 这种方式会出现数据丢失，损坏的block会被删掉.

把那些破坏的块移到/lost+found这个目录下面，启动Hbase，发现Hmaster启动之后进程又消失了，查看日志：

2015-10-14 17:48:29,476 FATAL [master:hbase1:60000] master.HMaster: Unhandled exception. Starting shutdown.

org.apache.hadoop.hbase.util.FileSystemVersionException: HBase file layout needs to be upgraded. You have version null and I want version 8. Consult http://hbase.apache.org/book.html for further information about upgrading HBase. Is your hbase.rootdir valid? If so, you may need to run 'hbase hbck -fixVersionFile'.

	at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:600)

	at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:462)

	at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:153)

	at org.apache.hadoop.hbase.master.MasterFileSystem.(MasterFileSystem.java:129)

	at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:800)

	at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:605)

	at java.lang.Thread.run(Thread.java:744)

从什么log中可以发现可能是hbase.version文件消失了！我看很多网友的做法是先把/hbase清理调，然后重启就好了，但是以前的数据就丢失了，这有点不科学。于是我：

bin/hadoop fs -ls /hbase

发现/hbase/hbase.version确实已经消失了，这才恍然大悟，原来是之前的这个文件可能被损坏了，去/lost+found目录找确实能找到，但是这个文件似乎出了问题，-ls它也看不到。于是想到一个办法，我做了以下操作：

bin/hadoop fs -mv /hbase /hbase.bk

重启HBase，这时就生成了/hbase/hbase.version文件，然后：

bin/hadoop fs -cp /hbase/hbase.version /hbase.bk/



bin/hadoop fs -rmr /hbase 



bin/hadoop fs -mv /hbase.bk /hbase

这样再次重启HBase，发现Hbase开始splitting hlogs，数据得以恢复。然后再重建river，状态可以正常了！

更多...

HBase是一个分布式的、面向列的开源数据库，该技术来源于 Fay Chang 所撰写的Google论文“Bigtable：一个结构化数据的分布式存储系统”。就像Bigtable利用了Google文件系统（File System）所提供的分布式数据存储一样，HBase在Hadoop之上提供了类似于Bigtable的能力。

话题描述

HBase是一个分布式的、面向列的开源数据库，该技术来源于 Fay Chang 所撰写的Google论文“Bigtable：一个结构化数据的分布式存储系统”。就像Bigtable利用了Google文件系统（File System）所提供的分布式数据存储一样... 查看全部

根话题

云计算大数据

最佳回复者

: 空心菜
获得 3 次赞同, 1 次感谢

: OpenSkill
获得 1 次赞同, 0 次感谢

: chris
获得 2 次赞同, 0 次感谢

6 人关注该话题

OpenSkill 专业的开源技术学习问答平台

关注开源相关的产品以及技术

2021 OpenSkill Technology Changing World

有趣有用有态度 Do What You Want

准备

NTP 服务

JDK配置

下载 安装包

安装 Zookeeper

安装 Hadoop

配置

格式化

启动与停止

测试

HBase安装配置

准备

NTP 服务

JDK配置

下载 安装包

安装 Zookeeper

安装 Hadoop

配置

格式化

启动与停止

测试

HBase安装配置

准备

NTP 服务

JDK配置

下载 安装包

安装 Zookeeper

安装 Hadoop

配置

格式化

启动与停止

测试

HBase安装配置

话题描述

相关话题

根话题

最佳回复者

6 人关注该话题

OpenSkill 专业的开源技术学习问答平台

下载安装包

下载安装包

下载安装包