Hive入门3–Hive与HBase的整合

Hive入门3–Hive与HBase的整合

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 07, 2016 pm 04:25 PM

hbasehivegetting StartedIntegrate

开场白： Hive与HBase的整合功能的实现是利用两者本身对外的API接口互相进行通信，相互通信主要是依靠hive_hbase-handler.jar工具类 (Hive Storage Handlers)，大致意思如图所示：口水：对 hive_hbase-handler.jar 这个东东还有点兴趣，有空来磋磨一下。

开场白：
Hive与HBase的整合功能的实现是利用两者本身对外的API接口互相进行通信，相互通信主要是依靠hive_hbase-handler.jar工具类 (Hive Storage Handlers)，大致意思如图所示：
hive-hbase

口水：
对 hive_hbase-handler.jar 这个东东还有点兴趣，有空来磋磨一下。

一、2个注意事项：
1、需要的软件有 Hadoop、Hive、Hbase、Zookeeper，Hive与HBase的整合对Hive的版本有要求，所以不要下载.0.6.0以前的老版本，Hive.0.6.0的版本才支持与HBase对接，因此在Hive的lib目录下可以看见多了hive_hbase-handler.jar这个jar包，他是Hive扩展存储的Handler ，HBase 建议使用 0.20.6的版本，这次我没有启动HDFS的集群环境，本次所有测试环境都在一台机器上。

2、运行Hive时，也许会出现如下错误，表示你的JVM分配的空间不够，错误信息如下：
Invalid maximum heap size: -Xmx4096m
The specified size exceeds the maximum representable size.
Could not create the Java virtual machine.

解决方法：
/work/hive/bin/ext# vim util/execHiveCmd.sh 文件中第33行
修改，
HADOOP_HEAPSIZE=4096
为
HADOOP_HEAPSIZE=256

另外，在 /etc/profile/ 加入 export $HIVE_HOME=/work/hive

二、启动运行环境
1启动Hive
hive –auxpath /work/hive/lib/hive_hbase-handler.jar,/work/hive/lib/hbase-0.20.3.jar,/work/hive/lib/zookeeper-3.2.2.jar -hiveconf hbase.master=127.0.0.1:60000
加载 Hive需要的工具类，并且指向HBase的master服务器地址，我的HBase master服务器和Hive运行在同一台机器，所以我指向本地。

2启动HBase
/work/hbase/bin/hbase master start

3启动Zookeeper
/work/zookeeper/bin/zkServer.sh start

三、执行
在Hive中创建一张表，相互关联的表
CREATE TABLE hbase_table_1(key int, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val") TBLPROPERTIES ("hbase.table.name" = "xyz");

在运行一个在Hive中建表语句，并且将数据导入
建表
CREATE TABLE pokes (foo INT, bar STRING);
数据导入
LOAD DATA LOCAL INPATH '/work/hive/examples/files/kv1.txt' OVERWRITE INTO TABLE pokes;

在Hive与HBase关联的表中插入一条数据
INSERT OVERWRITE TABLE hbase_table_1 SELECT * FROM pokes WHERE foo=98;
运行成功后，如图所示：
hive

插入数据时采用了MapReduce的策略算法，并且同时向HBase写入，如图所示：
Map-Reduce Job for INSERT

在HBase shell中运行 scan 'xyz' 和describe "xyz" 命令，查看表结构，运行结果如图所示：
hive

xyz是通过Hive在Hbase中创建的表，刚刚在Hive的建表语句中指定了映射的属性 "hbase.columns.mapping" = ":key,cf1:val" 和在HBase中建表的名称 "hbase.table.name" = "xyz"

在hbase在运行put命令，插入一条记录
put 'xyz','10001','cf1:val','www.javabloger.com'

在hive上运行查询语句，看看刚刚在hbase中插入的数据有没有同步过来，
select * from hbase_table_1 WHERE key=10001;
如图所示：
hive

最终的效果
   以上整合过程和操作步骤已经执行完毕，现在Hive中添加记录HBase中有记录添加，同样你在HBase中添加记录Hive中也会添加，表示Hive与HBase整合成功，对海量级别的数据我们是不是可以在HBase写入，在Hive中查询喃？因为HBase 不支持复杂的查询，但是HBase可以作为基于 key 获取一行或多行数据，或者扫描数据区间，以及过滤操作。而复杂的查询可以让Hive来完成，一个作为存储的入口(HBase)，一个作为查询的入口(Hive)。如下图示。
    hive mapreduce

   呵呵，见笑了，以上只是我面片的观点。

先这样，稍后我将继续更新，感谢你的阅读。

相关文章：
Apache Hive入门2
Apache Hive入门1

HBase入门篇4
HBase入门篇3
HBase入门篇2
HBase入门篇

–end–

原文地址：Hive入门3–Hive与HBase的整合, 感谢原作者分享。

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Related Article

Adding Users to MySQL: The Complete Tutorial

Adding Users to MySQL: The Complete TutorialMay 12, 2025 am 12:14 AM

Mastering the method of adding MySQL users is crucial for database administrators and developers because it ensures the security and access control of the database. 1) Create a new user using the CREATEUSER command, 2) Assign permissions through the GRANT command, 3) Use FLUSHPRIVILEGES to ensure permissions take effect, 4) Regularly audit and clean user accounts to maintain performance and security.

Mastering MySQL String Data Types: VARCHAR vs. TEXT vs. CHAR

Mastering MySQL String Data Types: VARCHAR vs. TEXT vs. CHARMay 12, 2025 am 12:12 AM

ChooseCHARforfixed-lengthdata,VARCHARforvariable-lengthdata,andTEXTforlargetextfields.1)CHARisefficientforconsistent-lengthdatalikecodes.2)VARCHARsuitsvariable-lengthdatalikenames,balancingflexibilityandperformance.3)TEXTisidealforlargetextslikeartic

MySQL: String Data Types and Indexing: Best Practices

MySQL: String Data Types and Indexing: Best PracticesMay 12, 2025 am 12:11 AM

Best practices for handling string data types and indexes in MySQL include: 1) Selecting the appropriate string type, such as CHAR for fixed length, VARCHAR for variable length, and TEXT for large text; 2) Be cautious in indexing, avoid over-indexing, and create indexes for common queries; 3) Use prefix indexes and full-text indexes to optimize long string searches; 4) Regularly monitor and optimize indexes to keep indexes small and efficient. Through these methods, we can balance read and write performance and improve database efficiency.

MySQL: How to Add a User Remotely

MySQL: How to Add a User RemotelyMay 12, 2025 am 12:10 AM

ToaddauserremotelytoMySQL,followthesesteps:1)ConnecttoMySQLasroot,2)Createanewuserwithremoteaccess,3)Grantnecessaryprivileges,and4)Flushprivileges.BecautiousofsecurityrisksbylimitingprivilegesandaccesstospecificIPs,ensuringstrongpasswords,andmonitori

The Ultimate Guide to MySQL String Data Types: Efficient Data Storage

The Ultimate Guide to MySQL String Data Types: Efficient Data StorageMay 12, 2025 am 12:05 AM

TostorestringsefficientlyinMySQL,choosetherightdatatypebasedonyourneeds:1)UseCHARforfixed-lengthstringslikecountrycodes.2)UseVARCHARforvariable-lengthstringslikenames.3)UseTEXTforlong-formtextcontent.4)UseBLOBforbinarydatalikeimages.Considerstorageov

MySQL BLOB vs. TEXT: Choosing the Right Data Type for Large Objects

MySQL BLOB vs. TEXT: Choosing the Right Data Type for Large ObjectsMay 11, 2025 am 12:13 AM

When selecting MySQL's BLOB and TEXT data types, BLOB is suitable for storing binary data, and TEXT is suitable for storing text data. 1) BLOB is suitable for binary data such as pictures and audio, 2) TEXT is suitable for text data such as articles and comments. When choosing, data properties and performance optimization must be considered.

MySQL: Should I use root user for my product?

MySQL: Should I use root user for my product?May 11, 2025 am 12:11 AM

No,youshouldnotusetherootuserinMySQLforyourproduct.Instead,createspecificuserswithlimitedprivilegestoenhancesecurityandperformance:1)Createanewuserwithastrongpassword,2)Grantonlynecessarypermissionstothisuser,3)Regularlyreviewandupdateuserpermissions

MySQL String Data Types Explained: Choosing the Right Type for Your Data

MySQL String Data Types Explained: Choosing the Right Type for Your DataMay 11, 2025 am 12:10 AM

MySQLstringdatatypesshouldbechosenbasedondatacharacteristicsandusecases:1)UseCHARforfixed-lengthstringslikecountrycodes.2)UseVARCHARforvariable-lengthstringslikenames.3)UseBINARYorVARBINARYforbinarydatalikecryptographickeys.4)UseBLOBorTEXTforlargeuns

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Roblox: Grow A Garden - Complete Mutation Guide

3 weeks agoByDDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

How to fix KB5055612 fails to install in Windows 10?

3 weeks agoByDDD

Nordhold: Fusion System, Explained

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools

Hot Topics

1664

14

CakePHP Tutorial

1423

52

Laravel Tutorial

1321

25

1269

29

1249

24