Home >Backend Development >PHP Tutorial >Hive 1.2.1&Spark&Sqoop Installation Guide_PHP Tutorial

Hive 1.2.1&Spark&Sqoop Installation Guide_PHP Tutorial

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOriginal: 2016-07-12 08:58:261263browse

Hive 1.2.1&Spark&Sqoop Installation Guide

Hive 1.2.1&Spark&Sqoop Installation Guide.pdf

1. Preface

The installation of this article refers to the "Hive0.12.0 Installation Guide", and the content comes from the official: GettingStarted, install Hive1.2.1 on Hadoop2.7.1. This article configures Hive into Server mode, uses MySQL as the metadata database, and remotely connects to MySQL.

For the installation of Hadoop2.7.1, please refer to the article "Hadoop-2.7.1 Distributed Installation Manual".

2. Agreement

This article agrees that Hadoop is installed in /data/hadoop/current, and Hive1.2.1 is installed in the directory /data/hadoop/hive (actually pointing to /data/hadoop /hive-1.2.1-bin). Install MySQL5.7.10 to the directory /data/mysql. During actual installation and deployment, you can specify other directories.

3. Service port

10000	hive.server2.thrift.port，执行hiveserver2时会启动它
9083	hive.metastore.uris，执行hive--servicemetastore时会启动它

10000

hive.server2.thrift.port, it will be started when hiveserver2 is executed

9083

hive.metastore.uris, it will be started when hive--servicemetastore is executed

4. Install MySQL

Since a single MySQL has a single point of problem, it needs to be configured as an active-standby MySQL mode in practice.

4.1. Install MySQL

MySQL in this article is installed on the 172.25.39.166 machine. Hive uses MySQL to store metadata, so MySQL needs to be installed first. The latest MySQL5.7.10 is installed here. The download URL is: http://dev.mysql.com/downloads/mysql/. This article chooses "Linux-Generic(glibc2.5)(" under "Linux-Generic". x86,64-bit),CompressedTARArchive", its binary installation package is named mysql-5.7.10-linux-glibc2.5-x86_64.tar.gz.

After decompressing the binary installation package, you can see a file named INSTALL-BINARY. This file explains how to install MySQL. This article basically refers to it.

Since the officially provided binary installation package, the "--prefix" specified during compilation is "/usr/local/mysql", so it is best to install mysql in the /usr/local directory, otherwise the installation The process may easily encounter problems. But create the data directory specified as a directory under a large enough partition.

Of course, the data directory can also be a soft link to a large enough partition directory, and the soft link method is recommended. Otherwise, when using the mysql command, you often need to specify the parameter "--datadir", such as mysqld, mysqld_safe and mysql_ssl_rsa_setup, etc. all need to specify "--datadir".

If it is not installed in /usr/local/mysql, you need to specify --basedir, --character-sets-dir, --language, --lc-messages-dir, --plugin- for mysqld. dir and many other parameter values.

If you cannot install it as root user, you also need to specify --slow-query-log-file, --socket, --pid-file, --plugin-dir and --general-log- for mysqld. file and other parameter values.

The default values of these parameters can be viewed by executing MySQL's "bin/mysqld --verbose--help".

#MySQL安装目录为/usr/local/mysql，数据目录实际为/data/mysql/data

#注意需以root用户安装MySQL，如果不能root用户安装，容易遇到安装麻烦

#并请注意5.7.6之前的版本安装略有不同！

#新建mysql用户组

groupaddmysql

#新建mysql用户，并设置为不能作为linux登录用户

useradd-r-gmysql-s/bin/falsemysql

#进入到mysql安装目录

cd/usr/local

#解压二进制安装包

tarxzfmysql-5.7.10-linux-glibc2.5-x86_64.tar.gz

#建立易记的、与版本无关的短链接

ln-smysql-5.7.10-linux-glibc2.5-x86_64mysql

#进入到mysql目录

cdmysql

#创建数据目录

mkdir-p/data/mysql/data

#建立数据目录软链接，让指向/usr/local/mysql/data指向/data/mysql/data

ln-s/data/mysql/data/usr/local/mysql/data

#设置目录权限

chmod770/data/mysql/data

chown-Rmysql/data/mysql/data

chgrp-Rmysql/data/mysql/data

chown-Rmysql.

chgrp-Rmysql.

#初始化（成功执行完mysqld后，会提供一个临时的root密码，请务必记住）

#另外需要注意临时密码会过期，所以需要尽量修改root密码

#进入MySQLCli后，执行下列命令即可修改成新密码：

#SETPASSWORDFOR'root'@'localhost'=PASSWORD('new_password');

bin/mysqld--initialize--user=mysql--explicit_defaults_for_timestamp

#安装和配置SSL

bin/mysql_ssl_rsa_setup

#重置目录权限

chown-Rroot.

chown-Rmysql/data/mysql/data

#启动mysql

bin/mysqld_safe--user=mysql&

#查看端口是否已起来（不修改配置和不指定参数--port，默认端口号为3306）

netstat-lpnt|grep3306

#停止MySQL

support-files/mysql.serverstop

#设置mysql随着系统自启动

cpsupport-files/mysql.server/etc/init.d/mysql.server

#MySQL installation directory is /usr/local/mysql, and the data directory is actually /data/mysql/data

#Note that MySQL needs to be installed as root user. If you cannot install it as root user, it is easy to encounter problems. To the installation trouble

# and please note that the installation of versions before 5.7.6 is slightly different!

# Create a new mysql user group

groupaddmysql

createdatabaseifnotexistshive;

# Create a new mysql user and set it so that it cannot be used as a linux login user useradd -r-gmysql-s/bin/falsemysql#Enter the mysql installation directory cd/usr/local#Unzip the binary installation package tarxzfmysql-5.7.10-linux-glibc2.5-x86_64.tar.gz# Create an easy-to-remember, version-independent short link ln-smysql -5.7.10-linux-glibc2.5-x86_64mysql#Enter the mysql directory cdmysql#Create Data directory mkdir-p/data/mysql/data# Create a soft link to the data directory so that /usr/local/mysql/data points to /data/mysql/data ln-s/data/mysql/data/usr/local/mysql/data#Set directory permissionschmod770/data/mysql/datachown-Rmysql/data/mysql/datachgrp-Rmysql/data/mysql/datachown-Rmysql.chgrp-Rmysql.#Initialization (after successfully executing mysqld, a temporary root password will be provided, please remember) # Also note that the temporary password will expire, so you need Try to change the root password as much as possible # After entering MySQLCli, execute the following command to change it to a new password: #SETPASSWORDFOR'root'@'localhost'=PASSWORD('new_password');bin/mysqld--initialize--user=mysql--explicit_defaults_for_timestamp#Install and configure SSLbin/mysql_ssl_rsa_setup #Reset directory permissionschown-Rroot.chown-Rmysql/data/mysql/data#Start mysqlbin/mysqld_safe--user=mysql&# Check whether the port is up (do not modify the configuration and do not specify the parameter --port, the default port number is 3306)netstat-lpnt|grep3306#Stop MySQLsupport-files/mysql.serverstop#Set mysql with With the system starting automaticallycpsupport-files/mysql.server/etc/init.d/mysql.server

The above uses the MySQL default configuration. If customization is required, it can be achieved by modifying the file my.cnf. MySQL5.7.10 does not have my.cnf, only support-files/my-default.cnf. By executing the command "support-files/my-default.cnf", you can know that the order in which MySQL searches for my.cnf is: /etc/my.cnf/etc/mysql/my.cnf/usr /local/mysql/etc/my.cnf~/.my.cnf, so you can make a copy of my-default.cnf and then modify it, such as: cpsupport-files/my-default.cnf/etc/my.cnf. 4.2. Create Hive metadata database Create database hive: createdatabaseifnotexistshive;

Create database user hive:

createuserhiveidentifiedby'hive2016';

Authorize IPs and users that can access the database hive, where the actual IP of localhost is 172.25.39.166:

grantallonhive.*to'hive'@'localhost'identifiedby'hive2016';

grantallonhive.*to'hive'@'172.25.39.166'identifiedby'hive2016';

grantallonhive.*to'hive'@'172.25.40.171'identifiedby'hive2016';

grantallonhive.*to'hive'@'localhost'identifiedby'hive2016';

grantallonhive .*to'hive'@'172.25.39.166'identifiedby'hive2016';

grantallonhive.*to'hive'@'172.25.40.171'identifiedby'hive2016';

tr>

Enter hive database:

1) Native entry: mysql-uhive-phive2016

2) Non-local Machine entry: mysql-uhive-h172.25.39.166-phive2016

Note that if MySQL master-master synchronization or other synchronization is configured, if the synchronized library does not contain mysql, create a library and users need to operate once on different MySQL.

5. Installation steps

5.1. Download the Hive1.2.1 binary installation package

Download URL: http://hive.apache.org/downloads.html, after downloading The package name is: apache-hive-1.2.1-bin.tar.gz, and then upload apache-hive-1.2.1-bin.tar.gz to the /data directory.

5.2. Install Hive

1) Switch to the /data directory: cd/data

2) Unzip the binary installation package: tarxzfapache-hive-1.2.1-bin. tar.gz

3) Change the name: mvapache-hive-1.2.1-binhive-1.2.1

4) Create a soft link: ln-shive-1.2.1hive

5.3. Install MySQL-Connector

MySQL-Connector download URL: http://dev.mysql.com/downloads/connector/.

Select "Connector/J", then select "PlatformIndependent". The download in this article is "mysql-connector-java-5.1.38.tar.gz".

There is mysql-connector-java-5.1.38-bin.jar in the compressed package "mysql-connector-java-5.1.38.tar.gz". After decompression, mysql-connector-java-5.1 .38-bin.jar is uploaded to the lib directory of Hive. This is the JDBC driver for MySQL.

exportHIVE_HOME=/data/hadoop/hive

exportPATH=$HIVE_HOME/bin:$PATH

5.4. Modify the configuration

5.4.1. Modify /etc/profile or ~/.profile

Set the environment variable HIVE_HOME and add Hive to PATH:

hadoop@VM-40-171-sles10-64:~/hive/conf>ls

hive-default.xml.templatehive-exec-log4j.properties.template

hive-env.sh.templatehive-log4j.properties.template

exportHIVE_HOME=/data/hadoop/hive

exportPATH=$HIVE_HOME/bin:$PATH

cphive-env.sh.templatehive-env.sh

cphive-default.xml.templatehive-site.xml

cphive-log4j.properties.templatehive-log4j.properties

cphive-exec-log4j.properties.templatehive-exec-log4j.properties

5.4 .2. Modify other configuration files

Enter the /data/hadoop/hive/conf directory, you can see the following:

hadoop@VM-40-171-sles10-64:~/hive/ conf>ls

HADOOP_HOME=/data/hadoop/current

hive-default.xml.templatehive-exec-log4j.properties.templatehive-env.sh.templatehive-log4j.properties.template

You can see 4 template files, copy and rename them into configuration files:

cphive-env.sh.templatehive-env .shcphive-default.xml.templatehive-site.xmlcphive-log4j.properties.templatehive-log4j.propertiescphive-exec-log4j.properties. templatehive-exec-log4j.properties

5.4.2.1. Modify hive-env.shif it has not been set before HADOOP_HOME environment variable can be set in hive-env.sh:

HADOOP_HOME=/data/hadoop/current

5.4.2.2. Modify hive-site.xml

1) Modify javax.jdo.option.ConnectionURL

and set the value to:

jdbc:mysql: //172.25.39.166:3306/hive?useSSL=false,

note "useSSL=false", and other parameters characterEncoding=UTF-8, etc.

2) Modify javax.jdo.option.ConnectionDriverName

and set the value to: com.mysql.jdbc.Driver.

3) Modify javax.jdo.option.ConnectionUserName

and set the value to the user name hive for accessing the hive database: hive.

4) Modify javax.jdo.option.ConnectionPassword

and set the value to the password for accessing the hive database: hive2016.

5) Modify hive.metastore.schema.verification

Modify this value according to the situation.

6) Modify hive.zookeeper.quorum

and set the value to: 10.12.154.77, 10.12.154.78, 10.12.154.79. ZooKeeper is installed on these three machines. It is recommended to use the machine name instead of IP, because machine retirement may cause IP changes.

7) Modify hive.metastore.uris

and set the value to: thrift://172.25.40.171:9083, 9083 is the RPC service port of Hive metadata.

8) Modify hive.metastore.warehouse.dir

and set the value to: /data/hadoop/hive/warehouse. Note that before starting, you need to create the directory (mkdir/data/ hadoop/hive/warehouse).

9) Modify hive.server2.thrift.bind.host

This value defaults to localhost. If you need to access Hive remotely from other machines, you need to change it to an IP address. This article will It is changed to 172.25.40.171, which can be considered as 0.0.0.0.

10) Modify hive.exec.scratchdir

This step is optional, you can directly use the default value /tmp/hive. Set to: /data/hadoop/hive/tmp or other, and create the directory.

11) Modify hive.exec.local.scratchdir

and set it to: /data/hadoop/hive/tmp/scratch or others, and create the directory.

12) Modify hive.downloaded.resources.dir

and set it to: /data/hadoop/hive/tmp/resources or others, and create the directory.

13) Modify hive.querylog.location

to: /data/hadoop/hive/tmp/querylog or others, and create a directory.

14) Modify hive.server2.logging.operation.log.location

to: /data/hadoop/hive/tmp/operation or others, and create a directory.

5.4.2.3. Modify hive-log4j.properties

Modify the log file storage directory and change the log directory from /tmp/${user.name} to /data/hadoop/hive/ logs：

hive.log.dir=/data/hadoop/hive/logs

Then Create the directory /data/hadoop/hive/logs.

5.4.2.4. Modify hive-exec-log4j.properties

Modify the log file storage directory and change the log directory from the default /tmp/${user.name} to /data/ hadoop/hive/logs/exec：

hive.log.dir=/data/hadoop/hive/logs/exec

Then create the directory /data/hadoop/hive/logs/exec.

6. Start and run

1) Initialize metastore

After installation and configuration, before starting the Hive server, you need to execute "schematool-dbTypemysql-initSchema" on the server ” to complete the initialization of the metastore.

If MySQL master-master synchronization is configured, it only needs to be executed on one hive machine. Repeated execution will result in an error.

2) Start metastore

Execute command: hive--servicemetastore&

3) Start Hive service

Execute: hiveserver2&.

4) Enter the Hive command line operation interface (similar to mysql)

Execute: hive

In addition to using the hive command line operation interface, hiveserver2 also provides beeline (hive is the username, hive2016 is the password, you can get more information from HiveServer2 Clients):

hadoop@VM-40-171-sles10-64:~/hive/bin>./beeline

Beelineversion1.2.1byApacheHive

beeline>!connectjdbc:hive2://172.25.40.171:10000hivehive2016org.apache.hive.jdbc.HiveDriver

Connectingtojdbc:hive2://172.25.40.171:10000

SLF4J:ClasspathcontainsmultipleSLF4Jbindings.

SLF4J:Foundbindingin[jar:file:/data/hadoop/hadoop-2.7.1/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J:Foundbindingin[jar:file:/data/hadoop/hive-1.2.1-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J:Seehttp://www.slf4j.org/codes.html#multiple_bindingsforanexplanation.

SLF4J:Actualbindingisoftype[org.slf4j.impl.Log4jLoggerFactory]

Connectedto:Hive(version1.2.1)

Driver:Hive(version1.2.1)

Transactionisolation:TRANSACTION_REPEATABLE_READ

0:jdbc:hive2://172.25.40.171:10000>select*frominviteslimit2;

------ ---------- -------

|foo|bar|ds|

------ ---------- -------

|474|val_475|2014|

|281|val_282|2014|

------ ---------- -------

2rowsselected(1.779seconds)

0:jdbc:hive2://172.25.40.171:10000>

hadoop@VM-40-171-sles10-64:~/hive/bin>./ beeline

Beelineversion1.2.1byApacheHive

beeline>!connectjdbc:hive2://172.25.40.171:10000hivehive2016org.apache.hive.jdbc.HiveDriver

Connectingtojdbc :hive2:/ /172.25.40.171:10000

SLF4J:ClasspathcontainsmultipleSLF4Jbindings.

SLF4J:Foundbindingin[jar:file:/data/hadoop/hadoop-2.7.1/share/hadoop/common/lib/slf4j -log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]

CREATETABLEpokes(fooINT,barSTRING);

CREATETABLEinvites(fooINT,barSTRING)PARTITIONEDBY(dsSTRING);

SHOWTABLES;

SHOWTABLES'.*s';

DESCRIBEinvites;

DROPTABLEpokes;

SLF4J:Foundbindingin[jar:file:/data/hadoop/hive-1.2.1-bin/lib/slf4j -log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J:Seehttp://www.slf4j.org/codes.html#multiple_bindingsforanexplanation.

LOADDATALOCALINPATH'../examples/files/kv2.txt'OVERWRITEINTOTABLEinvitesPARTITION(ds='2014');

SLF4J:Actualbindingisoftype[org.slf4j.impl.Log4jLoggerFactory]Connectedto:Hive(version1.2.1)Driver:Hive(version1.2.1)Transactionisolation:TRANSACTION_REPEATABLE_READ0:jdbc:hive2://172.25.40.171:10000>select*frominviteslimit2; ------ ---------- ----- -- |foo|bar|ds| ------ ---------- ------- |474|val_475|2014||281|val_282|2014| ------ ---------- ------- 2rowsselected(1.779seconds)0:jdbc:hive2://172.25.40.171:10000>

7. Remotely execute HSQLPackage hive/bin, hive/lib, hive/conf and hive/examples, such as: tarczfhive-bin.tar.gzhive/binhive/libhive/confhive/examples. Then upload hive-bin.tar.gz to other machines, and use beeline to remotely execute HSQL (you may encounter problems when using hive. When operating this article, use hive, and there will always be problems when executing HSQL. Stuck, the log does not record the special reason, and it has not been located yet). 8. Basic commands The following content is from the official website (GettingStarted). Note that the commands are not case-sensitive:

CREATETABLEpokes(fooINT,barSTRING);CREATETABLEinvites (fooINT,barSTRING)PARTITIONEDBY(dsSTRING);SHOWTABLES;SHOWTABLES'.*s';DESCRIBEinvites;DROPTABLEpokes;

There is an examples subdirectory under the Hive installation directory, which stores the data files used in the examples. Test loading data into table invites and load the file ../examples/files/kv2.txt into table invites:

LOADDATALOCALINPATH'../examples/files/kv2.txt'OVERWRITEINTOTABLEinvitesPARTITION(ds='2014 ');

You can check the loading status through "select*frominvites;" or execute "selectcount(1)frominvites;".

9. Single point solution

The single point can be solved by deploying two hives. The metadatabase uses MySQL. MySQL and hive are deployed on the same machine. The two MySQLs are configured as master-master synchronization. .

Hive adopts one master and one hot backup method. It is best to ensure that only one hive provides services at the same time. Although in many cases, both hives can provide services and work normally.

10. Integrate with Spark

Spark integrating Hive is very simple, just the following steps:

1) Add HIVE_HOME to spark-env.sh, such as: exportHIVE_HOME =/data/hadoop/hive

2) Copy Hive’s hive-site.xml and hive-log4j.properties files to Spark’s conf directory.

After completion, execute spark-sql again to enter Spark's SQLCli, and run the command showtables to see the tables created in Hive.

Example:

./spark-sql--masteryarn--driver-class-path/data/hadoop/hive/lib/mysql-connector-java-5.1.38-bin. jar

11. Integrate with Sqoop

Taking sqoop-1.4.6.bin__hadoop-2.0.4-alpha as an example, it supports incremental import, which can not only import data into Hive, but also You can import data to HBase or import data from DB to HDFS storage. In short, Sqoop is very powerful, but it is only briefly introduced here.

Download sqoop-1.4.6.bin__hadoop-2.0.4-alpha from Sqoop’s official website (download URL: http://www.apache.org/dyn/closer.lua/sqoop/1.4.6) .tar.gz.

Unzip, then enter the Sqoop conf directory and complete the following modifications:

11.1. Modify sqoop-env.sh

Copy a copy of sqoop-env-template. sh, named sqoop-env.sh. Set the following environment variables in sqoop-env.sh:

1) HADOOP_COMMON_HOME

value is the Hadoop installation directory, example: exportHADOOP_COMMON_HOME=/data/hadoop

2) The HADOOP_MAPRED_HOME

value is the directory where the hadoop-common-*.tar file is located, which is located under the Hadoop installation directory.

Example: exportHADOOP_MAPRED_HOME=/data/hadoop/share/hadoop/common

3)HBASE_HOME

The value is the installation directory of HBase, example: exportHBASE_HOME=/data/hbase

4)HIVE_HOME

value is the installation directory of Hive, example: exportHIVE_HOME=/data/hive

5)ZOOCFGDIR

value is the configuration of Zookeeper Directory, example: exportZOOCFGDIR=/data/zookeeper/conf

11.2. Modify sqoop-site.xml

Copy a copy of sqoop-site-template.xml and name it sqoop-site.xml , no modification is required.

11.3. Verification test

1) List MySQL database

./sqooplist-databases--connectjdbc:mysql://127.0.0.1:3306/--usernamezhangsan--passwordzhangsan2016

./sqooplist-databases--connectjdbc:mysql://127.0.0.1:3306/--usernamezhangsan- -passwordzhangsan2016

2) Create a Hive table based on the MySQL table

./sqoopcreate-hive-table--connectjdbc:mysql://127.0.0.1:3306/test--usernamezhangsan--passwordzhangsan2016--tablet_test--hive-tablet_test_2016

./sqoopcreate-hive-table --connectjdbc:mysql://127.0.0.1:3306/test--usernamezhangsan--passwordzhangsan2016--tablet_test--hive-tablet_test_2016

If the Hive table needs to be partitioned, you can specify it through the parameters --hive-partition-key and --hive-partition-value.

If you need to overwrite an existing Hive table, just add the parameter "--hive-overwrite". The "--hive-partition-key" value is the partition name, which defaults to string type, and "--hive-partition-value" is the partition value.

./sqoopimport--connectjdbc:mysql://127.0.0.1:3306/test--usernamezhangsan--password'zhangsan2016'--tablet_test--hive-import-m6--hive-tablet_test_2016--direct

3) Import data from MySQL to Hive

./sqoopimport--connectjdbc:mysql://127.0.0.1:3306/test--usernamezhangsan--password 'zhangsan2016'--tablet_test--hive-import-m6--hive-tablet_test_2016--direct

It is recommended to bring the parameter "--direct", which means to use the fast mode. For example, it will use the MySQL tool mysqldump to export data.

"-m" indicates how many maps are enabled to import data in parallel. The default is 4. It is best not to set the number higher than the maximum number of maps in the cluster.

"--table" is used to specify the name of the DB table to be imported, and "--hive-import" means to import data from DB to Hive. You can also use the parameter "--query" to conditionally export from the DB using SQL.

If you need to specify the character set, use the parameter "--default-character-set", such as: --default-character-setUTF-8.

12. Common Errors

1) TIMESTAMPwithimplicitDEFAULTvalueisdeprecated

Error when executing "bin/mysqld--initialize--user=mysql" of MySQL.

The reason is that starting from MySQL version 5.6, the default value of timestamp has been marked as deprecated, that is, if the field of type timestamp is not explicitly declared as NULL, the default value is NOTNULL. If the timestamp field is set to NULL, the current timestamp is automatically stored.

2)Can'tfinderror-messagefile'/usr/local/mysql/share/errmsg.sys'

Execute MySQL's "bin/mysqld--initialize--user=mysql-- explicit_defaults_for_timestamp" error.

This may be because the data directory is not empty because it has been executed before. You can see that the default data directory is /var/lib/mysql through "bin/mysqld--verbose--help|grepdatadir" /. It is necessary to ensure that the /var/lib/mysql/ directory is empty. Or change the data directory by specifying the parameter --datadir, such as "bin/mysqld --initialize --user=mysql --explicit_defaults_for_timestamp --datadir=/data/mysql/data".

3)Can'tfinderror-messagefile'/usr/local/mysql/share/errmsg.sys'

For error:

Can'tfinderror-messagefile'/usr /local/mysql/share/errmsg.sys'.Checkerror-messagefilelocationand'lc-messages-dir'configurationdirective.

The default installation directory of MySQL downloaded from the official website is /usr/local/mysql, if it is actually other directory, it is recommended to specify it through the parameter --basedir, otherwise you will encounter many installation problems. By executing "bin/mysqld --verbose --help|grepbasedir" you can see that the default value of "--basedir" is /usr/local/mysql/.

4) FailedtoconnecttotheMetaStoreServer

If you run hiveserver2 and encounter the following errors, it is recommended to turn on the DEBUG log level to view more detailed information and change the log configuration file hive-log4j.properties Change "hive.root.logger=WARN,DRFA" to "hive.root.logger=DEBUG,WARN,DRFA".

2014-04-2306:00:04,169WARNhive.metastore(HiveMetaStoreClient.java:open(291))-FailedtoconnecttotheMetaStoreServer...

2014-04-2306:00:05,173WARNhive.metastore(HiveMetaStoreClient.java:open(291))-FailedtoconnecttotheMetaStoreServer...

2014-04-2306:00:06,177WARNhive.metastore(HiveMetaStoreClient.java:open(291))-FailedtoconnecttotheMetaStoreServer...

2014-04-2306:00:07,181WARNhive.metastore(HiveMetaStoreClient.java:open(291))-FailedtoconnecttotheMetaStoreServer...

2014-04-2306:00:08,185WARNhive.metastore(HiveMetaStoreClient.java:open(291))-FailedtoconnecttotheMetaStoreServer...

2014-04-2306:00:09,194ERRORservice.CompositeService(CompositeService.java:start(74))-ErrorstartingservicesHiveServer2

org.apache.hive.service.ServiceException:UnabletoconnecttoMetaStore!

atorg.apache.hive.service.cli.CLIService.start(CLIService.java:85)

atorg.apache.hive.service.CompositeService.start(CompositeService.java:70)

atorg.apache.hive.service.server.HiveServer2.start(HiveServer2.java:73)

atorg.apache.hive.service.server.HiveServer2.main(HiveServer2.java:103)

atsun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)

atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

atjava.lang.reflect.Method.invoke(Method.java:483)

atorg.apache.hadoop.util.RunJar.main(RunJar.java:212)

2014-04-2306:00:04,169WARNhive.metastore(HiveMetaStoreClient.java:open(291))-FailedtoconnecttotheMetaStoreServer...2014-04-2306:00:05,173WARNhive.metastore (HiveMetaStoreClient.java:open(291))-FailedtoconnecttotheMetaStoreServer...2014-04-2306:00:06,177WARNhive.metastore(HiveMetaStoreClient.java:open(291))-FailedtoconnecttotheMetaStoreServer...2014-04-2306:00:07,181WARNhive.metastore(HiveMetaStoreClient.java:open(291))-FailedtoconnecttotheMetaStoreServer...2014-04-2306:00:08,185WARNhive.metastore( HiveMetaStoreClient.java:open(291))-FailedtoconnecttotheMetaStoreServer...2014-04-2306:00:09,194ERRORservice.CompositeService(CompositeService.java:start(74))-ErrorstartingservicesHiveServer2 org.apache.hive.service.ServiceException:UnabletoconnecttoMetaStore!aorg.apache.hive.service.cli.CLIService.start(CLIService.java:85)aorg.apache.hive. service.CompositeService.start(CompositeService.java:70)atorg.apache.hive.service.server.HiveServer2.start(HiveServer2.java:73)atorg.apache.hive. service.server.HiveServer2.main(HiveServer2.java:103)atsun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)atjava.lang.reflect.Method.invoke(Method.java:483)atorg. apache.hadoop.util.RunJar.main(RunJar.java:212)

After modification, run hiveserver2 again. The log becomes more detailed. It is guessed that the metastore is not up. You can start the metastore by executing "hive --servicemetastore".

2014-04-2306:04:27,053INFOhive.metastore(HiveMetaStoreClient.java:open(244))-TryingtoconnecttometastorewithURIthrift://172.25.40.171:9083

2014-04-2306:04:27,085WARNhive.metastore(HiveMetaStoreClient.java:open(288))-FailedtoconnecttotheMetaStoreServer...

org.apache.thrift.transport.TTransportException:java.net.ConnectException:拒绝连接

atorg.apache.thrift.transport.TSocket.open(TSocket.java:185)

atorg.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:283)

atorg.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:164)

atorg.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:104)

atorg.apache.hive.service.cli.CLIService.start(CLIService.java:82)

atorg.apache.hive.service.CompositeService.start(CompositeService.java:70)

atorg.apache.hive.service.server.HiveServer2.start(HiveServer2.java:73)

atorg.apache.hive.service.server.HiveServer2.main(HiveServer2.java:103)

atsun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)

atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

atjava.lang.reflect.Method.invoke(Method.java:483)

atorg.apache.hadoop.util.RunJar.main(RunJar.java:212)

2014-04-2306:04:27,053INFOhive.metastore(HiveMetaStoreClient.java:open(244))-TryingtoconnecttometastorewithURIthrift://172.25.40.171:9083

2014-04-2306:04 :27,085WARNhive.metastore(HiveMetaStoreClient.java:open(288))-FailedtoconnecttotheMetaStoreServer...

org.apache.thrift.transport.TTransportException:java.net.ConnectException: Connection refused

atorg.apache.thrift.transport.TSocket.open(TSocket.java:185)

SLF4J:Actualbindingisoftype[org.slf4j.impl.Log4jLoggerFactory]

MetaException(message:Versioninformationnotfoundinmetastore.)

atorg.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:5638)

atorg.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:5622)

atsun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)

atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

atjava.lang.reflect.Method.invoke(Method.java:483)

atorg.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:124)

atcom.sun.proxy.$Proxy2.verifySchema(UnknownSource)

atorg.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:403)

atorg.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:441)

atorg.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:326)

atorg.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.(HiveMetaStore.java:286)

atorg.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:54)

atorg.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59)

atorg.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4060)

atorg.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:4263)

atorg.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:4197)

atsun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)

atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

atjava.lang.reflect.Method.invoke(Method.java:483)

atorg.apache.hadoop.util.RunJar.main(RunJar.java:212)

atorg.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:283) atorg.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:164)atorg.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java: 104)aorg.apache.hive.service.cli.CLIService.start(CLIService.java:82)aorg.apache.hive.service.CompositeService.start(CompositeService.java: 70)aorg.apache.hive.service.server.HiveServer2.start(HiveServer2.java:73)aorg.apache.hive.service.server.HiveServer2.main(HiveServer2. java:103)atsun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)atsun.reflect. DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)atjava.lang.reflect.Method.invoke(Method.java:483)aorg.apache.hadoop.util.RunJar.main( RunJar.java:212)

5)VersioninformationnotfoundinmetastoreExecute "./hive-- servicemetastore" reports the following error because the metastore has not been initialized and "schematool-dbTypemysql-initSchema" needs to be executed once.

SLF4J:Actualbindingisoftype[org.slf4j.impl.Log4jLoggerFactory]MetaException(message:Versioninformationnotfoundinmetastore.)aorg.apache.hadoop.hive.metastore.ObjectStore.checkSchema( ObjectStore.java:5638)aorg.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:5622)atsun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)atjava.lang.reflect. Method.invoke(Method.java:483)aorg.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:124)atcom.sun.proxy.$Proxy2 .verifySchema(UnknownSource)aorg.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:403)aorg.apache.hadoop.hive.metastore.HiveMetaStore $HMSHandler.createDefaultDB(HiveMetaStore.java:441)aorg.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:326)aorg.apache.hadoop .hive.metastore.HiveMetaStore$HMSHandler.(HiveMetaStore.java:286)aorg.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:54)atorg.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59)atorg.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4060 )atorg.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:4263)atorg.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java :4197)atsun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)atsun.reflect.DelegatingMethodAccessor Impl .invoke(DelegatingMethodAccessorImpl.java:43)atjava.lang.reflect.Method.invoke(Method.java:483)aorg.apache.hadoop.util.RunJar.main(RunJar .java:212)

6)java.net.URISyntaxException:RelativepathinabsoluteURI:${system:java.io.tmpdir}/${system:user.name}

Solution: Replace all hive-site.xml system:java.io.tmpdir are all replaced with absolute paths, hive-1.2.1 has 4 locations in total.

7)EstablishingSSLconnectionwithoutserver'sidentityverificationisnotrecommended

Problem:

WedFeb1710:39:37CST2016WARN:EstablishingSSLconnectionwithoutserver'sidentityverificationisnotrecommended.AccordingtoMySQL5.5.45,5. 6.26 and5.7.6 requirementsSSLconnectionmustbeestablishedbydefaultifexplicitoptionisn'tset. ForcompliancewithexistingapplicationsnotusingSSLtheverifyServerCertificatepropertyissetto'false'.YouneedeithertoexplicitlydisableSSLbysettinguseSSL=false,orsetuseSSL=trueandprovidetruststoreforservercertificateverification.

The solution is that the configuration item javax.jdo.option.ConnectionURL value in hive-site.xml needs to be added with "useSSL =false", such as:

jdbc:mysql://127.0.0.1:3306/hive?characterEncoding=UTF-8;useSSL=false.

8)SPARK_CLASSPATHwasdetected

SPARK_CLASSPATHwasdetected(setto'/data/hadoop/hive/lib/mysql-connector-java-5.1.38-bin.jar:').

ThisisdeprecatedinSpark1.0 .

Pleaseinsteaduse:

-./spark-submitwith--driver-class-pathtoaugmentthedriverclasspath

-spark.executor.extraClassPathtoaugmenttheexecutorclasspath

This means that it is not recommended to set the environment variable SPARK_CLASSPATH in spark-env.sh. You can change it to the following recommended method:

./spark-sql--masteryarn--driver-class-path/data/hadoop/ hive/lib/mysql-connector-java-5.1.38-bin.jar

13. Related documents

"HBase-0.98.0 Distributed Installation Guide"

"Hive1.2.1 Installation Guide"

"ZooKeeper-3.4.6 Distributed Installation Guide"

"Hadoop2.3.0 Source Code Reverse Engineering"

"Compiling on Linux Hadoop-2.7.1》

《Accumulo-1.5.1 Installation Guide》

《Drill1.0.0 Installation Guide》

《Shark0.9.1 Installation Guide》

For more, please pay attention to the technology blog: http://aquester.cublog.cn.

Statement：

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Previous article：A DB2 Performance Tuning Roadmap --DIVE INTO LOCK_PHP教程Next article：A DB2 Performance Tuning Roadmap --DIVE INTO LOCK_PHP教程

See more

Hive 1.2.1&Spark&Sqoop Installation Guide_PHP Tutorial