Sqoop是一个用来将Hadoop(Hive、HBase)和关系型数据库中的数据相互转移的工具,可以将一个关系型数据库(例如:MySQL ,Oracle ,Postgres等)中的数据导入到Hadoop的HDFS中,也可以将HDFS的数据导入到关系型数据库中。 Sqoop目前已经是Apache的顶级项目了,目前版本是1.4.4 和 Sqoop2 1.99.3,本文以1.4.4的版本为例讲解基本的安装配置和简单应用的演示。
  • 安装配置
  • 准备测试数据
  • 导入数据到HDFS
  • 导入数据到Hive
  • 导入数据到HBase
[一]、安装配置 选择Sqoop 1.4.4 版本:sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz 1.1、下载后解压配置:
tar -zxvf sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz /usr/local/
cd /usr/local
ln -s sqoop-1.4.4.bin__hadoop-2.0.4-alpha sqoop
1.2、环境变量配置 vi ~/.bash_profile
#Sqoop  add by micmiu.com
export SQOOP_HOME=/usr/local/sqoop
1.3、配置Sqoop参数: 复制/conf/sqoop-env-template.sh 一份重命名为:/conf/sqoop-env.sh vi ?<sqoop_home>/conf/sqoop-env.sh</sqoop_home>
# 指定各环境变量的实际配置
# Set Hadoop-specific environment variables here.
#Set path to where bin/hadoop is available
#Set path to where hadoop-*-core.jar is available
#set the path to where bin/hbase is available
#export HBASE_HOME=
#Set the path to where bin/hive is available
#export HIVE_HOME=
# Hadoop  
export HADOOP_PREFIX="/usr/local/hadoop"  
# Native Path  
export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib/native" 
# Hadoop end
export HIVE_HOME=/usr/local/hive
export PATH=$HIVE_HOME/bin:$PATH
export HBASE_HOME=/usr/local/hbase
export PATH=$HBASE
#add by micmiu.com
1.4、驱动jar包 下面测试演示以MySQL为例,则需要把mysql对应的驱动lib文件copy到 <sqoop_home>/lib</sqoop_home> 目录下。 [二]、测试数据准备 以MySQL 为例:
  • database: test
  • 用户:root 密码:micmiu
准备两张测试表一个有主键表demo_blog,一个无主键表 demo_log
CREATE TABLE `demo_blog` (
  `blog` varchar(100) NOT NULL,
  PRIMARY KEY (`id`)
CREATE TABLE `demo_log` (
  `operator` varchar(16) NOT NULL,
  `log` varchar(100) NOT NULL
insert into demo_blog (id, blog) values (1, "micmiu.com");
insert into demo_blog (id, blog) values (2, "ctosun.com");
insert into demo_blog (id, blog) values (3, "baby.micmiu.com");
insert into demo_log (operator, log) values ("micmiu", "create");
insert into demo_log (operator, log) values ("micmiu", "update");
insert into demo_log (operator, log) values ("michael", "edit");
insert into demo_log (operator, log) values ("michael", "delete");
[三]、导入数据到HDFS 3.1、导入有主键的表 比如我需要把表 demo_blog (含主键) 的数据导入到HDFS中,执行如下命令:
sqoop import --connect jdbc:mysql:// --username root --password micmiu --table demo_blog
$ sqoop import --connect jdbc:mysql:// --username root --password micmiu --table demo_blog
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
14/04/09 09:58:43 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/04/09 09:58:43 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/04/09 09:58:43 INFO tool.CodeGenTool: Beginning code generation
14/04/09 09:58:43 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `demo_blog` AS t LIMIT 1
14/04/09 09:58:43 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `demo_blog` AS t LIMIT 1
14/04/09 09:58:43 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop
Note: /tmp/sqoop-hadoop/compile/e8fd26a5bca5b7f51cdb03bf847ce389/demo_blog.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/04/09 09:58:44 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/e8fd26a5bca5b7f51cdb03bf847ce389/demo_blog.jar
14/04/09 09:58:44 WARN manager.MySQLManager: It looks like you are importing from mysql.
14/04/09 09:58:44 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
14/04/09 09:58:44 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
14/04/09 09:58:44 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
14/04/09 09:58:44 INFO mapreduce.ImportJobBase: Beginning import of demo_blog
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hbase-0.98.0-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/04/09 09:58:44 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/04/09 09:58:45 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/04/09 09:58:45 INFO client.RMProxy: Connecting to ResourceManager at Master.Hadoop/
14/04/09 09:58:47 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`id`), MAX(`id`) FROM `demo_blog`
14/04/09 09:58:47 INFO mapreduce.JobSubmitter: number of splits:3
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.job.classpath.files is deprecated. Instead, use mapreduce.job.classpath.files
14/04/09 09:58:47 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.cache.files.filesizes is deprecated. Instead, use mapreduce.job.cache.files.filesizes
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.cache.files is deprecated. Instead, use mapreduce.job.cache.files
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
14/04/09 09:58:47 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/04/09 09:58:47 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/04/09 09:58:47 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.cache.files.timestamps is deprecated. Instead, use mapreduce.job.cache.files.timestamps
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/04/09 09:58:47 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/04/09 09:58:47 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1396936838233_0001
14/04/09 09:58:47 INFO impl.YarnClientImpl: Submitted application application_1396936838233_0001 to ResourceManager at Master.Hadoop/
14/04/09 09:58:47 INFO mapreduce.Job: The url to track the job: http://Master.Hadoop:8088/proxy/application_1396936838233_0001/
14/04/09 09:58:47 INFO mapreduce.Job: Running job: job_1396936838233_0001
14/04/09 09:59:00 INFO mapreduce.Job: Job job_1396936838233_0001 running in uber mode : false
14/04/09 09:59:00 INFO mapreduce.Job:  map 0% reduce 0%
14/04/09 09:59:14 INFO mapreduce.Job:  map 33% reduce 0%
14/04/09 09:59:16 INFO mapreduce.Job:  map 67% reduce 0%
14/04/09 09:59:19 INFO mapreduce.Job:  map 100% reduce 0%
14/04/09 09:59:19 INFO mapreduce.Job: Job job_1396936838233_0001 completed successfully
14/04/09 09:59:19 INFO mapreduce.Job: Counters: 27
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=271866
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=295
		HDFS: Number of bytes written=44
		HDFS: Number of read operations=12
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=6
	Job Counters 
		Launched map tasks=3
		Other local map tasks=3
		Total time spent by all maps in occupied slots (ms)=43032
		Total time spent by all reduces in occupied slots (ms)=0
	Map-Reduce Framework
		Map input records=3
		Map output records=3
		Input split bytes=295
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=590
		CPU time spent (ms)=6330
		Physical memory (bytes) snapshot=440934400
		Virtual memory (bytes) snapshot=3882573824
		Total committed heap usage (bytes)=160563200
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=44
14/04/09 09:59:19 INFO mapreduce.ImportJobBase: Transferred 44 bytes in 34.454 seconds (1.2771 bytes/sec)
14/04/09 09:59:19 INFO mapreduce.ImportJobBase: Retrieved 3 records.
$ hdfs dfs -ls /user/hadoop/demo_blog
Found 4 items
-rw-r--r--   3 hadoop supergroup          0 2014-04-09 09:59 /user/hadoop/demo_blog/_SUCCESS
-rw-r--r--   3 hadoop supergroup         13 2014-04-09 09:59 /user/hadoop/demo_blog/part-m-00000
-rw-r--r--   3 hadoop supergroup         13 2014-04-09 09:59 /user/hadoop/demo_blog/part-m-00001
-rw-r--r--   3 hadoop supergroup         18 2014-04-09 09:59 /user/hadoop/demo_blog/part-m-00002
[hadoop@Master ~]$ hdfs dfs -cat /user/hadoop/demo_blog/part-m-0000*
ps:默认设置下导入到hdfs上的路径是:?/user/username/tablename/(files),比如我的当前用户是hadoop,那么实际路径即:?/user/hadoop/demo_blog/(files)。 如果要自定义路径需要增加参数:--warehouse-dir 比如:
sqoop import --connect jdbc:mysql://Master.Hadoop/test --username root --password micmiu --table demo_blog --warehouse-dir /user/micmiu/sqoop
3.2、导入不含主键的表 比如需要把表 demo_log(无主键) 的数据导入到hdfs中,执行如下命令:
sqoop import --connect jdbc:mysql:// --username root --password micmiu --table demo_log --warehouse-dir /user/micmiu/sqoop --split-by operator
ps:无主键表的导入需要增加参数? --split-by xxx ?或者 -m 1 执行过程:
$ sqoop import --connect jdbc:mysql:// --username root --password micmiu --table demo_log --warehouse-dir /user/micmiu/sqoop --split-by operator
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
14/04/09 15:02:06 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/04/09 15:02:06 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/04/09 15:02:06 INFO tool.CodeGenTool: Beginning code generation
14/04/09 15:02:06 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `demo_log` AS t LIMIT 1
14/04/09 15:02:06 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `demo_log` AS t LIMIT 1
14/04/09 15:02:06 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop
Note: /tmp/sqoop-hadoop/compile/dddc1bcdba30515f95a2d604f22e4fe9/demo_log.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/04/09 15:02:07 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/dddc1bcdba30515f95a2d604f22e4fe9/demo_log.jar
14/04/09 15:02:07 WARN manager.MySQLManager: It looks like you are importing from mysql.
14/04/09 15:02:07 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
14/04/09 15:02:07 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
14/04/09 15:02:07 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
14/04/09 15:02:07 INFO mapreduce.ImportJobBase: Beginning import of demo_log
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hbase-0.98.0-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/04/09 15:02:07 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/04/09 15:02:08 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/04/09 15:02:08 INFO client.RMProxy: Connecting to ResourceManager at Master.Hadoop/
14/04/09 15:02:10 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`operator`), MAX(`operator`) FROM `demo_log`
14/04/09 15:02:10 WARN db.TextSplitter: Generating splits for a textual index column.
14/04/09 15:02:10 WARN db.TextSplitter: If your database sorts in a case-insensitive order, this may result in a partial import or duplicate records.
14/04/09 15:02:10 WARN db.TextSplitter: You are strongly encouraged to choose an integral split column.
14/04/09 15:02:10 INFO mapreduce.JobSubmitter: number of splits:4
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.job.classpath.files is deprecated. Instead, use mapreduce.job.classpath.files
14/04/09 15:02:10 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.cache.files.filesizes is deprecated. Instead, use mapreduce.job.cache.files.filesizes
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.cache.files is deprecated. Instead, use mapreduce.job.cache.files
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
14/04/09 15:02:10 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/04/09 15:02:10 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/04/09 15:02:10 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.cache.files.timestamps is deprecated. Instead, use mapreduce.job.cache.files.timestamps
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/04/09 15:02:10 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/04/09 15:02:10 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1396936838233_0003
14/04/09 15:02:10 INFO impl.YarnClientImpl: Submitted application application_1396936838233_0003 to ResourceManager at Master.Hadoop/
14/04/09 15:02:10 INFO mapreduce.Job: The url to track the job: http://Master.Hadoop:8088/proxy/application_1396936838233_0003/
14/04/09 15:02:10 INFO mapreduce.Job: Running job: job_1396936838233_0003
14/04/09 15:02:17 INFO mapreduce.Job: Job job_1396936838233_0003 running in uber mode : false
14/04/09 15:02:17 INFO mapreduce.Job:  map 0% reduce 0%
14/04/09 15:02:28 INFO mapreduce.Job:  map 25% reduce 0%
14/04/09 15:02:30 INFO mapreduce.Job:  map 50% reduce 0%
14/04/09 15:02:33 INFO mapreduce.Job:  map 100% reduce 0%
14/04/09 15:02:33 INFO mapreduce.Job: Job job_1396936838233_0003 completed successfully
14/04/09 15:02:33 INFO mapreduce.Job: Counters: 27
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=362536
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=516
		HDFS: Number of bytes written=56
		HDFS: Number of read operations=16
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=8
	Job Counters 
		Launched map tasks=4
		Other local map tasks=4
		Total time spent by all maps in occupied slots (ms)=44481
		Total time spent by all reduces in occupied slots (ms)=0
	Map-Reduce Framework
		Map input records=4
		Map output records=4
		Input split bytes=516
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=429
		CPU time spent (ms)=6650
		Physical memory (bytes) snapshot=587669504
		Virtual memory (bytes) snapshot=5219356672
		Total committed heap usage (bytes)=205848576
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=56
14/04/09 15:02:33 INFO mapreduce.ImportJobBase: Transferred 56 bytes in 25.2746 seconds (2.2157 bytes/sec)
14/04/09 15:02:33 INFO mapreduce.ImportJobBase: Retrieved 4 records.
$ hdfs dfs -ls /user/micmiu/sqoop/demo_log
Found 5 items
-rw-r--r--   3 hadoop supergroup          0 2014-04-09 15:02 /user/micmiu/sqoop/demo_log/_SUCCESS
-rw-r--r--   3 hadoop supergroup         28 2014-04-09 15:02 /user/micmiu/sqoop/demo_log/part-m-00000
-rw-r--r--   3 hadoop supergroup          0 2014-04-09 15:02 /user/micmiu/sqoop/demo_log/part-m-00001
-rw-r--r--   3 hadoop supergroup          0 2014-04-09 15:02 /user/micmiu/sqoop/demo_log/part-m-00002
-rw-r--r--   3 hadoop supergroup         28 2014-04-09 15:02 /user/micmiu/sqoop/demo_log/part-m-00003
$ hdfs dfs -cat /user/micmiu/sqoop/demo_log/part-m-0000*
[四]、导入数据到Hive 比如把表demo_blog 数据导入到Hive中,增加参数 --hive-import?:
sqoop import --connect jdbc:mysql:// --username root --password micmiu --table demo_blog  --warehouse-dir /user/sqoop --hive-import --create-hive-table
$ sqoop import --connect jdbc:mysql:// --username root --password micmiu --table demo_blog  --warehouse-dir /user/sqoop --hive-import --create-hive-table 
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
14/04/09 10:44:21 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/04/09 10:44:21 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
14/04/09 10:44:21 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
14/04/09 10:44:21 WARN tool.BaseSqoopTool: It seems that you've specified at least one of following:
14/04/09 10:44:21 WARN tool.BaseSqoopTool: 	--hive-home
14/04/09 10:44:21 WARN tool.BaseSqoopTool: 	--hive-overwrite
14/04/09 10:44:21 WARN tool.BaseSqoopTool: 	--create-hive-table
14/04/09 10:44:21 WARN tool.BaseSqoopTool: 	--hive-table
14/04/09 10:44:21 WARN tool.BaseSqoopTool: 	--hive-partition-key
14/04/09 10:44:21 WARN tool.BaseSqoopTool: 	--hive-partition-value
14/04/09 10:44:21 WARN tool.BaseSqoopTool: 	--map-column-hive
14/04/09 10:44:21 WARN tool.BaseSqoopTool: Without specifying parameter --hive-import. Please note that
14/04/09 10:44:21 WARN tool.BaseSqoopTool: those arguments will not be used in this session. Either
14/04/09 10:44:21 WARN tool.BaseSqoopTool: specify --hive-import to apply them correctly or remove them
14/04/09 10:44:21 WARN tool.BaseSqoopTool: from command line to remove this warning.
14/04/09 10:44:21 INFO tool.BaseSqoopTool: Please note that --hive-home, --hive-partition-key, 
14/04/09 10:44:21 INFO tool.BaseSqoopTool: 	 hive-partition-value and --map-column-hive options are 
14/04/09 10:44:21 INFO tool.BaseSqoopTool: 	 are also valid for HCatalog imports and exports
14/04/09 10:44:21 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/04/09 10:44:21 INFO tool.CodeGenTool: Beginning code generation
14/04/09 10:44:21 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `demo_blog` AS t LIMIT 1
14/04/09 10:44:21 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `demo_blog` AS t LIMIT 1
14/04/09 10:44:21 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop
Note: /tmp/sqoop-hadoop/compile/c071f02ecad006293202fd2c2fad0dce/demo_blog.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/04/09 10:44:22 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/c071f02ecad006293202fd2c2fad0dce/demo_blog.jar
14/04/09 10:44:22 WARN manager.MySQLManager: It looks like you are importing from mysql.
14/04/09 10:44:22 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
14/04/09 10:44:22 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
14/04/09 10:44:22 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
14/04/09 10:44:22 INFO mapreduce.ImportJobBase: Beginning import of demo_blog
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hbase-0.98.0-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/04/09 10:44:22 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/04/09 10:44:23 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/04/09 10:44:23 INFO client.RMProxy: Connecting to ResourceManager at Master.Hadoop/
14/04/09 10:44:25 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`id`), MAX(`id`) FROM `demo_blog`
14/04/09 10:44:25 INFO mapreduce.JobSubmitter: number of splits:3
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.job.classpath.files is deprecated. Instead, use mapreduce.job.classpath.files
14/04/09 10:44:25 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.cache.files.filesizes is deprecated. Instead, use mapreduce.job.cache.files.filesizes
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.cache.files is deprecated. Instead, use mapreduce.job.cache.files
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
14/04/09 10:44:25 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/04/09 10:44:25 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/04/09 10:44:25 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.cache.files.timestamps is deprecated. Instead, use mapreduce.job.cache.files.timestamps
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/04/09 10:44:25 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/04/09 10:44:25 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1396936838233_0002
14/04/09 10:44:25 INFO impl.YarnClientImpl: Submitted application application_1396936838233_0002 to ResourceManager at Master.Hadoop/
14/04/09 10:44:25 INFO mapreduce.Job: The url to track the job: http://Master.Hadoop:8088/proxy/application_1396936838233_0002/
14/04/09 10:44:25 INFO mapreduce.Job: Running job: job_1396936838233_0002
14/04/09 10:44:33 INFO mapreduce.Job: Job job_1396936838233_0002 running in uber mode : false
14/04/09 10:44:33 INFO mapreduce.Job:  map 0% reduce 0%
14/04/09 10:44:46 INFO mapreduce.Job:  map 67% reduce 0%
14/04/09 10:44:48 INFO mapreduce.Job:  map 100% reduce 0%
14/04/09 10:44:49 INFO mapreduce.Job: Job job_1396936838233_0002 completed successfully
14/04/09 10:44:49 INFO mapreduce.Job: Counters: 27
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=271860
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=295
		HDFS: Number of bytes written=44
		HDFS: Number of read operations=12
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=6
	Job Counters 
		Launched map tasks=3
		Other local map tasks=3
		Total time spent by all maps in occupied slots (ms)=34047
		Total time spent by all reduces in occupied slots (ms)=0
	Map-Reduce Framework
		Map input records=3
		Map output records=3
		Input split bytes=295
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=505
		CPU time spent (ms)=5350
		Physical memory (bytes) snapshot=427388928
		Virtual memory (bytes) snapshot=3881439232
		Total committed heap usage (bytes)=171638784
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=44
14/04/09 10:44:49 INFO mapreduce.ImportJobBase: Transferred 44 bytes in 26.0401 seconds (1.6897 bytes/sec)
14/04/09 10:44:49 INFO mapreduce.ImportJobBase: Retrieved 3 records.
14/04/09 10:44:49 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `demo_blog` AS t LIMIT 1
14/04/09 10:44:49 INFO hive.HiveImport: Loading uploaded data into Hive
14/04/09 10:44:52 INFO hive.HiveImport: 14/04/09 10:44:52 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/04/09 10:44:52 INFO hive.HiveImport: 14/04/09 10:44:52 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
14/04/09 10:44:52 INFO hive.HiveImport: 14/04/09 10:44:52 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
14/04/09 10:44:52 INFO hive.HiveImport: 14/04/09 10:44:52 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
14/04/09 10:44:52 INFO hive.HiveImport: 14/04/09 10:44:52 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
14/04/09 10:44:52 INFO hive.HiveImport: 14/04/09 10:44:52 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
14/04/09 10:44:52 INFO hive.HiveImport: 14/04/09 10:44:52 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
14/04/09 10:44:52 INFO hive.HiveImport: 14/04/09 10:44:52 INFO Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed
14/04/09 10:44:53 INFO hive.HiveImport: 14/04/09 10:44:53 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect.  Use hive.hmshandler.retry.* instead
14/04/09 10:44:53 INFO hive.HiveImport: 
14/04/09 10:44:53 INFO hive.HiveImport: Logging initialized using configuration in file:/usr/local/hive-0.13.0-bin/conf/hive-log4j.properties
14/04/09 10:44:53 INFO hive.HiveImport: SLF4J: Class path contains multiple SLF4J bindings.
14/04/09 10:44:53 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
14/04/09 10:44:53 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/hbase-0.98.0-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
14/04/09 10:44:53 INFO hive.HiveImport: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
14/04/09 10:44:53 INFO hive.HiveImport: SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/04/09 10:44:57 INFO hive.HiveImport: OK
14/04/09 10:44:57 INFO hive.HiveImport: Time taken: 0.773 seconds
14/04/09 10:44:57 INFO hive.HiveImport: Loading data to table default.demo_blog
14/04/09 10:44:57 INFO hive.HiveImport: Table default.demo_blog stats: [numFiles=4, numRows=0, totalSize=44, rawDataSize=0]
14/04/09 10:44:57 INFO hive.HiveImport: OK
14/04/09 10:44:57 INFO hive.HiveImport: Time taken: 0.25 seconds
14/04/09 10:44:57 INFO hive.HiveImport: Hive import complete.
14/04/09 10:44:57 INFO hive.HiveImport: Export directory is empty, removing it
Hive CLI中验证导入的数据:
hive> show tables;
Time taken: 0.073 seconds, Fetched: 8 row(s)
hive> select * from demo_blog;
1	micmiu.com
2	ctosun.com
3	baby.micmiu.com
Time taken: 0.506 seconds, Fetched: 3 row(s)
[五]、导入数据到HBase 演示把表 demo_blog 数据导入到HBase ,指定Hbase中表名为 demo_sqoop2hbase 的命令:
sqoop  import  --connect jdbc:mysql:// --username root --password micmiu --table demo_blog --hbase-table demo_sqoop2hbase --hbase-create-table --hbase-row-key id --column-family url
$ sqoop  import  --connect jdbc:mysql:// --username root --password micmiu --table demo_blog --hbase-table demo_sqoop2hbase --hbase-create-table --hbase-row-key id --column-family url
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
14/04/09 16:23:38 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/04/09 16:23:38 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/04/09 16:23:38 INFO tool.CodeGenTool: Beginning code generation
14/04/09 16:23:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `demo_blog` AS t LIMIT 1
14/04/09 16:23:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `demo_blog` AS t LIMIT 1
14/04/09 16:23:39 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop
Note: /tmp/sqoop-hadoop/compile/85408c854ee8fba75bbb2458e5e25093/demo_blog.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/04/09 16:23:40 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/85408c854ee8fba75bbb2458e5e25093/demo_blog.jar
14/04/09 16:23:40 WARN manager.MySQLManager: It looks like you are importing from mysql.
14/04/09 16:23:40 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
14/04/09 16:23:40 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
14/04/09 16:23:40 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
14/04/09 16:23:40 INFO mapreduce.ImportJobBase: Beginning import of demo_blog
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hbase-0.98.0-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/04/09 16:23:40 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/04/09 16:23:40 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:host.name=Master.Hadoop
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:java.version=1.6.0_20
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Sun Microsystems Inc.
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:java.home=/java/jdk1.6.0_20/jre
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:java.class.path=/usr/local/hadoop/etc/hadoop: .......
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/usr/local/hadoop/lib/native
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:java.compiler=
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:os.version=2.6.32-71.el6.x86_64
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:user.name=hadoop
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/hadoop
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Client environment:user.dir=/home/hadoop
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=Slave6.Hadoop:2181,Slave5.Hadoop:2181,Slave7.Hadoop:2181 sessionTimeout=90000 watcher=hconnection-0x57c8b24d, quorum=Slave6.Hadoop:2181,Slave5.Hadoop:2181,Slave7.Hadoop:2181, baseZNode=/hbase
14/04/09 16:23:41 INFO zookeeper.ClientCnxn: Opening socket connection to server Slave5.Hadoop/ Will not attempt to authenticate using SASL (Unable to locate a login configuration)
14/04/09 16:23:41 INFO zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x57c8b24d connecting to ZooKeeper ensemble=Slave6.Hadoop:2181,Slave5.Hadoop:2181,Slave7.Hadoop:2181
14/04/09 16:23:41 INFO zookeeper.ClientCnxn: Socket connection established to Slave5.Hadoop/, initiating session
14/04/09 16:23:41 INFO zookeeper.ClientCnxn: Session establishment complete on server Slave5.Hadoop/, sessionid = 0x453fecb6c50009, negotiated timeout = 90000
14/04/09 16:23:41 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=Slave6.Hadoop:2181,Slave5.Hadoop:2181,Slave7.Hadoop:2181 sessionTimeout=90000 watcher=catalogtracker-on-hconnection-0x57c8b24d, quorum=Slave6.Hadoop:2181,Slave5.Hadoop:2181,Slave7.Hadoop:2181, baseZNode=/hbase
14/04/09 16:23:41 INFO zookeeper.ClientCnxn: Opening socket connection to server Slave7.Hadoop/ Will not attempt to authenticate using SASL (Unable to locate a login configuration)
14/04/09 16:23:41 INFO zookeeper.RecoverableZooKeeper: Process identifier=catalogtracker-on-hconnection-0x57c8b24d connecting to ZooKeeper ensemble=Slave6.Hadoop:2181,Slave5.Hadoop:2181,Slave7.Hadoop:2181
14/04/09 16:23:41 INFO zookeeper.ClientCnxn: Socket connection established to Slave7.Hadoop/, initiating session
14/04/09 16:23:41 INFO zookeeper.ClientCnxn: Session establishment complete on server Slave7.Hadoop/, sessionid = 0x2453fecb6f50008, negotiated timeout = 90000
14/04/09 16:23:41 INFO zookeeper.ZooKeeper: Session: 0x2453fecb6f50008 closed
14/04/09 16:23:41 INFO zookeeper.ClientCnxn: EventThread shut down
14/04/09 16:23:41 INFO mapreduce.HBaseImportJob: Creating missing HBase table demo_sqoop2hbase
14/04/09 16:23:42 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=Slave6.Hadoop:2181,Slave5.Hadoop:2181,Slave7.Hadoop:2181 sessionTimeout=90000 watcher=catalogtracker-on-hconnection-0x57c8b24d, quorum=Slave6.Hadoop:2181,Slave5.Hadoop:2181,Slave7.Hadoop:2181, baseZNode=/hbase
14/04/09 16:23:42 INFO zookeeper.RecoverableZooKeeper: Process identifier=catalogtracker-on-hconnection-0x57c8b24d connecting to ZooKeeper ensemble=Slave6.Hadoop:2181,Slave5.Hadoop:2181,Slave7.Hadoop:2181
14/04/09 16:23:42 INFO zookeeper.ClientCnxn: Opening socket connection to server Slave7.Hadoop/ Will not attempt to authenticate using SASL (Unable to locate a login configuration)
14/04/09 16:23:42 INFO zookeeper.ClientCnxn: Socket connection established to Slave7.Hadoop/, initiating session
14/04/09 16:23:42 INFO zookeeper.ClientCnxn: Session establishment complete on server Slave7.Hadoop/, sessionid = 0x2453fecb6f50009, negotiated timeout = 90000
14/04/09 16:23:42 INFO zookeeper.ZooKeeper: Session: 0x2453fecb6f50009 closed
14/04/09 16:23:42 INFO zookeeper.ClientCnxn: EventThread shut down
14/04/09 16:23:42 INFO client.RMProxy: Connecting to ResourceManager at Master.Hadoop/
14/04/09 16:23:47 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`id`), MAX(`id`) FROM `demo_blog`
14/04/09 16:23:47 INFO mapreduce.JobSubmitter: number of splits:3
14/04/09 16:23:47 INFO Configuration.deprecation: mapred.job.classpath.files is deprecated. Instead, use mapreduce.job.classpath.files
14/04/09 16:23:47 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/04/09 16:23:47 INFO Configuration.deprecation: mapred.cache.files.filesizes is deprecated. Instead, use mapreduce.job.cache.files.filesizes
14/04/09 16:23:47 INFO Configuration.deprecation: mapred.cache.files is deprecated. Instead, use mapreduce.job.cache.files
14/04/09 16:23:47 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/04/09 16:23:47 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
14/04/09 16:23:47 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
14/04/09 16:23:47 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/04/09 16:23:47 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
14/04/09 16:23:47 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
14/04/09 16:23:47 INFO Configuration.deprecation: mapred.cache.files.timestamps is deprecated. Instead, use mapreduce.job.cache.files.timestamps
14/04/09 16:23:47 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/04/09 16:23:47 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/04/09 16:23:47 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1396936838233_0005
14/04/09 16:23:47 INFO impl.YarnClientImpl: Submitted application application_1396936838233_0005 to ResourceManager at Master.Hadoop/
14/04/09 16:23:47 INFO mapreduce.Job: The url to track the job: http://Master.Hadoop:8088/proxy/application_1396936838233_0005/
14/04/09 16:23:47 INFO mapreduce.Job: Running job: job_1396936838233_0005
14/04/09 16:23:55 INFO mapreduce.Job: Job job_1396936838233_0005 running in uber mode : false
14/04/09 16:23:55 INFO mapreduce.Job:  map 0% reduce 0%
14/04/09 16:24:05 INFO mapreduce.Job:  map 33% reduce 0%
14/04/09 16:24:12 INFO mapreduce.Job:  map 100% reduce 0%
14/04/09 16:24:12 INFO mapreduce.Job: Job job_1396936838233_0005 completed successfully
14/04/09 16:24:12 INFO mapreduce.Job: Counters: 27
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=354636
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=295
		HDFS: Number of bytes written=0
		HDFS: Number of read operations=3
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=0
	Job Counters 
		Launched map tasks=3
		Other local map tasks=3
		Total time spent by all maps in occupied slots (ms)=35297
		Total time spent by all reduces in occupied slots (ms)=0
	Map-Reduce Framework
		Map input records=3
		Map output records=3
		Input split bytes=295
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=381
		CPU time spent (ms)=11050
		Physical memory (bytes) snapshot=543367168
		Virtual memory (bytes) snapshot=3918925824
		Total committed heap usage (bytes)=156958720
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=0
14/04/09 16:24:12 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 29.7126 seconds (0 bytes/sec)
14/04/09 16:24:12 INFO mapreduce.ImportJobBase: Retrieved 3 records.
hbase shell中验证导入的数据:
hbase(main):009:0> list
5 row(s) in 0.0310 seconds
=> ["demo_sqoop2hbase", "table_02", "table_03", "test_table", "xyz"]
hbase(main):010:0> scan "demo_sqoop2hbase"
ROW                          COLUMN+CELL                                                                    
 1                           column=url:blog, timestamp=1397031850700, value=micmiu.com                     
 2                           column=url:blog, timestamp=1397031844106, value=ctosun.com                     
 3                           column=url:blog, timestamp=1397031849888, value=baby.micmiu.com                
3 row(s) in 0.0730 seconds
hbase(main):011:0> describe "demo_sqoop2hbase"
DESCRIPTION                                                            ENABLED                              
 'demo_sqoop2hbase', {NAME => 'url', DATA_BLOCK_ENCODING => 'NONE', BL true                                 
 OOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRE                                      
 SSION => 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', KEEP_DELET                                      
 ED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOC                                      
 KCACHE => 'true'}                                                                                          
1 row(s) in 0.0580 seconds
验证导入成功。 本文到此已经把MySQL中的数据迁移到 HDFS、Hive、HBase的三种基本情况演示结束。 参考:
  • http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html
—————– ?EOF?@Michael Sun?—————–
