HDFS与关系型数据库数据交换利器—sqoop初探-mysql チュートリアル-php.cn

ホームページ

データベース

mysql チュートリアル

HDFS与关系型数据库数据交换利器—sqoop初探

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 07, 2016 pm 04:29 PM

hdfs交換関連した予備探査鋭い武器データデータベース

Sqoop是一种用于hadoop与RDBMS进行数据传输的工具。
配置比较简单。
去apache官网下载最新的sqoop包。
下载地址：http://www.apache.org/dist/sqoop/1.99.1/
解压缩到服务器上。服务器要求本身有jdk，hadoop，hive。
配置：
conf/sqoop-env.sh
#Set path to where bin/hadoop is available
export HADOOP_HOME=/home/hadoop/hadoop-0.20.205.0
#Set the path to where bin/hive is available
export HIVE_HOME=/home/hadoop/hive-0.8.1
这时候就可以进行试验了。我们主要是利用其与hive进行交互，实际就是将关系型的数据库中的数据提交到hive，保存到HDFS中，以便于大数据的计算。

sqoop主要包含了以下命令，或者说功能。

 codegen             Import a table definition into Hive eval                Evaluate a SQL statement and display the results export              Export an HDFS directory to a database table help                List available commands import              Import a table from a database to HDFS import-all-tables   Import tables from a database to HDFS job                 Work with saved jobs list-databases      List available databases on a server list-tables         List available tables in a database merge               Merge results of incremental imports metastore           Run a standalone Sqoop metastore version             Display version information <code> 这里主要是使用其中的import功能。export功能的命令语法类似。</code>

示例

./sqoop import --connect jdbc:mysql://lcoalhost:3306/dbname--username dbuser --password dbpassword --table tablename --hive-import --hive-table hivedb.hivetable --hive-drop-import-delims --hive-overwrite --num-mappers 6

以上命令的意思就是要将本地数据库dbname中的tablename表的数据导入到hivedb的hivetable表中。
其中一些常用的参数就不进行解释了。

–hive-import 标识本次导入的地址为hive
–hive-table 标识hive中的表信息
–hive-drop-import-delims 这个比较重要，因为数据从数据库中导入到HDFS中，如果包含了特殊的字符，对MR解析是存在问题的，比如数据库中
有text类型的字段，有可能包含\t,\n等参数，加入这个参数后，会自动将特殊字符进行处理。
–hive-overwrite 如果原有的hive表已经存在，则会进行覆盖操作。
–num-mappers 会指定执行本次导入的mapper任务数量。

还有一个比较重要的参数 –direct 这个参数可以通过数据库的dump功能进行数据导入，这样的性能比上例更好，但是其不能与–hive-drop-import-delims参数功能使用。所以还是要根据自己数据库的情况来进行判断使用何种命令。

如下是sqoop的import命令

Argument	Description
`--connect <jdbc-uri></jdbc-uri>`	Specify JDBC connect string
`--connection-manager <class-name></class-name>`	Specify connection manager class to use
`--driver <class-name></class-name>`	Manually specify JDBC driver class to use
`--hadoop-home <dir></dir>`	Override $HADOOP_HOME
`--help`	Print usage instructions
`-P`	Read password from console
`--password <password></password>`	Set authentication password
`--username <username></username>`	Set authentication username
`--verbose`	Print more information while working
`--connection-param-file <filename></filename>`	Optional properties file that provides connection parameters

Argument	Description
`--hive-home <dir></dir>`	Override `$HIVE_HOME`
`--hive-import`	Import tables into Hive (Uses Hive’s default delimiters if none are set.)
`--hive-overwrite`	Overwrite existing data in the Hive table.
`--create-hive-table`	If set, then the job will fail if the target hive
table exits. By default this property is false.
`--hive-table <table-name></table-name>`	Sets the table name to use when importing to Hive.
`--hive-drop-import-delims`	Drops \n, \r, and \01 from string fields when importing to Hive.
`--hive-delims-replacement`	Replace \n, \r, and \01 from string fields with user defined string when importing to Hive.
`--hive-partition-key`	Name of a hive field to partition are sharded on
`--hive-partition-value <v></v>`	String-value that serves as partition key for this imported into hive in this job.
`--map-column-hive <map></map>`	Override default mapping from SQL type to Hive type for configured columns.

以下为一些参考示例

写入条件
sqoop import –table test –columns “id,name” –where “id>400″
使用dump功能
sqoop import –connect jdbc:mysql://server.foo.com/db –table bar –direct — –default-character-set=latin1
列类型重新定义
sqoop import … –map-column-java id=String,value=Integer
定义分割符
sqoop import –connect jdbc:mysql://db.foo.com/corp –table EMPLOYEES –fields-terminated-by ‘\t’ –lines-terminated-by ‘\n’ –optionally-enclosed-by ‘\”‘

原文地址：HDFS与关系型数据库数据交换利器—sqoop初探, 感谢原作者分享。

声明

この記事の内容はネチズンが自主的に寄稿したものであり、著作権は原著者に帰属します。このサイトは、それに相当する法的責任を負いません。盗作または侵害の疑いのあるコンテンツを見つけた場合は、admin@php.cn までご連絡ください。

MySQLでビューを使用することの限界は何ですか？May 14, 2025 am 12:10 AM

mysqlviewshavelimitations：1）supportallsqloperations、制限、dataManipulationswithjoinsorubqueries.2）それらは、特にパフォーマンス、特にパルフェクソルラージャターセット

MySQLデータベースのセキュリティ：ユーザーの追加と特権の付与May 14, 2025 am 12:09 AM

reperusermanmanagementInmysqliscialforenhancingsecurationsinginuring databaseaperation.1）usecreateusertoaddusers、指定connectionsourcewith@'localhost'or@'％ '。

MySQLで使用できるトリガーの数にどのような要因がありますか？May 14, 2025 am 12:08 AM

mysqldoes notimposeahardlimitontriggers、しかしpracticalfactorsdeTerminetheireffectiveuse：1）serverconufigurationStriggermanagement; 2）complentiggersincreaseSystemload;

mysql：Blobを保管しても安全ですか？May 14, 2025 am 12:07 AM

はい、それはssafetostoreblobdatainmysql、butonsiderheSeCactors：1）Storagespace：blobscanconsumesificantspace.2）パフォーマンス：パフォーマンス：大規模なドゥエットブロブスメイズ階下3）backupandrecized recized recized recize

MySQL：PHP Webインターフェイスを介してユーザーを追加しますMay 14, 2025 am 12:04 AM

PHP Webインターフェイスを介してMySQLユーザーを追加すると、MySQLI拡張機能を使用できます。手順は次のとおりです。1。MySQLデータベースに接続し、MySQLI拡張機能を使用します。 2。ユーザーを作成し、CreateUserステートメントを使用し、パスワード（）関数を使用してパスワードを暗号化します。 3. SQLインジェクションを防ぎ、MySQLI_REAL_ESCAPE_STRING（）関数を使用してユーザー入力を処理します。 4.新しいユーザーに権限を割り当て、助成金ステートメントを使用します。

MySQL：BLOBおよびその他のNO-SQLストレージ、違いは何ですか？May 13, 2025 am 12:14 AM

mysql'sblobissuitable forstoringbinarydatawithinarationaldatabase、whileenosqloptionslikemongodb、redis、andcassandraofferferulesions forunstructureddata.blobissimplerbutcanslowdowdowd withwithdata

MySQLユーザーの追加：構文、オプション、セキュリティのベストプラクティスMay 13, 2025 am 12:12 AM

toaddauserinmysql、使用：createuser'username '@' host'identifidedby'password '; here'showtodoitsely：1）chosehostcarefilytoconを選択しますTrolaccess.2）setResourcelimitslikemax_queries_per_hour.3）usestrong、uniquasswords.4）endforcessl/tlsconnectionswith

MySQL：文字列データ型の一般的な間違いを回避する方法May 13, 2025 am 12:09 AM

toavoidcommonMonmistakeswithStringDatatypesinmysql、undultingStringTypenuste、choosetherightType、andManageEncodingandCollationsEttingtingive.1）Usecharforfixed-LengthStrings、Varcharforaible Length、AndText/Blobforlardata.2）setCurrectCherts

See all articles