search
HomeDatabaseMysql Tutorial为已存在的Hadoop集群配置HDFS Federation

一、实验目的 1. 现有Hadoop集群只有一个NameNode,现在要增加一个NameNode。 2. 两个NameNode构成HDFS Federation。 3. 不重启现有集群,不影响数据访问。 二、实验环境 4台CentOS release 6.4虚拟机,IP地址为 192.168.56.101 master 192.168.56.102 slave

一、实验目的
1. 现有Hadoop集群只有一个NameNode,现在要增加一个NameNode。
2. 两个NameNode构成HDFS Federation。
3. 不重启现有集群,不影响数据访问。

二、实验环境
4台CentOS release 6.4虚拟机,IP地址为
192.168.56.101 master
192.168.56.102 slave1
192.168.56.103 slave2
192.168.56.104 kettle

其中kettle是新增的一台“干净”的机器,已经配置好免密码ssh,将作为新增的NameNode。

软件版本:
hadoop 2.7.2
hbase 1.1.4
hive 2.0.0
spark 1.5.0
zookeeper 3.4.8
kylin 1.5.1

现有配置:
master作为hadoop的NameNode、SecondaryNameNode、ResourceManager,hbase的HMaster
slave1、slave2作为hadoop的DataNode、NodeManager,hbase的HRegionServer
同时master、slave1、slave2作为三台zookeeper服务器

三、配置步骤
1. 编辑master上的hdfs-site.xml文件,修改后的文件内容如下所示。
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
	<name>dfs.namenode.name.dir</name>
	<value>file:/home/grid/hadoop-2.7.2/hdfs/name</value>
</property>
<property>
	<name>dfs.datanode.data.dir</name>
	<value>file:/home/grid/hadoop-2.7.2/hdfs/data</value>
</property>
<property>
	<name>dfs.replication</name>
	<value>1</value>
</property>
<property>
	<name>dfs.webhdfs.enabled</name>
	<value>true</value>
</property>

<!-- 新增属性 -->
<property>
    <name>dfs.nameservices</name>
    <value>ns1,ns2</value>
</property>
<property>
    <name>dfs.namenode.rpc-address.ns1</name>
    <value>master:9000</value>
</property>
<property>
    <name>dfs.namenode.http-address.ns1</name>
    <value>master:50070</value>
</property>
<property>
    <name>dfs.namenode.secondary.http-address.ns1</name>
    <value>master:9001</value>
</property>
<property>
    <name>dfs.namenode.rpc-address.ns2</name>
    <value>kettle:9000</value>
</property>
<property>
    <name>dfs.namenode.http-address.ns2</name>
    <value>kettle:50070</value>
</property>
<property>
    <name>dfs.namenode.secondary.http-address.ns2</name>
    <value>kettle:9001</value>
</property>
</configuration>
2. 拷贝master上的hdfs-site.xml文件到集群上的其它节点
scp hdfs-site.xml slave1:/home/grid/hadoop-2.7.2/etc/hadoop/
scp hdfs-site.xml slave2:/home/grid/hadoop-2.7.2/etc/hadoop/
3. 将Java目录、Hadoop目录、环境变量文件从master拷贝到kettle
scp -rp /home/grid/hadoop-2.7.2 kettle:/home/grid/
scp -rp /home/grid/jdk1.7.0_75 kettle:/home/grid/
# 用root执行
scp -p /etc/profile.d/* kettle:/etc/profile.d/
4. 启动新的NameNode、SecondaryNameNode
# 在kettle上执行
source /etc/profile
ln -s hadoop-2.7.2 hadoop
$HADOOP_HOME/sbin/hadoop-daemon.sh start namenode
$HADOOP_HOME/sbin/hadoop-daemon.sh start secondarynamenode

执行后启动了NameNode、SecondaryNameNode进程,如图1所示。


图1

5. 刷新DataNode收集新添加的NameNode
# 在集群中任意一台机器上执行均可
$HADOOP_HOME/bin/hdfs dfsadmin -refreshNamenodes slave1:50020
$HADOOP_HOME/bin/hdfs dfsadmin -refreshNamenodes slave2:50020
至此,HDFS Federation配置完成,从web查看两个NameNode的状态分别如图2、图3所示。


图2


图3


四、测试
# 向HDFS上传一个文本文件
hadoop dfs -put /home/grid/hadoop/NOTICE.txt /
# 分别在两台NameNode节点上运行Hadoop自带的例子
# 在master上执行
hadoop jar /home/grid/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /NOTICE.txt /output
# 在kettle上执行
hadoop jar /home/grid/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /NOTICE.txt /output1
用下面的命令查看两个输出结果,分别如图4、图5所示。
hadoop dfs -cat /output/part-r-00000
hadoop dfs -cat /output1/part-r-00000
图4


图5


参考:
http://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/Federation.html
Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
MySQL: BLOB and other no-sql storage, what are the differences?MySQL: BLOB and other no-sql storage, what are the differences?May 13, 2025 am 12:14 AM

MySQL'sBLOBissuitableforstoringbinarydatawithinarelationaldatabase,whileNoSQLoptionslikeMongoDB,Redis,andCassandraofferflexible,scalablesolutionsforunstructureddata.BLOBissimplerbutcanslowdownperformancewithlargedata;NoSQLprovidesbetterscalabilityand

MySQL Add User: Syntax, Options, and Security Best PracticesMySQL Add User: Syntax, Options, and Security Best PracticesMay 13, 2025 am 12:12 AM

ToaddauserinMySQL,use:CREATEUSER'username'@'host'IDENTIFIEDBY'password';Here'showtodoitsecurely:1)Choosethehostcarefullytocontrolaccess.2)SetresourcelimitswithoptionslikeMAX_QUERIES_PER_HOUR.3)Usestrong,uniquepasswords.4)EnforceSSL/TLSconnectionswith

MySQL: How to avoid String Data Types common mistakes?MySQL: How to avoid String Data Types common mistakes?May 13, 2025 am 12:09 AM

ToavoidcommonmistakeswithstringdatatypesinMySQL,understandstringtypenuances,choosetherighttype,andmanageencodingandcollationsettingseffectively.1)UseCHARforfixed-lengthstrings,VARCHARforvariable-length,andTEXT/BLOBforlargerdata.2)Setcorrectcharacters

MySQL: String Data Types and ENUMs?MySQL: String Data Types and ENUMs?May 13, 2025 am 12:05 AM

MySQloffersechar, Varchar, text, Anddenumforstringdata.usecharforfixed-Lengthstrings, VarcharerForvariable-Length, text forlarger text, AndenumforenforcingdataAntegritywithaetofvalues.

MySQL BLOB: how to optimize BLOBs requestsMySQL BLOB: how to optimize BLOBs requestsMay 13, 2025 am 12:03 AM

Optimizing MySQLBLOB requests can be done through the following strategies: 1. Reduce the frequency of BLOB query, use independent requests or delay loading; 2. Select the appropriate BLOB type (such as TINYBLOB); 3. Separate the BLOB data into separate tables; 4. Compress the BLOB data at the application layer; 5. Index the BLOB metadata. These methods can effectively improve performance by combining monitoring, caching and data sharding in actual applications.

Adding Users to MySQL: The Complete TutorialAdding Users to MySQL: The Complete TutorialMay 12, 2025 am 12:14 AM

Mastering the method of adding MySQL users is crucial for database administrators and developers because it ensures the security and access control of the database. 1) Create a new user using the CREATEUSER command, 2) Assign permissions through the GRANT command, 3) Use FLUSHPRIVILEGES to ensure permissions take effect, 4) Regularly audit and clean user accounts to maintain performance and security.

Mastering MySQL String Data Types: VARCHAR vs. TEXT vs. CHARMastering MySQL String Data Types: VARCHAR vs. TEXT vs. CHARMay 12, 2025 am 12:12 AM

ChooseCHARforfixed-lengthdata,VARCHARforvariable-lengthdata,andTEXTforlargetextfields.1)CHARisefficientforconsistent-lengthdatalikecodes.2)VARCHARsuitsvariable-lengthdatalikenames,balancingflexibilityandperformance.3)TEXTisidealforlargetextslikeartic

MySQL: String Data Types and Indexing: Best PracticesMySQL: String Data Types and Indexing: Best PracticesMay 12, 2025 am 12:11 AM

Best practices for handling string data types and indexes in MySQL include: 1) Selecting the appropriate string type, such as CHAR for fixed length, VARCHAR for variable length, and TEXT for large text; 2) Be cautious in indexing, avoid over-indexing, and create indexes for common queries; 3) Use prefix indexes and full-text indexes to optimize long string searches; 4) Regularly monitor and optimize indexes to keep indexes small and efficient. Through these methods, we can balance read and write performance and improve database efficiency.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

MinGW - Minimalist GNU for Windows

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

Dreamweaver Mac version

Dreamweaver Mac version

Visual web development tools

MantisBT

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment