Home  >  Article  >  Database  >  hadoop增加新节点实践

hadoop增加新节点实践

WBOY
WBOYOriginal
2016-06-07 16:02:49932browse

之前已经有了namenode和datanode1,现在要新增节点datanode2 第一步:修改将要增加节点的主机名 hadoop@datanode1:~$ vim /etc/hostname datanode2 第二步:修改host文件 hadoop@datanode1:~$ vim /etc/hosts 192.168.8.4 datanode2 127.0.0.1 localhost 127.0

之前已经有了namenode和datanode1,现在要新增节点datanode2
第一步:修改将要增加节点的主机名
hadoop@datanode1:~$ vim /etc/hostname
datanode2
第二步:修改host文件
hadoop@datanode1:~$ vim /etc/hosts
192.168.8.4 datanode2
127.0.0.1 localhost
127.0.1.1 ubuntu
192.168.8.2 namenode
192.168.8.3 datanode1
192.168.8.4 datanode2(增加了这个)

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
第三步:修改ip
\
第四步:重启
第五步:ssh免密码配置
1.生成密钥
hadoop@datanode2:~$ ssh-keygen -t rsa -P ""
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
/home/hadoop/.ssh/id_rsa already exists.
Overwrite (y/n)? y
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
34:45:84:85:6e:f3:9e:7a:c0:f1:a4:ef:bf:30:a6:74 hadoop@datanode2
The key's randomart image is:
+--[ RSA 2048]----+
| *= |
| o. |
| .o |
| .=.. |
| oSB |
| + o |
| .+E. |
| . +=o |
| o+..o. |
+-----------------+
2.把公钥传给namenode
hadoop@datanode2:~$ cd ~/.ssh
hadoop@datanode2:~/.ssh$ ls
authorized_keys id_rsa id_rsa.pub known_hosts
hadoop@datanode2:~/.ssh$ scp ./id_rsa.pub hadoop@namenode:/home/hadoop
hadoop@namenode's password:
id_rsa.pub 100% 398 0.4KB/s 00:00
3.把公钥追加到authorized_keys
hadoop@namenode:~/.ssh$ cat ../id_rsa.pub >> authorized_keys
hadoop@namenode:~/.ssh$ cat authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDuOOD8R7OfNSUhGPZhQWCfC0yTeM6+txWSo3LiJjEWZbH512ymKIEiNRjCzTiRjLEqWGadAPVbip3jLuOHFpk89v7D6q8QH4ilBjLtsaVxmhb77w3yGrXlHJ8+g3QtS8VmjGEyZ86oeM5F9UM8F8QmK9mxXOWhqt3xvufetr7o7acV3APEHH1hvvkFImim2sT/iNi/Nxsch176byUS6y86gOTgznVH8OIx8MDmdKSLjqWPSCTrpvXPESlZvpLm4YSN2cYoKaxcedaynzOhXgAC0GLdq1k07eFmerUwpBT+xTzTRJPquYawK+MPf6+lnLm89u+bewdBZLdunCKhbCK3 hadoop@ubuntu3
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCssQnDzo5uhPn93bVqj+nEpzgQBipc1WgasOeFQV7ljyNlFHhOPVS6G3oHpvSrbjg3aK1MqxmCw0VokuuO5eoHwqh0alQw46eEmunzrnwuhhFpAU9V4t7LJ5pYuxZOioXbsJKxCetOY6G2lKRmyk2Z/MIMpPW+UFebt150+oYXcKKYSBBJoLmThH3bWW2CesAokIe8gCQ3rIYsHfA8rNuwxEnrL8fC2XlWODTahjHD5bymBO4rd3uiJxuTv7/r243t0hrimjhJ7uUIyPcIRYDchPmmO9DFVEBtYloLmqQQs/ZOxDiX7GF+YK7KC7Ayo1kL8VuwP90dqIhpaJmP96zV hadoop@ubuntu2
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDbeTMrOtMZ8gurJyzoSVFpJbtXzUYDElXJcfm0O+FRpigxoIePPHiQc5vi7kabnLSiEv+94YDMclxZpXFjR0TXz6IJOVdPxFPqovY+GzrYVXEXj3HhbBWKC4sFUvGFGSZr8rM3R5OE2wYIZzOKdX9c6Ak5uIE7BUSuXzaiFctYXIvu37TObYZ44vDQGv9/mPsqP4Qnyx4czTLD1VmOeUHA5iQTKLt4K0HNE3i+a3mEEBMxBwETUI/6dcmvTxjEe7cy48YPadr5UT0/xgTub/OdmkBfvfT6fPDVlHtRP5jQiFapFyzL/BXiObqkSlrJbLKWTczS8J6SfsKWsSZfOPzL hadoop@datanode2
4.把公钥传给其节点
hadoop@namenode:~$ scp ./.ssh/authorized_keys hadoop@datanode1:/home/hadoop/.ssh/authorized_keys
authorized_keys 100% 1190 1.2KB/s 00:00
hadoop@namenode:~$ scp ./.ssh/authorized_keys hadoop@datanode2:/home/hadoop/.ssh/authorized_keys
authorized_keys 100% 1190 1.2KB/s 00:00
5.一个错误
 

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

@ WARNING: UNPROTECTED PRIVATE KEY FILE! @

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

Permissions 0644 for '/home/jiangqixiang/.ssh/id_dsa' are too open.

It is recommended that your private key files are NOT accessible by others.

This private key will be ignored.

bad permissions: ignore key: /home/youraccount/.ssh/id_dsa 解决方法:

chmod 700 id_rsa

第六步:修改namenode的配置文件

hadoop@namenode:~$ cd hadoop-1.2.1/conf

hadoop@namenode:~/hadoop-1.2.1/conf$ vim slaves

datanode1

datanode2

第七步:负载均衡

hadoop@namenode:~/hadoop-1.2.1/conf$ start-balancer.sh

Warning: $HADOOP_HOME is deprecated.

starting balancer, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-balancer-namenode.out

以下摘自其他博客

1)如果不balance,那么cluster会把新的数据都存放在新的node上,这样会降低Map Reduce的工作效率

2)threshold是平衡阈值,默认是10%,值越低各节点越平衡,但消耗时间也更长

/app/hadoop/bin/start-balancer.sh -threshold 0.1

3)在namenode的配置文件 hdfs-site.xml 可以加上balance的带宽(默认值就是1M):

  dfs.balance.bandwidthPerSec

  1048576

  

    Specifies the maximum amount of bandwidth that each datanode

    can utilize for the balancing purpose in term of

    the number of bytes per second.

  

第八步:测试是否有效

1.启动hadoop

hadoop@namenode:~/hadoop-1.2.1$ start-all.sh

Warning: $HADOOP_HOME is deprecated.

starting namenode, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-namenode-namenode.out

datanode2: starting datanode, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-datanode-datanode2.out

datanode1: starting datanode, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-datanode-datanode1.out

namenode: starting secondarynamenode, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-secondarynamenode-namenode.out

starting jobtracker, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-jobtracker-namenode.out

datanode2: starting tasktracker, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-datanode2.out

datanode1: starting tasktracker, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-datanode1.out

hadoop@namenode:~/hadoop-1.2.1$

2.错误

运行wordcount程序时出现错误

hadoop@namenode:~/hadoop-1.2.1$ hadoop jar hadoop-examples-1.2.1.jar wordcount in out

Warning: $HADOOP_HOME is deprecated.

14/09/12 08:40:39 ERROR security.UserGroupInformation: PriviledgedActionException as:hadoop cause:org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.mapred.SafeModeException: JobTracker is in safe mode

at org.apache.hadoop.mapred.JobTracker.checkSafeMode(JobTracker.java:5188)

at org.apache.hadoop.mapred.JobTracker.getStagingAreaDir(JobTracker.java:3677)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)

org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.mapred.SafeModeException: JobTracker is in safe mode

at org.apache.hadoop.mapred.JobTracker.checkSafeMode(JobTracker.java:5188)

at org.apache.hadoop.mapred.JobTracker.getStagingAreaDir(JobTracker.java:3677)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)

at org.apache.hadoop.ipc.Client.call(Client.java:1113)

at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)

at org.apache.hadoop.mapred.$Proxy2.getStagingAreaDir(Unknown Source)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)

at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)

at org.apache.hadoop.mapred.$Proxy2.getStagingAreaDir(Unknown Source)

at org.apache.hadoop.mapred.JobClient.getStagingAreaDir(JobClient.java:1309)

at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:102)

at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:942)

at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)

at org.apache.hadoop.mapreduce.Job.submit(Job.java:550)

at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)

at org.apache.hadoop.examples.WordCount.main(WordCount.java:82)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)

at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)

at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.util.RunJar.main(RunJar.java:160)

解决方法:

hadoop@namenode:~/hadoop-1.2.1$ hadoop dfsadmin -safemode leave

Warning: $HADOOP_HOME is deprecated.

Safe mode is OFF

3.再次测试

hadoop@namenode:~/hadoop-1.2.1$ hadoop jar hadoop-examples-1.2.1.jar wordcount in out

Warning: $HADOOP_HOME is deprecated.

14/09/12 08:48:26 INFO input.FileInputFormat: Total input paths to process : 2

14/09/12 08:48:26 INFO util.NativeCodeLoader: Loaded the native-hadoop library

14/09/12 08:48:26 WARN snappy.LoadSnappy: Snappy native library not loaded

14/09/12 08:48:28 INFO mapred.JobClient: Running job: job_201409120827_0003

14/09/12 08:48:29 INFO mapred.JobClient: map 0% reduce 0%

14/09/12 08:48:47 INFO mapred.JobClient: map 50% reduce 0%

14/09/12 08:48:48 INFO mapred.JobClient: map 100% reduce 0%

14/09/12 08:48:57 INFO mapred.JobClient: map 100% reduce 33%

14/09/12 08:48:59 INFO mapred.JobClient: map 100% reduce 100%

14/09/12 08:49:02 INFO mapred.JobClient: Job complete: job_201409120827_0003

14/09/12 08:49:02 INFO mapred.JobClient: Counters: 30

14/09/12 08:49:02 INFO mapred.JobClient: Job Counters

14/09/12 08:49:02 INFO mapred.JobClient: Launched reduce tasks=1

14/09/12 08:49:02 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=27285

14/09/12 08:49:02 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0

14/09/12 08:49:02 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0

14/09/12 08:49:02 INFO mapred.JobClient: Rack-local map tasks=1

14/09/12 08:49:02 INFO mapred.JobClient: Launched map tasks=2

14/09/12 08:49:02 INFO mapred.JobClient: Data-local map tasks=1

14/09/12 08:49:02 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=12080

14/09/12 08:49:02 INFO mapred.JobClient: File Output Format Counters

14/09/12 08:49:02 INFO mapred.JobClient: Bytes Written=48

14/09/12 08:49:02 INFO mapred.JobClient: FileSystemCounters

14/09/12 08:49:02 INFO mapred.JobClient: FILE_BYTES_READ=104

14/09/12 08:49:02 INFO mapred.JobClient: HDFS_BYTES_READ=265

14/09/12 08:49:02 INFO mapred.JobClient: FILE_BYTES_WRITTEN=177680

14/09/12 08:49:02 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=48

14/09/12 08:49:02 INFO mapred.JobClient: File Input Format Counters

14/09/12 08:49:02 INFO mapred.JobClient: Bytes Read=45

14/09/12 08:49:02 INFO mapred.JobClient: Map-Reduce Framework

14/09/12 08:49:02 INFO mapred.JobClient: Map output materialized bytes=110

14/09/12 08:49:02 INFO mapred.JobClient: Map input records=2

14/09/12 08:49:02 INFO mapred.JobClient: Reduce shuffle bytes=110

14/09/12 08:49:02 INFO mapred.JobClient: Spilled Records=18

14/09/12 08:49:02 INFO mapred.JobClient: Map output bytes=80

14/09/12 08:49:02 INFO mapred.JobClient: Total committed heap usage (bytes)=248127488

14/09/12 08:49:02 INFO mapred.JobClient: CPU time spent (ms)=8560

14/09/12 08:49:02 INFO mapred.JobClient: Combine input records=9

14/09/12 08:49:02 INFO mapred.JobClient: SPLIT_RAW_BYTES=220

14/09/12 08:49:02 INFO mapred.JobClient: Reduce input records=9

14/09/12 08:49:02 INFO mapred.JobClient: Reduce input groups=7

14/09/12 08:49:02 INFO mapred.JobClient: Combine output records=9

14/09/12 08:49:02 INFO mapred.JobClient: Physical memory (bytes) snapshot=322252800

14/09/12 08:49:02 INFO mapred.JobClient: Reduce output records=7

14/09/12 08:49:02 INFO mapred.JobClient: Virtual memory (bytes) snapshot=1042149376

14/09/12 08:49:02 INFO mapred.JobClient: Map output records=9

hadoop@namenode:~/hadoop-1.2.1$ hadoop fs -cat out/*

Warning: $HADOOP_HOME is deprecated.

heheh 1

hello 2

it's 1

ll 1

the 2

think 1

why 1

cat: File does not exist: /user/hadoop/out/_logs

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn