Home  >  Article  >  Database  >  Hadoop入门--HDFS(单节点)配置和部署(一)

Hadoop入门--HDFS(单节点)配置和部署(一)

WBOY
WBOYOriginal
2016-06-07 15:22:171402browse

一 配置SSH 下载ssh服务端和客户端 sudo apt-get install openssh-server openssh-client 验证是否安装成功 ssh username@192.168.30.128按照提示输入username的密码,回车后显示以下,则成功。(此处不建议修改端口号,hadoop默认的是22,修改后启动hadoop会

一 配置SSH

下载ssh服务端和客户端 sudo apt-get install openssh-server openssh-client 验证是否安装成功 ssh username@192.168.30.128按照提示输入username的密码,回车后显示以下,则成功。(此处不建议修改端口号,hadoop默认的是22,修改后启动hadoop会报异常,除非在hadoop的配置文件中也修改ssh端口号)Welcome to Ubuntu 13.04 (GNU/Linux 3.8.0-34-generic i686)

* Documentation: https://help.ubuntu.com/

New release '13.10' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Sun Dec 8 10:27:38 2013 from ubuntu.local 公钥-私钥登录配置(无密) ssh-keygen -t rsa -P ""(其中会出现输入提示,回车即可,之后home/username/.ssh/ 下生成id_rsa ,id_rsa.pub, known_hosts三个文件。
/home/username/ 下生成 authorized_keys 文件)将id_rsa.pub追加到authorized_keys授权文件中 cat .ssh/id_rsa >> authorized_keys (切换到/home/username/下)公钥-私钥登录配置(有密) ssh-keygen -t rsa (在出现 Enter passphrase (empty for no passphrase):
时,输入设置的密码。其它同上,此处未测试过)

二 安装JDK(采用OpenJDK,为啥不用JDK...百度or谷歌)

下载jdk sudo apt-get install openjdk-7-jdk(目前最新的是openjdk-7)配置环境变量 sudo vim ~/.bashrc (在文件末尾添加) export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-i386
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH(修改环境变量生效)source ~/.bashrc
测试jdk是否安装成功 java -version(出现以下信息则成功)java version "1.7.0_25"
OpenJDK Runtime Environment (IcedTea 2.3.10) (7u25-2.3.10-1ubuntu0.13.04.2)
OpenJDK Client VM (build 23.7-b01, mixed mode, sharing)

三 安装Hadoop和HDFS配置

下载hadoop tar -zxvf hadoop-1.2.1.tar.gz(解压到 hadoop-1.2.1目录下)mv hadoop-1.2.1 hadoop(hadoop-1.2.1目录改名为hadoop)cp hadoop /usr/local(复制hadoop到 /usr/local 目录下)
配置hdfs文件(hadoop<code class="bash plain">/conf/core-site<code class="bash plain">.xml,<code class="bash plain">hadoop<code class="bash plain">/conf/hdfs-site<code class="bash plain">.xml,<code class="bash plain">hadoop<code class="bash plain">/conf/mapred-site<code class="bash plain">.xml) sudo vim /usr/local/<code class="bash plain">hadoop<code class="bash plain">/conf/core-site<code class="bash plain">.xml(修改为以下内容) <?xml version="1.0"?><br> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?><br> <br> <!-- Put site-specific property overrides in this file. --><br> <br> <configuration><br> <property><br> <name>fs.default.name</name><br> <value>hdfs://192.168.30.128:9000</value><br> </property><br> </configuration><br> sudo vim /usr/local/<code class="bash plain">hadoop<code class="bash plain">/conf/hdfs-site<code class="bash plain">.xml(修改为以下内容) <?xml version="1.0"?><br> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?><br> <br> <!-- Put site-specific property overrides in this file. --><br> <br> <configuration><br> <property><br> <name>hadoop.tmp.dir</name><br> <value>/home/username/hadoop_tmp</value><!--需要创建此目录--><br> <description>A base for other temporary directories.</description><br> </property><br> <property><br> <name>dfs.name.dir</name><br> <value>/tmp/hadoop/dfs/datalog1,/tmp/hadoop/dfs/datalog2</value><br> </property><br> <property><br> <name>dfs.data.dir</name><br> <value>/tmp/hadoop/dfs/data1,/tmp/hadoop/dfs/data2</value><br> </property><br> <property><br> <name>dfs.replication</name><br> <value>2</value><br> </property><br> sudo vim /usr/local/<code class="bash plain">hadoop<code class="bash plain">/conf/mapred-site<code class="bash plain">.xml(修改为以下内容) <?xml version="1.0"?><br> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?><br> <br> <!-- Put site-specific property overrides in this file. --><br> <br> <configuration><br> <property><br> <name>mapred.job.tracker</name><br> <value>192.168.30.128:9001</value><br> </property><br> </configuration>

<code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain">四 运行wordcount

<code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain"><code class="bash plain">在hdfs中创建一个统计目录,输出目录不用创建,否则运行wordcount的时候报错。 ./hadoop fs -mkdir /input./hadoop fs -put myword.txt /input./hadoop jar /usr/local/hadoop/hadoop-examples-1.2.1.jar wordcount /input /output./hadoop fs -cat <strong>/output/part-r-00000</strong>
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn