Ubuntu环境下手动配置Hadoop

配置Hadoop

前提时已经配置过JDK和SSH

(如何配置JDK:http://www.cnblogs.com/xxx0624/p/4164744.html)

(如何配置SSH:http://www.cnblogs.com/xxx0624/p/4165252.html)

 

1. 添加Hadoop用户

sudo addgroup hadoop 
sudo adduser --ingroup hadoop hadoop
sudo usermod -aG admin hadoop

 2. 下载Hadoop文件(例:Hadoop1.2.1,我放的时/home/xxx0624/hadoop)

sudo tar -zxzf hadoop-1.2.1.tar.gz 
sudo mv hadoop-1.2.1 /home/xxx0624/hadoop

 保证所有操作都是在hadoop用户下完成

sudo chown -R hadoop:hadoop /home/xxx0624/hadoop

 3. 设置hadoop和java环境变量

sudo gedit /home/xxx0624/hadoop/conf/hadoop-env.sh

 在打开的文件中末尾添加:

export JAVA_HOME=/usr/lib/jvm   //(根据你本身的java安装路径而定的) 
export HADOOP_HOME=/home/xxx0624/hadoop 
export PATH=$PATH:/home/xxx0624/hadoop/bin 

 使环境变量生效(每次运行Hadoop命令都必须保证变量生效!)

source /home/xxx0624/hadoop/conf/hadoop-env.sh

 4. 伪分布式模式配置

core-site.xml:  Hadoop Core的配置项,例如HDFS和MapReduce常用的I/O设置等。

 hdfs-site.xml:  Hadoop 守护进程的配置项,包括namenode,辅助namenode和datanode等。

 mapred-site.xml: MapReduce 守护进程的配置项,包括jobtracker和tasktracker。

  4.1  首先新建这几个文件夹

mkdir tmp 
mkdir hdfs 
mkdir hdfs/name 
mkdir hdfs/data
/*都是在hadoop文件夹下*/

   4.2 开始编辑文件

 core-site.xml:

1     <configuration>
2     <property>
3     <name>fs.default.name</name>
4     <value>hdfs://localhost:9000</value>
5     </property>
6     <property>
7     <name>hadoop.tmp.dir</name>
8     <value>/home/xxx0624/hadoop/tmp</value>
9     </property>

 

hdfs-site.xml:

 1     <configuration>
 2     <property>
 3     <name>dfs.replication</name>
 4     <value>1</value>
 5     </property>
 6     <property>
 7     <name>dfs.name.dir</name>
 8     <value>/home/xxx0624/hadoop/hdfs/name</value>
 9     </property>
10     <property>
11     <name>dfs.data.dir</name>
12     <value>/home/xxx0624/hadoop/hdfs/data</value>
13     </property>
14     </configuration

 

mapred-site.xml:

1     <configuration>
2     <property>
3     <name>mapred.job.tracker</name>
4     <value>localhost:9001</value>
5     </property>
6     </configuration>

 

5. 格式化HDFS

hadoop namenode -format 

 如果出现这种错误:

ERROR namenode.NameNode: java.io.IOException: Cannot create directory /home/xxx0624/hadoop/hdfs/name/current

则:将hadoop的目录权限设为当前用户可写sudo chmod -R a+w /home/xxx0624/hadoop,授予hadoop目录的写权限

 

6. 启动Hadoop

cd /home/xxx0624/hadoop/bin
start-all.sh

 正确结果如下:

Warning: $HADOOP_HOME is deprecated.
starting namenode, logging to /home/xxx0624/hadoop/logs/hadoop-xxx0624-namenode-xxx0624-ThinkPad-Edge.out
localhost: Warning: $HADOOP_HOME is deprecated.
localhost:
localhost: starting datanode, logging to /home/xxx0624/hadoop/logs/hadoop-xxx0624-datanode-xxx0624-ThinkPad-Edge.out
localhost: Warning: $HADOOP_HOME is deprecated.
localhost:
localhost: starting secondarynamenode, logging to /home/xxx0624/hadoop/logs/hadoop-xxx0624-secondarynamenode-xxx0624-ThinkPad-Edge.out
starting jobtracker, logging to /home/xxx0624/hadoop/logs/hadoop-xxx0624-jobtracker-xxx0624-ThinkPad-Edge.out
localhost: Warning: $HADOOP_HOME is deprecated.
localhost:
localhost: starting tasktracker, logging to /home/xxx0624/hadoop/logs/hadoop-xxx0624-tasktracker-xxx0624-ThinkPad-Edge.out

可以通过jps命令来验证是否成功:

如果5个守护进程都出现,则正常

 

7.查看运行状态

 http://localhost:50030/    - Hadoop 管理介面
 http://localhost:50060/    - Hadoop Task Tracker 状态
 http://localhost:50070/    - Hadoop DFS 状态

 

8. 关闭Hadoop

stop-all.sh

 

 

 

 


 

郑重声明:本站内容如果来自互联网及其他传播媒体,其版权均属原媒体及文章作者所有。转载目的在于传递更多信息及用于网络分享,并不代表本站赞同其观点和对其真实性负责,也不构成任何其他建议。