Saturday, November 7, 2015

Hadoop changes on Single-Node Cluster Installations


There are many notes about HADOOP single node cluster installation, mostly in Ubuntu machine. All of them I've seen worked. But as time passed by, hadoop developer also made some structural and logical changes to fit the application better.

Here I wish to sum-up those changes (e.g. change of conf, file structure etc.) so newer user would get required commands or files easily while doing HADOOP signe node installation.


first to download hadoop =>
---------------------------------------------------
Best option is to search for nearest repository for hadoop from -

http://www.apache.org/dyn/closer.cgi/hadoop/common/

for me, below is the nearest

$ wget http://apache.lauf-forum.at/hadoop/common/hadoop-2.6.2/hadoop-2.6.2.tar.gz




Where is my JAVA ?
-------------------------------------

While installaing HADOOP, you require to set, JAVA environment variable at your .bashrc, and export it. While following somebody else's example, perhaps it would not fit for your own machine.

So best option is to search where is your jvm.


$ find / -name jvm


it will brings you some result of JVM installed in your machine. Check those link for similarity with the example you are following. For my case, its


/usr/lib/jvm/java-7-oracle


So, I change my .bashrc as,


export JAVA_HOME=/usr/lib/jvm/java-7-oracle


You also need to change the JAVA_HOME at hadoop-env.sh.  You'll find this file in latest hadoop version at


~/hadoop/etc/hadoop/hadoop-env.sh


In latest hadoop a template mapred-site.xml is given, which you need to copy as mapred-site.xml



cp /home/hduser/hadoop/etc/hadoop/mapred-site.xml.template /home/hduser/hadoop/etc/hadoop/mapred-site.x


All hadoop start / stop script is in ~/hadoop/sbin/ folder



./start-dfs.sh 
./start-yarn.sh
./stop-all.sh



ERROR
===============
of course for fist time you'll get some error,

Most common one is

Cannot create directory /app/hadoop/tmp/dfs/name/current

while trying command


~/hadoop/bin/hadoop namenode -format


can be solved using sudo.


sudo ~/hadoop/bin/hadoop namenode -format


sometimes it replied that hduser  is not in sudoers file, in that case, you need to add the user and group of hduser, so they are privileged to use sudo.

 sudo usermod -a -G sudo hduser


More error and updates will be added .... 

No comments:

Post a Comment