Install Hadoop in standalone machine - java Hadoop

Step 1: Install the following software before installing Hadoop

$sudo apt-get update
$sudo apt-get install sun-java6-jdk
$sudo update-java-alternatives -s java-6-sun

Step 2: Verify the java installation using following command

$java -version

Step 3:Create a Hadoop User

$sudo addgroup hadoop
$sudo adduser –ingroup hadoop hduser

Step 4: SSH configuration

$su hduser
Generate ssh key

$ssh-keygen -t rsa -P ""
Enable the SSH access to local machine

$cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
Verify ssh configuration using the command

$ssh slogix.in

Step 5: Disabling IPv6
Add the following lines to the /etc/sysctl.conf file

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

To check whether IPv6 is enabled on local machine using the following command

$cat /proc/sys/net/ipv6/conf/all/disable_ipv6
A return value of 0 means IPv6 is enabled, a value of 1 means disabled

Step 6: Download hadoop from the following link

https://archive.apache.org/dist/hadoop/core/hadoop-1.2.1/

Step 7: Extract hadoop-1.2.1.tar.gz into the /usr/local/ directory using the following command

$sudo tar xzf hadoop-1.2.1.tar.gz -C /usr/local/
Getting Permission to the hadoop user $sudo chown -R hduser:hadoop hadoop-1.2.1

Step 8: Set hadoop environment variable as follows

Modify .bahrcvi /home/admin(system name)/.bashrc
Add the following lines to .bashrc file

# Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/hadoop-1.2.1
# Set JAVA_HOME
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0
# Some convenient aliases and functions for running Hadoop-related commands
unalias fs &> /dev/null
alias fs="hadoop fs"
unalias hls &> /dev/null
alias hls="fs -ls"
# Requires installed 'lzop' command.
lzohead () {
hadoop fs -cat $1 | lzop -dc | head -1000 | less
}
# Add Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin

Step 9: Configuration of Hadoop directory

Add the following lines to Hadoop xml filesThese files are contained in following directory$ cd /usr/local/hadoop-1.2.1/conf

1.hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0
export HADOOP_HOME_WARN_SUPPRESS="TRUE"

2.Core-site.xml
<configuration></li>
</ul>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop-${user.name}</value>
<description>A base for other temporary directories.</description>
</property>

<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. </description>
</property>

</configuration>

3.hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.</description>
</property>
</configuration>

4.mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task.
</description>
</property>

</configuration>

Step 10: Format Hadoop NameNode

Execute the below command from Hadoop directory
$hadoop namenode -format

Step 12: Verify the running state of Hadoop daemons

$jps6146 JobTracker
6400 TaskTracker
6541 Jps
5806 DataNode
6057 SecondaryNameNode
5474 NameNode

Step 13:Stop Hadoop Daemons

Stop Hadoop Daemons
$stop-all.sh
stopping jobtracker
slogix.in: stopping tasktracker
stopping namenode
slogix.in: stopping datanode
slogix.in: stopping secondarynamenode

List

S-Logix (OPC) Private Limited