Contents
Hadoop Cluster Kickstart
Preface
We work with Apache Hadoop release 1.0.4 from http://hadoop.apache.org/, which is stable version in February 2013.
In our setup the secondarynamenode is running on other machine than namenode. Both namenode and secondarynamenode are also datanodes and tasktracker. Jobtracker is same machine than namenode.
Configuration on all machines in cluster
- We have to add user hadoop in group hadoop on all machines in Cluster:
groupadd -g 790 hadoop
useradd --comment "Hadoop" --shell /bin/zsh -m -r -g 790 -G hadoop --home /usr/local/hadoop hadoop
- in zshrc we have to add some variables:
export HADOOP_INSTALL=/usr/local/hadoop/hadoop-1.0.4 export PATH=$PATH:$HADOOP_INSTALL/bin export HADOOP_CONF_DIR=$HADOOP_INSTALL/conf
Configuration of Hadoop framework
conf/hadoop-env.xml
Following lines are to add in hadoop-env.xml
- Setting JAVA_HOME
export JAVA_HOME=/etc/alternatives/jre_oracle
- Setting cluster members
export HADOOP_SLAVES=$HADOOP_HOME/conf/slaves
- Setting path where hadoop conf should be rsync'd
export HADOOP_MASTER=ssu03:/usr/local/hadoop/hadoop-1.0.4