Revision 2 as of 2013-02-12 11:16:04

Clear message

Hadoop Cluster Kickstart

Preface

We work with Apache Hadoop release 1.0.4 from http://hadoop.apache.org/, which is stable version in February 2013.
In our setup the secondarynamenode is running on other machine than namenode. Both namenode and secondarynamenode are also datanodes and tasktracker. Jobtracker is same machine than namenode.

Configuration on all machines in cluster

  groupadd -g 790 hadoop

  useradd --comment "Hadoop" --shell /bin/zsh -m -r -g 790 -G hadoop --home /usr/local/hadoop hadoop

  export HADOOP_INSTALL=/usr/local/hadoop/hadoop-1.0.4
  export PATH=$PATH:$HADOOP_INSTALL/bin
  export HADOOP_CONF_DIR=$HADOOP_INSTALL/conf

Configuration of Hadoop framework

conf/hadoop-env.xml

Following lines are to add in hadoop-env.xml

  export JAVA_HOME=/etc/alternatives/jre_oracle

  export HADOOP_SLAVES=$HADOOP_HOME/conf/slaves

  export HADOOP_MASTER=ssu03:/usr/local/hadoop/hadoop-1.0.4