<> == Hadoop Cluster Kickstart == === Preface === We work with Apache Hadoop release 1.0.4 from http://hadoop.apache.org/, which is stable version in February 2013.<
> In our setup the ''secondarynamenode'' is running on other machine than ''namenode''. Both namenode and secondarynamenode are also ''datanodes'' and ''tasktracker''. ''Jobtracker'' is same machine than namenode. === Configuration on all machines in cluster === * We have to add user hadoop in group hadoop on all machines in Cluster: {{{ groupadd -g 790 hadoop }}} {{{ useradd --comment "Hadoop" --shell /bin/zsh -m -r -g 790 -G hadoop --home /usr/local/hadoop hadoop }}} * in zshrc we have to add some variables: {{{ export HADOOP_INSTALL=/usr/local/hadoop/hadoop-1.0.4 export PATH=$PATH:$HADOOP_INSTALL/bin export HADOOP_CONF_DIR=$HADOOP_INSTALL/conf }}} === Configuration of Hadoop framework === ==== conf/hadoop-env.xml ==== Following lines are to add in hadoop-env.xml * Setting JAVA_HOME {{{ export JAVA_HOME=/etc/alternatives/jre_oracle }}} * Setting cluster members {{{ export HADOOP_SLAVES=$HADOOP_HOME/conf/slaves }}} * Setting path where hadoop conf should be rsync'd {{{ export HADOOP_MASTER=ssu03:/usr/local/hadoop/hadoop-1.0.4 }}}