<<TableOfContents>>
== Hadoop Cluster Kickstart ==
=== Preface ===
We work with Apache Hadoop release 1.0.4 from http://hadoop.apache.org/, which is stable version in February 2013.<<BR>>
In our setup the ''secondarynamenode'' is running on other machine than ''namenode''. Both namenode and secondarynamenode are also ''datanodes'' and ''tasktracker''. ''Jobtracker'' is same machine than namenode.

=== Configuration on all machines in cluster ===
 * We have to add user hadoop in group hadoop on all machines in Cluster:
{{{
  groupadd -g 790 hadoop
}}}
{{{
  useradd --comment "Hadoop" --shell /bin/zsh -m -r -g 790 -G hadoop --home /usr/local/hadoop hadoop
}}}

 * in zshrc we have to add some variables:
{{{
  export HADOOP_INSTALL=/usr/local/hadoop/hadoop-1.0.4
  export PATH=$PATH:$HADOOP_INSTALL/bin
  export HADOOP_CONF_DIR=$HADOOP_INSTALL/conf
}}}
=== Configuration of Hadoop framework ===
==== conf/hadoop-env.xml ====
Following lines are to add in hadoop-env.xml
 * Setting JAVA_HOME
{{{
  export JAVA_HOME=/etc/alternatives/jre_oracle
}}}

 * Setting cluster members
{{{
  export HADOOP_SLAVES=$HADOOP_HOME/conf/slaves
}}}

 * Setting path where hadoop conf should be rsync'd
{{{
  export HADOOP_MASTER=ssu03:/usr/local/hadoop/hadoop-1.0.4
}}}