Diff for "Hadoop Cluster Kickstart" -

Differences between revisions 1 and 2

Contents

Hadoop Cluster Kickstart

Hadoop Cluster Kickstart

Preface

We work with Apache Hadoop release 1.0.4 from http://hadoop.apache.org/, which is stable version in February 2013.
In our setup the secondarynamenode is running on other machine than namenode. Both namenode and secondarynamenode are also datanodes and tasktracker. Jobtracker is same machine than namenode.

Configuration on all machines in cluster

We have to add user hadoop in group hadoop on all machines in Cluster:

  groupadd -g 790 hadoop

  useradd --comment "Hadoop" --shell /bin/zsh -m -r -g 790 -G hadoop --home /usr/local/hadoop hadoop

in zshrc we have to add some variables:

  export HADOOP_INSTALL=/usr/local/hadoop/hadoop-1.0.4
  export PATH=$PATH:$HADOOP_INSTALL/bin
  export HADOOP_CONF_DIR=$HADOOP_INSTALL/conf

Configuration of Hadoop framework

conf/hadoop-env.xml

Following lines are to add in hadoop-env.xml

Setting JAVA_HOME

  export JAVA_HOME=/etc/alternatives/jre_oracle

Setting cluster members

  export HADOOP_SLAVES=$HADOOP_HOME/conf/slaves

Setting path where hadoop conf should be rsync'd

  export HADOOP_MASTER=ssu03:/usr/local/hadoop/hadoop-1.0.4

-  ⇤ ← Revision 1 as of 2013-02-12 10:18:57 → 
  Size: 41
  Editor: AndreasKnoepke
  Comment:
+   ← Revision 2 as of 2013-02-12 11:16:04 → ⇥
  Size: 1296
  Editor: AndreasKnoepke
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 1:
-* User:group on all machines in Cluster
+<<TableOfContents>>
== Hadoop Cluster Kickstart ==
=== Preface ===
We work with Apache Hadoop release 1.0.4 from http://hadoop.apache.org/, which is stable version in February 2013.<<BR>>
In our setup the ''secondarynamenode'' is running on other machine than ''namenode''. Both namenode and secondarynamenode are also ''datanodes'' and ''tasktracker''. ''Jobtracker'' is same machine than namenode.

=== Configuration on all machines in cluster ===
 * We have to add user hadoop in group hadoop on all machines in Cluster:
{{{
  groupadd -g 790 hadoop
}}}
{{{
  useradd --comment "Hadoop" --shell /bin/zsh -m -r -g 790 -G hadoop --home /usr/local/hadoop hadoop
}}}

 * in zshrc we have to add some variables:
{{{
  export HADOOP_INSTALL=/usr/local/hadoop/hadoop-1.0.4
  export PATH=$PATH:$HADOOP_INSTALL/bin
  export HADOOP_CONF_DIR=$HADOOP_INSTALL/conf
}}}
=== Configuration of Hadoop framework ===
==== conf/hadoop-env.xml ====
Following lines are to add in hadoop-env.xml
 * Setting JAVA_HOME
{{{
  export JAVA_HOME=/etc/alternatives/jre_oracle
}}}

 * Setting cluster members
{{{
  export HADOOP_SLAVES=$HADOOP_HOME/conf/slaves
}}}

 * Setting path where hadoop conf should be rsync'd
{{{
  export HADOOP_MASTER=ssu03:/usr/local/hadoop/hadoop-1.0.4
}}}

Wiki

Page

Hadoop Cluster Kickstart

Preface

Configuration on all machines in cluster

Configuration of Hadoop framework

conf/hadoop-env.xml