Jan 10, 2011

Installing Hadoop on Windows / cygwin

Highly unnecessary... unless you're stuck with a Windows machine :-/

1) Install Cygwin

This is as straightforward as it gets, but don't forget to add your favorite editor : vim is not included in the default install (!).

The default directory is c:\cygwin, no reason to change it.

2) Install the Java Development Kit

You should avoid unnecessary spaces in the installation directory : c:\jdk1.6 will do nicely.

3) Install Hadoop

Download the latest release (0.21.0 at the time of writing) and extract it to d:\work (or something similar).

4) Fix your environment variables

Start cygwin and append the following lines at the end of your .bashrc file:

$ export JAVA_HOME=/cygdrive/c/jdk1.6
$ export HADOOP_INSTALL=/cygdrive/d/work/hadoop-0.21.0
$ export PATH=$PATH:$HADOOP_INSTALL/bin


5) Fix the hadoop-config script

$ vi $HADOOP_INSTALL/bin/hadoop-config.sh

Locate this section starting with "# cygwin path translation" and add the following line :

CLASSPATH=`cygpath -wp "$CLASSPATH"`

Save and exit.

6) Test your installation

$ hadoop version
Hadoop 0.21.0
etc etc.


That's it. Happy hadoop'ing :)

5 comments:

  1. Thank you so much!! I had the update to the config file in the wrong location.

    ReplyDelete
  2. Thank You!
    The statement:
    CLASSPATH=`cygpath -wp "$CLASSPATH"`
    helped me a lot. I was struglling for it while installing Hadoop 2.2.0 on Cygwin.

    ReplyDelete
    Replies
    1. Hi NKR,
      I am trying to install Hadoop-2.2.0 on cygwin. Tried following all the steps above. Could not locate the statement "CLASSPATH=`cygpath -wp "$CLASSPATH"` and so added it just after the statment

      # CLASSPATH initially contains $HADOOP_CONF_DIR
      CLASSPATH="${HADOOP_CONF_DIR}"

      and saved and exited.

      Now, when I try the command "hadoop version" at my localhost, I get a response "-bash: hadoop: command not found

      Please help me out in resolving this.

      Thx,
      Neelima

      Delete
  3. Following article describes how to build bin native distribution from source codes, install, configure and run Hadoop 2.2.0 in Windows Platform

    http://www.srccodes.com/p/article/38/build-install-configure-run-apache-hadoop-2.2.0-microsoft-windows-os

    ReplyDelete
  4. As we also follow this blog along with attending hadoop online training center, our knowledge about the hadoop increased in manifold ways. Thanks for the way information is presented on this blog.

    ReplyDelete