Showing posts from October, 2016

Install Apache hadoop Mahout in MacOS without Brew

Here are the steps to install Apache Mahout
1. Download latest package from
2. Extract the package
3. Create a directory and put into HDFS
<hadoop_directory> hdfs dfs -put /home/Hadoop/data/mydata.txt /mahout_data/ 4. Run clustering in mahout
<mahout_directory>/bin/mahout seqdirectory -i hdfs://localhost:9000/mahout_data/ -o hdfs://localhost:9000/clustered_data/ 5. The output file will be in clustered_data directory

Read .properties file in Java

Java class Properties file

[Solved] Hive installation error: Relative path in absolute URI

Exception in thread "main" java.lang.IllegalArgumentException: Relative path in absolute URI: ${$
at org.apache.hadoop.fs.Path.initialize(
Solution: Edit and update hive-site.xml
<name>hive.exec.scratchdir</name> <value>/tmp/hive-${}</value> <name>hive.exec.local.scratchdir</name> <value>/tmp/${}</value> <name>hive.downloaded.resources.dir</name> <value>/tmp/${}_resources</value>

Read a file and print all lines in Hadoop

#1. Compile this code and run in hadoop like
#2. Put the "transaction.csv" into HDFS

hadoop jar Cat.jar org.myorg.Cat #3. You will get the output in the console itself

Hadoop - Class not found Exception with $ symbol

java.lang.RuntimeException: java.lang.ClassNotFoundException: org.gamma.WordCount$Map
To solve this


in main()

How to verify Hadoop is running properly?

To verify hadoop's all node are running properly, type


The output should be like

How to set environment variable permanently in MAC osX

$cd ~
$open -a .bash_profile

This will open a text editor
there you can edit your environment variable

Ex: export JAVA_HOME=/etc/java/java_path

Save the file

$source .bash_profile

Install Hadoop in Mac

#1. Download the latest hadoop distribution from [Ex: hadoop-2.7.3/]

#2. Extract the compressed file anywhere
#3. Then change the following files

hadoop_distro/etc/hadoop/hdfs-site.xml 1 2 3 4 5