Hive + MongoDB

In this article I'll show you how to connect mongoDB and hive using MongoStorageHandler.

Suppose you want to load mongoDB data into hive. So that you can run any hive query (HQL) using hadoop map reduce.

First you have to setup hadoop , hive and mongoDB.

Then download the mongo-hadoop release based on your hadoop version.

extract it and copy all 3 jar files to hadoop/lib and hive/lib folders.

First we create a sample records in mongoDB.

 use hive-test  
 db.books.insert({ "_id": 1, "name": "Java 7", "author": "author1" });  
 db.books.insert({ "_id": 2, "name": "Hadoop", "author": "author2" });  
 db.books.insert({ "_id": 3, "name": "Hive", "author": "author3" });  

Then start hive console and run following commands.

 create schema books;  
 use books;  
 CREATE TABLE book (id int, name string, author string)  
 STORED BY "com.mongodb.hadoop.hive.MongoStorageHandler"  
 WITH SERDEPROPERTIES('mongo.columns.mapping'='{"id":"_id","name":"name","author":"author"}')  
 TBLPROPERTIES ( "mongo.uri" = "mongodb://localhost:27017/hive-test.books");  

Make sure data loaded into hive using

 select * from book;   


 Failed with exception org.apache.hadoop.hive.ql.metadata.HiveException: Error in loading storage  
 FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask  

Make sure you added above 3 jars into hadoop/lib and hive/lib

Sending Email with Java and Akka actors

Akka is a concurrent framework written by Scala.

Here I demonstrate sample application to send emails with Akka and implemented in Java.

Reasons I decided to use Akka framework other than concurrency.
  • Built-in configurable supervisor strategy to monitor child workers and decide what policy applies when there is an exception.
  • Can reschedule delivery when application throwing some specific exception.
  • Use of actor routers and allow them to use actor connection pool.

Here is how to create supervision strategy.

 class EmailServiceActor extends UntypedActor {  
   private static SupervisorStrategy strategy =  
       new OneForOneStrategy(10, Duration.create("1 minute"),  
           new Function<Throwable, Directive>() {  
             public Directive apply(Throwable t) {  
               if (t instanceof MessagingException) {  
                 return resume();  
               } else if (t instanceof Exception) {  
                 return stop();  
               } else {  
                 return escalate();  
   public void onReceive(Object message) {  
     getContext().actorOf(new Props(EmailServiceWorker.class)).tell(message, self());  
   public SupervisorStrategy supervisorStrategy() {  
     return strategy;  

Here is how child worker create

 class EmailServiceWorker extends UntypedActor {  
   public void onReceive(Object message) {  
     try {  
       EmailService emailService = new EmailService();  
     } catch (IOException e) {  
       e.printStackTrace(); //To change body of catch statement use File | Settings | File Templates.  
     } catch (MessagingException e) {  
       e.printStackTrace(); //To change body of catch statement use File | Settings | File Templates.  
   public void preStart() {  
 //    getContext().system().scheduler().scheduleOnce(Duration.create(5, TimeUnit.SECONDS), self(), "emailWorker", getContext().system().dispatcher(), null);  
   public void postStop() {  

Sample application -

Install CouchDB 1.3 in Ubuntu 12.04 64bit

Here are the steps to install couchDB 1.3 in Ubunutu 12.04 64bit
Step 1
 execute following command set to install required dependencies

 sudo apt-get install g++  
 sudo apt-get install erlang-base erlang-dev erlang-eunit erlang-nox  
 sudo apt-get install libmozjs185-dev libicu-dev libcurl4-gnutls-dev libtool  

Step 2
Then go to couchDB site and download source.

Step 3
Extract it and execute following commands.

 sudo make install  

Step 4
To start the couchDB service run the following command.

 sudo couchdb