Pages

Hive + MongoDB

In this article I'll show you how to connect mongoDB and hive using MongoStorageHandler.

Suppose you want to load mongoDB data into hive. So that you can run any hive query (HQL) using hadoop map reduce.

First you have to setup hadoop , hive and mongoDB.

Then download the mongo-hadoop release based on your hadoop version.

extract it and copy all 3 jar files to hadoop/lib and hive/lib folders.

First we create a sample records in mongoDB.


 use hive-test  
 db.books.insert({ "_id": 1, "name": "Java 7", "author": "author1" });  
 db.books.insert({ "_id": 2, "name": "Hadoop", "author": "author2" });  
 db.books.insert({ "_id": 3, "name": "Hive", "author": "author3" });  

Then start hive console and run following commands.


 create schema books;  
 use books;  
 CREATE TABLE book (id int, name string, author string)  
 STORED BY "com.mongodb.hadoop.hive.MongoStorageHandler"  
 WITH SERDEPROPERTIES('mongo.columns.mapping'='{"id":"_id","name":"name","author":"author"}')  
 TBLPROPERTIES ( "mongo.uri" = "mongodb://localhost:27017/hive-test.books");  

Make sure data loaded into hive using

 select * from book;   

Troubleshoot

 Failed with exception org.apache.hadoop.hive.ql.metadata.HiveException: Error in loading storage handler.com.mongodb.hadoop.hive.MongoStorageHandler  
 FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask  

Make sure you added above 3 jars into hadoop/lib and hive/lib

Sending Email with Java and Akka actors

Akka is a concurrent framework written by Scala.

Here I demonstrate sample application to send emails with Akka and implemented in Java.

Reasons I decided to use Akka framework other than concurrency.
  • Built-in configurable supervisor strategy to monitor child workers and decide what policy applies when there is an exception.
  • Can reschedule delivery when application throwing some specific exception.
  • Use of actor routers and allow them to use actor connection pool.

Here is how to create supervision strategy.

 class EmailServiceActor extends UntypedActor {  
   private static SupervisorStrategy strategy =  
       new OneForOneStrategy(10, Duration.create("1 minute"),  
           new Function<Throwable, Directive>() {  
             @Override  
             public Directive apply(Throwable t) {  
               if (t instanceof MessagingException) {  
                 return resume();  
               } else if (t instanceof Exception) {  
                 return stop();  
               } else {  
                 return escalate();  
               }  
             }  
           });  
   @Override  
   public void onReceive(Object message) {  
     getContext().actorOf(new Props(EmailServiceWorker.class)).tell(message, self());  
   }  
   @Override  
   public SupervisorStrategy supervisorStrategy() {  
     return strategy;  
   }  
 }  

Here is how child worker create


 class EmailServiceWorker extends UntypedActor {  
   @Override  
   public void onReceive(Object message) {  
     try {  
       EmailService emailService = new EmailService();  
       emailService.sendEmail();  
     } catch (IOException e) {  
       e.printStackTrace(); //To change body of catch statement use File | Settings | File Templates.  
     } catch (MessagingException e) {  
       e.printStackTrace(); //To change body of catch statement use File | Settings | File Templates.  
     }  
   }  
   @Override  
   public void preStart() {  
 //    getContext().system().scheduler().scheduleOnce(Duration.create(5, TimeUnit.SECONDS), self(), "emailWorker", getContext().system().dispatcher(), null);  
   }  
   @Override  
   public void postStop() {  
   }  
 }  


Sample application -  https://github.com/rajithd/email-service-akka

Install CouchDB 1.3 in Ubuntu 12.04 64bit

Here are the steps to install couchDB 1.3 in Ubunutu 12.04 64bit
Step 1
 execute following command set to install required dependencies


 sudo apt-get install g++  
 sudo apt-get install erlang-base erlang-dev erlang-eunit erlang-nox  
 sudo apt-get install libmozjs185-dev libicu-dev libcurl4-gnutls-dev libtool  

Step 2
Then go to couchDB site and download source.

Step 3
Extract it and execute following commands.

 ./configure  
 make  
 sudo make install  

Step 4
To start the couchDB service run the following command.

 sudo couchdb