Pages

How To Setup a Hadoop Cluster

In this tutorial I will show you the require steps for setting up a multi-node hadoop cluster using Hadoop Distributed File System (HDFS) in Linux based Operating Systems.

What is Hadoop ?
Apache Hadoop is a software framework that supports data-intensive distributed applications under a free license.[1] It enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Google's MapReduce and Google File System (GFS) papers.

Source : http://en.wikipedia.org/wiki/Hadoop

In this tutorial I will guide you the required steps to setup a multi-node cluster.

STEP 1
To Setup hadoop we need some prerequisites.

1. Download and Config JDK
    Java 1.6.X recommended.

2. Download Hadoop
    Download Hadoop latest stable release in here

All the nodes must have the same version of JDK and hadoop core.

STEP 2
Establish Authentication among nodes


Suppose if a user from node_A wants to login to a remote node_B by using SSH, It will asked the password for node_B for authentication. So it is impossible to enter the password every time the masternode wants to operate the slavenode. To solve this we must adopt public key authentication. Every node will generate a pair of public key and private key, and node_A can login to node_B without password authentication only if node_B has a copy of node_A's public key. In hadoop cluster all the slave nodes must have a copy of master nodes public key.

To do this,
Login each node and run the following command.
ssh-keygen -t rsa
When question asked simply press enter to continue. Then two files "id_rsa" and "id_rsa.pub" are creates under the /home/username/.ssh/

Now login to master node and run the following command.

  • cat /home/username/.ssh/id_rsa.pub >> /home/username/.ssh/authorizes_keys
  • scp /home/username/.ssh/id_rsa.pub ip_address_of_slavenode:/home/username/.ssh/master.pub
Then login to each slave node and run the following command.
cat /home/username/.ssh/master.pub >> /home/username/.ssh/authorized_keys
Then login back to master node and run to test whether masternode can login to slave node without password.
ssh ip_address_of_slave_node

STEP 5
In this step we have to install hadoop in each slave node. Download the hadoop and exact to a directory and set the HADOOP_INSTALL variable.

STEP 6
Hadoop Configuration

Set the JAVA_HOME and HADOOP_INSTALL system variables.

Modify "hadoop-env.sh" in HADOOP_HOME/conf/. Delete the beginning '#' in The Java Implementation to use and fill the appropriate path.

Modify hdfs-site.xml , mapred-site.xml , core-site.xml as below.
Download Link : http://hotfile.com/dl/85903416/b760647/XMLs.tar.gz.html

STEP 7
Start Hadoop

First you have to format the namenode. To do this
hadoop namenode -format
Then Start the cluster
start-all.sh


Some Useful links.
http://ip_add_of_namenode:50070
http://ip_add_of_jobtracker:50030
http://ip_add_of_map_reduce:50060

Hbase

Hbase is a distributed column-oriented database built on top of Hadoop Distributed file System. Hbase is the hadoop application to use when you require real-time read and write random-access to very large datasets.

more info : http://hbase.apache.org/

Installation


Download stable release from above page and extract it to somewhere on your file system.
For linux users
                 tar xzf hbase-x.y.z.tar.gz


First You have to set the JAVA_HOME environment variable .

After that set the HBASE_INSTALL environment variable.
For linux users you can edit the bash_profile or simply write the line in your console.
                     export HBASE_INSTALL=/home/hbase/hbase-x.y.z
                     export PATH=$PATH:$HBASE_INSTALL/bin 


To verify your installation check with hbase command. It will generate list of options.

To start a tempory instance of hbase go to your hbase bin dirctory and type  start-hbase.sh.


Now it will start the hbase instance.


To Administer your hbase instance, launch the Hbase shell by typing hbase shell in console. This will bring a interpreter that has some hbase specific commands added to it.
For help just type help in console and you can see the list of help commands.

To  create a table name test with a single column family data
 create 'test','data'
and press enter.

To list down the tables type list command.

To insert data to table
put 'test','row1','data:1','val1'

To get the description of the table.
scan 'test'
 To drop the table first you have to disable it and drop it.
disable 'test'
drop 'test'

Shutdown the hbase instance
stop-hbase.sh


Using Java we can write a programme to retrieve, insert, delete, update data to hbase tables.

First Create the tables as below.
create 'DemoInsert', 'details'


To insert data create class call InsertData and 
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.util.Bytes;


public class InsertData {

public static void main(String[] args) throws Exception {


HBaseConfiguration hbaseConfig = new HBaseConfiguration();
HTable htable = new HTable(hbaseConfig, "DemoInsert");
htable.setAutoFlush(false);
htable.setWriteBufferSize(1024 * 1024 * 12);

int totalRecords = 100000;
int maxID = totalRecords / 1000;
Random rand = new Random();
System.out.println("importing " + totalRecords + " records ....");
for (int i=0; i < totalRecords; i++) {
int userID = rand.nextInt(maxID) + 1;
byte [] rowkey = Bytes.add(Bytes.toBytes(userID), Bytes.toBytes(i));
String randomPage =rand.nextInt()*1000 ;
Put put = new Put(rowkey);
put.add(Bytes.toBytes("details"), Bytes.toBytes("page"), Bytes.toBytes(randomPage)); htable.put(put);
}
htable.flushCommits();
htable.close();
System.out.println("done");
} }

This is a basic example to show how to insert data to table in java.

You can find more smiler example in : http://jgray.la/javadoc/hbase-0.20.0/org/apache/hadoop/hbase/client/package-summary.html







 

World Smallest Projector Up to date...

iGo, introduces the UP-2020 palm size pocket projector based on the Digital light processing System.


 The Projector features 854x480 native resolution and able to project viewable screen up to 70 inches. 
It also supports built in media playback function like MP4,JPG and BMP files and has micro-SD card slot, USB port and built in speaker.









 

What is Zookeeper ?

Zookeeper is use to build general distributed application so its a coordination service.
In this tutorials we will look how to build a sample distributed application.
First look at the definition
ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All of these kinds of services are used in some form or another by distributed applications. Each time they are implemented there is a lot of work that goes into fixing the bugs and race conditions that are inevitable. Because of the difficulty of implementing these kinds of services, applications initially usually skimp on them ,which make them brittle in the presence of change and difficult to manage. Even when done correctly, different implementations of these services lead to management complexity when the applications are deployed.

Source : http://hadoop.apache.org/zookeeper/

In Zookeeper we can handle partial failure. Partial failure means something like this.
Suppose we send a message across the network to one node to another node.If the network fails sender does not know whether the receiver get the message. So the only way to find this is reconnect to the receiver and ask it. This is called the partial failure.

Zookeeper Characteristics
1.Simple
             Zookeeper has some few valuable operations, such as ordering and notifications.
2.Expressive
             Can build large data structures and protocols.
3.Highly Available
             Runs on a collection of machines and designed to be highly available. 
4.Loosely  coupled interaction
             Zookeeper participants do not need to know about one another.

Installing Zookeeper

Here are the steps to install zookeeeper.
1. To install zookeeper require java 6 or later version. You can find the java latest version in http://www.oracle.com/technetwork/java/javase/downloads/index.html 

2. After installing the jdk set the path.
   Windows : Set path in environment variable
   Unix : Open a terminal type vi ~/.bash_profile -> export JAVA_HOME=/path   /to/java/dir -> export PATH=$PATH:$JAVA_HOME/bin

3. Download Zookeeper http://hadoop.apache.org/zookeeper/releases.html

4.Unzip it and set zookeeper variable

5.Before running zookeeper service you have to make the zoo.cfg file inside the /conf/zoo.cfg as
   tickTime=2000
   dataDir=/path/zookeeper/dir 
   clientPort=2181

6. Now all are ready to start the Zookeeper server
    zkServer.sh start

7. zkServer status is use to check zookeeper is running or not

Group Membership



Create a Group

import java.io.IOException;
import java.util.concurrent.CountDownLatch;

import org.apache.zookeeper.CreateMode;
import org.apache.zookeeper.KeeperException;
import org.apache.zookeeper.WatchedEvent;
import org.apache.zookeeper.Watcher;
import org.apache.zookeeper.Watcher.Event.KeeperState;
import org.apache.zookeeper.ZooDefs.Ids;
import org.apache.zookeeper.ZooKeeper;

public class CreateGroup implements Watcher {
private static final int SESSION_TIMEOUT = 5000;
private ZooKeeper zk;
private CountDownLatch connectedSignal = new CountDownLatch(1);
public void connect(String hosts) throws IOException, InterruptedException {
zk = new ZooKeeper(hosts, SESSION_TIMEOUT, this);
connectedSignal.await();
}
@Override
public void process(WatchedEvent event) { // Watcher interface
if (event.getState() == KeeperState.SyncConnected) {
connectedSignal.countDown();
}
}
public void create(String groupName) throws KeeperException,
InterruptedException {
String path = "/" + groupName;
String createdPath = zk.create(path, null/*data*/, Ids.OPEN_ACL_UNSAFE,
CreateMode.PERSISTENT);
System.out.println("Created " + createdPath);
}
public void close() throws InterruptedException {
zk.close();
}
public static void main(String[] args) throws Exception {
CreateGroup createGroup = new CreateGroup();
createGroup.connect(args[0]);
createGroup.create(args[1]);
createGroup.close();
}
}

Zookeeper API : http://hadoop.apache.org/zookeeper/docs/r3.2.1/api/index.html

Join The Group

public class ConnectionWatcher implements Watcher {
private static final int SESSION_TIMEOUT = 5000;
protected ZooKeeper zk;
private CountDownLatch connectedSignal = new CountDownLatch(1);
public void connect(String hosts) throws IOException, InterruptedException {
zk = new ZooKeeper(hosts, SESSION_TIMEOUT, this);
connectedSignal.await();
}
@Override
public void process(WatchedEvent event) {
if (event.getState() == KeeperState.SyncConnected) {
connectedSignal.countDown();
}
}
public void close() throws InterruptedException {
zk.close();
}
}


public class JoinGroup extends ConnectionWatcher {
public void join(String groupName, String memberName) throws KeeperException,
InterruptedException {
String path = "/" + groupName + "/" + memberName;
String createdPath = zk.create(path, null/*data*/, Ids.OPEN_ACL_UNSAFE,
CreateMode.EPHEMERAL);
System.out.println("Created " + createdPath);
}
public static void main(String[] args) throws Exception {
JoinGroup joinGroup = new JoinGroup();
joinGroup.connect(args[0]);
joinGroup.join(args[1], args[2]);
// stay alive until process is killed or thread is interrupted
Thread.sleep(Long.MAX_VALUE);
}
}

Retrive members in a group


public class ListGroup extends ConnectionWatcher {
public void list(String groupName) throws KeeperException,
InterruptedException {
String path = "/" + groupName;
try {
List children = zk.getChildren(path, false);
if (children.isEmpty()) {
System.out.printf("No members in group %s\n", groupName);
System.exit(1);
}
for (String child : children) {
System.out.println(child);
}
} catch (KeeperException.NoNodeException e) {
System.out.printf("Group %s does not exist\n", groupName);
System.exit(1);
}
}
public static void main(String[] args) throws Exception {
ListGroup listGroup = new ListGroup();
listGroup.connect(args[0]);
listGroup.list(args[1]);
listGroup.close();
}
}

Znodes can be two types. Persistent or Ephemeral. This type is set at creation time and may not be change later. When the client's session ends or client exit the application Ephemeral node will deleted. Ephemeral nodes not have children. But the persistent node will not be deleted when client's session ends or client exit the application.

Watches

Watches allow client to get notification when a node changes in a some way. Watches are set by a zookeeper service.

Useful Links
http://hadoop.apache.org/zookeeper/docs/current/
http://hadoop.apache.org/zookeeper/docs/r3.2.1/api/index.html







            

What is Failover and Clustering ?

First we go for the definition of Failover.
Failover is the capability to switch over automatically to a redundant or standby computer server, system, or network upon the failure or abnormal termination of the previously active application,server, system, or network.

source : http://en.wikipedia.org/wiki/Failover
In simply failover is something like this. When client request something from the server but the particular server is fail the request automatically goes to another server and respond is goes to the client. But in backend the main server is check when the fail server is recovering. After the server is recover the client request automatically redirect to that server.

The main advantage is availability and the high degree of reliability.

There are different types of failovers.

source : http://knol.google.com/k/failover

There are some failovers those are not automatically it require some manual intervention.
In Clod failover must do in manually.
Warm failover means it transact to other server automatically but the current transaction may abort because of the data synchronization failure.
Hot failover is the most reliable failover. Because it transact to another server in automatically and 100% synchronize the data. So there is no failure between the client and the sever.

In practically failover is very important to all of us. Suppose if we request a streaming video, streaming audio,VoIP or downloading a file from the internet. If the particular server is fail the connection between the server and the client is abort. So all the operation is abort. But if there is a failover capability it will automatically redirect the request to another server and continue the operation.  
So that's all about failover.

Now we will look into Clustering.
Here is the definition.
A computer cluster is a group of linked computers, working together closely thus in many respects forming a single computer. The components of a cluster are commonly, but not always, connected to each other through fast local area networks.

source : http://en.wikipedia.org/wiki/Computer_cluster

Clustering means group of computers word together to provide some enterprise service. 
Clusters can comprise the redundant and failover capable.

All clusters provide two main advantages.
Scalability and High Availability.
Scalability means application support increasing number of users. Because clusters provide extra capacity by adding new servers.


Cluster Categorization
High Availability Clusters
Load Balancing Clusters
Compute Clusters

There are two types two Clusters
1.Shared nothing
Every application server has a files. Updating and maintenance files is very hard. 
2.Shared disk
There is storage disk that contains the files and application servers use those files. So update and maintenance files is easy.

Create a Setup Application for your project using Microsoft visual studio 2008

To Create a Setup project for your application you can use MVS2008 Setup wizard. In this article i show you how to do this.This is the simplest way of doing it but if you want to add more features you can also customize it.

Using Setup wizard it creates two files
  • Setup.exe - Include a bootstrap to look prerequisite 
  • <application>.msi
You can also customize the Setup project preperties
To do this you must already open a application.For demo just create a simple application.In these examples i create email sender application that send emails.

File -> Add New Project and Select other Project types -> Setup and Deployment -> Setup wizard.
Name the project and click OK.

after that Setup wizard is open. click Next.

Select create a setup for windows application and click Next.

tic Primary Output from email and click Next. 


In here you can add additional files for your setup directory such as Readme file.. after adding additional files click Next to continue.

Click Finish.

Now you can see in the Solution Explorer setup files is there..

In the Properties window you can change the author , company name and lot of things there..


Now in the menu area select Build -> Build Setup(Your setup name). In my pro Build -> Build Email Setup

After that go to your project directory in there you can see the setup(Your Setup name Dir) folder.Inside that folder double click Debug. Now you can see that .exe file , .msi file. Double click setup.exe and it will install the file to your computer.



Now go to default installation directory path and you can see the file is there with .exe extension and other additional files that you add in your installation.


 
Also you can uninstall the file in Control panel or simply right click the project file to uninstall it.

 

Build Mobile Database Application using Microsoft Visual studio 2008 just in 5min - Video Tutorial

Create a Windows Mobile Database application Using Microsoft Visual studio 2008

In this article i'll show you how to create a windows mobile device application using MVS 2008.
but first you need to know some basics..
simply click this link to get some basic knowledge.......
http://dummiestutorials.blogspot.com/2010/09/create-simple-windows-mobile.html

Before creating the application you need to know some basics about SQL Server Compact. Because you use SQL Server Compact as your database.

SQL Server Compact 3.5

  • Data Stored in .sdf file
  • Create amd manage this database using SQL Server Management studio or Visual Studio
  • No Stored Procedure and views

First launch the visual studio 2008 and select New -> Project -> Smart Device  and give the project name as MobileDatabaseApp and click OK.

Select Target Platform as Windows mobile 6 Professional SDK and select Device application and click OK.


After that you need to create a Database file for this. To do that in Menu select Data -> Add New DataSource  and you will get Data Source configuration wizard ....


Select Next.. In the Data Source configuration wizard select New connection . And Add connection window appears. Choose the data source as Microsoft SQL server Compact 3.5(.NET Framework)
In here you can create a Database or you can browse existing database.So i choose browse because i use sample Northwind database.




After choose the Database click the Test Connection. If it tells Test Connection Succeed that means you successfully config the database.Otherwise it will show a error message.


After that click Next. In here you can select the tables that you want. For the demo i select Customers table and click finish. It will ask that you want to copy the database to your app and select yes.
Now you will get something like this
(Check solution explorer Northwind DB is there)

Now simply drag and drop Customer table that appear in Data sources to Form1. After that select grid properties(Right click grid-> Preperties )
In properties select Table styles and click the (...)

After you select it you will get this..

Click GridColumnStyles (...).. after that remove additional Members and keep Members you want.. Select OK


After that Select Generate Data Forms like this....

Finally you will get the form and MVS2008 automatically create two additional forms for you.

Click the debug button and you will get the output..

You can download Sample Project Application
http://hotfile.com/dl/70497068/721f746/MobileDatabaseApp.rar.html

Create a simple windows mobile application Using Microsoft Visual Studio 2008

In this article I'll teach you how to create a simple windows mobile application using MVS2008.
but first you have to install these softwares to develop mobile application.
Windows SDK
Windows DTK 

First launch MVS2008 and Select New -> Project.
Under the Visual C# project types you can see Smart Device. Just click it and select the Smart device Project. Add a name for your project and select OK.

after that Select Device Application in Device templates and click OK.


Now You will get something like this....


Simply Drag and Drop a Label,Textbox and Button to the design area and change name and text as follows.
Label  name - lblText , Text - null(empty space)
TextBox name - txtText
Button name - btnclick ,  Text - Click!

Now your form look like this...


Now double click the button and it will direct to the source code of the form. In the btnclick_Click() paste this code..
private void btnclick_Click(object sender, EventArgs e)
        {
               lblText.Text = "Hi " + txtText.Text;
        }
After that right click the project and select Build. If there are no errors it will say Build succeed.
Now click the debug button to get the output.

It takes little more time to compile and run the application. So you have to wait some time. After the application build you can check with sample inputs.

You can download Sample Project In here...
http://hotfile.com/dl/70330219/2a5960d/MobileApp.rar.html

Read XML file from C# using Microsoft Visual Studio 2008

Topic says what i'm going to teach you.But first you need to know some theories about XML and C#.
  • System.Xml.XmlDocument class provides handling XML  
  • There are two main .NET APIs for working with XML
  1. XML document object model (DOM) - XmlReader and XmlWriter
  2. Tree Based XML handling
  • XML data loaded into memory.
  • Can search for any node.
What is XML DOM ?
  • It provide standard mechanism for programitically working with data.
Read XML data
  • XML data is load into XmlDocument object
  • So first we have to create a XmlDocument object .

XML data look like this...
<bookstore>
  <book category="CHILDREN">
    <title>Harry Potter</title>
    <author>J K. Rowling</author>
    <year>2005</year>
    <price>29.99</price>
  </book>
  <book category="WEB">
    <title>Learning XML</title>
    <author>Erik T. Ray</author>
    <year>2003</year>
    <price>39.95</price>
  </book>
</bookstore>

Contains root node and multiple child nodes (root node - bookstore)
Each node has the parent node (Except root node)
Each node can have siblings (Except root node).

So if you want to learn XML deeply just go to w3schools.com. There are lot of tutorials available.

OK lets do this. Now we are going to make a new project in Microsoft visual studio 2008 to read a simple XML file.
First you need to launch Visual studio 2008 and Select new Project like this...

After that select Visual C# Windows Forms Application and Give a name to the project and save where ever you want...


Right Click the project and Select Add --> New Folder . Rename that folder into XML.

Right Click the XML folder that you create and Select Add -> New Item . Select XML file and name it as BookStore.xml and Click Add.


Copy and Paste this XML code into BookStore.xml file.

<?xml version="1.0" encoding="utf-8" ?>
<bookstore>
  <book publicationdate="2000" ISBN="212">
    <title>C#</title>
    <author>Author 1</author>
    <price>5000</price>
  </book>
  <book publicationdate="2009" ISBN="312">
    <title>Java for dummies</title>
    <author>Author 2</author>
    <price>12000</price>
  </book>
  <book publicationdate="2008" ISBN="412">
    <title>ASP.NET</title>
    <author>Author 3</author>
    <price>6000</price>
  </book>
</bookstore>

 Now right click the Form1.cs and select view design . Create design like this and rename button name as btnLoad and Text as Load Data.
Also drag and drop a textbox and click the small arrow in the right upper corner and tick the multiline. Change the name as txtresult.



Double click the button and it will automatically direct to the code.
First you need  2 references and a varialble....

using System.Xml;
using System.IO;

 private string _xmlFile;

After that go to the design view and double click the Form1.cs and it will automatically go to code and create Form1_Load event.In the Form1_Load paste this code..
if (!SetxmlFilePath())
            {
                MessageBox.Show("Unable to load  XML file.");
            } 
Now create the private bool SetxmlFilePath() method and paste this code inside that method..
FileInfo fi = new FileInfo(Application.StartupPath + @"\..\..\XML\BookStore.xml");
            if (fi.Exists)
            {
                _xmlFile = fi.FullName;
            }
            else
            {
                return false;
            }
            return true;
 Now goto design area and click the Load button and you will get the btnLoad_click. Inside that paste this code..
DumpContents(_xmlFile);
Now Create private void DumpContents(string _xmlFile) mathod and paste this code inside that method..
           string publishDate = null;
            string title = null;
            string author = null;
            string ISBN = null;
            string price = null;
            XmlElement element = null;

            XmlDocument doc = new XmlDocument();
            doc.Load(_xmlFile);

            foreach (System.Xml.XmlNode node in doc.SelectNodes("//book"))
            {
                title = GetNodeValue(node, "title");
                author = GetNodeValue(node, "author");
                price = GetNodeValue(node, "price");

                element = (XmlElement)node;
                publishDate = element.GetAttribute("publicationdate");
                ISBN = element.GetAttribute("ISBN");

                DisplayBook(title, author, publishDate, ISBN, price);
            }
Create this method..
private void DisplayBook(string title, string author, string publishDate, string ISBN, string price)
        {
            string results = string.Format("{0} ({1}) by {2} (ISBN: {3}), ${4}",title, publishDate, author,
            ISBN, price);
            AddText(results);
        }
After that create AddText methos like this..
private void AddText(string Text)
        {
            txtResult.AppendText(Text + Environment.NewLine);
        }
At last create this method to get the node value
 private string GetNodeValue(XmlNode parentNode, string nodeName)
        {
            string retval = null;

            XmlNode node =parentNode.SelectSingleNode(nodeName);
            if (node != null)
            {
                retval = node.InnerText;
            }
            return retval;
        }
 If you do it correctly right click the project and select build . If there any compilation error you have to fix it to get the result.
If there are no errors click the debug button , after the form shows click the Load Data button and the result textbox shows the xml file.



You can download the sample project
http://hotfile.com/dl/70230733/5270b8d/ReadXML.rar.html