Pages

How To Setup a Hadoop Cluster

In this tutorial I will show you the require steps for setting up a multi-node hadoop cluster using Hadoop Distributed File System (HDFS) in Linux based Operating Systems.

What is Hadoop ?
Apache Hadoop is a software framework that supports data-intensive distributed applications under a free license.[1] It enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Google's MapReduce and Google File System (GFS) papers.

Source : http://en.wikipedia.org/wiki/Hadoop

In this tutorial I will guide you the required steps to setup a multi-node cluster.

STEP 1
To Setup hadoop we need some prerequisites.

1. Download and Config JDK
    Java 1.6.X recommended.

2. Download Hadoop
    Download Hadoop latest stable release in here

All the nodes must have the same version of JDK and hadoop core.

STEP 2
Establish Authentication among nodes


Suppose if a user from node_A wants to login to a remote node_B by using SSH, It will asked the password for node_B for authentication. So it is impossible to enter the password every time the masternode wants to operate the slavenode. To solve this we must adopt public key authentication. Every node will generate a pair of public key and private key, and node_A can login to node_B without password authentication only if node_B has a copy of node_A's public key. In hadoop cluster all the slave nodes must have a copy of master nodes public key.

To do this,
Login each node and run the following command.
ssh-keygen -t rsa
When question asked simply press enter to continue. Then two files "id_rsa" and "id_rsa.pub" are creates under the /home/username/.ssh/

Now login to master node and run the following command.

  • cat /home/username/.ssh/id_rsa.pub >> /home/username/.ssh/authorizes_keys
  • scp /home/username/.ssh/id_rsa.pub ip_address_of_slavenode:/home/username/.ssh/master.pub
Then login to each slave node and run the following command.
cat /home/username/.ssh/master.pub >> /home/username/.ssh/authorized_keys
Then login back to master node and run to test whether masternode can login to slave node without password.
ssh ip_address_of_slave_node

STEP 5
In this step we have to install hadoop in each slave node. Download the hadoop and exact to a directory and set the HADOOP_INSTALL variable.

STEP 6
Hadoop Configuration

Set the JAVA_HOME and HADOOP_INSTALL system variables.

Modify "hadoop-env.sh" in HADOOP_HOME/conf/. Delete the beginning '#' in The Java Implementation to use and fill the appropriate path.

Modify hdfs-site.xml , mapred-site.xml , core-site.xml as below.
Download Link : http://hotfile.com/dl/85903416/b760647/XMLs.tar.gz.html

STEP 7
Start Hadoop

First you have to format the namenode. To do this
hadoop namenode -format
Then Start the cluster
start-all.sh


Some Useful links.
http://ip_add_of_namenode:50070
http://ip_add_of_jobtracker:50030
http://ip_add_of_map_reduce:50060

68 comments:

  1. I get a lot of great information here and this is what I am searching for Hadoop. Thank you for your sharing. I have bookmark this page for my future reference.Thanks so much for the work you have put into this post.
    Hadoop Training in hyderabad

    ReplyDelete
  2. Hi,
    Thanks for providing nice information the best way to learn big data training on
    hadoop online training
    also provides real time projects

    ReplyDelete
  3. Hi,
    Nice to share about hadoop big data.The best hadoop online trainers provides online training on hadoop with real time experienced experts
    hadoop online training

    ReplyDelete
  4. hi,i hope to really understand this information..Thanks for that...hadoop training chennai
    &
    good training

    ReplyDelete
  5. Replies
    1. I have read your blog its very attractive and impressive. I like it your blog.

      ES6 Online Training JavaScript Training Courses JavaScript Training Courses | Angular 4 Online Training Angular 4 Online Training

      Delete
  6. Thank you so much for sharing this wonderful article. From this i have earned more knowledge since I have been following your blog for a long time. This will be very useful for me in finding the best institute for Big Data Training Chennai

    ReplyDelete
  7. Thanks you for this post, here i learned about how to setup a Hadoop cluster, every step in this blog was crystal clear keep blogging... Hadoop training center in Velachery

    ReplyDelete
  8. sas course in chennai
    I am following your blog from the beginning, it was so distinct & I had a chance to collect conglomeration of information that helps me a lot to improvise myself. I hope this will help many readers who are in need of this vital piece of information. Thanks for sharing & keep your blog updated.
    sas training in Chennai

    ReplyDelete
  9. Hadoop topics covered all the basic information.its very helpful
    Software Testing Training in Chennai

    ReplyDelete
  10. I have read your blog it was nice to follow even I am looking for your future updates. Hadoop is a highly growing & scoopful technology in IT market it’s an open-source software framework for managing big data in a distributed fashion on large commodity computing hardware. FITA provides Hadoop training chennaiget in to fita and out with your career.
    Hadoop training center in Chennai | Hadoop course in Chennai | Hadoop training institutes in Chennai

    ReplyDelete
  11. Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing. AWS course chennai | AWS certification in chennai | AWS cerfication chennai

    ReplyDelete
  12. Nice article i was really impressed by seeing this article, it was very interesting and it is very useful for me.. VMWare Training in chennai | VMWare Training chennai | VMWare course in chennai | VMWare course chennai

    ReplyDelete
  13. brillant piece of information, I had come to know about your web-page from my friend hardkik, chennai,i have read atleast 9 posts of yours by now, and let me tell you, your webpage gives the best and the most interesting information. This is just the kind of information that i had been looking for, i'm already your rss reader now and i would regularly watch out for the new posts, once again hats off to you! Thanx a million once again, Regards, obiee training in hyderabad

    ReplyDelete

  14. This technical post helps me to improve my skills set, thanks for this wonder article I expect your upcoming blog, so keep sharing..
    Regards,
    Best Informatica Training In Chennai|Informatica training center in Chennai

    ReplyDelete
  15. I agree with your thoughts!!! As the demand of java programming application keeps on increasing, there is massive demand for java professionals in software development industries. Thus, taking training will assist students to be skilled java developers in leading MNCs. J2EE Training in Chennai | JAVA Training Institutes in Chennai

    ReplyDelete
  16. I get a lot of great information from this blog. Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing.
    sas online training

    ReplyDelete
  17. Truely a very good article on how to handle the future technology. This content creates a new hope and inspiration within me. Thanks for sharing article like this. The way you have stated everything above is quite awesome. Keep blogging like this. Thanks :)

    Software testing training in chennai | Testing training in chennai | Software testing course in chennai

    ReplyDelete
    Replies
    1. I have read your blog its very attractive and impressive. I like it your blog.

      Java Online Training Java EE Online Training Java EE Online Training Java 8 online training Core Java 8 online training

      Java Online Training from India Java Online Training from India Java Online Training

      Delete
  18. Thank you for taking the time to provide us with your valuable information. We strive to provide our candidates with excellent care and we take your comments to heart.As always, we appreciate your confidence and trust in us.
    ... Selenium Training in chennai

    ReplyDelete
  19. Thank you for taking the time to provide us with your valuable information. We strive to provide our candidates with excellent care and we take your comments to heart.As always, we appreciate your confidence and trust in us.
    ... Software Testing Training in chennai

    ReplyDelete
  20. This information is impressive; I am inspired with your post writing style & how continuously you describe this topic. After reading your post, thanks for taking the time to discuss this, I feel happy about it and I love learning more about this topic.
    Regards,


    Fita Chennai reviews|Hadoop Training in Chennai|Big Data Training in Chennai

    ReplyDelete
  21. Thanks a lot for all your valuable article! We are really happy about the your...

    hadoop training in Chennai

    ReplyDelete
  22. Your blog provides me the valuable information which can be used to attain the quality information. Thank you for sharing
    Loadrunner Training in Chennai

    ReplyDelete
  23. Oracle database management system is a very secure and reliable platform for storing database and secured information.Due its reliable and trustworthy factor oracle DBA is famous all around the globe and is prefered by many large MNC which are using database management system.
    oracle training in Chennai | oracle dba training in chennai | oracle training institutes in chennai

    ReplyDelete
  24. I got information about hadoop software thanks for sharing.


    dot-net-training in chennai

    ReplyDelete
  25. Nice blog about hadoop. We are offering the best Linux open Stack Training in Chennai join and get placed with top companies

    ReplyDelete
  26. The steps which had given for hadoop cluster set up was very helpful.Thanks for sharing this information with us.In-case if any body wants to undergone salesforce training course,can Enroll from our best salesforce training institute.Here we offered world class training with placement support.For more course details,just visit here Salesforce developertraining in Chennai

    ReplyDelete
  27. Great post!! Thanks for posting like this unique and useful information with us. This is really awesome. Thanks for sharing this with us.

    Informatica Training in Chennai

    ReplyDelete
  28. thanks for this explanation for hadoop cluster. this provide basic details of what is hadoop and the step by step explanation of hadoop cluster.Informatica Training in Chennai

    ReplyDelete
  29. nice post.easy to understand because of step by step explanation,thanks for sharing about hadoop.it is helpful and useful information for our career.



    CCNA Training in Chennai

    ReplyDelete
  30. Nice..Thanks for this explaining about hadoop cluster.This provide basic detail of what is hadoop.Step by step is very useful..
    Android Training in Chennai

    ReplyDelete
  31. i got an information about the hadoop which is useful and informative thanks for sharing the information.



    sharepoint training in chennai

    ReplyDelete
  32. Thanks for explaining about hadoops concept.Its very easy to understand and more informative.Thanks for sharing.
    Linux training in chennai

    ReplyDelete
  33. thanks for step by step explanation of how to setup of hadoop cluster. and also provide pictorial representation explanation about this step by step explanation. great job. keep doing like this.

    ReplyDelete
  34. very informative blog. thanks for the explanation of hadoop cluster.
    VMWare Training in Chennai

    ReplyDelete
  35. Nice information about the ways to cluster with the hadoop technology. we are offering the best training for Datawarehousing Training in Chennai

    ReplyDelete
  36. Got an idea how step up a hadoop cluster.I have gained a new information.
    Linux training in chennai

    ReplyDelete
  37. the way of explanation is good ,it was useful information about how to setup the hadoop cluster?.thanks for giving a step by step explanation to us,it helps us to learn more about hadoop.

    CCNA Training in Chennai

    ReplyDelete
  38. This blog is informative. gained good knowledge about looping process.ios training in chennai

    ReplyDelete
  39. this blog is very useful and informative. it helps me to get some additional ideas in hadoop clustering. please keep on updating.
    Microstrategy Training in

    Chennai

    ReplyDelete
  40. This looping explanation is awesome to read and is easy to understanding.
    Websphere MQ Training in Chennai

    ReplyDelete
  41. In this blog, the setting up a multi-node hadoop cluster using HDFS in Linux based Operating Systems has been clearly explained. the blog is very useful and informative. please keep on updating...
    Microstrategy Training in Chennai

    ReplyDelete
  42. thanks for shared about general kind of information with us for how to set up the hadoop cluster. nice article beginners also can understandable.
    keep updating.
    Datawarehousing Training in Chennai

    ReplyDelete
  43. Thank you so much for sharing this wonderful article. From this i have earned more knowledge since I have been following your blog for a long time.

    Regards,
    CCNA Training in Chennai | CCNA Training Institute in Chennai | Best CCNA Training in Chennai

    ReplyDelete

  44. Superb i really enjoyed very much with this article here. Really its a amazing article i had ever read. I hope it will help a lot for all. Thank you so much for this amazing posts and please keep update like this excellent article.



    Peridot Systems Chennai Contact Address

    ReplyDelete
  45. That is very interesting; you are a very skilled blogger. I have shared your website in my social networks! A very nice guide. I will definitely follow these tips. Thank you for sharing such detailed article.


    PHP training in Adyar

    ReplyDelete
  46. Database means to maintain and organize all the files in a systematic format where the data can be easily accessible when needed.
    Oracle DBA training in chennai | Oracle training in chennai | Oracle course in Chennai

    ReplyDelete
  47. A phone call is a phone bring in which somebody converses with a few individuals in the meantime. The phone calls might be intended to permit the called gathering to take part amid the call, or the ring might be set so that the called party simply listens into the call and can't talk. It is now and again called ATC (sound video chat).
    Conference Call

    ReplyDelete
  48. Superb article ,I really appreciated with it, This is fine to read and valuable pro potential, I really bookmark it, pro broaden read. Appreciation pro sharing. I like itOfficial Blog

    ReplyDelete
  49. Great article.I really enjoyed while reading this.Your steps are useful.Thanks for sharing and keep updating your blog.
    Regards,
    SAS Training in Chennai | SAS Course in Chennai | SAS Institutes in Chennai

    ReplyDelete
  50. Excellant post!!!. The strategy you have posted on this technology helped me to get into the next level and had lot of information in it.
    Node JS training in chennai | Node JS training institute in chennai

    ReplyDelete
  51. Thanks for sharing this informative content that guided me to know the details about the training offered in different technology.
    digital marketing course in chennai | digital marketing training

    ReplyDelete
  52. Thanks admin,your article is really good and informative.Continue sharing more post this way.
    SAS Course in Chennai | SAS Institutes in Chennai

    ReplyDelete
  53. Pretty article! I found some useful information in your blog, it was awesome to read, thanks for sharing this great content to my vision, keep sharing.
    Regards,
    PHP Training in Chennai | Webdesigning Training in Chennai

    ReplyDelete

  54. Thanks of sharing this post…Informatica is the fastest growing technology that helps to get your dream job in a best way, so if you wants to become a expertise in Informatica get some training on that Technology.
    Regards,

    ETL Training in Chennai | Informatica Training Institutes in Chennai

    ReplyDelete
  55. Wow it is really wonderful and awesome thus it is very much useful for me to understand many concepts and helped me a lot. it is really explainable very well and i got more information from your blog.
    Selenium Training

    ReplyDelete
  56. Wow it is really wonderful and awesome thus it is very much useful for me to understand many concepts and helped me a lot. it is really explainable very well and i got more information from your blog.
    Software Testing Training

    ReplyDelete
  57. Good information and you live code examples are very helpful for better understand. Thanks buddy :)
    Selenium training in Chennai | Best Selenium training institute in Chennai

    ReplyDelete
  58. The blog gave me idea to setup hadoop cluster.The seven steps discussed is very good and informative thanks for sharing the valuable post
    Selenium Training in Chennai | Hadoop Training in Chennai

    ReplyDelete
  59. really you have posted an informative blog. it will be really helpful to many peoples. thank you for sharing this blog. before i read this blog i didn't have any knowledge about this but now i got some knowledge.
    java traininh in chennai

    ReplyDelete