EWU Institutional Repository

Hadoop Cluster Implementation

Show simple item record

dc.contributor.author Sayed, Aysha Binta
dc.date.accessioned 2017-10-04T06:30:06Z
dc.date.available 2017-10-04T06:30:06Z
dc.date.issued 4/22/2017
dc.identifier.uri http://dspace.ewubd.edu/handle/2525/2346
dc.description This thesis submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering of East West University, Dhaka, Bangladesh en_US
dc.description.abstract Recently, data driven science is an interdisciplinary field to gather, process, manage, analyze and extract inherit meaning from unstructured data and formulate them as structural information. Later, that information can be employed in many practical applications to solve real life problems. Hadoop is an open source data science tool and is able to process large amount of data sets in distributed manner across cluster of computers (a single server and several worker machines). Hadoop allows running several tasks in parallel and processing huge amount of complex data efficiency with respect to time, performance and cost. Thus, learning Hadoop with its different sub modules is important. This project work covers the implementation of Hadoop cluster with SSH public key authentication for processing large volumes of data, using cheap, easily available personal computer hardware (Intel/AMD based pcs) and freely available open source software (Ubuntu Linux, Apache Hadoop etc). In addition, Mapreduce and Yarn based distributed applications are ported and tested the cluster’s workability. en_US
dc.language.iso en_US en_US
dc.publisher East West University en_US
dc.relation.ispartofseries ;00128 CSE
dc.subject Hadoop Cluster Implementation en_US
dc.title Hadoop Cluster Implementation en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account