Energy efficient technique for Hadoop MapReduce cluster management
AuthorsAlalawi, Manal Tawalai
Hadoop MapReduce clusters
MapReduce performance model
data intensive applications
Subject Categories::G490 Computing Science not elsewhere classified
MetadataShow full item record
AbstractBig data analytics, with datasets of terabyte and petabyte size, is now a reality for businesses. A widely used solution for data centres is the MapReduce model on open‐source Hadoop. Many organisations processing real‐time data of this magnitude rely on the Hadoop MapReduce model, and the massive increase in data generation means that even small to medium enterprises (SMEs) have a requirement for big data analysis. The business insights gained from this real‐time data analysis are vital in the modern world, and although this can be outsourced to data centres, SMEs will be more sustainable if they can do this for themselves. However, the increase in the amount of data has resulted in a corresponding increase in the amount of energy used for processing. The need to minimise the use of energy, both in terms of cost and ecology, is the main rationale behind this research, and energy‐efficiency will be the key to sustainability in the twenty‐first century. The initial categorisation of energy‐efficient methods for Hadoop components has been the starting point for a comparative evaluation in this research. The research has used Hadoop MapReduce performance modelling in a series of mathematical analyses and experimental tests, and these have led to the identification and design of an energy‐efficient model. This proposed model uses a novel method of data partitioning using virtual chunks. The idea is that rather than accessing the entire data file, blocks, or chunks of data are accessed that are virtually linked. The accuracy and efficiency of the proposed design have been evaluated mathematically and the results presented graphically, and the method has been shown to minimise the processing time and complete the different data operations. This reduction of processing time has resulted in minimising the I/O bottleneck of workload applications, thus reducing the amount of energy needed for processing big data. This improved energyefficiency can be maintained for datasets of all sizes and in multiple applications. The results of this research are transferrable and can be used by SMEs of any kind in any area of business.
CitationAlalawi, M.T. (2020) 'Energy Efficient Technique for Hadoop MapReduce Cluster Management'. PhD thesis. University of Bedfordshire.
PublisherUniversity of Bedfordshire
TypeThesis or dissertation
DescriptionA thesis submitted to the University of Bedfordshire, in partial fulfilment of the requirements for the degree of Doctor of Philosophy.
The following license files are associated with this item:
- Creative Commons
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 International