Research Area:  Cloud Computing
The stunning growth in data has immensely impacted organizations. Their infrastructure and traditional data management system could not keep up to scale of Big Data. They have to either invest heavily on their infrastructure or move their Big Data analytics to Cloud where they can benefit from both on-demand scalability and contemporary data management techniques. However, to make Cloud hosted Big Data analytics available to wider range of enterprises, we have to carefully capture their preferences in terms of budget and service level objectives. Therefore, this study aims at proposing a SLA and cost-aware resource provisioning and task scheduling approach tailored for Big Data applications in the Cloud. Current approaches assume that data is pre-stored in cluster nodes prior to deployment of Big Data applications. In addition, their focus is purely on task scheduling, and not virtual machine provisioning. We argue that in the Cloud computing context this is not applicable, because the nodes are provisioned dynamically (data cannot be pre-stored) and leaving provisioning to user may lead to under or over provisioning that can both lead to SLA or budget constraint violations. Therefore,in this study we first model the user request, which consist of Big Data analytics jobs with budget and deadline. Then, we model infrastructures as a list of data centers, virtual machines (offered in a pay-as-you-go model), data sources, and network throughputs. After that, to address the aforementioned issues, we propose and compare cost-aware and SLA-based algorithms which provision cloud resources and schedule analytics tasks.
Keywords:  
Author(s) Name:  Mohammed Alrokayan; Amir Vahid Dastjerdi and Rajkumar Buyya
Journal name:  
Conferrence name:  2014 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM)
Publisher name:  IEEE
DOI:  10.1109/CCEM.2014.7015497
Volume Information:  
Paper Link:   https://ieeexplore.ieee.org/document/7015497