TY - JOUR
T1 - Deadline Scheduling for Aperiodic Tasks in inter-Cloud Environments: a new approach to resource management
AU - Pop, Florin
AU - Dobre, Ciprian
AU - Cristea, Valentin
AU - Bessis, Nik
AU - Xhafa, Fatos
AU - Barolli, Leonard
N1 - Funding Information:
The research presented in this paper is supported by projects: ”SideSTEP–Scheduling Methods for Dynamic Distributed Systems: a self-* approach”, ID: PN-II-CT-RO-FR-2012-1-0084; CyberWater: grant of the Romanian National Authority for Scientific Research, CNDI-UEFISCDI, project number 47/2012; MobiWay: Mobility Beyond Individualism: an Integrated Platform for Intelligent Transportation Systems of Tomorrow–PN-II-PT-PCCA-2013-4-0321; clueFarm: Information system based on cloud services accessible through mobile devices, to increase product quality and business development farms–PN-II-PT-PCCA-2013-4-0870.
Publisher Copyright:
© 2014, Springer Science+Business Media New York.
PY - 2015/5/1
Y1 - 2015/5/1
N2 - In the big data era, the speed of analytical processing is influenced by the storage and retrieval capabilities to handle large amounts of data. While the distributed crunching applications themselves can yield useful information, the analysts face difficult challenges: they need to predict how much data to process and where, such that to get an optimum data crunching cost, while also respect deadlines and service level agreements within a limited budget. In today’s data centers, data processing on demand and data transfers requests coming from distributed applications are usually expressed as aperiodic tasks. In this paper, we challenge the problem of tasks scheduling with deadline constraints of aperiodic tasks within inter-Cloud environments. In massively multithreaded computing systems that deal with data-intensive applications, Hadoop and BaTs tasks arrive periodically, which challenges traditional scheduling approaches previously proposed for supercomputing. Here, we consider the deadline as the main constraint, and propose a method to estimate the number of resources needed to schedule a set of aperiodic tasks, considering both execution and data transfers costs. Starting from classical scheduling techniques, and considering asynchronous tasks handling, we analyze the possibility of decoupling task arriving from task creation, scheduling and execution, sets of actions that can be put into a peer-to-peer relation over a network or over a client–server architecture in the Cloud. Based on a mathematical model, and using different simulation scenarios, we prove the following statements: (1) multiple source of independent aperiodic tasks can be considered similar to a single one; (2) with respect to the global deadline, the tasks migration between different regional centers is the appropriate solution when the number of estimated resources exceed a data center capacity; and (3) in a heterogeneous data center, we need a higher number of resources for the same request in order to respect the deadline constraints. We believe such results will benefit researchers and practitioners alike, who are interested in optimizing the resource management in data centers according to novel challenges coming from next-generation big data applications.
AB - In the big data era, the speed of analytical processing is influenced by the storage and retrieval capabilities to handle large amounts of data. While the distributed crunching applications themselves can yield useful information, the analysts face difficult challenges: they need to predict how much data to process and where, such that to get an optimum data crunching cost, while also respect deadlines and service level agreements within a limited budget. In today’s data centers, data processing on demand and data transfers requests coming from distributed applications are usually expressed as aperiodic tasks. In this paper, we challenge the problem of tasks scheduling with deadline constraints of aperiodic tasks within inter-Cloud environments. In massively multithreaded computing systems that deal with data-intensive applications, Hadoop and BaTs tasks arrive periodically, which challenges traditional scheduling approaches previously proposed for supercomputing. Here, we consider the deadline as the main constraint, and propose a method to estimate the number of resources needed to schedule a set of aperiodic tasks, considering both execution and data transfers costs. Starting from classical scheduling techniques, and considering asynchronous tasks handling, we analyze the possibility of decoupling task arriving from task creation, scheduling and execution, sets of actions that can be put into a peer-to-peer relation over a network or over a client–server architecture in the Cloud. Based on a mathematical model, and using different simulation scenarios, we prove the following statements: (1) multiple source of independent aperiodic tasks can be considered similar to a single one; (2) with respect to the global deadline, the tasks migration between different regional centers is the appropriate solution when the number of estimated resources exceed a data center capacity; and (3) in a heterogeneous data center, we need a higher number of resources for the same request in order to respect the deadline constraints. We believe such results will benefit researchers and practitioners alike, who are interested in optimizing the resource management in data centers according to novel challenges coming from next-generation big data applications.
UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-84928775675&partnerID=MN8TOARS
U2 - 10.1007/s11227-014-1285-8
DO - 10.1007/s11227-014-1285-8
M3 - Article (journal)
SN - 0920-8542
VL - 71
SP - 1754
EP - 1765
JO - Journal of Supercomputing
JF - Journal of Supercomputing
IS - 5
ER -