Predictive Technique Of Task Scheduling For BigData In Cloud

In the current era, Big data utilizes MapReduce strategies in task scheduling, most notably, Apache Hadoop, a programming library and framework that considers the distributed processing of enormous data across clusters of computers using simple programming models. Still, there persist longer wait times with MapReduce technique. Because, the scheduling in Cloud with huge data causes frequent obstructions to effective computing, prompting prolonged makespan, longer waiting time and expenses acquired by the client and the server end. Task Scheduling with a huge measure of information data can cause obstacles to proficient processing. The motive of this work is to accomplish the desired solution to overcome the issues obstructing the effective CPU/task Scheduling of VMs in the Cloud environment with massive data without the use of any complex algorithms. This paper presents a predictive task scheduling approach and introduces PCA (Principal component analysis) and utilizes nine different Machine classifiers and compares the results of the accuracy and time obtained by each ML classifiers with and without the use of PCA. Results are visualized and the percentage variation of comparison is discussed. Experiments are carried out in the Hadoop Environment, using MapReduce the dataset is generated and ML classifiers are executed in Python.

[1]  Huankai Chen,et al.  User-priority guided Min-Min scheduling algorithm for load balancing in cloud computing , 2013, 2013 National Conference on Parallel Computing Technologies (PARCOMPTECH).

[2]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[3]  Ladislau Bölöni,et al.  A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems , 2001, J. Parallel Distributed Comput..

[4]  Michal Jakubczyk,et al.  A framework for sensitivity analysis of decision trees , 2017, Central European Journal of Operations Research.

[5]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[6]  Chih-Jen Lin,et al.  Training and Testing Low-degree Polynomial Data Mappings via Linear SVM , 2010, J. Mach. Learn. Res..

[7]  Akhilesh Jain,et al.  Cloud Scheduling Using Improved Hyper Heuristic Framework , 2018, International Conference on Advanced Computing Networking and Informatics.

[8]  Amir Hayat,et al.  Resource management in cloud computing: Taxonomy, prospects, and challenges , 2015, Comput. Electr. Eng..

[9]  Sarbjeet Singh,et al.  A review of metaheuristic scheduling techniques in cloud computing , 2015 .

[10]  Anshuman Chhabra,et al.  A predictive approach to task scheduling for Big Data in cloud environments using classification algorithms , 2017, 2017 7th International Conference on Cloud Computing, Data Science & Engineering - Confluence.

[11]  W S McCulloch,et al.  A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.

[12]  Rajkumar Buyya,et al.  CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms , 2011, Softw. Pract. Exp..

[13]  Mahesh Pal,et al.  Random forest classifier for remote sensing classification , 2005 .

[14]  M. E. Maron,et al.  Automatic Indexing: An Experimental Inquiry , 1961, JACM.

[15]  Upendra Bhoi,et al.  Enhanced Load Balanced Min-min Algorithm for Static Meta Task Scheduling in Cloud Computing , 2015 .

[16]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[17]  Minghe Huang,et al.  Study on Resources Scheduling Based on ACO Allgorithm and PSO Algorithm in Cloud Computing , 2012, 2012 11th International Symposium on Distributed Computing and Applications to Business, Engineering & Science.

[18]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[19]  Divyakant Agrawal,et al.  Big data and cloud computing: current state and future opportunities , 2011, EDBT/ICDT '11.

[20]  Balázs Kégl,et al.  The return of AdaBoost.MH: multi-class Hamming trees , 2013, ICLR.