论文信息 - Advances in cloud computing and big data analytics

Advances in cloud computing and big data analytics

Recently, big data have been harnessed from a variety of sources, including social networks, sensor data, scientific data, and so on. To derive latent useful information quickly, big data analytics applications typically process a tremendous amount of data on clusters of tens, hundreds, or thousands of machines. Given the urgent demand on high capacity of computation and storage resources, cloud computing is a great candidate solution for massive data storage, processing under various circumstances and different requirement.1 Cloud computing is a powerful infrastructure, which centralizes both data storage and computing in the cloud server side. Cloud clients benefit from cloud servers' powerful and secure service. A cloud application processes its primary computing tasks and stores data on cloud servers to reduce the workload on the client side. In order to effectively store and analyze the big data based on cloud computing architecture to realize intelligent applications, the new methodologies and technologies for both cloud computing and big data need to be proposed and developed. This special track focuses on a new strategic research area that addresses “Advances in Cloud Computing and Big Data Analytics”. From those submitted papers for the 5th International Conference on Advanced Cloud and Big Data (CBD 2017) held in Shanghai, China, on August 13-16, 2017, nine papers are selected that target the following research issues in cloud computing and big data: • innovative cloud applications and experiences; • scheduling optimization in cloud computing; • cloud computing security; • cloud storage system design and optimization; • approximate big data analysis and processing. Edge computing is a new computing paradigm that performs data processing at the edge of the network to lower data processing latency. Prior research significantly focused on offloading tasks from terminals to edge servers, yet most ignored how to store tasks' necessary data especially for the data-intensive tasks such as deep learning and AR. If an edge server does not have a task's necessary data, then it needs to off-load the task to cloud data centers or download the necessary data from the cloud. Either case could increase data processing latency. To address this problem, Jin et al propose an edge-side collaborative storage framework called Edge-side Cooperative Storage (ECS).2 ECS models cooperative storage as a graph and solves the data placement problem by using a graph-based iterative algorithm. This algorithm can easily extend to a distributed version without a centralized scheduler, which is two times better than a nonshared storage framework in terms of the cache hit rate. The main challenge of an access control system is how to allow authorized users to access quickly and accurately. There are several biological technologies, such as fingerprint identification, facial recognition, and iris recognition, that are widely deployed in access control systems to improve system security. However, these methods are costly for installation and latter maintenance. To make the access control system easier to use, Zhu et al propose a cloud access control system that employs the ability of sensing acceleration of Wireless Identification and Sensing Platform (WISP) tags combined with customized motions.3 Users are allowed to pass the access control system only if they operate user-defined motions correctly. Authorized users can define the authentication motions by themselves, which not only facilitates the daily use but also improves the security of the cloud access control system. Cloud computing is a powerful infrastructure tool for scientific computing and research due to the centralized data storage and computing resource in cloud servers. The demand for cloud computing has rapidly increased in recent years because an increasing number of companies move their products from local servers to cloud servers to reduce the maintenance cost of their products. The migration of legacy code is difficult since it needs not only to implement the original programs on the cloud server but also to transform their programming model from sequential to parallel. Compared with the de novo development method, the automatic translation is more productive and economical. Li et al introduce a new Java to Spark (J2S) translator to achieve automatic translation from sequential Java code to Spark cloud application,4 which can translate three types of computing-intensive programs. Moreover, the evaluations of the three types of translation demonstrate that all of three translations' results work well in their domains. It is believed that it is a new step in the automatic code migration era of cloud computing. In recent years, with the development of cloud computing, enterprises have been interested in high I/O capabilities and huge storage capacities by deploying their own cloud systems. Actually, most enterprise cloud systems are heterogeneous simultaneously consisting partly of private cloud and partly of public cloud. An enterprise needs to know the optimal mixing ratio to achieve the lowest cost. However, the optimal ratio is a dynamic value because the popularity of data changes constantly and the data increases rapidly. In order to give a cost-saving solution for enterprise cloud systems, Gao et al formulate maximizing the net present value (NPV) of the investment revenue as a dynamic decision-making problem and propose a model to simplify the decision-making problem.5 The k-means algorithm is used to cluster the devices into groups and then the kNN algorithm is conducted to provide a solution for locating data. Subsequently, as the system grows larger, the decision-making solution can be employed again to determine how to best extend the cloud system at the lowest cost.

Qiang He | Fang Dong | Jun Shen

[1] Yang Gao,et al. A decision‐making solution for cloud storage system , 2018, Concurr. Comput. Pract. Exp..

[2] Jie Cao,et al. Spotting review spammer groups: A cosine pattern and network based method , 2018, Concurr. Comput. Pract. Exp..

[3] Jun Zhang,et al. A real‐time bus‐subway transfer scheme recommendation systems , 2018, Concurr. Comput. Pract. Exp..

[4] Yi Pan,et al. Automatic translation from Java to Spark , 2018, Concurr. Comput. Pract. Exp..

[5] Defu Zhang,et al. Cost optimization heuristics for deadline constrained workflow scheduling on clouds and their comparative evaluation , 2018, Concurr. Comput. Pract. Exp..

[6] Junzhou Luo,et al. HaDaap: A hotness‐aware data placement strategy for improving storage efficiency in heterogeneous Hadoop clusters , 2018, Concurr. Comput. Pract. Exp..

[7] N. B. Anuar,et al. The rise of "big data" on cloud computing: Review and open research issues , 2015, Inf. Syst..

[8] Xiaoliang Xu,et al. Skew‐aware online aggregation over joins through guided sampling , 2018, Concurr. Comput. Pract. Exp..

[9] Junzhou Luo,et al. Cooperative storage by exploiting graph‐based data placement algorithm for edge computing environment , 2018, Concurr. Comput. Pract. Exp..

[10] Hong Wang,et al. Cloud access control authentication system using dynamic accelerometers data , 2018, Concurr. Comput. Pract. Exp..