Big Data Analysis: New Algorithms for a New Society

This edited volume is devoted to Big Data Analysis from a Machine Learning standpoint as presented by some of the most eminent researchers in this area. It demonstrates that Big Data Analysis opens up new research problems which were either never considered before, or were only considered within a limited range. In addition to providing methodological discussions on the principles of mining Big Data and the difference between traditional statistical data analysis and newer computing frameworks, this book presents recently developed algorithms affecting such areas as business, financial forecasting, human mobility, the Internet of Things, information networks, bioinformatics, medical systems and life science. It explores, through a number of specific examples, how the study of Big Data Analysis has evolved and how it has started and will most likely continue to affect society. While the benefits brought upon by Big Data Analysis are underlined, the book also discusses some of the warnings that have been issued concerning the potential dangers of Big Data Analysis along with its pitfalls and challenges.

[1]  Arkady B. Zaslavsky,et al.  Sensing as a Service and Big Data , 2013, ArXiv.

[2]  Flora Malamateniou,et al.  Machine Learning for Knowledge Extraction from PHR Big Data , 2014, ICIMTH.

[3]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[4]  S. Vandermerwe,et al.  Servitization of business : Adding value by adding services , 1988 .

[5]  Hao Chen,et al.  Content-rich biological network constructed by mining PubMed abstracts , 2004, BMC Bioinformatics.

[6]  T. Jenssen,et al.  A literature network of human genes for high-throughput analysis of gene expression , 2001, Nature Genetics.

[7]  Shonali Krishnaswamy,et al.  Mining data streams: a review , 2005, SGMD.

[8]  S. Fawcett,et al.  Data Science, Predictive Analytics, and Big Data: A Revolution that Will Transform Supply Chain Design and Management , 2013 .

[9]  Stan Matwin,et al.  Privacy-Preserving Data Mining Techniques: Survey and Challenges , 2013, Discrimination and Privacy in the Information Society.

[10]  Yunhao Liu,et al.  Big Data: A Survey , 2014, Mob. Networks Appl..

[11]  Peter Friess,et al.  Internet of Things: Converging Technologies for Smart Environments and Integrated Ecosystems , 2013 .

[12]  A. Barabasi,et al.  Evolution of the social network of scientific collaborations , 2001, cond-mat/0104162.

[13]  Andrew M. Gross,et al.  Network-based stratification of tumor mutations , 2013, Nature Methods.

[14]  Nada Lavrac,et al.  A Methodology for Mining Document-Enriched Heterogeneous Information Networks , 2011, Comput. J..

[15]  M. Newman,et al.  The structure of scientific collaboration networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[17]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[18]  Nitesh V. Chawla,et al.  Multi-relational Link Prediction in Heterogeneous Information Networks , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[19]  Jeffrey Pennington,et al.  Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[20]  Huan Liu,et al.  Data Mining in Social Media , 2011, Social Network Data Analytics.

[21]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[22]  Raghunath Nambiar,et al.  A look at challenges and opportunities of Big Data analytics in healthcare , 2013, 2013 IEEE International Conference on Big Data.

[23]  Dayou Liu,et al.  Discovering Communities from Social Networks: Methodologies and Applications , 2010, Handbook of Social Network Technologies.

[24]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[25]  Yizhou Sun,et al.  Graph Regularized Transductive Classification on Heterogeneous Information Networks , 2010, ECML/PKDD.

[26]  TaeHyun Hwang,et al.  A Heterogeneous Label Propagation Algorithm for Disease Gene Discovery , 2010, SDM.

[27]  Nada Lavrac,et al.  Mining Text Enriched Heterogeneous Citation Networks , 2015, PAKDD.

[28]  Yizhou Sun,et al.  Ranking-based clustering of heterogeneous information networks with star network schema , 2009, KDD.

[29]  Pedro M. Domingos,et al.  Extracting Semantic Networks from Text Via Relational Clustering , 2008, ECML/PKDD.

[30]  Yizhou Sun,et al.  RankClus: integrating clustering with ranking for heterogeneous information network analysis , 2009, EDBT '09.

[31]  D. Boyd,et al.  CRITICAL QUESTIONS FOR BIG DATA , 2012 .

[32]  Roded Sharan,et al.  Associating Genes and Protein Complexes with Disease via Network Propagation , 2010, PLoS Comput. Biol..

[33]  Guillermo Navarro-Arribas,et al.  Advanced Research in Data Privacy , 2015, Advanced Research in Data Privacy.

[34]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[35]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[36]  Andrea Zanella,et al.  Internet of Things for Smart Cities , 2014, IEEE Internet of Things Journal.

[37]  Trey Ideker,et al.  Protein Networks as Logic Functions in Development and Cancer , 2011, PLoS Comput. Biol..

[38]  Cormac J. Sreenan,et al.  A Holistic Architecture for the Internet of Things, Sensing Services and Big Data , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[39]  D. Inzé,et al.  The Potential of Text Mining in Data Integration and Network Biology for Plant Research: A Case Study on Arabidopsis[C][W] , 2013, Plant Cell.

[40]  Attila Altay Yavuz Practical Immutable Signature Bouquets (PISB) for Authentication and Integrity in Outsourced Databases , 2013, DBSec.

[41]  Nico Vervliet,et al.  Breaking the Curse of Dimensionality Using Decompositions of Incomplete Tensors: Tensor-based scientific computing in big data analysis , 2014, IEEE Signal Processing Magazine.

[42]  Yves Grandvalet,et al.  Y.: SimpleMKL , 2008 .

[43]  Andrzej Cichocki,et al.  Era of Big Data Processing: A New Approach via Tensor Networks and Tensor Decompositions , 2014, ArXiv.

[44]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[45]  David Mason,et al.  Progressive Concepts for Semantic Web Evolution: Applications and Developments , 2011 .

[46]  Graham J. Williams,et al.  Big Data Opportunities and Challenges: Discussions from Data Analytics Perspectives [Discussion Forum] , 2014, IEEE Computational Intelligence Magazine.

[47]  Yizhou Sun,et al.  Mining Heterogeneous Information Networks: Principles and Methodologies , 2012, Mining Heterogeneous Information Networks: Principles and Methodologies.

[48]  Ruslan Salakhutdinov,et al.  Learning Deep Generative Models , 2009 .

[49]  Wei Fan,et al.  Mining big data: current status, and forecast to the future , 2013, SKDD.

[50]  Manuel de Buenaga Rodríguez,et al.  Big Data and IoT for Chronic Patients Monitoring , 2014, UCAmI.

[51]  Mianxiong Dong,et al.  Managing Heterogeneous Sensor Data on a Big Data Platform: IoT Services for Data-Intensive Science , 2014, 2014 IEEE 38th International Computer Software and Applications Conference Workshops.

[52]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[53]  Joseph K. Bradley,et al.  Parallel Double Greedy Submodular Maximization , 2014, NIPS.

[54]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[55]  中尾 光輝,et al.  KEGG(Kyoto Encyclopedia of Genes and Genomes)〔和文〕 (特集 ゲノム医学の現在と未来--基礎と臨床) -- (データベース) , 2000 .

[56]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[57]  Maximilian Nickel,et al.  Tensor factorization for relational learning , 2013 .

[58]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[59]  Mohamed Zaki,et al.  Optimising Asset Management within Complex Service Networks: The Role of Data , 2014 .

[60]  Michel Crampes,et al.  Survey on Social Community Detection , 2013, Social Media Retrieval.

[61]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[62]  Bin Chen,et al.  Assessing Drug Target Association Using Semantic Linked Data , 2012, PLoS Comput. Biol..

[63]  Edward Y. Chang,et al.  PLDA: Parallel Latent Dirichlet Allocation for Large-Scale Applications , 2009, AAIM.

[64]  Fabio Crestani,et al.  Application of Spreading Activation Techniques in Information Retrieval , 1997, Artificial Intelligence Review.

[65]  Thorsten Joachims,et al.  Cutting-plane training of structural SVMs , 2009, Machine Learning.