Defining, Understanding, and Addressing Big Data

“Big Data” is an emerging term used with business, engineering, and other domains. Although Big Data is a popular term used today, it is not a new concept. However, the means in which data can be collected is more readily available than ever, which makes Big Data more relevant than ever because it can be used to improve decisions and insights within the domains it is used. The term Big Data can be loosely defined as data that is too large for traditional analysis methods and techniques. In this article, varieties of prominent but loose definitions for Big Data are shared. In addition, a comprehensive overview of issues related to Big Data is summarized. For example, this paper examines the forms, locations, methods of analyzing and exploiting Big Data, and current research on Big Data. Big Data also concerns a myriad of tangential issues, from privacy to analysis methods that will also be overviewed. Best practices will further be considered. Additionally, the epistemology of Big Data and its history will be examined, as well as technical and societal problems existing with Big Data.

[1]  Philip S. Yu,et al.  Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[2]  Jesse Harriott,et al.  Win with Advanced Business Analytics: Creating Business Value from Your Data , 2012 .

[3]  Richard P. Feynman There's plenty of room at the bottom [data storage] , 1992, Journal of Microelectromechanical Systems.

[4]  Mary M Tai Reply From Mary Tai , 1994, Diabetes Care.

[5]  Vijay V. Raghavan,et al.  Big Data: Promises and Problems , 2015, Computer.

[6]  Rob W.W. Hooft,et al.  The value of data , 2011, Nature Genetics.

[7]  W. Neville Holmes Aspects of Data Obesity , 2011, Computer.

[8]  Earl M. Bednar Identification and Classification of Player Types in Massive Multiplayer Online Games Using Avatar Behavior , 2011 .

[9]  David J. Hand,et al.  Data Mining: Statistics and More? , 1998 .

[10]  Swati Patil,et al.  Relevance Feedback in Content Based Image Retrieval , 2014 .

[11]  Yun Zhang,et al.  Revisiting the Sequential Programming Model for Multi-Core , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[12]  W. Neville Holmes The Profession , 2010, Computer.

[13]  A. D. Gordon A Review of Hierarchical Classification , 1987 .

[14]  F. E. Salor Sum of all knowledge: Wikipedia and the encyclopedic urge , 2012 .

[15]  Alfred Inselberg,et al.  Parallel coordinates: a tool for visualizing multi-dimensional geometry , 1990, Proceedings of the First IEEE Conference on Visualization: Visualization `90.

[16]  Heraclitus Fragments: The Collected Wisdom of Heraclitus , 2001 .

[17]  T. A. Keahey Visualization of high-dimensional clusters using nonlinear magnification , 1999, Electronic Imaging.

[18]  Winston Haynes,et al.  Unraveling the Complexities of Life Sciences Data , 2013, Big Data.

[19]  Hyoil Han,et al.  A framework of a logic-based question-answering system for the medical domain (LOQAS-Med) , 2009, SAC '09.

[20]  Lucas Laursen Data for the 31st Century , 2013 .

[21]  Hao Helen Zhang,et al.  ON THE ADAPTIVE ELASTIC-NET WITH A DIVERGING NUMBER OF PARAMETERS. , 2009, Annals of statistics.

[22]  Sabina Leonelli,et al.  Integrating data to acquire new knowledge: Three modes of integration in plant science. , 2013, Studies in history and philosophy of biological and biomedical sciences.

[23]  Brian M. Gaff,et al.  Privacy and Big Data , 2014, Computer.

[24]  Jonathan Barnes,et al.  The Presocratic Philosophers , 2018, Studies in Early Greek Philosophy.

[25]  A Errejon,et al.  Artificial neural network model to predict biochemical failure after radical prostatectomy. , 2001, Molecular urology.

[26]  Xiaoli Yu,et al.  Adaptive multiple-band CFAR detection of an optical pattern with unknown spectral distribution , 1990, IEEE Trans. Acoust. Speech Signal Process..

[27]  Hyoil Han,et al.  Focused multi-document summarization: human summarization activity vs. automated systems techniques , 2010 .

[28]  Stanley Lemeshow,et al.  Applied Logistic Regression, Second Edition , 1989 .

[29]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[30]  Bart J. Bronnenberg,et al.  Database Paper - The IRI Marketing Data Set , 2008, Mark. Sci..

[31]  G. W. Milligan,et al.  Methodology Review: Clustering Methods , 1987 .

[32]  Alexandros Labrinidis,et al.  Challenges and Opportunities with Big Data , 2012, Proc. VLDB Endow..

[33]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  Marios D. Dikaiakos,et al.  Cloud Computing: Distributed Internet Computing for IT and Scientific Research , 2009, IEEE Internet Computing.

[35]  Samuel Madden,et al.  From Databases to Big Data , 2012, IEEE Internet Comput..

[36]  Paul B. Reverdy,et al.  Human-inspired algorithms for search A framework for human-machine multi-armed bandit problems , 2014 .

[37]  Bruce Ratner,et al.  Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data , 2003 .

[38]  Leonard M. Lodish,et al.  INFER: An expert system for automatic analysis of scanner data , 1991 .

[39]  Jingde Cheng,et al.  A Strong Relevant Logic Model of Epistemic Processes in Scientific Discovery , 1998, EJC.

[40]  Yan Liu,et al.  Kroger Uses Simulation-Optimization to Improve Pharmacy Inventory Management , 2014, Interfaces.

[41]  Stevan Harnad,et al.  Post-Gutenberg Galaxy: The Fourth Revolution in the Means of Production of Knowledge , 1991 .

[42]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[43]  T M Wolever Comments on Tai's Mathematic Model , 1994, Diabetes Care.

[44]  Jie Zhang,et al.  A Reliable Neural Network Model Based Optimal Control Strategy for a Batch Polymerization Reactor , 2004 .

[45]  Trevor J. Bihl,et al.  Dimensional reduction analysis for Physical Layer device fingerprints with application to ZigBee and Z-Wave devices , 2015, MILCOM 2015 - 2015 IEEE Military Communications Conference.

[46]  D. Massart,et al.  The Mahalanobis distance , 2000 .

[47]  Heidi Ledford Big science: The cancer genome challenge , 2010, Nature.

[48]  Fielding H. Garrison An Introduction to the History of Medicine: With Medical Chronology, Suggestions for Study and Bibliographic Data , 2010 .

[49]  C. Pasqualini,et al.  The shallows: What the internet is doing to our brains , 2011 .

[50]  Guoqiang Peter Zhang,et al.  Avoiding Pitfalls in Neural Network Research , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[51]  Edward Rolf Tufte,et al.  The visual display of quantitative information , 1985 .

[52]  L. Eriksson Multi- and megavariate data analysis , 2006 .

[53]  Ian H. Witten,et al.  An open-source toolkit for mining Wikipedia , 2013, Artif. Intell..

[54]  J. E. Jackson,et al.  Control Procedures for Residuals Associated With Principal Component Analysis , 1979 .

[55]  Kenneth W. Bauer,et al.  The life and death of ATR/sensor fusion and the hope for resurrection , 2008, SPIE Defense + Commercial Sensing.

[56]  Rong Jin,et al.  Image Retrieval in Forensics: Tattoo Image Database Application , 2012, IEEE MultiMedia.

[57]  Herodotos Herodotou,et al.  MapReduce programming and cost-based optimization? , 2011, Proc. VLDB Endow..

[58]  Johan Trygg,et al.  Multi- and Megavariate Data Analysis : Part II: Advanced Applications and Method Extensions , 2006 .

[59]  Christina Bloebaum,et al.  The hyper-radial visualisation method for multi-attribute decision-making under uncertainty , 2009 .

[60]  Gregory Piatetsky-Shapiro,et al.  High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality , 2000 .

[61]  Sabine Van Huffel,et al.  Total least squares problem - computational aspects and analysis , 1991, Frontiers in applied mathematics.

[62]  Thierry Bertin-Mahieux,et al.  The Million Song Dataset , 2011, ISMIR.

[63]  Trevor P Martin,et al.  Discovery of time-varying relations using fuzzy formal concept analysis and associations , 2010 .

[64]  Rick Wicklin An Analysis of Airline Delays with SAS / IML r © Studio , 2009 .

[65]  A. Buchanan,et al.  Too Much to Know. Managing Scholarly Information before the Modern Age , 2010 .

[66]  Kenneth W. Bauer,et al.  Feature screening using signal-to-noise ratios , 2000, Neurocomputing.

[67]  Michael Y. Hu,et al.  Artificial neural networks in bankruptcy prediction: General framework and cross-validation analysis , 1999, Eur. J. Oper. Res..

[68]  Hong-Jiang Zhang Relevance Feedback in Content-Based Image Retrieval , 2003 .

[69]  E. Colin Cherry A history of the theory of information , 1953, Trans. IRE Prof. Group Inf. Theory.

[70]  Joseph Adler,et al.  R in a Nutshell , 2010 .

[71]  Hsinchun Chen,et al.  Business Intelligence and Analytics: Research Directions , 2013, TMIS.

[72]  Avita Katal,et al.  Big data: Issues, challenges, tools and Good practices , 2013, 2013 Sixth International Conference on Contemporary Computing (IC3).

[73]  Ulf-Dietrich Reips,et al.  Mining twitter: A source for psychological wisdom of the crowds , 2011, Behavior research methods.

[74]  Louis D. Brandeis,et al.  The Right to Privacy , 1890 .

[75]  Rhoda C. Joseph,et al.  Big Data and Transformational Government , 2013, IT Professional.

[76]  Trevor J. Bihl,et al.  Vibrometry-based vehicle identification framework using nonlinear autoregressive neural networks and decision fusion , 2014, NAECON 2014 - IEEE National Aerospace and Electronics Conference.

[77]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[78]  William A. Young,et al.  A survey of methodologies for the treatment of missing values within datasets: limitations and benefits , 2011 .

[79]  Richard Wood,et al.  Aircraft Accident Investigation , 1995 .

[80]  Kieron O'Hara Are We Getting Privacy the Wrong Way Round? , 2013, IEEE Internet Comput..

[81]  Le Gruenwald,et al.  A survey of data mining and knowledge discovery software tools , 1999, SKDD.

[82]  Shawn T. Brown,et al.  Contagious diseases in the United States from 1888 to the present. , 2013, The New England journal of medicine.

[83]  Robert A. Eisenbeis,et al.  PITFALLS IN THE APPLICATION OF DISCRIMINANT ANALYSIS IN BUSINESS, FINANCE, AND ECONOMICS , 1977 .

[84]  Ali M. S. Zalzala,et al.  NOCEA: A rule-based evolutionary algorithm for efficient and effective clustering of massive high-dimensional databases , 2007, Appl. Soft Comput..

[85]  Heather Eggins,et al.  Killing the Spirit: Higher Education in America , 1991 .

[86]  M M Tai,et al.  A Mathematical Model for the Determination of Total Area Under Glucose Tolerance and Other Metabolic Curves , 1994, Diabetes Care.

[87]  W. Neville Holmes The Rise and Rise of Digital Gluttony , 2010, Computer.

[88]  William A. Young,et al.  Extracting Knowledge of Concrete Shear Strength from Artificial Neural Networks , 2008 .

[89]  Sabine Van Huffel,et al.  The total least squares problem , 1993 .

[90]  T. K. Das,et al.  BIG Data Analytics: A Framework for Unstructured Data Analysis , 2013 .

[91]  Xindong Wu,et al.  Synthesizing High-Frequency Rules from Different Data Sources , 2003, IEEE Trans. Knowl. Data Eng..

[92]  Trevor J. Bihl,et al.  Principal Component Reconstruction Error for Hyperspectral Anomaly Detection , 2015, IEEE Geoscience and Remote Sensing Letters.

[93]  D. P. Bergeron,et al.  Dark Ages II: When the Digital Data Die , 2001 .

[94]  Elsa Tamez,et al.  Ecclesiastes , 2001, The Shadow of a Great Rock.

[95]  Daniel J. Solove,et al.  'I've Got Nothing to Hide' and Other Misunderstandings of Privacy , 2007 .

[96]  Anne E. Trefethen,et al.  The Data Deluge: An e-Science Perspective , 2003 .

[97]  Andre Heck,et al.  Information Handling in Astronomy , 2012 .

[98]  B. Kowalski,et al.  Partial least-squares regression: a tutorial , 1986 .

[99]  Mark Ware,et al.  The STM report: An overview of scientific and scholarly journal publishing fourth edition , 2015 .

[100]  Margaret Jefferson,et al.  So Many Books: reading and publishing in an age of abundance, de Gabriel Zaid , 2004 .

[101]  Lee A. Bygrave,et al.  A right to be forgotten? , 2014, Commun. ACM.

[102]  Henry Braun,et al.  DATA-DRIVEN IMPROVEMENT AND ACCOUNTABILITY , 2013 .

[103]  David Walker,et al.  Big data and big business: Should statisticians join in? , 2013 .

[104]  William Stetson Merrill Mind and the World-Order: Outline of a Theory of Knowledge , 1930 .

[105]  J. Mcewen,et al.  Evolving approaches to the ethical management of genomic data. , 2013, Trends in genetics : TIG.

[106]  Rachel Courtland The end of the shrink , 2013, IEEE Spectrum.

[107]  J. Naisbitt Megatrends: Ten New Directions Transforming Our Lives , 1982 .

[108]  Rob Kitchin,et al.  Code and the Transduction of Space , 2005 .

[109]  Melius Weideman,et al.  An investigation into search engines as a form of targeted advert delivery , 2002 .

[110]  James Bret Michael,et al.  Thoughts on Higher Education and Scientific Research , 2011, IT Professional.

[111]  K. Stanovich,et al.  Heuristics and Biases: Individual Differences in Reasoning: Implications for the Rationality Debate? , 2002 .

[112]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[113]  Andy Maltz How do you store a digital movie for 100 years? , 2014, IEEE Spectrum.

[114]  William A. Young,et al.  LEARNING BEFORE ERRING: THE INFLUENCE OF DIELECTRIC MATERIALS TO PURSUE MOORE’S LAW , 2009 .

[115]  Benjamin H. Brinkmann,et al.  Large-scale electrophysiology: Acquisition, compression, encryption, and storage of big data , 2009, Journal of Neuroscience Methods.

[116]  Morley O Stone DoD Priorities for Autonomy Research and Development , 2011 .

[117]  John Wang,et al.  Encyclopedia of Business Analytics and Optimization , 2018 .

[118]  M. M. Derriso,et al.  Global workspace theory inspired architecture for autonomous structural health monitoring , 2012, 2012 IEEE National Aerospace and Electronics Conference (NAECON).

[119]  Ulf-Dietrich Reips,et al.  "Big Data" : big gaps of knowledge in the field of internet science , 2012 .

[120]  Jianguo Lu,et al.  Bias Correction in a Small Sample from Big Data , 2013, IEEE Transactions on Knowledge and Data Engineering.

[121]  Donald Yau,et al.  Categories , 2021, 2-Dimensional Categories.

[122]  René F. Kizilcec How Much Information?: Effects of Transparency on Trust in an Algorithmic Interface , 2016, CHI.

[123]  Xindong Wu,et al.  Database classification for multi-database mining , 2005, Inf. Syst..

[124]  Santiago Izquierdo Izquierdo,et al.  Use of Artificial Neural Networks to Predict The Business Success or Failure of Start-Up Firms , 2013 .

[125]  K. Wagstaff,et al.  Big data challenges for large radio arrays , 2012, 2012 IEEE Aerospace Conference.

[126]  Rune Linding,et al.  PROTEINCHALLENGE: crowd sourcing in proteomics analysis and software development. , 2013, Journal of proteomics.

[127]  Steven K. Feiner,et al.  Worlds within worlds: metaphors for exploring n-dimensional virtual worlds , 1990, UIST '90.

[128]  Craig Stuart Sapp,et al.  Search Effectiveness Measures for Symbolic Music Queries in Very Large Databases , 2004, ISMIR.

[129]  T. Bui Neural network analysis of sparse datasets ?? an application to the fracture system in folds of the Lisburne Formation, northeastern Alaska , 2005 .

[130]  D. Rosenberg Early modern information overload , 2003, IEEE Engineering Management Review.

[131]  Kelly Heffner,et al.  The Electronic Frontier Foundation Defending Freedom in the Digital World Report on Data Aggregation , 2007 .

[132]  Jehiel Keeler Hoyt,et al.  Hoyt's New Cyclopedia of Practical Quotations , 2000 .

[133]  R. Feynman There’s plenty of room at the bottom , 2011 .

[134]  Keri Schreiner Distributed projects tackle protein mystery , 2001, Comput. Sci. Eng..

[135]  Steve Feng,et al.  Crowd-sourced BioGames: managing the big data problem for next-generation lab-on-a-chip platforms. , 2012, Lab on a chip.

[136]  Çağatay Üstün Galen and his anatomic eponym: vein of Galen. , 2004 .

[137]  B. Marx The Visual Display of Quantitative Information , 1985 .

[138]  Melnned M. Kantardzic Big Data Analytics , 2013, Lecture Notes in Computer Science.

[139]  Matthew E Falagas,et al.  Comparison of PubMed, Scopus, Web of Science, and Google Scholar: strengths and weaknesses , 2007, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[140]  Rick Wicklin,et al.  Visualizing Airline Delays and Cancelations , 2011 .

[141]  Paulo B. Góes,et al.  Business Intelligence and Analytics Education, and Program Development: A Unique Opportunity for the Information Systems Discipline , 2012, TMIS.

[142]  Ramez Elmasri,et al.  Web data cleansing and preparation for ontology extraction using WordNet , 2000, Proceedings of the First International Conference on Web Information Systems Engineering.

[143]  F. Hayek The economic nature of the firm: The use of knowledge in society , 1945 .

[144]  C. I. Lewis MIND AND THE WORLD ORDER. OUTLINE OF A THEORY OF KNOWLEDGE , 1930 .

[145]  P. Baldi,et al.  Searching for exotic particles in high-energy physics with deep learning , 2014, Nature Communications.

[146]  Daniel E. O'Leary,et al.  'Big Data', the 'Internet of Things' and the 'Internet of Signs' , 2013, Intell. Syst. Account. Finance Manag..

[147]  W. G. Tuller,et al.  Theoretical Limitations on the Rate of Transmission of Information , 1949, Proceedings of the IRE.

[148]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[149]  Trevor P. Martin,et al.  Fuzzy sets in the fight against digital obesity , 2005, Fuzzy Sets Syst..

[150]  Thierry Bertin-Mahieux,et al.  The million song dataset challenge , 2012, WWW.

[151]  D. Stenmark Information vs. knowledge: the role of intranets in knowledge management , 2002, Proceedings of the 35th Annual Hawaii International Conference on System Sciences.

[152]  M. Goldstein,et al.  Multivariate Analysis: Methods and Applications , 1984 .

[153]  Jonathan Evans In two minds: dual-process accounts of reasoning , 2003, Trends in Cognitive Sciences.

[154]  Neil M. Richards Reconciling Data Privacy and the First Amendment , 2004 .

[155]  Veda C. Storey,et al.  Business Intelligence and Analytics: From Big Data to Big Impact , 2012, MIS Q..

[156]  Rolland E. Stevens The Microform Revolution , 1971 .

[157]  James Bennett,et al.  The Netflix Prize , 2007 .

[158]  H. Brückner,et al.  Wikipedia, sociology, and the promise and pitfalls of Big Data , 2015 .

[159]  S. Bliesner,et al.  Predictive analytics with aviation big data , 2013, 2013 Integrated Communications, Navigation and Surveillance Conference (ICNS).

[160]  Winston A Hide,et al.  Big data: The future of biocuration , 2008, Nature.

[161]  B. Gross The managing of organizations : the administrative struggle , 1965 .

[162]  Sebastián Ventura,et al.  Educational Data Mining: A Review of the State of the Art , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[163]  G. W. Milligan,et al.  An examination of procedures for determining the number of clusters in a data set , 1985 .

[164]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[165]  Douglas Chismar,et al.  Vice and Virtue in Everyday (Business) Life , 2001 .

[166]  Seref Sagiroglu,et al.  Big data: A review , 2013, 2013 International Conference on Collaboration Technologies and Systems (CTS).

[167]  Mary Czerwinski,et al.  Interactions with big data analytics , 2012, INTR.

[168]  M. Sebrechts Ignorance and Uncertainty: Emerging Paradigms , 1989 .

[169]  Jeff Jonas Threat and Fraud Intelligence, Las Vegas Style , 2006, IEEE Secur. Priv..

[170]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[171]  R Bender Determination of the Area Under a Curve , 1994, Diabetes Care.

[172]  David Maier,et al.  When big data leads to lost data , 2012, PIKM '12.

[173]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[174]  Vijay V. Raghavan,et al.  NoSQL Systems for Big Data Management , 2014, 2014 IEEE World Congress on Services.

[175]  David Becker,et al.  Big data quality case study preliminary findings: Hyperspectral imaging (HSI) using the Airborne visible / Infrared imaging spectrometer (AVIRIS) , 2013, ICIQ.

[176]  Roger Clarke,et al.  Big Data's Big Unintended Consequences , 2013, Computer.

[177]  Matthew Smith,et al.  Big data privacy issues in public social media , 2012, 2012 6th IEEE International Conference on Digital Ecosystems and Technologies (DEST).

[178]  Hyoil Han,et al.  Biomedical question answering: A survey , 2010, Comput. Methods Programs Biomed..

[179]  David J. Stone,et al.  "Big data" in the intensive care unit. Closing the data loop. , 2013, American journal of respiratory and critical care medicine.

[180]  Matthew O. Ward,et al.  Exploring N-dimensional databases , 1990, Proceedings of the First IEEE Conference on Visualization: Visualization `90.

[181]  Gregory R. Doddrell Information security and the Internet , 1996, Internet Res..

[182]  Dominic Barton,et al.  Making advanced analytics work for you. , 2012, Harvard business review.

[183]  Matthew Smith,et al.  SnapMe if you can: privacy threats of other peoples' geo-tagged media and what we can do about it , 2013, WiSec '13.

[184]  Andrew James,et al.  Big Data in the Intensive Care Unit , 2017, AMIA.

[185]  Sergey Brin,et al.  Reprint of: The anatomy of a large-scale hypertextual web search engine , 2012, Comput. Networks.

[186]  H. Mannila,et al.  Data mining: machine learning, statistics, and databases , 1996, Proceedings of 8th International Conference on Scientific and Statistical Data Base Management.

[187]  C. Northcote Parkinson,et al.  Parkinson's Law or the Pursuit of Progress , 1958 .

[188]  Andy Evans,et al.  Uncertainty and Error , 2012 .

[189]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[190]  D. Huron,et al.  On the Virtuous and the Vexatious in an Age of Big Data , 2013 .

[191]  Gina Neff,et al.  Why Big Data Won't Cure Us , 2013, Big Data.

[192]  Michael Stonebraker,et al.  A comparison of approaches to large-scale data analysis , 2009, SIGMOD Conference.

[193]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[194]  J H Monaco,et al.  Tai's Formula Is the Trapezoidal Rule , 1994, Diabetes Care.

[195]  Imad Aad,et al.  The Mobile Data Challenge: Big Data for Mobile Computing Research , 2012 .

[196]  Ao Lei,et al.  Exploration on Big Data Oriented Data Analyzing and Processing Technology , 2013 .

[197]  Vitaly Shmatikov,et al.  How To Break Anonymity of the Netflix Prize Dataset , 2006, ArXiv.

[198]  William A. Young,et al.  Artificial Neural Networks for Business Analytics , 2014 .

[199]  V. Barnett,et al.  Applied Linear Statistical Models , 1975 .

[200]  Iris Xie,et al.  Dimensions of tasks: influences on information-seeking and retrieving process , 2009, J. Documentation.