Chemometrics: From Data Preprocessing to Fog Computing

The accumulation of data from various instrumental analytical instruments has paved a way for the application of chemometrics. Challenges, however, exist in processing, analyzing, visualizing, and storing these data. Chemometrics is a relatively young area of analytical chemistry that involves the use of statistics and computer applications in chemistry. This article will discuss various computational and storage tools of big data analytics within the context of analytical chemistry with examples, applications, and usage details in relation to fog computing. The future of fog computing in chemometrics will also be discussed. The article will dedicate particular emphasis to preprocessing techniques, statistical and machine learning methodology for data mining and analysis, tools for big data visualization, and state-of-the-art applications for data storage using fog computing.

[1]  Carlo Vandecasteele,et al.  Modern Methods for Trace Element Determination , 1993 .

[2]  Robert John Walters,et al.  Fog Computing and the Internet of Things: A Review , 2018, Big Data Cogn. Comput..

[3]  A. Bansal,et al.  Chemometrics: A new scenario in herbal drug standardization , 2014, Journal of pharmaceutical analysis.

[4]  Ghalib Bello,et al.  Comparison of Chemometric Algorithms for Multicomponent Analyses and Signal Processing: An Example from 4-(2- Pyridylazo) Resorcinol-Metal Colored Complexes , 2014 .

[5]  K. Jetter,et al.  Quantitative analysis of near infrared spectra by wavelet coefficient regression using a genetic algorithm , 1999 .

[6]  Dominique Paret,et al.  Cloud and Fog Computing , 2017 .

[7]  Lei Wang,et al.  A collaborative divide-and-conquer K-means clustering algorithm for processing large data , 2014, Conf. Computing Frontiers.

[8]  Wolfgang Gaul,et al.  "Classification, Clustering, and Data Mining Applications" , 2004 .

[9]  David R. Smith,et al.  An Introduction to Wavelets , 1992 .

[10]  Harald Martens,et al.  Fast and comprehensive fitting of complex mathematical models to massive amounts of empirical data , 2012 .

[11]  Seyed Benyamin Dalirsefat,et al.  Comparison of Similarity Coefficients used for Cluster Analysis with Amplified Fragment Length Polymorphism Markers in the Silkworm, Bombyx mori , 2009, Journal of insect science.

[12]  Sateesh Addepalli,et al.  Fog computing and its role in the internet of things , 2012, MCC '12.

[13]  Sanni Matero,et al.  Chemometric Methods in Pharmaceutical Tablet Development and Manufacturing Unit Operations , 2010 .

[14]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[15]  Beata Walczak,et al.  USE AND ABUSE OF CHEMOMETRICS IN CHROMATOGRAPHY , 2006 .

[16]  Brian Everitt,et al.  Cluster analysis , 1974 .

[17]  Howard Gobioff,et al.  The Google file system , 2003, SOSP '03.

[18]  Douglas B. Kell,et al.  Wavelet Denoising of Infrared Spectra , 1997 .

[19]  Andrew Starr,et al.  A Review of data fusion models and architectures: towards engineering guidelines , 2005, Neural Computing & Applications.

[20]  Xavier Maldague Advances in signal processing for non destructive evaluation of materials , 2004, Canadian Journal of Electrical and Computer Engineering.

[21]  Xueguang Shao,et al.  Continuous Wavelet Transform Applied to Removing the Fluctuating Background in Near-Infrared Spectra , 2004, J. Chem. Inf. Model..

[22]  Frutos C. Marhuenda-Egea,et al.  New approach for chemometric analysis of mass spectrometry data. , 2013, Analytical chemistry.

[23]  Shen Yin,et al.  Data-Driven Design of Fog-Computing-Aided Process Monitoring System for Large-Scale Industrial Processes , 2018, IEEE Transactions on Industrial Informatics.

[24]  Hui Wang,et al.  The fog computing service for healthcare , 2015, 2015 2nd International Symposium on Future Information and Communication Technologies for Ubiquitous HealthCare (Ubi-HealthTech).

[25]  Pradeep Kumar,et al.  Pharmaceutical Applications of Chemometric Techniques , 2013 .

[26]  Philip Sedgwick Standard error of the mean , 2010, BMJ : British Medical Journal.

[27]  Chunming Wu,et al.  Analysis of Plant Breeding on Hadoop and Spark , 2016 .

[28]  Alexey L. Pomerantsev,et al.  Chemometrics in Excel: Pomerantsev/Chemometrics in Excel , 2014 .

[29]  Rudolf W. Kessler,et al.  Perspectives in process analysis , 2013 .

[30]  Romain Briandet,et al.  Discrimination of Arabica and Robusta in Instant Coffee by Fourier Transform Infrared Spectroscopy and Chemometrics , 1996 .

[31]  Ivan Stojmenovic,et al.  An overview of Fog computing and its security issues , 2016, Concurr. Comput. Pract. Exp..

[32]  M. Bos,et al.  The wavelet transform for pre-processing IR spectra in the identification of mono- and di-substituted benzenes , 1994 .

[33]  T. C. Nicholas Graham,et al.  Seeing through the fog: an algorithm for fast and accurate touch detection in optical tabletop surfaces , 2010, ITS '10.

[34]  Richard G. Brereton,et al.  Chemometrics for Pattern Recognition , 2009 .

[35]  Søren Balling Engelsen,et al.  TRENDS IN THE APPLICATION OF CHEMOMETRICS TO FOODOMICS STUDIES , 2015 .

[36]  Frank R. Burden,et al.  Fourier Transform Infrared microspectroscopy and chemometrics as a tool for the discrimination of cyanobacterial strains , 1999 .

[37]  E Holmes,et al.  Chemometric analysis of biofluids following toxicant induced hepatotoxicity: A metabonomic approach to distinguish the effects of 1-naphthylisothiocyanate from its products , 2005, Xenobiotica; the fate of foreign compounds in biological systems.

[38]  Howell G. M. Edwards,et al.  Chemometric methods applied to the differentiation of Fourier-transform Raman spectra of ivories , 2001 .

[39]  Richard G. Brereton,et al.  Chemometrics: Data Analysis for the Laboratory and Chemical Plant , 2003 .

[40]  Pratiksha Rashinkar,et al.  An overview of data fusion techniques , 2017, 2017 International Conference on Innovative Mechanisms for Industry Applications (ICIMIA).

[41]  Peter D. Wentzell,et al.  Signal Processing in Analytical Chemistry , 2006 .

[42]  Desire L. Massart,et al.  Wavelet packet transform applied to a set of signals: A new approach to the best-basis selection , 1997 .

[43]  Heng Luo,et al.  Haystack, a web-based tool for metabolomics research , 2014, BMC Bioinformatics.

[44]  Edward R. Adlard Cesar Ovalles, Carl E. Rechsteiner Jr (Eds): Analytical Methods in Petroleum, Upstream Applications , 2015, Chromatographia.

[45]  Beata Walczak,et al.  Wavelets in Chemistry , 2001 .

[46]  Ewa Szymańska,et al.  Chemometrics for ion mobility spectrometry data: recent advances and future prospects. , 2016, The Analyst.

[47]  Rajarshi Guha,et al.  Chemical Informatics Functionality in R , 2007 .

[48]  Gang Peng,et al.  A novel quantitative spectral analysis method based on parallel BP neural network for dissolved gas in transformer oil , 2016, 2016 IEEE PES Asia-Pacific Power and Energy Engineering Conference (APPEEC).

[49]  Anuj Kumar,et al.  Fog in Comparison to Cloud: A Survey , 2015 .

[50]  John Sadowsky,et al.  Investigation of Signal Characteristics Using the Continuous Wavelet Transform , 1996 .

[51]  Katherine A. Bakeev Process analytical technology , 2005 .

[52]  Alan N. Steinberg,et al.  Revisions to the JDL data fusion model , 1999, Defense, Security, and Sensing.

[53]  Hans Lohninger,et al.  Chemometric analysis of multisensor hyperspectral images of precipitated atmospheric particulate matter. , 2015, Analytical chemistry.

[54]  Michael Bächle,et al.  Ruby on Rails , 2006, Softwaretechnik-Trends.

[55]  S. Kim,et al.  Use of Fourier transform infrared spectra of crude bacterial lipopolysaccharides and chemometrics for differentiation of Salmonella enterica serotypes , 2005, Journal of applied microbiology.

[56]  Wei Cheng,et al.  Fog Computing Based Ultraviolet Radiation Measurement via Smartphones , 2015, 2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb).

[57]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[58]  Anders Björk Chemometric and signal processing methods for real time monitoring and modeling : applications in the pulp and paper industry , 2007 .

[59]  Ghalib Bello,et al.  Chemometric regression techniques as emerging, powerful tools in genetic association studies , 2015 .

[60]  Gerard G. Dumancas Simultaneous spectrophotometric and chemometric determination of cholesterol and mono-/polyunsaturated fatty acids , 2012 .

[61]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[62]  D. Massart Chemometrics: A Textbook , 1988 .

[63]  Steven D. Brown,et al.  Wavelet analysis applied to removing non‐constant, varying spectroscopic background in multivariate calibration , 2002 .

[64]  G. Nychas,et al.  Rapid monitoring of the spoilage of minced beef stored under conventionally and active packaging conditions using Fourier transform infrared spectroscopy in tandem with chemometrics. , 2009, Meat science.

[65]  Ross McGuire,et al.  Data-driven medicinal chemistry in the era of big data. , 2014, Drug discovery today.

[66]  Andrey Bogomolov,et al.  Multivariate process trajectories: capture, resolution and analysis , 2011 .

[67]  D. Massart,et al.  Application of wavelet transform to extract the relevant component from spectral data for multivariate calibration. , 1997, Analytical chemistry.

[68]  Md Nasir Sulaiman,et al.  Data stream clustering by divide and conquer approach based on vector model , 2015, Journal of Big Data.

[69]  Bart Nicolai,et al.  Kernel PLS regression on wavelet transformed NIR spectra for prediction of sugar content of apple , 2007 .

[70]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[71]  J. Stevens Applied Multivariate Statistics for the Social Sciences , 1986 .

[72]  D B Kell,et al.  Detection of the dipicolinic acid biomarker in Bacillus spores using Curie-point pyrolysis mass spectrometry and Fourier transform infrared spectroscopy. , 2000, Analytical chemistry.

[73]  Ingrid Daubechies,et al.  Ten Lectures on Wavelets , 1992 .

[74]  M. Grasserbauer,et al.  Wavelet denoising of Gaussian peaks: A comparative study , 1996 .

[75]  D. Massart,et al.  Application of Wavelet Packet Transform in Pattern Recognition of Near-IR Data , 1996 .

[76]  Bradley N. Miller,et al.  Python Programming in Context , 2008 .

[77]  Ankit Bansal,et al.  Chemometrics tools used in analytical chemistry: an overview. , 2014, Talanta.

[78]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[79]  Márcia M. C. Ferreira,et al.  Assessing the use of different chemometric techniques to discriminate low-fat and full-fat yogurts , 2013 .

[80]  Oxana Ye. Rodionova,et al.  Process analytical technology: a critical view of the chemometricians , 2012 .

[81]  Geoffrey C. Fox,et al.  MapReduce for Data Intensive Scientific Analyses , 2008, 2008 IEEE Fourth International Conference on eScience.

[82]  António S. Barros,et al.  Fourier transform infrared spectroscopy and chemometric analysis of white wine polysaccharide extracts. , 2002, Journal of agricultural and food chemistry.

[83]  Donagh Berry,et al.  Learning in the compressed data domain: Application to milk quality prediction , 2018, Inf. Sci..

[84]  I. Gràcia,et al.  Review on ion mobility spectrometry. Part 1: current instrumentation. , 2015, The Analyst.

[85]  Zenon Chaczko,et al.  A review on Fog Computing technology , 2016, 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[86]  Cesar Ovalles,et al.  Analytical Methods in Petroleum Upstream Applications , 2015 .

[87]  Peter Filzmoser,et al.  Introduction to Multivariate Statistical Analysis in Chemometrics , 2009 .

[88]  K. Jetter,et al.  Principles and applications of wavelet transformation to chemometrics , 2000 .

[89]  James Demmel,et al.  Applied Numerical Linear Algebra , 1997 .

[90]  Richard G. Brereton,et al.  Applied Chemometrics for Scientists , 2007 .

[91]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[92]  Zhuoyong Zhang,et al.  Detection of adulterants such as sweeteners materials in honey using near-infrared spectroscopy and chemometrics , 2010 .

[93]  Junbin Gao,et al.  Chemometrics: From Basics to Wavelet Transform , 2004 .

[94]  Bhupinder S. Dayal,et al.  Improved PLS algorithms , 1997 .

[95]  Mehmed Kantardzic,et al.  Data Mining: Concepts, Models, Methods, and Algorithms , 2002 .

[96]  Federico Castanedo,et al.  A Review of Data Fusion Techniques , 2013, TheScientificWorldJournal.

[97]  Edward A. Fox,et al.  Clustering for Data Reduction: A Divide and Conquer Approach , 2007 .

[98]  J. Dubrovkin,et al.  Big Data Approach to Analytical Chemistry , 2014 .

[99]  J. Mocák,et al.  Chemometrics in Medicine and Pharmacy , 2012 .

[100]  Dumitru Baleanu,et al.  Application of the wavelet method for the simultaneous quantitative determination of benazepril and hydrochlorothiazide in their mixtures. , 2004, Journal of AOAC International.