A GMDH-type neural network with multi-filter feature selection for the prediction of transition temperatures of bent-core liquid crystals

A novel strategy for the prediction of the transition temperature of bent-core liquid crystals (LCs) based on the combination of multi filter feature selection and group method of data handling (GMDH) type neural networks is reported. An entire set of 243 compounds was randomly divided into a training set of 207 compounds and a test set of 36 compounds. Descriptors were selected from a pool of 2D, and two pools of 2D and 3D ones, optimized by molecular mechanics (MM) and semi-empirical (SE) method. The reduction of the pool of descriptors was performed using multi filters based on chi square and v-WSH algorithm, while the final subset selection was performed by GMDH algorithm during the learning process. The obtained 2D, MM and SE GMDH models have 11, 13 and 16 descriptors, respectively, and demonstrate good generalization and predictive ability (R2 = 0.92). The final models were subjected to a randomization test for validation purpose. Those models appear to be not only suitable for prediction, but they also allow the identification of key structural features that alter the transition temperature of bent-core LCs.

[1]  S. Yousefinejad,et al.  Prediction of ETN Polarity Scale of Ionic Liquids Using a QSPR Approach , 2015 .

[2]  S. Sasikala,et al.  Multi Filtration Feature Selection (MFFS) to improve discriminatory ability in clinical data set , 2016 .

[3]  Davood Domiri Ganji,et al.  Effect of magnetic field on Cu–water nanofluid heat transfer using GMDH-type neural network , 2013, Neural Computing and Applications.

[4]  Vicenç Puig,et al.  A GMDH neural network-based approach to passive robust fault detection using a constraint satisfaction backward test , 2007, Eng. Appl. Artif. Intell..

[5]  Józef Korbicz,et al.  Towards Robust Neural-Network-Based Sensor and Actuator Fault Diagnosis: Application to a Tunnel Furnace , 2014, Neural Processing Letters.

[6]  Zhiguo Gong,et al.  Study of Nematic Transition Temperatures in Themotropic Liquid Crystal Using Heuristic Method and Radial Basis Function Neural Networks and Support Vector Machine , 2008 .

[7]  Chen Chu,et al.  A computational method for the identification of new candidate carcinogenic and non-carcinogenic chemicals. , 2015, Molecular bioSystems.

[8]  Tadashi Kondo,et al.  Hybrid Multi-layered GMDH-Type Neural Network Using Principal Component-Regression Analysis and Its Application to Medical Image Diagnosis of Lung Cancer , 2012, 2012 ASE/IEEE International Conference on BioMedical Computing (BioMedCom).

[9]  Jie Xu,et al.  QSPR study on melting point of carbocyclic nitroaromatic compounds by multiple linear regression and artificial neural network , 2015 .

[10]  Lei Wang,et al.  QSPR study of Setschenow constants of organic compounds using MLR, ANN, and SVM analyses , 2011, J. Comput. Chem..

[11]  S. J. Farlow The GMDH Algorithm of Ivakhnenko , 1981 .

[12]  Kunal Roy,et al.  QSAR Analyses of 3-(4-Benzylpiperidin-1-yl)-N-phenylpropylamine Derivatives as Potent CCR5 Antagonists , 2005, J. Chem. Inf. Model..

[13]  Andreas Zell,et al.  Feature Selection for Descriptor Based Classification Models. 1. Theory and GA-SEC Algorithm , 2004, J. Chem. Inf. Model..

[14]  Mohammad Yusri Hassan,et al.  A review on applications of ANN and SVM for building electrical energy consumption forecasting , 2014 .

[15]  A. G. Ivakhnenko,et al.  Polynomial Theory of Complex Systems , 1971, IEEE Trans. Syst. Man Cybern..

[16]  CHUN WEI YAP,et al.  PaDEL‐descriptor: An open source software to calculate molecular descriptors and fingerprints , 2011, J. Comput. Chem..

[17]  Soteris A. Kalogirou,et al.  Artificial intelligence for the modeling and control of combustion processes: a review , 2003 .

[18]  Jie Xu,et al.  Accurate quantitative structure–property relationship analysis for prediction of nematic transition temperatures in thermotropic liquid crystals , 2010 .

[19]  Stanley J. Farlow,et al.  Self-Organizing Methods in Modeling: Gmdh Type Algorithms , 1984 .

[20]  W. J. Welsh,et al.  Polynomial Neural Network for Linear and Non-linear Model Selection in Quantitative-Structure Activity Relationship Studies on the Internet , 2000, SAR and QSAR in environmental research.

[21]  R. E. Abdel-Aal,et al.  GMDH-based feature ranking and selection for improved classification of medical data , 2005, J. Biomed. Informatics.

[22]  Marcin Mrugalski,et al.  An unscented Kalman filter in designing dynamic GMDH neural networks for robust fault detection , 2013, Int. J. Appl. Math. Comput. Sci..

[23]  Dong-Sheng Cao,et al.  In silico evaluation of logD7.4 and comparison with other prediction methods , 2015 .

[24]  Viktor Pocajt,et al.  A QSPR study on the liquid crystallinity of five-ring bent-core molecules using decision trees, MARS and artificial neural networks , 2016 .

[25]  Scott Boyer,et al.  Choosing Feature Selection and Learning Algorithms in QSAR , 2014, J. Chem. Inf. Model..

[26]  Quantum chemical studies on the conformational behaviour of substituted banana‐shaped mesogens with a central 1,3‐phenylene unit , 2005 .

[27]  Vitor Hugo Ferreira,et al.  Input space to neural network based load forecasters , 2008 .

[28]  A. Jákli,et al.  Polar bent-shape liquid crystals – from molecular bend to layer splay and chirality , 2013 .

[29]  Józef Korbicz,et al.  A GMDH neural network-based approach to robust fault diagnosis : Application to the DAMADICS benchmark problem , 2006 .

[30]  J. Al-Fahemi QSPR study on nematic transition temperatures of thermotropic liquid crystals based on DFT-calculated descriptors , 2014 .

[31]  Masoud Rahimi,et al.  GMDH-type neural network modeling and genetic algorithm-based multi-objective optimization of thermal and friction characteristics in heat exchanger tubes with wire-rod bundles , 2016 .

[32]  Roberto Todeschini,et al.  A novel variable reduction method adapted from space-filling designs , 2014 .

[33]  Peter Filzmoser,et al.  Multivariate linear QSPR/QSAR models: Rigorous evaluation of variable selection for PLS , 2013, Computational and structural biotechnology journal.

[34]  B. Fan,et al.  Prediction of nematic transition temperatures in thermotropic liquid crystals by a heuristic method , 2007 .

[35]  Hui Li,et al.  A cascaded QSAR model for efficient prediction of overall power conversion efficiency of all‐organic dye‐sensitized solar cells , 2015, J. Comput. Chem..

[36]  P. Jurs,et al.  Prediction of the clearing temperatures of a series of liquid crystals from molecular structure , 1999 .

[37]  D. Antanasijević,et al.  Prediction of clearing temperatures of bent-core liquid crystals using decision trees and multivariate adaptive regression splines , 2016 .

[38]  Kunal Roy,et al.  A Primer on QSAR/QSPR Modeling: Fundamental Concepts , 2015 .

[39]  Alex Alves Freitas,et al.  Pre-processing Feature Selection for Improved C&RT Models for Oral Absorption , 2013, J. Chem. Inf. Model..

[40]  Amir Hossein Zaji,et al.  GMDH-type neural network approach for modeling the discharge coefficient of rectangular sharp-crested side weirs , 2015 .

[41]  Y. Takanishi,et al.  Bent-Core Liquid Crystals: Their Mysterious and Attractive World , 2006 .

[42]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[43]  Tomasz Puzyn,et al.  “NanoBRIDGES” software: Open access tools to perform QSAR and nano-QSAR modeling , 2015 .

[44]  M. Karelson,et al.  Correlation of Boiling Points with Molecular Structure. 1. A Training Set of 298 Diverse Organics and a Test Set of 9 Simple Inorganics , 1996 .

[45]  Mohammad Najafzadeh,et al.  Application of improved neuro-fuzzy GMDH to predict scour depth at sluice gates , 2015, Earth Science Informatics.

[46]  Gerta Rücker,et al.  y-Randomization and Its Variants in QSPR/QSAR , 2007, J. Chem. Inf. Model..

[47]  Jorge Gálvez,et al.  Charge Indexes. New Topological Descriptors , 1994, J. Chem. Inf. Comput. Sci..

[48]  K. Taylor Summarizing multiple aspects of model performance in a single diagram , 2001 .

[49]  Eslam Pourbasheer,et al.  Prediction of PCE of fullerene (C60) derivatives as polymer solar cell acceptors by genetic algorithm–multiple linear regression , 2015 .