AlPOs Synthetic Factor Analysis Based on Maximum Weight and Minimum Redundancy Feature Selection

The relationship between synthetic factors and the resulting structures is critical for rational synthesis of zeolites and related microporous materials. In this paper, we develop a new feature selection method for synthetic factor analysis of (6,12)-ring-containing microporous aluminophosphates (AlPOs). The proposed method is based on a maximum weight and minimum redundancy criterion. With the proposed method, we can select the feature subset in which the features are most relevant to the synthetic structure while the redundancy among these selected features is minimal. Based on the database of AlPO synthesis, we use (6,12)-ring-containing AlPOs as the target class and incorporate 21 synthetic factors including gel composition, solvent and organic template to predict the formation of (6,12)-ring-containing microporous aluminophosphates (AlPOs). From these 21 features, 12 selected features are deemed as the optimized features to distinguish (6,12)-ring-containing AlPOs from other AlPOs without such rings. The prediction model achieves a classification accuracy rate of 91.12% using the optimal feature subset. Comprehensive experiments demonstrate the effectiveness of the proposed algorithm, and deep analysis is given for the synthetic factors selected by the proposed method.

[1]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[2]  Yan Yan,et al.  Database of open-framework aluminophosphate syntheses: introduction and application (I) , 2009 .

[3]  Daoqiang Zhang,et al.  Constraint Score: A new filter method for feature selection with pairwise constraints , 2008, Pattern Recognit..

[4]  Huo Wei Decision Trees Combined with Feature Selection for the Rational Synthesis of Aluminophosphate AlPO4-5 , 2011 .

[5]  高娜,et al.  基于特征选择的决策树方法在磷酸铝AlPO 4 -5定向合成中的应用 , 2011 .

[6]  Jihong Yu,et al.  Insight into the construction of open-framework aluminophosphates. , 2006, Chemical Society reviews.

[7]  Jihong Yu,et al.  Design of Chiral Zeolite Frameworks with Specified Channels through Constrained Assembly of Atoms , 2005 .

[8]  Shuicheng Yan,et al.  Dense Neighborhoods on Affinity Graph , 2011, International Journal of Computer Vision.

[9]  W. R. Buckland,et al.  Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability. , 1952 .

[10]  Yi Li,et al.  Prediction of Open-Framework Aluminophosphate Structures Using the Automated Assembly of Secondary Building Units Method with Lowenstein's Constraints , 2005 .

[11]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[12]  Qisheng Huo,et al.  Chemistry of Zeolites and Related Porous Materials: Synthesis and Structure , 2007 .

[13]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[14]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[15]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Chris H. Q. Ding,et al.  Minimum redundancy feature selection from microarray gene expression data , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[17]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[18]  Stacey I. Zones,et al.  A combustion-free methodology for synthesizing zeolites and zeolite-like materials , 2003, Nature.

[19]  Yi Li,et al.  Combining structure modeling and electron microscopy to determine complex zeolite framework structures. , 2008, Angewandte Chemie.

[20]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[21]  Jun Kong,et al.  Computational prediction of the formation of microporous aluminophosphates with desired structural features , 2010 .

[22]  Reiji Teramoto,et al.  Supervised Consensus Scoring for Docking and Virtual Screening , 2007, J. Chem. Inf. Model..

[23]  Sam P. Perone,et al.  Computerized pattern recognition applications to chemical analysis. Development of interactive feature selection methods for the K-nearest neighbor technique , 1974 .

[24]  Paolo Soda,et al.  A multi-objective optimisation approach for class imbalance learning , 2011, Pattern Recognit..

[25]  Wenguo Xu,et al.  Structures and Templating Effect in the Formation of 2D Layered Aluminophosphates with Al3P4O163- Stoichiometry , 1999 .

[26]  Jihong Yu,et al.  A crystalline germanate with mesoporous 30-ring channels. , 2009, Journal of the American Chemical Society.

[27]  Peng Chen,et al.  Template-Designed Syntheses of Open-Framework Zinc Phosphites with Extra-Large 24-Ring Channels , 2008 .

[28]  Chih-Jen Lin,et al.  Combining SVMs with Various Feature Selection Strategies , 2006, Feature Extraction.

[29]  Andreas Bender,et al.  Characterizing Bitterness: Identification of Key Structural Features and Development of a Classification Model , 2006, J. Chem. Inf. Model..

[30]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[31]  Jihong Yu,et al.  Rational Synthesis of Microporous Aluminophosphates with an Inorganic Open Framework Analogous to Al4P5O20H·C6H18N2 , 2000 .

[32]  Yi Li,et al.  Design of Zeolite Frameworks with Defined Pore Geometry through Constrained Assembly of Atoms , 2003 .

[33]  Zheng Rong Yang,et al.  Evaluation of Mutual Information and Genetic Programming for Feature Selection in QSAR , 2004, J. Chem. Inf. Model..

[34]  Ying Liu,et al.  A Comparative Study on Feature Selection Methods for Drug Discovery , 2004, J. Chem. Inf. Model..