Ensemble partial least squares regression for descriptor selection, outlier detection, applicability domain assessment, and ensemble modeling in QSAR/QSPR modeling
暂无分享,去创建一个
Dong-Sheng Cao | Jie Dong | Zhi-Jiang Yao | Min-Feng Zhu | Zhen-Ke Deng | Rui-Gang Zhao | Dongsheng Cao | Jie Dong | Minfeng Zhu | Zhi-Jiang Yao | Zhen-ke Deng | Rui-Gang Zhao
[1] W. Cai,et al. A variable selection method based on uninformative variable elimination for multivariate calibration of near-infrared spectra , 2008 .
[2] Dong-Sheng Cao,et al. Variable importance analysis based on rank aggregation with applications in metabolomics for biomarker discovery. , 2016, Analytica chimica acta.
[3] Dong-Sheng Cao,et al. ChemSAR: an online pipelining platform for molecular SAR modeling , 2017, Journal of Cheminformatics.
[4] Yi-Zeng Liang,et al. Monte Carlo cross validation , 2001 .
[5] Dong-Sheng Cao,et al. Predicting human intestinal absorption with modified random forest approach: a comprehensive evaluation of molecular representation, unbalanced data, and applicability domain issues , 2017 .
[6] Dong-Sheng Cao,et al. A bootstrapping soft shrinkage approach for variable selection in chemical modeling. , 2016, Analytica chimica acta.
[7] Dong-Sheng Cao,et al. The model adaptive space shrinkage (MASS) approach: a new method for simultaneous variable selection and outlier detection based on model population analysis. , 2016, The Analyst.
[8] Ruisheng Zhang,et al. QSAR Models for the Prediction of Binding Affinities to Human Serum Albumin Using the Heuristic Method and a Support Vector Machine , 2004, J. Chem. Inf. Model..
[9] Dong-Sheng Cao,et al. In silico toxicity prediction of chemicals from EPA toxicity database by kernel fusion-based support vector machines , 2015 .
[10] Yizeng Liang,et al. Comparison of quantitative structure-retention relationship models on four stationary phases with different polarity for a diverse set of flavor compounds. , 2012, Journal of chromatography. A.
[11] Desire L. Massart,et al. ROBUST PRINCIPAL COMPONENTS REGRESSION AS A DETECTION TOOL FOR OUTLIERS , 1995 .
[12] P. Legendre,et al. Forward selection of explanatory variables. , 2008, Ecology.
[13] Robert P. Sheridan,et al. Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling , 2003, J. Chem. Inf. Comput. Sci..
[14] Dong-Sheng Cao,et al. ChemDes: an integrated web-based platform for molecular descriptor and fingerprint computation , 2015, Journal of Cheminformatics.
[15] Leo Breiman,et al. Random Forests , 2001, Machine Learning.
[16] Ferran Sanz,et al. Applicability Domain Analysis (ADAN): A Robust Method for Assessing the Reliability of Drug Property Predictions , 2014, J. Chem. Inf. Model..
[17] Roberto Todeschini,et al. Comparison of Different Approaches to Define the Applicability Domain of QSAR Models , 2012, Molecules.
[18] D. Massart,et al. Elimination of uninformative variables for multivariate calibration. , 1996, Analytical chemistry.
[19] Gergana Dimitrova,et al. A Stepwise Approach for Defining the Applicability Domain of SAR and QSAR Models , 2005, J. Chem. Inf. Model..
[20] Ting Wang,et al. Boosting: An Ensemble Learning Tool for Compound Classification and QSAR Modeling , 2005, J. Chem. Inf. Model..
[21] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.
[22] M. Kenward,et al. An Introduction to the Bootstrap , 2007 .
[23] Jaroslaw Polanski,et al. Modeling Robust QSAR 3: SOM-4D-QSAR with Iterative Variable Elimination IVE-PLS: Application to Steroid, Azo Dye, and Benzoic Acid Series , 2007, J. Chem. Inf. Model..
[24] Youngjo Lee,et al. Sparse partial least-squares regression and its applications to high-throughput data analysis , 2011 .
[25] D L Massart,et al. Boosting partial least squares. , 2005, Analytical chemistry.
[26] Dong-Sheng Cao,et al. A new strategy of outlier detection for QSAR/QSPR , 2009, J. Comput. Chem..
[27] Paola Gramatica,et al. QSAR study of malonyl‐CoA decarboxylase inhibitors using GA‐MLR and a new strategy of consensus modeling , 2008, J. Comput. Chem..
[28] J. Shao. Linear Model Selection by Cross-validation , 1993 .
[29] Kimito Funatsu,et al. GA Strategy for Variable Selection in QSAR Studies: GA-Based PLS Analysis of Calcium Channel Antagonists , 1997, J. Chem. Inf. Comput. Sci..
[30] Qingsong Xu,et al. Rcpi: R/Bioconductor package to generate various descriptors of proteins, compounds and their interactions , 2015, Bioinform..
[31] Ke Wang,et al. Bagging for robust non-linear multivariate calibration of spectroscopy , 2011 .
[32] Xin Yao,et al. A Survey on Evolutionary Computation Approaches to Feature Selection , 2016, IEEE Transactions on Evolutionary Computation.
[33] Dong-Sheng Cao,et al. ChemoPy: freely available python package for computational biology and chemoinformatics , 2013, Bioinform..
[34] B. Ripley,et al. Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.
[35] Jaroslaw Polanski,et al. The Comparative Molecular Surface Analysis (CoMSA) with Modified Uniformative Variable Elimination-PLS (UVE-PLS) Method: Application to the Steroids Binding the Aromatase Enzyme , 2003, J. Chem. Inf. Comput. Sci..
[36] Dong-Sheng Cao,et al. Model population analysis for variable selection , 2010 .
[37] Peter J. Rousseeuw,et al. Robust Regression and Outlier Detection , 2005, Wiley Series in Probability and Statistics.
[38] Peter J. Rousseeuw,et al. Robust regression and outlier detection , 1987 .
[39] Hiromasa Kaneko,et al. Applicability Domain Based on Ensemble Learning in Classification and Regression Analyses , 2014, J. Chem. Inf. Model..
[40] R. Yu,et al. An ensemble of Monte Carlo uninformative variable elimination for wavelength selection. , 2008, Analytica chimica acta.
[41] Dong-Sheng Cao,et al. Prediction of aqueous solubility of druglike organic compounds using partial least squares, back‐propagation network and support vector machine , 2010 .
[42] Bahram Hemmateenejad,et al. Ant colony optimisation: a powerful tool for wavelength selection , 2006 .
[43] Ingo Krossing,et al. Is universal, simple melting point prediction possible? , 2011, Chemphyschem : a European journal of chemical physics and physical chemistry.
[44] Ping Zhang. Model Selection Via Multifold Cross Validation , 1993 .
[45] M. Hubert,et al. Robust methods for partial least squares regression , 2003 .
[46] R. Didziapetris,et al. Estimation of reliability of predictions and model applicability domain evaluation in the analysis of acute toxicity (LD 50) , 2010, SAR and QSAR in environmental research.
[47] Romà Tauler,et al. Detection of Olive Oil Adulteration Using FT-IR Spectroscopy and PLS with Variable Importance of Projection (VIP) Scores , 2012 .
[48] Yi-Zeng Liang,et al. Monte Carlo cross‐validation for selecting a model and estimating the prediction error in multivariate calibration , 2004 .
[49] L. Buydens,et al. Use of the bootstrap and permutation methods for a more robust variable importance in the projection metric for partial least squares regression. , 2013, Analytica chimica acta.
[50] Muthukumarasamy Karthikeyan,et al. General Melting Point Prediction Based on a Diverse Compound Data Set and Artificial Neural Networks , 2005, J. Chem. Inf. Model..
[51] Dong-Sheng Cao,et al. Automatic feature subset selection for decision tree-based ensemble methods in the prediction of bioactivity , 2010 .
[52] Qing-Song Xu,et al. Robust principal components regression based on principal sensitivity vectors , 2003 .
[53] Ramón Carrasco-Velar,et al. Quantitative study of the structure-retention index relationship in the imine family. , 2006, Journal of chromatography. A.
[54] José Julio Espina Agulló,et al. The multivariate least-trimmed squares estimator , 2008 .
[55] Dong-Sheng Cao,et al. BioTriangle: a web-accessible platform for generating various molecular representations for chemicals, proteins, DNAs/RNAs and their interactions , 2016, Journal of Cheminformatics.
[56] Xueguang Shao,et al. A consensus least squares support vector regression (LS-SVR) for analysis of near-infrared spectra of plant samples. , 2007, Talanta.
[57] Hongdong Li,et al. Key wavelengths screening using competitive adaptive reweighted sampling method for multivariate calibration. , 2009, Analytica chimica acta.
[58] Dong-Sheng Cao,et al. ADME Properties Evaluation in Drug Discovery: Prediction of Caco-2 Cell Permeability Using a Combination of NSGA-II and Boosting , 2016, J. Chem. Inf. Model..
[59] Dong-Sheng Cao,et al. Support Vector Machines and Their Application in Chemistry and Biotechnology , 2011 .
[60] David E. Clark,et al. Evolutionary algorithms in computer-aided molecular design , 1996, J. Comput. Aided Mol. Des..
[61] P. Rousseeuw. Least Median of Squares Regression , 1984 .
[62] Dong-Sheng Cao,et al. The boosting: A new idea of building models , 2010 .
[63] B. Kowalski,et al. Partial least-squares regression: a tutorial , 1986 .
[64] Igor V. Tetko,et al. Critical Assessment of QSAR Models of Environmental Toxicity against Tetrahymena pyriformis: Focusing on Applicability Domain and Overfitting by Variable Selection , 2008, J. Chem. Inf. Model..
[65] Nina Nikolova-Jeliazkova,et al. QSAR Applicability Domain Estimation by Projection of the Training Set in Descriptor Space: A Review , 2005, Alternatives to laboratory animals : ATLA.
[66] Rosario Romera,et al. On robust partial least squares (PLS) methods , 1998 .
[67] S. Keleş,et al. Sparse partial least squares regression for simultaneous dimension reduction and variable selection , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.
[68] Scott D. Kahn,et al. Current Status of Methods for Defining the Applicability Domain of (Quantitative) Structure-Activity Relationships , 2005, Alternatives to laboratory animals : ATLA.
[69] J. Sutherland,et al. A comparison of methods for modeling quantitative structure-activity relationships. , 2004, Journal of medicinal chemistry.
[70] S. Wold,et al. PLS-regression: a basic tool of chemometrics , 2001 .
[71] John C Dearden,et al. Quantitative structure‐property relationships for prediction of boiling point, vapor pressure, and melting point , 2003, Environmental toxicology and chemistry.
[72] W. Cai,et al. An improved boosting partial least squares method for near-infrared spectroscopic quantitative analysis. , 2010, Analytica chimica acta.
[73] Jie Dong,et al. TargetNet: a web service for predicting potential drug–target interaction profiling via multi-target SAR models , 2016, Journal of Computer-Aided Molecular Design.
[74] Hongdong Li,et al. Toward better QSAR/QSPR modeling: simultaneous outlier detection and variable selection using distribution of model features , 2011, J. Comput. Aided Mol. Des..
[75] Richard Jensen,et al. Ant colony optimization as a feature selection method in the QSAR modeling of anti-HIV-1 activities of 3-(3,5-dimethylbenzyl)uracil derivatives using MLR, PLS and SVM regressions , 2009 .
[76] Robert Tibshirani,et al. Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy , 1986 .