[Isomap-PLS nonlinear modeling method for near infrared spectroscopy].

For modeling the nonlinear relationship existing between samples' near infrared (NIR) spectra and their chemical or physical properties, a novel modeling method was put forward in the present paper, which builds model by combining Isomap and partial least squares (PLS). Isomap is a newly proposed nonlinear dimension reduction algorithm, and belongs to the algorithm family of manifold learning, which is a new branch of machine learning. Isomap is based on multidimensional scaling (MDS) algorithm; however, it replaces the Euclidean distance in MDS with an approximated geodesic distance, so it can effectively find out the intrinsic low dimensional structure from high dimensional data. By combining Isomap and PLS, refered to as Isomap-PLS, a novel nonlinear modeling method for NIR spectra analysis was proposed. In this method, Isomap was used to extract nonlinear information from high dimensional NIR spectra while keeping the invariance of geometric property, and then PLS was adopted to remove linear information redundancy and build a calibration model. The parameters of the Isomap, i.e. the number of the nearest neighbor k and output dimension d, can affect the performance of the method. In this paper, a grid search approach was used for parameter optimization. The Isomap-PLS modeling method was applied to two public benchmark NIR datasets, and the modeling results were compared with that of PLS. The results demonstrated that in both datasets, each model built with Isomap-PLS had a smaller rooted mean square error of cross-validation (RMSECV) than the corresponding model built with PLS. Moreover, for some properties, the RMSECV of Isomap-PLS was significantly reduced by a factor of 2-5 compared with that of PLS. It can be concluded that by taking the virtue that Isomap can reflect the intrinsic nonlinear structure of NIR spectra, Isomap-PLS can effectively model the nonlinear correlations between spectra and physicochemical properties of the samples, and so it gains more power in calibration and prediction than PLS.