Comparative analysis of mass spectral matching-based compound identification in gas chromatography-mass spectrometry.

Compound identification in gas chromatography-mass spectrometry (GC-MS) is usually achieved by matching query spectra to spectra present in a reference library. Although several spectral similarity measures have been developed and compared using a small reference library, it still remains unknown how the relationship between the spectral similarity measure and the size of reference library affects on the identification accuracy as well as the optimal weight factor. We used three reference libraries to investigate the dependency of the optimal weight factor, spectral similarity measure and the size of reference library. Our study demonstrated that the optimal weight factor depends on not only spectral similarity measure but also the size of reference library. The mixture semi-partial correlation measure outperforms all existing spectral similarity measures in all tested reference libraries, in spite of the computational expense. Furthermore, the accuracy of compound identification using a larger reference library in future is estimated by varying the size of reference library. Simulation study indicates that the mixture semi-partial correlation measure will have the best performance with the increase of reference library in future.

[1]  Ingrid Daubechies,et al.  Ten Lectures on Wavelets , 1992 .

[2]  Fred W. McLafferty,et al.  Reliability ranking and scaling improvements to the probability based matching system for unknown mass spectra , 1985 .

[3]  Aiqin Fang,et al.  iMatch: a retention index tool for analysis of gas chromatography-mass spectrometry data. , 2011, Journal of chromatography. A.

[4]  X. Zhang,et al.  A method of calculating the second dimension hold-up time for comprehensive two-dimensional gas chromatography. , 2012, Journal of chromatography. A.

[5]  Imhoi Koo,et al.  Compound identification using partial and semipartial correlations for gas chromatography-mass spectrometry data. , 2012, Analytical chemistry.

[6]  D. Scott,et al.  Optimization and testing of mass spectral library search algorithms for compound identification , 1994, Journal of the American Society for Mass Spectrometry.

[7]  Thomas L. Isenhour,et al.  The Evaluation of Mass Spectral Search Algorithms , 1979, J. Chem. Inf. Comput. Sci..

[8]  E. O. Brigham,et al.  The Fast Fourier Transform , 1967, IEEE Transactions on Systems, Man, and Cybernetics.

[9]  Imhoi Koo,et al.  A method of finding optimal weight factors for compound identification in gas chromatography-mass spectrometry , 2012, Bioinform..

[10]  K. Biemann,et al.  Identification of mass spectra by computer-searching a file of known spectra , 1971 .

[11]  R K Julian,et al.  A method for quantitatively differentiating crude natural extracts using high-performance liquid chromatography-electrospray mass spectrometry. , 1998, Analytical chemistry.

[12]  Jun Zhang,et al.  A method of calculating the second dimension retention index in comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry. , 2011, Journal of chromatography. A.

[13]  Imhoi Koo,et al.  Wavelet- and Fourier-transform-based spectrum similarity approaches to compound identification in gas chromatography/mass spectrometry. , 2011, Analytical chemistry.