Automated Kerogen Classification in Microscope Images of Dispersed Kerogen Preparation

We develop the classification part of a system that analyses transmitted light microscope images of dispersed kerogen preparation. The system automatically extracts kerogen pieces from the image and labels each piece as either inertinite or vitrinite. The image pre-processing analysis consists of background removal, identification of kerogen material, object segmentation, object extraction (individual images of pieces of kerogen) and feature calculation for each object. An expert palynologist was asked to label the objects into categories inertinite and vitrinite, which provided the ground truth for the classification experiment. Ten state-of-the-art classifiers and classifier ensembles were compared: Naïve Bayes, decision tree, nearest neighbour, the logistic classifier, multilayered perceptron (MLP), support vector machines (SVM), AdaBoost, Bagging, LogitBoost and Random Forest. The logistic classifier was singled out as the most accurate classifier, with an accuracy greater than 90. Using a 10 times 10-fold cross-validation provided within the Weka software, we found that the logistic classifier was significantly better than five classifiers (p<0.05) and indistinguishable from the other four classifiers. The initial set of 32 features was subsequently reduced to 6 features without compromising the classification accuracy. A further evaluation of the system alerted us to the possible sensitivity of the classification to the ground truth that might vary from one human expert to another. The analysis also revealed that the logistic classifier made most of the correct classifications with a high certainty.

[1]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[2]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[3]  Ludmila I. Kuncheva,et al.  Background Segmentation in Microscopy Images , 2008, VISAPP.

[4]  Rasmus Larsen,et al.  Supervised Mineral Classification with Semiautomatic Training and Validation Set Generation in Scanning Electron Microscope Energy Dispersive Spectroscopy Images of Thin Sections , 2000 .

[5]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[6]  Lynne Boddy,et al.  Automated identification and characterisation of microbial populations using flow cytometry: the AIMS project , 2000 .

[7]  M. Berthod,et al.  Automatic classification of planktonic foraminifera by a knowledge-based system , 1994, Proceedings of the Tenth Conference on Artificial Intelligence for Applications.

[8]  Ludmila I. Kuncheva,et al.  Object segmentation within microscope images of palynofacies , 2008, Comput. Geosci..

[9]  Li Wang,et al.  Automatic identification of rocks in thin sections using texture analysis , 1995 .

[10]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[11]  A.W.G. Duller,et al.  A new approach to automated pollen analysis , 2000 .

[12]  Monique Thonnat,et al.  Development of a semi-automatic system for pollen recognition , 2002 .

[13]  D. Hand,et al.  Idiot's Bayes—Not So Stupid After All? , 2001 .

[14]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[15]  P. A. Swaby,et al.  VIDES: an expert system for visually identifying microfossils , 1992, IEEE Expert.

[16]  Luc Vincent,et al.  Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[18]  Lynne Boddy,et al.  Identification of Phytoplankton from Flow Cytometry Data by Using Radial Basis Function Neural Networks , 1999, Applied and Environmental Microbiology.

[19]  Ps Quinn,et al.  Automated particle analysis: calcareous microfossils , 2005 .

[20]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[21]  J. Andrew Ware,et al.  Determining the saliency of feature measurements obtained from images of sedimentary organic matter for use in its classification , 2006, Comput. Geosci..

[22]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[23]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[24]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[25]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[26]  Jonathan Corcoran,et al.  The semi-automated classification of sedimentary organic matter in palynological preparations , 2005, Comput. Geosci..

[27]  Richard J. Howarth,et al.  The application of expert systems to the identification and use of microfossils in the petroleum industry , 1994 .

[28]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[29]  Scott J. Hill,et al.  Outline extraction of microfossils in reflected light images , 1988 .

[30]  David G. Stork,et al.  Pattern Classification , 1973 .

[32]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[33]  J. A. Ware,et al.  Two Supervised Neural Networks for Classification of Sedimentary Organic Matter Images from Palynological Preparations , 2007 .