An improved classification scheme for chromosomes with missing data

Karyotyping, or the automatic classification of human chromosomes, is mostly based on the analysis of the chromosome specific banding pattern. Unfortunately, the most informative phases of the cell division cycle are composed of long chromosomes that easily overlap: the involved banding pattern information is corrupted, resulting in a drastic increase of the classification error. Assuming the availability of a probabilistic classifier, the improvement of the classification of chromosomes with corrupted data would require the additional estimation of the joint probability density of the observed and missing data for each chromosome class. Given the number of classes, the possible position and extension of the corrupted data within a chromosome, and the dimensionality of the feature space, a reliable estimation would need an impossible number of training samples. We chose to circumvent the estimation problem by developing a statistical generative model of the pattern of each class, so that the corrupted part can be substituted with a partial pattern synthetically generated from the model. This allows to obtain a Monte Carlo estimate of the maximum a posteriori probability for the class given the observation and the missing data, which reduces to a simple voting scheme if the a priori probability for each class is equal. Moreover, this Monte Carlo classification is superior to the voting scheme based on the simple imputation of the classes mean to the missing data.

[1]  Its'hak Dinstein,et al.  Geometric Separation of Partially Overlapping Nonrigid Objects Applied to Automatic Chromosome Classification , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[3]  Jim Piper,et al.  Genetic algorithm for applying constraints in chromosome classification , 1995, Pattern Recognit. Lett..

[4]  Enrico Grisan,et al.  Automatic Segmentation and Disentangling of Chromosomes in Q-Band Prometaphase Images , 2009, IEEE Transactions on Information Technology in Biomedicine.

[5]  Its'hak Dinstein,et al.  Medial axis transform-based features and a neural network for human chromosome classification , 1995, Pattern Recognit..

[6]  S. Delshadpour Reduced size multi layer perceptron neural network for human chromosome classification , 2003, Proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEEE Cat. No.03CH37439).

[7]  Robert H. Shumway,et al.  The model selection criterion AICu , 1997 .

[8]  A.M. Badawi,et al.  Chromosomes classification based on neural networks, fuzzy rule based, and template matching classifiers , 2003, 2003 46th Midwest Symposium on Circuits and Systems.

[9]  N. Sweeney,et al.  A comparison of wavelet and Fourier descriptors for a neural network chromosome classifier , 1997, Proceedings of the 19th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. 'Magnificent Milestones and Emerging Opportunities in Medical Engineering' (Cat. No.97CH36136).

[10]  D. Rubin,et al.  Statistical Analysis with Missing Data , 1988 .

[11]  Seyed Kamaledin Setarehdan,et al.  New features for automatic classification of human chromosomes: A feasibility study , 2006, Pattern Recognit. Lett..

[12]  Its'hak Dinstein,et al.  A classification-driven partially occluded object segmentation (CPOOS) method with application to chromosome analysis , 1998, IEEE Trans. Signal Process..

[13]  Ming S. Hung,et al.  Estimating Posterior Probabilities In Classication Problems With Neural Networks , 1996 .

[14]  Masahiro Tanaka,et al.  Pattern classification by stochastic neural network with missing data , 1996, 1996 IEEE International Conference on Systems, Man and Cybernetics. Information Intelligence and Systems (Cat. No.96CH35929).

[15]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[16]  Alfredo Ruggeri,et al.  Automatic classification of chromosomes in Q-band images , 2008, 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.