High Throughput Analysis of Breast Cancer Specimens on the Grid

Breast cancer accounts for about 30% of all cancers and 15% of all cancer deaths in women in the United States. Advances in computer assisted diagnosis (CAD) holds promise for early detecting and staging disease progression. In this paper we introduce a Grid-enabled CAD to perform automatic analysis of imaged histopathology breast tissue specimens. More than 100,000 digitized samples (1200 x 1200 pixels) have already been processed on the Grid. We have analyzed results for 3744 breast tissue samples, which were originated from four different institutions using diaminobenzidine (DAB) and hematoxylin staining. Both linear and nonlinear dimension reduction techniques are compared, and the best one (ISOMAP) was applied to reduce the dimensionality of the features. The experimental results show that the Gentle Boosting using an eight node CART decision tree as the weak learner provides the best result for classification. The algorithm has an accuracy of 86.02% using only 20% of the specimens as the training set.

[1]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[2]  Guang-Zhong Yang,et al.  Tissue Characterization Using Dimensionality Reduction and Fluorescence Imaging , 2006, MICCAI.

[3]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[4]  David G. Stork,et al.  Pattern Classification , 1973 .

[5]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[6]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[7]  R. Hoda,et al.  Rosai and Ackerman???s Surgical Pathology , 2004 .

[8]  Rangaraj M. Rangayyan,et al.  Recent Advances in Breast Imaging, Mammography, and Computer-Aided Diagnosis of Breast Cancer , 2006 .

[9]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[10]  Robert Marti,et al.  A Comparison of Breast Tissue Classification Techniques , 2006, MICCAI.

[11]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[12]  Francine Berman,et al.  Grid Computing: Making the Global Infrastructure a Reality , 2003 .

[13]  Axel Hoos,et al.  Tissue Microarray Profiling of Cancer Specimens and Cell Lines: Opportunities and Limitations , 2001, Laboratory Investigation.

[14]  张振跃,et al.  Principal Manifolds and Nonlinear Dimensionality Reduction via Tangent Space Alignment , 2004 .

[15]  Fausto J. Rodriguez,et al.  Rosai and Ackerman’s Surgical Pathology, 9th ed. , 2004 .

[16]  Anant Madabhushi,et al.  A Boosting Cascade for Automated Detection of Prostate Cancer from Digitized Histology , 2006, MICCAI.

[17]  H. Zha,et al.  Principal manifolds and nonlinear dimensionality reduction via tangent space alignment , 2004, SIAM J. Sci. Comput..

[18]  Jitendra Malik,et al.  Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons , 2001, International Journal of Computer Vision.

[19]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[20]  C. Cardinez,et al.  United States cancer statistics; 2003 incidence and mortality , 2006 .

[21]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[22]  David J. Foran,et al.  A prototype for unsupervised analysis of tissue microarrays for cancer research and diagnostics , 2004, IEEE Transactions on Information Technology in Biomedicine.

[23]  Wei Liu,et al.  Australian Neuroinformatics Research - Grid Computing and e-Research , 2005, ICNC.

[24]  Lasse Riis Østergaard,et al.  Active Surface Approach for Extraction of the Human Cerebral Cortex from MRI , 2006, MICCAI.