Automatic Summarization of Changes in Biological Image Sequences Using Algorithmic Information Theory

An algorithmic information-theoretic method is presented for object-level summarization of meaningful changes in image sequences. Object extraction and tracking data are represented as an attributed tracking graph (ATG). Time courses of object states are compared using an adaptive information distance measure, aided by a closed-form multidimensional quantization. The notion of meaningful summarization is captured by using the gap statistic to estimate the randomness deficiency from algorithmic statistics. The summary is the clustering result and feature subset that maximize the gap statistic. This approach was validated on four bioimaging applications: 1) It was applied to a synthetic data set containing two populations of cells differing in the rate of growth, for which it correctly identified the two populations and the single feature out of 23 that separated them; 2) it was applied to 59 movies of three types of neuroprosthetic devices being inserted in the brain tissue at three speeds each, for which it correctly identified insertion speed as the primary factor affecting tissue strain; 3) when applied to movies of cultured neural progenitor cells, it correctly distinguished neurons from progenitors without requiring the use of a fixative stain; and 4) when analyzing intracellular molecular transport in cultured neurons undergoing axon specification, it automatically confirmed the role of kinesins in axon specification.

[1]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[2]  W. Eric L. Grimson,et al.  Answering Questions about Moving Objects in Surveillance Videos , 2003, New Directions in Question Answering.

[3]  Lei Chen,et al.  Multi-scale histograms for answering queries over time series data , 2004, Proceedings. 20th International Conference on Data Engineering.

[4]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[5]  Dimitrios Gunopulos,et al.  Indexing multi-dimensional time-series with support for multiple distance measures , 2003, KDD '03.

[6]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[7]  José Carlos Príncipe,et al.  Information Theoretic Clustering , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[9]  Manuel Cebrián,et al.  The Normalized Compression Distance Is Resistant to Noise , 2007, IEEE Transactions on Information Theory.

[10]  Ming Li,et al.  Applying MDL to learn best model granularity , 2000, Artif. Intell..

[11]  Paul F. Whelan,et al.  Tracking of facial features using deformable triangles , 2003, SPIE OPTO-Ireland.

[12]  Badrinath Roysam,et al.  Robust 3-D Modeling of Vasculature Imagery Using Superellipsoids , 2007, IEEE Transactions on Medical Imaging.

[13]  Anil C. Kokaram,et al.  Off-line multiple object tracking using candidate selection and the Viterbi algorithm , 2005, IEEE International Conference on Image Processing 2005.

[14]  Ramakant Nevatia,et al.  Event Detection and Analysis from Video Streams , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Ying Chen,et al.  Automatic summarization of changes in image sequences using algorithmic information theory , 2008, 2008 5th IEEE International Symposium on Biomedical Imaging: From Nano to Macro.

[16]  Michael H. F. Wilkinson Mathematical Morphology: 40 years on , 2005 .

[17]  Badrinath Roysam,et al.  Integrated Analysis of Vascular and Nonvascular Changes From Color Retinal Fundus Image Sequences , 2007, IEEE Transactions on Biomedical Engineering.

[18]  Péter Gács,et al.  Algorithmic statistics , 2000, IEEE Trans. Inf. Theory.

[19]  Rama Chellappa,et al.  From sample similarity to ensemble similarity: probabilistic distance measures in reproducing kernel Hilbert space , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  R. Waterston,et al.  Automated cell lineage tracing in Caenorhabditis elegans. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Gregory J. Chaitin,et al.  Information, Randomness and Incompleteness - Papers on Algorithmic Information Theory; 2nd Edition , 1987, World Scientific Series in Computer Science.

[22]  Karen L. Smith,et al.  Effects of insertion conditions on tissue strain and vascular damage during neuroprosthetic device insertion , 2006, Journal of neural engineering.

[23]  J. Hannan,et al.  Introduction to probability and mathematical statistics , 1986 .

[24]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[25]  Eamonn J. Keogh,et al.  Three Myths about Dynamic Time Warping Data Mining , 2005, SDM.

[26]  Mark A. Pitt,et al.  Advances in Minimum Description Length: Theory and Applications , 2005 .

[27]  B. Schnapp,et al.  A Change in the Selective Translocation of the Kinesin-1 Motor Domain Marks the Initial Specification of the Axon , 2006, Neuron.

[28]  David L. Neuhoff,et al.  Quantization , 2022, IEEE Trans. Inf. Theory.

[29]  Greg Hamerly,et al.  Learning the k in k-means , 2003, NIPS.

[30]  Qiang Wang,et al.  A multiresolution symbolic representation of time series , 2005, 21st International Conference on Data Engineering (ICDE'05).

[31]  Mateu Sbert,et al.  Compression-based Image Registration , 2006, 2006 IEEE International Symposium on Information Theory.

[32]  Badrinath Roysam,et al.  A multi‐model approach to simultaneous segmentation and classification of heterogeneous populations of cell nuclei in 3D confocal microscope images , 2007, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[33]  Khalid A. Al-Kofahi,et al.  Rapid automated three-dimensional tracing of neurons from confocal image stacks , 2002, IEEE Transactions on Information Technology in Biomedicine.

[34]  Philippe Van Ham,et al.  Tracking of migrating cells under phase-contrast video microscopy with combined mean-shift processes , 2005, IEEE Transactions on Medical Imaging.

[35]  B. Roysam,et al.  Automated Cell Lineage Construction: A Rapid Method to Analyze Clonal Development Established with Murine Neural Progenitor Cells , 2006, Cell cycle.

[36]  Jorma Rissanen,et al.  Stochastic Complexity in Statistical Inquiry , 1989, World Scientific Series in Computer Science.

[37]  Badrinath Roysam,et al.  Image change detection algorithms: a systematic survey , 2005, IEEE Transactions on Image Processing.

[38]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[39]  James J. Clark,et al.  Anomaly Detection for Video Surveillance Applications , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[40]  Eamonn J. Keogh,et al.  A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.

[41]  John Wright,et al.  Segmentation of Multivariate Mixed Data via Lossy Data Coding and Compression , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Mihai Datcu,et al.  Modeling trajectory of dynamic clusters in image time-series for spatio-temporal reasoning , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[43]  K. Kogure,et al.  Bacterial motility: links to the environment and a driving force for microbial physics. , 2006, FEMS microbiology ecology.

[44]  Gregory J. Chaitin,et al.  Information, Randomness and Incompleteness , 1987 .

[45]  Paul M. B. Vitányi,et al.  Clustering by compression , 2003, IEEE Transactions on Information Theory.

[46]  Badrinath Roysam,et al.  Automated semantic analysis of changes in image sequences of neurons in culture , 2006, IEEE Transactions on Biomedical Engineering.

[47]  Ming Li,et al.  Applying MDL to Learning Best Model Granularity , 2000, ArXiv.

[48]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[49]  Nicolas Roussel,et al.  A Computational Model for C. elegans Locomotory Behavior: Application to Multiworm Tracking , 2007, IEEE Transactions on Biomedical Engineering.

[50]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[51]  Paul M. B. Vitányi,et al.  Meaningful Information , 2001, IEEE Transactions on Information Theory.

[52]  Victor Chew,et al.  Confidence, Prediction, and Tolerance Regions for the Multivariate Normal Distribution , 1966 .

[53]  Peter Yianilos,et al.  Normalized Forms for Two Common Metrics , 1991 .

[54]  Nikolai K. Vereshchagin,et al.  Kolmogorov's structure functions and model selection , 2002, IEEE Transactions on Information Theory.

[55]  Derick Wood,et al.  Theory of computation , 1986 .

[56]  Paul M. B. Vitányi,et al.  The Power and Perils of MDL , 2007, 2007 IEEE International Symposium on Information Theory.

[57]  Dorin Comaniciu,et al.  Real-time tracking of non-rigid objects using mean shift , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[58]  Yvan G. Leclerc,et al.  Constructing simple stable descriptions for image partitioning , 1989, International Journal of Computer Vision.

[59]  Péter Gács,et al.  Information Distance , 1998, IEEE Trans. Inf. Theory.

[60]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[61]  Josef Kittler,et al.  Floating search methods for feature selection with nonmonotonic criterion functions , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[62]  Bin Ma,et al.  The similarity metric , 2001, IEEE Transactions on Information Theory.

[63]  Li Wei,et al.  Experiencing SAX: a novel symbolic representation of time series , 2007, Data Mining and Knowledge Discovery.

[64]  Eamonn J. Keogh,et al.  Towards parameter-free data mining , 2004, KDD.

[65]  Robert Tibshirani,et al.  Estimating the number of clusters in a data set via the gap statistic , 2000 .

[66]  Alexander Zelinsky,et al.  Fast Radial Symmetry for Detecting Points of Interest , 2003, IEEE Trans. Pattern Anal. Mach. Intell..