Flexible Pattern Discovery with (Extended) Disjunctive Logic Programming

The post-genomic era showed up a wide range of new challenging issues for the areas of knowledge discovery and intelligent information management. Among them, the discovery of complex pattern repetitions in string databases plays an important role, specifically in those contexts where even what are to be considered the interesting pattern classes is unknown. This paper provides a contribution in this precise setting, proposing a novel approach, based on disjunctive logic programming extended with several advanced features, for discovering interesting pattern classes from a given data set.

[1]  Hamilton O. Smith,et al.  Finding sequence motifs in groups of functionally related proteins. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[2]  R. Losick,et al.  6 Bacterial Sigma Factors , 1992 .

[3]  Charles Elkan,et al.  Unsupervised learning of multiple motifs in biopolymers using expectation maximization , 1995, Mach. Learn..

[4]  D. Higgins,et al.  Finding flexible patterns in unaligned protein sequences , 1995, Protein science : a publication of the Protein Society.

[5]  E. Davidson,et al.  The hardwiring of development: organization and function of genomic regulatory systems. , 1997, Development.

[6]  V. Lifschitz,et al.  Foundations of Logic Programming , 1997 .

[7]  David R. Gilbert,et al.  Approaches to the Automatic Discovery of Patterns in Biosequences , 1998, J. Comput. Biol..

[8]  Gary D. Stormo,et al.  Identifying DNA and protein patterns with statistically significant alignments of multiple sequences , 1999, Bioinform..

[9]  J. Collado-Vides,et al.  Discovering regulatory elements in non-coding sequences by analysis of spaced dyads. , 2000, Nucleic acids research.

[10]  G. Church,et al.  Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. , 2000, Journal of molecular biology.

[11]  Marie-France Sagot,et al.  Algorithms for Extracting Structured Motifs Using a Suffix Tree with an Application to Promoter and Regulatory Site Consensus Identification , 2000, J. Comput. Biol..

[12]  Jeremy Buhler,et al.  Finding motifs using random projections , 2001, RECOMB.

[13]  Eleazar Eskin,et al.  Finding composite regulatory patterns in DNA sequences , 2002, ISMB.

[14]  G. Brewka Principles of Knowledge Representation , 1996 .

[15]  Jean-Jacques Daudin,et al.  Occurrence Probability of Structured Motifs in Random Sequences , 2002, J. Comput. Biol..

[16]  Uri Keich,et al.  Finding motifs in the twilight zone , 2002, RECOMB '02.

[17]  Ivan Erill,et al.  In silico analysis reveals substantial variability in the gene contents of the gamma proteobacteria LexA-regulon , 2003, Bioinform..

[18]  G. Terracina A FAST TECHNIQUE FOR DERIVING FREQUENT STRUCTURED PATTERNS FROM BIOLOGICAL DATA SETS , 2005 .

[19]  Wolfgang Faber,et al.  The DLV system for knowledge representation and reasoning , 2002, TOCL.