A Journey in Pattern Mining

The traditional research paradigm in the sciences was hypothesis-driven. Over the last decade or so, this hypothesis-driven view has been replaced with a data-driven view of scientific research. In almost all fields of scientific endeavor, large research teams are systematically collecting data on questions of great import. Knowledge and insights are gained through data analysis and mining, feeding this inversion of science, i.e., rather than going from hypothesis to data, we use data to generate and validate hypotheses and to generate knowledge and understanding. The same can be said for applications in the commercial realm.

[1]  Charu C. Aggarwal,et al.  XRules: An effective algorithm for structural classification of XML data , 2006, Machine Learning.

[2]  Mohammed J. Zaki,et al.  SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.

[3]  Mohammed J. Zaki,et al.  CHARM: An Efficient Algorithm for Closed Itemset Mining , 2002, SDM.

[4]  Mohammed J. Zaki,et al.  TRICLUSTER: an effective algorithm for mining coherent clusters in 3D microarray data , 2005, SIGMOD '05.

[5]  Srinivasan Parthasarathy,et al.  Parallel Algorithms for Discovery of Association Rules , 1997, Data Mining and Knowledge Discovery.

[6]  Mohammed J. Zaki,et al.  Parallel classification for data mining on shared-memory multiprocessors , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[7]  Naren Ramakrishnan,et al.  Mining Frequent Boolean Expressions: Application to Gene Expression and Regulatory Modeling , 2010, Int. J. Knowl. Discov. Bioinform..

[8]  Mohammad Al Hasan,et al.  MUSK: Uniform Sampling of k Maximal Patterns , 2009, SDM.

[9]  Srinivasan Parthasarathy,et al.  Parallel Data Mining for Association Rules on Shared-Memory Multi-Processors , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[10]  Mohammad Al Hasan,et al.  An integrated, generic approach to pattern mining: data mining template library , 2008, Data Mining and Knowledge Discovery.

[11]  Mohammed J. Zaki,et al.  Theoretical Foundations of Association Rules , 2007 .

[12]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[13]  Mohammed J. Zaki,et al.  Mining residue contacts in proteins using local structure predictions , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[14]  Naren Ramakrishnan,et al.  BLOSOM: a framework for mining arbitrary boolean expressions , 2006, KDD '06.

[15]  Bernhard Ganter,et al.  Formal Concept Analysis , 2013 .

[16]  Mohammed J. Zaki Efficient enumeration of frequent sequences , 1998, CIKM '98.

[17]  Mohammed J. Zaki Efficiently mining frequent trees in a forest: algorithms and applications , 2005, IEEE Transactions on Knowledge and Data Engineering.

[18]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[19]  Mohammed J. Zaki,et al.  Genome-scale disk-based suffix tree indexing , 2007, SIGMOD '07.

[20]  Mohammed J. Zaki,et al.  Context shapes: Efficient complementary shape matching for protein–protein docking , 2008, Proteins.

[21]  Mohammed J. Zaki,et al.  Predicting Protein Folding Pathways , 2005, Data Mining in Bioinformatics.

[22]  Mohammed J. Zaki Mining Non-Redundant Association Rules , 2004, Data Min. Knowl. Discov..

[23]  Feng Gao,et al.  PSIST: indexing protein structures using suffix trees , 2005, 2005 IEEE Computational Systems Bioinformatics Conference (CSB'05).

[24]  Yongqiang Zhang,et al.  SMOTIF: efficient structured pattern and profile motif search , 2006, Algorithms for Molecular Biology.

[25]  Mohammed J. Zaki Generating non-redundant association rules , 2000, KDD '00.

[26]  Wei Li,et al.  Scalable data mining for rules , 1998 .

[27]  Mohammed J. Zaki Sequence mining in categorical domains: incorporating constraints , 2000, CIKM '00.

[28]  Nicolas Pasquier,et al.  Pruning closed itemset lattices for associations rules , 1998, BDA.

[29]  Mohammed J. Zaki,et al.  Scalable Feature Mining for Sequential Data , 2000, IEEE Intell. Syst..

[30]  Mohammed J. Zaki,et al.  MicroCluster: efficient deterministic biclustering of microarray data , 2005, IEEE Intelligent Systems.

[31]  Mohammed J. Zaki,et al.  GenMax: An Efficient Algorithm for Mining Maximal Frequent Itemsets , 2005, Data Mining and Knowledge Discovery.

[32]  Mohammed J. Zaki,et al.  Microcluster : Efficient deterministic biclustering of microarray data : Data mining in bioinformatics , 2005 .

[33]  Mohammed J. Zaki,et al.  GRAIL , 2010, Proc. VLDB Endow..

[34]  Charu C. Aggarwal,et al.  XRules: an effective structural classifier for XML data , 2003, KDD '03.

[35]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[36]  Mohammed J. Zaki Efficiently mining frequent trees in a forest , 2002, KDD.

[37]  Mohammed J. Zaki Efficiently Mining Frequent Embedded Unordered Trees , 2004, Fundam. Informaticae.

[38]  Mohammad Al Hasan,et al.  ORIGAMI: A Novel and Effective Approach for Mining Representative Orthogonal Graph Patterns , 2008 .

[39]  Mohammed J. Zaki Parallel Sequence Mining on Shared-Memory Machines , 1999, J. Parallel Distributed Comput..

[40]  Mohammed J. Zaki,et al.  Efficient algorithms for mining closed itemsets and their lattice structure , 2005, IEEE Transactions on Knowledge and Data Engineering.

[41]  Mohammed J. Zaki,et al.  FlexSnap: Flexible Non-sequential Protein Structure Alignment , 2009, Algorithms for Molecular Biology.

[42]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[43]  Yongqiang Zhang,et al.  EXMOTIF: efficient structured motif extraction , 2006, Algorithms for Molecular Biology.

[44]  Michael Luxenburger,et al.  Implications partielles dans un contexte , 1991 .

[45]  Mohammad Al Hasan,et al.  Output Space Sampling for Graph Patterns , 2009, Proc. VLDB Endow..

[46]  Mohammed J. Zaki,et al.  Iterative Non-Sequential protein Structural Alignment , 2009, J. Bioinform. Comput. Biol..