Mining Association Rules in Analysis of Transcription Factors Essential to Gene Expressions

Knowledge discovery from gene expression databases has become an important research area for biologists since the growing number of gene sequences was obtained. This paper studies the transcription factor(s) required for expression of the target genes using data mining association rule techniques. To apply the association rules to mine the transcription factors essential to certain gene expressions, we defined each type of tissues as a set of transactions or a dataset. Each dataset consists of transcription factors and the target genes. The Apriori mining algorithm was prototyped and the gene sequence data were tested. The results were obtained by pruning the itemsets before and after applying the Apriori algorithm, in which the false results were eliminated. The data items (transcription factors) obtained from this program were compared with those data obtained through experimental research. The comparison results indicated that it may be effective to apply data mining association rules to obtain transcription factors associated with gene expressions.