Decoding gene expression regulation through motif discovery and classification

Biological systems are complex machineries with numerous components interacting with each other. Through the regulation of gene expression, the systems work differently at different conditions. The regulatory rules are by and large determined by DNA, as it is the most important inheritable substance. Thus, it is interesting to infer these rules by building connections between DNA sequences and gene expression. Modern high-throughput technologies are able to provide us with massive amounts of data related to sequence features and gene expression. However, the scale of the data also brings the challenges of variable selection and computation efficiency.