Determining Significant Fold Differences in Gene Expression Analysis

A typical use for RNA expression microarrays is comparing the measurement of gene expression of two groups. There has not been a study reproducing an entire experiment and modeling the distribution of reproducibility of fold differences. Our goal was to create a model of significance for fold differences, then maximize the number of ESTs above that threshold. Multiple strategies were tested to filter out those ESTs contributing to noise, thus decreasing the requirements of what was needed for significance. We found that even though RNA expression levels appears consistent in duplicate measurements, when entire experiments are duplicated, the calculated fold differences are not as consistent. Thus, it is critically important to repeat as many data points as possible, to ensure that genes and ESTs labeled as significant are truly so. We were successfully able to use duplicated expression measurements to model the duplicated fold differences, and to calculate the levels of fold difference needed to reach significance. This approach can be applied to many other experiments to ascertain significance without a priori assumptions.