We use differential equation to determine the critical parameters characterizing gene expression profile. This method allows us to analyze the transcriptional events underlying the huge amount of time-course gene expression data and to determine the vital parameters such as the turn-on () and turn-off () time of the important genes during the cerebellum development. A brief introduction to the model is as follows:
Our transcriptome profiling of developing cerebellum has produced a temporal sequence of expression levels for each gene in B6 and D2 strains. We adapted a kinetic model (Sasik et al 2002, Obitko 1999) to determine the critical parameters characterizing each gene expression profile.
A first order differential equation is used to model the abundance of gene transcript at time ,
where is a gene-specific transcription regulation term, and is a gene-specific decay rate. It is reasonable to assume the transcription regulation is a sharp function of time,
where , i.e. there are two levels of regulation -- basal transcription and stimulated transcription that starts at and ends at .
The solution of (1) and (2) is
This solution is characterized by the five parameters , , , and . To determine the parameters, we fit the solution with the actual mRNA abundance levels by minimizing the sum of square
in the space of the five parameters. The quality of the fit is measured by
where is the average expression level of gene , is the number of time point.
Analysis & Results (This goes to the Analysis & Results section)
We used the differential equation model to analyze B6 and D2 cerebellum developmental time series microarray data and found that the expression profiles of 560 genes (in both strains) fit the turn-on-turn-off differential equation model. We then filtered out genes with low expression levels (>400). The final gene list includes 187 genes. This gene list is significantly enriched by genes that are involved in "nervous system development" (22 genes, p-value=1E-9) according to Gene Ontology.
The following figure contains the 374 panels, two for each gene (because there are two mouse strains). Each figure shows the curve generated by the differential equation model and the normalized expression data of a gene (red stars). We labeled each figure in the upper-left corner with the gene name:strain (the genes associated with GO term "nervous system development" are highlighted in green), turn-on time, turn-off time and the average expression value. The number in the upper-right corner is the "time difference" between the two strains, which is defined as the difference between the turn-on time plus the difference between the turn-off time of the two strains. The figures are sorted by the time difference.
The genes with big between-strain time differences are of great interest because they might be responsible for the differences in cerebellum development process between B6 and D2. The genes with very small between-strain time differences may also be important because their expression time are very conserved between B6 and D2, which indicates very precise timing control for these genes are required for cerebellum development in both strains.
See figure on the following pages.