Conserved and Differential Interactions Detected from Microarray Data in Mouse Cerebellar
Development
In this report, we compare gene expression among three developmental stages (according
to Thomass request in August, 2010), to identify conserved and differential gene
interactions using the comparative dynamical system modeling (DSM) method we have
developed. The three stages include early embryonic (EE), late embryonic (LE) and
postnatal (PN). By gene expression time courses, 1435 transcription factors
(TFs) including alternative splicing variants were grouped into 291 clusters
for DBA and 251 for BL6. We applied comparative DSM modeling on EE versus
LE+PN, LE versus EE+PN, and PN versus EE+LE for the two strains separately. For
each of the three comparisons on the DBA strain, we detected 272, 201, 260
differential and 19, 90, 31 conserved interactions among the cluster representatives,
at a p-value of less than 0.05; for the BL6 strain, we identified 220, 206, 243
differential and 30, 45, 8 conserved interactions. From this result, we found
a significant interaction between pax6 and klf4. We also tested interactions between
splicing variants of klf4 and pax6 and found that some transcripts interact differentially
during the three developmental stages.
1. Designation of development stages
This study divides the developmental time window into three stages: early embryonic,
late embryonic and postnatal. The early embryonic stage contains E12, E13
and E14; the late embryonic stage consists of E15, E16, E17, E18 and E19; the postnatal
stage includes P0, P3, P6 and P9. (This is a revised developmental stage designation,
according to Thomass email in August 2010, from our previous studies.)
2. Modeling separately gene interactions in
DBA and BL6 strains
We treated the DBA and BL6 strains separately in comparing gene interactions in the
developmental stages. We observed that expression patterns of some genes
in the two strains are quite different. We tried two methods to deal with the strain
difference: one was to filter out genes showed significant difference between the
two strains; the other is to perform comparative modeling on the two strains separately.
The results in this report is produced with the second method, because many of the
1435 TFs are very different between the two strains and would have been filtered
out if we had to use the first method.
3. Clustering
We clustered transcripts of the two strains separately into linearly correlated groups.
The clustering results are saved in TOTAL_cluster_DBA.txt and TOTAL_cluster_BL6.txt.
Each cluster includes genes and their probes with linearly correlated transcripts
across the entire time course. Technical replicates were likely to be grouped into
the same cluster.
4. Comparative dynamical system modeling
Comparative DSM was applied on analyzing interactions during cerebellar development
from gene expression data. A DSM consists of a set of ordinary differential
equations (ODE). And a nonlinear sigmoidal ODE model is used to describe the expression
rate of a gene as
(1)
where
is the
expression level of gene i, influenced by gene j and k,
s are
constant rate coefficients to be estimated, and the last term models mRNA degradation
with rate constant
>0.
Self-influence is allowed, i.e., a gene can influence its own rate of change. Comparative
DSM is a framework to detect differential and conserved interactions through three
statistical tests, explained in Table 1.
Table 1. Statistical tests to determine the strengths of total interaction,
homogeneity and heterogeneity for a pair of interactions across experimental conditions.
Null hypothesis
|
Alternative hypothesis
|
Test statistic
|
Significance
|
No interaction
|
Active interaction
|
Ft
|
pt
|
No homogeneity
in interaction
|
Homogeneous interaction
|
Fc
|
pc
|
No heterogeneity
in interaction
|
Heterogeneous interaction
|
Fd
|
pd
|
Mathematically, any of the three test statistics can be derived from the other two.
Comparative DSM searches the best model including the influential genes for each
gene representative, according to pt, which measures the total interaction strength.
Then according to pd and pc, differential (pd <= 0.05) or conserved interactions
(pd > 0.05, pc <= 0.05) will be determined. The smaller the pd value is, the
more significant is a differential interaction, likewise for pc to a conserved interaction.
5. Results
5.1 Comparative modeling results across three development stages for
individual strains
For studying interactions among clusters of transcription factors, we did three comparative
runs for each strain: early embryonic, late embryonic, and postnatal versus the
corresponding remaining stages. This resulted in, for each strain, 3 comma
separated value (CSV) files describing interactions between transcript clusters.
So there are a total of 6 cluster interaction files for both strains. The
name of a CSV file, e.g., Early_Embryonic_DBA_0.csv, contains the development stage
(Early Embryonic), a strain name (DBA), and a zero, for a comparative result between
early embryonic and the other two stages for the DBA strain. Each file contains
6 columns, including symbol, probeID, cluster, cluster representative, parent cluster,
total interaction strength p-value (pt), homogeneity p-value (pc) and heterogeneity
p-value (pd). Symbol and probeID together determine a unique transcript of a gene.
The cluster column contains the labels of the clusters to which each transcript
is assigned; Cluster representative is the transcript chosen to represent a cluster;
parent cluster is the cluster whose representative statistically significantly influences
the expression pattern of the current cluster representative, as determined by comparative
modeling; pt, pc and pd are described in Table 1.
In the BL6 strain, transcripts of klf4 were assigned to clusters C77, C95 and C119;
and transcripts of pax6 were assigned to clusters C1, C28 and C35. A differential
interaction between C28 and C119 showed up significantly when we compared
early embryonic with the other two stages. And a more significantly differential
interaction between C35 and C95 also appeared when we compared postnatal
with the other two stages. We also noticed that cluster C165 is consistently
active in influencing cluster C119 in three comparative analyses. It might
thus be biologically interesting to examine those genes in C165 for their
role in regulation of klf4.
In the DBA strain, transcripts of klf4 were assigned to clusters C64, C163 and C225;
and pax6 into C31, C92 and C106. A conserved interaction between C31
and C163 was detected by comparative DSM in both comparisons of late embryonic
v.s. the other two stages and postnatal v.s. the other two stages. Two differential
interactions between C31 and C64 and between C31 and C225 were detected
when we compared postnatal with the other two stages.
Thus, comparative modeling on both strains suggested that interactions between various
transcripts of pax6 and klf4 might be actively involved in cerebellar development.
This is based on the strong statistically differential expression pattern between
the embryonic (early + late, i.e., EE+LE) and the postnatal (PN) stages, we detected
above.
5.2 The interaction between various transcripts of pax6 and klf4
between pre-EGL and EGL expansion
We also studied interactions between the transcripts of pax6 and klf4, by contrasting
the interactions between the time duration of pre-EGL expansion (E12, E13, E14,
E15) and that of EGL expansion (E16, E17, E18, E19, P0, P3, P6, P9). We
detected conserved and differential interactions between various transcripts of
the two genes in both strains, as shown in Table 2 and Table 3. These results are
obtained differently from previous ones in four aspects. First, manually choosing
the order of smoothing splines, shown in Figure 3, may introduce inconsistency
in estimating the expression rate, the derivative in equation (2). Second,
>0
is required which means if it is not satisfied, all p-values would be set
as 1. Third, dissociation consistent
and hill
coefficient
was applied.
and
were
chosen from {5,6,7,8,9,10,11} and {2,3,4,5} respectively. The range was decided
in order to capture the non-linear dynamical change of TFs, since the average of
all transcripts is 8. Forth, self-influence may exist, shown in equation (2). According
to overall significance,
, self-influence
or not will be chosen along with a combination of
and
. Two
interaction patterns are shown in the phase diagram in Figure 1, to illustrate possible
differential and conserved interactions between pre-EGL expansion and EGL expansion
involving different transcripts of the two genes by observations. These interaction
patterns are also shown in Figure 2 through model predictions.
(2)
Table 2. Interactions between transcripts of pax6 and klf4 in the BL6 strain.
Interactions in blue background are differential, and the ones in yellow background
are conserved. The rest are inactive interactions.
child transcript
|
parent transcript
|
pt
|
pc
|
pd
|
Klf4.1850022
|
Pax6.101660253
|
0
|
0
|
0.163
|
Klf4.1850022
|
Pax6.102340114
|
0
|
0
|
0.05
|
Klf4.1850022
|
Pax6.105720411
|
1
|
1
|
1
|
Klf4.1850022
|
Pax6.1190025
|
0
|
0.018
|
0.003
|
Klf4.3830239
|
Pax6.101660253
|
0.004
|
0
|
0.181
|
Klf4.3830239
|
Pax6.102340114
|
1
|
1
|
1
|
Klf4.3830239
|
Pax6.105720411
|
0
|
0
|
0.19
|
Klf4.3830239
|
Pax6.1190025
|
0.029
|
0.017
|
0.096
|
Klf4.5570750
|
Pax6.101660253
|
1
|
1
|
1
|
Klf4.5570750
|
Pax6.102340114
|
1
|
1
|
1
|
Klf4.5570750
|
Pax6.105720411
|
1
|
1
|
1
|
Klf4.5570750
|
Pax6.1190025
|
0
|
0
|
0.178
|
Table 3. Interactions between transcripts of pax6 and klf4 in the DBA strain.
The ones in yellow background are conserved and the rest are inactive interactions.
child transcript
|
parent transcript
|
pt
|
pc
|
pd
|
Klf4.1850022
|
Pax6.101660253
|
0
|
0
|
0.177
|
Klf4.1850022
|
Pax6.102340114
|
0.001
|
0.001
|
0.111
|
Klf4.1850022
|
Pax6.105720411
|
0
|
0
|
0.183
|
Klf4.1850022
|
Pax6.1190025
|
0.02
|
0.007
|
0.089
|
Klf4.3830239
|
Pax6.101660253
|
0.004
|
0
|
0.167
|
Klf4.3830239
|
Pax6.102340114
|
0.016
|
0.001
|
0.148
|
Klf4.3830239
|
Pax6.105720411
|
0
|
0
|
0.175
|
Klf4.3830239
|
Pax6.1190025
|
0.033
|
0.008
|
0.106
|
Klf4.5570750
|
Pax6.101660253
|
0.011
|
0
|
0.171
|
Klf4.5570750
|
Pax6.102340114
|
0.009
|
0
|
0.179
|
Klf4.5570750
|
Pax6.105720411
|
0.082
|
0
|
0.195
|
Klf4.5570750
|
Pax6.1190025
|
0.054
|
0.034
|
0.092
|


Figure 1. Differential and conserved interactions between different transcripts
of pax6 and klf4 between pre-EGL expansion and EGL expansion viewed through the
observations. A label on the plots represents the time of an observation, and its
coordinates are the expression level of two transcripts of klf4 and pax6. An arrow
represents a transition in expression level over time: the dashed is for pre-EGL
expansion, and the solid is for EGL expansion. Left: In a differential interaction
between transcripts klf4.1850022 and pax6.1190025, transition in time
course shows a diverging trend between the two stages. Right: In a conserved interaction
between transcripts klf4.1850022 and pax6.101660253, transitions in
both time courses converge to the top right corner. Despite the difference in working
zone between the two stages, we still consider this interaction conserved, as they
appear to approach a common steady state.


Figure 2. Differential and conserved interactions between different transcripts
of pax6 and klf4 between pre-EGL expansion and EGL expansion viewed through the
model predictions. In these plots, the x- and y-axis represent expression levels
of a parent transcript and a child transcript while the z-axis is the expression
rate of the child transcript. Since we consider degradation and self-influence,
both observations of the parent and its child are taken into account. Each point
represents observed expression levels of a parent and its child, and the corresponding
child expression rate at each time point. The different colors of the points represent
different developmental stages, pre-EGL expansion and EGL expansion. The child expression
rate is estimated by smoothing splines shown in Figure 3. The surfaces are model
predictions. Red and blue surfaces are produced by models for pre-EGL expansion
and EGL expansion data, respectively, and the yellow one is generated by a model
for pooled data. Top: In a differential interaction for klf4.1850022 from
pax6.1190025, the expression levels of the pax6.1190025 have different
influence patterns on the expression rate of klf4.1850022, as the blue and
red surfaces substantially deviate from the yellow surface. Bottom: In a conserved
interaction for klf4.1850022 from pax6.101660253, the expression levels
of the pax6.101660253 have the same influence patterns on the expression
rate of klf4.185002, as the blue and red surfaces both substantially overlap
the yellow surface, though the ranges of the expression levels of pax6.101660253
are different for the two developmental stages.



Figure 3. Three smoothing splines fitted from time course observations. The
x-axis in the plots represents time and y-axis the expression levels. The day of
birth (P0) is marked as 0. Points are observed gene expression levels. The red lines
are smoothing splines. A derivative, or expression rate, at each time point is estimated
by smoothing splines. The three plots are for transcript pax6.1190025, pax6.101660253
and klf4.1850022, respectively, in the BL6 strain.