<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1746-4811-2-8</ui>
   <ji>1746-4811</ji>
   <fm>
      <dochead>Methodology</dochead>
      <bibl>
         <title>
            <p><it>ModuleFinder </it>and <it>CoReg</it>: alternative tools for linking gene expression modules with promoter sequences motifs to uncover gene regulation mechanisms in plants</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Holt</snm>
               <mi>E</mi>
               <fnm>Kathryn</fnm>
               <insr iid="I1"/>
               <email>katholt@graduate.uwa.edu.au</email>
            </au>
            <au id="A2">
               <snm>Millar</snm>
               <fnm>A Harvey</fnm>
               <insr iid="I1"/>
               <email>hmillar@cyllene.uwa.edu.au</email>
            </au>
            <au id="A3" ca="yes">
               <snm>Whelan</snm>
               <fnm>James</fnm>
               <insr iid="I1"/>
               <email>seamus@cyllene.uwa.edu.au</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>ARC Centre of Excellence in Plant Energy Biology, CMS Building M310 University of Western Australia, 35 Stirling Highway, Crawley 6009, Western Australia, Australia</p>
            </ins>
         </insg>
         <source>Plant Methods</source>
         <issn>1746-4811</issn>
         <pubdate>2006</pubdate>
         <volume>2</volume>
         <issue>1</issue>
         <fpage>8</fpage>
         <url>http://www.plantmethods.com/content/2/1/8</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">16606469</pubid>
               <pubid idtype="doi">10.1186/1746-4811-2-8</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>05</day>
               <month>11</month>
               <year>2005</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>11</day>
               <month>4</month>
               <year>2006</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>11</day>
               <month>4</month>
               <year>2006</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2006</year>
         <collab>Holt et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Uncovering the key sequence elements in gene promoters that regulate the expression of plant genomes is a huge task that will require a series of complementary methods for prediction, substantial innovations in experimental validation and a much greater understanding of the role of combinatorial control in the regulation of plant gene expression.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>To add to this larger process and to provide alternatives to existing prediction methods, we have developed several tools in the statistical package R. <it>ModuleFinder </it>identifies sets of genes and treatments that we have found to form valuable sets for analysis of the mechanisms underlying gene co-expression. <it>CoReg </it>then links the hierarchical clustering of these co-expressed sets with frequency tables of promoter elements. These promoter elements can be drawn from known elements or all possible combinations of nucleotides in an element of various lengths. These sets of promoter elements represent putative <it>cis</it>-acting regulatory elements common to sets of co-expressed genes and can be prioritised for experimental testing. We have used these new tools to analyze the response of transcripts for nuclear genes encoding mitochondrial proteins in <it>Arabidopsis </it>to a range of chemical stresses. <it>ModuleFinder </it>provided a subset of co-expressed gene modules that are more logically related to biological functions than did subsets derived from traditional hierarchical clustering techniques. Importantly <it>ModuleFinder </it>linked responses in transcripts for electron transport chain components, carbon metabolism enzymes and solute transporter proteins. <it>CoReg </it>identified several promoter motifs that helped to explain the patterns of expression observed.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p><it>ModuleFinder </it>identifies sets of genes and treatments that form useful sets for analysis of the mechanisms behind co-expression. <it>CoReg </it>links the clustering tree of expression-based relationships in these sets with frequency tables of promoter elements. These sets of promoter elements represent putative <it>cis</it>-acting regulatory elements for sets of genes, and can then be tested experimentally. We consider these tools, both built on an open source software product to provide valuable, alternative tools for the prioritisation of promoter elements for experimental analysis.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The regulation of gene expression is one of the most intensively studied areas of biology. The regulation of transcription, the first committed step in gene expression, is achieved via the interaction of transcription factors with <it>cis </it>acting regulatory elements (CAREs) <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. A complete understanding of the interaction between transcription factors and regulatory sequences will ultimately lead to a picture of the regulatory networks operating in a biological system. Genome wide studies on the expression of transcription factors are currently underway in attempts to gain data that can be used to understand the complex nature of gene regulation that exists to coordinate cellular functions <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>. The structure of such regulatory networks (multi-component regulatory factors that have overlapping but also discrete activities) for a plant can begin to be hypothesized using the ~1,500 transcription factors in <it>Arabidopsis </it>in a combinatorial manner to achieve regulation of the 28,000 or more genes <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>.</p>
         <p>The completion of the <it>Arabidopsis </it>nuclear genome sequence means that the analysis of plant gene expression has changed from probing the expression of a single or few genes at a time to simultaneous analysis of the expression of virtually every gene <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. This change in the amount of data available represents a considerable challenge for biologists to extract knowledge from these data and use it in a productive manner to investigate the mechanisms underlying gene regulation, i.e. the further dissection of a complex network of combinatorial control.</p>
         <p>The analysis of <it>Arabidopsis </it>microarray expression data sets can be carried out from single gene analysis to whole genome approaches. At a single gene level many researchers can simply look up how their gene or genes of interest are changing under a large number of conditions. This approach has been facilitated by the use of tools such as Genevestigator, which enables complex array data to be easily interrogated for a gene of interest <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. At a wider genome level hierarchical clustering has been applied to complete genome transcriptomic data during growth and development <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>, following various biotic and abiotic treatments <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp> and after alterations in transcript abundances due to changes in nutrient availability <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Development of analysis packages such as MAPMAN has allowed plant biologists to visualize transcriptomic data on metabolic pathways that should lead to a greater understanding and use of transcriptomic data <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>.</p>
         <p>Even though large-scale analysis like those above can and has identified novel associations of biological significance, the clustering methods used can also tend to split or miss relationships in such data. The transcripts from a group of genes may respond to a number of parameters in a similar manner, but in additional treatments their response may differ. In a hierarchical cluster analysis of all these treatments the relationship between these genes will often be masked and they will be separated to different parts of the clustering tree. This loss of association is further compounded by the fact that clustering of gene expression data is often carried out with the intent to identify co-expressed genes and then these data used to elucidate the regulation of these genes, i.e. to identify CAREs and the transcription factors that bind them. As transcription factor binding sites are small in size (6 to 10 bp <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>) compared to the large number of DNA bases in promoter regions, there is a significant challenge in identifying these regions of important sequence. Direct experimental confirmation requires considerable effort, so computational efforts to identify the most likely putative CAREs are essential. The identification of similar CAREs in co-expressed genes thus becomes crucial as it will determine the quality of input for such analysis.</p>
         <p>An alternative approach to hierarchical clustering to analyse array expression data is to define associations based on similarities in transcript abundance in a subset of treatments. Such two way clustering or biclustering uses iterative approaches to define relationships between subsets of genes and subset of treatments. This approach has been most widely used in the analysis of transcript datasets from cancer samples <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr></abbrgrp>. Various approaches such as the progressive iterative signature algorithm (PISA) <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>, gene expression mining server (GEMS) <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>, coupled two-ways clustering (CTWC) <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> and X-Motifs <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> use this principle to search for relationships that go largely undetected using hierarchical clustering.</p>
         <p>We have taken a biclustering approach to identify co-expressed genes and the prediction of the CAREs. Firstly we have simplified the number of genes analyzed by using only a subset, in this example those that encode proteins located in mitochondria <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp>. Secondly we have identified genes that are co-expressed in response to subsets of treatments using a novel approach via a tool we have developed and named <it>ModuleFinder</it>. The pattern of co-expressed genes produced in <it>ModuleFinder </it>can be exported to visualize functional groups in tools such as MAPMAN. To predict CAREs we have used the hierarchical clustering produced in <it>ModuleFinder </it>and the assumption that the resulting hierarchical tree structure of the expression data is a reflection of patterns of CAREs in promoter regions. Thus the hierarchical relationships identified based on the expression data can be used to identify these promoter elements. We have developed a tool named <it>CoReg </it>to undertake this CARE prediction.</p>
      </sec>
      <sec>
         <st>
            <p>Results and discussion</p>
         </st>
         <sec>
            <st>
               <p>Existing approaches are not well suited to identifying shared responses among numerous non-linear-related treatments</p>
            </st>
            <p>Cluster analysis is a useful technique for identifying genes whose expression patterns across a given set of treatments are similar. For example, such analysis will cluster together all those genes whose expression is up-regulated in response to treatments A, B and C, down regulated in response to treatments D, E and F, and unaffected by treatments G and H (Figure <figr fid="F1">1</figr>, cluster 1). However since expression data from all treatments is used in the analysis, this cluster will not include genes that are up-regulated in response to A, B, C and G, and down-regulated in response to D, E, F and H (Figure <figr fid="F1">1</figr>, cluster 3). These will be grouped together into a separate cluster since their expression patterns differ under treatments G and H (Figure <figr fid="F1">1</figr>, cluster 3). The similarity between the clusters in response to treatments A to F is masked in the analysis and the cluster tree. Yet from a biological point of view, the fact that both clusters display co-ordinated expression in response to treatments A to F is very interesting. It may indicate that they are co-regulated by a factor that is induced or activated under treatments A-C and repressed or inactivated under treatments D-F. Thus it would be informative to identify the genes of cluster 1 and 3, and the treatments A-F, as a co-ordinated gene expression module. Such a module contains more member genes, and in the analysis of this larger set it can be argued it is more likely that a biological significant mechanism might become apparent than in analysis of the two separate smaller groups produced by classical cluster analysis.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Shared gene expression responses can be split in simple cluster analysis</p>
               </caption>
               <text>
                  <p><it>Shared gene expression responses can be split in simple cluster analysis</it>. A) Classical cluster analysis groups together genes whose expression patterns are similar across all available experiments. This cluster analysis of genes 1 to 12 in treatments A to H splits the genes into three separate clusters. B) Clusters 1 and 3 (genes 1&#8211;4 and 9&#8211;12) are co-ordinately expressed in response to treatments A-F.</p>
               </text>
               <graphic file="1746-4811-2-8-1"/>
            </fig>
            <p>We have thus developed <it>ModuleFinder </it>in <it>R </it>with the aim of identifying gene expression modules in a way that facilitates the subsequent interpretation of results. The method was designed to allow easy visualization not only of the expression patterns of discrete modules, but also of the relationships between the modules. The aim of <it>ModuleFinder </it>is to identify gene expression responses that are shared among subsets of treatments and genes; the approach is to first identify gene clusters that are co-expressed in a small subset (often a pair) of treatments, then look for other treatments in which these gene clusters are expressed in a similar co-ordinated manner. This approach ignores the differences in treatment effects and focuses on the shared effects on gene expression, which are expected to be related to the activation of common gene regulatory pathways.</p>
         </sec>
         <sec>
            <st>
               <p>The ModuleFinder algorithm</p>
            </st>
            <p><it>ModuleFinder </it>takes as its input a matrix of expression data from a set of experiments, for example the set of average log expression ratios for genes from a range of experimental treatments compared to a control. It also requires a matrix of p-values associated with each data point, providing an assessment of how likely it would be to observe the gene expression values if there was really no change in experimental compared to control conditions. <it>P</it>-values may be calculated from the original expression measures via an appropriate statistical method, e.g. <it>t</it>-tests.</p>
            <p>The algorithm begins with a subset of experiments and extracts the genes whose expression levels differ from control conditions in those experiments, according to the p-values provided. <it>ModuleFinder </it>then clusters the genes hierarchically and splits them into co-expressed modules based on the resulting clustering tree. Next, the algorithm searches for another experiment (outside the initial subset) that fits the expression patterns of these modules. The new experiment is added to the module and the genes are re-clustered. Experiments are added one by one in an iterative procedure of searching for matching experiments and re-clustering the genes, until no more experiments can be found that fit the module expression patterns. The resulting subsets of genes and experiments are referred to as gene expression modules, as they define not only gene clusters but also subsets of genes whose expression is co-ordinated in a specific subset of experiments. A general scheme of the program is illustrated in Figure <figr fid="F2">2A</figr>.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>An overview of the operation of ModuleFinder</p>
               </caption>
               <text>
                  <p><it>An overview of the operation of ModuleFinder</it>. A) Flow diagram of <it>ModuleFinder</it>. Sets of expression data are taken as input, subsets of genes and experiments are hierarchically clustered and then experiments with similar expression profiles are added consecutively (indicated with an asterisk), expression data and TreeView cluster files are saved after each addition and the entire run is documented in a PDF file. B) The PDF output file includes a heatmap, clustering tree and functional breakdown of the modules at each stage of the run.</p>
               </text>
               <graphic file="1746-4811-2-8-2"/>
            </fig>
            <p>The <it>ModuleFinder </it>algorithm can be run in either a supervised or unsupervised fashion. In an unsupervised run, the algorithm first searches for pairs of experiments in which gene expression was similar (i.e. highly correlated), then builds gene expression modules based on these correlated pairs. On the other hand, the user can identify a particular subset of experiments they are interested in, and run the algorithm in a supervised manner by specifying the names of the experiments to provide an initial subset. This initial set will then be added to by iterative additions of related experiments.</p>
            <p>The main output of <it>ModuleFinder </it>is a PDF file containing clustering trees and expression heat maps of the modules produced after the addition of each new experiment. It also includes pie charts displaying the breakdown of each module according to the functional categories of its member genes (Figure <figr fid="F2">2B</figr>). In addition, cluster files are written at each stage for easy viewing of clusters, heat maps and gene annotations in tree viewing programs compatible with TreeView and its java versions, which can run on any platform <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. Excel-compatible, comma-separated files containing the expression data for the subsets of genes and experiments are also saved at each stage (Figure <figr fid="F2">2A</figr>).</p>
         </sec>
         <sec>
            <st>
               <p>Using ModuleFinder to identify modules within the expression of a set of nuclear genes encoding mitochondrial proteins (NGMP)</p>
            </st>
            <p>Traditional hierarchical clustering was compared to <it>ModuleFinder </it>in analysing the expression of 374 <it>Arabidopsis </it>NGMPs in a set of microarray experiments where <it>Arabidopsis </it>suspension cell cultures were subjected to 16 different chemical stresses <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. The clustering trees and expression heat maps using a standard clustering method (hierarchical clustering using a Euclidean distance measure and the McQuitty method of linkage) are shown in Figure <figr fid="F3">3A</figr><abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. The full clustering tree and heat map showing all individual gene responses are shown in Supplementary Figure <figr fid="F1">1A</figr>. A set of 16 clusters can be defined from this analysis ranging in size from 4 to 89 genes. For the <it>ModuleFinder </it>analysis, four experiments comprising salicylic acid at 10 and 100 &#956;M and rotenone after 3 and 12 hours were selected as an initial experiment subset. These treatments had been shown to induce similar responses in the expression of the alternative respiratory pathway components of plant mitochondria <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. Using a p-value cut-off of 0.1, a total of 51 (14%) of the genes could be selected in these experiments and clustered into eight modules. A further seven treatments were added by <it>ModuleFinder </it>and are shown in Figure <figr fid="F3">3B</figr>. A number of genes that were separated into several different clusters using hierarchical clustering (Figure <figr fid="F3">3A</figr>) were placed into the same modules or closely related modules by <it>ModuleFinder </it>(Figure <figr fid="F3">3B</figr>). Therefore the similarity in biological response to these treatments becomes readily apparent, with genes that are uniformly induced and genes that are uniformly repressed by the treatments are identified. Analysis of the same data set using the coupled two-way clustering (CTWC) algorithm yielded an intermediate set of results to the traditional hierarchical clustering and analysis by <it>ModuleFinder </it>(data not shown). Of the 24 genes whose transcript abundance was increased as shown in Figure <figr fid="F3">3B</figr>, 7 and 4 were placed in two close clusters, indicating that the CTWC algorithm and ModuleFinder were placing gene together that were split in the traditional hierarchical clustering <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. However using several iterative clustering steps with treatments (sample clusters in CTWC terminology) the apparent relationships between treatments defined by ModuleFinder were not evident in the CTWC results. This may be due to the fact that the initial clustering of treatments and genes is based on the entire data set, so the problems illustrated in Figure <figr fid="F1">1</figr> remain. Furthermore CTWC works by clustering genes into subsets, then clustering samples into subsets. Each gene subset-sample subset pair is then considered as a sub-matrix and genes and samples are re-clustered within that sub-matrix <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B27">27</abbr></abbrgrp>. The result is a collection of subsets of genes and samples (gene expression modules), which theoretically should display co-ordinated expression patterns. However, this fragmentation of the data into small discrete modules makes it difficult to interpret the CTWC results and particularly difficult to see overall trends in the expression patterns displayed by ModuleFinder.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Traditional hierarchical cluster versus ModuleFinder analysis of 374 Nuclear Genes encoding Mitochondrial Proteins (NGMPs)</p>
               </caption>
               <text>
                  <p><it>Traditional hierarchical cluster versus ModuleFinder analysis of 374 Nuclear Genes encoding Mitochondrial Proteins (NGMPs)</it>. A) The 374 NGMPs were clustered into 16 clusters in response to 16 treatments. This analysis split genes that have related function in a subset of treatments. B) Using the same starting set of genes and treatments, but seeding the <it>ModuleFinder </it>with salicylic acid and rotenone treatments (indicated in red) a different grouping of genes is produced. This output contained only 51 genes, divided into 8 modules; this used a p-value cut-off of 0.1, Euclidean distance, complete linkage and a between-to-within-groups variance ratio > 4. The number to the left of the heat map indicates the cluster number that these genes belonged to in the analysis carried out in A above.</p>
               </text>
               <graphic file="1746-4811-2-8-3"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Visualization of ModuleFinder sets in MAPMAN</p>
            </st>
            <p>The output from <it>ModuleFinder </it>can be visualized using MAPMAN to display functional categories. MAPMAN is a Java program that allows users to annotate images with data from a text file <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. Once the appropriate images and annotation files are loaded into the program, users can load in a file containing a list of gene identifiers with values assigned to each (e.g. an expression value from a particular experiment), and the genes are mapped onto the pathway image, coloured according to the value (e.g. expression level) in the loaded file. This helps users to recognize if a number of genes in a pathway were induced or repressed in an experiment. A number of mappings and annotated images come with the standard MAPMAN download, but users can also add their own. To facilitate functional interpretation of the results presented here, which focus specifically on this set of mitochondrial-targeted genes, a new MAPMAN annotation was developed to aid visualization of changes in expression of components of the various pathways of plant mitochondria (Figure <figr fid="F4">4</figr>). The mapping includes the classical and alternative mitochondrial electron transport chains in some detail, as well as components of the mitochondrial import machinery, substrate transporters and TCA cycle enzymes, and can be used to visualize data from any source in which genes are labelled with AGI locus identifiers. The necessary files are available as part of the <it>ModuleFinder </it>package. The annotated image highlighted that the highest up-regulated group contained two genes that together form an alternative respiratory pathway: alternative oxidase 1a (<it>Aox1a</it>) and an external class alternative NADH dehydrogenase (<it>NDB2</it>). The next most up-regulated group contained several mitochondrial substrate carriers and genes involved in metabolism. The annotated image also suggested some down-regulation of genes involved in import of proteins and substrates into the mitochondria, as well as functions associated with expression of the mitochondrial genome (DNA/RNA processing, transcription and protein synthesis).</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Visualization of the expression of Nuclear Genes encoding Mitochondrial Proteins using MAPMAN</p>
               </caption>
               <text>
                  <p><it>Visualization of the expression of Nuclear Genes encoding Mitochondrial Proteins using MAPMAN</it>. Pictorial representation of mitochondrial functions of the changes in gene expression from <it>ModuleFinder </it>as carried out in Figure 3B. The mitochondrial outer membrane contains the Translocase of the Outer Membrane (TOM) complex and the Translocase of the Inner Membrane 17:23 and 22 (TIM17:23 and TIM22) which are responsible for the import of all mitochondrial proteins synthesised in the cytosol [60]. The substrate carriers refer to the family of mitochondrial carrier proteins characterised by six transmembrane regions and responsible for the import and export of various metabolites in and out of mitochondria [61]. The mitochondrial electron transport chain consisting of four multi-subunit electron transport complexes and the ATP synthase complex are labelled I to V. The alternative electron transport chain components, alternative NAD(P)H dehydrogenases (NDH) and alternative oxidase (Aox) are shown. The TCA cycle and a range of other functions of mitochondria are listed [62]. The boxes represent the average gene expression for the 47 genes in Fig 3B divided into the functional annotation for these genes in Mapman.</p>
               </text>
               <graphic file="1746-4811-2-8-4"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Building a framework for understanding the biological implications of the gene regulation observed</p>
            </st>
            <p>Combining the MAPMAN overview with a more detailed analysis using the wider literature provided an even deeper view of the biological response to rotenone and salicylic acid, showing this process was helpful for a biologists' interpretation of the dataset. Rotenone is an inhibitor of complex I function, thus preventing matrix-located NADH from the TCA cycle entering the classical respiratory chain. Salicylic acid can have a similar effect, as it appears that along with its defence signalling functions this compound can inhibit the respiratory chain in plants <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. This effect appears to be through inhibition of the dehydrogenases of the mitochondrial electron transport chain <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>. Induction of the Aox and NADH dehydrogenase are the clearest direct response to this targeted inhibition of mitochondrial function evident from both types of cluster analysis (Figure <figr fid="F3">3</figr>). Using the classical cluster analysis it appeared that the up-regulation of gene expression in response to respiratory poisons was split, in clusters 2, 4, 5, 15 and 16, and down-regulation split into 1, 3, 10 and 14 (Fig <figr fid="F3">3A</figr>, Supplementary Figure <figr fid="F1">1A</figr>). Many of the genes in cluster 9 and 15 are involved in protein synthesis or mitochondrial biogenesis (Supplementary Figure <figr fid="F1">1B</figr>). We have previously reported that changes in protein import into mitochondria and a general up-regulation of genes encoding components involved in mitochondrial biogenesis occur as a result of chemical and environmental stresses <abbrgrp><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp>.</p>
            <p>Using <it>ModuleFinder </it>a larger picture of the effects of these chemical stresses on the expression of mitochondrial components becomes evident. In the defined subset of co-expressed genes the induction of the alternative transport chain components is coupled to the induction of transcripts encoding for eight different substrate dehydrogenases, providing new avenues for NADH generation, or in the case of the electron transfer flavoprotein (At1g50940), provision of electrons to ubiquinone. Significantly, the new carbon substrates for these NADH generating pathways, while including the organic acids of the TCA cycle, are likely to be generated by catabolism of amino acids. Enzymes involved in valine, isoleucine, cysteine, tyrosine, alanine and glutamate catabolism are induced. Concomitant with this change in substrate for energy generation is the upregulation of transcripts for 4 mitochondrial carrier proteins, most of unknown function. Down-regulation is observed for components of the classical electron transport chain complexes I and III, a separate set of five mitochondrial substrate carriers (most of unknown function) and lipid biosynthesis pathways for phosphotidylglyerol and phosphotidylethanolamine. Interestingly, both genes for NAD-malic enzyme (At4g00570, At2g13560) are down-regulated. This protein normally bridges the TCA cycle to allow the anaplerotic removal of organic acids for functions elsewhere in the cell. Together the insights from this analysis suggests that these simple chemical inhibitors appear to initiate the signals for a complicated re-organisation of mitochondrial function within the plant cell that can now been investigated independently.</p>
         </sec>
         <sec>
            <st>
               <p>Searching for common regulatory elements in the promoters of co-expressed genes</p>
            </st>
            <p>Genes whose transcription is co-ordinately regulated may exhibit co-ordinated expression patterns. Thus co-expression of a group of genes may be indicative of co-regulation at the transcriptional level <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. To determine whether this is the case for a given cluster of co-expressed genes, such as those shown above, the promoter regions of the genes need to be analyzed. Transcription factors (TFs) bind to specific DNA sequences, which are usually only 6 to 10 base pairs long <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. These short sequences are often referred to as promoter motifs or sequence elements. Transcriptional regulation in eukaryotes most often occurs through the combinatorial action of multiple TFs <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr></abbrgrp>. For example, the induction and repression of <it>Arabidopsis </it>genes in response to red and blue light or abscisic acid (ABA) is dependent on combinations of multiple light-responsive or ABA-responsive promoter elements <abbrgrp><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr></abbrgrp>. It is therefore expected that the promoter regions of co-expressed genes may share numerous TF binding sites, including some that are also present in the promoter regions of genes whose expression patterns are quite different. A limitation of this type of approach is that genes may be regulated by the same transcription factor(s) but display different pattern(s) of transcript abundance due to the fact that post-transcriptional processes that affect their transcript stability may differ.</p>
         </sec>
         <sec>
            <st>
               <p>Aims of promoter analysis</p>
            </st>
            <p>Modules of co-expressed genes identified using <it>ModuleFinder </it>(Figure <figr fid="F3">3B</figr>), or groups of genes identified as co-expressed by other methods, provide an opportunity to discover potential regulatory sequence elements that may be responsible for the observed co-expression. The aims of such an analysis could be:</p>
            <p>(a) to identify promoter sequence elements (possible TF binding sites) that are common to genes within a module,</p>
            <p>(b) to identify promoter sequence elements that are common to up-regulated genes or downregulated genes but not both,</p>
            <p>(c) to identify combinations of promoter sequence elements that are common within a module but not shared by other modules, and</p>
            <p>(d) to use the identified promoter motifs to construct testable hypothetical models of gene regulation that explain observed expression patterns in terms of patterns of regulatory elements.</p>
            <p>Various motif recognition tools are available which can identify promoter sequence elements that are common among a group of genes, many of them available as web-based programs <abbrgrp><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr></abbrgrp>. However this becomes difficult when there are large numbers of large groups to be analyzed, as the processing times for these programs generally increase exponentially with the number of sequences taken as input data. Assuming such programs could be employed, it would be possible to build up a model of the regulatory network responsible for observed patterns of gene expression by applying these tools repeatedly to gene clusters defined by cluster analysis or gene modules defined by <it>ModuleFinder </it>analysis. Unfortunately such a process would be time consuming and error-prone. The identification of motifs conserved in multiple sequences is a complicated computing task and can consume significant processing time. To achieve the aims outlined above, this task must be repeated for each module and subset of modules and each potential motif would then have to be searched against all the other promoter sequences. Keeping track of module memberships and relationships, promoter sequences and motifs is a complicated task in itself. If this involves using the current web-based tools it requires considerable uploading, copying and pasting of gene lists and sequences, which can also introduce errors. A more attractive alternative is to try to identify sequence elements whose presence in gene promoter regions can be correlated with observed gene expression levels <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. This approach was implemented using clustering-based methods in a novel tool called <it>CoReg </it>(<ul>Co</ul>-<ul>R</ul>egulation of Co-<ul>E</ul>xpressed <ul>G</ul>enes) to undertake promoter analysis by deducing models of gene co-regulation to explain observed patterns of gene co-expression.</p>
         </sec>
         <sec>
            <st>
               <p>The CoReg algorithm</p>
            </st>
            <p><it>CoReg </it>aims to identify regulatory elements in the promoter regions of a set of co-expressed genes, which explain the observed expression patterns of those genes. It is based on the assumption that there is a relationship between the degree of similarity in gene expression and the degree of similarity in the combination of transcription factors binding within gene promoters. <it>CoReg </it>takes as its starting point a hierarchical clustering of a set of genes according to their expression in a set of experiments, for example the output of <it>ModuleFinder</it>. The user is then asked to break the hierarchical tree down into discrete groups of genes (Figure <figr fid="F5">5A</figr>). The assignment of genes into discrete groups can be recorded in a text file, which can be loaded into MAPMAN to aid interpretation of the functional significance of these groups (as indicated above). The <it>CoReg </it>algorithm then navigates down the tree, stopping at the first point at which the tree splits into two branches, and searches for sequence elements whose frequency of occurrence in promoter sequences varies greatly between the two groups of genes defined by the branches. For example, depending on the parameters set it will identify any sequence elements that are present in the promoters of all the genes in one group but none in the other group, or in promoters of >80% of the genes in one group but &lt;20% of the second group. The two branches resulting from the first split are then each broken down into two groups and sequence elements identified in each, then the process is repeated until the specified groups are reached. This process is illustrated in Figure <figr fid="F5">5B</figr>. <it>CoReg </it>can also search for sequence elements that are 'characteristic' of each group, in that their frequency is particularly high or low in that group. A separate frequency tolerance value may be set for this purpose. The user provides CoReg with a list of sequence elements to search for, these may be known elements from databases such as PlantCare <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>, Place <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>, AGRIS <abbrgrp><abbr bid="B48">48</abbr></abbrgrp> or Athamap <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>, or list of all possible combinations of nucleotides ranging from 3 to X nucleotides, where X would be an upper limit to the size of a transcription factor binding site. Degenerate binding sites can also be included where N can be any nucleotide. All these potential motifs can be included in a single file and the user can select the elements that match the expression profile (Figure <figr fid="F5">5</figr> and <figr fid="F6">6</figr>). The example provided contains a built-in list of all hexamers, that is all possible sequences of the bases A, C, G, T of length six.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>An overview of the operation of CoReg</p>
               </caption>
               <text>
                  <p><it>An overview of the operation of CoReg</it>. The expression data output from <it>ModuleFinder </it>is taken as input (A) and the user defines the number of groups for analysis by <it>CoReg</it>. Files are also saved for visualization of expression data by MAPMAN. B) Sequence elements are identified that are unevenly distributed between the promoters of groups defined in A. The frequencies of these elements in each group are recorded. C) Various combinations of elements can be selected and saved.</p>
               </text>
               <graphic file="1746-4811-2-8-5"/>
            </fig>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>CoReg analysis of the salicylic acid/rotenone module group from Nuclear Genes encoding Mitochondrial Proteins</p>
               </caption>
               <text>
                  <p><it>CoReg analysis of the salicylic acid/rotenone module group from Nuclear Genes encoding Mitochondrial Proteins</it>. The eight modules produced by <it>ModuleFinder </it>(Figure 3) were analyzed by <it>CoReg</it>. A) From the variety of elements detected in the upstream regions, five were selected which produced a tree structure that closely resembled the tree structure produced from the expression data. B) Hypothetical models of elements governing gene expression are shown based on these elements, which can be tested by experimental analysis. The sequences previously identified as regulatory elements in plants are indicated.</p>
               </text>
               <graphic file="1746-4811-2-8-6"/>
            </fig>
            <p>The frequency of each of the identified sequence elements in each of the gene groups is then calculated, and displayed as a greyscale heatmap (dubbed frequency map) in which black corresponds to a frequency of 80&#8211;100%, shades of grey intermediate values and white 0&#8211;20% (Figure <figr fid="F5">5B&#8211;C</figr>). The gene groups are then clustered according to the frequencies of the identified elements in the promoter regions of their member genes (Figure <figr fid="F5">5C</figr>). At this point, the algorithm has done the bulk of its work, and it is up to the user to drive the selection of a final subset of the identified sequence elements. The user can choose to try random subsets of sequence elements, chosen by CoReg using random sampling methods, or can select their own subsets to try. For each subset of sequence elements, the image window is updated to display a frequency map for the subset of elements, and a hierarchical tree showing the gene groups clustered according to these frequencies. The aim here is to try to find a subset of sequence elements such that, when the gene groups are clustered according to the frequencies of the elements in the promoter regions of their member genes, the resulting tree has the same structure as the expression-based hierarchical clustering tree. It can then be proposed that the selected sequence elements capture the structure of the observed gene expression patterns, and it can be hypothesised that the sequences correspond to regulatory elements that are responsible for these patterns of gene expression. Experiments may then be designed to test these hypotheses in the laboratory.</p>
            <p>While the criteria of tree matching provides a good visual cue to spot relationships between gene expression and the occurrence of sequence elements, it is up to the user to decide when they have found a set of sequence elements that might explain the observed expression patterns. The frequency maps themselves provide visual cues, helping the user to spot other patterns that may be useful. Thus, rather than providing the user with a definitive list of promoter elements that might be regulatory, <it>CoReg </it>is a tool for the user-driven exploration of patterns relating gene co-expression and co-regulation. <it>CoReg </it>scans for the specific elements present and thus will not identify degenerate elements.</p>
         </sec>
         <sec>
            <st>
               <p>Using CoReg to identify putative sequence elements in subsets of co-expressed nuclear genes encoding mitochondrial proteins</p>
            </st>
            <p>We have used <it>CoReg </it>to analyze the gene expression modules identified by <it>ModuleFinder </it>analysis as described in the example above (Figure <figr fid="F3">3B</figr>). To do this, the file containing expression data for the 51 genes in the module, created during the <it>ModuleFinder </it>run, was loaded into CoReg along with the promoter sequences for these genes. The built-in list of hexamers was taken as the list of sequence elements for the search. The hierarchical clustering tree was broken down into eight groups &#8211; four up-regulated (Group 1 to 4) and four down-regulated (Group 5 to 8) in response to the various treatments. The resulting tree is shown in Figure <figr fid="F6">6A</figr>. The maximum frequency tolerance was set to 0.35 and the characteristic frequency tolerance to 0.1, meaning that at each split in the tree, any sequence element present in promoters of >65% of the genes in one group but &lt;35% of the other group would be identified as interesting, as would any sequence elements with a frequency of >90% in one group but &lt;10% in all other groups. A subset of 6 of these elements was identified which resulted in a clustering of the gene groups that was quite similar to the expression-based clustering (Figure <figr fid="F6">6A</figr>). This suggests that although the element-based tree did not precisely match the expression-based tree, the uniqueness of expression pattern is reflected in the uniqueness of its promoter composition, relative to the other groups. Therefore the high frequency of the elements TTCTGC and ATGTAC correlate with the down regulation of modules SR 5 to 8, while the high frequency or AAAAGC, TTCCAG and AACTAT correlate with the up regulation of modules SR 1 to 4. GATGAC is present in all except the most highly downregulated module SR5.</p>
            <p>These correlation patterns can then be used to model gene regulatory networks that can be prioritised for experimental testing (Figure <figr fid="F6">6B</figr>). Of the six elements chosen to define the expression patterns obtained from the microarray analysis two have been previously identified to be involved in regulation of gene expression. The motif GATGAC, identified in CoReg analysis as a regulatory element present in all except the most highly downregulated module SR5, is part of two regulatory elements documented in the PlantCARE database: the As-1-box of tobacco (PlantCARE ID: NT~as-1-box) and OCS-element of <it>Arabidopsis </it>(PlantCARE ID: AT~ocs-element). These were both identified as being involved in the induction of gene expression in response to salicylic acid, auxin and oxidative stress <abbrgrp><abbr bid="B50">50</abbr><abbr bid="B51">51</abbr><abbr bid="B52">52</abbr><abbr bid="B53">53</abbr><abbr bid="B54">54</abbr></abbrgrp>. The alternative oxidase gene (<it>Aox1a</it>) is a member of SR4, contains this GATGAC element and transcript abundance of <it>Aox </it>is known to be induced by salicylic acid in several species <abbrgrp><abbr bid="B35">35</abbr><abbr bid="B55">55</abbr><abbr bid="B56">56</abbr></abbrgrp>.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>Using a large number of plant microarray analyses to help pinpoint the mechanisms of gene regulation is limited by the range of tools currently available. We have developed <it>ModuleFinder </it>to identify sets of genes and treatments that in our hands contain more biologically related functions for analysis of the mechanisms behind co-expression in non-linear-related sets. We then developed <it>CoReg </it>to link the clustering tree of expression-based relationships in these gene sets with frequency tables of promoter elements. These sets of promoter elements represent putative CAREs for sets of genes, and can then be tested experimentally. We consider these tools, both built on an open source software product, provide a valuable alternative tool to those widely available for the prioritisation of promoter elements for experimental analysis.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Data sources and processing</p>
            </st>
            <p>The changes in gene expression in response to the addition of various compounds to <it>Arabidopsis </it>suspension cells were measured as outlined previously <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. Data for the addition of chitin to 50 mg/mL (Sigma, Sydney) and flagellen22 peptide to 1 &#956;M (Auspep, Parkville, Victoria) are included here and arrays were carried out as described in Clifton et al. 2005 <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. Average gene expression levels were calculated across replicate chips; in each case, a minimum of two replicates was available. For each experimental variable or time point, the log ratio of expression under experimental conditions to appropriate control conditions was determined for each gene. These log ratios formed the input for <it>ModuleFinder </it>and <it>CoReg </it>analysis. Only a subset of the >22,000 genes on the Affymetrix gene chips were analyzed in the examples presented here. This gene subset comprised 374 genes, derived from a set of proteins identified in isolated <it>Arabidopsis </it>mitochondria by liquid chromatography-tandem mass spectrometry <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. For <it>CoReg </it>analysis, promoter sequences were taken as the 3000 base-pair sequences upstream of each gene, retrieved from TAIR.</p>
         </sec>
         <sec>
            <st>
               <p>Programming in R</p>
            </st>
            <p><it>ModuleFinder </it>and <it>CoReg </it>were developed in <it>R</it>, a computer language and environment for statistical computing <abbrgrp><abbr bid="B57">57</abbr></abbrgrp>. An advantage of <it>R </it>is that it is available as free software and runs on a wide variety of UNIX platforms and similar systems, Windows and MacOS. Most importantly <it>R </it>provides a variety of built-in statistical and graphical techniques, including a variety of cluster analysis methods and facilities for displaying cluster trees and heat maps, while also allowing users to extend <it>R</it>'s capabilities by defining their own functions.</p>
         </sec>
         <sec>
            <st>
               <p>Statistical methods used in ModuleFinder</p>
            </st>
            <p><it>ModuleFinder </it>filters out the genes whose expression did not change under all experiments in the initial subset. This is done by considering a matrix of p-values provided by the user, which reflects the results of a test for differential expression (including correction for multiple testing if appropriate), and filtering out all genes whose p-values are above a user-defined cut-off in any of the experiments in the subset. The default p-value cut-off is 0.05, but can be set by the user to any value between zero and one. In the examples presented here, the p-values used were derived from two-sided t-tests comparing the robust multiarray analysis-processed expression measures from replicates of control and experimental conditions <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. In each case, a minimum of two replicates was available.</p>
            <p><it>ModuleFinder </it>uses <it>R</it>'s hclust function for hierarchical clustering of genes based on the expression values provided in the input expression data file. The default clustering method uses a Euclidean distance measure and the Ward linkage method <abbrgrp><abbr bid="B58">58</abbr></abbrgrp>, but can be set by the user to any of the hierarchical clustering methods available in <it>R</it>. (These include Minkowski, Canberra, maximum, minimum and Manhattan distances, and the complete, single, average, centroid and McQuitty <abbrgrp><abbr bid="B33">33</abbr></abbrgrp> methods of linkage.)</p>
            <p>Having defined modules containing genes that are co-ordinately expressed in response to a subset of experiments, <it>ModuleFinder </it>searches for further experiments in which these modules also display co-ordinated expression responses. For each experiment not already in the module, the variance of the gene expression measures within each module is calculated using the <it>var </it>function in R (var(x<sub>1</sub>,.., x<sub>n</sub>) = sum(x<sub>i</sub>-mean(x))<sup>2</sup>/(n-1)). A small within-module variance can be interpreted as a high level of co-expression among the genes in the module. The sum of these within-module variance measures is calculated as an overall measure of how well gene expression in the experiment fits the set of modules. A measure of between-module variance is also calculated for each experiment (between-module var = sum(mean<sub>module i </sub>-mean<sub>all modules</sub>)<sup>2</sup>). Large values here indicate that the modules had distinct expression patterns in the experiment. The experiment that most closely 'fits' the module structure will display co-ordinated gene expression within modules and, ideally, distinct patterns of gene expression between modules. That is, it will have small within-module variances and a large between-module variance. The algorithm thus looks for the experiment with the highest ratio of between-module variance to sum of within-module variances.</p>
         </sec>
         <sec>
            <st>
               <p>CoReg algorithm</p>
            </st>
            <p>The primary data set used by <it>CoReg </it>is a table representing the incidence of each of a list of potential sequence elements (e.g. hexamers, known motifs) in a list of gene promoters. This is a table of sequence elements on the horizontal axis, gene names on the vertical axis and values of TRUE or FALSE indicating whether or not the element was found in a search of the gene's promoter sequence. String matching is used to search for sequence elements in promoter sequences. This table can be prepared independently, or <it>CoReg </it>can build one from a list of sequence elements and a file containing gene promoter sequences in FASTA format input by the user.</p>
            <p>The user is then asked to input a table of expression data. The genes in this table must appear in the incidence table, and must be labelled in the same way (e.g. AGI locus identifier). <it>CoReg</it>, like <it>ModuleFinder</it>, uses <it>R</it>'s hclust function for hierarchical clustering of genes based on the expression values in this table. Distance and linkage methods can be set by the user to any of those available in <it>R </it>(see above). The user is also asked to indicate branches defined by the tree that they consider to be gene expression clusters.</p>
            <p>The resulting hierarchical clustering tree is split into two branches, separating the genes into two discrete groups (say A and B). The incidence table is then used to determine, for each sequence element in the table, the proportion of genes in each group that contain that element. This is dubbed the 'frequency' of the element in those two groups (say f<sub>i, A </sub>and f<sub>i, B</sub>, where i denotes sequence element i). These frequencies are then compared to a user-defined tolerance level, <it>f</it>. Any sequence element that occurs with frequencies f<sub>i, A </sub>&lt;<it>f </it>and f<sub>i, B </sub>> (1-<it>f</it>) is recorded in a list of sequence elements that may be able to explain the difference in expression patterns of the two groups. The same process (splitting into two branches and searching for elements whose frequencies are different in the two groups defined by the split) is repeated for each of the two branches in an iterative procedure, stopping when the final user-defined clusters are reached.</p>
            <p>In addition, the frequencies of each sequence element in each of the user-defined clusters is compared to a second user-defined tolerance level <it>g</it>. Any sequence elements whose frequency is below <it>g </it>or above <it>1-g </it>in exactly one of these clusters, is added to the list of interesting sequence elements.</p>
            <p>The gene expression clusters defined by the user are then themselves clustered, according to the frequencies of all the recorded sequence elements. The same method chosen for expression-based hierarchical clustering is used at this step. The user is then given the opportunity to select subsets of the recorded sequence elements and cluster according to those, the aim being to isolate a subset of sequence elements leading to a hierarchical structure similar to that defined by the expression-based hierarchical clustering tree.</p>
            <p>ModuleFinder and CoReg are available for downloading from <abbrgrp><abbr bid="B59">59</abbr></abbrgrp>. Alternatively a package will be emailed on request containing program files, instruction files and examples files. We request that users cite this manuscript if using these programs.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Abbreviations</p>
         </st>
         <p>Aox alternative oxidase</p>
         <p>CARE(s) <it>cis</it>-acting regulatory element(s)</p>
         <p>NDH alternative NAD(P)H dehydrogenases</p>
         <p>NGMP nuclear genes encoding mitochondrial proteins</p>
         <p>TOM translocase of the outer mitochondrial membrane</p>
         <p>TIM translocase of the inner mitochondrial membrane</p>
      </sec>
      <sec>
         <st>
            <p>Competing interests</p>
         </st>
         <p>The author(s) declare that they have no competing interests.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>KEH was responsible for designing and writing the code and analyzing the data. AHM and JW contributed in designing the project, obtaining the expression data, interpretation of analysis and writing the manuscript.</p>
         <suppl id="S1">
            <title>
               <p>Additional File 1</p>
            </title>
            <text>
               <p><b>Supplementary Figure </b><figr fid="F1">1A</figr> A .eps file with a supplementary figure referred to in text</p>
            </text>
            <file name="1746-4811-2-8-S1.eps">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S2">
            <title>
               <p>Additional File 2</p>
            </title>
            <text>
               <p><b>Supplementary Figure </b><figr fid="F1">1B</figr> A .eps file with a supplementary figure referred to in text</p>
            </text>
            <file name="1746-4811-2-8-S2.eps">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S3">
            <title>
               <p>Additional File 3</p>
            </title>
            <text>
               <p><b>Manual.pdf </b>A manual that describes how to install and use programs and the outputs they produce.</p>
            </text>
            <file name="1746-4811-2-8-S3.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S4">
            <title>
               <p>Additional File 4</p>
            </title>
            <text>
               <p><b>MF and CoReg code </b>A .zip file with containing the code</p>
            </text>
            <file name="1746-4811-2-8-S4.zip">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S5">
            <title>
               <p>Additional File 5</p>
            </title>
            <text>
               <p><b>MapMan files.zip </b>Annotation files for visualisation in Mapman</p>
            </text>
            <file name="1746-4811-2-8-S5.zip">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S6">
            <title>
               <p>Additional File 6</p>
            </title>
            <text>
               <p><b>User guide (htm files).zip </b>Instruction for use in htm format</p>
            </text>
            <file name="1746-4811-2-8-S6.zip">
               <p>Click here for file</p>
            </file>
         </suppl>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This work was supported by funding to JW and AHM through the Australian Research Council (ARC) Centres of Excellence Program. AHM is also funded as an ARC Queen Elizabeth II Research Fellow.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Transcriptional Regulation: a Genomic Overview.</p>
            </title>
            <aug>
               <au>
                  <snm>Riechmann</snm>
                  <fnm>JL</fnm>
               </au>
            </aug>
            <source>The Arabidopsis Book</source>
            <publisher>Rockville, MD , American Society of Plant Biologists</publisher>
            <editor>Somerville CR, Meyerowitz EM</editor>
            <pubdate>2002</pubdate>
            <volume>doi: 10.1199/tab.0085,  http://www.aspb.org/publications/arabidopsis/</volume>
            <fpage>1</fpage>
            <lpage>46</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1199/tab.0085</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Real-time RT-PCR profiling of over 1400 Arabidopsis transcription factors: unprecedented sensitivity reveals novel root- and shoot-specific genes</p>
            </title>
            <aug>
               <au>
                  <snm>Czechowski</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Bari</snm>
                  <fnm>RP</fnm>
               </au>
               <au>
                  <snm>Stitt</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Scheible</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>Udvardi</snm>
                  <fnm>MK</fnm>
               </au>
            </aug>
            <source>Plant J</source>
            <pubdate>2004</pubdate>
            <volume>38</volume>
            <issue>2</issue>
            <fpage>366</fpage>
            <lpage>379</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1111/j.1365-313X.2004.02051.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">15078338</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Plant metabolic diversity: a regulatory perspective</p>
            </title>
            <aug>
               <au>
                  <snm>Grotewold</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Trends Plant Sci</source>
            <pubdate>2005</pubdate>
            <volume>10</volume>
            <issue>2</issue>
            <fpage>57</fpage>
            <lpage>62</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.tplants.2004.12.009</pubid>
                  <pubid idtype="pmpid" link="fulltext">15708342</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>The Arabidopsis basic/helix-loop-helix transcription factor family</p>
            </title>
            <aug>
               <au>
                  <snm>Toledo-Ortiz</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Huq</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Quail</snm>
                  <fnm>PH</fnm>
               </au>
            </aug>
            <source>Plant Cell</source>
            <pubdate>2003</pubdate>
            <volume>15</volume>
            <issue>8</issue>
            <fpage>1749</fpage>
            <lpage>1770</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">167167</pubid>
                  <pubid idtype="pmpid" link="fulltext">12897250</pubid>
                  <pubid idtype="doi">10.1105/tpc.013839</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>affy&#8211;Analysis of Affymetrix GeneChip data at the probe level</p>
            </title>
            <aug>
               <au>
                  <snm>Gautier</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Cope</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Bolstad</snm>
                  <fnm>BM</fnm>
               </au>
               <au>
                  <snm>Irizarry</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <fpage>307</fpage>
            <lpage>315</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg405</pubid>
                  <pubid idtype="pmpid" link="fulltext">14960456</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes</p>
            </title>
            <aug>
               <au>
                  <snm>Riechmann</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Heard</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Martin</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Reuber</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Jiang</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Keddie</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Adam</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Pineda</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Ratcliffe</snm>
                  <fnm>OJ</fnm>
               </au>
               <au>
                  <snm>Samaha</snm>
                  <fnm>RR</fnm>
               </au>
               <au>
                  <snm>Creelman</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Pilgrim</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Broun</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>JZ</fnm>
               </au>
               <au>
                  <snm>Ghandehari</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Sherman</snm>
                  <fnm>BK</fnm>
               </au>
               <au>
                  <snm>Yu</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2000</pubdate>
            <volume>290</volume>
            <issue>5499</issue>
            <fpage>2105</fpage>
            <lpage>2110</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.290.5499.2105</pubid>
                  <pubid idtype="pmpid" link="fulltext">11118137</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Transcriptional regulation in plants: the importance of combinatorial control</p>
            </title>
            <aug>
               <au>
                  <snm>Singh</snm>
                  <fnm>KB</fnm>
               </au>
            </aug>
            <source>Plant Physiol</source>
            <pubdate>1998</pubdate>
            <volume>118</volume>
            <issue>4</issue>
            <fpage>1111</fpage>
            <lpage>1120</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1104/pp.118.4.1111</pubid>
                  <pubid idtype="pmpid" link="fulltext">9847085</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Analysis of the genome sequence of the flowering plant Arabidopsis thaliana</p>
            </title>
            <aug>
               <au>
                  <snm>Initiative</snm>
                  <fnm>AG</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2000</pubdate>
            <volume>408</volume>
            <issue>6814</issue>
            <fpage>796</fpage>
            <lpage>815</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35048692</pubid>
                  <pubid idtype="pmpid" link="fulltext">11130711</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>GENEVESTIGATOR. Arabidopsis microarray database and analysis toolbox</p>
            </title>
            <aug>
               <au>
                  <snm>Zimmermann</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Hirsch-Hoffmann</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hennig</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Gruissem</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Plant Physiol</source>
            <pubdate>2004</pubdate>
            <volume>136</volume>
            <issue>1</issue>
            <fpage>2621</fpage>
            <lpage>2632</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">523327</pubid>
                  <pubid idtype="pmpid" link="fulltext">15375207</pubid>
                  <pubid idtype="doi">10.1104/pp.104.046367</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Transcriptional profiling of Arabidopsis tissues reveals the unique characteristics of the pollen transcriptome</p>
            </title>
            <aug>
               <au>
                  <snm>Becker</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Boavida</snm>
                  <fnm>LC</fnm>
               </au>
               <au>
                  <snm>Carneiro</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Haury</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Feijo</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Plant Physiol</source>
            <pubdate>2003</pubdate>
            <volume>133</volume>
            <issue>2</issue>
            <fpage>713</fpage>
            <lpage>725</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">219046</pubid>
                  <pubid idtype="pmpid" link="fulltext">14500793</pubid>
                  <pubid idtype="doi">10.1104/pp.103.028241</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Arabidopsis HAF2 gene encoding TATA-binding protein (TBP)-associated factor TAF1, is required to integrate light signals to regulate gene expression and growth</p>
            </title>
            <aug>
               <au>
                  <snm>Bertrand</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Benhamed</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>YF</fnm>
               </au>
               <au>
                  <snm>Ayadi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Lemonnier</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Renou</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Delarue</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Zhou</snm>
                  <fnm>DX</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2005</pubdate>
            <volume>280</volume>
            <issue>2</issue>
            <fpage>1465</fpage>
            <lpage>1473</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M409000200</pubid>
                  <pubid idtype="pmpid" link="fulltext">15525647</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Transcriptome analysis of haploid male gametophyte development in Arabidopsis</p>
            </title>
            <aug>
               <au>
                  <snm>Honys</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Twell</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <issue>11</issue>
            <fpage>R85</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">545776</pubid>
                  <pubid idtype="pmpid" link="fulltext">15535861</pubid>
                  <pubid idtype="doi">10.1186/gb-2004-5-11-r85</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Arabidopsis whole-transcriptome profiling defines the features of coordinated regulations that occur during secondary growth</p>
            </title>
            <aug>
               <au>
                  <snm>Ko</snm>
                  <fnm>JH</fnm>
               </au>
               <au>
                  <snm>Han</snm>
                  <fnm>KH</fnm>
               </au>
            </aug>
            <source>Plant Mol Biol</source>
            <pubdate>2004</pubdate>
            <volume>55</volume>
            <issue>3</issue>
            <fpage>433</fpage>
            <lpage>453</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s11103-004-1051-z</pubid>
                  <pubid idtype="pmpid" link="fulltext">15604691</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Arabidopsis transcriptome profiling indicates that multiple regulatory pathways are activated during cold acclimation in addition to the CBF cold response pathway</p>
            </title>
            <aug>
               <au>
                  <snm>Fowler</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Thomashow</snm>
                  <fnm>MF</fnm>
               </au>
            </aug>
            <source>Plant Cell</source>
            <pubdate>2002</pubdate>
            <volume>14</volume>
            <issue>8</issue>
            <fpage>1675</fpage>
            <lpage>1690</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">151458</pubid>
                  <pubid idtype="pmpid" link="fulltext">12172015</pubid>
                  <pubid idtype="doi">10.1105/tpc.003483</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Transcriptome changes for Arabidopsis in response to salt, osmotic, and cold stress</p>
            </title>
            <aug>
               <au>
                  <snm>Kreps</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Chang</snm>
                  <fnm>HS</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Harper</snm>
                  <fnm>JF</fnm>
               </au>
            </aug>
            <source>Plant Physiol</source>
            <pubdate>2002</pubdate>
            <volume>130</volume>
            <issue>4</issue>
            <fpage>2129</fpage>
            <lpage>2141</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">166725</pubid>
                  <pubid idtype="pmpid" link="fulltext">12481097</pubid>
                  <pubid idtype="doi">10.1104/pp.008532</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Characterizing the stress/defense transcriptome of Arabidopsis</p>
            </title>
            <aug>
               <au>
                  <snm>Mahalingam</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Gomez-Buitrago</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Eckardt</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Shah</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Guevara-Garcia</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Day</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Raina</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Fedoroff</snm>
                  <fnm>NV</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <issue>3</issue>
            <fpage>R20</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">153460</pubid>
                  <pubid idtype="pmpid" link="fulltext">12620105</pubid>
                  <pubid idtype="doi">10.1186/gb-2003-4-3-r20</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Genome-wide reprogramming of primary and secondary metabolism, protein synthesis, cellular growth processes, and the regulatory infrastructure of Arabidopsis in response to nitrogen</p>
            </title>
            <aug>
               <au>
                  <snm>Scheible</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>Morcuende</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Czechowski</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Fritz</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Osuna</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Palacios-Rojas</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Schindelasch</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Thimm</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Udvardi</snm>
                  <fnm>MK</fnm>
               </au>
               <au>
                  <snm>Stitt</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Plant Physiol</source>
            <pubdate>2004</pubdate>
            <volume>136</volume>
            <issue>1</issue>
            <fpage>2483</fpage>
            <lpage>2499</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">523316</pubid>
                  <pubid idtype="pmpid" link="fulltext">15375205</pubid>
                  <pubid idtype="doi">10.1104/pp.104.047019</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>MAPMAN: a user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes</p>
            </title>
            <aug>
               <au>
                  <snm>Thimm</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Blasing</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Gibon</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Nagel</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Meyer</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kruger</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Selbig</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Muller</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Rhee</snm>
                  <fnm>SY</fnm>
               </au>
               <au>
                  <snm>Stitt</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Plant J</source>
            <pubdate>2004</pubdate>
            <volume>37</volume>
            <issue>6</issue>
            <fpage>914</fpage>
            <lpage>939</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1111/j.1365-313X.2004.02016.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">14996223</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Evolutionary biclustering of microarray data</p>
            </title>
            <aug>
               <au>
                  <snm>Aquilar-Ruiz</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Divina</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Applications of evolutionary computing, proceedings lecture notes in computer science</source>
            <pubdate>2005</pubdate>
            <volume>3449</volume>
            <fpage>1</fpage>
            <lpage>10</lpage>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Coupled two-way clustering analysis of breast cancer and colon cancer gene expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Getz</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Gal</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kela</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Notterman</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Domany</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>9</issue>
            <fpage>1079</fpage>
            <lpage>1089</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btf876</pubid>
                  <pubid idtype="pmpid" link="fulltext">12801868</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Coupled two-way clustering analysis of gene microarray data</p>
            </title>
            <aug>
               <au>
                  <snm>Getz</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Levine</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Domany</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>2000</pubdate>
            <volume>97</volume>
            <issue>22</issue>
            <fpage>12079</fpage>
            <lpage>12084</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">17297</pubid>
                  <pubid idtype="pmpid" link="fulltext">11035779</pubid>
                  <pubid idtype="doi">10.1073/pnas.210134797</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Spectral biclustering of microarray data: coclustering genes and conditions</p>
            </title>
            <aug>
               <au>
                  <snm>Kluger</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Basri</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Chang</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Gerstein</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <issue>4</issue>
            <fpage>703</fpage>
            <lpage>716</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">430175</pubid>
                  <pubid idtype="pmpid" link="fulltext">12671006</pubid>
                  <pubid idtype="doi">10.1101/gr.648603</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Interrelated two-way clustering and ita application on gene expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Tang</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>AD</fnm>
               </au>
            </aug>
            <source>International Journal on Artificial Intelligence Tools</source>
            <pubdate>2005</pubdate>
            <volume>14</volume>
            <fpage>577</fpage>
            <lpage>597</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1142/S0218213005002272</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Bioclustering models for structured microarray data</p>
            </title>
            <aug>
               <au>
                  <snm>Turner</snm>
                  <fnm>HL</fnm>
               </au>
               <au>
                  <snm>Bailey</snm>
                  <fnm>TC</fnm>
               </au>
               <au>
                  <snm>Krzanowski</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Hemingway</snm>
                  <fnm>CA</fnm>
               </au>
            </aug>
            <source>IEEE/ACM Transactions on Computational Biology and Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>2</volume>
            <fpage>316</fpage>
            <lpage>329</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1109/TCBB.2005.49</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>GEMS: a web server for biclustering analysis of expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Wu</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Kasif</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <issue>Web Server issue</issue>
            <fpage>W596</fpage>
            <lpage>9</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1160230</pubid>
                  <pubid idtype="pmpid" link="fulltext">15980544</pubid>
                  <pubid idtype="doi">10.1093/nar/gki469</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Finding regulatory modules through large-scale gene-expression data analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Kloster</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tang</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Wingreen</snm>
                  <fnm>NS</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <issue>7</issue>
            <fpage>1172</fpage>
            <lpage>1179</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti096</pubid>
                  <pubid idtype="pmpid" link="fulltext">15513996</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Coupled two-way clustering server</p>
            </title>
            <aug>
               <au>
                  <snm>Getz</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Domany</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>9</issue>
            <fpage>1153</fpage>
            <lpage>1154</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg143</pubid>
                  <pubid idtype="pmpid" link="fulltext">12801877</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Extracting conserved gene expression motifs from gene expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Murali</snm>
                  <fnm>TM</fnm>
               </au>
               <au>
                  <snm>Kasif</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Pac Symp Biocomput</source>
            <pubdate>2003</pubdate>
            <fpage>77</fpage>
            <lpage>88</lpage>
            <xrefbib>
               <pubid idtype="pmpid">12603019</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>AMPDB: the Arabidopsis Mitochondrial Protein Database</p>
            </title>
            <aug>
               <au>
                  <snm>Heazlewood</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Millar</snm>
                  <fnm>AH</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33 Database Issue</volume>
            <fpage>D605</fpage>
            <lpage>10</lpage>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Experimental analysis of the Arabidopsis mitochondrial proteome highlights signaling and regulatory components, provides assessment of targeting prediction programs, and indicates plant-specific mitochondrial proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Heazlewood</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Tonti-Filippini</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Gout</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Day</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Whelan</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Millar</snm>
                  <fnm>AH</fnm>
               </au>
            </aug>
            <source>Plant Cell</source>
            <pubdate>2004</pubdate>
            <volume>16</volume>
            <issue>1</issue>
            <fpage>241</fpage>
            <lpage>256</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">301408</pubid>
                  <pubid idtype="pmpid" link="fulltext">14671022</pubid>
                  <pubid idtype="doi">10.1105/tpc.016055</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Cluster analysis and display of genome-wide expression patterns</p>
            </title>
            <aug>
               <au>
                  <snm>Eisen</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Spellman</snm>
                  <fnm>PT</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <issue>25</issue>
            <fpage>14863</fpage>
            <lpage>14868</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">24541</pubid>
                  <pubid idtype="pmpid" link="fulltext">9843981</pubid>
                  <pubid idtype="doi">10.1073/pnas.95.25.14863</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Stress-induced co-expression of alternative respiratory chain components in Arabidopsis thaliana</p>
            </title>
            <aug>
               <au>
                  <snm>Clifton</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Lister</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Parker</snm>
                  <fnm>KL</fnm>
               </au>
               <au>
                  <snm>Sappl</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Elhafez</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Millar</snm>
                  <fnm>AH</fnm>
               </au>
               <au>
                  <snm>Day</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Whelan</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Plant Mol Biol</source>
            <pubdate>2005</pubdate>
            <volume>58</volume>
            <fpage>193</fpage>
            <lpage>212</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s11103-005-5514-7</pubid>
                  <pubid idtype="pmpid" link="fulltext">16027974</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Capabilities and improvement of linkage analysis as a clustering method.</p>
            </title>
            <aug>
               <au>
                  <snm>McQuitty</snm>
                  <fnm>LL</fnm>
               </au>
            </aug>
            <source>Educ Psychol Meas</source>
            <pubdate>1964</pubdate>
            <volume>24</volume>
            <fpage>441</fpage>
            <lpage>456</lpage>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Salicylic acid induces rapid inhibition of mitochondrial electron transport and oxidative phosphorylation in tobacco cells</p>
            </title>
            <aug>
               <au>
                  <snm>Xie</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>Z</fnm>
               </au>
            </aug>
            <source>Plant Physiol</source>
            <pubdate>1999</pubdate>
            <volume>120</volume>
            <issue>1</issue>
            <fpage>217</fpage>
            <lpage>226</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">59253</pubid>
                  <pubid idtype="pmpid" link="fulltext">10318699</pubid>
                  <pubid idtype="doi">10.1104/pp.120.1.217</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Salicylic acid is an uncoupler and inhibitor of mitochondrial electron transport</p>
            </title>
            <aug>
               <au>
                  <snm>Norman</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Howell</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Millar</snm>
                  <fnm>AH</fnm>
               </au>
               <au>
                  <snm>Whelan</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Day</snm>
                  <fnm>DA</fnm>
               </au>
            </aug>
            <source>Plant Physiol</source>
            <pubdate>2004</pubdate>
            <volume>134</volume>
            <issue>1</issue>
            <fpage>492</fpage>
            <lpage>501</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">316328</pubid>
                  <pubid idtype="pmpid" link="fulltext">14684840</pubid>
                  <pubid idtype="doi">10.1104/pp.103.031039</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>A transcriptomic and proteomic characterization of the Arabidopsis mitochondrial protein import apparatus and its response to mitochondrial dysfunction</p>
            </title>
            <aug>
               <au>
                  <snm>Lister</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Chew</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>MN</fnm>
               </au>
               <au>
                  <snm>Heazlewood</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Clifton</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Parker</snm>
                  <fnm>KL</fnm>
               </au>
               <au>
                  <snm>Millar</snm>
                  <fnm>AH</fnm>
               </au>
               <au>
                  <snm>Whelan</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Plant Physiol</source>
            <pubdate>2004</pubdate>
            <volume>134</volume>
            <issue>2</issue>
            <fpage>777</fpage>
            <lpage>789</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">344553</pubid>
                  <pubid idtype="pmpid" link="fulltext">14730085</pubid>
                  <pubid idtype="doi">10.1104/pp.103.033910</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Environmental stresses inhibit and stimulate different protein import pathways in plant mitochondria</p>
            </title>
            <aug>
               <au>
                  <snm>Taylor</snm>
                  <fnm>NL</fnm>
               </au>
               <au>
                  <snm>Rudhe</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Hulett</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Lithgow</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Glaser</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Day</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Millar</snm>
                  <fnm>AH</fnm>
               </au>
               <au>
                  <snm>Whelan</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>FEBS Lett</source>
            <pubdate>2003</pubdate>
            <volume>547</volume>
            <issue>1-3</issue>
            <fpage>125</fpage>
            <lpage>130</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0014-5793(03)00691-4</pubid>
                  <pubid idtype="pmpid" link="fulltext">12860399</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Orchestrated transcription of key pathways in Arabidopsis by the circadian clock</p>
            </title>
            <aug>
               <au>
                  <snm>Harmer</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Hogenesch</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Straume</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Chang</snm>
                  <fnm>HS</fnm>
               </au>
               <au>
                  <snm>Han</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Kreps</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Kay</snm>
                  <fnm>SA</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2000</pubdate>
            <volume>290</volume>
            <issue>5499</issue>
            <fpage>2110</fpage>
            <lpage>2113</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.290.5499.2110</pubid>
                  <pubid idtype="pmpid" link="fulltext">11118138</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Identifying regulatory networks by combinatorial analysis of promoter elements</p>
            </title>
            <aug>
               <au>
                  <snm>Pilpel</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Sudarsanam</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2001</pubdate>
            <volume>29</volume>
            <issue>2</issue>
            <fpage>153</fpage>
            <lpage>159</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng724</pubid>
                  <pubid idtype="pmpid" link="fulltext">11547334</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Combinatorial control of gene expression</p>
            </title>
            <aug>
               <au>
                  <snm>Remenyi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Scholer</snm>
                  <fnm>HR</fnm>
               </au>
               <au>
                  <snm>Wilmanns</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nat Struct Mol Biol</source>
            <pubdate>2004</pubdate>
            <volume>11</volume>
            <issue>9</issue>
            <fpage>812</fpage>
            <lpage>815</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nsmb820</pubid>
                  <pubid idtype="pmpid" link="fulltext">15332082</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Regulation and role of the Arabidopsis abscisic acid-insensitive 5 gene in abscisic acid, sugar, and stress response</p>
            </title>
            <aug>
               <au>
                  <snm>Brocard</snm>
                  <fnm>IM</fnm>
               </au>
               <au>
                  <snm>Lynch</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Finkelstein</snm>
                  <fnm>RR</fnm>
               </au>
            </aug>
            <source>Plant Physiol</source>
            <pubdate>2002</pubdate>
            <volume>129</volume>
            <issue>4</issue>
            <fpage>1533</fpage>
            <lpage>1543</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">166741</pubid>
                  <pubid idtype="pmpid" link="fulltext">12177466</pubid>
                  <pubid idtype="doi">10.1104/pp.005793</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Combinatorial interaction of light-responsive elements plays a critical role in determining the response characteristics of light-regulated promoters in Arabidopsis</p>
            </title>
            <aug>
               <au>
                  <snm>Chattopadhyay</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Puente</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Deng</snm>
                  <fnm>XW</fnm>
               </au>
               <au>
                  <snm>Wei</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Plant J</source>
            <pubdate>1998</pubdate>
            <volume>15</volume>
            <issue>1</issue>
            <fpage>69</fpage>
            <lpage>77</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-313X.1998.00180.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">9744096</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Assessing computational tools for the discovery of transcription factor binding sites</p>
            </title>
            <aug>
               <au>
                  <snm>Tompa</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Bailey</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>De Moor</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Eskin</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Favorov</snm>
                  <fnm>AV</fnm>
               </au>
               <au>
                  <snm>Frith</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Fu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Makeev</snm>
                  <fnm>VJ</fnm>
               </au>
               <au>
                  <snm>Mironov</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Noble</snm>
                  <fnm>WS</fnm>
               </au>
               <au>
                  <snm>Pavesi</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Pesole</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Regnier</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Simonis</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Sinha</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Thijs</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>van Helden</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Vandenbogaert</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Weng</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Workman</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Ye</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>Z</fnm>
               </au>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>2005</pubdate>
            <volume>23</volume>
            <issue>1</issue>
            <fpage>137</fpage>
            <lpage>144</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nbt1053</pubid>
                  <pubid idtype="pmpid" link="fulltext">15637633</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Applied bioinformatics for the identification of regulatory elements</p>
            </title>
            <aug>
               <au>
                  <snm>Wasserman</snm>
                  <fnm>WW</fnm>
               </au>
               <au>
                  <snm>Sandelin</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <issue>4</issue>
            <fpage>276</fpage>
            <lpage>287</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrg1315</pubid>
                  <pubid idtype="pmpid" link="fulltext">15131651</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies</p>
            </title>
            <aug>
               <au>
                  <snm>van Helden</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Andre</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Collado-Vides</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1998</pubdate>
            <volume>281</volume>
            <issue>5</issue>
            <fpage>827</fpage>
            <lpage>842</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1998.1947</pubid>
                  <pubid idtype="pmpid" link="fulltext">9719638</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Lescot</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Dehais</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Thijs</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Marchal</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Moreau</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Van de Peer</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Rouze</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Rombauts</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <issue>1</issue>
            <fpage>325</fpage>
            <lpage>327</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">99092</pubid>
                  <pubid idtype="pmpid" link="fulltext">11752327</pubid>
                  <pubid idtype="doi">10.1093/nar/30.1.325</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Plant cis-acting regulatory DNA elements (PLACE) database: 1999</p>
            </title>
            <aug>
               <au>
                  <snm>Higo</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ugawa</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Iwamoto</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Korenaga</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1999</pubdate>
            <volume>27</volume>
            <issue>1</issue>
            <fpage>297</fpage>
            <lpage>300</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">148163</pubid>
                  <pubid idtype="pmpid" link="fulltext">9847208</pubid>
                  <pubid idtype="doi">10.1093/nar/27.1.297</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>AGRIS: Arabidopsis gene regulatory information server, an information resource of Arabidopsis cis-regulatory elements and transcription factors</p>
            </title>
            <aug>
               <au>
                  <snm>Davuluri</snm>
                  <fnm>RV</fnm>
               </au>
               <au>
                  <snm>Sun</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Palaniswamy</snm>
                  <fnm>SK</fnm>
               </au>
               <au>
                  <snm>Matthews</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Molina</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Kurtz</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Grotewold</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <fpage>25</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">166152</pubid>
                  <pubid idtype="pmpid" link="fulltext">12820902</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-4-25</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>AthaMap web tools for database-assisted identification of combinatorial cis-regulatory elements and the display of highly conserved transcription factor binding sites in Arabidopsis thaliana</p>
            </title>
            <aug>
               <au>
                  <snm>Steffens</snm>
                  <fnm>NO</fnm>
               </au>
               <au>
                  <snm>Galuschka</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Schindler</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Bulow</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Hehl</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <issue>Web Server issue</issue>
            <fpage>W397</fpage>
            <lpage>402</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1160156</pubid>
                  <pubid idtype="pmpid" link="fulltext">15980498</pubid>
                  <pubid idtype="doi">10.1093/nar/gki395</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>The promoter of a H2O2-inducible, Arabidopsis glutathione S-transferase gene contains closely linked OBF- and OBP1-binding sites</p>
            </title>
            <aug>
               <au>
                  <snm>Chen</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Chao</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Singh</snm>
                  <fnm>KB</fnm>
               </au>
            </aug>
            <source>Plant J</source>
            <pubdate>1996</pubdate>
            <volume>10</volume>
            <issue>6</issue>
            <fpage>955</fpage>
            <lpage>966</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-313X.1996.10060955.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">9011080</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>Characterization of salicylic acid-responsive, arabidopsis Dof domain proteins: overexpression of OBP3 leads to growth defects</p>
            </title>
            <aug>
               <au>
                  <snm>Kang</snm>
                  <fnm>HG</fnm>
               </au>
               <au>
                  <snm>Singh</snm>
                  <fnm>KB</fnm>
               </au>
            </aug>
            <source>Plant J</source>
            <pubdate>2000</pubdate>
            <volume>21</volume>
            <issue>4</issue>
            <fpage>329</fpage>
            <lpage>339</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-313x.2000.00678.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">10758484</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>Analysis of the promoter of the auxin-inducible gene, parC, of tobacco</p>
            </title>
            <aug>
               <au>
                  <snm>Sakai</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Takahashi</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Nagata</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Plant Cell Physiol</source>
            <pubdate>1996</pubdate>
            <volume>37</volume>
            <issue>7</issue>
            <fpage>906</fpage>
            <lpage>913</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8979393</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>Interactions between distinct types of DNA binding proteins enhance binding to ocs element promoter sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Foley</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Buttner</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Singh</snm>
                  <fnm>KB</fnm>
               </au>
            </aug>
            <source>Plant Cell</source>
            <pubdate>1995</pubdate>
            <volume>7</volume>
            <issue>12</issue>
            <fpage>2241</fpage>
            <lpage>2252</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">161076</pubid>
                  <pubid idtype="pmpid" link="fulltext">8718629</pubid>
                  <pubid idtype="doi">10.1105/tpc.7.12.2241</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B54">
            <title>
               <p>Isolation and characterization of two related Arabidopsis ocs-element bZIP binding proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Foley</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Singh</snm>
                  <fnm>KB</fnm>
               </au>
            </aug>
            <source>Plant J</source>
            <pubdate>1993</pubdate>
            <volume>4</volume>
            <issue>4</issue>
            <fpage>711</fpage>
            <lpage>716</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-313X.1993.04040711.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">8252072</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>Evidence of mitochondrial involvement in the transduction of signals required for the induction of genes associated with pathogen attack and senescence</p>
            </title>
            <aug>
               <au>
                  <snm>Maxwell</snm>
                  <fnm>DP</fnm>
               </au>
               <au>
                  <snm>Nickels</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>McIntosh</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Plant J</source>
            <pubdate>2002</pubdate>
            <volume>29</volume>
            <issue>3</issue>
            <fpage>269</fpage>
            <lpage>279</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-313X.2002.01216.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">11844105</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B56">
            <title>
               <p>Salicylic acid: a natural inducer of heat production in arum lilies</p>
            </title>
            <aug>
               <au>
                  <snm>Raskin</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Ehmann</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Melander</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>Meeuse</snm>
                  <fnm>BJD</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1987</pubdate>
            <volume>237</volume>
            <fpage>1601</fpage>
            <lpage>1602</lpage>
         </bibl>
         <bibl id="B57">
            <title>
               <p/>
            </title>
            <aug>
               <au>
                  <cnm>http://www.r-project.org</cnm>
               </au>
            </aug>
         </bibl>
         <bibl id="B58">
            <title>
               <p>Hierarchical Grouping to Optimize an Objective Function</p>
            </title>
            <aug>
               <au>
                  <snm>Ward</snm>
                  <fnm>JHJ</fnm>
               </au>
            </aug>
            <source>J Am Stat Assoc</source>
            <pubdate>1963</pubdate>
            <volume>58</volume>
            <fpage>236</fpage>
            <lpage>244</lpage>
            <xrefbib>
               <pubid idtype="doi">10.2307/2282967</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B59">
            <title>
               <p/>
            </title>
            <aug>
               <au>
                  <cnm>http://www.plantenergy.uwa.edu.au/webpages/research/downloads.html</cnm>
               </au>
            </aug>
         </bibl>
         <bibl id="B60">
            <title>
               <p>Protein import into mitochondria: Origins and functions today</p>
            </title>
            <aug>
               <au>
                  <snm>Lister</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hulett</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Lithgow</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Whelan</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Molecular Membrane Biology</source>
            <pubdate>2005</pubdate>
            <volume>22</volume>
            <fpage>87</fpage>
            <lpage>100</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16092527</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B61">
            <title>
               <p>Genomic and proteomic analysis of mitochondrial carrier proteins in Arabidopsis</p>
            </title>
            <aug>
               <au>
                  <snm>Millar</snm>
                  <fnm>AH</fnm>
               </au>
               <au>
                  <snm>Heazlewood</snm>
                  <fnm>JL</fnm>
               </au>
            </aug>
            <source>Plant Physiol</source>
            <pubdate>2003</pubdate>
            <volume>131</volume>
            <issue>2</issue>
            <fpage>443</fpage>
            <lpage>453</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">166821</pubid>
                  <pubid idtype="pmpid" link="fulltext">12586869</pubid>
                  <pubid idtype="doi">10.1104/pp.009985</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B62">
            <title>
               <p>Mitochondrial Biogenesis and Function in Arabidopsis</p>
            </title>
            <aug>
               <au>
                  <snm>Millar</snm>
                  <fnm>AH</fnm>
               </au>
               <au>
                  <snm>Day</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Whelan</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>The Arabidopsis Book</source>
            <publisher>Rockville, MD , American Society of Plant Biologists</publisher>
            <editor>Somerville CR, Meyerowitz EM</editor>
            <pubdate>2004</pubdate>
            <volume>doi: 10.1199/tab.0105,  http://www.aspb.org/publications/arabidopsis/</volume>
            <fpage>1</fpage>
            <lpage>36</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1199/tab.0105</pubid>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
