-
Johanson, U.
Molecular analysis of FRIGIDA, a major determinant of natural variation in Arabidopsis flowering time
Science
290,
344-347
(2000)
.
- . . . However, this may cause bias, because genes may be inactive in the reference but expressed in the population1, suggesting that sequencing and re-annotating individual genomes is necessary . . .
-
Bentley, D. R.
Accurate whole human genome sequencing using reversible terminator chemistry
Nature
456,
53-59
(2008)
.
- . . . Advances in sequencing2 make this tractable for Arabidopsis thaliana3, 4, 5, whose natural accessions (strains) are typically homozygous . . .
- . . . Accessions were sequenced with Illumina paired-end reads2 (Supplementary Table 1), generally with two libraries with 200-bp and 400-bp inserts and reads of 36 and 51 bp, respectively, to between 27-fold and 60-fold coverage . . .
-
Ossowski, S.
Sequencing of natural strains of Arabidopsis thaliana with short reads
Genome Res.
18,
2024-2033
(2008)
.
- . . . Bur-0 survey (blue line): 1,442 survey sequences (about 417 bp each) in predominantly genic regions19; Bur-0 divergent (red line): 188 sequences (each about 254 bp) highly divergent from Col-0 (ref. 3); Ler-0 nonrepetitive (orange line): a predominantly single-copy 175-kb Ler-0 sequence on chromosome 5; Ler-0 repetitive (purple line): a highly repetitive 339-kb Ler-0 locus on chromosome 3 (ref. 18; Supplementary Information section 4) . . .
- . . . Advances in sequencing2 make this tractable for Arabidopsis thaliana3, 4, 5, whose natural accessions (strains) are typically homozygous . . .
- . . . Relative to the 119-megabase (Mb) high-quality reference sequence from Col-0 (ref. 6), diverse accessions harbour a single nucleotide polymorphism (SNP) about every 200 base pairs (bp) (ref. 3), and indel variation is pervasive3, 7, 8 . . .
- . . . The assembled genomes also contribute to the A. thaliana 1001 Genomes Project3, 4, 5, 13. . . .
- . . . At unique loci, polymorphic regions probably reflect complex polymorphisms3, 8 . . .
- . . . As assessed with about 1.2 Mb of genomic dideoxy data3, 18, 19 (Supplementary Information section 4), the substitution error rate was about 1 per 10 kb in single-copy regions, and about tenfold higher in transposable-element-rich regions . . .
- . . . As expected3, 7, disease resistance genes of the coiled-coil and Toll interleukin 1 receptor subfamilies of the Nucleotide-Binding Leucine Rich Repeat (NB-LRR) gene family were predicted to encode the most variable proteins (Fig. 4a and Supplementary Fig. 26) . . .
-
Schneeberger, K.
Reference-guided assembly of four diverse Arabidopsis thaliana genomes
Proc. Natl Acad. Sci. USA
108,
10249-10254
(2011)
.
- . . . Advances in sequencing2 make this tractable for Arabidopsis thaliana3, 4, 5, whose natural accessions (strains) are typically homozygous . . .
- . . . The assembled genomes also contribute to the A. thaliana 1001 Genomes Project3, 4, 5, 13. . . .
- . . . The substitution error rate for our assemblies was comparable to that reported for four other A. thaliana genome assemblies4. . . .
- . . . Our study goes beyond cataloguing polymorphisms7, 17 to provide genome sequences for a moderately sized population sample (see also refs 4, 16) . . .
-
Weigel, D.; Mott, R.
The 1001 genomes project for Arabidopsis thaliana
Genome Biol.
10,
107
(2009)
.
- . . . Advances in sequencing2 make this tractable for Arabidopsis thaliana3, 4, 5, whose natural accessions (strains) are typically homozygous . . .
- . . . The assembled genomes also contribute to the A. thaliana 1001 Genomes Project3, 4, 5, 13. . . .
- . . . The methods we developed are of immediate relevance to the broader A. thaliana 1001 Genomes Project5 and to other organisms, and highlight the importance of RNA-seq data for annotation. . . .
-
The Arabidopsis Genome Initiative
Analysis of the genome sequence of the flowering plant Arabidopsis thaliana
Nature
408,
796-815
(2000)
.
- . . . Relative to the 119-megabase (Mb) high-quality reference sequence from Col-0 (ref. 6), diverse accessions harbour a single nucleotide polymorphism (SNP) about every 200 base pairs (bp) (ref. 3), and indel variation is pervasive3, 7, 8 . . .
-
Clark, R. M.
Common sequence polymorphisms shaping genetic diversity in Arabidopsis thaliana
Science
317,
338-342
(2007)
.
- . . . Relative to the 119-megabase (Mb) high-quality reference sequence from Col-0 (ref. 6), diverse accessions harbour a single nucleotide polymorphism (SNP) about every 200 base pairs (bp) (ref. 3), and indel variation is pervasive3, 7, 8 . . .
- . . . The probability of recent co-ancestry is slightly higher than expected for a few pairs of accessions, with extended haplotype sharing at a minority of loci (Supplementary Figs 11–15), perhaps reflecting selective sweeps7 . . .
- . . . Variation among the 18 accessions is similar to a diverse global A. thaliana sample7, 8 in nucleotide diversity (Supplementary Figs 11–15), correlation with genomic features (Supplementary Tables 9–12) and structural variants (Supplementary Fig. 17). . . .
- . . . As expected3, 7, disease resistance genes of the coiled-coil and Toll interleukin 1 receptor subfamilies of the Nucleotide-Binding Leucine Rich Repeat (NB-LRR) gene family were predicted to encode the most variable proteins (Fig. 4a and Supplementary Fig. 26) . . .
- . . . Our data suggest that high turnover for some F-box families in the A. thaliana lineage7 extends to gene expression as well. . . .
- . . . Our study goes beyond cataloguing polymorphisms7, 17 to provide genome sequences for a moderately sized population sample (see also refs 4, 16) . . .
-
Zeller, G.
Detecting polymorphic regions in Arabidopsis thaliana with resequencing microarrays
Genome Res.
18,
918-929
(2008)
.
- . . . Relative to the 119-megabase (Mb) high-quality reference sequence from Col-0 (ref. 6), diverse accessions harbour a single nucleotide polymorphism (SNP) about every 200 base pairs (bp) (ref. 3), and indel variation is pervasive3, 7, 8 . . .
- . . . We aligned reads to the final assemblies to detect polymorphic regions8 lacking read coverage (2.1–3.7 Mb per accession; Supplementary Table 3 and Supplementary Fig. 2) . . .
- . . . At unique loci, polymorphic regions probably reflect complex polymorphisms3, 8 . . .
- . . . Variation among the 18 accessions is similar to a diverse global A. thaliana sample7, 8 in nucleotide diversity (Supplementary Figs 11–15), correlation with genomic features (Supplementary Tables 9–12) and structural variants (Supplementary Fig. 17). . . .
-
Kover, P. X.
A multiparent advanced generation inter-cross to fine-map quantitative traits in Arabidopsis thaliana
PLoS Genet.
5,
e1000551
(2009)
.
- . . . Characterizing this variation is crucial for dissecting the genetic architecture of traits by quantitative trait locus mapping in recombinant inbred lines (see, for example, ref. 9) or genome-wide association in natural accessions10. . . .
- . . . Here we have sequenced and accurately assembled the single-copy genomes of 18 accessions that, with Col-0, are the parents of more than 700 Multiparent Advanced Generation Inter-Cross (MAGIC) lines9, similar to the maize Nested Association Mapping (NAM)11 population and the murine Collaborative Cross12 . . .
- . . . These accessions comprise a geographically and phenotypically diverse sample across the species9 . . .
- . . . Our findings indicate that the MAGIC lines, for which population structure is largely mitigated9, will be an important and complementary resource to genome-wide association studies in A. thaliana populations10. . . .
-
Atwell, S.
Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines
Nature
465,
627-631
(2010)
.
- . . . Characterizing this variation is crucial for dissecting the genetic architecture of traits by quantitative trait locus mapping in recombinant inbred lines (see, for example, ref. 9) or genome-wide association in natural accessions10. . . .
- . . . Our findings indicate that the MAGIC lines, for which population structure is largely mitigated9, will be an important and complementary resource to genome-wide association studies in A. thaliana populations10. . . .
-
McMullen, M. D.
Genetic properties of the maize nested association mapping population
Science
325,
737-740
(2009)
.
- . . . Here we have sequenced and accurately assembled the single-copy genomes of 18 accessions that, with Col-0, are the parents of more than 700 Multiparent Advanced Generation Inter-Cross (MAGIC) lines9, similar to the maize Nested Association Mapping (NAM)11 population and the murine Collaborative Cross12 . . .
-
Collaborative cross mice and their power to map host susceptibility to Aspergillus fumigatus infection
Genome Res.
21,
1239-1248
(2011)
.
- . . . Here we have sequenced and accurately assembled the single-copy genomes of 18 accessions that, with Col-0, are the parents of more than 700 Multiparent Advanced Generation Inter-Cross (MAGIC) lines9, similar to the maize Nested Association Mapping (NAM)11 population and the murine Collaborative Cross12 . . .
-
Cao, J.
Whole-genome sequencing of multiple Arabidopsis thaliana populations
Nature Genet
,
(28 August 2011)
.
- . . . The assembled genomes also contribute to the A. thaliana 1001 Genomes Project3, 4, 5, 13. . . .
-
Lunter, G.; Goodson, M.
Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads
Genome Res.
21,
936-939
(2011)
.
- . . . Each genome was assembled by using five cycles of iterative read mapping14 combined with de novo assembly15 (Supplementary Information sections 2 and 3, and Supplementary Tables 1 and 2) . . .
-
Li, R.
De novo assembly of human genomes with massively parallel short read sequencing
Genome Res.
20,
265-272
(2010)
.
- . . . Each genome was assembled by using five cycles of iterative read mapping14 combined with de novo assembly15 (Supplementary Information sections 2 and 3, and Supplementary Tables 1 and 2) . . .
-
Keane, T. M.
Mouse genomic variation and its effect on phenotypes and gene regulation
Nature
,
(in the press)
.
- . . . The density of sequence differences is greater than between classical inbred strains of mice16, but less than between lines of maize17. . . .
- . . . Our study goes beyond cataloguing polymorphisms7, 17 to provide genome sequences for a moderately sized population sample (see also refs 4, 16) . . .
-
Gore, M. A.
A first-generation haplotype map of maize
Science
326,
1115-1117
(2009)
.
- . . . The density of sequence differences is greater than between classical inbred strains of mice16, but less than between lines of maize17. . . .
- . . . Our study goes beyond cataloguing polymorphisms7, 17 to provide genome sequences for a moderately sized population sample (see also refs 4, 16) . . .
-
Lai, A. G.; Denton-Giles, M.; Mueller-Roeber, B.; Schippers, J. H.; Dijkwel, P. P.
Positional information resolves structural variations and uncovers an evolutionarily divergent genetic locus in accessions of Arabidopsis thaliana
Genome Biol. Evol.
,
(27 May 2011)
.
- . . . Bur-0 survey (blue line): 1,442 survey sequences (about 417 bp each) in predominantly genic regions19; Bur-0 divergent (red line): 188 sequences (each about 254 bp) highly divergent from Col-0 (ref. 3); Ler-0 nonrepetitive (orange line): a predominantly single-copy 175-kb Ler-0 sequence on chromosome 5; Ler-0 repetitive (purple line): a highly repetitive 339-kb Ler-0 locus on chromosome 3 (ref. 18; Supplementary Information section 4) . . .
- . . . As assessed with about 1.2 Mb of genomic dideoxy data3, 18, 19 (Supplementary Information section 4), the substitution error rate was about 1 per 10 kb in single-copy regions, and about tenfold higher in transposable-element-rich regions . . .
-
Nordborg, M.
The pattern of polymorphism in Arabidopsis thaliana
PLoS Biol.
3,
e196
(2005)
.
- . . . Bur-0 survey (blue line): 1,442 survey sequences (about 417 bp each) in predominantly genic regions19; Bur-0 divergent (red line): 188 sequences (each about 254 bp) highly divergent from Col-0 (ref. 3); Ler-0 nonrepetitive (orange line): a predominantly single-copy 175-kb Ler-0 sequence on chromosome 5; Ler-0 repetitive (purple line): a highly repetitive 339-kb Ler-0 locus on chromosome 3 (ref. 18; Supplementary Information section 4) . . .
- . . . As assessed with about 1.2 Mb of genomic dideoxy data3, 18, 19 (Supplementary Information section 4), the substitution error rate was about 1 per 10 kb in single-copy regions, and about tenfold higher in transposable-element-rich regions . . .
-
Song, Y. S.; Hein, J.
Constructing minimal ancestral recombination graphs
J. Comput. Biol.
12,
147-169
(2005)
.
- . . . We computed phylogenies20 across 1.25 million biallelic, non-private SNPs (Supplementary Information section 6) . . .
-
Jean, G.; Kahles, A.; Sreedharan, V. T.; De Bona, F.; Ratsch, G.
Current Protocols in Bioinformatics
,
(2010)
.
- . . . We integrated read alignments21 with sequence-based gene predictions22 by using mGene.ngs (Supplementary Information sections 9–10.3, and Supplementary Fig. 19) . . .
-
Schweikert, G.
mGene: accurate SVM-based gene finding with an application to nematode genomes
Genome Res.
19,
2133-2143
(2009)
.
- . . . We integrated read alignments21 with sequence-based gene predictions22 by using mGene.ngs (Supplementary Information sections 9–10.3, and Supplementary Fig. 19) . . .
- . . . Comparison of Col-0 de novo predictions with TAIR10 annotations (Supplementary Table 16) showed that these predictions are more accurate (transcript F-score 65.2%) than using the genome sequence (mGene22, 59.6%) or RNA-seq alignments alone (Cufflinks23, 37.5%; Supplementary Table 17) . . .
-
Trapnell, C.
Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation
Nature Biotechnol.
28,
511-515
(2010)
.
- . . . Comparison of Col-0 de novo predictions with TAIR10 annotations (Supplementary Table 16) showed that these predictions are more accurate (transcript F-score 65.2%) than using the genome sequence (mGene22, 59.6%) or RNA-seq alignments alone (Cufflinks23, 37.5%; Supplementary Table 17) . . .
-
Hu, T. T.
The Arabidopsis lyrata genome sequence and the basis of rapid genome size change
Nature Genet.
43,
476-481
(2011)
.
- . . . As expected, variation between A. thaliana and its congener A. lyrata24 exceeds that observed among A. thaliana accessions (Fig. 2c and Supplementary Fig. 23) . . .
-
Silverstein, K. A.; Graham, M. A.; Paape, T. D.; VandenBosch, K. A.
Genome organization of more than 300 defensin-like genes in Arabidopsis
Plant Physiol.
138,
600-610
(2005)
.
- . . . F-box and defensin-like genes implicated in diverse processes including defence25, 26 were also highly variable . . .
- . . . F-box and defensin-like genes were exceptional in that expression was restricted in a minority of genes (41% and 12%, respectively; Fig. 4b), perhaps reflecting tissue-specific or environment-specific expression25, 37 . . .
-
Gagne, J. M.; Downes, B. P.; Shiu, S. H.; Durski, A. M.; Vierstra, R. D.
The F-box subunit of the SCF E3 complex is encoded by a diverse superfamily of genes in Arabidopsis
Proc. Natl Acad. Sci. USA
99,
11519-11524
(2002)
.
- . . . F-box and defensin-like genes implicated in diverse processes including defence25, 26 were also highly variable . . .
-
Anders, S.; Huber, W.
Differential expression analysis for sequence count data
Genome Biol.
11,
R106
(2010)
.
- . . . In total, 75% (20,550) of protein-coding genes (and 21% of non-coding RNAs and 21% of pseudogenes) were expressed in at least one accession (false discovery rate (FDR) 5%), and 46% (9,360) of expressed protein-coding genes were differentially expressed between at least one pair of accessions27 (Fig. 3a; FDR 5%, Supplementary Information section 11) . . .
-
Keurentjes, J. J.
Regulatory network construction in Arabidopsis by using genome-wide gene expression quantitative trait loci
Proc. Natl Acad. Sci. USA
104,
1708-1713
(2007)
.
- . . . Our results corroborate the general findings28, 29, 30, 31 of extensive cis regulation of gene expression in A. thaliana . . .
-
Plantegenet, S.
Comprehensive analysis of Arabidopsis expression level polymorphisms with simple inheritance
Mol. Syst. Biol.
5,
242
(2009)
.
- . . . Our results corroborate the general findings28, 29, 30, 31 of extensive cis regulation of gene expression in A. thaliana . . .
- . . . Copy-number and structural variants were associated with expression in 3% (240) of differentially expressed genes, including 45% (64 out of 142) of genes with more than 100-fold differences (Fig. 3b), consistent with array studies29. . . .
-
West, M. A.
Global eQTL mapping reveals the complex genetic architecture of transcript-level variation in Arabidopsis
Genetics
175,
1441-1450
(2007)
.
- . . . Our results corroborate the general findings28, 29, 30, 31 of extensive cis regulation of gene expression in A. thaliana . . .
-
Zhang, X.; Cal, A. J.; Borevitz, J. O.
Genetic architecture of regulatory variation in Arabidopsis thaliana
Genome Res.
21,
725-733
(2011)
.
- . . . Our results corroborate the general findings28, 29, 30, 31 of extensive cis regulation of gene expression in A. thaliana . . .
-
Howe, G. A.; Jander, G.
Plant immunity to insect herbivores
Annu. Rev. Plant Biol.
59,
41-66
(2008)
.
- . . . Seventeen of the 18 GO classifications that were enriched for differential expression (P < 10−3) concerned response to the biotic environment, including pathogen defence and the production of glucosinolates32 to deter herbivores (Supplementary Table 24) . . .
-
Kaufmann, K.; Melzer, R.; Theissen, G.
MIKC-type MADS-domain proteins: structural modularity, protein interactions and network evolution in land plants
Gene
347,
183-198
(2005)
.
- . . . The type II MADS box transcription factor family33 showed striking expression polymorphisms (Fig. 4b–d), including for the FLOWERING LOCUS C (FLC)34 and MADS AFFECTING FLOWERING (MAF) genes35 . . .
-
Sheldon, C. C.
The FLF MADS box gene: a repressor of flowering in Arabidopsis regulated by vernalization and methylation
Plant Cell
11,
445-458
(1999)
.
- . . . The type II MADS box transcription factor family33 showed striking expression polymorphisms (Fig. 4b–d), including for the FLOWERING LOCUS C (FLC)34 and MADS AFFECTING FLOWERING (MAF) genes35 . . .
-
Ratcliffe, O. J.; Kumimoto, R. W.; Wong, B. J.; Riechmann, J. L.
Analysis of the Arabidopsis MADS AFFECTING FLOWERING gene family: MAF2 prevents vernalization by short periods of cold
Plant Cell
15,
1159-1169
(2003)
.
- . . . The type II MADS box transcription factor family33 showed striking expression polymorphisms (Fig. 4b–d), including for the FLOWERING LOCUS C (FLC)34 and MADS AFFECTING FLOWERING (MAF) genes35 . . .
-
Lempe, J.
Diversity of flowering responses in wild Arabidopsis thaliana strains
PLoS Genet.
1,
109-118
(2005)
.
- . . . FLC, a floral inhibitor expressed highly in accessions that require prolonged cold (vernalization) to flower36, varied more than 400-fold (Supplementary Fig. 42) . . .
-
Schmid, M.
A gene expression map of Arabidopsis thaliana development
Nature Genet.
37,
501-506
(2005)
.
- . . . F-box and defensin-like genes were exceptional in that expression was restricted in a minority of genes (41% and 12%, respectively; Fig. 4b), perhaps reflecting tissue-specific or environment-specific expression25, 37 . . .