NGS developments in tomato genome sequencing

Size: px
Start display at page:

Download "NGS developments in tomato genome sequencing"

Transcription

1 NGS developments in tomato genome sequencing , Sandra Smit TATGTTTTGGAAAACATTGCATGCGGAATTGGGTACTAGGTTGGACCTTAGTACC GCGTTCCATCCTCAGACCGATGGTCAGTCTGAGAGAACGATTCAAGTGTTGGAAG ATATGCTTCGTGCATGTGTGATAGAGTTTGGTGGCCATTGGGATAGCTTCTTACC CTTAGCGGAGTTTTCATACAATAATAGCTATCACTCAAGCATTGATATGGCTCCA TTTGAAGCAGTGTATGGTAGGAGATGTAGGTCTCCCATTTGTTGGTTTGATGCAT TTGAGGTTAGACCTTGGGGCACTGATCTCTTGAGGGATTCGATGGAAAAAGTGAA GTCTATTAAAGAAAAGCTTCTAGCGGCGCAAAGTAGACAAAAAGAATATGCAGAT CGAAAGGTTAGAGACTAAGAGTTCATGGAGGGTGAACAAGTCTTGTTGAAAGTTT CACCAAAGAAAGGGGTGATGCGGTTTGGTAAAAGGGGTAAACTTAGCCCAAGGTA TATTGGTCCATTCGATGTACTTAAGCGAGTAGGGGAGGTGGCTTATGAGTTAGCC TTGACTCCAGGGCTGTCCGGAGTGCATCCGGTATTCCATGTGTCTATGTTGAAAA GATATCATGGGGATGGAAATTACATTATCCGTTGGGATTCAGTGTTGCTTGATGA GAACTTGTCTTATGAGGAGGAGCCTGTTGCTATTTTAGATAGAGAAGTTCGCAAG TTGAGGTCAAGAGAGATTTCATCCATCAAGGTGTAATGGAAGAATCGACCGGTTG AAGAAGCCACTTGGGAGAAGGAGGCAGATATGCAAGAAAGAAACCCACATCTGTT TACAGATTCAGGTACTCCTTTTCGCCCGTGTTTTCCTTCTTTTGATCGTTTGGGG ACGAACGATGGGTAAATTGGTATCTATTGTAATGACCTGTTTAGTCGTTTTGAGC AACAAACTTCAATTCTGGAAAAACTGGCTGAGGCGACGGACCAAACGACGATCCG TCATGGGCACGACGGACCGTCGCAGGGTCTCGTTTCAAAACACTTAGAAAATCTA AAATTGGGTACTGAAAATCGACTCTTTGAACTTCGGGACAGAATGGCAGCACGGA CCGTCACAGGCGTGACAGACCGTCATAGATTGTTCAGTGGAAGTTGACTCTCTGA CCCTTGCGACGACCTGCAGGACGGACCGTCGCAGGCACGACGGCCCGTCATAGGT TGCGCAAATCCCAGGCAGAATCGGATTTTCTTACACGTTTTAAGGGACGTTTTTG GACTATTCTTTCCTTAATTATAGATTTCGTGGGTTTATATTAATAACTCAAATTC TTGGGGGTTAAAAGAGGTAACCCTAAGTTAATTAGTGGGGTATTATTGCCATCTT TTATTCTTAATTATATACTAATTAGGGTAAAAGAAAGAGTGTTTGAATAAGAAAA TAGAAAGAAAAAGAAGGGAGAGAGAGAAACGATCGAGAAGAAGAGGAAAACACCA AGCTTTGAGGATTAACTTGCTTGATTTCAATTCTTCGGTGGAGGTAGGTTATGGT TTTCATGCTTCATAAGTAAACTCTTAATAGTGAATGATATGTATTGGTAGTATTG TAAACCCTACTATATGCTTAATGGTATGTTTGTATGAATATGATTATATGATTGT GATAAGATAAGCATGATGAAAATATTGAATCCCAAATCTTGAAAAGAAACTTTAA TATACATTATTAATGATGATGCCTTGGTATAGAAGAAGGCTTGATGAATTAAAGT AATGGGATTGATGATGCCTTGGAATAGAGAAGGCTTGATGATTTACAGAATGATA TTAGTGGATCGGAGTGTCACGTTCCGACACATAGTATTAGTGGATCGGCGTGTCA CGTTCCGACACATAGTATTAGTGGATTGGAGTGTCACGTTCCGACACATGTAGGG GATCGGAGTGTCACGTTCTGACACATGTAGGGGATCGAAGTGTCACGTACCAACA TATGTAGGGGATCGGAGTGTCACGTTCCGACACATGTAGGGGATCGGAGTGTCAC GTACCGACACATGTAGGGGATCGGACCCC

2 Solanaceae genome sequencing

3 Solanaceae genome sequencing

4 Tomato Solanaceae family Diploid genome 12 chromosome pairs Genome size: 950 Mb Euchromatin size: 220 Mb Approx. 35,000 genes

5 Tomato genome sequencing project 2004: Hierarchical BAC-by-BAC approach The International Tomato Genome Sequencing Consortium

6 Tomato genome sequencing project 2009: NGS approach x Sanger 3.6x Illumina 82x SOLiD 140x Shotgun Matepair BACs Paired-end Fosmid ends BAC ends Paired-end Matepair Shotgun Matepair

7 Tomato genome assembly 454 shotgun 454 matepair Sanger matepair 31x 22x 3.3x De novo assembly - newbler - CABOG Illumina paired-end SOLiD matepair 70x 42x 118x 61x Base error correction - k-mer correction - read(-pair) alignment Sanger clone ends 0.3x Long-range scaffolding Sanger BACs 117 Mb Gap filling

8 Tomato genome build SL Mb assembled ~900 Mb genome 97% anchored ~7 scf per chromosome 34,727 genes 30,855 supported by RNAseq

9 So, are we done? Is it complete? Is it perfect? Standard Draft High-Quality Draft Improved High-Quality Draft Annotation-Directed Improvement Noncontiguous Finished Finished Chain et al. 2009

10 So, are we done? Is it complete? Is it perfect? Standard Draft High-Quality Draft Improved High-Quality Draft Annotation-Directed Improvement Noncontiguous Finished Finished Chain et al. 2009

11 So, are we done? Is it complete? Is it perfect? Standard Draft High-Quality Draft Improved High-Quality Draft Annotation-Directed Improvement Noncontiguous Finished Finished Chain et al. 2009

12 BAC sequencing for gap closure Sequencing 1000 BACs (EUSOL)

13 BAC sequencing for gap closure

14 High-throughput small gap closure Contig A 120 nt probes Gap Contig B

15 High-throughput small gap closure Contig A 120 nt probes Gap Contig B

16 High-throughput small gap closure Contig A 120 nt probes Gap Contig B Consensus sequence to fill the gap Read assembly CCGATATTTAGCTCTAGGGAA

17 A single reference genome is not enough

18

19 150 tomato genome project Public-private partnership initiated by TTI green genetics and BGI China (Re)sequencing 150 tomato accessions Cultivated tomatoes Land races Wild tomatoes Herbarium material RIL population De novo sequencing of 3 genomes

20 NGS facility

21 Acknowledgements Wageningen UR International tomato genome sequencing consortium SOL EUSOL CBSG 150 tomato genome project KeyGene and CAT-AgroFood