Functional genomics to improve wheat disease resistance Dina Raats Postdoctoral Scientist, Krasileva Group
Talk plan Goal: to contribute to the crop improvement by isolating YR resistance genes from cultivated wheat Forward screens for Yellow rust resistance in large Kronos TILLING population APR phenotyping in US and UK field trails Seedling phenotyping in CER Speed breeding for development of bi-parental mapping populations Mapping-by-sequencing with newly developed EI resources and HPC for in wheat: CS42 TGACv1 genome assembly and annotation Kronos genome assembly DRAGEN BioIT processor
Yellow rust disease of wheat Susceptible Resistant
Mapping-by-sequencing strategy R- bulk S- bulk Mapping to reference genome Resistant F1/F2 Susceptible Wild type selfing Exome capture Illumina sequencing ATGT TACT CGTG ATGC TGTT AGTG F3 Phenotyping Mutation identification ATGT TACT CGTG ATGC TGTT AGTG Allelic frequency scoring, mapping region of interest Resistant F2 bulk Susceptible F2 bulk
Kronos TILLING population EMS treatment G to A and C to T base changes www.dubcovskylab.ucdavis.edu www.wheat-tilling.com M1 M2 Krasileva et al, PNAS, 2017 Genomic DNA Over 1,500 lines and seed
Yellow rust resistance field trails Mutant lines with APR in 2 environments: California, US (UCD) and Norwich, Norfolk (EI) Mutant APR Mutant Increased APR Susceptible
Seedling resistance Resistant Mutant Susceptible Kronos WT Mutant with heterozygous resistant phenotype
Mapping populations development Resistant Mutant X Kronos WT Speed-breeding (Riaz et al. Plant Methods (2016)) Cross -> F1 -> F2 -> F3 (~9 month) 7 July 2016 7 August 2016 phenotyping field growth chambers
Chinese Spring 42 assembly and annotation Open access in Genome Research http://genome.cshlp.org/content/early/2017/04/04/gr.217117.116.abstract The reference sequence and annotation: http://opendata.earlham.ac.uk/triticum_aestivum/tgac/v1/annotation/ http://plants.ensembl.org/triticum_aestivum/info/index M. Clark The RNA sequencing reads can be downloaded from ENA (project PRJEB15048): http://www.ebi.ac.uk/ena/data/view/prjeb15048
W2RAP w2rap PCR-free PE Nextera LMP B. Clavijo From raw Illumina data to Scaffolds LMP Long (improved mate-pair method lib. by EI) PCR-free 2x250bp (~700bp) w2rap-contigger Contigs (includes pe-gaps) Nextera LMP pre-processing Accuracy and contiguity Metrics and traceability all along Available on github. SOAPdenovo scaffolding N-stretches re-mapping Scaffolds Do-it-yourself! https://github.com/bioinfologics/w2rap Clavijo et al - available on github ( https://github.com/bioinfologics/w2rap-contigger ) - on preparation / soon on Biorxiv
Improving genome annotation 217,907 loci 104,091 high-confidence protein coding genes. High quality: RNA-seq with PacBio strand specific Illumina data D. Swarbreck Tools developed to improve transcript reconstruction: Genomic assembly PacBio Isoseq Illumina strand specific Cross species proteins ALIGNMENT Splice junction quality filtering with Portcullis Splice junctions filtering https://github.com/maplesond/portcullis REPEAT IDENTIFICATION TRANSCRIPT ASSEMBLY AND SELECTION multiple assembly methods Selection by Mikado GENE PREDICTOR TRAINING GENE PREDICTION GENE MODEL REFINEMENT + ADDITION OF SPLICE VARIANTS GENE CONFIDENCE CLASSIFICATION https://github.com/lucventurini/mikado Scores transcripts qualities Robustly integrate multiple RNA-Seq assemblies Detects and resolves chimeric transcripts FUNCTIONAL ANNOTATION
APR mapping in selected mutant selfing F1 selfing F2 selfing F3 phenotyping of 30 progenies J Hegarty Resistant mutant Susceptible Kronos wt F2 homozygous resistant F2 homozygous susceptible
APR mapping in selected mutant 48 individually barcoded libraries 16-plex pools 82.4 Mb exome-capture design (Krasileva et al, Genome Biology 2013) 3 Illumina HiSeq 2500 lanes mapping to CS42 TGAC and Kronos (bwa and Dragen bwa) variant calling (GATC, Dragen GATC and Freebayes) Identification of chromosomal region VEP analysis -> causative mutation
Mapping and variant calling with CS CS reference Signal EMS mutations Kronos wild type mutant 1 mutant 2 Noise Paralog/homoeolog SNPs Varietal SNPs PCR errors Off target repeats mutant 3
Assembling multiple Wheat genomes
Mapping and variant calling with Kronos ref Kronos reference Signal EMS mutations Kronos wild type mutant 1 mutant 2 Noise Paralog/homoeolog SNPs PCR errors Off target repeats mutant 3
2.3billion reads 2x126bp Pipeline Quality control Timescale HPC hours / sample C Schudoma Mapping to reference 5 min / sample days / sample Mutation Detection 5 min / sample 2-3 days Filtering and Allele Frequency analysis <1h Putative region Dynamic Read Analysis for Genomics The DRAGEN Processor uses a field-programmable gate array (FPGA) to provide hardware-accelerated implementations of genome pipeline algorithms
Chromosomal region and Putative causative mutations 45 F2 samples, 3 Kronos wt 180 SNPs on 42 scaffolds 2 genetic bins (TGACv1 assembly mapped to POPSEQ Chapman et al. Genome Biology (2015)) 2.5 cm 120 SNPs in High Confidence genes TGACv1 CS annotation VEP analysis -> Putative causative mutations: 18 genes - missense variant Cysteine-rich-receptor-like-protein -kinase Protein-kinase-family-protein CLV1-receptor-kinase-like-protein
Validation Refining genetic map by screening of additional phenotyped F2 with KASPs markers Backcrosses Kronos wt other susceptible tetraploid cultivars
Acknowledgments Ksenia Krasileva Christian Schudoma Andrew Deatker Amelie Heckmann Anthony Hall Matt Clark Bernardo Clavijo David Swarbreck Luca Venturini Federica di Palma Rob Davey Leah Clissold Jorge Dubcovsky Joshua Hegarty
Kronos reference Kronos wild type mutant 1 mutant 2 mutant 3 Filtering IncredibleBulk.py : Only EMS (GA/CT) mut polymorphic between Kronos wt and Kronos Mut R bulk mut allele keep S bulk mut allele out 30x on site coverage for each line in a bulk HomMut ratio 0.9 (2 reads)