Analysis of Differential Gene Expression in Cattle Using mrna-seq
mrna-seq A rough guide for green horns
Animal and Grassland Research and Innovation Centre Animal and Bioscience Research Department Teagasc, Grange, Dunsany, Co. Meath, Ireland
Animal and Bioscience Research Department (Grange site) Department head - Richard Dewhurst Research officers - Kieran Meade, David Kenny, Sinead Waters, Orla Keane, Bernie Early, Chris Creevey, David Lynn Research technologists - Matt McCabe
NGS Technologies Illumina (Solexa) Genome Analyser II (GAII) UCD, Conway. (part-owned by Teagasc) Roche 454 FLX Teagasc Moorepark
Progress 86 polya RNA-seq libraries SR = 12 liver, 48 leukocyte, 21 hypothalamus PE = 5 pooled sheep lymphocyte 14 mi/piwi RNA libraries 16 pooled target enriched DNA libraries Shotgun rumen metagenomic libraries
Important traits Feed efficiency Milk yield Fertility Disease resistance
Negative energy balance Metabolic disorder affecting high yielding dairy cows in the post partum period imbalance between energy intake and energy requirements for lactation and body maintenance.
McCarthy et al. (2010) 12 multiparous cows (parity no. = 4.3-5.2) selected randomly from a pool of 24 Holstein-Friesians with average lactation yield of 6477 354 kg Treatment applied morning after 2 nd or 3 rd milking post-parturition MNEB 6 cows were fed grass silage ad libitum with 8 kg/day of a 21% crude protein dairy concentrate and milked once daily SNEB 6 cows were fed 25 kg/day silage with 4 kg/day concentrate and milked 3 x daily Energy balance was calculated Cows slaughtered on days 6-7 of the first follicular wave after calving RNA extracted from endometrium, spleen, liver
1 2 3 DNA primary-mrna cytoplasm 5 CAP 1 2 3 mature-mrna 1 3 1 2 3 AAAAA AAAAA AAAAA nucleus 1 3 AAAAA 1 2 3 AAAAA
Library preparation UCD protocol using individually purchased reagents from Invitrogen and NEB
1. 10 ug total RNA (RIN ~ 8.0) 6. End repair (Klenow T4 and E. coli polymerase) 2. mrna (Dynal oligo(dt) beads AAAAA 3. Fragment mrna (Zinc) 4. First strand cdna synth (Superscript II) AAAAA TTTTT 5. Second strand cdna synth 7. Add a single A base A 8. Ligate adaptors T A 9. Gel purify 200-300 bp adaptor ligated fragments? 10. PCR amplify (12-18 cycles), quantify on Qubit Fluorometer A A T TTTTTTT AAAAA
Cattle leukocyte SR mrna library preps Aran O Loughlin Yields = 20-30 ng/ul (Qubit)
Single Read Library Adapter Sequence Adapter: 5' -------------------- -----ACACTCTTTCCCTAC ACGACGCTCTTCCGATCT (-) -------------------- -------------- 3' 3' -------------------- -----TGTGAGAAAGGGATG TGCTGCGAGAAGGCTAGp (-) -------------------- -------------- 5' Adapter: 5' -------------------- -------------------- ------------------ (-) pgatcggaagagctcgtatg CCGTCTTCTGCTTG 3' 3' -------------------- -------------------- ------------------ (-) TCTAGCCTTCTCGAGCATAC GGCAGAAGACGAAC 5' PCR Primer: 5' AATGATACGGCGACCACCGA GATCTACACTCTTTCCCTAC ACGACGCTCTTCCGATCT (-) -------------------- -------------- 3' 3' -------------------- -------------------- ------------------ (-) -------------------- -------------- 5 PCR Primer: 5' -------------------- -------------------- ------------------ (-) -------------------- -------------- 3' 3' -------------------- -------------------- ------------------ (-) TCTAGCCTTCTCGAGCATAC GGCAGAAGACGAAC 5 Result Library: 5' AATGATACGGCGACCACCGA GATCTACACTCTTTCCCTAC ACGACGCTCTTCCGATCT (N) AGATCGGAAGAGCTCGTATG CCGTCTTCTGCTTG 3' 3' TTACTATGCCGCTGGTGGCT CTAGATGTGAGAAAGGGATG TGCTGCGAGAAGGCTAGA (N) TCTAGCCTTCTCGAGCATAC GGCAGAAGACGAAC 5 Genomic DNA Sequencing Primer: 5' -------------------- -----ACACTCTTTCCCTAC ACGACGCTCTTCCGATCT (-) -------------------- -------------- 3' 3' -------------------- -------------------- ------------------ (-) -------------------- -------------- 5'
Paired end mrna seq library prep
Cattle leukocyte mrna library preps Sheep lymphocyte PE mrna library preps
How much coverage? Haploid bovine genome ~2.87 Gb (Elsik et al, 2009) 22,000 protein coding genes 180,000 protein coding exons (~30 Mb) 30 Mb/36 bp reads = 0.83 M reads for 1x coverage Rare transcripts +5 and 3 UTRs? Alternatively spliced isoforms Only 70% reads align to the bovine genome
Ran 12 libraries on 3 flowcells libraries 1-6 = MNEB libraries 7-12 = SNEB Lane Flowcell 1 Flowcell 2 Flowcell 3 1 library 1 (4.5 pm)* library 12 (5.5 pm)* library 2 (5.5 pm)* 2 library 1 (5.0 pm)* library 6 (5.5 pm)* library 3 (5.5 pm) 3 library 5 (5.5 pm)* library 2 (5.5 pm)* library 6 (5.5 pm)* 4 library 5 (6.0 pm)* library 7 (5.5 pm)* library 7 (5.5 pm)* 5 library 10 (4.0 pm)* library 9 (5.5 pm) PhiX 6 PhiX PhiX library 8 (5.5 pm) 7 library 10 (5.5 pm)* library 11 (5.5 pm)* library 11 (5.5 pm)* 8 library 10 (6.5 pm)* library 4 (5.5 pm) library 12 (5.5 pm)*
Bowtie Alignment (to BCM4) TopHat Splice junction mapper Cufflinks Assembles aligned reads into transcripts Computes expression values
Cufflinks
1 2 3
Short SR 1 2 3
Short SR 1 2 3
Short SR 1 2 3
Alternatively spliced isoforms 1 3 1 2 3
Alternative splicing predictions generated by Tophat Displayed on Integrative Genomics Viewer
Summary data (average for 21 lanes) Total number of reads 13,664,739 Reads with only one alignment and no duplicates 8,539,432 (62%) Number of islands detected 212,491 Number of exon/exon junctions detected 19,606
# of unique splice junctions detected for all 21 lanes Total 110365 Alternative splicing # of splice junctions for the exon 99801 1 9077 2 1185 3 248 4 39 5 8 6 4 7 3 8
Short PE sequencing 1 2 3
Long PE sequencing 1 2 3
COST Sequencing kit = 1K SR cluster gen kit = 2K PE cluster gen kit = 4K Short SR = 1 seq kit + 1 SR cluster kit = 3K Long SR = 2 seq kits + 1 SR cluster kit = 4K Short PE = 2 seq kits + 1 PE cluster kit = 6K Long PE = 4 seq kits + 1 PE cluster kit = 8K
Microarray vs. mrnaseq RNAseq = 16,720 genes Microarray analysis = 12,679 genes RNAseq + microarray = 10,950 genes RNAseq only = 5,770 genes Microarray analysis only = 1,729 genes
Analysis of significantly differentially expressed genes in MNEB and SNEB animals
Analysis of significantly differentially expressed genes in MNEB and SNEB animals Expression values FPKM calculated using Cufflinks (Fragments Per Kilobase of exon per Million fragments mapped)
FPKM values
PCA of FPKM values in all lanes in mild (M) and severe (S) NEB animals
Analysis of significantly differentially expressed genes in MNEB and SNEB animals 348 SDE genes (<FDR 0.05) 438 SDE genes (<FDR 0.1)
Analysis of significantly differentially expressed genes in MNEB and SNEB animals 348 SDE genes (<FDR 0.05) 438 SDE genes (<FDR 0.1) SDE of 350 of these genes was also shown in the microarray
Correlation of 350 SDGEs in Microarray and mrnaseq
Lane effect Used Edge R to compare FPKM values of libraries run on 2 lanes Libraries 2,5,6,7,9,10,11,12 (1st lane) vs. Libraries 2,5,6,7,9,10,11,12 (2nd lane) found 40 differentially expressed genes!!!
Lane effect
Pathway analysis IPA analysis of microarray and RNAseq data show same top networks affected by SNEB Innate DB analysis GOseq
What next FRT-Seq??
Acknowledgements Chris Creevey, David Lynn - Teagasc Paul Cormican -Trinity Amanda Lohan, Alison Murphy - UCD Elaine Kenny -Trinity Sinead Waters, Sean McArthy, Dermot Morris -Teagasc