GENOMICS for DUMMIES

Size: px
Start display at page:

Download "GENOMICS for DUMMIES"

Transcription

1 ØGC seminar 31. oktober 2013 GENOMICS for DUMMIES Torben A. Kruse Klinisk Genetisk Afdeling, Odense Universitetshospital Klinisk Institut, Syddansk Universitet Human MicroArray Center, OUH / SDU

2

3

4

5

6 Årsag: Gen-mutationer Trinvis proces Høj heterogenitet

7

8 -omics means global, genome-wide

9 GENOME SIZE AND VARIATION Size: 3 x 10 9 base pairs 25,000 genes Variation: % of basepairs polymorphic (SNPs) higher number of rare variants

10 Is it likely that a somatic mutation happens in a cell? Evidence from NGS Mutation frequency = 10-8 / basepair x division i.e. 60 mutations pr diploid genome pr division ie 60% risk for a mutation in some gene in one division (if genes constitute 1% of the genome) cell divisions / lifetime

11 >gi ref NM_ Homo sapiens breast cancer 1, early onset (BRCA1), transcript variant BRCA1a, mrna AAAACTGCGACTGCGCGGCGTGAGCTCGCTGAGACTTCCTGGACCCCGCACCAGGCTGTGGGGTTTCTCAGATAACTGGGCCCCTGCGCTCAGGAGGCCTTCACCCT CTGCTCTGGGTAAAGTTCATTGGAACAGAAAGAAATGGATTTATCTGCTCTTCGCGTTGAAGAAGTACAAAATGTCATTAATGCTATGCAGAAAATCTTAGAGTGTC CCATCTGTCTGGAGTTGATCAAGGAACCTGTCTCCACAAAGTGTGACCACATATTTTGCAAATTTTGCATGCTGAAACTTCTCAACCAGAAGAAAGGGCCTTCACAGT GTCCTTTATGTAAGAATGATATAACCAAAAGGAGCCTACAAGAAAGTACGAGATTTAGTCAACTTGTTGAAGAGCTATTGAAAATCATTTGTGCTTTTCAGCTTGAC ACAGGTTTGGAGTATGCAAACAGCTATAATTTTGCAAAAAAGGAAAATAACTCTCCTGAACATCTAAAAGATGAAGTTTCTATCATCCAAAGTATGGGCTACAGAAA CCGTGCCAAAAGACTTCTACAGAGTGAACCCGAAAATCCTTCCTTGCAGGAAACCAGTCTCAGTGTCCAACTCTCTAACCTTGGAACTGTGAGAACTCTGAGGACAA AGCAGCGGATACAACCTCAAAAGACGTCTGTCTACATTGAATTGGGATCTGATTCTTCTGAAGATACCGTTAATAAGGCAACTTATTGCAGTGTGGGAGATCAAGAA TTGTTACAAATCACCCCTCAAGGAACCAGGGATGAAATCAGTTTGGATTCTGCAAAAAAGGCTGCTTGTGAATTTTCTGAGACGGATGTAACAAATACTGAACATCA TCAACCCAGTAATAATGATTTGAACACCACTGAGAAGCGTGCAGCTGAGAGGCATCCAGAAAAGTATCAGGGTAGTTCTGTTTCAAACTTGCATGTGGAGCCATGTG GCACAAATACTCATGCCAGCTCATTACAGCATGAGAACAGCAGTTTATTACTCACTAAAGACAGAATGAATGTAGAAAAGGCTGAATTCTGTAATAAAAGCAAACA GCCTGGCTTAGCAAGGAGCCAACATAACAGATGGGCTGGAAGTAAGGAAACATGTAATGATAGGCGGACTCCCAGCACAGAAAAAAAGGTAGATCTGAATGCTGA TCCCCTGTGTGAGAGAAAAGAATGGAATAAGCAGAAACTGCCATGCTCAGAGAATCCTAGAGATACTGAAGATGTTCCTTGGATAACACTAAATAGCAGCATTCAG AAAGTTAATGAGTGGTTTTCCAGAAGTGATGAACTGTTAGGTTCTGATGACTCACATGATGGGGAGTCTGAATCAAATGCCAAAGTAGCTGATGTATTGGACGTTCT AAATGAGGTAGATGAATATTCTGGTTCTTCAGAGAAAATAGACTTACTGGCCAGTGATCCTCATGAGGCTTTAATATGTAAAAGTGAAAGAGTTCACTCCAAATCAG TAGAGAGTAATATTGAAGACAAAATATTTGGGAAAACCTATCGGAAGAAGGCAAGCCTCCCCAACTTAAGCCATGTAACTGAAAATCTAATTATAGGAGCATTTGTT ACTGAGCCACAGATAATACAAGAGCGTCCCCTCACAAATAAATTAAAGCGTAAAAGGAGACCTACATCAGGCCTTCATCCTGAGGATTTTATCAAGAAAGCAGATT TGGCAGTTCAAAAGACTCCTGAAATGATAAATCAGGGAACTAACCAAACGGAGCAGAATGGTCAAGTGATGAATATTACTAATAGTGGTCATGAGAATAAAACAAA AGGTGATTCTATTCAGAATGAGAAAAATCCTAACCCAATAGAATCACTCGAAAAAGAATCTGCTTTCAAAACGAAAGCTGAACCTATAAGCAGCAGTATAAGCAAT ATGGAACTCGAATTAAATATCCACAATTCAAAAGCACCTAAAAAGAATAGGCTGAGGAGGAAGTCTTCTACCAGGCATATTCATGCGCTTGAACTAGTAGTCAGTA GAAATCTAAGCCCACCTAATTGTACTGAATTGCAAATTGATAGTTGTTCTAGCAGTGAAGAGATAAAGAAAAAAAAGTACAACCAAATGCCAGTCAGGCACAGCAG AAACCTACAACTCATGGAAGGTAAAGAACCTGCAACTGGAGCCAAGAAGAGTAACAAGCCAAATGAACAGACAAGTAAAAGACATGACAGCGATACTTTCCCAGA GCTGAAGTTAACAAATGCACCTGGTTCTTTTACTAAGTGTTCAAATACCAGTGAACTTAAAGAATTTGTCAATCCTAGCCTTCCAAGAGAAGAAAAAGAAGAGAAAC TAGAAACAGTTAAAGTGTCTAATAATGCTGAAGACCCCAAAGATCTCATGTTAAGTGGAGAAAGGGTTTTGCAAACTGAAAGATCTGTAGAGAGTAGCAGTATTTC ATTGGTACCTGGTACTGATTATGGCACTCAGGAAAGTATCTCGTTACTGGAAGTTAGCACTCTAGGGAAGGCAAAAACAGAACCAAATAAATGTGTGAGTCAGTGTG CAGCATTTGAAAACCCCAAGGGACTAATTCATGGTTGTTCCAAAGATAATAGAAATGACACAGAAGGCTTTAAGTATCCATTGGGACATGAAGTTAACCACAGTCG GGAAACAAGCATAGAAATGGAAGAAAGTGAACTTGATGCTCAGTATTTGCAGAATACATTCAAGGTTTCAAAGCGCCAGTCATTTGCTCCGTTTTCAAATCCAGGAA ATGCAGAAGAGGAATGTGCAACATTCTCTGCCCACTCTGGGTCCTTAAAGAAACAAAGTCCAAAAGTCACTTTTGAATGTGAACAAAAGGAAGAAAATCAAGGAAA GAATGAGTCTAATATCAAGCCTGTACAGACAGTTAATATCACTGCAGGCTTTCCTGTGGTTGGTCAGAAAGATAAGCCAGTTGATAATGCCAAATGTAGTATCAAAG GAGGCTCTAGGTTTTGTCTATCATCTCAGTTCAGAGGCAACGAAACTGGACTCATTACTCCAAATAAACATGGACTTTTACAAAACCCATATCGTATACCACCACTTT TTCCCATCAAGTCATTTGTTAAAACTAAATGTAAGAAAAATCTGCTAGAGGAAAACTTTGAGGAACATTCAATGTCACCTGAAAGAGAAATGGGAAATGAGAACAT TCCAAGTACAGTGAGCACAATTAGCCGTAATAACATTAGAGAAAATGTTTTTAAAGAAGCCAGCTCAAGCAATATTAATGAAGTAGGTTCCAGTACTAATGAAGTG GGCTCCAGTATTAATGAAATAGGTTCCAGTGATGAAAACATTCAAGCAGAACTAGGTAGAAACAGAGGGCCAAAATTGAATGCTATGCTTAGATTAGGGGTTTTGC AACCTGAGGTCTATAAACAAAGTCTTCCTGGAAGTAATTGTAAGCATCCTGAAATAAAAAAGCAAGAATATGAAGAAGTAGTTCAGACTGTTAATACAGATTTCTCT CCATATCTGATTTCAGATAACTTAGAACAGCCTATGGGAAGTAGTCATGCATCTCAGGTTTGTTCTGAGACACCTGATGACCTGTTAGATGATGGTGAAATAAAGGA AGATACTAGTTTTGCTGAAAATGACATTAAGGAAAGTTCTGCTGTTTTTAGCAAAAGCGTCCAGAAAGGAGAGCTTAGCAGGAGTCCTAGCCCTTTCACCCATACAC ATTTGGCTCAGGGTTACCGAAGAGGGGCCAAGAAATTAGAGTCCTCAGAAGAGAACTTATCTAGTGAGGATGAAGAGCTTCCCTGCTTCCAACACTTGTTATTTGGT AAAGTAAACAATATACCTTCTCAGTCTACTAGGCATAGCACCGTTGCTACCGAGTGTCTGTCTAAGAACACAGAGGAGAATTTATTATCATTGAAGAATAGCTTAAA TGACTGCAGTAACCAGGTAATATTGGCAAAGGCATCTCAGGAACAT

12 GENOME SIZE AND VARIATION Size: 3 x 10 9 base pairs 25,000 genes Variation: % of basepairs polymorphic (SNPs) higher number of rare variants

13

14 APPLICATIONS: Tumour/metastasis biology targetted treatment Prognosis (diagnosis - subtypes) Prediction of treatment respons Detection Monitoring

15 Traditional analysis of tumour Hypothesis driven: DNA analysis - somatic mutations: single gene sequence analysis single gene copy number (amplification/deletion) RNA analysis single gene expression RNA level Protein analysis e.g. IHC Outcome: cancer +/-, metastasis +/-, respons +/- Data analysis 2 by 2, chi-square

16 Technological break through - DNA-chips microarrays allowing global data driven approach acgh (arraychromosomecomparativehybridization) global gene copy number analysis RNA profiling 25,000 genes mrna, microrna, long-non-codingrna SNP typing (GWAS) common genetic variation 500,000 SNPs germ line (not somatic mutations)

17 RNA profiling

18 Bioinformatician by M. Haemakers

19 Unsupervised and supervised analysis

20 Molecular heterogeneity

21 Technological break through - Next generation sequencing (NGS/MPS) allowing global data driven approach Whole genome sequencing - 3 x 10 9 base pairs Whole exome sequencing - 60 x 10 6 base pairs RNA sequencing transcriptome analysis incl. copy number analysis

22 Sequencing through put: 2008: 3 x 10 4 basepairs/day 2013: 3 x basepairs/day At 40x coverage ie a 25,000 fold increase in capacity

23 Identification of clinically relevant profiles: Selection of informative gene sets RNA features driver mutations (vs passengers) Definition of model/rule/predictor Validation: Training test sets Leave-one-out

24 David Cameron: The genome profile will give doctors a new, advanced understanding of a patient s genetic make-up, condition and treatment needs, ensuring they have access to the right drugs and personalised care far quicker than ever before. By unlocking the power of DNA data, the NHS will lead the global race for better tests, better drugs and above all better care. We are turning an important scientific breakthrough into a potentially life-saving reality for NHS patients across the country

25 Dec 2012: Press release DNA tests to revolutionise fight against cancer and help 100,000 NHS patients The government has earmarked 100 million: to train a new generation of British genetic scientists to lead on the development of new drugs, treatments and cures, building the UK as the world leader in the field. And train the wider healthcare community in harnessing this technology to pump-prime DNA sequencing for cancer and rare inherited diseases to build the NHS data infrastructure to ensure that this new technology leads to better care for patients.

26

27

28

29

30

31