Benno Pütz. MPI of Psychiatry

Size: px
Start display at page:

Download "Benno Pütz. MPI of Psychiatry"

Transcription

1

2

3 Benno Pütz

4

5

6 Lifetime prevalence ~20%

7 Treatment response CC CT TT Binder et al., Nature Genetics 2004

8 Drug transport Dosierung

9 Drug transport Text Uhr et al., Neuron, 2008

10 150 years ago

11 Gregor Mendel

12 Inheritance

13 Heritability

14 Gene mrna Protein Trait

15 Gene mrna Protein Trait

16 Hypothesis-driven research

17 candidate genes

18

19 Problem

20 Complex Diseases

21 many genes

22 weak contributions

23 heritability rate

24 Genetic contribution % 80% Bipolar Disorder Schizophrenia Unipolar Depression % 50 40% ,5 1,25 2 2,75 3,5

25 environment

26 observation

27 treatment response

28 identify mechanisms

29 understand processes

30 genetic level

31 targets for intervention

32 individually tailored

33 Personalized Medicine

34 unbiased approaches

35 Gene Genome Transcript mrna Protein Trait

36 Genome Transcriptome Protein Trait 36

37 Genome Transcriptome Proteome Trait

38 Genome Transcriptome Proteome Trait

39 ...omics Genomics Genome Transcriptome Proteome Trait

40 ...omics Genomics Transcriptomics Genome Transcriptome Proteome Trait

41 ...omics Genomics Transcriptomics Proteomics Genome Transcriptome Proteome Trait

42 High-Throughput Genotyping Gene Expression Proteomics

43 hypotheses 43

44 unbiased approaches

45 High-Throughput Genotyping Gene Expression Proteomics

46 High-Throughput Genotyping Gene Expression Proteomics NGS

47 High-Throughput Genotyping Gene Expression Proteomics NGS

48 Clinical limitations! Patient data Privacy concerns (Bay. Krankenhausgesetz) in-house solution

49 Hardware cluster 192 cores (4 48) 256GB RAM each 48 TB

50 Hardware cluster GPU GTX 295

51 Hardware GTX 580 cluster GPU

52 Hardware cluster GPU

53 Genotyping

54 Genetic variations Human genome > 99% conserved across individuals

55 Genetic variations SNPs CNVs Epigenetic modifications Inversions,

56 Genetic variations 2 copies SNPs CNVs Epigene odifics Inversio 0 copies

57 Genetic variations SNPs CNVs Mutations Epigenetic modifications

58 Genetic variations SNPs CNVs Mutations Epigenetic modifications

59 Genetic variations SNPs CNVs Mutations Epigenetic modifications

60 Genetic variations SNPs CNVs methylated cytosine Mutations Epigenetic modifications acetylated lysine

61 Challenges

62 Imputation

63 SNP Chips Genome-wide genotyping Affymetrix 10k, 100k, 500k, 500k +420k 900k +950k Illumina 100k, 300k, 500k, 1M, only partial overlap

64 Large studies June 2007

65 SNP Chips Affymetrix 10k, 100k, 500k 500k 2 * 250k +420k 900k +950k Illumina 100k, 300k, 600k, 1M, unify

66 Imputation Combine studies different SNP coverage Extent to reference samples HAPMAP project 1000 genomes 1,386,077 7,611,725 7,835,704

67 Linkage Caucasian Asian African

68 Observation vs. Reference

69 Find matches

70 Phase and impute

71 Example Text Li et al., Annu. Rev. Genom. Human Genet :

72 Software Impute Mach Beagle Plink snpmatrix

73 MARS sample 1600 subjects Illumina 675 ksnps 100k / 330k / 600k

74 MARS sample 1600 subjects MSNPs ksnps ~2 weeks 48 cores

75 rare variants

76 Family trees

77 Restless Leg Syndrome

78 Restless Leg Syndrome exact solution: series of conditional likelihoods no parallel solution yet

79 Classification

80 Prediction (differential) diagnosis treatment treatment outcome

81 Variables high-dimensional problem Text find relevant subset

82 Decision trees

83 NGS Sequence alignment

84 Idea Shendure and Ji, Nat Biotech, 2008

85 Idea Shendure and Ji, Nat Biotech, 2008

86 Idea Shendure and Ji, Nat Biotech, 2008

87 Resequencing Text

88 Aligned reads

89 Divide and Conquer Text Naturally parallel

90 Divide and Conquer Text bwa multi-core

91 Divide and Conquer MapReduce Hadoop Software Searching for SNPs with cloud computing Open Access Text Published: 20 November 2009 Genome Biology 2009, 10:R134 (doi: /gb r134) The electronic version of this article is the complete one and can be found online at Received: 30 September 2009 Revised: 5 November 2009 Accepted: 20 November Langmead et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract As DNA sequencing outpaces improvements in computer speed, there is a critical need to accelerate tasks like alignment and SNP calling. Crossbow is a cloud-computing software tool that combines the aligner Bowtie and the SNP caller SOAPsnp. Executing in parallel using Hadoop, Crossbow analyzes data comprising 38-fold coverage of the human genome in three hours using a 320-CPU cluster rented from a cloud computing service for about $85. Crossbow is available from

92 De-novo sequencing no reference sequence to align to computionally more demanding

93 Epistasis

94 Epistasis

95 Model Linear regression Phenotype β 0 + β 1 SNP 1 + β 2 SNP 2 + β 12 (S

96 Binary Phenotype Approximation Difference of correlation between two SNPs in cases vs. controls Zhao, J., Jin, L. and Xiong, M.: Test for interaction between two unlinked loci, Am J Hum Genet, 79, (2006)

97 Correlation coefficients cases controls Gretton, A., Borgwardt K. M., Rasch M., Schölkopf B. and Smola A.: A Kernel Method for the Two-Sample-Problem. Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference, (Eds.) Schölkopf, B., J. Platt, T. Hofmann, MIT Press, Cambridge, MA, USA ( )

98 CC

99 Approximation

100 GPU Implementation in using gputools gpucor(x, cor y,...)

101 Validation Explicit test for good interactions explicit logistic regression

102 Application Overestimating

103 Epistasis: GPU Very fast approximation Very fast correlation for continuous phenoype applicable to eqtl studies

104 GPU Strong dependence Text on problem Serious efforts required

105 Problems Classification Epistasis Family trees SNP Imputation Sequence alignment Higher-order interactions eqtl

106 Problems Multi-core GPU-Parallel Too big/complex? Sequence alignment SNP Imputation eqtl Classification Epistasis Higher-order interactions Family trees

107 Thanks

108 Thank you

109