Individual and Allele specific chromatin. EBI is an Outstation of the European Molecular Biology Laboratory.

Size: px
Start display at page:

Download "Individual and Allele specific chromatin. EBI is an Outstation of the European Molecular Biology Laboratory."

Transcription

1 Individual and Allele specific chromatin EBI is an Outstation of the European Molecular Biology Laboratory.

2 2

3 ENCODE production 3

4 Structure of the data - with no variation Cell Type Neuronal Hepatocyte Lymphocyte Fibroblast Genome Assay 4

5 Genome wide analysis 5

6 We want to add individual axis to this picture Cell Type Genome Assays Neuronal Hepatocyte Lymphocyte Fibroblast Individual Genotype 6

7 Gaining more power and understanding via molecular phenotypes Gene Expression A/T GWAS Chromatin Type I Diabetes Environment? 7

8 But it comes with a confounding axis Cell Type Genome Assays Neuronal Hepatocyte Lymphocyte Fibroblast Individual Genotype Environmental effects 8

9 Variance in Chromatin Heritable? Interesting Environment (eg, diet aged 3-5)? Boring Environment (eg, sample handling)? 9

10 Experimental set up Leverage the 1,000 genomes - take the 2 deep trios 4 unrelated individuals, 2 CEPH, 2 YRI, 2 Male, 2 Female 2 Children (both female) Lymphoblastoid cell lines Not ideal - infectious status? Transformation? EBV effects? Only feasible thing at the moment In good company with the eqtl people Chromatin Assays Dnase I CTCF 10

11 Variable sites 11

12 12

13 Allele specific information EBI is an Outstation of the European Molecular Biology Laboratory.

14 The power of allelic information Different: Genetics Environment Sample handling Same Environment Sample Handling Trans Genetics Different Cis Genetics 14

15 Distinguishing Alleles A T Allele T: 7 (70%) Allele A: 3 15

16 (small nightmare - getting rid of mapping bias) 16

17 Allele specific sites A T Binomial test vs a 0.5 split 0.1 FDR level 7% of DNaseI sites with a Het 11% of CTCF sites with a Het Allele T: 7 (70%) Allele A: 3 17

18 Between unrelated individuals? Matched cis genetic Different: Trans Genetics Environment Sample handling 18

19 19

20 Allele specific behaviour between individuals Fisher s Exact FDR corrected 2% is opposite bias between unrelated 1% is opposite bias Parent->Child 20

21 Parents to Child Does the direction of the child s allele specficity correlate with the parental difference? 21

22 Combined over both population P<e-5, Spearman s correlation Rho ~0.5 22

23 Other biological signals X is more allele specific than other chromosomes Allele specificity is biased towards the active X in this cell line 23

24 CTCF motif 24

25 Basic biology in the human genetics context N cell lines High throughput assay PeakA PeakB PeakZ Correlation/analysis of variance In particular wrt to genotypes cqtls, eqtls 25

26 Back to mid Would ideally do sequencing but at 2 lanes a sample, this is a bit expensive Nimblegen 12-plex arrays (currently ~5 fold cheaper) Switched chromatin assay from DNaseI to FAIRE 26

27 Initial results from a 60 cell line experiment Similar distributions of Pvalues as eqtl data 27

28 Gaining more power and understanding via molecular phenotypes Gene Expression A/T GWAS Chromatin Cell Shape Type I Diabetes Or Drosophila (A. Ephrussi, E. Furlong) Or Mouse (J. Flint) 28

29 Trudy s flies 192 Wild isogenic lines 40 whole genome sequenced (now) 192 coming along 29

30 Drosophila Image analysis 114 parameters 114 parameters 30

31 GWAS in Drosophila 31

32 Conclusions One can reliably measure individual differences in chromatin effects without being swamped by experimental noise At least some proportion of the variance in TF binding and chromatin accessibility is due to cis-genetic effects Genome wide association is not just for Human Disease genetics - It is a powerful tool for basic biology in humans and other organisms 32

33 Thanks University of Texas Ryan McDaniell Bum-Kyu Lee Zheng Liu Anna Batterhouse Vishy Iyer Duke University Lingyn Song Alan Boyle Katerina S. Kucera Terry Furey Greg Crawford Hunt. Willard University of North Carolina Linda Grasfeder Jason Lieb NIH Mike Erdos Francis Collins U. Michigan Laura Scott EMBL-HD Anne Ephrussi EBI Damian Keefe Dace Ruklisa Funding: NIH-NHGRI, EMBL McDaniell etal, Science, Apr 9;328:

34 What is a Qualitative effect? Large effect means significance at low sample numbers (rarely 1 though!) Perfect 4 low vs 2 high (wilcox test) Perfect 10 low vs 1 high Perfect 10 low vs 2 high

35 Dissecting effects Chromatin Gene Expression Gene Exp chromatin SNP Gene Exp Gene Exp chromatin Genotype A/A chromatin Genotype T/T 35

36 Comparisons of FAIRE, RNA, Genotypes 36

37 More complex conditional correlation 37

38 Screen shots 38

39 39

40 40

41 PILOT PILOT GENOME WIDE Pilot Constraint (4.9%) Jan 2010 Constraint (8%) Jan2010 Constraint (6.8%) 41

42 Dnase I hypersensitivity CTCF Chip-seq 42

43 Current ENCODE Dataset: Non chip-seq 43

44 Chip-seq datasets 44