GATCGTGCACGATCTCGGCAATTCGGGATGCCGGCTCGTCACCGGTCGCT

Size: px
Start display at page:

Download "GATCGTGCACGATCTCGGCAATTCGGGATGCCGGCTCGTCACCGGTCGCT"

Transcription

1 Problem. (pts) A. (5pts) Your colleague professor Eugene Mathew Lateed generated a genome-wide DNA methylation map for normal colon cells using MRE-seq and MeDIP-seq. In an intergenic region, he found an interesting locus. This locus is about 2kb. On one end of the locus, there is a 2kb CpG rich stretch that has both intermediate MRE-seq and MeDIP-seq signals. The rest 8kb has high level of MeDIP-seq signals. Based on what you learnt in class, you suspect that this region encodes for a novel gene. Why do you suspect so? You decide to look at histone modification patterns across this region for more evidence. There are several genome-wide datasets available for this cell type: H3K4me, H3K4me3, H3K27me3, H3K9me3, H3K36me3, and H3K9Ac. Which histone mark would you investigate for this locus and why? Can you suggest least one other types of data that may help you? Why you think it can help?

2 B (5pts). You decide to use bisulfate sequencing to validate the methylation status of the 2kb region that has both intermediate MRE-seq and MeDIP-seq signal. Bisulfate treatment will convert unmethylated C to U, and sequenced as T. You did this experiment in both normal colon cells and a colon cancer cell line. Here is some data after aligning bisulfate reads to the region. For simplicity, we only consider data on one strand. Template: Reads(normal) GATCGTGCACGATCTCGGCAATTCGGGATGCCGGCTCGTCACCGGTCGCT GATTGTGTATGATTTTGGTAATTTGGG GATTGTGTATGATTTTGGTAATTTGGG GATTGTGTATGATTTTGGTAATTTGGG GATTGTGTATGATTTCGGTAATTCGGG GATTGTGTATGATTTCGGTAATTCGGG Template: Reads(cancer) GATCGTGCACGATCTCGGCAATTCGGGATGCCGGCTCGTCACCGGTCGCT

3 Please calculate level of methylation, defined as percentage methylated, of the 8 CpG sites in both normal cells and in cancer cells. Based on this data, you suggest to professor Lateed that this promoter is potentially imprinted. Why? Describe how you might obtain additional support for the imprinted status of this promoter. What is the observed change in cancer cells? If this novel gene plays a role in tumorigenesis, do you think it is a tumor suppressor gene or an oncogene? Why? Propose at least two mechanisms leading to the observed change.

4 Problem 2 ( pts) Two recent papers addressed conservation and divergence of methylation patterning in plants and animals. Whole genome bisulfite sequencing was performed on 8 organisms. In the question, you will study figures from one of the papers, and describe your conclusion. Figure : phylogenetic relationship of 8 species used in study. Figure 2: DNA methylation pattern on genes. Figure 3: DNA methylation pattern on repeats. Figure 4: DNA methylation pattern of exons and introns. Questions: () Describe general DNA methylation pattern in and around genes, repeats. (2) What kind of DNA methylation pattern around genes and repeats would you expect to observe in (a) the last common ancestor of animals; (b) the last common ancestor of plants (c) the last common ancestor of animal and plants? (3) Exon/intron boundaries are recognized in splicing, which happens posttranscription on RNA; DNA methylation happens on the genome. Give a hypothesis that explains why there is difference in methylation between exons and introns.!"#$%&'( )&%"*'( Fig.. Eight eukaryotic organisms used in this study. Tree topology is from NCBI Taxonomy. All tissues are wild type.

5 Fig. 2. Distribution of methylation along protein-coding genes. Upstream and downstream regions are the same length as the gene. Only data up to halfway to the next nonoverlapping gene are used in this analysis. Two vertical purple lines mark the gene boundaries.

6 Fig. 3. Distribution of methylation along repetitive DNA. Upstream and downstream regions are the same length as the repeat. Only data up to halfway to the next nonoverlapping repeat are used in this analysis. Two vertical purple lines mark the repeat boundaries.

7 Fig. 4. Comparison of methylation levels across exons and introns. Only internal exons (flanked by introns on both ends) that do not contain any 5 - or 3 -UTR bases are used. Upstream and downstream regions are the same length as the exon. Only data up to halfway to the next exon are used in this analysis. Two vertical purple lines mark the intron exon and exon intron boundaries.

8 Problem 4 (pts). DNA methylation has been implicated as an epigenetic component of mechanisms that stabilize cell- fate decisions. Greg Hannon and colleagues have characterized the methylomes of human female hematopoietic stem/ progenitor cells (HSPCs) and mature cells from the myeloid and lymphoid lineages. For a brief review of Haematopoiesis, see Fig. The technology they used is Whole Genome Bisulfite Shotgun Sequencing. Fig 2 displays methylation levels of CD9 gene across 6 samples. These samples are:. ESCs: Embryonic Stem Cells 2. HSPCs: hematopoietic stem/ progenitor cells, representing the earliest self-renewing, multipotent populations from pooled female samples; 3. CD33+: also representing HSPC, but from male umbilical cord blood; 4. B cells: derived, mature cell types from the lymphoid lineage; 5. Neutrophils: derived, mature cell types from the myeloid lineage; 6. Sperm cells. Questions (2pts) A. (2) Describe why we can use bisulfite sequencing to detection DNA methylation. B. (2) Given double stranded genomic region AC m GTTCGCTTGAG, what does it look like in bisulfite reads that fully cover it? C. (2) If you use bp long single end reads, your goal is to generate enough reads to cover each strand of the genome 5 times on average, how many mappable reads do you need to generate? D. (4) Describe what you can learn from Fig 2. You need to at least address why CD9 is used as a B-lymphocyte specific antigen marker. E. (2) Suggest at least one alternative genome-wide DNA methylation assay that can give you similar result to reach conclusion for (D). Discuss why, how, and pros and cons. Fig. Fig 2.

9 Human Hematopoie A Scale chr6: RefSeq Genes kb CD9 UCSC CpG Islands CGIs (HMM-based) ESCs ESC HMRs Methylation Level HSPCs HSPC HMRs CD33 HMRs CD33+ B Cells BCell HMRs Neut HMRs Neutrophils Sperm HMRs Sperm Individual CpG Sites B Scale chr9: RefSeq Genes kb CEBPA UCSC CpG Islands LOC854 CGIs (HMM-based) ESCs Methylation Level HSPCs CD33+ B Cells Neutrophils Sperm ESC HMRs HSPC HMRs CD33 HMRs BCell HMRs Neut HMRs Sperm HMRs Individual CpG Sites C HSPC UCSC CGI D