Applied Bioinformatics

Size: px
Start display at page:

Download "Applied Bioinformatics"

Transcription

1 Applied Bioinformatics In silico and In clinico characterization of genetic variations Assistant Professor Department of Biomedical Informatics Center for Human Genetics Research

2 ATCAAAATTATGGAAGAA ATCAAAATCATGGAAGAA Single Nucleotide Polymorphisms About every 300 th nucleotide pair is polymorphic Used as markers for genetic studies 14,653,228 validated SNPs in dbsnp SNPs William S. Bush

3 dbsnp Contains submitted data and computed content Identified by RefSNP numbers (rs1234) Information on the position, frequency, relevant populations NBK21101 William S. Bush

4 What does this SNP/Allele do? Ensembl Consequence Type Intergenic Downstream Within non- coding gene Upstream Intronic Splice Site Intronic Non- synonymous coding Synonymous coding 5 UTR 3 UTR FrameshiB coding Upstream regulatory region William S. Bush

5 SIFT and PolyPhen SIFT PolyPhen-2 genetics.bwh.harvard.edu/pph2/ Use known deleterious mutations to predict the impact of similar mutations in similar proteins William S. Bush

6 Expression QTLs ~3.3 million SNPs + ~14,925 expression probes William S. Bush

7 U Chicago eqtl Browser eqtl/ Searchable by SNP, Gene, Region, etc Mostly cis eqtls (within the regulatory region) Multiple studies and tissue types William S. Bush

8 Linkage Disequilibrium When a mutation occurs near an existing SNP, the two become linked on the chromosome Two SNPs that flow through the population in successive generations said to be in LD Assuming recombination occurs at random points throughout the genome, the LD between two SNPs eventually fades William S. Bush

9 The International HapMap Catalog of SNPs in 11 human subpopulations Contains roughly 4 million SNPs Project Allows calculation of LD, facilitating GWAS studies William S. Bush

10 Genome-wide Association Linkage Disequilibrium Studies Compare Controls Cases William S. Bush

11 GWAS Associations Thousands of confirmed associations for hundreds of human phenotypes William S. Bush

12 Genetic Association Database A catalog of mostly non-gwas studies Contains both positive and negative results A variety of traits and disease classes are represented William S. Bush

13 OMIM Online Mendelian Inheritance in Man Contains narratives and extensive references for the genetic basis of nearly every disease Manually curated by the community (think WikiPedia) Also contains narratives by gene William S. Bush

14 Human Gene Mutation Database From the makers of Transfac, another subscription-based database Public and private versions Has extensive lists of references for known mutations and their categories William S. Bush

15 In Clinico Characterizaton Left-over blood samples from clinical care DNA linked to deidentified electronic medical records William S. Bush

16 Phenome-Wide Association Associate a single variant to many traits Popular in electronic medical records Studies William S. Bush

17 Genetic PREDICTion Uses specific genetic variants to predict a person s drug response Clopidogrel response Warfarin dosing William S. Bush

18 Direct to Consumer Genetics William S. Bush

19 Sequencing GWAS is quickly becoming boring Sequencing is the new fun thing to do Whole-exon designs Whole-genome sequencing RNA-seq William S. Bush

20 1000 Genomes Project Goal is to sequence the whole genomes of ~2500 people Currently have sequence for 180 samples Exons will be sequenced with better quality Data are being loaded into a special Ensembl-style browser William S. Bush