Using Big Data technologies to uncover genetic causes of Amyotrophic lateral sclerosis
|
|
- Morgan James
- 5 years ago
- Views:
Transcription
1 Using Big Data technologies to uncover genetic causes of Amyotrophic lateral sclerosis Dr Natalie Twine Transformational Bioinformatics 11 October 2017 HEATH & BIOSECURITY
2 Astronomy Twitter YouTube Genomics Genomics will outpace other BigData disciplines Stephens et al. PLOS Biology 2015
3 Population-scale genomic data analysis requires BigData solutions Desktop compute High-performance compute cluster Hadoop/Spark compute cluster Focus small data Compute-intensive Data-intensive Fault tolerant No No Yes Node-bound Yes Yes No Parallelization 10 CPU 100 CPU 1000 CPU Parallelization procedure bespoke bespoke standardized CSIRO solution
4 Amyotrophic lateral sclerosis (ALS) ALS is a devastating motor neurone disease Leads to death within 3 years Affects more than 200,000 people worldwide Causes largely unknown genetic component Project MinE - sequencing 15,000 ALS genomes worldwide
5 What is our hypothesis? Sporadic cases are potentially related but separated over generations ALS is reported to be 5% familial and 95% sporadic Familial component is potentially higher than 5% Australia is a small population and disease is late onset What are our aims? Uncover hidden patient relationships to increase detection power Identify novel disease causing variants Datasets available: Exomes (Familial, n=137) WGS (Sporadic, n=800) Project MinE WGS (Sporadic, n=15,000) GOI SNP A SNP B SNP C
6 Existing methods for measuring relatedness PLINK (Chang CC et al. GigaScience, 2015) KING (Manichaikul A et al. Bioinformatics, 2010) SNPduo, ERSA, GRAB, XIBD (etc.) Limitations of these tools Designed to identify and remove relatives as part of GWAS workflow Identifying more distant relatives is challenging Tools effective at distant relationship detection are SLOW We want to expand existing family structures Identify more distant relationships with confidence
7 Relatedness between ALS patients using KING KING identifies close relationships 172 Familial and Sporadic ALS Exomes Each blue dot represents a relationship between a pair of ALS patients.
8 Relatedness between ALS patients using KING n=172 ( 137 Familial and 35 Sporadic) Degree of relationship Number True positives False positives Unknown Duplicates 6 6 (100%) 0 0 1st degree (100%) 0 0 2nd degree (100%) 0 3rd degree (44%) 9 (33%) 6 (22%) 4th degree 1310 n/a n/a n/a 5th degree 7852 n/a n/a n/a KING identified 8 novel relationships, at 3 rd degree 3 we have ruled out as false positives. 5 are potentially REAL as can t be classified as FP (no mutation status known).
9 How can we improve on this result? True positive rate - 44% - this needs to improve Whole Genome Sequencing >> Exome (SNP density) WGS (n=800) cohort has 42 million variants High density data - > more informative BUT - Existing tools struggle/fail with Big Data volumes We are implementing relatedness testing in VariantSpark to identify novel relationships in 800 WGS Sporadic and Familial ALS 15,000 WGS samples (Project MinE)
10 Speed BMC Genomics 2015, 16:1052 PMID: (IF=4) Bringing BigLearning to genomics applications. VariantSpark learns from 3000 individuals and 80 million mutations in under 30 minutes z Association testing Clustering Classification Accuracy
11 Using VariantSpark to identify relatedness Data-driven rather than model-driven approach VariantSpark can handle 80 million variants x 3000 individuals What is the genetic distance between samples? (allele sharing) Euclidean distance Identity by descent (IBD) (as in PLINK) Sliding window for IBD segments (as in ERSA) Include data from 1000 Genomes as controls (family and ancestry known) Approaches currently being tested for feasibility
12 distance accuracy Testing using different distance measures Euclidean distance 1 (IBD) UR 1 5 Degree of relationship Degree of relationship Exomes (n= 137 Familial ALS) 10 Euclidean distance performs well until 4 th degree Plink (IBD) performs better than other distances
13 Effective methods are compute intensive Ramstetter et al., Genetics 2017 IBD segment based methods most accurate for more distant relatives BUT They are also are most compute intensive (a.k.a SLOW)
14 Next steps in tool development Simulate a large pedigree using whole genome data Calculate genetic distance between each sample Measure performance of different distance measures sensitivity and specificity (AUC) Generate relationship degree metrics from simulated cohort Implement these methods in VariantSpark speed and scalability proof of principle cohort Familial ALS WGS (n=89) Identify novel relationships in Sporadic ALS WGS (n=800)
15 VariantSpark application genetic association Bone Mineral Density (BMD) as the phenotype; 1,936 individuals with 7.2 Million variants (imputed from array). Joint-loci analysis (machine learning - random forest) Replicate known BMD genes identified by traditional GWAS (single loci regression). Amplify signal over traditional methods so smaller cohorts give robust insights Random forests identifies interaction of 2 or more loci We will use this methodology to identify novel & modulating ALS variants
16 In summary: BigLearning to understand ALS Identify related individuals Novel diseasecausing variants Preventative measures Personalised treatment
17 Transformational Bioinformatics Team Denis Bauer Oscar Luo Laurence Wilson Aidan O Brien Natalie Twine Rob Dunne Piotr Szul Kaitao Lai Collaborators News Ian Blair Kelly Williams Emily McCann Jenn Fifita Adrian White Mia Champion Software
Genomic-scale Data Pipelines
Genomic-scale Data Pipelines Lynn Langit Transformational Bioinformatics Denis C. Bauer @allpowerde Big Data Big in Data 2025 Petabytes? in 2025 Petabytes? YouTube 2000 Twitter 17 Astronomy 1000 1000 0
More informationPotential of human genome sequencing. Paul Pharoah Reader in Cancer Epidemiology University of Cambridge
Potential of human genome sequencing Paul Pharoah Reader in Cancer Epidemiology University of Cambridge Key considerations Strength of association Exposure Genetic model Outcome Quantitative trait Binary
More informationH3A - Genome-Wide Association testing SOP
H3A - Genome-Wide Association testing SOP Introduction File format Strand errors Sample quality control Marker quality control Batch effects Population stratification Association testing Replication Meta
More informationVariantSpark applying Spark based machine learning methods to genomic information
VariantSpark applying Spark based machine learning methods to genomic information Dr Denis Bauer Bioinformatics 5 July 2016 HEATH & BIOSECURITY @allpowerde Talk Overview Background: CSIRO and Medical Genomics
More informationAssociation studies (Linkage disequilibrium)
Positional cloning: statistical approaches to gene mapping, i.e. locating genes on the genome Linkage analysis Association studies (Linkage disequilibrium) Linkage analysis Uses a genetic marker map (a
More informationHaplotype phasing in large cohorts: Modeling, search, or both?
Haplotype phasing in large cohorts: Modeling, search, or both? Po-Ru Loh Harvard T.H. Chan School of Public Health Department of Epidemiology Broad MIA Seminar, 3/9/16 Overview Background: Haplotype phasing
More informationIntroduction to Add Health GWAS Data Part I. Christy Avery Department of Epidemiology University of North Carolina at Chapel Hill
Introduction to Add Health GWAS Data Part I Christy Avery Department of Epidemiology University of North Carolina at Chapel Hill Outline Introduction to genome-wide association studies (GWAS) Research
More informationA genome wide association study of metabolic traits in human urine
Supplementary material for A genome wide association study of metabolic traits in human urine Suhre et al. CONTENTS SUPPLEMENTARY FIGURES Supplementary Figure 1: Regional association plots surrounding
More informationWhole genome sequencing in the UK Biobank
Whole genome sequencing in the UK Biobank Part of the UK Government s Industrial Strategy Challenge Fund (ISCF) for the Data to Early Diagnosis and Precision Medicine initiative Aim to produce deep characterisation
More informationDeep learning sequence-based ab initio prediction of variant effects on expression and disease risk
Summer Review 7 Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk Jian Zhou 1,2,3, Chandra L. Theesfeld 1, Kevin Yao 3, Kathleen M. Chen 3, Aaron K. Wong
More informationsolid S Y S T E M s e q u e n c i n g See the Difference Discover the Quality Genome
solid S Y S T E M s e q u e n c i n g See the Difference Discover the Quality Genome See the Difference With a commitment to your peace of mind, Life Technologies provides a portfolio of robust and scalable
More informationImprovement of Association-based Gene Mapping Accuracy by Selecting High Rank Features
Improvement of Association-based Gene Mapping Accuracy by Selecting High Rank Features 1 Zahra Mahoor, 2 Mohammad Saraee, 3 Mohammad Davarpanah Jazi 1,2,3 Department of Electrical and Computer Engineering,
More informationRedefine what s possible with the Axiom Genotyping Solution
Redefine what s possible with the Axiom Genotyping Solution From discovery to translation on a single platform The Axiom Genotyping Solution enables enhanced genotyping studies to accelerate your research
More informationAxiom Biobank Genotyping Solution
TCCGGCAACTGTA AGTTACATCCAG G T ATCGGCATACCA C AGTTAATACCAG A Axiom Biobank Genotyping Solution The power of discovery is in the design GWAS has evolved why and how? More than 2,000 genetic loci have been
More informationBioinformatics pipelines, workflows, and resources
Bioinformatics pipelines, workflows, and resources Marylyn D. Ritchie, PhD Professor, Biochemistry & Molecular Biology Center for Systems Genomics The Pennsylvania State University GWAS: Genome-Wide Association
More informationGenome wide association studies. How do we know there is genetics involved in the disease susceptibility?
Outline Genome wide association studies Helga Westerlind, PhD About GWAS/Complex diseases How to GWAS Imputation What is a genome wide association study? Why are we doing them? How do we know there is
More informationTHE HEALTH AND RETIREMENT STUDY: GENETIC DATA UPDATE
: GENETIC DATA UPDATE April 30, 2014 Biomarker Network Meeting PAA Jessica Faul, Ph.D., M.P.H. Health and Retirement Study Survey Research Center Institute for Social Research University of Michigan HRS
More informationGeneral aspects of genome-wide association studies
General aspects of genome-wide association studies Abstract number 20201 Session 04 Correctly reporting statistical genetics results in the genomic era Pekka Uimari University of Helsinki Dept. of Agricultural
More informationGlobal Screening Array (GSA)
Technical overview - Infinium Global Screening Array (GSA) with optional Multi-disease drop in (MD) The Infinium Global Screening Array (GSA) combines a highly optimized, universal genome-wide backbone,
More informationExploring the Genetic Basis of Congenital Heart Defects
Exploring the Genetic Basis of Congenital Heart Defects Sanjay Siddhanti Jordan Hannel Vineeth Gangaram szsiddh@stanford.edu jfhannel@stanford.edu vineethg@stanford.edu 1 Introduction The Human Genome
More informationPersonal Genomics Platform White Paper Last Updated November 15, Executive Summary
Executive Summary Helix is a personal genomics platform company with a simple but powerful mission: to empower every person to improve their life through DNA. Our platform includes saliva sample collection,
More informationGenome-wide association studies (GWAS) Part 1
Genome-wide association studies (GWAS) Part 1 Matti Pirinen FIMM, University of Helsinki 03.12.2013, Kumpula Campus FIMM - Institiute for Molecular Medicine Finland www.fimm.fi Published Genome-Wide Associations
More informationUtilising Deep Learning and Genome Wide Association Studies for Epistatic-Driven Preterm Birth Classification in African-American Women
Utilising Deep Learning and Genome Wide Association Studies for Epistatic-Driven Preterm Birth Classification in African-American Women Paul Fergus, Casimiro Curbelo Montañez, Basma Abdulaimma, Paulo Lisboa,
More informationCrash-course in genomics
Crash-course in genomics Molecular biology : How does the genome code for function? Genetics: How is the genome passed on from parent to child? Genetic variation: How does the genome change when it is
More informationWhole genome sequencing in drug discovery research: a one fits all solution?
Whole genome sequencing in drug discovery research: a one fits all solution? Marc Sultan, September 24th, 2015 Biomarker Development, Translational Medicine, Novartis On behalf of the BMD WGS pilot team:
More informationCross Haplotype Sharing Statistic: Haplotype length based method for whole genome association testing
Cross Haplotype Sharing Statistic: Haplotype length based method for whole genome association testing André R. de Vries a, Ilja M. Nolte b, Geert T. Spijker c, Dumitru Brinza d, Alexander Zelikovsky d,
More informationLinkage Disequilibrium
Linkage Disequilibrium Why do we care about linkage disequilibrium? Determines the extent to which association mapping can be used in a species o Long distance LD Mapping at the tens of kilobase level
More informationAssociation Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010
Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010 Traditional QTL approach Uses standard bi-parental mapping populations o F2 or RI These have a limited number of
More informationHaplotype estimation for biobank scale datasets
1 2 3 4 5 6 7 8 9 10 11 12 13 14 Haplotype estimation for biobank scale datasets Jared O Connell 1,2,7, Kevin Sharp 2,7, Nick Shrine 3, Louise Wain 3, Ian Hall 4, Martin Tobin 3, Jean-Francois Zagury 5,
More informationGenomic Technologies. Michael Schatz. Feb 1, 2018 Lecture 2: Applied Comparative Genomics
Genomic Technologies Michael Schatz Feb 1, 2018 Lecture 2: Applied Comparative Genomics Welcome! The primary goal of the course is for students to be grounded in theory and leave the course empowered to
More informationIntroduction to Quantitative Genomics / Genetics
Introduction to Quantitative Genomics / Genetics BTRY 7210: Topics in Quantitative Genomics and Genetics September 10, 2008 Jason G. Mezey Outline History and Intuition. Statistical Framework. Current
More informationProstate Cancer Genetics: Today and tomorrow
Prostate Cancer Genetics: Today and tomorrow Henrik Grönberg Professor Cancer Epidemiology, Deputy Chair Department of Medical Epidemiology and Biostatistics ( MEB) Karolinska Institutet, Stockholm IMPACT-Atanta
More informationWhole Transcriptome Analysis of Illumina RNA- Seq Data. Ryan Peters Field Application Specialist
Whole Transcriptome Analysis of Illumina RNA- Seq Data Ryan Peters Field Application Specialist Partek GS in your NGS Pipeline Your Start-to-Finish Solution for Analysis of Next Generation Sequencing Data
More informationBTRY 7210: Topics in Quantitative Genomics and Genetics
BTRY 7210: Topics in Quantitative Genomics and Genetics Jason Mezey Biological Statistics and Computational Biology (BSCB) Department of Genetic Medicine jgm45@cornell.edu Spring 2015, Thurs.,12:20-1:10
More informationGenotype quality control with plinkqc Hannah Meyer
Genotype quality control with plinkqc Hannah Meyer 219-3-1 Contents Introduction 1 Per-individual quality control....................................... 2 Per-marker quality control.........................................
More informationUnderstanding the science and technology of whole genome sequencing
Understanding the science and technology of whole genome sequencing Dag Undlien Department of Medical Genetics Oslo University Hospital University of Oslo and The Norwegian Sequencing Centre d.e.undlien@medisin.uio.no
More informationComputational Workflows for Genome-Wide Association Study: I
Computational Workflows for Genome-Wide Association Study: I Department of Computer Science Brown University, Providence sorin@cs.brown.edu October 16, 2014 Outline 1 Outline 2 3 Monogenic Mendelian Diseases
More informationSNP calling and Genome Wide Association Study (GWAS) Trushar Shah
SNP calling and Genome Wide Association Study (GWAS) Trushar Shah Types of Genetic Variation Single Nucleotide Aberrations Single Nucleotide Polymorphisms (SNPs) Single Nucleotide Variations (SNVs) Short
More informationThe interactions that occur between two proteins are essential parts of biological systems.
Gabor 1 Evaluation of Different Biological Data and Computational Classification Methods for Use in Protein Interaction Prediction in Signaling Pathways in Humans Yanjun Qi 1 and Judith Klein-Seetharaman
More informationPharmacogenetics: A SNPshot of the Future. Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001
Pharmacogenetics: A SNPshot of the Future Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001 1 I. What is pharmacogenetics? It is the study of how genetic variation affects drug response
More informationGenomic resources. for non-model systems
Genomic resources for non-model systems 1 Genomic resources Whole genome sequencing reference genome sequence comparisons across species identify signatures of natural selection population-level resequencing
More informationTraditional Genetic Improvement. Genetic variation is due to differences in DNA sequence. Adding DNA sequence data to traditional breeding.
1 Introduction What is Genomic selection and how does it work? How can we best use DNA data in the selection of cattle? Mike Goddard 5/1/9 University of Melbourne and Victorian DPI of genomic selection
More informationCUMACH - A Fast GPU-based Genotype Imputation Tool. Agatha Hu
CUMACH - A Fast GPU-based Genotype Imputation Tool Agatha Hu ahu@nvidia.com Term explanation Figure resource: http://en.wikipedia.org/wiki/genotype Allele: one of two or more forms of a gene or a genetic
More informationTranslational Medicine in the Era of Big Data: Hype or Real?
Translational Medicine in the Era of Big Data: Hype or Real? AAHCI MENA Regional Conference September 27, 2018 AKL FAHED, MD, MPH @aklfahed Disclosures None 2 Outline The Promise of Big Data Genomics Polygenic
More informationAssociation Mapping. Mendelian versus Complex Phenotypes. How to Perform an Association Study. Why Association Studies (Can) Work
Genome 371, 1 March 2010, Lecture 13 Association Mapping Mendelian versus Complex Phenotypes How to Perform an Association Study Why Association Studies (Can) Work Introduction to LOD score analysis Common
More informationWhole Human Genome Sequencing Report This is a technical summary report for PG DNA
Whole Human Genome Sequencing Report This is a technical summary report for PG0002601-DNA Physician and Patient Information Physician name: Vinodh Naraynan Address: Suite 406 222 West Thomas Road Phoenix
More informationUnderstanding genetic association studies. Peter Kamerman
Understanding genetic association studies Peter Kamerman Outline CONCEPTS UNDERLYING GENETIC ASSOCIATION STUDIES Genetic concepts: - Underlying principals - Genetic variants - Linkage disequilibrium -
More informationHuman Genomics. Higher Human Biology
Human Genomics Higher Human Biology Learning Intentions Explain what is meant by human genomics State that bioinformatics can be used to identify DNA sequences Human Genomics The genome is the whole hereditary
More informationAssay Validation Services
Overview PierianDx s assay validation services bring clinical genomic tests to market more rapidly through experimental design, sample requirements, analytical pipeline optimization, and criteria tuning.
More informationThe MiniSeq System. Explore the possibilities. Discover demonstrated NGS workflows for molecular biology applications.
The MiniSeq System. Explore the possibilities. Discover demonstrated NGS workflows for molecular biology applications. Let your work flow with Illumina NGS. The MiniSeq System delivers powerful and cost-effective
More informationThe first and only fully-integrated microarray instrument for hands-free array processing
The first and only fully-integrated microarray instrument for hands-free array processing GeneTitan Instrument Transform your lab with a GeneTitan Instrument and experience the unparalleled power of streamlining
More information'Bioinformatics in academia as related to ehealth' - including the "Genomic Virtual Lab"
'Bioinformatics in academia as related to ehealth' - including the "Genomic Virtual Lab" Dr Gareth Price Head of Computational Biology Queensland Facility of Advanced Bioinformatics From Genomes to Systems
More informationImputation. Genetics of Human Complex Traits
Genetics of Human Complex Traits GWAS results Manhattan plot x-axis: chromosomal position y-axis: -log 10 (p-value), so p = 1 x 10-8 is plotted at y = 8 p = 5 x 10-8 is plotted at y = 7.3 Advanced Genetics,
More informationDNA Collection. Data Quality Control. Whole Genome Amplification. Whole Genome Amplification. Measure DNA concentrations. Pros
DNA Collection Data Quality Control Suzanne M. Leal Baylor College of Medicine sleal@bcm.edu Copyrighted S.M. Leal 2016 Blood samples For unlimited supply of DNA Transformed cell lines Buccal Swabs Small
More informationTitle: Genome-Wide Predictions of Transcription Factor Binding Events using Multi- Dimensional Genomic and Epigenomic Features Background
Title: Genome-Wide Predictions of Transcription Factor Binding Events using Multi- Dimensional Genomic and Epigenomic Features Team members: David Moskowitz and Emily Tsang Background Transcription factors
More informationComparative eqtl analyses within and between seven tissue types suggest mechanisms underlying cell type specificity of eqtls
Comparative eqtl analyses within and between seven tissue types suggest mechanisms underlying cell type specificity of eqtls, Duke University Christopher D Brown, University of Pennsylvania November 9th,
More informationEPIB 668 Genetic association studies. Aurélie LABBE - Winter 2011
EPIB 668 Genetic association studies Aurélie LABBE - Winter 2011 1 / 71 OUTLINE Linkage vs association Linkage disequilibrium Case control studies Family-based association 2 / 71 RECAP ON GENETIC VARIANTS
More informationMolecular and Applied Genetics
Molecular and Applied Genetics Ian King, Iain Donnison, Helen Ougham, Julie King and Sid Thomas Developing links between rice and the grasses 6 Gene isolation 7 Informatics 8 Statistics and multivariate
More informationIn silico prediction of novel therapeutic targets using gene disease association data
In silico prediction of novel therapeutic targets using gene disease association data, PhD, Associate GSK Fellow Scientific Leader, Computational Biology and Stats, Target Sciences GSK Big Data in Medicine
More informationImproving the accuracy and efficiency of identity by descent detection in population
Genetics: Early Online, published on March 27, 2013 as 10.1534/genetics.113.150029 Improving the accuracy and efficiency of identity by descent detection in population data Brian L. Browning *,1 and Sharon
More informationExome Sequencing Exome sequencing is a technique that is used to examine all of the protein-coding regions of the genome.
Glossary of Terms Genetics is a term that refers to the study of genes and their role in inheritance the way certain traits are passed down from one generation to another. Genomics is the study of all
More informationBenno Pütz. MPI of Psychiatry
Benno Pütz Lifetime prevalence ~20% Treatment response CC CT TT Binder et al., Nature Genetics 2004 Drug transport Dosierung Drug transport Text Uhr et al., Neuron, 2008 150 years ago Gregor Mendel Inheritance
More informationSupplementary Note: Detecting population structure in rare variant data
Supplementary Note: Detecting population structure in rare variant data Inferring ancestry from genetic data is a common problem in both population and medical genetic studies, and many methods exist to
More informationIntroduction to Genome Wide Association Studies 2015 Sydney Brenner Institute for Molecular Bioscience Shaun Aron
Introduction to Genome Wide Association Studies 2015 Sydney Brenner Institute for Molecular Bioscience Shaun Aron Many sources of technical bias in a genotyping experiment DNA sample quality and handling
More informationEFI 2016 DEBATE: WHOLE GENE VERSUS EXONIC SEQUENCING. Dr Katy Latham Stance: Whole gene sequencing should be the norm for HLA typing
EFI 2016 DEBATE: WHOLE GENE VERSUS EXONIC SEQUENCING Dr Katy Latham Stance: Whole gene sequencing should be the norm for HLA typing Why we should be utilising whole gene sequencing Ambiguity generated
More informationLARGE DATA AND BIOMEDICAL COMPUTATIONAL PIPELINES FOR COMPLEX DISEASES
1 LARGE DATA AND BIOMEDICAL COMPUTATIONAL PIPELINES FOR COMPLEX DISEASES Ezekiel Adebiyi, PhD Professor and Head, Covenant University Bioinformatics Research and CU NIH H3AbioNet node Covenant University,
More informationBTRY 7210: Topics in Quantitative Genomics and Genetics
BTRY 7210: Topics in Quantitative Genomics and Genetics Jason Mezey Biological Statistics and Computational Biology (BSCB) Department of Genetic Medicine jgm45@cornell.edu January 29, 2015 Why you re here
More informationMachine learning applications in genomics: practical issues & challenges. Yuzhen Ye School of Informatics and Computing, Indiana University
Machine learning applications in genomics: practical issues & challenges Yuzhen Ye School of Informatics and Computing, Indiana University Reference Machine learning applications in genetics and genomics
More informationNextSeq 500 System WGS Solution
NextSeq 500 System WGS Solution An accessible, high-quality whole-genome sequencing solution for any species. Highlights High-Quality, High-Coverage Genome Illumina chemistry offers highest read quality
More informationARTICLE High-Resolution Detection of Identity by Descent in Unrelated Individuals
ARTICLE High-Resolution Detection of Identity by Descent in Unrelated Individuals Sharon R. Browning 1,2, * and Brian L. Browning 1,2 Detection of recent identity by descent (IBD) in population samples
More informationHuman Genetics and Gene Mapping of Complex Traits
Human Genetics and Gene Mapping of Complex Traits Advanced Genetics, Spring 2015 Human Genetics Series Thursday 4/02/15 Nancy L. Saccone, nlims@genetics.wustl.edu ancestral chromosome present day chromosomes:
More informationData Informatics. Seon Ho Kim, Ph.D.
Data Informatics Seon Ho Kim, Ph.D. seonkim@usc.edu What is Big Data? What is Big Data? Big Data is data whose scale, diversity, and complexity require new architecture, techniques, algorithms, and analytics
More informationKNN-MDR: a learning approach for improving interactions mapping performances in genome wide association studies
Abo Alchamlat and Farnir BMC Bioinformatics (2017) 18:184 DOI 10.1186/s12859-017-1599-7 METHODOLOGY ARTICLE Open Access KNN-MDR: a learning approach for improving interactions mapping performances in genome
More informationRNA-SEQUENCING ANALYSIS
RNA-SEQUENCING ANALYSIS Joseph Powell SISG- 2018 CONTENTS Introduction to RNA sequencing Data structure Analyses Transcript counting Alternative splicing Allele specific expression Discovery APPLICATIONS
More informationPerformance of the Newly Developed Non-Invasive Prenatal Multi- Gene Sequencing Screen
1 // Performance of the Newly Developed Non-Invasive Prenatal Multi- Gene Sequencing Screen ABSTRACT Here we describe the analytical performance of the newly developed non-invasive prenatal multi-gene
More informationIlluminating Genetic Networks with Random Forest
? Illuminating Genetic Networks with Random Forest ANDREAS BEYER University of Cologne Outline Random Forest Applications QTL mapping Epistasis (analyzing model structure) 2 Random Forest HOW DOES IT WORK?
More informationSupplementary Figure 1 Genotyping by Sequencing (GBS) pipeline used in this study to genotype maize inbred lines. The 14,129 maize inbred lines were
Supplementary Figure 1 Genotyping by Sequencing (GBS) pipeline used in this study to genotype maize inbred lines. The 14,129 maize inbred lines were processed following GBS experimental design 1 and bioinformatics
More informationGenomic Data Is Going Google. Ask Bigger Biological Questions
Genomic Data Is Going Google Ask Bigger Biological Questions You know your research could have a significant scientific impact and answer questions that may redefine how a disease is diagnosed or treated.
More informationSEGMENTS of indentity-by-descent (IBD) may be detected
INVESTIGATION Improving the Accuracy and Efficiency of Identity-by-Descent Detection in Population Data Brian L. Browning*,1 and Sharon R. Browning *Department of Medicine, Division of Medical Genetics,
More informationBioinformatics opportunities in Genomics and Genetics
Bioinformatics opportunities in Genomics and Genetics Case Study: Prediction of novel gene functions of NSF1/YPL230W in Saccharomyces Cerevisiae via search for maximally interconnected sub-graph Kyrylo
More informationSNPTransformer: A Lightweight Toolkit for Genome-Wide Association Studies
GENOMICS PROTEOMICS & BIOINFORMATICS www.sciencedirect.com/science/journal/16720229 Application Note SNPTransformer: A Lightweight Toolkit for Genome-Wide Association Studies Changzheng Dong * School of
More informationLD Mapping and the Coalescent
Zhaojun Zhang zzj@cs.unc.edu April 2, 2009 Outline 1 Linkage Mapping 2 Linkage Disequilibrium Mapping 3 A role for coalescent 4 Prove existance of LD on simulated data Qualitiative measure Quantitiave
More informationSEQUENCING. M Ataei, PhD. Feb 2016
CLINICAL NEXT GENERATION SEQUENCING M Ataei, PhD Tehran Medical Genetics Laboratory Feb 2016 Overview 2 Background NGS in non-invasive prenatal diagnosis (NIPD) 3 Background Background 4 In the 1970s,
More informationStructure, Measurement & Analysis of Genetic Variation
Structure, Measurement & Analysis of Genetic Variation Sven Cichon, PhD Professor of Medical Genetics, Director, Division of Medcial Genetics, University of Basel Institute of Neuroscience and Medicine
More informationDNA METHYLATION RESEARCH TOOLS
SeqCap Epi Enrichment System Revolutionize your epigenomic research DNA METHYLATION RESEARCH TOOLS Methylated DNA The SeqCap Epi System is a set of target enrichment tools for DNA methylation assessment
More informationExperiences in implementing large-scale biomedical workflows on the cloud: Challenges in transitioning to the clinical domain
Experiences in implementing large-scale biomedical workflows on the cloud: Challenges in transitioning to the clinical domain Sehrish KANWAL a,1, Andrew LONIE a, Richard O. SINNOTT a Charlotte ANDERSON
More informationPopulation stratification. Background & PLINK practical
Population stratification Background & PLINK practical Variation between, within populations Any two humans differ ~0.1% of their genome (1 in ~1000bp) ~8% of this variation is accounted for by the major
More informationHaloPlex HS. Get to Know Your DNA. Every Single Fragment. Kevin Poon, Ph.D.
HaloPlex HS Get to Know Your DNA. Every Single Fragment. Kevin Poon, Ph.D. Sr. Global Product Manager Diagnostics & Genomics Group Agilent Technologies For Research Use Only. Not for Use in Diagnostic
More informationGenome Assembly Using de Bruijn Graphs. Biostatistics 666
Genome Assembly Using de Bruijn Graphs Biostatistics 666 Previously: Reference Based Analyses Individual short reads are aligned to reference Genotypes generated by examining reads overlapping each position
More informationGenetics: Early Online, published on July 24, 2017 as /genetics
Genetics: Early Online, published on July 24, 2017 as 10.1534/genetics.117.1122 GENETICS INVESTIGATION Benchmarking relatedness inference methods with genome-wide data from thousands of relatives Monica
More informationBy the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs
(3) QTL and GWAS methods By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs Under what conditions particular methods are suitable
More informationSUPPLEMENTARY METHODS AND RESULTS
SUPPLEMENTARY METHODS AND RESULTS With: Genetic variation in the KIF1B locus influences susceptibility to multiple sclerosis Yurii S. Aulchenko 1,8, Ilse A. Hoppenbrouwers 2,8, Sreeram V. Ramagopalan 3,
More informationS G. Design and Analysis of Genetic Association Studies. ection. tatistical. enetics
S G ection ON tatistical enetics Design and Analysis of Genetic Association Studies Hemant K Tiwari, Ph.D. Professor & Head Section on Statistical Genetics Department of Biostatistics School of Public
More informationTopics in Statistical Genetics
Topics in Statistical Genetics INSIGHT Bioinformatics Webinar 2 August 22 nd 2018 Presented by Cavan Reilly, Ph.D. & Brad Sherman, M.S. 1 Recap of webinar 1 concepts DNA is used to make proteins and proteins
More informationQTL Mapping, MAS, and Genomic Selection
QTL Mapping, MAS, and Genomic Selection Dr. Ben Hayes Department of Primary Industries Victoria, Australia A short-course organized by Animal Breeding & Genetics Department of Animal Science Iowa State
More informationAN EVALUATION OF POWER TO DETECT LOW-FREQUENCY VARIANT ASSOCIATIONS USING ALLELE-MATCHING TESTS THAT ACCOUNT FOR UNCERTAINTY
AN EVALUATION OF POWER TO DETECT LOW-FREQUENCY VARIANT ASSOCIATIONS USING ALLELE-MATCHING TESTS THAT ACCOUNT FOR UNCERTAINTY E. ZEGGINI and J.L. ASIMIT Wellcome Trust Sanger Institute, Hinxton, CB10 1HH,
More informationBlood Pressure and Hypertension Genetics
Blood Pressure and Hypertension Genetics Yong Huo, M.D. Wei Gao, M.D. Yan Zhang, M.D. Santhi K. Ganesh, M.D. Outline Blood pressure and hypertension in China Update on genetics of blood pressure BP/HTN
More informationStatistical Methods for Network Analysis of Biological Data
The Protein Interaction Workshop, 8 12 June 2015, IMS Statistical Methods for Network Analysis of Biological Data Minghua Deng, dengmh@pku.edu.cn School of Mathematical Sciences Center for Quantitative
More informationKnowledge-Guided Analysis with KnowEnG Lab
Han Sinha Song Weinshilboum Knowledge-Guided Analysis with KnowEnG Lab KnowEnG Center Powerpoint by Charles Blatti Knowledge-Guided Analysis KnowEnG Center 2017 1 Exercise In this exercise we will be doing
More information