MRC CAiTE Centre Workshop

Size: px
Start display at page:

Download "MRC CAiTE Centre Workshop"

Transcription

1

2 Why this workshop is needed now The workshop proposed here is designed to bring together some of the leading figures in genetic research to discuss novel issues resulting from the most recent advances in population level genetic data collection and analysis. The workshop will mainly focus on methodological and ethical issues relating to the collection of whole genome sequence data and the likely role of this type of data in the future of public health research. Technological advances now make it possible to undertake population based whole genome sequencing. This advance has brought into the academic and public eye a new set of ethical, analytical and practical issues which require address. These issues are relevant to population health scientists because of their potential for advancing understanding of the determinants of health and disease and because they are increasingly seen as the next logical step after genome wide SNP analyses for assessing genomic variation in large cohort studies. What is genomic sequencing? There has been a recent and large development in the collection of sequence level genetic data 1. Michael Metzger s review of sequencing technologies in Nature Reviews Genetics 2 charts the development of technologies which are radically changing the scope of sequencing as an investigative tool. The ability to collect representations of the whole genome sequence of participants in large epidemiological cohorts and case-control studies has important implications for population health research in terms of: (i) Understanding how genetic variation influences health and disease. (ii) Clarifying important gene-environment interactions. (iii) Identifying genetic variation that can be used as valid instrumental variables for determining causal effects of non-genetic modifiable risk factors 3,4. (iv) Identifying genetic contributions to off-target drug side-effects 5. (v) Exploring the potential of genetic variation to add to risk prediction models for common complex diseases. In comparison to the genome-wide approaches that have been used widely over the last 5+ years the next generation sequencing technologies allow near to complete sequence capture and the potential to identify rare variants with moderate effects and variation currently not covered in genome-wide SNP chips. Furthermore, the profiling of structural variation and the application of similar technologies to the analysis of genomewide epigenetic marks (again across large numbers of individuals), will offer new insight into the importance of these mechanisms for risk of common complex diseases. At the same time this approach poses major challenges in terms of how to appropriately handle and analyse the data and how to develop best ethical guidelines.

3 Aims and objectives of the workshop The aim of this workshop will be to discuss, with those most qualified experts, the impact and challenges of collecting whole genome sequence data on a population scale. This discussion is desperately needed as whilst technology has been leading the way, the ethical, practical and analytical issues surrounding these new data remain largely unaddressed. The objectives: a) Next steps for genetic analysis: From genomewide SNP based association analyses to affordable next generation sequencing. What are the implications of this step in terms of data collection, analysis and interpretation. Theme based discussion facilitated by presentations and experiences gained from those working with population scale sequence data. The aim here is to understand what whole genome sequence data is and to consider the differences it has with existing resources. b) Genetic epidemiology: Discussion will consider the main advantages of resequencing for advancing understanding of the causes of health and disease at a population level. This should include consideration of the implications of total genetic assessment for participants in a study (as opposed to candidate gene approaches or genome-wide chip based SNP genotyping). c) Next generation genomewide resequencing: A technological update including the consideration of data - Sequence production: The performance of the new platforms is not sufficiently well understood to know which approaches are most appropriate at the population level. Data quality: The accuracy data needs to be measured. Quality scores need to be related to the accuracy of sequencing and the definition of good sequence needs to be established. Sequence capture: Methods to capture regions of the genome for regional approaches to sequencing also require more development. A few methods are being worked on now and they should be evaluated. With this, the number and frequency spectrum of rare variants are unknown. There are about 3-5 common (MAF > 5%) coding SNPs per gene, and there may be ten times as many sites with rare variants. Each person may have about 3000 rare coding variants. What proportion of rare variants are unique to individuals, and which are at low frequency but polymorphic in the population? Phasing and imputing genotypes: What data accuracy and depth of coverage is needed to phase variants accurately? How well can rare alleles be imputed and placed on the correct haplotypes? Data technicalities: How much will be collected and how will this be stored, formatted and processed? d) Ethical issues: surrounding the availability and interpretation of population level genomewide sequence data. The collection of whole genome sequence data for variants with minor allele frequencies that can be less than 1% has implications for the likely clinical utility of these data. A number of possible solutions have been proposed to deal with issues born from the collection of sensitive genetic data, but as yet there has been no detailed discussion or plan for exploring which of these would be most acceptable and ethical. Issues that will be covered at the workshop will include - Clinical utility: Exactly what is the value of data collected. Consent: Whether existing projects require alterations of consent and with this, consideration of data suppression, zero feedback, data freezes, consent to inform. Feedback: If there are to be systems of participant feedback, how should these be formed and what services are required. Participant information: How should participants be informed about the projects that they are being enrolled in and the implications or benefits that their data/contribution might bring.

4 Intended workshop participants The intended participants of this workshop are chosen to provide the necessary expertise to address its key objectives. Participants form the UK contributing centres- the MRC CAiTE Centre (Bristol), TWINS UK (and honorary staff), The Wellcome Trust Sanger Centre (Cambridge), The Wellcome Trust Centre for Human Genetics (Oxford), Department of Statistics, Oxford University, The Ethox Centre (Oxford), King s College (London) and Clinical Genetics, University Hospitals, Bristol - are expert analysts and researchers in this field and are currently at the leading edge of applying these new technologies to UK based population samples and the ethical implications of them. Other guest from leading institutions will be invited according to lead on specific issues relating to their areas of expertise. Further junior members of the host and contributing institutions will be invited since it is clear that the next generation of researchers need to understand the implications of these new technologies. There will be a limited open invitation to individuals from each of the participating institutions in order to facilitate an atmosphere of open access and open participation. These individuals will also be given the opportunity to feedback on the workshop (in light of future meetings) and to contribute to the workshop report. Intended outputs The workshop will also aim to provide a written report/position paper that will summarise the major advances and discussion points raised at the workshop and that will provide a set of initial consensus guidelines for the undertaking of this type of investigation. This report will be submitted to a peer-reviewed journal for publication. References: 1. Schuster, S.C. & Schuster, S.C. Next-generation sequencing transforms today's biology. Nature Methods 5, 16-8 (2008). 2. Metzker, M.L. Sequencing technologies - the next generation. Nature Reviews Genetics 11, (2010). 3. Davey Smith, G. et al. Clustered environments and randomized genes: a fundamental distinction between conventional and genetic epidemiology. Public Library of Science 4, e352 (2007). 4. Davey Smith, G. & Ebrahim, S. Mendelian Randomisation: prospects, potentials and limitations. International Jounal of Epidemiology 33, (2004). 5. Sofat, R. et al. Separating the Mechanism-Based and Off-Target Actions of Cholesteryl Ester Transfer Protein Inhibitors With CETP Gene Polymorphisms. Circulation AOP(2010).

5 TIMETABLE OF EVENTS (confirmed participants): March 16 th Coffee & Introductions Presentation: George Davey Smith & Nic Timpson Welcome & epidemiology in the world of next generation genetics Presentation: Jeff Barrett What is next generation genetics and where will it take us? Discussion and question and answer session for previous presentations Chair: David Evans Lunch Seminar: - Gil McVean Feeding experiences from the 1000 genomes project into sequencing in populations Small group work: Focused on population based sequence collection themed groups chairs: Catherine Heeney (ethics), Fernando Rivadeneira (phenotypes vs case/control), Carl Anderson & Andrew Morris (methods) Coffee discussion subject Dennis Mook (Sequencing the consortia) Large group discussion (feed back from small groups): - Chair: Ele Zeggini Pub/Dinner March 17 th Presentation: Paul de Bakker Advances in complex disease genetics Presentation: Catherine Heeney Ethical issues for next generation genetics Discussion and question and answer session for previous presentations Chair: Tim Spector Coffee Seminar: Matt Hurles Promise and complicating issues surrounding genomewide sequence data in large population collections Lunch Small group work: Focused on the pragmatics of large sequencing projects, utility and ethics themed groups chairs: Nicole Soranzo (Harmonisation of studies), Debbie Lawlor (epidemiological considerations), Andrew Hattersley & Ian Day (moves towards function), Coffee discussion subject UoB Research & Enterprise Development (Funding the effort) Presentation: Paul Franks Study designs exploiting sequence data and its new findings close Large group discussion & close (feed back from small groups and round up): - Chair: Cecilia Lindgren