BA, BSc, and MSc Degree Examinations

Size: px
Start display at page:

Download "BA, BSc, and MSc Degree Examinations"

Transcription

1 Examination Candidate Number: Desk Number: BA, BSc, and MSc Degree Examinations Department : BIOLOGY Title of Exam: Human genetics Time Allowed: 2 hours Marking Scheme: Total marks available for this paper: 100 Sec on A: Short Answer / Problem / Experimental Design ques ons (50 marks) Sec on B: Essay ques on (marked out of 100, weighted 50 marks) The marks available for each ques on are indicated on the paper Instructions: Sec on A: Answer all ques ons in the spaces provided on the examina on paper Sec on B: Answer either ques on A or ques on B. Write your answer on the separate paper provided and a ach it to the back of the ques on paper using the treasury tag provided. Materials supplied: CALCULATOR For marker use only: For office use only: Module total as % DO NOT WRITE ON THIS BOOKLET BEFORE THE EXAM BEGINS DO NOT TURN OVER THIS PAGE UNTIL INSTRUCTED TO DO SO BY AN INVIGILATOR page 1 of 15

2 SECTION A: Short Answer / Problem / Experimental Design questions Answer all questions in the spaces provided Mark total for this section: 50 Module LOs 1 Describe the human genome and the processes that affect it. 2 Explain how comparative genomics has led to our current picture of the origin of modern humans. 3 Discuss examples of genes that have been under selection in human history. 4 Calculate genetic risk by using the genotype probability in human pedigrees in simple and complex scenarios. 5 Describe, with examples, the basis of single gene and complex disorders. 6 Explain how genes associated with Mendelian or multifactorial disorders can be identified. page 2 of 15

3 1. a) The population size of humans is more than four orders of magnitude greater than that of chimpanzees and gorillas. However, the genomes of chimpanzees and gorillas contain much greater genetic diversity compared to human genomes. How can you explain this unexpected pattern? (3 marks) Large populations are expected to harbour greater genetic diversity as the loss of diversity due to genetic drift is slower than in a smaller population (1 mark). Although humans currently have a very large population size, historically humans have had small population sizes which would have lost diversity due to drift (1 mark). Diversity would have been lost during population bottlenecks such as when we migrated out of Africa (1 mark). A study genotyped humans from 10 east Asian populations at 50,000 autosomal SNPs. In the figure below, a) shows the location of the genotyped populations. Genetic diversity, measured as heterozygosity, was calculated for each of the populations and are plotted against latitude in b). b) What processes are likely to be responsible for the latitudinal gradient in genetic diversity across the 10 east Asian populations. (3 marks) Modern human populations may have colonised east Asia from south to north in a stepping stone fashion (1 mark). Southern populations acting as source populations, with northern populations being founded as a series of page 3 of 15

4 bottlenecks (1 mark). Tropical populations may be larger than those at higher latitudes, therefore loss of genetic diversity due to genetic drift will be greater at higher latitudes (1 mark). LO1 Describe the human genome and the processes that affect it. LO2 Explain how comparative genomics has led to our current picture of the origin of modern humans. 2. a) One sign of recent positive selection acting on a section of a genome is extended tracts of homozygosity flanking the genetic variant under selection. Explain why these extended tracts of homozygosity develop as a result of recent selection. (5 marks) The genetic variant under selection is genetically linked to other variants adjacent to it (1 mark). This is because recombination events are unlikely to break up associations between variants that are physically close together on a chromosome (1 mark). Strong selection will rapidly increase the frequency of the focal variant, but flanking variants that are linked to it will also increase in frequency (1 mark). This selective sweep will result in the removal of polymorphisms flanking the focal variant and the formation of tracts of homozygosity (1 mark). Over time the tracts of homozygosity will be eroded as a result of recombination and new mutations (1 mark). The figure below shows tracts of homozygosity flanking the lactase-persistence associated SNPs in African (top panel) and Eurasian (bottom panel) populations. Each horizontal line depicts a single individual; African lactase-persistent individuals (red), African non-persistent individuals (blue), Eurasian lactase-persistent individuals (green), Eurasian non-persistent individuals (orange). Note that some homozygosity tracts are too short to be visible as plotted. page 4 of 15

5 b) Describe any differences in the homozygosity tracts between African and Eurasian lactase-persistent individuals. (2 marks) On average, the tracts of homozygosity in African lactase-persistent individuals are longer than those in Eurasian individuals (1 mark). The African tracts are about twice as long (1 mark). c) From these differences, what can you infer about the evolution of lactase persistence in these populations? (3 marks) Over time, recombination and new mutations will reduce the length of these homozygosity tracts. The above differences indicate that selection acting on the African mutation happened more recently than in Europeans (2 marks). The strength of selection acting on the African lactase-persistence allele could have been stronger (1 mark). LO3 Discuss examples of genes that have been under selection in human page 5 of 15

6 history. 3. The figure below shows the results of a genome-wide association study in which 297,086 polymorphic SNPs were genotyped in samples from 1522 case subjects with rheumatoid arthritis and 1850 control subjects. The blue horizontal line indicated with the arrow shows SNPs that are significant at a genome-wide level after Bonferroni correction (P < 5 x 10-8 ). SNPs at three loci ( PTPN22, MHC and TRAF1-C5 ) appear to be associated with the disease. a) What other lines of evidence are required to ascertain the validity of these SNPs associated with the disease? (4 marks) Population structure analysis to ensure lack of stratification (1 mark). Have these associations been replicated in independent population samples (1 mark). Are there plausible functions associated with these gene regions (1 mark). Have previously reported SNPs associated with the disease been detected (1 mark). b) Any associations with P > 5 x 10-8 have been ignored. Discuss whether or not this is appropriate. (4 marks) Owing to multiple testing of thousands of SNPs a Bonferroni correction can be applied; in this case it would result in a P-value threshold of 0.05/ = 2 x 10-8 (1 mark). Therefore the applied threshold is too stringent (1 mark). In any case, the Bonferroni correction assumes all the SNPs are independent, but we know that adjacent SNPs will be in linkage disequilibrium. Therefore it is likely that the applied threshold is too conservative resulting in false negatives page 6 of 15

7 (1 mark). For example, SNP association on chromosome 20 with P < 5 x 10-5 may be false negatives (1 mark). 4. In a recent study, linkage of a non syndromic mental retardation mutation was attempted from a single family (MR-D). The LOD score table below was obtained using two markers around candidate gene TUC3. Haplotype analysis with additional markers from the region was then carried out and the results are shown on the pedigree. a) What is the pattern of inheritance in family MR-D? (1 mark) Autosomal recessive b) Was haplotype analyses with further markers from the same region an appropriate strategy to follow? Briefly justify your answer. (1 mark) No, the LOD score is not showing significance of linkage to that region at any recombination. page 7 of 15

8 c) If a mutation in TUC3 was responsible for this condition, would you expect the haplotypes obtained? Briefly explain. (2 marks) No, this recessive trait emerged from a consanguineous marriage (1), therefore we would expect the haplotypes surrounding the mutation to be identical by descent (1). d) What experimental approach would be most feasible to apply in this case? Briefly explain. (2 marks) Autozygosity mapping using genome wide SNP analysis (1). This can be combined with whole exome NGS to confirm the presence of a homozygous mutation consistent with the status of each individual in the pedigree (1). LO: Identify the pattern of inheritance in a human pedigree LO: Articulate the process of positional cloning LO: Use haplotype analysis to delimit the physical chromosomal segment containing the mutation 5. The pedigree A shows segregation of a dynamic repeat expansion causing a neurodegenerative disease in a Chinese family (American Journal of Medical Genetics 141B: (2006)). B shows the results of a PCR amplification with primers flanking the expansion for generations II and III individuals. Individuals from generation II range from 50 to 65 years. All individuals from generation III are less than 30 years old. page 8 of 15

9 a) What is the most likely pattern in inheritance. Explain. (2 marks) Late onset autosomal dominant (1). III.3 and III.4 have high length repeats but not showing the phenotype as younger than 35 (1). b) Why is III-5 the only individual from generation III showing symptoms? (2 marks) The allele has expanded (1) and is showing anticipation (1). c) What would be the most likely outcome (affected or unaffected) later in life for: (1 mark) III-1: Unaffected III-4: affected LO: Identify the pattern of inheritance in a human pedigree LO: Distinguish a pathogenic variant from a polymorphic change 6. Myotonic dystrophy (DM) is caused by an unstable CTG trinucleotide repeat in the 3 untranslated region of the dystrophia myotonica protein kinase (DMPK) gene, which is pathogenic when above 35 repeats. page 9 of 15

10 Segregation of this disease in a family is shown (Amiel et al, J Med Gen 2001). Each haplotype shows in brackets the length of the repeat for both alleles. (Fetus III.3 was genotyped using chorionic villus sampling.) a) What is the pattern of inheritance? (1 mark) Autosomal dominant b) Which haplotype bears the causative alleles? (1 mark) c) Would you anticipate individual III.3 to show the disease later in life? Explain. (3 marks) III.3 has inherited the haplotype linked with the mutation (1), however the allele was contracted in the gamete from II.2 (1) and became within the normal range (less than 35), therefore the individual will not show the disease (1). LO: Identify the pattern of inheritance in a human pedigree LO: Distinguish a pathogenic variant from a polymorphic change LO: Use haplotype analysis to delimit the physical chromosomal segment containing the mutation page 10 of 15

11 page 11 of 15 Module Code: BIO00042H

12 7. The pedigree below shows a family with severe X-linked Haemophilia A segregating. Linkage analysis was used to establish the carrier status of II:3 with an RFLP located 15 cm from the mutation. The genotype results are shown on the pedigree. What is the risk of II.3 being a carrier? (10 marks) Probability II.3 Is a carrier II.3 Is Not a carrier Prior 1/2 ½ Conditional: 3.7Kb allele normal son Joint Posterior LO: Apply Bayesian calculations to predict genetic risk in a pedigree when linkage information is provided page 12 of 15

13 SECTION B: Essay question Answer one question on the separate paper provided Remember to write your candidate number at the top of the page and indicate whether you have answered question A or B Mark total for this section: 50 EITHER A) Discuss what the composition of the human genome can tell us about our evolutionary history. Repeat elements comprise ~two-thirds of the human genome. Most of these transposable element families are ancient and are, for example, shared with rodents. Most transposon families are no longer active, except for SINE Alus which show some activity since humans split from chimpanzees. Such activity is generally thought to be deleterious, but cases there is evidence that some transposon activity may be beneficial. Although large sequencing projects have uncovered millions of genetic variants (mainly SNPs, but also small and large indels and CNVs), relative to the other great apes, human genomes harbour low levels of genetic polymorphism. This low diversity is a result of our historically small population size and migration history. African populations are the most diverse. Non-Africans are less diverse as they are derived from a relatively small number of migrants who moved out of Africa and colonised the rest of the world. Admixture of archaic hominid lineages, Neanderthal and Denisovans, has occurred with modern humans. This admixture has primarily affected non-african genomes. Both adaptive alleles (MHC, EPAS1) as well as maladaptive genes have entered the human gene pool. The latter is evident in the relative paucity of Neandertal alleles in gene rich regions, reduced admixture on the X chromosome and purging of Neandertal alleles from human genomes over the past 35,000 years. Comparisons between human and non-human genomes as well as comparisons among human populations have revealed evidence of recent positive and balancing selection acting on a variety of genes affecting adaptations to climate, disease and diet. Although most genes are under purifying selection, some estimates suggest that perhaps up to 10% of the human genome is linked to elements under positive selection. page 13 of 15

14 1 Describe the human genome and the processes that affect it. 2 Explain how comparative genomics has led to our current picture of the origin of modern humans. 3 Discuss examples of genes that have been under selection in human history. OR B) Discuss the benefits and challenges of discovering rare disease-causing mutations in the era of next generation sequencing. Benefits (a good essay will be expected to explain the rate of discovery and the simplicity of the experimental pipeline): Many rare genetic diseases have been refractory to traditional gene discovery approaches for several reasons: locus heterogeneity, the availability of only a small number of patients or families to study and substantially reduced reproductive fitness as a result of such diseases. The advent of next-generation sequencing (NGS) has changed the landscape of rare-genetic-disease research, with causative genes being identified at an accelerating rate and promising to complete the full catalogue of causative genes of rare disease by Standard pipelines are now in place to pro cess the sequencing data generated by Whole Exome Sequencing (WES) or Whole Genome Sequencing (WGS), including mapping, variant calling and annotation. The sequence data can be compared with various public databases (including the single-nucleotide polymor phism (SNP) database (dbsnp), the 1000 Genomes Project, the Exome Variant Server and International HapMap Project, as well as internal control databases. These comparisons reduce the ~20,000 variants that are typically identified by WES (when compared with a reference genome) to <500 rare variants (defined as occurring at a frequency of 1% in controls) per exome. Initially, both inherited variants and de novo variants are catalogued; the subsequent validation of a variant as definitively disease causing is frequently the rate-limiting step. Challenges (any of the issues below could be mentioned, the more the better): Whilst NGS has been successful in recessive conditions, NGS-based gene identification for familial autoso mal dominant disorders has been challenging; most successes have been associated with a defined disease interval identified by linkage analyses of a large pedigree (or pedigrees). Moreover, due to the short length of reads NGS technology is, at least for the moment, unable to detect dynamic trinucleotide expansions. However, third generation sequencing page 14 of 15

15 technology may overcome this limitation. Non-coding causal muta tions might prove especially refractory to analysis, as their identification will require WGS, which, compared with WES, will identify many more sequence polymor phisms that have no connection to the disease. Because there is currently no rigorous means by which to fil ter the variants (as is the case with coding mutations) and no clear means of delineating biological causation, pathogenic validation will be a rate-limiting step. Exome sequences account for just 1% of the genome and it is therefore only logical that many disease-causing mutations will be missed. According to data from the ENCODE project, about 80% of the human genome may have a regulatory role. Thus, with usual assays one can lose potential protein-coding regions that have not yet been annotated as genes, as well as regulatory regions, such as non-coding RNAs (or transcription factor binding sites). Many splicing mutations are missed in clinical settings due to the limitations of in silico prediction algorithms or because they are located in non-coding regions. Subtle changes in splice site variations, 3 untranslated regulatory regions, noncoding RNAs, and direct interaction of transcription factors may have significant effects on gene expression patterns that can only be assessed by transcriptome interrogation. Other limitations that may be mentioned: 1) Capture efficiency, an important subset of GC-rich exons of the coding genes is missed in NGS studies; 2) Coverage shortfalls generated by the presence of highly homologous regions Although these regions are captured and covered by multiple reads, QC filters discard them because the same read can be aligned in multiple different regions. Therefore, coverage drops and variants present in those regions may be missed. 3) Some genes are dispensable, meaning that variants truncating in even one of these genes can lead to erroneous categorization of variant pathogenicity. 4) WGS, WES, and gene panels cannot capture epigenetic phenomena. Other orthogonal technologies such as RNA-seq or ChIP-seq can detect gene fusions, expression differences, or changes in regulatory regions that are missed by exome sequencing alone. LO: Articulate with examples the impact of modern technologies on rare disease diagnosis page 15 of 15