GENETICS OF ABCA4-ASSOCIATED DISEASES AND RETINITIS PIGMENTOSA

Size: px
Start display at page:

Download "GENETICS OF ABCA4-ASSOCIATED DISEASES AND RETINITIS PIGMENTOSA"

Transcription

1 GENETICS OF ABCA4-ASSOCIATED DISEASES AND RETINITIS PIGMENTOSA Yajing (Angela) Xie Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy under the Executive Committee of the Graduate School of Arts and Sciences COLUMBIA UNIVERSITY 2016

2 2016 Yajing (Angela) Xie All rights reserved

3 ABSTRACT Genetics of ABCA4-Associated Diseases and Retinitis Pigmentosa Yajing (Angela) Xie Inherited retinal dystrophies encompass a broad group of genetic disorders affecting visual functions in as high as 1 in 3,000 individuals around the world. Common symptoms include loss of central, periphery, or night visions, and in severe cases progression to complete blindness. Syndromic forms also exist involving abnormalities in other parts of the body. Currently, more than 250 genes representing a wide variety of functional roles have been shown to be responsible for the disease phenotypes. Moreover, mutations in the same gene sometimes cause different phenotypes while mutations in multiple genes can give rise to the same clinical subtype, further demonstrating the level of complexity in these disorders. Such genetic heterogeneity has substantially complicated the process of pinpointing precise genetic causes underlying these conditions. The goal of my thesis research is to clarify the genetic causes underlying retinal dystrophies, with a primary focus on phenotypes resembling ABCA4-associated diseases and retinitis pigmentosa in both syndromic and non-syndromic forms. Recent advances in the next-generation sequencing (NGS), the high-throughput, deep sequencing technology, have enabled several novel genes to be identified, or found new mutations in known genes. Nevertheless, a substantial fraction of unsolved cases still remain. The primary work in this thesis involves utilizing NGS, particularly whole-

4 exome sequencing, to identify disease-causal mutations in families where at least one parent and affected or unaffected siblings are available. Determining all genetic variation underlying retinal diseases is necessary for precise molecular genetic diagnosis and improved prognosis of these conditions. The first part of my thesis highlights the complexity in genetic inheritance of diseases caused by mutations in the ABCA4 gene. In a substantial fraction of Stargardt Disease cases with only one mutation in the ABCA4 coding region, deep sequencing of the entire locus identified the second mutation in the intronic region of the gene in 10% of cases. The genetic heterogeneity of ABCA4 was further demonstrated by the identification of 4 different pathogenic ABCA4 mutations and 4 phenotypes in a single family. These findings epitomized the extremely complex mutational spectrum underlying the ABCA4-associated diseases and suggested thorough sequencing of variations in the entire genomic locus, including copy number variant analysis. In the second part of my thesis, exome-sequencing has led to findings of phenotypic expansions in known disease gene, and in one case the precise molecular diagnosis resulted in an immediate treatment. A family with 2 affected siblings presented novel phenotype of a macular dystrophy caused by mutations in CRB1. In another family where 9 members were affected with late-onset BEM, a mutation was found in CRX given incomplete penetrance. In one family with an affected adult, two well-documented mutations in MMACHC - a gene causal for a potentially debilitating disorder of cobalamin deficiency, were found to segregate with bull s eye maculopathy (BEM) and minimal systemic features in the proband. Early diagnosis in this patient resulted in hydroxycobalamin treatment for her condition, and possibly an improvement

5 of her systemic prognosis. Together, these findings revealed that clinical phenotype can be very divergent from those described, and only genetic testing can unequivocally determine the cause of a disease. The third part of my thesis work highlights first-time discovery, and co-discovery of new genes associated with retinal diseases. A new form of syndromic RP was investigated in a family presenting a previously undescribed constellation of phenotypic features. Exome sequencing analysis of 3 affected siblings and their unaffected parents revealed deleterious mutations in the RDH11 gene. In another family where 2 affected siblings presented with a remarkably similar phenotype, no mutations in RDH11 were detected. However, analysis of absence of heterozygosity revealed causal mutations in the CWC27 gene. In the search for novel genes in cone-rod dystrophy cases negative of ABCA4 mutations, WES identified new rare, deleterious mutations in RAB28 in two families of Spanish descent. These findings revealed novel genetic causes underlying hereditary retinal diseases, and demonstrated the effectiveness of WES analysis in rare disease gene discovery. In summary, this work represents a comprehensive mutational analysis of inherited retinal dystrophies with complex genotype and phenotype correlations, utilizing next-generation DNA sequencing in large study cohorts. The power of whole-exome sequencing for gene discovery was well demonstrated by unequivocally solving close to 50% of all patients examined in this study. Establishing precise correlations between genotype and clinical phenotype is important for facilitating patient care, counseling, and therapeutic intervention for inherited diseases.

6 TABLE OF CONTENTS LIST OF FIGURES AND TABLES... iii LIST OF APPENDICES... vii LIST OF ABBREVIATIONS... viii ACKNOWLEDGEMENTS... x DEDICATION... xiii CHAPTER I. GENETICS OF HEREDITARY RETINAL DISEASES Overview of Inherited retinal dystrophies (IRDs) Heterogeneity of ABCA4-associated diseases Heterogeneity of Retinitis pigmentosa Genomics approach to gene discovery in IRDs Work described in this dissertation CHAPTER II. DESCRIPTION OF STUDY COHORTS AND GENETIC METHODS Recruitment of study subjects and clinical evaluation Whole-Exome Sequencing Variant analysis Homozygosity mapping CHAPTER III. COMPLICATIONS IN GENETIC INHERITANCE OF ABCA4- ASSOCIATED DISEASES Preface Analysis of the ABCA4 genomic locus in Stargardt disease Complex inheritance of ABCA4 disease: four mutations in a family with multiple macular phenotypes Discussion i

7 CHAPTER IV. EXPANSION OF THE PHENOTYPIC SPECTRUM IN KNOWN DISEASE GENES Preface Whole-exome sequencing identifies defect in an unusual maculopathy phenotype A Drosophila genetic resource of mutants to study mechanisms underlying human genetic diseases Whole exome sequencing identifies an adult-onset case of methylmalonic aciduria and homocystinuria type C (cbic) with non-syndromic bull s eye maculopathy Discussion CHAPTER V. DISCOVERY OF NOVEL GENES FOR NON-SYNDROMIC AND SYNDROMIC RETINAL DISORDERS Preface New mutations in the RAB28 gene in 2 Spanish families with cone-rod dystrophy New syndrome with retinitis pigmentosa is caused by nonsense mutations in retinol dehydrogenase RDH Mutations in spliceosome-associated protein homolog CWC27 cause syndromic autosomal recessive retinitis pigmentosa Discussion CHAPTER VI. SUMMARY AND CONCLUSION REFERENCES APPENDICES ii

8 LIST OF FIGURES AND TABLES Chapter I & II Figure 1.1. Anatomy of the eye... 3 Figure 1.2. Vertical section of the human retina... 5 Figure 1.3. Genotypic and phenotypic heterogeneity in Mendelian IRDs Figure 1.4. Phenotype categories of eye disease genes solved by WES Table 2.1. Phenotype categories in ABCA4-associated disease cohort Figure 2.1. Schematic overview of whole-exome sequencing cohort and outcome Chapter III.2 Table 1. Analysis of the new intronic ABCA4 variants which were either detected twice or more in the cohort of 114 STGD1 patients, and/or have a predicted effect on splicing Figure 1. Pedigrees segregating the new ABCA4 intronic variants with STGD Table 2. Frequency of the variants described in Braun et al. (2013), and in the current study Figure 2. Analysis of the c g.a variant in M. mulatta Figure S1. Examples of the possible predicted effect of new non-coding ABCA4 variants on splicing, assessed using 5 different algorithms Table S1. The 141 non coding new ABCA4 locus variants that have not been reported in 1000 Genomes Database, with C-scores and DNaseI hypersensitivity and Transcription Factor (TF) scores Table S2. Detailed information of the 8X60K custom acgh microarray Chapter III.3 Fig. 1 Pedigree of the family illustrating the segregation of four disease-causing ABCA4 alleles with two Stargardt disease (STGD1) phenotypes of varying severity Table 1 Demographic and genetic characteristics of individuals within the family iii

9 Fig. 2 Advanced ABCA4 phenotype of the proband, paternal aunt and uncles Fig. 3 The paternal cousin (36-year-old son of the paternal uncle) harboring the p.g1961e allele and the c _c.6670del/instgtgcacctccctag variant presented at a comparatively milder disease stage Table 2 Summary of ABCA4-associated phenotypes in affected individuals Fig. 4 Macular findings in the mother and father of the proband who each carry a single ABCA4 variant, p.c54y and c.[302+68c>t; c>t], respectively Fig. 5 The deletion/insertion variant in the ABCA4 locus in the family Chapter IV.2 Figure 1. Pedigree of the family Table 1. Summary of Demographic, Clinical, and Genetic Data Figure 2. Fundus photographs, fundus autofluorescence (FAF) images, and spectraldomain optical coherence tomography (SD-OCT) scans of the right and left eyes of a 45-year-old woman (proband) and her 41-year-old affected brother with the same CRB1 mutations Figure 3. Fundus photographs, 488 nm fundus autofluorescence (FAF) images and spectral-domain optical coherence tomography (SD-OCT) scans in the parental carriers of the CRB1 mutation Figure 5. Color-coded macular thickness (mm) maps of the CRB1-affected proband and her brother compared with an age-matched normal subject Figure 6. Functional and macular progression assessment through fundus autofluorescence (FAF) imaging and microperimetry (MP)-1 mapping Table 3. Variants Identified in the 2 Affected Individuals Table 5. Potential Modifier Genes for CRB1-Associated Phenotype Figure 4: Pre- and post-treatment spectral domain-optical coherence tomography images of the left eye of the affected sibling with cystoid macular edema Table 2. Summary statistics for exome sequencing Table 4. Genes investigated for a possible modifier effect iv

10 Chapter IV.3 Figure 1. Summary of the Drosophila X Chromosome Screen Table 1. List of 165 Fly Genes and 259 Corresponding Human Homologs Identified from the Screen Figure 2. Comparison of Results from This EMS Screen and Previous RNAi Screens Figure 3. Essential Fly Genes Associated with More Than One Human Homolog Are More Likely to be Linked to Human Diseases Figure 4. Flowchart for Discovery and Functional Studies of Disease Genes Using the Drosophila Resource and Human Exome Data Figure 5. Mutations in CRX Cause Bull s Eye Maculopathy Figure 6. ANKLE2 and Microcephaly Figure S1. Flow Chart of the F3 Adult Mosaic Genetic Screen on the X Chromosome of Drosophila, Related to Figure Figure S2. Phenotypic Screening of Morphological and Electrophysiological Defects in Mutant Clones, Related to Figure Figure S3. Flow Chart of Mapping of X-Linked Recessive Lethal Mutants, Related to Figure Figure S4. Missense Mutations in DNM2 Associated with Charcot Marie Tooth Disease, Related to Figure Figure S5. dankle2 Regulates Brain Size, Related to Figure Chapter IV.4 FIGURE 1. Pedigree of the family and segregation of the MMACHC mutations with the disease phenotype FIGURE 2. Goldmann visual fields of the left (A) and right (B) eyes from the initial visit in 2006, showing ring scotomas consistent with the bull s eye-appearing macular lesions FIGURE 3. Spectral domain optical coherence tomography line scans FIGURE 4. Full-field electroretinogram (Espion E3, Diagnosys LLC, Littleton, MA) of the patient s right eye at age 35 during a followup visit v

11 Chapter V.2 Figure 1. Pedigrees of the 2 Families and Segregation of the RAB28 Mutations With the Disease Phenotype Table. Clinical Data From the 2 Probands With CRD and Mutations in the RAB28 Gene Figure 2. Clinical Presentation of the Disease in the Patient From Family MD Figure 3. B-Allele Frequencies Generated From Whole-Exome Sequencing Data for Both Affected Patients From Families MD-0312 and MD Chapter V.3 Figure 1. Pedigree of the family and segregation of the RDH11 mutations with the disease phenotype Figure 2. Clinical presentation of the syndromic features in affected family members.145 Figure 3. Ophthalmic examination of affected family members consistent with an earlyonset retinal dystrophy Supplemental Table 1: Number of variants (genes) identified in the family at various data filtering stages Chapter V.4 Figure 5.1. Pedigree of the family Figure 5.2. Absence of heterozygosity (AOH) regions in chromosome Figure 5.3. Analysis of splicing of the CWC27 gene vi

12 LIST OF APPENDICES Appendix 1. Clinical categories and mutations of solved WES cases Appendix 2. Supplemental Information for Chapter IV vii

13 LIST OF ABBREVIATIONS A2E: N-retinylidene-N-retinylethanolamine acgh: array-comparative genomic hybridization adrp: autosomal dominant retinitis pigmentosa AllinScore: Allikmets In-house Score AOH: absence of hererozygosity arcrd: autosomal recessive cone-rod dystrophy arrp : autosomal recessive retinitis pigmentosa BBS: Bardet-Biedl syndrome BEM: bull s eye maculopathy BWA: Burrow-Wheeler Aligner CADD: Combined Annotation Dependent Depletion cblc: methylmalonic aciduria and homocystinuria type C CD: cone dystrophy cdna: coding DNA CNV: copy number variation CRD: cone-rod dystrophy CSNB: congenital stationary night blindness DNA: deoxyribonucleic acid ENCODE: the Encyclopedia of DNA Elements ERDC: European Retinal Disease Consortium EST: expression sequence tag EVS: Exome Variant Server ExAC: Exomes Aggregation Consortium viii

14 FEVR: familial exudative vitreoretinopathy HGMD: Human Gene Mutation Database IBD: identity-by-descent INDEL: insertion/deletion IRD: inherited retinal dystrophies LCA: Leber congenital amaurosis MAF: minor allele frequency MD: macular dystrophy NGS: next-generation sequencing N-ret-PE: N-retinylidne-phosphatidylethanolamine OCT: optical coherence tomography PCR: polymerase chain reaction RetNet: The Retinal Network RP: retinitis pigmentosa RPE: retinal pigment epithelium PCR: polymerase chain reaction RNA: ribonucleic acid RT-PCR: reverse transcription polymerase chain reaction SDCCAG10: colon cancer antigen 10 SNP: single-nucleotide polymorphism SNV: single-nucleotide variation snrnp: small nuclear ribonucleo protein STGD1: Stargardt disease type 1 VCF: variant call format WES: whole-exome sequencing ix

15 ACKNOWLEDGEMENTS Throughout my PhD training, there have been many people who have spent significant amounts of time in supporting and helping me, and in general being there for me through ups and downs. I am deeply grateful for all my mentors, colleagues, family members, and close friends, from whom I have learned so much and received immense support, and without whom none of my achievements in the past five years would have been possible. First and foremost, I would like to express my deepest gratitude to my mentor and PhD advisor Professor Rando Allikmets, for his patience, motivation, and continuous support of my graduate study. Dr. Allikmets accepted me as a student in his lab when I had no in-depth knowledge in his research topic at the time. Throughout my training, Dr. Allikmets has devoted an immense amount of time and patience in teaching me and untiringly shared his knowledge in human genetics with me. He has provided me with abundance of guidance and the best possible resources to support my training and development as a research scientist. I have grown tremendously in the time I have worked in his lab and am forever grateful for all he has done to enable my accomplishments and promote my overall professional development. I have received an incredible amount of support and encouragement from my committee members. I am grateful for Dr. Angela Christiano for her willingness to serve as my thesis committee chair, and her guidance on my thesis writing and kindness and patience throughout my training. I am greatly indebted to Dr. Yufeng Shen for his help and support in multiple areas of computational analyses and his guidance on my x

16 development as a bioinformatics scientist. I am grateful to Dr. Peter Nagy for his mentorship and support, and Marissa Pang, a member of his lab, for technical help and collaboration on the NGS work. I am thankful for Dr. Chaolin Zhang for his support and encouragement as a member of my thesis committee. I would like to express gratitude to current and former members of the Allikmets lab who have taught me, assisted me, and been there for me throughout my training. I want to give a special thanks to Winston Lee, our clinical coordinator and researcher, who has provided extensive support on the clinical materials and other matters, which has been tremendously helpful for my work. I am especially grateful for the numerous help and guidance I have received from Dr. Takayuki Nagasaki, who has been there to support me throughout my thesis writing process. I want to thank Carolyn Cai for helping me learn and performing together with me on many experiments, and Jana Zernant, Dr. Huicong Cai, Dr. Jian Kong, Dr. Joanna Merriam, and the rest of the Allikmets lab for their wonderful companionship and support. My training would not have been possible without the support of the Integrated Program in Cellular, Molecular, and Biomedical Studies at the Columbia Medical Center. I would like to thank Dr. Ron Liem, our program director, for giving me the wonderful opportunity to train as a PhD student, and Zaia Sivo for her support on administrative matters and others. I would also like to express gratitude to members of the Baylor College of Medicine, Dr. James Lupski and Dr. Tomasz Gambin, for hosting me in Houston and providing me with training and collaboration on critical parts of my thesis work. I am grateful for receiving support from the NIH Vision Science Training Grant for three years. xi

17 I thank all my friends, research scientists, physicians and fellows at the Harkness Eye Institute, for their help and companionship. Without the work of all present and past colleagues, my research project would not have been possible. I am also deeply grateful to all the patients and families in this study, from whom I have learned so much. Finally, I would like to give my biggest gratitude to my parent for their life-long love and support. Without their many encouragements, influences and provisions, I would never have obtained the background and the spirit that led me to pursue the path of biomedical research. xii

18 DEDICATION To my parents Donghai Xie and Xuejin Lu for their unconditional love and support in my every endeavor ~*~ To my best friends Adara Liao and Huan M. Duong for nine years of unfailing friendship xiii

19 CHAPTER I GENETICS OF HEREDITARY RETINAL DISEASES 1

20 1. Overview of Inherited retinal dystrophies (IRDs) Vision plays a vital role in human life. Visual perception is mediated by a lightsensitive tissue located in the inner coat of the eye termed the retina (Figure 1.1). Light striking the retina initiates a cascade of phototransduction events that ultimately trigger nerve impulses passing through the optic nerve to the brain. The retina contains two types of photoreceptor cells: rods and cones. Rods mediate vision in dim light, while cones support daytime vision and the perception of color. Most of the cone photoreceptor cells are located in the center of the retina known as the macula, and in the center of the macula is the fovea where the highest cone density can be found. The retinal pigment epithelium (RPE) lines the back of the retina below the choroid and provides support to the photoreceptor cells (Figure 1.2). The RPE supports the visual cycle a process taken place in both the RPE and photoreceptors where the retinoid substrates are recycled (Grossniklaus, Geisert et al. 2015). Inherited retinal dystrophies (IRDs) encompass a diverse set of Mendelian retinal disorders that are associated with photoreceptor degeneration. Caused by early damage or progressive deterioration of cellular components important for retinal function, IRDs may result in loss of central, peripheral or night vision. In some cases, symptoms may progress to complete blindness (den Hollander, Black et al. 2010, Wright, Chakarova et al. 2010). The estimated rate of disease occurrence varies, but in some conditions it can be as high as 1 in 3,000 individuals (Daiger, Bowne et al. 2007, Chizzolini, Galan et al. 2011, Ferrari, Di Iorio et al. 2011). IRDs vary widely in their clinical manifestations such as severity, age of onset, pathogenesis, symptom progression and inheritance pattern. 2

21 Figure 1.1. Anatomy of the eye. (Left) Overview of the eye. Light contacts the cornea, and passes through the pupil, lens, and vitreous gel to reach the retina. Photoreceptors in the retina receive light and transmit information through the optic nerve to the brain. (Right) Color fundus photography of a normal retina. The center of the image is macula, where the fovea is in its central pit. The light-orange half circle in the upper left corner of the image is the optic disk, where the optic nerve reaches the eye (Photo credits: National Eye Institute). Based on the type of photoreceptor affected and the manifestations and atrophy within the retina, IRDs can be generally categorized into cone-dominating, rod-dominating, or generalized retinal degenerations involving both rod and cone photoreceptors. Cone-dominating dystrophies are primarily characterized by loss of central vision and perception of color. Progressive forms include macular dystrophies (MDs) localized loss of central/macular cones, cone dystrophies (CDs) involvement of only cone photoreceptors, and cone-rod dystrophies (CRDs) central and peripheral cone involvement followed by rod degeneration (Thiadens, Phan et al. 2012). The most common group of mostly juvenile-onset macular dystrophies is collectively called Stargardt disease type 1 (STGD1) (Allikmets, Singh et al. 1997). Non-progressive forms of cone-dominating subtypes include achromatopsia, characterized by loss in perception of either all or only one specific color (Remmer, Rastogi et al. 2015). Rod- 3

22 dominating dystrophies primarily affect rod photoreceptors. The most common progressive form is retinitis pigmentosa (RP), characterized by gradual loss of peripheral vision resulting in tunnel vision (Hartong, Berson et al. 2006). Nonprogressive forms of cone-dominating diseases include congenital stationary night blindness (CSNB), a form of night blindness also known as nyctalopia (Boycott, Sauve et al. 1993). Generalized non-syndromic retinal dystrophies involve the simultaneous degeneration of both rod and cone photoreceptor functions, the most common form of which is Leber congenital amaurosis (LCA). As the most severe subtype of retinal dystrophy, LCA can result in complete blindness in childhood with an onset of symptoms in the first year of life (Koenekoop 2004). Retinal dystrophies can manifest as syndromic forms involving abnormalities in other parts of the human body. Examples of syndromic types of IRD include Bardet- Biedl syndrome, where an RP phenotype is accompanied by obesity, polydactyly, hypogonadism, and in some cases renal failure. Other syndromic IRDs include Joubert syndrome, Usher syndrome and Senior-Loken syndrome. While there are general groupings of the phenotypes, substantial overlaps exist among the clinical presentations in certain conditions, which often complicates precise clinical diagnosis in some patients (Nash, Wright et al. 2015). 4

23 Figure 1.2. Vertical section of the human retina. The deepest layers (back of the eye) are at the top of the image, the superficial layers, at the bottom. (Image courtesy of Bryan William Jones, Ph.D. Marc Laboratory, Moran Eye Center. Used with permission.) The genetic inheritance of IRDs predominantly follows a Mendelian fashion, where cases may be familial with autosomal recessive, autosomal dominant, or X-linked modes of inheritance. Polygenic inheritance, mitochondrial or de novo mutations, as well as environmental factors such as drug toxicity, also have been reported in a few instances. Currently, over 250 genes have been identified to be associated with one or 5

24 more subtypes of IRDs (The Retinal Network, RetNet, accessed on: July, 2016). To add to the complexities of these disorders, mutations in multiple genes can give rise to a similar groups of phenotypes, while mutations in one gene can cause multiple conditions with overlapping or distinct phenotypes (Figure 1.3) (Nash, Wright et al. 2015). Almost all causal mutations associated with IRDs are rare (minor allele frequency < 1%), consistent with mutation-selection balance (Human Gene Mutation Database, HGMD, The biological functions of retinal disease genes are highly eclectic. While some genes belong to a visual pathway, such as the visual cycle or phototransduction, a great number of them support general functions such as protein folding, lipid metabolism or the extracellular matrix. The functional category with the most genes is ciliary trafficking, accounting for ~20% of all IRD genes discovered to-date (RetNet). The combined genotypic and phenotypic heterogeneity has substantially complicated the molecular diagnosis and establishment of precise treatment for these diseases. There is currently no cure for most patients with IRDs. A number of experimental treatments are in development, however the evidence supporting their effectiveness is variable and limited (Ku and Pennesi 2015). 6

25 Figure 1.3. Genotypic and phenotypic heterogeneity in Mendelian IRDs. Diagrammatic representation of overlap in genetic causes and phenotypic expression of various forms of retinal dystrophies. Example genes are shown for each disease categories. Achm: achromatopsia; CSNB: congenital stationary night blindness (Nash, Wright et al. 2015). 2. Heterogeneity of ABCA4-associated diseases The ABCA4 gene was first identified in 1997 by Allikmets and colleagues (Allikmets, Singh et al. 1997) as the causal gene for Stargardt Disease 1 (STGD1) the 7

26 most common juvenile-onset macular dystrophy that accounts up to 7% of retinopathies with a prevalence of 1 in 5,000 to 10,000 (Blacharski 1988). STGD1 is characterized by rapid central visual impairment, progressive bilateral atrophy of the foveal retinal pigment epithelium (RPE), and frequent appearance of yellowish flecks, defined as lipofuscin deposits, around the macula and/or in the central and near peripheral areas of the retina (Allikmets, Singh et al. 1997). In addition to STGD1, mutations in ABCA4 were found to segregate with a wide variety of other retinal phenotypes such as autosomal recessive CRD (Cremers, van de Pol et al. 1998, Maugeri, Klevering et al. 2000) and atypical autosomal recessive RP (Cremers, van de Pol et al. 1998, Martinez- Mir, Paloma et al. 1998, Shroyer, Lewis et al. 2001). With a high carrier frequency of 1 in 20 people across all populations, variation in the ABCA4 locus has emerged as the most prevalent cause of Mendelian retinal disease (Maugeri, van Driel et al. 1999, Yatsenko, Shroyer et al. 2001, Jaakson, Zernant et al. 2003). The economic burden from these diseases is difficult to estimate, but is clearly substantial. Currently no approved treatment exists for all range of ABCA4-associated phenotypes, but several promising therapeutic options are in or are close to be in clinical trials (Binley, Widdowson et al. 2013, Auricchio, Trapani et al. 2015). ABCA4 encodes the ATP-binding cassette, sub-family A, member 4 transporter. Studies have shown that the ABCA4 protein is localized to the rim of the photoreceptor discs in the outer segments of rods and cones (Sun and Nathans 1997), where it plays the role of rate-keeper of retinal transport in the visual cycle. Specifically, ABCA4 is proposed to act as a transporter for N-retinylidne-phosphatidylethanolamine (N-ret-PE) across the photoreceptor disc membranes, thereby preventing the formation of toxic 8

27 bisretinoid compounds in photoreceptors and RPE cells. Retinal degenerations arisen from mutations in the ABCA4 gene are proposed to be caused by the accumulation of lipofuscin fluorophore, N-retinylidene-N-retinylethanolamine (A2E) in the RPE cells and subsequent RPE cell death as well as secondary loss of photoreceptors (Weng, Mata et al. 1999, Mata, Weng et al. 2000, Tsybovsky, Molday et al. 2010, Molday 2015). To date, over 1000 disease-associated ABCA4 mutations have been identified (Allikmets, Singh et al. 1997, Zernant, Schubert et al. 2011) (personal communication), demonstrating extensive allelic heterogeneity in the locus. The most frequent diseaseassociated ABCA4 allele p.g1961e, has been described in approximately 20% of STGD1 patents (10% of ABCA4 disease alleles) (Allikmets 2007). Several studies have identified frequent ethnic group-specific ABCA4 alleles, such as the c.2588g>c variant resulting in a dual effect, p.g863a/delg863, as a founder mutation in Northern European patients with STGD1 (Maugeri, van Driel et al. 1999), and p.l541p/a1038v - a complex allele (two variants on the same chromosome) in both STGD1 and CRD patients of German origin (Cremers, van de Pol et al. 1998, Rivera, White et al. 2000). Complex ABCA4 alleles are not uncommon; they are detected in approximately 10% of all STGD1 patients (Lewis, Shroyer et al. 1999, Shroyer, Lewis et al. 2001). Depending on the severity of the ABCA4 mutations and the stage of the disease diagnosis, ABCA4-asociated pathology presents in a wide range of phenotypes (Burke and Tsang 2011). ABCA4 is the most frequently identified disease gene in patients with CRD, with studies reporting its frequency in about one-third of cases (Thiadens, Phan et al. 2012). In comparison to mutations that cause classical STGD1, more severe mutations, such as those resulting in a frameshift (insertions/deletions), or in a 9

28 premature stop codon, are seen to segregate more commonly with CRD (Riveiro- Alvarez, Lopez-Martinez et al. 2013). The p.g1961e mutation in either heterozygous or homozygous state has been found to cause bull s eye maculopathy (BEM) (Cella, Greenstein et al. 2009), a condition characterized by localized central lesion in the macula that has been reported in early stages of STGD1 and CRD, but have also been attributed to other causes such as chloroquine drug toxicity (Kearns and Hollenhorst 1966, Michaelides, Chen et al. 2007). Mutations in ABCA4 have been associated with the RP phenotype on several occasions, typically as a result of two ABCA4 null alleles (Cremers, van de Pol et al. 1998). Studies have shown that the ABCA4-associated RP cases are atypical subtypes that happen when patients at late stages of a severe form of Stargardt disease present typical RP-associated features, such as panretinal degeneration and bone spicule-shaped pigment deposits. In patients who present classical RP symptoms from the disease onset, the genetic cause is more likely non- ABCA4 (Riveiro-Alvarez, Lopez-Martinez et al. 2013). Allelic heterogeneity in ABCA4-associated disorders has substantially complicated genetic analyses and molecular diagnosis of patients. The Allikmets lab has in the past performed a combination of screening methods on patients diagnosed with STGD1 (Jaakson, Zernant et al. 2003) (Zernant, Schubert et al. 2011). Sequencing of all ABCA4 exons and adjacent splice sites finds the expected two pathogenic mutations in only 65 70% of patients diagnosed with likely ABCA4 disease. In 15% to 20% of patients, only one mutation is found, and in the remaining ~15%, no pathogenic ABCA4 mutations are identified (Zernant, Schubert et al. 2011). Studies assessing the fraction of copy number variants (CNVs) (large deletions or insertions of exons and 10

29 chromosomal segments), which elude PCR-based methods such as direct sequencing, have found those in only approximately 1% of all STGD1 patients (Yatsenko, Shroyer et al. 2003). Based on the above findings, we suggested several hypotheses. For patients with 1 identified ABCA4 mutation in the coding region, there could be 2 explanations: 1) the other pathogenic mutation is outside the coding region but within the ABCA4 locus as an intronic single-nucleotide variation (SNV) or small insertion/deletion (INDEL), or as a large copy number variantion (CNV) in a small subset (~1%) or, 2) these patients could be carriers of this ABCA4 mutation by chance, and the actually cause could be due to mutations in another gene. For patients with 0 identified ABCA4 mutations in the coding region, we also suggested 2 explanations: 1) most of these patients are phenocopies they have phenotypes resembling ABCA4-disease which are caused by mutations in other known or novel disease genes; 2) both mutations are still in the ABCA4 locus but are outside of coding regions or are large CNVs. The fraction of phenocopies in a given cohort depends on a number of factors. The quality of clinical diagnosis is important as diagnostic criteria overlap for multiple macular diseases. In some cases, however, even extensive clinical data are often not enough to pinpoint the possible genetic cause, since mutations in other genes can lead to phenotypes resembling ABCA4-associated diseases. Given the substantially overlapping phenotypes and several treatment options currently in late stages of preclinical development or in clinical trials, the correct and comprehensive molecular diagnosis of ABCA4-associated diseases is crucial. In this study, we describe work 11

30 conducted to address the above hypotheses on determining all genetic variation underlying ABCA4-asscoiated disease. 3. Heterogeneity of Retinitis pigmentosa Retinitis pigmentosa is arguably the most common IRD phenotype affecting approximately 1 in 3,000-5,000 people worldwide (Daiger, Bowne et al. 2007, Chizzolini, Galan et al. 2011, Ferrari, Di Iorio et al. 2011). Variation exists at multiple levels with locus and allelic heterogeneity, incomplete penetrance and variable expression and penetrance all observed. The onset of the disease varies from cases with early onset or juvenile RP affected from as early as the first years of life whereas adult or late onset RP symptoms develop significantly later. Clinical presentations manifest with progressive deterioration of the ability to see in dim light causing night blindness, followed by loss of peripheral vision that slowly encroaches toward the center of the visual field resulting in tunnel vision. Complete blindness can result at later stages of the disease where the cone photoreceptors are also affected. The affected region of the retina may be restricted to a specific site, adding further complexity to disease identification (Hamel 2006). Mutations in over 100 disease genes have been identified to cause RP (RetNet). Rather than being a single disease entity, RP is now considered a common clinical pathway that arises from a number of causes that lead to rod photoreceptor degeneration. The two most common single genes (retinitis pigmentosa GTPase regulator; RPGR and rhodopsin; RHO), account for 6 11% and 5 8% of all RP cases respectively (Daiger, Bowne et al. 2007). Known functions of the encoded proteins are 12

31 highly varied, the main categories of which include: phototransduction, retinal metabolism, RNA splicing, tissue development and maintenance, and cellular structure (RetNet). Nonsyndromic, nonsystemic RP cases, amount to 65% of all cases. Of these, roughly 30% are adrp, 20% are arrp, and 15% are X-linked RP, and 5% are earlyonset severe forms of RP that are diagnosed as recessive LCA. The remaining cases, at least 30%, are isolated or simplex cases which, most likely, almost all represent arrp (Daiger, Bowne et al. 2007). Digenic inheritance has also been rarely observed (Kajiwara, Berson et al. 1994). In addition to simple forms of RP, there are syndromic forms wherein the retinal phenotypes are accompanied by manifestation of a system-wide pathology. The two best characterized systemic RP forms are Usher syndrome, caused by 12 genes (Bonnet and El-Amraoui 2012) and Bardet-Biedl syndrome (BBS), which is caused by mutations in at least 17 genes (RetNet) (Leitch, Zaghloul et al. 2008, Zaghloul and Katsanis 2009). Most prominent non-retinal phenotypes of Usher syndrome include congenital or early onset hearing loss (Friedman, Schultz et al. 2011), while BBS presents often with obesity, developmental delay, and polydactyly (Guo and Rahmouni 2011, Putoux, Attie-Bitach et al. 2012). Other forms of syndromic RP include those associated with mitochondrial diseases (Kearns-Sayre, Wolfram, etc., syndromes) (Puddu, Barboni et al. 1993, Inoue, Tanizawa et al. 1998) and some forms of renal or neurodegenerative phenotypes (Joubert, Jeune, etc., syndromes; (Keeler, Marsh et al. 2003, Dixon-Salazar, Silhavy et al. 2004, Bredrup, Saunier et al. 2011). The presence of specific systemic phenotypic features in patients vary and often mutations in the same gene can cause different 13

32 phenotypes, sometimes classified as separate diseases(coppieters, Lefever et al. 2010) or the same syndrome can be caused by mutations in different genes (Katsanis 2004). Due to the exceptionally complex clinical presentations and genetic factors underlying RP, even the latest genetic diagnostic techniques are only able to achieve molecular diagnosis in approximately 50% of RP patients (Neveling, Collin et al. 2012). Moreover, by current estimates known genes explain between 60 80% of RP cases, which suggests that many RP loci remain to be found (Daiger, Sullivan et al. 2013). In this study, we aim to identify and characterize novel RP genes to facilitate genotypephenotype correlations and improved molecular diagnosis for these conditions. 4. Genomics approach to gene discovery in IRDs Advances in next-generation sequencing (NGS) have revolutionized analyses of human genetic variation in recent years. Since massively parallel sequencing platforms became widely available in 2005, individual research groups are now able to carry out large-scale sequencing within reasonable cost and time (Metzker 2010). Evidence suggests that in rare monogenic disorders a majority of disease-causing mutations reside in the protein-coding regions of the human genome (Stenson, Ball et al. 2009), which is only about 2% of the 3 billion base pairs of human genome sequence (Bamshad, Ng et al. 2011). This has made the approach of whole-exome sequencing (WES) the targeted capture and sequencing of all coding regions in the genome popular for the identification of pathogenic mutations in rare genetic diseases. The first monogenic disorder resolved by WES was the multiple malformation disorder Miller 14

33 syndrome in 2009, where sequencing only a selected few individuals was enough to pinpoint the causal gene (Ng, Buckingham et al. 2010). Since then, more than 800 novel monogenic disease genes have been found through similar approaches (Chong, Buckingham et al. 2015, Stranneheim and Wedell 2016). In gene discovery for retinal degenerations, exome sequencing can be particularly effective, since different genetic loci could independently lead to the closely related phenotypes, and genotype-phenotype correlation is often ambiguous. From 2011 to 2016, WES has been employed in the identification of over 60 novel eye disease genes, phenotypes of which spanned all major subtypes in both non-syndromic and syndromic forms (Figure 1.4; RetNet). Non-syndromic retinitis pigmentosa genes encompass the largest group, with 20 genes identified in the past 5 years. A majority of the novel genes are of autosomal recessive inheritance, discovered through homozygosity mapping and/or linkage analysis coupled with WES (RetNet). This study will describe the use of whole-exome sequencing approach to identify pathogenic mutations in known and new retinal disease genes. 15

34 CRD; 6 BBS; 4 US; 2 CSNB; 3 FEVR; 2 Jobert syndrome; 3 LCA; 3 MD; 4 optic atrophy; 2 other syndromic; 14 RP; 20 other nonsyndromic; 4 Figure 1.4. Phenotype categories of eye disease genes solved by WES. Phenotype and the number of genes discovered are separated by semi-colon (;). Abbreviations: US Usher syndrome; BBS Bardet-Biedl syndrome; CRD cone-rod dystrophy; CSNB congenital stationary night blindess; FEVR familial exudative vitreoretinopathy; LCA Leber congenital amaurosis; MD macular dystrophy (RetNet, date accessed: July, 2016). 5. Work described in this dissertation ABCA4-associated macular dystrophies and retinitis pigmentosa together represent the most prevalent causes of cone-dominating and rod-dominating retinopathies. The prevalence and lack of cure for these conditions present a major public health problem. The primary goal of this project is to determine all genetic variation underlying these disorders in order to obtain precise molecular genetic diagnosis, which will be crucial for forming the basis for selecting patients for emerging therapeutic applications and for improvement of prognosis. In this work, we present several studies conducted to address the hypotheses raised in Chapter I.2 and I.3. The layout of this thesis is as follows: 16

35 Chapter II includes methods I employed and experiments I conducted throughout this work. Experiments conducted by colleagues in the Allikmets lab and collaborators, such as targeted locus screening, array comparative genomic hybridization (acgh), and some functional and animal studies, will be described in the studies they occurred in, in the relevant Chapters. Chapter III presents a large cohort study on the ABCA4 locus in STGD1/CRD patients with 1 pathogenic ABCA4 coding mutation, and a genotype-phenotype study on ABCA4 variations. Chapters IV and V present the key findings from a large exome sequencing cohort study aiming to identify other genes causing overlapping phenotypes with ABCA4-associated diseases and retinitis pigmentosa. Since ABCA4 mutations have been identified in STGD1, BEM, CRD, and RP, this cohort included patients with phenotypes that fall under any of those categories. In addition to ABCA4-associated arrp phenotypes, patients with other forms of RP, including those with dominant inheritance and syndromic features, were included in the cohort. Chapter VI concludes the work described in Chapters III-V, and discusses findings from the large exome sequencing cohort study. Complexities of ABCA4 variants, phenotypic expansion of known diseases genes, and discovery of novel genes, are discussed in the context of genetic inheritance underlying ABCA4-associated diseases and retinitis pigmentosa. 17

36 CHAPTER II DESCRIPTION OF STUDY COHORTS AND GENETIC METHODS 18

37 1. Recruitment of study subjects and clinical evaluation This study confirmed to the tenets of the Declaration of Helsinki. The patients from this study were, after written informed consent, recruited and clinically examined during a 15-year period in centers in the USA and Spain. Specifically, patients were recruited at Columbia University, the University of Illinois at Chicago and the Pangere Center at the Chicago Lighthouse, and Centro de Investigacion Biomedica en Red de Enfermedades Raras, Instituto de Salud Carlos III in Madrid, Spain. All patients, and whenever possible, relatives of the proband, were given a full ophthalmic examination. Retinal images were taken and transmitted to Edward S. Harkness Eye Institute at Columbia University, in New York. Imaging techniques consisted of color fundus photography, fluorescein angiography, blue light autoflurorescence imaging, blue light and infrared reflectance imaging, optical coherence tomography (OCT), as indicated by an attending ophthalmologist. Electroretinogram was performed on some patients to characterize the severity and the type of retinal defects. Peripheral blood of patients and relatives was obtained by venipuncture at the respective centers, where DNA was isolated for candidate gene screening, and when applicable, peripheral blood RNA for functional assessments. A total of 125 probands were included in the study, for 90 of whom we were able to enroll and screen more than one affected or unaffected relatives. The continuum of disease manifestation in this cohort range from milder macular dystrophies such as bull s eye maculopathy (BEM) and Stargardt disease (STGD1)-like maculopathies to more severe phenotypes like cone-rod dystrophy (CRD) and retinitis pigmentosa (RP) (Table 2.1). Ninety-two of 125 cases are maculopathies consistent with phenotypes 19

38 associated with ABCA4 mutations, while 33 out of 125 are nonsyndromic or syndromic RP cases of unknown genetic cause. Table 2.1. Phenotype categories in ABCA4-associated disease cohort Categories BEM STGD1-like CRD RP Total Families Sporadic cases Total For all enrolled cases the disease is assumed to be inherited in a Mendelian fashion. This is based on the family structure, and prior knowledge that other rare, familial disorders of similar phenotypes are Mendelian conditions composed of rare variants with large effect size. In general, four inheritance patterns are considered upon observation of disease pattern from each family pedigree: autosomal-dominant, autosomal-recessive, X-linked recessive, and X-linked dominant. Complications to Mendelian inheritance, such as incomplete penetrance, pseudo-dominance, oligogenic and modifier effects, are considered if evidence points to a potential inheritance of such patterns. When possible we included more relatives, both closely and distantly related, to increase the power of genetic analysis. This is especially important when we have to consider that each family may represent a case of a distinct gene/disease. 20

39 2. Whole-Exome Sequencing In this study, we performed whole-exome sequencing in 90 families where at least one parent and affected or unaffected siblings were available and 35 sporadic cases where no family members were available (Figure 2.1), adding up to a total of 327 exomes. Exome capture and sequencing were conducted at various centers/companies: 252 samples sequenced at Baylor College of Medicine, 49 at Macrogen, and 26 at Columbia University Laboratory of Personalized Genomic Medicine. Sample processing and sequence analysis followed in-house protocols developed at the respective centers/companies. The basic workflow at all centers can be briefly summarized as follows. DNA extracted from peripheral blood of the patients was submitted to sequence capture. Exome capture arrays either Agilent SureSelect V4, Agilent SureSelectV4+UTR (Agilent, Santa Clara, or the Baylor custom array HGSC- CORE designed based on NimbleGen VCRome 2.1 (Roche NimbleGen, Madison, were employed according to instructions from each centers/companies. Depending on the specific array, exome capture covered from 30-70Mb of sequence target regions. Massively parallel sequencing of the enriched library was performed on the Illumina HiSeq platform (HiSeq 2000 or 2500) with paired-end reads (Illumina, San Diego, Bioinformatics sequence analyses were performed according to in-house pipelines from each center/company. In Macrogen, raw sequencing reads were converted to the fastq format, and mapped against the human reference genome (hg19) with a short-read aligner software the Burrow-Wheeler Aligner (BWA) (Li and Durbin 2009). Single-nucleotide variations and 21

40 small insertion/deletions are detected from the aligned file with SAMtools (Li 2011), and the resulting variants are formatted in variant call format (VCF) (Danecek, Auton et al. 2011). In Baylor College of Medicine, analyses were performed using the HGSC Mercury analysis pipeline ( which addressed all aspects of data processing and analyses. In Columbia University Laboratory of Personalized Genomic Medicine, sequencing alignment and variant calling were conducted using the NextGENe software (SoftGenetics, Sate College, PA, Families: 90 families Sporadics: 35 cases Solved: 44 cases In analysis: 46 cases Solved: 15 cases In analysis: 20 cases Known genes: 26 genes New genes: 5 genes Candidates: 10 genes Known genes: 9 genes Figure 2.1. Schematic overview of whole-exome sequencing cohort and outcome. Whole-exome sequencing was performed on 90 families and 35 sporadic cases. Of the 90 families, 44 were solved where the causative genes included 26 known genes and 5 new genes. The remaining 46 families were in analysis, with 10 strong candidate genes currently being pursued. Of the 35 sporadic cases, 15 were solved with 9 known genes, and 20 were in analysis. Detailed findings from this study are described in Chapter VI. 22

41 In all samples, sequencing achieved a minimum mean read depth of the target region of 50X, and >90% of the target region were covered at least >10X. We generated on average of from million total reads for each sample, yielding 5-10 gigabytes per sample. With these sequencing yields, samples achieved 90% of the target regions covered to a depth of 10 times or more. A total of 80, ,000 variants were identified for each sample. Of all coding variants, over 98% were SNPs less than 1% were indels. The ratio of nonsynonymous to synonymous variants was close to 1:1. Key findings from this cohort study will be described in detail in Chapter III- IV, and a summary of all findings will be given at the concluding chapter. 3. Variant analysis Exome sequencing identifies around 80, ,000 variants per individual on average. There is a need to distinguish pathogenic mutations from benign variants among the large pool of information generated. We considered several parameters when filtering for pathogenic mutations: 1) allele frequency, 2) types of variants, 3) segregation with the disease, and 4) evidence from known genes and biological pathways. Allele Frequency In the majority of Mendelian eye disorders, the highly penetrant mutations that cause the disease are very rare in the general population. Our exome cohort encompassed a wide range of retinal phenotypes. Based on our prior knowledge on the extensive locus heterogeneity in IRD, we expected to find a number of genes harboring very rare mutations which each account for a small number of cases in this cohort. The 23

42 increase in variant information submitted to the public databases, as well as large cohort studies that sequenced thousands of individuals, have made publicly available the population frequency information on coding as well as non-coding variants. In this study, we excluded variants that are present in > 0.5% minor allele frequency (MAF) in databases such as 1000 Genomes (Genomes Project, Abecasis et al. 2010), Exome Variant Server (EVS) ( and Exome Aggregation Consortium (ExAC) ( Exomes sequenced at Baylor College of Medicine were compared against the Baylor internal ARIC database. To take into account sequencing artifacts from batch effects, we curated an internal database of all mutations seen in exomes sequenced in this project the Allikmets In-house Score (AllinScore). A variant that is present in more than 5 families/sporadic cases out of 118 independent cases in this cohort, but is absent in public databases, would be considered a likely sequencing artifact. Types of variants Based on the observation that most causal variants in monogenic disorders affect protein coding and canonical splice sites, we focused on variants of missense, frameshift/non-frameshift, nonsense, and splice site changes when prioritizing them for disease causality. INDELs that introduce a frameshift, and stop codon-inducing (stopgain) or altering (stop-loss) SNVs are considered loss-of-function variants, as they disrupt the reading frame of a transcript and can be largely regarded as null alleles. The pathogenicity of SNVs that alter the amino acid sequence in the coding region (missense variants) can be more difficult to assess. They can be evaluated based on 24

43 the biochemical changes and the location of the altered position relative to functionally and/or structurally important domains of the protein. In this study, for all novel missense variants found, we prioritized them for their predicted functional impact on protein using variant pathogenicity prediction software such as SIFT (Kumar, Henikoff et al. 2009), PolyPhen-2 (Adzhubei, Schmidt et al. 2010), and MutationTaster (Schwarz, Cooper et al. 2014), and CADD (Kircher, Witten et al. 2014), and used quantitative estimates on the level of sequence conservation of the alleles using phastcons (Siepel, Bejerano et al. 2005) and phylop (Pollard, Hubisz et al. 2010). For variants found adjacent to canonical splice sites, we examined their predicted effect on splicing prediction tools such as MaxEntScan (Yeo and Burge 2004) and Human Splicing Finder (Desmet, Hamroun et al. 2009). Missense and splice site variant prediction tools were access via the Alamut Visual software ( for variant analyses in this study. Other types of variants that are sometimes considered for potential diseasecausal effects are synonymous and deep intronic variants (outside of splice consensus sequences). Synonymous variants do not produce an amino acid change, but they can be disease associated in several ways. First, they can also alter splice site sequence resulting in incorrect splicing. Second, it has been shown that synonymous variant can affect translation efficiency by introducing less frequent codons resulting in delayed translation and folding of the protein (Chen, Davydov et al. 2010). In this study, pathogenic synonymous variants were discovered through next-generation sequencing, and evaluated by a combination of in-silico predictions and reverse-transcription PCR. 25

44 Key findings for the ABCA4 gene as well as the novel disease gene CWC27 are described in Chapter III and V. Variants in deep intronic regions have been found to cause ABCA4-associated diseases (Zernant, Schubert et al. 2011, Braun, Mullins et al. 2013), by affecting splicing or regulatory regions, such as promoters or enhancers (Braun, Mullins et al. 2013). In this study, we evaluated potential effect on gene regulation for new ABCA4 intronic variants by assessing their location in the DNaseI hypersensitivity and transcription factor binding regions from the ENCODE project a comprehensive list of functional elements in the human genome (Consortium 2012). Details of the use of this method are described for the ABCA4 gene in Chapter III. Segregation with the disease An important criterion for assessing causality of a gene to a disease is to determine how the variants in the gene segregate with the disease phenotype in a family. For a fully penetrant, autosomal dominant disorder, a segregating heterozygous variant is shared by the proband and other affected members of the family, but not by unaffected members. For an autosomal recessive disease, two pathogenic alleles are each inherited from one of the two parents. This can be two compound heterozygous mutations in the same gene, or in the case where the two alleles are identical, a homozygous mutation in the same gene. For an X-linked recessive pattern, a hemizygous mutation on the X chromosome causes the phenotype to be expressed in affected males. In rare occurrences of affected females, the mother would have to be an obligate carrier and the father to be affected. X-linked dominant inheritance, while being 26

45 much less common than X-linked recessive inheritance, happens when a dominant gene is carried on the X chromosome. In this study, segregation analysis was performed individually on the exome sequencing data for each of the 90 families. Sequencing was carried out for number of relatives ranging from one parent/sibling in addition to the proband to 8 members in multi-generation pedigrees. Whenever possible, we included one affected/unaffected sibling, and at least one parent to examine the phase of segregating variants. In-house scripts written in the R language ( were developed based on pattern on inheritance described above to search for variants segregating in each family. Segregating mutations in the same gene in several unrelated cases are often necessary to establish a causal relationship between a gene and a disease. Therefore, obligatory follow-up analyses involved screening more patients with the same phenotype. We searched both sporadic cases in our cohort and from collaborating sites, such as from European Retinal Disease Consortium (ERDC; which consists of 16 research groups from Europe and North America collaborating in the field of IRDs. Specific attention was given to patients with matching ancestry. Evidence from known genes and biological pathways In prioritizing candidate genes for disease causality, variants in existing genes known to produce the phenotype of interest are given priority consideration. Variants in several retinal disease genes are known to sometimes result in phenotypes closely resembling those in ABCA4-associated diseases. These genes are PRPH2 (gene for multifocal pattern dystrophy) (Grover, Fishman et al. 2002, Boon, van Schooneveld et al. 27

46 2007), ELOVL4 (dominant STGD-like disease gene) (Bernstein, Tammur et al. 2001, Zhang, Kniazeva et al. 2001), BEST1 (Best disease gene, recessive forms resemble STGD), RS1 (retinoschisis gene) (Tsang, Vaclavik et al. 2007), and CNGB3 (achromatopsia gene) (Michaelides, Aligianis et al. 2004). If no causal mutations were found in the above genes, all variants from known retinal disease genes in the dataset were considered for a possible phenotype expansion and all known RP genes were carefully assessed when the case in question has RP phenotype. An important database we accessed was the Retinal Information Network (RetNet, which is curating all known retinal disease genes to date. In this study, genes recorded in RetNet were surveyed for previously published diseaseassociated mutations or likely pathogenic novel mutations; those were given strong consideration before variants in other genes. If no pathogenic variants emerged from genes in the RetNet database, all genes, variants in which segregated with the disease, were taken into consideration. Candidate variants were stratified by their predicted role in biological pathways involved in eye functions or their interactions with genes or proteins that are known to cause similar phenotypes. The ABCA4 protein plays the role of a rate-keeper of retinal transport in the visual cycle a series of reactions that converts all-trans retinal, a derivative of vitamin A, to 11-cis retinal for recycling of rhodopsin (Molday 2015). However, ABCA4 is not known to directly interact with any proteins in the visual cycle, other than with ATP and the substrate N-ret-PE. Among the key players of the visual cycle (RDH8, LRAT, RPE65, RDH5, IRBP), RPE65 and LRAT, when mutated, cause recessive Leber congenital amaurosis and retinitis pigmentosa (Gu, Thompson et al. 1997, Ruiz, Kuehn 28

47 et al. 2001, Bowne, Humphries et al. 2011). Current known RP genes encompass a broad spectrum of biological pathways. The search for new candidates could first prioritize genes from the 5 main categories of pathways listed in Chapter I. Experimental validation Current next-generation sequencing technologies are prone to false positive errors in variant calling. There is a need to experimentally validate the integrity of variants called from NGS, especially for INDELs and variants in complex, repetitive regions where most errors are produced. In this study, all causal variants detected by next-generation sequencing methods were validated with direct Sanger sequencing in family members submitted for exome sequencing as well as in available additional family members. In cases where a variant was predicted to have an effect on splicing, and where we were able to obtain RNA from peripheral blood leucocytes, the effect on splicing was examined by reverse-transcription PCR (RT-PCR). In this study, complementary DNA was synthesized using oligo-dt primers. PCR amplification of the cdna was conducted with primers designed to span the adjacent exons encompassing the mutation, and the pattern of the spliced product was visualized with DNA gel electrophoresis. 4. Homozygosity mapping Discovering genes that cause rare recessive traits in large outbred populations is often a difficult task due to the paucity of families with affected individuals. An efficient gene mapping strategy is homozygosity mapping in individuals from consanguineous marriages, often from isolated populations. This method takes advantage of the fact 29

48 that individuals who are relatives to a known degree contain common genomic regions inherited from the ancestors they share, and this increases the likelihood that the two copies of the disease allele are identical-by-descent (IBD) from the same ancestor (Lander and Botstein 1987). The basic idea of searching for the disease allele is thus to identify regions of homozygosity that are shared by different affected individuals. Gene mapping is usually carried out by scanning the genotype for known polymorphic sites, such as those obtained via SNP arrays, to search for regions with contiguous sharing of the same genotype. In this study, we identified homozygous regions of individuals from SNVs obtained from whole-exome sequencing data. WES provides genotype information on SNVs that are present in the coding region of the individual genome, and a significant lack of heterozygosity in contiguous SNVs, within a gene or spanning multiple genes, would indicate a region of absence of heterozygosity (AOH). To detect such AOH regions, we adopted the circular binary segmentation algorithm (Olshen, Venkatraman et al. 2004), through utilizing the DNAcopy Bioconductor package ( and in-house R scripts. Originally designed for the analysis of array-based DNA copy number data, the algorithm assumes that genomic events can be seen as discrete gains and losses in contiguous segments of the genome, and identifies such segments through splitting chromosomal regions by change-points the points after which the test over reference signal ratios have changed(olshen, Venkatraman et al. 2004). In applying the algorithm to exome data, a B-allele frequency was computed from normalized read depth ratios between variant reads and total reads of each SNV present in the variant call format (VCF) file. Resulting regions above a 30

49 certain threshold presented consecutive homozygous SNVs consisting AOH blocks. Chromosomal coordinates of AOH regions in affected and unaffected relatives were compared. A region could contain the disease allele if it is shared by only affected siblings and are not in unaffected siblings. All rare variants under the region would be surveyed, and strong candidate would be carefully assessed with the variant analysis criteria described in Chapter II to identify the causal mutation. This method was used in part for discovering two novel genes RAB28 and CWC27. Details will be described in Chapters V. 31

50 CHAPTER III COMPLICATIONS IN GENETIC INHERITANCE OF ABCA4- ASSOCIATED DISEASES 32

51 1, Preface Since its discovery in 1997 as the causal gene for STGD1 (Allikmets, Singh et al. 1997), the ABCA4 gene has been a subject of intensive genetic research. Similarly, phenotypes caused by ABCA4 mutations have also been extensively studied and found to be notable for a high degree of variability of expression. In molecular diagnosis of STGD1 patients, complete sequencing of ABCA4 identifies both disease alleles in only 65-70% of STGD1 patients, and only one mutation in 15-20% of patients. Furthermore, the high carrier frequency of ABCA4 alleles in the general population (~1:20) sometimes results in pseudo-dominant inheritance patterns in patients, or in very rare instances patients of other retinal diseases being carriers for an ABCA4 allele. Such genetic heterogeneity makes precise molecular diagnosis of affected individuals a challenging task. The work described in this chapter was designed to identify the missing diseasecausing alleles in STGD1 patients who carry one pathogenic ABCA4 mutation. By employing multimodal genetics approaches, we seek to complete the picture of genotype-phenotype correlations underlying ABCA4-associated diseases. In the published work described in Chapter III.2, a combination of targeted nextgeneration sequencing, array-comparative genomic hybridization (acgh) arrays, in silico and RNA analyses, and segregation analyses in families were employed to screen the entire 140 kb ABCA4 genomic locus in 114 STGD1 patients with one known ABCA4 exonic mutation to search for intronic variations/cnvs. My work in this study involved in silico characterization of deep intronic variants found in the ABCA4 locus for regulatory potentials, such as their effect on promoter or enhancer/silencer regions. 33

52 In the published work described in Chapter III.3, we presented a genotypephenotype analysis of a large family presenting with four distinct macular disease phenotypes spanning across two generations. Sequencing of the ABCA4 coding region initially revealed two known missense mutations, but alone they were not sufficient to account for the phenotypes presented by individuals in the pedigree. Initially suspecting multiple genes to play a role, we performed exome sequencing in available family members in addition to ABCA4 locus and acgh screening. My contribution to this work included in silico analyses of putative ABCA4 regulatory variants, pedigree-based exome sequence analyses, and RNA analysis for the identified deep intronic ABCA4 variants. 34

53 CHAPTER III.2 Analysis of the ABCA4 genomic locus in Stargardt disease (Published Paper) 35

54 Human Molecular Genetics, doi: /hmg/ddu396 Analysis of the ABCA4 genomic locus in Stargardt disease Jana Zernant 1,Yajing(Angela)Xie 1, Carmen Ayuso 3,4, Rosa Riveiro-Alvarez 3,4, Miguel-Angel Lopez-Martinez 3,4, Francesca Simonelli 5, Francesco Testa 5,MichaelB.Gorin 6,7, Samuel P. Strom 6,7,8, Mette Bertelsen 9, Thomas Rosenberg 9, Philip M. Boone 10,BoYuan 10, Radha Ayyagari 11, Peter L. Nagy 2, Stephen H. Tsang 1,2, Peter Gouras 1, Frederick T. Collison 12, James R. Lupski 10, Gerald A. Fishman 12 and Rando Allikmets 1,2, 1 Department of Ophthalmology and 2 Department of Pathology and Cell Biology, Columbia University, New York, NY, USA, 3 Department of Genetics, Instituto de Investigacion Sanitaria-University Hospital Fundacion Jimenez Diaz, UAM (IIS-FJD), Madrid, Spain, 4 Centro de Investigacion Biomedica en Red (CIBER) de Enfermedades Raras, ISCIII, Madrid, Spain, 5 Eye Clinic, Multidisciplinary Department of Medical, Surgical and Dental Sciences, Second University of Naples, Naples, Italy, 6 Department of Ophthalmology, 7 Department of Human Genetics, Jules Stein Eye Institute and 8 Department of Pathology, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA, 9 Kennedy Center Eye Clinic, Glostrup Hospital, Glostrup, Denmark, 10 Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA, 11 Department of Ophthalmology, University of California San Diego, La Jolla, CA, USA and 12 The Pangere Center for Hereditary Retinal Diseases, The Chicago Lighthouse for People Who are Blind or Visually Impaired, Chicago, IL, USA Received May 29, 2014; Revised May 29, 2014; Accepted July 29, 2014 Autosomal recessive Stargardt disease (STGD1, MIM ) is caused by mutations in the ABCA4 gene. Complete sequencing of ABCA4 in STGD patients identifies compound heterozygous or homozygous disease-associated alleles in 65 70% of patients and only one mutation in 15 20% of patients. This study was designed to find the missing disease-causing ABCA4 variation by a combination of next-generation sequencing (NGS), array-comparative Genome Hybridization (acgh) screening, familial segregation and in silico analyses. The entire 140 kb ABCA4 genomic locus was sequenced in 114 STGD patients with one known ABCA4 exonic mutation revealing, on average, 200 intronic variants per sample. Filtering of these data resulted in 141 candidates for new mutations. Two variants were detected in four samples, two in three samples, and 20 variants in two samples, the remaining 117 new variants were detected only once. Multimodal analysis suggested 12 new likely pathogenic intronic ABCA4 variants, some of which were specific to (isolated) ethnic groups. No copy number variation (large deletions and insertions) was detected in any patient suggesting that it is a very rare event in the ABCA4 locus. Many variants were excluded since they were not conserved in non-human primates, were frequent in African populations and, therefore, represented ancestral, and not disease-associated, variants. The sequence variability in the ABCA4 locus is extensive and the non-coding sequences do not harbor frequent mutations in STGD patients of European-American descent. Defining disease-associated alleles in the ABCA4 locus requires exceptionally well characterized large cohorts and extensive analyses by a combination of various approaches. INTRODUCTION Mutations in the ABCA4 gene are responsible for a wide variety of retinal dystrophy phenotypes from autosomal recessive Stargardt disease (STGD1) (1) to cone rod dystrophy (CRD) (2,3) and, in some advanced cases, retinitis pigmentosa (RP) (2,4,5). While CRD and RP phenotypes are also caused by mutations in many other genes, ABCA4 is the only recognized gene To whom correspondence should be addressed. rla22@columbia.edu # The Author Published by Oxford University Press. All rights reserved. For Permissions, please journals.permissions@oup.com 36

55 2 Human Molecular Genetics, 2014 responsible for STGD1 (MIM ), a predominantly juvenile-onset macular dystrophy frequently associated with early-onset central visual impairment, progressive bilateral atrophy of the foveal retinal pigment epithelium, and the presence of yellowish flecks, defined as lipofuscin deposits, around the macula and/or in the central and near-peripheral areas of the retina. Over 800 disease-associated ABCA4 variants have been already identified (6) and the most frequent of these have been described in only 10% of STGD1 patients (7). Several studies have identified frequent ethnic group-specific ABCA4 alleles, such as the p.g863a/g863del founder mutation in Northern European patients (8), the p.[l541p;a1038v] complex allele in patients of mostly German origin (3,9), the p.r1129l founder mutation in Spain (10), the p.n965s variant in the Danish population (11) and the p.a1773v variant in Mexico (12). Complete sequencing of the ABCA4 coding and adjacent intronic sequences in patients with STGD1 routinely discovers 80% of mutations with the fraction of patients harboring the expected two disease-associated alleles at 65 70%, with one mutation 15 20%, and with no mutations in the remaining 15% (13). These fractions depend on many variables, most importantly the quality of the clinical diagnosis and the ethnic composition of the cohort. Most of the cases with no detected ABCA4 mutations likely represent phenocopies (13); i.e. in those patients mutations in other gene(s) cause a STGD1-like phenotype. However, based on the known carrier frequency of pathogenic ABCA4 variants, in most cases with one ABCA4 allele the second allele is expected to reside in the ABCA4 locus. It can present as a copy number variant (CNV, large deletion or insertion of one exon or more) which eludes detection by PCR-based sequencing techniques, a synonymous variant in the coding region, or a (deep) intronic variant, which may affect splicing or a regulatory region, such as a promoter or an enhancer (14). Very few of these have been identified (13,15,16). This study was designed to find the missing ABCA4 mutations by a combination of next-generation sequencing, arraycomparative genomic hybridization (acgh) arrays, in silico and RNA analyses, and segregation analyses in families. RESULTS Discovery of new disease-associated variants by next-generation sequencing Sequencing of the entire ABCA4 genomic locus, at an average depth of coverage of 100, in 130 patients with ABCA4- associated disease harboring one previously known ABCA4 disease-associated allele, and 6 patients with no known ABCA4 mutations, resulted in detecting 1745 different variants. Eighty-three of these were previously known disease-associated or benign variants from coding regions and pathogenic splice site variants. Six hundred and ninety-five (695) variants were also detected in 1000 Genomes Project or Exome Sequencing Project, with no statistically significant differences in allele frequencies between the general population and the patient cohort, unless the variants were on the same allele (haplotype) with the frequent known ABCA4 coding mutation, p.g1961e. Five hundred and twenty-six (526) variants were incorrectly called deletions or insertions from single nucleotide repeat areas (homopolymers) that have proven to be difficult for the NGS approach. We also experienced a relatively high A.C/C.A/ T.G/G.T false-positive calling rate with Illumina sequencing. The number of false positives can be reduced by more stringent criteria for variant calling; however, this may also exclude some real variants. After the filtering and verification steps 141 new intronic ABCA4 variants remained in 114 patients. In 22 patients with one previously known ABCA4 mutation, the second pathogenic ABCA4 allele was also found in the coding sequence or adjacent splice sites. In 6/22 cases this was due to reevaluation of several variants which had been classified as benign, e.g. p.g991r and p.a1773v. The remaining 16 cases represented false-negative results, probably due to technical reasons in the first, sequencing, step of the ABCA4 coding regions. Of the 141 new possible candidates for disease-associated variants, two variants, c c.t and c c.a, were detected together (one the same chromosome) as a complex allele in four patients of Spanish or Italian descent (Table 1). The c c.a variant is in an evolutionarily less conserved area, the c c.t variant is adjacent to the recently reported c c.t and c G.A variants from a conserved area (14). According to predictive programs, none of these variants have any effect on splicing, whether on existing cryptic splice sites or on creating new sites. The c c.t and c c.a haplotype segregated with the disease in all three STGD1 families from Spain (Fig. 1A C), and were absent in 100 matched Spanish control samples, making these variants very likely candidates for intronic ABCA4 mutations. Two variants, c a.g and c t.a, were detected in 3/114 unrelated patients each and were absent in 368 matched control samples. The c a.g variant segregated with the disease in two families; i.e. it was on a different chromosome than the proband s other ABCA4 mutation (Fig. 1D). This variant is also predicted to strengthen a cryptic splice donor (Supplementary Material, Fig. S1A). The variant was found, in addition, in 1/119 patients from our replication cohort of STGD1 patients with one known ABCA4 mutation and has been recently reported as a disease-associated allele (14). The aggregate evidence suggests that the c A.G variant is a rare, deep intronic disease-associated allele. The c t.a variant, which does not have a predicted effect on splicing, was detected in 6/119 additional unrelated samples from our replication cohort of STGD1 patients with one known ABCA4 mutation. Two of these samples were from the Columbia University patient cohort and four from European cohorts including two from Denmark. The variant was not present in 368 European-American control samples, but was detected in 2/182 Danish control samples. While the frequency of this variant is 10 elevated in STGD1 patients as compared with all controls (1.9 versus 0.18%) and 3.5 if compared with Danish controls (0.55%), it is premature to unequivocally call the variant as associated with the disease. Three variants, c a.g, c g.a, and c.859 9T.C were each detected in 2/114 different patients in our primary ABCA4 locus screening cohort, but absent in 368 control samples and also in 119 additional STGD1 samples from the replication cohort (Table 1). The c a.g 37

56 Human Molecular Genetics, Table 1. Analysis of the new intronic ABCA4 variants which were either detected twice or more in the cohort of 114 STGD1 patients, and/or have a predicted effect on splicing Position on chr1 Variant Effect on splicing (combined Alamut prediction) C score Primary ABCA4 locus cohort (114) Validation (replication) cohort (119) Segregation with disease Controls a Conservation in primates Disease association c t.g Cryptic donor strongly activated /368 No Probably not c a.g New donor site /368 Yes Yes c c.t No effect /368 Yes Possibly yes c t.c No effect /368 Yes No c.859-9t.c Weakens the acceptor by 14% Yes 0/368, 0/120 b Yes Yes c a.g Cryptic donor strongly activated Yes 0/368 Yes Yes c a.g New donor site /368 Yes Yes c g.a Weakens the acceptor by 50% /368 Yes Yes c t.g Weakens the acceptor by 35% /368 Yes Yes c t.a Cryptic donor strongly activated /368 No Probably not c c.t New donor site /368 Yes Yes c g.a Cryptic donor strongly activated /368 No Probably not c g.t New donor site /368 Yes Yes c g.a c No effect /368 Yes Possibly yes c c.t c No effect NS Yes Possibly yes c c.t No effect; the most frequent new variant Yes 0/100 d Yes Yes c a.g c Cryptic donor strongly activated Yes 0/368 Yes Yes c g.a c Cryptic acceptor strongly activated /368, 8/200 e No No c c.a f No effect; the most frequent new variant Yes 0/368, 0/100 d No No c c.a Weakens the acceptor by 26% NS No Yes c t.a No effect, second most frequent /368, 2/180 g Yes Possibly yes variant c c.t Cryptic donor strongly activated /368 Yes Yes c t.c Cryptic acceptor strongly activated No 0/368 Yes No a Control cohort of 368 samples of European ancestry was screened if not indicated differently. b Cohort of 120 control samples of Asian-Indian origin. c Variant is previously reported by Braun et al. (2013). d Cohort of 100 controls from Spain. e Cohort of 200 African-American general population controls. f Variant is on the same allele with c c.t. g Cohort of 180 controls from Denmark. NS, not screened. 38

57 4 Human Molecular Genetics, 2014 variant creates a very strong new donor splice site according to all predictive software (Supplementary Material, Fig. S1B). The c g.a variant weakens the existing acceptor by 50%, and the c.859-9t.c change by 14% (Supplementary Material, Fig. S1C and D). The c g.a and c.859-9t.c variants are adjacent to ABCA4 exon sequences and can be detected also by sequencing of ABCA4 coding regions. However, neither of these variants has been detected in the Exome Sequencing Project currently containing 4300 individuals of European-American descent and 2203 individuals of African-American descent. In addition to the 2/114 patients from the ABCA4 locus screening cohort, c.859-9t.c has been detected twice, in homozygosity, in STGD1 patients with no other known ABCA4 mutations, and in three more STGD1 patients in heterozygous state with one known ABCA4 mutation. Segregation analysis was possible in one family, and confirmed that the c.859-9t.c was not on the same chromosome with the proband s other mutation (Fig. 1E). All evidence points to these three variants as being intronic disease-associated ABCA4 variants; the c.859-9t.c variant is discussed in detail below. Seventeen more new variants were detected twice in 114 patients from the locus screening cohort. Five of these variants were detected together in the same two patients both of whom also carried the most likely benign p.v931m variant, so these variants were eliminated from the pool of possible mutation candidates. Three more variants were on the same chromosome with the previously known ABCA4 exon variants, (with p.r212c, p.t1253m, and p.[l541p;a1038v]), respectively. Other variants were either in the same patients who already carried a stronger intronic mutation candidate (5), not conserved in non-human primates (NHPs) (1), or were found in controls with similar Figure 1. Pedigrees segregating the new ABCA4 intronic variants with STGD1. frequency (1). None of them had a predicted effect on splicing. Among these variants are also the recently reported by Braun et al. c c.t and c g.a variants. Both variants were found in 2/114 samples in our primary cohort. The c c.t change was also detected in 2/ 119 patients of the replication cohort, a frequency significantly lower than reported by Braun et al.(14). The remaining 117 new ABCA4 intronic variants were only detected once each in 62 different patients with one previously known ABCA4 mutation. Twelve of these variants were predicted to have an effect on splicing (Table 1). The c c.a and c t.g weaken the existing splice acceptor sites on average by 25 and 35%, respectively (Supplementary Material, Fig. S1E and F). Neither of these two variants has been detected by Exome Sequencing Project, nor in our entire STGD1 patient cohort (780 patients) where all ABCA4 coding regions and adjacent splice sites have been sequenced. The variants c t.g, c a.g, c C.T, c g.a and c t.a, all strengthen the existing cryptic splice donors sites (Supplementary Material, Fig. S1G and H). None of these variants were found in 368 control samples, nor in the replication cohort of 119 additional STGD1 patients with one ABCA4 disease-associated allele. The c a.g variant segregated with the disease in one family (Fig. 1F). The c t.g and c t.a variants are not conserved in non-human primates and, therefore, not likely disease-causing in humans. At the same time, these two patients do not carry any other possible ABCA4 mutant alleles. The variants c a.g and c G.T are predicted to create new strong splice donors (Supplementary Material, Fig. S1I and J). The variants are 39

58 Human Molecular Genetics, absent in 368 control samples and in the replication cohort of 119 STGD1 patients (Table 1). Since these positions are highly conserved among species we suggest that the c a.g and c g.t variants are very good candidates for intronic ABCA4 mutations. The variants c g.a, c c.t and c t.c have a predicted effect of strengthening the existing cryptic splice acceptors (Supplementary Material, Fig. S1K M), and were each all detected in one patient in the primary locus screening cohort. The c t.c variant was on the same chromosome with the probands other known ABCA4 mutation, and can therefore be excluded from possible new mutations list. The c c.t and c t.c variants were not detected in 368 control samples, nor in 119 additional STGD1 samples. The recently reported (14) c g.a variant was additionally found in nine STGD1 samples with one previously known ABCA4 variant (Table 1). Three of these patients were from Denmark, three were of African-American origin, and three of unknown ethnicity. Screening matched control cohorts revealed no carriers for this allele in 180 Danish control samples, but it was detected in 8/200 (MAF ¼ 2%) general population controls of African- American descent, strongly suggesting non-pathogenicity. Additional support for this variant not being pathogenic comes from evolutionary analysis (see below). The human major allele c g is the minor allele in Macaca mulatta and the suggested mutant allele A is the major allele in macaques. Analysis of regulatory sequences To assess the effect of the new ABCA4 intronic variants on putative regulatory regions we compared their location against the chromosome coordinates of the DNaseI hypersensitivity and transcription factor binding regions from the ENCODE project. Since regulatory regions, in particular promoters, tend to be DNase sensitive, variants that fall within such regions may affect regulatory potential. Also, variants that are located in transcription factor binding sites may potentially have an effect on protein binding. The combined DNaseI hypersensitivity data were derived from 125 cell types, and the dataset for transcription factor binding regions involved 161 transcription factors in 91 cell types. Unfortunately the datasets did not contain eye-specific cell types or transcription factors. The defined regions were assigned normalized scores in the range of , with higher scores indicating stronger signal strength. Twenty-four of the 141 new ABCA4 intronic variants are located in regions with both DNaseI hypersensitivity and transcription factor binding scores of various strength (Supplementary Material, Table S1). Family members were available for segregation analyses in three cases. Variants c c.a and c c.t were on the same chromosome with the probands other mutation, while c c.t and c t.c were both detected in one patient and on the different chromosome than this patient s other mutation, suggesting possible pathogenicity. Twenty-one variants fall within regions with only DNaseI hypersensitivity score, and 12 variants within regions of transcription factor binding consensus sequences. The new ABCA4 intronic variants were also subjected to the Combined Annotation Dependent Depletion (CADD) algorithm ( (17). The CADD algorithm combines a diverse array of annotations into one metric (C score) for each variant, ranking a variant relative to all possible substitutions of the human genome (17). C score correlates with allelic diversity, pathogenicity and experimentally measured regulatory effects. A C score of.10 indicates that the variant is among the top 10% of most deleterious substitutions in the human genome. C scores for the new ABCA4 intronic variants range from to (Supplementary Material, Table S1), with nine variants resulting in a C score greater than 10. Four of those variants were already classified as possibly disease-associated due to their strong predicted effect on splicing (Table 1). The other five variants with higher C scores included c T.C, c t.g, c a.g, c t.c and c t.a (Supplementary Material, Table S1). In summary, we found a very strong or a probable intronic mutation candidate in 27/114 (23.7%) patients with one existing definite ABCA4 mutation (Table 1). No immediately plausible candidates for intronic mutations were found in 36/114 (31.6%) patients. The remaining 51 (44.7%) patients possessed one or more new intronic variants that were only detected once, had no effect on splicing according to prediction programs and, therefore, are very difficult to confirm or refute as diseaseassociated alleles with the available methods and the impossibility of obtaining the patient RNA. However, it is highly likely that a fraction of these are associated with STGD1. Analysis of previously reported variants Recently several ABCA4 intronic variants were suggested to account for a substantial fraction of pathogenic ABCA4 alleles (14). Since we have directly sequenced the ABCA4 gene and flanking intronic sequences in.780 STGD1 patients and the entire ABCA4 genomic locus in 114 STGD patients with one Table 2. Frequency of the variants described in Braun et al. (2013), and in the current study. Variant # (Braun et al.) Location on chr1 Variant Braun et al. primary cohort (n¼28) CU locus cohort (n ¼ 114) CU exon seq cohort (n ¼ 780) V c g.a 4 1 V c c.a 1 0 V c a.g 1 3 V c g.a 1 2 V c c.t 3 2 V c.6342g.a/p.v2114v V c a.g

59 6 Human Molecular Genetics, 2014 mutation, we compared the Braun et al. data to ours (Table 2). The silent exonic p.v2114v variant, which we had described in our earlier study as a possibly disease-associated mutation (13), is very rare; it was detected in 1 out of 780 STGD1 patients (Table 2). The near-exonic c a.g variant was not detected in any of our 780 patients. We were also unable to detect one of the remaining five deep intronic variants (c c.a) in any patient with one mutation and the other four were seen, in total, in 9/114 (7.9%) STGD1 patients, a statistically significant difference from the Braun et al. data 10/28 (17.9%; P ¼ 0.001). We then analyzed the evolutionary and/or ethnic origin of the variants focusing first on the most frequent variant in the Braun et al. study, c g.a. Evolutionary conservation of a nucleotide is one of the most important criteria for determining the pathogenicity of a variant, especially in a highly conserved gene such as ABCA4. The ABCA4 protein performs a very specialized function in the visual cycle; therefore it is exceptionally conserved in mammals and in all vertebrates with visual cycle. For example, the mouse and human ABCA4 proteins are 88% identical, allowing the human protein to perform the transport function in mouse (18). The conservation extends beyond coding sequences and includes splice sites; in fact, the ABCA4 gene has the same structure, consisting of exactly the same 50 exons in non-human primates, such as Pan troglodytes and M. mulatta. The evolutionary conservation extends, in some regions, deep into the introns; for example, the 200 bp sequences surrounding the c g.a variant are 96% identical between human and macaque. Most importantly, M. mulatta has the adenosine nucleotide in human position c (Fig. 2), as a major allele with guanosine as a minor allele; i.e. a situation exactly the reverse to that observed in humans, suggesting that this is an ancestral variant and not a disease-causing mutation. To further prove this assumption, we screened 200 unrelated individuals from the general population of African- American descent and identified eight heterozygotes, resulting in the allele frequency of 2%. The cohort of STGD1 patients at Columbia University contains 46 African-American patients of whom one carried the c g.a variant, resulting in the allele frequency of 1%, comparable to that in the general population. No other disease-associated ABCA4 variant was identified in this patient. In addition, we detected the c g.a variant only once in 114 STGD1 patients (allele frequency 0.4%) of European descent in the Columbia cohort (Tables 1 and 2). However, it was found in 3 out of 24 STGD1 patients (allele frequency 6.25%) in a STGD1 cohort from Denmark (Table 1). Interestingly, all three patients derived from the same region in Denmark, suggesting either admixture, or that this variant may be more frequent also in some (isolated) ethnic groups other than of African descent since the variant was not detected in 180 unaffected Danish individuals from Copenhagen. Finally, RNA analysis from macaque homozygous for the A allele at the human position c clearly showed no effect on splicing (Fig. 2), which eliminates the c G.A variant as possibly pathogenic in STGD1 patients. The three remaining deep intronic variants described by Braun et al. (14), c a.g, c c.t and c g.a are discussed in the previous section. All of these are very rare and are conserved in NHPs. The c a.g variant, which affects splicing, is likely a Figure 2. Analysis of the c g.a variant in M. mulatta. (A) Alignment of the Homo sapiens chromosome1: , GRCh37.p13 Primary Assembly, with M. mulatta respective sequences. ABCA4 intronic position c G is marked with large font in bold. Differences in macaque sequence compared with human are designated with letters. (B) Confirmation of correct splicing of ABCA4 exons 36 and 37 in a c a macaque retina in RT-PCR analysis. No alternate splicing products were detected. 41

60 Human Molecular Genetics, disease-associated mutation. The other two are not predicted to affect splicing and are too rare to investigate by other means, so the pathogenicity of these variants remains unconfirmed. New frequent intronic variant in STGD1 patients of Asian Indian descent As described above, we detected the c.859-9t.c variant in 2/ 114 patients in our primary ABCA4 locus screening cohort. Since the c.859-9t.c variant weakens the existing splice acceptor only by 14%, we initially did not consider the variant a strong candidate for disease association. However, subsequently the variant was detected in homozygous state in two STGD1 patients who did not harbor any other exonic or intronic ABCA4 mutations and, heterozygously, in three more STGD1 patients with one known ABCA4 mutation. Review of the ethnic origin of all these patients determined that they were all of Asian Indian descent originating from either Pakistan, India or Bangladesh. Segregation analysis was possible in one family, and confirmed that the c.859-9t.c segregated with the disease (Fig. 1E); i.e. it was in trans configuration with the second ABCA4 mutation, c.5917del, in the proband. The c.859-9t.c variant is adjacent to ABCA4 exon sequences and is, therefore, detected also by exome sequencing, or by direct sequencing of ABCA4 coding regions. However, it has not been detected in the Exome Sequencing Project currently containing 4300 individuals of European-American descent and 2200 individuals of African-American descent. It was also absent in 368 control samples and also in 119 additional STGD1 samples of European-American descent with one known ABCA4 mutation (Table 1). Further screening in Asian-Indian population did not detect this variant in 120 subjects, both from the general population (50 individuals), or from the patients affected with Leber congenital amaurosis or Leber hereditary optic neuropathy (70 cases). Altogether, the data suggest that the variant is not frequent in the ethnically matched general population and is very frequent in Indian patients with STGD1 (7 alleles in 15 patients 23.3%) suggesting that the c.859-9t.c variant is a frequent disease-associated ABCA4 allele in patients of Asian-Indian origin. Analysis of the copy number variants by acgh arrays Few lines of evidence have suggested that some diseaseassociated ABCA4 alleles can present as CNVs, mostly in the form of large deletions encompassing one or more exons (15). These reports have been rare, so we reasoned that a small fraction of missing ABCA4 alleles might present as CNVs. In total of 104 STGD1 patients with one known ABCA4 allele and 5 patients with no ABCA4 mutations were screened on the custom CGH arrays, 57 of these were also included in the locus sequencing. acgh data of the total number of 109 samples was analyzed by using Agilent Genomic Workbench (version 7.0), where CNVs were called by using ADM-2 algorithm with threshold of 4.0. No large (.500 bp) CNVs were identified in the ABCA4 locus, while ultra-small, seemingly true-positive CNV calls were further validated by PCR, which did not confirm any, reflecting the decreased accuracy of acgh for the ultrasmall CNVs Since the array contained several genes in the CFH locus with known frequent CNVs and confirmed in a positive control a heterozygous 1030 bp deletion in the ABCA4 locus, we can exclude any technical issues and conclude that, despite reports of CNVs (mainly large deletions) in 1 2% of STGD1 patients, these events are likely much more rare and the CNVs do not account for a reasonable fraction of missing ABCA4 alleles in all populations studied. DISCUSSION The ABCA4 gene was described as the causal gene for STGD1 17 years ago (1) and has been subjected to intensive genetic research; however, the complete understanding of genetic causality has yet to be elucidated. Major factors challenging genetic analysis are: (1) the size of the gene, (2) the exceptional genetic heterogeneity, (3) the expression of the gene in only photoreceptors, rods and cones and, consequently and most importantly, (4) the impossibility of obtaining retinal tissue samples in vivo from patients for RNA analyses. Until recently, even the analysis of amino acid-changing variants was stymied by the lack of a direct functional assay; these analyses were limited to indirect in vitro studies of the transporters ability to bind and/or hydrolyze ATP (19,20). This problem has been somewhat alleviated by the recent studies of Molday et al. (21,22), where direct transport assays, albeit still in vitro, have been proposed and successfully used for several ABCA4 missense alleles. Large-scale use of the assay for hundreds of documented ABCA4 variants is still time- and cost-prohibitive. However, the functional analyses of variants that do not affect the protein sequence directly, such as splicing defects and variants in regulatory regions likely modulating levels of expression, remain refractory to experimental verification. Some studies (9,14) have employed the illegitimate transcript or minigene strategies to assess the effect of certain variants on splicing, but these studies have serious limitations (in part stemming from the word illegitimate ) and the results, while suggestive, cannot be considered unequivocal, as also demonstrated in the current study. With the lack of patient RNA for direct structural and expression studies, the analysis is limited to the assessment of variant frequencies in STGD1 patients and in the matched general populations, to in silico suggestions by predictive software programs, and to segregation analyses in families. The latter approach is also seriously hampered by the fact that most variants in ABCA4 coding and noncoding regions are extremely rare, most often represented in singleton cases and the available families/pedigrees are usually nuclear, i.e. mostly very small, where the segregation analysis has a very limited power. In the current study of non-coding genetic variation on the largest STGD1 cohort analyzed to date, we were able to (almost) unequivocally assign pathogenicity to only 12/141 variants in intronic sequences of ABCA4. Most of the variants which occurred only once in the ABCA4 locus cannot be called especially when the predictive programs do not suggest any effect on splicing. Moreover, the assessment based on predictive programs is not unequivocal, several studies have suggested that molecularly confirmed predictions range from 70 to 80% (23). One has to take into account the fact that assessment of the effect on splicing addresses just one possible mechanism for non-coding sequences. Effect on these on other regulatory 42

61 8 Human Molecular Genetics, 2014 elements affecting the gene expression, such as transcription factor binding sites, enhancers, promoters, etc., is still very difficult to assess in most cases (24). The best known example of an intronic variant in the ABCA4 locus is the c t.c variant in intron 38, which is the second or third most frequent variant (found in 7% of STGD1 patients of European descent) after the p.g1961e and p.[l541p;a1038v] mutations (13). The c t.c variant always segregates with the disease phenotype in families, is very rare in the general population (,0.001) and is shown not to affect splicing in a minigene approach (9). Since there is no mutation in the ABCA4 coding sequence on the same chromosome with the c t.c variant, the latter has to be a disease-associated mutation, although its functional consequences remain unknown. A recent study suggested a few deep intronic variants in the ABCA4 locus which were associated with STGD1 in a large fraction, up to 50%, of patients with one exonic mutation. Our detailed analysis of these data with a multi-faceted approach on a much larger cohort of European-American descent could not confirm these conclusions. Specifically, the reported variants were much, times, rarer than originally claimed (Table 2) and some of them were deemed not associated with STGD1 after evolutionary and RNA analyses employing NHPs (Fig. 2). Concerns also include the ethnic origin and relatedness of patients in cohorts described by Braun et al. (14), since the statistically significant difference in allele frequencies between two studies can occur, excluding technical issues, for two main reasons: (1) cohorts include related subjects or, (2) difference in the racial or ethnic origin of the cohorts. For example, recently several missense ABCA4 variants that were originally considered pathogenic since they were very rare in populations of European descent have now been classified non-pathogenic since they are major alleles (i.e. the wild-type variants) in non-human primates (e.g. p.r1300q) and/or frequent in some racial and ethnic groups, for example, in African-Americans (e.g. p.l1201r, p.r1300q and p.v643m) (25). In summary, the analysis of the entire ABCA4 locus in large cohorts of STGD1 patients revealed the following: (1) The genetic variability in the non-coding sequences in the ABCA4 locus, similar to the coding sequences, is exceptionally vast. (2) There are no frequent pathogenic variants in the noncoding sequences of the ABCA4 locus; all definitely or likely disease-associated variants are individually rare in the populations of European descent. (3) Some variants are more frequent in specific racial and ethnic groups. (4) Copy number variations in the ABCA4 locus are very rare. Analysis of the pathogenicity of specifically intronic ABCA4 variants, which affect splicing and regulatory regions influencing the gene expression, is complicated due to the impossible task of obtaining RNA from photoreceptor cells from affected individuals in vivo. Studying ips cells obtained from individual patients, which are then directed towards differentiating into photoreceptors with the goal of expressing ABCA4, could be a plausible approach. MATERIALS AND METHODS Patients STGD1 patients (255) were, after written informed consent, recruited and clinically examined during a 10-year period in different centers in the USA, Italy, Spain and Denmark. Control cohorts included samples from centers in the USA, Spain and Denmark. In total, 255 patients, and 918 control samples were included in the analyses. Our primary locus sequencing cohort of 136 samples consisted of 49 STGD1 patients from Columbia University, 22 from the University of Illinois at Chicago and the Pangere Center at the Chicago Lighthouse, 22 from UCLA, 25 patients from Italy and 18 from Spain. The second, validation/replication cohort consisted of 119 STGD1 patients from the same centers. Age of onset was defined as the age at which symptoms were first reported. Visual acuity was measured using the Early Treatment Diabetic Retinopathy Study Chart 1 or a Snellen acuity chart. Clinical examination, fundus photography, fundus autofluorescence and spectral domain-optical coherence tomography (SD-OCT) (Heidelberg Spectralis HRA + OCT) were performed using standard acquisition protocols following pupil dilation with Tropicamide 1% and Neosynephrine 2%. All research was carried out with the approval of the Institutional Review Boards at the respective centers and in accordance with the Declaration of Helsinki. M. mulatta samples DNA samples from 86 unrelated rhesus macaques (M. mulatta) from NHP colonies at the National Institutes of Health and the Primate Facility at the University of Oregon were genotyped for the c g.a variant. Sequencing The first set of 48 patients were sequenced using RainDance microdroplet-pcr target enrichment (RainDance Technologies, Billerica, MA) with subsequent sequencing on Roche 454 platform (454 Life Sciences, Branford, CT). We targeted the genomic region chr1: (GRCh37/hg19 Assembly); including the ABCA4 genomic locus chr1: and 9027 bp of 5 UTR and 4667 of 3 UTR sequences. The design covered 100% of the targeted area via 473 amplicons of bp. The second set of 95 patient samples were analyzed by the Illumina Truseq Custom Amplicon target enrichment strategy followed by sequencing on Illumina MiSeq platform (Illumina, San Diego, CA). The Illumina design targeted the genomic region , including the ABCA4 locus, and 4895 bp of 5 UTR and 1694 bp of 3 UTR sequences. The region was divided into nine targets, with seven 500 bp and one 1100 bp gap introduced into the repeating elements. The cumulative target via Illumina design involved bp, covered by 94% with 421 amplicons of 425 bp. The nextgeneration sequencing reads were analyzed using the variant discovery software NextGENe (SoftGenetics, State College, PA). The reads were aligned against the targeted region in the reference genome GRCh37/hg19. For a variant to be called, we required the read containing the variant to match 85% to the 43

62 Human Molecular Genetics, aligned position, the variant to be covered by at least 10 reads and the variant to be present in 20% of all reads aligned to the given position. We also used the overall confidence score of 10 of the NextGENe software as a further filter. We used the previously determined ABCA4 coding variants as controls to set these filtering criteria. On average, 200 variants were called per individual patient. Analysis of the ABCA4 variants All variants and their allele frequencies were compared with the 1000 Genomes database (26), and to the Exome Variant Server (EVS) dataset, NHLBI Exome Sequencing Project, Seattle, WA, USA ( accessed November 2013). New variants that were not recorded in these databases were further analyzed by a combination of predictive in silico methods and statistical analyses. The possible effect of all new non-coding ABCA4 variants on splicing was assessed using five different algorithms (SpliceSiteFinder, MaxEntScan, NNSPLICE, GeneSplicer, Human Splicing Finder) via Alamut software ( In order to assess the regulatory potential of the new ABCA4 intronic variants we compared their chromosome coordinates against the predicted regulatory regions from two ENCODE datasets: (1) Combined DNaseI hypersensitivity clusters from 125 cell types ( Digital DNaseI Hypersensitivity Clusters in 125 cell types from ENCODE Ui?hgsid= &c=chr1&g=wgEncodeRegDnaseClustered V2, filename: wgencoderegdnaseclusteredv2.bed.gz, last accessed on 23 August 2012); and (2) ChIP-seq clustered regions for 161 transcription factors in 91 cell types ( Transcription Factor ChIP-seq V4 (161 factors) with Factorbook motifs from ENCODE hgsid= &c=chr1&g=wgencoderegtfbsclusteredv3, filename: wgencoderegtfbsclusteredv3.bed.gz, last accessed on 21 July 2013). Evolutionary conservation of the variants was noted via UCSC Genome Browser ( edu). The Combined Annotation Dependent Depletion (CADD) algorithm ( was used to estimate combined predicted general deleteriousness of every variant. The variants segregation with the disease in available families was analyzed by Sanger sequencing, and screening of patient and control cohorts for allele frequencies in various populations was performed using TaqMan Genotyping technology (Life Technologies, Carlsbad, CA). For genotyping for the c g.a variant we used PCR RFLP with forward primer 5 GTGGGCCTAGCTCCTTTTAT3, reverse primer 5 GGAGACCAACACAAATGACC3 (Life technologies, Carlsbad, CA), and the DNA restriction endonuclease BssSI (New England Biolabs, Ipswich, MA) Array-comparative genome hybridization (acgh) Custom acgh arrays (Agilent Technologies, CA), in a 8 60K format, was designed with high-density probes tiling the critical genetic loci of ABCA4-associated disease were designed to assess for CNVs involving these loci. The ABCA4 locus and eight other known genes causing macular disease (ELOVL4, PRPH2, BEST1, RS1, CNGB3), and major age-related macular degeneration-associated loci (CFH, ARMS2, C2/CFB), were considered as primary loci and therefore probed with ultra-high density throughout the entire genomic length of the genes plus the 5 promoter regions and 3 downstream regions, as well as at slightly lower density for flanking 5 and 3 conserved regions. Eighteen genetic loci (C3, APOE, CFI, LIPC, SYN3/TIMP3, CETP, COL8A1, BBX, PLD1, SPEF2, ADAM19, VEGFA, FRK, MEPCE, CHMP7/LOXL2, TGFBR1, NPS, PICK1) were considered as secondary loci and probed with ultra-high density in exons and with lower resolution throughout each gene plus the flanking 5 promoter regions and 3 downstream regions. The details of the array design are described in the Supplementary Material, Table S2. Array-CGH analysis was performed on DNA from 104 individuals diagnosed with STGD1, for each of whom only one mutation in ABCA4 had been found by sequencing. A DNA sample with a known, previously reported 1030 bp heterozygous deletion of exon 18 in ABCA4, was used as positive control (15). Experimental procedures of acgh were performed as described previously (27) with minor modifications. Agilent Genomic Workbench version 7.0 software (Agilent Technologies, CA) was used for data analysis, and PCR validations were performed for the plausible true-positive CNVs after being filtered against several criteria. ABCA4 RNA analysis from rhesus macaque retina Total RNA was isolated from a snap frozen rhesus macaque (M. mulatta) retina using AllPrep DNA/RNA Mini Kit (QIAGEN Cat. No 80204) with a fast spin-column procedure. cdna synthesis was achieved using TaqMan Reverse Transcription Reagents (Life technologies, Carlsbad, CA) in a 30 min incubation at 488C. The primer pair for further PCR amplification was designed to encompass ABCA4 exons 36 and 37, with the forward primer in the exon 36 of the ABCA4 gene, 5 GA TTTTCTCCATGTCCTTCG3 and the reverse primer in the exon 37, 5 CTTTCTTCTGAAACCCGATG3, resulting in an amplicon of 205 bp, in the case of correct splicing. SUPPLEMENTARY MATERIAL Supplementary Material is available at HMG online. ACKNOWLEDGEMENTS The authors thank Dr John Fingert for the analysis of a subset of Asian-Indian samples and Dr Dwight Stambolian for providing African-American general population samples. Conflict of Interest statement. None declared. FUNDING This work was supported, in part, by grants from the National Eye Institute/NIH EY021163, EY019861, EY and EY (Core Support for Vision Research); Foundation Fighting Blindness (Owings Mills, MD), Harold and Pauline Price Foundation, unrestricted funds from Research to Prevent Blindness (New York, NY) to the Departments of Ophthalmology, Columbia University and UCLA, the Pangere Family 44

63 10 Human Molecular Genetics, 2014 Foundation, Chicago Lighthouse, FIS PI13/00226 & CIBERER from ISCIII, Madrid, Spain, ONCE and Fundaluce (Spain). REFERENCES 1. Allikmets, R., Singh, N., Sun, H., Shroyer, N.F., Hutchinson, A., Chidambaram, A., Gerrard, B., Baird, L., Stauffer, D., Peiffer, A. et al. (1997) A photoreceptor cell-specific ATP-binding transporter gene (ABCR) is mutated in recessive Stargardt macular dystrophy. Nat. Genet., 15, Cremers,F.P., van de Pol, D.J., van Driel, M., den Hollander, A.I., van Haren, F.J., Knoers, N.V., Tijmes, N., Bergen, A.A., Rohrschneider, K., Blankenagel, A. et al. (1998) Autosomal recessive retinitis pigmentosa and cone rod dystrophy caused by splice site mutations in the Stargardt s disease gene ABCR. Hum. Mol. Genet., 7, Maugeri, A., Klevering, B.J., Rohrschneider, K., Blankenagel, A., Brunner, H.G., Deutman, A.F., Hoyng,C.B. and Cremers, F.P. (2000) Mutationsin the ABCA4 (ABCR) gene are the major cause of autosomal recessive cone rod dystrophy. Am. J. Hum. Genet., 67, Martinez-Mir, A., Paloma, E., Allikmets, R., Ayuso, C., del Rio, T., Dean, M., Vilageliu, L., Gonzalez-Duarte, R. and Balcells, S. (1998) Retinitis pigmentosa caused by a homozygous mutation in the Stargardt disease gene ABCR. Nat. Genet., 18, Shroyer, N.F., Lewis, R.A., Yatsenko, A.N. and Lupski, J.R. (2001) Null missense ABCR (ABCA4) mutations in a family with Stargardt disease and retinitis pigmentosa. Invest. Ophthalmol. Vis. Sci., 42, Allikmets, R. (2007) Tombran-Tink, J. and Barnstable, C.J. (eds), In Retinal Degenerations: Biology, Diagnostics and Therapeutics, Humana Press, Totowa, NJ, in press, pp Burke, T.R., Fishman, G.A., Zernant, J., Schubert, C., Tsang, S.H., Smith, R.T., Ayyagari, R., Koenekoop, R.K., Umfress, A., Ciccarelli, M.L. et al. (2012) Retinalphenotypesin patientshomozygous for the G1961Emutation in the ABCA4 gene. Invest. Ophthalmol. Vis. Sci., 53, Maugeri, A., van Driel, M.A., van de Pol, D.J., Klevering, B.J., van Haren, F.J., Tijmes, N., Bergen, A.A., Rohrschneider, K., Blankenagel, A., Pinckers, A.J. et al. (1999) The 2588G-.C mutation in the ABCR gene is a mild frequent founder mutation in the Western European population and allows the classification of ABCR mutations in patients with Stargardt disease. Am. J. Hum. Genet., 64, Rivera, A., White, K., Stohr, H., Steiner, K., Hemmrich, N., Grimm, T., Jurklies, B., Lorenz, B., Scholl, H.P., Apfelstedt-Sylla, E. et al. (2000) A comprehensive survey of sequence variation in the ABCA4 (ABCR) gene in Stargardt disease and age-related macular degeneration. Am. J. Hum. Genet., 67, Valverde, D., Riveiro-Alvarez, R., Bernal, S., Jaakson, K., Baiget, M., Navarro, R. and Ayuso, C. (2006) Microarray-basedmutation analysisof the ABCA4 gene in Spanish patients with Stargardt disease: evidence of a prevalent mutated allele. Mol. Vis., 12, Rosenberg, T., Klie, F., Garred, P. and Schwartz, M. (2007) N965S is a common ABCA4 variant in Stargardt-related retinopathies in the Danish population. Mol. Vis., 13, Chacon-Camacho, O.F., Granillo-Alvarez, M., Ayala-Ramirez, R. and Zenteno, J.C. (2013) ABCA4 mutational spectrum in Mexican patients with Stargardt disease: identification of 12 novel mutations and evidence of a founder effect for the common p.a1773v mutation. Exp. Eye Res., 109, Zernant, J., Schubert, C., Im, K.M., Burke, T., Brown, C.M., Fishman, G.A., Tsang, S.H., Gouras, P., Dean, M. and Allikmets, R. (2011) Analysis of the ABCA4 gene by next-generation sequencing. Invest Ophthalmol. Vis. Sci., 52, Braun, T.A., Mullins, R.F., Wagner, A.H., Andorf, J.L., Johnston, R.M., Bakall, B.B., Deluca, A.P., Fishman, G.A., Lam, B.L., Weleber, R.G. et al. (2013) Non-exomic and synonymous variants in ABCA4 are an important cause of Stargardt disease. Hum. Mol. Genet., 22, Yatsenko, A.N., Shroyer, N.F., Lewis, R.A. and Lupski, J.R. (2003) An ABCA4 genomic deletion in patients with Stargardt disease. Hum. Mutat., 21, Stenirri, S., Battistella, S., Fermo, I., Manitto, M.P., Martina, E., Brancato, R., Ferrari, M. and Cremonesi, L. (2006) De novo deletion removes a conserved motif in the C-terminus of ABCA4 and results in cone rod dystrophy. Clin. Chem. Lab. Med., 44, Kircher, M., Witten, D.M., Jain, P., O Roak, B.J., Cooper, G.M. and Shendure, J. (2014) A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet., 46, Kong, J., Kim, S.R., Binley, K., Pata, I., Doi, K., Mannik, J., Zernant-Rajang, J., Kan, O., Iqball, S., Naylor, S. et al. (2008) Correction of the disease phenotype in the mouse model of Stargardt disease by lentiviral gene therapy. Gene Ther., 15, Sun, H., Smallwood, P.M. and Nathans, J. (2000) Biochemical defects in ABCR proteinvariants associated with human retinopathies. Nat. Genet., 26, Shroyer, N.F., Lewis, R.A., Yatsenko, A.N., Wensel, T.G. and Lupski, J.R. (2001) Cosegregation and functional analysis of mutant ABCR (ABCA4) alleles in families that manifest both Stargardt disease and age-related macular degeneration. Hum. Mol. Genet., 10, Quazi, F. and Molday, R.S. (2013) Differential phospholipid substrates and directional transport by ATP-binding cassette proteins ABCA1, ABCA7, and ABCA4 and disease-causing mutants. J. Biol. Chem., 288, Quazi,F. and Molday, R.S.(2014)ATP-bindingcassette transporter ABCA4 and chemical isomerization protect photoreceptor cells from the toxic accumulation of excess 11-cis-retinal. Proc. Natl. Acad. Sci. U. S. A., 111, Liu, Y.H., Li, C.G. and Zhou, S.F. (2009) Predictionof deleterious functional effects of non-synonymous single nucleotide polymorphisms in human nuclear receptor genes using a bioinformatics approach. Drug Metab. Lett., 3, Hardison, R.C. and Taylor, J. (2012) Genomic approaches towards finding cis-regulatory modules in animals. Nat. Rev. Genet., 13, Utz, V.M., Chappelow, A.V., Marino, M.J., Beight, C.D., Sturgill-Short, G.M., Pauer, G.J., Crowe, S., Hagstrom, S.A. and Traboulsi, E.I. (2013) Identification of three ABCA4 sequence variations exclusive to African American patients in a cohort of patients with Stargardt disease. Am J Ophthalmol, 156, e Genomes Project, C., Abecasis, G.R., Auton, A., Brooks, L.D., DePristo, M.A., Durbin, R.M., Handsaker, R.E., Kang, H.M., Marth, G.T. and McVean, G.A. (2012) An integrated map of genetic variation from 1,092 human genomes. Nature, 491, Gonzaga-Jauregui, C., Zhang, F., Towne, C.F., Batish, S.D. and Lupski, J.R. (2010) GJB1/connexin 32 whole gene deletions in patients with X-linked Charcot-Marie-Tooth disease. Neurogenetics, 11,

64 46

65 47

66 48

67 49

68 50

69 51

70 ABCA4 52

71 ABCA4 53

72 54

73 55

74 56

75 ABCA4 ELOVL4 PRPH2 RDS BEST1 (VMD2 RS1 CNGB3 CFH a C2 CFB b ARMS2 ARMS2 HTRA1 c c C3 APOE CFI LIPC SYN3 TIMP3 CETP COL8A1 BBX PLD1 SPEF2 ADAM19 VEGFA FRK MEPCE CHMP7 LOXL2 TGFBR1 NPS PICK1 a CFH F13B b C2 C4A C4B 57

76 CHAPTER III.3 Complex inheritance of ABCA4 disease: four mutations in a family with multiple macular phenotypes (Published Paper) 58

77 Hum Genet (2016) 135:9 19 DOI /s y ORIGINAL INVESTIGATION Complex inheritance of ABCA4 disease: four mutations in a family with multiple macular phenotypes Winston Lee1 Yajing Xie1 Jana Zernant1 Bo Yuan2 Srilaxmi Bearelly1 Stephen H. Tsang1,3 James R. Lupski2 Rando Allikmets1,3 Received: 17 July 2015 / Accepted: 1 October 2015 / Published online: 2 November 2015 Springer-Verlag Berlin Heidelberg 2015 Abstract Over 800 mutations in the ABCA4 gene cause autosomal recessive Stargardt disease. Due to extensive genetic heterogeneity, observed variant-associated phenotypes can manifest tremendous variability of expression. Furthermore, the high carrier frequency of pathogenic ABCA4 alleles in the general population (~1:20) often results in pseudo-dominant inheritance patterns further complicating the diagnosis and characterization of affected individuals. This study describes a genotype/phenotype analysis of an unusual family with multiple macular disease phenotypes spanning across two generations and segregating four distinct ABCA4 mutant alleles. Complete sequencing of ABCA4 discovered two known missense mutations, p.c54y and p.g1961e. Array comparative genomic hybridization revealed a large novel deletion combined with a small insertion, c _c.6670del/instgtgcacctccctag, and complete sequencing of the entire ABCA4 genomic locus uncovered a new deep intronic variant, c c>t. Patients with the p.g1961e mutation had the mildest, confined maculopathy phenotype with peripheral flecks while those with all other mutant allele combinations exhibited a more advanced stage of generalized retinal and choriocapillaris atrophy. This family epitomizes the clinical and genetic complexity of ABCA4associated diseases. It contained variants from all classes * Rando Allikmets rla22@cumc.columbia.edu 1 Department of Ophthalmology, Columbia University, New York, NY, USA 2 Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA 3 Department of Pathology and Cell Biology, Columbia University, New York, NY, USA of mutations, in the coding region, deep intronic, both single nucleotide variants and copy number variants that accounted for varying phenotypes segregating in an apparent dominant fashion. Unequivocally defining disease-associated alleles in the ABCA4 locus requires a multifaceted approach that includes advanced mutation detection methods and a thorough analysis of clinical phenotypes. Introduction Mutations in the ABCA4 gene are responsible for a wide range of autosomal recessive retinal dystrophy phenotypes, from Stargardt disease (STGD1; OMIM #248200) (Allikmets et al. 1997) to cone rod dystrophy (CRD) (Cremers et al. 1998; Maugeri et al. 2000) and, in some advanced cases, generalized choriocapillaris dystrophy (GCCD) (Bertelsen et al. 2014) and pan-retinal dystrophies with extensive pigment migration resembling retinitis pigmentosa (RP) (Cremers et al. 1998; Martinez-Mir et al. 1998; Shroyer et al. 2001). STGD1 is a predominantly juvenileonset macular dystrophy associated with central visual impairment, progressive bilateral atrophy of the retinal pigment epithelium, and the accumulation of yellow, pisciform flecks, defined as lipofuscin deposits, in or around the macula or posterior pole of the retina. Complete sequencing of the ABCA4 coding and adjacent intronic sequences in patients with STGD1 usually identifies the expected two disease-associated alleles in % of patients, one mutation in ~15 20 %, and no mutations in the remaining ~15 % (Zernant et al. 2011). Clinically diagnosed cases of STGD1 with no detected ABCA4 mutations most often represent phenocopies (Zernant et al. 2011, 2014), whereby mutations in other gene(s) manifest a STGD1-like phenotype (Riveiro-Alvarez et al. 2015; Tsang et al. 2014; 13 59

78 10 Hum Genet (2016) 135:9 19 Yamamoto et al. 2014). However, in STGD1 cases with one ABCA4 mutant allele, the second usually resides in the ABCA4 locus. Such non-coding alleles in ABCA4 belong to two main classes of mutations: (1) copy number variants (CNV), i.e., large deletions or insertions of one exon or more which elude detection by PCR-based sequencing techniques, and (2) deep intronic variants >10 bp away of exons, i.e., outside of splice sites. While the ABCA4 gene and locus show extensive variability in single nucleotide variants (SNV), large CNVs are rare in the ABCA4 gene (<1 % of all disease-associated alleles); there have been only a few reports describing these (Stenirri et al. 2006; Yatsenko et al. 2003; Zernant et al. 2011, 2014). Recently, many deep intronic disease-associated variants which may affect splicing or other regulatory functions have been described (Braun et al. 2013; Zernant et al. 2014). However, these are also individually very rare and often difficult to unequivocally associate with the disease due to inaccessibility of ABCA4 RNA (the gene/protein is expressed only in photoreceptors) and lack of a direct functional assay (Zernant et al. 2014). All of the above make the molecular analysis of the ABCA4 locus in STGD1 patients very challenging. Moreover, although all ABCA4-associated diseases are recessive, the high carrier frequency of pathogenic ABCA4 alleles in the general population (1:20) (Maugeri et al. 1999; Yatsenko et al. 2001; Zernant et al. 2011) can often result in a dominant-appearing pattern of inheritance in families. Here, we describe a genotype/phenotype analysis of a large family presenting with four distinct macular disease phenotypes spanning across two successive generations. Four distinct ABCA4 mutant alleles were identified, two of which are novel, from three classes of mutations, which account for two of the four phenotypes and the seemingly pseudo-dominant inheritance pattern. This study demonstrates the genetic and clinical complexity of ABCA4- associated disorders and highlights the extensive molecular analyses required to unequivocally solve such challenging cases involving multiple mutations and phenotypes in a single family. Materials and methods Patients and clinical evaluation Ten members of a two-generation family were consented and enrolled under the protocol IRB-AAAI9906, approved by the Institutional Review Board at Columbia University and adhering to tenets of the Declaration of Helsinki. Each affected and unaffected relative underwent a complete ophthalmic examination by a retina specialist (S.H.T. and S.B.) which included fundus autofluorescence (AF) images obtained using a confocal scanning-laser ophthalmoscope (cslo, Heidelberg Retina Angiograph 2, Heidelberg Engineering, Dossenheim, Germany) by illuminating the fundus with argon laser light (488 nm) and viewing the resultant fluorescence through a band pass filter with a short wavelength cutoff at 495 nm. Simultaneous AF and spectral domain-optical coherence tomography (SD-OCT) images were acquired using a Spectralis HRA + OCT (Heidelberg Engineering, Heidelberg, Germany). Color fundus photography was acquired using an FF 450plus Fundus Camera (Carl Zeiss Meditec AG, Jena, Germany). Full-field electroretinograms (ERG) were acquired with the Diagnosys Espion Electrophysiology System (Diagnosys LLC, Littleton, MA, USA). For all recordings, the pupils were maximally dilated before full-field ERG testing using guttate tropicamide (1 %) and phenylephrine hydrochloride (2.5 %); and the corneas were anesthetized with guttate proparacaine 0.5 %. Silver impregnated fiber electrodes (DTL; Diagnosys LLC, Littleton, MA) were used with a ground electrode on the forehead. Full-field ERGs to test generalized retinal function were performed using extended testing protocols incorporating the International Society for Clinical Electrophysiology of Vision standard (McCulloch et al. 2015). Sequencing Patient samples were analyzed by the Illumina Truseq Custom Amplicon target enrichment strategy followed by sequencing on Illumina MiSeq platform (Illumina, San Diego, CA). For sequencing of the ABCA4 gene (coding region) 58 amplicons of 425 bp were designed, resulting in 11,258 bp cumulative target with 100 % coverage. For sequencing of the entire ABCA4 locus the genomic region 94,456,700 94,591,600 on chromosome 1 was targeted, including the ABCA4 locus, and 4895 bp of 5 UTR, and 1694 bp of 3 UTR sequences. The genomic region was divided into 9 targets, with seven 500 bp and one 1100 bp gap introduced by DNA repeats. The cumulative target included 130,319 bp and was covered at 94 % with 421 amplicons of 425 bp each. The next-generation sequencing reads were analyzed with the NextGENe variant discovery software (SoftGenetics, State College, PA). The reads were aligned against the targeted region in the haploid reference genome GRCh37/hg19. For a variant to be called, we required the read containing the variant to match 85 % to the aligned position, the variant to be covered by at least 10 reads and the variant to be present in 20 % of all reads aligned to the given position. We also used the overall confidence score of 10 of the NextGENe software and the previously determined ABCA4 coding variants as controls for further filtering criteria

79 Hum Genet (2016) 135:9 19 Analysis of the ABCA4 variants All variants and their allele frequencies were compared to the 1000 Genomes database (Genomes Project et al. 2012), and to the Exome Variant Server (EVS) dataset, NHLBI Exome Sequencing Project, Seattle, WA, USA ( gs.washington.edu/evs/; accessed April 2015). New variants that were not recorded in these databases were further analyzed by a combination of predictive in silico methods and statistical analyses. The possible effect of the two new non-coding ABCA4 variants on splicing was assessed using 5 different algorithms (SpliceSiteFinder, MaxEntScan, NNSPLICE, GeneSplicer, Human Splicing Finder) via Alamut software ( com). To assess the regulatory potential of the new ABCA4 intronic variants we compared their chromosome coordinates against the predicted regulatory regions from two ENCODE datasets: (1) combined DNaseI hypersensitivity clusters from 125 cell types ( Digital DNaseI Hypersensitivity Clusters in 125 cell types from ENCODE genome.ucsc.edu/cgi-bin/hgtrackui?hgsid= & c=chr1&g=wgencoderegdnaseclusteredv2, filename: wgencoderegdnaseclusteredv2.bed.gz); and (2) ChIPseq clustered regions for 161 transcription factors in 91 cell types ( Transcription Factor ChIP-seq V4 (161 factors) with Factorbook motifs from ENCODE (ENCODE project; Date submitted , cgi-bin/hgtrackui?hgsid= &c=chr1&g=wge ncoderegtfbsclusteredv3, filename: wgencoderegtfbsclusteredv3.bed.gz). Evolutionary conservation of the variants was noted via UCSC Genome Browser ( genome.ucsc.edu). The Combined Annotation-Dependent Depletion (CADD) algorithm ( edu/score) was used to estimate combined predicted general deleteriousness of every variant. The variants segregation with the disease was analyzed by Sanger sequencing. Array comparative genome hybridization (acgh) Custom CGH (Agilent Technologies) arrays, in an 8X60K format, were designed with high-density probes tiling the critical genetic locus of ABCA4-associated diseases. The details of the array design have been described previously (Zernant et al. 2014). Array CGH analysis was performed on DNA from patients I-3, II-1 and II-3, for each of whom only one mutation had been found by sequencing of the entire ABCA4 coding region. A DNA sample with a known, previously reported 1030 bp heterozygous deletion of exon 18 in ABCA4, was used as positive control (Yatsenko et al. 2003). Experimental procedures for acgh were performed as described previously (Gonzaga-Jauregui et al. 2010) with minor modifications. Agilent Genomic Workbench version software (Agilent Technologies) was used for data analysis. CNVs were called using the ADM-2 algorithm with a threshold of 4.0. PCR validations were performed for the plausible true-positive CNVs after being filtered against several criteria. TaKaRa LA Taq (Clontech) was used for the PCR amplifications. Sanger sequencing was performed for the PCR products. DNA sequences were compared to the reference genome (hg19) to map the breakpoint junctions. PCR primers used for breakpoint mapping: ABCA4-bkpt-F, 5 -ACCC CAATAAACAGAGGGCAAGAGTT-3 ; ABCA4-bkpt-R, 5 -TTTAGGAGTGAAGGGCTGTGATGAGT-3. A PCR genotyping assay using the same pair of breakpoint junction-specific primers was performed for the family members with available DNA samples. Results Patients and disease phenotypes Ten members of a two-generation family presented to the clinic with a history of dominant retinal disease (Fig. 1, pedigree). A summary of demographic and genetic characteristics is provided in Table 1 (below). Fundus examinations were abnormal in five members of the family exhibiting retinal phenotypes within the spectrum described for ABCA4-associated disease. The proband (II-1), a 43-yearold man of German descent, and his paternal aunt (I-4) and uncle (I-3) reported experiencing symptoms of uncorrectable vision loss within the first decade of life. At the time of examination each were found to have poor bestcorrected visual acuities (BCVA) ranging from 20/400 to hand motion and peripheral light perception with eccentric fixation. Funduscopy and autofluorescence imaging in the proband revealed a large area of well-delineated chorioretinal atrophy and central pigment clumping within the vascular arcades. The macular lesions in each eye were circumscribed by a dense border of hyperautofluorescence adjacent to a reticular pattern of granular flecks in the periphery. The peripapillary region around the optic nerve was notably spared of any apparent disease-associated changes (Fig. 2a, yellow arrowheads). Full-field electroretinogram (ERG) testing in the proband revealed significant reductions in both rod and cone functions that is indicative of advanced-stage, generalized photoreceptor loss throughout the retina (Fig. 2b). Atrophic chorioretinal lesions in the proband s aunt and uncle were comparatively more advanced extending further into the posterior pole of the retina. Retinal vessels were visibly attenuated and residual areas of fundus tissue (RPE) around the numerous lesions were speckled with densely resorbed, hyperautofluorescent flecks. The peripapillary regions in both were affected with

80 12 Hum Genet (2016) 135:9 19 Fig. 1 Pedigree of the family illustrating the segregation of four disease-causing ABCA4 alleles with two Stargardt disease (STGD1) phenotypes of varying severity. Filled circles and squares denote affected males and females, respectively, and centrally filled shapes denote heterozygous carriers. ABCA4 variants, and their order on the chromosome, are listed in the key. The symbol (+) denotes the presence of a disease-causing ABCA4 variant while ( ) represents the wild-type allele. The mother of the proband (I-2, arrowhead) was affected with pattern macular dystrophy which was phenotypically distinct from the other affected individuals and the father (I-1) was affected with chronic central serous chorioretinopathy Table 1 Demographic and genetic characteristics of individuals within the family Subject Age (y) Relationship to proband Gender Diagnosis ABCA4 mutation Allele 1 Allele 2 I-1 75 Father M CSC/AMD c.[302+68c>t; c>t] I-2 75 Mother F PD p.c54y I-3 69 Uncle (P) M STGD1 c _c.6670del/instgtgca c.[302+68c>t; c>t] CCTCCCTAG I-4 68 Aunt F p.g1961e I-5 67 Aunt (P) F STGD1 c _c.6670del/instgtgca c.[302+68c>t; c>t] CCTCCCTAG I-6 63 Uncle (P) M c _c.6670del/instgtgca CCTCCCTAG I-7 59 Aunt F c _c.6670del/instgtgca CCTCCCTAG II-1 43 Proband M STGD1 p.c54y c.[302+68c>t; c>t] II-2 40 Cousin M STGD1 c.[302+68c>t; c>t] p.g1961e II-3 37 Cousin M STGD1 c _c.6670del/instgtgca CCTCCCTAG p.g1961e y years; (P) paternal lineage, M male, F female, CSC central serous chorioretinopathy, AMD age-related macular degeneration, PD pattern dystrophy, STGD1 Stargardt disease flecks or partially spared, respectively (Fig. 2c, d, yellow arrowheads). The paternal cousins of the proband (sons of the uncle) presented with a comparatively milder ABCA4 phenotype. Aged 36 years (II-3) and 39 years (II-2) at the time of examination, both reported an onset of symptoms within the third decade of life signifying a shorter disease duration as compared to the other affected relatives described above. Their phenotypes were similar to one another; measured BCVA ranged from 20/100 to 20/150 in both eyes. A lesion of mottled atrophy and photoreceptor loss was confined to the central macula in each eye (Fig. 3a, b) with a scattered, confluent pattern of subretinal pisciform flecks centrifugally distributed throughout the mid-periphery. Sparing of the peripapillary region was also apparent in both cases. A comparative summary of ABCA4-associated phenotypic features in these affected individuals is provided in Table

81 Hum Genet (2016) 135: Fig. 2 Advanced ABCA4 phenotype of the proband, paternal aunt and uncles. a The proband, a 43-year-old man of German descent harboring the p.c54y and c.[302+68c>t; c>t] variants, presented with large well-delineated lesions of chorioretinal atrophy and central pigment clumping in both eyes on autofluorescence imaging and color fundus photographs. Surrounding areas of dense hyperautofluorescence preceding a reticular pattern of granular flecks were found throughout the periphery. b Full-field electroretinogram testing in the proband (II-2) revealed significantly reduced amplitudes in rod and cone responses in the right (blue trace) and left (red trace) eyes when compared to an age-matched control (dotted gray trace). The paternal aunt (c.[302+68c>t; c>t]; c _c.6670del/instgtgcacctccctag) presented with similarly large area of chorioretinal atrophy while the paternal uncle d with the same compound heterozygous genotype exhibited generalized atrophy of the posterior pole. Various degrees of sparing of the peripapillary region in each patient are marked with yellow arrowheads Remarkably, both parents of the proband were also affected with macular diseases; however, with phenotypes usually not associated with ABCA4 mutations. The mother (I-2) reported a late onset of visual symptoms and had measured BCVA s of 20/20 and 20/30 in the right and left eyes, respectively. A confined area of central atrophy surrounded by large amorphous hyperautofluorescent flecks was found in each eye. The fovea was spared of diseaseassociated changes in both eyes which likely accounted for her preserved visual acuity and stable fixation. It was noted that atrophy from the macular lesion extended to the peripapillary region around the optic disc (Fig. 4a), resulting in a phenotype described as pattern dystrophy. The proband s father (I-1), aged 75 years, also presented with a recent onset of macular findings consistent with nonexudative central serous chorioretinopathy in the left eye (Fig. 4b); SD-OCT revealed a minor pigment epithelial detachment in this eye (Fig. 4c). Minor pigmentary changes and drusenoid deposits were found in both eyes which are likely attributable to the onset of dry age-related macular degeneration (AMD). Discovery of new disease-associated variants by next-generation sequencing Sequencing of the ABCA4 gene and the entire genomic locus in patients II-1 and II-3, at an average depth of coverage of 100, identified the disease-associated missense variants, p.c54y and p.g1961e, respectively. Both of these variants are known to cause STGD1, the p.g1961e being the most frequent disease-associated allele (Burke et al. 2012). Interestingly, both of these mutations originated from carrier mothers (I-2, I-4) of the two STGD1 patients (Fig. 1). Locus sequencing in II-1 detected a

82 14 Hum Genet (2016) 135:9 19 Fig. 3 The paternal cousin (36-year-old son of the paternal uncle) harboring the p.g1961e allele and the c _c.6670del/instgtgcac- CTCCCTAG variant presented at a comparatively milder disease stage. a Autofluorescence imaging revealed a localized lesion of retinal pigment epithelium mottling and photoreceptor loss in the central macula b as correlated with spectral domainoptical coherence tomography. The lesion is surrounded by a confluent pattern of round, pisciform flecks centrifugally distributed across the midperiphery of the retina Table 2 Summary of ABCA4-associated phenotypes in affected individuals Subject Onset (y) Duration (y) BCVA Geographic Extent of Fleck distribution sparing Peripapillary ABCA4 mutation atrophy atrophy OD OS Allele 1 Allele 2 I CF CF Posterior pole Choriocapillaris Resorbed Partial del/ins a intronic b I CF 20/400 Extra-macular Choriocapillaris Resorbed Partial del/ins a intronic b II /400 20/200 Macular Choriocapillaris Resorbed/ reticular Spared p.c54y intronic b II /100 20/100 n/a Outer retina Scattered Spared intronic b p.g1961e II /150 20/150 n/a Outer retina Scattered Spared del/ins a p.g1961e y years, BCVA best-corrected visual acuity, OD right eye, OS left eye, CF counting, n/a not applicable a c _c.6670del/instgtgcacctccctag b c.[302+68c>t; c>t] new c c>t variant and a previously described c c>t allele (Braun et al. 2013; Zernant et al. 2014). Both of these variants were on the same chromosome as a complex allele. This haplotype was on the other allele from the p.c54y variant in patient II-1, and was detected also in 5/350 other STGD1 cases, all of whom had a second ABCA4 mutation. According to in silico analyses, neither c c>t nor c c>t was predicted to have any effect on splicing, either involving existing cryptic splice sites or creating new sites. This was also confirmed by the RNA analysis from leukocytes via an illegitimate transcript experimental approach (data not shown); however, this method has been shown to have limited value for the analysis of the ABCA4 gene, which is expressed only in photoreceptors, often producing results that can lead to erroneous interpretations (Albert et al. 2015). Comparing these two positions and flanking sequences in primates reveal that c c>t lies in a relatively less conserved area than c c>t. Both variants were not detected in the 1000 Genomes Project and segregated with the disease in this large family (and in 5 other STGD1 patients), making them very likely candidates for intronic ABCA4 mutations. Analysis of regulatory sequences To assess the potential functional effects of the two newly described ABCA4 intronic variants on putative regulatory regions we compared their location against the chromosome coordinates of the DNaseI hypersensitivity and

83 Hum Genet (2016) 135: Fig. 4 Macular findings in the mother and father of the proband who each carry a single ABCA4 variant, p.c54y and c.[302+68c>t; c>t], respectively. a The mother presented with a confined lesion of atrophy and large hyperfluorescent flecks sparing the fovea in each eye. Notably, atrophy extends into the peripapillary region of the optic nerve in both eyes. b The father exhibited a recent onset of central serous chorioretinopathy in the left eye with fluid and pigment epithelial detachment over the fovea in the left eye. Pigmentary changes noted in the macula are likely attributed to early, dry age-related macular degeneration transcription factor-binding clusters from ENCODE. The c c>t variant maps to a region of weak DNaseI hypersensitivity (score 192; scale from 0 to 1000), while the c c>t does not overlap with DNaseI hypersensitivity nor transcription factor-binding consensus sequences. The new ABCA4 intronic variants were also subjected to the Combined Annotation-Dependent Depletion (CADD) algorithm ( The CADD algorithm combines a diverse array of annotations into one metric (C score) for each variant, which correlates with allelic diversity, pathogenicity, and experimentally measured regulatory effects (Kircher et al. 2014); a score >10 indicates that the variant is among the top 10 % of most deleterious substitutions in the human genome. C scores for c c>t and c c>t were similar, 3.15 and 3.5, respectively, suggesting no significant deleterious effect for either of these variants. Analysis of the copy number variants by acgh arrays Copy number variations in ABCA4, specifically in the form of large deletions encompassing one or more exons, have been shown to account for a small portion of Stargardt disease alleles (Yatsenko et al. 2003). While these reports have been rare, we screened the patients I-3, II-1 and II-3 using custom-designed high-density array CGH, which revealed a ~5 kb heterozygous deletion at the ABCA4 locus (Fig. 5a) in patients I-3 and II-3. Breakpoint junction sequencing mapped the precise genomic interval (Chr1: 94,463,476 94,468,246), revealing a 4771 bp deletion (Fig. 5b). A short inserted DNA fragment (TGTGCACCTCCCTAG) was revealed at the deletion breakpoint junction, indicating a deletion/insertion haplotype. The insertion could be split into two halves; both were potentially from regions close to the proximal end of the deletion (Fig. 5b). Microhomologies were identified at the breakpoint junctions (Fig. 5b)

84 16 Hum Genet (2016) 135:9 19 Fig. 5 The deletion/insertion variant in the ABCA4 locus in the family. a acgh log 2 ratio plot showing the heterozygous deletion identified in patient II-3. The breakpoint junction sequences alignment underneath the plot reveals two inserted DNA fragments (green) conjugated by microhomologies (purple). The potential origins of the inserted DNA fragments are marked with underlines labeled with 1 and 2, respectively. DIST/PROX, human genome reference sequences at the distal/proximal end of the junction. b Potential molecular mechanism explaining the deletion etiology. FoSTeS fork stalling and template switching. c Pedigree of the family identified with the deletion/insertion variant allele (c _c.6670del/inst- GTGCACCTCCCTAG). The gel pictures underneath the pedigree demonstrate the segregation of the deletion allele with three other potential disease-causing alleles identified in this family. wt wild-type allele, del deletion/insertion variant allele These suggest fork stalling and template switching/microhomology-mediated break-induced replication (FoSTeS/ MMBIR) as the potential CNV generating mechanism via three template switches (Hastings et al. 2009; Lee et al. 2007; Zhang et al. 2009). The deletion/insertion spanned from the intron upstream of exon 45 to exon 48. The exons were completely deleted, while exon 48 was partially deleted. Integrating the insertion, the variant may be represented as c _c.6670del/instgtgcacctccctag. Genotyping analysis of c _c.6670del/instgtgcacctccct AG revealed its presence in unaffected individuals I-6 and I-7, who were carriers of this variant, and in affected individuals I-3 and I-5 in combination with a second variant consisting of a complex deep intronic mutant allele c.[302+68c>t; c>t] (Fig. 5c). Discussion The ABCA4 gene has been the subject of intensive genetic research since it was first described as the causal gene for STGD1 18 years ago (Allikmets et al. 1997). Similarly, phenotypes caused by ABCA4 mutant alleles have also been extensively studied and found to be notable for a high degree of variability of expression (Fujinami et al. 2014, 2015; Mullins et al. 2012; Noupuu et al. 2014; Zahid et al. 2013). Genetic analyses are challenging due to the size of the gene and the extensive genetic heterogeneity. Moreover, the gene is expressed only in rod and cone photoreceptors and, consequently, it is impossible to obtain photoreceptor samples in vivo from patients for RNA analyses. With the lack of patient RNA for direct structural and expression studies, the analysis is limited to the assessment of variant frequencies in STGD1 patients and in the matched general populations, to in silico suggestions by predictive software programs, and to segregation analyses in families. The latter approach is often complicated due to the variants in ABCA4 coding and non-coding regions being extremely rare, most often represented in singleton cases and/or in nuclear families where the segregation analysis has limited power. In the current study, we analyzed the genotypes and phenotypes in a large two-generation family segregating two distinct forms of STGD1 in a pseudo-dominant inheritance pattern. The disease in its variable expression was due to different combinations of 4 ABCA4 mutant alleles 2 wellknown missense mutations and two new variant alleles, a large deletion/insertion (including exons 45 48) and a complex allele containing 2 deep intronic variants. Segregation of the 4 mutant alleles explained the seemingly dominant (in fact, pseudo-dominant) inheritance pattern and variable expression of the disease. The complex allele with the two deep intronic variants (c.[302+68c>t; C>T]) appeared to impart an aggressive, early-onset phenotype in patients II-1, I-3 and I-4 (Fig. 2). The deletion

85 Hum Genet (2016) 135:9 19 variant (c _c.6670del/instgtgcacctccctag) is also likely a null allele since it causes a shift of the reading frame and is predicted to result in protein truncation (Fig. 5); however, II-2 and II-3 presented with comparatively milder disease phenotypes (Fig. 3) despite harboring each of the intronic and deletion variants, respectively. This is likely due to the p.g1961e mutation shared by II-2 and II-3 on the opposite, maternal allele, which has been previously associated with a late-onset, milder disease phenotype characterized by more localized disease confined to the central macula (Burke et al. 2012; Cella et al. 2009). This apparent resistance to disease severity conferred by p.g1961e is clearly exemplified in this family; this could potentially be attributed to G1961E representing a hypomorphic allele, although its precise mechanism remains to be elucidated (Allikmets 2000; Lewis et al. 1999). While neither of the two deep intronic variants c c>t and c c>t, which compose a complex allele, had an effect on splicing or regulatory elements as determined by in silico analyses, the complex allele segregated with the disease. Moreover, this complex allele has been also detected as the second ABCA4 disease-associated allele in five more cases from our large (>700 cases) STGD1 cohort at Columbia University (data not shown). In addition, there is precedent at the ABCA4 locus for complex alleles to have more severe functional consequences than the individual constituent variants and also have more severe clinical phenotypic consequences (Shroyer et al. 2001). The assessment of a potential effect on splicing using prediction programs is correct in only % of variants examined (Liu et al. 2009). The same is true for predicting the effect of intronic variants on regulatory elements affecting gene expression, such as transcription factor-binding sites, enhancers, promoters, etc. (Hardison and Taylor 2012). A good example of a diseaseassociated intronic variant in the ABCA4 locus which, until most recently, had no proven functional effect is the c T>C variant in intron 38, which is one of the most frequently observed variants in STGD1 patients (Zernant et al. 2011). This variant always segregates with the disease phenotype in families, is very rare in the general population (<0.001) and was shown not to affect splicing using the minigene approach (Rivera et al. 2000). Therefore, it had been assumed that this variant is not a pathogenic allele, but in LD with an unknown ABCA4 variant. However, it was most recently shown by obtaining RNA from ips-derived photoreceptor progenitor cells from patients carrying this allele, that the variant causes skipping of exons 39 and 40 in the ABCA4 gene (Albert et al. 2015). Taken together, these data strongly suggest pathogenic consequences for the deep intronic complex allele. The unequivocal proof will be obtained by analyzing RNA derived from patients ips cells, as described (Albert et al. 2015). 17 The two additional macular phenotypes found in patients I-1 and I-2 (Fig. 4), which were chronic central serous chorioretinopathy (CSC) and pattern dystrophy (PD), respectively, initially provided yet another challenge in assessing the disease entity in the family. While we were not able to determine the exact genetic cause of the phenotypes in these patients, we speculate that the heterozygous ABCA4 variants were not causal in these cases. The genetic cause of CSC is largely unknown. Variants in the CDH5 gene, currently the only gene reliably associated with the CSC phenotype (Schubert et al. 2014), were not detected in patient I-1. PD, as presented in patient I-2, refers to a group of slowly progressive macular diseases that manifest with minimal diminution of visual acuity and photoreceptor atrophy in older individuals (Marmor and McNamara 1996; Watzke et al. 1982). Many PD cases have been attributed to mutations in the PRPH2/RDS (Boon et al. 2007; Francis et al. 2005). The genetic cause of the PD phenotype can be different and it is often difficult to diagnose based on phenotype alone. However, while the patient carried one ABCA4 allele, p.c54y, and exhibited certain phenotypic characteristics similar to STGD1, the distinct lack of peripapillary sparing (a pathognomonic characteristic of STGD1 Cideciyan et al. 2005) is clinically contraindicative of ABCA4-associated disease. No other potentially disease-associated variants in either the ABCA4 or in PRPH2 genes were detected in this patient. Due to the high carrier frequency of disease-causing ABCA4 alleles in the general population (1:20), pseudodominant inheritance in STGD1 has been described (Cremers et al. 1998; Lewis et al. 1999; Yatsenko et al. 2001). However, similar families mostly occur in isolated populations with elevated occurrence of consanguineous marriages, and contain no more than 3 ABCA4 mutations (Cremers et al. 1998; Yatsenko et al. 2001). The family reported here that segregates four pathogenic ABCA4 variant alleles is: (i) derived from an outbred population and, (ii) harbors two known ABCA4 missense mutations which were introduced into the family by marriage, which is a very rare event (1:400). In summary, the analysis of the entire ABCA4 locus in a large family with multiple members presenting four very different phenotypes in a seemingly dominant fashion revealed four different ABCA4 variants belonging to 3 separate classes of mutations segregating with the disease and accounting for at least 2 out of 4 phenotypes. This family epitomizes the extremely complex mutational spectrum underlying ABCA4-associated diseases and suggests a thorough analysis of all cases where the phenotype falls into the spectrum of ABCA4-associated diseases. Genetic analysis should include not only complete sequencing of the gene and adjacent canonical splice junctions, but also the variation in the entire genomic locus, including copy number variant analysis

86 18 Hum Genet (2016) 135:9 19 Acknowledgments This work was supported, in part, by grants from the National Eye Institute/NIH EY021163, EY019861, EY and EY (Core Support for Vision Research); National Human Genome Research Institute/NIH HG ; Robert L. Burch III Fund, Columbia University, New York, NY, New York Community Trust Fredrick J. and Theresa Dow Wallace Fund, Columbia University, New York, NY; Foundation Fighting Blindness (Owings Mills, Maryland), and unrestricted funds from Research to Prevent Blindness (New York, NY) to the Department of Ophthalmology, Columbia University. Compliance with ethical standards Conflict of interest None declared. References Albert S, Sangermano R, Bax N, Roosing S, van den Born L, den Engelsman-van Dijk A, Ramlal A, Stone E, Hoyng C, Cremers F (2015) Towards the identification of deep-intronic ABCA4 mutations in Stargardt patients by using induced pluripotent stem cellderived photoreceptor progenitor cells. Association for Research in Vision and Ophthalmology Allikmets R (2000) Further evidence for an association of ABCR alleles with age-related macular degeneration. The International ABCR Screening Consortium. Am J Hum Genet 67: Allikmets R, Singh N, Sun H, Shroyer NF, Hutchinson A, Chidambaram A, Gerrard B, Baird L, Stauffer D, Peiffer A, Rattner A, Smallwood P, Li Y, Anderson KL, Lewis RA, Nathans J, Leppert M, Dean M, Lupski JR (1997) A photoreceptor cell-specific ATP-binding transporter gene (ABCR) is mutated in recessive Stargardt macular dystrophy. Nat Genet 15: Bertelsen M, Zernant J, Larsen M, Duno M, Allikmets R, Rosenberg T (2014) Generalized choriocapillaris dystrophy, a distinct phenotype in the spectrum of ABCA4-associated retinopathies. Investig Ophthalmol Vis Sci 55: doi: / iovs Boon CJ, van Schooneveld MJ, den Hollander AI, van Lith-Verhoeven JJ, Zonneveld-Vrieling MN, Theelen T, Cremers FP, Hoyng CB, Klevering BJ (2007) Mutations in the peripherin/rds gene are an important cause of multifocal pattern dystrophy simulating STGD1/fundus flavimaculatus. Br J Ophthalmol 91: doi: /bjo Braun TA, Mullins RF, Wagner AH, Andorf JL, Johnston RM, Bakall BB, Deluca AP, Fishman GA, Lam BL, Weleber RG, Cideciyan AV, Jacobson SG, Sheffield VC, Tucker BA, Stone EM (2013) Non-exomic and synonymous variants in ABCA4 are an important cause of Stargardt disease. Hum Mol Genet 22: doi: /hmg/ddt367 Burke TR, Fishman GA, Zernant J, Schubert C, Tsang SH, Smith RT, Ayyagari R, Koenekoop RK, Umfress A, Ciccarelli ML, Baldi A, Iannaccone A, Cremers FP, Klaver CC, Allikmets R (2012) Retinal phenotypes in patients homozygous for the G1961E mutation in the ABCA4 gene. Investig Ophthalmol Vis Sci 53: doi: /iovs Cella W, Greenstein VC, Zernant-Rajang J, Smith TR, Barile G, Allikmets R, Tsang SH (2009) G1961E mutant allele in the Stargardt disease gene ABCA4 causes bull s eye maculopathy. Exp Eye Res 89: doi: /j.exer Cideciyan AV, Swider M, Aleman TS, Sumaroka A, Schwartz SB, Roman MI, Milam AH, Bennett J, Stone EM, Jacobson SG (2005) ABCA4-associated retinal degenerations spare structure and function of the human parapapillary retina. Investig Ophthalmol Vis Sci 46: Cremers FP, van de Pol DJ, van Driel M, den Hollander AI, van Haren FJ, Knoers NV, Tijmes N, Bergen AA, Rohrschneider K, Blankenagel A, Pinckers AJ, Deutman AF, Hoyng CB (1998) Autosomal recessive retinitis pigmentosa and cone-rod dystrophy caused by splice site mutations in the Stargardt s disease gene ABCR. Hum Mol Genet 7: Francis PJ, Schultz DW, Gregory AM, Schain MB, Barra R, Majewski J, Ott J, Acott T, Weleber RG, Klein ML (2005) Genetic and phenotypic heterogeneity in pattern dystrophy. Br J Ophthalmol 89: doi: /bjo Fujinami K, Singh R, Carroll J, Zernant J, Allikmets R, Michaelides M, Moore AT (2014) Fine central macular dots associated with childhood-onset Stargardt Disease. Acta Ophthalmol 92:e157 e159. doi: /aos Fujinami K, Zernant J, Chana RK, Wright GA, Tsunoda K, Ozawa Y, Tsubota K, Robson AG, Holder GE, Allikmets R, Michaelides M, Moore AT (2015) Clinical and molecular characteristics of childhood-onset Stargardt disease. Ophthalmology 122: doi: /j.ophtha Genomes Project C, Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, Kang HM, Marth GT, McVean GA (2012) An integrated map of genetic variation from 1,092 human genomes. Nature 491: doi: / nature11632 Gonzaga-Jauregui C, Zhang F, Towne CF, Batish SD, Lupski JR (2010) GJB1/Connexin 32 whole gene deletions in patients with X-linked Charcot Marie Tooth disease. Neurogenetics 11: doi: /s Hardison RC, Taylor J (2012) Genomic approaches towards finding cis-regulatory modules in animals. Nat Rev Genet 13: doi: /nrg3242 Hastings PJ, Ira G, Lupski JR (2009) A microhomology-mediated break-induced replication model for the origin of human copy number variation. PLoS Genet 5:e doi: /journal. pgen Kircher M, Witten DM, Jain P, O Roak BJ, Cooper GM, Shendure J (2014) A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet 46: doi: /ng.2892 Lee JA, Carvalho CM, Lupski JR (2007) A DNA replication mechanism for generating nonrecurrent rearrangements associated with genomic disorders. Cell 131: doi: /j. cell Lewis RA, Shroyer NF, Singh N, Allikmets R, Hutchinson A, Li Y, Lupski JR, Leppert M, Dean M (1999) Genotype/phenotype analysis of a photoreceptor-specific ATP-binding cassette transporter gene, ABCR, in Stargardt disease. Am J Hum Genet 64: doi: / Liu YH, Li CG, Zhou SF (2009) Prediction of deleterious functional effects of non-synonymous single nucleotide polymorphisms in human nuclear receptor genes using a bioinformatics approach. Drug Metab Lett 3: Marmor MF, McNamara JA (1996) Pattern dystrophy of the retinal pigment epithelium and geographic atrophy of the macula. Am J Ophthalmol 122: Martinez-Mir A, Paloma E, Allikmets R, Ayuso C, del Rio T, Dean M, Vilageliu L, Gonzalez-Duarte R, Balcells S (1998) Retinitis pigmentosa caused by a homozygous mutation in the Stargardt disease gene ABCR. Nat Genet 18: doi: /ng Maugeri A, van Driel MA, van de Pol DJ, Klevering BJ, van Haren FJ, Tijmes N, Bergen AA, Rohrschneider K, Blankenagel A, Pinckers AJ, Dahl N, Brunner HG, Deutman AF, Hoyng CB, Cremers FP (1999) The 2588G >C mutation in the ABCR gene is a mild frequent founder mutation in the Western European population and allows the classification of ABCR mutations in patients with Stargardt disease. Am J Hum Genet 64:

87 Hum Genet (2016) 135:9 19 Maugeri A, Klevering BJ, Rohrschneider K, Blankenagel A, Brunner HG, Deutman AF, Hoyng CB, Cremers FP (2000) Mutations in the ABCA4 (ABCR) gene are the major cause of autosomal recessive cone-rod dystrophy. Am J Hum Genet 67: doi: / McCulloch DL, Marmor MF, Brigell MG, Hamilton R, Holder GE, Tzekov R, Bach M (2015) ISCEV standard for full-field clinical electroretinography (2015 update). Doc Ophthalmol 130:1 12. doi: /s Mullins RF, Kuehn MH, Radu RA, Enriquez GS, East JS, Schindler EI, Travis GH, Stone EM (2012) Autosomal recessive retinitis pigmentosa due to ABCA4 mutations: clinical, pathologic, and molecular characterization. Investig Ophthalmol Vis Sci 53: doi: /iovs Noupuu K, Lee W, Zernant J, Tsang SH, Allikmets R (2014) Structural and genetic assessment of the ABCA4-associated optical gap phenotype. Investig Ophthalmol Vis Sci 55: doi: /iovs Riveiro-Alvarez R, Xie YA, Lopez-Martinez MA, Gambin T, Perez- Carro R, Avila-Fernandez A, Lopez-Molina MI, Zernant J, Jhangiani S, Muzny D, Yuan B, Boerwinkle E, Gibbs R, Lupski JR, Ayuso C, Allikmets R (2015) New mutations in the RAB28 gene in 2 Spanish families with cone-rod dystrophy. JAMA Ophthalmol 133: doi: /jamaophthalmol Rivera A, White K, Stohr H, Steiner K, Hemmrich N, Grimm T, Jurklies B, Lorenz B, Scholl HP, Apfelstedt-Sylla E, Weber BH (2000) A comprehensive survey of sequence variation in the ABCA4 (ABCR) gene in Stargardt disease and age-related macular degeneration. Am J Hum Genet 67: Schubert C, Pryds A, Zeng S, Xie Y, Freund KB, Spaide RF, Merriam JC, Barbazetto I, Slakter JS, Chang S, Munch IC, Drack AV, Hernandez J, Yzer S, Merriam JE, Linneberg A, Larsen M, Yannuzzi LA, Mullins RF, Allikmets R (2014) Cadherin 5 is regulated by corticosteroids and associated with central serous chorioretinopathy. Hum Mutat 35: doi: /humu Shroyer NF, Lewis RA, Yatsenko AN, Lupski JR (2001) Null missense ABCR (ABCA4) mutations in a family with Stargardt disease and retinitis pigmentosa. Investig Ophthalmol Vis Sci 42: Stenirri S, Battistella S, Fermo I, Manitto MP, Martina E, Brancato R, Ferrari M, Cremonesi L (2006) De novo deletion removes a conserved motif in the C-terminus of ABCA4 and results in conerod dystrophy. Clin Chem Lab Med 44: doi: / CCLM Tsang SH, Burke T, Oll M, Yzer S, Lee W, Xie YA, Allikmets R (2014) Whole exome sequencing identifies CRB1 defect in an unusual maculopathy phenotype. Ophthalmology 121: doi: /j.ophtha Watzke RC, Folk JC, Lang RM (1982) Pattern dystrophy of the retinal pigment epithelium. Ophthalmology 89: Yamamoto S, Jaiswal M, Charng WL, Gambin T, Karaca E, Mirzaa G, Wiszniewski W, Sandoval H, Haelterman NA, Xiong B, Zhang K, Bayat V, David G, Li T, Chen K, Gala U, Harel T, Pehlivan D, Penney S, Vissers LE, de Ligt J, Jhangiani SN, Xie Y, Tsang SH, Parman Y, Sivaci M, Battaloglu E, Muzny D, Wan YW, Liu Z, Lin-Moore AT, Clark RD, Curry CJ, Link N, Schulze KL, Boerwinkle E, Dobyns WB, Allikmets R, Gibbs RA, Chen R, Lupski JR, Wangler MF, Bellen HJ (2014) A drosophila genetic resource of mutants to study mechanisms underlying human genetic diseases. Cell 159: doi: /j.cell Yatsenko AN, Shroyer NF, Lewis RA, Lupski JR (2001) Late-onset Stargardt disease is associated with missense mutations that map outside known functional regions of ABCR (ABCA4). Hum Genet 108: Yatsenko AN, Shroyer NF, Lewis RA, Lupski JR (2003) An ABCA4 genomic deletion in patients with Stargardt disease. Hum Mutat 21: doi: /humu Zahid S, Jayasundera T, Rhoades W, Branham K, Khan N, Niziol LM, Musch DC, Heckenlively JR (2013) Clinical phenotypes and prognostic full-field electroretinographic findings in Stargardt disease. Am J Ophthalmol 155( ):e3. doi: /j. ajo Zernant J, Schubert C, Im KM, Burke T, Brown CM, Fishman GA, Tsang SH, Gouras P, Dean M, Allikmets R (2011) Analysis of the ABCA4 gene by next-generation sequencing. Investig Ophthalmol Vis Sci 52: doi: /iovs Zernant J, Xie YA, Ayuso C, Riveiro-Alvarez R, Lopez-Martinez MA, Simonelli F, Testa F, Gorin MB, Strom SP, Bertelsen M, Rosenberg T, Boone PM, Yuan B, Ayyagari R, Nagy PL, Tsang SH, Gouras P, Collison FT, Lupski JR, Fishman GA, Allikmets R (2014) Analysis of the ABCA4 genomic locus in Stargardt disease. Hum Mol Genet 23: doi: /hmg/ddu396 Zhang F, Khajavi M, Connolly AM, Towne CF, Batish SD, Lupski JR (2009) The DNA replication FoSTeS/MMBIR mechanism can generate genomic, genic and exonic complex rearrangements in humans. Nat Genet 41: doi: /ng

88 4. Discussion The work in Chapter III.2 presented a study of non-coding genetic variation on the largest STGD1 cohort analyzed to date. We identified 117 new variants in the intronic regions of ABCA4, of which we were able to unequivocally assign pathogenicity to 12 through multimodal analysis. No large CNV was detected in any patients, suggesting that it is a very rare event in the ABCA4 locus. In silico analyses gave predictions on the regulatory impact of variants and their relative deleteriousness to all possible substitutions of the human genome, but no definite proof of pathogenicity in any variants. Analysis of the entire ABCA4 locus in the large family presented in Chapter III.3 revealed four different ABCA4 variants belonging to 3 separate classes of mutations segregating with the disease, which accounted for at least 2 out of 4 phenotypes in the pseudo-dominant inheritance pedigree. Analyzing off-target regions in exome sequences aided the identification of the variants (c C>T) on the complex intronic allele (c C>T/c ) that was initially missed by ABCA4 locus sequencing. RNA analysis via an illegitimate transcript approach of either of the two complex intronic alleles did not reveal any effect on splicing. The large number of novel intronic variants found revealed that the genetic variability in the non-coding sequences in ABCA4 locus, similar to the coding sequences, is exceptionally vast. All definitely or likely disease-associated variants are individually rare in the populations of European descent, and no pathogenic intronic variant account for significantly more cases than others. The pathogenicity assessment of intronic variants are challenging due to a number of factors. The exclusive expression 70

89 of ABCA4 protein in the cones and rods photoreceptors, and consequently, the impossibility to obtain in vivo tissues from patients for RNA analyses, precludes direct structural and expression studies of ABCA4 intronic variants. This has limited the analyses to the assessment of variant frequencies in patients and in the matched general populations, to in silico suggestions by predictive software programs, and to segregation analyses in families. The later approach is further hampered by the extremely rare occurrence of most individual variants in ABCA4 coding and non-coding regions. The majority of intronic variants discovered in this study occurred in singleton cases and/or in nuclear families where segregation analysis has limited power. Findings from the large family described in chapter III.3 further demonstrated the necessity of using multi-prone approach for variant analysis in ABCA4-associated disorders. Based on the apparent dominant transmission of disease and presentation of several phenotypes (STGD1, central serous choroidoretinopathy, and pattern dystrophy), we initially explored the possibility of other causal genes than ABCA4 alone by exome sequencing all available affects individuals. In addition to variant analysis approach described in Chapter II, analyses included dominant segregation of variants with only individuals that belong to the same phenotype group and recessive analyses separated by each generation, to account for the possibility of a combination of dominant and recessive genes. Off-target reads in the ABCA4 locus in exome sequences were also carefully surveyed for rare intronic variants, which led to the confirmed segregation of a novel intronic variant c c>t on the same allele as the previously identified c c>t variant. As locus sequencing was initially performed only on the proband of the family, and the pathogenicity of this allele was 71

90 unknown at that time due to no effect predicted from in silico analyses, the finding of this variant segregating in exome sequences directly aided the confirmation of diseasecausality of this complex alleles in the family. According to in silico analyses, neither c c>t nor c c>t was predicted to have any effect on splicing, either involving existing cryptic splice sites or creating new sites. This was also confirmed by the RNA analysis from leukocytes via an illegitimate transcript experimental approach; however, this method has been shown to have limited value for the analysis of the ABCA4 gene, which is expressed only in photoreceptors, often producing results that can lead to erroneous interpretations. This complex allele has been also detected as the second ABCA4 disease-associated allele in five more cases from our large (>700 cases) STGD1 cohort at Columbia University. In addition, there is precedent at the ABCA4 locus for complex alleles to have more severe functional consequences than the individual constituent variants and also have more severe clinical phenotypic consequences (Shroyer, Lewis et al. 2001). In summary, the analyses in this Chapter epitomize the extremely complex mutational spectrum underlying ABCA4-asscociated diseases, and highlight the extensive molecular analyses required to unequivocally solve these cases. Studying ips cells obtained from individual patients, which are then directed towards differentiating into photoreceptors with the goal of expressing ABCA4, is a plausible approach to study the pathogenicity of ABCA4 intronic variants (Sangermano, Bax et al. 2016). 72

91 CHAPTER IV EXPANSION OF THE PHENOTYPIC SPECTRUM IN KNOWN DISEASE GENES 73

92 1. Preface The term phenocopy refers to an individual with a genetic defect who has clinical manifestations that resemble those of another disease. Several reasons or mechanisms could lead to the presence of phenocopies in a disease cohort. Given the wide continuum of phenotypic spectrum in many inherited retinal disorders, it is often a challenging task to unambiguously attribute a phenotype to a single gene due to overlapping diagnostic presentations of multiple diseases. Those cases could represent clinical misdiagnosis, or the clinical features could have been ambiguous or even novel. In other cases, the phenotypes observed in patients could have been the result of phenotypic expansion in other genes that usually give rise to different diseases, or the existence of other unknown mechanisms. A fraction of patients whose phenotypes were consistent with ABCA4-disease but who did not possess mutations in ABCA4, was expected to express the disease due to mutations in other known disease genes. In the work summarized in this chapter we performed whole-exome sequencing on families and sporadic cases with ABCA4-like phenotypes and attributed the phenotype to other known retinal disease genes. In the work presented in Chapter IV.2, we described a unique maculopathy phenotype in a family with 2 affected siblings. The proband and her affected brother presented relative sparing of the retinal periphery beyond the vascular arcades. The family was initially screened for mutations in ABCA4 and PRPH2, revealing no pathogenic mutations. DNA from 3 members of the family, including both affected siblings and their mother, was then screened with whole-exome sequencing. Segregation analysis following a recessive inheritance and analysis of resulting 74

93 candidate genes were performed as described in Chapter II, resulting in the identification of 2 missense mutations in the severe and generalized retinopathyassociated CRB1 gene. The effect of the CRB1 mutations, as well as potential mechanism of modifier effect was investigated and discussed. In the work described in Chapter IV.3, we investigated a large, 3-generation family with multiple individuals affected with a late-onset, slowly progressive BEM, the feature of which resembled early-stage STGD1 The multi-generation pedigree exhibited dominant inheritance pattern. WES was performed on 8 families members with available DNA, revealing a single dominant stop-gain mutation in the CRX gene. Concurrent WES analysis in sporadic cases identified 2 other individuals with remarkably similar phenotype, each harboring a frame-shift mutation in CRX. We discussed the expansion of phenotype in CRX a gene that is typically associated with much more severe childhood vision loss, as well as incomplete penetrance of the dominant inheritance pattern observed in the large family. The result of this work was published as part of a bigger study on mosaic genetic screen of lethal mutations on the Drosophila X chromosome to reveal human homologs in flies potentially associated with human diseases. The Drosophila study including a section on the CRX gene constituted the body of Chapter IV.3. In the work presented in Chapter IV.4, we described one family where an adult female presented with BEM at 28 years of age. Her medical history was otherwise unremarkable medical history except for anemia and both urinary tract and kidney infections. After negative screening in the ABCA4 gene, we performed WES on the family. WES analysis following a recessive inheritance pattern revealed 2 previously 75

94 published compound heterozygous mutations in the MMACHC gene, defects in which were associated with methylmalonic aciduria and homocystinuria type C (cblc). As an unprecedented case of adult cblc with only macular lesions and no remarkable systemic symptoms, the situation of a retinal phenotype as part of a systemic complication was discussed in detail. 76

95 CHAPTER IV.2 Whole-exome sequencing identifies defect in an unusual maculopathy phenotype (Published Paper) 77

96 Whole Exome Sequencing Identifies CRB1 Defect in an Unusual Maculopathy Phenotype Stephen H. Tsang, MD, PhD, 1,2 Tomas Burke, MD, 1,3 Maris Oll, MD, 1,4 Suzanne Yzer, MD, PhD, 1,5 Winston Lee, MA, 1 Yajing (Angela) Xie, MA, 1 Rando Allikmets, PhD 1,2 Objective: To report a new phenotype caused by mutations in the CRB1 gene in a family with 2 affected siblings. Design: Molecular genetics and observational case studies. Participants: Two affected siblings and 3 unaffected family members. Methods: Each subject received a complete ophthalmic examination together with color fundus photography, fundus autofluorescence (FAF), and spectral-domain optical coherence tomography (SD-OCT). Microperimetry 1 (MP-1) mapping and electroretinogram (ERG) analysis were performed on the proband. Screening for disease-causing mutations was performed by whole exome sequencing in 3 family members followed by segregation analyses in the entire family. Main Outcome Measures: Appearance of the macula as examined by clinical examination, fundus photography, FAF imaging, SD-OCT, and visual function by MP-1 and ERG. Results: The proband and her affected brother exhibited unusual, previously unreported, findings of a macular dystrophy with relative sparing of the retinal periphery beyond the vascular arcades. The FAF imaging showed severely affected areas of hypoautofluorescence that extended nasally beyond the optic disc in both eyes. A central macular patch of retinal pigment epithelium (RPE) sparing was evident in both eyes on FAF, whereas photoreceptor sparing was documented in the right eye only using SD-OCT. The affected brother presented with irregular patterns of autofluorescence in both eyes characterized by concentric rings of alternating hyper- and hypoautofluorescence, and foveal sparing of photoreceptors and RPE, as seen on SD-OCT, bilaterally. After negative results in screening for mutations in candidate genes including ABCA4 and PRPH2, DNA from 3 members of the family, including both affected siblings and their mother, was screened by whole exome sequencing resulting in identification of 2 CRB1 missense mutations, c.c3991t:p.r1331c and c.c4142t:p.p1381l, which segregated with the disease in the family. Of the 2, the p.r1331c CRB1 mutation has not been described before and the p.p1381l variant has been described in 1 patient with Leber congenital amaurosis. Conclusions: This report illustrates a novel presentation of a macular dystrophy caused by CRB1 mutations. Both affected siblings exhibited a relatively well-developed retinal structure and preservation of generalized retinal function. An unusual 5-year progression of macular atrophy alone was observed that has not been described in any other CRB1-associated phenotypes. Ophthalmology 2014;121: ª 2014 by the American Academy of Ophthalmology. Supplemental material is available at Mutations in the CRB1 gene (Mendelian Inheritance in Man #604210) have been associated with a variety of generalized retinal dystrophies ranging from retinitis pigmentosa (RP) to Leber congenital amaurosis (LCA). 1e4 Retinitis pigmentosa refers to a group of clinically and genetically heterogeneous disorders affecting 1.5 million people worldwide. Reported cases of RP associated with mutations in CRB1 (RP12 phenotype) present with an early disease onset, including nystagmus, hyperopia, optic nerve head drusen, relative attenuation of the vessels, a maculopathy, and nummular type of pigmentation in the periphery. 2,5,6 CRB1 mutations also have been correlated to retinal vascular sheathing, preserved para-arteriolar retinal pigment epithelium (RPE), and the development of Coats-like exudative vasculopathy, a condition of abnormally permeable blood vessels leading to exudation and retinal detachment. 4,5,7 Mutations in the CRB1 gene have been detected in 10% to 13% of patients with LCA, one of the most severe forms of retinal dystrophy characterized by onset in the first year of life, nystagmus, sluggish pupillary and oculodigital reflexes, and an extinguished electroretinogram (ERG). 4,8e10 Phenotypically, patients with LCA with CRB1 mutations usually show the described RP12 characteristics, including the early-onset maculopathy (macular dysplasia). CRB1 is the human homologue of the gene encoding the crumbs (Crb) protein in Drosophila melanogaster and is expressed in the fetal brain and the inner segments of photoreceptors in humans. 2,11 CRB1 also is expressed in the brain, kidney, colon, stomach, lung, and testis. 12,13 CRB1 maps to chromosome 1q31.3 and is composed of 12 exons Ó 2014 by the American Academy of Ophthalmology Published by Elsevier Inc. ISSN /14/$ - see front matter

97 Ophthalmology Volume 121, Number 9, September 2014 that are translated into 2 protein isoforms, the larger of which possesses a cytoplasmic domain containing FERMand PDZ-binding motifs that enable adherin junction formation and actin skeleton association. 14 CRB1 is crucial for the assembly of the zonula adherens in Drosophila and has been found to be localized at the apical membrane. 15,16 A similar distribution in the outer limiting membranes of epithelial cells, Müller cells, and photoreceptor inner segments has been observed in mouse and human retinas. 11,17,18 Developmentally, CRB1 has been shown to determine embryonic epithelium and peripheral neurons in Drosophila. 15,16 In addition, both human and mouse CRB1 proteins are involved in photoreceptor morphogenesis. 11,17,19 Mouse models of CRB1 have been extensively studied and used to show developmental defects and disorganization of the retina in mutants, particularly disruptions of the outer limiting membrane and the formation of retinal folds or pseudorosettes. 17,19 These findings correlate with the appearance of the developmentally immature retinas in patients with CRB1 mutations. The retinas of such patients appear thickened and often exhibit altered laminar organization through the disruption of developmental apoptosis. 19e22 More than 150 disease-associated variants have been described to date in the CRB1 gene, the most common of which is the p.c948y variant in exon 9. 1,2,5,11,20,23e26 We describe the clinical appearance of a combination of novel CRB1 variants that were associated with an unusual and previously not described phenotype in 2 affected siblings of Irish descent. Methods Patients and Clinical Evaluation Two patients, the proband and her affected brother, along with an unaffected sister, mother, and father, were enrolled in the study under the protocol #AAAB6560 after obtaining full consent. The protocol was approved by the institutional review board at Columbia University and adhered to tenets set out in the Declaration of Helsinki. Each patient underwent a complete ophthalmic examination by a retinal physician (S.H.T.), which included color fundus photography with an FF 450plus Fundus Camera (Carl Zeiss Meditec AG, Jena, Germany). Fundus autofluorescence (FAF) images were obtained using a confocal scanning-laser ophthalmoscope (Heidelberg Retina Angiograph 2, Heidelberg Engineering, Dossenheim, Germany) by illuminating the fundus with argon laser light (488 nm) and viewing the resultant fluorescence through a band pass filter with a short wavelength cutoff at 495 nm. Simultaneous FAF and spectral-domain optical coherence tomography (SD-OCT) images were acquired using a Spectralis HRAþOCT (Heidelberg Engineering, Heidelberg, Germany). Color-coded retinal thickness maps were exported from Heidelberg Explorer v (Heidelberg Engineering, Heidelberg, Germany) software that had automatically calculated the thickness of the retina (from internal limiting membrane to Bruch s membrane) using raster SD-OCT scans acquired on the Heidelberg Spectralis. The segmentations performed by the software were manually adjusted where necessary. The retinal thickness measurements are automatically mapped onto an infrared fundus image and color-coded according to thickness (micrograms). Electroretinography was carried out using the Diagnosys Espion Electrophysiology System (Diagnosys LLC, Littleton, MA). For all recordings, the pupils were maximally dilated before full-field ERG testing using guttate tropicamide (1%) and phenylephrine hydrochloride (2.5%), and the corneas were anesthetized with guttate proparacaine 0.5%. Silverimpregnated fiber electrodes (DTL; Diagnosys LLC) were used with a ground electrode on the forehead. Full-field ERGs to test generalized retinal function were performed using extended testing protocols incorporating the International Society for Clinical Electrophysiology of Vision standard. 27 Microperimetry (Nidek Instruments Inc, Padova, Italy; NAVIS software version ) mapping was carried out in proband using the 10-2 pattern after pupil dilation with 1% tropicamide and after a 15-minute adaptation period to the background luminance. Genetic Analyses The proband was initially screened for variants in the ABCA4 and PRPH2 genes by direct Sanger sequencing revealing no diseaseassociated variants. Because the phenotype did not suggest testing any other candidate genes, the family was subjected to the whole exome sequencing and analysis. Exome sequencing was performed for the 2 affected siblings and their unaffected mother. Three to 5 mm of genomic DNA extracted from peripheral blood were exome captured and sequenced at Axeq Technologies (Rockville, MD; available at: accessed September 10, 2013). In-solution sequence capture was performed using Nimblegen capture array (SeqCap EZ Exome Library v3.0) with 64 Mb target region. Massively parallel sequencing of the enriched library was performed on Illumina HiSeq platform with 100 base pair paired-end reads. Sequencing reads were generated in the fastq format after nucleotide calling, and quality score assessment was performed using instrument-specific Real Time Analysis software (Illumina, San Diego, CA). Read pairs were aligned to the human reference genome (hg19) using Burrows-Wheeler Aligner (available at: and duplicate reads were removed with PICARD tools (available at: Uniquely mapped on-target reads were extracted, and single nucleotide polymorphism (SNP) and in/del calling were performed with Samtools ( Variants were annotated using ANNOVAR ( annovar/). All variants of interest were confirmed by Sanger sequencing, and segregation analyses were performed in the entire family. Results Clinical Examination The proband, a 45-year-old woman of Irish descent, had noticed a decrease in her vision starting around her mid-20s. Her brother, aged 41 years at the time of examination, had reported similar onset of visual symptoms in his early 30s. Neither the proband nor the affected sibling had undergone ophthalmic examination before the onset of symptoms; therefore, there are no data on the presymptomatic retinal state in either case. The third sibling and parents reported no major issues with vision (Fig 1). The family members recalled a significant vision loss in their maternal grandmother and a paternal aunt beginning in the 6th and 7th decades of life. However, in both cases the cause remained unknown, and both of these individuals were deceased with no available clinical records. Both the proband and her affected sibling did not have any contributory ophthalmic or systemic illnesses, and both denied a history of smoking

98 Tsang et al Whole Exome Sequencing Identifies an Unusual CRB1 Phenotype the proband; however, clear parafoveal atrophy was seen in the maculae (right eye > left eye) along with a few foci of hyperpigmentation (Fig 2L, Q). Both parents of the affected siblings were examined with dilated fundus examination. There was no vascular attenuation or intraretinal pigment migration in either parent. However, peripapillary atrophy was present in both parents. Choroidal thinning due to myopia was observed in the mother s left eye. A nonsignificant epiretinal membrane was observed in the temporal macula of her left eye. Normal retinal layers and intact photoreceptors were apparent in both parents (Fig 3B, D, F, H). Figure 1. Pedigree of the family. Open circles and squares represent the unaffected female and male family members, respectively; closed circles and squares represent the affected female and male patients. The mother and father in this pedigree are heterozygous for the p.p1381l and p. R1331C mutations, respectively. The affected siblings are compound heterozygous for both mutations. Table 1 summarizes the demographic and clinical findings from the initial evaluation of both affected siblings. Best-corrected visual acuity in the proband was 20/40 in the right eye and 20/400 in the left eye at presentation. A slit-lamp examination of the anterior segment showed unremarkable results. Dilated fundoscopy revealed a clear vitreous and healthy vascular arcades. There was no optic disc swelling; however, temporal pallor was present in both eyes (Fig 2A, F). There were no signs of macular edema in the right eye, although both maculae exhibited a mottled, granularly speckled appearance with a continuous annulus of atrophy circumscribing an island of preserved RPE in the foveal region. A more generalized wipe-out of RPE in the left macula was observed. No peripapillary sparing was noted in either eye, and, most notable, atrophy of the retina and RPE was observed nasal to the disc in both eyes. Central fixation was spared according to the patient s ability to follow a fixation target (Fig 2A, F). The younger brother of the proband presented with similar ocular findings. His best-corrected visual acuity was 20/40 in the right eye and 20/70 in the left eye. Cystoid macular edema (CME) was present in the left eye that responded to treatment with oral acetazolamide (500 mg) and guttate nepafenac, resulting in the improvement of visual acuity to 20/30 with an associated reduction in the size of intraretinal cysts. The anterior segment examination results were unremarkable. No vitreous opacities were found, and both the optic nerves and the fundus vasculature appeared healthy. Both maculae exhibited similar granular characteristics observed in Fundus Autofluorescence Fundus autofluorescence imaging in the proband showed hyperautofluorescence suggestive of preservation of a central island of RPE in the right eye and parafoveal preservation of an RPE island in the left eye. The macula in both eyes appeared to be enshrouded with a large cloud of hypoautofluorescence extending temporally past the optic discs. Peripheral areas of relatively normal FAF pattern were marked with dark punctate changes in areas approaching the hypoautofluorescent cloud (Fig 2B, C, G, H). In the affected brother, the central maculae of both eyes appeared hypoautofluorescent, more so in the left eye. There was associated hypoautofluorescence along the peripapillary area and the proximal superotemporal arcades in both eyes. However, the more peripheral maculae appeared relatively hyperautofluorescent compared with the surrounding extramacular retina. There were multiple discrete punctate hyperautofluorescent foci in the maculae, predominating nasally and temporally (Fig 2N, S). A petaloid pattern of alternating hyper- and hypoautofluorescence was seen in both eyes but was, again, more obvious in the left eye (Fig 2S). Spectral-Domain Optical Coherence Tomography Horizontal SD-OCT line scans through the central maculae of both eyes of the proband showed an absence of the outer nuclear layer, inner segment ellipsoid band, or inner/outer segment junction of the photoreceptors and RPE, except in the foveal region of the right eye where the structure of both the photoreceptors and RPE appeared anomalous, but present, suggestive of relative sparing (Fig 2D, E). This sparing co-localized with the region of hyperautofluorescence. Of note, similar sparing of the photoreceptors was not seen in the foveal region of the left eye despite a hyperautofluorescent signal (Fig 2I, J). The SD-OCT scans in the right eye of the affected sibling resembled that of the proband. However, the photoreceptor sparing on SD-OCT was less obvious. Nonetheless, in the foveal region of the sibling s right eye there was evidence of a residual inner segment ellipsoid band in the foveal region and a disorganized atrophic outer nuclear layer. On close inspection of the outer nuclear layer in the temporal parafoveal Table 1. Summary of Demographic, Clinical, and Genetic Data Family Age (yrs) BCVA Snellen (logmar) OD BCVA Snellen (logmar) OS Condition Age of Onset CRB1 Mutation Proband 45 20/40 20/400 Affected Mid-20s R1331C; P1381L Brother 41 20/40 20/70 Affected Early 30s R1331C; P1381L Sister* 33 d d Unaffected d wt; wt Mother 69 20/50 20/40 Unaffected d wt; P1381L Father 85 20/20 20/20 Unaffected d R1331C; wt e ¼ not applicable; BCVA ¼ best-corrected visual acuity; logmar ¼ logarithm of the minimum angle of resolution; OD ¼ right eye; OS ¼ left eye. *Not clinically examined; underwent only genetic testing

99 Ophthalmology Volume 121, Number 9, September 2014 Figure 2. Fundus photographs, fundus autofluorescence (FAF) images, and spectral-domain optical coherence tomography (SD-OCT) scans of the right and left eyes of a 45-year-old woman (proband) and her 41-year-old affected brother with the same CRB1 mutations. Fundus photography in the proband exhibited stable fixation (needle position), temporal pallor of the optic discs, attenuated retinal vasculature, and extensive retinal pigment epithelium (RPE) changes (A, F). The FAF imaging reveals predominant hypoautofluorescence across the macula, consistent with widespread RPE atrophy, although relative hyperautofluorescence was documented in the foveal and parafoveal regions of the right and left eyes and respectively, consistent with sparing of the RPE (B, C, G, H). The SD-OCT scans reveal relative sparing of foveal photoreceptors (outer nuclear layer and inner segment ellipsoid band) and RPE in the right eye (D, E), and an absence of all except RPE in the left eye (I, J). Fundus photographs in the affected brother also showed extensive loss of macular RPE (L, Q). The FAF revealed concentric rings of alternating hyper- and hypoautofluorescence encircling the macula (M, R), and there were petaloid patterns centrally, greatest in the left eye (N, S). The SD-OCT scans through the fovea in the right eye revealed an area of relative foveal sparing (O, P), and cystoid macular edema (CME) was detected in the left eye (T, U). There was possible early cystoid change in the temporal parafoveal macula of the right eye (O). Relatively normal laminar architecture is preserved in the maculae of both affected siblings. Electroretinography in the proband reveals overall preservation of generalized cone and rod function without any obvious implicit time shifts (K). ISe ¼ inner segment ellipsoid; OD ¼ right eye; ONL ¼ outer nuclear layer; OS ¼ left eye. macula of the right eye, there was possible early cystoid change (Fig 2O). There was definite CME in the left eye visible on SDOCT (Fig 2T, U). Despite the presence of CME in the left eye, the outer nuclear layer and inner segment ellipsoid band appeared well preserved in the central macula. A 3-year followup scan of the affected sibling s left eye was acquired after treatment (Fig 4, available at and showed that there was persistence of CME, although it had reduced. Even with progressive photoreceptor and RPE layer disruption, retinal thickness appeared to be within normal limits in both affected siblings (excluding the sibling s left eye with CME), with reasonably well-preserved retinal lamination. Retinal thickness in the proband and affected brother was assessed using color-coded thickness maps over the macula of each eye (Fig 5). Compared with an age-matched control, both eyes of the proband and the right eye of her affected sibling exhibited a reduction in retinal thickness in the central macula consistent with atrophy. There was a trend toward increased thickness relative to the control in the more peripheral macula. The left eye of the proband s sibling had grossly increased macular thickness centrally due to CME. Electroretinogram Analysis Scotopic responses in the proband showed a generalized preservation of retinal rod function. A marginal implicit time delay was detected in the photopic responses, suggestive of cone

100 Tsang et al Whole Exome Sequencing Identifies an Unusual CRB1 Phenotype Figure 3. Fundus photographs, 488 nm fundus autofluorescence (FAF) images and spectral-domain optical coherence tomography (SD-OCT) scans in the parental carriers of the CRB1 mutation. No significant retinal pigment epithelium (RPE) and abnormal AF patterns were observed. Both parents exhibited peripapillary atrophy in both eyes, greatest in the mother and related to her high myopia. Minor choroidal thinning in the mother also was attributed to this myopia (B, D). The SD-OCT of her left eye revealed a nonsignificant epiretinal membrane. Otherwise, normal retinal layers and intact photoreceptors were present in both parents. FAF ¼ fundus autofluorescence; OD ¼ right eye; OS ¼ left eye. involvement; however, both the waveform and amplitudes were within normal limits compared with age-matched controls (Fig 2K). Progression and Visual Function Microperimetry 1 (MP-1) mapping results (in decibels) were recorded in the proband at the initial visit and registered to a corresponding FAF image with the built-in Navis software. Preserved visual function was documented within the region of hyperautofluorescence in both eyes. However, across the atrophic hypoautofluorescent macula, retinal sensitivities of 0 decibels were recorded. The test documented steady foveal fixation as assessed by the fixation tracker on the mapping instrument. A subsequent recording after 5 years showed a notable reduction in sensitivity in the right eye and an almost complete loss of function in the left eye (Fig 6). Genetic Analyses Whole exome sequencing was performed in the 2 affected siblings and their unaffected mother. An average of 113 million total reads were generated for each sample, with an average of 65 million (58%) nonredundant, unique reads mapped to the 64 MB exome region (Table 2; available at More than 90% of the target regions have >10 coverage, with an average mean depth of coverage of 80. On average, there were a total of variants identified for each sample, of which (21%) were in protein coding sequences. Some 98% of all coding variants were SNPs and 2% were insertions and deletions

101 Ophthalmology Volume 121, Number 9, September 2014 Figure 5. Color-coded macular thickness (mm) maps of the CRB1-affected proband and her brother compared with an age-matched normal subject. The maps show a reduced (darker colors) thickness in the central macula in both eyes of the proband and in the right eye of her sibling compared with the control, with a trend toward increased (brighter colors) thickness relative to the control in the more peripheral macula. The left eye of the proband s sibling had grossly increased macular thickness due to cystoid macular edema (CME). (in/dels). The ratio of nonsynonymous to synonymous variants was close to 1:1 in all samples. For initial filtering of variants, the minimum SNP quality value was set at 20 and the minimum total read depth was set at 5. Considering the autosomal recessive inheritance pattern of the disease and that both affected siblings had to have the same causal mutations, we focused on genes that had at least 2 shared variants in the 2 affected siblings (Table 3). We further assumed that the disease-causing variants have to be rare, thereby filtering the sequence data for new variants and those found in <0.5% in dbsnp135 and in the 1000 Genomes databases. After excluding synonymous variants and all coding and intronic variants that did not affect splicing, the number of candidate genes was narrowed down to 19. After assessing the phase of remaining variants using the sequences of the mother, only 3 possible candidate genes remained: CRB1, NADK, and DNAH12. The segregation of variants in these genes with the disease was tested in the entire family, including the unaffected sibling and the father. Dynein, axonemal, heavy chain 12 protein is a large protein with multiple rare variants involved in microtubule-associated motor protein complexes composed of several heavy, light, and intermediate chains. NADK catalyzes the synthesis of NADP from

102 Tsang et al Whole Exome Sequencing Identifies an Unusual CRB1 Phenotype Figure 6. Functional and macular progression assessment through fundus autofluorescence (FAF) imaging and microperimetry (MP)-1 mapping. Over the course of 5 years, preservation of the central island of hyperautofluorescent retinal pigment epithelium (RPE) was maintained; however, the spatial function had become more constricted. A comparative overlay of MP-1 and FAF showed that the scotomata (0 decibels) recorded on the MP-1 corresponded to atrophic (hypoautofluorescent) regions at the initial examination in the right (A) and left (B) eyes and in the same area of both eyes, respectively (C, D), after 5 years. OD ¼ righteye; OS ¼ left eye. NAD and ATP and exists in several isoforms. Neither of these 2 genes has been associated with eye disease phenotypes, and there is no direct mechanism how these would be involved in retinal dystrophies. This left CRB1 as the only plausible candidate gene for the disease phenotype segregating with the 2 variants, because it is a well-known gene in retinal dystrophies (RetNet; uth.edu/retnet/; accessed September 10, 2013). The 2 CRB1 variants shared by the affected siblings (c.c3991t:p.r1331c and C4142T:p.P1381L) are not present in ESP6500, dbsnp135, and 1000 Genomes database, although the p.p1381l variant was recently identified as disease associated in 1 patient with LCA. 28 Both variants are predicted to be deleterious or damaging by Sorting Intolerant from Tolerant (SIFT; Polyphen-2, and MutationTaster programs. Nucleotide positions of the 2 mutations are highly conserved as predicted by phylop. The novel c.c3991t variant that results in the p.r1331c mutation is 1 nucleotide before a previously reported benign variant c.a3992g, which results in the p.r1331h amino acid change. Table 3. Variants Identified in the 2 Affected Individuals Variant Filters No. SNVs No. Genes Total variants shared between both affected siblings Not present in dbsnp135 common and <0.5% in 1000 genomes Missense, nonsense, indels, and splicing Recessive Phase-assessed 6 3 SNV ¼ single nucleotide variation. The latter is considered benign; however, unlike arginine to histidine change (Grantham distance 29), there is large physicochemical difference between arginine and cysteine (Grantham distance 180). Creation of a cysteine residue at this position also results in 2 consecutive cysteines in the amino acid chain. Because EGF-like domain is characterized by 6 conserved cysteines forming 3 disulfide bonds and modification from arginine to histidine at this position is considered benign, p.r1331c also could be a relatively milder mutation, which could explain the milder phenotype compared with other CRB1-associated phenotypes. Discussion The clinical and genetic findings were summarized in a family in whom 2 rare, likely disease-associated CRB1 missense variants segregated with an unusual macular dystrophy phenotype. Previously reported phenotypes associated with mutations in CRB1 include those consistent with an early-onset RP, associated with preserved paraarteriolar RPE or Coats-like exudative vasculopathy. The patients described in this study exhibited unusual phenotypic characteristics that have not been described before and therefore eluded clinical diagnosis by several retinal specialists. Both affected siblings presented with decreased visual acuity but without nyctalopia. Furthermore, on fundoscopy, there was a relatively normal-appearing periphery without any of the RP12 characteristics, such as nummular hyperpigmentation and some degree of vascular attenuation. The disease was largely confined to the macula. In both cases, the reported age of onset was between

103 Ophthalmology Volume 121, Number 9, September 2014 Table 5. Potential Modifier Genes for CRB1-Associated Phenotype Gene DNA Change Protein Change PP-2 SIFT Mutation Taster PhyloP In Proband/Brother MPDZ C1970T P657L Benign Deleterious Disease causing Weakly conserved Proband USH2A A10999C T3667P N/A Deleterious Polymorphism Not conserved Both RPGRIP1 C2794G P932A Probably damaging Tolerated Disease causing Moderately conserved Both TOPORS G2138A R713K Benign Tolerated Polymorphism Moderately conserved Both CDHR1 A1868G N623S Benign Tolerated Polymorphism Highly conserved Both CEP290 G4237C D1413H Benign Deleterious Disease causing Moderately conserved Proband SNRNP200 A3315G A1105A Synonymous Synonymous Synonymous Synonymous Proband c2orf71 G3789A L1263L Synonymous Synonymous Synonymous Synonymous Brother N/A ¼ not applicable; PP-2 ¼Polymorphism Phenotyping v2 (PolyPhen-2; PhyloP ¼ Phylogenetic p-values ( compgen.bscb.cornell.edu/phast/phylop/); SIFT ¼ Sorting Intolerant from Tolerant ( the mid-20s and early 30s; however, the proband had progressed to a more advanced disease stage as evidenced by her comparatively lower visual acuity and the extent of macular atrophy on FAF imaging and SD-OCT. Of note, a small patch of RPE was preserved in the central macular regions, and the overlying retina was found to be functionally intact through microperimetry testing in the proband. Spectral-domain OCT documented relative foveal sparing of the photoreceptors in the right eye of the proband and in both eyes of the affected sibling, although the latter also had CME in his left eye. Although the proband did not exhibit any CME at the time of examination, retinal laminar changes seen in her foveal SD-OCT scans may be due to macular edema from an earlier disease stage. Hyperautofluorescence in the nasal retina and foveal sparing are rare phenomena observed in some cases of ABCA4- and PRPH2-associated maculopathies. Clinical findings in the affected brother exhibited unusual alternating patterns of autofluorescent rings marked with punctate changes in the central macula, resembling a bull s-eye lesion reported in other related retinal conditions. Patients with CRB1-associated phenotypes typically present with a thick, underdeveloped retina characteristically exhibiting loss of laminar layering. A study of mice carrying the rd8 mutation in Crb1 has shown an analogous phenotype of a thickened retina and loss of distinct retinal layering. 22 This developmental disorganization has been attributed to the role of CRB1 in embryonic retinal development in mice and humans, more specifically, photoreceptor morphogenesis. Abnormal thickening and loss of laminar layering were not seen in the presented cases. Although some laminar disruption is evident, the distinct layers can, in general, be distinguished and retinal thickness appears normal. Preserved electrophysiologic function was observed in full-field ERG measurements. These findings are consistent with the presence of peripheral retinal sparing but distinct from previous reports of CRB1-assciated phenotypes in which ERG was extinguished and undetectable because of abnormal development of the embryonic retina. The molecular genetic reasons for the observed distinct phenotype may be attributed to a specific combination of CRB1 alleles, to modifier genes, or both. Furthermore, the modifying effect of nongenetic factors (e.g., environmental) has been suggested as a reason for phenotype variation in CRB1 retinopathy. 24 As discussed previously, the combination of the alleles, specifically the new p.r1331c variant, may have a different effect on the protein function than other CRB1 alleles. The exome sequences also were analyzed for variants in possible modifier genes, especially those that have been shown to interact with the CRB1 protein or belong to the same pathway(s) (Table 4, available at Specifically, genes that encode for proteins involved in the CRUMBS network, those that have been coimmunoprecipitated with CRB1 or retinal ciliopathy proteins in the ciliary compartment, were analyzed for possible modifier variants (Table 5). Crb1 in mice localizes to the outer limiting membrane between the subapical surface or region and adherens junction of Müller glia cells. 19 In the outer limiting membrane, Crb1 interacts with MPP5 via its PDZ binding domain and EPB41L5 with its FERM binding site. At this location, MPP5 organizes a protein scaffold that includes the MAGUK family members MPP3 and MPP4. In addition, MPP5 also has been found to interact with LIN- 7, PAR6, PATJ, MUPP1, EZRIN, and the neuronal GABA transporter GAT1. 14 No mutations that were shared between both affected siblings were found, except for a rare heterozygous missense mutation in MPDZ, a gene that codes for multi-pdz domain protein-1 (MUPP1), in the proband (Table 5); MUPP1 interacts with the intracellular domain of CRB1 via association with the PDZ domain of MPP5. Some rare heterozygous variants in genes known to cause retinal dystrophies, such as RP, LCA, and CRD, were shared by both affected siblings. The specific variants and in silico prediction of their pathogenicity are shown in Table 5. One study has suggested that the USHERIN protein network has physical connection to the CRUMBS protein complex via interaction of MPP5 with MPP1 and the multi-pdz protein whirlin at the outer limiting membrane. 14 WHRN and USH2A co-localize at the outer limiting membrane and the connecting cilium of photoreceptors. 29 Mutations in USH2A are associated with recessive Usher syndrome type 2a and recessive RP. A study of Spanish families with LCA has suggested that

104 Tsang et al Whole Exome Sequencing Identifies an Unusual CRB1 Phenotype variants in RPGRIP1, along with GUCY2D and AIPL1, could be modifier alleles of CRB1. 30 Several studies have identified mutations in the TOPORS gene that cause dominant RP. 31,32 TOPORS is a cilia-centrosomal protein that localizes to the basal bodies of connecting cilium and to the centrosomes of cultured cells. Morpholino-mediated silencing of topors in zebrafish embryos demonstrated defective retinal development and failure to form outer segments. 33 CDHR1 (PCDH21) encodes a photoreceptorspecific cadherin that co-localizes at the base of outer segment with Prominin 1. It is involved in disc morphogenesis and causes cone-rod dystrophy in a mutated form. 34 In addition, sequence changes in CEP290 and SNRNP200 were detected in the proband and 1 variant in the c2orf71 gene in the affected brother. Recessive mutations in all these genes have been associated with LCA, RP, or CRD. However, because both patients exhibited similar disease phenotype and the combination of CRB1 variants was unique to this family, we were not able to assign modifier role to any of the identified variants or to nongenetic modifiers. In conclusion, manifestation of only focal disease in CRB1-associated degeneration in this family was distinct from all previously described retinal dysfunction caused by mutations in CRB1. Instead of a generalized retinal degeneration with dysplastic retinae seen in other CRB1-associated cases, patients in this study exhibited a slowly progressive focal disease. Patients also were lacking all well-known phenotypic features of CRB1-associated disease, such as peripheral nummular pigmentation, preserved para-arteriolar RPE, or Coats-like vasculopathy. Although no unequivocal evidence was found for any modifier alleles that could explain the clinical findings, variants in several genes were identified that could modulate the phenotype in this family. Identification of gene- and especially mutationspecific phenotypes will aid in directing future DNA testing and selecting treatment options for patients with macular dystrophies. References 1. Clark GR, Crowe P, Muszynska D, et al. Development of a diagnostic genetic test for simplex and autosomal recessive retinitis pigmentosa. Ophthalmology 2010;117: den Hollander AI, ten Brink JB, de Kok YJ, et al. Mutations in a human homologue of Drosophila crumbs cause retinitis pigmentosa (RP12). Nat Genet 1999;23: Lotery AJ, Jacobson SG, Fishman GA, et al. Mutations in the CRB1 gene cause Leber congenital amaurosis. Arch Ophthalmol 2001;119: Lotery AJ, Malik A, Shami SA, et al. CRB1 mutations may result in retinitis pigmentosa without para-arteriolar RPE preservation. Ophthalmic Genet 2001;22: Bernal S, Calaf M, Garcia-Hoyos M, et al. Study of the involvement of the RGR, CRPB1, and CRB1 genes in the pathogenesis of autosomal recessive retinitis pigmentosa [report online]. J Med Genet 2003;40:e89. Available at: Accessed October 14, Heckenlively JR. Preserved para-arteriole retinal pigment epithelium (PPRPE) in retinitis pigmentosa. Birth Defects Orig Artic Ser 1982;18: den Hollander AI, Heckenlively JR, van den Born LI, et al. Leber congenital amaurosis and retinitis pigmentosa with Coats-like exudative vasculopathy are associated with mutations in the crumbs homologue 1 (CRB1) gene. Am J Hum Genet 2001;69: den Hollander AI, Johnson K, de Kok YJ, et al. CRB1 has a cytoplasmic domain that is functionally conserved between human and Drosophila. Hum Mol Genet 2001;10: den Hollander AI, Roepman R, Koenekoop RK, Cremers FP. Leber congenital amaurosis: genes, proteins and disease mechanisms. Prog Retin Eye Res 2008;27: Franceschetti A, Dieterle P. Diagnostic and prognostic importance of the electroretinogram in tapetoretinal degeneration with reduction of the visual field and hemeralopia [in French]. Confin Neurol 1954;14: den Hollander AI, Ghiani M, de Kok YJ, et al. Isolation of Crb1, a mouse homologue of Drosophila crumbs, and analysis of its expression pattern in eye and brain. Mech Dev 2002;110: Roh MH, Makarova O, Liu CJ, et al. The Maguk protein, Pals1, functions as an adapter, linking mammalian homologues of Crumbs and Discs Lost. J Cell Biol 2002;157: Watanabe T, Miyatani S, Katoh I, et al. Expression of a novel secretory form (Crb1s) of mouse Crumbs homologue Crb1 in skin development. Biochem Biophys Res Commun 2004;313: Gosens I, den Hollander AI, Cremers FP, Roepman R. Composition and function of the Crumbs protein complex in the mammalian retina. Exp Eye Res 2008;86: Tepass U. Crumbs, a component of the apical membrane, is required for zonula adherens formation in primary epithelia of Drosophila. Dev Biol 1996;177: Tepass U, Theres C, Knust E. crumbs encodes an EGF-like protein expressed on apical membranes of Drosophila epithelial cells and required for organization of epithelia. Cell 1990;61: Mehalow AK, Kameya S, Smith RS, et al. CRB1 is essential for external limiting membrane integrity and photoreceptor morphogenesis in the mammalian retina. Hum Mol Genet 2003;12: Pellikka M, Tanentzapf G, Pinto M, et al. Crumbs, the Drosophila homologue of human CRB1/RP12, is essential for photoreceptor morphogenesis. Nature 2002;416: van de Pavert SA, Kantardzhieva A, Malysheva A, et al. Crumbs homologue 1 is required for maintenance of photoreceptor cell polarization and adhesion during light exposure. J Cell Sci 2004;117: Jacobson SG, Cideciyan AV, Aleman TS, et al. Crumbs homolog 1 (CRB1) mutations result in a thick human retina with abnormal lamination. Hum Mol Genet 2003;12: van de Pavert SA, Meuleman J, Malysheva A, et al. A single amino acid substitution (Cys249Trp) in Crb1 causes retinal degeneration and deregulates expression of pituitary tumor transforming gene Pttg1. J Neurosci 2007;27: Aleman TS, Cideciyan AV, Aguirre GK, et al. Human CRB1- associated retinal degeneration: comparison with the rd8 Crb1- mutant mouse model. Invest Ophthalmol Vis Sci 2011;52:

105 Ophthalmology Volume 121, Number 9, September Booij JC, Florijn RJ, ten Brink JB, et al. Identification of mutations in the AIPL1, CRB1, GUCY2D, RPE65, and RPGRIP1 genes in patients with juvenile retinitis pigmentosa [report online]. J Med Genet 2005;42:e67. Available at: Accessed October 14, Bujakowska K, Audo I, Mohand-Said S, et al. CRB1 mutations in inherited retinal dystrophies. Hum Mutat 2012;33: Tosi J, Tsui I, Lima LH, et al. Case report: autofluorescence imaging and phenotypic variance in a sibling pair with earlyonset retinal dystrophy due to defective CRB1 function. Curr Eye Res 2009;34: Zernant J, Kulm M, Dharmaraj S, et al. Genotyping microarray (disease chip) for Leber congenital amaurosis: detection of modifier alleles. Invest Ophthalmol Vis Sci 2005;46: Marmor MF, Fulton AB, Holder GE, et al; International Society for Clinical Electrophysiology of Vision. ISCEV Standard for full-field clinical electroretinography (2008 update). Doc Ophthalmol 2009;118: Henderson RH, Mackay DS, Li Z, et al. Phenotypic variability in patients with retinal dystrophies due to mutations in CRB1. Br J Ophthalmol 2011;95: van Wijk E, van der Zwaag B, Peters T, et al. The DFNB31 gene product whirlin connects to the Usher protein network in the cochlea and retina by direct association with USH2A and VLGR1. Hum Mol Genet 2006;15: Vallespin E, Cantalapiedra D, Riveiro-Alvarez R, et al. Mutation screening of 299 Spanish families with retinal dystrophies by Leber congenital amaurosis genotyping microarray. Invest Ophthalmol Vis Sci 2007;48: Chakarova CF, Papaioannou MG, Khanna H, et al. Mutations in TOPORS cause autosomal dominant retinitis pigmentosa with perivascular retinal pigment epithelium atrophy. Am J Hum Genet 2007;81: Papaioannou M, Chakarova CF, Prescott DC, et al. A new locus (RP31) for autosomal dominant retinitis pigmentosa maps to chromosome 9p. Hum Genet 2005;118: Chakarova CF, Khanna H, Shah AZ, et al. TOPORS, implicated in retinal degeneration, is a cilia-centrosomal protein. Hum Mol Genet 2011;20: Henderson RH, Li Z, Abd El Aziz MM, et al. Biallelic mutation of protocadherin-21 (PCDH21) causes retinal degeneration in humans. Mol Vis [serial online] 2010;16: Available at: Accessed October 14, Footnotes and Financial Disclosures Originally received: June 18, Final revision: March 7, Accepted: March 7, Available online: May 6, Manuscript no Department of Ophthalmology, Columbia University, New York, New York. 2 Department of Pathology & Cell Biology, Columbia University, New York, New York. 3 Department of Ophthalmology, Stoke Mandeville Hospital, Aylesbury, Buckinghamshire, United Kingdom. 4 University Eye Clinic, Tartu University, Tartu, Estonia. 5 Rotterdam Eye Hospital, Rotterdam, The Netherlands. Financial Disclosure(s): The author(s) have no proprietary or commercial interest in any materials discussed in this article. Supported in part by grants from the National Eye Institute/National Institutes of Health EY021163, EY019861, EY018213, and EY (Core Support for Vision Research); Stichting Wetenschappelijk Onderzoek Oogziekenhuis Rotterdam; Rotterdamse Blindenbelangen; Stichting Blindenhulp; Gelderse Blinden Stichting; Landelijke Stichting voor Blinden en Slechtzienden; Foundation Fighting Blindness (Owings Mills, MD); and unrestricted funds from Research to Prevent Blindness (New York, NY) to the Department of Ophthalmology, Columbia University. Abbreviations and Acronyms: CME ¼ cystoid macular edema; ERG ¼ electroretinogram; FAF ¼ fundus autofluorescence; LCA ¼ Lebercongenital amaurosis; MP-1¼ microperimetry 1; MUPP1 ¼ multi-pdz domain protein-1; RP ¼ retinitis pigmentosa; RPE ¼ retinal pigment epithelium; SD-OCT ¼ spectral-domain optical coherence tomography; SNP ¼ single nucleotide polymorphism. Correspondence: Rando Allikmets, PhD, Eye Institute Research, Room 202, 160 Fort Washington Avenue, New York, NY rla22@columbia.edu

106 Figure 4: Pre- and post-treatment spectral domain-optical coherence tomography images of the left eye of the affected sibling with cystoid macular edema. 3 years following treatment (B) there has been a reduction in the size of the intraretinal cysts from baseline (A). 88

107 Table 2. Summary statistics for exome sequencing Sequence Reads Target Coverage SNVs Family Total 1 Nonredundant unique 2 On-target 3 % >1X % >10X Mean read depth Total SNPs Coding SNPs Total Indels Coding Indels Mother 76,288,358 58,494,139 45,379, X 89,694 20,479 8, Proband 134,719,072 95,007,901 79,366, X 91,157 20,400 8, Brother 129,753,988 84,508,093 70,033, X 89,720 20,090 8, Total number of sequence reads generated for a sample. 2. Number of unique sequence reads that can be mapped to a single genomic location. 3. Number of non-redundant unique reads that can be mapped to the exome capture region. SNP=single nucleotide polymorphism, SNV=single nucleotide variation 89

108 Table 4. Genes investigated for a possible modifier effect. Gene MPP5 MPP4 MPP1 MPP3 CRB2 CRB3 EPB41L5 MPDZ INADL LIN7C DFNB31 CASK SDCBP DLG1 DLG4 EXT2 PSMD13 DDX56 TSC22D1 PRKARIA CD2BP2 ALB RP1L1 RP1 MAK OFD1 LCA5 c2orf71 USH2A TULP1 CLRN1 IQCB1 RPGR RPGRIP1 CEP290 TTC8 BBS1 BBS9 ARL6 FAM161A RP2 TOPORS c8orf37 Network Crumbs protein network anti-bait co- IP Retinal ciliopathy proteins in ciliary compartment 90

109 CHAPTER IV.3 A Drosophila genetic resource of mutants to study mechanisms underlying human genetic diseases (Published Paper) 91

110 Resource A Drosophila Genetic Resource of Mutants to Study Mechanisms Underlying Human Genetic Diseases Shinya Yamamoto, 1,2,3,24 Manish Jaiswal, 2,4,24 Wu-Lin Charng, 1,2 Tomasz Gambin, 2,5 Ender Karaca, 2 Ghayda Mirzaa, 6,7 Wojciech Wiszniewski, 2,8 Hector Sandoval, 2 Nele A. Haelterman, 1 Bo Xiong, 1 Ke Zhang, 9 Vafa Bayat, 1 Gabriela David, 1 Tongchao Li, 1 Kuchuan Chen, 1 Upasana Gala, 1 Tamar Harel, 2,8 Davut Pehlivan, 2 Samantha Penney, 2,8 Lisenka E.L.M. Vissers, 10 Joep de Ligt, 10 Shalini N. Jhangiani, 11 Yajing Xie, 12 Stephen H. Tsang, 12,13 Yesim Parman, 14 Merve Sivaci, 15 Esra Battaloglu, 15 Donna Muzny, 2,11 Ying-Wooi Wan, 3,16 Zhandong Liu, 3,17 Alexander T. Lin-Moore, 2 Robin D. Clark, 18 Cynthia J. Curry, 19,20 Nichole Link, 2 Karen L. Schulze, 2,4 Eric Boerwinkle, 11,21 William B. Dobyns, 6,7,22 Rando Allikmets, 12,13 Richard A. Gibbs, 2,11 Rui Chen, 1,2,11 James R. Lupski, 2,8,11 Michael F. Wangler, 2,8, * and Hugo J. Bellen 1,2,3,4,9,23, * 1 Program in Developmental Biology, Baylor College of Medicine (BCM), Houston, TX 77030, USA 2 Department of Molecular and Human Genetics, BCM, Houston, TX 77030, USA 3 Jan and Dan Duncan Neurological Research Institute, Houston, TX 77030, USA 4 Howard Hughes Medical Institute, Houston, TX 77030, USA 5 Institute of Computer Science, Warsaw University of Technology, Warsaw, Poland 6 Department of Pediatrics, University of Washington, Seattle, WA 98195, USA 7 Center for Integrative Brain Research, Seattle Children s Research Institute, Seattle, WA 98101, USA 8 Texas Children s Hospital, Houston, TX 77030, USA 9 Program in Structural and Computational Biology and Molecular Biophysics, BCM, Houston, TX 77030, USA 10 Department of Human Genetics, Radboudumc, PO Box 9101, 6500 HB, Nijmegen, The Netherlands 11 Human Genome Sequencing Center, BCM, Houston, TX 77030, USA 12 Department of Ophthalmology, Columbia University College of Physicians and Surgeons, New York, NY 10032, USA 13 Department of Pathology and Cell Biology, Columbia University College of Physicians and Surgeons, New York, NY 10032, USA 14 Neurology Department and Neuropathology Laboratory, Istanbul University Medical School, Istanbul 34390, Turkey 15 Department of Molecular Biology and Genetics, Bogazici University, Istanbul 34342, Turkey 16 Department of Obstetrics and Gynecology, BCM, Houston, TX 77030, USA 17 Department of Pediatrics, BCM, Houston, TX 77030, USA 18 Division of Medical Genetics, Department of Pediatrics, Loma Linda University Medical Center, Loma Linda, CA 92354, USA 19 Department of Pediatrics, University of California San Francisco, San Francisco, CA 94143, USA 20 Genetic Medicine Central California, Fresno, CA 93701, USA 21 Human Genetics Center, University of Texas, Health Science Center, Houston, TX 77030, USA 22 Department of Neurology, University of Washington, Seattle WA 98195, USA 23 Department of Neuroscience, BCM, Houston, TX 77030, USA 24 Co-first author *Correspondence: michael.wangler@bcm.edu (M.F.W.), hbellen@bcm.edu (H.J.B.) SUMMARY Invertebrate model systems are powerful tools for studying human disease owing to their genetic tractability and ease of screening. We conducted a mosaic genetic screen of lethal mutations on the Drosophila X chromosome to identify genes required for the development, function, and maintenance of the nervous system. We identified 165 genes, most of whose function has not been studied in vivo. In parallel, we investigated rare variant alleles in 1,929 human exomes from families with unsolved Mendelian disease. Genes that are essential in flies and have multiple human homologs were found to be likely to be associated with human diseases. Merging the human data sets with the fly genes allowed us to identify disease-associated mutations in six families and to provide insights into microcephaly associated with brain dysgenesis. This bidirectional synergism between fly genetics and human genomics facilitates the functional annotation of evolutionarily conserved genes involved in human health. INTRODUCTION Unbiased genetic chemical mutagenesis screens in flies have led to the discovery of the vast majority of genes in developmental signaling pathways (Nüsslein-Volhard and Wieschaus, 1980). Most genes important to these pathways have now been shown to function as oncogenes or tumor suppressors 200 Cell 159, , September 25, 2014 ª2014 Elsevier Inc. 92

111 (Pastor-Pareja and Xu, 2013). Similarly, in some areas of neurobiology, genetic screens in flies have led to the discovery of genes important to nervous system function including TRP channels, potassium channels, and pathways that affect diurnal rhythmicity. Subsequent studies have identified many diseases that are associated with mutations or deletions of human homologs (Bellen et al., 2010). However, our molecular understanding of neurological disorders such as neurodegenerative disease has mostly relied on reverse genetics (Lu and Vogel, 2009). Although some genes required for neuronal maintenance have been identified from genetic screens for viable mutations that exhibit shortened life span, electroretinogram defects, abnormal phototaxis, and retinal histology defects, or temperature-sensitive paralysis, no large-scale systematic screens to directly probe neurodegeneration have been carried out, (reviewed in Jaiswal et al., 2012). In addition, because of lethal phenotypes, the role of numerous essential genes in neuronal maintenance is not known. We therefore implemented a genetic mosaic screen to identify essential genes required for neuronal maintenance on the X chromosome. One major limitation in chemical mutagenesis screens has been the inability to systematically identify an abundance of causative mutations. However, with the advent of numerous mapping tools and whole-genome sequencing (WGS), it should be possible to identify hundreds of causative mutations from a single mutagenesis experiment in which a multitude of phenotypes are scored in parallel for each mutation. In humans, the study of Mendelian traits has led to the discovery of thousands of disease genes. Currently, identification of rare disease-causing mutations is rapidly evolving because whole-exome sequencing (WES) technologies are driving the process (Bainbridge et al., 2011a; Lupski et al., 2013). However, the capability to detect rare variants in personal genomes has provided a diagnostic challenge. Traditionally, the identification of causative or associated genetic variation has relied on gene identification in families or patient cohorts followed by genetic studies in model organisms to define the function of the gene in vivo. Several studies have made use of phenotypic information in Drosophila to identify genes associated with human diseases or traits (Bayat et al., 2012; Neely et al., 2010). However, the large number of variants detected by WES with poorly defined phenotypic consequences makes it challenging to tie a specific variant/ gene to a given disease phenotype. Yet, these rare variants have a strong contribution to disease (Lupski et al., 2011). The interpretation of such genome-wide variation is hindered by our lack of understanding of gene function for the majority of annotated genes in the human genome. We identified mutations in 165 genes, most of which have not been characterized previously in vivo. We provide data that suggest this gene set can be utilized as a resource to study numerous disease-causing genes. In addition, we present data that there is a fundamental difference between ethyl methanesulfonate (EMS) screens and RNAi screens. Moreover, we show that fly genes with more than one homolog are much more likely to be associated with human genetic disorders. Finally, we demonstrate that merging data sets genes identified in the fly screen and rare variant alleles in the human homologs in families with Mendelian disease can assist in human disease gene discovery and provide biological insights into disease mechanisms. RESULTS A Mosaic Genetic Screen on the X Chromosome To isolate mutations in essential genes that are required for proper development, function, and maintenance of the Drosophila nervous system, we performed an F3 adult mosaic screen on an isogenic (iso) y w FRT19A X chromosome (Figure 1 and Figures S1 and S2 available online). We mutagenized males using a low concentration of ethyl methane-sulfonate (EMS), established 31,530 mutagenized stocks, and identified 5,857 stocks that carry recessive lethal mutations. To identify a broad spectrum of mutations and isolate genes that affect multiple biological processes, we screened for numerous phenotypes that affect the nervous system. We also screened for seemingly unrelated phenotypes, such as wing and pigmentation defects. Genes that affect wing veins and notching have been shown to play roles in critical pathways that affect numerous organs, including the nervous system. To assess phenotypes in the tissues of interest, we induced mitotic clones in the thorax and wing with Ultrabithorax-flippase (Ubx-FLP) (Jafar-Nejad et al., 2005) and in the eye with eyeless-flippase (ey-flp) (Newsome et al., 2000). We did not pursue mutations that caused cell lethality or showed no/minor phenotypes (Figure 1A). While these genes are clearly important, they are difficult to study and these mutants were not kept. We selected 2,083 lethal lines with interesting phenotypes for further characterization (Figures 1A and 1B). In the Ubx-FLP screen, we assessed the number and size of mechanosensory organs (bristles) on the fly cuticle to identify genes required for neural development (Figures 1C and 1D and S2A S2C) (Charng et al., 2014). We also screened for alterations in the color of bristles and cuticle to permit identification of genes involved in dopamine synthesis, secretion, metabolism, or melanization (Yamamoto and Seto, 2014) (Figure S2D). In addition, we selected mutations that affect wing morphogenesis to isolate genes that regulate core signaling pathways, including Notch, Wnt, Hedgehog, and BMP/TGF-b (Bier, 2005) (Figures S2E S2J). Indeed, these pathways have been implicated in synaptic plasticity and neuronal maintenance in both fly and vertebrate nervous systems. In the ey-flp screen, we assessed morphological defects in the eye and head to isolate genes involved in neuronal patterning, specification, and differentiation (Figures S2K S2O). Moreover, we screened for mutations that cause glossy eye patches (Figure S2P) or mutations that cause a head overgrowth (Figures S2Q S2S). Glossy eye phenotypes are associated with mitochondrial mutations (Liao et al., 2006), while head overgrowth is linked to genes in Hippo signaling, TOR signaling, intracellular trafficking, and cell polarity/adhesion, and these pathways are implicated in disorders such as autism, intellectual disability, and neurodegenerative diseases (Emoto, 2012; Saksena and Emr, 2009). To isolate mutations that affect neuronal development, function, and maintenance in the visual system, we recorded electroretinograms (ERGs) in mutant eye clones in 3- to 4-week-old flies (Figures 1E 1I). By analyzing the on and off transients of ERGs Cell 159, , September 25, 2014 ª2014 Elsevier Inc

112 (legend on next page) 202 Cell 159, , September 25, 2014 ª2014 Elsevier Inc. 94

113 (Figure 1H), one can assess photoreceptor synaptic activity and axon guidance. A loss or reduction in the amplitude of depolarization (Figure 1G) is typically associated with genes that play a role in phototransduction, loss of which typically causes retinal degeneration (Wang and Montell, 2007). To identify mutations that cause a progressive demise of neurons, we screened young and old animals for ERG defects (Figures 1F and 1I). Ultrastructural defects in the photoreceptor terminals of young and old flies were also examined in some mutants with strong ERG phenotypes (Figures 1J 1M). Based on both the morphology screen and the ERG screen, we attempted to map 1,918 mutations (Figures S1 and S3). Mutation Identification On the X chromosome, complementation testing requires a genomic duplication on another chromosome to rescue male lethality. We selected 21 large (0.5 Mb to 2 Mb) duplications that cover 95% of the X chromosome (Cook et al., 2010), crossed them into the mutant backgrounds, and rescued the lethality of 1,385 mutations (Figure S3). This permitted mapping of the lethality to 26 cytological intervals of the X chromosome. Complementation tests between mutants with similar phenotypes rescued by the same duplication allowed us to establish complementation groups. We grouped 450 mutations into 109 multiple allele complementation groups. The remaining 935 mutant strains include single alleles and a large number of mutations not yet assigned to complementation groups. To map the genes, we first performed deficiency mapping and Sanger sequencing. This allowed identification of the locus for 63 complementation groups. For the remaining groups and single alleles, we performed WGS (Haelterman et al., 2014) and rescued the phenotypes with molecularly defined 80 kb P[acman] duplications (Venken et al., 2010). By using both approaches, we were able to map 614 mutations to 165 genes, including 81 loci that have not been characterized in vivo (Tables 1 and S1) and are predicted to be involved in many diverse processes based on gene ontology analysis (Figures S2T and S2U). Chemical Mutagenesis versus RNAi Screens Two of the phenotypes that we screened, bristle development and depigmentation, allow a direct comparison between this screen and a genome-wide RNAi screen (Mummery-Widmer et al., 2009). This RNAi screen covered 80% of all X chromosome protein coding genes. Interestingly, only 14% of the genes we identified in the bristle screen were also isolated in the RNAi screen (Figures 2A and 2B). Similarly, only 18% of the genes that we identified from the pigmentation screen were also identified in the RNAi screen (Figures 2C and 2D). Conversely, we did not identify the vast majority of genes that were identified by RNAi. In addition, a comparison of our gene list and those of two RNAi screens for wing margin (Saj et al., 2010) and eye morphological defects (Oortveld et al., 2013), show that these screens also identified very different sets of genes (Figures 2E and 2F). In summary, chemical screens identify a distinctive set of genes when compared to RNAi-based screens. Links to Human Diseases Based on Online Mendelian Inheritance in Man We next sought to determine if the 165 genes we identified in flies could enhance the understanding of human disease associated genes. Strikingly, 93% (153) of the fly genes isolated have homologs in humans (Tables 1 and S1; Figure 3A).Thisisa strong enrichment (c 2 = 129, p < 0.001) for evolutionarily conserved genes between humans and flies when compared to the whole fly genome as only 48% of all fly genes have human homologs (Figure 3B). Moreover, the human homologs of 31% (48/153) of the identified fly genes have been associated with human disease in Online Mendelian Inheritance in Man (OMIM), 79% (38/48) of which exhibit neurological signs and symptoms (Figure 3A; Table S1). Of the genes that are conserved but not yet associated with Mendelian diseases with neurological symptoms, 65 genes have potential relationships to neurologic diseases (Figure 3A; Table S2). Therefore, the essential genes that we identified in this screen are highly conserved and many of their homologs have already been implicated in human disorders, showing that the screening strategy is effective. Data analysis revealed a striking difference in the number of genes associated with disease depending on the number of human homologs for each fly gene. Fly genes that have a single human homolog have many fewer disease genes represented in the OMIM database than those that have more than one homolog. There is a 2-fold enrichment (c 2 = 10.7, p < 0.001) of fly genes with more than one human homolog associated with diseases in the OMIM database compared to fly genes that have Figure 1. Summary of the Drosophila X Chromosome Screen (A and B) Pie chart (A) and bar graph (B) of phenotypes scored in the screen. The numbers represent mutations in each phenotypic category. Note that one strain may show more than one phenotype in (B). (C and D) Examples of phenotypes observed in the notum. (C) Clones induced in a wild-type background, clone borders are marked by a white dotted line, (D) example of bristle loss in mutant clones (white arrows) (see Extended Experimental Procedures). (E I) Examples of ERG traces from mutant clones in the eye. A typical ERG has an on transient (blue arrows), depolarization (orange line) and an off transient (blue arrow head). ERGs were recorded in young (1- to 3-day-old) and old (3- to 4-week-old) flies for each genotype. (E) ERG of young or aged flies that show no obvious difference. (F) ERGs showing amplitude reduction in aged flies. (G) ERGs showing amplitude and on- and off-transient reduction in both young and aged mutants. (H) ERGs showing no or very small on transient in both young and aged flies. (I) ERGs showing on and off transients that are either absent or very small in aged flies carrying mutant clones in eye. (J M) Ultrastructural analysis using transmission electron microscopy (TEM) on young (2-day-old) and aged (3-week-old) mosaic flies. Red arrowheads indicate the rhabdomeres. (J) Young wild-type control eye: regular array of ommatidial structures with seven rhabdomeres surrounded by pigment (glia) cells. (K) Young mutant rhabdomeres showing intact structures. (L) Aged control eye tissue with intact rhabdomeres. (M) Aged mutant eye tissue with a strong degeneration of rhabdomeres. See also Figures S1, S2, S3. Cell 159, , September 25, 2014 ª2014 Elsevier Inc

114 Table 1. List of 165 Fly Genes and 259 Corresponding Human Homologs Identified from the Screen Fly Gene Human Homologs (*OMIM) Fly Gene Human Homologs (*OMIM) Fly Gene Human Homologs (*OMIM) Aats-his HARS*, HARS2* COQ7 COQ7 para SCN1A*, SCN2A*, SCN3A, AP-1g AP1G1, AP1G2, AP4E1* Crag DENND4A, DENND4B, SCN4A*,SCN5A*,SCN7A, ari-1 ARIH1 DENND4C SCN8A*, SCN9A*, SCN10A*, SCN11A* arm CTNNB1* Cyp4d2 CYP4V2* Arp2 ACTR2 DAAM DAAM1, DAAM2 parvin PARVA, PARVB, PARVG ATP7 ATP7A*, ATP7B* dlg1 DLG1, DLG2, DLG3*, DLG4 pck CLDN12 baz PARD3, PARD3B Dlic DYNC1LI1, DYNC1LI2 Pgd PGD* ben UBE2N dor VPS18 phl ARAF, BRAF*, RAF1* br - dsh DVL1, DVL2, DVL3 PI4KIIIalpha PI4KA Brms1 BRMS1, BRMS1L Dsor1 MAP2K1*, MAP2K2* por PORCN* cac CACNA1A*, CACNA1B dwg MZF1, ZSCAN22 pot - CACNA1E Efr SLC35B4 PpV PPP6C Cap SMC3* egh - Prosa4 PSMA7, PSMA8 car VPS33A eif2b-ε EIF2B5* Psf3 GINS3 CDC45L CDC45L elav ELAVL1, ELAVL2, rap FZR1 Cdk7 CDK7 ELAVL3, ELAVL4 Rbcn-3A DMXL1, DMXL2 CG ewg NRF1 Rbcn-3B WDR7, WDR72* CG11417 ESF1 fh FXN* Rbf RB1*, RBL1, RBL2 CG11418 MTPAP* flii FLII Rhp RHPN1, RHPN2 CG12125 FAM73A, FAM73B flw PPP1CB RpII215 POLR2A CG fs(1)h BRD2, BRD3, BRD4, BRDT RpS5a RPS5 CG Sas10 UTP3 CG14442 ZNF821 Fur2 PCSK5, PCSK6 schlank CERS1, CERS2, CERS3*, CG14786 LRPPRC* Gtp-bp SRPR CERS4, CERS5, CERS6 CG15208 C21orf2 hfw - scu HSD17B10* CG15896 KIAA0391 Hlc DDX56 Sec16 SEC16A, SEC16B CG1597 MOGS* hop JAK1, JAK2*, JAK3*, TYK2* sgg GSK3A, GSK3B CG1703 ABCF1 Hr4 NR6A1 shi DNM1, DNM2*, DNM3 CG1749 UBA5 Hsc70-3 HSPA5 sicily NDUFAF6* CG Inx2 - skpa SKP1 CG17829 HINFP kdn CS Smox SMAD2, SMAD3* CG18624 NDUFB1 l(1)1bi MYBBP1A smr NCOR1, NCOR2 CG2025 NRD1 l(1)g0156 IDH3A SNF1A PRKAA1, PRKAA2 CG2918 HYOU1 l(1)g0222 ANKLE2 sno SBNO1, SBNO2 CG3011 SHMT1, SHMT2 l(1)g0230 ATP5D Sp1 SP7*, SP8, SP9 CG3149 RFT1* l(1)g0255 FH* stim STIM1*, STIM2 CG32649 ADCK3*, ADCK4 l(1)g0334 PDHA1*, PDHA2 svr CPD CG Marf MFN1, MFN2* tay AUTS2 CG32795 TMEM120A, TMEM120B Mcm6 MCM6* temp PTAR1 CG34401 ZSWIM8 mew ITGA3*, ITGA6*, ITGA7* TH1 NELFCD CG3446 NDUFA13* mrna-cap RNGTT tko MRPS12 CG3704 GPN1 mrpl38 MRPL38 trr KMT2C, KMT2D* CG3857 SMG9 mrps25 MRPS25 ubqn UBQLN1, UBQLN2*, CG4078 RTEL1* mrps30 MRPS30 UBQLN3, UBQLN4, UBQLNL CG4165 USP16 mst MSTO1 Upf1 UPF1 CG42237 PLA2G3, PROCA1 mus101 TOPBP1 Upf2 UPF2 CG42593 UBR3 mxc NPAT Usf USF1, USF2 (Continued on next page) 204 Cell 159, , September 25, 2014 ª2014 Elsevier Inc. 96

115 Table 1. Continued Fly Gene Human Homologs (*OMIM) Fly Gene Human Homologs (*OMIM) Fly Gene Human Homologs (*OMIM) CG Myb MYB*, MYBL1, MYBL2 Usp7 USP7 CG4407 FLAD1 mys ITGB1, ITGB2*, ITGB4*, vnd NKX2-2, NKX2-8 CG4542 ALG8* ITGB5, ITGB6, ITGB7, ITGB8 Vps26 VPS26A, VPS26B CG wapl WAPAL CG7358 ZC3H13 N NOTCH1*, NOTCH2*, wds WDR5, WDR5B CG8184 HUWE1* NOTCH3*, NOTCH4 wus DNAJC22 CG8636 EIF3G nej CREBBP*, EP300* Ykt6 YKT6 CG8949 WAC Nmd3 NMD3 Zpr1 ZNF259 CG9650 BCL11A, BCL11B, ZNF296 nonc SMG1 bcop COPB1 Chc CLTC, CLTCL1 Nrg CHL1, L1CAM*, NFASC, b-spec SPTB*, SPTBN1, SPTBN2*, CkIIbeta CSNK2B NRCAM SPTBN4 CkIa CSNK1A1, CSNK1A1L Nup93-1 NUP93 dcop ARCN1 comt NSF oc CRX*, OTX1, OTX2* Human genes associated with Mendelian disease are marked with an asterisk and bold type, the corresponding fly gene is shown in bold. See also Tables S1, S5. only one human homolog, 47% versus 22% (Figure 3C). This prompted us to assess if the bias is conserved for all fly genes. We found that a similar bias holds throughout the genome. Fly genes with more than one human homolog are more likely to be associated with diseases in the OMIM database than those with a single homolog, 40% versus 20% (c 2 = 386, p < 0.001) (Figure 3D and Extended Experimental Procedures). Indeed, 57 fly genes with more than one human homolog account for 100 diseases in the OMIM database (1.75 diseases per fly gene), an 8-fold enrichment when compared to fly genes with a single homolog (0.22 diseases per fly gene) (Figure 3E). This enrichment is not simply due to an absolute increase in the total number of human homologs because evolutionarily conserved genes that have more than one homolog are three times more enriched for OMIM diseases, 0.62 versus 0.22 diseases per human gene (Figure 3E). The difference between 1.75 and 0.62 is due to the number of homologs. Indeed, there are on average 3 human homologs for every fly gene that has more than one human homolog (data not shown). These data suggest that evolutionary gene duplication with divergence and further specialization of gene function may allow tolerance of mutation and viability versus lethality. Since all of the mutations we isolated cause homozygous lethality, we analyzed the correlation between lethality, the number of human homologs, and their links to OMIM diseases for the entire fly genome. The number of essential genes in Drosophila has been estimated to be approximately 5,000 (Benos et al., 2001). Currently only 2,000 essential genes in FlyBase have transposable elements or EMS/X-ray-induced mutations (Marygold et al., 2013), representing about 40% of all essential fly genes. The proportion of essential genes varies with evolutionary conservation: an estimated 11% of the genes that do not have human homologs are essential, whereas 38% of the genes that have a single human homolog are essential (c 2 = 354, p < 0.001) (Figure 3F). Finally, an estimated 61% of the fly genes with more than one human homolog are essential. These data show that fly genes that have more than one human homolog are more likely to cause lethality when mutated. Finally, human homologs of essential genes in Drosophila are more likely to be associated with human genetic diseases (c 2 = 88, p < 0.001) (Figure 3G). Therefore, we conclude that genes that are essential in flies and have multiple human homologs are the most likely to be associated with human diseases, potentially due to gene duplication and redundancy. Combining Fly and Human Mutant Screen Data Sets to Identify Disease Genes We next utilized the fly gene data set uncovered from the forward genetic screen in combination with a human exome data set to identify new human disease genes. We undertook a systematic search of all the variants in the human homologs of the genes identified from the Drosophila screen within WES data generated from undiagnosed cases of Mendelian diseases. This included 1,929 individuals in the Baylor-Hopkins Centers for Mendelian Genomics (BHCMG) (Figure 4). BHCMG uses next-generation sequencing to discover the genetic basis of as many Mendelian diseases as possible (Bamshad et al., 2012). The study population includes singleton cases with sporadic disease, single families, and when possible, larger cohorts of affected individuals with a range of rare Mendelian phenotypes. A wide range of disorders are under investigation ( In general, patients are recruited when a Mendelian disease seems highly likely and all reasonable efforts at a molecular diagnosis have failed. Due to the rare nature of the phenotypes, information from other patients or additional biological information from model organisms is required to fulfill the burden of proof for gene/disease association in such cases. For this reason, our Drosophila resource of mutant genes was integrated with our human exome variant and Mendelian phenotype (Hamosh et al., 2013) databases, and the combination approach was used to solve some of the cases. We analyzed 237 out of the 259 (Table 1) homologs of fly genes identified through the X chromosome screen as they were validated at the time of analysis. We included all 237 genes, Cell 159, , September 25, 2014 ª2014 Elsevier Inc

116 Figure 2. Comparison of Results from This EMS Screen and Previous RNAi Screens (A and B) Venn diagram (A) and bar graph (B) showing overlap between two screens for bristle loss defects. The genes that were identified in the EMS screen were also screened by RNAi (Mummery-Widmer et al., 2009) and 10 caused a bristle loss whereas 57 showed no phenotype or caused lethality. (C and D) Venn diagram (C) and bar graph (D) showing overlap between two screens for pigmentation defects (this screen and the RNAi screen of Mummery-Widmer). (E) Comparison of the results of these screens for wing notching defects. (F) Comparison of the results of these two screens for eye morphological defects. regardless of whether they were previously identified to be associated with Mendelian diseases in OMIM, to avoid any bias. We filtered out variants reported as having greater than 1% allele frequency in databases of control individuals (See Extended Experimental Procedures). Under the assumption of a recessive model data set, we included all variants that met these criteria and were homozygous or had two heterozygous variants affecting the same gene. The latter set was not tested for cis or trans orientation of the variants prior to analysis. A dominant model included heterozygous variants. These were filtered even more stringently for allele frequency such that only variants that had not been observed in the control data sets were studied (Table S3). To explore potential associations with disease, we prioritized variants for segregation analysis within families (Figure 4). We performed Sanger sequencing or explored segregation in families for 64 variants in 24 genes within 34 individuals in the recessive data set and found that 15 variants in 8 genes within 10 individuals fulfilled Mendelian expectations for recessive inheritance. Likewise, for the dominant data set, we explored the segregation for 158 variants in 85 genes within 99 individuals. We found 22 variants in 15 genes within 21 individuals that ful- filled Mendelian expectations of a dominantly inherited disorder in the family under investigation. Interestingly, 22/31 individuals in which the variant met Mendelian expectations had a neurological disease. As a proof-of-principle, we report six patients/families with mutations in three genes. In addition, we identified 25 other individuals in which the variant in the homolog of the fly gene met Mendelian expectation. Some of these individuals were found to have candidate variants in multiple genes, some had too few living relatives for further study, and for others, studies are ongoing. Therefore, a systematic search of variants within the genes identified in the Drosophila screen was able to identify and prioritize a subset of variants with Mendelian inheritance in families that could be studied. Among these, we found examples of known disease genes (DNM2), a novel disease association to a known disease gene (CRX), and novel candidate genes for disease (ANKLE2). DNM2 and Charcot-Marie-Tooth Neuropathy Examination of a homolog of Drosophila shibire (shi), the gene that encodes Dynamin, led to a molecular diagnosis for two individuals with heterozygous mutations in DNM2 (Figures S4A and S4B). Both patients were diagnosed with a distal symmetric polyneuropathy consistent with Charcot-Marie-Tooth disease (CMT) (See Extended Results). Mutations in DNM2 are associated with CMT Type 2M (OMIM ), an axonal form primarily affecting neurons (Figure S4C). Patient 1, the proband in Figure S4A, presented at age 12 with hand tremor, calf cramps, lower limb paresthesias, and difficulty with heel walking. She is a member of a family with three generations of neuropathy (Figure S4A), and the heterozygous G358R variant cosegregated with CMT (Figure S4A). Patient 2, the proband in Figure 4B, (currently 88 years old) presented at age 40 with lower extremity weakness. His nerve conduction studies showed low amplitudes and borderline 206 Cell 159, , September 25, 2014 ª2014 Elsevier Inc. 98

117 Figure 3. Essential Fly Genes Associated with More Than One Human Homolog Are More Likely to be Linked to Human Diseases (A) Classification of genes identified in the screen based on human homologs and associated diseases. (B) Classification of the whole fly genome according to the same criteria as in (A). (C and D) Relationship between the number of human homologs per fly gene and their association with human diseases for genes identified in the screen (C) and the whole fly genome (D). (E) The number of human homologs per fly gene and their enrichment in OMIM associated human diseases. (F) Relationship between the number of human homologs per fly gene and lethality in flies. (G) Relationship between genes associated with lethality in flies and OMIM associated human diseases. See also Table S2. slowed velocities. He carries an E341K mutation in DNM2 (Figure S4D). In addition to DNM2, WES revealed a variant in another CMT gene, LRSAM1 in this patient (Figure S4B). Interestingly, dominant as well as recessive mutations in LRSAM1 can cause CMT2P (OMIM ). Hence, either one or a combination of both genes may cause CMT in this family. While some clinical features of the probands made diagnosis difficult, the phenotypes of these cases were indeed consistent with CMT type 2. CRX and Bull s Eye Maculopathy Examination of one of the human homologs of Drosophila ocelliless (oc, CRX in humans) led to the identification of three cases of bull s eye maculopathy associated with dominant CRX alleles. oc encodes a homeobox transcription factor that regulates photoreceptor development (Vandendries et al., 1996). Identifying cases of bull s eye maculopathy, a late-onset slowly progressive retinal disorder, with CRX alleles was surprising because CRX is typically associated with much more severe childhood vision loss seen in dominant cone-rod dystrophy, Leber congenital amaurosis, and autosomal dominant retinitis pigmentosa (OMIM , ). The three cases of bull s eye maculopathy included two individuals with no family history of retinal disease (patients 3 and 4) and one multigenerational pedigree (patient 5 [S150X]) (Figure 5A). The affected individuals in the family of patient 5 developed symptoms at age 50 (range years), and three family members with the S150X mutation had minimal symptoms at initial evaluation between the age of Despite having near normal vision, ophthalmologic exam in the retina of these individuals revealed advanced bull s eye maculopathy with foveal sparing explaining the modest effect on vision. Patient 5 exhibits retinal abnormalities (Figure 5B B ), abnormal autofluorescence in the fundus (Figure 5C C ), aberrant Optical Coherence Tomography (OCT, Figure 5D D ) and electroretinograms (Figure 5E), all consistent with bull s eye maculopathy. The three new alleles are all encoding predicted truncations of the OTX transcription factor domain (Figure 5F). Functional analysis of homozygous oc mutant clones reveal that the ERGs in young animals are nearly normal (Figure 5G) but defective in 7-day-old flies, including reduced amplitude and loss of on transients (Figure 5G, blue arrows). This suggests that the photoreceptors become impaired over time. In summary, the defects in flies and humans show similarities. ANKLE2 and Microcephaly The Drosophila screen identified a mutation in l(1)g0222, the homolog of ANKLE2 (dankle2) (Table 1). The mutation causes a loss of thoracic bristles and underdevelopment of the sensory organs in clones (Figure 6A). The human WES data identified Cell 159, , September 25, 2014 ª2014 Elsevier Inc

118 Figure 4. Flowchart for Discovery and Functional Studies of Disease Genes Using the Drosophila Resource and Human Exome Data See also Table S3, Figure S4. variants in ANKLE2 in a family with apparent recessive microcephaly (Figures 6B and 6C). The proband, patient 6, has an extreme small head circumference, a low sloping forehead, ptosis, small jaw, multiple hyper- and hypopigmented macules over all areas of his body, and spastic quadriplegia (Figure 6D 6H; Extended Results, Clinical Case Histories ). During his first year of life, he had unexplained anemia, and glaucoma. At 3 years, he had onset of seizures, and at 5.5 years, his weight was 10.7 kg ( 4 SD), length 83.8 cm ( 6 SD) and fronto-occipital circumference 38.2 cm ( 9 SD). Brain MRI in the newborn period demonstrated a low forehead, several scalp ruggae, and mildly enlarged extra-axial space with communication between the posterior lateral ventricles and the mesial extra-axial space. Other brain abnormalities included a simplified gyral pattern, mildly thickened cortex, small frontal horns of the lateral ventricles with mildly enlarged posterior horns of the lateral ventricles, and agenesis of the corpus callosum. The brainstem and cerebellum appeared relatively normal (Figures 6G and 6H). A younger sister born a year later had severe microcephaly, spasticity, and similar hyperand hypopigmented macules over all areas of her body. She died 24 hr after delivery from cardiac failure associated with poor contractility, although the basis for this was not known. WES data of the proband, his affected sister, and both parents revealed four candidate genes that meet Mendelian expectation and are expressed in the CNS (Table S4). Table S4 shows the variants with their scores from four predictions programs (Liu et al., 2011). ANKLE2 was prioritized as a good candidate. To assess if dankle2 is involved in CNS development, we examined the brains of Drosophila mutant larvae. Brain size in early third instar larval stages is similar to that of controls (Figure S5A). However, later in third larval stage, the brain becomes progressively smaller than control larvae (Figure S5A and Figures 6I and J). To confirm that dankle2 is an ortholog of human ANKLE2, we ubiquitously expressed human ANKLE2 in mutant flies and observed rescue of lethality and the small brain phenotype (Figures 6K 6L). These data indicate that ANKLE2 is implicated in CNS development and its molecular function is evolutionarily conserved. To explore the cause of the small brain phenotype in dankle2 mutants, we assessed defects in processes which can cause small brain phenotypes: mitosis, asymmetric cell division, and apoptosis (Rujano et al., 2013). The number of neuroblasts, marked by Miranda (Ceron et al., 2001) is severely reduced in late third instar brain lobes (Figures 6M 6O and S5B and S5C). In the few neuroblasts that undergo division, the spindles are properly oriented toward the polarity axis (Figures S5D and S5E). In addition, centriole duplication, impaired in many primary human microcephaly syndromes (Kaindl et al., 2010), is not affected in dankle2 mutants (Figures S5F and S5G). Hence, loss of dankle2 causes a severe reduction in neuroblast number but does not seem to affect asymmetric division or centriole number. To assess proliferation in the CNS, we induced mitotic clones of dankle2 in the brain and labeled them with Bromodeoxyuridine (BrdU)(Figures 6P 6R). As shown in Figure 6R, BrdU incorporation is strongly reduced in mutant clones when 208 Cell 159, , September 25, 2014 ª2014 Elsevier Inc. 100

119 Figure 5. Mutations in CRX Cause Bull s Eye Maculopathy (A) Pedigree of the family of patient 5 (red arrow) with multiple individuals with bull s eye maculopathy. The S150X mutation in CRX was identified in eight family members. DNA was not available for family members for whom screening results are not indicated. (B D) Clinical phenotypes of patient 5. (B B ) Fundus photography show fine granularity in the outer retina and speckled glistening deposits arranged in a ring around the macula. Peripheral fundi appear unaffected. (C C ) Autofluorescence images reveal a bull s eye phenotype with hypofluorescent macula surrounded by a hyperautofluorescent ring, suggesting a continuously atrophic macular area. (D D ) Optical coherence tomography shows central loss of the outer nuclear layer, ellipsoid line, external limiting membrane, and retinal pigment epithelium atrophy corresponding to area of hypoautofluorescence in (C C ). (E) ERG of the proband: Electroretinographic traces showed implicit time delay and amplitude reduction in both scotopic and especially photopic responses in keeping with generalized cone-rod dysfunction. (F) Structure of CRX protein and mutations in patients 3 5. (G) ERG of control and oc mutant clone in 2-day-old and 7-day-old (in light) adult flies. Blue arrows indicate on transient in ERG. On transients are lost in 7-day-old flies. The orange line indicates the amplitude of ERG. Cell 159, , September 25, 2014 ª2014 Elsevier Inc

120 Figure 6. ANKLE2 and Microcephaly (A) dankle2 mutant clone of the peripheral nervous system in the thorax of a fly. In wild-type tissue (GFP, shown in blue), sensory organs are comprised of four cells marked by Cut (green), one of which is a neuron marked by ELAV (red). In the mutant clone ( /, nonblue), the number of cells per sensory organ is reduced to two and does not contain a differentiated neuron. (B) Pedigree of the family of patient 6 (red arrow) with a severe microcephaly phenotype. Both affected individuals inherited variants from both parents in ANKLE2. (C) Structure of ANKLE2 protein and mutations in patient 6. Abbreviations: transmembrane domain (TMD), LAP2/emerin/MAN1 domain (LEM), ankyrin repeats (ANK). (D and E) Clinical phenotypes of the proband with a severe sloping forehead, microcephaly, and micrognathia. (F) Scattered hyperpigmented macules on the trunk. (G) Sagittal brain MRI of the proband in infancy with severe microcephaly, agenesis of the corpus callosum and a collapsed skull with scalp ruggae. (H) Axial brain MRI showing polymicrogyria-like cortical brain malformations. (I L) Third instar larval brain of (I) control (y w FRT19Aiso); scale bar, 100 microns (J) dankle2 mutant, and (K) dankle2 mutant in which the human ANKLE2 cdna is ubiquitous expressed (Rescue). Note that brain lobe (arrow in I) size is reduced in dankle2 mutant (J) and the phenotype is rescued by ANKLE2 expression (K). Relative brain lobe volume of control, dankle2, and rescue using 3D confocal images is quantified in (L). (M O) Larval CNS neuroblasts (arrowheads) in control and dankle2 mutant. Neuroblasts are marked by Miranda (Mira, green), chromosomes in dividing cells are marked by Phospo-Histone3 (PH3, blue), and spindles in dividing cells are marked by a-tubulin (atub, red). Relative number of neuroblasts in control and dankle2 is shown in (O). (P R) BrdU incorporation (red) in control (P) and dankle2 mutant clones (Q) marked by GFP (green, dotted lines) in larval brains. Differentiated neurons are marked by ELAV (blue). Neuroblast (nb), ganglion mother cells (gmc), and neurons (n) are marked. Quantification of relative BrdU incorporation is shown in (R). (S V) TUNEL assay in third instar larval brain lobes of (S) control, (T) dankle2 mutant, and (U) Rescue. Quantification of TUNEL positive cells/volume (cell death) is shown in (V). In (L, O, R, and V), error bars indicated SEM, *** indicates a p value < and ** indicates a p value < 0.01.See also Table S4, Figure S5. compared to wild-type clones, indicating that cell proliferation is severely impaired. In addition, the mutant clones (Figure 6Q) that contain a neuroblast and its progeny, the ganglion mother cells and neurons, contain many fewer cells than wild-type clones (Figure 6P). Finally, we observe a dramatic increase in apoptotic cells marked by TUNEL in the dankle2 mutant brain lobes 210 Cell 159, , September 25, 2014 ª2014 Elsevier Inc. 102

121 (Figures 6S, 6T, and 6V). This cell death is rescued by the expression of the human cdna encoding ANKLE2 (Figures 6U and 6V). Therefore, defects in proliferation and excessive apoptosis are both contributing to the loss of CNS cells in dankle2. DISCUSSION Here we describe the generation of a large set of chemically induced lethal mutations on the Drosophila X chromosome that were screened for predominantly neurological phenotypes in adult mosaic flies. The mutations were assigned to complementation groups, mapped, and sequenced to associate as many genes as possible with specific phenotypes. We identified and rescued the lethality associated with mutations in 165 genes using a variety of mapping and sequencing methods. These mutations are available through the Bloomington Drosophila Stock Center and provide a valuable resource to study the function of human genes in Drosophila especially since 93% of the genes are evolutionarily conserved in human. This mutant collection contains 21 genes associated with human diseases for which no mutations were previously available. The fly mutants thus enable the study of the basic molecular mechanism of 26 human diseases, including Leigh syndrome (CG14786/LRPPRC, l(1)g0334/pdha1, and sicily/ndufaf6), congenital disorders of glycosylation (CG1597/MOGS, and CG3149/RFT1), Usher syndrome (Aats-his/HARS), Friedreich ataxia (fh/fxn), and amyotrophic lateral sclerosis (ubqn/ UBQLN2). Based on the gene list from the Drosophila screen, we explored a database of 1,929 human exomes from a Mendelian disease resource of patients with rare diseases. We examined the personal genomes for rare variants of the fly homologs and prioritized a subset of human rare variant alleles for segregation analysis. We report six families with distinct diseases in which the variants segregate and are likely responsible for causing the associated Mendelian disease. The approach described here provides a valuable resource to study the function of many disease genes in different tissues. We propose that the screen strategy be expanded to the autosomes, and a number of guiding principles should be considered based on this study. First, the use of low concentrations of EMS is important as it minimizes the number of second site lethal and visible mutations (Haelterman et al., 2014). Second, screening for lethal mutations has major advantages as 93% of the isolated genes that are essential for viability are conserved, whereas only 48% of all Drosophila genes have evolutionarily conserved human homologs. Third, the isolation of lethal mutations also greatly facilitates genetic mapping. Fourth, screening for many different phenotypes casts a broader net and permits isolation of mutations in many different genes, a strategy that is also used in mice (White et al., 2013). Fifth, analyzing different phenotypes revealed that mutations in the majority of the genes cause more than one phenotype, consistent with extensive pleiotropism. Comparison of the gene list identified from our EMS screen and several RNAi screens have shown that these approaches reveal very distinct sets of genes. There are multiple reasons that may lead to this difference. For example, since our screen was aimed at identifying mutations that cause lethality, we have not screened for genes that are nonessential. Thus, a number of genes that are nonessential but cause morphological defects are missed in our screen. On the other hand, RNAi may not be efficient or cause off-targeting effects (Green et al., 2014; Mohr, 2014). Regardless of the methods that are being used, rescue experiments and independent validation are critical to determine that the phenotype one observes is due to loss of the gene of interest when performing a genetic screen. It is interesting to note that from our screen, essential fly genes with two or more homologs in humans have a significantly higher likelihood of being associated with Mendelian diseases than those that only have a single human homolog (Figure 3). This suggests that gene duplications of essential genes and subsequent evolutionary divergence may lead to genes that are partially redundant and more likely to be disease associated. Hence, when analyzing human exomes, it would seem more productive to start with homologs of evolutionarily conserved essential Drosophila genes that have two or more human homologs. In addition to these relationships to Mendelian traits, 17% (26/153) of the fly genes that have human homologs have been identified in GWAS (genome-wide association studies) for neurological disorders (Table S5). Hence, the collection of mutations described here may permit us to study genes for complex traits. We uncovered a genetic basis in a few cases for which the gene was previously known. For example, the study of DNM2 revealed previously studied phenotypes associated with mutations in the gene (CMT, Figure S4). In another case we observed that mutations in a gene caused unexpected phenotypes. Indeed, we identified three families with bull s eye maculopathy, a condition that is much milder and with a later age of onset than conditions typically associated with CRX truncations such as Leber congenital amaurosis (leading to blindness before a year of life) and cone rod dystrophy (a condition with onset in the first or second decade). Interestingly, other truncating alleles have been reported both N- and C-terminally to the OTX transcription factor domain in patients with these severe phenotypes. Therefore, while CRX mutations can produce variable phenotypes (Huang et al., 2012), bull s eye maculopathy has not been associated with deleterious CRX variants. Our data suggest that some symptoms may manifest at older ages, and the phenotypic spectrum of CRX mutations includes late-onset mild retinopathy. We identified deleterious alleles in ANKLE2 in two individuals in a family affected by severe microcephaly. In flies, we observed severe defects in neuroblast proliferation and excessive apoptosis in the third instar larval brain of dankle2 mutants. This knowledge, combined with the observation that expression of human ANKLE2 in dankle2 mutants rescues lethality, brain size, and apoptosis, provide strong evidence that ANKLE2 is responsible for the microcephaly in the family. Moreover, ANKLE2 has been shown to physically and genetically interact with VRK1 in C. elegans and vertebrates (Asencio et al., 2012), and loss of fly VRK1 (also known as ballchen (ball) or nhk-1 in flies) also causes a small brain phenotype in third instar larvae (Cullen et al., 2005). It is therefore interesting to note that mutations in VRK1 also cause microcephaly in patients (Figure S5H) (Gonzaga-Jauregui et al., 2013). Cell 159, , September 25, 2014 ª2014 Elsevier Inc

122 The pattern of brain abnormalities and microcephaly in our patient with ANKLE2 mutations is somewhat similar to patients with autosomal recessive CLP1 mutations. CLP1 encodes an RNA kinase involved in trna splicing (Karaca et al., 2014; Schaffer et al., 2014). The Clp1 homozygous kinase-dead mouse exhibits microcephaly that worsens with age due to apoptosis. Hence, apoptosis may be a common denominator in these forms of microcephaly. Phenotypic information of Drosophila mutants allows researchers to understand the potential in vivo function of their human homologs. The cases of oc/crx and dankle2/ankle2 are examples in which some direct phenotypic comparisons are possible between the fly mutant and human conditions. However, one of the major drawbacks of comparing phenotypes in different species is that a comparison between different tissues and organs is not always obvious. How do we relate wing vein defects or a rough eye with the phenotypes observed in human genetic diseases? Numerous strategies have been outlined by Lehner (Lehner, 2013) and one of the most compelling strategies is based on orthologous phenotypes or phenologs (McGary et al., 2010). Genes tend to work in evolutionarily conserved pathways, allowing the direct transfer from genotype-phenotype relations between species. For example, mutations in a subset of genes that function in mitochondrial quality control cause a high incidence of muscle mitochondrial defects in adult flies and Parkinson disease (PD) in humans (Jaiswal et al., 2012), suggesting that new genes that affect muscle mitochondria in adult flies are good candidates for PD. Indeed, it may well be that phenotypic similarities between fly and man will be the exception rather than the rule. Regardless, we provide evidence that the use of unbiased screens in the fly and the resulting genetic resources will provide opportunities to prioritize human exome variants and to explore the underlying function of these and many other disease-causing genes in vivo. Gene Identification When a complementation group was mapped to a small region ( kb, varies depending on available resources), we searched for publically available lethal mutations that map to the same region using FlyBase (Marygold et al., 2013). We performed complementation tests using >1 mutant allele when possible. For complementation groups that complemented all available lethal mutations in the region, we performed Sanger sequencing using standard methods. To expedite gene identification we also used Illumina-based whole-genome sequencing technology (Haelterman et al., 2014) Ethics Statement Informed consent was obtained prior to participation from all subjects or parents of recruited subjects under an Institutional Review Board approved protocol at BCM. Study Subjects The analysisof 1,929exomesfrom BHCMG describedwasperformed in a database from the WES of over 160 separate phenotypic cohorts. The sequencing data includedfamily-based studies in which both affected and unaffected family members were sequenced, single individuals with unique phenotypes, as well as larger cohorts of up to cases with the same phenotype. Selection of subjects was performed by a phenotypic review committee based on the likelihood of the Mendelian inheritance for the disease phenotype. Whole-Exome Capture, Sequencing and Data Analysis All of the subjects enrolled in the BHCMG underwent WES using methods previously described (Lupski et al., 2013) (Extended Experimental Procedures). Produced sequence reads were mapped and aligned to the GRCh37 (hg19) human genome reference assembly using the HGSC Mercury analysis pipeline ( Variants were determined and called using the Atlas2 suite to produce a variant call file (VCF). High-quality variants were annotated using an in-house developed suite of annotation tools (Bainbridge et al., 2011a). ANKLE2 Construct and Transgenesis Human ANKLE2 cdna was cloned into puastattb (Bischof et al., 2007) tagged vectors (N-terminal FLAG) using In-Fusion HD Cloning Kit (Clontech) and vector was linearized with NotI and XhoI. The construct was inserted in VK33 (Venken et al., 2006). EXPERIMENTAL PROCEDURES Fly Strains The strains used in this study including the mutations and duplications and deletion strains used for mapping are described in Flybase (Marygold et al., 2013) (see also Extended Experimental Procedures). Isogenization and Mutagenesis Isogenization of y w FRT19A chromosome was performed using standard genetic crosses. Mutagenesis was performed by feeding isogenized y w FRT19A iso males with sucrose solution containing a low concentration ( mm) of EMS as described (Bökel, 2008). After recovery from mutagenesis, these males were mated en masse with Df(1)JA27/FM7c Kr > GFP virgin females for 3 days. In the F1 generation, y w mut* FRT19A/FM7c Kr > GFP (mut* indicates the EMS-induced mutation) virgins were collected and 33,887 individual females were crossed with FM7c Kr > GFP males to establish independent balanced stocks. A total of 5,859 lines carried lethal mutations and the remaining stocks were discarded. Complementation and Mapping Lines that exhibited a strong morphological and/or ERG phenotype were subjected initially to duplication mapping. Subsequently, lines that were rescued by the same duplication and exhibit similar phenotypes were crossed inter se to establish complementation groups based on lethality. Complementation groups were further fine mapped using deficiencies that cover the region of interest. ACCESSION NUMBERS The dbgap accession number for the data reported in this paper is phs v1.p1. Additional details are available at gov/projects/gap/cgi-bin/study.cgi?study_id=phs v1.p1. Biosample IDs for patient 1 (BAB3655), patient 2 (BAB3659), patient 6 (LR06-300a1), and patient 6 family data (LR06-300a2, LR06-300f, LR06-300m) are in Table S6 and data for these individuals is available at this link: ncbi.nlm.nih.gov/sra?db=sra&dbfrom=bioproject&cmd=link&linkname= bioproject_sra&linkreadablename=sra&ordinalpos=1&idsfromresult= SUPPLEMENTAL INFORMATION Supplemental Information includes Extended Results, Extended Experimental Procedures, five figures, five tables and can be found with this article online at AUTHOR CONTRIBUTIONS Thorax and wing screen: S.Y. and W.L.C. Eye and ERG screen: M.J., B.X., K.Z., and V.B. Fly mapping: S.Y., M.J., W.L.C., B.X., K.Z., V.B., H.S., N.A.H., G.D., T.L., K.C., U.G., A.T.L.-M., K.L.S., and R. Chen. Bioinformatic analysis: S.Y., M.J., M.F.W., Y.-W.W. and Z.L. Whole-exome sequencing: T.G., S.N.J., D.M., E. Boerwinkle, R.A.G. and J.R.L. Human genome analysis S.Y., M.J., W.L.C., T.G., E.K., W.W., L.E.L.M.V., J.d.D., T.H., H.S., N.H., G.D., 212 Cell 159, , September 25, 2014 ª2014 Elsevier Inc. 104

123 T.L., K.C., U.G., and M.F.W. Clinical data and segregation analysis: E.K., D.P., Y.P., M.S. and E. Battaloglu, Y.X., S.H.T. and R.A. S.P., G.M., R.D. Clark, C.J.C. and W.B.D. Fly experiments on oc mutant: M.J., and dankle2 mutant: M.J., N.L., W.L.C. Designed the study and wrote the manuscript: S.Y., M.J., J.R.L., M.F.W. and H.J.B. S.Y. and M.J. contributed equally. ACKNOWLEDGMENTS We thank Y. Chen, C. Benitez, X. Shi, S. Gibbs, A. Jawaid, H. Wang, Y.Q. Lin, D. Bei, L. Wang, Y. He, and H. Pan for technical support; Y-N. Jan, T. Kaufman, C. Doe, U. Banergee, J. Olson, K. Cook, and D. Bilder for reagents; and J. Shulman, J. Zallen, E. Seto, and H.Y. Zoghbi for critical reading of this manuscript. This study was supported by the National Institutes of Health (NIH) 1RC4GM (H.J.B. and R. Chen), U54HG (BHCMG), R01NBS (J.R.L.), 5P30HD (Confocal microscopy at the Intellectual and Developmental Disabilities Research Center), K23NS (W.W.) 5R01GM (H.S.), T32 NS (H.S.), and 5K12GM (H.S.), K08NS076547(M.F.W.), EY (R.A.), EY (R.A.), and EY (R.A. and S.H.T.). Additional support: Nakajima Foundation and the Jan and Dan Duncan Neurological Research Institute at Texas Children s Hospital (S.Y.) National Science Centre Poland (DEC-2012/06/M/NZ2/00101) (W.W.), Houston Laboratory and Population Science Training Program in Gene-Environment Interaction from the Burroughs Wellcome Fund (Grant No ) (B.X.), NSF DMS# (Z.L.), Bogazici University Research Foundation (09B101P) (E. Battaloglu), Research to Prevent Blindness to the Department of Ophthalmology, Columbia University (R.A., S.H.T.). H.J.B. is a Howard Hughes Medical Institute Investigator and received funds from the Robert and Renee Belfer Family Foundation, the Huffington Foundation, and Target ALS. Received: February 7, 2014 Revised: June 4, 2014 Accepted: September 2, 2014 Published: September 25, 2014 REFERENCES Asencio, C., Davidson, I.F., Santarella-Mellwig, R., Ly-Hartig, T.B., Mall, M., Wallenfang, M.R., Mattaj, I.W., and Gorjánácz, M. (2012). Coordination of kinase and phosphatase activities by Lem4 enables nuclear envelope reassembly during mitosis. Cell 150, Bainbridge, M.N., Wiszniewski, W., Murdock, D.R., Friedman, J., Gonzaga- Jauregui, C., Newsham, I., Reid, J.G., Fink, J.K., Morgan, M.B., Gingras, M.C., et al. (2011a). Whole-genome sequencing for optimized patient management. Sci. Transl. Med. 3, re3. Bamshad, M.J., Shendure, J.A., Valle, D., Hamosh, A., Lupski, J.R., Gibbs, R.A., Boerwinkle, E., Lifton, R.P., Gerstein, M., Gunel, M., et al.; Centers for Mendelian Genomics (2012). The Centers for Mendelian Genomics: a new large-scale initiative to identify the genes underlying rare Mendelian conditions. Am. J. Med. Genet. A. 158A, Bayat, V., Thiffault, I., Jaiswal, M., Tétreault, M., Donti, T., Sasarman, F., Bernard, G., Demers-Lamarche, J., Dicaire, M.J., Mathieu, J., et al. (2012). Mutations in the mitochondrial methionyl-trna synthetase cause a neurodegenerative phenotype in flies and a recessive ataxia (ARSAL) in humans. PLoS Biol. 10, e Bellen, H.J., Tong, C., and Tsuda, H. (2010). 100 years of Drosophila research and its impact on vertebrate neuroscience: a history lesson for the future. Nat. Rev. Neurosci. 11, Benos, P.V., Gatt, M.K., Murphy, L., Harris, D., Barrell, B., Ferraz, C., Vidal, S., Brun, C., D le, J., Cadieu, E., et al. (2001). From first base: the sequence of the tip of the X chromosome of Drosophila melanogaster, a comparison of two sequencing strategies. Genome Res. 11, Bier, E. (2005). Drosophila, the golden bug, emerges as a tool for human genetics. Nat. Rev. Genet. 6, Bischof, J., Maeda, R.K., Hediger, M., Karch, F., and Basler, K. (2007). An optimized transgenesis system for Drosophila using germ-line-specific phic31 integrases. Proc. Natl. Acad. Sci. USA 104, Bökel, C. (2008). EMS screens : from mutagenesis to screening and mapping. Methods Mol. Biol. 420, Ceron, J., González, C., and Tejedor, F.J. (2001). Patterns of cell division and expression of asymmetric cell fate determinants in postembryonic neuroblast lineages of Drosophila. Dev. Biol. 230, Charng, W.L., Yamamoto, S., and Bellen, H.J. (2014). Shared mechanisms between Drosophila peripheral nervous system development and human neurodegenerative diseases. Curr. Opin. Neurobiol. 27C, Cook, R.K., Deal, M.E., Deal, J.A., Garton, R.D., Brown, C.A., Ward, M.E., Andrade, R.S., Spana, E.P., Kaufman, T.C., and Cook, K.R. (2010). A new resource for characterizing X-linked genes in Drosophila melanogaster: systematic coverage and subdivision of the X chromosome with nested, Y-linked duplications. Genetics 186, Cullen, C.F., Brittle, A.L., Ito, T., and Ohkura, H. (2005). The conserved kinase NHK-1 is essential for mitotic progression and unifying acentrosomal meiotic spindles in Drosophila melanogaster. J. Cell Biol. 171, Emoto, K. (2012). Signaling mechanisms that coordinate the development and maintenance of dendritic fields. Curr. Opin. Neurobiol. 22, Gonzaga-Jauregui, C., Lotze, T., Jamal, L., Penney, S., Campbell, I.M., Pehlivan, D., Hunter, J.V., Woodbury, S.L., Raymond, G., Adesina, A.M., et al. (2013). Mutations in VRK1 associated with complex motor and sensory axonal neuropathy plus microcephaly. JAMA Neurol. 70, Green, E.W., Fedele, G., Giorgini, F., and Kyriacou, C.P. (2014). A Drosophila RNAi collection is subject to dominant phenotypic effects. Nat. Methods 11, Haelterman, N., Jiang, L., Li, S., Bayat, V., Ugur, B., Tan, K.L., Zhang, K., Bei, D., Xiong, B., Charng, W.L., et al. (2014). Large-scale identification of chemically induced mutations in Drosophila melanogaster. Genome Res. 24, Hamosh, A., Sobreira, N., Hoover-Fong, J., Sutton, V.R., Boehm, C., Schiettecatte, F., and Valle, D. (2013). PhenoDB: a new web-based tool for the collection, storage, and analysis of phenotypic features. Hum. Mutat. 34, Huang, L., Xiao, X., Li, S., Jia, X., Wang, P., Guo, X., and Zhang, Q. (2012). CRX variants in cone-rod dystrophy and mutation overview. Biochem. Biophys. Res. Commun. 426, Jafar-Nejad, H., Andrews, H.K., Acar, M., Bayat, V., Wirtz-Peitz, F., Mehta, S.Q., Knoblich, J.A., and Bellen, H.J. (2005). Sec15, a component of the exocyst, promotes notch signaling during the asymmetric division of Drosophila sensory organ precursors. Dev. Cell 9, Jaiswal, M., Sandoval, H., Zhang, K., Bayat, V., and Bellen, H.J. (2012). Probing mechanisms that underlie human neurodegenerative diseases in Drosophila. Annu. Rev. Genet. 46, Kaindl, A.M., Passemard, S., Kumar, P., Kraemer, N., Issa, L., Zwirner, A., Gerard, B., Verloes, A., Mani, S., and Gressens, P. (2010). Many roads lead to primary autosomal recessive microcephaly. Prog. Neurobiol. 90, Karaca, E., Weitzer, S., Pehlivan, D., Shiraishi, H., Gogakos, T., Hanada, T., Jhangiani, S.N., Wiszniewski, W., Withers, M., Campbell, I.M., et al.; Baylor Hopkins Center for Mendelian Genomics (2014). Human CLP1 mutations alter trna biogenesis, affecting both peripheral and central nervous system function. Cell 157, Lehner, B. (2013). Genotype to phenotype: lessons from model organisms for human genetics. Nat. Rev. Genet. 14, Liao, T.S., Call, G.B., Guptan, P., Cespedes, A., Marshall, J., Yackle, K., Owusu-Ansah, E., Mandal, S., Fang, Q.A., Goodstein, G.L., et al. (2006). An efficient genetic screen in Drosophila to identify nuclear-encoded genes with mitochondrial function. Genetics 174, Liu, X., Jian, X., and Boerwinkle, E. (2011). dbnsfp: a lightweight database of human nonsynonymous SNPs and their functional predictions. Hum. Mutat. 32, Cell 159, , September 25, 2014 ª2014 Elsevier Inc

124 Lu, B., and Vogel, H. (2009). Drosophila models of neurodegenerative diseases. Annu. Rev. Pathol. 4, Lupski, J.R., Belmont, J.W., Boerwinkle, E., and Gibbs, R.A. (2011). Clan genomics and the complex architecture of human disease. Cell 147, Lupski, J.R., Gonzaga-Jauregui, C., Yang, Y., Bainbridge, M.N., Jhangiani, S., Buhay, C.J., Kovar, C.L., Wang, M., Hawes, A.C., Reid, J.G., et al. (2013). Exome sequencing resolves apparent incidental findings and reveals further complexity of SH3TC2 variant alleles causing Charcot-Marie-Tooth neuropathy. Genome Med 5, 57. Marygold, S.J., Leyland, P.C., Seal, R.L., Goodman, J.L., Thurmond, J., Strelets, V.B., and Wilson, R.J.; FlyBase consortium (2013). FlyBase: improvements to the bibliography. Nucleic Acids Res. 41 (Database issue), D751 D757. McGary, K.L., Park, T.J., Woods, J.O., Cha, H.J., Wallingford, J.B., and Marcotte, E.M. (2010). Systematic discovery of nonobvious human disease models through orthologous phenotypes. Proc. Natl. Acad. Sci. USA 107, Mohr, S.E. (2014). RNAi screening in Drosophila cells and in vivo. Methods 68, Mummery-Widmer, J.L., Yamazaki, M., Stoeger, T., Novatchkova, M., Bhalerao, S., Chen, D., Dietzl, G., Dickson, B.J., and Knoblich, J.A. (2009). Genome-wide analysis of Notch signalling in Drosophila by transgenic RNAi. Nature 458, Neely, G.G., Hess, A., Costigan, M., Keene, A.C., Goulas, S., Langeslag, M., Griffin, R.S., Belfer, I., Dai, F., Smith, S.B., et al. (2010). A genome-wide Drosophila screen for heat nociception identifies a2d3 as an evolutionarily conserved pain gene. Cell 143, Newsome, T.P., Asling, B., and Dickson, B.J. (2000). Analysis of Drosophila photoreceptor axon guidance in eye-specific mosaics. Development 127, Nüsslein-Volhard, C., and Wieschaus, E. (1980). Mutations affecting segment number and polarity in Drosophila. Nature 287, Oortveld, M.A., Keerthikumar, S., Oti, M., Nijhof, B., Fernandes, A.C., Kochinke, K., Castells-Nobau, A., van Engelen, E., Ellenkamp, T., Eshuis, L., et al. (2013). Human intellectual disability genes form conserved functional modules in Drosophila. PLoS Genet. 9, e Pastor-Pareja, J.C., and Xu, T. (2013). Dissecting social cell biology and tumors using Drosophila genetics. Annu. Rev. Genet. 47, Rujano, M.A., Sanchez-Pulido, L., Pennetier, C., le Dez, G., and Basto, R. (2013). The microcephaly protein Asp regulates neuroepithelium morphogenesis by controlling the spatial distribution of myosin II. Nat. Cell Biol. 15, Saj, A., Arziman, Z., Stempfle, D., van Belle, W., Sauder, U., Horn, T., Dürrenberger, M., Paro, R., Boutros, M., and Merdes, G. (2010). A combined ex vivo and in vivo RNAi screen for notch regulators in Drosophila reveals an extensive notch interaction network. Dev. Cell 18, Saksena, S., and Emr, S.D. (2009). ESCRTs and human disease. Biochem. Soc. Trans. 37, Schaffer, A.E., Eggens, V.R., Caglayan, A.O., Reuter, M.S., Scott, E., Coufal, N.G., Silhavy, J.L., Xue, Y., Kayserili, H., Yasuno, K., et al. (2014). CLP1 founder mutation links trna splicing and maturation to cerebellar development and neurodegeneration. Cell 157, Vandendries, E.R., Johnson, D., and Reinke, R. (1996). orthodenticle is required for photoreceptor cell development in the Drosophila eye. Dev. Biol. 173, Venken, K.J., He, Y., Hoskins, R.A., and Bellen, H.J. (2006). P[acman]: a BAC transgenic platform for targeted insertion of large DNA fragments in D. melanogaster. Science 314, Venken, K.J., Popodi, E., Holtzman, S.L., Schulze, K.L., Park, S., Carlson, J.W., Hoskins, R.A., Bellen, H.J., and Kaufman, T.C. (2010). A molecularly defined duplication set for the X chromosome of Drosophila melanogaster. Genetics 186, Wang, T., and Montell, C. (2007). Phototransduction and retinal degeneration in Drosophila. Pflugers Arch. 454, White, J.K., Gerdin, A.K., Karp, N.A., Ryder, E., Buljan, M., Bussell, J.N., Salisbury, J., Clare, S., Ingham, N.J., Podrini, C., et al.; Sanger Institute Mouse Genetics Project (2013). Genome-wide generation and systematic phenotyping of knockout mice reveals new roles for many genes. Cell 154, Yamamoto, S., and Seto, E.S. (2014). Dopamine dynamics and signaling in Drosophila: an overview of genes, drugs and behavioral paradigms. Exp. Anim. 63, Cell 159, , September 25, 2014 ª2014 Elsevier Inc. 106

125 Supplemental Information EXTENDED RESULTS Clinical Case Histories Patient 1- G358R Variant in DNM2 Patient 1, the proband in Figure S4A, is a 14 year old female who presented with hand tremor, calf cramps, paresthesias of the lower limbs, and difficulty with heel walking at age 12. Patient 1 is a member of a Turkish family with three generations of neuropathy. Her first neurological exam showed distal weakness of all limbs with more prominent weakness in the lower limbs. Nerve conduction studies showed low amplitude and velocities of the median nerve were normal (39 m/s). A sural nerve biopsy revealed rare onion bulbs. The patient was noted by WES to have a heterozygous disease-causing mutation in DNM2 (DNM2:NM_ :exon8:c. G1072A:p.G358R). The heterozygous G358R mutation in the DNM2 gene cosegregated with the CMT phenotype in the family in all six affected individuals who were genotyped. Patient 2- E341K Variant in DNM2 Patient 2, the proband in Figure S4B, is a now 88 year old male who developed weakness in lower extremities starting at age 40 years. His mother who is deceased was also reportedly affected but had not been formally evaluated. The patient remains ambulatory. He has no living relatives. The patient has an E341K mutation in DNM2 (DNM2:NM_ :exon8:c.G1021A:p.E341K). In addition to the DNM2, WES revealed a variation in LRSAM1 (c.g334a, p.e112k), a novel mutation altering an amino acid in the fourth leucine-rich repeat region of the protein. Both mutations were confirmed in the proband by Sanger sequencing but no other relatives were alive. DNM2 Case Summary Two cases of CMT type 2 were found to have DNM2 mutations.their phenotypes were consistent with those reported for CMT associated with DNM2 mutations. DNM2 is associated with CMT type 2 and dominant intermediate CMT (Züchner et al., 2005). In patient 1, the G358R mutation has been previously associated with CMT type 2 (Gallardo et al., 2008). For patient 2, a mutation was found in the same domain, but another mutation in another CMT locus (LRSAM1) was also noted. Either one or a combination of both genes may cause CMT in this family. Patient 3- Y221fs Variant in CRX Patient 3 is an individual of European descent with visual symptoms at age 61 years. Noted to have bull s eye maculopathy. No other affected relatives. A CRX frameshift allele was noted (CRX:NM_000554:exon4:c.661 delt:p.y221fs). Patient 4- D219fs Variant in CRX Patient 4 is an individual of Asian Indian descent who presented with visual symptoms at age 26 years. Noted to have bull s eye maculopathy. No other affected relatives. A CRX frameshift allele was noted (CRX:NM_000554:exon4:c.657 delc:p.d219fs). Patient 5- S150X Variant in CRX Patient 5 is the proband in Figure 5A, who presented with visual symptoms at age 43 years. Patient 5 is a member of a large family with Spanish heritage (from the Dominican Republic) with a dominant mode of inheritance. A CRX nonsense allele was noted in the proband (CRX:NM_000554:exon4:c.C449G:p.S150X). In this family, the S150X mutation segregates with the phenotype in the seven affected individuals tested, with ages of onset that range from 28 years to 63 years. The unaffected mother also carried the S150X variant consistent with incomplete penetrance. CRX Summary Three individuals with bull s eye maculopathy were found to have truncating CRX alleles. CRX has been associated with a range of early-onset retinal phenotypes including cone-rod dystrophy, (Kitiratschky et al., 2008), Leber s congenital amaurosis (Freund et al., 1998), and retinitis pigmentosa (Huang et al., 2012). While parents carrying the same alleles have been noted to be without visual impairment, presumably in early adulthood (Freund et al., 1998; Silva et al., 2000), the late onset bull s eye maculopathy has never been noted in association with CRX. Patient 6- Compound Heterozygous Variants in ANKLE2 Patient 6 (patient LR06-300a1 in Dobyns database) is a boy of Mexican descent with a birth weight of 2.67 kg (3rd percentile) and a very small head circumference. Examination demonstrated severe microcephaly with low sloping forehead, ptosis, small jaw, multiple hyper- and hypopigmented macules over all areas of his body, and spastic quadriplegia. During his first year of life, he had unexplained anemia, glaucoma, and surgery for ptosis and undescended testes. At 3 years, he had onset of seizures consisting of multiple staring episodes with a few episodes of facial twitching. When evaluated at 5.5 years (Figures 6C 6G), his weight was 10.7 kg ( 4 standard deviations, SD), length 83.8 cm ( 6 SD) and Fronto-occipito circumference (FOC) of 38.2 cm ( 9 SD). He was awake and had good eye contact, symmetric movements, but severe spastic quadriplegia, adducted thumbs and flexion contractures at the knees. He had severe microcephaly with low sloping forehead, normal ears, bilateral ptosis, telecanthus, open mouth with drooling, prominent vertebral bodies in the midthoracic region, and unchanged hyper- and hypopigmented macules. Brain MRI in the newborn period demonstrated a low forehead, several scalp ruggae, and mildly enlarged extra-axial space with a wide open communication between the posterior lateral ventricles and the mesial extra-axial space. Other changes included a markedly simplified gyral pattern, mildly thickened cortex, small frontal horns of the lateral ventricles with mildly enlarged posterior horns of the lateral ventricles, and agenesis of the corpus callosum. The brainstem and cerebellum appeared relatively normal. Cell 159, , September 25, 2014 ª2014 Elsevier Inc. S1 107

126 A younger sister born a year later had severe microcephaly, spasticity, and similar hyper- and hypopigmented macules over all areas of her body. She died in the first few weeks of life from cardiac failure associated with poor contractility, although the basis for this was not known. Whole-exome sequencing was performed on the proband, his affected sister, and both parents. Homozygous and compound heterozygous variants were prioritized based on segregation in the family and then by expression in the nervous system. This led to four candidate genes which met Mendelian expectation and were expressed in the CNS. Table S4 shows the variants with their scores and predictions from the phylop, SIFT, Polyphen2, likelihood ratio test (LRT), and MutationTaster algorithms on dbnsfp (Liu et al., 2011). The ANKLE2 variants noted in the proband (ANKLE2:NM_015114:exon11:c.C2344T:p.Q782X; and NM_015114: exon10:c.c1717g:p.l573v) were prioritized for further study. EXTENDED EXPERIMENTAL PROCEDURES Fly Strains We used the following Drosophila melanogaster strains in this study. Mutagenesis and Phenotypic Analysis y w P{neoFRT}19A Df(1)JA27/FM7c Kr-GAL4, UAS-GFP w sn P{neoFRT}19A; Ubx-FLP (Yamamoto et al., 2012) Tub-Gal80 hsflp FRT19A; Act-Gal4, UAS-GFP/CyO cl Ubi-GFP FRT19A/ Dp (1:Y)y + v + ; Ubx-FLP (II chr) cl(1)* P{neoFRT}19A/ Dp(1;Y)y + v + ; ey-flp (Call et al., 2007) (gift from Drs. John Olson and Utpal Banerjee, UCLA) cl(1)* is a recessive cell lethal mutation that is caused by a P-element transposon insertion in RpII215, the major subunit of RNA polymerase II). P{neoFRT}19A and Kr-GAL4, UAS-GFP are abbreviated as FRT19A and Kr>GFP respectively. Duplication Mapping Df(1)svr, N spl-1 ras 2 fw 1 /Dp(1;Y)y 2 67 g19.1/c(1)dx, y 1 f 1 (Dp901), Dp(1;f)R, y + /y 1 dor 8 (Dp761), Df(1)64c18, g 1 sd 1 /Dp(1;2;Y)w + /C(1)DX, y 1 w 1 f 1 (Dp936) Df(1)dhd81, w 1118 /C(1)DX, y 1 f 1 ; Dp(1;2)4FRDup/+ (Dp5594) Df(1)JC70/Dp(1;Y)dx + 5, y + /C(1)M5 (Dp5279) Df(1)ct-J4, In(1)dl-49, f 1 /C(1)DX, y 1 w 1 f 1 ; Dp(1;3)sn 13a1 /+ (Dp948) winscy/dp(1;y)bsc174 /C(1)DX, y 1 w 1 f 1 (Dp(1;Y)BSC174) (Gift from Dr. Kevin Cook, Indiana University) Dp(1;Y)619, y + B S /w 1 oc 9 /C(1)DX, y 1 f 1 (Dp5678) y 1 nej Q7 v 1 f 1 /Dp(1;Y)FF1, y + /C(1)DX, y 1 w 1 f 1 (Dp5292) Df(1)v-L15, y 1 /C(1)DX, y 1 w 1 f 1 ; Dp(1;2)v + 75d/+ (Dp929) Df(1)v-N48, f*/dp(1;y)y + v + #3/C(1)DX, y 1 f 1 (Dp3560) Dp(1;Y)BSC1, y + /w 67c23 P{lacW}Smr G0060 /C(1)RA, y 1 (Dp5596), C(1;Y)6, y 1 w* P{white-un4}BE1305 mew 023 /C(1)RM, y 1 pn 1 v 1 ; Dp(1;f)y + (Dp5459) w* l(1)dd4 xr16 / FM7a/Dp(1;Y)y + g + (Dp26276) Dp(1;Y)BSC231, y + P{3 0.RS }BSC27, B S /Df(1)ED7265, w 1118 P{3 0.RS }ED7265/C(1)RA, In(1)sc J1, In(1)sc 8, l(1)1ac 1, sc J1 sc 8 (Dp33250) Dp(1;Y)BSC223, y + P{3 0.RS }BSC16, B S /Df(1)ED7344 w 1118 P{3 0.RS }ED7344/ C(1)RA, In(1)sc J1, In(1)sc 8, l(1)1ac 1, sc J1 sc 8 (Dp33244) Df(1)19, f 1 /C(1)DX, y 1 w 1 f 1 ; Dp(1;4)r + l (Dp5273) Dp(1;Y)W73, y 31d B 1,f +, B S /C(1)DX, y 1 f 1 /y 1 baz EH171 (Dp1537) Df(1)os UE69 /C(1)DX, y 1 f 1 /Dp(1;Y)W39, y + (Dp1538) Dp(1;Y)BSC129, y + P{3 0.RS }BSC22, B S /Df(1)ED7441 w 1118 P{3 0.RS }ED7441/C(1)RA, In(1)sc J1, In(1)sc 8, l(1)1ac 1, sc J1 sc 8 (Dp30450) Df(1)R20, y 1 /C(1)DX, y 1 w 1 f 1 /Dp(1;Y)y + mal + (Dp3033) Bx 3 /C(1)DX, y 1 w 1 f 1 Deficiency Mapping and Complementation Test Lines that carry deficiencies or lethal mutations in specific regions of interest were identified using Cytosearch ( static_pages/cytosearch/cytosearch15.html) in FlyBase (Marygold et al., 2013) and publically available lines were obtained from BDSC. Information on the specific lines used for mapping of each complementation group can be obtained upon request. Evaluation of Isogenized y w FRT19A Lines Isogenization of y w FRT19A chromosome was performed using standard genetic crosses. We established 10 independent lines and selected one line (line F1) as the starter line for mutagenesis. We examined the external structure under light microscope to confirm S2 Cell 159, , September 25, 2014 ª2014 Elsevier Inc. 108

127 the line showed normal morphology. ERG was performed (see below) in both young and aged flies to confirm that the line exhibited normal ERGs. To test whether the line exhibited normal synaptic transmission, we measured excitatory junctional potentials (EJPs), resting potentials, and paired-pulse stimulation (PPS) at the third instar larval neuromuscular junction. Phenotypic Analysis of Morphological Defects in Mutant Clones To induce homozygous mutant clones of recessive lethal mutations obtained, we collected virgin females from each y w mut* FRT19A/FM7c Kr > GFP strain and crossed them with two different FLP lines. To generate clones in the thorax and wing, virgin females were crossed with w sn FRT19A; Ubx-FLP males and we screened y w mut* FRT19A/ w sn FRT19A; Ubx-FLP/+ progeny for morphological defects (Figures 1 and S2). Homozygous mutant tissues were marked by y - sn + bristles, heterozygous tissues were marked by y + sn + bristles, and homozygous wild-type bristles were marked by y + sn - bristles. Since homozygous mutant and wild-type cells are progeny of the same mitotic division, the size of the homozygous mutant clones relative to homozygous wildtype clones should be similar if the mutation does not affect cell division or cell survival. Flies that comprise mostly homozygous wild-type and heterozygous tissue were annotated as cell lethal. To generate clones in the eye and head, virgin females were crossed with cl(1)* FRT19A/ Dp(1;Y)y + v + ; ey-flp males and we screened y w mut* FRT19A/ cl(1)* FRT19A; ey-flp/+ progeny for morphological defects. Homozygous mutant cells were marked by w - and heterozygous cells were marked by w +. Homozygous wild-type cells were eliminated by the recessive cell lethal mutation (cl(1)*) to give the mutant clones a growth advantage. Morphological defects were documented and recorded in a database that is publically accessible ( bellenxscreendata/mutantsandphenotypes.xlsx). ERG Analysis of Mutant Clones y w mut* FRT19A/FM7c Kr > GFP virgins were crossed with cl(1)* FRT19A/ Dp(1;Y)y + v + ; ey-flp males to obtain y w mut* FRT19A/ cl(1)* FRT19A; ey-flp/+ flies. Flies were aged for 3 4 weeks at room temperature under a normal light-dark cycle and then ERG was recorded. ERG recordings were performed as described earlier (Xiong et al., 2012) Duplication Rescue and Rough Mapping Using Large Duplications Lines that exhibited a strong morphological and/or ERG phenotype were subjected to duplication mapping. Virgin females from the mutant lines were crossed to males carrying different X chromosome duplications (Cook et al., 2010). Progenies were scored to determine whether the duplication rescued the lethality of the mutation. The duplication mapping was performed in 3 rounds. Round 1: Dp901, Dp936, Dp5279, Dp5678, Dp5292, Dp3560, Dp5596, Dp1537, Dp1538, Dp3033. Round 2: Dp761, Dp5594, Dp948, Dp8-28-8A, Dp929, Dp5459, Dp26276, Dp5273. Round 3: Dp33250, Dp33244, Dp Rescued males were crossed to a stock that carries a compound X chromosome (C(1)DX) or to the original mutant stock to establish stocks that stably produce rescued male flies. For Dp5459, this was not possible due to technical reasons. Complementation Testing Lines that were rescued by the same duplication and exhibit similar phenotypes were crossed inter se to establish complementation groups based on lethality. We did not perform complementation tests for mutations rescued by Dp5594, Dp948, Dp929, and Dp5273 since the X chromosome duplication did not possess any useful visible markers. In addition, we did not perform complementation tests for mutants rescued by Dp5459 since we were not able to obtain lines that stably produce rescued male flies. In cases where mutations with different phenotypes were fine mapped to similar regions, we performed complementation tests between these lines. Fine Mapping Using Deletions and P[acman] Duplications Complementation groups were further fine mapped using deficiencies that cover the region of interest. We selected 5 deficiencies to further subdivide the rough mapped regions into smaller regions. Most of the deficiencies we selected were molecularly defined (Cook et al., 2012; Parks et al., 2004). Whenever a molecularly defined deficiency was not available, we selected cytologically mapped deficiencies to cover the gap regions (Lindsley and Zimm, 1992). Rescued males from the mutant lines were crossed to virgin females that carried X chromosome deficiencies. In addition, we occasionally used strains carrying BACs that cover a portion (80 kb) of the X chromosome generated using the P[acman] technology (P[acman] mapping) (Venken et al., 2010). Females from mutant strains were crossed with males that carry the P[acman] duplication and we scored the rescue of lethality in the subsequent generation. Gene Identification When a complementation group was mapped to a small region ( kb, varies depending on available resources), we searched for publically available lethal mutations that map to the same region using FlyBase (Marygold et al., 2013). We performed complementation tests using >1 mutant allele when possible. For complementation groups that complemented all available lethal mutations in the region, we performed Sanger sequencing using standard methods. To expedite gene identification we also used Illuminabased whole-genome sequencing technology (Haelterman et al., 2014). Cell 159, , September 25, 2014 ª2014 Elsevier Inc. S3 109

128 Gene Ontology Analysis The molecular functions (MF) and biological processes (BF) annotated for each gene were retrieved using the online tool DAVID (the Database for Annotation, Visualization and Integrated Discovery) with Flybase ID as the identifier (Dennis et al., 2003). MF and BF terms for genes that are not annotated in DAVID were manually extracted through FlyBase. MF and BF were further classified manually. MF and BF associated with individual genes can be downloaded from the following website: bellenxscreendata/go.xlsx Identification of Human Homologs of Fly Genes and Their Association with OMIM Diseases The human homologs of the fly genes were identified using HGNC Comparison of Orthology Predictions (HCOP) search tool ( et al., 2005). Once we assembled a human homolog list for the genes identified from our screen and for the whole fly genome based on data downloaded from FlyBase (Marygold et al., 2013), we searched human diseases that have been associated with each human homolog based on data downloaded from OMIM ( Estimation of the number of genes in the fly genome that are lethal versus viable was based on the following criteria. The number of essential loci in the fly genome has been repeatedly been estimated to be 5,000 based on saturation mutagenesis experiments (Benos et al., 2001). Currently, 1,934 loci have been associated with a lethal mutation (excluding uncharacterized transposon insertions and RNAi-based phenotypes) according to FlyBase. This is 40% of all essential loci based on the predicted total number of essential genes. The raw data we used to generate the graphs and tables in Figure 3 can be found in Table S3 (genes from the screen) or can be downloaded from the following website (for all genes in the fly genome): Imaging of Larval Brains Larval brains (Figures 6I 6K) were dissected in PBS from similar sized late third instar larvae and fixed (3.7% formaldehyde in PBS) for 20 min and washed in PBS. DIC images of brains were taken by Zeiss microscope (Axio Imager-Z2) equipped with the AxioCam MRm digital camera. Images are acquired using image acquisition software Zen and processed by Adobe Photoshop. Immunohistochemistry For immunostaining of the fly PNS in the notum (Figure 6H), white fly pupae (0 hr after puparium formation) were aged at 25 C for hr before dissection. For larval brain immunostainings, wandering third instar larvae brains were dissected, fixed in 3.7% formaldehyde in PBS for 20 min, and washed in PBS with 0.3% Triton X-100 (PBT). Fixed larvae were blocked in 13 PBS containing 5% normal goat serum and 0.3% Triton X-100 (PBTS) for 1hr. Samples were incubated in secondary antibody diluted in PBTS overnight at 4 C. Samples are washed in PBT, incubated in secondary antibody diluted in PBT for two hrs, and then washed in PBT prior to mounting. Primary antibodies were used at the following dilutions: rat anti-elav 1:500 (DSHB) (O Neill et al., 1994), mouse anti-cut 1:500 (DSHB) (Blochlinger et al., 1990), and chicken anti-gfp 1:1,000 (Abcam), rat anti-mira 1:250 (Chabu and Doe, 2008), rabbit anti-phospho-histone 3 (PH3) 1:1,000 (Upstate Biotechnoloy), mouse anti-a-tubulin 1:5000 (Sigma), and Cnn 1:100(Heuer et al., 1995). Images were taken by confocal microscope (Zeiss 510 or Zeiss LSM 710) and processed using ImageJ or Imaris (Bitplane). TUNEL Staining Wandering 3 rd instar larvae brains were dissected, fixed in 100 mm Pipes, 1 mm EGTA, 0.3% Triton X-100, and 1 mm MgSO 4 containing 4% formaldehyde for 20 min, and blocked in 13 PBS containing 1% BSA and 0.3% Triton X-100 supplemented with 0.01 M glycine and 0.1% normal serum for 1hr. Fixed brains were treated with 20 mg/ml proteinase K for 2 min, rinsed 43 in 13 PBS containing 0.3% Triton X-100 (PBST), re-fixed for 20 min, rinsed 33 in PBST, equilibrated in TdT Equilibration Buffer (Calbiochem Fluorescein-FragEL Kit) for 30 min, and incubated with TdT enzyme and Fluorescein labeled dntps at 37 C for 2hrs. Brains images were acquired using confocal microscope (Zeiss LSM 710) and TUNEL positive cells were quantified using Imaris (Bitplane). BrdU Incorporation Tub-Gal80 hsflp FRT19A; Act-Gal4, UAS-GFP/CyO was crossed to y, w, dankle2 A FRT19A or y, w, FRT19A (control) to generate clones that are marked by GFP (MARCM (Lee and Luo, 1999), Figures 6P and 6Q). Embryos were collected for 24hrs, aged hr, heat shocked at 37 C for 2 hr, and resulting 3 rd instar larvae containing MARCM clones were shifted to blue food containing 1 mg/ml BrdU for 4hrs. Brains were dissected, fixed in 100 mm Pipes, 1 mm EGTA, 0.3% Triton X-100, and 1 mm MgSO 4 containing 4% formaldehyde for 20 min, and blocked in 13 PBS containing 1% BSA and 0.3% Triton X-100 supplemented with 0.01 M glycine and 0.1% normal serum for 1hr. Fixed brains were treated with 2N HCl for 30 min and blocked in 13 PBS containing 1% BSA and 0.3% Triton X-100 with 0.1% normal serum for 1hr. Samples were incubated with mouse anti-brdu (DSHB, 1:250), rat anti-elav (DSHB, 1:250), and rabbit anti-gfp (Invitrogen, 1:1000) overnight at 4 C. Images were acquired with Zeiss LSM 710 or Apotome.2 and analyzed using Imaris (Bitplane). Whole-Exome Sequencing Briefly, 1 mg of genomic DNA in 100 ml volume was sheared into fragments of approximately base pairs in a Covaris plate with E210 system (Covaris, Inc. Woburn, MA). Genomic DNA samples were constructed into Illumina paired-end precapture libraries S4 Cell 159, , September 25, 2014 ª2014 Elsevier Inc. 110

129 according to the manufacturer s protocol (Illumina Multiplexing_SamplePrep_Guide_ _D) with modifications as described in the BCM-HGSC Illumina Barcoded Paired-End Capture Library Preparation protocol. Libraries were prepared using Beckman robotic workstations (Biomek NXp and FXp models). The complete protocol and oligonucleotide sequences are accessible from the HGSC website ( Four precapture libraries were pooled together (approximately 500 ng/sample, 2 ug per pool) and hybridized in solution to the HGSC CORE design (Bainbridge et al., 2011b) (52Mb, NimbleGen) according to the manufacturer s protocol NimbleGen SeqCap EZ Exome Library SR User s Guide (Version 2.2) with minor revisions. Captured DNA fragments were sequenced using paired end mode on an Illumina HiSeq 2000 platform (TruSeq SBS Kits, Part no. FC ) producing 9 10 Gb per sample and achieving an average of 90% of the targeted exome bases covered to a depth of 203 or greater. Illumina sequence analysis was performed using the HGSC Mercury analysis pipeline ( that addresses all aspects of data processing and analyses from the initial sequence generation on the instrument to annotated variant calls (SNPs and intraread in/dels). This pipeline uses.bcl files to then generate sequence reads and base-call confidence values (qualities) using Illumina primary analysis software (CASAVA). Reads were mapped to the GRCh37 Human reference genome ( using the Burrows-Wheeler aligner (BWA(Li and Durbin, 2009; producing a BAM(Li et al., 2009 ) (binary alignment/map) file. BAM postprocessing including in/del realignment and quality recalibration is done using a variety of tools (SAMtools, GATK, etc). Variants were determined using the Atlas2 (Challis et al., 2012) suite (Atlas-SNP and Atlas-indel) to call variants and produce a variant call file (VCF) (Danecek et al., 2011). Finally, annotation data are added to the vcf using a suite of annotation tools Cassandra (Bainbridge et al., 2011a). Sanger Confirmation Primers for Sanger confirmation for all variants reported were designed using Primer3 (Untergasser et al., 2012). Variant Analysis Variants were filtered out for having greater than 1% allele frequency in the 1000 Genomes Project ( the Exome Variant Server of NHLBI GO Exome Sequencing Project ( or within the Atherosclerosis Risk in Communities Study (ARIC) ( SUPPLEMENTAL REFERENCES Bainbridge, M.N., Wang, M., Wu, Y., Newsham, I., Muzny, D.M., Jefferies, J.L., Albert, T.J., Burgess, D.L., and Gibbs, R.A. (2011b). Targeted enrichment beyond the consensus coding DNA sequence exome reveals exons with higher variant densities. Genome Biol. 12, R68. Blochlinger, K., Bodmer, R., Jan, L.Y., and Jan, Y.N. (1990). Patterns of expression of cut, a protein required for external sensory organ development in wild-type and cut mutant Drosophila embryos. Genes Dev. 4, Call, G.B., Olson, J.M., Chen, J., Villarasa, N., Ngo, K.T., Yabroff, A.M., Cokus, S., Pellegrini, M., Bibikova, E., Bui, C., et al. (2007). Genomewide clonal analysis of lethal mutations in the Drosophila melanogaster eye: comparison of the X chromosome and autosomes. Genetics 177, Chabu, C., and Doe, C.Q. (2008). Dap160/intersectin binds and activates apkc to regulate cell polarity and cell cycle progression. Development 135, Challis, D., Yu, J., Evani, U.S., Jackson, A.R., Paithankar, S., Coarfa, C., Milosavljevic, A., Gibbs, R.A., and Yu, F. (2012). An integrative variant analysis suite for whole exome next-generation sequencing data. BMC Bioinformatics 13, 8. Cook, R.K., Christensen, S.J., Deal, J.A., Coburn, R.A., Deal, M.E., Gresens, J.M., Kaufman, T.C., and Cook, K.R. (2012). The generation of chromosomal deletions to provide extensive coverage and subdivision of the Drosophila melanogaster genome. Genome Biol. 13, R21. Danecek, P., Auton, A., Abecasis, G., Albers, C.A., Banks, E., DePristo, M.A., Handsaker, R.E., Lunter, G., Marth, G.T., Sherry, S.T., et al.; 1000 Genomes Project Analysis Group (2011). The variant call format and VCFtools. Bioinformatics 27, Dennis, G., Jr., Sherman, B.T., Hosack, D.A., Yang, J., Gao, W., Lane, H.C., and Lempicki, R.A. (2003). DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol. 4, 3. Freund, C.L., Wang, Q.L., Chen, S., Muskat, B.L., Wiles, C.D., Sheffield, V.C., Jacobson, S.G., McInnes, R.R., Zack, D.J., and Stone, E.M. (1998). De novo mutations in the CRX homeobox gene associated with Leber congenital amaurosis. Nat. Genet. 18, Gallardo, E., Claeys, K.G., Nelis, E., García, A., Canga, A., Combarros, O., Timmerman, V., De Jonghe, P., and Berciano, J. (2008). Magnetic resonance imaging findings of leg musculature in Charcot-Marie-Tooth disease type 2 due to dynamin 2 mutation. J. Neurol. 255, Heuer, J.G., Li, K., and Kaufman, T.C. (1995). The Drosophila homeotic target gene centrosomin (cnn) encodes a novel centrosomal protein with leucine zippers and maps to a genomic region required for midgut morphogenesis. Development 121, Kitiratschky, V.B., Nagy, D., Zabel, T., Zrenner, E., Wissinger, B., Kohl, S., and Jägle, H. (2008). Cone and cone-rod dystrophy segregating in the same pedigree due to the same novel CRX gene mutation. Br. J. Ophthalmol. 92, Lee, T., and Luo, L. (1999). Mosaic analysis with a repressible cell marker for studies of gene function in neuronal morphogenesis. Neuron 22, Li, H., and Durbin, R. (2009). Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., and Durbin, R.; 1000 Genome Project Data Processing Subgroup (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, Lindsley, D.L., and Zimm, G.G. (1992). The Genome of Drosophila melanogaster (San Diego: Academic Press). O Neill, E.M., Rebay, I., Tjian, R., and Rubin, G.M. (1994). The activities of two Ets-related transcription factors required for Drosophila eye development are modulated by the Ras/MAPK pathway. Cell 78, Cell 159, , September 25, 2014 ª2014 Elsevier Inc. S5 111

130 Parks, A.L., Cook, K.R., Belvin, M., Dompe, N.A., Fawcett, R., Huppert, K., Tan, L.R., Winter, C.G., Bogart, K.P., Deal, J.E., et al. (2004). Systematic generation of high-resolution deletion coverage of the Drosophila melanogaster genome. Nat. Genet. 36, Silva, E., Yang, J.M., Li, Y., Dharmaraj, S., Sundin, O.H., and Maumenee, I.H. (2000). A CRX null mutation is associated with both Leber congenital amaurosis and a normal ocular phenotype. Invest. Ophthalmol. Vis. Sci. 41, Untergasser, A., Cutcutache, I., Koressaar, T., Ye, J., Faircloth, B.C., Remm, M., and Rozen, S.G. (2012). Primer3 new capabilities and interfaces. Nucleic Acids Res. 40, e115. Wright, M.W., Eyre, T.A., Lush, M.J., Povey, S., and Bruford, E.A. (2005). HCOP: the HGNC comparison of orthology predictions search tool. Mamm. Genome 16, Xiong, B., Bayat, V., Jaiswal, M., Zhang, K., Sandoval, H., Charng, W.L., Li, T., David, G., Duraine, L., Lin, Y.Q., et al. (2012). Crag is a GEF for Rab11 required for rhodopsin trafficking and maintenance of adult photoreceptor cells. PLoS Biol. 10, e Yamamoto, S., Charng, W.L., Rana, N.A., Kakuda, S., Jaiswal, M., Bayat, V., Xiong, B., Zhang, K., Sandoval, H., David, G., et al. (2012). A mutation in EGF repeat- 8 of Notch discriminates between Serrate/Jagged and Delta family ligands. Science 338, Züchner, S., Noureddine, M., Kennerson, M., Verhoeven, K., Claeys, K., De Jonghe, P., Merory, J., Oliveira, S.A., Speer, M.C., Stenger, J.E., et al. (2005). Mutations in the pleckstrin homology domain of dynamin 2 cause dominant intermediate Charcot-Marie-Tooth disease. Nat. Genet. 37, S6 Cell 159, , September 25, 2014 ª2014 Elsevier Inc. 112

131 Figure S1. Flow Chart of the F3 Adult Mosaic Genetic Screen on the X Chromosome of Drosophila, Related to Figure 1 (A and B) The y w FRT19A chromosome was isogenized (A) and male flies were mutagenized (B). (C and D) The mutagenized X-chromosomes were balanced with FM7c Kr > GFP balancer (C) and strains with X-linked recessive lethal mutations were kept (D). (E) Mosaic flies with Ubx-FLP were used to screen mutant clones in wing and thorax and with ey-flp were used to screen mutant clones in head and eye. (F and G) We assessed morphological and ERGs defects in mosaic flies. Cell 159, , September 25, 2014 ª2014 Elsevier Inc. S7 113

132 Figure S2. Phenotypic Screening of Morphological and Electrophysiological Defects in Mutant Clones, Related to Figure 1 (A D) Examples of phenotypes observed in the fly notum. Homozygous wild-type bristles are marked by singed. Homozygous mutant bristles are marked by yellow (encircled by dotted lines). Heterozygous bristles are wild-type for these two markers. (A) Macrochaetae loss. (B) Short bristles. (C) Cell lethal. (D) Depigmentation. (E J) Examples of phenotypes observed in wings. The exact clonal boundaries are not obvious since yellow does not show a strong phenotype in the wing. (E) Notching. (F) Ectopic wing margin. (G)Vein loss (arrow) and gain (arrowhead). (H) Ectopic bristles on the wing blade. (I) Wing blistering. (J) Crinkled wings. (K S) Examples of phenotypes observed in eyes and heads. Homozygous wild-type cells are eliminated by a recessive cell lethal mutation. Homozygous mutant clones in the eyes are marked by white. Heterozygous clones appear red (white + ). (K) Wild-type eye and head clones. (L) Rough eye. (M) Cell lethal. (N) Small eye. (O) Ectopic eye (black arrow). (P) Glossy eye. (Q) Ectopic antenna formation (two left antennae are marked by two black arrows) and overgrowth of the eye and head. (R) Noncell autonomous overgrowth of the eye (marked by a white arrow). (S) Overgrowth of the head cuticle (marked by a white arrow). (T and U) Gene Ontology (GO) analysis based on (T) molecular functions and (U) biological processes. S8 Cell 159, , September 25, 2014 ª2014 Elsevier Inc. 114

133 Figure S3. Flow Chart of Mapping of X-Linked Recessive Lethal Mutants, Related to Figure 1. Cell 159, , September 25, 2014 ª2014 Elsevier Inc. S9 115

134 Figure S4. Missense Mutations in DNM2 Associated with Charcot Marie Tooth Disease, Related to Figure 4 (A) Pedigree of the family of patient 1, a 14 year old (red arrow) who was diagnosed CMT neuropathy, demonstrating 13 individuals affected with neuropathy (black indicates clinical neuropathy). Six affected individuals were genotyped and all six carry the G358R allele. Two additional unaffected individuals did not carry the allele. (B) Pedigree of the family of patient 2 with CMT neuropathy (black indicates clinical neuropathy). This individual (red arrow) was also found to be heterozygous for an E341K allele in DNM2 and a heterozygous variant in LRSAM1 (E112K). (C) Sural nerve biopsy of control and patient 1 showing rare onion bulb structures (red arrows). (D) Structure of DNM2 protein and position and nature of the mutations in patient 1 and 2. S10 Cell 159, , September 25, 2014 ª2014 Elsevier Inc. 116

135 Figure S5. dankle2 Regulates Brain Size, Related to Figure 6 (A) Quantification of control and dankle2 A larval brain lobe volume from early, mid, and late third instar larvae. ** indicates p value < 0.01 and * indicates p value < (B and C) Neuroblast cells in control (B) and dankle2 A mutants (C) are marked by Miranda (Mira, green). PH3 (red) marks chromosomes in dividing cells. Quantification of the number of neuroblasts in these brains is shown in Figure 6O. (D and E) Neuroblasts undergoing mitosis in control (D) and dankle2 A larval brains (E). Mitotic spindles (a-tub, red) are oriented toward the polarity axis both in control and dankle2 A. Mira (green) marks the basal side of asymmetrically dividing neuroblast cells. Condensed chromosomes are marked by PH3 (blue). (F and G) Centrioles in mitotic neuroblast in larval brains are marked by Cnn (red). Mitotic spindles are marked by a-tub (green) and DNA is marked by DAPI (blue). (H) A comparison of clinical features observed in patients carrying variants in VRK1 and ANKLE2. Cell 159, , September 25, 2014 ª2014 Elsevier Inc. S11 117

136 CHAPTER IV.4 Whole exome sequencing identifies an adult-onset case of methylmalonic aciduria and homocystinuria type C (cbic) with non-syndromic bull s eye maculopathy (Published Paper) 118

137 Ophthalmic Genetics, 2015; 36(3): Published with license by Taylor & Francis ISSN: print / online DOI: / CASE REPORT Whole Exome Sequencing Identifies an Adult-Onset Case of Methylmalonic Aciduria and Homocystinuria Type C (cblc) with Non-Syndromic Bull s Eye Maculopathy Frederick T. Collison 1, Yajing (Angela) Xie 2, Tomasz Gambin 3, Shalini Jhangiani 3, Donna Muzny 4, Richard Gibbs 3,4, James R. Lupski 3,4, Gerald A. Fishman 1,5 and Rando Allikmets 2,6 1 The Pangere Center for Hereditary Retinal Diseases, The Chicago Lighthouse for People Who Are Blind or Visually Impaired, Chicago, IL, USA, 2 Department of Ophthalmology, Columbia University, New York, NY, USA, 3 Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA, 4 Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA, 5 Department of Ophthalmology, University of Illinois at Chicago, Chicago, IL, USA, and 6 Department of Pathology and Cell Biology, Columbia University, New York, NY, USA ABSTRACT Background: Methylmalonic aciduria and homocystinuria type C (cblc), a disorder of vitamin B12 (cobalamin) metabolism caused by mutations in the MMACHC gene, presents with many systemic symptoms, including neurological, cognitive, psychiatric, and thromboembolic events. Retinal phenotypes, including maculopathy, pigmentary retinopathy, and optic atrophy are common in early onset form of the disease but are rare in adult onset forms. Materials and Methods: An adult Hispanic female presented with decreased central vision, bilateral pericentral ring scotomas and bull s eye-appearing macular lesions at 28 years of age. Her medical history was otherwise unremarkable except for iron deficiency anemia and both urinary tract and kidney infections. Screening of the ABCA4 gene, mutations in which frequently cause bull s eye maculopathy, was negative. Subsequently, analysis with whole exome sequencing was performed. Results: Whole exome sequencing discovered compound heterozygous mutations in MMACHC, c.g482a:p.arg161gln and c.270_271insa:p.arg91lysfs*14, which segregated with the disease in the family. The genetic diagnosis was confirmed by biochemical laboratory testing, showing highly elevated urine methylmalonic acid/creatinine and homocysteine levels, and suggesting disease management with hydroxycobalamin injections and carnitine supplementation. Conclusions: In summary, a unique case of an adult patient with bull s eye macular lesions and no clinically relevant systemic symptoms was diagnosed with cblc by genetic screening and follow-up biochemical laboratory tests. Keywords: Bull s eye maculopathy, methylmalonic aciduria and homocystinuria type C (MMACHC), wholeexome sequencing Received 30 October 2014; revised 22 December 2014; accepted 18 January 2015; published online 12 February 2015 ß F. T. Collison, Y. Xie, T. Gambin, S. Jhangiani, D. Muzny, R. Gibbs, J. R. Lupski, G. A. Fishman, R. Allikmets. This is an Open Access article. Non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly attributed, cited, and is not altered, transformed, or built upon in any way, is permitted. The moral rights of the named author(s) have been asserted. Correspondence: Rando Allikmets, Columbia University, Eye Institute Research, Rm. 202, 160 Fort Washington Avenue, New York, NY 10032, USA. Tel: Fax: rla22@columbia.edu

138 INTRODUCTION Methylmalonic aciduria and homocystinuria type C (cblc; OMIM #277400) is a recessively inherited disorder of vitamin B12 (cobalamin) metabolism, 1 which is caused by mutations in the MMACHC gene. 2 Maculopathy (including bull s eye maculopathy), pigmentary retinopathy, and optic atrophy are all common in early-onset disease. 3 Conversely, ocular findings are rare, 4,5 in the late-onset type, where systemic symptoms become apparent after age 4. 1 Although optic atrophy and peripheral retinal pigmentary changes have been noted in late-onset cblc disease 4,6 a maculopathy has not been described. 1 Patients with the late-onset form present with a variety of systemic deficits, including neurological complaints, cognitive decline, psychiatric disturbances, thromboembolic events, and gait disturbances. 4,7 Some of the late-onset cblc cases also present retinal changes, but no late-onset cblc case has been described presenting without systemic symptoms but with a retinal degenerative phenotype. 1 Case Report MATERIALS AND METHODS A 28-year-old Hispanic female (Figure 1) presented in 2006 with decreased central vision and mild photoaversion of approximately two years duration. She reported subjectively normal color vision, night vision and peripheral vision. Her medical history was positive for iron deficiency anemia, occasional urinary tract infections, and recurrent kidney infections in Best corrected Snellen visual acuities (BCVA) were 20/80 OD and 20/70 2 OS with a low myopic refraction. She correctly read only two out of twelve FIGURE 1. Pedigree of the family and segregation of the MMACHC mutations with the disease phenotype. The open circle and squares represent the unaffected female and male family members, respectively; the closed circle represents the affected female patient. Published with license by Taylor & Francis Adult-Onset cblc with Bull s Eye Maculopathy 271 Ishihara color plates OD, and zero out of twelve OS. Goldmann visual field testing demonstrated a pericentral ring scotoma in each eye (Figure 2A and B). Funduscopy showed bilateral bull s eye-appearing macular lesions (Figure 2C and D) with no visible fundus flecks, vessel attenuation, optic nerve pallor or peripheral pigmentary changes. A scotopic and photopic full-field electroretinogram (ERG) of the right eye showed normal a- and b-wave amplitudes and implicit times. The patient reported no current or previous use of hydroxychloroquine or any other medication or toxic exposure that could be potentially responsible for the bull s eye-appearing lesions in her maculae. The patient reported no family history of a retinal dystrophy, and examination of both parents and a brother (Figure 1) showed essentially normal maculae; only a few fine drusen near the fovea were observed in the examinations of both parents. By the most recent visit in 2013 at age 35, the patient was still working at her job as a production planner, although her BCVA had dropped to 10/100 2 OD and 10/80 OS. Fundus photos (Figure 2E and F), and near-infrared autofluorescence (Figure 2G and H) demonstrated bull s eye macular dystrophy with relative preservation of the retinal pigment epithelium (RPE) at the foveolar region. Spectral-domain optical coherence tomography also showed preservation of the RPE at the fovea in the presence of marked foveal thinning (Figure 3). However, the patient s eccentric viewing and level of visual acuity evidenced a notable degree of compromised foveal function. A repeat full field ERG (Figure 4) of the right eye at that visit still showed amplitudes and implicit times that were within normal limits under both scotopic rod isolated and photopic cone isolated test conditions, whereas the scotopic rod and cone combined response b-wave was mildly reduced and delayed compared to normal. Genetic Analysis Since the bull s eye maculopathy phenotype is frequent in patients with diseases caused by mutations in the ABCA4 gene, 8,9 the patient was first screened for mutations in ABCA4 by direct sequencing with negative results. Consequently, all members of the family were subjected to whole exome sequencing (WES) at the Baylor College of Medicine Human Genome Sequencing Center as part of the Baylor- Hopkins Center for Mendelian Genomics. Exome capture was performed on the NimbleGen array with minor revisions, and sequencing was performed on the Illumina platform. Sequence analysis was performed using the HGSC Mercury analysis pipeline. Samples achieved an average of 90% of the targeted exome bases covered to a depth of 20X or greater. Possible disease-associated variants were determined by filtering based on minor allele 120

139 272 F. T. Collison et al. FIGURE 2. Goldmann visual fields of the left (A) and right (B) eyes from the initial visit in 2006, showing ring scotomas consistent with the bull s eye-appearing macular lesions. Shaded areas indicate the target was not seen in that area. Fundus photos from 2006 (C, right eye and D, left eye) and 2013 (E and F) show some extension of RPE atrophy, apparent at the peripheral boundaries, as well as in the foveal region, in both eyes over 7 years. Near-infrared autofluorescence (G and H) from the 2013 follow-up visit shows hyper-autofluorescence at the fovea, surrounded by hypo-autofluorescence (bull s eye), further surrounded by a ring of hyperautofluorescence. Ophthalmic Genetics 121

140 Adult-Onset cblc with Bull s Eye Maculopathy 273 FIGURE 3. Spectral domain optical coherence tomography line scans (right eye top and left eye bottom, corresponding to the location of the horizontal lines on the near-infrared autofluorescence images in Figures 2G and H, respectively) show intact RPE at the foveola in the presence of marked foveal thinning, surrounded by atrophy of the outer retinal layers and retinal pigment epithelium, with relatively more normal retinal structure at the margins of the line scans in both eyes. FIGURE 4. Full-field electroretinogram (Espion E3, Diagnosys LLC, Littleton, MA) of the patient s right eye at age 35 during a followup visit. Boxes represent ranges of 13 visually-normal subjects. The cone isolated amplitudes and implicit times (top), and the rod isolated single flash response (bottom left) are within the normal range. The combined rod and cone single flash response (bottom right) is mildly reduced and delayed compared to normal. Published with license by Taylor & Francis 122

141 274 F. T. Collison et al. frequency, predicted pathogenicity and by segregation analysis in all family members. RESULTS Screening of the ABCA4 gene on an array revealed no disease-associated variants. WES analysis uncovered two compound heterozygous mutations, c.g482a:p.arg161gln and c.270_271insa:p.arg9 1Lysfs*14, in the MMACHC gene in the proband, which segregated with the disease in the family (Figure 1). Sanger sequencing confirmed compound heterozygous mutant alleles in the proband and heterozygous alleles in carrier parents again documenting trans mutations in the patient. Both of these mutations have been frequently described as causes of cblc ( Consequently, based upon the molecular diagnosis established by genetic studies, the patient was referred for biochemical laboratory testing, which showed a urine methylmalonic acid/creatinine level of 510 mmol/mol creatinine (reference range ) and a urine homocysteine level of 51 mg/day (reference range 0 32). The patient was then referred for disease management by a pediatric genetics specialist. Physical examination at that visit was normal; specifically, no focal neurological abnormalities were found. She was subsequently prescribed intramuscular hydroxycobalamin (vitamin B12) injections, as well as carnitine supplementation due to low carnitine levels. DISCUSSION To our knowledge, no cases with ocular findings as the only presenting sign in an adult patient with methylmalonic aciduria and homocystinuria type C have been described before. In addition, maculopathy has not been described in late-onset cblc disease. 10 It is possible that subtle, sub-clinical bull s eye macular changes have been missed in previously reported cases of late-onset cblc disease if sensitive tests such as SD-OCT, autofluorescence imaging or multifocal ERG were not performed. The late-onset form of cblc is usually discovered due to neurological complaints and cognitive decline. The history of iron deficiency anemia and urinary tract infections in our case has been described in other cases of late-onset cblc disease, 11 although these are not unusual findings in an otherwise healthy adult female. Phenotypes associated with the two disease-causing mutations in the MMACHC gene, c.g482a:p.arg161gln and c.270_271insa:p.arg91 Lysfs*14, have been previously characterized. The c.270_271insa:p.arg91lysfs*14 variant is the most common MMACHC mutation (440% of all disease-associated alleles), and is associated with a more severe, early-onset phenotype especially when found in homozygosity. 2,12 The c.g482a:p.arg161gln mutation is more often associated with the milder, late-onset phenotype. 2,11 13 Interestingly, the exact same combination of mutations has been described before in at least six cases with teenage or adult onset of cblc ,15 All cases, however, presented with severe systemic symptoms, often including muscular abnormalities in lower extremities, loss of bowel and/or bladder function, progressive encephalopathy, neurological anomalies, and thromboembolic complications. No degenerative retinal changes were described in any of these patients. 13,14 Patients who were compound heterozygous for the c.g482a:p.arg161gln and c.270_271ins A:p.Arg91Lysfs*14 variants had variable ethnic origin, including Italian, German, Hispanic, and Chinese descent. Perhaps the most relevant study with regards to the patient described in this report is the description of a patient with late-onset cblc presented by Tsai and colleagues, 11 since that patient matched the case described here by age, gender, and Hispanic ethnicity. However, in that case a 36-year-old woman had a constellation of systemic features including joint hypermobility, arthritis, chronic anemia, urogenital fistula, and spinal cord infarct. She also presented with emotional difficulties beginning in her teens and was diagnosed with depression and psychosis requiring hospitalization. 11 Bilateral cataracts were the only ocular findings described. Conversely, the patient described here, although matched by age, ethnicity, and gender, did not report any systemic abnormalities, other than iron deficiency anemia and urinary tract infections, but instead presented with an ocular finding (bull s eye maculopathy) as a primary finding. More macular dystrophy patients with mutations in the MMACHC gene will need to be identified before cblc becomes a serious consideration for a disease underlying a non-syndromic macular dystrophy. However, patients with bull s eye-appearing macular lesions may be questioned about systemic symptoms, possibly prompting testing for methylmalonic acid and homocysteine levels. Even with overt neurological symptoms, the diagnosis of late-onset cblc disease can be easily missed or delayed, 7,16 but urine or plasma markers of the disease can be easily detected with simple and inexpensive clinical biochemical laboratory tests. The implementation of whole exome and whole genome sequencing has allowed identification of causes of metabolic diseases which previously have remained obscure. 17 These include new genes, including those for methylmalonic aciduria, 18 and many cases of phenotypic expansion of known loci; i.e. where the clinical phenotype has not directed the screening of underlying genetic causes. 19 Importantly, Ophthalmic Genetics 123

142 as shown here, clinical phenotype can be very divergent from those described, and genetic testing can unequivocally determine the cause of a disease. Screening of the MMACHC gene is an accessible option for confirmation of cblc disease. An early diagnosis in patients with cblc disease has added importance because often systemic symptoms can not only improve, 7,16 but also may be prevented, delayed, or ameliorated with hydroxycobalamin treatment. 1 In conclusion, a patient with bull s eye macular lesions and no neurological, psychiatric or cognitive symptoms was determined to have cblc, a potentially debilitating disorder if left untreated. Due to the precise molecular diagnosis, achieved through genetic studies and confirmed by follow-up biochemical laboratory tests, she was treated for her condition, thereby possibly improving her systemic prognosis. DECLARATION OF INTEREST The authors report no conflicts of interest. The authors alone are responsible for the content and writing of the paper. This study is supported in part by NIH grants EY021163, EY019861, HG and EY (Core Support for Vision Research), by unrestricted funds from Research to Prevent Blindness (New York, NY) to the Department of Ophthalmology, Columbia University, and by the Pangere Family Corporation, The Chicago Lighthouse for People Who Are Blind or Visually Impaired (Chicago, IL). REFERENCES 1. Carrillo-Carrasco N, Chandler RJ, Venditti CP. Combined methylmalonic acidemia and homocystinuria, cblc type. I. Clinical presentations, diagnosis and management. J Inherit Metab Dis 2012;35: Lerner-Ellis JP, Tirone JC, Pawelek PD, et al. Identification of the gene responsible for methylmalonic aciduria and homocystinuria, cblc type. Nat Genet 2006;38: Gizicki R, Robert MC, Gomez-Lopez L, et al. Long-term visual outcome of methylmalonic aciduria and homocystinuria, cobalamin C type. Ophthalmology 2014;121: Thauvin-Robinet C, Roze E, Couvreur G, et al. The adolescent and adult form of cobalamin C disease: clinical Adult-Onset cblc with Bull s Eye Maculopathy 275 and molecular spectrum. J Neurol Neurosurg Psychiatry 2008;79: Gerth C, Morel CF, Feigenbaum A, et al. Ocular phenotype in patients with methylmalonic aciduria and homocystinuria, cobalamin C type. J AAPOS 2008;12: Van Hove JL, Van Damme-Lombaerts R, Grunewald S, et al. Cobalamin disorder Cbl-C presenting with late-onset thrombotic microangiopathy. Am J Med Genet A 2002;111: Ben-Omran TI, Wong H, Blaser S, et al. Late-onset cobalamin-c disorder: a challenging diagnosis. Am J Med Genet A 2007;143A: Allikmets R, Singh N, Sun H, et al. A photoreceptor cellspecific ATP-binding transporter gene (ABCR) is mutated in recessive Stargardt macular dystrophy. Nat Genet 1997; 15: Michaelides M, Chen LL, Brantley Jr. MA, et al. ABCA4 mutations and discordant ABCA4 alleles in patients and siblings with bull s-eye maculopathy. Br J Ophthalmol 2007;91: Carrillo-Carrasco N, Venditti CP. Combined methylmalonic acidemia and homocystinuria, cblc type. II. Complications, pathophysiology, and outcomes. J Inherit Metab Dis 2012;35: Tsai AC, Morel CF, Scharer G, et al. Late-onset combined homocystinuria and methylmalonic aciduria (cblc) and neuropsychiatric disturbance. Am J Med Genet A 2007; 143A: Lerner-Ellis JP, Anastasio N, Liu J, et al. Spectrum of mutations in MMACHC, allelic expression, and evidence for genotype-phenotype correlations. Hum Mutat 2009;30: Morel CF, Lerner-Ellis JP, Rosenblatt DS. Combined methylmalonic aciduria and homocystinuria (cblc): phenotype-genotype correlations and ethnic-specific observations. Mol Genet Metab 2006;88: Bodamer OA, Rosenblatt DS, Appel SH, et al. Adult-onset combined methylmalonic aciduria and homocystinuria (cblc). Neurology 2001;56: Nogueira C, Aiello C, Cerone R, et al. Spectrum of MMACHC mutations in Italian and Portuguese patients with combined methylmalonic aciduria and homocystinuria, cblc type. Mol Genet Metab 2008;93: Wang X, Sun W, Yang Y, et al. A clinical and gene analysis of late-onset combined methylmalonic aciduria and homocystinuria, cblc type, in China. J Neurol Sci 2012;318: Yang Y, Muzny DM, Reid JG, et al. Clinical whole-exome sequencing for the diagnosis of mendelian disorders. N Engl J Med 2013;369: Sloan JL, Johnston JJ, Manoli I, et al. Exome sequencing identifies ACSF3 as a cause of combined malonic and methylmalonic aciduria. Nat Genet 2011;43: Bainbridge MN, Wiszniewski W, Murdock DR, et al. Whole-genome sequencing for optimized patient management. Sci Transl Med 2011;3:87re83. Published with license by Taylor & Francis 124

143 5. Discussion The advent of massively parallel sequencing has allowed us to discover many interesting, sometimes unexpected, biological outcomes associated with previously established knowledge. Phenotypic expansion, or new clinical manifestation associated with known disease genes, is being discovered at a higher rate as the number of genomes sequenced has exponentially increased (Bainbridge, Wiszniewski et al. 2011). In this study, we have identified previously unreported clinical features in three genes that normally give rise to different phenotypes than those we observed in this study. Here, we found a family with new maculopathy caused by mutations in the CRB1 gene that has never been described in any other CRB1-associated phenotypes. Instead of a generalized retinal degeneration with dysplastic retinae seen in other CRB1 cases, patients in this family exhibited a slowly progressive focal disease. In three other independent cases, we found truncating mutations in the CRX gene causing late-onset BEM a phenotype significantly milder than the usual early-onset, severe CRXassociated retinopathies. We also found a unique case of vitamin B12 deficiency caused by MMACHC that presented only bull s eye macular lesion and otherwise unremarkable systemic features. The molecular genetic reasons for the observed distinct phenotypes could be attributed to the specific combination of the alleles, to modifier genes, or to other nongenetic factors such as environmental effects. The p.p1381l variant in CRB1 shared by the 2 affected siblings in Chapter IV.2 was previously identified as associated in 1 patient with LCA (Henderson, Mackay et al. 2011), the most severe generalized retinopathy that leads to childhood blindness. It is possible that p. P1381L being in 125

144 combination with the novel variant p.r1331c, for mechanisms we do not know, gave rise to the milder retinopathy in this case compared with other CRB1-associated phenotypes. We investigated the potential contribution of modifier genes, especially those that have been shown to interact with the protein or belong to the same pathway(s), through querying exome sequencing data. Crb1 in mice localizes to the outer limiting membrane between the subapical surface region and adherens junction of Müller glia cells (van de Pavert, Kantardzhieva et al. 2004). We searched the exome sequences for variants in genes that encode for proteins involved in the CRUMBS network, especially those that interact with CRB1 in the outer limiting membrane, such as MPP5, EPB41L5, and the MAGUK family members MPP3 and MPP4 (Gosens, den Hollander et al. 2008). No mutations that were shared between both affected siblings were found. There were rare heterozygous variants in genes known to cause retinal dystrophies that were shared by both affected siblings, such as USH2A, RPGRIP1, TOPORS, and CDHR1. Studies have shown the USHERINE protein network that includes USH2A has physical connection to the CRUMBS protein complex at the outer limiting membrane, while variants in RPGRIP1 have been suggested to be modifier alleles of CRB1. Recessive mutations in all these genes have been associated with LCA, RP, or CRD. While no definite evidence in this family was found that could explain the clinical findings, variants in several genes identified here could modulate this maculopathy phenotype. Modifier effect could play a role in the mild phenotype produced by CRX in the 3 cases, as well as the variable phenotypic expression among multiple affected members in the large family described in Chapter IV.3. The three new CRX alleles all generate 126

145 stop codons in OTX transcription factor domain of the protein. Other truncating alleles have been reported both N- and C-terminally to the OTX domain in patients with phenotypes in the severe end of the CRX phenotypic spectrum (Huang, Xiao et al. 2012), and bull s eye maculopathy had not been associated with deleterious CRX variants until this study and another recent study published in the same year (Hull, Arno et al. 2014). That study also showed no evident association between age of onset and position or type of CRX mutation, and no clear correlation between genotype and phenotype (Hull, Arno et al. 2014). The incomplete penetrance observed in the large family could be due to variable levels in expression of the mutant allele (Tran, Zhang et al. 2014), or the wild-type allele mechanism of which has been described in the autosomal dominant RP gene PRPF31 (Vithana, Abu-Safieh et al. 2003). Variability in expression of PRPF31 wild-type allele was shown to be modulated by presence of modifier allele in CNOT3, through action of a common intronic SNP via mechanism of transcriptional repression (Venturini, Rose et al. 2012). While no such modifier gene has been reported for CRX, study of RNA transcripts, and/or the co-expressed transcription factors such as NRL and NR2E3 (Chen, Wang et al. 1997, Peng, Ahmad et al. 2005), could reveal further insights. We identified the first case of adult methylmalonic aciduria and homocystinuria type C (cblc) with mutations in MMACHC and with ocular findings as the only presenting sign, as described in Chapter VI.5. Late-onset form of cblc associated with MMACHC is usually discovered due to neurological complaints and cognitive decline. Retinal phenotypes, including maculopathy, pigmentary retinopathy, and optic atrophy are common in early onset form of the disease but rarely found in adult onset forms 127

146 (Carrillo-Carrasco and Venditti 2012). Of the two disease-causing mutations in the Hispanic, adult female patient, the c.271dupa variant is the most common MMACHC mutation, accounting for >40% of all disease-associated alleles. It also often produces a more severe, early-onset phenotype. The c.g482a:p.arg161gln mutation is more often associated with the milder, late-onset phenotype (Lerner-Ellis, Tirone et al. 2006, Tsai, Morel et al. 2007, Lerner-Ellis, Anastasio et al. 2009). The exact same combination of mutations has been described before in at least 6 cases with teenage or adult onset of cblc. All cases, however, presented with severe systemic symptoms (Bodamer, Rosenblatt et al. 2001, Morel, Lerner-Ellis et al. 2006, Nogueira, Aiello et al. 2008). An early diagnosis in patients with cblc disease can be of added importance, as treatment with hydroxycobalamin may prevent, delay, or ameliorate the systemic symptoms (Carrillo-Carrasco, Chandler et al. 2012). In this case, precise molecular diagnosis was achieved through genetic studies and confirmed by follow-up biochemical laboratory tests. The patient was treated with vitamin B12, and thereby possibly improving her systemic prognosis. In conclusion, we identified novel phenotypes in three genes CRB1, CRX, and MMACHC through family-based analysis of exome sequencing data. Our findings clarified some of the underlying genetic causes of retinopathies with phenotypes resembling ABCA4-associated diseases, while revealing insights to the molecular mechanisms behind phenotypic heterogeneity both between and within pedigrees. These findings indicate that clinical phenotype can be very divergent from those previously described, and only genetic testing can unambiguously determine the cause of a disease. Future studies can include the promoter regions of these genes for 128

147 variants that may influence the phenotype. Studies of RNA transcripts in these genes for levels in expression of the wild-type and mutant alleles could also help to elucidate the mechanism of variable penetrance. 129

148 CHAPTER V DISCOVERY OF NOVEL GENES FOR NON-SYNDROMIC AND SYNDROMIC RETINAL DISORDERS 130

149 1. Preface The combined genetic and clinical heterogeneity in retinal disorders have in the past made gene discovery a challenging endeavor. Phenotypes resulting from mutations in the new genes can be indistinguishable from those caused by known genes, which most often occurs in retinitis pigmentosa. Traditional methods for mapping genetic causality have frequently involved first screening obvious candidate genes based on the phenotype, and in the case of a negative result, screening a group of genes implicated in similar phenotypes or in the same pathway. The advent of nextgeneration sequencing technology has allowed the simultaneous screening of many genetic loci in a time- and cost-efficient way. This has resulted in the identification of many rare disease genes in the past few years, including those in inherited retinal diseases. In the work described in this chapter, we performed whole-exome sequencing on families with retinal disorders of unknown cause, and successfully identified three novel disease genes combining evidence from in-silico analyses, functional characterizations, and functional evidence from prior investigations. In the work described in Chapter V.2, two families of Spanish descent were identified as having a classic early-onset, rapidly progressing cone-rod dystrophy. CRD is often considered a severe or advanced form of Stargardt disease, and the gene most frequently mutated in autosomal recessive CRD is ABCA4, accounting for roughly onethird of all cases. Comprehensive ABCA4 screening in all patients in the two families revealed no causal mutations, suggesting an involvement of another gene. We performed whole exome sequencing followed by segregation analysis of WES-identified variants with the disease in the families. Some degree of relatedness was suspected 131

150 from the parents of the affected probands, and regions with absence of heterozygosity (AOH) were detected by analyzing coding SNPs from WES data. In the work described in Chapter V.3, a form of syndromic retinitis pigmentosa was investigated in a family presenting atypical RP with facial dysmorphologies, psychomotor developmental delays, learning disabilities and short stature. Initially, the affected siblings were thought to have non-syndromic RP as their systemic symptoms escaped notice of our retinal physicians. While their eye phenotype contains several common features for classical RP, peripapillary sparing around the optic disk is characteristic of phenotypes caused by ABCA4 mutations. For that purpose, the proband was screened for variants in the ABCA4 gene and on an array containing most known arrp genes, revealing no disease-associated variants. To search for a novel gene for the new syndrome, we performed whole-exome sequencing in the family, followed by segregation analyses using variant data from WES. In the work described in Chapter V.4, we investigated one family of RP with syndromic features, where phenotypes strikingly resembled those from the family in Chapter V.3. We performed exome sequencing on all siblings, including two affected individuals and one unaffected. Since consanguinity was suspected, AOH analysis was performed using coding SNPs from WES and analyzed according to the principle of homozygosity mapping. These analyses resulted in a new gene that is different from the one discovered in Chapter V.3. Functional characterization of the mutation involved RT- PCR to determine the effect on splicing. While working on this family we became aware of several other cases and families segregating mutations in the same gene. Collaborators generated zebrafish and mouse models to characterize the functional 132

151 effect(s) of the variants. Chapter V.4 presented a report of my contribution to the manuscript in preparation. 133

152 CHAPTER V.2 New mutations in the RAB28 gene in 2 Spanish families with cone-rod dystrophy (Published Paper) 134

153 Research Original Investigation New Mutations in the RAB28 Gene in 2 Spanish Families With Cone-Rod Dystrophy Rosa Riveiro-Álvarez, PhD; Yajing (Angela) Xie, MA; Miguel-Ángel López-Martínez, MD; Tomasz Gambin, PhD; Raquel Pérez-Carro, MSc; Almudena Ávila-Fernández, PhD; María-Isabel López-Molina, MD; Jana Zernant, MS; Shalini Jhangiani, PhD; Donna Muzny, MS; Bo Yuan, BS; Eric Boerwinkle, PhD; Richard Gibbs, PhD; James R. Lupski, MD, PhD; Carmen Ayuso, MD, PhD; Rando Allikmets, PhD IMPORTANCE The families evaluated in this study represent the second report of cone-rod dystrophy (CRD) cases caused by mutations in RAB28, a recently discovered gene associated with CRD. OBJECTIVE To determine the disease-causing gene in 2 families of Spanish descent presenting with CRD who do not have ABCA4 mutations. DESIGN, SETTING, AND PARTICIPANTS Molecular genetics and observational case studies of 2 families, each with 1 affected proband with CRD and 3 or 5 unaffected family members. The affected individual from each family received a complete ophthalmic examination including assessment of refractive errors and best-corrected visual acuity, biomicroscopy, color fundus photography, electroretinography analysis, and visual-evoked potential analysis. After complete sequencing of the ABCA4 gene with negative results, the screening for disease-causing mutations was performed by whole-exome sequencing. Possible disease-associated variants were determined by filtering based on minor allele frequency, predicted pathogenicity, and segregation analysis in all family members. MAIN OUTCOMES AND MEASURES The appearance of the macula was evaluated by clinical examination, fundus photography, and fundus autofluorescence imaging, and visual function was assessed by electroretinography. Disease-causing mutations were assessed by sequence analyses. RESULTS Ophthalmologic findings included markedly reduced visual acuity, bull s eye maculopathy, foveal hyperpigmentation, peripapillary atrophy, dyschromatopsia, extinguished photopic responses, and reduced scotopic responses observed on electroretinography consistent with the CRD phenotype often associated with ABCA4 mutations. Although no ABCA4 mutations were detected in either patient, whole-exome sequencing analysis identified 2 new homozygous mutations in the recently described RAB28 gene, the c G>C splice site variant in IVS2 and the missense c.t651g:p.c217w substitution. Both variants were determined as deleterious by predictive programs and were segregated with the disease in both families. Sequencing of 107 additional patients of Spanish descent with CRD did not reveal other cases with RAB28 mutations. CONCLUSIONS AND RELEVANCE Deleterious mutations in RAB28 result in a classic CRD phenotype and are an infrequent cause of CRD in the Spanish population. JAMA Ophthalmol. doi: /jamaophthalmol Published online October 30, Author Affiliations: Author affiliations are listed at the end of this article. Corresponding Author: Rando Allikmets, PhD, Department of Ophthalmology, Columbia University, 160 Fort Washington Ave, Room 202, NewYork,NY10032 (rla22@columbia.edu). E1 135

154 Research Original Investigation New Mutations in the RAB28 Gene Autosomal recessive cone-rod dystrophies (arcrds) represent a group of diseases involving a primary loss of cone photoreceptors resulting in severely reduced visual acuity, defects in color vision, atrophy in the macular region, and reduced cone responses observed on electroretinography (ERG). 1 Genetic causes of arcrds vary, but most cases (30%-50%) with a known genetic basis carry homozygous or compound heterozygous mutations in the ABCA4 (MIM ) gene. 2-4 The goal of our study was to identify genes that cause phenotypes usually attributed to ABCA4 mutations in families lacking ABCA4 disease-associated alleles. The approach combines a complete sequencing of the ABCA4 gene and adjacent intronic sequences 5 or the entire ABCA4 genomic locus followed by whole-exome sequencing (WES) in cases in which no ABCA4 alleles are identified in either coding regions or introns. Recently, RAB28 (MIM ), a newly identified gene responsible for a small fraction of arcrds, was described. 6 The RAB28 protein, encoded in 3 isoforms with alternative C termini, is a member of the Rab subfamily of the RAS-related small guanosine triphosphatases. 7,8 The RAB28 protein is localized to the basal body and the ciliary rootlet of the photoreceptors and may be involved in ciliary transport. 6 We describe 2 new, likely deleterious homozygous RAB28 mutations in 2 families of Spanish descent resulting in the CRD phenotype, suggesting that disease-associated variation in RAB28 is rare in the Spanish population. Methods Recruitment of Participants The study was reviewed and approved by the ethics committee of the University Hospital Fundacion Jimenez Diaz in 2008, and it was performed according to the tenets of the Declaration of Helsinki. The participants signed a written informed consent form after the nature of procedures had been fully explained. The participants did not receive financial compensation. The collection of samples belongs to the Biobank of the University Hospital Fundacion Jimenez Diaz. Members of 2 families, each harboring 1 affected person with CRD and compatible with a recessive mode of inheritance, were enrolled in the study in 2008 and followed up thereafter. Both families were of Spanish descent with no reported consanguinity; however, parents of 1 of the 2 families (MD- 0312) were from the same small town. Family members included the affected probands, both of their parents, and their unaffected siblings (Figure 1). Clinical Evaluation The diagnosis of arcrd was based on initial reports of poor visual acuity, blurred central vision, impairment of color vision, and intense photophobia since childhood without a history of night blindness. In addition, funduscopic evidence of atrophic macular degeneration and peripheral pigment clumping and earlier loss of cone than rod ERG amplitude were noticed. Figure 1. Pedigrees of the 2 Families and Segregation of the RAB28 Mutations With the Disease Phenotype MD /0223 c G>C I c G>C MD /1125 C217W I WT 10/0166 C217W I C217W 08/0224 c G>C I WT 08/0225 WT I WT 10/1126 WT I C217W 10/1127 WT I C217W 08/0226 WT I c G>C 08/0227 c G>C I WT 08/0228 c G>C I WT Open circles and squares represent the unaffected female and male family members, respectively; closed circles and squares represent the affected female and male patients, respectively. WT indicates wild-type. Ophthalmologic examinations included assessment of refractive error, best-corrected visual acuity (BCVA [representative of the Snellen equivalent on a decimal scale]), biomicroscopic slitlamp examination, fundus examination, fundus autofluorescence, fluorescein angiography, chromatic sensitivity (Ishihara test), Goldmann perimetry, Ganzfeld ERG according to the International Society for Clinical Electrophysiology of Vision standards, 9,10 and visual-evoked potentials. Genetic Analyses The probands were initially screened for variants in the ABCA4 gene by ABCA4 genotyping microarray (Asper Biotech Inc; or by direct sequencing of all exons and adjacent intronic sequences, revealing no disease-associated variants. Since there are no other major CRD genes, the family underwent WES, which was performed on the affected proband, 1 unaffected parent, and 1 unaffected sibling in both families at the Baylor College of Medicine Human Genome Sequencing Center (BCM-HGSC). Genomic DNA samples were constructed into paired-end precapture libraries according to the manufacturer s protocol (Multiplexing_SamplePrep_Guide_ _D; Illumina, Inc) with modifications as described in the BCM-HGSC Illumina Barcoded Paired-End Capture Library Preparation 11 protocol. Libraries were prepared using robotic workstations (Biomek NXp and FXp models; Beckman). The complete protocol and oligonucleotide sequences are accessible from the HGSC website ( /files/documents/illumina_barcoded_paired-end_capture _Library_Preparation.pdf). Precaptured libraries were pooled and hybridized in solution to the HGSC CORE design (52 megabases; Roche NimbleGen), and exome capture was performed according to the manufacturer s protocol 12 with E2 JAMA Ophthalmology Published online October 30, 2014 jamaophthalmology.com 136

155 New Mutations in the RAB28 Gene Original Investigation Research Table. Clinical Data From the 2 Probands With CRD and Mutations in the RAB28 Gene Family MD-0448; Spanish ancestry MD-0312; Spanish ancestry Age at Diagnosis/ Last Visit, y Refractive Error at Last Visit BCVA a Observations 8/17 RE: 0.3 with 8.75 ( 1 to 50); LE: 0.2 with ( 1 to 160) 12/22 RE: 1 ( ); LE: 0.50 ( ) 0.2 (20/100) (stableat17y); 0.2 (20/100) (12 y); 0.3 (20/65) (9 y); 0.5 (20/40) (8 y) RE: 0.1 (20/200); LE: 0.3 (20/65) (22 y); RE: 0.3 (20/65); LE: 0.4 (20/80) (21 y); RE: 0.5 (20/40); LE: 0.5 (20/40) (20 y) Abbreviations: BCVA, best-corrected visual acuity; CRD, cone-rod dystrophy; LE, left eye; NA, not available; RE, right eye. Biomicroscopy: normal; funduscopy: bull s eye maculopathy, peripapillary atrophy, optic pallor, thin vessels, tigroid fundus, no signs of peripheral involvement; color vision (Ishihara test): dyschromatopsia; electroretinogram, photopic response: absent; scotopic response: residual; visual-evoked potentials; pathologic: low amplitude of P100 wave, enlarged latency (implicit time); other: magna myopia, diplopia Biomicroscopy: normal; funduscopy: loss of foveal reflex, atrophic areas in the macula, optic pallor, no flecks; fluorescein angiography: hyperfluorescent area at central and peripapillary areas, no choroidal silence; color vision (Ishihara test): dyschromatopsia (deuteranopia); visual field (Goldmann perimetry): central scotoma, development of peripheral field involvement; electroretinogram: photopic response absent, scotopic response diminished; visual-evoked potentials; diminished amplitude of P100 wave, enlarged latency (implicit time); other: intense photophobia a Best-corrected visual acuity is reported as the Snellen equivalent on a decimal scale (Snellen equivalent). minor revisions. Library templates were prepared for sequencing using the cbot cluster generation system (TruSeq PE Cluster Generation Kits, part No. PE ; Illumina, Inc). Real-time analysis software was used to process the image analysis and base calling. Sequencing runs generated approximately 300 million to 400 million successful reads on each lane of a flow cell, yielding 9 to 10 gigabytes per sample. With these sequencing yields, samples achieved a mean of 90% of the targeted exome bases covered to a depth of 20 times or greater. Illumina sequence analysis was performed using the HGSC Mercury analysis pipeline ( /Mercury-Pipeline), which addresses all aspects of data processing and analyses, moving data step by step through various analysis tools from the initial sequence generation on the instrument to annotated variant calls (single-nucleotide polymorphisms [SNPs] and intraread insertions and deletions). The pathogenicity of novel variants was assessed with predictive programs for splice sites and coding sequences (Alamut, version 2.3; All variants of interest were confirmed by Sanger sequencing, and segregation analyses were performed on all members of the 2 families. Regions with absence of heterozygosity (AOH) were detected by analyzing the B-allele frequency data obtained from coding SNPs and WES. The B-allele frequency from WES was determined by computing the variant to total reads ratio for each SNP present in the variant call format file. The regions with AOH were identified using in-house scripts in R language ( and the circular binary segmentation algorithm. 13 A cohort of 107 unrelated patients of Spanish descent with arcrd were screened for variants on the RAB28 gene by direct Sanger sequencing of the 7 exons, including all 3 isoforms of RAB28. Results Clinical Examination The clinical features of the affected individuals of both families are summarized in the Table. The affected member of the family MD-0448 was an 18-year-old woman with CRD diagnosed in her first decade of life (at 8 years). She presented with decreased visual acuity (BCVA, 0.5 on the Snellen-equivalent decimal scale [Snellen equivalent, 20/40]), high myopia, diplopia, and dyschromatopsia since childhood. Rapid deterioration of her vision was noticed at an early age; the BCVA progressed to 0.3 (Snellen equivalent, 20/65) at age 9 years and to 0.2 (Snellen equivalent, 20/100) at 12 years but has remained stable since her last examination at age 12. The funduscopic examination revealed a tigroid fundus, bull s eye maculopathy, thin vessels, peripapillary atrophy, and remarkable optic pallor. Her peripheral retina had no signs of abnormality during initial examinations. Photopic responses were undetectable by ERG, and scotopic responses were residual. Visual-evoked potential recorded pathologic responses, including low amplitude and enlarged latency (implicit time). The affected member of family MD-0312 was a 30-yearold man presenting with low visual acuity, deuteranopia, and intense photophobia since diagnosis at age 12 years. At age 20, the patient s BCVA was 0.5 (Snellen equivalent, 20/40), which decreased to 0.3 OS (Snellen equivalent, 20/65) and 0.1 OD (Snellen equivalent, 20/200) in 2 years. Funduscopic examination demonstrated a loss of the foveal reflex and optic pallor. Fluorescein angiography revealed hyperfluorescent central and peripapillary areas, with no silent choroid (Figure 2). Goldmann perimetry, performed when the patient was aged 20 years, showed central scotoma developing toward the peripheral field. At his last visit at age 22, full-field ERG testing revealed no photopic responses, and both scotopic and flicker responses were reduced. Visual-evoked potential examination recorded diminished amplitude and enlargement of the latency (implicit time). In both patients, the onset of the disease started during childhood (ages 8 and 12 years), with markedly reduced visual acuity and dyschromatopsia as the first symptoms. The proband from the MD-0312 family reported intense photophobia as one of the worst initial clinical signs. However, this patient was not affected by high myopia, which was a common feature among previously described patients with mutations in the RAB28 gene. 6 The proband from the MD-0448 family presented with both high myopia and diplopia. jamaophthalmology.com JAMA Ophthalmology Published online October 30, 2014 E3 137

156 Research Original Investigation New Mutations in the RAB28 Gene Figure 2. Clinical Presentation of the Disease in the Patient From Family MD-0312 A Left eye B Right eye C Left eye D Right eye A and B, Funduscopy revealed normal retinal papilla and parenchyma, with normal vessels. A, Left eye; note changes at the foveal and perifoveal retinal pigment epithelium, including yellowish deposits, together with both nasal and inferior peripapillary atrophy. B, Right eye; note similar changes with the exception of the peripapillary atrophy, which was observed only at the nasal region. C and D, Fluorescein angiography demonstrated foveal hyperfluorescent dots and peripapillary hyperflourescence in both eyes. D, A higher extension of the affected area was observed in the right eye. No signs of peripheral involvement were initially observed in either patient by funduscopic examination. The patients did not report night blindness despite the reduced scotopic responses. However, the peripheral visual field involvement was observed in subsequent ophthalmologic examinations (Table). In summary, both patients presented with a remarkably similar CRD phenotype despite differences in the molecular genetic findings (splice-site mutation vs missense variant). Genetic Analyses Considering the apparent autosomal recessive inheritance of CRD, the observed WES variants were filtered by all of the following criteria: (1) variants were present in less than 0.5% in ESP6500 ( Genomes, 14 and ARIC, an internal control database of 6250 exomes at Baylor Genome Center; (2) variants were in protein coding regions, in canonical splice sites, or both; (3) variants were missense, nonsense, frameshift, or splice site; and (4) variants were compound heterozygous or homozygous in the affected proband, heterozygous in a parent, and either heterozygous or absent in unaffected siblings. These analyses identified likely diseaseassociated homozygous variants in RAB28 in both families (Figure 1). In MD-0312, the homozygous c G>C variant is located in intron 2 in the splice donor site of exon 2. The change is predicted to eliminate the splice donor site, resulting in skipping of the second exon and, consequently, by conceptual translation, a nonfunctional protein. In family MD-0448, the c.t651g substitution results in a missense variant p.c217w in exon 8 of RAB28. There is a large physicochemical difference between cysteine and tryptophan (Grantham 15 distance, 215), and the variant is predicted to be deleterious by all predictive programs, SIFT (score, 0.00), 16 PolyPhen2 (score, 1.00), 17 and MutationTaster (probability value, 1.00). 18 The amino acid Cys217 is highly conserveduptocaenorhabditis elegans. Neither variant is present in 1000 Genomes, in the Exome Variant Server, or in the 6520 Baylor in-house database of control exomes. The WES and coding SNP data confirmed that the RAB28 mutation in family MD-0312 is located within a large, approximately 20-kB block of AOH (Figure 3), suggesting that the parents of the affected proband are distantly related. Considering that both parents are from the same small town and that the variant is very rare in the general population, the causal c G>C mutation probably arose relatively recently in the local population. The AOH was much smaller, approximately 1 kb, around the RAB28 locus in family MD-0448 (Figure 3). The RAB28 gene encodes for 3 isoforms with alternative C termini. 7 The c G>C variant affects all 3 isoforms, and the p.c217w variant is in exon 8, which is present only in isoform 2 (RAB28L [RefSeq NM_ ]). Sanger sequencing of all coding exons of RAB28 in 107 unrelated Spanish patients with CRD did not identify additional mutations. No pathogenic variants were identified in the genes known to be associated with the arcrd phenotype (ABCA4, ADAM9, C8orf37, CERKL, EYS, RP- GRIP1, and TULP1). After the multimodal analysis under the hypothesis of autosomal recessive inheritance, no other plausible candidate gene remained in either family. Discussion The advent of next-generation sequencing has allowed the identification of many rare disease genes, including genes for E4 JAMA Ophthalmology Published online October 30, 2014 jamaophthalmology.com 138

157 New Mutations in the RAB28 Gene Original Investigation Research Figure 3. B-Allele Frequencies Generated From Whole-Exome Sequencing Data for Both Affected Patients From Families MD-0312 and MD-0448 MD-0312 Variant/Total Reads AOH region e e e Variant/Total Reads Genes RAB28 8.0e e e e e e e + 07 Chromosomal Position, Base Pairs MD Variant/Total Reads e e e Variant/Total Reads Genes RAB28 8.0e e e e e e e + 07 Chromosomal Position, Base Pairs Chromosome 4 of both probands is shown. For each proband, the top bar is the entire chromosome 4, and the bottom bar shows the magnified region around the causal mutation in RAB28 (marked by red vertical lines). A large absence of heterozygosity (AOH) block (gray area) of approximately 20 kb was identified in the proband from family MD-0312; a much smaller, approximately 1-kB AOH block, was seen in the proband from family MD retinal dystrophies. 6,19,20 Most often, the phenotype resulting from mutations in these genes is indistinguishable from that caused by known genes, therefore offering few clues for gene discovery. Cone-rod dystrophies are a good example of a group of diseases in which causal genes have many different functions but mutations in them cause a very similar phenotype. 21 The gene most frequently mutated in arcrd is ABCA4, the gene for recessive Stargardt disease 22 ; approximately one-third of all arcrds is caused by the pathogenic ABCA4 variant. 4,21 Cone-rod dystrophy is often considered a severe or advanced form of Stargardt disease and is usually associated with deleterious mutations in ABCA4. 4,21 Screening for genetic causality is often performed in several steps, starting from the most obvious candidate gene based on the phenotype that is screened first followed by, in case of a negative result, screening of a group of genes implicated in similar phenotypes or all genes known to cause retinal phenotypes, such as the entire RetNet panel ( However, because WES is now extremely robust and affordable, it is becoming more labor- and cost-efficient to prefer WES to other means of screening, especially in large families with recessive diseases in whom comprehensive segregation analysis can be performed. jamaophthalmology.com JAMA Ophthalmology Published online October 30, 2014 E5 139

158 Research Original Investigation New Mutations in the RAB28 Gene The present study exemplifies this approach. We first screened ABCA4 in all patients presenting with phenotypes compatible with ABCA4 defects and, in case of negative results, performed WES on all or most available family members. The affected members of the 2 families of Spanish descent described here were identified as having classic earlyonset, rapidly progressing CRD, suggesting no specific clue as to the genetic causality after the ABCA4 gene was ruled out. Evaluation using WES identified 2 new deleterious mutations in a very recently described 6 CRD gene, RAB28. Screening of 107 unrelated individuals with CRD did not identify any more Spanish patients with RAB28 mutations. Roosing and colleagues 6 screened more than 600 patients with CRD and cone dystrophy and assessed the sequence data on more than 400 additional individuals with CRD, Leber congenital amaurosis, and retinitis pigmentosa that are available at the European Retinal Disease Consortium; this screening resulted in identification of only 2 families segregating RAB28 mutations with a CRD phenotype. With our data and those of Roosing and colleagues, we can conclude that CRD-associated mutations in RAB28 are very rare and that they are exclusively or at least predominantly deleterious. The clinical phenotype of the 2 patients in the present study has extensive similarities with those described by Roosing et al, 6 including high myopia in 1 of the 2 cases, foveal hyperpigmentation, bull s eye maculopathy, defects in color vision, extinguished photopic responses on ERG, and peripapillary atrophy. The accumulated clinical and genetic data allow us to also conclude that RAB28 mutations cause only CRD phenotype and are not associated with related diseases, such as Stargardt disease or retinitis pigmentosa. Expression analysis presented by Roosing and colleagues 6 (as shown in Figure S1 in their article) suggests that all RAB28 isoforms, including isoform 2 (also called RAB28L), in which the mutation was detected in family MD-0448, are expressed in various human tissues including, most importantly for associated eye disease traits, retinal pigment epithelium and photoreceptors. The functional role of RAB28 and its isoforms in photoreceptors and retinal pigment epithelium is unknown, but it has been suggested 6 that the localization of RAB28 to the basal body and ciliary rootlet plays a role in ciliary transport. Three other proteins (ie, C8orf37, RPGRIP1, and TULP1), with mutations causing CRD, are located in the connecting cilia of photoreceptor cells. Conclusions Whole-exome sequencing identified new rare, deleterious mutations in RAB28 in 2 families of Spanish descent. Wholeexome sequencing is a powerful tool for identification of rare genes in which disease-associated variants can convey a shared clinical phenotype and is especially effective in family-based analyses. ARTICLE INFORMATION Submitted for Publication: April 8, 2014; final revision received July 25, 2014; accepted July 26, Published Online: October 30, doi: /jamaophthalmol Author Affiliations: Department of Genetics, Instituto de Investigacion Sanitaria-University Hospital Fundacion Jimenez Diaz, Madrid, Spain (Riveiro-Álvarez, López-Martínez, Pérez-Carro, Ávila-Fernández, Ayuso); Centro de Investigacion Biomedica en Red de Enfermedades Raras, Instituto de Salud Carlos III, Madrid, Spain (Riveiro-Álvarez, López-Martínez, Pérez-Carro, Ávila-Fernández, López-Molina, Ayuso); Department of Ophthalmology, Columbia University, New York, New York (Xie, Zernant, Allikmets); Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas (Gambin, Jhangiani, Yuan, Gibbs, Lupski); Department of Ophthalmology, University Hospital Fundacion Jimenez Diaz, Madrid, Spain (López-Molina); Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas (Muzny, Boerwinkle, Gibbs, Lupski); Department of Pathology and Cell Biology, Columbia University, New York, New York (Allikmets). Author Contributions: Dr Riveiro-Álvarez and Ms Xie contributed equally to this study. Drs Ayuso and Allikmets had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: Riveiro-Álvarez, López-Martínez, Ayuso, Allikmets. Acquisition, analysis, or interpretation of data: All authors. Drafting of the manuscript: Riveiro-Álvarez, Xie, Pérez-Carro, Ávila-Fernández, López-Molina, Zernant, Allikmets. Critical revision of the manuscript for important intellectual content: Riveiro-Álvarez, Xie, López-Martínez, Gambin, Jhangiani, Muzny, Yuan, Boerwinkle, Gibbs, Lupski, Ayuso, Allikmets. Statistical analysis: Xie, Gambin. Obtained funding: Boerwinkle, Ayuso, Allikmets. Administrative, technical, or material support: Riveiro-Álvarez, Ávila-Fernández, López-Molina, Zernant, Jhangiani, Muzny, Boerwinkle, Gibbs, Lupski, Allikmets. Study supervision: Riveiro-Álvarez, Muzny, Gibbs, Ayuso, Allikmets. Conflict of Interest Disclosures: All authors have completed and submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest and none were reported. Funding/Support: This work was supported in part by National Institutes of Health grants EY021163, EY019861, HG006542, and EY (Core Support for Vision Research); by research grants FIS PI13/00226, FIS PS09/00459, and RD (Retics Biobank); CIBERER Intra/07/704.1 and Intra/09/702.1; ONCE; Fundaluce and Fundacion Conchita Rabago de Jimenez Diaz; and unrestricted funds from Research to Prevent Blindness to the Department of Ophthalmology, Columbia University. Role of the Funder/Sponsor: The funding organizations had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication. REFERENCES 1. Hamel CP. Cone rod dystrophies. Orphanet J Rare Dis. 2007;2:7. 2. Cremers FP, van de Pol DJ, van Driel M, et al. Autosomal recessive retinitis pigmentosa and cone-rod dystrophy caused by splice site mutations in the Stargardt s disease gene ABCR. Hum Mol Genet. 1998;7(3): Maugeri A, Klevering BJ, Rohrschneider K, et al. Mutations in the ABCA4 (ABCR) gene are the major cause of autosomal recessive cone-rod dystrophy. Am J Hum Genet. 2000;67(4): Riveiro-Álvarez R, López-Martínez MA, Zernant J, et al. Outcome of ABCA4 disease-associated alleles in autosomal recessive retinal dystrophies: retrospective analysis in 420 Spanish families. Ophthalmology. 2013;120(11): Zernant J, Schubert C, Im KM, et al. Analysis of the ABCA4 gene by next-generation sequencing. Invest Ophthalmol Vis Sci. 2011;52(11): Roosing S, Rohrschneider K, Beryozkin A, et al; European Retinal Disease Consortium. Mutations in RAB28, encoding a farnesylated small GTPase, are associated with autosomal-recessive cone-rod dystrophy. Am J Hum Genet. 2013;93(1): Brauers A, Schürmann A, Massmann S, et al. Alternative mrna splicing of the novel GTPase Rab28 generates isoforms with different C-termini. Eur J Biochem. 1996;237(3): Touchot N, Chardin P, Tavitian A. Four additional members of the ras gene superfamily isolated by an oligonucleotide strategy: molecular cloning of YPT-related cdnas from a rat brain library. Proc Natl Acad SciUSA. 1987;84(23): E6 JAMA Ophthalmology Published online October 30, 2014 jamaophthalmology.com 140

159 New Mutations in the RAB28 Gene Original Investigation Research 9. Marmor MF, Brigell MG, McCulloch DL, Westall CA, Bach M; International Society for Clinical Electrophysiology of Vision. ISCEV standard for clinical electro-oculography (2010 update). Doc Ophthalmol. 2011;122(1): Marmor MF, Holder GE, Seeliger MW, Yamamoto S; International Society for Clinical Electrophysiology of Vision. Standard for clinical electroretinography (2004 update). Doc Ophthalmol. 2004;108(2): Lupski JR, Gonzaga-Jauregui C, Yang Y, et al. Exome sequencing resolves apparent incidental findings and reveals further complexity of SH3TC2 variant alleles causing Charcot-Marie-Tooth neuropathy. Genome Med. 2013;5(6): NimbleGen SeqCap EZ Library LR user s guide. Roche. / _SeqCapEZLibraryLR_Guide_v2p0.pdf. Accessed September 19, Olshen AB, Venkatraman ES, Lucito R, Wigler M. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics. 2004;5(4): Abecasis GR, Altshuler D, Auton A, et al; 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature. 2010;467(7319): Grantham R. Amino acid difference formula to help explain protein evolution. Science. 1974;185 (4154): Kumar P, Henikoff S, Ng PC. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protoc. 2009;4(7): Adzhubei IA, Schmidt S, Peshkin L, et al. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7(4): Schwarz JM, Rödelsperger C, Schuelke M, Seelow D. MutationTaster evaluates disease-causing potential of sequence alterations. Nat Methods. 2010;7(8): Ozgül RK, Siemiatkowska AM, Yücel D, et al; European Retinal Disease Consortium. Exome sequencing and cis-regulatory mapping identify mutations in MAK, a gene encoding a regulator of ciliary length, as a cause of retinitis pigmentosa. Am J Hum Genet. 2011;89(2): Chiang PW, Wang J, Chen Y, et al. Exome sequencing identifies NMNAT1 mutations as a cause of Leber congenital amaurosis. Nat Genet. 2012;44 (9): Thiadens AAHJ, Phan TML, Zekveld-Vroon RC, et al; Writing Committee for the Cone Disorders Study Group Consortium. Clinical course, genetic etiology, and visual outcome in cone and cone-rod dystrophy. Ophthalmology. 2012;119(4): Allikmets R, Singh N, Sun H, et al. A photoreceptor cell specific ATP-binding transporter gene (ABCR) is mutated in recessive Stargardt macular dystrophy. Nat Genet. 1997;15(3): jamaophthalmology.com JAMA Ophthalmology Published online October 30, 2014 E7 141

160 CHAPTER V.3 New syndrome with retinitis pigmentosa is caused by nonsense mutations in retinol dehydrogenase RDH11 (Published Paper) 142

161 Human Molecular Genetics, 2014, Vol. 23, No doi: /hmg/ddu291 Advance Access published on June 10, 2014 New syndrome with retinitis pigmentosa is caused by nonsense mutations in retinol dehydrogenase RDH11 Yajing (Angela) Xie 1, Winston Lee 1, Carolyn Cai 1, Tomasz Gambin 3, Kalev Nõupuu 1, Tharikarn Sujirakul 1, Carmen Ayuso 5,6, Shalini Jhangiani 3, Donna Muzny 4, Eric Boerwinkle 3, Richard Gibbs 3,4, Vivienne C. Greenstein 1, James R. Lupski 3,4, Stephen H. Tsang 1,2 and Rando Allikmets 1,2, 1 Department of Ophthalmology, 2 Department of Pathology and Cell Biology, Columbia University, New York, NY 10032, USA, 3 Department of Molecular and Human Genetics, 4 Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA, 5 Department of Genetics, Instituto de Investigacion Sanitaria-University Hospital Fundacion Jimenez Diaz (IIS-FJD), Madrid, Spain and 6 Centro de Investigacion Biomedica en Red (CIBER) de Enfermedades Raras, ISCIII, Madrid, Spain Received April 25, 2014; Revised and Accepted June 6, 2014 Retinitis pigmentosa (RP), a genetically heterogeneous group of retinopathies that occur in both non-syndromic andsyndromicforms, iscausedbymutationsin 100genes. Althoughrecentadvancesinnext-generationsequencinghaveaidedinthediscoveryofnovelRPgenes, anumberoftheunderlyingcontributinggenesandlociremainto be identified. We investigated three siblings, born to asymptomatic parents of Italian American descent, who each presented with atypical RP with systemic features, including facial dysmorphologies, psychomotor developmental delays recognized since early childhood, learning disabilities and short stature. RP-associated ophthalmological findings included salt-and-pepper retinopathy, attenuation of the arterioles and generalized rod cone dysfunction as determined by almost extinguished electroretinogram in 2 of 3 siblings. Atypical for RP features included mottled macula at an early age and peripapillary sparing of the retinal pigment epithelium. Whole-exome sequencing data, queried under a recessive model of inheritance, identified compound heterozygous stop mutations, c.c199t:p.r67 and c.c322t:p.r108, in the retinol dehydrogenase 11 (RDH11) gene, resulting in a non-functional protein, inallaffectedchildren. Insummary, deleteriousmutationsinrdh11, animportantenzymeforvision-related and systemic retinoic acid metabolism, cause a new syndrome with RP. INTRODUCTION Retinitis pigmentosa (RP) is a genetically and clinically heterogeneous group of inherited retinopathies (1). It includes both non-syndromic and syndromic forms of all modes of inheritance, autosomal recessive, autosomal dominant and X-linked, and is caused by mutations in close to 100 genes. By current estimates known genes explain between 60 and 80% of RP cases suggesting many RP loci remain to be found (1). Syndromic forms of RP present with many heterogeneous phenotypes, the two best characterized of which are Usher syndrome, caused by 12 genes (2) and Bardet Biedl syndrome (BBS), which is caused by mutations in at least 17 genes ( retnet/) (3,4). Most prominent systemic phenotypes of Usher syndrome include congenital or early-onset hearing loss (5), although BBS presents often with obesity, developmental delay and polydactyly (6,7). Other forms of syndromic RP include those associated with mitochondrial diseases (Kearns Sayre, Wolfram syndromes, etc.) (8,9) and some forms of renal or neurodegenerative phenotypes (Joubert, Jeune syndromes, etc.; sph.uth.edu/retnet/) (10 12). The presence of specific systemic phenotypic features in patients vary and often mutations in the To whom correspondence should be addressed at: Columbia University, Eye Institute Research, Rm. 202, 160 Fort Washington Avenue, New York, NY 10032, USA. Tel: ; Fax: ; rla22@columbia.edu # The Author Published by Oxford University Press. All rights reserved. For Permissions, please journals.permissions@oup.com 143

162 Human Molecular Genetics, 2014, Vol. 23, No same gene can cause different phenotypes (allelic affinity), sometimes classified as separate diseases (13) or the same syndrome can be caused by mutations in different genes (locus heterogeneity) (14). Recent advances in the next-generation sequencing (NGS), the high-throughput, deep sequencing technology, have enabled several novel RP genes to be identified (15 17) or found new mutations in known genes; nevertheless, a substantial fraction of unsolved cases still remain. Here, we investigated the clinical phenotype and the genetic cause of RP with syndromic features in a family in which all three children presented with atypical RP and a unique combination of systemic features potentially describing a novel syndrome. RESULTS Physical evaluation A family with three affected children of Italian American descent (Fig. 1) presented to the clinic for evaluation of retinal degeneration. The oldest sibling is a 19-year-old woman (Case 1), the middle sibling a 17-year-old boy (Case 2) and youngest an 8-year-old boy (Case 3). Ophthalmic examinations in both parents were unremarkable, and they did not exhibit any physical or systemic abnormalities. The siblings were born at term after uneventful pregnancies but had psychomotor developmental delays since early childhood. At presentation, the main complaints were lack of fine motor skills and coordination (writing, drawing) for which they are receiving physical and occupational therapy. Each sibling is enrolled in special educational programs due to learning difficulties. The three affected siblings share a striking resemblance with distinctive facial features (Fig. 2). Each appear to be abnormally short in stature relative to their age groups, the oldest and middle sibling were measured to be 57 and 61 tall, respectively, and the youngest sibling 47 tall. The two older siblings fall below the 5th percentile for stature within their respective age and gender groups (Fig. 2C and D). The youngest sibling is just above the 5th percentile (Fig. 2D). Older siblings consistently exhibited short stature throughout development until early puberty (9 10 years) and then again around the early teenage years. The youngest sibling appears to be following a similar trend (Fig. 2C and D). Skeletal X-rays of the hands from an earlier age in Cases 1 and 2 were reviewed with no suggestive evidence for skeletal dysplasias; e.g. brachydactyly. Figure 1. Pedigree of the family and segregation of the RDH11 mutations with the disease phenotype. Open circles and squares represent the unaffected female and male family members, respectively; closed circles and squares represent the affected female and male patients. Excessive dental spacing and malocclusion were observed in Case 3, which occurred similarly in Cases 1 and 2 but was corrected with dental braces according to their reported health history. All exhibited apparent facial dysmorphologies which included distinct formation of the nose with hypoplasia of the alae nasae (Fig. 2A). Cases 2 and 3 exhibited malar hypoplasia, whereas Cases 1 and 2 had attached ear lobes. The palpebral fissures in each sibling appeared slightly upslanted. An extensive review of pediatric health records from birth for each sibling did not reveal any additional significant systemic abnormalities. Ophthalmic evaluation The progressive visual acuity decrease and reported difficulties with night vision developed at the age of 10 years in Case 1 and at the age of 8 years in Cases 2 and 3, and an onset of juvenile cataracts was diagnosed and operated within the same year. After cataract surgery, night vision became progressively worse and RP was diagnosed at the age of 16 years (Case 1), at the age of 13 years (Case 2) and at the age of 8 years (Case 3) according to the fundus appearance and electroretinogram (ERG) findings (Fig. 3). The oldest sibling presented at a later disease stage. The best-corrected visual acuity (BCVA) was 20/20RE and 20/ 20LE. Fundus examination revealed attenuation and narrowing of the retinal arterioles in both eyes (Fig. 3A). The retina exhibited a mottled, pigmented appearance consistent with localized retinal pigment epithelium (RPE) atrophy, but relatively spared fovea. Particular areas of the mid-periphery had an abnormal, grayish sheen. A confluent pattern of bone-spicule pigmentation was observed in the periphery (Fig. 3B). Autofluorescence imaging revealed the presence of an amorphous hyperfluorescent ring beyond the peripheral vascular arcades with sparing of RPE in the peripapillary region of the optic nerve (Fig. 3C). Spectral domain-optical coherence tomography (SD-OCT) cross sections revealed intact retinal lamination in the central macula proceeding to an abrupt absence or disruption of outer layers in the parafoveal and peripheral retina (Fig. 3D). The parafoveal outer retinal layers were progressively atrophic or absent resulting in a loss of laminar architecture of the retina with increasing spatial eccentricity. Such affected areas also exhibited decreased retinal thickening in the mid- and far-peripheral retina. Cases 2 and 3 presented at a relatively milder stage on fundus examination. BCVA in Case 2 was 20/30RE and 20/25 LE, and in Case 3 was 20/25 in both eyes. Optic discs had a pinkish waxy appearance with distinct borders. The retina appeared mottled and was marked with diffuse pigment clumping without bone-spicule pigmentation in the periphery. A similar but more uniformly symmetric hyperfluorescent ring just beyond the superior and inferior vascular arcades and around the optic disc was seen in autofluorescence imaging. SD-OCT scans through the macula revealed relatively spared foveae but abnormal outer retinal lamination in parafoveal area. Areas of pigment clumping seen in fundus examination spatially correspond to hyperreflective deposits which bulged up anteriorly to an otherwise undisrupted photoreceptor layer. The external limiting membrane (ELM) underlying these lesions appeared to be thinner. Full-field ERG conducted in each affected sibling showed changes mostly consistent with their retinal pathology (Fig. 3E). Severe amplitude reductions and delays of the a- and b-waves in 144

163 5776 Human Molecular Genetics, 2014, Vol. 23, No. 21 Figure 2. Clinical presentation of the syndromic features in affected family members. Affected siblings shared distinct craniofacial and physical dysmorphologies. (A) Case 2, middle sibling, presented with prominent alae nasae and malar hypoplasia; the oldest and youngest sibling also exhibited these features in a less pronounced manner. (B) The body frame of each sibling suggests irregular physical development according to their respective ages. Growth curves (C, Case 1 and D, Cases 2 and 3) documenting stature from age 2 to the present illustrate a consistent pattern of abnormal height, falling below the 5th percentile of age- and gendermatched reference population. 145

164 Human Molecular Genetics, 2014, Vol. 23, No Figure 3. Ophthalmic examination of affected family members consistent with an early-onset retinal dystrophy. (A) Funduscopy of Case 1 revealed an abnormal, mottled appearance of the retina in addition to narrowing and attenuation of the retinal vessels (white arrows). (B) A more peripheral examination of the same eye revealed a confluent pattern of bone-spicule pigment deposition (white arrows). (C) Autofluorescence imaging of the same eye uncovered a large amorphous ring (white arrows) situated just beyond the vascular arcades surrounding an area of granular RPE atrophy with sparing of the fovea and regions surrounding the optic nerve. (D) SD-OCT scans revealed relatively intact retina layers in areas immediately adjacent to the fovea, which proceed to an abrupt loss (white arrow) of the ELM, inner segment ellipsoid layer (ISe) and thinning of the RPE with increasing eccentricity from the fovea (inset). (E) Full-field ERG results in each sibling were compared with an age-matched control (dotted gray line). Case 1 (solid red line) and exhibited generalized retinal dysfunction with severe waveform reductions and delays in both the scotopic and photopic systems. Case 2 (solid green line) and Case 3 (solid blue line) appeared to be less affected with mild delays in the 30 Hz flicker response and amplitudinal reductions in scotopic function. both the rod (scotopic) and cone (photopic) system were apparent, especially in the oldest sibling. The scotopic system was clearly more affected than photopic, consistent with the RP phenotype (Fig. 3E). The 30-2 Humphrey visual field for Case 1 was constricted (Supplementary Material, Fig. S1), consistent with the RP phenotype. There was a marked decline in visual sensitivity with increasing eccentricity from the fovea. Foveal sensitivity was slightly decreased compared with values for age-similar healthy observers. The two-color dark-adapted threshold results also showed decreased sensitivity with increasing eccentricity to both chromatic stimuli compared with the mean value for healthy observers, and at 158 nasal field 650 nm stimuli at the maximum intensity were not detected (Supplementary Material, Fig. S1). The differences in sensitivity of 2 log units between the 500 and 650 nm test lights along the horizontal meridian indicate rod mediation. Genetic analyses Considering the apparent autosomal recessive inheritance of the disease (Fig. 1), the variants detected by NGS in affected children and parents were selected for further analysis when they met all of the following criteria: (1) variants were present in 146

165 5778 Human Molecular Genetics, 2014, Vol. 23, No. 21,0.5% in ESP6500 ( last accessed 17 April 2014), 1000 Genomes (18) and Atherosclerosis risk in communities study (ARIC), an internal control database of 3996 exomes at the Baylor College of Medicine Human Genome Sequencing Center, (2) variants were in protein coding regions and/or in canonical splice sites, (3) variants were either missense, nonsense, frameshift or splice site variants and (4) variants were compound heterozygous or homozygous in the same gene in all affected children. The filtering strategy and numbers of identified variants and genes at each step are provided in Supplementary Material, Table S1. After assessing the phase of remaining variants using the sequence data of the parents and the segregation analyses in the entire family, only two genes, dual specificity (all-trans- and 11-cis-) retinol dehydrogenase (RDH11)(19) and cadherin 23 (CDH23)(20), remained as possible candidates. The CDH23 gene encodes a Ca 2+ -dependent cell cell adhesion glycoprotein that is involved in stereocilia organization and hair bundle formation (21). Mutations in CDH23 are causal for two allelic recessive disorders: Usher syndrome, Type 1D (USH1D) presenting with congenital deafness with a variable degree of retinal degeneration and non-syndromic autosomal recessive deafness 12 (DFNB12) high-frequency progressive sensorineural hearing loss with normal retinal and vestibular function. Specifically, missense mutations in CDH23 mostly cause DFNB12, whereas nonsense, frameshift, splice site and some more severe missense mutations of CDH23 cause USH1D (22,23). All affected children in the family were compound heterozygous for two rare missense variants, c.g574c:p.e192q and c.t9185c:p.m3062t. Predictive programs suggested that the c.g574c:p.e192q variant may be possibly disease associated and the c.t9185c:p.m3062t is likely benign. This information coupled with the normal hearing in all siblings suggested that the CDH23 gene is not associated with the RP and syndromic phenotype in the family. This left RDH11 as the only plausible candidate gene for the disease phenotype segregating with the two variants. The two RDH11 variants shared by the affected siblings, c.c199t:p.r67 and c.c322t:p.r108, are predicted to either result in a truncated protein and, consequently, an inactive enzyme due to severed di-nucleotide-binding domain which is necessary for catalysis (19), or in a null allele due to nonsense-mediated decay. Neither variant is present in 1000 Genome database, in the Exome Sequencing Project (ESP) database, or in the ARIC database of 3996 control exomes at the Baylor Genome Center, where both specific nucleotide positions were well covered in all control exomes. In fact, truncating mutations (nonsense or frameshift alleles) in RDH11 are extremely rare, since only one deleterious allele has been described in the ESP database of 6500 exomes and none was present in 3996 ARIC and 2940 BH-CMG exomes in the Baylor Human Genome Sequencing Center database. Moreover, we also did not find any bi-allelic events in both datasets at Baylor Genome Center which further confirms that diseaseassociated variants in RDH11 are exceptionally rare. RDH11 encodes a dual specificity retinol dehydrogenase which is ubiquitously expressed and its expression is hormonally regulated (e.g. by androgens) (19,24). In the eye, RDH11 has an oxidoreductive function in the visual cycle (19). In the mouse eyes, the protein is expressed in the RPE and Muller cells (25). In the RPE, RDH11 is proposed to aid RDH5 in the regeneration of chromophore by oxidizing 11-cis-retinol to 11-cis-retinal (26,27). Rdh11 2/2 mice have a mild phenotype, a normal retinal morphology and ERGs under baseline conditions and normal retinoid profiles. The only RP-related defect in Rdh11 2/2 mice is a delayed dark adaptation (28). DISCUSSION A new form of syndromic RP was investigated in a family presenting with a previously undescribed constellation of phenotypic features. Several common features characteristic for RP, such as salt-and-pepper retinopathy, attenuation of the arterioles and generalized rod cone dysfunction based on ERG, were mixed with relatively atypical for RP features, e.g. mottled macula at early age and peripapillary sparing of RPE. The array of systemic features included developmental delay, which can often be seen in cases with RP, but also short stature and craniofacial features; the latter phenotypes not usually associated with syndromic RP. Overall, the combination of retinal and systemic features was quite unique. There has been one family described where two affected siblings presented with a remarkably similar phenotype (29); however, no mutations in RDH11 were detected in that family suggesting that the phenocopy could be due to mutations in genes from the same, retinoic acid metabolism, pathway. The assessment of RDH11 (dual all-trans- and 11-cis-retinolspecific dehydrogenase) as the likely causal gene for the syndromic RP phenotype in this family was aided by two main facts: first, the protein has a well-characterized, albeit auxiliary role in the visual cycle and second, the compound heterozygous nonsense mutations predict null alleles and these render the protein non-functional. Although the RP phenotype is clearly due to deleterious mutations in RDH11, the systemic features were somewhat unexpected, because the role of RDH11 in all-trans-retinoic acid metabolism and thus development in other organs remains obscure. However, there were no other variants in any genes which segregated with the phenotype and could plausibly explain the phenotype and mutations in several genes from the retinoic acid metabolism pathway have been associated with severe developmental disorders. The prominent examples include retinol transporter STRA6 (30) and another retinol dehydrogenase, RDH10 (31) which, in addition to developmental phenotypes have also specific, albeit different from RP, eye phenotype. RDH11 is one of the 11-cis-retinol dehydrogenases which catalyzes the last oxidation step, the conversion of 11-cis-retinol to 11-cis-retinal, in the retinoid cycle in the RPE (28). In mice, RDH11 plays an auxiliary role in this enzymatic reaction, because most of it is carried out by RDH5 (32). Mutations in RDH5 cause autosomal recessive fundus albipunctatus, a relatively stationary night blindness which can, in some patients, develop into progressive cone dystrophy (33,34). Interestingly, knockout mouse models of either Rdh5 or Rdh11 do not replicate the human phenotypes, except for the night blindness/delayed dark adaptation (27,32). In a double knockout, the Rdh5 2/2 Rdh11 2/2 mice, the phenotype was enhanced, the animals exhibited enhanced delay of dark adaptation and increased accumulation of cis-retinols and retinyl esters, suggesting epistasis and a cooperative role between RDH5 and RDH11 (32). Slowed recovery of rod responses and abnormal retinoid profiling suggests that RDH11 plays a complementary role to RDH5 in the flow of 147

166 Human Molecular Genetics, 2014, Vol. 23, No retinoids during dark adaptation (26). It is also possible that not a lack of 11-cis-retinol visual cycle activity of RDH11 is detrimental, but all-trans-retinol activity related to retinoic acid metabolism abnormalities in the eye causes the RP phenotype. In conclusion, we describe a new syndromic phenotype with RP caused by deleterious mutations in the RDH11 gene. Given that the Rdh11 2/2 mouse phenotype is mild, the phenotype in all three affected siblings was surprisingly and uniformly severe. MATERIALS AND METHODS Patients and clinical evaluation Three affected siblings, together with an unaffected mother and father (Fig. 1), were enrolled in the study under the protocol IRB-AAAB6560 after obtaining informed consent. The protocol was approved by the Institutional Review Board at Columbia University and adheres to tenets of the Declaration of Helsinki. Each patient underwent a complete ophthalmic examination by a retina specialist (S.H.T.), which included color fundus photography with an FF 450plus Fundus Camera (Carl Zeiss Meditec AG, Jena, Germany). Fundus autofluorescence (FAF) images were obtained using a confocal scanning laser ophthalmoscope (Heidelberg Retina Angiograph 2, Heidelberg Engineering, Dossenheim, Germany) by illuminating the fundus with argon laser light (488 nm) and viewing the resultant fluorescence through a band-pass filter with a short wavelength cutoff at 495 nm. Simultaneous FAF and SD-OCT images were acquired using a Spectralis HRA+OCT (Heidelberg Engineering, Heidelberg, Germany). Electroretinography was carried out using the Diagnosys Espion Electrophysiology System (Diagnosys LLC, Littleton, MA, USA). For all recordings, the pupils were maximally dilated before full-field ERG testing using guttate tropicamide (1%) and phenylephrine hydrochloride (2.5%); and the corneas were anesthetized with guttate proparacaine 0.5%. Silver impregnated fiber electrodes (DTL; Diagnosys LLC) were used with a ground electrode on the forehead. Full-field ERGs to test generalized retinal function were performed using extended testing protocols incorporating the International Society for Clinical Electrophysiology of Vision standard (35). Standard automated perimetry was attempted on all three siblings using the 30-2 Swedish Interactive Threshold Algorithm field program in the Humphrey Visual Field Analyzer (Carl Zeiss Meditec, Inc., Dublin, CA, USA). Reliable visual fields (global indices,33%) were obtained only for the oldest sibling. In addition, to assess the loss in rod- and cone-mediated sensitivity, a two-color dark-adapted threshold technique on a modified Octopus perimeter (Haag-Streit AG, Köniz, Switzerland) was used. Following pupil dilation and 45 min of dark adaptation, sensitivities to 500 and 650 nm size V, 200 ms targets were obtained along the horizontal meridian (28 intervals) through the fovea to 158 eccentricity. The sensitivity difference to the two chromatic test stimuli determined whether rods, cone or both photoreceptor systems mediated threshold at a given location (36). A comprehensive review of pediatric health records in each sibling was conducted to assess systemic health. Genetic analyses The proband was initially screened for variants in the ABCA4 gene and on an array containing mostknown arrpgenes (Asper Biotech, Inc., Tartu, Estonia; last accessed 12 February 2013) revealing no disease-associated variants. Subsequently, whole-exome sequencing was performed on family members at the Baylor College of Medicine Human Genome Sequencing Center. Genomic DNA samples were constructed into Illumina paired-end pre-capture libraries according to the manufacturer s protocol (Illumina Multiplexing_SamplePrep_Guide_ _ D) with modifications as described in the BCM-HGSC Illumina Barcoded Paired-End Capture Library Preparation protocol. Libraries were prepared using Beckman robotic workstations (Biomek NXp and FXp models). The complete protocol and oligonucleotide sequences are accessible from the HGSC website ( ina_barcoded_paired-end_capture_library_preparation.pdf; last accessed 20 October 2013). Pre-captured libraries were pooled together and hybridized in solution to the HGSC CORE design (52 Mb, NimbleGen), and exome capture was performed according to the manufacturer s protocol NimbleGen SeqCap EZ Exome Library SR User s Guide (Version 2.2) with minor revisions. Library templates were prepared for sequencing using Illumina s cbot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE ). Real-time analysis software was used to process the image analysis and base calling. Sequencing runs generated million successful reads on each lane of a flow cell, yielding 9 10 Gb per sample. With these sequencing yields, samples achieved an average of 90% of the targeted exome bases covered to a depth of 20 or greater. Illumina sequence analysis was performed using the HGSC Mercuryanalysis pipeline ( Pipeline; last accessed 12 May 2013) that addresses all aspects of data processing and analyses, moving data step by step through various analysis tools from the initial sequence generation on the instrument to annotated variant calls [single nucleotide polymorphisms (SNPs) and intra-read in/dels]. Pathogenicity of novel variants was assessed with predictive programs for splice sites and coding sequences, accessed via Alamut software (Alamut 2.3; last accessed 20 April 2014). All variants of interest were confirmed by Sanger sequencing, and segregation analyses were performed on all members of the family. SUPPLEMENTARY MATERIAL Supplementary Material is available at HMG online. ACKNOWLEDGEMENTS The authors thank the family members for participating in this study and acknowledge the insightful discussions, advice and the critical reading of the manuscript by Drs Krzysztof Palczewski and Uta Francke. Conflict of Interest Statement. None declared. FUNDING This work was supported in part by National Institutes of Health grants EY021163, EY019861, HG and EY (Core 148

167 5780 Human Molecular Genetics, 2014, Vol. 23, No. 21 Support for Vision Research), by research grants FIS PI13/00226, FIS PS09/00459, RD (Retics Biobank), CIBERER Intra/07/704.1 and Intra/09/702.1, ONCE, Fundaluce and Fundacion Conchita Rabago de Jimenez Diaz; and unrestricted funds from Research to Prevent Blindness (New York, NY) to the Department of Ophthalmology, Columbia University. REFERENCES 1. Daiger, S.P., Sullivan, L.S. and Bowne, S.J. (2013) Genes and mutations causing retinitis pigmentosa. Clin. Genet., 84, Bonnet, C. and El-Amraoui, A. (2012) Usher syndrome (sensorineural deafness and retinitis pigmentosa): pathogenesis, molecular diagnosis and therapeutic approaches. Curr. Opin. Neurol., 25, Leitch, C.C., Zaghloul, N.A., Davis, E.E., Stoetzel, C., Diaz-Font, A., Rix, S., Alfadhel, M., Lewis, R.A., Eyaid, W., Banin, E. et al. (2008) Hypomorphic mutations in syndromic encephalocele genes are associated with Bardet-Biedl syndrome. Nat. Genet., 40, Zaghloul, N.A. and Katsanis, N. (2009) Mechanistic insights into Bardet-Biedl syndrome, a model ciliopathy. J. Clin. Invest., 119, Friedman, T.B., Schultz, J.M., Ahmed, Z.M., Tsilou, E.T. and Brewer, C.C. (2011) Usher syndrome: hearing loss with vision loss. Adv. Otorhinolaryngol., 70, Guo, D.F. and Rahmouni, K. (2011) Molecular basis of the obesity associated with Bardet-Biedl syndrome. Trends Endocrinol. Metab., 22, Putoux, A., Attie-Bitach, T., Martinovic, J. and Gubler, M.C. (2012) Phenotypic variability of Bardet Biedl syndrome: focusing on the kidney. Pediatr. Nephrol., 27, Puddu, P., Barboni, P., Mantovani, V., Montagna, P., Cerullo, A., Bragliani, M., Molinotti, C. and Caramazza, R. (1993) Retinitis pigmentosa, ataxia, and mental retardationassociated with mitochondrial DNA mutationin an Italian family. Br. J. Ophthalmol., 77, Inoue, H., Tanizawa, Y., Wasson, J., Behn, P., Kalidas, K., Bernal-Mizrachi, E., Mueckler, M., Marshall, H., Donis-Keller, H., Crock, P. et al. (1998) A gene encoding a transmembrane protein is mutated in patients with diabetes mellitus and optic atrophy (Wolfram syndrome). Nat. Genet., 20, Keeler, L.C., Marsh, S.E., Leeflang, E.P., Woods, C.G., Sztriha, L., Al-Gazali, L., Gururaj, A. and Gleeson, J.G. (2003) Linkage analysis in families with Joubert syndrome plus oculo-renal involvement identifies the CORS2 locus on chromosome 11p12-q13.3. Am. J. Hum. Genet., 73, Dixon-Salazar, T., Silhavy, J.L., Marsh, S.E., Louie, C.M., Scott, L.C., Gururaj, A., Al-Gazali, L., Al-Tawari, A.A., Kayserili, H., Sztriha, L. et al. (2004) Mutations in the AHI1 gene, encoding jouberin, cause Joubert syndrome with cortical polymicrogyria. Am. J. Hum. Genet., 75, Bredrup, C., Saunier, S., Oud, M.M., Fiskerstrand, T., Hoischen, A., Brackman, D., Leh, S.M., Midtbo, M., Filhol, E., Bole-Feysot, C. et al. (2011) Ciliopathies with skeletal anomalies and renal insufficiency due to mutations in the IFT-A gene WDR19. Am. J. Hum. Genet., 89, Coppieters, F., Lefever, S., Leroy, B.P. and De Baere, E. (2010) CEP290, a gene with many faces: mutation overview and presentation of CEP290base. Hum. Mutat., 31, Katsanis, N. (2004) The oligogenic properties of Bardet-Biedl syndrome. Hum. Mol. Genet., 13 (Spec No. 1), R65 R Siemiatkowska, A.M., van den Born, L.I., van Hagen, P.M., Stoffels, M., Neveling, K., Henkes, A., Kipping-Geertsema, M., Hoefsloot, L.H., Hoyng, C.B., Simon, A. et al. (2013) Mutations in the mevalonate kinase (MVK) gene cause nonsyndromic retinitis pigmentosa. Ophthalmology, 120, Davidson, A.E., Schwarz, N., Zelinger, L., Stern-Schneider, G., Shoemark, A., Spitzbarth, B., Gross, M., Laxer, U., Sosna, J., Sergouniotis, P.I. et al. (2013) Mutations in ARL2BP, encoding ADP-ribosylation-factor-like 2 binding protein, cause autosomal-recessive retinitis pigmentosa. Am. J. Hum. Genet., 93, Zuchner, S., Dallman, J., Wen, R., Beecham, G., Naj, A., Farooq, A., Kohli, M.A., Whitehead, P.L., Hulme, W., Konidari, I. et al. (2011) Whole-exome sequencing links a variant in DHDDS to retinitis pigmentosa. Am. J. Hum. Genet., 88, Genomes Project ConsortiumAbecasis, G.R., Altshuler, D., Auton, A., Brooks, L.D., Durbin, R.M., Gibbs, R.A., Hurles, M.E. and McVean, G.A. (2010) A map of human genome variation from population-scale sequencing. Nature, 467, Haeseleer, F., Jang, G.F., Imanishi, Y., Driessen, C.A., Matsumura, M., Nelson, P.S. and Palczewski, K. (2002) Dual-substrate specificity short chain retinol dehydrogenases from the vertebrate retina. J. Biol. Chem., 277, Bolz, H., Reiners, J., Wolfrum, U. and Gal, A. (2002) Role of cadherins in Ca2+-mediated cell adhesion and inherited photoreceptor degeneration. Adv. Exp. Med. Biol., 514, Shin, J.B. and Gillespie, P.G. (2009) Unraveling cadherin 23 s role in development and mechanotransduction. Proc. Natl. Acad. Sci. U. S. A., 106, Bork, J.M., Peters, L.M., Riazuddin, S., Bernstein, S.L., Ahmed, Z.M., Ness, S.L., Polomeno, R., Ramesh, A., Schloss, M., Srisailpathy, C.R. et al. (2001) Usher syndrome 1D and nonsyndromic autosomal recessive deafness DFNB12 are caused by allelic mutations of the novel cadherin-like gene CDH23. Am. J. Hum. Genet., 68, Schultz, J.M., Bhatti, R., Madeo, A.C., Turriff, A., Muskett, J.A., Zalewski, C.K., King, K.A., Ahmed, Z.M., Riazuddin, S., Ahmad, N. et al. (2011) Allelic hierarchy of CDH23 mutations causing non-syndromic deafness DFNB12 or Usher syndrome USH1D in compound heterozygotes. J. Med. Genet., 48, Lin, B., White, J.T., Ferguson, C., Wang, S., Vessella, R., Bumgarner, R., True, L.D., Hood, L. and Nelson, P.S. (2001) Prostate short-chain dehydrogenase reductase 1 (PSDR1): a new member of the short-chain steroid dehydrogenase/reductase family highly expressed in normal and neoplastic prostate epithelium. Cancer Res., 61, Kasus-Jacobi, A., Ou, J., Bashmakov, Y.K., Shelton, J.M., Richardson, J.A., Goldstein, J.L. and Brown, M.S. (2003) Characterization of mouse short-chain aldehyde reductase (SCALD), an enzyme regulated by sterol regulatory element-binding proteins. J. Biol. Chem., 278, Parker, R.O. and Crouch, R.K. (2010) Retinol dehydrogenases (RDHs) in the visual cycle. Exp. Eye Res., 91, Kiser, P.D., Golczak, M., Maeda, A. and Palczewski, K. (2012) Key enzymes of the retinoid (visual) cycle in vertebrate retina. Biochim. Biophys. Acta, 1821, Kasus-Jacobi, A., Ou, J., Birch, D.G., Locke, K.G., Shelton, J.M., Richardson, J.A., Murphy, A.J., Valenzuela, D.M., Yancopoulos, G.D. and Edwards, A.O. (2005) Functional characterization of mouse RDH11 as a retinol dehydrogenase involved in dark adaptation in vivo. J. Biol. Chem., 280, Lorda-Sanchez, I., Trujillo, M.J., Gimenez, A., Garcia-Sandoval, B., Franco, A., Sanz, R., Rodriguez de Alba, M., Ramos, C. and Ayuso, C. (1999) Retinitis pigmentosa, mental retardation, marked short stature, and brachydactyly in two sibs. Ophthalmic Genet., 20, Berry, D.C. and Noy, N. (2012) Signaling by vitamin A and retinol-binding protein in regulation of insulin responses and lipid homeostasis. Biochim. Biophys. Acta, 1821, Ashique, A.M., May, S.R., Kane, M.A., Folias, A.E., Phamluong, K., Choe, Y., Napoli, J.L. and Peterson, A.S. (2012) Morphological defects in a novel Rdh10 mutant that has reduced retinoic acid biosynthesis and signaling. Genesis, 50, Kim, T.S., Maeda, A., Maeda, T., Heinlein, C., Kedishvili, N., Palczewski, K. and Nelson, P.S. (2005) Delayed dark adaptation in 11-cis-retinol dehydrogenase-deficient mice: a role of RDH11 in visual processes in vivo. J. Biol. Chem., 280, Nakamura, M., Hotta, Y., Tanikawa, A., Terasaki, H. and Miyake, Y. (2000) A high association with cone dystrophy in Fundus albipunctatus caused by mutations of the RDH5 gene. Invest. Ophthalmol. Vis. Sci., 41, Cideciyan, A.V., Haeseleer, F., Fariss, R.N., Aleman, T.S., Jang, G.F., Verlinde, C.L., Marmor, M.F., Jacobson, S.G. and Palczewski, K. (2000) Rod and cone visual cycle consequences of a null mutation in the 11-cis-retinol dehydrogenase gene in man. Vis. Neurosci., 17, Marmor, M.F., Fulton, A.B., Holder, G.E., Miyake, Y., Brigell, M. and Bach, M. and International Society for Clinical Electrophysiology of Vision. (2009) ISCEV Standard for full-field clinical electroretinography (2008 update). Doc. Ophthalmol., 118, Jacobson, S.G., Voigt, W.J., Parel, J.M., Apathy, P.P., Nghiem-Phu, L., Myers, S.W. and Patella, V.M. (1986) Automated light- and dark-adapted perimetry for evaluating retinitis pigmentosa. Ophthalmology, 93,

168 150

169 CHAPTER V.4 Mutations in spliceosome-associated protein homolog CWC27 cause syndromic autosomal recessive retinitis pigmentosa (Manuscript in Preparation) 151

170 Introduction Retinitis pigmentosa (RP) is a genetically and clinically heterogeneous group of inherited retinopathies affecting 1 in 2,500-3,000 people worldwide (Daiger, Sullivan et al. 2013). It includes both non-syndromic and syndromic forms of all modes of inheritance, and is caused by mutations in >100 genes. By current estimates known genes explain between 60 80% of RP cases suggesting that many, mostly rare, RP loci remain to be found (Daiger, Sullivan et al. 2013). Syndromic forms of RP present with many heterogeneous phenotypes, the two best characterized of which are Usher syndrome, caused by 12 genes (Bonnet and El-Amraoui 2012) and Bardet-Biedl syndrome (BBS), which is caused by mutations in at least 17 genes ( (Leitch, Zaghloul et al. 2008, Zaghloul and Katsanis 2009). The specific systemic phenotypic features in patients vary and often mutations in the same gene can cause different phenotypes (allelic affinity), sometimes classified as separate diseases (Coppieters, Lefever et al. 2010) or the same syndrome can be caused by mutations in different genes (locus heterogeneity) (Katsanis 2004). Recent advances in the next-generation sequencing (NGS) have enabled identification of several novel RP genes (Zuchner, Dallman et al. 2011, Davidson, Schwarz et al. 2013, Siemiatkowska, van den Born et al. 2013). Here, we investigated the clinical phenotype and the genetic cause of RP with syndromic features in a family where phenotypes strikingly resembled those from the RDH11 family (see Chapter V.3). 152

171 Results and Discussions Two affected individuals in a consanguineous family were diagnosed with syndromic RP (Figure 5.1). To identify the molecular origin of the RP and syndromic phenotype we, after excluding the RDH11 gene, performed exome sequencing on affected individuals and an unaffected sibling. Considering the apparent autosomal recessive inheritance of the disease, the variants detected by NGS in affected individuals were initially selected for further analysis when they met all of the following criteria: (1) variants were either missense, nonsense, frameshift or splice site variants, (2) variants were present in <0.5% in the Exome aggregation consortium (ExAC, available at: (3) variants were in protein coding regions and/or in canonical splice sites, and (4) variants were compound heterozygous or homozygous in the same gene in the affected children. An overlapping block of AOH of approximately 7.6 Mb in chromosome 5 was shared by affected individuals (Figure 5.2). Close examination of rare variants under that region revealed a homozygous synonymous variant in CWC27, transcript NM_ c.495G>A p.e165e in both affected individuals. This variant is located in the last coding nucleotide of exon 5 and is predicted to abolish the splice donor site downstream of exon 5 by MaxEntScan (Yeo and Burge 2004), and NNSPLICE (Reese, Eeckman et al. 1997), accessed through the Alamut Visual software (available at A search in the ExAC database (ExAC, available at: showed that p.e165e was never reported. 153

172 Figure 5.1. Pedigree of the family. Open circles and squares represent the unaffected female and male family members, respectively; closed circles and squares represent the affected female and male patients. The mother and father in this pedigree are heterozygous for the p.e165e synonymous mutation in CWC27, respectively. The affected siblings are homozygous for the mutation. Functional analysis of the effect of the p.e165e mutation using RT-PCR analysis with primers flanking CWC27 exons 5 and 6 revealed a non-spliced PCR product of 1,098 bp in the affected siblings, while a splice product of the expected size (154 bp) was detected in the unaffected sister and a WT control individual. The size of the PCR product in affected siblings suggested a complete absence of splicing and the inclusion of the entire intron 5 in the mutated CWC27 transcript (Figure 5.3), therefore the mutation should be presented as p.l167gfs*3. Screening CWC27 in our collections of recessively and simplex cases of RP did not identify additional mutants. 154

173 Figure 5.2. Absence of heterozygosity (AOH) regions in chromosome 5. Chromosome 5 of the proband (II-1), the affected sister (II-2), and the unaffected sister (II-3) is shown. A large AOH block (gray area) of approximately 7.6 Mb was shared by the proband and the affected sister, but is absent in the unaffected sister. The causal mutation in CWC27 is marked by red vertical lines. 155

174 Figure 5.3. Analysis of splicing of the CWC27 gene. Complementary DNA from proband II-1 (A, Lane 2), affected sibling II-2 (A, Lane 3), normal sibling II-3 (A, Lane 4), and a normal control individual (A, Lane 5) was amplified by primers located in Exons 5 and 6 of the CWC27 gene. Amplification of the cdna in the normal sibling and normal control produced the expected PCR fragment of 154bp (A, Land 4 and 5; B). Amplification of the cdna of the patients homozygous for E165E variant yielded a larger 1,098bp PCR product which includes the entire IVS5 (A, Land 2 and 3; B). Lane 6 of A shows PCR fragment from amplification of the gdna in the normal control, using the same primers. Lane 7 of A shows negative control. Lanes 1 and 8 of A shows DNA size standards. 156