Ahr gene Russell S. Thomas a, Sharron G. Penn a, Kevin Holden a, Christopher A. Brad eld b and David R. Rank a

Size: px
Start display at page:

Download "Ahr gene Russell S. Thomas a, Sharron G. Penn a, Kevin Holden a, Christopher A. Brad eld b and David R. Rank a"

Transcription

1 Original paper 151 Sequence variation and phylogenetic history of the mouse Ahr gene Russell S. Thomas a, Sharron G. Penn a, Kevin Holden a, Christopher A. Brad eld b and David R. Rank a The Ahr locus encodes for the aryl hydrocarbon receptor (AHR), which plays an important toxicological and developmental role. Sequence variation in this gene was studied in 13 different mouse lines that included eight laboratory strains, two Mus musculus subspecies and three additional Mus species. The data presented represent the largest study of sequence variation across multiple mouse lines in a single gene ( 15.9 kb/mouse line). Among all mice, the average frequency of all polymorphisms in the intronic regions was 20.3 variants/ kb and the average exonic frequency was 14.1 variants/kb. For substitutions alone, the average frequencies in the intronic and exonic regions for all mice were 13.3 and 8.9 substitutions/kb, respectively. Between laboratory strains, the average intronic and exonic frequencies for all polymorphisms dropped to 5.4 and 2.9 variants/kb, respectively. There were 111 non-synonymous polymorphisms that resulted in 42 different amino acid changes, of which only 10 amino acid changes had been previously identi ed. Based on the nucleotide sequence, the phylogenetic history of the gene showed mice from the Ahr b2 and Ahr d alleles in separate branches while mice from the Ahr b1 and Ahr b3 alleles exhibited a more complex history. Evolutionarily, the AHR protein as a whole appears to be under purifying selective pressure (K a :K s ratio 0.237). Despite signi cant functional constraint in the basic helix-loop-helix and PAS domains, ligand binding is not constrained to the high-af nity allele, which supports further the role of the AHR in development and its importance beyond the adaptive response to environmental toxicants. Pharmacogenetics 12:151±163 & 2002 Lippincott Williams & Wilkins Pharmacogenetics 2002, 12:151±163 Keywords: Aryl hydrocarbon receptor, phylogeny, polymorphism a Aeomica, Sunnyvale, California and b McArdle Laboratory for Cancer Research, University of Wisconsin, Madison, Wisconsin, USA Correspondence to David R. Rank, Aeomica, 928 E. Arques Avenue, Mail Stop #9, Sunnyvale, California , USA Tel: ; fax: ; david.rank@am.amershambiosciences.com Received 18 September 2001 Accepted 16 October 2001 Introduction The Ahr locus encodes a ligand-activated transcription factor, the aryl hydrocarbon receptor (AHR), which plays an important role in both the toxic response to environmental pollutants and developmental signaling. In the toxic response, environmental contaminants such as planar aromatic hydrocarbons (PAHs), chlorinated dioxins and other halogenated aromatic hydrocarbons activate the AHR and can eventually lead to cancer, thymic involution, wasting, chloracne, ovarian failure and birth defects [1±4]. The role of the AHR in developmental signaling is less well characterized and the endogenous ligand for the receptor remains unknown. However, target disruption of the AHR in mice led to liver and immune system effects [5,6] and additional studies have pointed to a role in embryonic development [7] and vascular remodeling [8]. The AHR is a basic helix-loop-helix (bhlh) protein and a member of the PAS superfamily. The term PAS is derived from the founding members of this family: Per, Arnt and Sim [9±11]. Each member of this family possesses a 250±300 amino acid homologous domain that acts as a dimerization surface for homotypic interactions with other PAS proteins and heterotypic interactions with cellular chaperones. For the AHR, the PAS domain also functions as a binding surface for the aromatic ligands [12]. The bhlh domain acts as an additional dimerization surface in homotypic interactions with other PAS proteins and positions the basic region to allow contact with DNA. In general, the PAS superfamily of proteins can be viewed as environmental sensors due to their roles in circadian rhythms, hypoxia signaling and responses to environmental contaminants [13]. Early genetic analysis revealed that inbred strains of mice had a polymorphic response to administration of PAHs based on variability in their induction of monooxygenase activity [14]. Based on the mono-oxygenase response and other biochemical characteristics, the Ahr gene was classi ed into four phenotypic alleles (Ahr d, Ahr b1, Ahr b2 and Ahr b3 ) [15] (Table 1). The relative lack of mono-oxygenase response in the Ahr d allele is due to an A375V substitution in the ligand-binding domain, resulting in a decrease in binding af nity [16]. 0960±314X & 2002 Lippincott Williams & Wilkins

2 152 Pharmacogenetics 2002, Vol 12 No 2 Table 1 Allelic summary of the Ahr gene and the associated mouse lines Ahr alleles Ahr d Ahr b1 Ahr b2 Ahr b3 Phenotype: a Phenotype: a Phenotype: a Phenotype: a Low af nity High af nity High af nity High af nity Stabilized by sodium molybdate High heat stability Low heat stability Intermediate heat stability M r protein M r protein M r protein M r protein Mice Mice Mice Mice DBA/2J C57BL/6J BALB/cBy MOLF/Ei 129/SvJ C3H/HeJ SPRETUS/Ei SJL/J A/J PANCEVO/Ei b CAST/Ei CBA/J CAROLI/Ei a Phenotypic data summarized from Poland and Glover [15]; b not analyzed previously [15]. Phenotype inferred based on nucleic acid sequence. Linkage studies using these mouse lines mapped the response to a single autosomal locus, Ahr, at mouse chromosome 12 [17]. The gene encoding the AHR has been cloned and characterized in a number of species including mouse, rat and humans [12,18±20]. In mice, the structural gene spans more than 30 kb of genomic sequence [17]. The gene is composed of 11 exons and codes for an mrna transcript of more than 5 kb in length with an open reading frame of approximately 2.5 kb [17]. The genomic structure of the Ahr gene with respect to the mrna transcript and various domains are depicted in Fig. 1. Inbred strains of mice have become a signi cant genetic resource within the scienti c community due to the isogenicity within strains and the heterogeneity between strains. This allows the mapping of loci controlling quantitative and qualitative traits, through the crossing of phenotypically different inbred strains and the eventual identi cation of the responsible gene(s). Since an understanding of the genealogy is important in the genetic analysis, there has been signi cant effort to understand the phylogenetic histories of these strains [21] and to verify the presumed histories using a variety of genetic techniques [22±24]. However, a comprehensive database of genetic variation within inbred mice is incomplete with information on genetic variation available for less than 10% of inbred strains [21]. For single nucleotide polymorphisms (SNPs) in particular, the largest mouse study to date was carried out by Lindblad-Toh and colleagues, who characterized the polymorphism rates in eight mouse strains [25]. They identi ed 2848 SNPs in 1755 sequence-tagged sites (STSs) using oligonucleotide arrays and mapped a subset to chromosomal locations for use in genetic linkage studies. Fig. 1 PAS LBD bhlh A B TAD mrna UTR UTR Genomic A B C D E F G H I J Structural organization of the mouse Ahr, depicting exon/intron boundaries and domains relative to both the genomic sequence and mrna transcript. For the genomic sequence, solid boxes represent protein-coding regions and open boxes represent the 59 and 39 untranslated regions. Numbers within the mrna identify the exons, while introns are identi ed using letters below the genomic sequence. Above the mrna, the important domains and the associated exons are highlighted.

3 Sequence variation of the mouse Ahr gene Thomas et al. 153 Although recent large-scale efforts to nd DNA sequence variants in both human and mouse have identi ed a number of useful polymorphisms for mapping studies, the vast majority are not known to be functionally important. A more directed approach has been to discover sequence variants within one gene or sets of genes by high throughput sequencing of multiple individuals [26,27]. Similar to the directed sequencing approach in humans to identify and catalog the population variability within a disease-related gene, we have identi ed allelic differences in a toxicologically and developmentally important gene, Ahr, across multiple mouse lines. The eight strains, two subspecies and three additional Mus species selected for this study cover all of the previously de ned phenotypic alleles for Ahr as well as spanning diverse evolutionary distances. The objectives of this approach were four-fold: 1) to identify the polymorphic differences in the mouse Ahr; 2) to compare the phylogenetic relationship of this gene with known mouse pedigrees; 3) to compare polymorphism rates within this gene to rates observed in whole genome scans; and 4) to evaluate the evolutionary pressures associated with the AHR. Materials and methods Polymerase chain reaction and DNA sequencing The Ahr nucleotide sequence for the reference Mus musculus strain, 129/SvJ, was obtained from GenBank (AF325111) and used for polymerase chain reaction (PCR) primer design. Primers were designed using Primer3 ( ware/other/primer3.html) to amplify a tiling set of overlapping 600 bp regions across all exons, a portion of the 59 and 39 anking regions, and a portion of the intronic sequence. Custom universal primers were added to the PCR primers to facilitate sequencing. The PCR primers were used to amplify DNA from genomic templates from 12 mouse lines: A/J, BALB/ cby, C3H/HeJ, C57BL/6J, CAST/Ei (Mus musculus castaneus), CBA/J, DBA/2J, MOLF/Ei (Mus musculus molossinus), CAROLI/Ei (Mus musculus caroli), PAN- CEVO/Ei (Mus musculus hortulanus), SJL/J and SPRE- TUS/Ei (Mus musculus spretus). The mouse genomic DNA was purchased from Jackson Labs (Bar Harbor, ME, USA). The PCR products were sequenced on a MegaBACE 1000 instrument (Molecular Dynamics, Sunnyvale, California, USA) using dye-terminator chemistry. Polymorphism detection and veri cation After sequencing, the nucleotide sequence of each mouse was assembled using the Phred/Phrap/Consed package [28±30] and the assembled sequences were compared against the reference nucleotide sequence of strain 129/SvJ. A list of potential polymorphisms was generated for each mouse using the cross_match program and compiled using a set of custom Perl scripts. The polymorphisms were manually veri ed by visually checking the assembled electropherograms. Only manually veri ed polymorphic bases were reported. All polymorphisms found in this study have been uploaded to dbsnp and the Ahr coding sequence for each mouse was deposited in GenBank (AF to AF405571). A subset of the polymorphisms was veri ed using a single-base extension assay (SnuPE; Molecular Dynamics). Six polymorphisms in 12 mice were interrogated. Of the 72 genotype assays, 66 were successful. The six failed genotype assays were due to strainspeci c polymorphisms within the primers. Of the 66 successful assays, 100% of the SNuPE calls agreed with the sequence veri ed SNPs. Phylogenic and evolutionary analysis To estimate the genealogical relationship among mouse lines, the nucleotide sequences of the Ahr proteincoding regions were aligned using the CLUSTAL (software for aligning groups of sequence) method. A neighbor-joining phylogenetic tree was constructed based on the alignment using the program MEGA2 (version 2.0, (in preparation) and the Kimura 2-parameter model. Bootstrap values based on 500 replicates were calculated for each interior branch. The frequencies of synonymous substitutions per synonymous site (K s ) and non-synonymous substitutions per non-synonymous site (K a ) were estimated by rst aligning nucleotide sequences of the open-reading frame using the CLUSTAL method. Nucleotide sequences were used from the reference 129/SvJ strain, the two Mus musculus subspecies (CAST/Ei and MOLF/Ei) and the three separate Mus species sequenced in this study (CAROLI/Ei, PANCEVO/Ei, and SPRETUS/Ei), as well as Rattus norvegicus (Gen- Bank #NM_013149) and Homo sapiens (GenBank #NM_001621). For the individual domains, subsets of the same nucleotide sequences were used. The bhlh domain was de ned as codons 10±80 in all mice sequences and the rat sequence, and codons 11±81 in the human sequence. The rst half of the PAS domain was de ned as codons 115±248 in 129/SvJ, CAST/Ei and MOLF/Ei, codons 115±252 in PANCEVO/Ei, CAROLI/Ei, SPRETUS/Ei and rat, and codons 117± 254 in human. The second half of the PAS domain was de ned as codons 249±380 in 129/SvJ, CAST/Ei and MOLF/Ei, codons 253±384 in PANCEVO/Ei, CARO- LI/Ei, SPRETUS/Ei and rat, and codons 255±386 in human. The transactivation domain was de ned as codons 521±669 in 129/SvJ, CAST/Ei and MOLF/Ei, codons 525±675 in PANCEVO/Ei, CAROLI/Ei and SPRETUS/Ei, codons 525±674 in rat, and codons 529± 667 in human. After aligning, the K a and K s values

4 154 Pharmacogenetics 2002, Vol 12 No 2 were calculated using the K-Estimator program (version 5.5, [31]. Amino acid alignment and de ning conservative and nonconservative changes The amino acid alignment of the AHR was performed using the CLUSTAL method. Conservative and nonconservative amino acid substitutions were de ned using the BLOSUM62 matrix [32]. Conservative changes were de ned as those having a positive or neutral sign in the matrix while non-conservative changes were identi ed as those having a negative value. Results The sequencing of the Ahr gene was performed across 12 different mouse lines plus the reference 129/SvJ sequence from GenBank (#AF325111). The resulting sequencing coverage included the entire open reading frame of the gene for all mice and differing amounts of untranslated, intronic and anking region coverage. On average, the following coverage was achieved in the introns and anking regions: intron A (227 bp), B (976 bp), C (522 bp), D (1574 bp), E (761 bp), F (401 bp), G (588 bp), H (744 bp), I (659 bp), J (2386 bp) and anking (1994 bp). The amount of sequencing coverage in each mouse line studied is summarized in Table 2. All polymorphisms in the Ahr were determined both relative to the reference 129/SvJ strain (Table 2) and by pair-wise comparisons (Table 3). In summary, 1426 polymorphic sites and a total of 2213 polymorphisms were identi ed within the 13 mouse lines. The polymorphisms can be broken down into exonic and intronic regions, with 501 (22.6%) located in exonic regions, 1250 (56.5%) in the intronic region and 462 (20.9%) in the 59 and 39 anking regions. Of the exonic polymorphisms, there were 207 (41.3%) in the coding region with 111 (53.6%) non-synonymous polymorphisms. The non-synonymous changes produced 34 different amino acid substitutions and eight amino acid insertions. Non-conservative changes accounted for eight of the 34 amino acid substitutions (23.5%). The majority of the substitutions in the coding region occurred at the third codon position (57.6%). A subset of polymorphic sites having sequence coverage in all mice is summarized in Fig. 2 with different types represented by different color codes. A breakdown of the observed DNA sequence variants showed that 934 (42.2%) were transitions, 518 (23.4%) were transversions, 501 (22.6%) were insertions and 260 (11.7%) were deletions. The global analysis yielded a transition to transversion ratio of 1.8 : 1 and an insertion to deletion ratio of 1.9 : 1. A comparison of the polymorphism frequencies in the various mouse lines studied shows that SPRETUS/Ei, CAROLI/Ei, PANCEVO/Ei and MOLF/Ei are all highly polymorphic across the entire gene, with more subtle differences between the standard laboratory strains (Tables 2 and 3). Within laboratory strains studied, C57BL/6J was the most distinctive. Transitions, transversions, insertions and deletions are broken down for all mouse lines relative to the reference 129/SvJ strain in Table 2. Although the absolute ratio of transitions to transversions was highly variable across the mouse lines, transitions were consistently higher in each mouse studied. In contrast, exonic deletions outnumbered insertions in only eight of the 12 mouse lines. Notably, the remaining four mouse lines that had a greater number of insertions than deletions included a Mus musculus subspecies and three distinct Mus species. Examination of the exonic portions of the Ahr and the associated domains show highly disparate polymorphism frequencies across mouse lines (Table 4). The coding portion of exon 1, which includes the methionine codon for translation initiation and the basic region of the bhlh domain (Fig. 1) contained no DNA sequence variants in any of the mouse lines studied. Exon 2 contains the helix-loop-helix portion of the bhlh domain and showed a single transition in the rst helix in A/J, BALB/cBy, C3H/HeJ, CBA/J and a single transition in the second helix in CAROLI/Ei. The PAS domain of the Ahr is contained within exons 3±9 with the ligand-binding region encoded by exons 7 and 8 (Fig. 1). For seven of the 12 mouse lines, there were no polymorphisms in exons 3±8 (Table 4). In the remaining mouse lines, a number of polymorphisms were observed. Exon 3 contained a single transition in SPRETUS/Ei and CAROLI/Ei, but these polymorphisms were between the boundaries of the bhlh and PAS domains and not in the PAS region itself. Exon 4 contained a single transition in MOLF/Ei, and exon 5 contained a 12-base insertion in SPRETUS/Ei, CARO- LI/Ei and PANCEVO/Ei. In addition, CAROLI/Ei contained ve transitions within exon 5, and SPRE- TUS/Ei and PANCEVO/Ei each contained one transition and one transversion. Exon 6 contained four transitions in CAROLI/Ei plus one transition and one transversion in SPRETUS/Ei. In the ligand-binding region of the PAS domain, exon 7 contained four transitions in CAROLI/Ei, two transitions in PANCE- VO/Ei and one transition in SPRETUS/Ei, MOLF/Ei and C57BL/6J. Exon 8 had two transitions in PANCE- VO/Ei, SPRETUS/Ei, CAROLI/Ei and C57BL/6J and only one transition in MOLF/Ei. After the ligandbinding region within the PAS domain, exon 9 was more polymorphic across the various mouse lines. The laboratory mice A/J, BALB/cBy and CBA/J all had two

5 Sequence variation of the mouse Ahr gene Thomas et al. 155 Table 2 Total sequence coverage and polymorphism breakdown relative to the reference 129/SvJ strain A/J BALB/cBy C3H/HeJ CBA/J DBA/2J CAST/Ei SJL/J C57BL/6J MOLF/Ei SPRETUS/Ei PANCEVO/Ei CAROLI/Ei Exon variants Transitions Transversions Insertions Deletions Exon variant frequency (0.94) 1.57 (1.18) 1.35 (0.97) 1.14 (0.95) 0.57 (0.19) 0.76 (0.38) 0.58 (0.19) 8.00 (4.60) 7.56 (5.62) (10.39) (9.10) (26.51) Intron variants Transitions Transversions Insertions Deletions Intron variant frequency (2.32) 1.92 (1.37) 3.29 (2.23) 3.40 (2.18) 0.10 (0.10) 0.24 (0.12) 0.11 (0.00) (9.76) (8.39) (18.16) (14.89) (31.59) Flanking variants Transitions Transversions Insertions Deletions Flanking variant frequency 1, (12.29) (13.13) (14.56) 4.98 (1.66) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) (16.61) (10.42) (24.14) (23.04) (44.46) Total sequence coverage (bp) Exonic Intronic Flanking Number outside brackets refers to total sequence variant frequency (variants/kb). Number in brackets refers to substitution frequency (substitutions/kb); 2 includes 59 and 39 anking regions.

6 156 Pharmacogenetics 2002, Vol 12 No 2 Table 3 Polymorphism frequency in exonic (A) and intronic (B) sequence between two mouse lines A 129/SvJ 1 A/J BALB/cBy C3H/HeJ CBA/J DBA/2J CAST/Ei SJL/J C57BL/6J MOLF/Ei SPRETUS/Ei PANCEVO/Ei CAROLI/Ei 42.2 (26.5) 43.6 (26.8) 43.6 (27.5) 43.9 (27.6) 43.0 (27.3) 43.1 (27.0) 43.2 (27.1) 43.4 (27.2) 49.4 (29.8) 48.5 (31.4) 54.5 (33.5) 49.3 (31.9) PANCEVO/Ei 13.3 (9.1) 14.7 (9.9) 14.2 (9.7) 14.5 (9.9) 14.0 (9.7) 14.0 (9.4) 14.2 (9.6) 14.1 (9.5) 19.0 (11.8) 19.2 (13.0) 20.8 (13.3) SPRETUS/Ei 17.9 (10.4) 19.7 (11.0) 19.2 (11.4) 18.5 (10.8) 18.3 (10.9) 18.2 (10.6) 18.4 (10.8) 18.0 (10.4) 23.4 (13.0) 23.2 (13.9) MOLF/Ei 7.6 (5.6) 7.3 (6.0) 8.3 (6.3) 8.2 (6.3) 8.1 (6.2) 7.8 (5.8) 8.0 (6.0) 7.8 (5.8) 13.7 (8.7) C57BL/6J 8.0 (4.6) 8.1 (4.5) 8.6 (5.3) 8.7 (5.2) 8.6 (5.2) 8.2 (4.8) 8.4 (5.0) 8.2 (4.8) SJL/J 0.6 (0.2) 1.5 (1.0) 1.8 (1.4) 1.6 (1.2) 1.5 (1.2) 0.8 (0.4) 1.0 (0.6) CAST/Ei 0.8 (0.4) 1.7 (1.3) 2.0 (1.6) 1.8 (1.4) 1.7 (1.3) 1.0 (0.6) DBA/2J 0.6 (0.2) 1.4 (1.0) 1.8 (1.4) 1.5 (1.2) 1.5 (1.1) CBA/J 1.1 (1.0) 1.4 (1.0) 1.5 (1.2) 1.4 (1.0) C3H/HeJ 1.4 (1.0) 1.4 (1.0) 1.4 (1.0) BALB/cBy 1.6 (1.2) 1.7 (1.2) A/J 1.4 (0.9) B CAROLI/Ei 47.4 (31.6) 48.4 (35.1) 48.4 (33.4) 52.1 (34.2) 48.8 (32.7) 46.4 (31.8) 46.2 (31.4) 44.1 (31.5) 59.5 (39.6) 58.2 (38.4) 66.0 (46.6) 62.8 (42.1) PANCEVO/Ei 23.3 (14.9) 25.9 (16.1) 25.6 (15.4) 25.6 (16.0) 26.7 (16.2) 23.9 (15.3) 23.9 (14.9) 23.5 (14.7) 33.3 (22.0) 33.8 (20.8) 35.3 (23.5) SPRETUS/Ei 25.8 (18.2) 28.5 (18.7) 29.0 (19.9) 28.3 (19.3) 28.7 (19.3) 26.8 (18.6) 27.1 (18.4) 26.9 (18.4) 36.4 (24.9) 37.1 (24.1) MOLF/Ei 15.4 (8.4) 17.2 (9.6) 19.1 (10.2) 18.1 (9.6) 17.5 (8.9) 14.1 (8.1) 16.5 (8.8) 15.1 (8.7) 25.9 (14.6) C57BL/6J 15.1 (9.8) 19.0 (12.0) 17.6 (11.1) 17.9 (11.4) 19.0 (11.8) 13.9 (9.1) 16.7 (9.7) 14.6 (9.6) SJL/J 0.1 (0.0) 3.6 (2.3) 2.1 (1.4) 3.6 (2.4) 3.6 (2.3) 0.2 (0.1) 0.4 (0.1) CAST/Ei 0.3 (0.1) 1.6 (0.9) 2.3 (1.5) 1.8 (1.0) 1.6 (0.9) 0.4 (0.3) DBA/2J 0.1 (0.1) 3.7 (2.5) 2.1 (1.6) 3.5 (2.3) 3.6 (2.3) CBA/J 3.4 (2.2) 4.6 (3.3) 2.2 (1.5) 4.2 (2.9) C3H/HeJ 3.3 (2.2) 4.8 (3.4) 2.2 (1.5) BALB/cBy 1.9 (1.4) 2.7 (2.1) A/J 3.6 (2.3) 1 Reference sequence from GenBank. Numbers outside brackets refer to total sequence variant frequency (variants/kb). Numbers in brackets refer to substitution frequency (substitutions/kb).

7 Sequence variation of the mouse Ahr gene Thomas et al. 157 Fig. 2 Pictorial representation of the genotypes of the 12 mouse lines sequenced relative to the reference 129/SvJ strain using the subset of polymorphic sites having sequence coverage in all mice. Each mouse is represented by a row of boxes and the polymorphic loci by the columns. A blue box denotes the wild-type base relative to the reference 129/SvJ strain, a yellow box denotes a substitution, a green box an inserted base and a red box a deleted base. The mice are listed by row from top to bottom as follows: BALB/cBy, CBA/J, A/J, C3H/HeJ, SJL/J, DBA/2J, CAST/Ei, 129/SvJ, MOLF/Ei, C57BL/6J, PANCEVO/Ei, SPRETUS/Ei and CAROLI/Ei. Contiguous groups of blue boxes below each section identify the single nucleotide polymorphisms within the various exons of the Ahr, starting with exon 2 and ending with exon 11 (no polymorphisms were identi ed in exon 1). transitions, C3H/HeJ had three transitions and C57BL/ 6J had only one transition. In the inbred subspecies, MOLF/Ei and SPRETUS/Ei each had a single transition while PANCEVO/Ei had two transitions and CAROLI/Ei had three transitions and two transversions. The rest of the Ahr open-reading frame is contained within exon 10 and a portion of exon 11. For exon 10, A/J, BALB/cBy, C3H/HeJ, CBA/J, CAST/Ei and SJL/J all have a single transition, while C57BL/6J, MOLF/Ei, SPRETUS/Ei, PANCEVO/Ei and CAROLI/Ei are more polymorphic, having six, three, 10, eight and 21 transitions, respectively. In addition, C57BL/6J had two transversions, SPRETUS/Ei and PANCEVO/Ei had four transversions and a six-base-pair insertion, while CAROLI/Ei had 10 transversions and a six-base-pair insertion. The portion of exon 11 that contained openreading frame had no polymorphisms in seven of the 12 mice. In C57BL/6J, a single transition was identi ed that caused a premature stop codon, while MOLF/Ei had four transitions and a two-base-pair insertion. The insertion in MOLF/Ei caused an extension of the open-reading frame relative to the remaining mouse lines. Finally, CAROLI/Ei had three transitions and one transversion, and SPRETUS/Ei and PANCEVO/Ei had two and three transitions, respectively. Using the nucleotide sequence from the open-reading frame of the Ahr, a neighbor-joining tree was constructed to characterize the phylogenetic relationship within the mouse lines (Fig. 3). Mouse lines possessing

8 158 Pharmacogenetics 2002, Vol 12 No 2 Fig. 3 Table 4 Polymorphism frequency by exon relative to the reference 129/SvJ strain (variants/kb). A/J BALB/cBy C3H/HeJ CBA/J DBA/2J CAST/Ei SJL/J C57BL/6J MOLF/Ei SPRETUS/Ei PANCEVO/Ei CAROLI/Ei Exon Exon Exon Exon Exon Exon Exon Exon Exon Exon Exon BALB/cBy 97 CBA/J A/J 99 C3H/HeJ SJL/J 85 DBA/2J CAST/Ei 129/SvJ MOLF/Ei C57BL/6J PANCEVO/Ei SPRETUS/Ei CAROLI/Ei Neighbor-joining phylogenetic tree of the 13 mouse lines used in this study based on the nucleotide sequence within the open reading frame of the Ahr. Bootstrap values based on 500 replicates are given for each interior branch. the Ahr b2 allele were clustered near the top of the tree and were most closely related to lines having the Ahr d allele (Fig. 3; Table 1). Interestingly, the MOLF/Ei subspecies having the Ahr b3 allele was more closely related to the Ahr d allele than the laboratory mouse C57BL/6J that has the Ahr b1 allele. Two distinct Mus species with the Ahr b3 allele, PANCEVO/Ei and SPRETUS/Ei, also clustered together while CAROLI/ Ei was the most divergent of the mouse lines studied. Alterations in the AHR protein resulting from the nonsynonymous polymorphisms are represented by an amino-acid alignment in Fig. 4. Notably, no amino-acid changes were found in the bhlh region in any of the mouse lines studied. Between the bhlh and PAS domains, a single D109N substitution was observed in SPRETUS/Ei with 129/SvJ de ned as the reference strain. In the PAS domain, a total of ve amino-acid substitutions and one insertion were observed. The A- repeat region of the PAS domain had an additional four amino acids in PANCEVO/Ei, SPRETUS/Ei and CAR- OLI/Ei. Between the A- and B-repeats, a single A193T substitution was found in CAROLI/Ei. Within the B- repeat region, an I292S substitution was found in CAROLI/Ei and a M324I substitution was observed in C57BL/6J. In the C-terminal portion of the PAS domain, the mouse lines composing the Ahr b2 allelic class shared a common L348F substitution while all Ahr b alleles shared a common V375A substitution. Outside the PAS domain in the C-terminal end of the protein, a total of 28 amino-acid substitutions, two insertions, a nonsense mutation causing a premature 99

9 Sequence variation of the mouse Ahr gene Thomas et al. 159 Fig. 4 Majority MSSGANITYASRKRRKPVQKTVKPIPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDVINKLDKLSVLRLSVSYLRAKSFFDVALKSTPADRNGGQDQCRAQIRDWQDLQEGEFLLQALNGFVLVVTADALVFYASSTIQDYLGFQQSDVIHQSVYELI BALB/cBy MSSGANITYASRKRRKPVQKTVKPIPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDVINKLDKLSVLRLSVSYLRAKSFFDVALKSTPADRNGGQDQCRAQIRDWQDLQEGEFLLQALNGFVLVVTADALVFYASSTIQDYLGFQQSDVIHQSVYELI CBA/J MSSGANITYASRKRRKPVQKTVKPIPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDVINKLDKLSVLRLSVSYLRAKSFFDVALKSTPADRNGGQDQCRAQIRDWQDLQEGEFLLQALNGFVLVVTADALVFYASSTIQDYLGFQQSDVIHQSVYELI A/J MSSGANITYASRKRRKPVQKTVKPIPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDVINKLDKLSVLRLSVSYLRAKSFFDVALKSTPADRNGGQDQCRAQIRDWQDLQEGEFLLQALNGFVLVVTADALVFYASSTIQDYLGFQQSDVIHQSVYELI C3H/HeJ MSSGANITYASRKRRKPVQKTVKPIPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDVINKLDKLSVLRLSVSYLRAKSFFDVALKSTPADRNGGQDQCRAQIRDWQDLQEGEFLLQALNGFVLVVTADALVFYASSTIQDYLGFQQSDVIHQSVYELI SJL/J MSSGANITYASRKRRKPVQKTVKPIPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDVINKLDKLSVLRLSVSYLRAKSFFDVALKSTPADRNGGQDQCRAQIRDWQDLQEGEFLLQALNGFVLVVTADALVFYASSTIQDYLGFQQSDVIHQSVYELI DBA/2J MSSGANITYASRKRRKPVQKTVKPIPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDVINKLDKLSVLRLSVSYLRAKSFFDVALKSTPADRNGGQDQCRAQIRDWQDLQEGEFLLQALNGFVLVVTADALVFYASSTIQDYLGFQQSDVIHQSVYELI CAST/Ei MSSGANITYASRKRRKPVQKTVKPIPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDVINKLDKLSVLRLSVSYLRAKSFFDVALKSTPADRNGGQDQCRAQIRDWQDLQEGEFLLQALNGFVLVVTADALVFYASSTIQDYLGFQQSDVIHQSVYELI 129/SvJ MSSGANITYASRKRRKPVQKTVKPIPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDVINKLDKLSVLRLSVSYLRAKSFFDVALKSTPADRNGGQDQCRAQIRDWQDLQEGEFLLQALNGFVLVVTADALVFYASSTIQDYLGFQQSDVIHQSVYELI MOLF/Ei MSSGANITYASRKRRKPVQKTVKPIPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDVINKLDKLSVLRLSVSYLRAKSFFDVALKSTPADRNGGQDQCRAQIRDWQDLQEGEFLLQALNGFVLVVTADALVFYASSTIQDYLGFQQSDVIHQSVYELI C57BL/6J MSSGANITYASRKRRKPVQKTVKPIPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDVINKLDKLSVLRLSVSYLRAKSFFDVALKSTPADRNGGQDQCRAQIRDWQDLQEGEFLLQALNGFVLVVTADALVFYASSTIQDYLGFQQSDVIHQSVYELI PANCEVO/Ei MSSGANITYASRKRRKPVQKTVKPIPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDVINKLDKLSVLRLSVSYLRAKSFFDVALKSTPADRNGGQDQCRAQIRDWQDLQEGEFLLQALNGFVLVVTADALVFYASSTIQDYLGFQQSDVIHQSVYELI SPRETUS/Ei MSSGANITYASRKRRKPVQKTVKPIPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDVINKLDKLSVLRLSVSYLRAKSFFDVALKSTPADRNGGQDQCRAQIRDWQNLQEGEFLLQALNGFVLVVTADALVFYASSTIQDYLGFQQSDVIHQSVYELI CAROLI/Ei MSSGANITYASRKRRKPVQKTVKPIPAEGIKSNPSKRHRDRLNTELDRLASLLPFPQDVINKLDKLSVLRLSVSYLRAKSFFDVALKSTPADRNGGQDQCRAQIRDWQDLQEGEFLLQALNGFVLVVTADALVFYASSTIQDYLGFQQSDVIHQSVYELI Majority HTEDRAEFQRQLHWALNP----DSAQGVDEAHGPPQAAVYYTPDQLPPENASFMERCFRCRLRCLLDNSSGFLAMNFQGRLKYLHGQNKKGKDGALLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTPIGCDAKGQLILGYTEVELCTRGSGY BALB/cBy HTEDRAEFQRQLHWALNP----DSAQGVDEAHGPPQAAVYYTPDQLPPENASFMERCFRCRLRCLLDNSSGFLAMNFQGRLKYLHGQNKKGKDGALLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTPIGCDAKGQLILGYTEVELCTRGSGY CBA/J HTEDRAEFQRQLHWALNP----DSAQGVDEAHGPPQAAVYYTPDQLPPENASFMERCFRCRLRCLLDNSSGFLAMNFQGRLKYLHGQNKKGKDGALLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTPIGCDAKGQLILGYTEVELCTRGSGY A/J HTEDRAEFQRQLHWALNP----DSAQGVDEAHGPPQAAVYYTPDQLPPENASFMERCFRCRLRCLLDNSSGFLAMNFQGRLKYLHGQNKKGKDGALLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTPIGCDAKGQLILGYTEVELCTRGSGY C3H/HeJ HTEDRAEFQRQLHWALNP----DSAQGVDEAHGPPQAAVYYTPDQLPPENASFMERCFRCRLRCLLDNSSGFLAMNFQGRLKYLHGQNKKGKDGALLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTPIGCDAKGQLILGYTEVELCTRGSGY SJL/J HTEDRAEFQRQLHWALNP----DSAQGVDEAHGPPQAAVYYTPDQLPPENASFMERCFRCRLRCLLDNSSGFLAMNFQGRLKYLHGQNKKGKDGALLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTPIGCDAKGQLILGYTEVELCTRGSGY DBA/2J HTEDRAEFQRQLHWALNP----DSAQGVDEAHGPPQAAVYYTPDQLPPENASFMERCFRCRLRCLLDNSSGFLAMNFQGRLKYLHGQNKKGKDGALLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTPIGCDAKGQLILGYTEVELCTRGSGY CAST/Ei HTEDRAEFQRQLHWALNP----DSAQGVDEAHGPPQAAVYYTPDQLPPENASFMERCFRCRLRCLLDNSSGFLAMNFQGRLKYLHGQNKKGKDGALLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTPIGCDAKGQLILGYTEVELCTRGSGY 129/SvJ HTEDRAEFQRQLHWALNP----DSAQGVDEAHGPPQAAVYYTPDQLPPENASFMERCFRCRLRCLLDNSSGFLAMNFQGRLKYLHGQNKKGKDGALLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTPIGCDAKGQLILGYTEVELCTRGSGY MOLF/Ei HTEDRAEFQRQLHWALNP----DSAQGVDEAHGPPQAAVYYTPDQLPPENASFMERCFRCRLRCLLDNSSGFLAMNFQGRLKYLHGQNKKGKDGALLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTPIGCDAKGQLILGYTEVELCTRGSGY C57BL/6J HTEDRAEFQRQLHWALNP----DSAQGVDEAHGPPQAAVYYTPDQLPPENASFMERCFRCRLRCLLDNSSGFLAMNFQGRLKYLHGQNKKGKDGALLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTPIGCDAKGQLILGYTEVELCTRGSGY PANCEVO/Ei HTEDRAEFQRQLHWALNPSQCTDSAQGVDEAHGPPQAAVYYTPDQLPPENASFMERCFRCRLRCLLDNSSGFLAMNFQGRLKYLHGQNKKGKDGALLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTPIGCDAKGQLILGYTEVELCTRGSGY SPRETUS/Ei HTEDRAEFQRQLHWALNPSQCTDSAQGVDEAHGPPQAAVYYTPDQLPPENASFMERCFRCRLRCLLDNSSGFLAMNFQGRLKYLHGQNKKGKDGALLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTPIGCDAKGQLILGYTEVELCTRGSGY CAROLI/Ei HTEDRAEFQRQLHWALNPSQCTDSAQGVDEAHGPPQTAVYYTPDQLPPENASFMERCFRCRLRCLLDNSSGFLAMNFQGRLKYLHGQNKKGKDGALLPPQLALFAIATPLQPPSILEIRTKNFIFRTKHKLDFTPSGCDAKGQLILGYTEVELCTRGSGY Majority QFIHAADMLHCAESHIRMIKTGESGMTVFRLLAKHSRWRWVQSNARLIYRNGRPDYIIATQRPLTDEEGREHLQKRSTSLPFMFATGEAVLYEISSPFSPIMDPLPIRTKSNTSRKDWAPQSTPSKDSFHPSSLMSALIQQDESIYLCPPSSPAPLDSHF BALB/cBy QFIHAADMLHCAESHIRMIKTGESGMTVFRLFAKHSRWRWVQSNARLIYRNGRPDYIIATQRPLTDEEGREHLQKRSTSLPFMFATGEAVLYEISSPFSPIMDPLPIRTKSNTSRKDWAPQSTPSKDSFHPSSLMSALIQQDESIYLCPPSSPAPLDSHF CBA/J QFIHAADMLHCAESHIRMIKTGESGMTVFRLFAKHSRWRWVQSNARLIYRNGRPDYIIATQRPLTDEEGREHLQKRSTSLPFMFATGEAVLYEISSPFSPIMDPLPIRTKSNTSRKDWAPQSTPSKDSFHPSSLMSALIQQDESIYLCPPSSPAPLDSHF A/J QFIHAADMLHCAESHIRMIKTGESGMTVFRLFAKHSRWRWVQSNARLIYRNGRPDYIIATQRPLTDEEGREHLQKRSTSLPFMFATGEAVLYEISSPFSPIMDPLPIRTKSNTSRKDWAPQSTPSKDSFHPSSLMSALIQQDESIYLCPPSSPAPLDSHF C3H/HeJ QFIHAADMLHCAESHIRMIKTGESGMTVFRLFAKHSRWRWVQSNARLIYRNGRPDYIIATQRPLTDEEGREHLQKRSTSLPFMFATGEAVLYEISSPFSPIMDPLPIRTKSNTSRKDWAPQSTPSKDSFHPSSLMSALIQQDESIYLCPPSSPAPLDSHF SJL/J QFIHAADMLHCAESHIRMIKTGESGMTVFRLLAKHSRWRWVQSNARLIYRNGRPDYIIVTQRPLTDEEGREHLQKRSTSLPFMFATGEAVLYEISSPFSPIMDPLPIRTKSNTSRKDWAPQSTPSKDSFHPSSLMSALIQQDESIYLCPPSSPAPLDSHF DBA/2J QFIHAADMLHCAESHIRMIKTGESGMTVFRLLAKHSRWRWVQSNARLIYRNGRPDYIIVTQRPLTDEEGREHLQKRSTSLPFMFATGEAVLYEISSPFSPIMDPLPIRTKSNTSRKDWAPQSTPSKDSFHPSSLMSALIQQDESIYLCPPSSPAPLDSHF CAST/Ei QFIHAADMLHCAESHIRMIKTGESGMTVFRLLAKHSRWRWVQSNARLIYRNGRPDYIIVTQRPLTDEEGREHLQKRSTSLPFMFATGEAVLYEISSPFSPIMDPLPIRTKSNTSRKDWAPQSTPSKDSFHPSSLMSALIQQDESIYLCPPSSPAPLDSHF 129/SvJ QFIHAADMLHCAESHIRMIKTGESGMTVFRLLAKHSRWRWVQSNARLIYRNGRPDYIIVTQRPLTDEEGREHLQKRSTSLPFMFATGEAVLYEISSPFSPIMDPLPIRTKSNTSRKDWAPQSTPSKDSFHPSSLMSALIQQDESIYLCPPSSPAPLDSHF MOLF/Ei QFIHAADMLHCAESHIRMIKTGESGMTVFRLLAKHSRWRWVQSNARLIYRNGRPDYIIATQRPLTDEEGREHLQKRSTSLPFMFATGEAVLYEISSPFSPIMDPLPIRTKSNTSRKDWAPQSTPSKDSFHPSSLMSALIQQDESIYLCPPSSPAPLDSHF C57BL/6J QFIHAADILHCAESHIRMIKTGESGMTVFRLLAKHSRWRWVQSNARLIYRNGRPDYIIATQRPLTDEEGREHLQKRSTSLPFMFATGEAVLYEISSPFSPIMDPLPIRTKSNTSRKDWAPQSTPSKDSFHPSSLMSALIQQDESIYLCPPSSPALLDSHF PANCEVO/Ei QFIHAADMLHCAESHIRMIKTGESGMTVFRLLAKHSRWRWVQSNARLIYRNGRPDYIIATQRPLTDEEGREHLQKRSMSLPFMFATGEAVLYEISSPFSPIMDPLPIRTKSNTSRKDWAPQSTPSKDSFHPSSLMSALIQQDESIYLCPPSSPAPLDSHF SPRETUS/Ei QFIHAADMLHCAESHIRMIKTGESGMTVFRLLAKHSRWRWVQSNARLIYRNGRPDYIIATQRPLTDEEGREHLQKRSMSLPFMFATGEAVLYEISSPFSPIMDPLPIRTKSNTSRKDWAPQSTPSKDSFHPSSLMSALIQQDESIYLCPPSSPAPLDSHF CAROLI/Ei QFIHAADMLHCAESHIRMIKTGESGMTVFRLLAKHSRWRWVQSNARLIYRNGRPDYIIATQRPLTDEEGREHLQKRSMSLPFMFATGEAVLYEISSPFSPIMDPLPIRAKSNTSRKDWAPQSTPSKDSFHPSSLMSALIQQDESIYLCPPSSPAPLDSHF Majority LMGSVSKCGSWQDSFAAAGSEAALKHEQIGHAQDVNLALSGGPSELFPDNKNNDLYNIMRNLGIDFEDIRSMQNEEFFRTDSTAA--GEVDFKDIDITDEILTYVQDSLNNSTLMNSACQQQPVTQHLSCMLQERLQLEQQQQ--LQQPPPQALEPQQQL BALB/cBy LMGSVSKCGSWQDSFAAAGSEAALKHEQIGHAQDVNLALSGGPSELFPDNKNNDLYNIMRNLGIDFEDIRSMQNEEFFRTDSTAA--GEVDFKDIDITDEILTYVQDSLNNSTLMNSACQQQPVTQHLSCMLQERLQLEQQQQ--LQQPPPQALEPQQQL CBA/J LMGSVSKCGSWQDSFAAAGSEAALKHEQIGHAQDVNLALSGGPSELFPDNKNNDLYNIMRNLGIDFEDIRSMQNEEFFRTDSTAA--GEVDFKDIDITDEILTYVQDSLNNSTLMNSACQQQPVTQHLSCMLQERLQLEQQQQ--LQQPPPQALEPQQQL A/J LMGSVSKCGSWQDSFAAAGSEAALKHEQIGHAQDVNLALSGGPSELFPDNKNNDLYNIMRNLGIDFEDIRSMQNEEFFRTDSTAA--GEVDFKDIDITDEILTYVQDSLNNSTLMNSACQQQPVTQHLSCMLQERLQLEQQQQ--LQQPPPQALEPQQQL C3H/HeJ LMGSVSKCGSWQDSFAAAGSEAALKHEQIGHAQDVNLALSGGPSELFPDNKNNDLYNIMRNLGIDFEDIRSMQNEEFFRTDSTAA--GEVDFKDIDITDEILTYVQDSLNNSTLMNSACQQQPVTQHLSCMLQERLQLEQQQQ--LQQPPPQALEPQQQL SJL/J LMGSVSKCGSWQDSFAAAGSEAALKHEQIGHAQDVNLALSGGPSELFPDNKNNDLYNIMRNLGIDFEDIRSMQNEEFFRTDSTAA--GEVDFKDIDITDEILTYVQDSLNNSTLINSACQQQPVTQHLSCMLQERLQLEQQQQ--LQQPPPQALEPQQQL DBA/2J LMGSVSKCGSWQDSFAAAGSEAALKHEQIGHAQDVNLALSGGPSELFPDNKNNDLYNIMRNLGIDFEDIRSMQNEEFFRTDSTAA--GEVDFKDIDITDEILTYVQDSLNNSTLMNSACQQQPVTQHLSCMLQERLQLEQQQQ--LQQPPPQALEPQQQL CAST/Ei LMGSVSKCGSWQDSFAAAGSEAALKHEQIGHAQDVNLALSGGPSELFPDNKNNDLYNIMRNLGIDFEDIRSMQNEEFFRTDSTAA--GEVDFKDIDITDEILTYMQDSLNNSTLMNSACQQQPVTQHLSCMLQERLQLEQQQQ--LQQPPPQALEPQQQL 129/SvJ LMGSVSKCGSWQDSFAAAGSEAALKHEQIGHAQDVNLALSGGPSELFPDNKNNDLYNIMRNLGIDFEDIRSMQNEEFFRTDSTAA--GEVDFKDIDITDEILTYVQDSLNNSTLMNSACQQQPVTQHLSCMLQERLQLEQQQQ--LQQPPPQALEPQQQL MOLF/Ei LMGSVSKCGSWQDSFAAAGSEAALKHEQIGHAQDVNLALSGGPSELFPDNKNNDLYNIMRNLGIDFEDIRSMQNEEFFRTDSTAA--GEVDFKDIDITDEILTYVQDSLNNSTLMNSACQQQPVTQHLSCMLQERLQLEQQQQ--LQQPPPQALEPQQQL C57BL/6J LMGSVSKCGSWQDSFAAAGSEAALKHEQIGHAQDVNLALSGGPSELFPDNKNNDLYSIMRNLGIDFEDIRSMQNEEFFRTDSTAA--GEVDFKDIDITDEILTYVQDSLNNSTLLNSACQQQPVTQHLSCMLQERLQLEQQQQ--LQQPPPQALEPQQQL PANCEVO/Ei LMGSVSKCGSWQDSFAATGSEAALKHEQIGHAQDVNLALSGGPSELFPDNKNNDLYSIMRNLGIDFEDIRSMQNEEFFRTDSTAAAAGEVDFKDIDITDEILTYVQDSLNNSTLLNSACQQQPVTQHLSCMLQERLQLEQQQQ--LQQPPPQALEPQQQL SPRETUS/Ei LMGSVSKCGSWQDSFAATGSEAALKHEQIGHAQDVNLALSGGPSELFPDNKNNDLYSIMRDLGIDFEDIRSMQNEEFFRTDSTAAAAGEVDFKDIDITDEILTYVQDSLNNSTLLNSACQQQPVTQHLSCMLQERLQLEQQQQ--LQQPPPQALEPQQQL CAROLI/Ei LMGSVSKCGSWQDSFAAAGSEAALKHEQIGHAQDVNLALSGGPSELFPDNKNNDLYSIMRNLGIDFEDIRSMQNEEFFRTDSTAT--GEVDFKDIDITDEILTYVQDSLNNSTLLNSACQQQPVTQHLSCMLQERLQLEQQQQQQLQQPPTQALEPQQQL Majority CQMVCPQQDLGPKHTQINGTFASWNPTPPVSFNCPQQELKHYQLFSSLQGTAQEFPYKPEVDSVPYTQNFAPCNQPLLPEHSKSVQLDFPGRDFEPSLHPTTSNLDFVSCLQVPENQSHGINSQSAMVSPQAYYAGAMSMYQCQPGPQRTPVDQTQYSSE BALB/cBy CQMVCPQQDLGPKHTQINGTFASWNPTPPVSFNCPQQELKHYQLFSSLQGTAQEFPYKPEVDSVPYTQNFAPCNQPLLPEHSKSVQLDFPGRDFEPSLHPTTSNLDFVSCLQVPENQSHGINSQSTMVSPQAYYAGAMSMYQCQPGPQRTPVDQTQYSSE CBA/J CQMVCPQQDLGPKHTQINGTFASWNPTPPVSFNCPQQELKHYQLFSSLQGTAQEFPYKPEVDSVPYTQNFAPCNQPLLPEHSKSVQLDFPGRDFEPSLHPTTSNLDFVSCLQVPENQSHGINSQSTMVSPQAYYAGAMSMYQCQPGPQRTPVDQTQYSSE A/J CQMVCPQQDLGPKHTQINGTFASWNPTPPVSFNCPQQELKHYQLFSSLQGTAQEFPYKPEVDSVPYTQNFAPCNQPLLPEHSKSVQLDFPGRDFEPSLHPTTSNLDFVSCLQVPENQSHGINSQSTMVSPQAYYAGAMSMYQCQPGPQRTPVDQTQYSSE C3H/HeJ CQMVCPQQDLGPKHTQINGTFASWNPTPPVSFNCPQQELKHYQLFSSLQGTAQEFPYKPEVDSVPYTQNFAPCNQPLLPEHSKSVQLDFPGRDFEPSLHPTTSNLDFVSCLQVPENQSHGINSQSTMVSPQAYYAGAMSMYQCQPGPQRTPVDQTQYSSE SJL/J CQMVCPQQDLGPKHTQINGTFASWNPTPPVSFNCPQQELKHYQLFSSLQGTAQEFPYKPEVDSVPYTQNFAPCNQPLLPEHSKSVQLDFPGRDFEPSLHPTTSNLDFVSCLQVPENQSHGINSQSAMVSPQAYYAGAMSMYQCQPGPQRTPVDQTQYSSE DBA/2J CQMVCPQQDLGPKHTQINGTFASWNPTPPVSFNCPQQELKHYQLFSSLQGTAQEFPYKPEVDSVPYTQNFAPCNQPLLPEHSKSVQLDFPGRDFEPSLHPTTSNLDFVSCLQVPENQSHGINSQSAMVSPQAYYAGAMSMYQCQPGPQRTPVDQTQYSSE CAST/Ei CQMVCPQQDLGPKHTQINGTFASWNPTPPVSFNCPQQELKHYQLFSSLQGTAQEFPYKPEVDSVPYTQNFAPCNQPLLPEHSKSVQLDFPGRDFEPSLHPTTSNLDFVSCLQVPENQSHGINSQSAMVSPQAYYAGAMSMYQCQPGPQRTPVDQTQYSSE 129/SvJ CQMVCPQQDLGPKHTQINGTFASWNPTPPVSFNCPQQELKHYQLFSSLQGTAQEFPYKPEVDSVPYTQNFAPCNQPLLPEHSKSVQLDFPGRDFEPSLHPTTSNLDFVSCLQVPENQSHGINSQSAMVSPQAYYAGAMSMYQCQPGPQRTPVDQTQYSSE MOLF/Ei CQMVCPQQDLGPKHTQINGTFASWNPTPPVSFNCPQQELKHYQLFSSLQGTAQEFPYKPEVDSVPYTQNFAPCNQPLLPEHSKSVQLDFPGRDFEPSLHPTTSNLDFVSCLQVPENQSHGINSQSAMVSPQAYYAGAMSMYQCQPGPQRTPVDQTQYSSE C57BL/6J CQMVCPQQDLGPKHTQINGTFASWNPTPPVSFNCPQQELKHYQLFSSLQGTAQEFPYKPEVDSVPYTQNFAPCNQPLLPEHSKSVQLDFPGRDFEPSLHPTTSNLDFVSCLQVPENQSHGINSQSAMVSPQAYYAGAMSMYQCQPGPQRTPVDQTQYSSE PANCEVO/Ei CQMVCPQQDLGPKHTQINGTFASWNPTPPVSFNCPQQELKHYQLFSSLQGTAQEFPYKPEVDSVPYTQNFAPCNQPLLPEHSKSVQLDFPGRDFEPSLHPTTSNLDFVSCLQVPENQSHGINSQSAMVSPQAYYAGAMSMYQCQPGPQRTPVDQTQYSSE SPRETUS/Ei CQMVCPQQDLGPKHTQINGTFASWNPTPPVSFNCPQQELKHYQIFSSLQGTAQEFPYKPEVDSVPYTQNFAPCNQPLLPEHSKSVQLDFPGRDFEPSLHPTTSNLDFVSCLQVPENQSHGINSQSAMVSPQAYYAGAMSMYQCQPGPQRTPVDQTQYGSE CAROLI/Ei CQMECPQQDLGQRHTQINGSFASWNPTPPVSFNCPQQELKHYHLFSSLQGTAQEFPYKPEVDGMPYTQNFAPCNQPLLPEHSKSVQLDFPGRDFEPSLHPTTSNLDFVSCLQVPENQSHGINSQSAMVSPQTYYAGAMSMYQCQPGPQHTPVDQTQYSSE Majority IPGSQAFLSKVQSRGIFNETYSSDLSSIGHAAQTTGHLHHLAEARPLPDITP GGF---L- BALB/cBy IPGSQAFLSKVQSRGIFNETYSSDLSSIGHAAQTTGHLHHLAEARPLPDITP GGF---L. CBA/J IPGSQAFLSKVQSRGIFNETYSSDLSSIGHAAQTTGHLHHLAEARPLPDITP GGF---L. A/J IPGSQAFLSKVQSRGIFNETYSSDLSSIGHAAQTTGHLHHLAEARPLPDITP GGF---L. C3H/HeJ IPGSQAFLSKVQSRGIFNETYSSDLSSIGHAAQTTGHLHHLAEARPLPDITP GGF---L. SJL/J IPGSQAFLSKVQSRGIFNETYSSDLSSIGHAAQTTGHLHHLAEARPLPDITP GGF---L. DBA/2J IPGSQAFLSKVQSRGIFNETYSSDLSSIGHAAQTTGHLHHLAEARPLPDITP GGF---L. CAST/Ei IPGSQAFLSKVQSRGIFNETYSSDLSSIGHAAQTTGHLHHLAEARPLPDITP GGF---L. 129/SvJ IPGSQAFLSKVQSRGIFNETYSSDLSSIGHAAQTTGHLHHLAEARPLPDITP GGF---L. MOLF/Ei IPGSQAFLSKVQSRGVFNETYSSDLSSIDHAVQTTGHLHHLAEARPLPDISHLVGSCSSHARMKFIQEQDTGTVRVGHQYTFSKTDFDSCI. C57BL/6J IPGSQAFLSKVQS. PANCEVO/Ei IPGSQAFLSKVQSRGVFNETYSSDLSSIGHAAQTTGHLHHLAEAQPLPDITP GGF---L. SPRETUS/Ei IPGSQAFLSKVQSRGVFNETYSSDLSSIGHAAQTTGHLHHLAEARPLPDITP GGF---L. CAROLI/Ei IPGSQAFLSKVQSRGVFNETYPSDLSSIGHAAQTTGHLHHLAEARPLPDITP GGF---L. Amino acid alignment of the aryl hydrocarbon receptor in the 13 mouse lines using the CLUSTAL method. The sequence identity is shown on the left side of the alignment and amino acids differing from the consensus are highlighted with a box. The relative positions of the various domains within the AHR are highlighted using colored boxes as follows: red, basic region; blue, helix-loop-helix; green, PAS domain; black, A-repeat; brown, B-repeat; violet, transactivation domain. Non-conservative substitutions are highlighted in yellow.

10 160 Pharmacogenetics 2002, Vol 12 No 2 stop and a polymorphism eliminating the stop codon were found in the various mouse lines (Fig. 4). Of the C-terminal changes, only two of the substitutions were shared among all members of the de ned phenotypic allele classes. Among the mouse lines comprising the Ahr b2 allelic class, an A758T substitution was found while a conservative I808V substitution was found in all mouse lines with the Ahr b3 allele. Within the transactivation domain [33], there were eight substitutions and two insertions. To evaluate the evolutionary pressure on the AHR as a whole and its individual domains, the ratios of nonsynonymous (K a ) to synonymous (K s ) substitutions were calculated using the coding sequences from a Mus musculus strain (129/SvJ), the two Mus musculus subspecies and three Mus species sequenced in this study, as well as Rattus norvegicus and Homo sapiens. The AHR as a whole possessed a K a :K s ratio of with the bhlh domain having the lowest K a :K s ratio (0.022) followed by the PAS domain (0.064 and for the N-terminal and C-terminal halves, respectively) and the transactivation domain (0.201). For comparison, the C-terminal end of the protein had a K a :K s ratio of Discussion In this report, we have produced a comprehensive description of the sequence variation in a single gene, Ahr, across 13 different mouse lines that include inbred laboratory strains, Mus musculus subspecies and separate Mus species. To our knowledge, the DNA sequence data presented represent the largest study of sequence variation across multiple mouse lines in a single gene ( 15.9 kb/mouse line). As a result, the sequence information will be useful in establishing polymorphism rates and characteristics within the Mus genus, comparing these rates with those observed in whole genome scans, and understanding the evolutionary pressures that underlie the biological and toxicological role of this particular gene. For a number of the polymorphism statistics, our results agree with previous studies in both mice and humans. The transition to transversion ratio in our study (1.8 : 1) is consistent with that reported in a previous study of mouse polymorphisms (2 : 1) [25] and the number of coding polymorphisms found at the third codon position (57.6%) is in close agreement with that observed by Garg and colleagues for human expressed sequence tags (ESTs) (59%) [34]. In addition, the relative number of non-synonymous polymorphisms (53.6%) was also similar to that observed in human ESTs (43%) [34]. Despite the similarities between our data and previous studies, there were two important differences. First, the proportional distribution of the various types of DNA sequence variants observed in our study differed from previous ndings. Our study found that 42.4% of the sequence variants were transitions, 23.4% were transversions, 22.6% were insertions and 11.7% were deletions. In the human lipoprotein lipase gene, the numbers of transitions was similar (59%), but the number of transversions (41%) and insertions/deletions (10%) were signi cantly different [26]. An explanation for these differences is not immediately apparent, but it is dif cult to compare across species and different loci are also known to have considerable variation in the relative proportion of substitutions, insertions and deletions [35]. An additional difference in our study was the relatively high polymorphism frequency. Among all mouse lines, the average frequency of all polymorphisms in the intronic regions is 20.3 variants/kb and the average exonic frequency is 14.1 variants/kb (Table 3). For substitutions alone, the average frequencies in the intronic and exonic regions were 13.3 and 8.9 substitutions/kb, respectively. These estimates are highly skewed, since the subspecies in the study contribute disproportionately due to the high frequencies of polymorphisms. Between laboratory strains, the average intronic and exonic frequencies for all polymorphisms were 5.4 and 2.8 variants/kb, respectively. For substitutions alone, the average frequencies in the intronic and exonic regions were 3.5 and 1.8 substitutions/kb, respectively. By comparison, the average intronic and exonic frequencies for all polymorphisms between the various Mus musculus subspecies and separate Mus species were 40.7 and 29.9 variants/kb, respectively. As expected, the polymorphism frequencies in the exonic regions are consistently less than the intronic regions. However, the substitution frequencies we measured within laboratory mice are two to four times higher in this gene than those measured by Lindblad-Toh and colleagues for both STSs (0.95 substitutions/kb) and ESTs (0.81 substitutions/kb) [25]. One possibility is that the Ahr gene has a higher than average mutation frequency (i.e. hypermutable) due to some inherent property of the gene. An alternative and more likely possibility is that the differences in rates may re ect the genetic history of the Ahr gene during the selection and breeding of the various mouse lines. Coalescent times for various alleles can vary across the genome. Genes with a more distant common ancestor have had more time to accumulate polymorphisms, suggesting that one or more of the laboratory strains in our study has remained relatively diverged during its selection and breeding from the original mouse stocks in Asia and Europe [21]. In support of this argument, the differences within the laboratory mice are primarily driven by the high polymorphism frequencies between the various laboratory strains and C57BL/6J. Without

11 Sequence variation of the mouse Ahr gene Thomas et al. 161 C57BL/6J, the substitution frequencies within laboratory strains fall to 1.6 and 1.0 substitutions/kb for intronic and exonic regions, respectively; these are more in line with the whole-genome estimates. In addition, the substitution frequency between CAROLI/ Ei and SPRETUS/Ei is very similar to that reported previously (46.6 vs 50 substitutions/kb) [25]. If the Ahr had a higher than average mutation rate, the polymorphic frequency between CAROLI/Ei and SPRE- TUS/Ei would also be higher. Based on the nucleotide sequence of the Ahr coding region, we were able to reconstruct the phylogenetic history of the gene within the various mouse lines (Fig. 3). BALB/cBy, CBA/J, A/J and C3H/HeJ are clustered together on a single branch and their position re ects the known phylogenetic history from whole-genome scans [24,25]. Conversely, the mouse lines contained in the Ahr d allele cluster are relatively different from the genome-based phylogenetic histories. The CAST/Ei subspecies should be considerably diverged from all the laboratory strains and, among laboratory mice, 129/ SvJ should be the most divergent with C57BL/6J and DBA/2J following in succession [21,24,25]. However, for the Ahr, 129/SvJ, DBA/2J, SJL/J and CAST/Ei all share a common phylogenetic history while C57BL/6J is highly diverged from the mouse lines in both the Ahr d and Ahr b2 alleles. Interestingly, the MOLF/Ei subspecies is more closely related to the laboratory mice and CAST/Ei at the Ahr than C57BL/6J. Finally, PANCEVO/Ei and SPRETUS/Ei show a common ancestral gene following the split from the laboratory mouse strains and the CAST/Ei and MOLF/Ei subspecies, while CAROLI/Ei is the most divergent. Of the 42 amino acid changes found in the AHR, only 10 have been reported previously [16]. These changes de ne the potential functional differences between the various alleles and can be used to better understand the role of the AHR in development and toxicity. For the bhlh domain, which acts as the region of contact between the AHR and its response element as well as a homotypic dimerization surface, no amino acid changes were found (Fig. 4). In contrast, ve amino acid substitutions and a four-amino-acid insertion were found in the PAS domain, which acts as a dimerization surface for homotypic interactions with other PAS proteins, heterotypic interactions with cellular chaperones and a binding surface for the halogenated aromatic ligands. Nevertheless, the amino acid changes were not spread uniformly throughout the PAS domain. The N-terminal portion of the PAS domain was relatively well conserved with the only major change being an additional four amino acids present in PANCEVO/Ei, SPRETUS/ Ei and CAROLI/Ei, in the latter part of the A-repeat. Interestingly, the additional amino acids have signi cant structural consequences as predicted by protein secondary structure models [36]. Without the additional residues, the region is in the middle of a coil-turn-coil that is anked on the N-terminus by an á helix and on the C-terminus by a â sheet. The additional residues disrupt the structure by eliminating the second coil and extending the predicted turn. Given the location of important residues for DNA binding in the central portion of the PAS domain [37], signi cant conformational changes upstream, such as the elimination of a coil, may have functional consequences. The C-terminal end had multiple changes, including the V375A substitution that has been shown to be responsible for the change in ligand binding [16]. Finally, the transactivation domain had eight amino acid substitutions and a pair of two amino acid insertions. Additional studies will be required to identify the functional consequences of these amino acid changes. At the molecular level, the evolutionary pressure on a gene is re ected in its degree of tolerance towards nucleotide substitution. In protein-coding genes, the evolutionary pressure is usually identi ed by the ratio of non-synonymous (K a ) to synonymous (K s ) substitutions [38]. Low K a :K s ratios (, 1) re ect purifying selection during periods of divergent evolution where the physiological role of the protein or domain is relatively constant. High K a :K s ratios (. 1) suggest an adaptive evolutionary pressure where speci c changes in the amino acid sequence are favored due to the development of a different biological function. To determine the type and strength of evolutionary pressure on the AHR, the K a :K s ratios were calculated for the open reading frame as a whole and for the various domains within the protein. The sequences chosen for this study spanned relatively short evolutionary distances with the Mus musculus strain (129/SvJ) and the two Mus musculus subspecies sequenced in this study, as well as more distant relatives using three additional Mus species, Rattus norvegicus and Homo sapiens. The time of divergence between humans and mice has been estimated at approximately 96 million years, while the rat and mouse divergence ranges from 23 to 33 million years [39,40]. Based on this approach, the K a :K s ratio for the Ahr open reading frame as a whole suggests that it is undergoing purifying selection and the function of the protein is highly constrained. Notably, ligand binding within the mice is not constrained to the highaf nity allele, further supporting the role of the AHR in development and its potential importance beyond the adaptive response to environmental toxicants. Similar conclusions have been reached by studying cdna sequences of AHR in early vertebrates [41]. Broken down by domain, the bhlh domain is the most constrained followed by the PAS domain and the transactivation domain. Interestingly, both halves of the PAS domain are equally constrained. Given the importance of the C-terminal half of the PAS domain in

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS.

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS. !! www.clutchprep.com CONCEPT: OVERVIEW OF GENOMICS Genomics is the study of genomes in their entirety Bioinformatics is the analysis of the information content of genomes - Genes, regulatory sequences,

More information

Selective constraints on noncoding DNA of mammals. Peter Keightley Institute of Evolutionary Biology University of Edinburgh

Selective constraints on noncoding DNA of mammals. Peter Keightley Institute of Evolutionary Biology University of Edinburgh Selective constraints on noncoding DNA of mammals Peter Keightley Institute of Evolutionary Biology University of Edinburgh Most mammalian noncoding DNA evolves rapidly Homo-Pan Divergence (%) 1.5 1.25

More information

This place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology.

This place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology. G16B BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY Methods or systems for genetic

More information

Gene Expression Profiling with NanoString Technology to Identify Genetic Modulators of Aryl Hydrocarbon Receptor (AHR)-mediated Toxicity

Gene Expression Profiling with NanoString Technology to Identify Genetic Modulators of Aryl Hydrocarbon Receptor (AHR)-mediated Toxicity Gene Expression Profiling with NanoString Technology to Identify Genetic Modulators of Aryl Hydrocarbon Receptor (AHR)-mediated Toxicity Peter Dornbos Genomics Core Seminar Michigan State University October

More information

Biotechnology Explorer

Biotechnology Explorer Biotechnology Explorer C. elegans Behavior Kit Bioinformatics Supplement explorer.bio-rad.com Catalog #166-5120EDU This kit contains temperature-sensitive reagents. Open immediately and see individual

More information

Figure S1 Correlation in size of analogous introns in mouse and teleost Piccolo genes. Mouse intron size was plotted against teleost intron size for t

Figure S1 Correlation in size of analogous introns in mouse and teleost Piccolo genes. Mouse intron size was plotted against teleost intron size for t Figure S1 Correlation in size of analogous introns in mouse and teleost Piccolo genes. Mouse intron size was plotted against teleost intron size for the pcloa genes of zebrafish, green spotted puffer (listed

More information

Section 10.3 Outline 10.3 How Is the Base Sequence of a Messenger RNA Molecule Translated into Protein?

Section 10.3 Outline 10.3 How Is the Base Sequence of a Messenger RNA Molecule Translated into Protein? Section 10.3 Outline 10.3 How Is the Base Sequence of a Messenger RNA Molecule Translated into Protein? Messenger RNA Carries Information for Protein Synthesis from the DNA to Ribosomes Ribosomes Consist

More information

Frequently asked questions

Frequently asked questions Frequently asked questions Affymetrix Mouse Diversity Genotyping Array The Affymetrix Mouse Diversity Genotyping Array features more than 623,000 single nucleotide polymorphisms (SNPs) and more than 916,000

More information

Human Genetic Variation. Ricardo Lebrón Dpto. Genética UGR

Human Genetic Variation. Ricardo Lebrón Dpto. Genética UGR Human Genetic Variation Ricardo Lebrón rlebron@ugr.es Dpto. Genética UGR What is Genetic Variation? Origins of Genetic Variation Genetic Variation is the difference in DNA sequences between individuals.

More information

CHAPTER 21 LECTURE SLIDES

CHAPTER 21 LECTURE SLIDES CHAPTER 21 LECTURE SLIDES Prepared by Brenda Leady University of Toledo To run the animations you must be in Slideshow View. Use the buttons on the animation to play, pause, and turn audio/text on or off.

More information

Lecture 2: Biology Basics Continued. Fall 2018 August 23, 2018

Lecture 2: Biology Basics Continued. Fall 2018 August 23, 2018 Lecture 2: Biology Basics Continued Fall 2018 August 23, 2018 Genetic Material for Life Central Dogma DNA: The Code of Life The structure and the four genomic letters code for all living organisms Adenine,

More information

Supplemental Data. Zhou et al. (2016). Plant Cell /tpc

Supplemental Data. Zhou et al. (2016). Plant Cell /tpc Supplemental Figure 1. Confirmation of mutant mapping results. (A) Complementation assay with stably transformed genomic fragments (ComN-N) (2 kb upstream of TSS and 1.5 kb downstream of TES) and CaMV

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION AS-NMD modulates FLM-dependent thermosensory flowering response in Arabidopsis NATURE PLANTS www.nature.com/natureplants 1 Supplementary Figure 1. Genomic sequence of FLM along with the splice sites. Sequencing

More information

Multiple choice questions (numbers in brackets indicate the number of correct answers)

Multiple choice questions (numbers in brackets indicate the number of correct answers) 1 Multiple choice questions (numbers in brackets indicate the number of correct answers) February 1, 2013 1. Ribose is found in Nucleic acids Proteins Lipids RNA DNA (2) 2. Most RNA in cells is transfer

More information

The University of California, Santa Cruz (UCSC) Genome Browser

The University of California, Santa Cruz (UCSC) Genome Browser The University of California, Santa Cruz (UCSC) Genome Browser There are hundreds of available userselected tracks in categories such as mapping and sequencing, phenotype and disease associations, genes,

More information

Chapter 5. Structural Genomics

Chapter 5. Structural Genomics Chapter 5. Structural Genomics Contents 5. Structural Genomics 5.1. DNA Sequencing Strategies 5.1.1. Map-based Strategies 5.1.2. Whole Genome Shotgun Sequencing 5.2. Genome Annotation 5.2.1. Using Bioinformatic

More information

Lecture 2: Biology Basics Continued

Lecture 2: Biology Basics Continued Lecture 2: Biology Basics Continued Central Dogma DNA: The Code of Life The structure and the four genomic letters code for all living organisms Adenine, Guanine, Thymine, and Cytosine which pair A-T and

More information

Computational gene finding

Computational gene finding Computational gene finding Devika Subramanian Comp 470 Outline (3 lectures) Lec 1 Lec 2 Lec 3 The biological context Markov models and Hidden Markov models Ab-initio methods for gene finding Comparative

More information

Biol 321 Spring 2013 Quiz 4 25 pts NAME

Biol 321 Spring 2013 Quiz 4 25 pts NAME Biol 321 Spring 2013 Quiz 4 25 pts NAME 1. (3 pts.) a. What is the name of this compound? BE EXPLICIT deoxyribose 5 b. Number the carbons on this structure: 4 1 3 2 2. (4 pts.) Circle True or False. If

More information

Section 14.1 Structure of ribonucleic acid

Section 14.1 Structure of ribonucleic acid Section 14.1 Structure of ribonucleic acid The genetic code Sections of DNA are transcribed onto a single stranded molecule called RNA There are two types of RNA One type copies the genetic code and transfers

More information

Pharmacogenetics: A SNPshot of the Future. Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001

Pharmacogenetics: A SNPshot of the Future. Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001 Pharmacogenetics: A SNPshot of the Future Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001 1 I. What is pharmacogenetics? It is the study of how genetic variation affects drug response

More information

CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes

CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes Coalescence Scribe: Alex Wells 2/18/16 Whenever you observe two sequences that are similar, there is actually a single individual

More information

Map-Based Cloning of Qualitative Plant Genes

Map-Based Cloning of Qualitative Plant Genes Map-Based Cloning of Qualitative Plant Genes Map-based cloning using the genetic relationship between a gene and a marker as the basis for beginning a search for a gene Chromosome walking moving toward

More information

This is a closed book, closed note exam. No calculators, phones or any electronic device are allowed.

This is a closed book, closed note exam. No calculators, phones or any electronic device are allowed. MCB 104 MIDTERM #2 October 23, 2013 ***IMPORTANT REMINDERS*** Print your name and ID# on every page of the exam. You will lose 0.5 point/page if you forget to do this. Name KEY If you need more space than

More information

1) (15 points) Next to each term in the left-hand column place the number from the right-hand column that best corresponds:

1) (15 points) Next to each term in the left-hand column place the number from the right-hand column that best corresponds: 1) (15 points) Next to each term in the left-hand column place the number from the right-hand column that best corresponds: natural selection 21 1) the component of phenotypic variance not explained by

More information

Hands-On Four Investigating Inherited Diseases

Hands-On Four Investigating Inherited Diseases Hands-On Four Investigating Inherited Diseases The purpose of these exercises is to introduce bioinformatics databases and tools. We investigate an important human gene and see how mutations give rise

More information

Introduction to RNA-Seq. David Wood Winter School in Mathematics and Computational Biology July 1, 2013

Introduction to RNA-Seq. David Wood Winter School in Mathematics and Computational Biology July 1, 2013 Introduction to RNA-Seq David Wood Winter School in Mathematics and Computational Biology July 1, 2013 Abundance RNA is... Diverse Dynamic Central DNA rrna Epigenetics trna RNA mrna Time Protein Abundance

More information

BIOL 1030 Introduction to Biology: Organismal Biology. Fall 2009 Sections B & D. Steve Thompson:

BIOL 1030 Introduction to Biology: Organismal Biology. Fall 2009 Sections B & D. Steve Thompson: BIOL 1030 Introduction to Biology: Organismal Biology. Fall 2009 Sections B & D Steve Thompson: stthompson@valdosta.edu http://www.bioinfo4u.net 1 DNA transcription and regulation We ve seen how the principles

More information

SNPs - GWAS - eqtls. Sebastian Schmeier

SNPs - GWAS - eqtls. Sebastian Schmeier SNPs - GWAS - eqtls s.schmeier@gmail.com http://sschmeier.github.io/bioinf-workshop/ 17.08.2015 Overview Single nucleotide polymorphism (refresh) SNPs effect on genes (refresh) Genome-wide association

More information

The Human Genome Project has always been something of a misnomer, implying the existence of a single human genome

The Human Genome Project has always been something of a misnomer, implying the existence of a single human genome The Human Genome Project has always been something of a misnomer, implying the existence of a single human genome Of course, every person on the planet with the exception of identical twins has a unique

More information

Measurement of Molecular Genetic Variation. Forces Creating Genetic Variation. Mutation: Nucleotide Substitutions

Measurement of Molecular Genetic Variation. Forces Creating Genetic Variation. Mutation: Nucleotide Substitutions Measurement of Molecular Genetic Variation Genetic Variation Is The Necessary Prerequisite For All Evolution And For Studying All The Major Problem Areas In Molecular Evolution. How We Score And Measure

More information

3. human genomics clone genes associated with genetic disorders. 4. many projects generate ordered clones that cover genome

3. human genomics clone genes associated with genetic disorders. 4. many projects generate ordered clones that cover genome Lectures 30 and 31 Genome analysis I. Genome analysis A. two general areas 1. structural 2. functional B. genome projects a status report 1. 1 st sequenced: several viral genomes 2. mitochondria and chloroplasts

More information

Unit 1: DNA and the Genome. Sub-Topic (1.3) Gene Expression

Unit 1: DNA and the Genome. Sub-Topic (1.3) Gene Expression Unit 1: DNA and the Genome Sub-Topic (1.3) Gene Expression Unit 1: DNA and the Genome Sub-Topic (1.3) Gene Expression On completion of this subtopic I will be able to State the meanings of the terms genotype,

More information

What determines if a mutation is deleterious, neutral, or beneficial?

What determines if a mutation is deleterious, neutral, or beneficial? BIO 184 - PAL Problem Set Lecture 6 (Brooker Chapter 18) Mutations Section A. Types of mutations Define and give an example the following terms: allele; phenotype; genotype; Define and give an example

More information

STSs and ESTs. Sequence-Tagged Site: short, unique sequence Expressed Sequence Tag: short, unique sequence from a coding region

STSs and ESTs. Sequence-Tagged Site: short, unique sequence Expressed Sequence Tag: short, unique sequence from a coding region STSs and ESTs Sequence-Tagged Site: short, unique sequence Expressed Sequence Tag: short, unique sequence from a coding region 1991: 609 ESTs [Adams et al.] June 2000: 4.6 million in dbest Genome sequencing

More information

Investigating Inherited Diseases

Investigating Inherited Diseases Investigating Inherited Diseases The purpose of these exercises is to introduce bioinformatics databases and tools. We investigate an important human gene and see how mutations give rise to inherited diseases.

More information

How does the human genome stack up? Genomic Size. Genome Size. Number of Genes. Eukaryotic genomes are generally larger.

How does the human genome stack up? Genomic Size. Genome Size. Number of Genes. Eukaryotic genomes are generally larger. How does the human genome stack up? Organism Human (Homo sapiens) Laboratory mouse (M. musculus) Mustard weed (A. thaliana) Roundworm (C. elegans) Fruit fly (D. melanogaster) Yeast (S. cerevisiae) Bacterium

More information

Chromosomal Mutations. 2. Gene Mutations

Chromosomal Mutations. 2. Gene Mutations 12-4 12-4 1. Chromosomal 3. NOT! 2. Gene A genetic mutation is any change in the DNA nucleotide sequence. Mutation is caused by mistakes during DNA replication, as well as mutagens, like certain chemicals

More information

Single Nucleotide Variant Analysis. H3ABioNet May 14, 2014

Single Nucleotide Variant Analysis. H3ABioNet May 14, 2014 Single Nucleotide Variant Analysis H3ABioNet May 14, 2014 Outline What are SNPs and SNVs? How do we identify them? How do we call them? SAMTools GATK VCF File Format Let s call variants! Single Nucleotide

More information

Crash-course in genomics

Crash-course in genomics Crash-course in genomics Molecular biology : How does the genome code for function? Genetics: How is the genome passed on from parent to child? Genetic variation: How does the genome change when it is

More information

Mutation. ! Mutation occurs when a DNA gene is damaged or changed in such a way as to alter the genetic message carried by that gene

Mutation. ! Mutation occurs when a DNA gene is damaged or changed in such a way as to alter the genetic message carried by that gene Mutations Mutation The term mutation is derived from Latin word meaning to change.! Mutation occurs when a DNA gene is damaged or changed in such a way as to alter the genetic message carried by that gene!

More information

Molecular Genetics of Disease and the Human Genome Project

Molecular Genetics of Disease and the Human Genome Project 9 Molecular Genetics of Disease and the Human Genome Project Fig. 1. The 23 chromosomes in the human genome. There are 22 autosomes (chromosomes 1 to 22) and two sex chromosomes (X and Y). Females inherit

More information

Higher Human Biology Unit 1: Human Cells Pupils Learning Outcomes

Higher Human Biology Unit 1: Human Cells Pupils Learning Outcomes Higher Human Biology Unit 1: Human Cells Pupils Learning Outcomes 1.1 Division and Differentiation in Human Cells I can state that cellular differentiation is the process by which a cell develops more

More information

Solutions will be posted on the web.

Solutions will be posted on the web. MIT Biology Department 7.012: Introductory Biology - Fall 2004 Instructors: Professor Eric Lander, Professor Robert A. Weinberg, Dr. Claudette Gardel NAME TA SEC 7.012 Problem Set 7 FRIDAY December 3,

More information

Understanding Genes & Mutations. John A Phillips III May 16, 2005

Understanding Genes & Mutations. John A Phillips III May 16, 2005 Understanding Genes & Mutations John A Phillips III May 16, 2005 Learning Objectives Understand gene structure Become familiar with genetic & mutation databases Be able to find information on genetic variation

More information

The Genetic Code and Transcription. Chapter 12 Honors Genetics Ms. Susan Chabot

The Genetic Code and Transcription. Chapter 12 Honors Genetics Ms. Susan Chabot The Genetic Code and Transcription Chapter 12 Honors Genetics Ms. Susan Chabot TRANSCRIPTION Copy SAME language DNA to RNA Nucleic Acid to Nucleic Acid TRANSLATION Copy DIFFERENT language RNA to Amino

More information

CHAPTER 5 Principle of Genetics Review

CHAPTER 5 Principle of Genetics Review CHAPTER 5 Principle of Genetics Review I. Mendel s Investigations Gregor Johann Mendel Hybridized peas 1856-1864 Formulated Principles of Heredity published in 1866 II. Chromosomal Basis of Inheritance

More information

Higher Unit 1: DNA and the Genome Topic 1.1 The Structure and Organisation of DNA

Higher Unit 1: DNA and the Genome Topic 1.1 The Structure and Organisation of DNA Higher Unit : DNA and the Genome Topic. The Structure and Organisation of DNA. Which of the following diagrams shows the correct structure of DNA? 2. A section of double stranded DNA was found to have

More information

Applicazioni biotecnologiche

Applicazioni biotecnologiche Applicazioni biotecnologiche Analisi forense Sintesi di proteine ricombinanti Restriction Fragment Length Polymorphism (RFLP) Polymorphism (more fully genetic polymorphism) refers to the simultaneous occurrence

More information

DNA and Biotechnology Form of DNA Form of DNA Form of DNA Form of DNA Replication of DNA Replication of DNA

DNA and Biotechnology Form of DNA Form of DNA Form of DNA Form of DNA Replication of DNA Replication of DNA 21 DNA and Biotechnology DNA and Biotechnology OUTLINE: Replication of DNA Gene Expression Mutations Regulating Gene Activity Genetic Engineering Genomics DNA (deoxyribonucleic acid) Double-stranded molecule

More information

BIOINFORMATICS FOR DUMMIES MB&C2017 WORKSHOP

BIOINFORMATICS FOR DUMMIES MB&C2017 WORKSHOP Jasper Decuyper BIOINFORMATICS FOR DUMMIES MB&C2017 WORKSHOP MB&C2017 Workshop Bioinformatics for dummies 2 INTRODUCTION Imagine your workspace without the computers Both in research laboratories and in

More information

HC70AL SUMMER 2014 PROFESSOR BOB GOLDBERG Gene Annotation Worksheet

HC70AL SUMMER 2014 PROFESSOR BOB GOLDBERG Gene Annotation Worksheet HC70AL SUMMER 2014 PROFESSOR BOB GOLDBERG Gene Annotation Worksheet NAME: DATE: QUESTION ONE Using primers given to you by your TA, you carried out sequencing reactions to determine the identity of the

More information

Trudy F C Mackay, Department of Genetics, North Carolina State University, Raleigh NC , USA.

Trudy F C Mackay, Department of Genetics, North Carolina State University, Raleigh NC , USA. Question & Answer Q&A: Genetic analysis of quantitative traits Trudy FC Mackay What are quantitative traits? Quantitative, or complex, traits are traits for which phenotypic variation is continuously distributed

More information

The Flow of Genetic Information

The Flow of Genetic Information Chapter 17 The Flow of Genetic Information The DNA inherited by an organism leads to specific traits by dictating the synthesis of proteins and of RNA molecules involved in protein synthesis. Proteins

More information

i. This type of DNA damage will result in (circle all correct answers) b. a transversion mutation

i. This type of DNA damage will result in (circle all correct answers) b. a transversion mutation Biol 321 Winter 2010 Quiz 5 40 points NAME ANSWERS IN RED COMMENTS IN BLUE 1. (2 pts.) Recall the article entitled: Human Genetic Variation By each statement circle True/False/Not addressed in the paper.

More information

Chapter 17. From Gene to Protein. Slide 1. Slide 2. Slide 3. Gene Expression. Which of the following is the best example of gene expression? Why?

Chapter 17. From Gene to Protein. Slide 1. Slide 2. Slide 3. Gene Expression. Which of the following is the best example of gene expression? Why? Slide 1 Chapter 17 From Gene to Protein PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from

More information

Chapter 8. Microbial Genetics. Lectures prepared by Christine L. Case. Copyright 2010 Pearson Education, Inc.

Chapter 8. Microbial Genetics. Lectures prepared by Christine L. Case. Copyright 2010 Pearson Education, Inc. Chapter 8 Microbial Genetics Lectures prepared by Christine L. Case Structure and Function of Genetic Material Learning Objectives 8-1 Define genetics, genome, chromosome, gene, genetic code, genotype,

More information

BIOLOGY - CLUTCH CH.17 - GENE EXPRESSION.

BIOLOGY - CLUTCH CH.17 - GENE EXPRESSION. !! www.clutchprep.com CONCEPT: GENES Beadle and Tatum develop the one gene one enzyme hypothesis through their work with Neurospora (bread mold). This idea was later revised as the one gene one polypeptide

More information

Gene Identification in silico

Gene Identification in silico Gene Identification in silico Nita Parekh, IIIT Hyderabad Presented at National Seminar on Bioinformatics and Functional Genomics, at Bioinformatics centre, Pondicherry University, Feb 15 17, 2006. Introduction

More information

Genetics Transcription Translation Replication

Genetics Transcription Translation Replication Genetics Transcription Translation Replication 1. Which statement best describes the relationship between an allele and a gene? A. An allele is a variation of a gene that can be expressed as a phenotype.

More information

Unit 1 Human cells. 1. Division and differentiation in human cells

Unit 1 Human cells. 1. Division and differentiation in human cells Unit 1 Human cells 1. Division and differentiation in human cells Stem cells Describe the process of differentiation. Explain how differentiation is brought about with reference to genes. Name the two

More information

Before starting, write your name on the top of each page Make sure you have all pages

Before starting, write your name on the top of each page Make sure you have all pages Biology 105: Introduction to Genetics Name Student ID Before starting, write your name on the top of each page Make sure you have all pages You can use the back-side of the pages for scratch, but we will

More information

The Nature of Genes. The Nature of Genes. Genes and How They Work. Chapter 15/16

The Nature of Genes. The Nature of Genes. Genes and How They Work. Chapter 15/16 Genes and How They Work Chapter 15/16 The Nature of Genes Beadle and Tatum proposed the one gene one enzyme hypothesis. Today we know this as the one gene one polypeptide hypothesis. 2 The Nature of Genes

More information

Evolutionary Genetics: Part 1 Polymorphism in DNA

Evolutionary Genetics: Part 1 Polymorphism in DNA Evolutionary Genetics: Part 1 Polymorphism in DNA S. chilense S. peruvianum Winter Semester 2012-2013 Prof Aurélien Tellier FG Populationsgenetik Color code Color code: Red = Important result or definition

More information

Introduction to Molecular Biology

Introduction to Molecular Biology Introduction to Molecular Biology Bioinformatics: Issues and Algorithms CSE 308-408 Fall 2007 Lecture 2-1- Important points to remember We will study: Problems from bioinformatics. Algorithms used to solve

More information

Question 2: There are 5 retroelements (2 LINEs and 3 LTRs), 6 unclassified elements (XDMR and XDMR_DM), and 7 satellite sequences.

Question 2: There are 5 retroelements (2 LINEs and 3 LTRs), 6 unclassified elements (XDMR and XDMR_DM), and 7 satellite sequences. Bio4342 Exercise 1 Answers: Detecting and Interpreting Genetic Homology (Answers prepared by Wilson Leung) Question 1: Low complexity DNA can be described as sequences that consist primarily of one or

More information

COMPETITOR NAMES: TEAM NAME: TEAM NUMBER:

COMPETITOR NAMES: TEAM NAME: TEAM NUMBER: COMPETITOR NAMES: TEAM NAME: TEAM NUMBER: Section 1:Crosses In a fictional species of mice, with species name Mus SciOlyian, fur color is controlled by a single autosomal gene. The allele for brown fur

More information

Training materials.

Training materials. Training materials - Ensembl training materials are protected by a CC BY license - http://creativecommons.org/licenses/by/4.0/ - If you wish to re-use these materials, please credit Ensembl for their creation

More information

What would this eye color phenomenon be called?

What would this eye color phenomenon be called? Name: School: Total Score: / 50 1 1. Which nitrogenous bases present in DNA are purines, and which are pyrimidines? What is the main difference between a purine and a pyrimidine? (2 points) 2. To the right

More information

Lecture 1. Basic Definitions and Nucleic Acids. Basic Definitions you should already know

Lecture 1. Basic Definitions and Nucleic Acids. Basic Definitions you should already know Lecture 1. Basic Definitions and Nucleic Acids Basic Definitions you should already know apple DNA: Deoxyribonucleic Acid apple RNA: Ribonucleic Acid apple mrna: messenger RNA: contains the genetic information(coding

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Neighbor-joining tree of the 183 wild, cultivated, and weedy rice accessions.

Nature Genetics: doi: /ng Supplementary Figure 1. Neighbor-joining tree of the 183 wild, cultivated, and weedy rice accessions. Supplementary Figure 1 Neighbor-joining tree of the 183 wild, cultivated, and weedy rice accessions. Relationships of cultivated and wild rice correspond to previously observed relationships 40. Wild rice

More information

GREG GIBSON SPENCER V. MUSE

GREG GIBSON SPENCER V. MUSE A Primer of Genome Science ience THIRD EDITION TAGCACCTAGAATCATGGAGAGATAATTCGGTGAGAATTAAATGGAGAGTTGCATAGAGAACTGCGAACTG GREG GIBSON SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc.

More information

Genes and How They Work. Chapter 15

Genes and How They Work. Chapter 15 Genes and How They Work Chapter 15 The Nature of Genes They proposed the one gene one enzyme hypothesis. Today we know this as the one gene one polypeptide hypothesis. 2 The Nature of Genes The central

More information

Enzyme that uses RNA as a template to synthesize a complementary DNA

Enzyme that uses RNA as a template to synthesize a complementary DNA Biology 105: Introduction to Genetics PRACTICE FINAL EXAM 2006 Part I: Definitions Homology: Comparison of two or more protein or DNA sequence to ascertain similarities in sequences. If two genes have

More information

From DNA to Protein: Genotype to Phenotype

From DNA to Protein: Genotype to Phenotype 12 From DNA to Protein: Genotype to Phenotype 12.1 What Is the Evidence that Genes Code for Proteins? The gene-enzyme relationship is one-gene, one-polypeptide relationship. Example: In hemoglobin, each

More information

A novel tool for monitoring endogenous alpha-synuclein transcription by NanoLuciferase

A novel tool for monitoring endogenous alpha-synuclein transcription by NanoLuciferase A novel tool for monitoring endogenous alpha-synuclein transcription by NanoLuciferase tag insertion at the 3 end using CRISPR-Cas9 genome editing technique Sambuddha Basu 1, 3, Levi Adams 1, 3, Subhrangshu

More information

Computational gene finding

Computational gene finding Computational gene finding Devika Subramanian Comp 470 Outline (3 lectures) Lec 1 Lec 2 Lec 3 The biological context Markov models and Hidden Markov models Ab-initio methods for gene finding Comparative

More information

FUNCTIONAL BIOINFORMATICS

FUNCTIONAL BIOINFORMATICS Molecular Biology-2018 1 FUNCTIONAL BIOINFORMATICS PREDICTING THE FUNCTION OF AN UNKNOWN PROTEIN Suppose you have found the amino acid sequence of an unknown protein and wish to find its potential function.

More information

I. Gene Expression Figure 1: Central Dogma of Molecular Biology

I. Gene Expression Figure 1: Central Dogma of Molecular Biology I. Gene Expression Figure 1: Central Dogma of Molecular Biology Central Dogma: Gene Expression: RNA Structure RNA nucleotides contain the pentose sugar Ribose instead of deoxyribose. Contain the bases

More information

I.1 The Principle: Identification and Application of Molecular Markers

I.1 The Principle: Identification and Application of Molecular Markers I.1 The Principle: Identification and Application of Molecular Markers P. Langridge and K. Chalmers 1 1 Introduction Plant breeding is based around the identification and utilisation of genetic variation.

More information

Authors: Vivek Sharma and Ram Kunwar

Authors: Vivek Sharma and Ram Kunwar Molecular markers types and applications A genetic marker is a gene or known DNA sequence on a chromosome that can be used to identify individuals or species. Why we need Molecular Markers There will be

More information

TIGR THE INSTITUTE FOR GENOMIC RESEARCH

TIGR THE INSTITUTE FOR GENOMIC RESEARCH Introduction to Genome Annotation: Overview of What You Will Learn This Week C. Robin Buell May 21, 2007 Types of Annotation Structural Annotation: Defining genes, boundaries, sequence motifs e.g. ORF,

More information

Mutation entries in SMA databases Guidelines for national curators

Mutation entries in SMA databases Guidelines for national curators 1 Mutation entries in SMA databases Guidelines for national curators GENERAL CONSIDERATIONS Role of the curator(s) of a national database Molecular data can be collected by many different ways. There are

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION doi:10.1038/nature09937 a Name Position Primersets 1a 1b 2 3 4 b2 Phenotype Genotype b Primerset 1a D T C R I E 10000 8000 6000 5000 4000 3000 2500 2000 1500 1000 800 Donor (D)

More information

Genomic Annotation Lab Exercise By Jacob Jipp and Marian Kaehler Luther College, Department of Biology Genomics Education Partnership 2010

Genomic Annotation Lab Exercise By Jacob Jipp and Marian Kaehler Luther College, Department of Biology Genomics Education Partnership 2010 Genomic Annotation Lab Exercise By Jacob Jipp and Marian Kaehler Luther College, Department of Biology Genomics Education Partnership 2010 Genomics is a new and expanding field with an increasing impact

More information

MODULE 1: INTRODUCTION TO THE GENOME BROWSER: WHAT IS A GENE?

MODULE 1: INTRODUCTION TO THE GENOME BROWSER: WHAT IS A GENE? MODULE 1: INTRODUCTION TO THE GENOME BROWSER: WHAT IS A GENE? Lesson Plan: Title Introduction to the Genome Browser: what is a gene? JOYCE STAMM Objectives Demonstrate basic skills in using the UCSC Genome

More information

Lecture 10 : Whole genome sequencing and analysis. Introduction to Computational Biology Teresa Przytycka, PhD

Lecture 10 : Whole genome sequencing and analysis. Introduction to Computational Biology Teresa Przytycka, PhD Lecture 10 : Whole genome sequencing and analysis Introduction to Computational Biology Teresa Przytycka, PhD Sequencing DNA Goal obtain the string of bases that make a given DNA strand. Problem Typically

More information

A PCR Assay for the Anthocyaninless Mutation in Fast Plants and a Bridge Between Classical Genetics and Genomics

A PCR Assay for the Anthocyaninless Mutation in Fast Plants and a Bridge Between Classical Genetics and Genomics Tested Studies for Laboratory Teaching Proceedings of the Association for Biology Laboratory Education Volume 39, Article 61, 2018 A PCR Assay for the Anthocyaninless Mutation in Fast Plants and a Bridge

More information

Algorithms for Genetics: Introduction, and sources of variation

Algorithms for Genetics: Introduction, and sources of variation Algorithms for Genetics: Introduction, and sources of variation Scribe: David Dean Instructor: Vineet Bafna 1 Terms Genotype: the genetic makeup of an individual. For example, we may refer to an individual

More information

Nature Biotechnology: doi: /nbt.4166

Nature Biotechnology: doi: /nbt.4166 Supplementary Figure 1 Validation of correct targeting at targeted locus. (a) by immunofluorescence staining of 2C-HR-CRISPR microinjected embryos cultured to the blastocyst stage. Embryos were stained

More information

The Nature of Genes. The Nature of Genes. The Nature of Genes. The Nature of Genes. The Nature of Genes. The Genetic Code. Genes and How They Work

The Nature of Genes. The Nature of Genes. The Nature of Genes. The Nature of Genes. The Nature of Genes. The Genetic Code. Genes and How They Work Genes and How They Work Chapter 15 Early ideas to explain how genes work came from studying human diseases. Archibald Garrod studied alkaptonuria, 1902 Garrod recognized that the disease is inherited via

More information

Lecture 7 Motif Databases and Gene Finding

Lecture 7 Motif Databases and Gene Finding Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il Lecture 7 Motif Databases and Gene Finding Motif Databases & Gene Finding Motifs Recap Motif Databases TRANSFAC

More information

8.1. KEY CONCEPT DNA was identified as the genetic material through a series of experiments. 64 Reinforcement Unit 3 Resource Book

8.1. KEY CONCEPT DNA was identified as the genetic material through a series of experiments. 64 Reinforcement Unit 3 Resource Book 8.1 IDENTIFYING DNA AS THE GENETIC MATERIAL KEY CONCEPT DNA was identified as the genetic material through a series of experiments. A series of experiments helped scientists recognize that DNA is the genetic

More information

Multiple choice questions (numbers in brackets indicate the number of correct answers)

Multiple choice questions (numbers in brackets indicate the number of correct answers) 1 February 15, 2013 Multiple choice questions (numbers in brackets indicate the number of correct answers) 1. Which of the following statements are not true Transcriptomes consist of mrnas Proteomes consist

More information

Textbook Reading Guidelines

Textbook Reading Guidelines Understanding Bioinformatics by Marketa Zvelebil and Jeremy Baum Last updated: May 1, 2009 Textbook Reading Guidelines Preface: Read the whole preface, and especially: For the students with Life Science

More information

BICD100 Midterm (10/27/10) KEY

BICD100 Midterm (10/27/10) KEY BICD100 Midterm (10/27/10) KEY 1. Variation in tail length is characteristic of some dog breeds, such as Pembroke Welsh Corgis, which sometimes show a bob tail (short tail) phenotype (see illustration

More information

12/8/09 Comp 590/Comp Fall

12/8/09 Comp 590/Comp Fall 12/8/09 Comp 590/Comp 790-90 Fall 2009 1 One of the first, and simplest models of population genealogies was introduced by Wright (1931) and Fisher (1930). Model emphasizes transmission of genes from one

More information

Introduction to Plant Genomics and Online Resources. Manish Raizada University of Guelph

Introduction to Plant Genomics and Online Resources. Manish Raizada University of Guelph Introduction to Plant Genomics and Online Resources Manish Raizada University of Guelph Genomics Glossary http://www.genomenewsnetwork.org/articles/06_00/sequence_primer.shtml Annotation Adding pertinent

More information

Genetics. Chapter 9 - Microbial Genetics. Chromosome. Genes. Topics - Genetics - Flow of Genetics - Regulation - Mutation - Recombination

Genetics. Chapter 9 - Microbial Genetics. Chromosome. Genes. Topics - Genetics - Flow of Genetics - Regulation - Mutation - Recombination Chapter 9 - Microbial Genetics Topics - Genetics - Flow of Genetics - Regulation - Mutation - Recombination Genetics Genome (The sum total of genetic material of a cell is referred to as the genome.) Chromosome

More information

Figure 1: Genetic Mosaicism

Figure 1: Genetic Mosaicism I. Gene Mutations a) Germinal Mutations: occur w/in the DNA of stem cells that ultimately form gametes. These are the only mutations that can be transmitted to future generations. b) Somatic Mutations:

More information