The Cephalopod Loligo bleekeri Mitochondrial Genome: Multiplied Noncoding Regions and Transposition of trna Genes

Similar documents
The Complete Mitochondrial Genome of the Articulate Brachiopod Terebratalia transversa

Partial Sequence of the Mitochondrial Genome of Littorina saxatilis: Relevance to Gastropod Phylogenetics*

Lecture for Wednesday. Dr. Prince BIOL 1408

Gene function at the level of traits Gene function at the molecular level

Problem Set 8. Answer Key

1/4/18 NUCLEIC ACIDS. Nucleic Acids. Nucleic Acids. ECS129 Instructor: Patrice Koehl

NUCLEIC ACIDS. ECS129 Instructor: Patrice Koehl

produces an RNA copy of the coding region of a gene

Recitation CHAPTER 9 DNA Technologies

CH 17 :From Gene to Protein

Section 10.3 Outline 10.3 How Is the Base Sequence of a Messenger RNA Molecule Translated into Protein?

From DNA to Protein: Genotype to Phenotype

THE mitochondrial (mt) genomes of almost all NADH dehydrogenase (ND1-ND6 and ND4L); and two

1. DNA, RNA structure. 2. DNA replication. 3. Transcription, translation

7.014 Problem Set 4 Answers to this problem set are to be turned in. Problem sets will not be accepted late. Solutions will be posted on the web.

From DNA to Protein: Genotype to Phenotype

Chapter Twelve Protein Synthesis: Translation of the Genetic Message

CLEP Biology - Problem Drill 11: Transcription, Translation and The Genetic Code

You are genetically unique

Basic concepts of molecular biology

5. Which of the following enzymes catalyze the attachment of an amino acid to trna in the formation of aminoacyl trna?

RNA : functional role

Chapter 10: Gene Expression and Regulation

Genes and How They Work. Chapter 15

Basic concepts of molecular biology

Daily Agenda. Warm Up: Review. Translation Notes Protein Synthesis Practice. Redos

The Nature of Genes. The Nature of Genes. Genes and How They Work. Chapter 15/16

From RNA To Protein

BIOCHEMISTRY REVIEW. Overview of Biomolecules. Chapter 13 Protein Synthesis

DNA Evolution of knowledge about gene. Contains information about RNAs and proteins. Polynucleotide chains; Double stranded molecule;

Te c htips. Simple Approaches for Optimization of RT-PCR TECH TIP 206

BIOLOGY - CLUTCH CH.17 - GENE EXPRESSION.

BIOL 300 Foundations of Biology Summer 2017 Telleen Lecture Outline

PROTEIN SYNTHESIS. copyright cmassengale

Chapter 12. DNA TRANSCRIPTION and TRANSLATION

Molecular Genetics Techniques. BIT 220 Chapter 20

Problem Set Unit The base ratios in the DNA and RNA for an onion (Allium cepa) are given below.

From Gene to Protein. How Genes Work

PROTEIN SYNTHESIS Flow of Genetic Information The flow of genetic information can be symbolized as: DNA RNA Protein

CHAPTER 21 LECTURE SLIDES

Branches of Genetics

Unit 1: DNA and the Genome. Sub-Topic (1.3) Gene Expression

ENZYMES AND METABOLIC PATHWAYS

Exam 2 Bio200: Cellular Biology Winter 2014

d. reading a DNA strand and making a complementary messenger RNA

DNA is the MASTER PLAN. RNA is the BLUEPRINT of the Master Plan

Degenerate site - twofold degenerate site - fourfold degenerate site

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Chapter 12: Molecular Biology of the Gene

CS313 Exercise 1 Cover Page Fall 2017

Chapter 6 MOLECULAR BASIS OF INHERITANCE

BS 50 Genetics and Genomics Week of Oct 24

Ch 10 Molecular Biology of the Gene

Bioinformatics. ONE Introduction to Biology. Sami Khuri Department of Computer Science San José State University Biology/CS 123A Fall 2012

8/21/2014. From Gene to Protein

Gene Expression Transcription/Translation Protein Synthesis

Chapter 17. From Gene to Protein. AP Biology

What happens after DNA Replication??? Transcription, translation, gene expression/protein synthesis!!!!

A. Incorrect! This feature does help with it suitability as genetic material.

Translation BIT 220 Chapter 13

DNA REPLICATION. DNA structure. Semiconservative replication. DNA structure. Origin of replication. Replication bubbles and forks.

Molecular Cell Biology - Problem Drill 08: Transcription, Translation and the Genetic Code

Chapter 13. From DNA to Protein

Protein Synthesis. DNA to RNA to Protein

GENE EXPRESSION AT THE MOLECULAR LEVEL. Copyright (c) The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Name. Student ID. Midterm 2, Biology 2020, Kropf 2004

Amplified segment of DNA can be purified from bacteria in sufficient quantity and quality for :

Data Sheet Quick PCR Cloning Kit

Hello! Outline. Cell Biology: RNA and Protein synthesis. In all living cells, DNA molecules are the storehouses of information. 6.

Molecular Biology Techniques Supporting IBBE

DNA Asymmetric Strand Bias Affects the Amino Acid Composition of Mitochondrial Proteins

MCB 102 University of California, Berkeley August 11 13, Problem Set 8

Winter Quarter Midterm Exam

Nucleic acids deoxyribonucleic acid (DNA) ribonucleic acid (RNA) nucleotide

Regulation of bacterial gene expression

Flow of Genetic Information_ Genetic Code, Mutation & Translation (Learning Objectives)

The Nature of Genes. The Nature of Genes. The Nature of Genes. The Nature of Genes. The Nature of Genes. The Genetic Code. Genes and How They Work

I. Gene Expression Figure 1: Central Dogma of Molecular Biology

DNA REPLICATION & BIOTECHNOLOGY Biology Study Review

RNA, & PROTEIN SYNTHESIS. 7 th Grade, Week 4, Day 1 Monday, July 15, 2013

Introduction to Cellular Biology and Bioinformatics. Farzaneh Salari

M1 - Biochemistry. Nucleic Acid Structure II/Transcription I

Fig Ch 17: From Gene to Protein

Genome annotation & EST

PROTEIN SYNTHESIS. copyright cmassengale

From DNA to Protein. Chapter 14

Flow of Genetic Information_Translation (Learning Objectives)

NUCLEIC ACID METABOLISM. Omidiwura, B.R.O

IB BIO I Replication/Transcription/Translation Van Roekel/Madden. Name Date Period. D. It separates DNA strands. (Total 1 mark)

Recombinant DNA Technology

Tutorial for Stop codon reassignment in the wild

Replication, Transcription, and Translation

DNA, RNA, and PROTEIN SYNTHESIS

DNA RNA Protein Trait Protein Synthesis (Gene Expression) Notes Proteins (Review) Proteins make up all living materials

DNA Function: Information Transmission

TRANSCRIPTION AND TRANSLATION

Text Reference, Campbell v.8, chapter 17 PROTEIN SYNTHESIS

How to Use This Presentation

Bioinformatics: Sequence Analysis. COMP 571 Luay Nakhleh, Rice University

DNA and Biotechnology Form of DNA Form of DNA Form of DNA Form of DNA Replication of DNA Replication of DNA

Transcription:

J Mol Evol (2002) 54:486 500 DOI: 10.1007/s00239-001-0039-4 Springer-Verlag New York Inc. 2002 The Cephalopod Loligo bleekeri Mitochondrial Genome: Multiplied Noncoding Regions and Transposition of trna Genes Kozo Tomita, 1,* Shin-ichi Yokobori, 2 Tairo Oshima, 2 Takuya Ueda, 3 Kimitsuna Watanabe 1,3 1 Department of Chemistry and Biotechnology, Graduate School of Engineering, University of Tokyo, Hongo, Bunkyo-ku, Tokyo 113-8656, Japan 2 Department of Molecular Biology, School of Life Science, Tokyo University of Pharmacy and Life Science, 1432-1, Horinouchi, Hachioji, Tokyo 192-0392, Japan 3 Department of Integrated Biosciences, Graduate School of Frontier Sciences, University of Tokyo, Hongo, Bunkyo-ku, Tokyo 113-8656, Japan Received: 9 May 2001 / Accepted: 3 October 2001 Abstract. We previously reported the sequence of a 9260-bp fragment of mitochondrial (mt) DNA of the cephalopod Loligo bleekeri [J. Sasuga et al. (1999) J. Mol. Evol. 48:692 702]. To clarify further the characteristics of Loligo mtdna, we have sequenced an 8148-bp fragment to reveal the complete mt genome sequence. Loligo mtdna is 17,211 bp long and possesses a standard set of metazoan mt genes. Its gene arrangement is not identical to any other metazoan mt gene arrangement reported so far. Three of the 19 noncoding regions longer than 10 bp are 515, 507, and 509 bp long, and their sequences are nearly identical, suggesting that multiplication of these noncoding regions occurred in an ancestral Loligo mt genome. Comparison of the gene arrangements of Loligo, Katharina tunicata, and Littorina saxatilis mt genomes revealed that 17 trna genes of the Loligo mt genome are adjacent to noncoding regions. A majority (15 trna genes) of their counterparts is found in two trna gene clusters of the Katharina mt genome. Therefore, the Loligo mt genome (17 trna genes) may have spread over the genome, and this may have been coupled with the multiplication of the noncoding regions. Maximum likelihood analysis of mt protein genes supports the clade Mollusca + Annelida + Brachiopoda but *Present address: Department of Biochemistry, University of Washington, Seattle, WA 98195-7350, USA Correspondence to: Shin-ichi Yokobori; email: yokobori@ ls.toyaku.ac.jp fails to infer the relationships among Katharina, Loligo, and three gastropod species. Key words: Cephalopod Loligo bleekeri Mitochondrial DNA Molluscan phylogeny Gene rearrangement Noncoding region trna Introduction Typical metazoan mitochondrial (mt) genomes are circular, are 14 18 kb in size, and encode 13 protein, 2 rrna [small and large subunit rrnas (rrns and rrnl)], and 22 trna (trna, trnc, etc.) genes but no introns (Wolstenholme 1992; Boore 1999). The 13 polypeptides are involved in ATP synthesis coupled with electron transfer during O 2 consumption [ATP synthetase subunits 6 and 8 (atp6 and atp8), cytochrome oxidase subunits I III (cox1 cox3), apocytochrome b (cob), and NADH dehydrogenase subunits 1 6 and 4L (nad1-6 and nad4l)]. Although the mt gene order is well conserved in several phyla (e.g., Arthropoda and Vertebrata), large variations in mt genome structure have been found within and between several specific groups of Mollusca. Complete mtdna sequences have been reported for various mollusks: a polyplacophoran [Katharina tunicata (Boore and Brown 1994)] and three gastropods, Cepaea nemoralis [Pulmonata (Terrett et al. 1996; Yamazaki et al. 1997)], Albinaria coerulea [Pulmonata

487 Fig. 1. Amplification strategy of the Loligo mtdna segment sequenced in this study (upper panel) and complete gene organization of the Loligo mtdna (lower panel). The relative position of each PCR fragment is shown by a horizontal bar. Sequences of the PCR primers are listed in the text. The circular genome is shown in a linear form. Genes encoded on the opposite strand are in gray boxes. trna genes are shown using one-letter abbreviations above the panel, or below the panel in the case of those encoded on the opposite strand. trnl(uaa), trnl(uag), trns(uga), and trns(gcu) are designated L 1,L 2,S 1, and S 2, respectively. The noncoding regions (>10 bp) are designated NC1, etc., and shown in black boxes.

488 (Hatzoglou et al. 1995)], and Pupa strigosa [Opisthobranchia (Kurabayashi and Ueshima 2000)]. In addition, the complete mtdna gene arrangements of the bivalve Mytilus edulis (Hoffmann et al. 1992; see also Beagley et al. 1999) and a third pulmonate, Euhadra herklotsi (Yamazaki et al. 1997), have been reported. The size of molluscan mtdna is from 14 kb [pulmonate gastropods (e.g., Terrett et al. 1996)] to approximately 34 kb [the scallop Placopecten magellanicus (e.g., La Roche et al. 1990)]. The gene arrangement of Mytilus mtdna is notably different from that of other known metazoan mtdnas in that atp8 is absent and an additional trna Met gene is present (Hoffmann et al. 1992). Katharina, Mytilus, and opisthobranch/pulmonate gastropods exhibit marked differences in their mt gene arrangements (see Kurabayashi and Ueshima 2000), and the pulmonate mt gene orders differ even among species belonging to the same family (Yamazaki et al. 1997; Kurabayashi and Ueshima 2000). For this reason, comparison of mt gene organization has become a popular means of inferring metazoan phylogeny (see Boore 1999). Although Cephalopoda is one of the major classes of Mollusca, no complete mt genome has been sequenced so far. We previously reported a 9240-bp mtdna fragment of the squid Loligo bleekeri (Sasuga et al. 1999) and found that the Loligo mt genome has a gene arrangement different from that of other mollusks such as Katharina. To characterize the cephalopod mt genome, we have sequenced the remaining region of the Loligo mt genome. The Loligo mt genome carries several long noncoding regions, which appear to be related to differences in mt gene arrangement between Loligo and other mollusks. Materials and Methods DNA Preparation, PCR, Cloning, and Sequencing Total DNA was prepared from livers of Loligo by the conventional phenol-extraction method (Sambrook et al. 1989). The region of Loligo mtdna that had not been determined previously was amplified by PCR (Saiki et al. 1988) as seven fragments, using the following primers designed according to the partial sequence of Loligo mtdna (Sasuga et al. 1999; unpublished results): fragment A, 5 -gggaattc TAAATTATTCACATAATTCTGCC-3 and 5 -gggaagcttg- GATCCTTGGTTTCATTCAT-3 ; fragment B, 5 -gggaattc AAATATACAATCATAGCAAGTC-3 and 5 -gggaagcttg- TATATCTTTATTTGATTATGGTT-3 ; fragment C, 5 gggaattcccgtaaaggaccttcac-3 and 5 -gggaagct- TGGGGAATCTGAACTTGTATCT-3 ; fragment D, 5 -TTCTT CGATCCTTTCGTA-3 and 5 -TTTATCAAAAACATCTCTCTTTG- 3 ; fragment E, 5 -gggctgcagacaaactaataaccaatac- CCTTA-3 and 5 -gggctgcagcagaccggcgtgagccag- GTTG-3 ; fragment F, 5 -gggggatccttatgctacctt AGTACAGTTAA-3 and 5 -gggctgcagggttgtaggaata TATAATAATAGATG-3 ; and fragment a, 5 -gggaattcttaac- TATTCTCTTAATTGGCCT-3 and 5 -gggctgcagggtgtttt- TAGTACGCCCCT-3. Underlined letters indicate restriction enzyme sites introduced for the convenience of ligation with the cloning vectors; lowercase letters denote additional 5 sequences inserted to ensure efficient digestion by the restriction enzymes. The relative locations of the PCR fragments are shown in Fig. 1 (upper panel). PCR was carried out as described by Saiki et al. (1988) in 50 L of a solution containing 10 mm Tris Cl, ph 8.4 (at 25 C), 2 mm MgCl 2, 400 M dntps, 25 pmol of each PCR primer, 2.5 U of Taq DNA polymerase, and 150 ng of total Loligo DNA. The mixtures were subjected to 30 cycles of PCR (one cycle: 94, 50 55, and 72 C for 1, 1, and 1.5 min, respectively). PCR-amplified fragments, purified on a QIAgen spin column (Qiagen) according to the manufacturer s protocol, were digested with restriction endonuclease and then ligated to puc18. Escherichia coli JM109 was transformed with the recombinant plasmids. DNA was sequenced using the dideoxy-termination method with Sequenase version 2.0 (Amersham). Synthetic oligonucleotide primers based on the newly obtained sequence were used for sequence extension. More than three independent clones were analyzed for each DNA clone. Data Analysis The nucleotide sequence of Loligo mtdna was analyzed using the GENETYX software package (Software Development Co. Ltd., Tokyo). trna genes were identified by the formation of cloverleaf secondary structures. Clustal X (Thompson et al. 1997) was used to align amino acid sequences inferred from the Loligo mtdna sequence with the counterparts of various metazoans. The complete sequence of Loligo mtdna is available through the DDBJ/EMBL/GenBank DNA databases under accession number AB029616. Phylogenetic Analyses Based on Primary Sequences Amino acid sequences of mt protein genes were subjected to maximum likelihood (ML) analysis. Each protein gene was extracted from the following complete nucleotide sequences retrieved from the GenBank database: Metridium senile (accession number AF000023), Homo sapiens (J01415), Eumeces egregius (AB016606), Cyprinus carpio (X61010), Petromyzon marinus (U11880), Branchiostoma lanceolatum (Y16474), Balanoglossus carnosus (AF051097), Asterina pectinifera (D16387), Florometra serratissima (AF049132), Drosophila yakuba (X03240), D. melanogaster (U37541), Ceratitis capitata (AJ242872), Anopheles gambiae (L20934), A. quadrimaculatus (L04272), Locusta migratoria (X80245), Artemia franciscana (X69067), Daphnia pulex (AF117817), Ixodes hexagonus (AF081828), Rhipicephalus sanguineus (AF081829), Lumbricus terrestris (U24570), Platynereis dumerilii (AF178678), Katharina tunicata (U09810), Albinaria coerulea (X83390), and Terebratulina retusa (AJ245743). A sequence fragment of the gastropod Littorina saxatilis (LSA132137) was also retrieved. Together with the counterparts of the Loligo mt genome, all protein genes of interest were extracted and translated to amino acid sequences. Each protein gene was aligned using Clustal X (Thompson et al. 1997). After the alignments were slightly modified by hand, regions where the alignment was not satisfactory were removed. Two data sets were prepared. One was a combination of atp6, atp8, cox1, cox2, cob, nad1, and nad6 data (1052 sites in total); the other was a combination of all genes (2301 sites in total). The alignments used for the phylogenetic analyses are available from S.Y. on request. The phylogenetic analyses were carried out by the ML method using PROTML in MOLPHY 2.3b (Adachi and Hasegawa 1996). First, the ML distances between all pairs of taxa were estimated by PROTML using the distance (D) option with the mtrev-f model. Then, a neighbor-joining (NJ) tree was reconstructed by NJDIST in MOLPHY. The NJ trees were used as the start topologies for the local rearrangement

489 search (R) option of ML trees by PROTML with the mtrev-f model as the substitution model. To compare the substitution rates among molluscan sequences, the 1056-site data prepared for ML analysis were analyzed with RRTree (Robinson-Rechavi and Huchon 2000). Strand-Specific Bias AT skew and GC skew (Perna and Kocher 1995) were calculated at each codon position of protein genes from the Loligo (this study; Sasuga et al. 1999), Katharina (Boore and Brown 1994), Littorina (partial) (Wilding et al. 1999), Pupa (Kurabayashi and Ueshima 2000), Lumbricus (Boore and Brown 1995), and Terebratulina (Stechmann and Schlegel 1999) mt genomes. The genes atp6, atp8, cox3, and nad3 (atp6 and atp8 only for Littorina) were encoded by the same strand in each of the mt genomes listed above. Similarly, cox1, cox2, and nad2 [cox1 (partial) and nad2 for Littorina] were encoded by the same strand, and nad1, nad4l, nad4, nad6, and cob [nad1, nad6, and cob (partial) for Littorina] were also encoded by the same strand. Therefore, we categorized the protein genes into three groups: A, B, and C. For the first and second positions, all the codons except those used for initiation and termination were included in the analyses. For the third position, four degenerate codon boxes were used. The data set sizes (number of codons) were as follows: for Loligo, 664, 1083, and 1972 for groups A, B, and C, respectively; for Katharina, 657, 1076, and 1968; for Littorina, 281, 598, and 690; for Pupa, 647, 1030, and 1909; for Lumbricus, 656, 1072, and 1990; and for Terebratulina, 660, 1067, and 1970. The same data set used to analyze the AT skew and GC skew was used to calculate the frequencies of amino acids specified by GT-rich codons (Phe TTY, Leu TTR, Val GTN, Cys TGY, Trp TGR, and Gly GGN) vs those specified by AC-rich codons (Pro CCN, Thr ACN, His CAY, Gln CAR, Lys AAY, and Asn AAR). Results and Discussion Sequence and Gene Content of the Loligo mt Genome Table 1. Amino acid identities (%) of Loligo mt protein genes to those of Katharina, Littorina, Pupa, Terebratulina, and Lumbricus a Gene Katharina Littorina Pupa Terebratulina Lumbricus atp6 44.9 51.7 33.8 39.5 41.8 atp8 35.2 35.8 24.5 16.9 29.6 cox1 76.9 70.4 75.6 73.3 cox2 60.4 63.5 50.0 56.4 55.7 cox3 69.5 58.2 69.1 71.8 cob 61.5 53.0 58.9 57.3 nad1 50.6 55.2 50.9 50.9 51.1 nad2 36.5 23.9 24.9 30.1 nad3 47.5 42.4 47.9 49.6 nad4 39.6 35.4 34.6 39.0 nad4l 37.0 36.4 33.7 32.3 nad5 43.2 34.2 39.0 42.0 nad6 38.8 34.1 21.2 20.7 26.8 a Data are from the following sources: Katharina (Boore and Brown 1994), Littorina (Wilding et al. 1999), Pupa (Kurabayashi and Ueshima 2000), Terebratulina (Stechmann and Schlegel 1999), and Lumbricus (Boore and Brown 1995). The highest identity for each protein gene is indicated by boldface numbers, and the lowest by italic numbers. We newly determined the sequence of an 8148-bp fragment of Loligo mtdna. This sequence, together with that determined previously [9240 bp (Sasuga et al. 1999); there is a 177-bp overlap between these two fragments.], provided the complete sequence of Loligo mtdna, which has a total of 17,211 bp. The nucleotide composition of the complete Loligo mt genome is 38.8% A, 19.4% C, 9.2% G, and 32.5% T (A+T 71.3%, G+T 41.7%, AT skew 0.089, and GC skew 0.358) in the sense strand on which a majority of protein genes is encoded. In the Katharina mt genome, the nucleotide composition of the major coding strand is 31.4% A, 11.9% C, 18.6% G, and 38.1% T (A+T 69.5%, G+T 56.7%, AT skew 0.095, and GC skew 0.199). It was thus confirmed that the ratios of A and T and of G and C in the Loligo and Katharina mt genomes are inverted, as pointed out previously (Sasuga et al. 1999). In the newly sequenced fragment, six protein genes (cox3, nad3, cob, nad6, nadl, and the 5 half of nad2), two rrna genes, and eight trna genes [trnq, trni, trnk, trnp, trns(uga), trns(gcu), trnw, and trnv] were identified. Together with our previous results (Sasuga et al. 1999), Loligo mtdna is concluded to encode a standard set of metazoan mt genes 13 protein, 2 rrna, and 22 trna genes. The complete gene organization of Loligo mtdna is presented in Fig. 1 (lower panel). Protein Genes The genes cox3, cob, nad2, and nad6 start with an ATG codon, nad3 with an ATA codon, and nad1 with an ATT codon. All these protein genes have complete termination codons (TAA or TAG). The sizes of the newly identified Loligo mt protein genes are very similar to those of their counterparts in the Katharina, Littorina, and Pupa mt genomes (data not shown). As shown in Table 1, four of the five Littorina mt protein genes (atp6, atp8, cox2, and nad1) show the highest similarity at the amino acid sequence level to their Loligo counterparts. In the case of nad6, Katharina nad6 has more similarity to its Loligo counterpart than to that of Littorina. With regard to the remaining eight mt protein genes, six Katharina genes and two Lumbricus genes show the highest similarity to their Loligo counterparts. Although Littorina and Pupa belong to Gastropoda, 8 of the 13 Pupa mt protein genes show the lowest similarity to their Loligo counterparts. This suggests that the Pupa mt protein genes might have evolved more rapidly than those of the other species in Table 1 (Loligo, Katharina, Lumbricus, and Terebratulina). The possible evolutionary patterns of these mt protein genes are discussed in more detail later. rrna Genes Loligo mt rrnl and rrns appear to be 1334 and 978 bp long, respectively, which is similar to their Katharina counterparts (Boore and Brown 1994) but longer than

490 Fig. 2. Cloverleaf structures of eight Loligo mt trna genes found in the region sequenced in this study (see Fig. 1). those of pulmonate gastropods (e.g., Yamazaki et al. 1997). trna Genes The gene trns(uga) as well as trns(gcu) in the Loligo mt genome can be formed into a cloverleaf structure similar to trns(gcu) of other metazoan mitochondria (Fig. 2). Likewise, mt trns(uga) of Katharina (Boore and Brown 1994), the pulmonates Cepaea and Euhadra (Yamazaki et al. 1997), and the bivalves Mytilus edulis and M. californianus (Beagley et al. 1999) have been reported to lack a D stem, as is the case in nematode mt trns(uga) (Okimoto et al. 1992). However, the pulmonate Albinaria has a D stem in its mt trns(uga) (Hatzoglou et al. 1995); lack of the D stem in this gene is not a common feature among molluscan mt genomes. Loligo trns(uga) has a GG sequence in the D loop and a TTCGA sequence in the T loop; interaction between these two conserved sequences may stabilize the tertiary structure of the trna, as in the cases of typical trnas (Dirheimaer et al. 1995). Only 2 bp can be formed in the D stems of the trnq and trnk (Fig. 2). Three of the eight trna genes [trns(uga), trns(gcu), and trnq] have GG-conserved nucleotides in the D loop. The T stems consist of 5 bp in seven of the trna genes; only trni has a 4-bp T stem. Unlike several trnas in Cepaea and Euhadra mtdnas (Yamazaki et al. 1997), none of the Loligo trna genes have lost their T stem. Metazoan mitochondria use the modified wobble rule, as summarized by Yokobori et al. (2001). While most anticodon sequences of the 22 species of Loligo mt trna genes are sufficient for reading most of the codons according to the mt wobble rule, there are two exceptions related to trnas involved in the translation of nonuniversal genetic codes namely, the AUA codon specifying Met and the AGR codon specifying Ser (see Sasuga et al. 1999). We have found that the C at the anticodon wobble position of Loligo mt trna Met is modified to 5-formylcytidine (f 5 C) (Tomita et al. 1997). This f 5 C nucleotide modification has also been observed in bovine (Moriya et al. 1994), Ascaris (Watanabe et al. 1994), and D. melanogaster (Tomita et al. 1999) mt trnas Met,in which the AUA codon is read as Met. Thus, it is most likely that a modification from C to f 5 C is involved in the recognition of the AUA codon in Loligo mitochondria. We have found that guanosine at the anticodon wobble position of Loligo mt trna Ser GCU is modified to 7-methylguanosine (m 7 G) (Tomita et al. 1998). Modification to m 7 G also occurs at the first anticodon position of mt trna Ser GCU of the starfish Asterias amurensis (Matsuyama et al. 1998). In both cases, all of the AGN codons specify Ser, and it may be that all of the AGN codons are recognized by a single trna species (trna Ser GCU). On the other hand, trna Ser GCU decoding only AGYs has been reported to carry the unmodified G at the first codon position in several metazoans, including bovine (Ueda et al. 1985) and urochordate Halocynthia roretzi mitochondria (Kondow et al. 1999). Thus, it

491 is most likely that in Loligo mitochondria, the modification from G to m 7 G is responsible for decoding AGR in addition to AGY codons. Noncoding Region The Loligo mt genome has 19 noncoding regions (NC) (Figs. 1 and 3) longer than 10 bp, 3 of which are longer than 500 bp, namely, NC4 (515 bp), located between trnq and trni; NC8 (507 bp), between trnw and trnk; and NC16 (509 bp), between trng and trna (Fig. 1). These three long NCs have nearly identical sequences (Fig. 3A; the pairwise similarities of NC4/NC8, NC4/ NC16, and NC8/NC16 are 96.5, 97.8, and 95.7%, respectively). Therefore, it is likely that these three NCs originate from a single noncoding region. The differences among the sequences occur mainly at their 5 and 3 ends (Fig. 3A). For instance, NC4 contains (AT) 10 and NC8 contains (AT) 8 (three nucleotides overlap with trnk) at its 3 end (Fig. 3A), whereas NC16 does not contain any (AT) n sequence. In addition to NC4 and NC8, the noncoding regions between trnq and trna (NC10) and between trnl(uag) and cox3 (NC19) contain (AT) 14, and NC19 contains (AT) 12 (two nucleotides overlapping cox3 are included), as reported previously (Sasuga et al. 1999) (Fig. 3B). When the secondary structures of NC4/8/16 were predicted for both strands, several stem-and-loop structures could be formed at the 5 and 3 regions of the strand shown in Fig. 3A (data not shown). In the case of the opposite strand also, several large stem-and-loop structures could be formed. In NC4/8/16, the sequence 5 - ATATAACCATCCACACTCACCCTCCATAAAC-3 occurs twice (boxed in Fig. 3A). However, neither sequence was part of any of the stem-and-loop structures predicted for the strand shown in Fig. 3A, and they were at different positions in stem-and-loop structures for the opposite strand (data not shown). When a BLAST search (BLASTN) was performed for NC4/8/16 against the mitochondrial database in NCBI (Altschul et al. 1997), there were three regions that matched other mt sequences. Sequences similar to the central region (5 -ATAAACAAATAAATAAATA- CATAATA-3 ; underlined in Fig. 3A) were found in the noncoding regions of various mt genomes, such as that of Saccharomyces cerveciae (Foury et al. 1998) (data not shown). For the 5 and 3 regions, several hits were obtained, most of which were noncoding regions. However, no molluscan mtdna sequences hit the NC4 sequence. From these results, it is difficult to visualize the significance of the similarities between the NC4 sequence and other mtdna sequences. It may be that the noncoding sequences determined and the predicted possible secondary structures play some role(s) in the early stages of the replication and transcription process. However, further experiments are needed to clarify this speculation. Phylogenetic Analyses Based on Primary Sequences When combined data on the inferred amino acid sequences of the 13 mt protein genes are used for the ML analysis (Adachi and Hasegawa 1996), Loligo forms a group with Katharina (Fig. 4A). However, the local bootstrap probability (LBP) support for the group is only 62%. In addition, the position of gastropods (Albinaria and Pupa) is far from that of Loligo/Katharina in the ML tree. Quartet puzzling (QP) analysis (Strimmer and von Haeseler 1996), which permits rate variation among sites, did no resolve the relationship among the Loligo/ Katharina group, gastropods (Albinaria and Pupa), annelids (Lumbricus and Platynereis), and a brachiopod (Terebratulina) (data not shown). When the gastropod Littorina (1052 residues) is included in the ML analysis, Littorina and Katharina form a group (Fig. 4B) with an LBP of 83%. In the ML tree and the QP tree (data not shown), the relationship among Katharina/Littorina, Loligo, and Terebratulina is not resolved. In addition, the gastropods (Littorina, Albinaria, and Pupa) do not form a monophyletic group in these trees. These results are essentially the same as those of Stechmann and Schlegel (1999), in which cephalopod data are not included. Thus, the monophyly of Mollusca (Polyplacophora, Gastropoda, and Cephalopoda in these analyses) as well as the phylogenetic position of Cephalopoda within Mollusca could not be satisfied, as was the case with 18S rrna analyses (e.g., Winnepenninckx et al. 1996), although the monophyly of Mollusca is widely accepted from morphological studies (e.g., Nielsen 1995; Willmer 1990). A close relationship among Annelida, Brachiopoda, and Mollusca has been suggested by 18S rrna analysis (e.g., Cohen 2000). A close relationship between Annelida and Mollusca has also been suggested by analysis of partial elongation factor 1 (EF-1 ) sequences (Kojima et al. 1993), as well as 18S rrna analyses (e.g., Aguinaldo et al. 1997). In addition, an Annelida Mollusca grouping rather than an Annelida Arthropoda grouping has been suggested by various analyses, based not only on molecular phylogenetics but also on morphological studies (e.g., Eernisse et al. 1992). Why is the monophyly of Mollusca not supported in the above analyses, even though the monophyletic origin of Mollusca is widely accepted (e.g., Nielsen 1995; Willmer 1990)? As noted earlier, either Katharina or Littorina protein genes exhibit the highest similarity to their Loligo counterparts except for cox3 and nad3; in these two cases, the Lumbricus genes show the highest similarity. Eight of the 13 Pupa mt protein genes have

492 Fig. 3. Comparison of the long noncoding sequences in Loligo mtdna. A NC4, NC8, and NC16. Each location is shown in Fig. 1. Positions where the three noncoding sequences have the same nucleotide are denoted by asterisks. Repeated regions are boxed. The sequence 5 -ATAAACAAATAAATAAATACATAATA-3 mentioned in the text is underlined. B Noncoding sequences other than NC4, NC8, and NC16. For positions of noncoding regions, see Fig. 1.

493 Fig. 4. ML trees of protostomes based on the inferred amino acid sequences of mt protein genes. A Without Littorina. B With Littorina. See Materials and Methods for details. the lowest identity among the species compared in Table 1. Such heterogeny of the rate of evolution among molluscan mt genome sequences might have led to the construction of an incorrect tree; the Albinaria and Pupa sequences might have evolved more rapidly than the Katharina, Loligo, and Littorina sequences (Fig. 4). To address this issue, the aligned sequence data used for the ML analysis presented in Fig. 4B were used for a relative rate test (Robinson-Rechavi and Huchon 2000). When D. yakuba is treated as the outgroup, there are no significant differences in the substitution rates among the molluscan species (data not shown). However, D. yakuba may be too distal an outgroup for this analysis. If the monophyly of Mollusca is accepted, although the ML trees in Fig. 4 do not support this, Annelida might be a much better outgroup for Mollusca than D. yakuba. When the annelid species (Lumbricus and Platynereis) are used as the outgroup, Albinaria apparently shows a significantly higher substitution rate than Katharina (p << 0.01, where the null hypothesis is that the two species compared show the same substitution rate), Loligo (p << 0.01), and Littorina (p 0.010). In addition, Pupa also exhibits a significantly higher substitution rate than Katharina (p 0.043) and Littorina (p 0.014), although the difference in substitution rates between Pupa and Loligo is not significant (p 0.232). Thus, the substitution rates of the molluscan species compared here vary, the rates of Pupa and Albinaria being higher than those of the other mollusks. This could affect the shape of the recovered phylogenetic trees. Another factor, related to the tempo of evolutionary change, which could have led to the construction of an

494 incorrect tree is the differences in nucleotide composition among molluscan and other protostome mt genes. The second position of a codon is the most conserved of the three codon positions, since changing the nucleotide at the second position changes the property of the encoded amino acid. Although AT and GC skews are in most cases negative at the second codon position, the GC skew of Katharina in groups A (consisting of atp6, atp8, cox3, and nad3) and B (consisting of cox1 and cox2, and nad2) and the GC skew of Loligo in group C (consisting of nad1, nad4 nad6, and nad4l) are positive (Fig. 5A). The AT skews of Katharina at the first and third positions of the codons in groups A and B are negative as in the cases of the second codon positions, and the GC skews are positive as in the cases of the second codon positions. For the same gene sets (groups A and B), the AT skew is very small and the GC skew is negative at the first and third positions of Loligo codons. On the other hand, in the case of group C the AT skew is very small and GC skew is negative at the first and third positions of Katharina codons, but the AT skews of Loligo at the first and third positions of the codons in the groups are negative as in the cases of the second codon positions, and the GC skews are positive as in the cases of the second codon positions. Thus, it can be concluded that the AT and GC skews are inverted, which means that the bias is inverted, between the same genes of Loligo and Katharina. The different direction of bias found in Loligo and Katharina mt protein genes affects the amino acid composition of the resultant polypeptides (Fig. 5B). The frequencies of GT amino acids (those encoded by GGN, GTN, TGN, and TTN codons) and AC amino acids (those encoded by AAN, ACN, CAN, and CCN codons) are very different between groups A/B and group C in the Loligo and Katharina mt genomes, respectively. Furthermore, the bias of GT-amino acid richness/shortness and that of AC-amino acid richness/shortness are opposite if they are compared between the counterparts of the Loligo and Katharina mt genomes. Thus, nucleotide usage in Katharina and Loligo mt protein genes is governed not only by the gene type but also by another constraint. This is most likely to be strand-specific directional mutation pressure (Asakawa et al. 1991) operating on the genes. All three codon positions appear to have been affected by this directional mutation pressure, which would have given rise to the different amino acid compositions of the Loligo and Katharina mt protein genes. All the mt protein genes in Littorina, Terebratulina, and Lumbricus are encoded by a single strand; hence the AC/GT bias of these genomes is likely to change the nucleotide and amino acid compositions in same direction for all the genes (Figs. 5A and B). On the other hand, the AC/GT bias is not an apparent constraint of amino acid usage in the cases of the Pupa mt protein genes (Figs. 5A and B). The branching orders of vertebrate species in the ML trees presented in Fig. 4 differ from the widely accepted view. An unusual branching order of vertebrate species in phylogenetic analysis using mt sequences has been noted when lampreys and nonvertebrate metazoan species, such as echinoderms, are included (e.g., Takezaki and Gojobori 1999). This might be also affected, in part, by differences in amino acid composition between vertebrates and other species (see Takezaki and Gojobori 1999), as postulated for molluscan species (see above). Evolution of Loligo mt Gene Arrangement If only protein and rrna genes are considered, only one inversion event would explain the difference in the mt gene order between Katharina and the brachiopod Terebratulina, which appears as a species very closely related to mollusks in the phylogenetic tree based on mt protein gene sequences, as discussed above (Fig. 6). Similarly, the difference in protein/rrna mt gene arrangement between Katharina and Littorina (reported region) is explained by one inversion (Fig. 6). On the other hand, the difference in the arrangement between the Katharina and the Loligo mt genomes is a more complex issue, since five gene blocks are recognized and are arranged in different orders in the Loligo and Katharina mt genomes (Fig. 6). Two transpositions and one inversion are necessary to explain the difference in gene organization between the Loligo and the Littorina mt genomes (Fig. 6). On the other hand, rearrangement of five gene blocks and two inversions are necessary to explain the difference in gene organization between the Loligo and the Terebratulina mt genomes (Fig. 6). These findings suggest that the Loligo mt genome has a highly scrambled gene arrangement compared with those of the Katharina, Littorina, and Terebratulina mt genomes. trna genes are known to transpose more frequently than protein and rrna genes in metazoan mt genomes (e.g., Pääbo et al. 1991). In addition, trna genes are often found at the end of the deleted/duplicated region, suggesting that they may be hot spots for gene rearrangement events (e.g., Stanton et al. 1994). When trna genes are also included in a comparison of the gene arrangements, the locations of 7 of the 22 trna genes in the Loligo mt genome can be directly compared with the locations of their counterparts in Katharina (Boore and Brown 1994) and Littorina (Wilding et al. 1999) (Fig. 7A). (Genes encoded by the opposite strand are underlined.) The order, rrns trnv rrnl, is shared by these mt genomes as well as by various nonmolluscan mt genomes (i.e., most vertebrate and arthropod mt genomes, as well as annelid and Terebratulina mt genomes). The Loligo mt genome shares the orders trns(gcu) nad2, trnt nad4l, nad5 trnf, and cob trns(uga) with the Katharina mt genome (regions containing these genes are not known for Littorina); these orders are also found in

495 Fig. 5. A Comparison of AT skew and GC skew (Perna and Kocher 1995) among three codon positions, among three groups, and among the Loligo (this study; Sasuga et al. 1999), Katharina (Boore and Brown 1994), Littorina [partial (Wilding et al. 1999)], Pupa (Kurabayashi and Ueshima 2000), Lumbricus (Boore and Brown 1995), and Terebratulina (Stechmann and Schlegel 1999) mt genomes. The mt protein genes are divided into three groups (A, B, and C). Black and White bars indicate AT skews and GC skews, respectively. For details, see Materials and Methods. B Comparison of frequencies of GT amino acids and AC amino acids among Loligo, Katharina, Littorina (partial), Pupa, Lumbricus, and Terebratulina. The same data set as that for A was used for analysis. Black and white bars indicate the appearance percentages of the GT amino acids and the AC amino acids, respectively.

496 Fig. 6. Comparison of the arrangement of rrna and protein genes among the Loligo, Katharina (Boore and Brown 1994), Littorina (Wilding et al. 1999), and Terebratulina (Stechmann and Schlegel 1999) mt genomes. Genes encoded on the opposite strand are shown in gray boxes.

497 Fig. 7. A Comparison of gene arrangement among the Loligo, Katharina (Boore and Brown 1994), and Littorina (Wilding et al. 1999) mt genomes. The conserved gene orders (at least, the order of the gene pair) between two genomes are indicated by bars. 4L, nad4l. Genes encoded on the opposite strand are shown in gray boxes. B Comparison of the partial orders of noncoding regions and flanking trna genes in the Loligo mt genome and those of the and trna gene clusters (see text), trnl(uaa), trnl(uag), and trnh in the Katharina mt genome (Boore and Brown 1994). Lengths of noncoding regions (NC4, etc.) are shown.

498 most arthropod mt genomes [trns(gcu) nad2, trnt nad4l, nad5 trnf, and cob trns(uga)], annelid mt genomes [trns(gcu) nad2 and nad5 trnf], and the Terebratulina mt genome [trns(gcu) nad2 and nad5 trnf]. Therefore, these orders might be ancestral features of molluscan mtdna. The Loligo and Littorina mt genomes share the positions of two trna genes, trnd atp8 and trnp nad6, which are not shared by the Katharina mt genome (Fig. 7A). The order trnd atp8 is also found in the Lumbricus (Boore and Brown 1995) and Terebratulina (Stechmann and Schlegel 1999) mt genomes as well in as most arthropod mt genomes. This suggests that the order trnd atp8 might be the ancestral gene order for molluscan mt genomes. On the other hand, the direction of trnp in the Katharina mt genome (trnp nad6) is inverted in relation to that in the Loligo and Littorina mt genomes (trnp nad6) (Fig. 7A). The arthropod (such as Drosophila) mt genome has the order trnp nad6 (Clary and Wolstenholme 1985), which is similar to the case for the Katharina mt genome. Therefore, the order trnp nad6 found in the Loligo and Littorina mt genomes may be a synapomorphic trait and may be derived from the order trnp nad6 found in the Katharina and Drosophila mt genomes. It is notable that the order trnp nad6 is found in the opisthobranch gastropod Pupa mt genome (Kurabayashi and Ueshima 2000), whose gene arrangement is much different from those of other molluscan mt genomes. trna Genes Flanking Noncoding Regions in the Loligo mt Genome Fifteen of 22 trna genes in the Loligo mt genome are positioned differently from their counterparts in the Katharina/Littorina mt genomes. Twelve of these 15 Loligo mt trna genes are found in two trna gene clusters in the Katharina mt genome (Fig. 7B). One of these is the cluster between rrns and cox3 (trnm trnc trny trnw trnq trng trne) (hereafter referred to as the cluster); the other is between cox3 and nad3 (trnk trna trnr trnn trni) (referred to as the cluster) (Boore and Brown 1994). A short (141-bp long) noncoding region is found between trne and cox3 in the Katharina mt genome, containing an AT stretch (34 repeats of TA dinucleotides) (Boore and Brown 1994). As shown in Fig. 7B, one of the ends of the noncoding regions NC4, NC8, NC10, and NC12 is flanked by the trna gene found in the cluster, and the other is flanked by the trna gene found in the cluster. Furthermore, the relative directions of the trna gene originating from the cluster and that originating from the cluster are the same as the original relative directions of the and the clusters. Because they are highly similar, NC4, NC8, and NC16 are likely to have originated from a single noncoding region. Hence, multiplication of the noncoding regions would correlate with the transposition of the trna genes flanking them (discussed later). The genes trnl(uaa) and trnl(uag), located between nad1 and rrnl in the Katharina and Littorina mt genomes, are replaced by the block NC3 trnq NC4 trni NC5 in the Loligo mt genome (Figs. 7A and B). trnl (uaa) is located downstream of trng (Figs. 7A and B), and trnl(uag) flanks NC19 together with trnh (Figs. 7A and B), as reported previously (Sasuga et al. 1999). As noted above, the Katharina mt gene arrangement might retain more ancestral features than the Loligo mt gene arrangement (see Fig. 6). Let us consider that the order trnm trnc trny trnw trnq trng trne NC cox3 trnk trna trnr trnn trni in the Katharina mt genome was rearranged to trnm trnc trny trnw trnq trng trne NC trnk trna trnr trnn trni cox3 in the ancestral Loligo mt genome. If this were the case, the noncoding region might be the ancestor of NC4, NC8, and NC16 of the Loligo mt genome as well as of other shorter noncoding regions (NC10, NC12, and probably NC19). Sequential multiplication of the noncoding region with the flanking trna gene clusters might have occurred in the ancestral Loligo mt genome. After or during multiplication of the noncoding regions with the flanking trna genes, loss of some trna genes in each copy might have occurred and trna Leu genes might have been transposed with the set of the noncoding region and the flanking trna genes after insertion of the ancestral NC4 and its flanking trna genes near the trna Leu genes between nad1 and rrnl. trna genes have been considered as hot spots located at the ends of duplicated fragments in various mt genomes (e.g., Stanton et al. 1994). In addition, the frequent occurrence of gene rearrangement around noncoding regions that contain the origins and/or regulation elements for replication and/or transcription has been found in various mt genomes (e.g., Zevering et al. 1991). Among the six long noncoding regions with flanking trna genes (NC4, -8, -10, -12, -16, and -19), four (NC8, -12, -16, and -19) are located at the junctions of blocks for protein and rrna genes conserved between the Loligo and the Katharina mt genomes. This suggests that the rearrangement of gene blocks mentioned in Fig. 6 might somehow correlate with the spread of noncoding regions. NC4 with flanking trnq and trni is located downstream of rrnl. In vertebrate mt genomes, trnl(uaa) located downstream of rrnl is known to contain an element for the termination of transcription, so that more rrnas than mrnas are transcribed (Attardi 1985). Therefore, the region trnq NC4 trni may play the same role. Alternatively, if both NC4 and NC8 contain initiation points for transcription, and those in NC4 are controlled differently than those in NC8, control of amounts of rrna transcripts relative to those of mrnas (an excess of rrnas) would be realized. Because of the high degrees of similarity among NC4, NC8, and NC16, some stages in the multiplication of the

499 noncoding regions and cotranslocations of flanking trna genes are considered not to be ancient events. Alternatively, the concerted evolution of these noncoding regions may have maintained the identity of their primary sequences, as proposed for the two nearidentical noncoding regions of the Dinodon mt genome by Kumazawa et al. (1998). Homologous recombination between mtdna molecules (Thyagarajan et al. 1996) is also a possible mechanism for retention of sequence identity among these noncoding regions. Analysis of other squid and cuttlefish mt genomes might help enlighten our understanding of the evolution of the Loligo bleekeri mt genome structure and might reveal the usefulness of the mt genome structure for elucidating cephalopod phylogeny. Acknowledgments. This work was supported by a Grant-in-Aid for Scientific Research on Priority Areas from the Ministry of Education, Culture, Sports, Science and Technology of Japan to K.W., a grant from the Human Frontier Science Program Organization to K.W., and grants from the Ministry of Education, Culture, Sports, Science and Technology of Japan to T.O. and S.Y. References Adachi J, Hasegawa M (1996) MOLPHY 2.3b. Institute of Statistical Mathematics, Tokyo Aguinaldo AM, Turbeville JM, Linford LS, Rivera MC, Garey JR, Raff RA, Lake JA (1997) Evidence for a clade of nematodes, arthropods and other moulting animals. Nature 387:489 493 Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res 25:3389 3402 Asakawa S, Kumazawa Y, Araki T, Himeno H, Miura K, Watanabe K (1991) Strand-specific nucleotide composition bias in echinoderm and vertebrate mitochondrial genomes. J Mol Evol 32:511 520 Attardi G (1985) Animal mitochondrial DNA: An extreme example of genetic economy. Int Rev Cytol 93:93 145 Beagley CT, Okimoto R, Wolstenholme DR (1999) Mytilus mitochondrial DNA contains a functional gene for a trna Ser UCN with a dihydrouridine arm-replacement loop and a pseudo-trna Ser UCN gene. Genetics 152:641 652 Boore JL (1999) Animal mitochondrial genomes. Nucleic Acids Res 27:1767 1780 Boore JL, Brown WM (1994) Complete DNA sequence of the mitochondrial genome of the black chiton, Katharina tunicata. Genetics 138:423 443 Boore JL, Brown WM (1995) Complete sequence of the mitochondrial DNA of the annelid worm Lumbricus terrestris. Genetics 141:305 319 Clary DO, Wolstenholme DR (1985) The mitochondrial DNA molecular of Drosophila yakuba: Nucleotide sequence, gene organization, and genetic code. J Mol Evol 22:252 271 Cohen BL (2000) Monophyly of brachiopods and phoronids: Reconciliation of molecular evidence with Linnaean classification (the subphylum Phoroniformea nov.). Proc R Soc Lond B Biol Sci 267:225 231 Dirheimaer G, Keith G, Dumas P, Westhof E (1995). Primary, secondary, and tertiary structures of trnas. In: Söll D, RajBahandary UL (eds) trna: Structure, biosynthesis and function. ASM Press, Washington, DC, pp 93 126 Eernisse DJ, Albert JS, Anderson FE (1992) Annelida and Arthropoda are not sister taxa: A phylogenetic analysis of spiralian metazoan morphology. Syst Biol 41:305 330 Foury F, Roganti T, Lecrenier N, Purnelle B (1998) The complete sequence of the mitochondrial genome of Saccharomyces cerevisiae. FEBS Lett 440:325 331 Hatzoglou E, Rodakis GC, Lecanidou R (1995) Complete sequence and gene organization of the mitochondrial genome of the land snail Albinaria coerulea. Genetics 140:1353 1366 Hoffmann RJ, Boore JL, Brown WM (1992) A novel mitochondrial genome organization for the blue mussel, Mytilus edulis. Genetics 131:397 412 Kojima S, Hashimoto T, Hasegawa M, Murata S, Ohta S, Seki H, Okada N (1993) Close phylogenetic relationship between Vestimentifera (tube worms) and Annelida revealed by the amino acid sequence of elongation factor-1. J Mol Evol 37:66 70 Kondow A, Suzuki T, Yokobori S, Ueda T, Watanabe K (1999) An extra trna Gly U*CU found in ascidian mitochondria responsible for decoding non-universal codons AGA/AGG as glycine. Nucleic Acids Res 27:2554 2559 Kumazawa Y, Ota H, Nishida M, Ozawa T (1998) The complete nucleotide sequence of a snake (Dinodon semicarinatus) mitochondrial genome with two identical control regions. Genetics 150:313 329 Kurabayashi A, Ueshima R (2000) Complete sequence of the mitochondrial DNA of the primitive opisthobranch gastropod Pupa strigosa: Systematic implication of the genome organization. Mol Biol Evol 17:266 277 La Roche J, Snyder M, Cook DI, Fuller K, Zouros E (1990) Molecular characterization of a repeat element causing large-scale size variation in the mitochondrial DNA of the sea scallop Placopecten magellanicus. Mol Biol Evol 7:45 64. Matsuyama S, Ueda T, Crain PF, McCloskey JA, Watanabe K (1998) A novel wobble rule found in starfish mitochondria. Presence of 7-methylguanosine at the anticodon wobble position expands decoding capability of trna. J Biol Chem 273:3363 3368 Moriya J, Yokogawa T, Wakita K, Ueda T, Nishikawa K, Crain PF, Hashizume T, Pomerantz SC, McCloskey JA, Kawai G, Hayashi N, Yokoyama S, Watanabe, K (1994) A novel modified nucleoside found at the first position of the anticodon of methionine trna from bovine liver mitochondria. Biochemistry 33:2234 2239 Nielsen C (1995) Animal evolution: Interrelationships of the living phyla. Oxford University Press, Oxford Okimoto R, Macfarlane JL, Clary DO, Wolstenholme DR (1992) The mitochondrial genomes of two nematodes, Caenorhabditis elegans and Ascaris suum. Genetics 130:471 498 Pääbo S, Thomas WK, Whitfield KM, Kumazawa Y, Wilson AC (1991) Rearrangements of mitochondrial transfer RNA genes in marsupials. J Mol Evol 33:426 430 Perna NT, Kocher TD (1995) Patterns of nucleotide composition at fourfold degenerate sites of animal mitochondrial genomes. J Mol Evol 41:353 358 Robinson-Rechavi M, Huchon D (2000) RRTree: relative rate tests between groups of sequences on a phylogenetic tree. Bioinformatics 16:296 297 Saiki RK, Gelfand DH, Stoffel S, Scharf SJ, Higuchi R, Horn GT, Mullis KB, Erlich HA (1988) Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239: 487 491 Sambrook J, Fritsch EF, Maniatis T (1989). Molecular cloning: A laboratory manual, 2nd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY Sasuga J, Yokobori S, Kaifu M, Ueda T, Nishikawa K, Watanabe K (1999) Gene contents and organization of a mitochondrial DNA segment of the squid Loligo bleekeri. J Mol Evol 48:692 702 Stanton DJ, Daehler LL, Moritz CC, Brown WM (1994) Sequences with the potential to form stem-and-loop structures are associated with coding-region duplications in animal mitochondrial DNA. Genetics 137:233 241

500 Stechmann A, Schlegel M (1999) Analysis of the complete mitochondrial DNA sequence of the brachiopod Terebratulina retusa places Brachiopoda within the protostomes. Proc R Soc Lond B Biol Sci 266:2043 2052 Strimmer K, von Haeseler A (1996) Quartet puzzling: A quartet maximum likelihood method for reconstructing tree topologies. Mol Biol Evol 13:964 969 Takezaki N, Gojobori T (1999) Correct and incorrect vertebrate phylogenies obtained by the entire mitochondrial DNA sequences. Mol Biol Evol 16:590 601 Terrett JA, Miles S, Thomas RH (1996) Complete DNA sequence of the mitochondrial genome of Cepaea nemoralis (Gastropoda: Pulmonata). J Mol Evol 42:160 168 Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The CLUSTAL_X windows interface: Flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25:4876 4882 Thyagarajan B, Padua RA, Campbell C (1996) Mammalian mitochondria possess homologous DNA recombination activity. J Biol Chem 271:27536 27543 Tomita K, Ueda T, Watanabe K (1997) 5-Formylcytidine (f 5 C) found at the wobble position of the anticodon of squid mitochondrial trna Met CAU Nucleic Acids Symp Ser 37:197 198 Tomita K, Ueda T, Watanabe K (1998) 7-Methylguanosine at the anticodon wobble position of squid mitochondrial trna Ser GCU: Molecular basis for assignment of AGA/AGG codons as serine in invertebrate mitochondria. Biochim Biophys Acta 1399:78 82 Tomita K, Ueda T, Ishiwa S, Crain PF, McCloskey JA, Watanabe K (1999) Codon reading patterns in Drosophila melanogaster mitochondria based on their trna sequences: A unique wobble rule in animal mitochondria. Nucleic Acids Res 27:4291 4297 Ueda T, Ohta T, Watanabe K (1985) Large scale isolation and some properties of AGY-specific serine trna from bovine heart mitochondria. J Biochem (Tokyo) 98:1275 1284 Watanabe Y, Tsurui H, Ueda T, Furushima R, Takamiya S, Kita K, Nishikawa K, Watanabe K (1994) Primary and higher order structures of nematode (Ascaris suum) mitochondrial trnas lacking either the T or D stem. J Biol Chem 269:22902 22906 Wilding CS, Mill PJ, Grahame J (1999) Partial sequence of the mitochondrial genome of Littorina saxatilis: Relevance to gastropod phylogenetics. J Mol Evol 48:348 359 Willmer P (1990) Invertebrate relationships: Patterns in animal evolution. Cambridge University Press, Cambridge Winnepenninckx B, Backeljau T, De Wachter R (1996) Investigation of molluscan phylogeny on the basis of 18S rrna sequences. Mol Biol Evol 13:1306 1317 Wolstenholme DR (1992) Animal mitochondrial DNA: structure and evolution. Int Rev Cytol 141:173 216 Yamazaki N, Ueshima R, Terrett JA, Yokobori S, Kaifu M, Segawa R, Kobayashi T, Numachi K, Ueda T, Nishikawa K, Watanabe K, Thomas RH (1997) Evolution of pulmonate gastropod mitochondrial genomes: Comparisons of gene organizations of Euhadra, Cepaea and Albinaria and implications of unusual trna secondary structures. Genetics 145:749 758 Yokobori S, Suzuki T, Watanabe K (2001) Genetic code variations in mitochondria: trna as a major determinant of genetic code plasticity. J Mol Evol 53:314 326 Zevering CE, Moritz C, Heideman A, Sturm RA (1991) Parallel origins of duplications and the formation of pseudogenes in mitochondrial DNA from parthenogenetic lizards (Heteronotia binoei; Gekkonidae). J Mol Evol 33:431 441