Species-Specific Signals for the Splicing of a Short Drosophila Intron In Vitro

Size: px
Start display at page:

Download "Species-Specific Signals for the Splicing of a Short Drosophila Intron In Vitro"

Transcription

1 MOLECULAR AND CELLULAR BIOLOGY, Feb. 1993, P /93/ $2./ Copyright 1993, Americn Society for Microbiology Vol. 13, No. 2 Species-Specific Signls for the Splicing of Short Drosophil Intron In Vitro MING GUO, PATRICK C. H. LO, AND STEPHEN M. MOUNT* Deprtment of Biologicl Sciences, Columbi University, New York, New York 127 Received 3 August 1992/Returned for modifiction 17 September 1992/Accepted 25 November 1992 The effects of brnchpoint sequence, the pyrimidine stretch, nd intron size on the splicing efficiency of the Drosophil white gene second intron were exmined in nucler extrcts from Drosophil nd humn cells. This 74-nucleotide intron is typicl of mny Drosophil introns in tht it lcks significnt pyrimidine stretch nd is below the minimum size required for splicing in humn nucler extrcts. Altertion of sequences djcent to the 3' splice site to crete pyrimidine stretch ws necessry for splicing in humn, but not Drosophil, extrcts. Incresing the size of this intron with insertions between the 5' splice site nd the brnchpoint gretly reduced the efficiency of splicing of introns longer thn 79 nucleotides in Drosophil extrcts but hd n opposite effect in humn extrcts, in which introns longer thn 78 nucleotides were spliced with much greter efficiency. The white-pricot copi insertion is immeditely djcent to the brnchpoint normlly used in the splicing of this intron, nd copi long terminl repet insertion prevents splicing in Drosophil, but not humn, extrcts. However, consensus brnchpoint does not restore the splicing of introns contining the copi long terminl repet, nd ltertion of the wild-type brnchpoint sequence lone does not eliminte splicing. These results demonstrte species specificity of splicing signls, prticulrly pyrimidine stretch nd size requirements, nd rise the possibility tht vrint mechnisms not found in mmmls my operte in the splicing of smll introns in Drosophil nd possibly other species. The splicing of eukryotic mrna precursors in mmmlin cells hs been studied extensively (for reviews, see references 18 nd 79), nd the signls tht govern the identifiction of splice sites re generlly known. A 5' splice site tht conforms to the consensus sequence MAGIGURAGU (M = C or A; R = A or G) nd includes the underlined GU dinucleotide is required. 3' splice sites conform to the consensus sequence YA_iIG (Y = C or U) nd re typiclly found t the site of the first AG dinucleotide downstrem of the brnchpoint. Brnchpoints fit the consensus sequence UNCURAC (in which brnch formtion occurs t the underlined A) nd usully reside between 18 nd 38 nucleotides upstrem of the 3' splice site. Between the brnchpoint nd the 3' splice site is pyrimidine-rich region. The wy in which sequences t the 5' splice site, the brnchpoint, the pyrimidine-rich stretch, nd the 3' splice site ct together in mmmlin splicing to specify intron boundries hs been investigted in detil, nd much is known of the fctors tht recognize these sites. For exmple, the 5' splice site is recognized by the Ul smll nucler ribonucleoprotein (sn- RNP) vi bse piring in both mmmls nd in yests (8, 34, 47, 76, 78, 9), nd the brnchpoint is similrly recognized by the U2 snrnp (5, 58, 84, 89, 91). Binding of the U2 snrnp to the brnchpoint requires number of fctors, including the Ul snrnp (2, 67, 75) nd U2AF, fctor tht binds to the pyrimidine-rich stretch (7). Despite the reltively lrge mount tht is known bout the sequence requirements for splicing, it is still not possible to ccurtely predict the positions of introns from sequence informtion lone, nd the bsis of lterntive, or regulted, splicing is still being elucidted. The genetics of Drosophil melnogster hs llowed regultory fctors to be identified for severl exmples of lterntive splicing (4, 1, 11, 41, 49, 85), nd it ppers tht this orgnism will prove useful for the * Corresponding uthor. study of lterntive splicing. However, the bsic informtion on splicing signls which exists in yest nd mmmlin systems hs no counterprt in D. melnogster, nd splicing signls do vry between species. For exmple, the reltive A+T richness of plnt introns is criticl to their proper recognition (17), nd niml introns re not properly recognized in trnsfected plnt cells (16, 82). Similrly, most mmmlin introns re not recognized by the yest Scchromyces cerevisie (3, 32). This is due, t lest in prt, to the fct tht yest introns lmost lwys use the precise sequence UACUAAC s brnchpoint, nd this sequence is the primry determinnt of yest 3' splice site selection (25, 51, 57). In contrst, the brnchpoint sequence of mmmlin introns hs greter flexibility (26, 28, 5, 65, 69, 88), nd the pyrimidine-rich stretch is more importnt (13, 64, 68). A thorough nlysis of Drosophil introns in GenBnk (45) reveled tht lthough Drosophil splice sites re like those in mmmlin genes, Drosophil introns s whole differ from mmmlin introns in severl significnt wys. First, Drosophil introns tend not to hve G in the position preceding the brnched nucleotide, lthough G is the most common nucleotide t tht position in mmmlin introns. Second, mny Drosophil introns re shorter thn the smllest mmmlin introns. Third, Drosophil introns differ from mmmlin introns in bse composition, with 17% greter A+T content in introns thn in flnking exons nd much less extreme preference for pyrimidines in the region between the brnchpoint nd the 3' splice site. In this study, we used in vitro splicing in both Drosophil nd humn cell nucler extrcts to explore the signls required for the splicing of smll Drosophil intron. The second intron of the Drosophil white gene ws chosen for this study becuse it is chrcteristic of smll pyrimidinepoor Drosophil introns tht lck sequence fetures required for mmmlin splicing. In ddition, the splicing of this 74-bp intron is ltered in the white-pricot (w') llele by n insertion of the trnsposble element copi in the sme 114

2 VOL. 13, 1993 trnscriptionl orienttion (5, 15, 56, 6). In We, the level of normlly spliced mrna is gretly reduced, nd number of berrnt RNAs tht re polydenylted within copi re observed (36, 46, 61, 86). Interest in we derives from the existence of muttions in unlinked genes which lter its expression, resulting in incresed or decresed eye pigmenttion (6, 7, 11, 38, 59, 62, 81, 85). Correltion of the structure of number of derivtives of w with their phenotypes, the RNAs tht they produce, nd their response to genetic modifiers hs led to the conclusion tht there is competition between polydenyltion within copi nd the splicing of this intron (31). Furthermore, it ppers tht the copi insertion in we hs not only provided polydenyltion site but hs lso interfered with splicing in some wy. For exmple, derivtives of w tht show little or no polydenyltion within copi (including those in which the copi element hs been replced by single long terminl repet [sltr]) re expressed t less thn fully wild type levels (31, 46, 86). In ddition, becuse number of studies hve shown tht polydenyltion sites within introns cn be spliced out (1, 33, 37), even miniml use of the copi polydenyltion site suggests disruption of splicing. In this report, we show tht the w copi insertion hs ltered the brnchpoint normlly used in the splicing of this intron. Additionl results indicte tht the incresed size of the intron my lso contribute to the splicing defect in w". A series of introns with ltertions between the 5' splice site nd the brnchpoint reveled little or no splicing of introns longer thn 79 nucleotides in Drosophil cell nucler extrct but n opposite effect in humn cell extrcts, in which introns shorter thn 78 nucleotides were not spliced. Furthermore, pyrimidine stretch djcent to the 3' splice site ws found to be essentil in humn extrcts nd unnecessry (but stimultory) for splicing in Drosophil extrcts. These in vitro differences between nucler extrcts from Drosophil nd humn cells in their responses to vrition in the sequence of RNA substrtes confirm differences suggested by differences between these species in the sequences of their introns. MATERIALS AND METHODS Constructions. Plsmids pmg1 nd pmg2 (Fig. 1) were constructed by inserting the PvuII (white nucleotide 1178)- SlI (position 11867) DNA frgments from plsmids pml2.5 (35) nd pclltr (46), respectively, between the SmI nd SlI sites of the vector pibi24 (Interntionl Biotechnologies, Inc.), downstrem of the promoter for T7 RNA polymerse. The resulting clones contined portions of the second nd third exons of the wild-type white gene with either the wild-type second intron (pmg1) or the sme intron crrying n sltr insertion (pmg2). The linker insertion muttion pmg3 ws constructed by using the polymerse chin rection (71). pmg1 ws used s templte for mplifiction in two rections. A 12-bp frgment contining exon 2 nd 29 bp of the second intron (1178 to 11182) ws mplified by using oligonucleotide MG2 (5' -GGATCCATCGATATCAGATCAGCCGACTGCGA-3') nd reverse sequencing primer 121 (New Englnd Biolbs). A 67-bp frgment contining the reminder of intron 2 nd ll of exon 3 (11182 to 11867) ws mplified by using sequencing primer 1211 (New Englnd Biolbs) nd oligonucleotide MG1 (5'-GATATCGATGGATCCTGTGTGAAA TCTrAAT-3'). Oligonucleotides MG1 nd MG2 crried ClI sites within nonnneling region t their 5' ends. pmg3 ws then constructed by ligting the 12-bp mplified frgment cut with EcoRI nd ClI, the 67-bp mplified frgment DROSOPHILA IN VITRO SPLICING SIGNALS 115 cut with ClI nd HindIII, nd pibi24 cut with EcoRI nd HindIII. The resulting construct, pmg3, contins 16-bp polylinker with sites for EcoRV, ClI, nd BmHI 29 bp downstrem of 5' splice site. Sequences were confirmed by DNA sequencing (73). The linker substitution muttions pmg7 nd pmg37 were lso constructed by using the polymerse chin rection (71). A frgment of pmg1/31 contining exon 2 nd 13 bp of the second intron (1178 to 11169) ws mplified by using oligonucleotide MG9 (5'-GGATCCATCGATATCAATAGA AACTCACCGTTC-3') nd reverse sequencing primer 121 (New Englnd Biolbs). Oligonucleotide MG9 crried ClI site within nonnneling region t its 5' end. pmg7 nd pmg37 were then constructed by ligting the mplified frgment cut with EcoRI nd ClI nd pmg3/33 cut with EcoRI nd ClI. The resulting construct, pmg7/37, contins 16-bp polylinker substitution between the 5' splice site nd brnchpoint, 13 bp downstrem of 5' splice site. Sequences were confirmed by DNA sequencing. Mutnts with incresed pyrimidine content (Fig. 1D) or brnchpoint ltertions (Fig. 3A nd C) were generted by oligonucleotide-directed mutgenesis by using oligonucleotides MG3 (for mking mutnts with incresed pyrimidine content; 5'-TTACCAAT'Tl'ITICCTCAGTITGC-3'), DCB1 (for mking mutnts pmg4/34 from pmg2/32; 5'-GTAATT GGACCCT'ITATTAGTAATITTATAATFIA-3'), nd MG6 (for mking mutnts pmg5/35 from pmg1/31; 5'-CTGTGTG AAAACAACATAAAGGGTCC-3'). The templte for mutgenesis ws generted by subcloning n EcoRI-HindIII frgment of pmg1 nd pmg2 into M13mpl8 or M13mpl9. Mutgenesis ws performed essentilly s described by Kunkel et l. (3), nd mutnts were identified by DNA sequencing. Deletion mutnts derived from pmg3 nd pmg33 (Fig. SC) were mde by nuclese Bl 31 digestion of pmg3 nd pmg33 DNA linerized t the EcoRV site. Deletion mutgenesis ws performed essentilly s described by Smbrook et l. (72), nd mutnts were identified by DNA sequencing. Nucler extrcts. Drosophil Kc cells were grown in D22 medium. Drosophil Kc cell nd humn HeL or 293 cell nucler extrcts were prepred by modifiction of the protocol of Dignm et l. (12) in which 42 mm (NH4)2SO4 ws substituted for.1 M KCl in the finl dilysis step (53). Precursor preprtion nd in vitro splicing. Cpped precursor RNAs were produced by runoff trnscription with T7 RNA polymerse (Promeg) of templte linerized t the PvuI site in the third exon of the white gene (Fig. 1A nd B) or t the XhoI site of thejushi trzu (ftz) gene (Fig. 1C). 32p lbeling ws provided by inclusion of [- 2P]GTP t finl specific ctivity of 12 Ci/mmol. Pre-mRNA (5, cpm) ws incubted in 25-pl rection mixtures contining 1 p,l of nucler extrct, 2.5% polyvinyl lcohol, 2 mm N-2-hydroxyethylpiperzine-N'-2-ethnesulfonic cid (HEPES)- KOH (ph 7.6), 5 mm cretine phosphte, 1 mm MgCl2, nd 3 mm ATP s described previously (66). Rections were terminted fter 3 to 3.5 h of incubtion t 2 C, nd RNA ws extrcted fter proteinse K digestion (29) nd nlyzed on denturing polycrylmide gels. K+ ion concentrtions between nd 1 mm nd Mg2e ion concentrtions between nd 4 mm were tested, nd rections were performed under optiml conditions (described bove). Debrnching of lrit RNAs ws crried out in 25-p,l rection mixtures contining 2 mm HEPES (ph 7.9), 2 mm KCl, 1 mm EDTA, 2% glycerol, 1 mm dithiothreitol, nd 1,ul of HeL S1 extrct (12, 68) t 3 C for 6 min. Anlysis of polydenyltion ctivity. A nonspecific (AAU

3 116 GUO ET AL. MOL. CELL. BIOL. A. pmg1 & 31 (wild type) B. pmg2 & 32 (solo LTR) C. ftz T7 promoter El (21 nl) (76 rt) Pvull n[s E2 (74 nt) (93 nt) Pvul T7 promoter El (21 ni) _.j (76 nt) Pvull lvs (35 ni)... -;"- I.,., -.1-I "- E2 (93 nt)... of 11, 111. Pvul T7 promoter El VS 62 (25 nt) -_ (241 nt) (15 ni) (55 nt) Sdl Xhol pre-mrna ~I -- (264 nt) E1 (97 nt) I --I 4= = (167 nt) pre-mrna (54 nt) El (97 nl).zl-e:::::nt)j IVS-E2 (443 nt( pre.mrna I -i::= (473 ~~~~~~~~~~nt) El (268 td) r /-,% IVS-E2 (25 ni) D. lvs (74 nt) I I E1-E2 (I19 nt) consensus: CUAAU YYYYYYYYYYYYYYYYNYAG/G pmg1 CG/GTGAGTTTCTATTCGCAGTCGGCTGATCTGTGTGAAATCTTAATAAAGGGTCCAATTACCAATTTGAAACTCAG/TT pmg31 CG/GTGAGTTTCTATTCGCAGTCGGCTGATCTGTGTGAAATCTTAATAAAGGGTCCAATTACCAATTTTTTCCTCAG/TT pmg2 pmg32 (35 nt) (19 nt) IVS (15 nt) TTTATTTATTTATTAAGAAAGGAAATATAAATTATAAATTACAACATAAAGGGTCCAATTACCAATTTGAAACTCAG/TT TTTATTTATTTATTAAGAAAGGAAATATAAATTATAAATTACAACATAAAGGGTCCAATTACCAATTTTTTCCTCAG/TT copi white intron FIG. 1. Structures nd prtil nucleotide sequences of constructs. (A to C) Structures nd expected RNA molecules produced during splicing of the pre-mrnas. Structures of the wild-type nd mutnt white introns re indicted t the top of ech pnel. (A) Plsmids contining the second intron of the wild-type white gene (pmg1) or the sme intron with consensus pyrimidine stretch (pmg31). Synthetic pre-mrna synthesized by using T7 RNA polymerse from templtes truncted t the PvuI site yields 264-nucleotide precursor RNA. The first step in the splicing rection should yield 5' exon frgment (El) of 97 nucleotides nd n intron-3' exon (IVS-E2) lrit RNA frgment of 167 nucleotides. The second step should yield 19-nucleotide mrna (E1-E2) nd the excised lrit intron (IVS) of 74 nucleotides. (B) Plsmids contining copi LTR insertion (pmg2 nd pmg32). Synthetic pre-mrna synthesized by using T7 RNA polymerse from templtes truncted t the PvuI site yields 54-nucleotide precursor RNA. The first step in the splicing rection should yield 5' exon frgment (El) of 97 nucleotides nd n intron-3' exon (IVS-E2) lrit RNA frgment of 443 nucleotides. The second step should yield 19-nucleotide mrna (El-E2) nd the excised lrit intron (IVS) of 35 nucleotides. (C) Plsmid contining theftz intron (pgem2 V61 S/B). The plsmid nd sizes re s described by Rio (66). (D) Prtil nucleotide sequences of constructs pmg1, pmg31, pmg2, nd pmg32. The 5' splice site nd 3' splice site re denoted by slshes. Consensus sequences for the brnchpoint (27, 45) nd 3' splice site (45, 74) re shown on the top line. The site of the brnch nucleotide within tht consensus is indicted by n sterisk. Plsmids pmg31 nd pmg32 differ from pmg1 nd pmg2 by substitution muttion tht increses pyrimidine content in the -2 to -5 region from 5 to 75%. The consensus brnchpoint in wild-type introns is underlined. The sequences re ligned by their 3' splice sites, nd only portions of the lrge introns from pmg2 nd pmg32 re shown. Sizes re indicted in nucleotides. AAA-independent) polydenyltion ctivity in our nucler extrct ws prtilly chrcterized (dt not shown). Slowly migrting bnds seen in Fig. 2B, 2C, nd 3A re ATP dependent but dispper when rections re crried out in the presence of n inhibitor of polydenyltion, 3'-dATP. Splicing rections shown in ll figures were performed without 3'-dATP. Although concentrtions of 3'-dATP bove 1 mm inhibit splicing (dt not shown), the ddition of.2 mm 3'-dATP elimintes the slowly migrting bnds without significnt effects on splicing efficiency, nd nlysis of the splicing of trnscripts from pmg1, pmg31, pmg3, pmg33, pmg5, pmg35, pmg2, pmg32, pmg4, pmg34, pmg7, nd pmg37 in combintion with titrtion of 3'- datp between nd.5 mm confirms the mjor conclusions of this study (dt not shown). RNse T1 digestion nd primer extension experiments. Excised lrit intron (intervening sequence [IVS]) nd the intron-exon 2 intermedite (E2-IVS) were gel purified following splicing of RNA from the wild-type intron pmg1 nd ---I E1-E2 (323 nt( pmg31 nd nlyzed with nd without tretment with S1 extrct (which contins debrnching ctivity). RNse T, digestion ws performed s described previously (63). RNse T1 oligonucleotide products contining the brnched nd debrnched nucleotide before nd fter S1 extrct tretment were then nlyzed on 2% denturing polycrylmide gel. Primer extension ws performed by the method of Inoue nd Cech (24). Oligonucleotide primers purified from 2% denturing polycrylmide gels were lbeled with [.Y-32P] ATP. The primer used for extension on the lrit is 15-mer (5'-AAATTGGTAATTGGA-3') complementry to the region ner the 3' splice site of the intron. The primer used for extension on the lrit-e2 intermedite is 32-mer (5'-GCA GGGTCGTCTlTCCGGCACCGGAACTGCCC-3') complementry to region of the 3' exon 4 nucleotides downstrem of the 3' splice site. Twenty-five to 5% of ech splicing rection product nd 1 ng of primer were used for ech rection.

4 VOL. 13, 1993 RESULTS DROSOPHILA IN VITRO SPLICING SIGNALS 117 The wild-type second intron of the Drosophi white gene is ccurtely removed in Kc cell nucler extrcts. As first step towrd the investigtion of species specificity in splicing signls, n in vitro splicing system ws estblished by using Drosophil Kc cell nucler extrcts (12, 66; see Mterils nd Methods). The Drosophil ftz intron ws efficiently spliced in these extrcts (e.g., Fig. 2C), s previously reported (66). Intron-contining trnscripts with white second-intron sequences were mde from the constructs pmg1, which contins the 74-nucleotide wild-type white second intron, nd pmg2, which contins the sme intron with 276-nucleotide copi sltr t the position of the w copi insertion. The structures, splicing pthwys, nd prtil sequences of these constructs re shown in Fig. 1. All 18 white intron derivtives described in this study re flnked by the sme 97-nucleotide 5' exon (with 2 nucleotides of 5' plsmid sequence) nd 93-nucleotide 3' exon. Therefore, they would ll be expected to generte the sme mrna product (19 nucleotides) but different lrit introns (between 72 nd 35 nucleotides). 32P-lbeled synthetic precursor RNAs corresponding to the wild-type white second intron (pmg1) were ccurtely spliced in this Kc cell nucler extrct, yielding the expected products in n ATP-dependent rection (Fig. 2A, lnes 1 nd 2). The products nd intermedites expected from n ccurte splicing of the white second intron re designted in Fig. 2A: the 5' exon (El), the intron-3' exon (E2-IVS), the lrit intron (IVS), nd the mrna. In ddition, the splicing products were incubted with HeL cell cytoplsmic S1 extrct which contins 2'-5' phosphodiesterse ctivity tht debrnches the lrit to generte liner RNA (68). After tretment with debrnching ctivity, the E2-IVS nd IVS migrte s liner RNAs t the expected sizes of 167 nd 74 nucleotides, respectively (Fig. 2B, lne 1). In the bsence of debrnching ctivity, the E2-IVS nd IVS retin their lrit structure nd migrte nomlously during gel electrophoresis (lne 2). The heterogeneous bnds round 9 bp in lne 1 my be due to n exonucleolytic ctivity tht hs been shown to remove the intron lrit til in vitro (68). Together with these expected products, n unexpected ATP-dependent product of pproximtely 18 nucleotides ws observed. Chrcteriztion by T1 digestion nd primer extension (dt not shown) reveled tht this RNA lcks sequences upstrem of site in the first exon ner the 5' splice site. Considertion of prior results in humn cell extrcts indictes tht such n RNA is likely to be due to the ctivity of n exonuclese endogenous to the nucler extrct (52, 54). Accordingly, we hve designted this bnd EPP, for exonuclese protection product. T7 trnscripts of pmg2, contining 276-nucleotide copi sltr insertion t the position of the w' copi insertion, were tested in prllel with pmg1 trnscripts in Kc nucler extrct. No splicing intermedites or products were detected from rections crried out nd nlyzed in the sme wy s for pmg1 trnscripts (Fig. 2C, lnes 1 nd 2). Thus, we conclude tht copi sltr insertion in the second intron of the Drosophil white gene elimintes in vitro splicing in Kc cell nucler extrcts. A consensus pyrimidine stretch enhnces, but is not essentil for, the removl of the white second intron in Drosophil extrcts. Like mny short Drosophil introns (reference 45 nd references therein), the wild-type white second intron does not hve consensus pyrimidine stretch. To investigte whether conventionl pyrimidine stretch would nevertheless enhnce the splicing of this intron, muttions were mde in ech of these constructs to increse the pyrimidine content. A chnge of four consecutive purines to four consecutive pyrimidines t positions - 9 through -6 incresed the frction of pyrimidines from 5 to 75% in the criticl region between -5 to -2 reltive to the 3' splice site (Fig. 1D). In Kc cell nucler extrcts, this chnge enhnced wild-type secondintron splicing to vrible extent (Fig. 2A nd B, pmg1 versus pmg31) but did not llow splicing of the sltrcontining intron (Fig. 2C, lnes 1 to 4, pmg2 nd pmg32). All four of these constructs were lso tested in extrcts from two humn cell lines. HeL nd 293 nucler extrcts gve identicl results with white-derived substrtes (Fig. 2C nd dt not shown), but results with these humn cell extrcts differ from those obtined for the Drosophil nucler extrct. Neither of the 74-nucleotide introns (pmg1 or pmg31) ws excised (dt not shown, but see Fig. 5), but the 35-nucleotide sltr-contining intron with consensus pyrimidine trct (pmg32) ws efficiently spliced, nd the lrit forms E2-IVS nd IVS migrted t the correct liner size fter tretment with debrnching ctivity (Fig. 2C, lnes 9 nd 1). In these humn extrcts, pyrimidine stretch ppers to be bsolutely required; no splicing of pre-mrna from pmg2 ws observed (lnes 7 nd 8). Although these results re in shrp contrst to those obtined with the Kc cell nucler extrct, they re in good greement with previous results from humn cell extrcts, which hve indicted tht both minimum intron size of between 66 nd 8 nucleotides nd good pyrimidine trct re required for efficient splicing (14, 79). They lso show tht the inbility of our Drosophil extrcts to process this intron ws not due to generl defect in the RNA preprtion. Trnscripts contining theftz intron (66) were used s positive control nd were spliced with similr efficiencies in Drosophil Kc nd humn HeL or 293 cell nucler extrcts (Fig. 2C, lnes 5, 6, 11, nd 12; lso dt not shown). The brnchpoint of the wild-type white second intron is t nucleotide -32, immeditely djcent to the site of the w" copi insertion. To understnd the moleculr bsis of the inefficient splicing of the sltr-contining introns (pmg2 nd pmg32) in Drosophil Kc cell nucler extrcts, we sought to determine whether the insertion of the copi element hd disrupted ny of the norml splicing signls required for removl of the wild-type second white intron. The insertion is reltively fr from both splice sites (48 nucleotides from the 5' splice site nd 31 nucleotides from the 3' splice site). However, the sequence UUAAU, which is 32 nucleotides upstrem of the 3' splice site nd is disrupted by the copi insertion, is n excellent cndidte for the brnchpoint sequence (27, 45, 5). Thus, ltertion of the nturl brnchpoint sequence ppered to be likely explntion for the inefficient expression of w in vivo nd the lck of splicing of trnscripts from the sltr-contining introns in vitro. We therefore determined the brnchpoint used by the second intron of the wild-type white llele in vitro. The brnchpoint of the second white intron ws first loclized to the 14-nucleotide RNse T1 product AAAUC WUAAUAAAG of pmg1 lrit intron, which contins the sequence UUAAU (dt not shown). Identicl results were obtined with the corresponding products from pmg31 (dt not shown). Primer extension experiments (24) were then used to precisely loclize the brnchpoint. A 15-mer complementry to the region ner the 3' splice site of the intron nd 32-mer complementry to region of the 3' exon 4 nucleotides downstrem of the 3' splice site gve identicl results. Figure 2D shows results obtined with the 32-mer on

5 118 GUO ET AL. A ATP pre-rna O 16- _.,* _do I 9- W.., M1 2 3 ^'s s S E1 mrna EPP E2-IVS jvs B r9o 16 -, p., deo rn cred pre 6 4*41 Et M 'r j,. E2-IVS Q. -mrna E2-IVS IVS Ivs 76 - I I lvs M FIG. 2. In vitro splicing nd chrcteriztion of white second introns. (A) In vitro splicing of the wild-type intron in Drosophil Kc cell nucler extrcts. 32P-lbeled pmg1 nd pmg31 precursors were synthesized nd incubted in the presence (+) or bsence (-) of ATP. Splicing products were nlyzed on 6% polycrylmide gel. Lne M, 32P-lbeled MspI digest of pbr322. Moleculr sizes (in nucleotides) in ll pnels re indicted on the left. The positions of substrte, intermedites, nd products re shown on the right. (Abbrevitions re s described in the legend to Fig. 1 except for EPP [exonuclese protection product]. See text for explntion.) (B) Debrnching ssy. Splicing products were treted with (+) or without (-) S1 debrnching ctivity nd nlyzed on 1% polycrylmide gel. The structures of substrte, intermedites, nd products re shown on the right. Lrit E2-IVS (lnes 2 nd 4) runs close to (just bove) mrna (19 nucleotides) on this 1% polycrylmide gel. The unshifted RNA between mrna nd lrit E2-IVS is the exonuclese protection product (see text). After tretment with debrnching ctivity (lnes 1 nd 3), the liner E2-IVS runs t the expected size of 167 nucleotides. Slowly migrting bnds visible in this gel re ATP dependent (see Fig. 3A nd B) nd re pprently due to nonspecific polydenyltion ctivity present in the Drosophil extrcts (see Mterils nd Methods). The inset t the bottom shows longer exposure of the lower portion of the gel. (C) In vitro splicing of sltr-contining introns. pmg2 nd pmg32 precursors were synthesized, incubted in Kc (lnes 1 to 4) or 293 (lnes 7 to 1) nucler extrcts, nd debrnched in prllel with pmg1 nd pmg31 (B) nd ftz positive controls (lnes 5, 6, 11, nd 12). The structures of substrte, intermedites, nd products of pmg32 nd ftz re shown on the right, inside nd outside, respectively. (D) Primer extension mpping of brnchpoints. RNAs from in vitro splicing rections in the presence (+) or bsence (-) of ATP were nlyzed by primer extension using primer complementry to the 3' exon, 4 nucleotides downstrem of the 3' splice site (see Mterils nd Methods). Primer extension products corresponding to the brnched nucleotide re indicted by the rrow. Lnes 3, 4, 7, nd 8, mrkers from dideoxy sequencing rections performed using the sme primer with precursor RNA s the templte. A prtil sequence of the white second intron is shown t the bottom. The position of the brnchpoint, which is t nucleotide -32, is indicted (*). Sequence with similrity to the Drosophil brnch site consensus (27, 45) is underlined..-, QL MOL. CELL. BIOL. the E2-IVS intermedites from both pmg1 nd pmg31. The precise loction of the brnchpoint ws found to be position -32 in both cses. Thus, our suspicion tht the copi insertion disrupted the nturl brnchpoint sequence ws confirmed. Nevertheless, humn nucler extrcts were ble to splice the sltr-contining trnscript from pmg32. This observtion could be explined either by use of n lternte brnchpoint in these extrcts or by n insensitivity to ltertion of the brnchpoint sequence. Anlysis of the brnchpoint used

6 VOL. 13, 1993 DROSOPHILA IN VITRO SPLICING SIGNALS 119 Co KC 293 N Nlc CN C 4 Cn d424 CDI debrnched E2-lVS C--[: Ivs -. D I pmg1 - I <: < - - pmg31 - I- vvx, ' I pre-pmg32 E2-IVS 'vs m OS ' mrna pre-ttz ftz-mrna ftz-lvsc -32 -*oo ":s ftz-ivs M _ when the sltr-contining intron pmg32 is spliced in mmmlin 293 cell extrcts confirmed the ltter possibility (dt not shown); position -32 ws used even though the sequence upstrem of the brnched A is completely ltered by the sltr insertion (ACAACAU, s compred with the wild-type UCUUAAU nd consensus UNCURAC; see Fig. 1 for the complete sequences). Becuse previous results with HeL cell nucler extrcts hve shown flexibility in the brnchpoint sequence of mmmlin introns (reviewed by Nelson nd Green [5]), this result ws not unexpected, but it is nevertheless quite striking. The brnchpoint disruption found in we is not sufficient to prevent splicing. To investigte whether the lck of splicing in sltr-contining introns ws due to brnchpoint disruption by the copi LTR, we constructed two groups of mutnts. In the first, we plced consensus brnchpoint sequence (UACUAAU) t the 3' end of the LTR in the sltr-contining introns pmg2 nd pmg32, generting pmg4 nd pmg34. In these mutnts, the consensus brnchpoint sequence is in the sme loction reltive to the 3' splice site s in the wild-type intron. In complementry experiment, the brnchpoint sequences of the wild-type intron constructs were chnged to ACAACAU, the sequence t this position in the sltr intron. If the splicing defect were due to the disruption of the nturl brnchpoint sequence, El ' AUCUUAAUAAA 3' -32 then mutnts pmg4 nd pmg34 should restore splicing ctivities. Conversely, mutnts pmg5 nd pmg35 would be expected to eliminte splicing in Kc cell nucler extrcts. We found tht restortion of the brnchpoint consensus did not compenste for the sltr insertion; sltr-contining introns with consensus brnchpoint (pmg4 nd pmg34), like the prentl constructs (pmg2 nd pmg32), did not splice in Drosophil Kc cell nucler extrcts t ll (Fig. 3A, lnes 1 to 8). Splicing of the pmg34 trnscripts ws, however, observed in humn cell nucler extrcts (Fig. 3B). Furthermore, 74-nucleotide introns without consensus brnchpoint (pmg5 nd pmg35) retined lrge frction of the splicing ctivity shown by the corresponding wild-type introns in Drosophil Kc cell nucler extrcts (pmg1 nd pmg31; Fig. 3C, lnes 1 to 4). To mp the brnchpoints used by RNA from pmg5 nd pmg35, primer extension experiments were crried out with the sme 3' exon 32-mer described previously. Figure 3D shows tht the precise loction of the brnchpoint is t nucleotide -35, within the sequence AAAACAACA, t loction (underlined) with only remote resemblnce to the brnchpoint consensus. These results suggest tht, contrry to our initil hypothesis, consensus brnchpoint sequence is not essentil for efficient splicing of the second intron of white. It is lso interesting tht the brnchpoints of pmg5 nd pmg35 re t 8 wsw A* 46b qhft 8

7 111 GUO ET AL. MOL. CELL. BIOL. A. od L CL CL CL ATP B C M~~~~~~~~l Cn 't Cf) * mrna pre-pmg2/32;4/34 pre-pmg 1/31 E2-IVS(pMG1/31) El 18-3 h mrna E pmg2/32 pmg4 / 34 TACAACATAA TA- CTAATAA n _S mrocorn - CO D. C C f vn C. <I- CL CL _n ]pre-mrna 1EPP (pmg33) mrna EPP 1 6o- - * E2-IVS _"o CL CL QL CL + I - - im 'IU ATP E2-lVS Ivs pre-rna E l Ivs M AATCTTAATAA pmg1 / 31 AAACAACATAA pmg5 / 35 5' AAAACAACAUA 3' t -35

8 VOL. 13, 1993 loction different from those observed when the sltrcontining intron (pmg32) is spliced in mmmlin extrcts (AACAU; dt not shown). Incresing the size of the white second intron elimintes splicing in Drosophil Kc cell nucler extrcts but improves splicing in mmmlin HeL cell nucler extrcts. Becuse the size of the wild-type second white intron (74 nucleotides) is close to the minimum size which cn be spliced in HeL cell nucler extrcts, we were originlly concerned tht the smll intron size might lso prevent splicing in our Drosophil in vitro system. Therefore, we mde two dditionl mutnts, pmg3 nd pmg33, by inserting 16-bp polylinker into pmg1 nd pmg31, respectively, t point between the 5' splice site nd the brnchpoint, 29 bp downstrem of the 5' splice site, resulting in 9-nucleotide introns (Fig. 4A). To our gret surprise, trnscripts from pmg3 nd pmg33 were not spliced in Kc cell nucler extrcts, lthough wild-type introns from pmg1 nd pmg31 spliced well in prllel rections (Fig. 4B, lnes 1, 3, 4, nd 6). In contrst, the pmg33 intron ws good substrte for humn extrcts (dt not shown nd Fig. SB). To distinguish the influence of intron size from the possibility tht the 16-bp polylinker insertion in pmg3 nd pmg33 prevented splicing by disrupting previously unrecognized sequence element, two substitution mutnts were mde in which 16-bp sequence between the 5' splice site nd the brnchpoint ws replced by the sme 16-bp polylinker, resulting in ltered 74-nucleotide substitution introns (pmg7 nd pmg37; Fig. 4A). If the linker insertion ltered previously unrecognized splicing signl, then these two mutnts would be expected to be defective for splicing, but if the linker insertion prevented splicing by lengthening the intron, then trnscripts from pmg7 nd pmg37 should splice with efficiencies similr to those of the prentl 74-nucleotide introns pmg1 nd pmg31. Figure 4B shows tht these substituted trnscripts re indeed spliced in Drosophil Kc cell nucler extrcts with efficiencies comprble to those of the corresponding wild-type introns (lnes 1, 2, 4, nd 5). Like the wild-type introns, these substituted introns re not spliced in HeL cell nucler extrcts (dt not shown). Thus, both 9-nucleotide introns contining n innocuous linker insertion (pmg3/33) nd 35-nucleotide introns bering n insertion of the copi LTR (pmg2/32) were defective for splicing in Drosophil Kc cell nucler extrcts. In ech cse, the sequence chnge per se did not eliminte splicing in control constructs of wild-type length (pmg7/37 nd pmg5/ 35). In ech cse, the elongted intron, if provided with pyrimidine-rich region, ws n effective substrte for splicing in humn cell nucler extrcts (pmg33 nd pmg32). Considertion of these results led us to the hypothesis tht DROSOPHILA IN VITRO SPLICING SIGNALS 1111 intron size is criticl fctor in the splicing of these introns in our Drosophil nucler extrcts. To test this ide, series of deletion muttions round the polylinker insertion region of pmg3 nd pmg33 ws constructed (Fig. 5C). Ech of the resulting mutnts contins different distnce between the 5' splice site nd brnchpoint. If size is the criticl fctor, mutnts with smller introns should be ble to restore splicing ctivity in Kc cell nucler extrcts. The results obtined with three deletions from ech of pmg3 nd pmg33 re shown in Fig. 5A. As expected, splicing efficiency increses with decresing intron size in Kc cell nucler extrcts. The 78- nd 79-nucleotide introns, pmg3a14, pmg3a28, pmg33a14, nd pmg33a28 (lnes 3, 4, 8, nd 9), showed splicing ctivity close to tht of the corresponding wild-type introns pmg1 nd pmg31 (lne S nd 1). Note tht lthough the exonuclese protection product generted from some substrtes of intermedite length obscures the mrna bnd, the exon 1 intermedite nd intron products re better isolted on the gel nd serve s excellent, nd eqully vlid, indictors of splicing efficiency. The 84-nucleotide introns from pmg3a6 nd pmg33a23 showed very little splicing ctivity (lnes 2 nd 7) but were better substrtes thn their 9-nucleotide prentl introns, which showed lmost no splicing ctivity t ll (lnes 1 nd 6). In greement with erlier results, consensus pyrimidine stretch confers greter splicing efficiency in ll cses (lnes 6 to 1 versus lnes 1 to 5). All of these 1 constructs were lso tested in HeL cell nucler extrcts (Fig. SB). In greement with previous studies using mmmlin cell nucler extrcts, but in contrst to our results for Drosophil Kc cell nucler extrcts, we found tht splicing efficiency decreses with decresing intron size. Derivtives of pmg3 lck consensus pyrimidine stretch nd did not show ny splicing ctivity in HeL cell extrcts (lnes 1 to 5). As observed erlier, the 9-nucleotide pmg33 intron ws spliced in HeL nucler extrcts. Deletion of the intron in construct pmg33 to less thn 84 nucleotides resulted in decresed splicing efficiencies in HeL nucler extrcts (Fig. SB, lnes 6 to 1). These results suggest tht removl of the white second intron is exquisitely sensitive to intron length in both Drosophil cell nucler extrcts nd humn cell nucler extrcts but tht the reltionship between length nd efficiency is species specific. DISCUSSION Species-specific splicing signls. We hve exmined the effects on in vitro splicing of sequence ltertions in the 74-nucleotide second intron of the Drosophil white gene. This intron differs from mmmlin introns in two wys tht FIG. 3. Evidence tht muttions in the brnchpoint consensus do not control white second intron splicing. (A) Splicing of sltr-contining introns with nd without consensus brnchpoint in Drosophil Kc nucler extrcts. Constructs pmg4 nd pmg34 encode sltr-contining introns with consensus brnchpoint sequence nd were mde from pmg2 nd pmg32. The sequences of the brnchpoint regions re shown underneth. Splicing rections were performed nd nlyzed s described in the legend to Fig. 2A. The positions of precursors nd splicing products of pmg1 nd pmg31 re shown on the right. (B) Splicing of pmg4 nd pmg34 in HeL nucler extrcts. (C) Splicing of introns with nd without consensus brnchpoint in Drosophil Kc nucler extrcts. Constructs pmg5 nd pmg35 crry 74-nucleotide introns lcking consensus brnchpoint sequence nd were mde from pmg1 nd pmg31. The sequences of the brnchpoint regions re shown underneth. Splicing rections were performed nd nlyzed s described in the legend to Fig. 2A. The exonuclese protection product of pmg33 runs with or slightly slower thn mrna (19 nucleotides), nd the best estimte of splicing efficiency is obtined from noting the quntity of IVS product or intermedites (El nd E2-IVS). (D) Primer extension mpping of the brnchpoint in pmg35 RNA. Primer extension experiments were performed s described in the legend to Fig. 2D. Primer extension products corresponding to the brnched nucleotide re indicted by the rrow. Lnes 3 nd 4, mrkers. Prtil sequence of the intron is shown underneth. The position of the brnchpoint, which is t nucleotide -35, is indicted (*). Sizes re indicted in nucleotides.

9 1112 GUO ET AL. re chrcteristic of lrge subset of Drosophil introns (45). It is shorter thn the length of pproximtely 8 nucleotides required for efficient splicing in mmmlin cells (68, 83), nd it lcks trct of pyrimidines djcent to the 3' splice site. In results summrized in Fig. 6, we hve confirmed tht this Drosophil intron is not spliced by extrcts from humn cells, but becomes substrte for splicing in humn extrcts if these two sequence fetures re modified to correspond to the requirements of humn cells. Thus, ll introns tested tht were longer thn 8 nucleotides nd hd good pyrimidine stretch (12 pyrimidines mong 15 nucleotides in positions -5 to -19) were efficiently spliced in humn cell extrcts. This set of introns includes the LTRcontining introns pmg32 (Fig. 2C, lnes 9 nd 1, nd Fig. 3B, lne 3, show the pmg32 intron spliced in extrcts from 293 cells nd HeL cells, respectively) nd pmg34 (Fig. 3B, A L-BP N,\ G B. r DM43 ;r13.3-1, pre-mrna 13EPP (PMG3;33) 19- mr -mna - BP-= C: CD5C:frCiZ (,DO Li;_ "w E1 _w _m* M E2-IVS Ivs 7::-lti MOL. CELL. BIOL. lne 7), s well s introns enlrged to 9 or 84 nucleotides by the insertion of linker (Fig. SB, lnes 6 nd 7, pmg33 nd pmg33a23). Although not present in the wild-type Drosophil intron, pyrimidine trct is bsolutely required for these introns to be recognized by humn extrcts; no intron with the wild-type Drosophil sequence (GAAA) t -9 to -6 reltive to the 3' splice site showed detectble splicing (Fig. 2C, lnes 7 nd 8 versus 9 nd 1; Fig. 3B, lnes 1 versus 3 nd 5 versus 7; Fig. 5B, lnes 1 to 5 versus 6 to 1). Likewise, introns shorter thn 8 nucleotides (pmg31, pmg33a14, nd pmg33a28 [Fig. SB] nd pmg37 [dt not shown]) re not spliced well in HeL cell nucler extrcts. Thus, our results confirm previous reports on the sequence fetures required in nucler extrcts derived from humn cells nd indicte tht the bility of this intron to be properly spliced in Drosophil extrcts is indeed due to species specificity in the recognition of splicing signls rther thn to some unrecognized feture of this intron. Indeed, extrcts derived from Drosophil Kc cells behve very differently from those derived from humn cells (summrized in Fig. 6). Most significntly, the wild-type intron is spliced efficiently in Drosophil extrcts, despite its short length nd the bsence of pyrimidine trct. To explore this species specificity, which ws consistently observed in five independent preprtions of Kc cell nucler extrct, nd to chrcterize the sequence requirements for the splicing of this intron in Drosophil nucler extrcts, muttions in the pyrimidine stretch, in the brnchpoint, nd in the overll length were nlyzed. Ech of these three sequence fetures mkes significnt contribution to splicing efficiency in Drosophil extrcts, but Drosophil extrcts differ from humn extrcts in tht the effect of the pyrimidine stretch is quntittive rther thn bsolute. When the rtio of pmg31 to pmg1 splicing produced by five different extrct preprtions ws mesured by quntittion of the excised intron lrit, mesurements vried between 7.73 nd 18.4, with men of 12.2 nd stndrd devition of 3.6 (dt not shown). Exmples re visible in Fig. 2A, 2B, 3C (lnes 1 versus 3 nd 2 versus 4), 4B (lnes 1 to 3 versus 4 to 6), nd SA (lnes 1 to 5 versus 6 to 1). A strong preference for smll, rther thn lrge, introns ws lso observed in Drosophil extrcts. This criticl dependence of splicing efficiency on length is mde cler by dt from the deletion FIG. 4. Splicing of introns crrying 16-nucleotide insertion in Drosophil Kc cell nucler extrcts. (A) Schemtic digrm of insertion construct pmg3/33 nd the control (substitution) construct pmg7/37. pmg3 nd pmg33 contin 9-nucleotide (nt) introns which were mde from pmg1 nd pmg31 by 16-nucleotide insertion (shown by blck box) t point between the 5' splice site nd the brnchpoint, 29 nucleotides downstrem of the 5' splice site. Constructs pmg7 nd pmg37 contin 74-nucleotide introns which were mde from pmg1 nd pmg31 by substitution of the sme 16 nucleotides (shown by blck box) for sequences in the region between the 5' splice site nd the brnchpoint, strting t 13 nucleotides downstrem of the 5' splice site. Exons re shown by open boxes. The intron is shown by line. The sequence of the 16-nucleotide insertion/substitution is underlined, nd the position of the insertion in pmg1/31 is indicted by the verticl br. (B) Splicing of pmg3, pmg33, pmg7, nd pmg37 trnscripts in Drosophil Kc nucler extrcts. Splicing rections were performed nd nlyzed s described in the legend to Fig. 2A. The exonuclese protection products from pmg3 nd pmg33 run with or slightly slower thn mrna (19 nucleotides), nd the best estimte of splicing efficiency is obtined from the quntity of IVS or intermedites. Sizes re indicted in nucleotides.

10 VOL. 13, 1993 DROSOPHILA IN VITRO SPLICING SIGNALS 1113 A. D ' c)oc~to D Cj-')< c1) (lr-m Y B. c C) t M - C,.-lqrC - CM OJ CD< < <i CD 3 CO n Q L CL CL 39- ] pre-mrna 39- pre-mrna IC. constructs "44 4 ~ -h : mrna ""*, ^ ^, _ ~E2-IVS 16- ]E2-v \ X" -L s * s f * 4 m eis El " M nt_ r srce r' e Ze S e - e 19- g 16* 16-1 to S 9-7.~~~~~~ r-. 7' 7.-rr7T-3.r pimg33/3 3 G G / TATAT-"Tr-'; * 4. M A6/A2 34 c3gc. S -._.v...>>-- - -r i, mrna ] E2-IVS El ]Ivs A1 ATGAUZ? icg Gt' ;T<- i7 n A 1 4 /A i 8 G/<-r 4 i - -C G n / S7 > S. h ; / s i _J'-. -7 Effect of intron size on splicing of white second introns in vitro. (A) Splicing of deletion mutnts derived from pmg3 nd pmg33 FIG. 5. in Drosophil Kc cell nucler extrcts. Splicing rections were performed nd nlyzed s described in the legend to Fig. 2A. (B) Splicing of the sme set of deletion mutnts in mmmlin HeL nucler extrcts. (C) Intron size (length in nucleotides) nd prtil nucleotide sequences of the introns in deletion mutnts. Introns derived from pmg3 (lnes 1 to 5 in pnels A nd B) hve the wild-type sequence t positions -9 to -6 upstrem of the 3' splice site, while introns derived from pmg33 (lnes 6 to 1 in pnel A nd B) hve the pyrimidines t positions -9 to -6 upstrem of the 3' splice site (see Fig. 1D).

11 1114 GUO ET AL. MOL. CELL. BIOL. C 1 C_ -r C., 3. l, L C.- L L. T >w M ) Q U)e C) t ffi ) 1, 14 Ū q 14 4_, m m xx (L CL co C13 cot m m h. on." H i 14 ', E : U r- to.rl A ' m Li Lik Li z z z z z z z z z + (N (N LO U) CY) CY) CY) (Y) Q Z Li z + z H (N (NI z z z z z z z z Z z Z E; : C3 z z z 4 l : z z z z z z z U) t,-*v, CZ) 2'. '.C~Y O E 4-i A o 8)Y U) 4.. )4. N r) 4 P N O *i E fe H H H EHE E E E l< F: F: F< f < 4 << < 4 l< 4 c) ; i < < 4 4 U Ev E- < E-4 E- E-4 E E-- E-4 E-4 E-4 E-4 E-4 p E-4 E- < pe H E- H H H H EH r c) -Wor lws;tz ) l. OrU) Kr LO r- r- (4- (4- r- ) ) OC» co CY) CY) srvv st slh C5E o D ur~~~~~~~~~~~~~~~~~~~c u-) o O U) CY) *-m. C) -)'> Q 4. b-.~ S N X 4C. 8 ).r U 14 4) (4)) < (N < LO CY) < m CY) 1-4 CY) UL' CY) (Y) CY) CY) CY) (Y) (Y) CY) D D D D D CY) 4 4 (N (N D (4) 4 4 4,r (V 4 vo XOcNC on, C:DE 2 8Q oc~-) O

12 VOL. 13, 1993 series derived from pmg3 nd pmg33, which indicte mximum length of between 79 nd 84 nucleotides for efficient splicing of the second white intron in Kc cell nucler extrcts (Fig. 5). In summry, the species specificity of splicing signls, prticulrly pyrimidine stretch nd size requirements, implied by sequence differences between Drosophil nd mmmlin introns hs now been demonstrted in vitro. We hve lso discovered n unexpected inbility of Drosophil extrcts to recognize elongted forms of this short intron. This is not the first report of species-specific splicing in Drosophil versus humn cell extrcts. Previously, it ws observed tht substrtes contining the regulted intron of the Drosophil P trnsposble element were spliced ccurtely in humn extrcts but not in Drosophil extrcts (77). However, the P intron conforms well to the fetures chrcteristic of mmmlin introns, nd the species specificity observed is contrry to expecttion (splicing occurs in the heterologous extrct). These results re best ttributed to the regulted splicing of the intron; the Drosophil extrcts used were mde from somtic cells, which do not normlly splice the P-element third intron, nd n inhibitory fctor cn lso be isolted from humn cells (8). The moleculr bsis of the w muttion. We hve duplicted the splicing defect observed in the wz llele in vitro with introns contining n sltr t the site of the copi insertion, nd we hve mpped the brnchpoint of the second white intron to loction disrupted by the copi insertion in the w llele. However, when the brnchpoint of n intron of wild-type length is ltered to mtch the sequence ltertion induced by the copi insertion, splicing is not bolished, either in vitro (Fig. 3C) or in vivo (dt not shown). Insted, nonconsensus brnchpoint sequence is used (Fig. 3D). Thus, the brnchpoint disruption per se cnnot be held ccountble for the splicing defect in w'. Similrly, the provision of consensus brnchpoint to LTRdisrupted introns does not restore splicing in vitro (Fig. 3A). Like ll introns derived from the second white intron nd greter thn 8 nucleotides in size, these LTR-contining introns yielded no spliced products in Drosophil extrcts tht crry out efficient removl of the wild-type intron in prllel (Fig. 2C nd 3A). Thus, our observtions re most consistent with the ide tht introns crrying the copi LTR re defective for splicing in Kc cell nucler extrcts not becuse of their ltered brnchpoint sequence but becuse of their overll size. This effect is ccentuted in the cse of the full copi insertion w becuse of copi's polydenyltion ctivity (31). Is there specific mechnism for the splicing of smll introns in D. melnogster? It is well estblished tht Drosophil nd vertebrte splicing signls nd mechnisms re generlly similr. Consensus sequences for Drosophil splice sites re extremely similr to those found in mmmls, nd similr brnchpoint sequences cn lso be found in Drosophil gene. Furthermore, identified components of the splicing mchinery pper to be conserved. The collection of smll RNAs tht is known to function in mmmlin splicing hs been identified in Drosophil cells, nd these RNAs re highly conserved in sequence (19, 39, 48), s re severl proteins involved in splicing tht hve been identified in both Drosophil nd humn cells (2, 22, 23, 4, 42-44, 87). In ddition, there re nturl introns tht re recognized by the splicing mchinery of both species (63, 77) (Fig. 2C). However, mny Drosophil introns re very smll reltive to mmmlin introns. Approximtely hlf of ll sequenced Drosophil introns re less thn 8 nucleotides, with modl DROSOPHILA IN VITRO SPLICING SIGNALS 1115 length between 6 nd 65 nucleotides (21, 45), which is smller thn the size of ll but few mmmlin introns (21; reviewed in reference 14). An even more extreme sitution exists in Cenorhbditis elegns, in which intron lengths of less thn 5 nucleotides re common (9). Consistent with these observtions, C. elegns intron of 53 nucleotides ws efficiently spliced in HeL cell nucler extrcts only when expnded to 84 nucleotides (55). The size dependence tht we hve observed for the nturlly smll white intron not only is species specific but lso is specific to smll introns. We hve obtined similr results with the fifth intron of the myosin hevy-chin gene, in which cse expnsion from 68 to 84 nucleotides permitted ccurte splicing in extrcts from HeL cells but prevented splicing in Kc cell extrcts (dt not shown). However, the Drosophil ftz intron, which is 15 nucleotides, is efficiently spliced in extrcts from both Drosophil nd humn cells (Fig. 2C). This finding rules out the possibility tht Kc extrcts re generlly unble to splice introns lrger thn 84 nucleotides but is consistent with the notion of n s yet unidentified independent signl for smll introns. Considertion of ll of the dt (most of them summrized in Fig. 6) leds us to fvor the hypothesis tht the splicing of smll introns, or subset of smll introns, in Drosophil cells differs in mechnistic detil from the splicing of mmmlin introns nd tht Drosophil nucler extrcts re sensitive to this difference. The effect of size my be more significnt in vitro thn in vivo. For exmple, lthough RNA trnscripts from pmg2 nd pmg32 re not spliced t ll in Drosophil nucler extrcts, nd detectble levels of intron-contining trnscripts re observed in flies crrying the corresponding white llele with n LTR-contining intron (3), these sme flies do hve significnt levels of normlly spliced RNA (46, 86). Similrly, nlysis of the splicing of introns described in this study (pmg1, -31, -3, -33, -5, -35, -7, nd -37) in trnsfected Drosophil Schneider cells (38) indictes tht lengthening the intron to 9 nucleotides decreses splicing efficiency roughly twofold, comprble in mgnitude to the effects of brnchpoint nd pyrimidine trct ltertions in the sme study. Thus, it ppers tht lthough the length effect cn be observed in vivo, it is less pronounced. We consider three possibilities for size-specific splicing signls. One is positive signl (such s mmmlin-style pyrimidine stretch) tht opertes in the splicing of long introns but is not required for short introns. In smll introns, simple direct contct between Ul snrnp t the 5' splice nd U2 snrnp t the brnchpoint might supersede the requirement for this signl. In this cse, incresing the distnce between the 5' splice site nd the brnchpoint beyond 53 nucleotides (the distnce in pmg3a6 nd pmg33a23) might prevent splicing unless the dditionl signl were present. A positive signl specific for the splicing of smll introns might lso exist. This second hypothesis (which is not inconsistent with the first) is ttrctive in tht it explins how pmg5, which lcks both consensus brnchpoint nd pyrimidine stretch, might be recognized. The observtion tht pmg5 is spliced better thn pmg33 would imply tht such smll intron-specific positive signl plys greter role in Drosophil extrcts thn does either the brnchpoint sequence or the pyrimidine stretch. Finlly, our results re consistent with negtive signl tht inhibits the piring of one of the two splice sites of short intron with distnt prtners. Hving shown tht size-dependent signl cts in the splicing of this intron but not the ftz control intron, we re

13 1116 GUO ET AL. now in position to loclize such signl nd to identify the fctor or fctors responsible for its ctivity. The existence of trns-cting genetic modifiers of w, t lest some of which my ct by overcoming the size limittions of this splicing event, should prove useful to this pproch. In ny cse, if there is indeed distinct mechnism for the splicing of smll introns in species such s D. melnogster nd C. elegns, the elucidtion of tht mechnism is likely to revel spects of splicing common to ll species. ACKNOWLEDGMENTS We thnk Don Rio nd Chris Siebel for providing plsmid pgem2 V61 S/B nd for technicl dvice; Hui Ge for n introduction to preprtion of S1 nd nucler extrcts nd for technicl dvice; Mthew Wng for HeL nd 293 cells; nd Zhenqing Pn, Jmes Mnley, Frnk Lski, Dniel Klderon, nd Jym Mohler for vluble suggestions nd criticl comments on the mnuscript. This work ws supported by Public Helth Service grnt GM from the Ntionl Institute of Generl Medicl Sciences, by Ntionl Science Foundtion Presidentil Young Investigtor wrd to S.M.M., nd by Bsil O'Connor Strter Scholr reserch wrd 5-63 from the Mrch of Dimes Birth Defects Foundtion. REFERENCES 1. Admi, G., nd J. R. Nevins Splicing site selection domintes over poly(a) choice in RNA production from complex denovirus trnscription units. EMBO J. 7: Brbino, S. L., B. J. Blencowe, U. Ryder, B. S. Sprot, nd A. I. Lmond Trgeted snrnp depletion revels n dditionl role for mmmlin Ul snrnp in spliceosome ssembly. Cell 63: Beggs, J. D., J. V. D. Berg, A. V. Ooyen, nd C. Weissmn Abnorml expression of chromosoml rbbit P-globin gene in Scchromyces cerevisie. Nture (London) 283: Bell, L. R., E. M. Mine, P. Schedi, nd T. W. Cline Sex-lethl, Drosophil sex determintion switch gene, exhibits sex-specific RNA splicing nd sequence similrity to RNA binding proteins. Cell 55: Binghm, P. M., nd B. H. Judd A copy of the copi trnsposble element is very tightly linked to the w llele t the white locus of D. melnogster. Cell 25: Birchler, J. A., nd J. C. Hiebert Interction of the Enhncer-of-white-pricot with trnsposble element lleles t the white locus in Drosophil melnogster. Genetics 122: Birchier, J. A., J. C. Hiebert, nd L. Rbinow Interction of the mottler-of-white with trnsposble element lleles t the white locus in Drosophil melnogster. Genes Dev. 3: Blck, D. L., B. Chbot, nd J. A. Steitz U2 s well s Ul smll nucler ribonucleoproteins re involved in pre-messenger RNA splicing. Cell 42: Blumenthl, T., nd J. Thoms Cis nd trns mrna splicing in C. elegns. Trends Genet. 4: Boggs, R. T., P. Gregor, S. Idriss, J. Belote, nd M. McKeown Regultion of sexul differentition in D. melnogster vi lterntive splicing of RNA from the trnsformer gene. Cell 5: Chou, T., Z. Zchr, nd P. M. Binghm Developmentl expression of regultory gene is progrmmed t the level of splicing. EMBO J. 7: Dignm, J. D., R. M. Lebovitz, nd R. G. Roeder Accurte trnscription initition by RNA polymerse II in soluble extrct from isolted mmmlin nuclei. Nucleic Acids Res. 11: Frendewy, D., nd W. Keller The stepwise ssembly of pre-mrna splicing complex requires U-snRNPs nd specific intron sequences. Cell 42: Ge, H., J. Noble, J. Colgn, nd J. L. Mnley Polyom virus smll tumor ntigen pre-mrna splicing requires coopertion between two 3' splice sites. Proc. Ntl. Acd. Sci. USA MOL. CELL. BIOL. 87: Gehring, W. J., nd R. Pro Isoltion of hybrid plsmid with homologous sequences to trnsposing element of Drosophil. Cell 19: Goodll, G. J., nd W. Filopowicz The AU-rich sequences present in the introns of plnt nucler pre-mrnas re required for splicing. Cell 58: Goodll, G. J., nd W. Filopowicz Different effects of intron nucleotide composition nd secondry structure on premrna splicing in monocot nd dicot plnts. EMBO J. 1: Green, M. R Biochemicl mechnisms of constitutive nd regulted pre-mrna splicing. Annu. Rev. Cell Biol. 7: Guthrie, C., nd B. Ptterson Spliceosoml snrnas. Annu. Rev. Genet. 22: Hrper, D. S., L. D. Fresco, nd J. D. Keene RNA binding specificity of Drosophil snrnp protein tht shres homology with mmmlin U1-A nd U2-B' proteins. Nucleic Acids Res. 2: Hwkins, J. D A survey on intron nd exon lengths. Nucleic Acids Res. 16: Hynes, S. R., D. Johnson, G. Rychudhuri, nd A. L. Beyer The Drosophil Hrb87F gene encodes new member of the A nd B hnrnp proteins group. Nucleic Acids Res. 19: Hynes, S. R., G. Rychudhuri, nd A. L. Beyer The Drosophil Hrb98DE locus encodes four protein isoforms homologous to the Al protein of mmmlin heterogeneous nucler ribonucleoprotein complexes. Mol. Cell. Biol. 1: Inoue, T., nd T. R. Cech Secondry structure of the circulr form of the Tetrhymen rrna intervening sequence: technique for RNA structure nlysis using chemicl probes nd reverse trnscriptse. Proc. Ntl. Acd. Sci. USA 82: Jcquier, A., J. R. Rodriguez, nd M. Rosbsh A quntittive nlysis of the effects of 5' junction nd TACTAAC box mutnts nd mutnt combintions on yest mrna splicing. Cell 43: Keller, E. B., nd W. A. Noon Intron splicing: conserved internl signl in introns of niml pre-mrnas. Proc. Ntl. Acd. Sci. USA 81: Keller, E. B., nd W. A. Noon Intron splicing: conserved internl signl in introns of Drosophil pre-mrnas. Nucleic Acids Res. 13: Konrsk, M. M., P. J. Grbowski, R. A. Pdgett, nd P. A. Shrp Chrcteriztion of the brnch site in lrit RNAs produced by splicing of mrna precursors. Nture (London) 313: Kriner, A. R., T. Mnitis, B. Ruskin, nd M. R. Green Norml nd mutnt humn 3-globin pre-mrnas re fithfully nd efficiently spliced in vitro. Cell 36: Kunkel, T. A., J. D. Roberts, nd R. A. Zkour Rpid nd efficient site-specific mutgenesis without phenotypic selection. Methods Enzymol. 154: Kurkulos, M. Unpublished dt. 31. Kurkulos, M., J. M. Weinberg, M. E. Pepling, nd S. M. Mount Polydenyltion in copi requires unusully distnt upstrem sequences. Proc. Ntl. Acd. Sci. USA 88: Lngford, C. J., nd D. Gllwitz Evidence for n introncontined sequence required for the splicing of yest RNA polymerse II trnscripts. Cell 33: Leff, S. E., R. M. Evns, nd M. G. Rosenfeld Splice commitment dicttes neuron-specific lterntive RNA processing in clcitonin/cgrp gene expression. Cell 48: Legrin, P., B. Serphin, nd M. Rosbsh Erly commitment of yest pre-mrna to the spliceosome pthwy does not require U2 smll nucler ribonucleoprotein. Mol. Cell. Biol. 8: Levis, R., P. M. Binghm, nd G. M. Rubin Physicl mp of the white locus of Drosophil melnogster. Proc. Ntl. Acd. Sci. USA 79: Levis, R., K. O'Hre, nd G. M. Rubin Effects of trnsposble element insertions on RNA encoded by the white gene of Drosophil melnogster. Cell 38:

14 VOL. 13, Levitt, N., D. Briggs, A. Gil, nd N. J. Proudfoot Definition of n efficient synthetic poly(a) site. Genes Dev. 3: Lindsley, D. L., nd G. G. Zimm The genome of Drosophil melnogster. Acdemic Press, New York. 38.Lo, P. C. H. Unpublished dt. 39. Lo, P. C. H., nd S. M. Mount Drosophil melnogster genes for Ul snrna vrints nd their expression during development. Nucleic Acids Res. 18: Mncebo, R., P. C. H. Lo, nd S. M. Mount Structure nd expression of the Drosophil melnogster gene for the Ul smll nucler ribonucleoprotein prticle 7K protein. Mol. Cell. Biol. 1: Mttox, W., nd B. S. Bker Autoregultion of the splicing of trnscripts from the trnsformer-2 gene of Drosophil. Genes Dev. 5: Mtunis, E. L., M. J. Mtunis, nd G. Dreyfuss Chrcteriztion of the mjor hnrnp proteins from Drosophil melnogster. J. Cell Biol. 116: Mtunis, M. J., E. L. Mtunis, nd G. Dreyfuss Isoltion of hnrnp complexes from Drosophil melnogster. J. Cell Biol. 116: Myed, A., A. M. Zhler, A. R. Kriner, nd M. B. Roth Two members of conserved fmily of nucler phosphoproteins re involved in generl nd lterntive splicing. Proc. Ntl. Acd. Sci. USA 89: Mount, S. M., C. Burks, G. Hertz, G. D. Stormo,. White, nd C. Fields Splicing signls in Drosophil: intron size, informtion content, nd consensus sequences. Nucleic Acids Res. 2: Mount, S. M., M. M. Green, nd G. M. Rubin Prtil revertnts of the trnsposble element-ssocited suppressible llele white-pricot in Drosophil melnogster: structure nd responsiveness to genetic modifiers. Genetics 118: Mount, S. M., I. Petterson, M. Hinterberger, A. Krms, nd J. A. Steitz The Ul smll nucler RNA-protein complex selectively binds 5' splice site in vitro. Cell 33: Mount, S. M., nd J. A. Steitz Sequence of Ul RNA from Drosophil melnogster: implictions for Ul secondry structure nd possible involvement in splicing. Nucleic Acids Res. 9: Ngoshi, R. N., M. McKeown, K. C. Burtis, J. M. Belote, nd B. S. Bker The control of lterntive splicing t genes regulting sexul differentition in D. melnogster. Cell 53: Nelson, K. K., nd M. R. Green Mmmlin U2 snrnp hs sequence-specific RNA-binding ctivity. Genes Dev. 3: Newmn, A. J., R.-J. Lin, S.-C. Cheng, nd J. Abelson Moleculr consequences of specific intron muttions on yest mrna splicing in vivo nd in vitro. Cell 42: Noble, J. C. S., H. Ge, M. Chudhuri, nd J. L. Mnley Fctor interctions with the simin virus 4 erly pre-mrna influence brnch site selection nd lterntive splicing. Mol. Cell. Biol. 9: Noble, J. C. S., Z. Pn, C. Prives, nd J. L. Mnley Splicing of SV4 erly pre-mrna to lrge T nd smll t mrna utilizes different ptterns of lrit brnch sites. Cell 5: Noble, J. C. S., C. Prives, nd J. L. Mnley In vitro splicing of simin virus 4 erly pre-mrna. Nucleic Acids Res. 14: Ogg, S. C., P. Anderson, nd M. P. Wickens Splicing of C. elegns myosin pre-mrna in humn nucler extrct. Nucleic Acids Res. 18: O'Hre, K., C. Murphy, R. Levis, nd G. M. Rubin DNA sequence of the white locus of Drosophil melnogster. J. Mol. Biol. 18: Prker, R., nd C. Guthrie A point muttion in the conserved hexnucleotide t yest 5' splice junction uncouples recognition, clevge nd ligtion. Cell 41: Prker, R., P. G. Silicino, nd C. Guthrie Recognition of the TACTAAC box during mrna splicing in yest involves bse-piring to the U2-like snrna. Cell 49: DROSOPHILA IN VITRO SPLICING SIGNALS Peng, X., nd S. M. Mount Chrcteriztion of Enhncerof-white-pricot in Drosophil melnogster. Genetics 136: Pepling, M. E., nd S. M. Mount Sequence of cdna from the Drosophil melnogster white gene. Nucleic Acids Res. 18: Pirrott, V., nd C. Brockl Trnscription of the Drosophil white locus nd some of its mutnts. EMBO J. 3: Rbinow, L., nd J. A. Birchler A dosge-sensitive modifier of the retrotrnsposon-induced lleles of the Drosophil white locus. EMBO J. 8: Reed, R., nd T. Mnitis Intron sequences involved in lrit formtion during pre-mrna splicing. Cell 41: Reed, R., nd T. Mnitis A role for exon sequences nd splice site proximity in splice site selection. Cell 46: Reed, R., nd T. Mnitis The role of mmmlin brnchpoint sequences in pre-mrna splicing. Genes Dev. 2: Rio, D. C Accurte nd efficient pre-mrna splicing in Drosophil cell-free extrcts. Proc. Ntl. Acd. Sci. USA 85: Ruby, S. W., nd J. Abelson An erly hierrchic role of Ul smll nucler ribonucleoprotein in spliceosome ssembly. Science 242: Ruskin, B., J. M. Greene, nd M. R. Green Cryptic brnch point ctivtion llows ccurte in vitro splicing of humn 3-globin intron mutnts. Cell 52: Rusldn, B., A. R. Kriner, T. Mnitis, nd M. R. Green Excision of n intct intron s novel lrit structure during pre-mrna splicing in vitro. Cell 38: Ruskin, B., P. D. Zmore, nd M. R. Green A fctor, U2AF, is required for U2 snrnp binding nd splicing complex ssembly. Cell 52: Siki, R. K., D. H. Gelfnd, S. Stoffel, S. Schrf, R. Higuchi, G. T. Horn, K. B. Mullis, nd H. A. Ehrlich Primerdirected enzymtic mplifiction of DNA with thermostble polymerse. Science 239: Smbrook, J., E. F. Fritsch, nd T. Mnitis Moleculr cloning: lbortory mnul, 2nd ed. Cold Spring Hrbor Lbortory, Cold Spring Hrbor, N.Y. 73. Snger, F., S. Nicklen, nd A. R. Coulsen DNA sequencing with chin-terminting inhibitors. Proc. Ntl. Acd. Sci. USA 74: Senpthy, P., M. B. Shpiro, nd N. L. Hrris Splice junctions, brnch point sites, nd exons: sequence sttistics, identifiction, nd pplictions to the humn genome project. Methods Enzymol. 183: Serphin, B., L. Kretzner, nd M. Rosbsh A Ul snrna: pre-mrna bse piring interction is required erly in yest spliceosome ssembly but does not uniques define the 5' splice site. EMBO J. 7: Serphin, B., nd M. Rosbsh Muttionl nlysis of the interctions between Ul smll nucler RNA nd pre-mrna of yest. Gene 82: Siebel, C. W., nd D. C. Rio Regulted splicing of the Drosophil P trnsposble element third intron in vitro: somtic repression. Science 248: Silicino, P. G., nd C. Guthrie ' splice site selection in yest: genetic ltertions in bse-piring with Ul revel dditionl requirements. Genes Dev. 2: Smith, C. W. J., J. G. Ptton, nd B. Ndl-Ginrd Alterntive splicing in the control of gene expression. Annu. Rev. Genet. 23: Tseng, J. C., S. Zollmn, A. C. Chin, nd F. A. Lski Splicing of the Drosophil P element ORF2-ORF3 intron is inhibited in humn cell extrct. Mech. Dev. 35: von Hlle, E. S Pursuing the Enhncer-of-white-pricot. Drosophil Inf. Serv. 44: Weibuer, K., J.-J. Herrero, nd W. Filopowicz Nucler pre-mrna processing in plnts: distinct modes of 3' splice site selection in plnts nd nimls. Mol. Cell. Biol. 8: Wiering, B., E. Hofer, nd C. Weissmnn A miniml

15 1118 GUO ET AL. intron length but no specific internl sequence is required for splicing the lrge rbbit,b-globin intron. Cell 37: Wu, J., nd J. L. Mnley Mmmlin pre-mrna brnch site selection by U2 snrnp involves bse piring. Genes Dev. 3: Zchr, Z., T. B. Chou, nd P. M. Binghm Evidence tht regultory gene utoregultes splicing of its trnscript. EMBO J. 6: Zchr, Z., D. Dvidson, D. Grz, nd P. M. Binghm A detiled developmentl nd structurl study of the trnscriptionl effects of insertion of the copi trnsposon into the white locus of Drosophil melnogster. Genetics 111: Zmore, P. D., nd M. R. Green Biochemicl chrcteriztion of U2 snrnp uxiliry fctor: n essentil pre-mrna MOL. CELL. BIOL. splicing fctor with novel intrmoleculr distribution. EMBO J. 1: Zeitlin, S., nd A. Efstrtitis In vivo splicing products of the rbbit,b-globin gene. Cell 39: Zhung, Y., A. M. Goldstein, nd A. M. Weiner UACU AAC is the preferred brnch site for mmmlin mrna splicing. Proc. Ntl. Acd. Sci. USA 86: Zhung, Y., nd A. M. Weiner A compenstory bse chnge in Ul snrna suppresses 5' splice site muttion. Cell 46: Zhung, Y., nd A. M. Weiner A compenstory bse chnge in humn U2 snrna cn suppress brnch site muttion. Genes Dev. 3: