WSSP-10 Chapter 9 Determine ORF and BLASTP

Size: px
Start display at page:

Download "WSSP-10 Chapter 9 Determine ORF and BLASTP"

Transcription

1 WSSP-10 Chapter 9 Determine ORF and BLASTP

2 Steps and terms used in protein expression 1 st ATG in mrna p 9-1 Cloning the cdna library p 9-1

3 Possible reading frames p 9-2 Possible types of clones in the cdna library p 9-2

4 DSAP Define ORF page: Link to Toolbox translation program p 9-3 Toolbox: DNA Sequence Translation Program PolyA tail at 3 end Reading frames p 9-3

5 EX Reading Frame Longest ORF Translation stop p 9-3 Could this ORF code for the protein?? p 9-4

6 Does this region match the BLASTX matches? Region of DNA that codes for the highlighted in protein sequence BLASTx p 9-4 Could the DNA code for a partial protein?? p 9-4

7 Does this region match the BLASTX matches? Region of DNA that codes for the highlighted in protein sequence BLASTx p 9-4 Does this region match the BLASTX matches? Region of DNA that codes for the highlighted in protein sequence BLASTx p 9-4

8 An example of a partial coding sequence Similar Seq. Is this a partial ORF cdna clone? What about this region?

9 The first part of the protein may not have matches because it is not conserved. Query Sbjct Region of similarity The BLASTx helps determine which reading frame is correct >ref NP_ dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Score = 158 bits (400), Expect = 5e-37 Identities = 73/93 (78%), Positives = 83/93 (89%), Gaps = 0/93 (0%) Frame = +2 Query 11 MLEGRARVEDTDMPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVV 190 MLEG+A VEDTDMP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVV Sbjct 1 MLEGKAVVEDTDMPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVV 60 Query 191 GSSFGCFFTHKKGSFIYFRLETLHFLIFKGAAA 289 GS FGC+ TH KGSFIYFRLE+L FL+FKGAAA Sbjct 61 GSGFGCYITHSKGSFIYFRLESLRFLVFKGAAA 93 It also helps suggest the start point p 9-6

10 Chose the reading frame and paste in the protein sequence Do not include the * (stop codon) Make sure to include bases that code for the stop codon p 9-7 DSAP BLASTp page p 9-8

11 NCBI BLASTp page Paste in protein sequence p 9-8 BLASTp results of EX ORF Link to Conserved Domain Database p 9-9

12 BLASTp results of EX ORF BLASTp results of EX ORF No matches

13 Enter BLASTp data into table Protein M * Possible DNA Clones AAAAAA AAAAAA AAAAAA p 9-10 Suppose the cdna was missing the first 13 bp Does this DNA code for the start of the protein? >gi ref NP_ dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Score = 139 bits (351), Expect = 8e-32 Identities = 63/81 (77%), Positives = Query 1 MPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVVGSSFGCFFTHKK 60 MP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVVGS FGC+ TH K Sbjct 13 MPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVVGSGFGCYITHSK 72 Query 61 GSFIYFRLETLHFLIFKGAAA 81 GSFIYFRLE+L FL+FKGAAA Sbjct 73 GSFIYFRLESLRFLVFKGAAA 93

14 Suppose the cdna was missing the first 13 bp Did they choose the correct ORF? >gi ref NP_ dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Score = 139 bits (351), Expect = 8e-32 Identities = 63/81 (77%), Positives = Query 1 MPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVVGSSFGCFFTHKK 60 MP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVVGS FGC+ TH K Sbjct 13 MPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVVGSGFGCYITHSK 72 Query 61 GSFIYFRLETLHFLIFKGAAA 81 GSFIYFRLE+L FL+FKGAAA Sbjct 73 GSFIYFRLESLRFLVFKGAAA 93 Suppose the cdna was missing the first 13 bp Did they choose the correct ORF? BLASTP starting here >gi ref NP_ dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Score = 139 bits (351), Expect = 8e-32 Identities = 63/81 (77%), Positives = Query 1 MPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVVGSSFGCFFTHKK 60 MP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVVGS FGC+ TH K Sbjct 13 MPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVVGSGFGCYITHSK 72 Query 61 GSFIYFRLETLHFLIFKGAAA 81 GSFIYFRLE+L FL+FKGAAA Sbjct 73 GSFIYFRLESLRFLVFKGAAA 93 BLASTP starting here >gi ref NP_ dynein light chain LC6, flagellar outer arm [Zea mays] Score = 156 bits (395), Expect = 6e-37 Identities = 72/92 (78%), Positives = 82/92 (89%), Gaps = 0/92 (0%) Query 1 LEGRARVEDTDMPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVVG 60 LEG+A VEDTDMP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVVG Sbjct 2 LEGKAVVEDTDMPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVVG 61 Query 61 SSFGCFFTHKKGSFIYFRLETLHFLIFKGAAA 92 S FGC+ TH KGSFIYFRLE+L FL+FKGAAA Sbjct 62 SGFGCYITHSKGSFIYFRLESLRFLVFKGAAA 93

15 Compare the BLASTx and BLASTp results for EX1.10: Are the matches to the same proteins? p 9-11 Compare the BLASTx and BLASTp results for EX1.10: Are the e-values similar? p 9-12

16 Compare the BLASTx and BLASTp results for EX1.10: Are the alignments similar? BLASTx >ref NP_ dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Query 11 MLEGRARVEDTDMPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVV 190 MLEG+A VEDTDMP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVV Sbjct 1 MLEGKAVVEDTDMPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVV 60 Query 191 GSSFGCFFTHKKGSFIYFRLETLHFLIFKGAAA 289 GS FGC+ TH KGSFIYFRLE+L FL+FKGAAA Sbjct 61 GSGFGCYITHSKGSFIYFRLESLRFLVFKGAAA 93 BLASTp >gi ref NP_ dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Score = 158 bits (400), Expect = 2e-37 Query 1 MLEGRARVEDTDMPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVV 60 MLEG+A VEDTDMP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVV Sbjct 1 MLEGKAVVEDTDMPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVV 60 Query 61 GSSFGCFFTHKKGSFIYFRLETLHFLIFKGAAA 93 GS FGC+ TH KGSFIYFRLE+L FL+FKGAAA Sbjct 61 GSGFGCYITHSKGSFIYFRLESLRFLVFKGAAA 93 p 9-12 DSAP Review Page

17 Determine ranges of 5 UTR and 3 UTR Highlight ranges in the cdna text box p. 9-14