WSSP-10 Chapter 9 Determine ORF and BLASTP

Similar documents
Transcription:

WSSP-10 Chapter 9 Determine ORF and BLASTP

Steps and terms used in protein expression 1 st ATG in mrna p 9-1 Cloning the cdna library p 9-1

Possible reading frames p 9-2 Possible types of clones in the cdna library p 9-2

DSAP Define ORF page: Link to Toolbox translation program p 9-3 Toolbox: DNA Sequence Translation Program PolyA tail at 3 end Reading frames p 9-3

EX1.10 +1 Reading Frame Longest ORF Translation stop p 9-3 Could this ORF code for the protein?? p 9-4

Does this region match the BLASTX matches? Region of DNA that codes for the highlighted in protein sequence BLASTx p 9-4 Could the DNA code for a partial protein?? p 9-4

Does this region match the BLASTX matches? Region of DNA that codes for the highlighted in protein sequence BLASTx p 9-4 Does this region match the BLASTX matches? Region of DNA that codes for the highlighted in protein sequence BLASTx p 9-4

An example of a partial coding sequence Similar Seq. Is this a partial ORF cdna clone? What about this region?

The first part of the protein may not have matches because it is not conserved. Query Sbjct 2 60 410 475 Region of similarity The BLASTx helps determine which reading frame is correct >ref NP_001150519.1 dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Score = 158 bits (400), Expect = 5e-37 Identities = 73/93 (78%), Positives = 83/93 (89%), Gaps = 0/93 (0%) Frame = +2 Query 11 MLEGRARVEDTDMPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVV 190 MLEG+A VEDTDMP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVV Sbjct 1 MLEGKAVVEDTDMPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVV 60 Query 191 GSSFGCFFTHKKGSFIYFRLETLHFLIFKGAAA 289 GS FGC+ TH KGSFIYFRLE+L FL+FKGAAA Sbjct 61 GSGFGCYITHSKGSFIYFRLESLRFLVFKGAAA 93 It also helps suggest the start point p 9-6

Chose the reading frame and paste in the protein sequence Do not include the * (stop codon) Make sure to include bases that code for the stop codon p 9-7 DSAP BLASTp page p 9-8

NCBI BLASTp page Paste in protein sequence p 9-8 BLASTp results of EX1.10 +2 ORF Link to Conserved Domain Database p 9-9

BLASTp results of EX1.10 +1 ORF BLASTp results of EX1.10 +3 ORF No matches

Enter BLASTp data into table Protein M * Possible DNA Clones AAAAAA AAAAAA AAAAAA p 9-10 Suppose the cdna was missing the first 13 bp Does this DNA code for the start of the protein? >gi 226493894 ref NP_001150519.1 dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Score = 139 bits (351), Expect = 8e-32 Identities = 63/81 (77%), Positives = Query 1 MPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVVGSSFGCFFTHKK 60 MP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVVGS FGC+ TH K Sbjct 13 MPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVVGSGFGCYITHSK 72 Query 61 GSFIYFRLETLHFLIFKGAAA 81 GSFIYFRLE+L FL+FKGAAA Sbjct 73 GSFIYFRLESLRFLVFKGAAA 93

Suppose the cdna was missing the first 13 bp Did they choose the correct ORF? >gi 226493894 ref NP_001150519.1 dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Score = 139 bits (351), Expect = 8e-32 Identities = 63/81 (77%), Positives = Query 1 MPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVVGSSFGCFFTHKK 60 MP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVVGS FGC+ TH K Sbjct 13 MPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVVGSGFGCYITHSK 72 Query 61 GSFIYFRLETLHFLIFKGAAA 81 GSFIYFRLE+L FL+FKGAAA Sbjct 73 GSFIYFRLESLRFLVFKGAAA 93 Suppose the cdna was missing the first 13 bp Did they choose the correct ORF? BLASTP starting here >gi 226493894 ref NP_001150519.1 dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Score = 139 bits (351), Expect = 8e-32 Identities = 63/81 (77%), Positives = Query 1 MPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVVGSSFGCFFTHKK 60 MP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVVGS FGC+ TH K Sbjct 13 MPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVVGSGFGCYITHSK 72 Query 61 GSFIYFRLETLHFLIFKGAAA 81 GSFIYFRLE+L FL+FKGAAA Sbjct 73 GSFIYFRLESLRFLVFKGAAA 93 BLASTP starting here >gi 226493894 ref NP_001150519.1 dynein light chain LC6, flagellar outer arm [Zea mays] Score = 156 bits (395), Expect = 6e-37 Identities = 72/92 (78%), Positives = 82/92 (89%), Gaps = 0/92 (0%) Query 1 LEGRARVEDTDMPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVVG 60 LEG+A VEDTDMP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVVG Sbjct 2 LEGKAVVEDTDMPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVVG 61 Query 61 SSFGCFFTHKKGSFIYFRLETLHFLIFKGAAA 92 S FGC+ TH KGSFIYFRLE+L FL+FKGAAA Sbjct 62 SGFGCYITHSKGSFIYFRLESLRFLVFKGAAA 93

Compare the BLASTx and BLASTp results for EX1.10: Are the matches to the same proteins? p 9-11 Compare the BLASTx and BLASTp results for EX1.10: Are the e-values similar? p 9-12

Compare the BLASTx and BLASTp results for EX1.10: Are the alignments similar? BLASTx >ref NP_001150519.1 dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Query 11 MLEGRARVEDTDMPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVV 190 MLEG+A VEDTDMP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVV Sbjct 1 MLEGKAVVEDTDMPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVV 60 Query 191 GSSFGCFFTHKKGSFIYFRLETLHFLIFKGAAA 289 GS FGC+ TH KGSFIYFRLE+L FL+FKGAAA Sbjct 61 GSGFGCYITHSKGSFIYFRLESLRFLVFKGAAA 93 BLASTp >gi 226493894 ref NP_001150519.1 dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Score = 158 bits (400), Expect = 2e-37 Query 1 MLEGRARVEDTDMPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVV 60 MLEG+A VEDTDMP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVV Sbjct 1 MLEGKAVVEDTDMPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVV 60 Query 61 GSSFGCFFTHKKGSFIYFRLETLHFLIFKGAAA 93 GS FGC+ TH KGSFIYFRLE+L FL+FKGAAA Sbjct 61 GSGFGCYITHSKGSFIYFRLESLRFLVFKGAAA 93 p 9-12 DSAP Review Page

Determine ranges of 5 UTR and 3 UTR Highlight ranges in the cdna text box p. 9-14