PREDICT Host DNA Barcoding Guide
Contents: 1. Rationale for Barcoding.. Page 2 2. Implementation... Page 2 3. PCR Protocols.... Page 3 4. Data Interpretation... Page 5 5. Data Entry into EIDITH.... Page 9-1 -
1. Rationale Identification of host species based on morphological traits can be challenging. Bats and rodents pose a particular problem because there are so many species, and the distinguishing characteristics (traits) of many are poorly described. As a result, in PREDICT-1 nearly 20% of bat and rodent species sampled were not identified to the species level in the field, and many others were potentially incorrectly identified. For this reason, in PREDICT-2 we have implemented DNA barcoding to confirm species identifications made in the field. What is barcoding? DNA barcoding is the process by which species are confirmed using genetic sequence. Two host genes are commonly used for this purpose: Cytochrome B (CytB) and Cytochrome oxidase subunit 1 (CO1). Protocols targeting both of the genes are provided below. What do we barcode? We will barcode all bats and rodents that are positive for a virus (PREDICT priority viral families). We will also barcode a subset of individuals for any species that are negative for the PREDICT priority viral families (5 individuals per species; again, bats and rodents only). Note: Further barcoding may be required if discrepancies between the field IDs and the genetic barcode are identified. 2. Implementation i) Run CytB PCR assay on all samples. This will also serve as an extraction control PCR and can replace the PCRs for Beta Actin. If the CytB PCR fails, then run the CO1 PCR assay on those samples instead. The objective is to have a CytB or CO1 amplicon for every sample. Following confirmation that CytB or CO1 was amplified (i.e. band on a gel), place all PCR products in the freezer until viral family testing has been completed. ii) iii) Upon completion of viral family testing, including sequencing, select one CytB or CO1 PCR product from each virus positive individual (rodents and bats only) for sequencing. Ideally, this would be the same sample in which you detected the virus; however, this is not required. If necessary, you can use any sample from that same individual. Then select CytB/CO1 PCR products from five virus negative individuals per species for sequencing (rodents and bats only). - 2 -
3. PCR Protocols Cytochrome B (CytB) RT-PCR Protocol Methods: Reverse Transcription performed separately using Invitrogen Superscript III First Strand Synthesis kit (Cat# 18080-400), followed by PCR. Reference: Townzen, JS et al. (2008). Med. Vet. Entomol. 22:386-393 Primer sequences: CytB_F: 5 - GAGGMCAAATATCATTCTGAGG -3 CytB_R: 5 - TAGGGCVAGGACTCCTCCTAGT -3 Invitrogen Platinum Taq kit ( Cat #: 10966-026) For 25µL reaction: 2.5µL 10X PCR Buffer 0.75µL MgCl2 (50mM) 0.5µL dntp (10mM) 0.1µL Platinum Taq DNA polymerase 18.15µL Molecular grade water 1µL Forward primer @ 10µm 1µL Reverse primer @ 10µm 1µL template PCR reaction conditions: 94 C for 2min 50 cycles- 94 C for 30 sec denature 52 C for 50 sec annealing 72 C for 60 sec elongation 72 C for 7 min final elongation 10 C for cooling Target: Mitochondrial Cytochrome b Size ~ 457 bp Visualizing results: Run 10µL of PCR product on a 1.5% agarose gel - 3 -
Cytochrome Oxidase I (COI) RT-PCR Protocol Methods: Reverse Transcription performed separately using Invitrogen Superscript III First Strand Synthesis kit (Cat# 18080-400), followed by nested PCR. Reference: Townzen, JS et al. (2008). Med. Vet. Entomol. 22:386-393 Primer sequences: Round 1: COI_long_F: 5 - AACCACAAAGACATTGGCAC -3 COI_long_R: 5 - AAGAATCAGAATARGTGTTG -3 Round 2: COI_short_F: 5 - GCAGGAACAGGWTGAACCG -3 COI_short_R: 5 - AATCAGAAYAGGTGTTGGTATAG -3 Invitrogen Platinum Taq kit ( Cat #: 10966-026) For 25µL reaction: 2.5µL 10X PCR Buffer 0.75µL MgCl2 (50mM) 0.5µL dntp (10mM) 0.1µL Platinum Taq DNA polymerase 18.15µL Molecular grade water 1µL Forward primer @ 10µm 1µL Reverse primer @ 10µm 1µL template PCR reaction conditions: 94 C for 2min 45 cycles- 94 C for 30 sec denature 48 C for 50 sec annealing 72 C for 60 sec elongation 72 C for 7 min final elongation 10 C for cooling Same protocol for Rounds 1 and 2 Target: Mitochondrial Cytochrome Oxidase Subunit 1 Round 1 ~ 663 bp Round 2 ~324 bp Visualizing results: Run 10µL Round 1 PCR product on a 1.5% agarose gel Run 10µL Round 2 PCR product on a 1.5% agarose gel - 4 -
4. Data Interpretation We use GenBank as our reference database and BLAST (Basic Local Alignment Search Tool) as a tool to search this database - Go to https://blast.ncbi.nlm.nih.gov/blast.cgi - Click on Nucleotide BLAST under Web BLAST Figure 1. Genbank website - Paste your sequence in the Enter Query Sequence (Box 1 in Figure 2). - Select Others (nr etc) and the Nucleotide collection (nr/nt) in dropdown menu (Box 2 in Figure 2) - Select Highly similar sequences (megablast) as BLAST algorithm (Box 3 in Figure 2). If you get only very few (or no) results, try Somewhat similar sequences (blastn). Both searches should provide the same top results. - Click the BLAST button (4) 1 2 3 4 Figure 2. BLAST tool website - 5 -
- Once the BLAST search is done, the BLAST results will be displayed in a new page (Figure 3). If you did the search for several sequences simultaneously, you can select the BLAST results for each sequence in the Results for (box 1 in Figure 3). - The graphic summary (Box 2 in Figure 3) shows the quality of the alignment of the query sequence with the sequences in the database. The color corresponds to the alignment scores. 1 2 Figure 3. BLAST results - Below the graphic summary, the result descriptions show all sequences in the database producing significant alignments with the query sequence (Figure 4). The list starts with the best matches. You should examine both sequence coverage (Query cover Box, Figure 4) and identity (Ident Box, Figure 4) to determine the quality of your match. We use a threshold of 97% identity to confirm the species. In the example below, Hipposideros cervinus is the identified species to be entered into EIDITH. - 6 -
Figure 4. Species with 97% identity - Between 95-97%, the exact species is uncertain (cf. species). In the example below, the individual should be identified as Rhinolophus cf. creaghi to be entered into EIDITH. Figure 5. Species with 95% identity - Below 95%, the species remains unidentified as in the example below. In this case please sequence the other gene (CO1) to see if a more specific result can be obtained. If the result is the same, the individual should be entered into EIDITH as unidentified. Figure 6. Species with less than 95% identity - In some cases, your sequence may match more than one species, as in the example below (Figure 7). The query sequence may belong to Tupaia minor or Tupaia tana (97% identity). In this case please sequence the other gene (CO1) to see if a more specific result is obtained. If you get the same result, then the individual should be identified only to the genus eg. Tupaia sp. and entered into EIDITH. - 7 -
Figure 7. Sequence matches more than one species - 8 -
5. Data Entry in EIDITH Manual Data Entry 1. In the Barcoding Dashboard, press Create Barcoding Result Batch. - 9 -
2. You will now enter into the data entry screen. Enter your batch name (1) and choose the lab that performed the barcoding tests (2). If the lab does not exist in the dropdown list, contact technology@eidith.org and we will add the lab for you. - 10 -
3. Choose the specimen ID for the first test (or type in the specimen ID). Once you choose/enter the specimen ID, the animal ID, scientific name & common name for that specimen will appear in the text boxes so that you can verify you have the correct specimen before proceeding. - 11 -
4. Press Add Barcoding Result. 5. You will now be shown a row with the data to be entered. All fields are mandatory. Once you have completed the data entry, press Save. To enter another result, go back to Step 3. If you need to change some of the data entered, you can press Edit and it will allow you to change the data. - 12 -
6. Once the batch is complete, press Back to Dashboard. - 13 -
7. You can now upload your batch to EIDITH. You will be notified when the Barcoding results have been reviewed and uploaded into EIDITH where you can review the final animal data with updated species names (if applicable). - 14 -
Using Templates 1. Download the template under Templates. 2. Fill in the template with your results. Please note: Your template can only contain one lab name. If you have results from more than one lab, please create separate templates for each lab. It is also a good idea to go to EIDITH.org à Uploaded Data à Specimens and download a list of specimens in EIDITH and match to the specimens entered in your template to ensure all specimens exist in the database before you try to upload the template to avoid upload errors. - 15 -
3. In the Barcoding Dashboard, press Import Tests from Excel (1), you will be prompted to navigate to your template. Choose the file, then press Open (2). - 16 -
4. Your batch will now appear in the Dashboard and will be named the same as the template file name. To enter the batch, click on the batch name link. - 17 -
5. You are now in the batch where you will see the results entered in the template. You can now add more results (see step 3 of manual data entry) or edit the results (see step 5 of manual data entry) if necessary. - 18 -
8. Once the batch is complete, press Back to Dashboard where you can submit the batch. You will be notified when the Barcoding results have been reviewed and uploaded into EIDITH where you can review the final animal data with updated species names (if applicable). - 19 -