GDMS Templates Documentation GDMS Templates Release 1.0 1
Table of Contents 1. SSR Genotyping Template 03 2. DArT Genotyping Template... 05 3. SNP Genotyping Template.. 08 4. QTL Template.. 09 5. Map Template.. 11 6. Mapping Template... 12 7. SSR Marker Template..... 14 8. SNP Marker Template. 16 9. CISR Marker Template...18 10. CAP Marker Template.20 11. MTA Template..22 2
SSR Genotyping Template This template is suitable for SSR fingerprinting data that comes directly from genotyping software or entered manually from traditional gels. Sections available in this template Section Name SSR_Source SSR_Data List Description General experiment data The actual data as a list SSR_Source Section Description: General experiment data Institute Institute which carried out the analysis Mandatory Character 100 Principal investigator Name of the Principal investigator Character 50 Dataset name Descriptive name of the dataset Mandatory Character 30 Dataset description Description of dataset Mandatory Character 255 Genus Genus name for taxon Mandatory Character 25 Species Missing data Taxonomic or common name of Species analysed. Symbol or characters used to represent missing data Character 25 Mandatory Character 20 Remark Notes or comments on dataset. Character 255 Dataset name 3
SSR_Data List Section Description: The actual data as a list GID Germplasm identifier Mandatory Integer 4 Accession The number or name of the accession. Mandatory Character 40 Marker The name of the marker used. Mandatory Character 40 Gel/Run The name or number of the gel or gel run from which the data was taken Character 50 Dye The dye used for detection of the peak. Character 50 Called Allele Raw Data Quality The binned allele value, which is normally the expected size of SSR fragment The raw allele value, which is normally the expected size of SSR fragment. The quality scale takes values from 1 to 100 attributed by the genotyping software Character 20 Character 20 Integer 4 Height Height of chromatogram peak Integer 4 Volume The area under the chromatogram peak. Integer 4 Amount The relative allele contribution of this allele to all alleles at this locus. Mandatory Integer 4 GID Accession (Germplasm Names) Marker 4
DArT Genotyping Template The DArT Genotyping Template DArT (Diversity Array technology) is a generic and costeffective genotyping technology. It was invented by Dr Andrzej Kilian, to overcome some of the limitations of other molecular marker technologies such as RFLP, AFLP and SSR Sections available in this template Section Name DArT_Source DArT_Data DArT_GIDs Description Information on the source of the dataset, the species it concerns and the name of the dataset The actual data Information on Germplasm identifier DArT_Source Section Description: Information on the source of the dataset Institute Institute which carried out the analysis Mandatory Character 100 Principal investigator Name of the Principal investigator Character 50 Dataset name Descriptive name of the dataset Mandatory Character 30 Dataset description Description of dataset Mandatory Character 255 Genus Genus name for taxon Mandatory Character 25 Species Taxonomic or common name of Species analysed. Character 25 Remark Notes or comments on dataset. Character 255 Dataset name 5
DArT_Data Section Description: Information on actual data Clone ID Clone identification number Mandatory Integer 4 Marker Name The name of the marker used. Mandatory Character 40 Q Reproducibility Call Rate PIC Discordance It is an estimate of marker quality, which reflects how well the two phases (Present = 1 vs. Absent = 0) of the marker are separated in this sample set. It measures the fraction of the total variation across all individuals due to bimodality. Q is based on ANOVA Measure in % how reproducible the scoring for replicated samples is. (100 means 100% reproducible). A small number of markers were also analysed in duplicate. A score of X means a scoring discordance between 2 copies of an extract or between 2 copies of a marker. Percentage of valid scores in all possible scores for a marker. Polymorphism information content (PIC): a maximum of 0.5 when a marker scores 50% 0 and 50% 1. Measure of reproducibility expressing overall variation of scores within replicated samples. Mandatory Float 4 Mandatory Float 4 Mandatory Float 4 Mandatory Float 4 Mandatory Float 4 Genotype 1 Allele data for Genotype 1 Mandatory Character 20 Genotype 2 Allele data for Genotype 2 Mandatory Character 20,,,, Genotype n Allele data for Genotype n Mandatory Character 20 Accession (Germplasm Names) Marker 6
DArT_GIDs Section Description: Information on Germplasm identifier GID Germplasm identifier Mandatory Integer 4 Germplasm Name Name of the Germplasm Mandatory Character 40 GID Accession (Germplasm Names) 7
SNP Genotyping Template Template for Genotyping Data using SNPs (As the number of data points in SNP Genotyping will be more, Tab delimited text file is used for uploading) Information on the source of the dataset Institute Institute which carried out the analysis Mandatory Character 100 Principal investigator Name of the Principal investigator Mandatory Character 50 Email Email of Principal Investigator Mandatory Incharge person Name of the Incharge person Mandatory Dataset name Descriptive name of the dataset Mandatory Character 30 Purpose of study Purpose of study Mandatory Dataset description Description of dataset Mandatory Character 255 Genus Genus name for taxon Mandatory Character 25 Taxonomic or common name of Species Species analysed. This is the symbol or characters used Missing data by the scientist to represent missing data Year, month and day when the Creation date measurement/observation was made. Marker/Genotype Matrix Mandatory Character 25 Mandatory Character 20 Mandatory GID Germplasm identifier Mandatory Integer 4 Genotype Name of the Germplasm Mandatory Character 40 Marker Name The name of the marker used. Mandatory Character 40 SNP Detected The single nucleotide polymorphism (SNP) being detected by the marker Date Mandatory Character 4 Dataset name GID Genotype (Germplasm Names) Marker 8
QTL Template Template for QTL Data Sections available in this template Section Name QTL_Source QTL_Data Description Information on the source of the dataset, the species it concerns and the name of the dataset The actual data QTL_Source Section Description: Information on the source of the dataset Institute Institute which carried out the analysis Mandatory Character 100 Principal investigator Dataset Name Dataset description Name of the Principal investigator Character 50 Descriptive name of the dataset Mandatory Character 30 Description of dataset Mandatory Character 255 Genus Genus name for taxon Mandatory Character 25 Method Method in which QTL was discovered Mandatory Character 25 Score Value of Log 10 base ratio Mandatory Float 4 Species Genus name for taxon Mandatory Character 25 Remark To add notes or comments on dataset Character 25 9
QTL_Data Section Description: The actual data Name Name of the QTL Mandatory Integer 30 Chromosome Map-Name Chromosome where the QTL was mapped. The Map on which the QTL data can be projected. Mandatory Character 50 Mandatory Character 30 Position Position of QTL Mandatory Float 4 Pos-Min Minimum position of QTL Mandatory Float 4 Pos-Max Maximum position of QTL Mandatory Float 4 Trait Name of the Trait Mandatory Character 40 Experiment Dataset id from DMS or qtl mapping s/w or location or environment Mandatory Character 100 CLEN Chromosome length Float 4 LFM Marker which is left of the QTL Mandatory Character 50 RFM Marker which is right of the QTL Mandatory Character 50 Effect +ve effect or -ve effect on the trait numeric value. Mandatory Float 4 SE additive Standard error of additive effect Mandatory Character 15 High value parent Parent of High value allele Character 255 High value allele High value allele for particular Character 20 Low value parent Parent of Low value allele Character 255 Low value allele Low value allele for particular Character 20 Score Value of Log 10 base ratio Mandatory Float 4 R2 % of total phenotypic variation Mandatory Float 4 Interactions Epistatic or environmental conjunction Character 255 Name (QTL name) 10
Map Template Template for Map Data Sections available in this template Section Name Map Description Information on the map Map Section Description: Information on the map Map Name Name of the Map Mandatory Character 30 Map Description Description of the Map Mandatory Character 150 Crop Name of the crop Mandatory Character 25 Map Unit Linkage group units ex: cm (Centimorgan), bp (basepair) Mandatory Character 15 Marker Name The name of the marker used. Mandatory Character 40 Linkage Group Position Chromosome number/ Linkage group number The position in cm of the marker of the map Mandatory Character 50 Mandatory Float 4 Map Name 11
Mapping Template Template for Genetic Mapping Data Sections available in this template Section Name Mapping_Source Mapping_Datalist Description Information on the source of the dataset, the species it concerns and the name of the dataset The actual data Mapping_Source Section Description: Information on the source of the dataset, the species it concerns and the name of the dataset Institute Institute which carried out the analysis Mandatory Character 100 Principal investigator Name of the Principal investigator Character 50 Email contact Email contact of PI Mandatory Character 30 Dataset Name Descriptive name of the dataset Mandatory Character 30 Dataset description Genus Species Population ID Parent A GID Parent A Parent B GID Parent B Description of dataset Mandatory Character 255 Taxonomic or common name of Species analysed. Taxonomic or common name of Species analysed. An identifier of the population, usually consisting of a identifier for the cross and population type Germplasm id for parent a (usually female) in the population. Germplasm name of parent a (usually female) in the population. Germplasm id for parent b (usually male) in the population. Germplasm name of parent b (usually male) in the population. 12 Mandatory Character 25 Mandatory Character 25 Mandatory Character 25 Mandatory Integer 4 Mandatory Character 50 Mandatory Integer 4 Mandatory Character 50
Population Size Number of individuals in the mapping population Character 4 Population Type Type of population used for mapping. Character 50 Purpose of the study Description of the reason for the study. Mandatory Character 50 Scoring Scheme The name of the scoring scheme used Missing data Creation date This is the symbol or characters used by the scientist to represent missing data Year, month and day when the measurement/observation was made. Mandatory Character 20 Mandatory Remark To add notes or comments on dataset. Character 255 Date Dataset name Mapping_Datalist Section Description: The actual data Genotype/ Marker Matrix Alias Alias name Mandatory Character 40 GID Germplasm identifier Mandatory Character 40 Line Name of the Germplasm Mandatory Character 40 Marker The name of the marker used. Mandatory Character 40 Score The called genotype value eg: A, B or H GID Line (Germplasm Names) Marker Mandatory Character 4 13
SSR Marker Template Information about SSR Markers Sections available in this template Section Name SSR Markers Description Information on the SSR Markers SSR Markers Section Description: Information on the SSR Markers Marker Name The name of the marker used Mandatory Character 40 Alias Alias name Character 40 Crop Name of the crop Mandatory Character 25 Genotype Ploidy Name of the genotype/accession/germplasm on which the marker was originally discovered. Ploidy of the species ex: haploid, diploid etc. 14 Character 40 Character 25 GID Germplasm identifier Character 40 Principal Investigator Name of the Principal investigator Mandatory Character 50 Contact Contact details Character 255 Institute Institute which carried out the analysis Mandatory Character 100 Incharge Person Name of the Incharge person Character 50 Assay Type Type of assay ex: golden gate, kaspar etc. Character 50 Repeat Motif values Character 250
No of Repeats Number of repeats Integer 4 SSR Type Type of motif ex: di, tri, tetra, etc Character 20 Sequence Sequence Length Min Allele Max Allele SSR number Size of Repeat Motif Forward Primer Reverse Primer Product Size Primer Length Forward Primer Temperature Reverse Primer Temperature Annealing Temperature Elongation Temperature Fragment Size Expected Fragment Size Observed Amplification Reference Sequence in which motif is identified Marker Number of base pairs in the sequence Minimum allele value Maximum allele value Number of identified SSR out of the total number of identified SSRs Number of base pairs in the motif The sequence of forward primer used in the PCR to detect the allele The sequence of reverse primer used in the PCR to detect the allele Size of the Product Length of the Primer Melting Temp of forward primer Melting Temp of reverse primer The annealing temperature of the PCR used to detect the allele Elongation Temperature Expected size of fragment in bp Observed size of fragment in bp Amplification status Formal citation for a paper or electronic publication GID Genotype (Germplasm Names) Marker Character 2500 Integer 4 Mandatory Mandatory Integer 4 Integer 4 Integer 4 Character 255 Character 25 Character 25 Character 20 Integer 4 Float 4 Float 4 Float 4 Float 4 Integer 4 Integer 4 Character 12 Character 255 15
SNP Marker Template Information about SNP Markers Sections available in this template Section Name SNP Markers Description Information on the SNP Markers SNP Markers Section Description: Information on the SNP Markers Marker Name The name of the marker used Mandatory Character 40 Alias Alias name Character 40 Crop Name of the crop Mandatory Character 25 Genotype Ploidy Name of the genotype/accession/germplasm on which the marker was originally discovered. Ploidy of the species ex: haploid, diploid etc. 16 Character 40 Character 25 GID Germplasm identifier Character 40 Principal Investigator Name of the Principal investigator Mandatory Character 50 Contact Contact details Character 255 Institute Institute which carried out the analysis Mandatory Character 100 Incharge Person Name of the Incharge person Character 50 Assay Type Forward Primer Type of assay ex: golden gate, kaspar etc. The sequence of forward primer used in the PCR to detect the allele Character 50 Mandatory Character 25
Reverse Primer The sequence of reverse primer used in the PCR to detect the allele Mandatory Character 25 Product Size Size of the Product Character 20 Expected Product Size Position on Reference Sequence Expected product size Integer 4 Position number of SNP on reference sequence Integer 4 Motif Motif values Character 250 Annealing Temperature Sequence Reference The annealing temperature of the PCR used to detect the allele Sequence in which motif is identified Marker Formal citation for a paper or electronic publication Float 4 Character 2500 Character 255 GID Genotype (Germplasm Names) Marker 17
CISR Marker Template Information about CISR Markers Sections available in this template Section Name CISR Markers Description Information on the CISR Markers CISR Markers Section Description: Information on the CISR Markers Marker Name The name of the marker used Mandatory Character 40 Primer ID Name of the primer Character 40 Alias Alias name Character 40 Crop Name of the crop Mandatory Character 25 Genotype Ploidy Name of the genotype/accession/germplasm on which the marker was originally discovered. Ploidy of the species ex: haploid, diploid etc. Character 40 Character 25 GID Germplasm identifier Character 40 Principal Investigator Name of the Principal investigator Mandatory Character 50 Contact Contact details Character 255 Institute Institute which carried out the analysis Mandatory Character 100 Incharge Person Name of the Incharge person Character 50 18
Assay Type Type of assay ex: golden gate, kaspar etc. Character 50 Repeat Motif values Character 250 No of Repeats Number of repeats Integer 4 Sequence Sequence Length Min Allele Max Allele Size of Repeat Motif Forward Primer Reverse Primer Product Size Primer Length Forward Primer Temperature Reverse Primer Temperature Annealing Temperature Fragment Size Expected Amplification Reference Remarks Sequence in which motif is identified Marker Number of base pairs in the sequence Minimum allele value Maximum allele value Number of base pairs in the motif The sequence of forward primer used in the PCR to detect the allele The sequence of reverse primer used in the PCR to detect the allele Size of the Product Length of the Primer Melting Temp of forward primer Melting Temp of reverse primer The annealing temperature of the PCR used to detect the allele Expected size of fragment in bp Amplification status Formal citation for a paper or electronic publication To add notes or comments on dataset. GID Genotype (Germplasm Names) Marker Character 2500 Integer 4 Mandatory Mandatory Integer 4 Integer 4 Character 255 Character 25 Character 25 Character 20 Integer 4 Float 4 Float 4 Float 4 Integer 4 Character 12 Character 255 Character 255 19
CAP Marker Template Information about CAP Markers Sections available in this template Section Name CAP Markers Description Information on the CAP Markers CAP Markers Section Description: Information on the CAP Markers Marker Name The name of the marker used 20 Mandatory Character 40 Primer ID Name of the primer Character 40 Alias Alias name Character 40 Crop Name of the crop Mandatory Character 25 Genotype Ploidy Name of the genotype/accession/germpl asm on which the marker was originally discovered. Ploidy of the species ex: haploid, diploid etc. Character 40 Character 25 GID Germplasm identifier Character 40 Principal Investigator Name of the Principal investigator Mandatory Character 50 Contact Contact details Character 255 Institute Incharge Person Institute which carried out the analysis Name of the Incharge person Mandatory Character 100 Character 50
Assay Type Forward Primer Reverse Primer Type of assay ex: golden gate, kaspar etc. The sequence of forward primer used in the PCR to detect the allele The sequence of reverse primer used in the PCR to detect the allele Character 50 Mandatory Character 25 Mandatory Character 25 Product Size Size of the Product Character 20 Expected Product Size Expected product size Integer 4 Restriction enzyme for assay Position on Reference Sequence The restriction enzymes used to create the diversity array library Position number of SNP on reference sequence Character 20 Integer 4 Motif Motif values Character 250 Annealing Temperature Sequence Reference Remarks The annealing temperature of the PCR used to detect the allele Sequence in which motif is identified Marker Formal citation for a paper or electronic publication To add notes or comments on dataset. Float 4 Character 2500 Character 255 Character 255 GID Genotype (Germplasm Names) Marker 21
MTA Template Template for Marker trait association. Sections available in this template Section Name MTA_Source MTA_Data Description Information on the source of the dataset, the species it concerns and the name of the dataset The actual data MTA_Source Section Description: Information on the source of the dataset, the species it concerns and the name of the dataset Institute Institute which carried out the analysis Mandatory Character 100 Principal investigator Name of the Principal investigator Character 50 Dataset Name Descriptive name of the dataset Mandatory Character 30 Dataset description Genus Description of dataset Mandatory Character 255 Taxonomic or common name of Species analysed. Mandatory Character 25 Method Method of MTA Mandatory Character 25 Score Value of Log 10 base ratio Mandatory Float 4 Species Genus name for taxon Mandatory Character 25 Remark To add notes or comments on dataset Character 25 Dataset name 22
MTA_Datalist Section Description: The actual data Marker The name of the marker used. Mandatory Character 40 Chromosome Germplasm identifier Mandatory Character 50 Map-Name The Map on which the MTA data can be projected. Mandatory Character 30 Position Position of QTL Mandatory Float 4 Trait Name of the Trait Mandatory Character 40 Effect +ve effect or -ve effect on the trait numeric value. Mandatory Float 4 High value allele High value allele for particular Mandatory Character 20 Experiment dataset id from DMS or qtl mapping s/w or location or environment Mandatory Character 100 Score Value of Log 10 base ratio Mandatory Float 4 R2 % of total phenotypic variation Mandatory Float 4 Marker Trait 23