1. (20 points) What are each of the following molecular markers? (Indicate (a) what they stand for; (b) the nature of the molecular polymorphism and (c) Methods of detection (such as gel electrophoresis, PCR, restriction digest etc.); and (d) their primary applications). RFLP a. Restriction fragment length polymorphism b. changes in fragment size due to loss or gain of restriction site c. Southern analysis d. linkage mapping, genotyping CAPS a. cleaved amplified polymorphism sequence b. changes in PCR fragment size due to loss or gain of restriction site c. PCR then restriction digest d. Genotyping, linkage mapping SNP a. single nucleotide polymorphism b. single nucleotide substitutions c. PCR / ASO hybridization, primer extension, southern analysis d. Linkage mapping Micro-satellite marker b. small repeated units (2-3 bp) c. PCR and gel electrophoresis d. linkage mapping, highly polymorphic DNA markers Mini-satellite marker b.highly polymorphic repeating units (20-100 bp) c. restriction digest and southern analysis d. DNA fingerprinting 2. uns- is an Arabidopsis mutant that is hypersensitive to UV irradiation. To determine the map distance between UVS and the microsatellite marker E (both UVS and E are located on the same chromosome), Dr. Franks made following cross: He let the F1 plants self-cross and then isolated DNA from 19 F2 UV hypersensitive plants (uvs-/uvs-). The PCR primers were used to PCR-amplify the E locus from these 19 uvs- /uvs- mutant plants. The PCR reactions were run on a 3% agarose gel, an image of which is shown below (E and e lanes are controls) (a) (3 points) Which of these plants (use number) show e/e pattern? Which show E/E Pattern and which showed heterozygous e/e pattern? e/e 1 2 3 5 6 8 9 10 11 12 13 14 6 17 18 19 E/e 4 7 E/E 15 (b) (3 points) Which type (E/E, or e/e, or e/e) show one recombination event and which type show double recombination?
e/e none E/e one E/E double (5 points) Calculate the distance (in % recombination) between E and UVS1 19 individuals, 2 chromosomes each: (19*2) 2 individuals with 1 crossover event: 2 1 individual with 2 crossover events: 2 (2+2)/(19*2) =4/38 =0.105 =10.5% 3. (24 points) Dr. Liu's lab works on two different genes named LEUNIG and SEUSS. (a) Use Pubmed search to find out how many journal articles describe research on the LEUNIG gene (hint: exclude those articles whose authors have LEUNIG as their last name). 16 (b) How many journal articles describe both the LEUNIG and SEUSS genes? 3 (c) What are the Genbank accession numbers for the Arabidopsis LEUNIG protein? (List at least three accession numbers) First 10 1: AAG32022. Reports LEUNIG [Arabidops...[gi:11141605] 3: Q9FUY2. Reports Transcriptional c...[gi:30580400] 4: NP_567896. Reports LUG (LEUNIG) [Ara...[gi:18418034] (d) What is the amino acid sequence of the LEUNG protein of the accession Q9FUY2? Please print it out in the FASTA format and attach it to this homework. 1: Q9FUY2. Reports Transcriptional c...[gi:30580400] >gi 30580400:1-931 Transcriptional corepressor LEUNIG MSQTNWEADKMLDVYIHDYLVKRDLKATAQAFQAEGKVSSDPVAIDAPGGFLFEWWSVFWDIFIARTNEK HSEVAASYIETQMIKAREQQLQQSQHPQVSQQQQQQQQQQIQMQQLLLQRAQQQQQQQQQQHHHHQQQQQ QQQQQQQQQQQQQQQHQNQPPSQQQQQQSTPQHQQQPTPQQQPQRRDGSHLANGSANGLVGNNSEPVMRQ NPGSGSSLASKAYEERVKMPTQRESLDEAAMKRFGDNVGQLLDPSHASILKSAAASGQPAGQVLHSTSGG MSPQVQTRNQQLPGSAVDIKSEINPVLTPRTAVPEGSLIGIPGSNQGSNNLTLKGWPLTGFDQLRSGLLQ QQKPFMQSQSFHQLNMLTPQHQQQLMLAQQNLNSQSVSEENRRLKMLLNNRSMTLGKDGLGSSVGDVLPN VGSSLQPGGSLLPRGDTDMLLKLKMALLQQQQQNQQQGGGNPPQPQPQPQPLNQLALTNPQPQSSNHSIH QQEKLGGGGSITMDGSISNSFRGNEQVLKNQSGRKRKQPVSSSGPANSSGTANTAGPSPSSAPSTPSTHT PGDVISMPNLPHSGGSSKSMMMFGTEGTGTLTSPSNQLADMDRFVEDGSLDDNVESFLSQEDGDQRDAVT RCMDVSKGFTFTEVNSVRASTTKVTCCHFSSDGKMLASAGHDKKAVLWYTDTMKPKTTLEEHTAMITDIR FSPSQLRLATSSFDKTVRVWDADNKGYSLRTFMGHSSMVTSLDFHPIKDDLICSCDNDNEIRYWSINNGS
CTRVYKGGSTQIRFQPRVGKYLAASSANLVNVLDVETQAIRHSLQGHANPINSVCWDPSGDFLASVSEDM VKVWTLGTGSEGECVHELSCNGNKFQSCVFHPAYPSLLVIGCYQSLELWNMSENKTMTLPAHEGLITSLA VSTATGLVASASHDKLVKLWK (e) Use the LEUNIG protein (accession: Q9FUY2) as a query to perform a Blastp search. How many types of protein domains does the LEUNIG protein have? What are the names of these domains? 2 SSDP WD40 (f) The STYLOSA protein from a different plant called Antirrhinum majus is highly similar to LEUNIG from Arabidopsis thaliana. What are the score and the e-value for the alignment between STYLOSA and LEUNIG? What is the percent identity between these two proteins? What is the percentage positive between these two proteins? Score: 959 e-value: 0.0 %identity: 72 %positive: 83 5. (28 points) Use OMIM to search for a human hereditary disease "cystic fibrosis (CF)" (a) What is the name of the gene responsible for the cystic fibrosis? CFTR: cystic fibrosis transmembrane conductance regulator (b) Briefly describe the function of this protein. ATP binding cassette transporter, functions as a chloride channel and controls the regulation of other transport pathways (c) Indicate the exact chromosomal location of this gene in human. 7q31.2 (d) What are names of its neighboring genes on the human chromosome map? (Provide one protein at each side) ASZ1 CTTNBP2 By clicking the zooming in or zoom out function in the website, one may obtain different neighbors. (e) What is the CF disease symptom? Causes disruption of exocrine function of the pancreas but also to intestinal glands (meconium ileus), biliary tree (biliary cirrhosis), bronchial
glands (chronic bronchopulmonary infection with emphysema), and sweat glands (high sweat electrolyte with depletion in a hot environment). Infertility occurs in males and females. (f) Is the mutation causing cystic fibrosis dominant or recessive? recessive (g) Use HGMD (first click on the gene name from the human chromosome map and then click on HGMD on the right hand link) to find out how many mutations have been found in this gene in human? Also list at least three types of mutations found in human (for example, frameshift, missense, repeat, etc). 1203 3 of these: missense/nonsense splicing regulatory small deletion small insertion small indels gross deletions complex rearrangements repeat variations 6. (10 points) Perform following blast searches and indicate the number of hits and the highest scores for each of following Blast searches. (a) Use Blastn search using the Drosophila Dynein mrna (NM_137686) as a query 2pnts hits:101 score:3727 (b) Use Blastp search using the Drosophila protein (AAF21334) as a query. 2pnts hits:520 score:1090 (c) What is the most striking difference between the Blastn and Blastp search results? Explain why? 6pnts Blastp gave more sequences with significant sequence similarity to the query. Blastn only gives very few sequences showing significant sequence similarity to the query. During evolution, many silent mutations accumulate that result in different nucleotide sequences encoding similar proteins. Because of the degeneracy of the codons, similar proteins can be encoded by very different nucleotide sequences. As a result, the nucleotide sequences could be divergent enough and not able to have significant similarity.
4 pnts Blastn gave lower scores and Blastp gave higher scores because the nucleotide sequence consists of only 4 different nucleotides and protein sequences consist of 20 different amino acids (higher complexity). As a result, any changes in nucleotide could increase the E-value (ie. homology by chance). 2pnts Blastn searches nucleotide sequences Blastp searches protein sequences 7. Search the "Structure" database with "Drosophila AND Homeodomain" as a query. (a) (3 points) How many different homeodomain structure entries do you obtain? 26 (b) (4 points) Look further into the structure of 1JGG and subsequently look into the 3D structure using the Cn3D program. The Cn3D program shows two homeodomains that bind to a short stretch of double stranded DNA. How many alpha-sheets or beta-helices are in each homeodomain? What is the DNA sequence bound by the two homeodomains shown in 1jGG? 2pnts: 3 alpha helices each 2pts: 5 naattgaatt3 3 attaacttan5