Lecture 2: Biology Basics Con4nued
Central Dogma
DNA: The Code of Life The structure and the four genomic le=ers code for all living organisms Adenine, Guanine, Thymine, and Cytosine which pair A- T and C- G on complimentary strands.
DNA: The Code of Life DNA has a double helix structure which composed of sugar molecule phosphate group and a base (A,C,G,T) DNA always reads from 5 end to 3 end for transcrip4on replica4on
DNA can replicate by splilng, and rebuilding each strand. Note that the rebuilding of each strand uses slightly different mechanisms due to the 5 3 asymmetry, but each daughter strand is an exact replica of the original strand. DNA Replica4on http://users.rcn.com/jkimball.ma.ultranet/biologypages/d/dnareplication.html
Inverse Complement of DNA What is the inverse complement sequence of TATAGCCCG?
Inverse Complement of DNA What is the inverse complement sequence of TATAGCCCG? CGGGCTATA
Genotype/Phenotype To prevent confusion between genes (which are inherited) and developmental outcomes (which are not), gene4cists make a dis4nc4on between the genotype and the phenotype of an organism Genotype: complete set of genes inherited by an individual Phenotype: all aspects of the individual s physiology, behavior, and ecological rela4onships
DNA the Gene4cs Makeup Genes are inherited and are expressed genotype (gene4c makeup) phenotype (physical expression) On the lex, is the eye s phenotypes of green and black eye genes.
Two organisms whose genes differ at one locus are said to have different genotypes. A locus (loci for plural) is the specific loca4on of a gene of a DNA sequence on a chromosome. A variant of the DNA sequence at a given loca4on is called a allele. The ordered list of loci known for a par4cular genome is called a gene4c map.
Diploid and polyploid cells whose chromosomes have the same allele of a given gene at some locus are called homozygous, with respect to that gene (otherwise, it is heterzygous). The chromosomal locus of a gene might be wri=en "6p21.3 6: chromosome number p: posi4on on the chromosome s short arm ( p ) or long arm ( q ) 21.3: the posi4on on the arm: region 2, band 1, sub- band 3. The bands are visible under a microscope when the chromosome is stained.
Genotype/Phenotype Blue eyes Phenotype: Brown eyes Recessive: bb Genotype: Dominant: Bb or BB
Genes are shown in rela4ve order and distance from each other based on pedigree studies. The chance of the chromosome breaking between A & C is higher than the chance of the chromosome breaking between A & B during meiosis Similarly, the chance of the chromosome breaking between E & F is higher than the chance of the chromosome breaking between F & G The closer two genes are, the more likely they are to be inherited together (co- occurrence) If pedigree studies show a high incidence of co- occurrence, those genes will be located close together on a gene4c map
Pleiotropy: when one gene affects many different traits. Polygenic traits: when one trait is governed by mul4ple genes, which maybe on the same chromosome or on different chromosomes. The addi4ve effects of numerous genes on a single phenotype create a con4nuum of possible outcomes. Polygenic traits are also most suscep4ble to environmental influences.
Pleiotropy in humans: Phenylketonuria A disorder that is caused by a deficiency of the enzyme phenylalanine hydroxylase, which is necessary to convert the essen4al amino acid phenylalanine to tyrosine. A defect in the single gene that codes for this enzyme therefore results in the mul4ple phenotypes associated with PKU, including mental retarda4on, eczema, and pigment defects that make affected individuals lighter skinned
Polygenic Inheritance in Humans Height is controlled by polygenes for skeleton height, but their effect may be affected by malnutri4on, injury, and disease. Weight, skin color, and intelligence. Birth defects like clubfoot, clex palate, or neural tube defects are also the result of mul4ple gene interac4ons. Complex diseases and traits have a tendency to have low heritability (tendency to be inherited) compared to single gene disorders (i.e. sickle- cell anemia, cys4c fibrosis, PKU, Hemophelia, many extremely rare gene4c disorders).
Selec4on Some genes may be subject to selec4on, where individuals with advantages or adap4ve traits tend to be more successful than their peers reproduc4vely When these traits have a gene4c basis, selec4on can increase the prevalence of those traits, because the offspring will inherit those traits. This may correlate with the organism's ability to survive in its environment. Several different genotypes (and possibly phenotypes) may then coexist in a popula4on. In this case, their gene4c differences are called polymorphisms.
Gene4c Muta4on The simplest is the point muta4on or subs4tu4on; here, a single nucleo4de in the genome is changed (single nucleo4de polymorphisms (SNPs)) Other types of muta4ons include the following: Inser4on. A piece of DNA is inserted into the genome at a certain posi4on Dele4on. A piece of DNA is cut from the genome at a certain posi4on Inversion. A piece of DNA is cut, flipped around and then re- inserted, thereby conver4ng it into its complement Transloca4on. A piece of DNA is moved to a different posi4on. Duplica4on. A copy of a piece of DNA is inserted into the genome
Muta4ons and Selec4on While muta4ons can be detrimental to the affected individual, they can also in rare cases be beneficial; more frequently, neutral. OXen muta4ons have no or a negligible impact on survival and reproduc4on. Thereby muta4ons can increase the gene4c diversity of a popula4on, that is, the number of present polymorphisms. In combina4on with selec4on, this allow a species to adapt to changing environmental condi4ons and to survive in the long term.
Raw Sequence Data 4 bases: A, C, G, T + other (i.e. N = any, R = G or A (purine), Y = T or (pyrimidine)) kb (= kbp) = kilo base pairs = 1,000 bp Mb = mega base pairs = 1,000,000 bp Gb = giga base pairs = 1,000,000,000 bp. Size: E. Coli 4.6Mbp (4,600,000) Fish 130 Gbp (130,000,000,000) Paris japonica (Plant) 150 Gbp Human 3.2Gbp
Fasta File A sequence in FASTA format begins with a single- line descrip4on, followed by lines of sequence data (file extension is.fa). It is recommended that all lines of text be shorter than 80 characters in length.
Fastq File Typically contain 4 lines: Line 1 begins with a '@' character and is followed by a sequence iden4fier and an op#onal descrip4on. Line 2 is the sequence. Line 3 is the delimiter +, with an op4onal descrip4on. Line 4 is the quality score. file extension is.fq @SEQ_ID GATTTGGGGTTCAAAGCTTCAAAGCTTCAAAGC +!''*((((***+))%%%++++++++!!!++***
Proteins: Primary Structure Pep4de sequence: Sequence of amino acids = sequences from a 20 le=er alphabet (i.e. ACDEFGHIKLMNPQRSTVWY) Average protein has ~300 amino acids Typically stored as fasta files >gi 5524211 gb AAD44166.1 cytochrome b [Elephas maximus maximus] LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG LLILILLLLLLALLSPDMLGDPDNHMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGVLALFLSIVIL GLMPFLHTSKHRSMMLRPLSQALFWTLTMDLLTLTWIGSQPVEYPYTIIGQMASILYFSIILAFLPIAGX IENY
Proteins: Secondary Structure Polypep4de chains fold into regular local structures Common types: alpha helix, beta sheet, turn, loop Defined by the crea4on of hydrogen bonds
Proteins: Ter4ary Structure 3D structure of a polypep4de sequence interac4ons between non- local and foreign atoms
Proteins: Quaternary Structure Arrangement of protein subunits
Genes and Proteins One gene encodes one protein and begins with start codon (e.g. ATG), then each three code one amino acid. Then a stop codon (e.g. TGA) signifies end of the gene. In the middle of a (eukaryo4c) gene, there are segments that are spliced out during transcrip4on. Introns: segments that are spliced out Exons: segments that are kept. Detec4ng the introns and exons is a task for gene finding.
Genomics: - Assembly - Detec4on of varia4on - GWAS RNA: - Gene expression - Transcriptome assembly - Pathway analysis Protein: - Mass spectrometry - Structure predic4on - Protein- Protein interac4on