Bioinformatics Practical Course. 80 Practical Hours

Size: px
Start display at page:

Download "Bioinformatics Practical Course. 80 Practical Hours"

Transcription

1 Bioinformatics Practical Course 80 Practical Hours Course Description: This course presents major ideas and techniques for auxiliary bioinformatics and the advanced applications. Points included incorporate arrangement, structure and capacity databases of DNA and protein particles, propelled succession and structure arrangement strategies, techniques for protein collapsing and protein structure expectation (homologous demonstrating, threading and ab initio folding) basics of molecular dynamics and Monte Carlo simulation, principle and application of machine learning, and techniques of protein structure determination (X-ray crystallography, NMR and cryo-em). Accentuation is on the comprehension of the ideas instructed and the pragmatic usage, with the target to help understudies to utilize the front line bioinformatics devices/strategies to tackle issues in their own exploration. Prerequisite: Good ability of computer coding (with at least one language, such as MATLAB) are highly recommended. Instructor: Dr. Qais Yousef. Ph.D. in Computer Engineering. Specialized in Bioinformatics and Enabling Systems. Have worked on numerus researches in this field. Schedule and location: Three months. The practical course will be held in ATIT Academy. Projects and Assignments: Each session will contain practical project. Moreover, there will be homework assignments, including code writing and literature reading. Textbook: No one reference will be considered for this course, however, related materials will be shared with students.

2 Table of content 1. Bioinformatics databases 1.1. Introduction Motivation Central dogma of life Type of bioinformatics databases 1.2. Nucleotide sequence databases EMBL GeneBank DDBJ 1.3. Protein amino acid sequence databases How protein sequences are determined DNA/mRNA coding Edman degradation reaction Mass spectrometry SwissProt/TrEMBL PIR UniProt UniProtKB/Swiss-Prot and UniProtKB/TrEMBL UniParc UniRef 1.4. Protein structure databases History of structural biology Protein Data Bank SCOP CATH 1.5. Protein function databases Pfam-protein family database GO-gene ontology PROSITE-protein function pattern and profile ENZYME-Enzyme commission BioLiP-ligand protein binding interaction 2. Pair-wise sequence alignments and database search 2.1. Biological motivation-why sequence alignment? 2.2. What is a sequence alignment?

3 Scoring matrix PAM BLOSUM Gap penalty 2.3. Dynamics programming Needleman-Wunsch: global alignment algorithm Smith-Waterman: local alignment algorithm Gotoh algorithm 2.4. Heuristic methods FASTA BLAST 2.5. Statistics of sequence alignment score E-Value P-Value 3. Phylogenic tree & multiple sequence alignments 3.1. Neighbor-joining method and phylogenetic tree 3.2. How to construct multiple sequence alignments? ClustalW PSI-BLAST PSI-Blast pipeline Profile pseudocount PSSM-position specific scoring matrix Installing and running PSI-Blast programs Interpret PSI-Blast out Hidden Markov Models Viterbi algorithm HMM based multiple-sequence alignment Creating HMM by iteration HMMER SAM 3.3. Sequence profile & profile based alignments What is sequence profile? Henikoff weighting scheme Profile-to-sequence alignment Profile-to-profile alignment 4. Protein structure alignments 4.1. Structure superposition versus structural alignment 4.2. Structure superposition methods RMSD TM-score

4 4.3. Structure alignment methods DALI CE TM-align 4.4. How to define the fold of proteins? 4.5. Number of protein folds in the PDB 5. Protein secondary structure predictions 5.1. What is protein secondary structure? 5.2. Hydrogen bond 5.3. How to define a secondary structure element? 5.4. Basics of machine learning and neural network methods 5.5. Methods for predicting secondary structure Chou and Fasman method PHD PSIPRED PSSpred 6. Introduction to Monte Carlo Simulation 6.1. Introduction: why Monte Carlo simulation? 6.2. Monte Carlo Sampling of Probabilities Random number generator How to test a random number generator? Sampling of rectangular distributions Sampling of probability distribution Reverse transform method Rejection sampling method 6.3. Boltzmann distribution 6.4. Metropolis method 6.5. Advanced Metropolis methods Replica exchange simulation Simulated annealing 7. Protein folding and protein structure modeling 7.1. Basic concepts 7.2. Ab initio modeling Anfinsen thermodynamic hypothesis Molecular dynamics simulation CHARMM AMBER Knowledge-based free modeling Bowie-Eisenberg approach ROSETTA QUARK

5 Why is beta-protein so difficult to fold? 7.3. Comparative modeling (homology modeling) Principle of homology modeling PSI-BLAST Modeler 7.4. Threading and fold-recognition What is threading? Threading programs Bowie-Luthy-Eisenberg HHpred MUSTER Meta-server threading D-jury LOMETS 7.5. Combined modeling approaches TASSER/I-TASSER Force field design Search engine: replica-exchange Monte Carlo simulation Major issues and recent development 7.6. CASP: A blind test on protein structure predictions 8. Protein function and structure-based function annotation 8.1. Gene ontology 8.2. Enzyme classification 8.3. Ligand-protein interaction 8.4. Structure-based function prediction Concavity FindSite COFACTOR COACH 9. Principle of X-ray Crystallography & Molecular Replacement 9.1. What is X-ray Crystallography? 9.2. Why can a wave be represented by exp (iα)? 9.3. How to calculate scattering on two electrons? 9.4. What is Laue condition? 9.5. What is Bragg s law? 9.6. How to calculate electron density of crystal? 9.7. What is Patterson function? 9.8. How to calculate electron density of crystal? 9.9. What is the idea of Molecular Replacement? How to judge quality of MR?

6 9.11. What are often-used software for MR? 10. Introduction to nuclear magnetic resonance (NMR) Basic magnetic property of nuclei Magnetic moment Nuclei in external magnetic field Nuclear shielding of magnetic field Chemical shift NMR spectrum Correlation spectroscopy (COSY) Heteronuclear single-quantum correlation spectroscopy (HSQC) Nuclear Overhauser effect spectroscopy (NOESY) From NOE to 3D structure model