Hybrid Intelligent Systems for DNA Microarray Data Analysis
|
|
- Calvin Houston
- 5 years ago
- Views:
Transcription
1 Hybrid Intelligent Systems for DNA Microarray Data Analysis November 27, 2007 Sung-Bae Cho Computer Science Department, Yonsei University Soft Computing Lab
2 What do I think with Bioinformatics? Biological Objects Cause Function Blackbox Disease Identification modeling Expression Data clustering classification Predict Cancer (Classify Disease) Drug Design (Personal Medicine) Identify Risk Factors optimal features & classifiers ensemble approach S.-B. Cho, Soft Computing Lab 2
3 Acknowledgements Bioinformatics team members (including OB s) C.-H. Park, K.-J. Kim, J.-H. Hong, H.-S. Park, S.-H. Yoo, H.-H. Won, J. Ryu, and H.-J. Kwon Soft Computing Lab
4 Outline Overview of DNA microarray technology Classification Comprehensive comparisons Ensemble approaches S.-B. Cho, Soft Computing Lab 4
5 DNA Microarray Technology Soft Computing Lab
6 Data Mining in Biological Data cells in human body 3*10 9 letters in DNA code in every cell in human body Only 0.2% differ between humans Human DNA is 98% identical to that of chimpanzees 97% of human DNA has no known function Bioinformatics Solving problems arising from biology using methodology from computer science Drug design, identification of risk factors, personal medicine, etc. Related topics Classification, clustering, gene modeling, gene identification S.-B. Cho, Soft Computing Lab 6
7 New Paradigm in Biology Microarray Technology One Gene Analysis Very Slow Local Analysis Thousands Gene Analysis Very Fast Global Analysis Need Computational Method Machine Learning S.-B. Cho, Soft Computing Lab 7
8 Overview DNA Microarray DNA microarray A chip or slide that has been printed with a large number of DNA spots DNA microarray technology Enables the simultaneous analysis of thousands of gene expression levels for genetic and genomic research and for diagnostics Gene : sequence of DNA that includes genetic information Two major techniques Hybridization method cdna microarray/ Oligonucleotide microarray Sequencing method Serial analysis of gene expression (SAGE) S.-B. Cho, Soft Computing Lab 8
9 Data Acquisition DNA Microarray samples samples sample 1 sample 2 sample 3 genes genes log 2 Int( Cy5) Int( Cy3) microarray image accumulated microarray image (colors) gene expression data matrix (numbers) Microarray data consist of large number of genes in small samples!! S.-B. Cho, Soft Computing Lab 9
10 Example DNA Microarray A part of Leukemia dataset, before log transformation (Golub, et al., 1999) sample Gene Description Gene Accession Number AML AML ALL AML AML ALL GB DEF = BAC clone RG293F11 from 7q21-7q22, complete sequence AC000066_at Metabotropic glutamate receptor 8 mrna AC000099_at WUGSC:H_GS188P18.1a gene extracted from Human BAC clone GS188P18 A-589H1.1 from Homo sapiens Chromosome 16 BAC clone CIT987-SKA-589H1 ~complete genomic sequence, complete sequence./ntype=dna /annot=mrna WUGSC:DJ515N1.2 gene extracted from Human PAC clone DJ515N1 from 22q11.2-q22 GUANINE NUCLEOTIDE-BINDING PROTEIN G(T), ALPHA-1 SUBUNIT GB DEF = PAC clone DJ525N14 from Xq23, complete sequence COX6B gene (COXG) extracted from Human DNA from overlapping chromosome 19 cosmids R31396, F25451, and R31076 containing COX6B and UPKA, genomic sequence F25451_3 gene extracted from Human DNA from overlapping chromosome 19 cosmids R31396, F25451, and R31076 containing COX6B and UPKA, genomic sequence UPKA gene extracted from Human DNA from overlapping chromosome 19 cosmids R31396, F25451, and R31076 containing COX6B and UPKA, genomic sequence AC000115_cds1_at AC002045_xpt1_at gene AC002073_cds1_at AC002077_at AC002086_at AC002115_cds1_at AC002115_cds3_at AC002115_cds4_at S.-B. Cho, Soft Computing Lab 10
11 Two Types of Data DNA Microarray Single time point in different states States : disease or tumor type Goal : classifying samples using informative genes Can be used for gene identification Feature selection/extraction Classification problem Monitoring each gene in multiple times Time series data Goal : identifying functionally related genes Can be used for gene regulatory network Clustering problem S.-B. Cho, Soft Computing Lab 11
12 Challenges DNA Microarray Noise Microarray data contain a high level of noise due to experimental procedures The labeling of cdna and the scanning of the slides frequently show non-linear characteristics Sparseness Microarray data are sparse Several thousands of genes are monitored, while the number of samples is often restricted to hundreds or less High redundancy Many genes are highly correlated, which leads to redundancy in the data Adding coexpressed genes to the classification system does not increase information for the system S.-B. Cho, Soft Computing Lab 12
13 Classification Comprehensive comparisons Ensemble approaches Soft Computing Lab
14 Motivation Many researchers have been studying many problems of cancer classification using gene expression profiles and attempting to propose the optimal classification technique to work out these problems We need a thorough effort to give the evaluation of the possible methods to solve the problems of analyzing gene expression data There are several microarray datasets leukemia cancer dataset, colon cancer dataset, lymphoma dataset, breast cancer dataset, NCI60 dataset, and ovarian cancer dataset Three datasets for our study Leukemia cancer dataset Colon cancer dataset Lymphoma cancer dataset S.-B. Cho, Soft Computing Lab 14
15 Classification Scheme DNA microarray data Selected features Class 1 Class 2 Feature selection Classification S.-B. Cho, Soft Computing Lab 15
16 Overview Feature Selection Selecting informative features appropriate to specific goal Variable selection/ gene selection Microarray data consist of large number of genes in small samples All genes are not needed for classification It is essential to select some genes highly related with particular classes for classification, which is called informative genes (Golub et al., 1999) Many selection/extraction techniques based on measures Correlation-based measures Similarity-based measures Information theory-based measures Principal component analysis S.-B. Cho, Soft Computing Lab 16
17 Top 50 Genes Selected Feature Selection Leukemia dataset PC Pearson's Correlation Gene ALL AML Sample 0 S.-B. Cho, Soft Computing Lab 17
18 Rank-based Selection Feature Selection Representative feature selection method Gene selection according to the significance order of each gene Gene number Significance Gene Gene Gene Gene Selecting order Gene 3 Gene 2 Gene 4 Gene 1 How can we calculate the significance? S.-B. Cho, Soft Computing Lab 18
19 Correlation Measures Feature Selection Measuring how much each gene is correlated with the class g ideal = (0, 0, 0,, 1, 1, 1) class pattern class 1 class 2 Pearson correlation coefficients (PC) Parametric Spearman correlation coefficients (SC) Non-parametric Feature 2 Feature Negative correlation Feature 1 Positive correlation Feature 1 No correlation S.-B. Cho, Soft Computing Lab 19
20 Similarity Measures Feature Selection Calculating geometrical similarity between ideal gene vector and each gene vector Euclidean distance (ED) Geometric distance Cosine coefficient (CC) Difference of direction d θ S.-B. Cho, Soft Computing Lab 20
21 Information Theoretic Measures Feature Selection Measuring feature-goodness based on the frequency of the feature satisfying condition Q (whether genes are induced or not) Using frequency or mean and standard deviation of data to calculate the significance of genes Information gain (IG) Mutual information (MI) Signal to noise ratio (SN) µ 1 µ 2 σ 2 σ 1 µ 2 µ 1 S.-B. Cho, Soft Computing Lab 21
22 S.-B. Cho, Soft Computing Lab 22 Mathematical Definitions ) ( ) ( ) ( ) ( ), ( ) ( ) ( log ) ( ) ( log ) ( ) ( log ) ( 1) ( ) ( 6 1 ) ) ( )( ) ( ( cos g g g g c g P C A B A A MI D B B A B B C A B A A A IG Y X XY r Y X r N N Dy Dx r N Y Y N X X N Y X XY r ine euclidean spearman pearson σ σ µ µ + = + + = = = = = = Pearson s correlation coefficient (PC) Euclidean distance (ED) Spearman s correlation coefficient (SC) Cosine coefficient (CC) Information gain (IG) Mutual information (MI) Signal to noise ratio (SN) Feature Selection
23 Principal Component Analysis Feature Selection Widely used for dimensionality reduction Given N vectors in k-dimension, find c (<= k) orthogonal vectors that can be best used to represent data The original data set is reduced to one consisting of N vectors on c principal components (reduced dimensions) Each vector is a linear combination of the c principal components Principal components are directions of variance from the highest The first principal component (PC) is the direction of maximum variance, the second is that of the next highest variance, etc t ij = n k = 1 p ik m kj n : the number of significant principal components pik : the score of sample i on component k mkj : the loading on component k of variable j S.-B. Cho, Soft Computing Lab 23
24 Overview Classifier Supervised learning Need reliable and precise classification essential for successful cancer treatment Current methods for classifying human malignancies rely on a variety of morphological, clinical and molecular variables Uncertainties in diagnosis remain; likely that existing classes are heterogeneous Characterize molecular variations among tumors by monitoring gene expression (microarray) Hope: microarrays will lead to more reliable tumor classification (and therefore more appropriate treatments and better outcomes) Class 1 Decision boundary Class 2 S.-B. Cho, Soft Computing Lab 24
25 Classifiers Classifier Multilayer perceptron K-nearest neighbor Support vector machine Decision tree Structure adaptive self-organizing map S.-B. Cho, Soft Computing Lab 25
26 Multilayer Perceptron Classifier Updating the weights recursively in order to minimize errors occurred on layer using desired output Local for updating the synaptic weights and biases Efficient for computing all the partial derivatives of the cost function with respect to these free parameters x 1 x 2 w 11 w 21 x 3 o 1 o 2 x N w KN Input layer Hidden layer Output layer S.-B. Cho, Soft Computing Lab 26
27 K-Nearest Neighbor Classifier One of the most common methods in memory based induction Deciding the labels of k known data based on similarities with known exemplars P( X, c j ) = Sim( X, d di knn i ) P( d i, c j ) b j Sim(X, d i ) : Pearson s correlation similarity function k : # of neighbors b j : a bias term S.-B. Cho, Soft Computing Lab 27
28 Support Vector Machine Classifier Introduced by Vapnik in 1995 Constructing a hyperplane as the decision surface in such a way that the margin of separation between positive and negative examples is maximized Given a labeled set of M training samples (X i, y i ), where X i R N and y i is the associated label, y i {-1, 1}, the discriminant hyperplane is defined by: f ( X ) y α k ( X, = M i = 1 Linear and RBF kernels are used i i X i ) + b S.-B. Cho, Soft Computing Lab 28
29 Decision Tree Classifier A graph (tree) based model used primarily for classification Popular method for inductive inference A method for approximating discrete-valued target functions Easy to convert learned tree into if-then rules P2 P2 <= 0.03 P2 > 0.03 tumor P21 P21 <= 0.2 P21 > 0.2 P32 normal P32 <= 0.22 P32 > 0.22 normal tumor S.-B. Cho, Soft Computing Lab 29
30 Structure Adaptive SOM Classifier Dynamic node splitting classifier based on self organizing map (SOM) Overcome the shortcoming of SOM The structure of nodes does not have to be determined before training in advance P 1 P 1 C 0 C 1 P 0 P 4 P 2 P 0 P 2 C 2 C 3 P 3 P 3 S.-B. Cho, Soft Computing Lab 30
31 Classification Performance Comparisons Lymphoma cancer dataset SVM KNN MLP SASOM Linear RBF Cosine Pearson Avg. PC SC ED CC IG MI SN Avg S.-B. Cho, Soft Computing Lab 31
32 Classification Performance Comparisons Colon cancer dataset MLP SASOM SVM KNN Linear RBF Cosine Pearson DT Avg. PC SC ED CC IG MI SN Avg S.-B. Cho, Soft Computing Lab 32
33 Classification Comprehensive comparisons Ensemble approaches Soft Computing Lab
34 Overview Ensemble Classifier Limitation of machine learning classifiers in solving practical problems Incomplete dataset Noise in data Imperfection of classification algorithm Solution Searching for effective features of input patterns Utilizing multiple features Providing multiple pathways (more chance) to the optimal solution Improving classification performance Combining multiple classifiers Combining several prospective models may produce better prediction S.-B. Cho, Soft Computing Lab 34
35 Rationale Ensemble Classifier Feature space Selected feature Solution space Φ 1 F 1 Φ 2 F 2 High and complex space Φ F 3 3 Feature selection Classification Optimal solution Estimated solution by ensemble S.-B. Cho, Soft Computing Lab 35
36 Ensemble Approach Ensemble Classifier A good ensemble includes base classifiers that Are accurate easy Make their errors in different parts of the problem domain difficult Issues for ensemble classifiers How to generate good base classifiers From combinations of features and classifiers How to combine the base classifiers Majority voting Weighted voting Borda count BKS, S.-B. Cho, Soft Computing Lab 36
37 Ensemble Generation Ensemble Classifier Feature selection m Classification n Pearson correlation coefficients (PC) Spearman correlation coefficients (SC) Cosine coefficients (CC) Euclidean distance (ED) Information gain (IG) Mutual information (MI) Signal to noise ratio (SN) Principal component analysis (PCA) Multilayer perceptron (MLP) K-nearest neighbor (KNN(C), KNN(P)) Support vector machine (SVM(L), SVM(R)) Structure adaptive self-organizing map (SASOM) Feature-classifier pair 1 Feature-classifier pair 2 Combination Huge number of available ensembles Feature-classifier pair mn mn 2 mn S.-B. Cho, Soft Computing Lab 37
38 Ensemble Strategies Ensemble Classifier Mutually exclusive features Negatively correlated features Combinatorial ensemble GA optimization Speciated GA optimization S.-B. Cho, Soft Computing Lab 38
39 Overview Mutually Exclusive Features Combining classifiers with mutually exclusive features through the analysis of correlation of features Input pattern Feature a mutually exclusive Feature b MLP KNN SVM linear SVM RBF MLP KNN SVM linear SVM RBF Combining module S.-B. Cho, Soft Computing Lab 39
40 Classification Rates Mutually Exclusive Features Leukemia dataset 100 Recognition rate [%] MLP KNN SVM RBF SVM linear KNN cosine SOM DT S.-B. Cho, Soft Computing Lab 40
41 Correlation of Features Mutually Exclusive Features Three representative cases of correlations Pearson s correlation between features has been calculated Euclidean distance Signal to noise ratio Cosine coefficient Pearson s correlation (a) Negative correlation (coefficient: -0.52) Pearson s correlation (b) Neutral (coefficient: -0.03) Pearson s correlation (c) Positive correlation (coefficient: 0.80) S.-B. Cho, Soft Computing Lab 41
42 Comparison of Accuracy Mutually Exclusive Features Recognition accuracy [%] Neural network Majority voting case(a) Negative correlation case (b) Neutral case (c) Positive correlation all feature S.-B. Cho, Soft Computing Lab 42
43 Overview Negatively Correlated Features Idea With two ideal gene vectors, select features whose expression patterns are similar to one of ideal gene vectors Train classifiers with two feature sets and combine them Method Sim(X, Y) : similarity between vector X and Y Ideal gene vector A Gene set whose expression pattern is similar to (1,1,1,,0,0,0) SGS I = argmax{sim(gene i, Ideal Gene Vector A)} Ideal gene vector B Gene set whose expression pattern is similar to (0,0,0,,1,1,1) SGS II = argmax{sim(gene i, Ideal Gene Vector B)} S.-B. Cho, Soft Computing Lab 43
44 Example Negatively Correlated Features Ideal Gene A (1,1,1,1,1,1,0,0,0,0,0,0) Ideal Gene B (0,0,0,0,0,0,1,1,1,1,1,1) Negative Gene 1 Correlation Gene 1' Gene 2 Gene 2' S.-B. Cho, Soft Computing Lab 44
45 Selected Features Negatively Correlated Features Leukemia dataset Pearson correlation coefficients ALL AML gene_3320 gene_4847 gene_2020 gene_1745 gene_5039 gene_1834 gene_461 gene_4196 gene_3847 gene_2288 gene_1249 gene_6201 gene_2242 gene_3258 gene_1882 gene_2111 gene_2121 gene_6200 gene_6373 gene_6539 gene_2043 gene_2759 gene_6803 gene_1674 gene_2402 gene_5772 gene_2301 gene_6055 gene_387 gene_4167 gene_4230 gene_6990 gene_4328 gene_6281 gene_5593 gene_2543 gene_1306 gene_6064 gene_2050 gene_3386 gene_2441 gene_4289 gene_4389 gene_1928 gene_515 gene_2354 gene_6471 gene_6515 gene_149 gene_3070 SGS II SGS I S.-B. Cho, Soft Computing Lab 45
46 PCA 3D Plot Negatively Correlated Features Select 25 genes from SGS I + 25 genes from SGS II by Pearson correlation coefficients and extract 3 principal components Well classifying AML and ALL Third PC Second PC First PC Red : ALL Blue : AML S.-B. Cho, Soft Computing Lab 46
47 Comparison of Performance Negatively Correlated Features accuracy(%) sensitivity(%) specificity(%) Leukemia MLP I MLP II MLP I + MLP II Colon MLP I MLP II MLP I + MLP II Lymphoma MLP I MLP II MLP I + MLP II S.-B. Cho, Soft Computing Lab 47
48 Overview Combinatorial Ensemble In theory, a good ensemble should include base classifiers that Are accurate Make their errors in different parts of the problem domain In practice Easy to obtain weak classifiers whose accuracy is about 50% Very difficult to get uncorrelated classifiers large number of classifiers do not guarantee the good performance of ensemble Testing ensembles combinatorially until the promising number of ensembles instead of all available ensembles S.-B. Cho, Soft Computing Lab 48
49 Structure Combinatorial Ensemble Gene Expression Data Methods F 1 F 2 F 3 F i Classifiers Selection C 1 C 2 C 3.Feature C j.feature-classifier Sets F 1 C 1 F 1 C 2 F 1 C 2 F i C j.n Combinatorial Selection ( n C 5 ) Ensemble Method prediction 1.Class c S.-B. Cho, Soft Computing Lab 49
50 Comparison of Accuracy Combinatorial Ensemble Combining method # of classifiers Leukemia Colon Lymphoma Majority voting All Weighted voting All Bayesian Combination All is less accurate, 7 is expensive S.-B. Cho, Soft Computing Lab 50
51 Overview GA Optimization There are so many available ensembles from several classifiers Exponentially increase with respect to the number of classifiers 48 base feature-classifier pairs make 2 48 ensembles Exhaustive searching is very time-consuming Use GA to find optimal ensemble in a short time Ensemble is made from 48 base feature-classifier pairs from 8 feature selection methods and 6 classifiers S.-B. Cho, Soft Computing Lab 51
52 Structure GA Optimization Normalized Gene Expression Profiles Feature Selector 1 Feature Selector 2... Feature Selector m feature-classifier pairs Classifier 1... Classifier Classifier n fitness evaluation x x o... GA searching x o x Ensemble Cancer Normal S.-B. Cho, Soft Computing Lab 52
53 GA Chromosome GA Optimization 0 CC-MLP 1 ED-MLP % 1 IG-MLP % 0 MI-MLP 0 PC-MLP 48 bits 1 PCA-MLP % 0 SN-MLP 0. 0 SC-MLP. SC-SVM(RBF) % ensemble result actual class Majority voting Genotype (chromosome) Phenotype (feature-classifier) Result of featureclassifier pair Fitness of a chromosome ch: Fit( ch) = # of correctly classified samples by ch # of total classified samples by ch S.-B. Cho, Soft Computing Lab 53
54 Change of Average Fitness GA Optimization Fitness Iteration Increase until the number of iterations reaches 150 Saturated after 150 iterations S.-B. Cho, Soft Computing Lab 54
55 Leave-one-out-cross Validation GA Optimization 100 validation(ensemble) validation(ensemble) test Accuracy(%) training average range test validation (single) training validation (single) Lymphoma Colon Optimal ensemble searched by GA outperforms!! S.-B. Cho, Soft Computing Lab 55
56 Comparison of Accuracy GA Optimization 100 accuracy best single ensemble of good classifiers best ensemble among 1 milion random ensemble best ensemble among 1 milion - simple GA, sharing best ensemble among 1 milion - crowding experiment GA > best single classifier > ensemble of good classifiers S.-B. Cho, Soft Computing Lab 56
57 Some Optimal Ensembles GA Optimization Majority voting Weighted voting Feature-classifier pair Accuracy (%) Feature-classifier pair Accuracy (%) CC-KNN(P) 75.0 MI-KNN(C) 83.3 SN-KNN(C) 79.2 SC-SASOM 62.5 IG-SVM(L) 91.7 Ensemble 100 IG-KNN(C) 91.7 MI-KNN(C) 83.3 SN-KNN(C) 79.2 SN-KNN(P) 79.2 CC-SASOM 54.2 IG-SASOM 83.3 PC-SVM(R) 62.5 Ensemble 100 S.-B. Cho, Soft Computing Lab 57
58 Overview Speciated GA Optimization Among all the 2 mn ensembles Standard GA does not guarantee optimal solution GA usually converges to local optima There may be many optimal ensembles The number is unknown GA just finds one of them Use of speciated GA instead of standard GA Fitness sharing Deterministic crowding S.-B. Cho, Soft Computing Lab 58
59 Concept Speciated GA Optimization Solution space genetic drift Ω Observation space Solutions searched by simple GA Solutions searched by speciated GA S.-B. Cho, Soft Computing Lab 59
60 Structure Speciated GA Optimization Microarray data Preprocessing Gene expression data matrix Feature selection PC SC ED CC IG MI SN PCA... Classifier MLP KNN(C) KNN(P) SVM(L) SVM(R) SASOM... Training FCs FC1 FC2 FC2... FC48 Ensemble Ensemble maker Searching speciated GA searching Validation Optimal ensemble Evaluation new instance Test Tumor Normal S.-B. Cho, Soft Computing Lab 60
61 Fitness Function Speciated GA Optimization Fitness of a chromosome ch Fitness( ch) = Acc( ch) α * Num1( ch) where Acc( ch) = # of correctly classified samples by ch # of total classified samples by ch The shorter, the better Num 1 ( ch) = # of bit 1's in chromosome ch α :constant S.-B. Cho, Soft Computing Lab 61
62 Deterministic Crowding Speciated GA Optimization Input: g - number of generations to run, s - population size Output: P(g) - the final population P(0) initialize() for t 1 to g do P(t) shuffle(p(t-1)) for i 0 to s/2-1 do a 2i+1 (t) Od od p 1 p 2 a 2i+2 (t) {c1, c2} recombination(p1, p2) c 1 ' mutate(c 1 ) c 2 ' mutate(c 2 ) if[d(p 1,c 1 ')+d(p 2,c 2 ')] [d(p 1,c 2 ')+d(p 2,c 1 ')] then if F(c 1 ') > F(p 1 ) then a 2i+1 (t) c 1 ' fi if F(c 2 ') > F(p 2 ) then a 2i+2 (t) c 2 ' fi else if F(c 2 ') > F(p 1 ) then a 2i+1 (t) c 2 ' fi if F(c 1 ') > F(p 2 ) then a 2i+1 (t) c 1 ' fi fi S.-B. Cho, Soft Computing Lab 62
63 Fitness Sharing Speciated GA Optimization A strategy that maintains diversity of chromosomes through lowering the fitnesses of individuals that are located close Use shared fitness F (i) instead of original fitness F(i) F( i) F '( i) = m( i) µ m ( i) = sh( d( i, j)) sh(d ) = j 1 sharing α 1 ( d / σ share ) if d < σ share 0 otherwise shared fitness fitness S.-B. Cho, Soft Computing Lab 63
64 Comparison of Diversity Speciated GA Optimization The number of optimal ensembles found by each method on one dataset Experiment sga sharing crowding crowding >> sga sharing S.-B. Cho, Soft Computing Lab 64
65 Speciated GA Optimization Change of Fitness and Accuracy fitness, accuracy simple GA, fitness simple GA, accuracy sharing, fitness sharing, accuracy crowding, fitness crowding, accuracy iteration crowding >> sga sharing S.-B. Cho, Soft Computing Lab 65
66 Search Efficiency Speciated GA Optimization Iterations Common GA Sharing Crowding Execution time per iteration: simple GA < crowding < sharing S.-B. Cho, Soft Computing Lab 66
67 Conclusion Classification Comparisons of feature/classifiers Exploration of ensemble approaches S.-B. Cho, Soft Computing Lab 67
DNA Gene Expression Classification with Ensemble Classifiers Optimized by Speciated Genetic Algorithm
DNA Gene Expression Classification with Ensemble Classifiers Optimized by Speciated Genetic Algorithm Kyung-Joong Kim and Sung-Bae Cho Department of Computer Science, Yonsei University, 134 Shinchon-dong,
More informationOur view on cdna chip analysis from engineering informatics standpoint
Our view on cdna chip analysis from engineering informatics standpoint Chonghun Han, Sungwoo Kwon Intelligent Process System Lab Department of Chemical Engineering Pohang University of Science and Technology
More informationadvanced analysis of gene expression microarray data aidong zhang World Scientific State University of New York at Buffalo, USA
advanced analysis of gene expression microarray data aidong zhang State University of New York at Buffalo, USA World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TAIPEI CHENNAI Contents
More informationBIOINF/BENG/BIMM/CHEM/CSE 184: Computational Molecular Biology. Lecture 2: Microarray analysis
BIOINF/BENG/BIMM/CHEM/CSE 184: Computational Molecular Biology Lecture 2: Microarray analysis Genome wide measurement of gene transcription using DNA microarray Bruce Alberts, et al., Molecular Biology
More informationBIOINFORMATICS THE MACHINE LEARNING APPROACH
88 Proceedings of the 4 th International Conference on Informatics and Information Technology BIOINFORMATICS THE MACHINE LEARNING APPROACH A. Madevska-Bogdanova Inst, Informatics, Fac. Natural Sc. and
More informationData Mining for Biological Data Analysis
Data Mining for Biological Data Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Data Mining Course by Gregory-Platesky Shapiro available at www.kdnuggets.com Jiawei Han
More informationLymphoma Cancer Classification Using Genetic Programming with SNR Features
Lymphoma Cancer Classification Using Genetic Programming with SNR Features JinHyuk Hong and SungBae Cho Dept. of Computer Science, Yonsei University, 134 Shinchondong, Sudaemoonku, Seoul 120749, Korea
More informationBioinformatics : Gene Expression Data Analysis
05.12.03 Bioinformatics : Gene Expression Data Analysis Aidong Zhang Professor Computer Science and Engineering What is Bioinformatics Broad Definition The study of how information technologies are used
More informationClassifying Gene Expression Data using an Evolutionary Algorithm
Classifying Gene Expression Data using an Evolutionary Algorithm Thanyaluk Jirapech-umpai E H U N I V E R S I T Y T O H F R G E D I N B U Master of Science School of Informatics University of Edinburgh
More informationMicroarrays & Gene Expression Analysis
Microarrays & Gene Expression Analysis Contents DNA microarray technique Why measure gene expression Clustering algorithms Relation to Cancer SAGE SBH Sequencing By Hybridization DNA Microarrays 1. Developed
More informationStudy on the Application of Data Mining in Bioinformatics. Mingyang Yuan
International Conference on Mechatronics Engineering and Information Technology (ICMEIT 2016) Study on the Application of Mining in Bioinformatics Mingyang Yuan School of Science and Liberal Arts, New
More informationAnalysis of microarray data
BNF078 Fall 2006 Analysis of microarray data Markus Ringnér Computational Biology and Biological Physics Department of Theoretical Physics Lund University markus@thep.lu.se 046-2229337 1 Contents Preface
More informationBioinformatics. Microarrays: designing chips, clustering methods. Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute
Bioinformatics Microarrays: designing chips, clustering methods Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute Course Syllabus Jan 7 Jan 14 Jan 21 Jan 28 Feb 4 Feb 11 Feb 18 Feb 25 Sequence
More informationBioinformatics for Biologists
Bioinformatics for Biologists Functional Genomics: Microarray Data Analysis Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute Outline Introduction Working with microarray data Normalization Analysis
More informationRandom forest for gene selection and microarray data classification
www.bioinformation.net Hypothesis Volume 7(3) Random forest for gene selection and microarray data classification Kohbalan Moorthy & Mohd Saberi Mohamad* Artificial Intelligence & Bioinformatics Research
More informationThis place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology.
G16B BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY Methods or systems for genetic
More informationFeature Selection of Gene Expression Data for Cancer Classification: A Review
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 50 (2015 ) 52 57 2nd International Symposium on Big Data and Cloud Computing (ISBCC 15) Feature Selection of Gene Expression
More informationEnsemble methods for bioinformatics
Ensemble methods for bioinformatics Giorgio Valentini e-mail: valenti@disi.unige.it Ensemble methods for bioinformatics and for gene expression data analysis Applied in different bioinformatics domains:
More informationData Mining and Applications in Genomics
Data Mining and Applications in Genomics Lecture Notes in Electrical Engineering Volume 25 For other titles published in this series, go to www.springer.com/series/7818 Sio-Iong Ao Data Mining and Applications
More informationBagged Ensembles of Support Vector Machines for Gene Expression Data Analysis
Bagged Ensembles of Support Vector Machines for Gene Expression Data Analysis Giorgio Valentini INFM, Istituto Nazionale di Fisica della Materia, DSI, Dip. di Scienze dell Informazione Università degli
More informationData mining: Identify the hidden anomalous through modified data characteristics checking algorithm and disease modeling By Genomics
Data mining: Identify the hidden anomalous through modified data characteristics checking algorithm and disease modeling By Genomics PavanKumar kolla* kolla.haripriyanka+ *School of Computing Sciences,
More informationGene Expression Data Analysis
Gene Expression Data Analysis Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu BMIF 310, Fall 2009 Gene expression technologies (summary) Hybridization-based
More informationDNA Microarrays and Clustering of Gene Expression Data
DNA Microarrays and Clustering of Gene Expression Data Martha L. Bulyk mlbulyk@receptor.med.harvard.edu Biophysics 205 Spring term 2008 Traditional Method: Northern Blot RNA population on filter (gel);
More informationGene Selection in Cancer Classification using PSO/SVM and GA/SVM Hybrid Algorithms
Laboratoire d Informatique Fondamentale de Lille Gene Selection in Cancer Classification using PSO/SVM and GA/SVM Hybrid Algorithms Enrique Alba, José GarcíaNieto, Laetitia Jourdan and ElGhazali Talbi
More informationA Comparative Study of Microarray Data Analysis for Cancer Classification
A Comparative Study of Microarray Data Analysis for Cancer Classification Kshipra Chitode Research Student Government College of Engineering Aurangabad, India Meghana Nagori Asst. Professor, CSE Dept Government
More informationIntroduction to Bioinformatics. Fabian Hoti 6.10.
Introduction to Bioinformatics Fabian Hoti 6.10. Analysis of Microarray Data Introduction Different types of microarrays Experiment Design Data Normalization Feature selection/extraction Clustering Introduction
More informationFirst steps in signal-processing level models of genetic networks: identifying response pathways and clusters of coexpressed genes
First steps in signal-processing level models of genetic networks: identifying response pathways and clusters of coexpressed genes Olga Troyanskaya lecture for cheme537/cs554 some slides borrowed from
More informationComputational Biology I
Computational Biology I Microarray data acquisition Gene clustering Practical Microarray Data Acquisition H. Yang From Sample to Target cdna Sample Centrifugation (Buffer) Cell pellets lyse cells (TRIzol)
More informationProgress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong
Progress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong Machine learning models can be used to predict which recommended content users will click on a given website.
More informationBioinformatics and Genomics: A New SP Frontier?
Bioinformatics and Genomics: A New SP Frontier? A. O. Hero University of Michigan - Ann Arbor http://www.eecs.umich.edu/ hero Collaborators: G. Fleury, ESE - Paris S. Yoshida, A. Swaroop UM - Ann Arbor
More informationMethods for Multi-Category Cancer Diagnosis from Gene Expression Data: A Comprehensive Evaluation to Inform Decision Support System Development
1 Methods for Multi-Category Cancer Diagnosis from Gene Expression Data: A Comprehensive Evaluation to Inform Decision Support System Development Alexander Statnikov M.S., Constantin F. Aliferis M.D.,
More informationFunctional genomics + Data mining
Functional genomics + Data mining BIO337 Systems Biology / Bioinformatics Spring 2014 Edward Marcotte, Univ of Texas at Austin Edward Marcotte/Univ of Texas/BIO337/Spring 2014 Functional genomics + Data
More informationAPPLICATION OF COMMITTEE k-nn CLASSIFIERS FOR GENE EXPRESSION PROFILE CLASSIFICATION. A Thesis. Presented to
APPLICATION OF COMMITTEE k-nn CLASSIFIERS FOR GENE EXPRESSION PROFILE CLASSIFICATION A Thesis Presented to The Graduate Faculty of The University of Akron In Partial Fulfillment of the Requirements for
More informationStatistical Machine Learning Methods for Bioinformatics VI. Support Vector Machine Applications in Bioinformatics
Statistical Machine Learning Methods for Bioinformatics VI. Support Vector Machine Applications in Bioinformatics Jianlin Cheng, PhD Computer Science Department and Informatics Institute University of
More informationMachine Learning. HMM applications in computational biology
10-601 Machine Learning HMM applications in computational biology Central dogma DNA CCTGAGCCAACTATTGATGAA transcription mrna CCUGAGCCAACUAUUGAUGAA translation Protein PEPTIDE 2 Biological data is rapidly
More informationFollowing text taken from Suresh Kumar. Bioinformatics Web - Comprehensive educational resource on Bioinformatics. 6th May.2005
Bioinformatics is the recording, annotation, storage, analysis, and searching/retrieval of nucleic acid sequence (genes and RNAs), protein sequence and structural information. This includes databases of
More informationClassification Study on DNA Microarray with Feedforward Neural Network Trained by Singular Value Decomposition
Classification Study on DNA Microarray with Feedforward Neural Network Trained by Singular Value Decomposition Hieu Trung Huynh 1, Jung-Ja Kim 2 and Yonggwan Won 1 1 Department of Computer Engineering,
More informationGene set based ensemble methods for cancer classification
Louisiana State University LSU Digital Commons LSU Doctoral Dissertations Graduate School 2013 Gene set based ensemble methods for cancer classification William Evans Duncan Louisiana State University
More informationClassification and Learning Using Genetic Algorithms
Sanghamitra Bandyopadhyay Sankar K. Pal Classification and Learning Using Genetic Algorithms Applications in Bioinformatics and Web Intelligence With 87 Figures and 43 Tables 4y Spri rineer 1 Introduction
More informationLearning theory: SLT what is it? Parametric statistics small number of parameters appropriate to small amounts of data
Predictive Genomics, Biology, Medicine Learning theory: SLT what is it? Parametric statistics small number of parameters appropriate to small amounts of data Ex. Find mean m and standard deviation s for
More informationMachine learning applications in genomics: practical issues & challenges. Yuzhen Ye School of Informatics and Computing, Indiana University
Machine learning applications in genomics: practical issues & challenges Yuzhen Ye School of Informatics and Computing, Indiana University Reference Machine learning applications in genetics and genomics
More informationDNA Based Disease Prediction using pathway Analysis
2017 IEEE 7th International Advance Computing Conference DNA Based Disease Prediction using pathway Analysis Syeeda Farah Dr.Asha T Cauvery B and Sushma M S Department of Computer Science and Shivanand
More informationNeural Networks and Applications in Bioinformatics. Yuzhen Ye School of Informatics and Computing, Indiana University
Neural Networks and Applications in Bioinformatics Yuzhen Ye School of Informatics and Computing, Indiana University Contents Biological problem: promoter modeling Basics of neural networks Perceptrons
More informationPrediction of Success or Failure of Software Projects based on Reusability Metrics using Support Vector Machine
Prediction of Success or Failure of Software Projects based on Reusability Metrics using Support Vector Machine R. Sathya Assistant professor, Department of Computer Science & Engineering Annamalai University
More informationNeural Networks and Applications in Bioinformatics
Contents Neural Networks and Applications in Bioinformatics Yuzhen Ye School of Informatics and Computing, Indiana University Biological problem: promoter modeling Basics of neural networks Perceptrons
More informationGene Reduction for Cancer Classification using Cascaded Neural Network with Gene Masking
Gene Reduction for Cancer Classification using Cascaded Neural Network with Gene Masking Raneel Kumar, Krishnil Chand, Sunil Pranit Lal School of Computing, Information, and Mathematical Sciences University
More informationSupport Vector Machines (SVMs) for the classification of microarray data. Basel Computational Biology Conference, March 2004 Guido Steiner
Support Vector Machines (SVMs) for the classification of microarray data Basel Computational Biology Conference, March 2004 Guido Steiner Overview Classification problems in machine learning context Complications
More informationA Protein Secondary Structure Prediction Method Based on BP Neural Network Ru-xi YIN, Li-zhen LIU*, Wei SONG, Xin-lei ZHAO and Chao DU
2017 2nd International Conference on Artificial Intelligence: Techniques and Applications (AITA 2017 ISBN: 978-1-60595-491-2 A Protein Secondary Structure Prediction Method Based on BP Neural Network Ru-xi
More informationComparative Genomic Hybridization
Comparative Genomic Hybridization Srikesh G. Arunajadai Division of Biostatistics University of California Berkeley PH 296 Presentation Fall 2002 December 9 th 2002 OUTLINE CGH Introduction Methodology,
More informationA STUDY ON STATISTICAL BASED FEATURE SELECTION METHODS FOR CLASSIFICATION OF GENE MICROARRAY DATASET
A STUDY ON STATISTICAL BASED FEATURE SELECTION METHODS FOR CLASSIFICATION OF GENE MICROARRAY DATASET 1 J.JEYACHIDRA, M.PUNITHAVALLI, 1 Research Scholar, Department of Computer Science and Applications,
More informationPredicting prokaryotic incubation times from genomic features Maeva Fincker - Final report
Predicting prokaryotic incubation times from genomic features Maeva Fincker - mfincker@stanford.edu Final report Introduction We have barely scratched the surface when it comes to microbial diversity.
More informationROAD TO STATISTICAL BIOINFORMATICS CHALLENGE 1: MULTIPLE-COMPARISONS ISSUE
CHAPTER1 ROAD TO STATISTICAL BIOINFORMATICS Jae K. Lee Department of Public Health Science, University of Virginia, Charlottesville, Virginia, USA There has been a great explosion of biological data and
More informationMeasuring gene expression (Microarrays) Ulf Leser
Measuring gene expression (Microarrays) Ulf Leser This Lecture Gene expression Microarrays Idea Technologies Problems Quality control Normalization Analysis next week! 2 http://learn.genetics.utah.edu/content/molecules/transcribe/
More informationPCA and SOM based Dimension Reduction Techniques for Quaternary Protein Structure Prediction
PCA and SOM based Dimension Reduction Techniques for Quaternary Protein Structure Prediction Sanyukta Chetia Department of Electronics and Communication Engineering, Gauhati University-781014, Guwahati,
More information2. Materials and Methods
Identification of cancer-relevant Variations in a Novel Human Genome Sequence Robert Bruggner, Amir Ghazvinian 1, & Lekan Wang 1 CS229 Final Report, Fall 2009 1. Introduction Cancer affects people of all
More informationSmart India Hackathon
TM Persistent and Hackathons Smart India Hackathon 2017 i4c www.i4c.co.in Digital Transformation 25% of India between age of 16-25 Our country needs audacious digital transformation to reach its potential
More informationSupervised Learning from Micro-Array Data: Datamining with Care
November 18, 2002 Stanford Statistics 1 Supervised Learning from Micro-Array Data: Datamining with Care Trevor Hastie Stanford University November 18, 2002 joint work with Robert Tibshirani, Balasubramanian
More informationNetwork System Inference
Network System Inference Francis J. Doyle III University of California, Santa Barbara Douglas Lauffenburger Massachusetts Institute of Technology WTEC Systems Biology Final Workshop March 11, 2005 What
More informationReliable classification of two-class cancer data using evolutionary algorithms
BioSystems 72 (23) 111 129 Reliable classification of two-class cancer data using evolutionary algorithms Kalyanmoy Deb, A. Raji Reddy Kanpur Genetic Algorithms Laboratory (KanGAL), Indian Institute of
More informationBIOINFORMATICS AND SYSTEM BIOLOGY (INTERNATIONAL PROGRAM)
BIOINFORMATICS AND SYSTEM BIOLOGY (INTERNATIONAL PROGRAM) PROGRAM TITLE DEGREE TITLE Master of Science Program in Bioinformatics and System Biology (International Program) Master of Science (Bioinformatics
More informationPredicting Corporate Influence Cascades In Health Care Communities
Predicting Corporate Influence Cascades In Health Care Communities Shouzhong Shi, Chaudary Zeeshan Arif, Sarah Tran December 11, 2015 Part A Introduction The standard model of drug prescription choice
More informationData Mining in Bioinformatics. Prof. André de Carvalho ICMC-Universidade de São Paulo
Data Mining in Bioinformatics Prof. André de Carvalho ICMC-Universidade de São Paulo Main topics Motivation Data Mining Prediction Bioinformatics Molecular Biology Using DM in Molecular Biology Case studies
More informationGA-SVM WRAPPER APPROACH FOR GENE RANKING AND CLASSIFICATION USING EXPRESSIONS OF VERY FEW GENES
GA-SVM WRAPPER APPROACH FOR GENE RANKING AND CLASSIFICATION USING EXPRESSIONS OF VERY FEW GENES N.REVATHY 1, Dr.R.BALASUBRAMANIAN 2 1 Assistant Professor, Department of Computer Applications, Karpagam
More informationIntroduction to Microarray Analysis
Introduction to Microarray Analysis Methods Course: Gene Expression Data Analysis -Day One Rainer Spang Microarrays Highly parallel measurement devices for gene expression levels 1. How does the microarray
More informationIntroduction to Bioinformatics
Introduction to Bioinformatics If the 19 th century was the century of chemistry and 20 th century was the century of physic, the 21 st century promises to be the century of biology...professor Dr. Satoru
More informationIterated Conditional Modes for Cross-Hybridization Compensation in DNA Microarray Data
http://www.psi.toronto.edu Iterated Conditional Modes for Cross-Hybridization Compensation in DNA Microarray Data Jim C. Huang, Quaid D. Morris, Brendan J. Frey October 06, 2004 PSI TR 2004 031 Iterated
More informationFeature selection methods for SVM classification of microarray data
Feature selection methods for SVM classification of microarray data Mike Love December 11, 2009 SVMs for microarray classification tasks Linear support vector machines have been used in microarray experiments
More informationSurvival Outcome Prediction for Cancer Patients based on Gene Interaction Network Analysis and Expression Profile Classification
Survival Outcome Prediction for Cancer Patients based on Gene Interaction Network Analysis and Expression Profile Classification Final Project Report Alexander Herrmann Advised by Dr. Andrew Gentles December
More informationGene expression analysis. Biosciences 741: Genomics Fall, 2013 Week 5. Gene expression analysis
Gene expression analysis Biosciences 741: Genomics Fall, 2013 Week 5 Gene expression analysis From EST clusters to spotted cdna microarrays Long vs. short oligonucleotide microarrays vs. RT-PCR Methods
More informationBayesian Variable Selection and Data Integration for Biological Regulatory Networks
Bayesian Variable Selection and Data Integration for Biological Regulatory Networks Shane T. Jensen Department of Statistics The Wharton School, University of Pennsylvania stjensen@wharton.upenn.edu Gary
More information296 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 3, JUNE 2006
296 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 3, JUNE 2006 An Evolutionary Clustering Algorithm for Gene Expression Microarray Data Analysis Patrick C. H. Ma, Keith C. C. Chan, Xin Yao,
More informationCustomer Relationship Management in marketing programs: A machine learning approach for decision. Fernanda Alcantara
Customer Relationship Management in marketing programs: A machine learning approach for decision Fernanda Alcantara F.Alcantara@cs.ucl.ac.uk CRM Goal Support the decision taking Personalize the best individual
More informationFinding molecular signatures from gene expression data: review and a new proposal
Finding molecular signatures from gene expression data: review and a new proposal Ramón Díaz-Uriarte rdiaz@cnio.es http://bioinfo.cnio.es/ rdiaz Unidad de Bioinformática Centro Nacional de Investigaciones
More informationISTITUTO DI ANALISI DEI SISTEMI ED INFORMATICA Antonio Ruberti CONSIGLIO NAZIONALE DELLE RICERCHE
ISTITUTO DI ANALISI DEI SISTEMI ED INFORMATICA Antonio Ruberti CONSIGLIO NAZIONALE DELLE RICERCHE P. Bertolazzi, G. Felici, G. Lancia APPLICATION OF FEATURE SELECTION AND CLASSIFICATION TO COMPUTATIONAL
More informationMeasuring gene expression
Measuring gene expression Grundlagen der Bioinformatik SS2018 https://www.youtube.com/watch?v=v8gh404a3gg Agenda Organization Gene expression Background Technologies FISH Nanostring Microarrays RNA-seq
More informationGene expression analysis: Introduction to microarrays
Gene expression analysis: Introduction to microarrays Adam Ameur The Linnaeus Centre for Bioinformatics, Uppsala University February 15, 2006 Overview Introduction Part I: How a microarray experiment is
More informationIntroduction to Microarray Data Analysis and Gene Networks. Alvis Brazma European Bioinformatics Institute
Introduction to Microarray Data Analysis and Gene Networks Alvis Brazma European Bioinformatics Institute A brief outline of this course What is gene expression, why it s important Microarrays and how
More informationIncluding prior knowledge in shrinkage classifiers for genomic data
Including prior knowledge in shrinkage classifiers for genomic data Jean-Philippe Vert Jean-Philippe.Vert@mines-paristech.fr Mines ParisTech / Curie Institute / Inserm Statistical Genomics in Biomedical
More informationEstimating Cell Cycle Phase Distribution of Yeast from Time Series Gene Expression Data
2011 International Conference on Information and Electronics Engineering IPCSIT vol.6 (2011) (2011) IACSIT Press, Singapore Estimating Cell Cycle Phase Distribution of Yeast from Time Series Gene Expression
More informationEvolving connectionist systems for knowledge discovery from gene expression data of cancer tissue
Artificial Intelligence in Medicine 28 (2003) 165 189 Evolving connectionist systems for knowledge discovery from gene expression data of cancer tissue Matthias E. Futschik a,*, Anthony Reeve b, Nikola
More informationSTATISTICAL CHALLENGES IN GENE DISCOVERY
STATISTICAL CHALLENGES IN GENE DISCOVERY THROUGH MICROARRAY DATA ANALYSIS 1 Central Tuber Crops Research Institute,Kerala, India 2 Dept. of Statistics, St. Thomas College, Pala, Kerala, India email:sreejyothi
More informationMachine Learning Methods for Microarray Data Analysis
Harvard-MIT Division of Health Sciences and Technology HST.512: Genomic Medicine Prof. Marco F. Ramoni Machine Learning Methods for Microarray Data Analysis Marco F. Ramoni Children s Hospital Informatics
More informationStatistical Methods for Network Analysis of Biological Data
The Protein Interaction Workshop, 8 12 June 2015, IMS Statistical Methods for Network Analysis of Biological Data Minghua Deng, dengmh@pku.edu.cn School of Mathematical Sciences Center for Quantitative
More informationMachine Learning in Computational Biology CSC 2431
Machine Learning in Computational Biology CSC 2431 Lecture 9: Combining biological datasets Instructor: Anna Goldenberg What kind of data integration is there? What kind of data integration is there? SNPs
More informationIntroduction to gene expression microarray data analysis
Introduction to gene expression microarray data analysis Outline Brief introduction: Technology and data. Statistical challenges in data analysis. Preprocessing data normalization and transformation. Useful
More informationHomework : Data Mining. Due at the start of class Friday, 25 September 2009
Homework 4 36-350: Data Mining Due at the start of class Friday, 25 September 2009 This homework set applies methods we have developed so far to a medical problem, gene expression in cancer. In some questions
More informationSingle-cell sequencing
Single-cell sequencing Harri Lähdesmäki Department of Computer Science Aalto University December 5, 2017 Contents Background & Motivation Single cell sequencing technologies Single cell sequencing data
More informationSyllabus for BIOS 101, SPRING 2013
Page 1 Syllabus for BIOS 101, SPRING 2013 Name: BIOSTATISTICS 101 for Cancer Researchers Time: March 20 -- May 29 4-5pm in Wednesdays, [except 4/15 (Mon) and 5/7 (Tue)] Location: SRB Auditorium Background
More informationCancer Classification using Support Vector Machines and Relevance Vector Machine based on Analysis of Variance Features
Journal of Computer Science 7 (9): 1393-1399, 2011 ISSN 1549-3636 2011 Science Publications Cancer Classification using Support Vector Machines and Relevance Vector Machine based on Analysis of Variance
More informationIn silico prediction of novel therapeutic targets using gene disease association data
In silico prediction of novel therapeutic targets using gene disease association data, PhD, Associate GSK Fellow Scientific Leader, Computational Biology and Stats, Target Sciences GSK Big Data in Medicine
More informationMicroarray analysis challenges.
Microarray analysis challenges. While not quite as bad as my hobby of ice climbing you, need the right equipment! T. F. Smith Bioinformatics Boston Univ. Experimental Design Issues Reference and Controls
More informationIdentifying Splice Sites Of Messenger RNA Using Support Vector Machines
Identifying Splice Sites Of Messenger RNA Using Support Vector Machines Paige Diamond, Zachary Elkins, Kayla Huff, Lauren Naylor, Sarah Schoeberle, Shannon White, Timothy Urness, Matthew Zwier Drake University
More informationEstoril Education Day
Estoril Education Day -Experimental design in Proteomics October 23rd, 2010 Peter James Note Taking All the Powerpoint slides from the Talks are available for download from: http://www.immun.lth.se/education/
More informationMeasuring and Understanding Gene Expression
Measuring and Understanding Gene Expression Dr. Lars Eijssen Dept. Of Bioinformatics BiGCaT Sciences programme 2014 Why are genes interesting? TRANSCRIPTION Genome Genomics Transcriptome Transcriptomics
More informationA Hybrid Approach for Gene Selection and Classification using Support Vector Machine
The International Arab Journal of Information Technology, Vol. 1, No. 6A, 015 695 A Hybrid Approach for Gene Selection and Classification using Support Vector Machine Jaison Bennet 1, Chilambuchelvan Ganaprakasam
More informationMethods of Biomaterials Testing Lesson 3-5. Biochemical Methods - Molecular Biology -
Methods of Biomaterials Testing Lesson 3-5 Biochemical Methods - Molecular Biology - Chromosomes in the Cell Nucleus DNA in the Chromosome Deoxyribonucleic Acid (DNA) DNA has double-helix structure The
More informationLearning Methods for DNA Binding in Computational Biology
Learning Methods for DNA Binding in Computational Biology Mark Kon Dustin Holloway Yue Fan Chaitanya Sai Charles DeLisi Boston University IJCNN Orlando August 16, 2007 Outline Background on Transcription
More informationIdentification of biological themes in microarray data from a mouse heart development time series using GeneSifter
Identification of biological themes in microarray data from a mouse heart development time series using GeneSifter VizX Labs, LLC Seattle, WA 98119 Abstract Oligonucleotide microarrays were used to study
More information