Machine learning in neuroscience

Similar documents
BIOINFORMATICS AND SYSTEM BIOLOGY (INTERNATIONAL PROGRAM)

An Effective Genetic Algorithm for Large-Scale Traveling Salesman Problems

Customer Relationship Management in marketing programs: A machine learning approach for decision. Fernanda Alcantara

Prediction of Success or Failure of Software Projects based on Reusability Metrics using Support Vector Machine

Shobeir Fakhraei, Hamid Soltanian-Zadeh, Farshad Fotouhi, Kost Elisevich. Effect of Classifiers in Consensus Feature Ranking for Biomedical Datasets

Metaheuristics. Approximate. Metaheuristics used for. Math programming LP, IP, NLP, DP. Heuristics

Gene Reduction for Cancer Classification using Cascaded Neural Network with Gene Masking

MISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASE

Gene Expression Data Analysis

What is Evolutionary Computation? Genetic Algorithms. Components of Evolutionary Computing. The Argument. When changes occur...

1. Describe the depth-first search algorithm. What is the order of visiting the nodes for the following

Data Mining and Applications in Genomics

Research Summer School on Statistics for Data Science S4D 2018, Caen, France

Data mining: Identify the hidden anomalous through modified data characteristics checking algorithm and disease modeling By Genomics

Top-down Forecasting Using a CRM Database Gino Rooney Tom Bauer

Learning Bayesian Network Models of Gene Regulation

OILFIELD ANALYTICS: OPTIMIZE EXPLORATION AND PRODUCTION WITH DATA-DRIVEN MODELS

Gene function prediction. Computational analysis of biological networks. Olga Troyanskaya, PhD

Classification and Learning Using Genetic Algorithms

Contents. Preface...VII

CHAPTER 17 Brain Image Analysis and Atlas Construction

This place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology.

Profit Optimization ABSTRACT PROBLEM INTRODUCTION

A HYBRID GENETIC ALGORITHM FOR JOB SHOP SCHEUDULING

ARTIFICIAL IMMUNE SYSTEM CLASSIFICATION OF MULTIPLE- CLASS PROBLEMS

REAL TIME ASSESSMENT OF DRINKING WATER SYSTEMS USING A DYNAMIC BAYESIAN NETWORK

CHAPTER 1 INTRODUCTION

Lecture 6: Decision Tree, Random Forest, and Boosting

296 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 3, JUNE 2006

Biomedical Data Science

Inference and computing with decomposable graphs

Inferring Cellular Networks Using Probabilis6c Graphical Models. Jianlin Cheng, PhD University of Missouri 2010

DEA SUPPORTED ANN APPROACH TO OPERATIONAL EFFICIENCY ASSESSMENT OF SMEs

Bayesian Visual Analysis of the Indian Labour Market

Cellular Automaton, Genetic Algorithms, and Neural Networks

Artificial Intelligence Qual Exam

Accuracy of the Bayesian Network Algorithms for Inferring Gene Regulatory Networks

A classical predictive modeling approach for Task Who rated what? of the KDD CUP 2007

Study on the Application of Data Mining in Bioinformatics. Mingyang Yuan

1/27 MLR & KNN M48, MLR and KNN, More Simple Generalities Handout, KJ Ch5&Sec. 6.2&7.4, JWHT Sec. 3.5&6.1, HTF Sec. 2.3

DATA ANALYTICS WITH R, EXCEL & TABLEAU

A Hybrid Deep Learning Model for Predictive Analytics

What is Bioinformatics? Bioinformatics is the application of computational techniques to the discovery of knowledge from biological databases.

The application of hidden markov model in building genetic regulatory network

Optimisation and Operations Research

Data Analytics with MATLAB Adam Filion Application Engineer MathWorks

Understanding Customer Choice Processes Using Neural Networks. Walter A. Kosters, Han La Poutré and Michiel C. van Wezel

Using Multi-chromosomes to Solve. Hans J. Pierrot and Robert Hinterding. Victoria University of Technology

GENETIC ALGORITHMS. Narra Priyanka. K.Naga Sowjanya. Vasavi College of Engineering. Ibrahimbahg,Hyderabad.

Providing Relevant Advertisements Based on Item- Specific Purchase History

BIOMEDICAL SIGNAL AND IMAGE PROCESSING

Multiscale Materials Design Using Informatics. S. R. Kalidindi, A. Agrawal, A. Choudhary, V. Sundararaghavan AFOSR-FA

Load Frequency Control of Power Systems Using FLC and ANN Controllers

2015 The MathWorks, Inc. 1

Big Data. Methodological issues in using Big Data for Official Statistics

Assoc. Prof. Rustem Popa, PhD

Computational Biology

Data Mining for Biological Data Analysis

ZHAW 6, APRIL, 2016 WHAT DO NEURONS AND CORALS HAVE IN COMMON?

DNA Based Disease Prediction using pathway Analysis

Fraud Detection for MCC Manipulation

PREDICTING EMPLOYEE ATTRITION THROUGH DATA MINING

REVIEW ON PREDICTION OF CHRONIC KIDNEY DISEASE USING DATA MINING TECHNIQUES

Random matrix analysis for gene co-expression experiments in cancer cells

Introduction to Information Systems Fifth Edition

Application of Intelligent Methods for Improving the Performance of COCOMO in Software Projects

Computational Intelligence Lecture 20:Intorcution to Genetic Algorithm

Predictive Analytics

Methodological challenges of Big Data for official statistics

RNA-Seq. Joshua Ainsley, PhD Postdoctoral Researcher Lab of Leon Reijmers Neuroscience Department Tufts University

Finding Maximum Colorful Subtrees in practice

Linear, Machine Learning and Probabilistic Approaches for Time Series Analysis

A Forecasting model to Predict the Health Impacts in Metropolitan Cities using Data Mining Techniques and Tools

A Comparative Study of Filter-based Feature Ranking Techniques

A heuristic method for simulating open-data of arbitrary complexity that can be used to compare and evaluate machine learning methods *

A FLEXIBLE JOB SHOP ONLINE SCHEDULING APPROACH BASED ON PROCESS-TREE

In order to have GA, you must have a way to rate a given solution (fitness function). The fitness function must be continuous.

IBM SPSS & Apache Spark

Plan for today GENETIC ALGORITHMS. Randomised search. Terminology: The GA cycle. Decoding genotypes

Causal Explorer: A Causal Probabilistic Network Learning Toolkit for Biomedical Discovery

Researches of Detection of Fraudulent Financial Statements based on Data Mining

Network System Inference

An Analytical Upper Bound on the Minimum Number of. Recombinations in the History of SNP Sequences in Populations

VALLIAMMAI ENGINEERING COLLEGE

Using SAS Enterprise Guide, SAS Enterprise Miner, and SAS Marketing Automation to Make a Collection Campaign Smarter

Computational methods in bioinformatics: Lecture 1

SOFTWARE DEVELOPMENT PRODUCTIVITY FACTORS IN PC PLATFORM

A Simulation-based Multi-level Redundancy Allocation for a Multi-level System

Inferring Gene Networks from Microarray Data using a Hybrid GA p.1

HYBRID FLOWER POLLINATION ALGORITHM AND SUPPORT VECTOR MACHINE FOR BREAST CANCER CLASSIFICATION

advanced analysis of gene expression microarray data aidong zhang World Scientific State University of New York at Buffalo, USA

Who Is Likely to Succeed: Predictive Modeling of the Journey from H-1B to Permanent US Work Visa

College of information technology Department of software

Data representation for clinical data and metadata

Segmenting Customer Bases in Personalization Applications Using Direct Grouping and Micro-Targeting Approaches

STUDY OF CLASSIFIERS IN DATA MINING

Deep Dive into High Performance Machine Learning Procedures. Tuba Islam, Analytics CoE, SAS UK

Analytics for Banks. September 19, 2017

Reliable Software Networks

Feature Selection of Gene Expression Data for Cancer Classification: A Review

Transcription:

Machine learning in neuroscience Bojan Mihaljevic, Luis Rodriguez-Lujan Computational Intelligence Group School of Computer Science, Technical University of Madrid 2015 IEEE Iberian Student Branch Congress April 24 th, Madrid B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 1 / 48

Outline 1 Introduction 2 Methods Machine learning Bayesian networks Directional statistics 3 Applications Introduction to neuroscience Neuron classification Morphological simulation Soma and spines DRCMST Other applications 4 Future work B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 2 / 48

Computational Intelligence Group At Artificial Intelligence Department, School of Computer Science Since 2008 2 full professors, 1 associate professor, 1 post-doc, and 11 PhD students B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 3 / 48

Outline 1 Introduction 2 Methods Machine learning Bayesian networks Directional statistics 3 Applications Introduction to neuroscience Neuron classification Morphological simulation Soma and spines DRCMST Other applications 4 Future work B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 4 / 48

A useful tool http://en.wikipedia.org/wiki/file:no-spam.png http://commons.wikimedia.org/wiki/file:logo_youtube.svg B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 5 / 48

Data-driven Learn from data what you cannot program (well) explicitly Large amounts of data these days Typically, we assume data comes as attribute values X 1 X 2 X 3 X 4 X 5 1.40 A 10003-24 D -0.31 B 2039 21 C -0.01 B 7383 70 U Goal: learn some function over X Related terms: data mining, pattern recognition, data science... B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 6 / 48

Tasks Classification (discrete target variable) Regression (real-valued target variable) Clustering (hidden discrete target variable) Others: collaborative filtering, market basket analysis, etc. http://commons.wikimedia.org/ wiki/file:social_red.jpg B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 7 / 48

Multiple models Some of them: P(x, c) = P(c)P(x 1 c)p(x 2 c)p(x 3 c)p(x 4 c)p(x 5 c) Naive Bayes k nearest neighbors p(c x, w) = Ber(y sigm(x T x)) Logistic regression Hastie, T., Tibshirani, R., Friedman, J., (2009). The elements of statistical learning (Vol. 2, No. 1). New York: Springer Decision tree B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 8 / 48

Toolbox Many different tools to extract models from data Optimization (often heuristic) Combinatorial Continuous Information theory Probability theory and statistics Inherent uncertainty (e.g., noise; prediction confidence)... B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 9 / 48

Outline 1 Introduction 2 Methods Machine learning Bayesian networks Directional statistics 3 Applications Introduction to neuroscience Neuron classification Morphological simulation Soma and spines DRCMST Other applications 4 Future work B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 10 / 48

Underpinning: conditional independence Many random variables: intractable distributions 20 binary variables mean 2 20 1 parameters in the joint distribution Fortunately, some variables are sometimes independent of others E.g., if I know that it is very warm, then knowing that it is summer might not make it more likely that many people will be on the beach Factor a joint distribution into smaller local ones P(X 1, X 2, X 3..., X n ) = P(X 1 )P(X 2 X 1 )P(X 3,..., X n X 1, X 2 ) B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 11 / 48

Representation Directed acyclic graph Nodes = variables Arcs encode conditional independencies A local distribution for each parents values combination P(x) = n i=1 P(x i pa(x i )) Can greatly reduce number of parameters http://commons.wikimedia.org/wiki/file:simplebayesnet.svg B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 12 / 48

Inference B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 13 / 48

Inference B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 14 / 48

Some research topics Learning from data NP-hard in the general case Conditional-independence tests Structure scoring (optimization) Inference NP-complete in the general case Exact Approximate Classifiers Specialized learning algorithms Non-standard local probability distributions Hybrid networks Mixtures of polynomials Directional variables B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 15 / 48

Outline 1 Introduction 2 Methods Machine learning Bayesian networks Directional statistics 3 Applications Introduction to neuroscience Neuron classification Morphological simulation Soma and spines DRCMST Other applications 4 Future work B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 16 / 48

Directional statistics Deal with directions, axes, rotations Cannot be studied as regular real-valued variables. Periodicity Real world data: Wind, animal behaviour, neuroscience,... B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 17 / 48

Representation and methods Different ways to represent directional data Directional probability distributions B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 18 / 48

Research topics in CIG Bayesian networks Different local distributions Multi-dimensional classifiers Learning classifiers Big Data... Heuristic optimization Multi-objetive Estimation of distribution algorithms (probabilistic evolutionary) Applications Neuroscience Scientometrics Bioinformatics B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 19 / 48

Outline 1 Introduction 2 Methods Machine learning Bayesian networks Directional statistics 3 Applications Introduction to neuroscience Neuron classification Morphological simulation Soma and spines DRCMST Other applications 4 Future work B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 20 / 48

Projects and collaborations Projects Collaborations Companies B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 21 / 48

Outline 1 Introduction 2 Methods Machine learning Bayesian networks Directional statistics 3 Applications Introduction to neuroscience Neuron classification Morphological simulation Soma and spines DRCMST Other applications 4 Future work B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 22 / 48

The brain Scientific study of the nervous system. Molecular and cellular neuroscience We do not study the brain at macro level (yet)...... but on a micro scale: Neurons 100 billion neurons in the brain 180.000 kilometers of wiring (myelinated white fibers) B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 23 / 48

Neurons Three main parts: Soma, dendrites and axon Neurons communicate with each other using electro-chemical signals Significant differences between neurons B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 24 / 48

Outline 1 Introduction 2 Methods Machine learning Bayesian networks Directional statistics 3 Applications Introduction to neuroscience Neuron classification Morphological simulation Soma and spines DRCMST Other applications 4 Future work B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 25 / 48

Gardener Classification There is an accepted catalogue of neuron types and names But lack of a consistent terminology Every neuroanatomist has is own classification scheme B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 26 / 48

Towards a consensus in naming Learning from the experts Gather data from 42 experts Learn a model (Bayesian network) for each expert B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 27 / 48

Towards a consensus in naming Differences among experts Six clusters of experts (Bayesian network clustering) B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 28 / 48

Outline 1 Introduction 2 Methods Machine learning Bayesian networks Directional statistics 3 Applications Introduction to neuroscience Neuron classification Morphological simulation Soma and spines DRCMST Other applications 4 Future work B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 29 / 48

Morphological simulation Denditric trees Why so different denditric tree shapes? Determine interconnectivity and functional roles B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 30 / 48

Morphological simulation Variables More than 40 variables Evidence and construction variables B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 31 / 48

Morphological simulation B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 32 / 48

Outline 1 Introduction 2 Methods Machine learning Bayesian networks Directional statistics 3 Applications Introduction to neuroscience Neuron classification Morphological simulation Soma and spines DRCMST Other applications 4 Future work B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 33 / 48

Soma spatial characterization Descriptors based on the level curves of a level set function Hybrid Gaussian and angular Bayesian network B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 34 / 48

Spines Related with brain functions like learning and memory 3D active contours to repair fragmented spines Hybrid spatial DBN B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 35 / 48

Outline 1 Introduction 2 Methods Machine learning Bayesian networks Directional statistics 3 Applications Introduction to neuroscience Neuron classification Morphological simulation Soma and spines DRCMST Other applications 4 Future work B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 36 / 48

Main idea Degree-constrained minimum spanning tree Degree constraints Restrict the role of the nodes in the tree to root, intermediate or leaf node Novel permutation-based representation to encode forests of DRCMST B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 37 / 48

Example 20 points where we are interested in building a forest of three trees B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 38 / 48

Application to Neuroscience Applied to optimal neuronal wiring B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 39 / 48

Outline 1 Introduction 2 Methods Machine learning Bayesian networks Directional statistics 3 Applications Introduction to neuroscience Neuron classification Morphological simulation Soma and spines DRCMST Other applications 4 Future work B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 40 / 48

Medical applications Medical decision support systems: Neonatal jaundice treatment B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 41 / 48

Other applications DNA microarray analysis Immunology Alzheimer Parkinson B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 42 / 48

Outline 1 Introduction 2 Methods Machine learning Bayesian networks Directional statistics 3 Applications Introduction to neuroscience Neuron classification Morphological simulation Soma and spines DRCMST Other applications 4 Future work B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 43 / 48

Integration B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 44 / 48

BN & Big data B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 45 / 48

Big data in neuroscience Functional Magnetic Resonance Imaging (fmri) Single Photon Emission Computed Tomography (SPECT) B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 46 / 48

Contact us! Summer School 2015 Computational Intelligence Group bmihaljevic@fi.upm.es luis.rodriguezl@alumnos.upm.es http://cig.fi.upm.es B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 47 / 48

Machine learning in neuroscience Bojan Mihaljevic, Luis Rodriguez-Lujan Computational Intelligence Group School of Computer Science, Technical University of Madrid 2015 IEEE Iberian Student Branch Congress April 24 th, Madrid B. Mihaljevic, L. Rodriguez-Lujan (UPM) Apr, 2015 48 / 48