Integrating natural language processing and machine learning algorithms to categorize oncologic response in radiology reports

Size: px
Start display at page:

Download "Integrating natural language processing and machine learning algorithms to categorize oncologic response in radiology reports"

Transcription

1 Integrating natural language processing and machine learning algorithms to categorize oncologic response in radiology reports Po-Hao Chen, MD MBA Hanna Zafar, MD Tessa S. Cook, MD PhD

2 Roadmap Background Methods Results Discussion

3 Roadmap Background Methods Results Discussion

4 Background Structured reporting templates found to improve quality and consistency of radiology reporting. Most radiology reports remain unstructured.

5 Oncologic Imaging Key outputs in oncologic imaging reports Progression Stable Disease Improvement Resolution No Cancer

6 Process Natural Language Processing Machine Learning Impression: 1. No evidence of new or increasing metastatic disease in the abdomen or pelvis. 2. Unchanged size of hepatic metastases since prior exam. 3. Resolved pneumobilia in keeping with improving post-procedure changes. Stable Disease

7 Background Hypothesis: Predictive ability of classification algorithm for oncologic reporting depends on the combination of feature engineering and machine learning.

8 Roadmap Background Methods Results Discussion

9 Methods Our institutional structured reporting: new lesions and existing oncologic lesions on abdominal/pelvic CT and MRI Progression Improvement Stable disease Resolution/no cancer Interval development of new lesion(s) OR progression of existing lesions No interval development of a new lesion AND improvement of existing lesions No interval development of a new lesion AND stable appearance of existing lesions Absence of any new lesion AND either no previously documented cancer OR complete response.

10 Feature Engineering Stop Word Removal (SWR) (+ / -) Stemming (+ / -) N-gram Tokenization (N=1 5) Text-mining Term Frequency (TF) Term Frequency * Inverse Document Frequency (TF-IDF) 16-Bit Feature Hashing Filter-Based Feature Selection 500, 750, 1000, 1500, 2500

11 Machine Learning Algorithms Logistic Regression Bayes Machine Support Vector Machine Random Decision Forest Neural Network

12 Combination Data Split 70% training 30% validation Training Hyperparameter optimization 5-Fold Cross-validation Performance measured using F 1 measure and accuracy

13 Roadmap Background Methods Results Discussion

14 Results 9418 Oncologic CT/MRI 804 Excluded 8614 Included 2498 Progression 2132 Stable Disease 1184 Improvement 2800 Complete Response No Cancer

15 Highest F 1 -Score Combination Support vector machine TF-IDF With stop word removal With stemming Unigrams + Bigrams Top 750 tokens

16 ML Performance

17 Results

18 Results

19 Effect of Language Processing On F-Measure

20 Roadmap Background Methods Results Discussion

21 Discussion Classification accuracy is likely a function of the classification task, the NLP techniques, and the ML algorithm. Stop word removal and stemming generally improve the F- measures of all ML algorithms except for Bayes point machine. Unigrams + Bigrams outperforms other N-Gram constructs on most ML algorithms. Support vector machine, logistic regression, and Bayes machines have an optimal number of features Random decision forest and neural network performance do not degrade with increased number of features

22 Thank you! Cai T, Giannopoulos AA, Yu S, Kelil T, Ripley B, Kumamaru KK, et al. Natural Language Processing Technologies in Radiology Research and Clinical Applications. Radiogr Rev Publ Radiol Soc N Am Inc Feb;36(1): Yim W-W, Yetisgen M, Harris WP, Kwan SW. Natural Language Processing in Oncology: A Review. JAMA Oncol Jun 1;2(6): Kocbek S, Cavedon L, Martinez D, Bain C, Mac Manus C, Haffari G, et al. Text mining electronic hospital records to automatically classify admissions against disease: Measuring the impact of linking data sources. J Biomed Inform Oct 11; Zafar HM, Chadalavada SC, Kahn CE, Cook TS, Sloan CE, Lalevic D, et al. Code Abdomen: An Assessment Coding Scheme for Abdominal Imaging Findings Possibly Representing Cancer. J Am Coll Radiol JACR Sep;12(9): Lipton ZC, Elkan C, Naryanaswamy B. Optimal Thresholding of Classifiers to Maximize F1 Measure. Mach Learn Knowl Discov Databases Eur Conf ECML PKDD Proc ECML PKDD Conf. 2014;8725: Bennasar M, Hicks Y, Setchi R. Feature selection using Joint Mutual Information Maximisation. Expert Syst Appl Dec;42(22): Hripcsak G, Rothschild AS. Agreement, the f-measure, and reliability in information retrieval. J Am Med Inform Assoc JAMIA Jun;12(3): Tong W, Xie Q, Hong H, Shi L, Fang H, Perkins R, et al. Using decision forest to classify prostate cancer samples on the basis of SELDI-TOF MS data: assessing chance correlation and prediction confidence. Environ Health Perspect Nov;112(16): Liu X, Song M, Tao D, Liu Z, Zhang L, Chen C, et al. Random forest construction with robust semisupervised node splitting. IEEE Trans Image Process Publ IEEE Signal Process Soc Jan;24(1): Wang J, Zhang J, An Y, Lin H, Yang Z, Zhang Y, et al. Biomedical event trigger detection by dependency-based word embedding. BMC Med Genomics Aug 10;9 Suppl 2:45. Wei W, Marmor R, Singh S, Wang S, Demner-Fushman D, Kuo T-T, et al. Finding Related Publications: Extending the Set of Terms Used to Assess Article Similarity. AMIA Jt Summits Transl Sci Proc AMIA Jt Summits Transl Sci. 2016;2016:

Identifying Secondary Crashes for Traffic Incident Management (TIM) Programs. Xu Zhang Kentucky Transportation Center 9/14/2018

Identifying Secondary Crashes for Traffic Incident Management (TIM) Programs. Xu Zhang Kentucky Transportation Center 9/14/2018 Identifying Secondary Crashes for Traffic Incident Management (TIM) Programs Xu Zhang Kentucky Transportation Center 9/14/2018 TIM Team Eric Green Mei Chen Reg Souleyrette Secondary Crash Example Outline

More information

Effective Products Categorization with Importance Scores and Morphological Analysis of the Titles

Effective Products Categorization with Importance Scores and Morphological Analysis of the Titles Effective Products Categorization with Importance Scores and Morphological Analysis of the Titles Leonidas Akritidis, Athanasios Fevgas, Panayiotis Bozanis Data Structuring & Engineering Lab Department

More information

Machine learning-based approaches for BioCreative III tasks

Machine learning-based approaches for BioCreative III tasks Machine learning-based approaches for BioCreative III tasks Shashank Agarwal 1, Feifan Liu 2, Zuofeng Li 2 and Hong Yu 1,2,3 1 Medical Informatics, College of Engineering and Applied Sciences, University

More information

Study on Talent Introduction Strategies in Zhejiang University of Finance and Economics Based on Data Mining

Study on Talent Introduction Strategies in Zhejiang University of Finance and Economics Based on Data Mining International Journal of Statistical Distributions and Applications 2018; 4(1): 22-28 http://www.sciencepublishinggroup.com/j/ijsda doi: 10.11648/j.ijsd.20180401.13 ISSN: 2472-3487 (Print); ISSN: 2472-3509

More information

SOCIALQ&A: AN ONLINE SOCIAL NETWORK BASED QUESTION AND ANSWER SYSTEM

SOCIALQ&A: AN ONLINE SOCIAL NETWORK BASED QUESTION AND ANSWER SYSTEM SOCIALQ&A: AN ONLINE SOCIAL NETWORK BASED QUESTION AND ANSWER SYSTEM AUTHORS: H. SHEN, G. LIU, H. WANG, AND N. VITHLANI PRESENTED BY: NICOLE MCNABB OVERVIEW Introduction Design of SocialQ&A Security and

More information

Fraud Detection for MCC Manipulation

Fraud Detection for MCC Manipulation 2016 International Conference on Informatics, Management Engineering and Industrial Application (IMEIA 2016) ISBN: 978-1-60595-345-8 Fraud Detection for MCC Manipulation Hong-feng CHAI 1, Xin LIU 2, Yan-jun

More information

Resolution of Chemical Disease Relations with Diverse Features and Rules

Resolution of Chemical Disease Relations with Diverse Features and Rules Resolution of Chemical Disease Relations with Diverse Features and Rules Dingcheng Li*, Naveed Afzal*, Majid Rastegar Mojarad, Ravikumar Komandur Elayavilli, Sijia Liu, Yanshan Wang, Feichen Shen, Hongfang

More information

An intelligent medical guidance system based on multi-words TF-IDF algorithm

An intelligent medical guidance system based on multi-words TF-IDF algorithm International Conference on Applied Science and Engineering Innovation (ASEI 2015) An intelligent medical guidance system based on multi-words TF-IDF algorithm Y. S. Lin 1, L Huang 1, Z. M. Wang 2 1 Cooperative

More information

On utility of temporal embeddings for skill matching. Manisha Verma, PhD student, UCL Nathan Francis, NJFSearch

On utility of temporal embeddings for skill matching. Manisha Verma, PhD student, UCL Nathan Francis, NJFSearch On utility of temporal embeddings for skill matching Manisha Verma, PhD student, UCL Nathan Francis, NJFSearch Skill Trend Importance 1. Constant evolution of labor market yields differences in importance

More information

Validating drug repurposing signals using electronic health records: a case study of metformin associated with reduced cancer mortality

Validating drug repurposing signals using electronic health records: a case study of metformin associated with reduced cancer mortality Validating drug repurposing signals using electronic health records: a case study of metformin associated with reduced cancer mortality Hua Xu, PhD School of Biomedical Informatics, UTHealth JAMIA Journal

More information

CoTri: extracting chemical-disease relations with co-reference resolution and common trigger words

CoTri: extracting chemical-disease relations with co-reference resolution and common trigger words CoTri: extracting chemical-disease relations with co-reference resolution and common trigger words Yu-De Chen1, Juin-Huang Ju2, Ming-Yu Chien3, Yu-Cheng Sheng4, Tsung-Lu Lee5, Jung-Hsien Chiang*6 12346Department

More information

A logistic regression model for Semantic Web service matchmaking

A logistic regression model for Semantic Web service matchmaking . BRIEF REPORT. SCIENCE CHINA Information Sciences July 2012 Vol. 55 No. 7: 1715 1720 doi: 10.1007/s11432-012-4591-x A logistic regression model for Semantic Web service matchmaking WEI DengPing 1*, WANG

More information

Advances in Machine Learning for Credit Card Fraud Detection

Advances in Machine Learning for Credit Card Fraud Detection Advances in Machine Learning for Credit Card Fraud Detection May 14, 2014 Alejandro Correa Bahnsen Introduction Europe fraud evolution Internet transactions (millions of euros) 800 700 600 500 2007 2008

More information

MACHINE LEARNING OPPORTUNITIES IN FREIGHT TRANSPORTATION OPERATIONS. NORTHWESTERN NUTC/CCIT October 26, Ted Gifford

MACHINE LEARNING OPPORTUNITIES IN FREIGHT TRANSPORTATION OPERATIONS. NORTHWESTERN NUTC/CCIT October 26, Ted Gifford MACHINE LEARNING OPPORTUNITIES IN FREIGHT TRANSPORTATION OPERATIONS NORTHWESTERN NUTC/CCIT October 26, 2016 Ted Gifford SCHNEIDER IS A TRANSPORTATION AND LOGISTICS LEADER WITH A BROAD PORTFOLIO OF SERVICES.

More information

Random forest for gene selection and microarray data classification

Random forest for gene selection and microarray data classification www.bioinformation.net Hypothesis Volume 7(3) Random forest for gene selection and microarray data classification Kohbalan Moorthy & Mohd Saberi Mohamad* Artificial Intelligence & Bioinformatics Research

More information

UBT Performance Tracking Tool

UBT Performance Tracking Tool SECTION 4 COMPETENCY: Improving Performance UBT Performance Tracking Tool Purpose The Unit-Based Team Performance Tracking Tool provides a picture of how the UBT s actions impact overall performance. Experience

More information

International Journal of Scientific & Engineering Research, Volume 6, Issue 3, March ISSN Web and Text Mining Sentiment Analysis

International Journal of Scientific & Engineering Research, Volume 6, Issue 3, March ISSN Web and Text Mining Sentiment Analysis International Journal of Scientific & Engineering Research, Volume 6, Issue 3, March-2015 672 Web and Text Mining Sentiment Analysis Ms. Anjana Agrawal Abstract This paper describes the key steps followed

More information

MACRO CRITICAL: Example Case. Standardizing Documentation of Radiology Critical Test Results The NYU Experience

MACRO CRITICAL: Example Case. Standardizing Documentation of Radiology Critical Test Results The NYU Experience MACRO CRITICAL: Standardizing Documentation of Radiology Critical Test Results The NYU Experience Lindsay M Griffin MD Joseph Sanger MD Dana Ostrow Danny Kim MD RSNA Quality Storyboard Annual Meeting December

More information

Interpretable Predictions of Clinical Outcomes with An Attention-based Recurrent Neural Network

Interpretable Predictions of Clinical Outcomes with An Attention-based Recurrent Neural Network ACM-BCB 2017 Interpretable Predictions of Clinical Outcomes with An Attention-based Recurrent Neural Network Ying Sha and May D. Wang Georgia Institute of Technology Aug. 22 th, 2017 Outline Background

More information

Advanced Quantitative Research Methodology, Lecture Notes: Text Analysis: Supervised Learning

Advanced Quantitative Research Methodology, Lecture Notes: Text Analysis: Supervised Learning Advanced Quantitative Research Methodology, Lecture Notes: Text Analysis: Supervised Learning Gary King Institute for Quantitative Social Science Harvard University April 22, 2012 Gary King (Harvard, IQSS)

More information

A Personalized Company Recommender System for Job Seekers Yixin Cai, Ruixi Lin, Yue Kang

A Personalized Company Recommender System for Job Seekers Yixin Cai, Ruixi Lin, Yue Kang A Personalized Company Recommender System for Job Seekers Yixin Cai, Ruixi Lin, Yue Kang Abstract Our team intends to develop a recommendation system for job seekers based on the information of current

More information

Understanding the Patterns of Health Information Dissemination on Social Media during the Zika Outbreak

Understanding the Patterns of Health Information Dissemination on Social Media during the Zika Outbreak Understanding the Patterns of Health Information Dissemination on Social Media during the Zika Outbreak Improving Population Health Using Novel Data Sources S47 Xinning Gui 1 and Yue Wang 2 1. University

More information

The Service Desk Balanced Scorecard

The Service Desk Balanced Scorecard The Service Desk Balanced Scorecard Your Overall Measure of Service Desk Performance MetricNet Best Practices Series Your Speaker: Jeff Rumburg Co Founder and Managing Partner, MetricNet, LLC Winner of

More information

assessment regarding the effects of accuracy checks for infusion and syringe pumps on other clinical operations.

assessment regarding the effects of accuracy checks for infusion and syringe pumps on other clinical operations. Introduction Materials As per the revised Pharmaceutical Affairs Law announced in July 2004, infusion and syringe pumps are classified as specially controlled medical devices. In our hospital, all infusion

More information

Airbnb Price Estimation. Hoormazd Rezaei SUNet ID: hoormazd. Project Category: General Machine Learning gitlab.com/hoorir/cs229-project.

Airbnb Price Estimation. Hoormazd Rezaei SUNet ID: hoormazd. Project Category: General Machine Learning gitlab.com/hoorir/cs229-project. Airbnb Price Estimation Liubov Nikolenko SUNet ID: liubov Hoormazd Rezaei SUNet ID: hoormazd Pouya Rezazadeh SUNet ID: pouyar Project Category: General Machine Learning gitlab.com/hoorir/cs229-project.git

More information

The Call Center Balanced Scorecard

The Call Center Balanced Scorecard The Call Center Balanced Scorecard Your Overall Measure of Call Center Performance! MetricNet Best Practices Series Some Common Call Center KPIs Cost Cost per Contact Cost per Minute of Handle Time Quality

More information

Identifying Splice Sites Of Messenger RNA Using Support Vector Machines

Identifying Splice Sites Of Messenger RNA Using Support Vector Machines Identifying Splice Sites Of Messenger RNA Using Support Vector Machines Paige Diamond, Zachary Elkins, Kayla Huff, Lauren Naylor, Sarah Schoeberle, Shannon White, Timothy Urness, Matthew Zwier Drake University

More information

(& Classify Deaths Without Physicians) 1

(& Classify Deaths Without Physicians) 1 Advanced Quantitative Research Methodology, Lecture Notes: Text Analysis I: How to Read 100 Million Blogs (& Classify Deaths Without Physicians) 1 Gary King http://gking.harvard.edu April 25, 2010 1 c

More information

WaterlooClarke: TREC 2015 Total Recall Track

WaterlooClarke: TREC 2015 Total Recall Track WaterlooClarke: TREC 2015 Total Recall Track Haotian Zhang, Wu Lin, Yipeng Wang, Charles L. A. Clarke and Mark D. Smucker Data System Group University of Waterloo TREC, 2015 Haotian Zhang, Wu Lin, Yipeng

More information

Predicting prokaryotic incubation times from genomic features Maeva Fincker - Final report

Predicting prokaryotic incubation times from genomic features Maeva Fincker - Final report Predicting prokaryotic incubation times from genomic features Maeva Fincker - mfincker@stanford.edu Final report Introduction We have barely scratched the surface when it comes to microbial diversity.

More information

Clinical Decision Support Technologies for Oncologic Pathology

Clinical Decision Support Technologies for Oncologic Pathology Clinical Decision Support Technologies for Oncologic Pathology Brian H. Shirts Assistant Professor, Department of Laboratory Medicine University of Washington, National Cancer Policy Forum Improving Cancer

More information

Predictive Analytics

Predictive Analytics Predictive Analytics Mani Janakiram, PhD Director, Supply Chain Intelligence & Analytics, Intel Corp. Adjunct Professor of Supply Chain, ASU October 2017 "Prediction is very difficult, especially if it's

More information

Big Data. Methodological issues in using Big Data for Official Statistics

Big Data. Methodological issues in using Big Data for Official Statistics Giulio Barcaroli Istat (barcarol@istat.it) Big Data Effective Processing and Analysis of Very Large and Unstructured data for Official Statistics. Methodological issues in using Big Data for Official Statistics

More information

Categorizing Web Viewership Using Statistical Models of Web Navigation and Text Classification

Categorizing Web Viewership Using Statistical Models of Web Navigation and Text Classification Categorizing Web Viewership Using Statistical Models of Web Navigation and Text Classification Alan L. Montgomery and Brett Gordon Carnegie Mellon University Marketing Science Conference University of

More information

NLP/Information Extraction from Clinical Notes - Welcome!

NLP/Information Extraction from Clinical Notes - Welcome! UCSF Institute for Computational Health Sciences UCSF Clinical and Translational Institute UCSF Information Technology UCSF Library SOM Technical Services NLP/Information Extraction from Clinical Notes

More information

Text Mining Approach for Product Quality Enhancement

Text Mining Approach for Product Quality Enhancement 2017 IEEE 7th International Advance Computing Conference Text Mining Approach for Product Quality Enhancement (Improving Product Quality through Machine Learning) Chandrasekhar Rangu Shuvojit Chatterjee

More information

Recent publications & Announcements

Recent publications & Announcements Recent publications & Announcements HLP Seminar October 2018 https://rdcu.be/8vz2 2 https://rdcu.be/8tlf 3 https://t.co/resz5nfjlt 4 Social media mining for birth defects research: A rule-based, bootstrapping

More information

Prediction of air pollution in Changchun based on OSR method

Prediction of air pollution in Changchun based on OSR method ISSN 1 746-7233, England, UK World Journal of Modelling and Simulation Vol. 13 (2017) No. 1, pp. 12-18 Prediction of air pollution in Changchun based on OSR method Shuai Fu 1, Yong Jiang 2, Shiqi Xu 3,

More information

Sentiment Analysis and Political Party Classification in 2016 U.S. President Debates in Twitter

Sentiment Analysis and Political Party Classification in 2016 U.S. President Debates in Twitter Sentiment Analysis and Political Party Classification in 2016 U.S. President Debates in Twitter Tianyu Ding 1 and Junyi Deng 1 and Jingting Li 1 and Yu-Ru Lin 1 1 University of Pittsburgh, Pittsburgh PA

More information

Brian Macdonald Big Data & Analytics Specialist - Oracle

Brian Macdonald Big Data & Analytics Specialist - Oracle Brian Macdonald Big Data & Analytics Specialist - Oracle Improving Predictive Model Development Time with R and Oracle Big Data Discovery brian.macdonald@oracle.com Copyright 2015, Oracle and/or its affiliates.

More information

Clarity CT Technology

Clarity CT Technology Clarity CT Technology WHITE PAPER January 2013 Using state of the art algorithms Sapheneia Clarity CT allows physicians to lower radiation dose when acquiring CT data while maintaining image quality. The

More information

Linguistic Techniques to Improve the Performance of Automatic Text Categorization

Linguistic Techniques to Improve the Performance of Automatic Text Categorization Linguistic Techniques to Improve the Performance of Automatic Text Categorization Akiko Aizawa National Institute of Informatics 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, 101-8430, JAPAN akiko@nii.ac.jp Abstract

More information

Journal of Chemical and Pharmaceutical Research, 2015, 7(6): Research Article

Journal of Chemical and Pharmaceutical Research, 2015, 7(6): Research Article Available online www.jocpr.com Journal of Chemical and Pharmaceutical Research, 015, 7(6):6-66 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 Artificial neural networ model of algae density in Xiangxi

More information

DETECTING RATE OF RECOVERY IN EPIDEMICS THROUGH MODIFIED LV METHOD THROUGH SINK NODES

DETECTING RATE OF RECOVERY IN EPIDEMICS THROUGH MODIFIED LV METHOD THROUGH SINK NODES DETECTING RATE OF RECOVERY IN EPIDEMICS THROUGH MODIFIED ABSTRACT LV METHOD THROUGH SINK NODES Jobanpreet Singh 1, Mini Singh Ahuja 2 1Student M.tech CSE 2Assistant Professor CSE Guru Nanak Dev University,

More information

Electric Forward Market Report

Electric Forward Market Report Mar-01 Mar-02 Jun-02 Sep-02 Dec-02 Mar-03 Jun-03 Sep-03 Dec-03 Mar-04 Jun-04 Sep-04 Dec-04 Mar-05 May-05 Aug-05 Nov-05 Feb-06 Jun-06 Sep-06 Dec-06 Mar-07 Jun-07 Sep-07 Dec-07 Apr-08 Jun-08 Sep-08 Dec-08

More information

Clinical Applications of Big Data

Clinical Applications of Big Data Clinical Applications of Big Data Michael A. Grasso, MD, PhD, FACP Assistant Professor Internal Medicine, Emergency Medicine, Computer Science University of Maryland School of Medicine Director University

More information

Propagating Uncertainty in Multi-Stage Bayesian Convolutional Neural Networks with Application to Pulmonary Nodule Detection

Propagating Uncertainty in Multi-Stage Bayesian Convolutional Neural Networks with Application to Pulmonary Nodule Detection Propagating Uncertainty in Multi-Stage Bayesian Convolutional Neural Networks with Application to Pulmonary Nodule Detection Onur Ozdemir, Benjamin Woodward, Andrew A. Berlin Draper {oozdemir,bwoodward,aberlin}@draper.com

More information

Prediction of Success or Failure of Software Projects based on Reusability Metrics using Support Vector Machine

Prediction of Success or Failure of Software Projects based on Reusability Metrics using Support Vector Machine Prediction of Success or Failure of Software Projects based on Reusability Metrics using Support Vector Machine R. Sathya Assistant professor, Department of Computer Science & Engineering Annamalai University

More information

CRAB 2.0: A text mining tool for supporting literature review in chemical cancer risk assessment

CRAB 2.0: A text mining tool for supporting literature review in chemical cancer risk assessment CRAB 2.0: A text mining tool for supporting literature review in chemical cancer risk assessment Yufan Guo 1, Diarmuid Ó Séaghdha 1, Ilona Silins 2, Lin Sun 1, Johan Högberg 2, Ulla Stenius 2, Anna Korhonen

More information

Long-Term Market Analysis using Text Mining

Long-Term Market Analysis using Text Mining Long-Term Market Analysis using Text Mining K. Izumi DHRC, AIST, 2-4-6 Aomi, Koto-ku, Tokyo 35-64, Japan T. Goto TheBankofTokyo-MitsubishiUFJ,Ltd., 2-7-3, Marunouchi, Chiyoda-ku, Tokyo -647, Japan T. Matsui

More information

A STUDY ON STATISTICAL BASED FEATURE SELECTION METHODS FOR CLASSIFICATION OF GENE MICROARRAY DATASET

A STUDY ON STATISTICAL BASED FEATURE SELECTION METHODS FOR CLASSIFICATION OF GENE MICROARRAY DATASET A STUDY ON STATISTICAL BASED FEATURE SELECTION METHODS FOR CLASSIFICATION OF GENE MICROARRAY DATASET 1 J.JEYACHIDRA, M.PUNITHAVALLI, 1 Research Scholar, Department of Computer Science and Applications,

More information

In-vivo Targeting of Liver Lesions with a Navigation System based on Fiducial Needles

In-vivo Targeting of Liver Lesions with a Navigation System based on Fiducial Needles In-vivo Targeting of Liver Lesions with a Navigation System based on Fiducial Needles L. Maier-Hein 1, A. Tekbas 2, A. Seitel 1, F. Pianka 2, S. A. Müller 2, S. Schawo 3, B. Radeleff 3, R. Tetzlaff 1,4,

More information

REVIEW ON PREDICTION OF CHRONIC KIDNEY DISEASE USING DATA MINING TECHNIQUES

REVIEW ON PREDICTION OF CHRONIC KIDNEY DISEASE USING DATA MINING TECHNIQUES Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 5.258 IJCSMC,

More information

Progress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong

Progress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong Progress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong Machine learning models can be used to predict which recommended content users will click on a given website.

More information

Enabling News Trading by Automatic Categorization of News Articles

Enabling News Trading by Automatic Categorization of News Articles SCSUG 2016 Paper AA22 Enabling News Trading by Automatic Categorization of News Articles ABSTRACT Praveen Kumar Kotekal, Oklahoma State University Vishwanath Kolar Bhaskara, Oklahoma State University Traders

More information

Precision Health and Imaging Analytics

Precision Health and Imaging Analytics 17 th Annual International Healthcare Summit Kelowna, BC Precision Health and Imaging Analytics Charlotte Hovet, MD, MMM Medical Director, Healthcare Solutions & Consulting June 26, 2017 2017 NTT DATA,

More information

Gene Selection in Cancer Classification using PSO/SVM and GA/SVM Hybrid Algorithms

Gene Selection in Cancer Classification using PSO/SVM and GA/SVM Hybrid Algorithms Laboratoire d Informatique Fondamentale de Lille Gene Selection in Cancer Classification using PSO/SVM and GA/SVM Hybrid Algorithms Enrique Alba, José GarcíaNieto, Laetitia Jourdan and ElGhazali Talbi

More information

Ph.D. in Information and Computer Science (Area: Bioinformatics), University of California, Irvine, August, (Advisor: Dr.

Ph.D. in Information and Computer Science (Area: Bioinformatics), University of California, Irvine, August, (Advisor: Dr. Jianlin Cheng Assistant Professor School of Electrical Engineering and Computer Science University of Central Florida Orlando, FL 32816 Phone: (407) 968-9746 Email: jianlin.cheng@gmail.com Web: http://www.eecs.ucf.edu/~jcheng

More information

Financial News Classification using SVM

Financial News Classification using SVM International Journal of Scientific and Research Publications, Volume 2, Issue 3, March 2012 1 Financial News Classification using SVM Rama Bharath Kumar*, Bangari Shravan Kumar**, Chandragiri Shiva Sai

More information

Financial News Classification using SVM

Financial News Classification using SVM International Journal of Scientific and Research Publications, Volume 2, Issue 3, March 2012 1 Financial News Classification using SVM Rama Bharath Kumar*, Bangari Shravan Kumar**, Chandragiri Shiva Sai

More information

Health Analytics Current data situation and use in Norway

Health Analytics Current data situation and use in Norway Health Analytics Current data situation and use in Norway Anne Torill Nordsletta, Director Health Analytics, Norwegian Centre for E-health Research @ ehealthnorway 2016 Established NORWEGIAN CENTRE FOR

More information

USES THE BAGGING ALGORITHM OF CLASSIFICATION METHOD WITH WEKA TOOL FOR PREDICTION TECHNIQUE

USES THE BAGGING ALGORITHM OF CLASSIFICATION METHOD WITH WEKA TOOL FOR PREDICTION TECHNIQUE USES THE BAGGING ALGORITHM OF CLASSIFICATION METHOD WITH WEKA TOOL FOR PREDICTION TECHNIQUE 1 POOJA SHRIVASTAVA, 2 MANOJ SHUKLA 1 Computer Science and Engineering, Jayoti Vidyapeeth Women s University,

More information

Text Mining. Theory and Applications Anurag Nagar

Text Mining. Theory and Applications Anurag Nagar Text Mining Theory and Applications Anurag Nagar Topics Introduction What is Text Mining Features of Text Document Representation Vector Space Model Document Similarities Document Classification and Clustering

More information

Abstract. Keywords. 1. Introduction. 2. Methodology. Sujata 1, R. B. Dubey 2,R. Dhiman 3, T. J. Singh Chugh 4

Abstract. Keywords. 1. Introduction. 2. Methodology. Sujata 1, R. B. Dubey 2,R. Dhiman 3, T. J. Singh Chugh 4 An Evaluation of Two Mammography Segmentation Techniques Sujata 1, R. B. Dubey 2,R. Dhiman 3, T. J. Singh Chugh 4 EE, DCRUST, Murthal, Sonepat, India 1, 3 ECE, Hindu College of Engg., Sonepat, India 2

More information

Chance Constrained Multi-objective Programming for Supplier Selection and Order Allocation under Uncertainty

Chance Constrained Multi-objective Programming for Supplier Selection and Order Allocation under Uncertainty Chance Constrained Multi-objective Programming for Supplier Selection and Order Allocation under Uncertainty Xi Li 1, Tomohiro Murata 2 Abstract This paper proposes a chance-constrained multi-objective

More information

Applications of Machine Learning to Predict Yelp Ratings

Applications of Machine Learning to Predict Yelp Ratings Applications of Machine Learning to Predict Yelp Ratings Kyle Carbon Aeronautics and Astronautics kcarbon@stanford.edu Kacyn Fujii Electrical Engineering khfujii@stanford.edu Prasanth Veerina Computer

More information

Physicists Quality Control for MR Equipment

Physicists Quality Control for MR Equipment Physicists Quality Control for MR Equipment Geoffrey D. Clarke, Ph.D. University of Texas Health Science Center at San Antonio 1 Overview ABR and the role of the Qualified Medical Physicist/ MR Scientist

More information

Woking. q business confidence report

Woking. q business confidence report Woking q1 business confidence report Woking q1 report headlines saw a new record in company registrations in Woking when compared to any previous. was a record quarter for company registrations in Woking

More information

Data Mining to Aid Beam Angle Selection for IMRT

Data Mining to Aid Beam Angle Selection for IMRT Data Mining to Aid Beam Angle Selection for IMRT Stuart Price-University of Maryland Howard Zhang- University of Maryland School of Medicine Bruce Golden- University of Maryland Edward Wasil- American

More information

Understanding the Drivers of Negative Electricity Price Using Decision Tree

Understanding the Drivers of Negative Electricity Price Using Decision Tree 2017 Ninth Annual IEEE Green Technologies Conference Understanding the Drivers of Negative Electricity Price Using Decision Tree José Carlos Reston Filho Ashutosh Tiwari, SMIEEE Chesta Dwivedi IDAAM Educação

More information

Normalization of Text in Social Media: Analyzing the Need for Pre-processing Techniques and its Roles Nimala.K Dr.Thenmozhi. S Dr.ThamizhArasan.

Normalization of Text in Social Media: Analyzing the Need for Pre-processing Techniques and its Roles Nimala.K Dr.Thenmozhi. S Dr.ThamizhArasan. Volume 119 No. 12 2018, 2033-2037 ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu Normalization of Text in Social Media: Analyzing the Need for Pre-processing Techniques and its Roles

More information

CONNECTING CORPORATE GOVERNANCE TO COMPANIES PERFORMANCE BY ARTIFICIAL NEURAL NETWORKS

CONNECTING CORPORATE GOVERNANCE TO COMPANIES PERFORMANCE BY ARTIFICIAL NEURAL NETWORKS CONNECTING CORPORATE GOVERNANCE TO COMPANIES PERFORMANCE BY ARTIFICIAL NEURAL NETWORKS Darie MOLDOVAN, PhD * Mircea RUSU, PhD student ** Abstract The objective of this paper is to demonstrate the utility

More information

A Knowledge-Driven Method to. Evaluate Multi-Source Clustering

A Knowledge-Driven Method to. Evaluate Multi-Source Clustering A Knowledge-Driven Method to Evaluate Multi-Source Clustering Chengyong Yang, Erliang Zeng, Tao Li, and Giri Narasimhan * Bioinformatics Research Group (BioRG), School of Computer Science, Florida International

More information

Microarray gene expression ranking with Z-score for Cancer Classification

Microarray gene expression ranking with Z-score for Cancer Classification Microarray gene expression ranking with Z-score for Cancer Classification M.Yasodha, Research Scholar Government Arts College, Coimbatore, Tamil Nadu, India Dr P Ponmuthuramalingam Head and Associate Professor

More information

Analytics to the rescue How to blend asset hierarchies with reports. Dr Pierre Marchand, Industry Consultant 24-Sep-2014

Analytics to the rescue How to blend asset hierarchies with reports. Dr Pierre Marchand, Industry Consultant 24-Sep-2014 Analytics to the rescue How to blend asset hierarchies with reports Dr Pierre Marchand, Industry Consultant 24-Sep-2014 Manage Asset Integrity One of the most complex challenges across industries Keep

More information

Analysis of Hot Points on Data Mining Research of Medical in Foreign Countries

Analysis of Hot Points on Data Mining Research of Medical in Foreign Countries Cross-Cultural Communication Vol. 12, No., 216, pp. 1-5 DOI:.6/722 ISSN 1712-5[Print] ISSN 12-67[Online] www.cscanada.net www.cscanada.org Analysis of Hot Points on Data Mining Research of Medical in Foreign

More information

Research on Personal Credit Assessment Based. Neural Network-Logistic Regression Combination

Research on Personal Credit Assessment Based. Neural Network-Logistic Regression Combination Open Journal of Business and Management, 2017, 5, 244-252 http://www.scirp.org/journal/ojbm ISSN Online: 2329-3292 ISSN Print: 2329-3284 Research on Personal Credit Assessment Based on Neural Network-Logistic

More information

Managing Project Portfolio. Contents are subject to change. For the latest updates visit

Managing Project Portfolio. Contents are subject to change. For the latest updates visit Managing Project Portfolio Page 1 of 7 Why Attend The overall aim of this course is to provide participants with a generic and practical methodology to build and manage a balanced project portfolio. The

More information

CBA Workshop Series -3 Drug Safety (Pharmacovigilance) and Big Data. Larry Liu TCONNEX Inc.

CBA Workshop Series -3 Drug Safety (Pharmacovigilance) and Big Data. Larry Liu TCONNEX Inc. CBA Workshop Series -3 Drug Safety (Pharmacovigilance) and Big Data Larry Liu TCONNEX Inc. Larry.liu@tconnex.com Drug Safety Drug Safety is about Adverse Drug Reaction or ADR ADR is any unintended and

More information

An Implementation of genetic algorithm based feature selection approach over medical datasets

An Implementation of genetic algorithm based feature selection approach over medical datasets An Implementation of genetic algorithm based feature selection approach over medical s Dr. A. Shaik Abdul Khadir #1, K. Mohamed Amanullah #2 #1 Research Department of Computer Science, KhadirMohideen College,

More information

Exploring the Genetic Basis of Congenital Heart Defects

Exploring the Genetic Basis of Congenital Heart Defects Exploring the Genetic Basis of Congenital Heart Defects Sanjay Siddhanti Jordan Hannel Vineeth Gangaram szsiddh@stanford.edu jfhannel@stanford.edu vineethg@stanford.edu 1 Introduction The Human Genome

More information

Classification of DNA Sequences Using Convolutional Neural Network Approach

Classification of DNA Sequences Using Convolutional Neural Network Approach UTM Computing Proceedings Innovations in Computing Technology and Applications Volume 2 Year: 2017 ISBN: 978-967-0194-95-0 1 Classification of DNA Sequences Using Convolutional Neural Network Approach

More information

Predicting and Explaining Price-Spikes in Real-Time Electricity Markets

Predicting and Explaining Price-Spikes in Real-Time Electricity Markets Predicting and Explaining Price-Spikes in Real-Time Electricity Markets Christian Brown #1, Gregory Von Wald #2 # Energy Resources Engineering Department, Stanford University 367 Panama St, Stanford, CA

More information

Unlocking Unstructured Social Media Data in Marketing. William Rand Assistant Professor of Bussiness Management

Unlocking Unstructured Social Media Data in Marketing. William Rand Assistant Professor of Bussiness Management Unlocking Unstructured Social Media Data in Marketing William Rand Assistant Professor of Bussiness Management In Collaboration with Kelly Hewett, Roland Rust, and Harald J. van Heerde Managers perspectives

More information

Predicting Credit Card Customer Loyalty Using Artificial Neural Networks

Predicting Credit Card Customer Loyalty Using Artificial Neural Networks Predicting Credit Card Customer Loyalty Using Artificial Neural Networks Tao Zhang Bo Yuan Wenhuang Liu Graduate School at Shenzhen, Tsinghua University, Shenzhen 518055, P.R. China E-Mail: kirasz06@gmail.com,

More information

Enrichment Design with Patient Population Augmentation

Enrichment Design with Patient Population Augmentation Enrichment Design with Patient Population Augmentation Subhead Calibri 14pt, White Bo Yang, AbbVie Yijie Zhou, AbbVie Lanju Zhang, AbbVie Lu Cui, AbbVie Author Disclosure Bo Yang is a former Employee of

More information

Novedades en el manejo del paciente con CPRC M0 y su potencial aplicación en la clínica

Novedades en el manejo del paciente con CPRC M0 y su potencial aplicación en la clínica Novedades en el manejo del paciente con CPRC M0 y su potencial aplicación en la clínica Coordinación científica: Dr. Fernando Rivera Hospital Universitario Marqués de Valdecilla, Santander Dr. Juan Fco

More information

Managing Machine Learning: Insights and Strategy Session 141, March 7, 2018 Elizabeth Clements, MBA, Business Architect, Geisinger Health Debdipto

Managing Machine Learning: Insights and Strategy Session 141, March 7, 2018 Elizabeth Clements, MBA, Business Architect, Geisinger Health Debdipto Managing Machine Learning: Insights and Strategy Session 141, March 7, 2018 Elizabeth Clements, MBA, Business Architect, Geisinger Health Debdipto Misra, MS, Data Scientist, Geisinger Health 1 Speaker

More information

Analysis of Optimized Logistics Service of 3C Agents in Taiwan Based on ABC-KMDSS

Analysis of Optimized Logistics Service of 3C Agents in Taiwan Based on ABC-KMDSS Appl. Math. Inf. Sci. 7, No. 1L, 307-312 (2013) 307 Applied Mathematics & Information Sciences An International Journal Analysis of Optimized Logistics Service of 3C Agents in Taiwan Based on ABC-KMDSS

More information

Artificial Intelligence Principles and Practices

Artificial Intelligence Principles and Practices Artificial Intelligence Principles and Practices Page 1 of 6 Why Attend Organizations are creating an avalanche of data, and with Artificial Intelligence (AI) technology we can put that data to work in

More information

A comparison of Multiple Biomarker Selection Algorithms for Early Screening of Ovarian Cancer

A comparison of Multiple Biomarker Selection Algorithms for Early Screening of Ovarian Cancer A comparison of Multiple Biomarker Selection Algorithms for Early Screening of Ovarian Cancer Yu-Seop Kim 1,3, Jong-Dae Kim 1,3, Min-Ki Jang 2,3, Chan-Young Park 1,3, and Hye-Jung Song 1,3 1 Dept. of Ubiquitous

More information

PhD Dissertation Defense Presentation

PhD Dissertation Defense Presentation PhD Dissertation Defense Presentation Wednesday, September 11 th, 2013 11:00am 1:00pm C103 Engineering Research Complex ADVANCED INVERTER CONTROL FOR UNINTERRUPTIBLE POWER SUPPLIES AND GRID-CONNECTED RENEWABLE

More information

Study on the Application of Data Mining in Bioinformatics. Mingyang Yuan

Study on the Application of Data Mining in Bioinformatics. Mingyang Yuan International Conference on Mechatronics Engineering and Information Technology (ICMEIT 2016) Study on the Application of Mining in Bioinformatics Mingyang Yuan School of Science and Liberal Arts, New

More information

Text Categorization. Hongning Wang

Text Categorization. Hongning Wang Text Categorization Hongning Wang CS@UVa Today s lecture Bayes decision theory Supervised text categorization General steps for text categorization Feature selection methods Evaluation metrics CS@UVa CS

More information

Text Categorization. Hongning Wang

Text Categorization. Hongning Wang Text Categorization Hongning Wang CS@UVa Today s lecture Bayes decision theory Supervised text categorization General steps for text categorization Feature selection methods Evaluation metrics CS@UVa CS

More information

COURSE LISTING. Courses Listed. with SAP ERP. 3 January 2018 (00:08 GMT) SCM600 - Business Processes in Sales and Distribution

COURSE LISTING. Courses Listed. with SAP ERP. 3 January 2018 (00:08 GMT) SCM600 - Business Processes in Sales and Distribution with SAP ERP COURSE LISTING Courses Listed SCM600 - Business Processes in Sales and Distribution SCM605 - SCM610 - Delivery Processing in SAP ERP SCM615 - Billing in SAP ERP SCM620 - SCM650 - Cross-Functional

More information

Copyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d. ENTERPRISE MINER: ANALYTICAL MODEL DEVELOPMENT

Copyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d. ENTERPRISE MINER: ANALYTICAL MODEL DEVELOPMENT ENTERPRISE MINER: ANALYTICAL MODEL DEVELOPMENT ANALYTICAL MODEL DEVELOPMENT AGENDA Enterprise Miner: Analytical Model Development The session looks at: - Supervised and Unsupervised Modelling - Classification

More information

A News-Based Portfolio Management System

A News-Based Portfolio Management System Development and Application of Data Mining and Learning Systems Universität Bonn May 21, 2014 Goal Goal Based on the history of the stock and the corresponding news, predict its direction with a certain

More information

Imon Banerjee, Ph.D; Hailey H. Choi, MD; Terry S. Desser, MD; Daniel L. Rubin, MD, MS, FSIIM. Stanford University School of Medicine

Imon Banerjee, Ph.D; Hailey H. Choi, MD; Terry S. Desser, MD; Daniel L. Rubin, MD, MS, FSIIM. Stanford University School of Medicine A Distributional Semantics Model for Automatically Assigning ACR Ultrasound LIRADS Categories Across Multi-template and Multi-institutional Radiology Reports Imon Banerjee, Ph.D; Hailey H. Choi, MD; Terry

More information