Recent publications & Announcements

Size: px
Start display at page:

Download "Recent publications & Announcements"

Transcription

1 Recent publications & Announcements HLP Seminar October 2018

2 2

3 3

4 4

5 Social media mining for birth defects research: A rule-based, bootstrapping approach to collecting data for rare healthrelated events on Twitter. Klein AZ, Sarker A, Cai H, Weissenbacher D, Gonzalez- Hernandez G. Journal of Biomedical Informatics. 2018;in press. : 5

6 6

7 7

8 Data-centric text summarization methods to support evidence-based medicine Abeed Sarker, Ph.D. Research Associate Institute for Biomedical Informatics Department of Biostatistics, Epidemiology and Informatics 10/04/2018

9 Evidence-based medicine Information overload (PubMed has 28M+ articles) Automation can help w Summarizing evidence Sackett DL, et al. BMJ. 1996;312(7023): Generating query-focused summaries of individual publications Generating bottom-line recommendations from multiple publications w Appraising the quality of evidence 2

10 Sample evidence-based answer Query Multidocument summary Quality of evidence Reprinted with permission from the Journal of Family Practice Singledocument summary 3

11 Corpus creation w 456 Clinical queries w Bottom-line recommendations: 1396 w Bottom-line answers with quality grades: 1225 w Detailed justifications (singledocument summaries): 3036 w Unique referenced articles: 2908 Molla D, Santiago-Martinez ME, Sarker A, Paris C. Lang Resources & Evaluation (2016) 50: DOI: /s

12 Summarization w Target: 3-sentence summary (query-focused, extractive) w Past methods Sentence positions (e.g., higher score for later sentences) Sentence classifications (e.g., Outcome sentences) w Proposed methods Target-sentence-specific summarization 1,2 Sentence classification and selection customized to query type Use of semantic associations for scoring w Scores combined via the Edmundsonian paradigm Score x = ß 1 S 1 + ß 2 S 2 + ß 3 S 3 + ß 4 S 4. 1 Sarker A, Molla D, Paris C. Extractive summarisation of medical documents using domain knowledge and corpus statistics. Australas Med J. 2012;5(9): PMID: Sarker A, Molla D, Paris C. Extractive evidence based medicine summarisation based on sentencespecific statistics. In: Proc. CBMS 25 th Int. Symp. 2012;

13 Data-centric scoring features Data-centric statistics generation Sentence position and sentence type scores Sarker A, Molla D, Paris C. Query-oriented evidence extraction to support evidencebased medicine practice. J Biomed Inform Feb;59: PMID:

14 Query-specific customizations w The contents of evidence based answers depend on the types of questions Maximal marginal relevance with n-grams and semantic types Customized scores for associations between semantic types Sarker A, Molla D, Paris C. Query-oriented evidence extraction to support evidencebased medicine practice. J Biomed Inform Feb;59: PMID:

15 Evaluation and results w Automatic evaluation using ROUGE Sarker A, Molla D, Paris C. An approach for query-focused text summarisation for evidence based medicine. In: Proc. AIME Murcia, Spain;

16 Quality of evidence prediction w Strength of recommendation taxonomy three-level scale (A, B, C) w Difficult problem with low inter-rater agreement (mean agreement ~0.5) w Raters tend to categorize borderline evidence as B Sarker A, Molla D, Paris C. Automatic evidence quality prediction to support evidencebased decision making. Artif Intell Med Jun;64(2): PMID:

17 Sample summary Q: Are there big differences in betablockers in treating essential hypertension? Automatic extractive summary: Because the pathophysiology of hypertension differs in older and younger patients, we designed this metaanalysis to clarify the efficacy of beta-blockers in different age groups. In placebo-controlled trials, beta-blockers reduced major cardiovascular outcomes in younger patients (risk ratio [RR] 0.86, 95% confidence interval [CI] , based on 794 events in patients) but not in older patients (RR 0.89, 95% CI , based on 1115 events in 8019 patients). Beta-blockers should not be considered first-line therapy for older hypertensive patients without another indication for these agents; however, in younger patients beta-blockers are associated with a significant reduction in cardiovascular morbidity and mortality. (Quality of evidence: A) PMID:

18 Future directions w Multi-document summarization to generate bottom-line recommendations Two-step summarization approach 1 Customized strategies for question types in the second step 2 w Speeding up the summarization process by reducing reliance on clunky tools & packages w Use more efficient text representation methods (e.g., dense vectors) w Extrinsic evaluation generating rapid reviews 1 Sarker A, Molla D, Paris C. Towards two-step multi-document summarisation for evidence based medicine: A quantitative analysis. In Proc. ALTA. Melbourne, Australia.2012; Sarker A, Molla D, Paris C. Automatic prediction of evidence-based recommendations via sentence-level polarity classification. In: Proc IJCNLP. Nagoya, Japan. 2013;

19 Acknowledgments w Collaborators Dr. Cecile Paris (CSIRO, Australia) Dr. Diego Molla-Aliod (CLT, Macquarie University) w Funding CSIRO Macquarie University w Contact Abeed Sarker abeed@pennmedicine.upenn.edu 12