Building A Knowledge Base of Severe Adverse Drug Events Based On AERS Reporting Data Using Semantic Web Technologies

Size: px
Start display at page:

Download "Building A Knowledge Base of Severe Adverse Drug Events Based On AERS Reporting Data Using Semantic Web Technologies"

Transcription

1 Building A Knowledge Base of Severe Adverse Drug Events Based On AERS Reporting Data Using Semantic Web Technologies Guoqian Jiang, MD, PhD Mayo Clinic College of Medicine, Rochester, MN, USA MEDINFO 2013 Copenhagen, Denmark August 21, MFMER slide-1

2 Acknowledgements Co-authors Liwei Wang Jilin University, China Hongfang Liu Mayo Clinic, USA Harold R. Solbrig Mayo Clinic, USA Christopher G. Chute Mayo Clinic, USA This work was supported in part by the SHARP Area 4: Secondary Use of EHR Data (90TR000201) MFMER slide-2

3 Introduction Adverse Drug Events (ADEs) have been a well-recognized cause of patient morbidity and increased health care costs. A semantically coded knowledge base of ADEs with severity information is critical for clinical decision support systems and translational research applications MFMER slide-3

4 In the Field of Translational Research Pharmacogenomics study of ADEs the genetic component of ADEs is being considered as one of significant contribution factors for drug response variability and drug toxicity. PharmGKB initiated by NIH To collect and disseminate human-curated information about the impact of human genetic variation on drug responses Canadian Pharmacogenomics Network for Drug Safety to identify novel predictive genomic markers of severe ADEs in children and adults 2013 MFMER slide-4

5 ADEpedia Project A standardized knowledge base of ADEs that intends to integrate existing known ADE knowledge for drug safety surveillance from disparate resources such as the FDA Structured Product Labeling (SPL), the FDA Adverse Event Reporting System (AERS) and the Unified Medical Language System (UMLS). A framework of knowledge integration and discovery that aims to support pharmacogenomictarget prediction of ADEs MFMER slide-5

6 Severe ADEs Since the clinical applications of pharmacogenomics on ADEs are usually focused on the clinically severe ADEs, we designed a module in the ADEpedia framework for extracting severe ADE knowledge. However, few open-source ADE knowledge resources with severity information are available and it remains a challenging task for measuring and identifying the severity information of ADEs MFMER slide-6

7 Objective of the study To develop and evaluate a semantic web based approach for building a knowledge base of severe ADEs based on the FDA AERS reporting data MFMER slide-7

8 Semantic Web Technologies The W3C standards The Resource Description Framework (RDF) A model of directed, labeled graphs Using a set of triples (subject, predicate, object) The SPARQL A query language for RDF graphs The Web Ontology Language (OWL) A standard ontology language used for ontology modeling 2013 MFMER slide-8

9 Materials (I) Normalized AERS Dataset AERS-DM Reporting data from 2004 to 2011 Drug names RxNorm Codes by MedEx Mapped to the NDF-RT drug classes ADE names MedDRA codes Aggregated to the System Organ Class (SOC) codes Contains 4,639,613 putative Drug-ADE pairs Unique report ID number (ISR) Used to identify the outcome codes 2013 MFMER slide-9

10 2013 MFMER slide-10

11 Materials (II) Common Terminology Criteria for Adverse Event (CTCAE) and Its Grading System We used the CTCAE version 4.0 rendered in the Web Ontology Language (OWL) format that is publicly available. This version contains 764 AE terms and 26 Other, specify options for reporting text terms not listed in CTCAE. Each adverse event (AE) term is associated with a 5-point severity scale. The AE terms are grouped by MedDRA Primary SOC classes. In the CTCAE, Grade refers to the severity of the adverse event MFMER slide-11

12 2013 MFMER slide-12

13 Materials (III) ADE Datasets SIDER 2 Released on October 17, 2012 Contains 996 drugs, 4,192 side effects (SE), and 99,423 drug-se pairs UMLS ADE dataset from ADEpedia Contains 266,832 drug-disorder concept pairs, covering 14,256 (1.69%) distinct drug concepts and 19,006 (3.53%) distinct disorder concepts. There are a total of 102 relationships between the drug-disorder concept pairs Indications; 2. Contraindications; 3. Adverse drug effects; and 4. Other associations MFMER slide-13

14 Methods Linking outcome codes with putative drug-ade pairs Validating the drug-ade associations Data integration in a semantic web framework Classifying the AERS ADEs into the CTCAE in OWL We asserted the mappings between AERS outcome codes and CTCAE grades 2013 MFMER slide-14

15 System Architecture 2013 MFMER slide-15

16 Results Produced a cardiac-aers-dm dataset contains 164,895 entries with 21,757 unique putative Drug-ADE pairs, covering 3,073 unique drug codes in RxNorm and 251 unique ADE codes in MedDRA MFMER slide-16

17 For validated drug-ade pairs We had 2,444 unique pairs, of which 760 pairs are in Grade 5; 775 pairs in Grade 4 and 2,196 pairs in Grade 3. The drug-ade pairs cover 821 unique drug codes in RxNorm and 69 unique ADE codes in MedDRA, whereas 20 of 36 (55.6%) of AE terms under the Cardiac Disorders category in CTCAE were covered MFMER slide-17

18 2013 MFMER slide-18

19 2013 MFMER slide-19

20 Severity Classification of ADEs in CTCAE 2013 MFMER slide-20

21 Discussion We utilized a normalized AERS dataset, in which the drug names are normalized using standard drug ontologies RxNorm and NDF-RT and the ADEs are normalized using MedDRA. Which facilitated the interoperability between ADE datasets (e.g., mappings to SIDER) 2013 MFMER slide-21

22 Validation Pipeline The SIDER dataset should be considered as a silver standard rather than a gold standard for the validation. Although the UMLS drug-disorder pairs only covered a small portion of putative drug-ade pairs (1.4%), the validation illustrated the usefulness of known ADE knowledge asserted in the UMLS in discerning the indications from the ADEs. For those new ADEs that have not been recognized, a robust ADE detection algorithm will be required in the future MFMER slide-22

23 Rationale for the use of CTCAE grading system The CTCAE as a standard has been widely used in clinical cancer study for recording the AE severity; It has clear severity definitions using a 5-scale grading system; It includes the most common AE terms that have been well classified and mapped with a standard AE vocabulary MedDRA; It contains well-defined conditions for grading the severity of AE terms based on the domain knowledge MFMER slide-23

24 Leveraging Semantic Web Technologies We leveraged semantic web technologies that provide a scalable framework for data integration of heterogeneous ADE resources. In particular, we represented validated drug- ADE pairs in the OWL format, which not only provides seamless integration with the CTCAE, but also enables a standard infrastructure for automatic classification of ADEs based on the severity conditions specified in the CTCAE MFMER slide-24

25 Summary We developed a semantic web based approach for building a standard severe ADE knowledge base using a normalized FDA AERS reporting data. The datasets produced in this study is publicly available from our ADEpedia website Although we were focused on the Cardiac Disorders domain, we believe the approach can be easily generalized to analyze the data in all other domains available in the AERS reporting data MFMER slide-25

26 Mayo Clinic Locations 2013 MFMER slide-26

27 Questions & Discussion 2013 MFMER slide-27