Developing Data Models and Standards to Support Use Cases Robert R. Freimuth, PhD ClinGen/DECIPHER Meeting May 27, 2015 2014 MFMER slide-1
Interoperability Semantic Requires a common understanding of the meaning of the information Defining the "things" in a system Names Definitions Properties (attributes) Datatypes Value sets Syntactic Requires the use of a common format to represent the information Common platform for data exchange Messaging protocol
Standards Development Development Define domain Define use cases and requirements Motivation Evaluate existing standards Gap analysis Define scope of new Author Review, comment, refinement Approval Implementation Adoption 2014 MFMER slide-3
Standards Development Development Define domain Define use cases and requirements Motivation Evaluate existing standards Gap analysis Define scope of new Author Review, comment, refinement Approval Implementation Adoption Feedback informs next cycle 2014 MFMER slide-4
2014 MFMER slide-5
2014 MFMER slide-6
HL7 Clinical Genomics Work Group 2014 MFMER slide-7
Comment Informative DSTU Normative 2014 MFMER slide-8
2014 MFMER slide-9
2014 MFMER slide-10
Action Collaborative: Approach and Objectives 1. Identify and consider the issues, barriers, and challenges related to setting genomic data standards 2. Propose a collaborative approach to standardizing genomic information in the EHR that addresses data location, format, nomenclature, and interoperability issues 3. Prepare a written framework for standards development 4. Provide updates and findings at meetings of the Roundtable 5. Publish a call to action for implementing the framework for standards development
DIGITizE Action Collaborative: Use Cases # Use Case Comments 1 Display results to clinicians 2 Include genetic tests in order sets 3 CDS to identify whether a test should be ordered (pre-test alert) 4 CDS to identify when a drug order is inconsistent with a test result (post order alert) Initial focus is on generic display of pharmacogenomics results disease specific displays will be considered later. Involves including genetic tests in standard order sets without a check as to whether the test has been run before. Involves running a CDS algorithm when a drug is ordered to check whether related pharmacogenomic variants have already been assessed and warning the clinician if they have not. Involves running a CDS algorithm in response to a drug order to check whether the pharmacogenomic results are present that contra-indicate either the choice of drug or the specified dose.
DIGITizE Action Collaborative: Pilot Minimal PGx use cases that do not require structured allele data for CDS HLA-B/Abacavir & TPMT/Azathioprine Structured phenotype (used for CDS) Metabolizer status, drug sensitivity Based on CPIC terminology standardization Codes will be added to LOINC Genotype/alleles in narrative Approach Use existing HL7 standard, specify codes Vendors to implement constrained message Future: expand pilot to include structured allele data
Use Cases Examples shamelessly stolen from Marc Williams 2014 MFMER slide-14
Example of a CPIC Guideline http://www.pharmgkb.org/gene/pa356 2011 MFMER slide-16
Examples of Genetic Test Results 2011 MFMER slide-17
Current Reality The lack of standard variant nomenclature and allele identification systems complicates data exchange and results in ad hoc solutions Extra overhead required to maintain and interpret variants reported by other institutions Project- or gene-specific systems are too fragile for enterprise and/or shared CDS Diverse reporting methods lead to data heterogeneity, complicate clinical interpretation, and hinder reuse Standards are needed to ensure data are interoperable 2011 MFMER slide-18
Context and Motivation ClinGen Goals Enhance patient care through sharing clinicallyrelevant genomic data Capture curation (knowledge) Enable analysis and discovery Must unambiguously define genetic results Gene Variant location Observed alleles Variant classification Phenotypic interpretation Clinical relevance 2011 MFMER slide-19
2014 MFMER slide-20
ClinGen Data Modeling WG Models Supporting "Allele" http://datamodel.clinicalgenome.org/allele/resource/ 2014 MFMER slide-21
Allele Registry The community needs a real-time canonicalization service: The Allele Registry Provides a real-time service Takes in a valid allele description Checks to see if it knows that allele (regardless of format) If it has seen that allele in some form, return the same ID it previously returned. If it has not, return a new id Slide courtesy of Larry Babb
What's in a Name? Stakeholders have different requirements Identification: rs1800460 Definition: TPMT*3B, c.460g>a, g.18079228c>t, g.18139228c>t, g.18247207c>t, g.21147g>a Annotation: Allele T is associated with decreased enzyme activity when treated with mercaptopurine Interpretation: If *1/*3B, adjust dose No single solution will satisfy all consumers 2011 MFMER slide-23
(Not) Reinventing the Wheel Existing systems are fit-for-use Identification systems (e.g., dbsnp IDs) Legacy nomenclature (e.g., star alleles) All-in-one attempts (e.g., HGVS) Annotation systems (e.g., GO, SO) Data exchange formats (e.g., VCF) Resources under development ClinVar (genotype-phenotype) GTR (test registry) 2011 MFMER slide-24
Standard Development Take Home Points Standard dev. is driven by real-world use cases Standards are developed by those that show up Nearly all participants are volunteers Robust standards require diverse expertise Domain SMEs, modeling, technical Clinical, academic, industry, government Standards are not perfect or perfectly comprehensive Gaps in structure and/or content Easy to create Yet Another Standard Should contribute to existing, not develop new 2014 MFMER slide-25
http://xkcd.com/927/ 2014 MFMER slide-26
Standard Development Open Question Does the current model of standards development best serve the interests of the biomedical domain? Timeline Quality Incentives Funding Sustainability Scalability 2014 MFMER slide-27