An Introduction to PhysioNet George B. Moody<george@mit.edu> Roger G. Mark<rgmark@mit.edu> Massachusetts Institute of Technology Cambridge, MA 21 June 2013 1 / 23
What is PhysioNet? A unique open-access web-based resource established in 1999, funded by NIBIB and NIGMS (two institutes of the NIH), intended to support current research and stimulate new investigations in the study of complex biomedical and physiologic signals and time series Three closely interdependent components: Data repository (PhysioBank) of related software (PhysioToolkit) Workspaces for collaborative development (PhysioNetWorks) 2 / 23
Why Study Signals? Physiologic signals and time series reveal aspects of health, disease, biotoxicity, and aging not captured by static measures. Raw (original) signals are of increasing interest as means of developing new biomarkers, of measuring parameters of known interest, and also for developing new insights into basic mechanisms of human physiology. 3 / 23
What is PhysioBank? PhysioBank currently includes over 50 collections of cardiopulmonary, neural, and other biomedical signals from healthy subjects and patients with a variety of conditions with major public health implications, including sudden cardiac death, congestive heart failure, epilepsy, gait disorders, sleep apnea, and aging. 4 / 23
What is PhysioBank? PhysioBank currently includes over 50 collections of cardiopulmonary, neural, and other biomedical signals from healthy subjects and patients with a variety of conditions with major public health implications, including sudden cardiac death, congestive heart failure, epilepsy, gait disorders, sleep apnea, and aging. Signals include ECG, blood pressure, EEG, respiration, PPG,... 4 / 23
What is PhysioBank? PhysioBank currently includes over 50 collections of cardiopulmonary, neural, and other biomedical signals from healthy subjects and patients with a variety of conditions with major public health implications, including sudden cardiac death, congestive heart failure, epilepsy, gait disorders, sleep apnea, and aging. Signals include ECG, blood pressure, EEG, respiration, PPG,... Records are from many sources varied assortments of signals, durations, sampling frequencies,... and annotations of differing levels of detail 4 / 23
What is PhysioBank? PhysioBank currently includes over 50 collections of cardiopulmonary, neural, and other biomedical signals from healthy subjects and patients with a variety of conditions with major public health implications, including sudden cardiac death, congestive heart failure, epilepsy, gait disorders, sleep apnea, and aging. Signals include ECG, blood pressure, EEG, respiration, PPG,... Records are from many sources varied assortments of signals, durations, sampling frequencies,... and annotations of differing levels of detail Over 36,000 sets of recordings (about 4 terabytes) in all 4 / 23
Data Sources Where Do the Data Collections Come From? PhysioNet research team members Other university-based researchers Other hospital-based researchers Industry 5 / 23
Data Sources Where Do the Data Collections Come From? PhysioNet research team members Other university-based researchers Other hospital-based researchers Industry You! (email webmaster@physionet.org) Data contributions are developed and reviewed on PhysioNetWorks before inclusion in PhysioBank. 5 / 23
Example of a PhysioBank Dataset Physiologic time series, such as this series of cardiac interbeat (RR) intervals measured over 24 hours, can capture some of the information lost in summary statistics. Data from the NHLBI Cardiac Arrhythmia Suppression Trial (CAST) RR Interval Sub-study Database 6 / 23
Another PhysioBank Dataset Many data collections in PhysioBank come from published studies. Hausdorff et al., J Appl Physiol 86(3)1040-7 (1999). 7 / 23
Finding Relevant Data in PhysioBank PhysioBank Record Search A web application for locating data of interest Simple query server (SQS) searches an in-memory index of PhysioBank Each index row begins with a record name and describes a feature of that record: A signal: class, name, time resolution, amplitude resolution, duration A set of annotations: class, name, time resolution, number of annotations, duration A subset of annotations: type, class, name,... Other data pertaining to the recording or the subject: age, sex, diagnoses, medications, procedures, bandwidth,... Once per day, the index is checked and updated automatically if necessary. 8 / 23
PhysioBank Simple Query Server Daemon Small ( 1000 lines of C code) and fast (20-30 ms/query) Simple queries have three components: A subject (class of data) A relationship (comparison operator) A value (pattern) to be compared with the rows of the index Examples of simple queries: age>85 sex = F ECG3>10:0:0 /(VT? 9 / 23
Simple Query Subjects Signal classes: BP, CO, CO2, ECG, EEG, EMG, EOG, EP, Flow, HR, Noise, O2, PLETH, Pos[ition], Resp, Sound, ST, Stim[ulus], SV, Temp[erature] Annotator classes: AnnM [machine annotations], AnnR [reference annotations] Literal types: annotation types, annotator names, signal names Other data: age, sex, diagnoses, medications, info 10 / 23
Simple Query Relationships Standard comparisons: < <= = >= >! = Similar (within 10% of a numeric value, or containing a string value): Different (not similar): Valid:?! 11 / 23
Simple Query Values (Patterns) Duration: string containing a colon (:) 0:30 15:0 90:0 1:30:0 100:0:0 Sampling frequency (time resolution): string ending in Hz 200Hz 0.01Hz 128 Hz Gain (amplitude resolution): string containing adu/ 50adu/mmHg 200 adu/mv Other numbers in standard integer or floating-point notation 50 12.5-1 4.5e2 Other strings Propanolol anterior infarct 12 / 23
PhysioBank Record Search: Web UI Try PhysioBank Record Search at http://physionet.org/cgi-bin/pbsearch/ 13 / 23
Viewing PhysioBank Data LightWAVE is a lightweight waveform and annotation viewer and editor, for viewing any of the signals and time series in PhysioBank. It runs in any modern web browser and does not require installation. Try LightWAVE at http://physionet.org/lightwave/ 14 / 23
What Can You Do with PhysioBank Data? Download for exploration and research Develop new signal processing algorithms Evaluate algorithms using standard data Test physiologic models Develop, test, and refine new biomarkers Create real-world classroom challenges at undergraduate, graduate, and post-graduate levels 15 / 23
What is PhysioToolkit? Open-source software for physiologic signal processing and analysis: Detection of physiologically significant events using both classical techniques and novel methods Interactive display and characterization of signals; creation of new data collections Physiologic signal modeling Quantitative evaluation and comparison of analysis methods 16 / 23
Where Does the Open-Source Software Come From? PhysioNet research team members Contributions from individuals and teams around the world PhysioNet/Computing in Cardiology annual Challenges 17 / 23
Where Does the Open-Source Software Come From? PhysioNet research team members Contributions from individuals and teams around the world PhysioNet/Computing in Cardiology annual Challenges You! (email webmaster@physionet.org) Software contributions are developed and reviewed on PhysioNetWorks before inclusion in PhysioToolkit. 17 / 23
Open Source Tools: WFDB Software Projects requiring large amounts of data can process them efficiently using WFDB software. 18 / 23
Open Source Tools: WFDB Software Projects requiring large amounts of data can process them efficiently using WFDB software. The WFDB library reads and writes signals and annotations in many commonly-used binary formats, providing uniform access to data from local disks and from the web. 18 / 23
Open Source Tools: WFDB Software Projects requiring large amounts of data can process them efficiently using WFDB software. The WFDB library reads and writes signals and annotations in many commonly-used binary formats, providing uniform access to data from local disks and from the web. About 80 WFDB applications included in the WFDB software package use the library. They include signal-processing functions (digital filters, beat detectors, signal averagers), annotation-processing functions (comparators, outlier detectors, editors), and more. 18 / 23
Open Source Tools: WFDB Software Projects requiring large amounts of data can process them efficiently using WFDB software. The WFDB library reads and writes signals and annotations in many commonly-used binary formats, providing uniform access to data from local disks and from the web. About 80 WFDB applications included in the WFDB software package use the library. They include signal-processing functions (digital filters, beat detectors, signal averagers), annotation-processing functions (comparators, outlier detectors, editors), and more. Open-source software runs anywhere, and you can modify it to fit the needs of your research. Use the WFDB library in your own software to read PhysioBank data directly from physionet.org or your own archives. 18 / 23
Some PhysioNet Contributions Include Data and Software This contribution included A collection of ECG recordings with detailed waveform limit annotations made by human experts Open-source software for location of ECG waveform limits A reprint of a paper describing the evaluation of the software using the data collection 19 / 23
Tutorials and Reference Materials Featured in Nature News and Views 2002; 4 19:263. Our on-line library contains dozens of tutorials introducing PhysioNet s data and software resources, and analysis methods pioneered by PhysioNet team members. Extensive documentation of PhysioToolkit software is also available in the form of a set of on-line reference manuals. 20 / 23
PhysioNet/Computing in Cardiology Challenges Inaugurated shortly after PhysioNet was launched, this annual series of open engineering challenges invites participants to tackle unsolved, clinically interesting problems that can be addressed using data PhysioNet provides. Topics have included: Detecting Sleep Apnea from the ECG Predicting Paroxysmal Atrial Fibrillation RR Interval Time Series Modeling Distinguishing Ischemic from Non-Ischemic ST Changes Spontaneous Termination of Atrial Fibrillation QT Interval Measurement Electrocardiographic Imaging of Myocardial Infarction Detecting and Quantifying T-Wave Alternans Predicting Acute Hypotensive Episodes Mind the Gap [Robust estimation of missing data] Improving the Quality of ECGs Collected using Mobile Phones Predicting Mortality of ICU Patients Noninvasive Fetal ECG 21 / 23
PhysioNetWorks PhysioNetWorks workspaces are available to members of the PhysioNet community for works in progress that will be made publicly available via PhysioNet when complete. Unlike other areas of PhysioNet, these workspaces are password-protected. Any PhysioNet visitor can become a PhysioNetWorks member in a few minutes by creating a personal account. To get started, visit https://physionet.org/users/ 22 / 23
If you missed any of that... PhysioNet is a resource for biomedical research and development, established in 1999 and funded by NIBIB and NIGMS. It provides: PhysioBank: About 50 free collections of recorded physiologic signals from patients and healthy controls PhysioToolkit: Open-source software for viewing, annotating, and analyzing PhysioBank and compatible data Challenges: Open competitions focusing on unsolved, clinically interesting problems that can be addressed using PhysioBank data PhysioNetWorks: Password-protected, sharable space for works in progress that will be publicly available when complete Learn more at http://physionet.org/tour/ 23 / 23