GUIDELINES FOR REPORTING STATISTICAL RESULTS

Similar documents
Biostatistics 208 Data Exploration

Quantitative Methods. Presenting Data in Tables and Charts. Basic Business Statistics, 10e 2006 Prentice-Hall, Inc. Chap 2-1

Overview. Presenter: Bill Cheney. Audience: Clinical Laboratory Professionals. Field Guide To Statistics for Blood Bankers

= = Name: Lab Session: CID Number: The database can be found on our class website: Donald s used car data

Business Statistics: A Decision-Making Approach 7 th Edition

points in a line over time.

= = Intro to Statistics for the Social Sciences. Name: Lab Session: Spring, 2015, Dr. Suzanne Delaney

Why Learn Statistics?

Activities supporting the assessment of this award [3]

Basic applied techniques

1. Contingency Table (Cross Tabulation Table)

SPSS 14: quick guide

Guidelines for Collecting Data via Excel Templates

Fraud Detection in Clinical Trials: A Graphical Tool

Seven Basic Quality Tools. SE 450 Software Processes & Product Metrics 1

The Dummy s Guide to Data Analysis Using SPSS

Observations from the Winners of the 2013 Statistics Poster Competition Praise and Future Improvements

Module - 01 Lecture - 03 Descriptive Statistics: Graphical Approaches

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 2 Organizing Data

Calculating the Standard Error of Measurement

Multiple Responses Analysis using SPSS (Dichotomies Method) A Beginner s Guide

Section 9: Presenting and describing quantitative data

PRESENTING DATA ATM 16% Automated or live telephone 2% Drive-through service at branch 17% In person at branch 41% Internet 24% Banking Preference

Rounding a method for estimating a number by increasing or retaining a specific place value digit according to specific rules and changing all

STA Module 2A Organizing Data and Comparing Distributions (Part I)

Chart Recipe ebook. by Mynda Treacy

To provide a framework and tools for planning, doing, checking and acting upon audits

Introduction to Control Charts

MARK SCHEME for the October/November 2013 series 0610 BIOLOGY. 0610/62 Paper 6 (Alternative to Practical), maximum raw mark 40

JMP TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING

ASTEROID. Profiler Module. ASTEROID Support: Telephone

CMIP Performance Measure Trend Report Guide Disease Specific Care (DSC) Certified Program Rates and National/State Overall Rates

Transcript. The Toolkit Level 2 Spreadsheet

March 2017 Version Essential Skills Application of Number Level 2

Variables and data types

DIGITAL VERSION. Microsoft EXCEL Level 2 TRAINER APPROVED

Step-by-Step Instructions: Using Unit Level Labor Analysis

Chapter 1. * Data = Organized collection of info. (numerical/symbolic) together w/ context.

Eagle Business Management System - Manufacturing

Bars and Pies Make Better Desserts than Figures

Introduction of STATA

Session 5. Organizing database

Sample size Re estimation and Bias

Chapter 2 Part 1B. Measures of Location. September 4, 2008

An ordered array is an arrangement of data in either ascending or descending order.

AcaStat How To Guide. AcaStat. Software. Copyright 2016, AcaStat Software. All rights Reserved.

Validation of Clinical Output: More Quality with Less Effort

(Evolve) and (Paper) Functional Skills Mathematics Level 1 Chief Examiner s report January 2018

Electronic Timesheet System Student User Guide Date Written 30 May 2017 Date Updated

STAT 2300: Unit 1 Learning Objectives Spring 2019

Stata Program Notes Biostatistics: A Guide to Design, Analysis, and Discovery Second Edition Chapter 12: Analysis of Variance

Unit 9 : Spreadsheet Development. Assignment 3. By Sarah Ameer

Project Management CTC-ITC 310 Spring 2018 Howard Rosenthal

Account structure setup and Financial dimensions

Radio buttons. Tick Boxes. Drop down list. Spreadsheets Revision Booklet. Entering Data. Each cell can contain one of the following things

Soci Statistics for Sociologists

Copyright K. Gwet in Statistics with Excel Problems & Detailed Solutions. Kilem L. Gwet, Ph.D.

Topic 1: Descriptive Statistics

Equipment and preparation required for one group (2-4 students) to complete the workshop

September 2016 Version (Evolve) and (Paper) Functional Skills Mathematics Level 2

There are also further opportunities to develop the Core Skill of Information and Communication Technology at SCQF level 5 in this Unit.

CP 94bis Statement of amounts due with weight brackets

Statistical Techniques Useful for the Foundry Industry

Computing Descriptive Statistics Argosy University

Stata v 12 Illustration. One Way Analysis of Variance

New Perspectives on Microsoft Excel Module 4: Analyzing and Charting Financial Data

Chapter 1 Data and Descriptive Statistics

IE 301 Industrial Engineering laboratory LAB No.5: The seven QC tools and Acceptance sampling Instructor: Assisant.Prof. Parichat Chuenwatanakul Lab

Nature Biotechnology: doi: /nbt.4199

Tables, graphs and maps

1 BASIC CHARTING. 1.1 Introduction

LIR 832: MINITAB WORKSHOP

Transshipment. Chapter 493. Introduction. Data Structure. Example Model

30 important questions of Data Interpretation - I (Quantitative Aptitude)

ANALYSING QUANTITATIVE DATA

Scheduling Resources and Costs

PEER REVIEW COMMENT TEMPLATE

Chapter 2 Ch2.1 Organizing Qualitative Data

Suppliers to the Australian Food Manufacturing Industry

Determining Effective Data Display with Charts

At the end of this module, you will be able to:

Soil - Plasticity 2017 (72) PROFICIENCY TESTING PROGRAM REPORT

ProHelp Millennium. Production Monitoring System

Lecture 10. Outline. 1-1 Introduction. 1-1 Introduction. 1-1 Introduction. Introduction to Statistics

Customer Satisfaction Survey Report Guide

Research question: What do you want to find out? I would like to find out how (independent variable/x) affects the (dependent variable/y)

Energy savings reporting and uncertainty in Measurement & Verification

AP Statistics Scope & Sequence

Excel #2: No magic numbers

Credit Card Marketing Classification Trees

SCENARIO: We are interested in studying the relationship between the amount of corruption in a country and the quality of their economy.

ENGG1811: Data Analysis using Excel 1

REFERENCE GUIDE. January, 2018

Course Report Advanced Higher

DDBA8437: Central Tendency and Variability Video Podcast Transcript

User Guidance Manual. Last updated in December 2011 for use with ADePT Design Manager version Adept Management Ltd. All rights reserved

Transcription:

Guidelines for reporting statistical information 1 GUIDELINES FOR REPORTING STATISTICAL RESULTS Introduction When reporting statistical results it is crucial that they are presented in a neat, concise manner. This can greatly assist the reader in understanding the results, and in turn how they relate to the research question(s). Messy presentation, such as poorly structured tables, inadequately labelled axes on graphs and superfluous precision of estimates makes for more difficult reading and is unacceptable for peer-reviewed journals. The objective of this document is to provide guidelines for the reporting of statistical results and to give examples of what the BCA considers acceptable presentation of tables and figures for assignments. Precision of reported results Based on quote from Gardner and Altman (Statistics with Confidence, 1989): "Spurious precision adds no value to a paper and even detracts from its readability and credibility. Results obtained from a calculator or computer usually need to be rounded. When presenting means, standard deviations, and other statistics the author should bear in mind the precision of the original data. Means should not normally be given to more than one decimal place more than the raw data, but standard deviations or standard errors may need to be quoted to one extra decimal place [i.e. one more than the mean, e.g. when the SD or SE is small relative to the mean]. It is rarely necessary to quote percentages to more than one decimal place, and even one decimal place is often not needed. With samples of less than 100 the use of decimal places implies unwarranted precision and should be avoided. Note that these remarks apply only to presentation of results - rounding should not be used before or during analysis. It is sufficient to quote values of t, to two decimal places. 2 χ, and r P-values should be quoted to 1, or at most 2, significant figures. Do not use the abbreviation NS for P-values greater than 0.05, or asterisks (*) for different categories of P-value less than 0.05, but report the actual P-value. Notes: The number of decimal places is the number of digits after the decimal point. The number of significant figures is a more general indication of the precision of a number. It is determined as follows: 1. Count the number of digits, beginning with the first non-zero digit, reading from left to right. 2. Ignore all zeros at the end of a whole number, unless you know them to be exact. 3. Zeros at the end of a decimal number are always significant. For example: 3500, 0.0062, 0.050 each have 2 significant figures 50600, 8.02, 0.0920 each have 3 significant figures Based on guidelines presented in the notes for Introductory Biostatistics, a unit of the Master of Public Health degree offered by the School of Public Health, University of Sydney, with additions by Kris Jamsen, School of Population Health, The University of Melbourne, for the Biostatistics Collaboration of Australia (BCA)

Guidelines for reporting statistical information 2 Guidelines for Preparing Tables and Graphs 1. Tables and graphs should be self-explanatory, so they should have a title; all axes, rows and columns should be labelled unambiguously; and units should be given. 2. Missing values such as 'don't know' or 'not answered' should either be included as a separate class, or a footnote should be given noting that they are omitted. More detailed guidelines can be found in How to Report Statistics in Medicine (2 nd edition), Thomas A Lang and Michelle Secic. Philadelphia: American College of Physicians, 2006. 3. If percentages are given, it should be clear what the denominator is. 4. Where the vertical scale has a natural origin, it should either be included, or it should be emphasised that it is not. 5. Use spacing, not vertical lines, to separate columns in tables. In general, minimise the use of gridlines. 6. Three-dimensional graphics (e.g. 3D bar charts) should be avoided unless you are showing the complex relationship among three variables (e.g. an interaction effect). Guidelines for Preparing Frequency Distributions & Histograms 1. Class boundaries should be round numbers. 2. It should be unambiguous in which class a boundary value goes; usually the boundary is in the class above. 3. The number of classes should be small enough to provide regularity in the shape of the distribution, but large enough not to lose too much detail. Usually 5 to 15 classes are sufficient. 4. Where there are conventions covering 1), 2) and 3) above, these conventions should be followed; e.g. age-groups start with 0 or 5. 5. Classes can be combined later, but not divided, so use too many, rather than too few initially.

Guidelines for reporting statistical information 3 Graph examples.06 Inadequately formatted Inadequately formatted Density.04.02 wt 60 0 5 10 15 20 25 30 age 20 0 20 40 age Frequency 20 15 10 5 0 Improved format 0 10 20 30 40 Age (years) Weight (kg) 80 65 50 35 20 Improved format 0 10 20 30 40 Age (years) Figure 1: Examples of inadequately formatted graphs and graphs with improved formatting Problems with the top-left histogram: The bins (i.e. the age groups produced by the default setting) are too wide, providing a only a very coarse description of the age distribution There are no units for age (e.g. years, months) Density is on the y-axis, which is not an intuitive measure of the size (or relative size) of each age group (bin) Improvements to the histogram (bottom-left): There are an appropriate number of bins, which provides a more detailed description of the distribution and seems to provide a good trade-off between smoothness and precision Age is properly labeled with units in brackets The frequency is displayed on the y-axis, providing a straightforward measure of the size of each age group (i.e. the number of people in each bin ) Problems with the top-right scatterplot: The axes are labeled with the variable names, which in the case of wt requires the reader to guess what this stands for For both axes, the incremental increase in units is very large, making it difficult to identify weights for specific ages The plot symbols appear fuzzy Improvements to the scatterplot (bottom-right): The axes have meaningful labels with units in brackets For both axes, the incremental increase in units is smaller, making it easier to identify weights for specific ages The plot symbols are clearer In addition, you should always include a figure label above or below the graph (below in this case), which numbers the figure (for reference in the text) and clearly describes what is being displayed.

Guidelines for reporting statistical information 4 Table examples Inadequately formatted: The MEANS Procedure Variable N Mean Std Dev Minimum Maximum -------------------------------------------------------------------------------- SBPA 1080 131.1796481 10.9672201 98.9100000 171.1200000 DBPA 1080 81.7043889 8.2490641 48.2300000 106.2800000 HRA 1080 72.7800000 8.1729036 48.8600000 100.9400000 SBPA3m 795 130.3802390 11.0160729 100.3400000 168.6500000 DBPA3m 795 81.0252579 8.2270743 48.1400000 106.6800000 HRA3m 795 72.1476981 8.3003315 47.0700000 96.7500000 SBPA5y 233 133.8000858 10.2820711 108.4500000 167.5700000 DBPA5y 233 82.1574678 7.2404082 63.0500000 102.2600000 HRA5y 233 72.3622318 8.6014242 47.7400000 102.8400000 SBPAe 87 139.4104598 12.2275038 109.8100000 168.6400000 DBPAe 87 88.9980460 8.8186379 58.5000000 104.6600000 HRAe 87 73.2763218 9.0440817 51.6000000 97.2600000 -------------------------------------------------------------------------------- This table was copied and pasted directly from the SAS results window. Problems include: Superfluous precision (i.e. too many decimal places) The columns are not aligned The entries in the Variable column are not labeled The SAS procedure name ( the MEANS procedure ) is unnecessarily displayed The lines at the top and bottom of the table are not long enough and are dashed (should be solid) There is no description of what the table is displaying

Guidelines for reporting statistical information 5 Improved format: Table 1: Summary statistics for ambulatory systolic blood pressure (SBP), diastolic blood pressure (DBP) and heart rate (HR). Time Variable N Minimum Maximum Mean Median SD Baseline DBP 1080 48.2 106.3 81.7 82.2 8.3 SBP 1080 98.9 171.1 131.2 131.1 11.0 HR 1080 48.9 100.9 72.8 72.7 8.2 3 month DBP 795 48.1 106.7 81.0 81.2 8.2 SBP 795 100.3 168.7 130.4 130.2 11.0 HR 795 47.1 96.8 72.2 72.1 8.3 5 year DBP 233 63.1 182.3 82.2 82.2 7.2 SBP 233 108.5 167.6 133.8 133.3 10.3 HR 233 47.7 102.8 72.4 72.3 8.6 Endpoint DBP 87 58.5 104.7 89.0 90.9 8.8 SBP 87 109.8 168.6 139.4 138.6 12.2 HR 87 51.6 97.3 73.3 74.7 9.0 Improvements: Concise and consistent precision Minimal use of lines The columns are aligned and clearly labeled The numbers in each column are right-aligned to force the decimal points to be aligned The rows are more clearly labeled The table is labeled with a number (for reference in the text) and also with a clear description of what is being displayed

Guidelines for reporting statistical information 6 Cross-tabulation examples Inadequately formatted: +----------------+ Key ---------------- frequency row percentage +----------------+ hypb hypa 0 1 Total -----------+----------------------+---------- 0 212 3 215 98.60 1.40 100.00 -----------+----------------------+---------- 1 92 13 105 87.62 12.38 100.00 -----------+----------------------+---------- Total 304 16 320 95.00 5.00 100.00 This table was copied and pasted directly from Stata. Problems include: Lack of alignment, making it very difficult to read Ugly use of pluses and minuses for row, column and cell distinction The rows and columns are labeled with the variable names, which are uninformative Values of each variable are displayed as 0 and 1 (what do they mean?) There is no description of what is being displayed Improved format: Table 1: Cross tabulation of hypertension status using clinical and ambulatory measures. Row percentages are displayed below the cell counts in brackets. Hypertension using the clinical measure Hypertension using the ambulatory measure Yes No Total Yes 212 3 215 (98.6%) No 92 (87.6%) (1.4%) 13 (12.4%) (100.0%) 105 (100.0%) Total 304 (95.0%) 16 (5.0%) 320 (100.0%) Improvements: Rows, columns, numbers and percentages are aligned Minimal use of lines Rows and columns are clearly labeled with meaningful names (Hypertension using the X measure) and values (Yes, No) The table is labeled with a number and description