Data quality, variability and reproducibility

Size: px
Start display at page:

Download "Data quality, variability and reproducibility"

Transcription

1 survey summary and discussion Data quality, variability and reproducibility A snapshot of what scientists at the bench are thinking and doing.

2 2 Because good data quality is at the heart of what we all care about.

3 Table of Contents Summary... 4 Survey results... 6 Workflows conducted by respondents... 8 Where and how variability gets introduced Conclusion Resources References

4 Summary With a number of pharmaceutical and biotech labs reporting that they are unable to validate more than a third of published claims 1, 2, study reproducibility has become a rapidly growing concern throughout the life sciences 1 3. A large part of the conversation is focused around high-level issues that likely drive study irreproducibility, such as the publish-or-perish model for obtaining funding and advancing a scientific career, and the lack of rigorous statistical approaches in regard to sample size and data analysis. On a more everyday level, there are many steps scientists can and do take while conducting experiments to increase the reproducibility of their studies. But where does most of the variability enter the process? Is it reagents? Biology? Person-to-person technique? variability crops up in their workflows, and what they do to minimize irreproducibility risks. While the sample size is small, the survey respondents cover a range of life science efforts, research areas, and expertise (Figures 1-6), making the survey a great starting point for understanding factors that scientists perceive as providing the most impact on variability*. For example, while most respondents believe that a subset of steps in their workflows contributes to variability, almost 25% of respondents feel that a single step is where most of their data quality issues arise (Figure 7). In addition, when asked to independently rate the impact of different factors on variability biological/chemical complexity, person-to-person technique differences, reagents, instruments, labware, and environmental factors such as lab temperature and humidity respondents tended to rate most factors as moderately impactful (Figure 8). But when forced to rank these factors in order of impact, three stand out as generating the most concern biological or chemical complexity, person-to-person technique differences, and variability in reagents (Figure 9). Because good data quality is at the heart of what we care about at Artel, we wanted to start a conversation around reproducibility at the experimental level. As a first step, we conducted a survey to see what scientists think about where *Because many respondents did not answer every question, the number of respondents listed for each question varies. 4

5 Consistent with the importance of these concerns, of the 44 respondents who discussed the steps they take to reduce variability, the top four activities were centered around: (1) reducing reagent variabilities by validating reagents (27%), (2) focusing on keeping techniques and methods as consistent as possible (27%), (3) reducing person-to-person variability by having a single person do important assays (17%), and (4) ensuring personnel are properly trained (14%, Figure 10). It is important to note that our survey explores perception of variability rather than validated measurements, thus it s unclear how many of these reported concerns and risk-amelioration practices are based on intuition versus observed measurements. For example, despite the relatively low importance respondents placed on environmental effects on variability, Artel has measured the effects of lab humidity on the volume of liquid in 96-well plates and found significant effects in certain situations. Other studies by Artel support respondents concerns about training and experimental technique. In our pipette and pipetting technique training classes, we have seen a huge range of pipetting ability and have found that pipetting technique can have large effects on both accuracy and precision of liquid handling (see Lab Report 7 4, Figures 1 and 2), which in turn will affect the reproducibility of the data. Nevertheless, because the sample size of this survey is small, it s unclear how representative these findings are. Despite this limitation, we believe that the findings can play an important role in research reproducibility by focusing attention on lab practices meant to reduce variability and by starting a conversation on how much the scientific community really knows because the measurements have been done about where and how much variability enters our workflows. In the coming months we will continue to explore this topic through in-depth interviews, additional surveys, and our own studies on experimental variability, all of which will be posted on our blog, The Artel Digest. We hope that you will join us by reading about our findings or continuing to contribute your own insights in future surveys or interviews. 5

6 Survey results Background of the respondents While roughly 100 respondents started the survey, the majority of questions were fielded by 65 respondents. These scientists were primarily involved in basic research (Figure 1), with a large subset involved in assay development or transfer, high-throughput assays, and compound or materials management (Figure 2). Almost half have been at their current job for more than 10 years, and over 90% for over 2 years (Figure 3). Basic research Translational research Diagnostics Drug discovery Drug development Scientific Focus Figure 1. Distribution of respondents scientific focus (n=100). Note that because some respondents work focus is in multiple areas, the total across all areas is greater than 100%. Instrument development Other 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 6

7 Participation in select focus areas Assay development or assay transfer High-throughput screening or high throughput assays Compound / materials management Using automated liquid handlers Maintaining automated liquid handlers None of the above Figure 2. Participation in select focus areas (n=97). Note that because some respondents participate in multiple workflows, the total across all areas is greater than 100%. 0% 10% 20% 30% 40% 50% 60% 70% Experience at Current Job 1 Year 2% 10 Years 45% 1-2 Years 6% 5-10 Years 19% Figure 3. Length of time at current job (n=100). 2-5 Years 28% 7

8 Workflows conducted by respondents Survey respondents conduct a wide range of in vivo, in vitro, sample logistics, maintenance, and compliance workflows (Figure 4), with many running qpcr assays (15% of all workflows reported), followed closely by cell culture (6%), western blotting (6%), cell-based assays (4%), NGS library preparation (4%), PCR (4%), and screening (4%). For the most part, respondents conduct fairly straightforward workflows with 10 steps or less (Figure 5), and have a lot of variety in the types of assays they conduct over the course of a month (Figure 6). Figure 4. Range of workflows reported by survey respondents (n=70). The relative size of each phrase reflects the number of times it appeared, i.e. qpcr was the most frequently-reported workflow. Note that multiple workflows were typically reported by each respondent. 8

9 Number of steps in most- used workflow 11 20, 7% >20, 5% 1 5, 42% Figure 5. Number of steps in respondents most used workflows (n=86). Due to rounding, total across all areas is greater than 100% 6 10, 47% Variety of workflows (1- month span) 6-10 different workflows, 25% >10 - Everyday is different 15% Everyday is the same, 15% 2-5 different workflows, 46% Figure 6. Variety of workflows respondents reported conducting over the course of one month (n=89). Due to rounding, total across all areas is greater than 100% 9

10 Where and how variability gets introduced With so many different workflows reported by respondents, it should not be a surprise that there s little agreement in how many different steps of each workflow contribute to variability (Figure 7). While almost one quarter of survey respondents (24%) feel that most variability stems from a single step in their workflows, and another 23% feel that every step contributes more or less equally, 44% feel that variability enters the process at only a small subset of key steps. Number of steps contributing to data variability Every step contributes to variability, 23% 1 step accounts for most variability, 24% Figure 7. Number of steps that contribute to variability (n=70). >6 steps account for most variability, 9% 2-5 steps account for most variability, 44% 10

11 No one key factor seems to stand out in scientists minds as the source of experimental variability. When asked to consider the impact of a number of different elements on variability independently, most of the respondents felt that each factor contributed moderately (a score of 3, where 1 is low impact and 5 a great deal of impact, Figure 8). In addition, none of the factors showed a statistically significant difference in impact from the other factors. Other factors affecting variability that scientists added included the age of the reagents, differences in protocols between labs, and the training and skills of the scientist conducting the experiment. The impact of different factors on variability Person-to-person technique differences Biological or chemical complexity Figure 8. The impact of different factors on variability, considered independently (n=69). Reagents Instruments Environmental factors such as temperature, humidity, etc. Labware (e.g. tips, plates, or tubes) LOW MEDIUM HIGH

12 To better understand how the survey respondents viewed the importance of these different factors relative to each other, we also asked them to rank the factors order of impact on variability, with 1 being lowest impact and 5 the highest (Figure 9). While, again, there is no one factor that most respondents view as having the most impact, there are three that worry scientists the most biological or chemical complexity, person-to-person differences in technique, and variability in reagents. Ranked impact of factors on variability Biological or chemical complexity Reagents Figure 9. Ranked impact of different factors on variability (n=65). Person-to-person technique differences Environmental factors such as temperature, humidity, etc. Instruments Labware (e.g. tips, plates, or tubes) LOW MEDIUM HIGH When asked how scientists minimize variability, most focused on fixing person-to-person variability through training or limiting who does the assay, by creating standardized techniques and protocols, and by trying to be consistent with reagents (Figure 10). Interestingly, fewer discussed using controls to identify random or systematic errors. Practices for minimizing variability Techniques/methods consistency Reagents Person consistency Training Replicates Instrument calibration Automation Instrument consistency Controls Consistency Simplify procedures Randomized quality checks Experimental controls Documentation Figure 10. Steps taken to reduce variability (n=44). 0% 5% 10% 15% 20% 25% 30% 12

13 MARCH 2016 Conclusion With growing attention on research reproducibility, we feel it is time to start taking a detailed look at how bench scientists and lab managers approach this critical topic. In this initial survey, which reports on researchers conducting a range of scientific studies, we find that there s also a range of opinions on where variability arises and what one can do about it. While most respondents are concerned with human- based, reagent- based, and biologybased variability and take steps to minimize those (Figures 8-10), fewer are worried about the effects of the environment, labware, or instrument- based variability. Yet, this last group of factors also contributes to data variability and can be fairly easily minimized with only a few extra steps. While the exact plan will vary depending on your studies and goals, one good place to begin is to talk to the team at Artel. We can t help you with all your data variability needs, but we have extensive experience, and have done a number studies to validate our practices and recommendations around liquid handling. Be sure to keep an eye on the Artel Digest in the months to come, which will continue to explore the topic of data quality, reproducibility, and variability. Resources Nature section on reproducibility: Science section on reproducibility: NIH section on Rigor and Reproducibility: References 1. Dolgin, E. Drug discoverers chart path to tackling data irreproducibility. Nature Reviews Drug Discovery 13, (2014). 2. Prinz, F., Schlange, T. & Asadullah, K. Believe it or not: how much can we rely on published data on potential drug targets? Nature Reviews Drug Discovery 10, 712 (2011). 3. Begley, C. G. & Ioannidis, J. P. A. Reproducibility in science: improving the standard for basic and preclinical research. Circulation Research 116, (2015). 4. Artel Lab Report 7: Facilitating Assay Transfer by Controlling Liquid Handling Variables, Available on the web: 25 Bradley Drive, Westbrook, Maine info@artel-usa.com 2016 Artel, Inc.