A web service for independent reproduction of computations underlying biomedical research

To stay in touch with the project follow the project team members on Twitter:


The Problem

Analysis of data and computational modelling is central to many areas of biomedical research particularly with the explosion of data now available. The underlying computer programs are complex and costly to design. However, these computational techniques are rarely checked during review of the corresponding papers, nor shared upon publication. Instead, the primary method for sharing data and computer programs today is for authors to state “data available upon reasonable request”. Despite best intentions, these programs and data can quickly disappear from laboratories. A systematic study examining 300 papers published in 2016/17 in Journal of Computational Physics, a journal that promotes sharing of digital artefacts, found that only 5.6% made artefacts available. Given that code and data are rich digital artefacts that can be shared relatively easily in most cases, and that funders and journals increasingly request/mandate sharing of resources, we should be sharing more.

The Solution

We wish to build a computational platform, called CODE CHECK, to enhance the availability, discovery and reproducibility of published computational research. Researchers that provide code and data will have their code independently run to ensure their work can be reproduced. The results from our independent run will then be shared freely post-publication. Our independent runs will act as a “certificate of reproducible computation” to document that the research outputs could be replicated outside of the researcher’s lab. These time-stamped certificates will include key outputs, including figures and tables, and valuable information such as the environment used to evaluate the code. Such certificates will help in the peer review process by showing reviewers that the code is available and works. We will work with several journals to design the system so that it can be used a pilot service by those journals. The level of reproducibility in CODE CHECK will vary considerably. At a minimum, we propose one key figure or table from a paper needs to be regenerated from the code, rather than the entire research article.

The long-term vision is that our service will be used by journals in biomedicine (and beyond), with financial support from infrastructure grants and journals. We hope our system will encourage researchers to share data before publication, as it can confirm that an analysis works outside their lab. In the likely case that programs do not work upon first submission, our early feedback will help reduce the chance of incorrect code/data sharing during peer-review and post publication. Our aim is to provide a quality check that the provided code runs, rather than a full assessment of the correctness of the code. We will not guarantee that the code can be run forever, as this is simply infeasible; time-stamped certificates will stand as independent verification of the results around the time a paper was published.

Goals

We plan to integrate CODE CHECK with the popular open notebook infrastructure behind https://mybinder.org, an instance of a BinderHub. Based on a hand-coded demonstrator for checking computations (http://sje30.github.io/codecheck) and an existing API for executing research compendia (https://o2r.info/api/), we will prototype a system that awards CODE CHECK badges to specific executions of open notebooks on a Binder instance. Researchers can create a snapshot of a workflow executed as a Binder after comparing the results with the ones recorded in their scientific article. The snapshot comprises the used data, code (including configuration, parameters), and runtime environment. It will be saved to a public research data repository (i.e. Zenodo), allowing long-term access and scrutiny of open research by reviewers and readers. While the infrastructure behind BinderHub is generic, supporting many programming languages and domains, our examples will come from computational neuroscience. This is simply to ensure we have the ability to understand the scientific approaches in papers and support common workflows from the domain. However, CODE CHECK system should be widely applicable.

We aim to achieve the following outputs by the end of the project:

Examples

Here are a few examples of “certificates of reproducible computation” to demonstrate the flavour of the proposed system.