Independent execution of computations underlying research articles
April 24, 2024
Affiliate editor of bioRxiv; editorial board of Gigabyte.
Mozilla mini science grant, UK Software Sustainability Institute, NWO. Editors @ Gigascience, eLife, Scientific Data.
Thank you BNA for awarding us the Team Credibility Prize 2024.
Slides (CC BY 4.0) are available in HTML format from https://tinyurl.com/codecheck2024-02. (Grant McDermott).
We take your paper, code and datasets.
We run your code on your data.
If our results match your results, go to step 5.
Else we talk to you to find out where code broke. If you fix your code or data, we return to step 2 and try again.
We write a report summarising that we could reproduce your finding.
We work with you to freely share your paper, code, data and our reproduction.
We should be sharing material on the left, not the right.
“Paper as advert for Scholarship” (Buckheit & Donoho, 1995)
We check the code generates expected number of output files.
The contents of those output files are not checked, but are available for others to see.
The validity of the code is not checked.
This is not neuro-specific; we work across disciplines.
Depending on your project, there may be data to analyse, or simulations to run.
We did several reproductions of Covid papers, including the Imperial “Report 9” model.
AUTHOR provides code/data and instructions on how to run.
CODECHECKER runs code and writes certificate.
PUBLISHER oversees process, helps depositing artifacts, and persistently publishes certificate.
AUTHOR gets early check that “code works”; gets snapshot of code archived and increased trust in stability of results.
CODECHECKER gets insight in latest research and methods, credit from community, and citable object.
PUBLISHER Gets citable certificate with code/data bundle to share and increases reputation of published articles.
PEER REVIEWERS can see certificate rather than check code themselves.
READER Can check certificate and build upon work immediately.
https://codecheck.org.uk/register/
See for example certificate 2020-010 (Imperial’s “Report 9”).
CODECHECKER time is valuable, so needs credit.
Very easy to cheat the system, but who cares?
Author’s code/data must be freely available.
Deliberately low threshold for gaining a certificate.
High-performance compute is a resource drain.
Cannot (yet) support all thinkable/existing workflows and languages.
New project https://codecheck.org.uk/nl aiming to produce 50 codechecks for papers from Netherlands.
Embedding into journal workflows.
Training a community of codecheckers.
Funding for a codecheck editor.
Come and get involved
Further information: http://codecheck.org.uk and our research article.