<style> /* .remark-slide-content { */ /* padding-top: 20px; */ /* padding-left: 20px; */ /* padding-right: 20px; */ /* padding-bottom: 20px; */ /* } */ .remark-slide-content > h1:first-of-type { margin-top: 0px; margin-bottom: 0px; } .remark-slide-content > h2:first-of-type { margin-top: 0px; margin-bottom: 0px; } a:link { color: #1f618d; } a:visited { color: #1f618d; } .small { font-size: 70% } </style> ### CODECHECK: Evaluating the reproducibility of computational results reported in scientific journals <br> ``` Stephen J Eglen Cambridge Computational Biology Institute https://sje30.github.io University of Cambridge sje30@cam.ac.uk @StephenEglen Daniel Nüst Institute for Geoinformatics https://nordholmen.net University of Münster daniel.nuest@uni-muenster.de @nordholmen ``` HTML Slides: <http://tiny.one/codecheck22> (CC-BY 4.0 license) --- ## Declarations and acknowledgements #### Declarations Affiliate editor of *bioRxiv*; editorial board of *Gigabyte*. These slides accompany our paper: <https://f1000research.com/articles/10-253/> #### Acknowledgements Mozilla mini science grant, UK Software Sustainability Institute. Editors @ *Gigascience*, *eLife*, *Scientific Data*. --- class: highlight-last-item ## CODECHECK in one slide 1. We take your paper, code and datasets. -- 2. We run your code on your data. -- 3. If our results match your results, go to step 5. -- 4. Else we talk to you to find out where code broke. If you fix your code or data, we return to step 2 and try again. -- 5. We write a report summarising that we could reproduce your finding. -- 6. We work with you to freely share your paper, code, data and our reproduction. --- ## Premise <br> <center><img src="", width=800></center> We should be sharing material on the left, not the right. "Paper as advert for Scholarship" [(Buckheit & Donoho, 1995)](https://link.springer.com/chapter/10.1007/978-1-4612-2544-7_5) --- ## Approaches to code sharing <br> - [Barnes (2010)](https://dx.doi.org/10.1038/467753a) <center><img src="", width=700></center> - Informal 'code buddy' system - Community-led *research compedia*. - Code Ocean [(Nature trial)](https://link.springer.com/chapter/10.1007/978-1-4612-2544-7_5) - Certify reproducibility with confidential data (CASCAD) [(Pérignon et al 2019)](https://science.sciencemag.org/content/365/6449/127) <!-- - CODECHECK takes a different approach . . . --> --- ## The CODECHECK philosophy - Systems like Code Ocean set the bar high by "making code reproducible *forever* for *everyone*". - CODECHECK simply asks "was the code reproducible *once* for *someone* else?" - We check the code runs and generates the expected number of output files. - The contents of those output files are not checked, but are available for others to see. - The validity of the code is *not* checked. --- ## CODECHECK process <br> <center><img src="", width=1000></center> --- ## Variations in a codecheck <br> <center><img src="