Encouraging code sharing in academia

Stephen J Eglen

Encouraging code sharing in academia


Stephen J Eglen                  Cambridge Computational Biology Institute
https://sje30.github.io          University of Cambridge
sje30@cam.ac.uk                  @StephenEglen

Slides: http://bit.ly/eglen2017-1


Acknowledgements

Co-authors, Freeman lab, Laurent Gatto.

These slides are available under a creative common CC-BY license.

Inverse problems are hard

Score (%) grade
70-100 A
60-69 B
50-59 C
40-49 D
0-39 F

Forward problem

I scored 68, what was my grade?

Inverse problem

I got a B, what was my score?

Research sharing: the inverse problem


Where is the scholarship?

An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and that complete set of instructions that generated the figures.

[Buckheit and Donoho 1995, after Claerbout]

Moral or selfish approach?


Paper

Selfish reasons to share

Why not align what is good for science with what is good for scientists?

  1. Funding mandates (REF + enforcement from Wellcome Trust)
  2. Credit through data papers
  3. Leads to further collaborations (e.g. “EPAmeadev”)
  4. Fixes data bugs / errors in analysis
  5. Prevent data loss (Vines et al 2014). e.g. students have a habit of leaving…
  6. Your future self is probably one of the main beneficiaries of sharing.
  7. Now is a very good time to be an open scientist.

Code sharing: a way forward



Paper

Specific recommendations

  1. Include enough code to reproduce key figure/result from your paper (“modeldb”).
  2. Provide toy examples if your project is too intensive to expect others to run in a few hours.
  3. Version control (github)
  4. Licence (MIT)
  5. Provide data
  6. Provide tests
  7. Use standards
  8. Use permanent URLs (Zenodo/figshare)

Simple example

Paper Info

New tools

Docker

Can bundle entire open-source evironment for others to share:

(start docker)
docker run -d -p 8787:8787 sje30/eglen2015
open http://192.168.99.100:8787/

This should launch a web page …

A dirty secret

Jupyter notebooks

https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

binder = Docker + jupyter + cloud compute

https://github.com/sofroniewn/2pRAM-paper

binder other examples

Small example github repo

LIGO experiments

Old tools

Find a code buddy

Third most important file in github repo

(After Arfon Smith)

Makefile

Learn Make if you don’t know it already.

Practical tips

Summary