Skip to Content

Scientific computing and data handling workflows

This page lists selected literature and online resources. Some are related to existing tutorial pages, while others are not. They are supposed to be of high interest to this site’s users.

Several of the resources were added based on an inspiring talk by Julia Lowndes at the SAFRED conference, Brussels, 27 Feb 2018.

Overviews

  • Wilson et al. (2017): set of good computing practices that every researcher can adopt
  • British Ecological Society (2014): planning the data life cycle; creating, processing, documenting, preserving, sharing & reusing data
  • Goudeseune et al. (2019): open data management, data management plan, repositories, standards and licenses
  • Cooper & Hsing (2017): file organisation, workflow documentation, code reproducibility and readability, writing reproducible reports, version control and code archiving
  • Marwick et al. (2018): the research compendium as a solution to share research in a reproducible way
  • Ibanez et al. (2014): vision on reproducible science, routine practices, collaboration, literate computing

See also some resources related to learning and education and the discipline of open and reproducible science.

Specific tools

Focus on version control workflows

  • Bryan (2017): rationale, workflows and tools regarding version control for project organization
  • Bryan et al. (2019): getting started with git and github workflows in RStudio

Bibliography

British Ecological Society (ed.) (2014). A guide to data management in ecology and evolution. BES Guides to Better Science. British Ecological Society, London.

Bryan J. (2017). Excuse me, do you have a moment to talk about version control? PeerJ Preprints 5: e3159v2. https://doi.org/10.7287/peerj.preprints.3159v2.

Bryan J., the STAT 545 TAs & Hester J. (2019). Happy Git and GitHub for the useR. https://happygitwithr.com/.

Cooper N. & Hsing P.-Y. (eds.) (2017). A guide to reproducible code in ecology and evolution. BES Guides to Better Science. British Ecological Society, London.

Goudeseune L., Le Roux X., Eggermont H., Bishop B., Bléry C., Brosens D., Coupremanne M., Davis R., Hautala H., Heughebaert A., Jacques C., Lee T., Rerig G. & Ungvári J. (2019). Guidance document for scientists on data management, open data, and the production of Data Management Plans.  BiodivERsA report. Zenodo. https://doi.org/10.5281/zenodo.3448251.

Ibanez L., Schroeder W.J. & Hanwell M.D. (2014). Practicing open science. In: Stodden V., Leisch F. & Peng R.D. (eds.). Implementing reproducible research. CRC Press, Boca Raton, FL.

Marwick B., Boettiger C. & Mullen L. (2018). Packaging Data Analytical Work Reproducibly Using R (and Friends). The American Statistician 72 (1): 80–88. https://doi.org/10.1080/00031305.2017.1375986.

Ross Z., Wickham H. & Robinson D. (2017). Declutter your R workflow with tidy tools. PeerJ Preprints 5: e3180v1. https://doi.org/10.7287/peerj.preprints.3180v1.

Wilson G., Bryan J., Cranston K., Kitzes J., Nederbragt L. & Teal T.K. (2017). Good enough practices in scientific computing. PLOS Computational Biology 13 (6): e1005510. https://doi.org/10.1371/journal.pcbi.1005510.