Download Sustainable Software: Ensuring the Correctness and Reproducibility of Research Code and more Schemes and Mind Maps Software Development in PDF only on Docsity!
Sustainable Software
or “Is your research software correct?”
Robin Long 26/01/
Sustainable software
Is your research (software) correct.
- (^) How do you know that your code does what it is meant to?
- (^) We really on code, for science, but science is not always applied to our code.
- (^) Can you reproduce what you did?
- How sure are you of your results?
We have a problem
- (^) A lot of these problems would have been caught with better software, but what does that mean?
- (^) A different approach.
- Be more rigourous.
- Apply principles of testing and reproducability.
Are you sure you did that?
- (^) Solution: automate.
- Many programs support scripting, use this instead.
- (^) If it doesn’t, find another program.
- Don’t just automate the analysis, automate the whole chain: Data selection, manipulation and analysis.
Make you life easier
- We tend to write programs that are easy for computers to understand, but difficult for humans.
- How do you know it is correct if you struggle to read it?
- (^) Use a high level language.
Why a high level language
“Programmers write roughly the same number of lines of code per unit time regardless of the language they use”
- Best Practices for Scientific Computing, PLOS Biology, Wilson Et Al
I didn’t mean that version
- (^) myCode version
- myCode version
- (^) myCode version2 broken
- myCode version2 fixed-ish
- (^) myCode version2 fixed
- myCode version2 broken again
- myCode version2 fixed
- (^) myCode version2 fixed withChanges
I didn’t mean that version
- (^) This is scary.
- Not even clear which the final version is.
- (^) Are you sure of which version went into the paper.
- If the analysis was split, did you all use the same version?
- (^) Version control helps.
- There are many options, if you are starting out from scratch, I recommend git.
Share your code
- (^) Let us assume that you have done all the changes so far:
- You have automated your analysis (one command and you get results).
- (^) You have had someone look over it.
- It is in version control so you know what changes have been made.
- Why not upload it to a public github?
Share your code, Why?
- (^) Science is (or should be) about openness and trasnparency, we shoudl share.
- Saves other people re-inventing what has already been done.
- (^) People can use your code and cite it.
- Opens the door to more collaboration and impact.
- (^) Others can view your code, and submitt modifications and even enhance your work.
Are you afraid of your code?
- (^) It can seem like a lot of work at the beginning, but it will be worth it.
- Looking back our first examples, testing could have prevented many of these issues.
- Integrate tests with version control.
- (^) Automated testing, you write the tests, they are run periodically to check state of code.
Problems and solutions
- Reproducability - Learn to script and automate.
- (^) Readability - Write in a high level language.
- Traceability - Use version control.
- (^) Accuracy/Correctness - Testing.
- Reproducability / Correctness - Share your code / Get a code buddy.
Help exists
www.software.ac.uk
The Software Sustainability Institute
The Institute was founded to support the UKs research software community - a community that includes the majority of UKs researchers.
Our mission is to cultivate better, more sustainable, research software to enable world-class research (better software, better research). Software is fundamental to research: seven out of ten UK researchers report that their work would be impossible without it.