Chapter 2 Accurate
Code should be error free and appropriately quality assured. Alongside simple sense-checking, two of the key mechanisms for ensuring that code is accurate are testing and project review.
2.1 Testing
Testing our code helps to ensure that it is both correct and robust. For further guidance on what you should test, see the Coffee and Coding ‘Testing’ session.
As a broad description, unit tests should exist to check that your actual results match your expected results.
Unit tests should test, as a minimum, any functions you create. The purpose of these granular tests is to ensure the code continues to give the correct answer in a range of cases, and even in edge cases (where unusual inputs are provided).
You should ensure that your tests and linters run automatically on all pull requests, using Github Actions.
There are a number of tools to enable unit testing:
Language | Tools |
---|---|
R | In R, consider using the testthat package. For an introduction to using testthat, try reading this blog post from Inattentional Coffee or this Towards Data Science post. For an example unit tests within a project, see here. |
Python | Consider using unittest or pytest. |
Javascript | See here for testing with javascript. For data vis in Javascript, you need unit tests of routines that manipulate your data or data structures. Visual checks are sufficient of visualisation outputs, but you must make visual checks of the output against real data, and some test datasets that produce predictable output (e.g. where values are set to 1, 0.5 etc.). |
2.2 Project review
Code review provides additional assurance that code logic is correct, as well as providing feedback on code and problem structuring. For smaller projects, the review only needs to be a simple read-through and sanity check.
Code reviews should be initiated through the creation of a pull request. The review should typically involve the reviewer pulling the code to their local machine, testing it, and leaving comments in the pull request.
Remember that it’s always easier (for both you and your reviewers) if you commit and push your changes regularly. You should merge branches into the master regularly so that reviewers review little and often, rather than attempting to review your entire codebase all at once.
Performing good peer review
When performing peer review, asking yourself the following questions is a good place to start:
Theme | Check |
---|---|
Clear, concise code |
|
Packages |
|
Validation |
|
Verification |
|
Testing |
|
Documentation |
|
If you’re reviewing the code of a more experienced coder, it is a chance to learn and you have every right to ask for an explanation if there’s something that is unclear. It’s in everyone’s interest that you understand what you’re reading and it could be that you don’t understand it because the author has made a mistake or over-complicated something.