Chapter 2 Accurate

Code should be error free and appropriately quality assured. Alongside simple sense-checking, two of the key mechanisms for ensuring that code is accurate are testing and project review.

2.1 Testing

Testing our code helps to ensure that it is both correct and robust. For further guidance on what you should test, see the Coffee and Coding ‘Testing’ session.

As a broad description, unit tests should exist to check that your actual results match your expected results.

Unit tests should test, as a minimum, any functions you create. The purpose of these granular tests is to ensure the code continues to give the correct answer in a range of cases, and even in edge cases (where unusual inputs are provided).

You should ensure that your tests and linters run automatically on all pull requests, using Github Actions.

There are a number of tools to enable unit testing:

Language Tools
R In R, consider using the testthat package. For an introduction to using testthat, try reading this blog post from Inattentional Coffee or this Towards Data Science post. For an example unit tests within a project, see here.
Python Consider using unittest or pytest.
Javascript See here for testing with javascript. For data vis in Javascript, you need unit tests of routines that manipulate your data or data structures. Visual checks are sufficient of visualisation outputs, but you must make visual checks of the output against real data, and some test datasets that produce predictable output (e.g. where values are set to 1, 0.5 etc.).

2.2 Project review

Code review provides additional assurance that code logic is correct, as well as providing feedback on code and problem structuring. For smaller projects, the review only needs to be a simple read-through and sanity check.

Code reviews should be initiated through the creation of a pull request. The review should typically involve the reviewer pulling the code to their local machine, testing it, and leaving comments in the pull request.

Remember that it’s always easier (for both you and your reviewers) if you commit and push your changes regularly. You should merge branches into the master regularly so that reviewers review little and often, rather than attempting to review your entire codebase all at once.

Performing good peer review

When performing peer review, asking yourself the following questions is a good place to start:

Theme Check
Clear, concise code
  • Did the code need to be explained to me?
  • Is the code readable?
  • Could the same thing be achieved in a simpler way?
  • Does the code follow style guidelines?
  • Are the names of functions and variables sensible and descriptive?
Packages
Validation
  • Is the process doing what is needed?
Verification
  • Are the calculations correct?
Testing
  • Are there unit tests (if required)?
  • Is it tested with sufficient coverage?
  • Are there any cases that might break it?
Documentation
  • Are the code comments sufficient?
  • Have assumptions been documented?
  • Has accompanying documentation (e.g. readme,wiki) been updated (if required)?
  • Have AQA documents been updated (if required)?

If you’re reviewing the code of a more experienced coder, it is a chance to learn and you have every right to ask for an explanation if there’s something that is unclear. It’s in everyone’s interest that you understand what you’re reading and it could be that you don’t understand it because the author has made a mistake or over-complicated something.