Skip to content

Testing your software

Principles

  1. Test should not be a formality, but a necessity. However, try to think early about what bugs can be prevented by testing, and add tests timely when there are actual bugs.
  2. Test should be run against each code change. Dead test is worse than no test.

Unit test

Unit test is the prevalent kind of testing you can do easily to see if a component in your code will function as intended, esp. in corner cases.

Typically, the specification of each such component is clear, and it involves less external setup, so it is typically easier to write test for.

Best Practice: Many languages provide testing frameworks to do this, like junit in Java or unittest in Python. Try to use these "standard" libraries as much as possible.

Smoke test

Smoke test is typically performed during development on your computer. You just run the (changed) code within the local environment you have, and see if it works. This is normally used to help you gain information and confidence to continue development. It also tells you if your code is ready for further refinement to prepare for a change submission (e.g. pull request).

  • Your smoke test is successful != your code has no bugs, since there might be some other input you didn't consider.
  • Your smoke test is successful != your code is submittable, since it might break other tests or it is yet not well-structured/well-written
  • Your smoke test is successful != your code is runnable on CI or other people's computer, since it might depend on a temporary local environment (e.g. some local generated files)

Best Practice: It is useful to provide a command shortcut in Makefile or similar so people can run smoke test easily. With regression test set up (introduced below), smoke test can reuse a sub-command prepared for regression test as well.

Regression test

Regression test prevents surprises. It tracks whether given a set of inputs, your code can produce a set of outputs that are same as before, so you don't break anything (If some previously expected behavior is broken, we call it a regression).

It is inherently partial, since the set of inputs are limited, and typically you don't need to make sure that the recorded outputs are 100% expected behaviors. Sometimes, we can even use regression test to track unexpected outputs as well.

Best Practice: Provide easy way to test against recordings and update these recording in batch. Sometimes your code has certain indeterministic or imprecise outputs (like randomly generated id, or float point computation with allowed error), make sure your checking code can compare them meaningfully. Try to reduce such outputs in your code though.

End-to-end test

End-to-end (e2e) test are testings that test the behavior of a entire system, including both your code and external dependencies.

External dependencies can vary a lot, it might be some new input data, an API over network, etc. It is typically difficult or expensive to include as part of your regularly run tests (e.g. units tests and regression tests). So some e2e tests can be run and reproduced automatically (which requires quite a bit of infrastructural investment), while some other e2e tests that can only be run manually. However, even if your e2e tests are run manually, you need to make sure that the steps are carefully recorded (e.g. having a playbook) so anyone can perform them in a reproducible way.

Sometimes e2e tests are called "integration tests" as well. "End-to-end" emphasizes it being a entire system, taking inputs from end users and producing output to end users. "Integration" emphasizes that your code is tested for whether it still works well with other components which might be evolving independently beyond your control.