Programming Fundamentals
Testing: unittest, Mocks, and TDD
At Google every developer is required to cover code with tests. At Facebook, deployment is automatic - only tests stop bad code. Without tests, any refactoring is a lottery.
- CI/CD: GitHub Actions runs tests on every push - the test is the gate for deployment
- Regressions: Airbnb lost millions from a payment bug that a unit test would have caught
- TDD in embedded: NASA uses TDD for space probe software
- pytest: the Python testing standard at Netflix, Dropbox, Mozilla
Предварительные знания
- Writing functions with explicit parameters and return values (testable code)
- Basic exception handling: try/except, raise, common error types
- pip installed and some experience with virtual environments (venv, uv, poetry)
From Kent Beck's SUnit to Jest, Pytest, and property-based testing
Automated unit tests existed long before xUnit. In 1983 Boris Beizer published Software Testing Techniques, the classic textbook on equivalence partitioning, boundary-value analysis, and decision tables. In 1989 Kent Beck wrote SUnit for Smalltalk, the first xUnit-family framework with two key ideas: a test is a small class, and assert is just a method. In 1997 Beck, together with Erich Gamma (one of the GoF), ported SUnit to JUnit for Java on a plane to OOPSLA. JUnit became the template for NUnit, PyUnit (Python's unittest), PHPUnit, and dozens of others. In 2002 Beck published Test-Driven Development by Example and formalized the Red-Green-Refactor cycle; TDD became one of the 12 Extreme Programming practices. In 1999 John Hughes and Koen Claessen at Chalmers released QuickCheck for Haskell: instead of writing examples, the developer describes properties, and the library generates inputs and shrinks counterexamples on failure. The idea was later copied by Hypothesis for Python (David MacIver, 2013), ScalaCheck, and fast-check for JS. In 2003 Holger Krekel started py.test, which by the 2010s had overtaken unittest in popularity thanks to fixtures, parametrize, and human-readable error messages. In 2005 Steven Baker released RSpec for Ruby and popularized the BDD describe/it style. In 2014 Christoph Pojer at Facebook released Jest for testing React components; snapshot testing and parallel test runs became the JavaScript standard. By 2025 the typical Python testing stack is pytest plus Hypothesis plus pytest-asyncio plus coverage.py.
Unit tests: test one thing at a time
A **unit test** checks one function or method in isolation from the rest of the code. A good test is: Fast, Independent, Repeatable, Self-checking, and Timely (FIRST).
**pytest.approx()** is needed for float comparisons. Never write `assert 0.1 + 0.2 == 0.3` - it's false due to floating-point representation! Use `pytest.approx(0.3)` or `abs(result - 0.3) < 1e-9`.
What should a good unit test do?
Mocks: isolating external dependencies
A **Mock** is a stand-in object that simulates the behavior of a real dependency (API, database, email service). Without mocks, a test would send real emails or modify real data.
Why use a mock instead of a real object in a test?
TDD: write the test first, then the code
**Test-Driven Development** - the Red → Green → Refactor cycle: 1) Write a failing test (Red). 2) Write the minimal code to make it pass (Green). 3) Improve the code (Refactor). Repeat.
TDD equals "100% coverage". First write all tests, then all the code. Below 100% coverage the project is unprofessional.
TDD is the Red-Green-Refactor cycle one test at a time, not "all tests first". Tests are written iteratively against the current micro-task. 100% coverage is not a goal but a side effect: cover behavior, not lines. 70-80% meaningful coverage beats 100% trivial coverage.
TDD is confused with test-first development and with the coverage metric. Kent Beck describes short cycles (5-10 minutes): one test, minimal code, refactor. 100% coverage is often reached via getter/setter tests that catch nothing. The real value of TDD is in design (testable code equals loose coupling), not in the coverage percentage.
In the TDD cycle, what does the 'Red' phase mean?
Key ideas
- Unit test: one function, AAA pattern (Arrange-Act-Assert)
- pytest.approx() for float comparisons
- Mock: dependency stand-in. MagicMock, assert_called_once_with()
- TDD cycle: Red (test fails) → Green (minimal code) → Refactor
- pytest.mark.parametrize: one test = many input datasets
Related topics
Testing and refactoring are inseparable:
- Refactoring — Tests are the safety net during refactoring
- Functions — Good functions (single responsibility) are easy to test
Вопросы для размышления
- How do you test a function that depends on the current time datetime.now()? What do you mock?
- What is the difference between a mock, a stub, and a spy? When do you use each?
- Why is a test that always passes an anti-pattern? How do you verify that a test actually checks the behavior?
Связанные уроки
- prog-14-refactoring — Tests are a precondition for safe refactoring
- prog-13-patterns — Test patterns (AAA, fixtures) are patterns for test code
- stat-11-bayesian — Hypothesis testing in statistics follows the same logic as unit tests
- prob-01-intro — Probability of failure in production is the statistics of test coverage
- devops-01