Code Coverage
3 min read
Testing
| Metric | What it means | Realistic target |
|---|---|---|
| Line coverage | % of source lines executed by tests | 75–85% |
| Branch coverage | % of decision branches (if/else, switch arms) tested | 60–75% |
| Function coverage | % of functions called by at least one test | 85–95% |
Line coverage is easy and noisy. Branch coverage is the more honest signal — covering both arms of every if/else is what catches edge-case bugs.
Workflow
# 1. Run tests + collect coverage
flutter test --coverage
# 2. Generate HTML report
brew install lcov # macOS
genhtml coverage/lcov.info -o coverage/html
open coverage/html/index.html
# 3. CI gate (Linux example with bc)
COVERAGE=$(lcov --summary coverage/lcov.info 2>&1 \
| grep "lines" | awk '{print $2}' | sed 's/%//')
echo "Coverage: $COVERAGE%"
if (( $(echo "$COVERAGE < 80" | bc -l) )); then
echo "Coverage below 80%" && exit 1
fi
# Exclude generated / boilerplate from the report
lcov --remove coverage/lcov.info \
'**/*.g.dart' '**/*.freezed.dart' \
'**/generated/**' '**/*_provider.dart' \
-o coverage/lcov_filtered.info
What to test (and what to skip)
| ✅ Worth covering | ❌ Diminishing returns |
|---|---|
| Repositories, services, business logic | Generated code (.g.dart, .freezed.dart) |
| State notifiers / Bloc transitions | DTOs with no logic |
| Form validators, parsers, mappers | Platform channel glue (test the wrapper, not the bridge) |
| Critical user flows (widget + integration) | Pure layout widgets (use golden tests instead) |
| Error handling paths | UI rendering minutiae |
| Edge cases (empty, null, very large, very small) | One-line getters |
A 100% coverage app with no test for "what happens if the API returns 500" is still broken. Coverage doesn't equal correctness.
CI integration patterns
| Practice | Why |
|---|---|
| Run coverage on every PR; fail below threshold | Prevent regressions |
| Track coverage trend (Codecov, Coveralls) | See drift over time |
| Comment PR diffs with coverage delta | Reviewers see impact |
| Distinguish coverage on changed lines vs whole repo | "Patch coverage" prevents new untested code |
Exclude .g.dart and generated files BEFORE the threshold check | Otherwise coverage looks lower than reality |
| Don't block PRs on a 0.1% drop | Threshold should be principled, not punitive |
Common mistakes to avoid
❌ Treating 100% coverage as a goal
Leads to tests-of-tests, brittle setups, low-value assertions.
✅ Aim for coverage on code that MATTERS; let coverage drop where it doesn't.
❌ Counting generated code in coverage
Inflates or deflates artificially. Always exclude.
❌ Writing tests just to hit a number
verify(mock.foo()).called(1); — doesn't test behavior, just that mock was called.
❌ Ignoring coverage for golden / integration tests
Some flows are inherently visual / end-to-end. Golden + integration cover what
line-coverage tests can't.
❌ Coverage gates without team buy-in
80% rule lands → team starts gaming → quality regresses.
✅ Agree on what to test; coverage is the artifact, not the contract.
❌ Forgetting to test error paths
if (response.statusCode != 200) throw — never tested, fails in production.
Always cover BOTH branches of important conditions.
Interview follow-ups
-
What's the difference between line coverage and branch coverage? Line coverage just asks "did this line execute?" Branch coverage asks "did both arms of this
if, everycasein thisswitch, every conditional expression get exercised?" Branch is more honest — a single test can cover all lines of an if/else if the assertion only checks one arm. -
Why exclude generated code from coverage? It's not your code — testing it tests the generator, not your logic. Including it inflates or deflates the metric arbitrarily. Always filter with
lcov --remove(or your CI tool's exclude config). -
Should you require 100% coverage? Almost never. 100% pushes teams to write meaningless tests (mocks returning mocks, getters being "tested"). 80–85% is enough to catch most regressions without rewarding noise. The exception is critical libraries (auth, payments) where 95%+ is justified — but with high-quality tests, not coverage padding.
-
How do you measure coverage on changed lines only (patch coverage)? Tools like Codecov / Coveralls compare the PR's diff against the base branch's lcov.info and report coverage on changed lines. This is often more useful than absolute coverage: "new code is 92% covered" matters more than "repo dropped from 78% to 77.6%."
How helpful was this content?
Please sign in to rate this article.