Code Coverage

Low Priority

3 min read

Testing

Metric	What it means	Realistic target
Line coverage	% of source lines executed by tests	75–85%
Branch coverage	% of decision branches (if/else, switch arms) tested	60–75%
Function coverage	% of functions called by at least one test	85–95%

Line coverage is easy and noisy. Branch coverage is the more honest signal — covering both arms of every if/else is what catches edge-case bugs.

Workflow

# 1. Run tests + collect coverage
flutter test --coverage

# 2. Generate HTML report
brew install lcov                                  # macOS
genhtml coverage/lcov.info -o coverage/html
open coverage/html/index.html

# 3. CI gate (Linux example with bc)
COVERAGE=$(lcov --summary coverage/lcov.info 2>&1 \
  | grep "lines" | awk '{print $2}' | sed 's/%//')
echo "Coverage: $COVERAGE%"
if (( $(echo "$COVERAGE < 80" | bc -l) )); then
  echo "Coverage below 80%" && exit 1
fi

# Exclude generated / boilerplate from the report
lcov --remove coverage/lcov.info \
  '**/*.g.dart' '**/*.freezed.dart' \
  '**/generated/**' '**/*_provider.dart' \
  -o coverage/lcov_filtered.info

What to test (and what to skip)

✅ Worth covering	❌ Diminishing returns
Repositories, services, business logic	Generated code (`.g.dart`, `.freezed.dart`)
State notifiers / Bloc transitions	DTOs with no logic
Form validators, parsers, mappers	Platform channel glue (test the wrapper, not the bridge)
Critical user flows (widget + integration)	Pure layout widgets (use golden tests instead)
Error handling paths	UI rendering minutiae
Edge cases (empty, null, very large, very small)	One-line getters

A 100% coverage app with no test for "what happens if the API returns 500" is still broken. Coverage doesn't equal correctness.

CI integration patterns

Practice	Why
Run coverage on every PR; fail below threshold	Prevent regressions
Track coverage trend (Codecov, Coveralls)	See drift over time
Comment PR diffs with coverage delta	Reviewers see impact
Distinguish coverage on changed lines vs whole repo	"Patch coverage" prevents new untested code
Exclude `.g.dart` and generated files BEFORE the threshold check	Otherwise coverage looks lower than reality
Don't block PRs on a 0.1% drop	Threshold should be principled, not punitive

Common mistakes to avoid

❌ Treating 100% coverage as a goal
   Leads to tests-of-tests, brittle setups, low-value assertions.
   ✅ Aim for coverage on code that MATTERS; let coverage drop where it doesn't.

❌ Counting generated code in coverage
   Inflates or deflates artificially. Always exclude.

❌ Writing tests just to hit a number
   verify(mock.foo()).called(1); — doesn't test behavior, just that mock was called.

❌ Ignoring coverage for golden / integration tests
   Some flows are inherently visual / end-to-end. Golden + integration cover what
   line-coverage tests can't.

❌ Coverage gates without team buy-in
   80% rule lands → team starts gaming → quality regresses.
   ✅ Agree on what to test; coverage is the artifact, not the contract.

❌ Forgetting to test error paths
   if (response.statusCode != 200) throw — never tested, fails in production.
   Always cover BOTH branches of important conditions.

Interview follow-ups

What's the difference between line coverage and branch coverage? Line coverage just asks "did this line execute?" Branch coverage asks "did both arms of this if, every case in this switch, every conditional expression get exercised?" Branch is more honest — a single test can cover all lines of an if/else if the assertion only checks one arm.
Should you require 100% coverage? Almost never. 100% pushes teams to write meaningless tests (mocks returning mocks, getters being "tested"). 80–85% is enough to catch most regressions without rewarding noise. The exception is critical libraries (auth, payments) where 95%+ is justified — but with high-quality tests, not coverage padding.
How do you measure coverage on changed lines only (patch coverage)? Tools like Codecov / Coveralls compare the PR's diff against the base branch's lcov.info and report coverage on changed lines. This is often more useful than absolute coverage: "new code is 92% covered" matters more than "repo dropped from 78% to 77.6%."

How helpful was this content?

Please sign in to rate this article.