Code Coverage

Low PriorityAsked in ~40% of senior interviews

3 min read

Testing

MetricWhat it meansRealistic target
Line coverage% of source lines executed by tests75–85%
Branch coverage% of decision branches (if/else, switch arms) tested60–75%
Function coverage% of functions called by at least one test85–95%

Line coverage is easy and noisy. Branch coverage is the more honest signal — covering both arms of every if/else is what catches edge-case bugs.


Workflow

# 1. Run tests + collect coverage
flutter test --coverage

# 2. Generate HTML report
brew install lcov                                  # macOS
genhtml coverage/lcov.info -o coverage/html
open coverage/html/index.html

# 3. CI gate (Linux example with bc)
COVERAGE=$(lcov --summary coverage/lcov.info 2>&1 \
  | grep "lines" | awk '{print $2}' | sed 's/%//')
echo "Coverage: $COVERAGE%"
if (( $(echo "$COVERAGE < 80" | bc -l) )); then
  echo "Coverage below 80%" && exit 1
fi
# Exclude generated / boilerplate from the report
lcov --remove coverage/lcov.info \
  '**/*.g.dart' '**/*.freezed.dart' \
  '**/generated/**' '**/*_provider.dart' \
  -o coverage/lcov_filtered.info

What to test (and what to skip)

✅ Worth covering❌ Diminishing returns
Repositories, services, business logicGenerated code (.g.dart, .freezed.dart)
State notifiers / Bloc transitionsDTOs with no logic
Form validators, parsers, mappersPlatform channel glue (test the wrapper, not the bridge)
Critical user flows (widget + integration)Pure layout widgets (use golden tests instead)
Error handling pathsUI rendering minutiae
Edge cases (empty, null, very large, very small)One-line getters

A 100% coverage app with no test for "what happens if the API returns 500" is still broken. Coverage doesn't equal correctness.


CI integration patterns

PracticeWhy
Run coverage on every PR; fail below thresholdPrevent regressions
Track coverage trend (Codecov, Coveralls)See drift over time
Comment PR diffs with coverage deltaReviewers see impact
Distinguish coverage on changed lines vs whole repo"Patch coverage" prevents new untested code
Exclude .g.dart and generated files BEFORE the threshold checkOtherwise coverage looks lower than reality
Don't block PRs on a 0.1% dropThreshold should be principled, not punitive

Common mistakes to avoid

❌ Treating 100% coverage as a goal
   Leads to tests-of-tests, brittle setups, low-value assertions.
   ✅ Aim for coverage on code that MATTERS; let coverage drop where it doesn't.

❌ Counting generated code in coverage
   Inflates or deflates artificially. Always exclude.

❌ Writing tests just to hit a number
   verify(mock.foo()).called(1); — doesn't test behavior, just that mock was called.

❌ Ignoring coverage for golden / integration tests
   Some flows are inherently visual / end-to-end. Golden + integration cover what
   line-coverage tests can't.

❌ Coverage gates without team buy-in
   80% rule lands → team starts gaming → quality regresses.
   ✅ Agree on what to test; coverage is the artifact, not the contract.

❌ Forgetting to test error paths
   if (response.statusCode != 200) throw — never tested, fails in production.
   Always cover BOTH branches of important conditions.

Interview follow-ups

  1. What's the difference between line coverage and branch coverage? Line coverage just asks "did this line execute?" Branch coverage asks "did both arms of this if, every case in this switch, every conditional expression get exercised?" Branch is more honest — a single test can cover all lines of an if/else if the assertion only checks one arm.

  2. Why exclude generated code from coverage? It's not your code — testing it tests the generator, not your logic. Including it inflates or deflates the metric arbitrarily. Always filter with lcov --remove (or your CI tool's exclude config).

  3. Should you require 100% coverage? Almost never. 100% pushes teams to write meaningless tests (mocks returning mocks, getters being "tested"). 80–85% is enough to catch most regressions without rewarding noise. The exception is critical libraries (auth, payments) where 95%+ is justified — but with high-quality tests, not coverage padding.

  4. How do you measure coverage on changed lines only (patch coverage)? Tools like Codecov / Coveralls compare the PR's diff against the base branch's lcov.info and report coverage on changed lines. This is often more useful than absolute coverage: "new code is 92% covered" matters more than "repo dropped from 78% to 77.6%."


How helpful was this content?

Please sign in to rate this article.