Understanding Flaky Tests in RSpec
What Do Flaky Specs Look Like?
- Passing locally but failing on CI (or vice versa)
- Tests that fail or pass depending on run order
- State-dependent failures that disappear after retries
- Mocks/stubs that unexpectedly leak between tests
Why It Matters
Flaky tests introduce friction in test-driven workflows, slow down releases, and generate false negatives that reduce trust in automation. In regulated or mission-critical systems, this can delay certification or compliance sign-off.
Root Causes of RSpec Flakiness
1. Shared Mutable State Between Examples
Using global or class-level variables modified in tests can cause state bleed between examples, especially with improper cleanup.
2. Unverified or Overlapping Test Doubles
RSpec allows flexible mocking and stubbing, but without verification (e.g., using instance_double
), invalid or outdated method definitions may cause runtime failures or false positives.
3. Improper Use of before(:all)
before(:all)
hooks run once per example group without resetting database transactions or mocked states. This can leave behind mutated objects across tests.
4. Non-deterministic External Dependencies
Specs relying on time, randomization, or asynchronous services (like jobs or APIs) without proper control produce unpredictable results.
5. Database Race Conditions in Parallel Runs
Parallel RSpec execution (e.g., via Knapsack, ParallelTests) can surface uncommitted transactions, deadlocks, or dirty reads if database setup isn't isolated.
Diagnostics and Isolation Techniques
1. Run Specs With Random Order
rspec --order random --seed 12345
If tests fail under randomized order, shared state is likely the culprit.
2. Enable Full Backtrace and Fail Fast
rspec --backtrace --fail-fast
Immediately halts on first failure and reveals full trace context for debugging.
3. Use Verified Doubles
Replace double
with instance_double
, class_double
, or object_double
to catch invalid method calls at mock creation time.
4. Profile CI Failures by Example
Track failing specs over time to identify patterns tied to parallel runs, timeouts, or specific environments. Use tags and custom formatters to log consistently.
Step-by-Step Fixes
1. Avoid before(:all)
for Mutable Data
Use before(:each)
to ensure every test runs in a clean state. Avoid modifying shared constants or class variables across examples.
2. Stub External Services Using Verified Contracts
Use tools like WebMock or VCR to isolate and verify external service calls. Ensure mocked responses align with real-world schema or API changes.
3. Isolate Database Records with Transaction Rollback
config.use_transactional_fixtures = true
Each example runs in its own transaction and rolls back after execution, preventing cross-test contamination.
4. Use FactoryBot's build
Over create
When Possible
Avoid unnecessary database writes in unit tests. build
or build_stubbed
reduces execution time and minimizes side effects.
5. Set Up Deterministic Time and Randomness
Timecop.freeze(Time.utc(2023, 1, 1)) do # test logic end
Also set fixed seeds for randomness (srand
) and mock any UUID or random generators used in business logic.
Best Practices
- Use
instance_double
andallow_any_instance_of
cautiously - Split slow or flaky specs into separate CI groups
- Always cleanup global state (ENV vars, files, cache)
- Run specs in random order with a fixed seed during development
- Ensure CI and local environments match in DB config and Ruby version
Conclusion
Flaky specs in RSpec often stem from uncontrolled state, unreliable test doubles, and misused setup logic. Left unchecked, they erode trust in automated testing and delay releases. By enforcing strict isolation, using verified mocks, and ensuring deterministic behavior in tests, teams can achieve stable, repeatable test suites. In large Ruby codebases, reliability in RSpec is not just about coverage—it's about confidence in every build.
FAQs
1. Why do my RSpec tests pass locally but fail on CI?
CI runs may use different environments, databases, or run specs in parallel, exposing shared state or timing issues not seen locally.
2. Should I use let!
or before
for test data?
Use let
for lazy evaluation, let!
for eager setup. Avoid before(:all)
for mutable data as it risks leakage between tests.
3. How can I verify my mocks are valid in RSpec?
Use verified doubles like instance_double
which ensure mocked methods exist on the target class or instance at runtime.
4. What's the best way to handle external API calls in specs?
Use VCR or WebMock to stub HTTP responses. Match expected request structures and avoid live API calls in unit tests.
5. Can I parallelize RSpec safely?
Yes, with tools like ParallelTests or Knapsack, but ensure your DB and test data are isolated (via transactions or test containers) to prevent race conditions.