CI/CD & Test Integration

A mock stack that behaves perfectly on a developer laptop is worth little if it flakes, hangs, or never starts once it reaches continuous integration. When a pipeline races a cold mock server, clashes on a port, or lets one shard’s writes bleed into the next, a whole team loses hours chasing failures that have nothing to do with the code under test.

This is the discipline of making mocks reliable in automation: frontend and full-stack engineers who own the tests, QA engineers who triage the red builds, and platform teams who own the runners all feel the cost when the mock layer is the flakiest part of the pipeline. The pages under this section turn a locally-working mock into a dependable CI dependency — started deterministically, gated on health, isolated per shard, and torn down without residue.

How a mock stack moves through a pipeline

A mock in CI is not a single thing that runs once. It is started, probed, consumed by many parallel workers, sometimes promoted into a preview environment, and finally destroyed. The diagram below traces that lifecycle from a developer’s push through to teardown, and it is the mental model the rest of this section builds on.

Mock stack lifecycle across a CI pipeline A developer laptop pushes to git, which triggers a CI pipeline. The pipeline spins up a mock stack (MSW or WireMock in a service container), gates on a health check, fans the suite out into parallel test shards, optionally promotes an ephemeral preview environment, and finishes with a teardown step. Dev laptop git push CI pipeline Mock stack MSW / WireMock service container wait for /health gate before tests shard 1 shard 2 shard 3 ephemeral preview per-PR stack teardown — compose down (if: always)

Each stage in that diagram maps to a failure class if it is skipped. Skip the health gate and tests race the mock; skip per-shard isolation and state bleeds; skip guarded teardown and containers leak into the next build. The four core concepts below address these stages in order.

Core concept 1 — Starting a mock server as a pipeline step

The single most common CI mock failure is a test suite that connects before the mock is listening. The fix is to treat startup as an explicit, observable step: bring the container up, then block on a health probe until the mock answers. A fixed sleep is the wrong tool — cold runners vary by seconds — so poll instead.

The mock is configured entirely through environment variables so the same image is deterministic in every job. MOCK_SEED pins the deterministic seed that drives generated data, and PORT lets parallel jobs avoid clashing on the same host port.

# docker-compose.ci.yml
services:
  mock-api:
    image: ghcr.io/acme/mock-api:latest
    environment:
      MOCK_SEED: "${MOCK_SEED:-42}"
      PORT: "${MOCK_PORT:-8080}"
    ports:
      - "${MOCK_PORT:-8080}:8080"
    healthcheck:
      test: ["CMD", "wget", "--spider", "-q", "http://localhost:8080/health"]
      interval: 5s
      timeout: 3s
      retries: 5

The health gate is a small, portable bash loop with a hard timeout so a stuck container fails the build fast instead of hanging until the job’s global limit:

#!/usr/bin/env bash
# scripts/wait-for-mock.sh — poll /health, fail after 60s
set -euo pipefail
PORT="${MOCK_PORT:-8080}"
DEADLINE=$(( $(date +%s) + 60 ))

echo "Waiting for mock on :${PORT} ..."
until curl -fsS "http://localhost:${PORT}/health" > /dev/null 2>&1; do
  if [ "$(date +%s)" -ge "$DEADLINE" ]; then
    echo "Mock never became healthy — dumping logs:" >&2
    docker compose -f docker-compose.ci.yml logs mock-api >&2
    exit 1
  fi
  sleep 1
done
echo "Mock is healthy."

This is the same start-then-gate discipline described in mock lifecycle management, applied to the pipeline rather than a developer’s shell. The running Running Mock Servers in CI Pipelines guide expands this into a full GitHub Actions and GitLab job.

Core concept 2 — Choosing an execution model

There is no single right way to run a mock in CI. The choice hinges on who needs to reach it and how much isolation you are willing to pay for. The table below compares the three models this section covers.

Dimension In-process MSW Containerised WireMock CI service container
Startup cost Milliseconds (same process as tests) 2–4s (JVM boot) 2–5s (image pull + boot)
Isolation Per worker, in-memory One shared endpoint One shared endpoint per job
Cross-language reach JS/TS runtimes only Any HTTP client Any HTTP client
Resource footprint Lowest (no extra process) Moderate (JVM heap) Moderate (managed by runner)
State reset server.resetHandlers() POST /__admin/*/reset Restart or admin reset
Best when Unit/component tests in Node Polyglot services, E2E over network You want the runner to own lifecycle

In-process MSW is the fastest and simplest when only JavaScript consumers touch the mock, because the handlers live in the same process as the tests and reset instantly. A WireMock container earns its startup cost the moment a second language, a browser-driven E2E tool, or another service must reach the same endpoint over the network. A CI service container is the middle path: the runner manages the mock’s lifecycle for you, which suits pipelines that already lean on dockerized mock environments. For a deeper head-to-head, see MSW vs WireMock for CI Pipelines.

Core concept 3 — The integration surface across providers

Once the mock starts reliably, the next surface is the CI provider itself: how you declare the mock, fan tests out in parallel, and cache the fixtures that feed it. The two dominant providers express these ideas differently.

GitHub Actions models a network-reachable mock as a services: block, and parallelism as a matrix:

# .github/workflows/test.yml
name: test
on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        shard: [1, 2, 3, 4]
    services:
      mock-api:
        image: ghcr.io/acme/mock-api:latest
        env:
          MOCK_SEED: "42"
          PORT: "8080"
        ports:
          - 8080:8080
        options: >-
          --health-cmd "wget --spider -q http://localhost:8080/health"
          --health-interval 5s --health-retries 5
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: "20"
          cache: "npm"
      - run: npm ci
      - name: Run shard ${{ matrix.shard }}
        run: npx vitest run --shard=${{ matrix.shard }}/4
        env:
          MOCK_BASE_URL: http://localhost:8080

GitLab CI uses services: at the job level and parallel: for shards, with variables carrying the same configuration:

# .gitlab-ci.yml
test:
  image: node:20
  parallel: 4
  services:
    - name: registry.example.com/mock-api:latest
      alias: mock-api
  variables:
    MOCK_SEED: "42"
    MOCK_BASE_URL: "http://mock-api:8080"
  script:
    - npm ci
    - npx vitest run --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL

Fixture generation is often the slowest part of a cold job, so cache it keyed on the specification hash — a technique covered end to end in Caching Generated Mock Fixtures in CI. The fixtures themselves come from schema-driven data generation, so a cache keyed on the spec invalidates exactly when the contract changes.

Core concept 4 — Operational concerns: isolation, reset, and teardown

Parallelism is where correctness quietly breaks. Four shards hitting one stateful mock will observe each other’s writes unless you draw a boundary. There are two boundaries that work: give each shard its own instance, or reset the shared instance at a known point.

For a shared WireMock, reset scenarios and dynamically-added mappings between shards:

# Between shards, or in a global afterEach
curl -fsS -X POST "${MOCK_BASE_URL}/__admin/scenarios/reset"
curl -fsS -X POST "${MOCK_BASE_URL}/__admin/mappings/reset"

For in-process MSW, reset handlers so runtime overrides do not persist:

// vitest.setup.ts
import { afterEach, afterAll, beforeAll } from 'vitest';
import { setupServer } from 'msw/node';
import { handlers } from './mocks/handlers';

export const server = setupServer(...handlers);

beforeAll(() => server.listen({ onUnhandledRequest: 'error' }));
afterEach(() => server.resetHandlers());   // discard per-test overrides
afterAll(() => server.close());

The full menu of reset strategies — scenario reset, handler reset, and truncate-and-reseed for a database-backed mock — is the subject of Resetting Mock State Between Test Runs.

Teardown must be unconditional. A test failure must still stop the mock, or the next build inherits a leaked container on a held port:

      - name: Tear down mock stack
        if: always()
        run: docker compose -f docker-compose.ci.yml down --remove-orphans

When each pull request gets its own disposable stack, that same teardown logic runs on PR close instead of at the end of a job — the pattern behind Ephemeral Preview Environments.

Decision guide — matching a strategy to your pipeline

Use this matrix to pick a starting point for a given pipeline shape.

Situation Recommended model Why
Node component/unit tests, single language In-process MSW, reset per test Zero startup cost, instant reset
Browser E2E (Playwright/Cypress) needs a network endpoint Containerised mock behind a health gate E2E tools cannot see in-process handlers
Polyglot services share one mock WireMock service container One endpoint, any HTTP client
Suite is slow, runner minutes are scarce Parallel shards + cached fixtures Wall-clock time drops with runner count
Reviewers need a live URL per PR Ephemeral per-PR stack Disposable, namespaced, auto-torn-down
Contract must gate the merge Add a contract check job Fails the PR before drift ships

When a pipeline spans several rows — say, polyglot E2E that must also be fast — layer the choices: a WireMock service container for reach, sharded jobs for speed, and a health gate in front of both.

Common failure modes and mitigations

1. Port clash — bind: address already in use. A previous job or a runner-resident service holds the port. Mitigation: parameterise the port with MOCK_PORT and let each parallel job pick a distinct one, or drop the fixed host-port mapping and read the assigned port back from docker compose port.

2. Race before ready — tests connect while the mock is still booting. The suite starts the instant the container is created, not when it is listening. Mitigation: block on the wait-for-mock.sh health loop from Core concept 1; never substitute a fixed sleep.

3. Worker not intercepting — requests hit the real network. An MSW server that fails to start silently lets calls fall through. Mitigation: set onUnhandledRequest: 'error' so an unmatched request fails the test loudly, as detailed in advanced MSW handler patterns.

4. State bleed across shards — a test passes alone but fails in parallel. Shards mutate a shared mock. Mitigation: reset scenarios/handlers at a shard or test boundary, or give each shard its own instance on its own port.

5. Orphaned containers — the next build starts dirty. Teardown was skipped because tests failed. Mitigation: guard teardown with if: always() and scope stacks with a unique compose project name so leftovers can be swept by name.

6. Cache serves stale fixtures after a spec change. The fixture cache key ignored the specification. Mitigation: key the cache on a hash of the OpenAPI file so it invalidates precisely when the contract moves, and validate restored fixtures against the spec before trusting them.

FAQ

Why do mocks that work locally flake or fail to start in CI?

Locally the mock is usually already warm and the port is free, so tests connect on the first try. In CI the runner is cold: the test suite races the container’s startup, the port may clash with a leftover service from a previous job, and there is no interactive shell to notice a silent bind failure. Adding an explicit health gate before tests run removes the race, and parameterising the port removes the clash. The remaining flake almost always traces back to shared state, which per-shard isolation or a reset step resolves.

Should the mock run in-process with the tests or in its own container?

Run it in-process with MSW when only JavaScript or TypeScript consumers hit it and startup speed matters most — the handlers share the test process and reset in microseconds. Run it in a container such as WireMock when other languages, other services, or browser-based E2E tools must reach the same mock over the network. The container costs a few seconds of startup but gives every consumer one shared, addressable endpoint. Many teams run both: MSW for component tests and a container for cross-service E2E.

How do I stop state bleeding between parallel test shards?

Give each shard its own mock instance, or reset the shared mock’s state at a known boundary. WireMock exposes POST /__admin/scenarios/reset and /__admin/mappings/reset; MSW offers server.resetHandlers(); a database-backed mock needs a truncate-and-reseed step. Reset in an afterEach or between shard jobs so no test inherits another’s mutations. If isolation is cheaper than coordination for your suite, prefer a dedicated instance per shard on a distinct port.

How do I make sure the mock is ready before tests connect?

Expose a GET /health endpoint on the mock and poll it in a bash loop with a hard timeout before the test step. Never rely on a fixed sleep: cold runners vary by seconds, so a sleep is either too short and races, or wastefully long. A wait-for-health loop advances the instant the mock answers and fails fast with logs if it never does, which turns a mysterious hang into an actionable error.

How should teardown be structured so containers never leak?

Put teardown in a step guarded by if: always() in GitHub Actions, or an after_script block in GitLab, so it runs even when tests fail. Use docker compose down --remove-orphans, and scope ephemeral stacks with a unique project name so a crashed job’s containers can be found and swept by name later. Unconditional teardown is what keeps one red build from poisoning the next.


← Back to Home