Faker & Fixture Seeding

This page covers building seeded, reusable mock fixtures with @faker-js/faker and typed factory functions: how to define an entity once, build one or many instances, apply traits and overrides, and compose entities into relational graphs whose foreign keys stay stable across every run.

A fixture factory answers a narrower question than the broader data layer around it. It is not a schema validator and not a randomisation policy. Where schema-driven data generation derives payloads from an OpenAPI contract and deterministic seed management anchors the pseudo-random sequence, a factory is the hand-written layer in between: readable TypeScript that produces one well-shaped object, or a hundred of them, with the exact overrides a given test needs.

Prerequisites

Node.js 18 or later, locally and in CI
@faker-js/faker v8.4 or later (npm install --save-dev @faker-js/faker)
TypeScript 5.x with strict: true — the factory helper below relies on generics
A MOCK_SEED slot in .env.test and your CI environment; the mechanics live in deterministic seed management
MSW v2 installed if you will serve fixtures at runtime — see MSW handler registration
Optional: a WireMock standalone instance if you need static __files

Phase 1 — A typed fixture factory

The core building block is one generic helper that turns a “how do I build one of these?” function into a full API: build for a single instance, buildList for many, and define to register named traits. Writing this once means every entity type in the project shares the same ergonomics, and every call site reads the same way.

// src/mocks/factory.ts
import { faker } from './seeded-faker';

/**
 * A builder receives the current sequence index and the seeded faker,
 * and returns a fully-formed entity. Overrides are shallow-merged on top.
 */
export type Builder<T> = (ctx: { index: number; faker: typeof faker }) => T;

export interface Factory<T> {
  build(overrides?: Partial<T>): T;
  buildList(count: number, overrides?: Partial<T>): T[];
  /** Register a named trait: a partial patch applied when requested. */
  define(trait: string, patch: Partial<T> | Builder<Partial<T>>): Factory<T>;
  /** Build with one or more traits applied in order, then overrides. */
  buildAs(traits: string[], overrides?: Partial<T>): T;
}

export function defineFactory<T>(builder: Builder<T>): Factory<T> {
  let sequence = 0;
  const traits = new Map<string, Partial<T> | Builder<Partial<T>>>();

  const resolveTrait = (name: string, index: number): Partial<T> => {
    const trait = traits.get(name);
    if (!trait) throw new Error(`Unknown trait "${name}"`);
    return typeof trait === 'function' ? trait({ index, faker }) : trait;
  };

  const factory: Factory<T> = {
    build(overrides = {}) {
      const index = sequence++;
      return { ...builder({ index, faker }), ...overrides };
    },
    buildList(count, overrides = {}) {
      return Array.from({ length: count }, () => factory.build(overrides));
    },
    define(trait, patch) {
      traits.set(trait, patch);
      return factory;
    },
    buildAs(traitNames, overrides = {}) {
      const index = sequence++;
      let entity = builder({ index, faker });
      for (const name of traitNames) {
        entity = { ...entity, ...resolveTrait(name, index) };
      }
      return { ...entity, ...overrides };
    },
  };

  return factory;
}

The seeded-faker import is deliberate: every factory shares one seeded instance so that consuming the sequence is reproducible. That module is the same singleton described in deterministic seed management; the factory layer sits directly on top of it.

Now a concrete entity. A userFactory defines the base shape once and registers two traits — an admin role and a deactivated status — so tests can request meaningful variants without repeating field literals:

// src/mocks/factories/user.ts
import { defineFactory } from '../factory';

export interface User {
  id: string;
  handle: string;
  email: string;
  role: 'admin' | 'editor' | 'viewer';
  status: 'active' | 'deactivated';
  createdAt: string;
}

export const userFactory = defineFactory<User>(({ index, faker }) => ({
  id: `user_${(index + 1).toString().padStart(4, '0')}`,
  handle: faker.internet.username().toLowerCase(),
  email: faker.internet.email(),
  role: 'viewer',
  status: 'active',
  createdAt: faker.date.past({ years: 2 }).toISOString(),
}))
  .define('admin', { role: 'admin' })
  .define('deactivated', { status: 'deactivated' });

Three call patterns cover almost every test need:

import { userFactory } from './mocks/factories/user';

const one = userFactory.build();                       // a single viewer
const admin = userFactory.buildAs(['admin']);          // trait applied
const team = userFactory.buildList(5, { role: 'editor' }); // five editors
const named = userFactory.build({ handle: 'ada' });    // explicit override

Note the sequential, human-readable id: user_0001, user_0002, and so on. Sequential ids are the hinge that makes relational composition work in Phase 2 — a child entity can reference user_0001 deterministically because the factory always mints ids in the same order.

Phase 2 — Seed configuration and relational references

A single-entity factory is only half the story. Real API responses are graphs: a user owns orders, an order contains line items, a line item references a product. For those graphs to be useful in tests, the foreign keys must line up on every run — an order’s userId must point at a user that actually exists in the same fixture set.

First, confirm the seed boundary. The factory consumes faker, and faker is seeded exactly once from MOCK_SEED:

// src/mocks/seeded-faker.ts
import { Faker, en } from '@faker-js/faker';

const raw = process.env.MOCK_SEED ?? '42';
const seed = Number.parseInt(raw, 10);
if (Number.isNaN(seed)) {
  throw new Error(`MOCK_SEED must be an integer, got "${raw}"`);
}

export const faker = new Faker({ locale: [en] });
faker.seed(seed);

/** Reset the sequence — call in beforeEach so each test starts clean. */
export function resetSeed(value = seed): void {
  faker.seed(value);
}

# .env.test  (committed — a dev constant, not a secret)
MOCK_SEED=42

Now compose. Build parents first, capture their ids, and pass them into child factories as overrides. Because the parent ids are deterministic, the children reference stable values:

// src/mocks/factories/order.ts
import { defineFactory } from '../factory';

export interface Order {
  id: string;
  userId: string;
  status: 'pending' | 'paid' | 'shipped';
  placedAt: string;
}

export const orderFactory = defineFactory<Order>(({ index, faker }) => ({
  id: `order_${(index + 1).toString().padStart(4, '0')}`,
  userId: 'user_0001', // default owner; override when composing a graph
  status: faker.helpers.arrayElement(['pending', 'paid', 'shipped']),
  placedAt: faker.date.recent({ days: 30 }).toISOString(),
}));

// src/mocks/graph.ts
import { userFactory, type User } from './factories/user';
import { orderFactory, type Order } from './factories/order';
import { resetSeed } from './seeded-faker';

export interface Dataset {
  users: User[];
  orders: Order[];
}

/**
 * Build a referentially-consistent dataset: each user owns `ordersPerUser`
 * orders whose userId points at a user that exists in the same set.
 */
export function buildDataset(userCount = 3, ordersPerUser = 2): Dataset {
  resetSeed(); // deterministic start, independent of prior consumption

  const users = userFactory.buildList(userCount);
  const orders = users.flatMap((user) =>
    orderFactory.buildList(ordersPerUser, { userId: user.id })
  );

  return { users, orders };
}

buildDataset() returns the same graph on every call with the same seed: three users, six orders, and every order.userId resolvable inside users. The generating realistic relational mock data walkthrough extends this to a three-level users → orders → line items graph with quantity and price integrity.

Phase 3 — Wiring fixtures into the mock stack and CI

Fixtures earn their keep only when the mock layer serves them. There are two delivery modes, and a seeded factory makes both safe because the bytes are identical whether you build at runtime or ahead of time.

Runtime: MSW handlers

Handlers call the dataset builder and serve slices of it. Because the builder resets the seed internally, the handler is stateless across requests unless you deliberately hold state — the topic of stateful scenario sequences:

// src/mocks/handlers/orders.ts
import { http, HttpResponse } from 'msw';
import { buildDataset } from '../graph';

export const orderHandlers = [
  http.get('/api/v1/orders', ({ request }) => {
    const url = new URL(request.url);
    const userId = url.searchParams.get('userId');
    const { orders } = buildDataset(3, 2);
    const result = userId ? orders.filter((o) => o.userId === userId) : orders;
    return HttpResponse.json({ data: result, total: result.length });
  }),
];

For deeper response logic — pagination, projection, conditional errors — layer these fixtures into the techniques from advanced MSW handler patterns.

Ahead of time: WireMock `__files`

A mock server that cannot execute JavaScript needs static JSON on disk. Run the factory in a small generation script and write its output where WireMock reads it:

// scripts/generate-fixtures.mts
import { writeFileSync, mkdirSync } from 'node:fs';
import { buildDataset } from '../src/mocks/graph.js';

const outDir = './mocks/wiremock/__files';
mkdirSync(outDir, { recursive: true });

const { users, orders } = buildDataset(5, 3);
writeFileSync(`${outDir}/users.json`, JSON.stringify({ data: users }, null, 2));
writeFileSync(`${outDir}/orders.json`, JSON.stringify({ data: orders }, null, 2));

console.log(`Wrote ${users.length} users and ${orders.length} orders`);

{
  "scripts": {
    "fixtures:generate": "tsx scripts/generate-fixtures.mts",
    "fixtures:check": "npm run fixtures:generate && git diff --exit-code mocks/wiremock/__files"
  }
}

CI gate

fixtures:check regenerates and fails if the committed output drifts, which catches the classic bug where a factory changed but the checked-in JSON did not:

# .github/workflows/fixtures.yml
name: Fixtures
on: [push, pull_request]
jobs:
  verify:
    runs-on: ubuntu-latest
    env:
      MOCK_SEED: "42"
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: "20"
          cache: "npm"
      - run: npm ci
      - run: npm run fixtures:check

This aligns with mock lifecycle management: generated fixtures are derived artefacts, regenerated from the factory, never hand-edited.

Verification steps

Run these to confirm the factory produces stable, well-shaped data.

Determinism across runs:

MOCK_SEED=42 npm run fixtures:generate && cp mocks/wiremock/__files/users.json /tmp/a.json
MOCK_SEED=42 npm run fixtures:generate && diff /tmp/a.json mocks/wiremock/__files/users.json
# Expected: no output (identical)

Seed actually controls output:

MOCK_SEED=99 npm run fixtures:generate && diff /tmp/a.json mocks/wiremock/__files/users.json
# Expected: a diff — values changed, proving the seed is wired in

Referential integrity holds:

node -e "const {buildDataset}=require('./dist/mocks/graph.js'); const d=buildDataset(3,2); const ids=new Set(d.users.map(u=>u.id)); console.log(d.orders.every(o=>ids.has(o.userId)) ? 'OK' : 'BROKEN')"
# Expected: OK

Traits apply:

node -e "const {userFactory}=require('./dist/mocks/factories/user.js'); console.log(userFactory.buildAs(['admin']).role)"
# Expected: admin

Troubleshooting

Every `build()` returns the same object

Cause: The builder captured faker values at module load instead of inside the function body — for example, a top-level const email = faker.internet.email() reused on every call.

Fix: Generate all random values inside the builder function so each build() advances the sequence. The builder receives faker as an argument precisely to discourage top-level capture.

`buildList` output changes when unrelated tests run first

Cause: The shared faker sequence carries state across test files. A test that consumed three values earlier shifts every subsequent factory call.

Fix: Call resetSeed() in beforeEach, and have graph builders reset internally (as buildDataset does). Never depend on implicit sequence position across test boundaries — the same discipline covered under per-suite reset in deterministic seed management.

Foreign keys point at users that do not exist in the response

Cause: Child factories generated their own random userId with faker.string.uuid() instead of referencing a built parent.

Fix: Build parents first, then pass { userId: parent.id } into the child factory as an override, as shown in buildDataset. Random foreign keys are the single most common cause of “works in isolation, breaks in integration” fixture bugs.

Committed WireMock JSON drifts on every CI run

Cause: MOCK_SEED differs between the machine that committed the fixtures and CI, so the same factory emits different bytes.

Fix: Pin MOCK_SEED to a fixed integer in the fixtures workflow env (shown in Phase 3) and in .env.test. Reserve per-run seeds for handler-only projects that never commit static output.

When to advance

The fixture layer is solid when:

Each entity has exactly one factory; no test constructs entities with inline object literals
Traits express every meaningful variant (admin, deactivated, empty, oversized) instead of scattered overrides
Relational datasets pass a referential-integrity assertion in CI
Running the generator twice with the same seed produces a zero-byte diff
Handlers and static __files both derive from the same factories, never from separate hand-written data

Once these hold, extend the graph in generating realistic relational mock data, formalise the generic helper in building a reusable fixture factory, or bound the valid value space with schema-driven data generation.

FAQ

How is a fixture factory different from schema-driven generation?

A factory is hand-written TypeScript that you control field by field, giving you traits and precise overrides. Schema-driven data generation derives payloads from an OpenAPI or JSON Schema definition automatically. Factories win on readability and edge-case control; schema generation wins on staying in lockstep with a large contract. Most teams use both — factories for the entities tests assert on, schema generation for breadth across endpoints nobody is asserting against yet.

Should generated fixtures be committed or generated at runtime?

Commit them when a mock server that cannot run JavaScript needs static files, such as WireMock __files, or when you want a reviewable diff on every data change. Generate at runtime when MSW handlers build responses on demand. A seeded factory makes both safe because the output is identical either way — the committed JSON is simply a snapshot of what the runtime path would have produced.

How do I keep relational ids stable across factories?

Derive child foreign keys from an index or from the parent id rather than from a fresh faker.string.uuid() call. Build parents first, capture their ids, and pass them into the child factory as an override. Because the seed fixes the parent ids, the children reference the same values on every run. The buildDataset helper in Phase 2 is the minimal version of this pattern.

Building a Reusable Fixture Factory — the generic factory helper with traits, overrides, and sequences in depth
Generating Realistic Relational Mock Data — referential integrity across users, orders, and line items
Deterministic Seed Management — the seed layer every factory sits on top of
Schema-Driven Data Generation — bound the value space your factories populate
Stateful Scenario Sequences — when fixtures must change across a sequence of calls

← Back to Data Generation & Realism Strategies