Search…
CI/CD Pipelines · Part 6

Testing in CI pipelines

In this series (10 parts)
  1. What CI/CD actually means
  2. Pipeline anatomy and design
  3. GitHub Actions in depth
  4. GitLab CI/CD in depth
  5. Jenkins fundamentals
  6. Testing in CI pipelines
  7. Artifact management
  8. Pipeline security and supply chain
  9. Progressive delivery
  10. Self-hosted runners and pipeline scaling

Prerequisite: CI/CD pipeline anatomy.

A CI pipeline without a deliberate testing strategy is just a slow build that occasionally turns red. This article maps the test pyramid to pipeline stages, adds parallelism, tackles flaky tests, and introduces contract testing with Pact.


The test pyramid

Fast, cheap tests form the base. Slow, expensive tests sit at the top. You want many more of the former.

graph TB
  E2E["E2E Tests<br/>Few, slow, brittle<br/>Deploy stage"]
  INT["Integration Tests<br/>Moderate count<br/>Build stage"]
  UNIT["Unit Tests<br/>Many, fast, isolated<br/>Commit stage"]

  E2E --- INT
  INT --- UNIT

  style UNIT fill:#22c55e,color:#000
  style INT fill:#facc15,color:#000
  style E2E fill:#ef4444,color:#fff

Unit tests run first, E2E tests run last.

Unit tests in the commit stage

Unit tests cover pure functions and business logic in isolation. No database, no network. They belong in the first pipeline stage.

# .github/workflows/ci.yml (excerpt)
jobs:
  unit-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: 20, cache: npm }
      - run: npm ci
      - run: npm run test:unit -- --coverage

If unit tests fail, the pipeline stops immediately.

Integration tests in the build stage

Integration tests verify that components work together: a service calling a real database, an API handler hitting a real cache.

  integration-tests:
    needs: unit-tests
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:16
        env: { POSTGRES_PASSWORD: testpass, POSTGRES_DB: testdb }
        ports: ["5432:5432"]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: 20, cache: npm }
      - run: npm ci
      - run: npm run test:integration
        env:
          DATABASE_URL: postgres://postgres:testpass@localhost:5432/testdb

E2E tests in the deploy stage

E2E tests exercise the full stack via Playwright or Cypress against a staging environment. Keep the count low and focused on critical journeys: login, checkout, search.

  e2e-tests:
    needs: integration-tests
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - run: npx playwright install --with-deps chromium
      - run: npm run test:e2e
        env:
          BASE_URL: https://staging.example.com

Test parallelism

Most test frameworks support sharding. GitHub Actions supports matrix strategies.

  unit-tests:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        shard: [1, 2, 3, 4]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: 20, cache: npm }
      - run: npm ci
      - run: npx vitest --shard=${{ matrix.shard }}/4

Four shards run simultaneously. A 12-minute suite drops to roughly 3 minutes.

Diminishing returns start around 8 shards for most projects.


Flaky tests

A flaky test passes and fails on the same code without changes. Flaky tests erode trust and developers start ignoring red builds.

Common causes:

  • Shared state: tests modify a database row and do not clean up.
  • Timing dependencies: a test expects a fast response but the CI runner is slower.
  • Order dependence: test B passes only when test A runs first.

Strategies to fight flakiness:

  1. Quarantine: move flaky tests to a non-blocking job.
  2. Retry with limits: allow one retry. Two retries means quarantine.
  3. Deterministic fixtures: use transactions that roll back after each test.
  flaky-quarantine:
    runs-on: ubuntu-latest
    continue-on-error: true
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - run: npx vitest --config vitest.quarantine.config.ts

Coverage gates

Raw coverage percentages are noisy. A better gate: coverage on changed lines only.

      - run: npm run test:unit -- --coverage
      - name: Check diff coverage
        uses: romeovs/lcov-reporter-action@v0.4.0
        with:
          lcov-file: coverage/lcov.info
          github-token: ${{ secrets.GITHUB_TOKEN }}
          min-coverage: 80
          filter-changed-files: true

80% on changed lines is a good starting point. 100% leads to useless tests written only to satisfy the gate.


Contract testing with Pact

In a microservice architecture, integration tests between services get expensive fast. Service A needs Service B, which needs Service C. Contract testing breaks this chain: each service publishes a contract describing what it expects and provides.

graph LR
  Consumer["Consumer Service<br/>Generates contract"]
  Broker["Pact Broker<br/>Stores contracts"]
  Provider["Provider Service<br/>Verifies contract"]

  Consumer -->|publish| Broker
  Broker -->|retrieve| Provider
  Provider -->|verify result| Broker
  Broker -->|can-i-deploy check| Consumer

  style Consumer fill:#3b82f6,color:#fff
  style Broker fill:#8b5cf6,color:#fff
  style Provider fill:#22c55e,color:#000

Consumer publishes a contract to the broker. Provider verifies independently. No shared environment needed.

Consumer side

import { PactV4 } from "@pact-foundation/pact";

const provider = new PactV4({
  consumer: "OrderService",
  provider: "InventoryService",
});

describe("Inventory API", () => {
  it("returns stock level", async () => {
    await provider
      .addInteraction()
      .given("product ABC exists with stock 42")
      .uponReceiving("a request for stock level")
      .withRequest("GET", "/api/stock/ABC")
      .willRespondWith(200, (builder) => {
        builder.jsonBody({ sku: "ABC", quantity: 42 });
      })
      .executeTest(async (mockServer) => {
        const res = await fetch(`${mockServer.url}/api/stock/ABC`);
        const body = await res.json();
        expect(body.quantity).toBe(42);
      });
  });
});

Provider side

  verify-contracts:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - run: npm run start:test &
      - run: >
          npx pact-provider-verifier
          --provider-base-url=http://localhost:3000
          --pact-broker-base-url=https://pact.example.com
          --provider=InventoryService
          --provider-app-version=${{ github.sha }}
          --publish-verification-results

The can-i-deploy check asks the broker whether it is safe to deploy this version based on verified contracts. If not, the pipeline fails before reaching production.


Putting it all together

A solid CI test strategy has four properties:

  1. Speed: unit tests under a minute, integration under five, E2E under ten.
  2. Isolation: each tier runs independently with no shared side effects.
  3. Reliability: flaky tests are quarantined, not tolerated.
  4. Contracts: services verify compatibility without running the full stack.

Encode these into your pipeline config, not a wiki page nobody reads.


What comes next

Testing tells you the code works. The next question is what happens to the artifacts those tests produce. Artifact management covers versioning, registries, signing, and retention policies for the binaries your pipeline builds.

Start typing to search across all content
navigate Enter open Esc close