Confidential client · Remote · Q4 2025
The five-level framework used to diagnose this client's testing practice before building the automated pipeline — and what moved them from Level 2 to Level 4.
The N+1 query fixes and EXPLAIN ANALYZE work that dropped the dashboard P99 latency from 4.2s to 380ms — and the postgresql.conf settings applied to the production database.
The Playwright architecture behind this engagement — Page Object Model, auth fixtures for login flows, network interception for payment mocks, and parallel CI execution.
At the 90-day check-in the team reported shipping twice as many features per sprint. The discipline of writing tests first had accelerated development rather than slowing it — the opposite of what their engineers had expected.
A B2B SaaS platform serving 50,000 active users had no automated tests. Zero. A three-person QA team was manually testing every release — a process that took 10–14 days and still shipped regressions regularly. The engineering team wanted to move to weekly releases but the QA bottleneck made it impossible. One critical regression six months earlier had taken their largest customer offline for eight hours, nearly costing them the contract.
The codebase was a React frontend (120,000 lines) and a Node.js API (80,000 lines) with a PostgreSQL database. There was no CI pipeline — developers pushed to main and deployed manually. The platform had 47 distinct user flows critical to the core product, 200+ API endpoints, and integration with six third-party services (Stripe, Sendgrid, and four industry-specific data providers).
The timeline was aggressive: they wanted a working CI pipeline and meaningful test coverage within six weeks, before their Series B roadshow in January 2026.
We started with a coverage audit — mapping all 47 critical user flows and 200+ API endpoints, prioritizing by business impact and risk. We then built the testing pyramid from the bottom up: unit tests first (fast, isolated, cheap), integration tests second (API contracts, database interactions), and E2E tests last (browser automation, full flows).
We chose Vitest over Jest for its native ESM support and 3× faster execution. We wrote 847 unit tests covering utility functions, business logic, and React components via Testing Library. For API integration tests, MSW (Mock Service Worker) intercepts third-party API calls — tests run without hitting Stripe or any external service, making them fast and deterministic. Coverage gates in CI block merges below 90% on new code.
We wrote 156 Playwright tests covering all 47 critical user flows, each with at least three test cases (happy path, error handling, edge case). Tests run in parallel across Chromium, Firefox, and WebKit. We use Playwright's trace viewer for debugging failures — every CI run stores traces for the last 30 days. Visual regression testing via screenshot comparison catches layout breaks that functional tests miss.
We wrote k6 load test scripts for the ten most-trafficked API endpoints, simulating up to 5,000 concurrent users. Load tests run weekly on a staging environment that mirrors production data. We identified two N+1 database queries during initial load testing — fixing them reduced P99 latency on the dashboard endpoint from 4.2s to 380ms.
The new pipeline runs unit tests on every PR (under 90 seconds), integration tests on merge to main (under 4 minutes), E2E tests on merge to main (under 12 minutes, running in parallel across 8 workers), and load tests weekly on a schedule. The old manual QA process took 10–14 days. The new pipeline takes 16 minutes and runs on every merge.
"We went from dreading releases to shipping twice a week. The pipeline catches things before they reach staging — we don't even discuss 'did this break anything?' anymore."
— VP Engineering, Confidential SaaS Client