Most teams know their QA is not where it should be. What's harder to answer is precisely where it sits, what that means in practice, and what the realistic next step looks like. After testing across web apps, mobile products, desktop software, APIs, SaaS platforms, and game titles โ across sectors from fintech to e-commerce to gaming โ at TestGate Studio we've found that most teams fall cleanly into one of five maturity levels. The model is a diagnostic tool, not a ranking. Knowing where you are is the first step to knowing what to fix.
The five levels
What this looks like: Testing happens informally. Developers test their own code. There are no test cases. QA is whoever has time the day before a release. Bug reports are verbal or on sticky notes. No systematic regression exists โ each release is effectively a fresh start.
The real cost: Bugs found by customers cost 10โ100ร more to fix than bugs found in development. At Level 0 the cost is invisible because it's spread across engineering time spent firefighting, customer support load, and reputation damage that doesn't appear on any dashboard.
What this looks like: A test plan exists. Someone owns QA. There are written test cases โ probably in a spreadsheet. Testing happens at a defined point in the release cycle (usually "before we deploy"). Bug reports go into a tracker. There's some regression, but it's manual and often skipped when time is short.
The real problem: Manual regression is the bottleneck. As the product grows, the regression suite grows with it, but the time available to run it doesn't. The result is implicit prioritisation โ testers run the cases they remember, skip the ones that feel stable, and miss regressions in unexpected places.
What this looks like: Automated tests exist. Unit tests cover business logic. There might be some E2E tests (Playwright, Cypress, Selenium). Tests run in CI โ but not reliably. Flaky tests are a known problem. The automation suite gives some signal but isn't fully trusted. Manual testing still does most of the heavy lifting for regression.
The real problem: Flakiness is the enemy at this level. A test suite that fails intermittently teaches everyone to ignore failures โ which is worse than having no automation at all, because it creates the illusion of coverage without the substance. The root cause is almost always: tests written too quickly, with poor isolation, and without synchronisation strategies.
What this looks like: Tests run on every PR. The suite is fast enough that developers don't disable it to merge faster. Flakiness is under control. Coverage is tracked and gated. Mobile and API are covered alongside web. Performance testing runs on a schedule. New features get test cases as part of the definition of done.
What's still missing: Security testing, accessibility audits, and exploratory testing are still manual and ad-hoc. Coverage metrics look good but may not represent real-world risk โ a high coverage percentage with tests that don't exercise critical paths is common at this level.
What this looks like: QA engineers are involved from the design phase, not called in at the end. Test cases are written from user stories before code is written. The pipeline covers web, mobile, API, performance, security, and accessibility. Exploratory testing runs on every major release. Incident post-mortems always include a "what test would have caught this?" review.
This is the goal. It doesn't mean zero bugs โ it means no bug reaches a customer that a reasonable QA process would have caught. The cost of quality at Level 4 is lower than at Level 0, because prevention is always cheaper than remediation.
How to move up: what actually works
The most common mistake in QA improvement is trying to jump levels. A team at Level 0 trying to implement full CI automation will produce a maintenance nightmare they abandon within a month. The effective path is incremental โ fix the biggest problem at your current level before adding the complexity of the next.
At TestGate Studio we run QA audits that place a team on this model and identify the three highest-impact changes to make in the next 90 days. It's rarely about tools โ the limiting factor is almost always process and mandate, not technology. The right tool chosen badly is worse than the wrong tool chosen deliberately.
The coverage trap
One caution on metrics: code coverage percentage is not a proxy for QA quality. A codebase with 90% line coverage but tests that only verify happy paths in isolation can ship catastrophic bugs. Coverage tells you which lines were executed during tests โ it says nothing about whether the tests actually verify correct behaviour, whether error paths are handled, or whether the system works end-to-end under realistic conditions.
Better metrics for QA maturity than coverage percentage: escaped defect rate (bugs found by customers vs. bugs found in QA), mean time to detect regressions, and the ratio of bugs found in production vs. in testing. These are harder to measure but far more meaningful.
Where most teams actually are
In our experience testing web apps, mobile products, SaaS platforms, desktop software, APIs, and game titles: most teams sit at Level 1 or the low end of Level 2. They have a process, they have some tests, but the automation is not trusted enough to be the primary quality signal. The gap between Level 2 and Level 3 is where most improvement effort is needed โ specifically on flakiness, CI integration, and mobile/API coverage that matches the web layer.
If you're not sure where your team sits, the diagnostic is straightforward: ask whether your engineers trust the test suite enough to deploy on a green CI run without manual verification. If the answer is no, or "it depends," you're at Level 2 or below. That's the gap to close.