Engineering

Why your Cypress suite died at month 6 (and how to fix it)

By Sergei Pustovalov · 9 May 2026 · 7 min read

You adopted Cypress the way you'd adopt any other framework. Your team spent a hackathon week getting it set up. You wrote 30 or 40 tests covering the critical paths. Login, signup, checkout, the dashboard, the settings page. They ran in CI on every PR. They were green.

That was month one. By month six, the suite is amber. By month nine it's mostly red. By month twelve, someone hides the dashboard from the deploy bot because a "Cypress check" hasn't actually meant anything for two quarters.

This is the same arc at almost every B2B SaaS startup that adopts a code-first browser-testing framework without a dedicated owner. It is not a Cypress problem. It would happen with Playwright, with Selenium, with anything code-first. It is a process problem disguised as a technical problem, and once you see it that way, you have actual options.

I've watched this pattern at maybe a dozen companies, including one I built. Here is how it actually plays out, and what works to stop it.

How it starts

The hackathon week is genuine. Engineers like writing tests when it feels like greenfield work. The team installs Cypress, scaffolds the basic structure, writes a happy-path test for login. Two more for signup. Five for the main dashboard. They look at each other, feel proud, and ship the suite to CI.

For about a month, this works. The tests pass on every PR. When one fails, the engineer who wrote it knows exactly why and fixes it in five minutes. The suite is small enough that any one person can hold the whole map of it in their head. There is no debate about ownership because everyone owns it equally.

This is the high water mark. It is also the trap. The conditions that made it work, small suite, fresh memory, owner-equals-author, do not survive contact with month two of feature work.

Where it goes wrong

Month two ships a refactor of the login form. The Cypress test for login was using a CSS class that no longer exists, so the test fails. The engineer who wrote it is now on the new feature, has not looked at the test in three weeks, and does not have time to debug it today. They mark it as it.skip() with a TODO comment.

This is the original sin. Not because skip is bad. Because once one test is skipped, the bar for skipping the next one drops. Every engineer who hits a brittle test now has implicit permission to skip it under deadline pressure. By month three, four or five tests are skipped.

By month four, a real regression slips through. Someone on the team says "we should look at why our tests didn't catch that." Someone else points out that the tests for the affected area were skipped. Everyone agrees they should be unskipped. No one is assigned. They stay skipped.

By month six, the suite has 40 tests, maybe 25 of them green, the rest skipped or failing for reasons no one has time to investigate. The dashboard is amber every day. The team has stopped looking at it. The team has not stopped relying on it as a regression check, because their mental model is still "we have tests."

This is the dangerous middle. The organization still believes it has regression coverage. It does not. The suite has degraded into a placebo.

Why this happens specifically with code-first frameworks

There is nothing wrong with Cypress, Playwright, or any other code-first browser-testing framework. They are all excellent. The reason a code-first suite degrades is structural, not technical.

Code-first tests live in the same repo as the app. They go through code review, they ship in PRs, they are part of the engineering surface area. This sounds like a benefit. In practice, it means tests are subject to the same triage pressure as features, and tests almost always lose.

A new feature has a deadline, a stakeholder, and visible business value. A failing test has none of those. It has a Cypress dashboard in amber that someone might glance at on Monday morning. When an engineer chooses between "fix the brittle login test" and "ship the contract revision the sales team has been waiting for," the test loses every time.

The second structural problem is ownership. If everyone owns the suite, no one does. There is no QA engineer whose job is the test suite. There is no on-call rotation for test maintenance. There is no SLA. The suite is by default everyone's responsibility, which means it is no one's.

You can solve this with discipline. It is not impossible. You just need to actually do it, and most small SaaS teams do not have the engineering capacity to treat test maintenance as a real workstream alongside product work.

The hidden cost when this happens

Here is the cost that gets ignored: by month nine, the team is making release decisions based on a placebo.

Someone says "our regression tests are passing, looks good." They are not. Half the tests are skipped, the other half pass on assertions so loose that a real bug could slip through. The team's confidence in the deploy is built on a signal that has decayed below the noise floor. They are flying blind without knowing it.

Then the regression hits. A real bug ships to production. Customer complaint. Investigation. Root cause analysis. Someone runs the test that was supposed to catch this and finds it has been skipped for four months. The team holds a postmortem, agrees this should not happen again, and assigns someone to fix the suite. Three weeks later the new feature deadline kicks in and the suite goes back to its previous state.

You did not save engineering time by skipping the brittle test. You deferred it, and added on top a production incident, customer-trust damage, and a postmortem cycle. The cost compounds.

What actually works

Three options, in increasing order of cost.

1. Treat tests as production code.

Add the test suite to your on-call rotation. Whoever is on call this week is responsible for any failing test, including the ones that have been skipped. They either fix it or get explicit approval from the eng lead to keep it skipped with a stated reason and a ticket. Audit the suite quarterly. Delete tests that have been skipped for more than a sprint without an active fix in flight. This works if your team has the discipline to actually do it. Most don't, but it is the cheapest option if you can.

2. Hire a QA engineer.

Someone whose job it is to own the suite. They write new tests, they fix flakes, they audit, they file the bugs the suite catches. Salary cost: $80-150k/year depending on geography. Returns: a test suite that actually works as a regression signal. Easy decision once you are at 30+ engineers, hard decision before that.

3. Move regression to a managed service.

Treat regression testing as something separate from your app's test suite. Use a service that defines flows from outside your codebase, runs them on staging, and tells you when something breaks. Your in-repo tests stay focused on component-level logic. The regression check on the critical user flows lives somewhere that does not compete with feature work for engineering time.

I built Regresco for the third option. Flows live outside your repo, they run on every release, the platform classifies failures so you know which red runs are real, and the maintenance burden is something close to zero. Free plan is 5 runs a month if you want to point it at your staging URL and see what happens.

If your suite is already at month 6

A practical audit you can do this Friday afternoon:

Open your Cypress dashboard. Count tests in three buckets: passing, skipped, failing.
For each skipped test, check the commit that skipped it. If the skip is older than two months and there's no active ticket to unskip, delete the test. It is dead code.
For each failing test, decide right now: fix it this sprint, or delete it. Do not let it sit failing for another week.
Whatever survives the audit, write down what it actually covers. Compare to the actual critical user paths in your app. Identify gaps.
Decide who owns the surviving suite. If the answer is "everyone," go back to step 1.

Most teams find half their suite is dead and the other half does not actually cover the paths that would catch the regressions they are most worried about. That is the moment to choose option 1, 2, or 3.

Tired of debugging your own suite?

Free plan is 5 runs a month. No credit card. Point Regresco at your staging URL and you'll see a regression sweep running in under 10 minutes, with failures classified for you so you know which red runs are real.

Start free Read Regresco vs Cypress