I built a code-correctness skill after too many vibe-coded apps broke in production

I kept running into the same problem with vibe-coded apps: they looked fine, worked on the happy path, and then broke the moment reality showed up.

That pattern repeated often enough that I stopped treating it as bad luck. The issue was the workflow. A lot of AI-assisted coding looks convincing because it optimizes for speed, coverage theater, and locally plausible code. That is not the same thing as correctness under production conditions.

So I made a reusable code-correctness skill for agent-compatible coding tools.

Instead of doing one fuzzy “review this code” pass, it checks code across 19 focused correctness lenses. That forces the review to look at the failure modes that usually get skipped when a project is moving fast.

What it checks for

The skill looks for issues across areas like:

data representation mistakes
broken contract invariants
timing and ordering problems
concurrency hazards
fake or misleading test coverage
silent failure paths
observability gaps
stale caches and freshness bugs
schema evolution mistakes
risky online migrations
time zone and locale behavior

Those are exactly the categories that tend to survive demos, pass basic manual testing, and then fail under real traffic, dirty data, retries, clock skew, partial deploys, or long-lived state.

Why I made it portable

I did not want another one-off prompt buried in a notes app. I wanted something reusable, explicit, and easy to run wherever agentic coding is happening.

The repo is a portable Agent Skills repository, and the current skill can be installed with npx skills or used as a Claude Code plugin.

Fastest way to try it

With npx skills:

npx skills add a1exmozz/skills --skill code-correctness

In Claude Code:

/plugin marketplace add a1exmozz/skills
/plugin install a1exmozz-skills@a1exmozz

Then use it as:

/a1exmozz-skills:code-correctness [scope]

What changed for me

This has made a real difference in my own workflow. It catches issues that are easy to miss in normal AI-assisted coding loops, and it helps stop the same classes of breakage from coming back later.

That matters more to me than getting code generated slightly faster. If the output is fragile, you have not really saved time. You have just delayed the cost.

If you are building quickly with AI and you have been burned by apps that “worked” right up until they did not, this is for you.

Repo: a1exmozz/skills on GitHub.