Using a Rival AI to Audit My Code — What It Found, What It Missed, and What Actually Mattered

I have been building Cozy Reader — a WordPress blog reader for Android — almost entirely in collaboration with Claude. It handles the architecture decisions, writes the code, reviews its own output, and helps me think through edge cases as we go. For a solo developer without a formal engineering background, this kind of pair programming has been genuinely transformative.

But then I started wondering: what is it missing? Every collaborator has blind spots, and an AI trained on your codebase conversation is no different. So I did something a little unusual — I ran the same codebase through Cursor AI and asked it to do a fresh, independent security and UI audit with no prior context. Then I brought those findings back to Claude and asked it to verify each one against the actual code before we touched anything.

Here is what happened.

The Setup

Cozy Reader is a React Native app built with Expo, using SQLite for local storage, the WordPress REST API for content, and the WordPress.com OAuth flow for likes and account features. It is the kind of app where a few categories of bug really matter: data loss, auth failures, and the app looking broken on a user’s phone when they have dark mode on.

I gave Cursor the codebase and asked for two passes: a security and logic audit, and a theming and accessibility review. No prompting beyond that. Then I took its output verbatim to Claude.

What Cursor Found — Security Pass

Cursor flagged five issues with varying severity labels. Here is the honest breakdown of what was real, what was overstated, and what was genuinely useful.

The first finding — that the OAuth flow lacked a state parameter — was technically correct but overstated as High severity. The app uses a custom URI scheme (cozyreader://oauth) which the operating system enforces at the hardware level. Only this signed app can receive that callback. The classic CSRF attack Cursor described requires a browser context it simply cannot get to here. The finding was not wrong, but calling it High for a native mobile app is the kind of severity inflation that leads developers to deprioritise the genuinely important stuff.

The second finding — a hardcoded OAuth client secret — was also technically correct and also not actionable. The comment at the top of the file already explains this: WordPress.com’s native app OAuth flow has always worked this way, their own mobile apps do the same, and the protection comes from redirect URI verification rather than secret confidentiality. Cursor either did not read the comment or did not weight it. Claude confirmed this immediately and we moved on.

The third finding was a real bug and a good catch. In the sync service, when we flush pending likes that were queued while the user was offline, the like and unlike functions return a boolean indicating success. The original code awaited the call but never checked the returned value — so an API-level failure (say, a 500 from the server) would be treated as success and the queued action would be silently dropped. Claude spotted exactly why this was wrong, explained the distinction between a thrown network error and a returned false, and fixed it properly. That one was worth the whole exercise.

The fourth finding — permissive URL parsing in the OAuth callback — was low risk. The OS already constrains the URL shape before our code sees it, and the null-check on the code parameter handles the error case correctly. We left it alone.

The fifth finding — scroll progress not being clamped between 0 and 1 — was valid. On iOS with rubber-band scrolling you can get out-of-range values, and even on Android floating point imprecision near the bottom of a very short post could push the number fractionally above 1.0. One line of Math.min and Math.max, done.

What Cursor Found — UI and Theming Pass

This is where the second opinion really earned its place.

The most significant finding was that the Add Blog screen was completely outside the theme system. Every colour value was hardcoded to the light palette — the warm cream and brown tones used when the app is in light mode. On a dark-mode phone, that screen would look like a jarring intruder from a different app. This had simply been overlooked during the original build. Cursor caught it, Claude verified it, and we rewired the entire screen to pull from the active theme in a single pass.

Cursor also flagged that the splash-hold screen — the view that appears while fonts and the database are loading — used a hardcoded light background. Since the theme provider has not mounted yet at that point in the render tree, you cannot use theme tokens there at all. But you can read the OS colour scheme directly, which is available anywhere. We added that one-liner and the dark flash on dark-mode devices is gone.

The finding about block components falling back to light-theme tokens turned out to be a documented architectural reality rather than a bug. The codebase has a comment explaining exactly this. Every block receives a theme prop from the reader screen, so in practice no fallback ever fires. Cursor did not read that comment, or at least did not weight it. Claude knew it immediately because it had written the architecture.

The finding about hardcoded destructive colours was real but low priority. The red used for delete and error states was not in the theme token system, but it has adequate contrast on both light and dark backgrounds across all the palette combinations we tested. Worth noting, not worth rushing.

The Broader Lesson

Running two AI systems in sequence is not about distrust. Claude had already done significant self-review on this codebase — it caught several of its own issues across the build. But it is the same collaborator I have been working with for weeks, with full context of every decision. Cursor came in cold, with no assumptions, no loyalty to existing choices, and no memory of why things were done the way they were.

That combination turns out to be genuinely useful. Cursor was better at spotting things that had simply been skipped — the Add Blog screen, the splash-hold colour, the unchecked boolean return. Claude was better at evaluating severity, understanding the architecture, and knowing when a finding was real versus when it reflected a misunderstanding of how mobile OAuth works.

The workflow I would recommend: build with one AI, then bring in a second with no context and ask for a cold audit on specific categories. Then bring those findings back to your primary collaborator and verify each one against the actual code before acting. Do not just implement what the auditing AI suggests — verify it, understand it, and decide whether the severity rating makes sense in your specific context.

One more thing worth saying: both Claude and Cursor missed things. There will always be things both miss. The goal is not a perfect audit. It is a broader search surface than any single pass can cover.

What We Actually Fixed

To make this concrete, here is what changed as a direct result of the Cursor review.

The sync service now checks the boolean return from like and unlike API calls before dequeuing the pending action. A server-side failure no longer gets treated as success.

The scroll position tracker now clamps progress between 0 and 1 before writing to the database.

The settings connect button is now wrapped in a try/finally block so the loading spinner clears correctly even if the OAuth flow throws an unexpected error.

The stats screen now divides average reading time by the count of valid rows, not the total row count, so malformed entries no longer bias the average downward.

The Add Blog screen is now fully theme-aware, using the active theme’s background, text, input field, placeholder, and button colours throughout.

The splash-hold screen now reads the OS colour scheme directly and uses the appropriate dark or light background before the theme system is available.

That is six concrete fixes from a forty-minute second-opinion exercise. A reasonable return.

Cozy Reader is open source and in active development. If you follow WordPress blogs and want a calm, book-like reading experience on Android, it is worth keeping an eye on.