Code review is one of those things every developer knows they should do more carefully, and most developers do less carefully than they intend to. You are tired, the PR has been sitting for two days, you scan it quickly and leave a comment about variable names while missing the authentication bug on line 94.
Claude will not replace your teammates’ review. It does not know your product requirements, your team’s conventions, or the business context behind a decision. But it will catch a lot of things your tired eyes miss, and it will do it before you even open a PR.
Here is how to actually use it.
The basic flow
The simplest version: get your diff, paste it in, ask for a review.
git diff main > review.txt
Then paste the contents of review.txt into Claude with this prompt:
“Review this code diff. Look for bugs, edge cases, security issues, and anything that seems like it might behave differently than the author intended. Be specific about line numbers and explain your reasoning.”
That is it. Claude will go through the diff section by section and flag things.
Making the prompt better
The generic prompt gets you a generic review. Specificity gets you more value.
Tell Claude what the code is supposed to do:
“This diff adds a password reset flow to a Node.js Express app. The user receives a token by email, visits a reset link, and sets a new password. Review for security issues, token handling, and race conditions.”
Now Claude knows what to look for.
Tell Claude your stack and constraints:
“This is TypeScript with Prisma ORM. We do not use any error boundary library — all errors are handled manually. Flag any unhandled promise rejections or missing error handling.”
Ask for specific categories:
“Focus on: 1) security vulnerabilities, 2) edge cases that could throw errors in production, 3) any async/await issues. Skip style comments.”
Without that last line, Claude will often point out that your variable name could be more descriptive. Useful, but not what you needed right now.
A real example
Here is a simplified version of a piece of code I ran through Claude recently:
async function resetPassword(token: string, newPassword: string) {
const user = await db.user.findFirst({
where: { resetToken: token }
});
if (!user) throw new Error("Invalid token");
await db.user.update({
where: { id: user.id },
data: { password: hash(newPassword), resetToken: null }
});
}
What I thought was the issue: maybe the token comparison needed to be constant-time.
What Claude caught that I missed:
- No check for token expiry — a token generated a week ago would still work
- The
findFirstwith no expiry filter means if there are multiple users with the same token (unlikely but possible with a short token space), it picks the first one arbitrarily - No validation on
newPassword— someone could set an empty string as their password
I knew the token expiry issue in the abstract but had not implemented it yet and forgotten to add the check. The other two were genuine blind spots.
What Claude catches well
- Missing null checks and undefined access. “You call
user.emailhere butusercould be null if the query returns nothing.” - Unhandled promises and missing awaits. Claude reads async code carefully.
- SQL injection vectors, XSS, and common security patterns. It knows OWASP and applies it.
- Off-by-one errors and boundary conditions. Especially in loops and slices.
- Logic inconsistencies. “This condition checks for admin but the function is only called for regular users — this branch can never be reached.”
- Missing error handling. Functions that throw but callers that do not handle it.
What Claude misses
Be honest with yourself about this — Claude is not a full substitute for human review.
Business logic bugs. If the rule is “users on the free plan can only have 3 projects” and the code allows 4, Claude does not know that unless you told it. It only knows what the code does, not what it was supposed to do.
Architectural decisions. “Should this be a background job instead of synchronous?” requires product and system context Claude does not have.
Test coverage gaps. Claude can look at tests you wrote, but it cannot tell you that you forgot to test the case where the user is logged out mid-session unless you describe that scenario.
Performance at scale. “This query works fine” is true until you have 10 million rows. Claude does not always flag N+1 query patterns unless they are egregious.
Building it into your workflow
The best time to run a Claude review is right before you open a PR — not after you have already made the PR and are starting the review cycle.
A simple shell alias:
alias ai-review='git diff main | pbcopy && echo "Diff copied. Paste into Claude."'
Or if you use Claude Code, you can write a skill that runs the diff and pipes it directly:
claude /review-pr
With a SKILL.md that does the diff, pastes it in context, and asks Claude to review it with your team’s specific focus areas.
The mindset shift
The biggest change is treating AI review as a first pass, not a final one. Run it before you submit, not instead of getting a teammate’s eyes on it.
Think of it as the equivalent of running a linter before you push. Lint does not replace review. But it catches a class of problems automatically so your reviewer can focus on the things that actually require human judgment.
Claude does the same thing, just for a broader class of problems.
Related: How Developers Actually Use Claude Every Day · AI Mistakes When Building Apps (And How to Fix Them)