A well-crafted system immediate will enhance the standard of code produced by your coding assistant. It does make a distinction. For those who present pointers in your system immediate for writing code and checks, coding assistants will observe the rules.
Though that depends upon your definition of “will observe.” In case your definition is “will observe usually,” then it’s correct. In case your definition is “will observe at all times” and even “will observe more often than not,” then it’s inaccurate (until you’ve discovered a strategy to make them dependable that I haven’t—please let me know).
Coding brokers will ignore directions within the system immediate frequently. Because the context window fills up and begins to intoxicate them, all bets are off.
Even with the newest Opus 4.5 mannequin, I haven’t observed a significant enchancment. So if we will’t depend on fashions to observe system prompts, we have to spend money on suggestions cycles.
I’ll present you the way I’m utilizing Claude Code hooks to implement computerized code assessment on all AI-generated code in order that code high quality is larger earlier than it reaches the human within the loop.
| You could find a code instance that demonstrates the ideas mentioned on this submit on my GitHub. |
Auto Code Overview for Quick, Semantic Suggestions
Once I discuss auto code assessment on this submit, I’m describing a quick suggestions mechanism supposed to assessment frequent code high quality points. This can be run at any time when Claude has completed making edits, so it must be quick and environment friendly.
I additionally use coding assistants for detailed code evaluations when reviewing a PR, for instance. That may spin up a number of subagents and take a bit longer. That’s not what I’m speaking about right here.

The aim of the auto code assessment is to strengthen what’s in your system immediate, undertaking documentation, and on-demand expertise. Issues that Claude could have ignored. A part of a multipronged strategy.
Wherever potential, I like to recommend utilizing your lint and check guidelines to bake in high quality, and depart auto code assessment for extra semantic points that instruments can’t test.
If you wish to set a most size on your recordsdata or most degree of indentation, then use your lint instrument. If you wish to implement a minimal check protection, use your check framework.
Semantic Code Overview
A semantic code assessment seems at how nicely the code is designed. For instance, naming: Does the code precisely describe the enterprise ideas it represents?
AI will usually default to names like “helper” and “utils.” However AI can also be good at understanding the nuance and discovering higher names if you happen to problem it, and it may possibly do that shortly. So it is a good instance of a semantic rule.
You possibly can ban sure phrases like “helper” and “utils” with lint instruments. (I like to recommend doing that.) However that received’t catch the whole lot.
One other instance is logic leaking out of the area mannequin. When a use case/utility service queries an entity after which decides, it’s extremely possible your area logic is leaking into the applying layer. Not really easy to catch with lint instruments, however price addressing.

One other instance is default fallback values. When Claude has an undefined worth the place a price is predicted, it would set a default worth. It appears to hate throwing exceptions or difficult the sort signature and asking, “Ought to we enable undefined right here?” It needs to make the code run it doesn’t matter what and regardless of how a lot the system immediate tells it to not.

You possibly can catch a few of this with lint guidelines but it surely’s very nuanced and depends upon the context. Generally falling again to a default worth is appropriate.
Constructing an Auto Code Overview with Claude Hooks
For those who’re utilizing Claude Code and need to construct an auto code assessment for checks that you could’t simply outline with lint or testing instruments, then an answer is to configure a script that runs on the Cease hook.
The Cease hook is when Claude has completed working and passes management again to the person to decide. So right here, you may set off a subagent to carry out the assessment on the modified recordsdata.
To set off the subagent that you must return the error standing code which blocks the primary agent and forces them to learn the output.

I feel it’s typically thought of a finest observe to make use of a subagent centered on the assessment with a really crucial mindset. Asking the primary agent to mark its personal homework is clearly not an excellent strategy, and it’ll burn up your context window.
| The answer I exploit is obtainable on GitHub. You possibly can set up it as a plug-in in your repo and customise the code assessment directions, or simply use it as inspiration on your personal answer. Any suggestions is welcome. |
Within the instance above you may see it took 52 seconds. In all probability faster than me reviewing and offering the suggestions myself. However that’s not at all times the case. Generally it may possibly take a couple of minutes.
For those who’re sitting there blocked ready for assessment, this is perhaps slower than doing it your self. However if you happen to’re not blocked and are engaged on one thing else (or watching TV), this protects you time as a result of the top outcome can be larger high quality and require much less of your time to assessment and repair.
Scanning for Up to date Information
I would like my auto code assessment to solely assessment recordsdata which were modified for the reason that final pull request. However Claude doesn’t present this info within the context to the Cease hook.
I can discover all recordsdata modified or unstaged utilizing Git, however that’s not adequate.
What I do as a substitute is to hook into PostToolUse by protecting a log of every modified file.

When the Cease hook is triggered, the assessment will discover the recordsdata modified for the reason that final assessment and ask the subagent to assessment solely these. If there are not any modified recordsdata, the code assessment isn’t activated.
Challenges with the Cease Hook
Sadly the Cease hook isn’t 100% dependable for this use case for a number of causes. Firstly, Claude would possibly cease to ask a query, e.g. so that you can make clear some necessities. You may not need the auto assessment to set off right here till you’ve answered Claude and it has completed.
The second cause is that Claude can commit modifications earlier than the Cease hook. So by the point the subagent performs the assessment, the modifications are already dedicated to Git.
Which may not be an issue, and there are easy methods to resolve it whether it is. It’s simply additional issues to remember and setup.
The perfect answer can be for Anthropic (or different instrument distributors) to offer us hooks which are larger degree in abstraction—extra aligned with the software program growth workflow and never simply low-level file modification operations.
What I’d actually love is a CodeReadyForReview hook which gives all of the recordsdata that Claude has modified. Then we will throw away our customized options.
Let Me Know If You Have a Higher Strategy
I don’t know if I’m not trying in the correct locations or if the data isn’t on the market, however I really feel like this answer is fixing an issue that ought to already be solved.
I’d be actually grateful if you happen to can share any recommendation that helps to bake in code high quality earlier than the human within the loop has to assessment it.
Till then I’ll proceed to make use of this auto code assessment answer. While you’re giving AI some autonomy to implement duties and reviewing what it produces, it is a helpful sample that may prevent time and scale back frustration from having to repeat the identical suggestions to AI.
