Refinement in Place

Nov 12, 2025 · 3 min read

I previously wrote about reflection as a training signal. It’s the idea that models reflect on their reasoning and explain in simple language the flaws they spot. Reflection is only useful if it leads somewhere. The next step is refinement, where systems use what they observe to make themselves better.

Refinement in place means improvement that happens while the system runs, not through retraining or redeployment. It’s the difference between a model that gets better when someone updates it, and one that gets better by using itself. The distinction is small but important. It shifts the focus from model building to model behavior, and how systems adapt as they operate.

We’ve seen this kind of shift before. Early compilers just translated code. Only when they began folding constants, unrolling loops, and optimizing hot paths did they become transformative. Language models are beginning a similar transition. Where compilers optimized deterministically, models do it probabilistically, but the principle is the same: a system using its own output as input for improvement. In software terms, it’s the move from periodic updates to continuous integration, from static behavior to adaptive execution.

Recent research points to this shift taking shape. Projects like MetaAgent, demonstrate that a system can analyze its own reasoning traces, adjust prompts and tool use, and improve accuracy without updating model weights. In code, the pattern is even clearer. For example, RustAssistant, reached roughly 74% fix accuracy on real compiler errors by looping between generation and feedback. Kodezi Chronos scales that approach to entire repositories, fixing multi-file bugs through iterative debugging. These are a few examples, showing what refinement in place looks like in practice: systems that learn not from new data, but from their own operation.

The same idea is now visible in Foundry. A contract-review agent corrected itself after prompt refinement to include certain policies, and a data-pipeline assistant updated its own instructions after reflection, and succeeded on the next attempt. No retraining, no new dataset, just a local loop where the error became the fix.

Refinement depends on reliable feedback: compilers, test suites, runtime traces, or user interactions. When signals are weak or ambiguous, self-refinement can drift or reinforce mistakes. Like any feedback system, refinement introduces noise before it stabilizes. Some corrections overshoot; others converge. That’s how compilers evolved too, through iteration, validation, and repair. AI systems will do the same. The difference now is that the loop runs inside the system itself, not as a step outside it.

For developers, this changes the boundary between model design and model use. Not every improvement should require retraining or redeployment. Some can happen live, and fold back into use. As systems start refining themselves, developers move from writing logic to auditing change, and feedback becomes part of the design surface. Systems do not wait for the next release to get better, but evolve in front of you as they are used.

Reflection lets a system understand itself. Refinement lets it act on that understanding. Together they redefine product making. A system that can refine itself doesn’t just ship once, but keeps shipping every time it runs.