잡동사니
How Role Separation Reduced Execution Drift in Multi-Agent Systems 본문
Lessons from building reliable AI agent workflows with Hermes and local LLMs
TL;DR
- Multi-agent systems become unstable when responsibilities overlap
- Stronger models do not automatically improve workflow convergence
- Shared context without ownership boundaries creates execution drift
- Separating Planner, Implementer, and Validator responsibilities significantly improved workflow stability
- The Implementer should apply contracts precisely, not redesign the system during execution
1. The Problem — Execution Drift in Multi-Agent Workflows
When I first started building AI agent systems, I assumed the main problem was model capability.
If the model became smarter, the workflow would become more reliable.
That assumption turned out to be incomplete.
While experimenting with local LLM-based coding agents using:
- Hermes Agent
- Claude Code CLI
- Ollama
- local qwen models
- Discord-based orchestration
I repeatedly encountered the same failure pattern.
At first, using a single agent felt efficient.
The same agent would:
- plan tasks
- write code
- debug failures
- retry execution
- validate outputs
- redesign architecture during retries
Everything happened inside one large shared context.
Initially, this looked flexible.
But as workflows became larger, execution stability degraded rapidly.
I started seeing problems such as:
- endless retry loops
- inconsistent file structures
- duplicated abstractions
- rewritten interfaces during execution
- architectural drift between retries
- increasing divergence from the original task
The workflow often looked productive.
But convergence became worse over time.
Eventually, I realized I was not only dealing with model errors.
I was dealing with execution drift.
2. Stronger Models Still Drifted During Execution
One surprising realization was that stronger models did not fundamentally solve the problem.
Larger models often generated better local outputs.
However, workflow instability still remained.
In some cases, stronger models amplified instability because they became more willing to reinterpret previous decisions during execution.
For example, an Implementer agent might:
- rename directories during retries
- introduce new abstractions mid-execution
- redefine interfaces that were already agreed upon
- restructure unrelated components while fixing a local issue
At first, this behavior appeared intelligent.
The model looked proactive and adaptive.
However, execution reliability became worse.
The model attempted to optimize locally during retries.
Instead of treating the existing structure as a fixed contract, it continuously searched for “better” architectures.
As retries accumulated, small local optimizations gradually destabilized the workflow itself.
Eventually, the workflow became harder to reason about after every retry.
This led me to an important realization:
Reliability problems in multi-agent systems are often coordination problems.
The workflow was unstable not because the model was incapable, but because responsibilities were unclear.
3. Shared Context Without Ownership Creates Instability
One of the biggest problems in multi-agent systems is uncontrolled shared context.
At first, shared memory feels efficient because every agent can access the same information.
However, in practice, this often removes ownership boundaries.
Once ownership becomes unclear, responsibilities begin overlapping.
For example:
- the Planner modifies implementation details
- the Implementer redesigns architecture decisions
- the Validator proposes alternative execution strategies
- retry loops introduce conflicting interpretations
Eventually, the workflow loses convergence.
The issue is not that the agents are unintelligent.
The issue is that every agent is allowed to make every type of decision.
This creates architectural instability.
While debugging these workflows, I realized the problem felt surprisingly familiar.
It resembled a classic problem from object-oriented design.
4. The Object-Oriented Design Parallel
In object-oriented programming, responsibility separation is considered one of the most important design principles.
The same idea appears repeatedly in concepts such as:
- Single Responsibility Principle (SRP)
- high cohesion
- low coupling
- ownership boundaries
The core idea is simple:
Systems become difficult to reason about when responsibilities overlap.
That idea started feeling very similar to what I was seeing in agent systems.
In traditional software systems:
- a service should not own every responsibility
- a class should not make every decision
- a module should not redefine another module’s contract
The same pattern appeared inside multi-agent workflows.
When every agent could:
- plan
- implement
- redesign
- validate
- reinterpret contracts during retries
workflow stability degraded rapidly.
At some point, I stopped thinking about agents as “smart tools.”
I started thinking about them as independently evolving components inside a distributed system.
That perspective changed how I designed workflows afterward.
5. The Implementer Should Not Redesign the System
One specific failure pattern repeatedly caused instability in my workflows.
The Implementer agent would begin modifying architectural decisions during execution.
For example:
- changing directory structures during retries
- introducing new abstractions unrelated to the original task
- rewriting task boundaries while fixing local errors
- redefining interfaces that other agents already depended on
At first, this behavior looked intelligent.
The model appeared proactive.
However, execution reliability became significantly worse.
Every retry introduced additional design changes.
As those changes accumulated, the workflow continuously drifted away from the original contract.
Eventually, retries stopped behaving like recovery mechanisms.
They became architecture mutation loops.
This problem became especially severe when the Implementer shared the same broad context as the Planner.
The Implementer gradually started behaving like another Planner.
That overlap destabilized the workflow.
Eventually, I realized the problem resembled a classic object-oriented design issue.
An object becomes difficult to reason about when it owns too many responsibilities.
The same pattern appeared in agent systems.
The Implementer should not make new decisions during execution.
Its role is to apply the already defined contract as precisely as possible.
Once I separated:
- planning responsibilities
- execution responsibilities
- validation responsibilities
workflow convergence improved significantly.
6. The Harness Layer — Controlling Convergence
Role separation alone does not guarantee convergence.
The workflow still requires a control layer that verifies whether execution remains aligned with the original contract.
That became the responsibility of the Harness layer.
The Harness layer acts as a convergence controller.
It determines:
- whether retries should continue
- whether execution drift exceeded acceptable boundaries
- whether rollback is necessary
- whether the workflow should terminate
For example, if retries continuously modified unrelated files or redefined existing interfaces, the Harness layer treated the execution as divergence rather than recovery.
That distinction became important.
Without convergence control, retries often amplified instability instead of resolving failures.
The Harness layer then managed:
- retries
- convergence loops
- execution stabilization
- workflow validation
This architecture became significantly more stable than relying on a single highly capable agent operating inside a large shared context.
7. My Current Multi-Agent Structure
My current workflows are increasingly organized around ownership boundaries.
A simplified structure looks like this:
PM Agent
↓
Planner Agent
↓
Implementer Agent
↓
Validator Agent
↓
Harness Layer
Each role owns a different category of decisions.
That ownership is important.
Planner
Responsible for:
- execution strategy
- task decomposition
- contract definition
But not responsible for execution changes during runtime.
Implementer
Responsible for:
- applying predefined contracts
- writing code
- executing tasks precisely
But not responsible for redesigning architecture.
Validator
Responsible for:
- invariant verification
- semantic validation
- execution correctness checks
But not responsible for redefining execution strategy.
As ownership boundaries became clearer, workflow behavior became significantly easier to reason about.
Execution drift decreased.
Retries became more predictable.
And convergence stability improved substantially.
8. Reliability Comes From Ownership Boundaries
One of the biggest misconceptions about AI agents is that reliability comes only from intelligence.
In practice, reliability often comes from constrained responsibilities.
The same principle already exists in software engineering.
Distributed systems become more stable when responsibilities are isolated.
Database systems become safer when transactional boundaries are explicit.
Microservices reduce instability by limiting ownership scope.
Multi-agent systems appear to follow similar patterns.
Without boundaries:
- every agent becomes a planner
- every retry becomes a redesign
- every execution becomes negotiation
As workflows become more complex, instability grows quickly.
Role separation reduces that instability because ownership becomes predictable.
The more complex the workflow became, the more important role boundaries became for maintaining convergence.
9. The Future — Reliability Engineering for Agent Systems
I increasingly believe we are entering a new phase of AI system design.
Earlier generations of AI systems focused heavily on:
- prompts
- model quality
- tool integration
- inference capability
Those layers are still important.
However, as workflows become more autonomous, coordination and ownership also become architectural concerns.
Instead of only asking:
“Which model should execute this task?”
We may increasingly need to ask:
“Which role should own this decision?”
That shift feels important.
Because many of the hardest problems in agent systems are no longer only about generation quality.
They are increasingly about:
- coordination
- ownership
- execution boundaries
- convergence stability
As agent workflows become more autonomous, reliability engineering may increasingly become an exercise in defining ownership boundaries between agents.
Conclusion
One unexpected realization from building multi-agent systems was how familiar the failures looked.
Execution drift, responsibility overlap, and uncontrolled redesign during retries resembled classic software engineering problems.
In many ways, multi-agent workflows began behaving like distributed object systems.
The same lessons appeared again:
- unclear ownership creates instability
- overlapping responsibilities reduce predictability
- uncontrolled autonomy weakens convergence
The Implementer should not redesign the system during execution.
Its responsibility is to apply the already defined contract as precisely as possible.
That separation turned out to be one of the biggest improvements in workflow stability.
Ironically, some of the most important ideas for future AI systems may not be entirely new.
Software engineering has already spent decades learning how to build stable systems through responsibility separation and ownership boundaries.
Now, those principles appear to be emerging again inside agent systems.
