Why Prompt Engineering Alone Fails in LLM Systems (And How to Fix It with Convergence)

Notice

Recent Posts

Recent Comments

Link

« 2026/05 »
일	월	화	수	목	금	토
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Tags more

Archives

Today

Total

관리 메뉴

잡동사니

Why Prompt Engineering Alone Fails in LLM Systems (And How to Fix It with Convergence) 본문

IT/AI

Why Prompt Engineering Alone Fails in LLM Systems (And How to Fix It with Convergence)

yeTi 2026. 4. 13. 16:53

Lessons learned from building a real-world LLM coding agent with local models

📌 TL;DR

LLMs are non-deterministic → same input, different outputs
Pipeline architectures amplify failure probabilities
Prompt engineering improves outputs but cannot guarantee reliability
The real solution is not better prompts, but convergence systems

1. Problem — You Can’t Even Get Stable Outputs

I wanted to build a local LLM-powered coding assistant.

So I set up:

Mac Studio
Ollama
Claude Code CLI
qwen3.5

Then I tried the simplest possible task:

Build a simple API

But the results were unstable:

Sometimes no output at all
Sometimes excessive file exploration (over-exploration)
Sometimes the task never completed

The problem wasn’t correctness.

The problem was that I couldn’t reliably get results at all.

2. Observation — Small Tasks Work

After multiple attempts, I noticed a pattern:

Local LLMs perform much better on small, well-defined tasks.

For example:

Implementing a single function
Fixing a specific bug
Tasks with clear input/output

This led to an important insight:

“Break the problem down into smaller pieces.”

3. Approach — Role Decomposition

Instead of one large prompt, I split the task into stages:

[Analyze] → [Design] → [Implement]

Each step:

Has a narrow scope
Produces structured output
Can be validated

This significantly improved success rates (in manual runs).

4. Scaling Up — Pipeline Automation

Naturally, the next step was:

“Let’s automate this workflow.”

So I built a pipeline:

User Input
   ↓
[Analyze] → [Design] → [Implement]
   ↓
 Final Output

5. Problem — The Pipeline Breaks Easily

After automation, new issues appeared:

Sometimes it works
Sometimes it completely fails

The key issue:

A single failure breaks the entire pipeline.

6. Why Pipelines Fail

6.1 LLMs Are Non-Deterministic

Unlike traditional systems:

Same input → same output (X)
Same input → probabilistic output (O)

6.2 Probability Compounding

If each step succeeds with probability ( p ):

P_{total} = p_1 \times p_2 \times p_3

As the number of steps increases, total success probability drops rapidly.

6.3 Manual vs Automated Execution

Aspect	Manual	Automated
Human intervention	Yes	No
Error recovery	Possible	None
Progress condition	Partial success	Full success

Pipelines require every step to succeed every time.

7. The Real Problem

Initially, I thought:

“We need better prompts.”

But the real issue was:

“How do we handle failures?”

This is not a prompt problem.

It is a system design problem.

8. Solution — Convergence System

Instead of a linear pipeline, I redesigned the system as a convergence loop.

         LLM Call
             ↓
        Validation
        /        \
     OK           FAIL
     ↓            ↓
  Accept        Retry

9. Implementation — Retry + Validation

9.1 Retry Loop

def run_with_retry(task_fn, validate_fn, max_retry=3):
    for attempt in range(max_retry):
        result = task_fn()

        if validate_fn(result):
            return result

    return result

9.2 Validation Example

def validate_code(result):
    if "```" not in result:
        return False
    if "TODO" in result:
        return False
    return True

9.3 Step Isolation

analysis = analyze(input)
design = design(analysis)
code = implement(design)

Each step is independently validated and recoverable.

10. Results

After introducing convergence mechanisms:

Reduced over-exploration
Fewer pipeline failures
More consistent outputs

The most important change:

The system started working by design, not by luck.

11. Final Takeaway

Prompt engineering matters.

But it is not enough for automation.

LLM systems are not about generating correct answers.

They are about controlling incorrect ones.

'IT > AI' 카테고리의 다른 글

Why Prompt Engineering Fails — Harness Engineering for Reliable LLM Systems (0)	2026.04.24
Why LLM Systems Fail in Production (And Why Prompt Engineering Is Not Enough) (0)	2026.04.23
AI 도구 무엇을 써야 할까? 2026년 독립 개발자의 AI 개발 비용 중심 선택 기준 (0)	2026.02.12
ChatGPT 빌더와 OpenAI Assistants API 비교: 기능 및 사용 사례 (0)	2024.06.13
인간 대화와 LLM 상호작용에서의 기억과 맥락 (0)	2024.06.04

'IT/AI' Related Articles

Comments

잡동사니

Why Prompt Engineering Alone Fails in LLM Systems (And How to Fix It with Convergence) 본문

Why Prompt Engineering Alone Fails in LLM Systems (And How to Fix It with Convergence)

📌 TL;DR

1. Problem — You Can’t Even Get Stable Outputs

2. Observation — Small Tasks Work

3. Approach — Role Decomposition

4. Scaling Up — Pipeline Automation

5. Problem — The Pipeline Breaks Easily

6. Why Pipelines Fail

6.1 LLMs Are Non-Deterministic

6.2 Probability Compounding

6.3 Manual vs Automated Execution

7. The Real Problem

8. Solution — Convergence System

9. Implementation — Retry + Validation

9.1 Retry Loop

9.2 Validation Example

9.3 Step Isolation

10. Results

11. Final Takeaway

'IT > AI' 카테고리의 다른 글

티스토리툴바