How I Turned GitLab into a Coordination Layer for Autonomous AI Development Agents

Notice

Recent Posts

Recent Comments

Link

« 2026/05 »
일	월	화	수	목	금	토
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Tags more

Archives

Today

Total

관리 메뉴

잡동사니

How I Turned GitLab into a Coordination Layer for Autonomous AI Development Agents 본문

IT/AI

How I Turned GitLab into a Coordination Layer for Autonomous AI Development Agents

yeTi 2026. 5. 14. 10:06

Lessons from building a multi-agent AI development workflow for a production project

TL;DR

Building a reliable AI coding agent is one engineering problem.

Building a reliable AI development workflow with multiple agents is another.

A single agent mostly struggles with execution quality.

Multiple agents introduce coordination problems:

task ownership
shared state visibility
race conditions
workspace contamination
lock recovery
operational governance

While building an autonomous development workflow for the sqlgen project, I learned that code generation was only one part of the problem.

The dominant challenge was coordination.

GitLab labels became the shared state machine that allowed independent agents to coordinate work safely. GitLab’s scoped labels are explicitly designed to support mutually exclusive workflow states, which makes them a practical coordination primitive for workflow orchestration. ([GitLab 문서][1])

The Goal

The original goal was straightforward.

I wanted engineering work inside the sqlgen project to move through an AI-assisted delivery workflow with minimal manual execution.

The target flow looked like this:

Issue discovered
→ planned
→ implemented
→ reviewed
→ tested
→ merged

The initial assumption was simple:

If the coding model is good enough, autonomous delivery becomes practical.

That assumption turned out to be incomplete.

Code generation solved only part of the problem.

Once multiple agents became involved, coordination became the dominant engineering challenge.

This Was Not a Single-Agent Problem

I was not building a coding assistant.

I was building a workflow where multiple agents had distinct responsibilities.

A simplified structure:

Human PM
   ↓
PM Bot
   ↓
Review Bot
   ↓
Dev Bot
   ↓
QA Bot
   ↓
Human Approval

Each agent had a narrower role.

That part was intentional.

Specialized agents are easier to reason about than one general-purpose autonomous actor.

But specialization creates a new requirement:

shared operational context.

A human team can rely on conversation, memory, and implicit understanding.

Independent agents cannot.

Task ownership, workflow progress, and execution state must be externally visible.

That made coordination state an explicit architectural concern.

Why GitLab?

A natural question:

Why use GitLab instead of building a dedicated orchestration service?

The answer was practical.

GitLab already provided several useful properties.

1. Existing Workflow Surface

The engineering workflow already lived in GitLab:

issues
merge requests
labels

That meant no additional operational UI needed.

Agents could integrate into the workflow engineers were already using.

2. Shared Visibility

Humans and agents could observe the same workflow state.

This matters operationally.

A coordination system that only agents understand becomes difficult to debug.

GitLab gave immediate human inspectability.

An engineer could look at an issue and immediately understand where work was stuck.

3. Simple Polling Model

The initial MVP used a cron-based automation model.

Example:

find issues with workflow::dev-ready

This approach was intentionally simple.

No event bus.
No dedicated orchestration queue.
No new infrastructure.

For an MVP, operational simplicity mattered more than architectural purity.

4. Explicit State Representation

Scoped labels gave a lightweight way to encode workflow lifecycle state.

Example:

workflow::pm-ready
workflow::dev-running
workflow::review-ready

Because labels within the same scope are mutually exclusive, workflow transitions become naturally enforceable. ([GitLab 문서][1])

That significantly reduced coordination ambiguity.

The architectural tradeoff was intentional:

Instead of introducing a separate orchestration system, I reused the existing engineering control plane.

GitLab as a Shared State Machine

The workflow state model looked like this:

workflow::pm-ready
workflow::pm-running
workflow::dev-ready
workflow::dev-running
workflow::review-ready
workflow::qa-ready
workflow::done
workflow::failed

Example lifecycle:

Issue created
→ workflow::pm-ready

PM Bot claims task
→ workflow::pm-running

Planning complete
→ workflow::dev-ready

Dev Bot claims task
→ workflow::dev-running

Implementation complete
→ workflow::review-ready

This solved a critical coordination problem.

Agents no longer depended on hidden internal context.

Workflow state became:

explicit
queryable
observable

GitLab was no longer just storing code.

It was acting as the coordination layer for distributed autonomous workers.

First Working MVP

The initial MVP worked under normal execution conditions.

The execution flow looked like this:

1-minute cron poller
↓
Find issues labeled workflow::dev-ready
↓
Acquire workspace lock
↓
Mark issue workflow::dev-running
↓
Execute Codex implementation flow
↓
Create merge request
↓
Transition issue to workflow::review-ready

This was enough to validate the architectural direction.

But happy paths do not validate operational systems.

Failure behavior does.

What Actually Broke

The dominant failures were operational coordination failures rather than model capability failures.

1. Double Pickup

Without explicit claiming, multiple agents can observe the same available task.

Example:

Agent A sees workflow::dev-ready
Agent B sees workflow::dev-ready
Both begin execution

Classic race condition.

Humans resolve this socially.

Distributed workers do not.

The fix:

explicit task claiming
state transition before execution
locking

2. Dirty Workspace Contamination

A failed execution could leave behind:

modified files
temporary branches
partial generated output
broken local state

The next execution inherited polluted state.

This produced misleading failures.

The issue was not reasoning quality.

It was environment integrity.

The fix:

workspace isolation
cleanup contracts
pre-execution guards

3. Cron Environment Drift

Manual execution succeeded.

Automated execution failed.

This is a classic operational issue.

Cron environments differ from interactive shells.

Common failures:

PATH mismatch
missing environment variables
CLI auth assumptions
host normalization issues

In practice, this surfaced as:

Codex working manually but failing in automation
glab targeting the wrong host
executables missing during scheduled execution

These are not glamorous problems.

But production automation usually fails on operational details, not architecture diagrams.

4. Stale Locks

Locks prevent concurrent execution.

But failed runs can leave stale locks behind.

Result:

lock exists
→ no new work claimed
→ workflow silently stalls

Without recovery logic, the system appears healthy while doing nothing.

The fix:

lock TTL
stale lock detection
cleanup recovery

Human-Governed Autonomy

A design correction emerged during implementation.

Full autonomy is not the immediate objective.

A more practical operational model is:

human-governed autonomy

Humans remain responsible for:

defining goals
approving critical changes
resolving ambiguity
production governance

Agents handle:

execution
repetitive workflow progression
structured implementation tasks

This boundary preserves automation benefits while reducing operational risk.

Key Engineering Lesson

Single-agent reliability asks:

How do I make one agent execute correctly?

Multi-agent workflow reliability asks:

How do independent agents coordinate safely?

These are different engineering problems.

The second problem looks much closer to distributed systems engineering than prompt engineering.

Because the failure modes are familiar:

shared state consistency
ownership conflicts
stale resources
operational recovery
workflow observability

Reliable agents are useful.

Reliable coordination is essential.

'IT > AI' 카테고리의 다른 글

How Role Separation Reduced Execution Drift in Multi-Agent Systems (0)	2026.05.07
Why Multi-Agent Systems Fail to Respond — Debugging a Real Hermes Agent Setup (0)	2026.04.30
How I Designed a Reliable LLM Coding Agent for Production (0)	2026.04.29
Why Prompt Engineering Fails — Harness Engineering for Reliable LLM Systems (0)	2026.04.24
Why LLM Systems Fail in Production (And Why Prompt Engineering Is Not Enough) (0)	2026.04.23

'IT/AI' Related Articles

Comments

잡동사니

How I Turned GitLab into a Coordination Layer for Autonomous AI Development Agents 본문

How I Turned GitLab into a Coordination Layer for Autonomous AI Development Agents

TL;DR

The Goal

This Was Not a Single-Agent Problem

Why GitLab?

1. Existing Workflow Surface

2. Shared Visibility

3. Simple Polling Model

4. Explicit State Representation

GitLab as a Shared State Machine

First Working MVP

What Actually Broke

1. Double Pickup

2. Dirty Workspace Contamination

3. Cron Environment Drift

4. Stale Locks

Human-Governed Autonomy

Key Engineering Lesson

'IT > AI' 카테고리의 다른 글

티스토리툴바