What is an agent?
A language model (LLM) on its own can only produce text. Ask it a question and it answers — but it cannot do anything: it cannot read a file, run a command, or check its own work. It is a brain without hands.
An agent is that same model given two things: tools (hands to act on the world) and a loop (the ability to act several times in a row, observing the result of each action before deciding the next). Nothing more. The “magic” of coding agents lives entirely in these two ingredients.
You type → it replies. One pass. Text is the end of the road.
You type → it acts, observes, repeats, until the task is actually done.
This whole guide is based on a real, minimal agent. It talks to any OpenAI-compatible server (LM Studio, Ollama, a local model, a cloud API). No framework, no SDK: just Go's standard library, so that every mechanism stays readable and demystified.
The agentic loop
The heart of every agent is a loop. On each turn (one user message), the agent repeats a step as long as it has something to do:
In the code, this is the handleTurn function. Read it like a
recipe: ask the model, check whether it wants a tool, if so run it and put
the result back into the history, then repeat.
for step := 0; step < a.maxSteps; step++ {
a.compactContext(ctx) // keep the context under control
msg, err := a.client.Chat(ctx, a.history, a.registry.Schemas()) // 1. ask the model
if err != nil { /* hand control back cleanly */ return }
a.history = append(a.history, msg) // remember its reply
if len(msg.ToolCalls) > 0 { // 2. does the model want to act?
for _, call := range msg.ToolCalls {
result := a.runTool(ctx, call.Function.Name, parseJSONArgs(call.Function.Arguments))
a.history = append(a.history, Message{ // 3. the observation goes back to the model
Role: "tool", ToolCallID: call.ID,
Name: call.Function.Name, Content: result,
})
}
continue // 4. loop again
}
return // no tool requested → turn done
}
maxSteps) guarantees
a turn always ends. The code even adds a loop detector: if
the agent asks for exactly the same action as the previous step,
we stop — it is no longer making progress.
The tools: the agent's hands
A tool is simply a function the model can decide to call.
In this agent, a tool is a pure function (context, arguments) → (result, error).
It does no display of its own: it computes a result, full stop. That is what
makes it easy to write, test and reuse.
type Tool struct {
Name string // e.g. "read_file"
Description string // read by the model to know when to use it
Parameters map[string]any // the JSON schema of the arguments
Confirm func(args) (bool, string) // optional guard (risky actions)
Run ToolFunc // (ctx, args) → (result, error)
}
The agent ships three basic tools — that is all you need to code:
read_file
Reads the contents of a text file.
write_file
Writes or overwrites a file.
execute_shell
Runs a command (build, tests, git…).
How does the model know which tools exist? We describe each one in a schema sent with every request. The model reads the name, the description and the parameters, then it chooses which to call.
r.Register(Tool{
Name: "read_file",
Description: "Reads the contents of a text file.",
Parameters: map[string]any{
"type": "object",
"properties": map[string]any{
"path": map[string]any{"type": "string", "description": "Path of the file to read"},
},
"required": []string{"path"},
},
Run: toolReadFile,
})
Tool.
The rest of the loop doesn't change. That is the whole power of the model:
the tool registry is extensible without touching the engine.
Talking to the model
All that's left is to connect the model. The conversation is just a list of messages (system, user, assistant, tool results) that we send back in full at each step — the model has no memory of its own, the context is the memory.
Function calling: how the model “calls” a tool
The model doesn't run the tool itself. It returns a structured
intent — “I want to call read_file with path=math.go” —
and it's our code that runs it, then returns the result.
Streaming: watching the answer take shape
Rather than waiting for the full answer, we read a stream (SSE) and display each fragment as it arrives. Tool calls, for their part, arrive in pieces that we reassemble by their index.
for _, d := range delta.ToolCalls {
tc := toolCalls[d.Index] // one fragment per tool index
if d.Function.Name != "" { tc.Function.Name = d.Function.Name }
tc.Function.Arguments += d.Function.Arguments // arguments arrive in chunks
}
The safety net: what if the model can't call tools?
Not every model supports native function calling. So the agent provides
a fallback: if the model writes its intent as text
(Action: read_file(path="…")), we detect it with a regular
expression. A subtlety: we only accept the Action: prefix at the
start of a line — otherwise, when the model recaps its
actions, we would re-run them in a loop.
// (?m): ^ anchors at the start of a line — avoids re-running an action quoted
// in a recap ("1. Action: write_file(...)").
pattern := `(?sm)^[ \t]*Action\s*:\s*(` + strings.Join(names, "|") + `)\s*\(\s*(.*)\)`
Safety rails
Letting a model run shell commands is powerful — and dangerous. A good agent isn't just a loop: it's a cautious loop. Four protections, simple but essential:
Confirming risky actions
A command matching a dangerous pattern (rm -rf, sudo, dd, a fork bomb…) asks for human approval before running.
Loop detection
If the agent repeats exactly the same action, we stop the turn: it isn't making progress, no point burning tokens.
Timeouts
Every command and every model call has a time limit. A stuck command never freezes the agent.
Output truncation
A tool's result is capped before being fed back: a huge output won't saturate the context.
var dangerousPatterns = []*regexp.Regexp{
regexp.MustCompile(`\brm\s+-[a-zA-Z]*[rf]`), // rm -rf
regexp.MustCompile(`\bdd\s+if=`),
regexp.MustCompile(`:\s*\(\)\s*\{`), // fork bomb
regexp.MustCompile(`\b(shutdown|reboot|halt)\b`),
regexp.MustCompile(`\bsudo\b`),
// …
}
Memory & context
At each step the history grows: messages, tool calls, results. But a model's context window is limited. Do nothing and you eventually overflow it. The agent's solution: compact.
When the history exceeds a token budget, we keep the recent messages intact (the work in progress), and we ask the model to summarize the older ones in a few bullet points. The summary replaces the old messages. Long-term memory becomes compact, short-term memory stays precise.
if totalTokens(a.history) <= a.maxCtx { return } // under budget: nothing to do
older := rest[:keepFrom] // the older messages
recent := rest[keepFrom:] // the ~60% most recent, kept as-is
summary, err := a.client.Summarize(ctx, older) // the model summarizes the old ones
// → [initial system] + [summary] + [recent messages]
Putting it together
We have all the pieces. The main program wires them up: it creates the tool registry, the model client, and starts a read loop (a REPL). Each user message triggers an agent turn — the loop we just dissected.
And the agent's “personality”? It lives in a system prompt: a few sentences reminding it of its mission, its tools and its rules.
`You are an autonomous coding agent.
Rules:
1. Break missions into steps and call the tools you need.
2. Analyze each tool result; on error, fix it and retry.
3. A destructive action may require user confirmation.
4. When the mission is done, give a short recap and hand control back.`
In one sentence
A coding agent is a loop that sends the history to a model, runs the tools it asks for, returns the results to it, and repeats — all wrapped in safety rails and context management. No magic: just readable engineering.
Frequently asked questions
What is the difference between an agent and a chatbot?
A chatbot answers in a single pass: message, then reply. An agent acts, observes the result of each action and loops again, using tools, until the task is actually done.
Do you need a framework to build a coding agent?
No. The agent in this guide is written in Go with the standard library only, in ~1000 lines and zero dependencies. It works with any OpenAI-compatible server (LM Studio, Ollama, vLLM, a cloud API).
Is it free? Does it work offline?
Yes to both. The code is open-source (MIT license), so it's free. And because the agent talks to an OpenAI-compatible server, you can point it at a model running on your machine (LM Studio, Ollama…): no internet connection and no subscription required. It's the same principle as Cursor, Claude Code, Copilot or Codex, but 100% local.
How does a language model execute actions?
It doesn't execute them itself. Through function calling, the model returns a structured intent (tool name and arguments); it's the agent code that actually runs the tool, then returns the result to the model.
How do you stop an agent from looping or running a dangerous action?
With safety rails: a step limit per turn, repeated-action detection, human confirmation for risky commands, timeouts, and truncation of overly long outputs. Errors are fed back to the model so it can self-correct.
How do you handle a limited context window?
By compacting the history: when it exceeds a token budget, recent messages are kept intact while the oldest ones are summarized by the model and replaced with that summary.
The code & going further
This whole guide describes a real, complete agent. The code is open, commented line by line, and runs with any OpenAI-compatible server. The best way to understand it: read it, run it, change it.
A coding agent in Go — LLM loop + tools, ~1000 lines, zero dependencies.
Run the agent in 30 seconds
git clone https://github.com/Nhilo94/comprendre-agent-llm
cd comprendre-agent-llm
go run . # then pick your model and start chatting
A few ways to extend it
- Add a web search or API call tool.
- Replace the summary with a real vector memory.
- Let the agent plan before acting.
- Wire in a second agent that reviews the first one's work.