AI basics – LLM agents

A «Bare minimum» article on AI agents built around language models: systems that plan, retain context, call tools, and iterate toward a goal—not just emit a single reply.

Chat vs agent

A plain LLM chat is mostly “prompt → answer”: the model predicts tokens from the prompt and in-context history. It is not required to initiate real-world side effects on its own.

An agent wraps the model (and often other parts) so it pursues a goal, decomposes work, and when needed does things: calls APIs, runs code, searches the web or a document store until the task is done or a step budget is hit.

A useful metaphor: the LLM is the engine (reasoning and wording); the agent is the driver choosing route, when to stop, and which levers (tools) to pull. An engine without a driving scenario does not take you anywhere by itself.

What makes an agent

People often sketch three pillars—an “anatomy of an agent” diagram.

Planning. Breaking a large task into subgoals and steps. This overlaps with chain-of-thought style reasoning, explicit plans, and self-reflection (re-read a draft, check constraints, adjust).

Memory.

Short-term — the live context window: recent turns and intermediate notes in one session.
Long-term — stored facts, docs, prior sessions; often implemented with vector stores and RAG so relevant snippets are retrieved into the prompt (see the RAG article).

Tools. Formalized actions the model can request: HTTP APIs, code execution (e.g. Python), web search, filesystem hooks, calendar or DB queries. Without tools the “agent” remains text-only with no outward levers.

How to think about agents

The mindset shift is closer to delegating to a junior teammate than to typing a single search query.

Delegate, don’t only prompt. State the goal, success criteria, and allowed sources/tools the way you would brief a colleague.
Set boundaries and a role. A clear persona (“research assistant”, “no payment actions”) and guardrails reduce scope creep and unsafe surprises.
Expect iteration. Agents misstep and dead-end; a normal pattern is try → observe → replan. That drives step limits, logging, and human oversight.
Provide resources. If facts live in docs, wire search or RAG; if numbers matter, supply a runtime. Without the right levers the model falls back to guessing.

ReAct: reason and act

ReAct (*Reasoning and Acting*) is a common loop drawn as thought → action → observation.

Thought: what do I know, what is missing, what is the next sensible move?
Action: invoke a specific tool with arguments (search, API, code, …).
Observation: the tool’s raw result is fed back into context.
Then another Thought—adjust the plan or finish with a final user-facing answer.

This alternates natural-language reasoning with grounded steps instead of hallucinating when fresh data or computation is required.

Multi-agent systems

Hard problems are sometimes split across several agents with different roles—a “manager and workers” picture.

A coordinator assigns subtasks, merges outputs, keeps the end result coherent.
Specialists might focus on coding, testing, literature search, or report formatting.

Upsides: modularity and parallelism. Downsides: harder debugging, model-call cost, and risk of diverging context between agents. For prototypes, spell out the handoff protocol—who passes what, in what schema.

Who uses them

Agentic setups appear when a single chat reply is not enough—you need an action chain tied to tools or corpora.

Researchers. Scanning many PDFs, summaries, checking phrasing against sources—often with RAG and search.
Students and academics. Source discovery and draft surveys with explicit citations—never replacing fact-checking.
Developers. Multi-step debugging, refactors, automating ticket/CI/docs flows—with code review and caution.
Enterprises. Internal assistants over CRM, wikis, APIs—with strict access policy, action audit trails, and human gates on critical operations.