AI Agents Should Be Simple

AI agents should be simple.

Not because the apps we build with them are simple. They are not.

But because the core abstraction has to be simple.

If that part is wrong, everything above it starts rotting.

An agent is a state machine around an LLM with tools.

That is the whole thing.

TLDR

The core of an agent should be a small state machine:

call the model
maybe run a tool
append the result
call the model again
return the final output

Then you expand that machine when you need more: retries, memory, approvals, tracing, evals, persistence, sub-agents, graphs, whatever.

But if the base state machine is wrong, the rest of the framework cannot save you.

It just gives you prettier ways to be confused.

This is why I like Pydantic AI.

The primitive

Most agent frameworks describe agents as LLMs in a loop with tools.

That is true, but it is slightly too hand-wavy.

The more useful mental model is a state machine.

Something like:

model_call -> tool_call -> tool_result -> model_call -> final_output

With an error state somewhere, because obviously.

In code, the baby version looks like this:

while True:
    response = llm(messages, tools=tools)

    if response.tool_call:
        result = run_tool(response.tool_call)
        messages.append(result)
        continue

    return response

This is not enough for production.

But it is enough to understand the shape.

The agent is not magic.

It is a machine moving between states.

The LLM decides the next action, tools change the outside world or fetch context, and the loop keeps going until it reaches a terminal state.

That is the primitive.

Everything else should be built around it.

Why the primitive matters

Framework design compounds.

If the first abstraction is clean, adding features feels natural.

You add retry states.

You add approval states.

You add tracing around transitions.

You add evals around trajectories and final outputs.

You add persistence so the machine can stop and resume.

You add graphs when one state machine is not enough.

But if the first abstraction is wrong, every feature becomes a workaround.

You start needing special concepts for things that should have been obvious.

You get callbacks fighting orchestration.

You get hidden state.

You get a framework that is technically powerful but mentally expensive.

And once that happens, the framework becomes the thing you are building, instead of the thing helping you build.

I do not want to learn a new religion just to give an LLM three tools.

The wrong direction

A lot of agent frameworks seem to race toward complexity.

Suddenly an agent is a worker inside a crew inside a graph inside a flow inside an orchestration runtime.

Sometimes that is the correct architecture.

Most of the time, it is cosplay.

The problem is not graphs. Graphs are useful.

The problem is starting with the graph before you have the state machine.

The problem is starting with multi-agent collaboration before one agent has a clean execution model.

The problem is hiding the core loop under vocabulary.

Simple things should look simple.

Complex things should be possible.

That is the bar.

What good design looks like

Good agent framework design starts with a small machine and lets you expand it.

The base case should have boring pieces:

model
instructions
tools
output
transitions

Then the production features should attach to those pieces.

Tracing should observe transitions.

Evals should inspect trajectories.

Human approval should pause before dangerous tool calls.

Memory should be explicit state, not magic vibes.

Persistence should save the machine and resume it later.

Multi-agent systems should compose multiple machines, not invent a totally different universe.

The boring base case is not a lack of ambition.

It is the thing that lets the ambitious version stay understandable.

The test

Here is the test I would use for any agent framework:

Can I explain what state the agent is in right now?

Can I explain what can happen next?

Can I intercept the transition if it is dangerous?

Can I save the state and resume later?

Can I add tracing without changing the whole program?

If the answer is no, the abstraction is already leaking.

And if the abstraction leaks at the core, it will flood when the app gets real.

Why I like Pydantic AI

Pydantic AI is my favorite version of this right now.

The basic shape is small:

from pydantic_ai import Agent

agent = Agent(
    "anthropic:claude-sonnet-4-6",
    instructions="Be concise.",
)

result = agent.run_sync("Explain what an agent is.")
print(result.output)

Create an agent. Give it instructions. Run it.

When you need tools, tools are functions.

When you need structured output, outputs can be Pydantic models.

When you need dependencies, dependencies are explicit.

When you need evals or observability, they attach around the thing rather than replacing the thing.

That matters because agents already have enough runtime weirdness.

The framework should reduce mystery, not add more.

This is the main reason it feels right to me.

The simple thing stays simple, and the advanced thing still has somewhere to go.

The comparison

There are a lot of agent frameworks now, but they are not all trying to be the same thing.

So I do not think the question is "which framework is best?"

The better question is: what is the core abstraction, and does it match the problem?

Pydantic AI

Pydantic AI starts from an agent abstraction that still feels like normal Python.

That is the main reason I like it.

It has plenty of production pieces: structured outputs, dependency injection, evals, observability, model providers, MCP, and durable execution integrations.

But the important thing is that those features compose around the agent.

The center still feels small.

OpenAI Agents SDK

OpenAI Agents SDK is also aligned with this philosophy.

The docs talk about a small set of primitives: agents, tools, handoffs, guardrails, and tracing.

It also has a managed agent loop, which is basically the runtime owning the state machine for you.

That is useful if you are happy living close to the OpenAI ecosystem.

The tradeoff is that the center of gravity is obviously OpenAI.

That might be perfect. It might not be.

smolagents

smolagents is probably the most explicit about simplicity.

The library is intentionally small, and I respect that.

I also like the code-agent idea. Letting an agent write code to call tools maps nicely to how programmers already think: functions, loops, conditionals.

But code execution changes the security story.

Once the agent can write and run code, sandboxing is not optional anymore.

LangChain / LangGraph

LangChain and LangGraph are powerful.

LangGraph especially makes sense when you really do need orchestration: durable execution, branching, persistence, human-in-the-loop, long-running stateful workflows.

That is real.

But I would not start there for a simple agent.

Start with the state machine. Then reach for the graph when the state machine needs to become a graph.

Not before.

CrewAI

CrewAI is more focused on teams of agents and workflows.

If your mental model is genuinely multiple agents collaborating inside a process, the concepts make sense.

But that is already a higher-level abstraction.

If your app is one model with three tools, starting with crews and flows feels like hiring a committee to make a sandwich.

The rule

Start with the state machine.

Make the transitions obvious.

Make state explicit.

Make tools boring.

Then expand only when the problem asks for it.

Do not start with a graph because graphs look serious.

Do not start with multi-agent orchestration because it sounds futuristic.

Do not start with a framework that makes the simple case feel advanced.

The core abstraction matters because everything else sits on top of it.

Get that wrong, and the whole system gets weird.

Get that right, and you can build surprisingly complex things without losing the plot.