Chapter 1
What Is an AI Agent
From 'just talking' to 'getting things done'
🎯
Imagine this...
You walk into a restaurant and order a steak. If the waiter just stands there chatting about how a steak should be cooked but never actually places an order — that's a plain Large Language Model (LLM). But if the waiter can talk to you, walk into the kitchen, place the order, bring your food, and handle the check — that's an AI Agent.
LLM vs AI Agent
You've probably already used AI tools like ChatGPT or Claude. Under the hood, they run on Large Language Models (LLMs).
LLMs are incredibly smart — they can understand your questions, write essays, write code, and analyze data. But they have a fundamental limitation: they can only "talk," they can't "do."
For example, if you ask an LLM to "change line 10 of this file to hello," it can tell you how to make the change, but it can't actually open the file, modify the content, or save it.
AI Agents exist to solve this problem. They add a layer of "infrastructure" on top of the LLM, enabling the AI to actually:
• Read and write files
• Execute commands
• Search for information
• Interact with external tools
• Coordinate multiple subtasks
| Comparison | Plain LLM | AI Agent |
|---|---|---|
| Core capability | Understand and generate text | Understand, decide, and execute |
| Can it modify files? | ❌ No | ✅ Yes |
| Can it run code? | ❌ No | ✅ Yes |
| Can it search the web? | ❌ No | ✅ Yes |
| How it works | Single Q&A exchange | Loop: think → act → observe → think again |
| Analogy | A brilliant brain | A brilliant person with hands and feet |
Why Are AI Agents So Important?
Imagine this scenario:
You tell an AI: "Create a new React project, install dependencies, write a login page, and run the tests."
A plain LLM can only give you step-by-step text instructions.
But an AI Agent will actually:
1. Run npx create-react-app to create the project
2. Run npm install to install dependencies
3. Create files and write code
4. Run npm test to check results
5. If tests fail, automatically fix and retry
That's the power of an AI Agent — it doesn't just give advice, it actually completes the task.
📌 Key Takeaway
AI Agent = LLM + Infrastructure
The core formula for an AI Agent is simple: take the intelligence of a Large Language Model and layer on top of it an "infrastructure" (also called a Harness) that lets it perceive its environment, use tools, and take action. In the next chapter, we'll dive into what exactly this "infrastructure" is.
🧠 Check Your Understanding
Which description most accurately distinguishes an LLM from an AI Agent?
🧠 Check Your Understanding
When you tell an AI Agent "add a field to config.json," what does it do?