Chapter 1

What Is an AI Agent

From 'just talking' to 'getting things done'

🎯

Imagine this...

You walk into a restaurant and order a steak. If the waiter just stands there chatting about how a steak should be cooked but never actually places an order — that's a plain Large Language Model (LLM). But if the waiter can talk to you, walk into the kitchen, place the order, bring your food, and handle the check — that's an AI Agent.

LLM vs AI Agent

You've probably already used AI tools like ChatGPT or Claude. Under the hood, they run on Large Language Models (LLMs). LLMs are incredibly smart — they can understand your questions, write essays, write code, and analyze data. But they have a fundamental limitation: they can only "talk," they can't "do." For example, if you ask an LLM to "change line 10 of this file to hello," it can tell you how to make the change, but it can't actually open the file, modify the content, or save it. AI Agents exist to solve this problem. They add a layer of "infrastructure" on top of the LLM, enabling the AI to actually: • Read and write files • Execute commands • Search for information • Interact with external tools • Coordinate multiple subtasks

Comparison	Plain LLM	AI Agent
Core capability	Understand and generate text	Understand, decide, and execute
Can it modify files?	❌ No	✅ Yes
Can it run code?	❌ No	✅ Yes
Can it search the web?	❌ No	✅ Yes
How it works	Single Q&A exchange	Loop: think → act → observe → think again
Analogy	A brilliant brain	A brilliant person with hands and feet

Why Are AI Agents So Important?

Imagine this scenario: You tell an AI: "Create a new React project, install dependencies, write a login page, and run the tests." A plain LLM can only give you step-by-step text instructions. But an AI Agent will actually: 1. Run npx create-react-app to create the project 2. Run npm install to install dependencies 3. Create files and write code 4. Run npm test to check results 5. If tests fail, automatically fix and retry That's the power of an AI Agent — it doesn't just give advice, it actually completes the task.

📌 Key Takeaway

AI Agent = LLM + Infrastructure

The core formula for an AI Agent is simple: take the intelligence of a Large Language Model and layer on top of it an "infrastructure" (also called a Harness) that lets it perceive its environment, use tools, and take action. In the next chapter, we'll dive into what exactly this "infrastructure" is.

🧠 Check Your Understanding

Which description most accurately distinguishes an LLM from an AI Agent?

🧠 Check Your Understanding

When you tell an AI Agent "add a field to config.json," what does it do?

What Is a Harness