Production AI Institute — AI Workflow & Certification Studio

Chatbot vs agent — the real difference

A chatbot takes your input and produces text output. That's it. If it gets something wrong, you read the wrong text. The consequence is limited and easily reversed.

An agent is different. An agent can *use tools*: send an email, search the web, write and run code, call an API, read and edit files. It takes actions in the world — not just words.

This distinction matters enormously. A chatbot that makes a mistake costs you a few seconds of reading. An agent that makes a mistake might have already sent an email, modified a file, or made an API call before you noticed.

Task given to agent

Plan steps

Take action (email, file, API)

Observe result

Task complete (or escalate)

Cursor — the AI that codes with you

Cursor is one of the clearest examples of what an AI agent looks like in practice.

Cursor is a code editor (like Microsoft's VS Code) with AI built into its core. You open your project, and Cursor can:

- **Read your entire codebase** — it understands how your files relate to each other - **Write new code** — you describe what you want, it writes the implementation - **Edit existing code** — you highlight a section and ask it to change something - **Find and fix bugs** — you describe the problem, it locates the cause and proposes a fix - **Run tests** — it can execute code and interpret the results

The reason this is powerful isn't any single capability. It's the combination: Cursor can read your context, write code, run it, see if it worked, and iterate — all in a loop with minimal human input.

Professional developers using Cursor report completing tasks in 20-30% of the time they previously took. Some report even higher gains on routine tasks.

Your codebase

You describe the change

Cursor reads + writes code

Runs tests to verify

You review + approve

Microsoft 365 Copilot — AI in your office apps

Microsoft 365 Copilot is what an AI agent looks like for non-technical knowledge workers.

It sits inside the apps you already use:

In Outlook: Summarise a long email thread in one click. Draft a reply based on your previous communications. Catch up on what you missed while on leave.

In Teams: Get a meeting summary and action items without having to take notes. Ask what was decided without reading the full transcript.

In Word: Draft a first version from bullet points. Rewrite in a different tone. Summarise a long report into an executive summary.

In Excel: Generate a chart from your data with a description of what you want. Ask questions about your spreadsheet in plain English. Identify anomalies.

The key thing about Copilot: it has access to *your* Microsoft 365 data — your emails, documents, meetings. That's what makes it genuinely useful for workplace tasks, and also why data permissions and governance matter.

📧 Outlook

💬 Teams

📄 Word

⊙ Copilot understands your data

Summaries, drafts, answers

How an agent reasons — the loop

Most production AI agents follow the same basic pattern:

1. **Receive a task** — e.g. 'Summarise this week's sales data and email it to the team' 2. **Plan** — break the task into steps 3. **Act** — execute each step using available tools 4. **Observe** — see what happened 5. **Repeat or complete** — iterate until done or stuck

This loop is called ReAct (Reason + Act). Whether you're looking at Cursor, Copilot, or a custom-built agent for your business — they're all running some version of this loop.

The implication: a mistake early in the loop can compound through every subsequent step before anyone notices.

When human oversight is non-negotiable

The right level of human oversight depends on one question: **how hard is this action to undo?**

A draft document? Easy to throw away. AI should just go ahead.

An email to a client? Worth reviewing before it sends.

A bulk operation on your company database? A human should review and explicitly approve before any action.

This isn't about not trusting AI. Every tool fails sometimes — your junior colleague fails sometimes too. The question is whether the system you've designed can catch failures before they become incidents.

Building AI systems with this question in mind from the start is what separates teams that deploy AI successfully from teams that have a crisis six months in.

✦ Key takeaway

An agent that takes actions is fundamentally different from one that produces text. Once an AI can send emails, edit files, or make API calls, the stakes of every mistake go up — and 'undo' becomes much harder.

📝

Quick check

3 questions — not graded, just for you

1. What is Cursor primarily designed to do?

2. Which statement best describes how Cursor differs from just asking ChatGPT a coding question?

3. When should you add a human checkpoint to an AI agent workflow?

← Module 1 Next: Module 3 →