Silicon & Photonics|18 January 2026|14 min read

Agentic AI: When Your Tools Start Using Tools

The shift from chatbots to autonomous agents is the biggest change in how we work since the internet

SS

Sajad Saleem

the mediocre generalist

Last Tuesday, I asked an AI agent to refactor an authentication module in a codebase I'd been avoiding for weeks. Not "suggest how to refactor it." Not "explain the best approach." I said — in plain English, the way you'd brief a colleague — "this auth module is a mess, the session handling has three race conditions, and the token refresh logic fails silently when the network drops. Fix it. Run the tests. If anything breaks, fix that too."

Then I went to make chai.

When I came back, it had read forty-seven files. It had identified not three but five race conditions — two I hadn't noticed. It had rewritten the session handler, updated the token refresh logic, added retry mechanisms with exponential backoff, run the test suite, found two failing tests caused by its own changes, fixed those, run the suite again, and left me a summary of everything it had done and why.

Twelve minutes. I was gone twelve minutes. The chai wasn't even cool enough to drink yet.

This isn't a chatbot. This isn't autocomplete with a marketing budget. This is something genuinely different, and if you're running a business, leading a team, or just trying to understand where the world is heading — you need to understand what agentic AI actually is, because it's moving faster than the conversation about it.

So what is agentic AI, actually?

Here's the simplest definition I can offer: agentic AI is artificial intelligence that can do things — not just say things.

A chatbot answers questions. An AI agent answers questions, then takes action based on those answers, then evaluates whether the action worked, then adjusts if it didn't, then moves on to the next step. Without asking you. Without waiting for permission at every turn. It has agency — the ability to perceive its environment, make decisions, and act on them autonomously.

The distinction matters more than it sounds. When you use ChatGPT or Claude in a normal conversation, you're playing ping-pong. You serve, the AI returns, you serve again. Every action requires your input. You are the loop. Remove you, and nothing happens.

An agentic system is more like hiring a contractor. You describe what you want. You set the boundaries. Then you step back and let them work. They figure out the steps. They decide which tool to use when. They handle problems that arise without phoning you every thirty seconds. You're still in charge — you set the goal, you define the constraints, you review the output — but the execution is delegated.

That word — delegated — is the one that changes everything. We've gone from prompting machines to delegating to them. And delegation, as anyone who's ever managed a team knows, is an entirely different skill from giving instructions.

Why this isn't just chatbots with extra steps

I keep hearing people describe AI agents as "chatbots that can use tools." It's technically accurate in the same way that describing a car as "a horse with wheels" is technically accurate. You've captured the surface and missed everything that matters.

The difference between a chatbot and an AI agent isn't quantitative — it's not that agents are faster or smarter or have access to more data. It's qualitative. It's a different kind of thing.

Three properties define genuinely agentic systems, and they're worth understanding because they explain both the power and the risk:

Autonomy. The agent makes decisions without human intervention at each step. It plans, acts, observes the result, and decides what to do next. This is the obvious one — the one the marketing materials lead with.

Tool use. The agent doesn't just generate text. It calls APIs, reads databases, writes files, executes code, browses the web, sends emails. It interacts with the world through tools, the way you do. Your tools start using tools. Hence the title.

Reasoning loops. This is the one people miss. An agent doesn't just act — it reflects. It evaluates whether its action achieved the goal. If not, it reasons about why, adjusts its approach, and tries again. It's not executing a script. It's navigating toward an outcome through iteration. The same way you solve problems, except without the existential doubt and the need for a biscuit break.

Put those three together and you get something that, from the outside, looks remarkably like a competent junior employee. One who never sleeps, never gets distracted, processes information at superhuman speed, and — crucially — never develops opinions about the office temperature.

Note

Here's the shift most people haven't fully absorbed: we've moved from using AI to managing AI. The skill that matters now isn't prompt engineering — it's delegation. Defining outcomes, setting constraints, reviewing work. The same skills that make a good manager are becoming the skills that make a good AI operator. The mediocre middle manager's revenge, if you will.

The USB-C moment: MCP and why protocols matter

If you want to understand why agentic AI is accelerating now — why 2025-2026 is the inflection point rather than 2023 or 2024 — you need to understand a protocol called MCP.

MCP stands for Model Context Protocol. Anthropic created it, then — and this is the part that matters — handed it to the Linux Foundation. Open. Free. For everyone. It's now the emerging standard for how AI agents connect to tools and data sources.

Think of it like this. Before USB, every device had its own proprietary connector. Your printer cable didn't fit your scanner. Your scanner cable didn't fit your camera. Every connection was a special case. Then USB arrived and suddenly everything just... worked. One standard. Universal. The problem shifted from "how do I connect this thing" to "what do I want to connect."

MCP is doing the same thing for AI agents. Before MCP, if you wanted an AI to interact with your CRM, your database, your email, your project management tool — each integration was custom. Bespoke. Expensive. Fragile. A full-time job for someone, possibly several someones.

With MCP, a tool exposes a standard interface. Any AI agent that speaks MCP can use it. Your agent doesn't need custom code for Salesforce and custom code for HubSpot and custom code for your internal database. It needs MCP. Once. And then it can talk to anything that speaks the same language.

Google, not to be outdone, has released A2A — Agent-to-Agent protocol. If MCP is how agents talk to tools, A2A is how agents talk to each other. Your scheduling agent negotiates with your client's scheduling agent. Your procurement agent coordinates with your supplier's inventory agent. Agents orchestrating agents, across organisational boundaries, without humans doing the translation.

We're watching the plumbing get installed. It's not glamorous — protocols never are — but it's what makes the building habitable. And the speed of adoption has been remarkable. MCP went from announcement to near-universal adoption in months, not years. That almost never happens with infrastructure standards. Usually they take a decade of committee meetings and competing proposals and passive-aggressive emails between working group chairs. This one just... spread. Because it solved a real problem and nobody could come up with a good reason not to use it.

What agents are actually doing right now

Let me ground this in reality, because the discourse around agentic AI oscillates between "it can do everything" and "it can do nothing" with very little time spent in the truthful middle.

Software development is where agents have advanced furthest — partly because code is a structured domain with clear success criteria (does it compile? do the tests pass?), and partly because the people building AI agents are software developers who scratched their own itch first. Claude Code can spawn sixteen or more parallel sub-agents, each working on a different part of a codebase simultaneously. Cursor, Devin, Windsurf — these aren't toys. They're production tools being used daily by professional engineers. The question is no longer "can AI write code" but "how much of my codebase was written by an agent, and does it matter?"

Customer operations is the second frontier. Agents that handle customer inquiries end-to-end — not just answering questions, but checking order status, initiating refunds, escalating complex cases, updating CRM records. Klarna reported replacing 700 customer service agents with AI in 2024. Whether you find that inspiring or alarming probably depends on whether you were one of the 700.

Research and analysis is where I find agents most personally useful. An agent that can read a hundred documents, extract relevant information, cross-reference claims, identify contradictions, and produce a synthesis — that's not replacing an analyst, but it's giving every analyst a team. I used one last week to analyse the competitive landscape for a client. What would have taken a junior consultant three days took an agent forty minutes. The output needed editing — agents are fluent but not always discerning — but the raw material was comprehensive and well-structured.

Business process automation is the quiet revolution. Agents that handle invoice processing, compliance checking, data reconciliation, report generation. Not the exciting stuff that makes headlines. The mundane stuff that consumes 40% of most knowledge workers' time and makes them quietly consider whether this is really what their degree was for.

The enterprise reality

Let's talk numbers, because numbers have a clarifying effect on hype.

Gartner projects that 40% of enterprise applications will feature AI agents by the end of 2026. Up from less than 5% in early 2025. That's not linear growth — that's a step change. McKinsey puts it slightly differently: 62% of organisations are either actively scaling agentic AI (23%) or experimenting with it (39%). The UK is further along than many realise — a Salesforce survey found 69% of UK organisations report that most or all of their teams have adopted AI agents in some form.

Key Insight

Forrester estimates an average projected ROI of 171% from agentic AI deployments. That number should make you sit up. It should also make you suspicious. Projected ROI and actual ROI are related the way a weather forecast is related to the weather — directionally useful, not to be relied upon for picnic planning.

But here's the sobering counterpoint, and it's one the vendors would rather you didn't dwell on: Gartner also predicts that over 40% of agentic AI projects will be cancelled or significantly scaled back by 2027. Four in ten. Not because the technology doesn't work — because organisations deploy it without understanding what it takes to make it work. Without the data infrastructure, the governance frameworks, the cultural readiness. Without asking the hard questions about accountability, oversight, and what happens when an autonomous system makes a decision that costs real money or affects real people.

That's a pattern anyone who's lived through a technology hype cycle will recognise. The technology is real. The potential is real. And a significant number of organisations are going to spend substantial sums learning that technology is the easy part.

The risks nobody's talking about (or: the part where I rain on the parade)

I promised warmth, and I've delivered warmth. Now let me deliver honesty, because warmth without honesty is just flattery.

Compounding hallucinations. When a chatbot hallucinates — makes something up with unearned confidence — it's annoying but contained. You read the output, spot the error, move on. When an agent hallucinates, it acts on the hallucination. Then it observes the result of that action — which may itself be confusing or ambiguous — and reasons about what to do next based on a foundation that's already wrong. Error compounds on error, like a game of Chinese whispers played at machine speed. Each step looks locally reasonable. The accumulated result can be wildly, confidently, elaborately wrong. And by the time a human reviews it, the agent has built an entire house on a foundation of sand, complete with curtains and a welcome mat. Debugging multi-step autonomous reasoning is, to put it mildly, non-trivial.

The accountability vacuum. When a human makes a decision that goes wrong, we know who's responsible. When an AI agent makes a decision that goes wrong — one it was never explicitly instructed to make, one that emerged from its autonomous reasoning about how best to achieve a goal you set — who's accountable? You, for setting the goal? The developer, for building the agent? The company that trained the model? This isn't a philosophical question. It's a legal and operational one. And right now, the answer in most organisations is a collective shrug accompanied by the phrase "we're still working on our AI governance framework," which is corporate for "we haven't thought about this and we're hoping nobody asks."

Agency isn't a feature. It's a transfer of decision rights. And most organisations are transferring those rights without fully understanding what they're giving away.

Agent washing. Perhaps the most immediately practical risk. Gartner estimates there are thousands of vendors claiming to offer "agentic AI" capabilities. They estimate that roughly 130 of them are genuine. The rest are — let me find a diplomatic way to say this — optimistically labelling their products. A chatbot with a Zapier integration is not an AI agent. A workflow automation tool with an LLM bolted on is not an AI agent. But "agentic AI" is the hottest term in enterprise software right now, and the incentive to slap it on everything from CRM plugins to email clients is overwhelming. If someone is selling you "agentic AI" and it can't autonomously plan, execute, and evaluate — you're buying a chatbot in a trenchcoat.

The shift from doing to directing

Here's what I think this really means for how we work — and this is the part that keeps me up at night, in a good way, in the way that a genuinely interesting problem keeps you turning it over.

The last thirty years of computing have been about humans using tools. We type. We click. We drag. We configure. The tool does nothing unless we act. Every output requires a corresponding input. We are, fundamentally, operators.

Agentic AI shifts us from operators to directors. The job isn't to do the work. It's to define the work, set the quality standards, review the output, and intervene when something goes wrong. It's management. And not everyone who's excellent at doing the work will be excellent at directing it — just as the best individual contributor on a team isn't always the best manager. Different skills. Different instincts. The person who thrives writing code might struggle to effectively direct an AI that writes code, because directing requires you to think at a different level of abstraction. You have to care about what and why without getting seduced by how.

This is, if I'm being honest, the generalist's moment. The person who understands a bit of everything — enough to set direction, enough to evaluate output, enough to spot when something's gone sideways — becomes the most valuable person in the room. Not the deepest specialist. The broadest thinker. The one who can orchestrate.

The irony isn't lost on me. We spent twenty years telling generalists they needed to specialise. Now the tools are specialising for us, and what we need are people who can see the whole board. The mediocre generalist, it turns out, was early — not wrong.

What to actually do about this

I try not to write articles that leave people feeling informed but helpless. So here's what I'd actually do, if I were advising someone — a business leader, a team lead, anyone trying to navigate this honestly:

Start with one process, not a platform. Pick the most tedious, well-defined, low-risk process in your organisation. Invoice matching. Report generation. Data reconciliation. Something where failure is cheap and success is obvious. Give an agent that process. Learn from what happens. Then expand. Every successful agentic deployment I've seen started this way — one painful process, one quick win, then gradual expansion. The organisations that try to transform everything at once are the ones writing the post-mortems six months later.

Invest in governance before you invest in tools. Who reviews what the agent does? How do you audit decisions? What happens when it gets something wrong? These aren't questions to answer after deployment. They're prerequisites. And yes, governance is boring. It's also the difference between the 60% of projects that succeed and the 40% that get cancelled.

Develop your people, not just your technology. The skills your team needs are changing. Prompt engineering is table stakes. What matters now is the ability to decompose problems, define success criteria, evaluate autonomous output, and intervene effectively. These are management skills applied to machines. Most organisations have not invested in developing them. The ones that do will have a significant advantage.

Be deeply sceptical of vendor claims. If it can't autonomously plan, execute, and evaluate — it's not agentic. Ask for a demo. Not a slide deck. Not a case study. A demo, on your data, with your use case. Watch what it actually does. I've sat through enough vendor demos to know that the gap between the carefully rehearsed presentation and the real product is, in many cases, roughly the width of the Atlantic.


There's a moment — and I think about this a lot — when you stop using a tool and start collaborating with it. When the relationship shifts from command-and-response to something more like partnership. You set the direction. It figures out the path. You check the destination. It adjusts the route.

That moment is here. Not coming. Here.

Agentic AI isn't perfect. It hallucinates. It gets confused. It occasionally pursues a goal with the determined enthusiasm of a golden retriever chasing a ball off a cliff. It needs oversight, governance, and the kind of thoughtful human judgment that no model, however capable, can replace.

But it works. Really works. Not in the demo-only, venture-capital-funded, "imagine if" sense. In the Tuesday-afternoon, solving-actual-problems, saving-actual-time sense. The shift from chatbot to agent is the shift from a tool that helps you think to a tool that helps you do. And the organisations, teams, and individuals who learn to work with agents — not just use them, but genuinely collaborate with them — are going to operate at a speed and scale that makes everyone else feel like they're running in sand.

The tools are using tools now. The question isn't whether to pay attention.

It's whether you're directing, or being directed.