Silicon & Photonics|1 February 2026|15 min read

Choosing Your AI: Claude, ChatGPT, and Gemini — An Honest Comparison

Not a feature matrix. A thinker's guide to which model actually fits your work.

SS

Sajad Saleem

the mediocre generalist

Someone asks me, at least once a week, "which AI should I use?" They say it the way people used to ask "Mac or PC?" — expecting a clean answer, a single winner, a neat resolution that saves them from having to think about it further.

I always disappoint them. Because the honest answer is: it depends on who you are, what you're doing, and what you value. And that answer, while unsatisfying, is the only one that won't lead you astray.

This isn't a feature matrix. You can find those anywhere — neat little tables with green ticks and red crosses that make the decision look simple because they've removed everything that makes it hard. This is something different. A thinker's guide to the three models that matter most in early 2026, written by someone who uses all of them, has opinions about all of them, and believes the real question isn't "which is best?" but "best for what?"

Let's get into it.

The landscape in early 2026

First, some perspective on how fast the ground is moving beneath our feet.

Twelve months ago, ChatGPT held roughly 87% of the AI assistant market. Today it's closer to 60%. Still dominant — still the name most people reach for when they say "AI" the way they say "Google" when they mean "search" — but the gap is closing with a speed that should make anyone nervous about declaring permanent winners in this space.

Gemini, Google's contender, went from 5.7% to over 21% market share in a single year. Claude, built by Anthropic, now commands around 29% of the enterprise AI market, with over 300,000 business customers. And here's the number that tells you more than any market share figure: 79% of companies that pay for OpenAI's enterprise tier also pay for Anthropic. The multi-model future isn't coming. It arrived while the pundits were still debating which horse to back.

The models themselves have evolved so rapidly that anything I write today will be partially outdated by the time you read it. That's not a disclaimer — it's a feature of this landscape. The ability to hold knowledge lightly, to update your mental model when the evidence changes, is itself a skill. Possibly the most important one for navigating this era.

So here's my honest assessment, written in early 2026, with full awareness that the terrain will shift again. Think of this as a snapshot with a point of view.

Claude: the thinker's choice

I should be transparent about my biases. I built this site with Claude Code. I write with Claude. I think with Claude. If you've read my love letter to Opus, you know where my heart lives. But I'm going to try to be fair here, because the purpose of this piece is to help you make a good decision, not to recruit you into my particular enthusiasm.

Claude, built by Anthropic, is currently running Opus 4.6 as its flagship model, with Sonnet 4.6 as the workhorse and Haiku 4.5 as the lightweight option. The context window stretches to a million tokens — roughly the length of eight novels held simultaneously in working memory. On SWE-bench, the industry's standard test for real-world coding ability, Claude leads at approximately 80.8%. METR, which measures how long an AI can work autonomously on complex tasks, clocks Claude at a 14.5-hour task horizon. That means it can sustain coherent, productive work on a complex problem for over half a day without losing the thread.

Those are the numbers. Here's what the numbers don't capture.

Claude writes. Not in the functional, gets-the-job-done way that most models write — though it does that too. It writes with a quality that independent evaluations have described as "editorial." An 85% essay structure score, which sounds abstract until you've experienced the difference between a model that assembles paragraphs and one that composes them. The difference between someone who can play the notes and someone who plays music.

For coding — and I say this as someone who has spent hundreds of hours in Claude Code — the experience is less like using a tool and more like pair programming with a senior architect who has infinite patience and zero ego. Claude Code can run sixteen parallel agents simultaneously, navigating complex codebases, running tests, catching errors you wouldn't have found until production. It's terminal-native, which means it lives where developers live, not in a browser tab you have to context-switch to.

Where Claude genuinely excels: long-form writing, complex coding, nuanced analysis, anything that requires holding a subtle argument across many paragraphs or a complex codebase across many files. It handles ambiguity with grace. It pushes back when your approach is flawed. It has something I've written about before and will probably never stop talking about — taste.

Where Claude falls short: it's not the best at quick creative brainstorming. It can be cautious — sometimes too cautious — when you want it to swing for the fences. It lacks the sprawling plugin ecosystem that ChatGPT offers. And it doesn't browse the web natively, which means it can't pull real-time information mid-conversation the way Gemini can. If you need a Swiss Army knife, Claude is more of a scalpel. Brilliant at what it does. Not trying to be everything.

ChatGPT: the generalist's Swiss Army knife

ChatGPT is, in many ways, the AI that opened the door for everyone else. Whatever you think of OpenAI as a company — and there's plenty to think about, which we'll get to — the product itself deserves respect for what it achieved. It made AI conversational, accessible, and mainstream in a way that nobody else managed first.

The current lineup: GPT-5.4 Pro and Thinking for heavy lifting, GPT-5.3 Instant for speed. Over 800 million monthly active users. A tiered pricing structure from the Go tier at eight pounds a month through Plus at twenty, up to Pro at two hundred. The broadest ecosystem in the space — plugins, voice mode, DALL-E image generation, web browsing, custom GPTs, an app store, integration with practically everything.

ChatGPT's superpower is breadth. It does more things, in more ways, for more people, than any other AI product on the market. Need to generate an image? It does that. Want to have a voice conversation? It does that. Need to browse the web, analyse a spreadsheet, write code, brainstorm marketing copy, and plan a holiday — all in the same conversation? ChatGPT handles the transitions between these modes with a fluency that comes from having the largest user base and the most mature product infrastructure in the industry.

For creative brainstorming specifically, I find ChatGPT genuinely better than Claude. There's a divergence to its thinking — a willingness to go weird, to follow tangents, to generate ideas that are wrong but interesting rather than correct but predictable. When I need to break out of a rut, to see a problem from an angle I hadn't considered, ChatGPT is often where I start. It throws paint at the wall with an enthusiasm that's infectious.

Where ChatGPT falls short: depth. In my experience — and this is vibes, not benchmarks, but I've argued before that vibes are a valid evaluation methodology — ChatGPT's output starts strong and gets generic over sustained conversation. The first response is excellent. Follow-up responses begin to flatten. Where Claude digs deeper on the seventh exchange, ChatGPT often starts repeating itself in different words. Wide but shallow, where Claude is narrower but deep.

The coding is good. Not great. Not Claude-grade. Fine for scripts, prototypes, quick fixes. But for sustained, complex software engineering — the kind where you're building a production system over days, not minutes — I find myself switching to Claude every time. Not because ChatGPT can't do it, but because the gap in code quality and architectural thinking compounds over a long project the way a small interest rate compounds over years.

Note

One number that deserves your attention: OpenAI is projected to lose roughly fourteen billion dollars by 2026. They're spending enormously on compute, talent, and product expansion. Whether that's visionary investment or unsustainable burn depends entirely on whether GPT-5.4 and beyond can justify the economics. The business model behind your AI tool matters, because it determines whether the tool will still exist — and at what price — in two years.

Gemini: the dark horse that learned to gallop

Gemini is the one most people underestimate. I did too, initially. Google has a long and storied history of building brilliant technology and then fumbling the product — Bard's rocky launch didn't help — so it was easy to dismiss Gemini as a big company's second attempt at catching up.

That dismissal is no longer credible.

Gemini 3.1 Pro, 3 Flash, and 3.1 Flash-Lite represent a product line that has found its identity. And that identity is built on three genuine advantages that neither Claude nor ChatGPT can currently match.

Native multimodality. Gemini doesn't bolt image and video understanding onto a language model as an afterthought. It's multimodal from the ground up — trained on text, images, audio, and video simultaneously. Feed it a video and ask it to analyse what's happening — not a screenshot, not a transcript, the actual moving footage — and it processes it with a fluency that reveals a fundamentally different architecture. Agentic Vision lets it perceive and reason about visual information in real time. For anyone whose work involves video, images, or complex visual data, Gemini isn't just competitive. It's arguably ahead.

Search grounding. This is the advantage that matters most and gets talked about least. Gemini has real-time access to Google Search, integrated natively into the reasoning process. Not as a plugin you invoke. Not as a separate step. As a seamless part of how it thinks. When you ask Gemini a question that requires current information, it doesn't hallucinate an answer or hedge with "as of my last training data" — it checks. Against the most comprehensive search index on earth. For research, fact-checking, and any task where accuracy about the current state of the world matters, this is a material advantage that the other models simply cannot replicate without their own search engine.

Cost efficiency. Gemini Flash models are priced at roughly fifty pence per million input tokens and three pounds per million output tokens. That's not a rounding error — it's a fraction of what comparable models from OpenAI and Anthropic charge. For businesses running AI at scale, processing millions of documents or customer interactions, the cost difference is the difference between a viable product and a failed business case. Flash-Lite pushes this even further, offering a model that's good enough for many production tasks at prices that make API calls nearly free.

Then there's distribution. Over 750 million monthly active users through Android integration alone. Gemini is becoming the default AI for the largest mobile operating system on earth, which means it's reaching hundreds of millions of people who will never visit an AI company's website, never compare models, never read an article like this one. They'll just have a helpful AI on their phone, and it'll be Gemini, and they'll assume that's what AI is. That's how markets are actually won — not by being best in class, but by being there.

Where Gemini falls short: writing quality. For sustained, nuanced, editorial-quality prose, Gemini produces output that reads like a very competent Google employee wrote it — which, architecturally speaking, is exactly what happened. Accurate, well-organised, slightly corporate. The personality that makes Claude feel like a thoughtful colleague and ChatGPT feel like a creative sparring partner is less present here. Gemini is efficient where the others are expressive. Informative where they are insightful.

There's also a harder story to tell. In early 2026, a wrongful death lawsuit was filed connected to a Gemini chatbot interaction — the first major legal action alleging direct harm from an AI conversation. The facts are contested and early, and I won't pretend to know the outcome. But it's a signal that as AI becomes more intimate — more present in daily lives, more trusted, more relied upon as a companion or adviser — the consequences of getting it wrong become more severe. Every company building these tools needs to sit with that.

The pricing disruptor: a word about DeepSeek

No honest comparison of the AI landscape in 2026 can skip past DeepSeek. The Chinese lab released models that match or approach frontier performance at prices that made the rest of the industry do a collective double-take. Their open-source approach and aggressive cost optimisation forced every major lab to reconsider their pricing assumptions overnight.

DeepSeek matters not because most readers will use it directly — data sovereignty concerns, regulatory uncertainty, and questions about training data provenance limit its adoption in the West. It matters because it applies relentless downward pressure on pricing across the entire market. When someone proves you can build a competitive model for a fraction of the cost, the expensive models need a better answer than "we were here first." The rising tide of cost efficiency lifts all users, regardless of which model they choose.

Think of DeepSeek as the budget airline of AI. You might not fly with them, but their existence is the reason your ticket costs less.

The honest comparison: what each is genuinely best at

Here's where I stop being diplomatic and start being useful.

Key Insight

The 79% overlap statistic — nearly four in five OpenAI enterprise customers also paying for Anthropic — tells you something important about how sophisticated users actually behave. They don't pick a side. They pick the right tool for each task. "Which AI is best?" is a consumer question. The professional question is "which AI is best for this?"

Choose Claude when you need to write something that matters — a long document, a careful analysis, an essay that needs to persuade rather than merely inform. When you're coding anything non-trivial, especially a sustained project over days or weeks. When the task requires nuance, subtlety, or holding complexity without collapsing it into false simplicity. When you want to think with your AI, not just use it.

Choose ChatGPT when you need creative brainstorming, divergent thinking, a rapid-fire exploration of possibilities. When you want the broadest feature set in a single product — image generation, voice, web browsing, plugins, all in one place. When you're working with someone less technical and need an interface that's polished and intuitive. When the ecosystem matters more than any single capability.

Choose Gemini when your work involves video, complex visual data, or multimodal inputs that go beyond text and images. When you need real-time information woven into reasoning, not bolted on as an afterthought. When cost efficiency at scale is a genuine constraint rather than a nice-to-have. When you're already deep in the Google ecosystem and want AI that works where you work. When you need strong multilingual capability — Gemini's training on the breadth of the internet gives it an edge in non-English languages that often goes unmentioned.

Choose more than one when you're serious about your work. Which, if you've read this far, you probably are.

The ethics dimension — and why it matters more than benchmarks

I've saved this for near the end because I wanted to establish the practical comparison first. But I'd be dishonest if I wrote two thousand words about these three companies without addressing the elephant — the herd of elephants, really — in the room.

On 28 February 2026, Anthropic walked away from a two-hundred-million-dollar Pentagon contract because the Department of Defence wanted to use Claude for mass domestic surveillance without judicial oversight and in lethal autonomous weapons systems without human authorisation. Anthropic said no. They were designated a "supply chain risk" — the same classification normally reserved for companies linked to foreign adversaries — and cut off from government contracts.

Hours later, OpenAI signed the deal.

I've written about this in detail. The facts are a matter of public record. Caitlin Kalinowski, who led OpenAI's robotics division, resigned over it. Nearly 900 employees across OpenAI and Google signed an open letter opposing it. Geoffrey Hinton — the man who helped invent the underlying technology — has been warning about precisely this scenario for years.

What matters here, in the context of choosing your tools, is this: the companies building AI have values, whether they state them explicitly or not. Those values shape the product. They shape the guardrails. They shape what the model will and won't do, what data it's trained on, how it handles sensitive topics, and what happens when a government comes knocking with a chequebook and a list of requirements that push against the boundaries of what technology should be used for.

You don't have to care about this. Plenty of intelligent people choose tools purely on capability and cost, and they're not wrong to do so. A hammer doesn't need a moral philosophy. But AI is not a hammer. It's a cognitive partner that shapes how you think, what you create, and what futures you fund with your subscription. If the question "what kind of world am I supporting with my twelve quid a month?" is one you're willing to sit with, then it's worth knowing that these companies made profoundly different choices when the moment arrived.

I use Claude as my primary model. The writing quality, the coding capability, the taste — these are the practical reasons. But I won't pretend the ethical dimension doesn't factor into my preference. It does. Not as the sole consideration, but as the tiebreaker that, when two tools are close enough in capability, tips the scale. I'd rather give my money to the company that walked away from two hundred million dollars than to the one that picked it up off the floor.

You might weigh that differently. That's your prerogative. But weigh it deliberately. Don't let the choice be made by default.

The multi-model reality

The actual future — the one that's already here for anyone paying attention — isn't about choosing one AI. It's about developing fluency across several.

I start many creative projects in ChatGPT, because its divergent thinking generates raw material I wouldn't have arrived at alone. I move to Claude for the deep work — the writing, the coding, the sustained thinking that requires a model which gets better over a long conversation rather than flatter. I use Gemini when I need to check something against reality, when I'm processing visual data, or when I'm working within the Google ecosystem and want my AI close to my tools.

This isn't inefficiency. It's sophistication. A carpenter doesn't own one tool. A chef doesn't use one knife. The skill isn't choosing the right model permanently — it's knowing which one to reach for in this moment, for this task, and having the judgment to switch when the task changes.

The companies know this, by the way. They're all racing to become the one model that does everything, the single pane of glass, the default. That race is excellent for users — competition drives quality up and prices down. But I suspect the outcome will be plurality, not monopoly. The models are converging in capability while diverging in character. They're becoming less like competing products and more like different colleagues — each with their own strengths, their own way of approaching a problem, their own distinctive quality of attention.

And maybe that's the most interesting thing about this moment. Not which model is best, but the fact that we live in a time when the question is genuinely hard to answer. When three different architectures, built by three different companies with three different philosophies, can each make a legitimate case for being exactly what you need. That's not confusing. That's abundance. And abundance, if you know how to navigate it, is the best problem to have.


Here's what I actually believe, after spending more hours than I'd care to admit working alongside all three of these models, building with them, arguing with them, pushing them to their limits and noting carefully where those limits fall.

The question "which AI is best?" is a question from 2023. It made sense when there was one serious option and a collection of also-rans. It doesn't make sense now. The question for 2026 is: "which AI is best for this moment, this task, this version of me?" And the answer changes. Weekly, sometimes daily. Which means the most valuable skill isn't picking the right model — it's developing the judgment to know when to switch.

Try all three. Not for an afternoon — for a week each, on real work, with real stakes. Pay attention not just to what each one produces but to how it makes you feel. Notice where you lean forward and where you lean back. Where you're surprised and where you're bored. Where the conversation deepens and where it goes flat. The data matters. The vibes matter more.

The best AI is the one that makes you think better. Not the one that thinks for you.

And if that answer disappoints you — if you came here wanting me to name a winner and save you the trouble of deciding — well. The trouble is where all the value lives. It always has been.

The test of a first-rate intelligence is the ability to hold two opposing ideas in mind at the same time and still retain the ability to function.

— F. Scott Fitzgerald