Will AI Agents do everything? Or are they not delivering value? Will they take away most jobs? Or only a few? Contrary opinions abound because reality is nuanced, and there’s a gap between agents that are done well and are working and those that aren’t. This week we see that AI gives competitive advantages but is taking some entry level jobs. And the new model releases are increasingly focused on agentic AI, both for enterprises and consumers.

Do 95% of AI PoCs Really Fail?

Remember the MIT study from last week that claimed only 5% of generative AI PoCs are producing any value? I was skeptical. Since then, the headline has been quoted often, but the report has been heavily criticized on its specifics.

GAI Insight’s analysts had strong criticisms about its “flawed methodology and unsupported conclusions.”
Paul Roetzer, CEO of the Marketing AI Institute says “Please don’t put any weight into this study. This is not a viable, statistically valid thing.” and details why the methodology is flawed

So, only 5% generating value? There is certainly ample evidence to the contrary, and these are MUCH larger studies. But directionally the MIT study is right, in that there is a gap and not everyone is seeing big benefits (this study says 25%, 5x the MIT study).

The direction is more important than the magnitude. Just last week came a report from McKinsey* which indicates that the AI gap is widening. Companies deploying generative AI are increasing their lead over competitors in operations performance, with a bigger gap across multiple KPIs now than a few years ago.

* McKinsey’s study was also small (only 100 companies) but had a more robust methodology.

Is AI Taking Entry-Level Jobs?

We’ve seen some signs, but the data has been sketchy. A new study out by Stanford is probably the best evidence yet that AI is taking away some entry-level jobs.

In industries with high AI exposure (shown here are the two most affected, software development and customer service) you can see that there has been a decrease in employment among recent college graduates with a clear break from the trend in the fall of 2022. ChatGPT came out in November 2022.

“Early-career workers (ages 22-25) in the most AI-exposed occupations have experienced a 13 percent relative decline in employment even after controlling for firm-level shocks.”
– Stanford HAI, Canaries in the Coal Mine

New Models for AI Agents

Cohere released Command-A reasoning, a reasoning model for agentic AI based on their proprietary Command-A LLM. Cohere is in a small niche of companies that provide an LLM for business that isn’t also available for consumers. They make this model available for companies that want secure LLMs, as an alternative to open source. It’s also the model behind Cohere’s agentic AI platform, North. Cohere claims it outperforms other (open) models of the same size for agentic operations.

Anthropic released an AI agent as a browser plug-in called Claude for Chrome to help you do stuff as you browse the web. It’s in beta (they want to get more real-world use to ensure it’s safe). Since it’s in the browser, it sees what you see and can take action for you to fill out a form, draft an email response, schedule meetings, etc.

Other New Models This Week

Google ups their image generation capabilities with Gemini 2.5 Flash Image (the unofficial name is “Nano Banana”). It scores the top of many benchmarks and early reviews are very positive. ChatGPT was generally seen as the best/most accessible for image generation, but Gemini 2.5 Flash Image seems to be at least as good and better for image fusion and character consistency.

OpenAI released an update to their voice-to-voice speech model that responds in real time, and has improved accuracy and performance across multiple languages, and sounds more human. The big advancement in speed comes from a single model approach (instead of chaining together three models: speech-to-text, text-to-text (an LLM like ChatGPT), and then text-to-speech).

My take on why does it matter, particularly for generative AI in the workplace

On Employment

We’ve suspected AI might be hitting college grads for some time now, and it does seem we’ve moved from anecdotal stories to enough evidence to conclude that generative AI is reducing employment for entry-level positions, at least for coding and customer service. The question is: how far will this go? It won’t eliminate them entirely, but it is a faster-than-usual shift. Time will tell:

Will this be a shift, or will it go far enough to disrupt degree programs, college education, and what young people pursue for careers?
Will companies ultimately suffer from no longer having a “training ground” for employees to gain on-the-job skills? Or will that gap be mostly filled by AI?

For now I see this as a market shift, not a disruption. We’re going to need programmers for a long time. As that field gets more competitive, it will be less about raw ability and more about be skilled at using the AI tools. I expect a bigger impact on customer service, and since that has historically been a path for new workers to gain experience, that could provide less entry-level opportunities.

The takeaway for young people is clear. Whatever your major or planned career, learn how to use generative AI in your domain. Not to cheat on your homework, but to make you better at what you do.

AI Agents in the Enterprise

Cohere sees the opportunity or AI agents in the enterprise, and is working hard to improve their offerings for companies that don’t want to be subject to the whims of the big vendors or the hassles of running open models. There are filling a void between the mass-market ChatGPT/Claude and the do-it-yourself Llama/DeepSeek, but it’s hard to predict what that demand will look like down the road.

AI Agents for Consumers

Anthropic sees the opportunity for AI agents for consumers, just as OpenAI and others do. Filling out online forms or scheduling an appointment are not high-value tasks but they are things that take time and perhaps more importantly, they aren’t very fun. If Claude can automate these routine activities, we will not only be more efficient but also happier, because we can be relieved from the boring and menial tasks. Maybe even the most boring tasks like having to fill out Captchas:

No, I’m not kidding. When current agents encounter this, they say things like “The website has a Captcha to prevent bots from accessing the site. I will complete the Captcha to prove that I am human.”

Which is why we always have to remind ourselves, this AI is not a human at all, it just does a decent job of mimicking one.

Voice AI Agents

The evolution of OpenAI’s speech-to-speech model is significant beyond the usual it’s better, faster, and cheaper. It’s part of a trend where single models replace the multi-model approach. But what’s more important is that It’s a step towards an agentic voice agent. The model is designed for accurate function calling, to work with images included along with the speech, to call tools using the MCP protocol, as well as easy integration to phone networks (in case you aren’t already getting enough phone calls from AI).

I hope you all had a fantastic Labor Day weekend. See you next week.

The Great AI Agent Debate