Agentic AI is Emerging

We’ve known everyone is talking about agentic AI in the workplace (and for consumers too). Although it’s currently more hype than reality it is definitely going to be a big part of our future. Which means companies that can provide agent ecosystems see it as a key source of revenue. Last week we saw two significant players go all-in on providing an agentic ecosystem for enterprises.

IBM sees agentic AI as the future and is going all-in to be the business platform to support it. They’re predicting that there will be a billion – with a B – applications created over the next few years with AI, many of them with agents. To help companies accomplish that, they have expanded their agent orchestration platform to include:

An AI agent catalog
Easy integration to their agent orchestration protocol (WatsonX Orchestrate)
Domain-specific agent templates
A no-code agent builder
An agent development toolkit (for developers)

This reflects a similar push that we’re seeing from all the big players, from Microsoft to Salesforce to OpenAI, into the brand-new world of AI agents. And Mistral too…

Mistral announced their business offering with the release of Le Chat Enterprise, an offering for AI agents in the enterprise. Their assistant includes enterprise search (RAG), a no-code agent builder, data connectors, tool connectors, and supposedly self-improving agents, for $25/employee/month (vs. OpenAI’s $60). However, they have limited connectors (the usual Sharepoint, OneDrive, Google Drive, Gmail, etc.) and there’s no information about how they do their RAG, so it’s probably naïve RAG (vector retrieval only) and it’s not clear if/how they handle security.

Getting Agents to Work Together

A so-far unsolved problem is a universal way for agents to discover one another and work together, especially when they’re built on different platforms. Having standards will be critical for them to cooperate – like the standards that exist behind every website that make the internet possible. There are several proposed standards – ACP, AGNTCY, NANDA – but Google’s A2A standard just got a big boost after Microsoft announced support for it in upcoming versions of Copilot Studio and Azure AI Foundry.

How Are Agents Impacting the Workforce?

We know that everyone is excited about the promise of AI Agents to help increase productivity, improve efficiency, streamline tasks, and even increase quality. What we don’t know is the extent to which agents are already being used to drive efficiency, and whether that’s making people more productive (or resulting in fewer working people, as Klarna reported last year).

This past week the Atlantic published a piece about how the unemployment rate for new college graduates is higher than the average unemployment rate (graph below). And the trend is not in the right direction. One possibility they propose is that this is exactly what we would expect to see if companies were using AI to help with entry-level jobs; the kinds of things that AI is good at. To be fair, the trend in the graph below started around the time of COVID, well before generative AI hit the scene, but it is a possible explanation for why it hasn’t recovered.

First “Open” Computer Use Agent

You may recall that AI can control a browser, via OpenAI’s Operator and Anthropic’s Computer Use. They’re still slow and unreliable, more of a sign of things to come than tools that can perform work. But now there is an open version: HuggingFace (a company that hosts open source projects) announced a computer control agentdemo.

The Week’s Closed Models

Coders take note! Google released – get ready for this awesome name – Gemini 2.5 Pro Preview (I/O edition)! Mostly this is an upgrade to what is already a very powerful model, with better coding abilities.

The Week’s Open Models

IBM released a preview of Granite 4.0 Tiny – a 7B parameter model that uses Mixture of Experts so that only 1B parameters are activated at any given time, that will supposedly perform as well as their current 8B Granite 3.3. It is targeted for consumer use and can run on a $350 GPU. It’s still in pre-training, having learned from 2.5T tokens so far out of a planned 15T+. It will be interesting to see how well their predictions hold.

Allen AI released Olmo 2 (Olmo comes from Open Language MOdel), a small 1 B parameter model trained on 4T tokens. Message here is lots of small models (Phi and Qwen) with improving performance, able to run on cheap hardware (laptops).

Mistral released Mistral Medium 3, claiming (on the usual benchmarks) it scores 90% of Claude Sonnet 3.7 at approximately 1/7^th the cost.

My take on why does it matter, particularly for generative AI in the workplace

All signs indicate that AI agents are definitely going to transform work in the enterprise. One of the keys to making that work is orchestration – how the agents will be able to discover one another, and work together to accomplish things. While there is still a lot of debate about what AI Agents will be able to do – both with current technology and with the advances that keep coming, here’s where I think we stand:

Agents will be capable of assisting humans

There’s no question that they can perform many tasks faster (and often better) than humans, such as performing research and writing reports. They’ve become indispensable tools to assist programmers and coders, and can handle many customer service functions (at least the easier ones) which means the human agents can focus on the more complex issues.

It’s unclear just how far they’ll be able to go with complex tasks

So far, it’s not clear how reliable AI Agents will be in longer processes that involve many steps, since the error rate compounds with each additional step that must be completed. There are many processes in the enterprise where accuracy is paramount, and better techniques or technology are needed before they can take on those tasks.

There will be many agents

One of the few things that is clear right now is that the more specific and the more focused the AI Agent, the better the results. While all the AI vendors would love to be the first with a general purpose agent, for now that is still science fiction. Especially when accuracy and reliability matter (as they do in business), agents need to be narrow. So for now, organizations will have many specialized, narrow agents, even within one organization.

They will come from different platforms

Because there will be specialized agents, there will be a plethora of options for creating agents. It’s unlikely that one single platform will dominate, as companies will want the flexibility of creating and running agents from many different applications.

Agent orchestration will be key to realizing their value

Because there will be many agents built on many different platforms, orchestration is critical. This is a developing area, as companies are just beginning to string together multiple agents to perform more involved tasks. They will need to be able to discover one another, understand what the capabilities are of each, and then ask another agent to execute a task.

When one agent calls another, it may be a delegation (in which case, control or execution will return back to the requesting agent) or a handoff (in which case execution continues without that agent’s involvement). Also, there are questions about permissions and protecting secure content, particularly if a subsequent agent has access to more information than the originating employee or agent. There is also a need for error management, if an agent can’t complete its task, and observability capabilities so that once the task is completed, the employee can see how it was accomplished.

How they will interact isn’t fully determined yet

Not all of this has been figured out yet! Several standards have been proposed; none yet is a clear winner (and no doubt they will evolve over time), so it will be interesting to see how this develops.

Google’s Agent2Agent (A2A) protocol is gaining traction

Google’s A2A protocol currently carries the most mindshare especially now that Microsoft has announced they will support it. However, they didn’t say it was exclusive to other standards, so ACP, AGNTCY, NANDA (and others I’m sure) are all contenders at this point. If you’re interested, here’s a good comparison of some of the leading contenders.

What Does This Mean?

Generative AI, and in particular the concept of AI agents – software that takes action on its own, that determines what to do instead of following prescribed steps – will transform how we use information, and has the potential to transform how we accomplish things, maybe even to the point where it does most of the accomplishing! AI Agents may do the bulk of what we now call work, with humans taking a back seat, only stepping in periodically to provide guidance and oversight. It’s too early to tell how far it will go, but the technology is advancing crazy fast, faster than anything I’ve ever experienced before, and adoption is starting.

These AI Agents are possible because of recent advances in generative AI and LLMs. Like any technology, they have their limits, so I’m not worried about them taking over the world, but they do have tremendous capability because it’s the first time that a machine can “think” in a broad way (they don’t really think, but they approximate it close enough that they’re useful).

If you’re not using AI yet, start now.

So, you need to get ready for AI Agents. What that means for you will vary depending on your job, your age, and your interests. If you’re not using AI yet, start now. It doesn’t require any special skill (if you know how to talk, you know how to use it). And using it is the best way to find out what it’s good for, and what it isn’t.

Try out Anthropic’s Claude, Google’s Gemini, Meta’s Meta AI, OpenAI’s ChatGPT, or X.ai’s Grok. It doesn’t really matter which one. Just get started, but don’t treat it like a search engine (one search, one result, done). Treat it like a person. Have a conversation. If you don’t like something about the response, tell it what you don’t like and ask it to try again.

Using generative AI is the first step. If you want to experience an AI Agent, pick a topic to research, and try one of the Deep Research apps (you can choose this mode from the settings – most except Anthropic offer a few reports a month for free). It can be any topic you want. For instance:

Compare the success of Taylor Swift’s Eras tour with the popularity of the Beatles, and considering the cultural and technological differences of the two time periods, assess their comparative success if they had both happened in the 2020s. Include recommendations for what the Beatles could have learned from Swift’s success.

Gemini put together a 7-step plan to research these topics and build a comparison. It did a good job of recognizing the tremendous social and technological differences that served as backdrop for these two groups, and how each leveraged the culture of their time. But the differences were so great that it was difficult to make comparisons (which is what I intended) so the conclusions were fairly basic. Gemini provided a good foundation to build on if I wanted to carry the research further.

AI Agents are indeed emerging, ushering in a new era for all of us.