Every week it’s the same – new models, lower cost, more inference, and more agentic claims). Amid all this, I found a few news items of note this week.
- One of the authors of the original RAG paper from Facebook started his own company (contextual.ai), and this week they announced general availability of their RAG platform. They reinforce that plain vanilla RAG isn’t good enough – you need much more sophisticated RAG to ensure that you’re sending good quality information to the LLM, so you avoid the garbage-in, garbage-out problem. They are not a search company and don’t spend much time talking about good retrieval (just that they use mixture-of-retrievers and a reranker) which are key to good retrieval and are necessary to minimize hallucinations with RAG.
- In the push to “agentic AI” continues, OpenAI released “tasks” (in beta, only for paid users) where you can have ChatGPT perform tasks for you at certain times. I’m skeptical of the value here and apparently it gets a lot of things wrong…but it shows where companies are going…
- Microsoft added Copilot to its Personal and Family (i.e., non-business) subscriptions and raised subscription prices by $3/month. That’s a lot less than Copilot Pro (which stays at $20/month) but Copilot Pro is unlimited, whereas Copilot is limited by a credit system (you get so many credits to use per month).

My take on why does it matter, particularly for generative AI in the workplace
There is a growing realization that basic RAG is only good enough for simple applications in the enterprise. Most needs for RAG will not be met by taking an open-source vector database and creating some embeddings and hooking that up to an LLM. That will work at very small scale (a hundred documents, one source, no security) but it doesn’t work at larger volumes, because there is more unrelated information in the data. When that happens, the retrieval gets really hard, and if the LLMs get unrelated information, they are more likely to hallucinate. That’s why companies are building robust retrieval pipelines: advanced RAG is the key to using LLMs in the workplace.
The tricky part is that because the LLMs are so good and search at small scale is not hard, it often works very well for a proof of concept. I’ve seen more than one company build a successful small-scale PoC only to find that results are much worse with the full dataset. When this happens, the answer isn’t try a different LLM, the answer is better retrieval.