Google on Thursday released a “reimagined” version of its research agent Gemini Deep Research, which is based on its much-hyped cutting-edge platform model Gemini 3 Pro.
This new agent wasn’t just designed to create investigative reports. However, it is possible. This allows developers to incorporate Google’s SATA model exploration capabilities into their own apps. This feature is enabled by Google’s new Interactions API and is designed to give developers more control in the upcoming agent AI era.
The new Gemini Deep Research tool is an agent capable of synthesizing mountains of information and processing large context dumps in prompts. Google says its customers use it for everything from due diligence to drug toxicity safety studies.
Google also announced that the new deep research agent will soon be integrated into services such as Google Search, Google Finance, the Gemini app, and the popular NotebookLM. This is another step in preparing for a world where humans won’t do any Google and AI agents will.
The tech giant says Deep Research benefits from Gemini 3 Pro’s status as the “most fact-based” model, trained to minimize hallucinations during complex tasks.
AI illusions (which LLMs are just making up) are a particularly important problem for long-running deep reasoning agent tasks, where many autonomous decisions are made over minutes, hours, or longer. The more choices the LLM has to make, the more likely it is that one illusory choice will invalidate the entire output.
To prove its claims of progress, Google also created yet another benchmark (as if the world of AI needed another benchmark). The new benchmark, unimaginatively named DeepSearchQA, aims to test agents on complex multi-step information exploration tasks. Google has open sourced this benchmark.
tech crunch event
san francisco
|
October 13-15, 2026
We also tested deep research on humanity’s last test. It’s an independent benchmark of general knowledge filled with impossibly niche tasks with far more interesting names. BrowserComp is a benchmark for browser-based agent tasks.
As you might expect, Google’s new agent beat the competition in both its own benchmarks and Humanity’s benchmarks. However, OpenAI’s ChatGPT 5 Pro came in a surprisingly close second place all the way, narrowly beating Google in BrowserComp.
However, these benchmark comparisons were discontinued the moment Google published them. Because on the same day, OpenAI announced the long-awaited GPT 5.2 (codenamed Garlic). OpenAI says its latest model outperforms its rivals, especially Google, on a range of typical benchmarks, including benchmarks developed in-house.
Perhaps one of the most interesting aspects of this announcement was the timing. Knowing the world was waiting for the release of Garlic, Google announced some AI news of its own.
