OpenAI DeepSearchQA, the world’s largest benchmark in artificial intelligence, created a new path to understanding humanity. This news came alongside the introduction of its latest model, GPT 5.2 model, codenamed Garlic. This introduction took place during a TechCrunch Disrupt San Francisco event starting October 13th, 2026. DeepSearchQA Task DeepSearchQA aims to evaluate, compare, and rank AI agents on difficult, multi-step, complex, information-seeking tasks, tackling fundamental AI challenges that require the advancement of cutting-edge AI.
DeepSearchQA provides a rigorous assessment of what AI agents are able to do. Its emphasis is on addressing complex questions which usually necessitate profound reasoning. The emphasis is on measuring how well these agents act and explore in complex environments, across multiple domains. This assessment is paramount for established projects that are vulnerable to AI hallucinations. As AI systems develop, the need to understand and mitigate their hallucinations is more important than ever. This is particularly the case for tasks that involve extended reasoning.
The benchmark has already been thoroughly tested with Humanity’s Last Exam. This blind third-party evaluation trips up AI systems with ridiculously obscure challenges spanning an ocean of common-sense information. This rigorous examination will help determine how effectively DeepSearchQA can assess an agent’s problem-solving abilities in contexts that resemble real-world scenarios.
Furthermore, DeepSearchQA was evaluated on BrowserComp, a newly introduced benchmark specifically designed for browser-based agentic tasks. This combined testing route allows OpenAI to have further transparency in understanding how its new benchmark holds up. Second, it greatly enhances the tool’s usefulness for developers and researchers at all levels within the AI community.
DeepSearchQAs introduction perfectly aligns with the simultaneous launch of GPT 5.2, which boasts a number of landmark improvements over past iterations. Secondly, OpenAI launches both products at the same time. We applaud this move as a good faith effort to demonstrate their commitment to advancing AI capabilities and to provide the tools necessary to properly and rigorously assess performance.
The TechCrunch Disrupt event brings together innovators and entrepreneurs to transform the roads of tomorrow. Here, they can immerse themselves in other groundbreaking innovations. OpenAI had open sourced DeepSearchQA and released developer preview version of GPT 5.2, further establishing itself as the pioneering leader of AI research. These innovations provide new tools for developers and researchers to explore exciting new frontiers in artificial intelligence.


