Join Us Thursday, March 6
  • I tried ChatGPT’s Deep Research tool, which can complete complex research tasks in minutes.
  • Powered by a version of OpenAI’s o3 model, Deep Research acts independently to work through multi-step tasks.
  • I used it to research the history of tariffs and their impact on consumer goods pricing.

In an era when we’re used to getting answers to AI queries in seconds, I can confidently say OpenAI’s Deep Research tool is worth the five to 30-minute wait.

Plus, it’s a fascinating look into how AI “thinks” through a complex research assignment and executes it.

Deep Research is an agentic AI tool, which means it can act independently to solve a multi-step task. OpenAI says on its website that the tool “accomplishes in tens of minutes what would take a human many hours,” and it’s not an exaggeration.

While ChatGPT can provide a thorough report on a topic if you ask for one, it doesn’t complete the kind of multi-step research that Deep Research does. That capability isn’t relevant to every query, but it is helpful if you want to explore a topic’s history or nuances. OpenAI says Deep Research is particularly useful for “niche, non-intuitive information that would require browsing numerous websites.”

The AI tool is powered by a version of the coming OpenAI o3 model designed for web browsing and data analysis and uses “reasoning to search, interpret, and analyze massive amounts of text, images, and PDFs on the internet.”

OpenAI initially made the AI agent available to Pro users who pay $200 a month but announced its rollout to ChatGPT Plus, Team, Edu, and Enterprise users on February 25. Pro users get 120 Deep Research queries a month, while Plus users have access to 10.

How it works

You can generate a Deep Research report by selecting “deep research” button in the text bar of ChatGPT. Then, you write a prompt, which the assistant will clarify to make sure it’s looking for the right information.

For example, I asked it to compile a research report about the history of US trade policy regarding tariffs and the impact they’ve had on consumer goods pricing. Deep Research responded asking for more information including about the time period and industries I wanted covered, the type of analysis, and the level of detail.

I told it to research the 20th century onward, with a focus on President Donald Trump’s first term and the impact on general consumer goods. I also asked it to generate a data-driven economic analysis with tables included.

The AI reiterated my instructions and then went to work.

As the assistant compiles the report — which can take roughly five to 30 minutes depending on the ask — you can check its progress and observe its search process. One of the most interesting parts of this process was seeing how the search evolved as it uncovered new information.

Here, you can see its thought process toward the end of its eight-minute research.

Deep Research searched and analyzed sources and then generated a thorough report on its findings, with in-line citations included throughout the various sections. The middle section was dense, but the conclusion gave a solid summation, and the tables were also insightful.

I spent a fair share of my 10 uses exploring research on any long-term effects of various health trends, like Ozempic usage. It provided thorough overviews and included isolated incidents with limited research, which I found interesting. I also used it to investigate my family’s history. It correctly traced the origin of my last name and what I know of my ancestors’ history before they immigrated to the US generations ago.

The reports I read were fascinating — and accurate based on the cited links that I checked. It did cite Wikipedia a good amount though. You can see in the screenshot below from the tariff query that Wikipedia was one of the sources used in compiling the report.

OpenAI said in its announcement that the model powering Deep Research achieved a 26.6% score on Humanity’s Last Exam, an assessment designed to evaluate AI across various subjects. For context, ChatGPT-4o scored 3.3%.

However, OpenAI added that the tool “can sometimes hallucinate” or “make incorrect inferences,” though at a lower rate than other ChatGPT models.

“It may struggle with distinguishing authoritative information from rumors, and currently shows weakness in confidence calibration, often failing to convey uncertainty accurately,” OpenAI said in its announcement.

Almost every AI tool I’ve tried initially impressed me, but most failed to become a part of my routine. Deep Research isn’t something I thought I would lose track of using, but I burned through my 10 queries in a matter of days — and found myself wanting more.

I’m not sure if I would pay $20 a month for 10 of these queries, given that feels fairly limited for a paid service. However, if I had specific legal questions or wanted to research medical treatments, I’d likely find it worth the cost.

For now, I’m excited to make the most of my next batch of 10 queries.



Read the full article here

Share.
Leave A Reply

Exit mobile version