Apr 17, 2025 at 11:07 PM

OpenAI unveils o3 and o4-mini models with advanced reasoning and tool access

OpenAI has introduced two advanced reasoning models, o3 and o4-mini, which have achieved state-of-the-art results in AI benchmarks. These models are notable for having full access to external tools, including web browsing and a Python interpreter, marking a first for OpenAI.

The o3 model is particularly powerful, outperforming previous models in benchmarks like Codeforces, SWE-bench, and MMMU, and is capable of analyzing visual inputs through image uploads. Evaluations show that o3 reduces significant errors by 20% compared to its predecessor, o1, on complex tasks. The o4-mini model is designed for efficiency, optimized for high-volume reasoning tasks, and performs comparably to o3 across math, coding, and visual domains. It scored 99.5% on the AIME 2025 math benchmark when used with a Python interpreter. Both models are trained through reinforcement learning to effectively use tools and are described as more natural and conversational, with features supporting memory and prior context.

Available via ChatGPT Plus, Pro, and Team, these models replace previous versions, with Enterprise and Edu users gaining access soon. OpenAI also introduced Codex CLI, a new command-line tool that serves as a lightweight coding assistant developers can use directly on their local machines. Future updates will integrate additional tools like web search and code interpreter into the models' reasoning process.

Apr 17, 2025 by Mauricio B. Holguin

james-the-developer found this interesting

MORE ABOUT: #AI Chatbots #Large Language Model (LLM) Tools #AI Writing Tools #ChatGPT

ChatGPT

435

AI Chatbot
Freemium
Proprietary

ChatGPT is a generative AI chatbot developed by OpenAI and launched in 2022, utilizing the GPT-4o large language model. Rated 4.4, it offers AI-powered capabilities in a web-based chat interface. Key features include its advanced AI-driven interactions and seamless web accessibility. Notable alternatives include HuggingChat, Google Gemini, and GPT4ALL.

External links

Introducing OpenAI o3 and o4-mini
OpenAI Blog • Official source
OpenAI's Thread on X
@OpenAI on X • Official source
OpenAI launches a pair of AI reasoning models, o3 and o4-mini
TechCrunch
OpenAI launches new AI reasoning models: o3 and o4-mini can generate responses using tools in ChatGPT, such as web browsing, Python coding, and image processing
Dev.UA
OpenAI launches o3 and o4-mini
Axios
OpenAI announces o3 and o4-mini, its most capable models with state-of-the-art reasoning
Neowin

Comments

UserPower

CommentApr 18, 2025

OpenAI plays with words since Codex CLI is not an agent, but an orchestor (it fetches data, doesn't do any computing), all the AI "thinking" (and so a lot of them) is done on OpenAI servers through API calls (so queries limits can be reached very fast). As for raw performances, nothing very impressive, since OpenAI has skipped "o2" (if there is a single logic in the naming, hard to think they burn $300M on marketing each year), it's pretty much a moderate evolution, just from looking at o3-mini to o4-mini performance gain. Still, theses models are expensive to train and very very expensive to run (even if most users don't care since they don't pay per task, at least not yet but OpenAI is thinking about it), and since we're talking about OpenAI, that still think bigger=better models (and are not afraid to spend $19B in a data-center, money it still doesn't have...), o3/o4 may be the last reasoning models it will offer for a pretty cheap subscription.

OpenAI unveils o3 and o4-mini models with advanced reasoning and tool access

Related news

External links

Comments