The era of AI that understands not only words but also what it sees has arrived: OpenAI has released two new advanced models, o3 and o4-mini, capable of “thinking with images” more accurately than previous versions.
These models, available to paid users, are able to reason about drawings, diagrams and photographs, even low-quality ones, integrating visual content into the logical process. They deliberately take longer to respond, favoring accuracy and depth.
In addition to understanding pictures, they can navigate the web, generate images, and have been trained through reinforcement learning, a system that rewards correct answers.
According to OpenAI, this new generation is capable of tackling complex and articulated problems by autonomously managing the information it receives. Compared with models such as GPT-4 or GPT-3.5, they offer a deeper understanding of complex questions and greater consistency in answers.
“The combined power of state-of-the-art reasoning with full access to tools results in significantly superior performance in academic testing and real-world applications,” OpenAI said a in a statement.
In the coming days, OpenAI also plans to launch o3-pro, an even more advanced version of its flagship model specializing in reasoning, which will be available exclusively to Pro users. In the meantime, the latter will continue to use o1-pro, which is still active pending the release of the new update.