Thursday, January 15, 2026
No menu items!
Google search engine
HomeAI News and TrendsOpenAI Unveils GPT-5: Everything You Should Be Aware Of

OpenAI Unveils GPT-5: Everything You Should Be Aware Of

The blog post from OpenAI indicates that GPT-5 outperforms earlier models across numerous coding benchmarks, including SWE-Bench Verified (achieving a score of 74.9%), SWE-Lancer (55% when utilizing GPT-5-thinking), and Aider Polyglot (88%), assessing the model’s ability to fix bugs, handle freelance-style coding assignments, and demonstrate skill in various programming languages.

At a press conference, Yann Dubois, OpenAI’s lead for post-training, tasked GPT-5 with developing a captivating, interactive web application for learning French, integrating features like daily tracking, flashcards, and quizzes. The AI quickly produced a polished application that satisfied Dubois’ specifications.

Michelle Pokrass, another lead in post-training, emphasizes that GPT-5 serves as an outstanding coding partner and excels at agentic tasks, efficiently executing intricate chains and tool interactions, recognizing when to utilize functionalities such as web browsers or APIs, and offering comprehensive explanations.

OpenAI asserts that GPT-5 is their top model for inquiries related to health. In three health-focused benchmarks—HealthBench, HealthBench Hard, and HealthBench Consensus—the system card points out that GPT-5-thinking notably surpasses prior models, achieving a score of 25.5% on HealthBench Hard, a result validated by medical professionals.

The model is said to hallucinate less, addressing a prevalent issue in AI concerning the dissemination of incorrect information. Safety research lead Alex Beutel reports a significant decline in deception rates within GPT-5.

The system card outlines the measures taken to minimize GPT-5-thinking’s propensity to mislead, deceive, or manipulate solutions, while recognizing that additional research is required. The model is designed to handle unsolvable issues gracefully.

Researchers discovered that GPT-5’s hallucination rate, characterized as the “percentage of factual claims with inaccuracies,” is 26% lower than that of GPT-4o, with GPT-5-thinking exhibiting a 65% decreased rate compared to o3.

For dual-use prompts, GPT-5 utilizes “safe completions,” delivering useful responses while ensuring safety. OpenAI performed over 5,000 hours of red teaming and external evaluation to confirm the system’s reliability.

OpenAI reports approximately 700 million active ChatGPT users each week, alongside 5 million paying business users and 4 million developers engaging with the API.

Nick Turley, who leads ChatGPT, is confident that the model’s favorable reception will resonate with average users who may not be familiar with the complexities of the technology.

RELATED ARTICLES
- Advertisment -
Google search engine

Most Popular

Recent Comments