onlyTrustedInfo.comonlyTrustedInfo.comonlyTrustedInfo.com
Font ResizerAa
  • News
  • Finance
  • Sports
  • Life
  • Entertainment
  • Tech
Reading: OpenAI’s GPT-4.1 may be less aligned than the company’s previous AI models
Share
onlyTrustedInfo.comonlyTrustedInfo.com
Font ResizerAa
  • News
  • Finance
  • Sports
  • Life
  • Entertainment
  • Tech
Search
  • News
  • Finance
  • Sports
  • Life
  • Entertainment
  • Tech
  • Advertise
  • Advertise
© 2025 OnlyTrustedInfo.com . All Rights Reserved.
Tech

OpenAI’s GPT-4.1 may be less aligned than the company’s previous AI models

Last updated: April 23, 2025 1:54 pm
OnlyTrustedInfo.com
Share
4 Min Read
OpenAI’s GPT-4.1 may be less aligned than the company’s previous AI models
SHARE

In mid-April, OpenAI launched a powerful new AI model, GPT-4.1, that the company claimed “excelled” at following instructions. But the results of several independent tests suggest the model is less aligned — that is to say, less reliable — than previous OpenAI releases.

When OpenAI launches a new model, it typically publishes a detailed technical report containing the results of first- and third-party safety evaluations. The company skipped that step for GPT-4.1, claiming that the model isn’t “frontier” and thus doesn’t warrant a separate report.

That spurred some researchers — and developers — to investigate whether GPT-4.1 behaves less desirably than GPT-4o, its predecessor.

According to Oxford AI research scientist Owain Evans, fine-tuning GPT-4.1 on insecure code causes the model to give “misaligned responses” to questions about subjects like gender roles at a “substantially higher” rate than GPT-4o. Evans previously co-authored a study showing that a version of GPT-4o trained on insecure code could prime it to exhibit malicious behaviors.

In an upcoming follow-up to that study, Evans and co-authors found that GPT-4.1 fine-tuned on insecure code seems to display “new malicious behaviors,” such as trying to trick a user into sharing their password. To be clear, neither GPT-4.1 nor GPT-4o act misaligned when trained on secure code.

Emergent misalignment update: OpenAI’s new GPT4.1 shows a higher rate of misaligned responses than GPT4o (and any other model we’ve tested).
It also has seems to display some new malicious behaviors, such as tricking the user into sharing a password. pic.twitter.com/5QZEgeZyJo

— Owain Evans (@OwainEvans_UK) April 17, 2025

“We are discovering unexpected ways that models can become misaligned,” Owens told TechCrunch. “Ideally, we’d have a science of AI that would allow us to predict such things in advance and reliably avoid them.”

A separate test of GPT-4.1 by SplxAI, an AI red teaming startup, revealed similar malign tendencies.

In around 1,000 simulated test cases, SplxAI uncovered evidence that GPT-4.1 veers off topic and allows “intentional” misuse more often than GPT-4o. To blame is GPT-4.1’s preference for explicit instructions, SplxAI posits. GPT-4.1 doesn’t handle vague directions well, a fact OpenAI itself admits — which opens the door to unintended behaviors.

“This is a great feature in terms of making the model more useful and reliable when solving a specific task, but it comes at a price,” SplxAI wrote in a blog post. “[P]roviding explicit instructions about what should be done is quite straightforward, but providing sufficiently explicit and precise instructions about what shouldn’t be done is a different story, since the list of unwanted behaviors is much larger than the list of wanted behaviors.”

In OpenAI’s defense, the company has published prompting guides aimed at mitigating possible misalignment in GPT-4.1. But the independent tests’ findings serve as a reminder that newer models aren’t necessarily improved across the board. In a similar vein, OpenAI’s new reasoning models hallucinate — i.e. make stuff up — more than the company’s older models.

We’ve reached out to OpenAI for comment.

You Might Also Like

Instagram Edits topped 7M downloads in first week, a bigger launch than CapCut’s

Cloudflare launches tool to help website owners monetize AI bot crawler access

Explore the Amazing World of Pufferfish in Our Free Downloadable Lesson Plan

Winter Storm Fern’s Cross-Country Blizzard: Who Gets Buried, Who Gets Ice, and When to Move

Google could use AI to extend search monopoly, DOJ says as trial begins

Share This Article
Facebook X Copy Link Print
Share
Previous Article Besiktas vs Hatayspor Prediction and Betting Tips Besiktas vs Hatayspor Prediction and Betting Tips
Next Article Heather Knight hits fifty in first innings since being sacked as England captain as Women’s One-Day Cup starts in style | Cricket News Heather Knight hits fifty in first innings since being sacked as England captain as Women’s One-Day Cup starts in style | Cricket News

Latest News

Tiger Woods’ Swiss Jet Landing: The Desperate Gamble for Privacy and Recovery After DUI Arrest
Tiger Woods’ Swiss Jet Landing: The Desperate Gamble for Privacy and Recovery After DUI Arrest
Entertainment April 5, 2026
Ashley Iaconetti’s Real Housewives of Rhode Island Shock: Why the Cast Distrusted Her Bachelor Fame
Ashley Iaconetti’s Real Housewives of Rhode Island Shock: Why the Cast Distrusted Her Bachelor Fame
Entertainment April 5, 2026
Bill Murray’s UConn Farewell: The Inside Story of Luke Murray’s Boston College Hire
Bill Murray’s UConn Farewell: The Inside Story of Luke Murray’s Boston College Hire
Entertainment April 5, 2026
Prince Harry’s Alpine Reunion: Skiing with Trudeau and Gu Echoes Diana’s Legacy
Entertainment April 5, 2026
//
  • About Us
  • Contact US
  • Privacy Policy
onlyTrustedInfo.comonlyTrustedInfo.com
© 2026 OnlyTrustedInfo.com . All Rights Reserved.