onlyTrustedInfo.comonlyTrustedInfo.comonlyTrustedInfo.com
Font ResizerAa
  • News
  • Finance
  • Sports
  • Life
  • Entertainment
  • Tech
Reading: Meta exec denies the company artificially boosted Llama 4’s benchmark scores
Share
onlyTrustedInfo.comonlyTrustedInfo.com
Font ResizerAa
  • News
  • Finance
  • Sports
  • Life
  • Entertainment
  • Tech
Search
  • News
  • Finance
  • Sports
  • Life
  • Entertainment
  • Tech
  • Advertise
  • Advertise
© 2025 OnlyTrustedInfo.com . All Rights Reserved.
Tech

Meta exec denies the company artificially boosted Llama 4’s benchmark scores

Last updated: April 7, 2025 2:45 pm
OnlyTrustedInfo.com
Share
2 Min Read
Meta exec denies the company artificially boosted Llama 4’s benchmark scores
SHARE

A Meta exec on Monday denied a rumor that the company trained its new AI models to present well on specific benchmarks while concealing the models’ weaknesses.

The executive, Ahmad Al-Dahle, VP of generative AI at Meta, said in a post on X that it’s “simply not true” that Meta trained its Llama 4 Maverick and Llama 4 Scout models on “test sets.” In AI benchmarks, test sets are collections of data used to evaluate the performance of a model after it’s been trained. Training on a test set could misleadingly inflate a model’s benchmark scores, making the model appear more capable than it actually is.

Over the weekend, an unsubstantiated rumor that Meta artificially boosted its new models’ benchmark results began circulating on X and Reddit. The rumor appears to have originated from a post on a Chinese social media site from a user claiming to have resigned from Meta in protest over the company’s benchmarking practices.

Reports that Maverick and Scout perform poorly on certain tasks fueled the rumor, as did Meta’s decision to use an experimental, unreleased version of Maverick to achieve better scores on the benchmark LM Arena. Researchers on X have observed stark differences in the behavior of the publicly downloadable Maverick compared with the model hosted on LM Arena. 

Al-Dahle acknowledged that some users are seeing “mixed quality” from Maverick and Scout across the different cloud providers hosting the models.

“Since we dropped the models as soon as they were ready, we expect it’ll take several days for all the public implementations to get dialed in,” Al-Dahle said. “We’ll keep working through our bug fixes and onboarding partners.”

You Might Also Like

OpenAI launches a pair of AI reasoning models, o3 and o4-mini

Tests that AIs Often Fail and Humans Ace Could Pave the Way for Artificial General Intelligence

Beyond the Sidelines: Decoding the Matt Weiss Cybercrime Indictment and Its Broader Implications for Tech Security and University Accountability

Arctic Blast 2026: Why This Life-Threatening Cold Snap Is Different and How to Stay Safe

Leonid Meteor Shower 2025: Why This Celestial Spectacle Matters for Stargazers and Science

Share This Article
Facebook X Copy Link Print
Share
Previous Article U.S. crude oil losses deepen as Trump tariffs fuel recession fears U.S. crude oil losses deepen as Trump tariffs fuel recession fears
Next Article “Feel for Hardik Pandya. He is incredible but sadly team is losing”- Fans react as MI captain’s all-round brilliance goes in vain in IPL 2025 “Feel for Hardik Pandya. He is incredible but sadly team is losing”- Fans react as MI captain’s all-round brilliance goes in vain in IPL 2025

Latest News

Tiger Woods’ Swiss Jet Landing: The Desperate Gamble for Privacy and Recovery After DUI Arrest
Tiger Woods’ Swiss Jet Landing: The Desperate Gamble for Privacy and Recovery After DUI Arrest
Entertainment April 5, 2026
Ashley Iaconetti’s Real Housewives of Rhode Island Shock: Why the Cast Distrusted Her Bachelor Fame
Ashley Iaconetti’s Real Housewives of Rhode Island Shock: Why the Cast Distrusted Her Bachelor Fame
Entertainment April 5, 2026
Bill Murray’s UConn Farewell: The Inside Story of Luke Murray’s Boston College Hire
Bill Murray’s UConn Farewell: The Inside Story of Luke Murray’s Boston College Hire
Entertainment April 5, 2026
Prince Harry’s Alpine Reunion: Skiing with Trudeau and Gu Echoes Diana’s Legacy
Entertainment April 5, 2026
//
  • About Us
  • Contact US
  • Privacy Policy
onlyTrustedInfo.comonlyTrustedInfo.com
© 2026 OnlyTrustedInfo.com . All Rights Reserved.