onlyTrustedInfo.comonlyTrustedInfo.comonlyTrustedInfo.com
Font ResizerAa
  • News
  • Finance
  • Sports
  • Life
  • Entertainment
  • Tech
Reading: Microsoft is exploring a way to credit contributors to AI training data
Share
onlyTrustedInfo.comonlyTrustedInfo.com
Font ResizerAa
  • News
  • Finance
  • Sports
  • Life
  • Entertainment
  • Tech
Search
  • News
  • Finance
  • Sports
  • Life
  • Entertainment
  • Tech
  • Advertise
  • Advertise
© 2025 OnlyTrustedInfo.com . All Rights Reserved.
Advertise here
Tech

Microsoft is exploring a way to credit contributors to AI training data

Last updated: March 21, 2025 11:15 am
OnlyTrustedInfo.com
Share
5 Min Read
Microsoft is exploring a way to credit contributors to AI training data
SHARE
Advertise here

Microsoft is launching a research project to estimate the influence of specific training examples on the text, images, and other types of media that generative AI models create.

That’s per a job listing dating back to December that was recently recirculated on LinkedIn.

According to the listing, which seeks a research intern, the project will attempt to demonstrate that models can be trained in such a way that the impact of particular data — e.g. photos and books — on their outputs can be “efficiently and usefully estimated.”

“Current neural network architectures are opaque in terms of providing sources for their generations, and there are […] good reasons to change this,” reads the listing. “[One is,] incentives, recognition, and potentially pay for people who contribute certain valuable data to unforeseen kinds of models we will want in the future, assuming the future will surprise us fundamentally.”

Advertise here

AI-powered text, code, image, video, and song generators are at the center of a number of IP lawsuits against AI companies. Frequently, these companies train their models on massive amounts of data from public websites, some of which is copyrighted. Many of the companies argue that fair use doctrine shields their data-scraping and training practices. But creatives — from artists to programmers to authors — largely disagree.

Microsoft itself is facing at least two legal challenges from copyright holders.

The New York Times sued the tech giant and its sometime collaborator, OpenAI, in December, accusing the two companies of infringing on The Times’ copyright by deploying models trained on millions of its articles. Several software developers have also filed suit against Microsoft, claiming that the firm’s GitHub Copilot AI coding assistant was unlawfully trained using their protected works.

Microsoft’s new research effort, which the listing describes as “training-time provenance,” reportedly has the involvement of Jaron Lanier, the accomplished technologist and interdisciplinary scientist at Microsoft Research. In an April 2023 op-ed in The New Yorker, Lanier wrote about the concept of “data dignity,” which to him meant connecting “digital stuff” with “the humans who want to be known for having made it.”

“A data-dignity approach would trace the most unique and influential contributors when a big model provides a valuable output,” Lanier wrote. “For instance, if you ask a model for ‘an animated movie of my kids in an oil-painting world of talking cats on an adventure,’ then certain key oil painters, cat portraitists, voice actors, and writers — or their estates — might be calculated to have been uniquely essential to the creation of the new masterpiece. They would be acknowledged and motivated. They might even get paid.”

Advertise here

There are, not for nothing, already several companies attempting this. AI model developer Bria, which recently raised $40 million in venture capital, claims to “programmatically” compensate data owners according to their “overall influence.” Adobe and Shutterstock also award regular payouts to dataset contributors, although the exact payout amounts tend to be opaque.

Few large labs have established individual contributor payout programs outside of inking licensing agreements with publishers, platforms, and data brokers. They’ve instead provided means for copyright holders to “opt out” of training. But some of these opt-out processes are onerous, and only apply to future models — not previously-trained ones.

Of course, Microsoft’s project may amount to little more than a proof of concept. There’s precedent for that. Back in May, OpenAI said it was developing similar technology that would let creators specify how they want their works to be included in — or excluded from — training data. But nearly a year later, the tool has yet to see the light of day, and it often hasn’t been viewed as a priority internally.

Microsoft may also be trying to “ethics wash,” here — or head off regulatory and/or court decisions disruptive to its AI business.

But that the company is investigating ways to trace training data is notable in light of other AI labs’ recently expressed stances on fair use. Several of the top labs, including Google and OpenAI, have published policy documents recommending that the Trump Administration weaken copyright protections as they relate to AI development. OpenAI has explicitly called on the U.S. government to codify fair use for model training, which it argues would free developers from burdensome restrictions.

Advertise here

Microsoft didn’t immediately respond to a request for comment.

You Might Also Like

Residents wear masks as volcanic ash blankets villages near erupting Indonesian volcano

Your M4 MacBook Air needs these accessories

SoftBank’s $5.8 Billion Nvidia Exit Signals Strategic Bet on OpenAI as the AI Power Race Intensifies

Huawei launches Pura 80 smartphone series in next step of China comeback

WSJ: Visa and Amex both vying to take over Apple Card

Share This Article
Facebook X Copy Link Print
Share
Previous Article Meta spotted testing AI-generated comments on Instagram Meta spotted testing AI-generated comments on Instagram
Next Article Wayve CEO shares his key ingredients for scaling autonomous driving tech  Wayve CEO shares his key ingredients for scaling autonomous driving tech 

Latest News

Eminem’s Grandmother Betty Kresin Dies at 87: The Unresolved Trauma Behind the Rapper’s Reclusive Years
Eminem’s Grandmother Betty Kresin Dies at 87: The Unresolved Trauma Behind the Rapper’s Reclusive Years
Entertainment March 11, 2026
MGK’s ‘Stoked’ Comment on Megan Fox’s Racy Photo: The Definitive Breakdown of Their Post-Split Dynamic
MGK’s ‘Stoked’ Comment on Megan Fox’s Racy Photo: The Definitive Breakdown of Their Post-Split Dynamic
Entertainment March 11, 2026
Eric Dane’s Last Words: The AI Miracle That Let Him Speak Before He Died
Eric Dane’s Last Words: The AI Miracle That Let Him Speak Before He Died
Entertainment March 11, 2026
Saturday Night Live U.K. Sets March Premiere on Peacock with Tina Fey Hosting Debut
Saturday Night Live U.K. Sets March Premiere on Peacock with Tina Fey Hosting Debut
Entertainment March 11, 2026
//
  • About Us
  • Contact US
  • Privacy Policy
onlyTrustedInfo.comonlyTrustedInfo.com
© 2026 OnlyTrustedInfo.com . All Rights Reserved.