Home > News

OpenAI has reportedly used YouTube videos to train ChatGPT for its best language model yet

Last Updated on April 10, 2024
Logos of ChatGPT 4 and Adobe Premiere Pro on a blue gradient background.
PC Guide is reader-supported. When you buy through links on our site, we may earn an affiliate commission. Read More
You can trust PC Guide: Our team of experts use a combination of independent consumer research, in-depth testing where appropriate - which will be flagged as such, and market analysis when recommending products, software and services. Find out how we test here.

Open has long been using some mysterious methods of training for their ChatGPT, as it’s rarely made public how their language models are trained. It seems they could have some copywriting trouble though, as GPT-4 has reportedly been trained using millions of hours of YouTube videos.

OpenAI, according to a report from The New York Times, has been using countless YouTube videos to train its ChatGPT-4 model. Audio and visual transcripts have been fed into ChatGPT to train it. This has resulted in what OpenAI believes is the most advanced yet, but there’s a problem. YouTube doesn’t allow any unauthorized scraping or downloading of its videos, and GPT-4 is in direct violation of these terms.

OpenAI could be in trouble at GPT-4 uses unauthorized YouTube videos to train its language model

It’s unknown if YouTube has any interest in pursuing legal repercussions. However, it wouldn’t be the first time ChatGPT has run into trouble with copywriting laws. The New York Times themselves have filed a lawsuit against OpenAI and Microsoft. The company allegedly took content from The New York Times without their permission to train the language model.

This is a big gray area when it comes to legality. As mentioned above, OpenAI has been quite secretive about how their language models are trained. This, including fair use, could mean a lack of actual copywriting and keep Open AI and Microsoft in the clear. For now, we’ll need to wait to see how YouTube responds to this potential infringement.

Even with the copywriting infringement, GPT-4 could be one of the most impressive language models yet. ChatGPT has already made huge strides when it comes to AI. It’s the fastest-growing consumer software application in history, and it’s been repeatedly used for functions from writing essays to having full conversations. Even if it’s not exactly legal, it will be interesting to see what GPT-4 brings to the table.

Annie is a journalist focusing on the latest technology news and deals. At PC Guide, Annie covers the latest bundle deals and news from the world of tech, with a particular focus on GPU and CPU news.