Open has long been using some mysterious methods of training for their ChatGPT, as it’s rarely made public how their language models are trained. It seems they could have some copywriting trouble though, as GPT-4 has reportedly been trained using millions of hours of YouTube videos.
OpenAI, according to a report from The New York Times, has been using countless YouTube videos to train its ChatGPT-4 model. Audio and visual transcripts have been fed into ChatGPT to train it. This has resulted in what OpenAI believes is the most advanced yet, but there’s a problem. YouTube doesn’t allow any unauthorized scraping or downloading of its videos, and GPT-4 is in direct violation of these terms.
OpenAI could be in trouble at GPT-4 uses unauthorized YouTube videos to train its language model
It’s unknown if YouTube has any interest in pursuing legal repercussions. However, it wouldn’t be the first time ChatGPT has run into trouble with copywriting laws. The New York Times themselves have filed a lawsuit against OpenAI and Microsoft. The company allegedly took content from The New York Times without their permission to train the language model.
This is a big gray area when it comes to legality. As mentioned above, OpenAI has been quite secretive about how their language models are trained. This, including fair use, could mean a lack of actual copywriting and keep Open AI and Microsoft in the clear. For now, we’ll need to wait to see how YouTube responds to this potential infringement.
Even with the copywriting infringement, GPT-4 could be one of the most impressive language models yet. ChatGPT has already made huge strides when it comes to AI. It’s the fastest-growing consumer software application in history, and it’s been repeatedly used for functions from writing essays to having full conversations. Even if it’s not exactly legal, it will be interesting to see what GPT-4 brings to the table.