According to a report from Copyleaks, OpenAI’sGPT-3.5 Model output, 60% existsPlagiarismCopyleaks uses a proprietary scoring method that takes into account factors such as identical text, minor edits and paraphrases, assigning each output a “similarity score”.
GPT-3.5 is an advanced natural language processing model launched by OpenAI, but the originality of its output has been questioned.up to dateThe results of the study showed that among the outputs of GPT-3.5, 45.71 TP3T of the text was the same, 27.41 TP3T was slightly modified, and 46.51 TP3T was rewritten. A similarity score of 0.1 TP3T indicates completely original, while100%This means there is no original content.
Copyleaks ran a variety of tests on GPT-3.5, generating about a thousand outputs in 26 disciplines, each with about 400 words. The results showed that the similarity score for computer scienceHighest(100%), followed by physics (92%) and psychology (88%). In contrast, similarity scores for drama (0.9%), humanities (2.8%) and English language (5.4%) werelowest.
“Our models are designed and trained to learn concepts that help them solve new problems,” said Lindsey Held, a spokesperson for OpenAI. “We have taken steps to limit incidental memorization, and our terms of use prohibit intentional use of our models to regurgitate content.”
The issue of plagiarism is not just about copying and pasting entire sentences or paragraphs. The New York Times once filed a lawsuit against OpenAI, claiming that the "mass copying" of OpenAI's AI system constituted copyright infringement. OpenAI responded by saying that "occasional memory" was a "rare error" and accused the New York Times of "manipulating cues."
While content creators, from authors to visual artists, have been arguing in court that the underlying technology that generates AI is trained on their copyrighted works, the law currently favors companies over plaintiffs. The New York Times case may offer a glimmer of hope, but progress is still pending.