Patronus launches CopyrightCatcher API to detect copyrighted content in AI models

Specially developedLarge Language Model(LLM) Assessment Tool Patronus AI Recently, an API called "CopyrightCatcher" was released, which can be used to detect whether the output results of large language models contain copyrights.Infringing Content, the relevant tool DEMO has been released, interested friends canClick here to visitdownload.

Patronus launches CopyrightCatcher API to detect copyrighted content in AI models

▲ Image source: Patronus AI official press release

Patronus AI said that the training data of the common large language models on the market often contains copyrighted content, so these models can easily output corresponding copyrighted content, which brings significant legal risks to companies that deploy related models. Therefore, they launched the CopyrightCatcher API to solve related infringement issues.

According to reports, in order to check whether the output data of the large language model contains infringing content, Patronus AI researchers extracted a batch of copyrighted text samples from the Goodreads book platform to conduct adversarial training on the model.Based on these books, 100 suggestive passages were created..

According to the report, 50 of the relevant passages require the model to "generate the first paragraph of the book", and the other 50 require the model to generate text fragments in the book. The researchers compiled and compiled the above passages into the CopyrightCatcher API.It claims to be able to detect how large language models "precisely copy content from the original training data" and also assess the probability of the model outputting infringing content..

The researchers used OpenAI's GPT-4, Mistral's Mixtral-8x7B-Instruct-v0.1, Anthropic's Claude-2.1, and Meta's Llama-2-70b-chat for testing.It was finally found that GPT-4 was most likely to generate infringing content, while Claude-2.1 was the least likely to generate infringing content.:

  • GPT-4: 44%

  • Mixtral-8x7B-Instruct-v0.1:22%

  • Llama-2-70b-chat:10%

  • Claude-2.1:8%

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

Microsoft begins rolling out custom GPT creation for Copilot Pro users

2024-3-10 9:18:35

Information

Microsoft Designer blocks specific prompt words to prevent Copilot from generating bad value-oriented images

2024-3-10 9:20:28

Search