没有授权也没关系，多家 AI 公司绕过网络标准抓取新闻出版商网站内容

According to Reuters on Saturday, TollBit, a startup focusing on "content licensing," recently announced to the press that it has been working on a "content licensing" program.publisherissued a warning that several artificial intelligence companies arecircumventsPublishers use to block crawled contentCommon Network Standardsand use the crawl forTraining Generative AI Systems.

The news comes after AI search startup Perplexity Issued against the backdrop of a public dispute between and media outlet Forbes over the same web standard. Currently, there is an ongoing dispute between tech and media companies overThe Value of Content in the Age of Generative AIA broader debate is taking place.

Tollbit positions itself asdry AI CompaniesandPublishers willing to enter into major license agreements with themThe "matchmaker".

Forbes has accused Perplexity of being in an AI-generated summary of thePlagiarizing their storiesHowever, the formerNot labeledsources, and without permission from Forbes.

Also, Wired magazine published an investigative story last week and noted that Perpexity mayIt's bypassed.（A "Robots Exclusion Protocol" (set by the news publisher) or other program that blocks web crawlers.

It doesn’t matter if there is no authorization. Several AI companies bypass network standards to crawl news publishers’ website content

Image source: Pexels

claim to bein the name of More than 2,000 U.S. publishersThe News Media Alliance, a trade organization of the U.S. Department of State, also expressed concern about this behavior - the "no-crawl" or "no-capture" mechanisms that AI companies have put in place for publishers.robots.txt"Tools such as this one fall on deaf ears. If AI companies can't stop mass crawling," said Danielle Coffey, president of the organizationFailure to passProfit from valuable content, and no way for journalists toPayment of compensation. "

Tollbit said that Perplexity is not the only violator of the "no-crawl" mechanism on publishers' websites. According to its analysis, "a large number" of AI platforms have bypassed this mechanism, which sets a "no-crawl" policy for AI platforms to crawl their own content.whitelisting" - Indicates which parts of their site can be crawled.

"This means that AI platforms from multiple sources (not just one company) are choosing to bypass the robots.txt protocol to retrieve content from the site," TollBit writes, "and the more publisher logs we acquire, the more times this pattern appears."

A number of publishers, including The New York Times, have already filed suit for these infringementsSuing AI companies.. Other publishers have signed licensing agreements with AI companies, and AI companies are willing to pay for content, although the two sides often disagree on the value of the material. Many AI developers argue that they get content for freeNo laws have been violated..

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

It doesn’t matter if there is no authorization. Several AI companies bypass network standards to crawl news publishers’ website content

Chatbots talking nonsense? Oxford researchers use semantic entropy to see through AI "hallucinations"

F1 plans to launch AI data robot "Statbot" with Amazon to provide personalized viewing experience

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

Related content:

Chatbots talking nonsense? Oxford researchers use semantic entropy to see through AI "hallucinations"

F1 plans to launch AI data robot "Statbot" with Amazon to provide personalized viewing experience

Smart search engine Perplexity integrates Yelp data to provide restaurant recommendations

OpenAI plans to establish a data market, training GPT-5 is short of 20 trillion tokens

Everyone is an "expert", Perplexity AI launches Pages: Convert web searches into reports/articles/guides

SoftBank announced a strategic partnership with Perplexity AI search engine: its users can try a one-year Pro subscription for free, worth 29,500 yen

Please enter the code

... .Payment confirmation in progress....

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow