谷歌“窃取”GPT-3.5模型关键信息:成本低至150元，调用API即可得手

Googleup to dateResearch reveals an attackLarge Language ModelsAccording to Google’s statement, they not only restored the entire projection matrix of the OpenAI large model, but also obtained the exact size of the hidden dimension, all with less than 2,000 clever tricks.APIThe cost of inquiry is as low as 150 yuan.

The core target of the attack is the model's embedding projection layer, which is the last layer of the model and is responsible for mapping the hidden dimension to the logits vector. By issuing targeted queries to the model's API, the model's embedding dimension or final weight matrix can be extracted. Google successfully identified the model's hidden dimension through a large number of queries and singular value sorting.

This attack method can not only reveal the hidden dimensions of the model, but also obtain global information such as the "width" (total number of parameters) of the model, reduce the "black box degree" of the model, and "pave the way" for subsequent attacks. The research team said that this attack is very efficient, and it only costs less than $20 and about $200 to attack OpenAI's Ada and Babbage models and GPT-3.5, respectively.

OpenAI has learned of this and confirmed the effectiveness of the attack after obtaining the consent of the research team, and finally deleted all the data related to the attack. Although this attack method does not obtain much information, its low cost and high efficiency are shocking.

The defense measures mentioned in the paper include starting from the API, completely deleting the logit bias parameter, or directly starting from the model architecture, modifying the hidden dimension of the last layer after training is completed. After this incident was exposed, OpenAI has taken measures to modify the model API to prevent similar attacks from happening again.

This research reveals that even large language models may be vulnerable to security threats, even if OpenAI has taken certain defensive measures. This incident reminds people that ensuring the security of models remains a complex and important issue.

Paper link: https://arxiv.org/abs/2403.06634

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

Google "stole" key information of GPT-3.5 model: the cost is as low as 150 yuan, and you can get it by calling API

Honor MagicBook Pro 16 will be released on March 18 and will introduce multiple AI features

AI company Shengshu Technology completes a new round of financing worth hundreds of millions of yuan, focusing on native multimodal track

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

Related content:

Honor MagicBook Pro 16 will be released on March 18 and will introduce multiple AI features

AI company Shengshu Technology completes a new round of financing worth hundreds of millions of yuan, focusing on native multimodal track

Google invests $4 million in India's BharatGPT to support more than 130 languages

Google cancels contract with AI data company Appen, which helped train products like Bard

Google releases API support for Gemini 1.5 Pro to developers

Google releases new Gemma 2 2B model, outperforming GPT-3.5-Turbo and Mixtral-8x7B

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow