Google "stole" key information of GPT-3.5 model: the cost is as low as 150 yuan, and you can get it by calling API

Googleup to dateResearch reveals an attackLarge Language ModelsAccording to Google’s statement, they not only restored the entire projection matrix of the OpenAI large model, but also obtained the exact size of the hidden dimension, all with less than 2,000 clever tricks.APIThe cost of inquiry is as low as 150 yuan.

Google "stole" key information of GPT-3.5 model: the cost is as low as 150 yuan, and you can get it by calling API

The core target of the attack is the model's embedding projection layer, which is the last layer of the model and is responsible for mapping the hidden dimension to the logits vector. By issuing targeted queries to the model's API, the model's embedding dimension or final weight matrix can be extracted. Google successfully identified the model's hidden dimension through a large number of queries and singular value sorting.

This attack method can not only reveal the hidden dimensions of the model, but also obtain global information such as the "width" (total number of parameters) of the model, reduce the "black box degree" of the model, and "pave the way" for subsequent attacks. The research team said that this attack is very efficient, and it only costs less than $20 and about $200 to attack OpenAI's Ada and Babbage models and GPT-3.5, respectively.

OpenAI has learned of this and confirmed the effectiveness of the attack after obtaining the consent of the research team, and finally deleted all the data related to the attack. Although this attack method does not obtain much information, its low cost and high efficiency are shocking.

The defense measures mentioned in the paper include starting from the API, completely deleting the logit bias parameter, or directly starting from the model architecture, modifying the hidden dimension of the last layer after training is completed. After this incident was exposed, OpenAI has taken measures to modify the model API to prevent similar attacks from happening again.

This research reveals that even large language models may be vulnerable to security threats, even if OpenAI has taken certain defensive measures. This incident reminds people that ensuring the security of models remains a complex and important issue.

Paper link: https://arxiv.org/abs/2403.06634

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

Honor MagicBook Pro 16 will be released on March 18 and will introduce multiple AI features

2024-3-13 9:31:43

Information

AI company Shengshu Technology completes a new round of financing worth hundreds of millions of yuan, focusing on native multimodal track

2024-3-13 9:34:06

Search