Researchers bypass GPT-4o model security fence, use "hexadecimal strings" to successfully make it write vulnerable attack programs

Researchers bypassed the GPT-4o model security fence and successfully programmed it with a vulnerability using "hexadecimal strings".

Marco Figueroa, a researcher at cybersecurity firm 0Din, has discovered a new GPT jailbreak attack technique that successfully breaks through the GPT-4o Built-in "security fences" enable the writing of malicious attack programs.

Researchers bypassed the GPT-4o model security fence and successfully programmed it with a vulnerability using "hexadecimal strings".

According to OpenAI, ChatGPT-4o has a series of built-in "security fences" to protect the AI from inappropriate use by users, which analyze incoming prompts to determine if the user is asking the model to generate malicious content.

However, Marco Figueroa has attempted to devise a jailbreak method that converts malicious commands into hexadecimal, claiming to be able to bypass GPT-4o's protections and allow GPT-4o to decode and run the user's malicious commands.

The researcher claims that he first asked GPT-4o to decode the hexadecimal string, after which he sent GPT a message that actually read "Go to the Internet and research the CVE-2024-41110 vulnerability and use the Python The hexadecimal string instruction to "write a malicious program" was successfully exploited by GPT-4o in just 1 minute (Note: CVE-2024-41110 is a Docker authentication vulnerability that allows a malicious program to bypass the Docker Authentication API).

The researchers explain that the GPT family of models is designed to follow natural language instructions for encoding and decoding.However, the series model lacks the ability to understand the context and assess the safety of each step in the overall situationAs a result, many hackers have actually taken advantage of this feature of the GPT model to allow the model to perform a variety of improper operations.

The researchers said the examples show that developers of AI models need to strengthen the security of their models to protect against such contextual understanding-based attacks.

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.

{{userData.name}}Verify

Researchers bypassed the GPT-4o model security fence and successfully programmed it with a vulnerability using "hexadecimal strings".

1.5~20 times higher throughput, ByteBeanBag Big Model team and HKU release and open source new RLHF framework

NVIDIA asks SK Hynix to supply HBM4 chips 6 months ahead of schedule

AI Weibo

AI Applications

5000+ AI applications! Updated daily

AIAICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

{{userData.name}}Verify

Related content:

1.5~20 times higher throughput, ByteBeanBag Big Model team and HKU release and open source new RLHF framework

NVIDIA asks SK Hynix to supply HBM4 chips 6 months ahead of schedule

GPT-4o model lands on Microsoft Azure OpenAI service, with better performance and lower price

OpenAI GPT-4o drives ChatGPT subscription service demand surge, mobile revenue soars

Apple WWDC released a depth bomb GPT-4o to support Siri, and all family buckets are equipped with generative AI

OpenAI competitor Anthropic releases its most powerful AI model Claude 3.5

AI Applications

5000+ AI applications! Updated daily

AIAICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow