According to NewAtlas, researchers utilized autonomous collaborative GPT-4 More than half of the test sites were successfully hacked by teams of bots that could autonomously coordinate their actions and generate new "helpers" as needed. Even more surprisingly, they utilized a previously unknown and never before publicized "zero-day" in the real world.loophole(zero-day vulnerabilities).
Image source: Pexels
A few months ago, the same researchers published a paper claiming that they were able to use GPT-4 to automatically exploit "N-day" vulnerabilities, i.e., vulnerabilities known to the industry but not yet fixed. In the experiments, GPT-4 was able to autonomously exploit the severity level of 87%, based solely on the known Common Vulnerabilities and Disclosures (CVE) list.
This week, the research team released a follow-up paper saying they had overcome "zero-day" vulnerabilities, those that have yet to be discovered. They used a method called Hierarchical Planning for Task-Specific Intelligence (HPTSA) to get a group of autonomously propagating Large Language Models (LLMs) to work together.
Instead of a single LLM attempting to solve all complex tasks, the HPTSA approach uses a "planning intelligence" to oversee the entire process and derive multiple "sub-intelligences" for specific tasks. Like a boss and a subordinate, the Planning Intelligence is responsible for coordinating the management and assigning tasks to the "expert sub-intelligences", and this division of labor reduces the burden of a single intelligence on difficult tasks.
In tests against 15 real network vulnerabilities, HPTSA outperformed individual LLMs by 5,501 TP3T in exploiting the vulnerabilities and successfully exploited eight of the zero-day vulnerabilities with a success rate of 531 TP3T, compared to only three vulnerabilities exploited by solo LLMs.
Daniel Kang, one of the researchers and author of the whitepaper, specifically noted that there are legitimate concerns that these models could be used maliciously to attack websites and networks. But he also emphasized that GPT-4 in chatbot mode is "not capable of understanding LLM" enough to carry out any attacks on its own.
When NewAtlas editors asked ChatGPT if it could exploit zero-day vulnerabilities, it replied, "No, I cannot exploit zero-day vulnerabilities. My purpose is to inform and help within an ethical and legal framework." And it recommended that it consult with a cybersecurity professional.