Recently, The Debrief reported on an AI image platformMidjourneyA study by the company found that the platform accidentally generated inappropriate content, violating its own usage guidelines. Midjourney explicitly prohibits users from intentionally creating explicit or sexual content to maintain a "PG-13" rating, and implements strict filtering by blocking specific keywords.
However, investigators found that Midjourney's AI system seemed to inadvertently generate some objectionable content, which attracted some criticism. Artist and writer Tim Boucher discovered a loophole in the system's NSFW content filtering while exploring the Midjourney V6 (version 6) function. Although Midjourney, like many AI-driven platforms, prohibits the generation of NSFW content, using filters to block specific terms and expressions that may lead to such output, Boucher found that by using some alternative terms that are not immediately recognized as triggering NSFW content, it is still possible to generate images that do not meet the platform's "PG-13" standard.
Source Note: The image is generated by AI, and the image is authorized by Midjourney
According to Midjourney, the content is inaccessible. While internet users have found ways around it, such as using "strawberry syrup" instead of "blood," the AI tool says it constantly updates its parameters to block such requests. In Boucher's case, however, he was simply looking for images of his book, Relaxatopia, which takes place in a futuristic doom-of-the-doom beach resort. The prompt he used was "doomsday resort."
Boucher's experience highlightsAI Image GeneratorA key problem faced: While inappropriate terms are explicitly prohibited, synonyms or related terms may not be restricted, enabling users to circumvent intended content restrictions. For example, while the word “wound” may be restricted, the synonym “injury” may not be restricted, leading to the generation of content that may violate the platform’s guidelines. The broader issue, however, is that Boucher and others did not intentionally try to circumvent Midjourney’s protections.
Midjourney is headquartered in San Francisco and launched in March 2022. It was founded by David Holz, co-founder of Leap Motion, which has been working on replacing computer mice with gestures. To run Midjourney, you first need to use the messaging application Discord. Then you need to pay a fee of about $10 per month to get access. The Midjourney bot receives requests from users through Discord chat and already has more than 14 million registered users.
The Debrief decided to run a test to see if the results could be replicated using Midjourney. The simplest way to test this was to ask for images of situations where people would typically wear less clothing. Asking Midjourney to generate prompts such as "people on a hot day," "people on the beach," or "spa day" generally worked. With Boucher's help, on January 26, it was decided to start with "beach party." So, just "beach party" was typed into the Discord chat.
Midjourney generated four images using the cue “beach party” in the above test.FirstThe image was the most realistic and contained easily recognizable people, so we chose this image for testing. After selecting the image, we decided to use the "Variation" feature. In short, you can have Midjourney take the image you selected and create alternative versions of it. Clicking "Variation (Low)" will only slightly change the image and give you four more options that are similar in appearance. "Variation (Strong)" will make more significant changes to the image and create four more options for you. In my test, "Variation (Strong)" was clicked. This was done four times until one of the images contained a woman without a top. After selecting that image, "Variation (Strong)" was selected again, and one of the AI-generated women was completely naked.
After a series of other variations of the request, the generated images contained nudity. Additional variations simply generated more and more objectionable content. It took a total of 5 minutes, using the prompt "beach party", to find a nude beach.
To confirm the results, Boucher and The Debrief conducted a second test a few days later on January 31. Using the same prompt, "Beach Party," this image was selected. In this second test, multiple images containing nudity were generated by clicking on "Variation (Strong)" several times.
Boucher is notonlyOne user noticed that Midjourney Version 6 seemed to have eased its nudity filter. On Reddit, a discussion arose in which a user noticed that when they simply used the prompt "put a banana on it," multiple images with nudity were generated.
Explicit or violent content generated by AI is a fairly common problem. Last week, an AI-generated pornographic image of Taylor Swift went viral on the internet. Using a vulnerability in Microsoft's AI tool, a user first uploaded the image to the chat app Telegram, and then it quickly spread on X (Twitter). Microsoft has since fixed the vulnerability. Prior to this, far-right activists had used the program to generate racist and hateful content for the purpose of spreading false information. Although the proverbial securityWhole networkThere is always a way.
The concern about Midjourney, however, is that the images it creates are not solicited. It seems that any user, including minors, could potentially serve up images with nudity simply by entering relatively innocuous content.
“On one hand, as an artist, some of these images are aesthetically very beautiful. If the user is an adult and consenting, the problem is mitigated. On the other hand, as a Trust & Safety professional, your system should not be creating nude images without people asking for them,” Boucher told The Debrief. “Especially since your rules explicitly prohibit nude photos. When users directly ask for them, they can be banned from the service outright. There is a major inconsistency here.”
The Debrief has contacted Midjourney for comment and will update the article when they respond.
MJ Banias is a journalist covering security and technology, and he is the host of The Debrief Weekly Report. You can contact MJ via email at mj@thedebrief.org or follow him on Twitter @mjbanias.