AI image generation has a new leader! The open source model FLUX.1 has been released. Are Midjourney and DALL·E 3 nervous?

In the field of artificial intelligence, disruptive changes can happen every day. Just one day after Midjourney was significantly updated,Open SourceThe field of image generation has ushered in an eye-catching dark horse——FLUX.1. This sudden new player not only claimed to have greatly surpassed the performance of closed-source models such as DALL E3 and Midjourney V6, but also killed the open-source SD3 series across the board, instantly setting off a sensation in the AI circle.

Let us first get to know the mastermind behind FLUX.1. Its founder, Robin Rombach, is no unknown person, but an authority in the field of diffusion models. His representative works include VQGAN, Taming Transformers, and Latent Diffusion. He served as the chief scientist of Stability AI and led the world-renowned Stable Diffusion series of projects. It can be said that Robin Rombach isAI ImageThe generation field can be described as an "old driver" among "old drivers".

AI image generation has a new leader! The open source model FLUX.1 has been released. Are Midjourney and DALL·E 3 nervous?

In March this year, Robin left Stability AI due to internal turmoil. After four months of development, he returned with a new open source large model platform FLUX.1. Even more surprising is that FLUX.1 received a $32 million seed round of financing led by the famous venture capital firm Andreessen Horowitz as soon as it was launched. This undoubtedly injected a shot in the arm for the future development of FLUX.1.

So, what is so special about FLUX.1? First of all, it is based on the Vision Transformer architecture, adopts a process matching training method, and uses rotation position embedding and parallel attention layers to improve model performance and hardware utilization efficiency. This 12 billion parameter model has been launched in three versions:

  • Pro version:Used through API, the performance is the strongest.
  • Dev version:A non-commercial guided distillation model that inherits most of the performance of the Pro version.
  • Schnell version:The open source model can be used commercially and its performance is also quite outstanding.

According to the test data of the FLUX.1 team, even the open source Schnell version has surpassed mainstream models such as Midjourney v6.0, DALL·E3 (HD) and SD3-Ultra in terms of text semantic restoration, image quality, motion consistency, coherence and diversity. In particular, FLUX.1 has shown a clear advantage in text embedding into images.

Of course, FLUX.1's ambitions are clearly not limited to this. The team said that Vincent images are just the beginning, and in the future they also plan to launch Vincent video models to challenge first-line products such as Sora, Gen-3, and Luma.

For developers and AI enthusiasts, the emergence of FLUX.1 is undoubtedly a major benefit. The Schnell version is completely open source and has been supported by Comfyui. If you have more than 36G of video memory, you can even run the fp16 version of t5. However, it should be noted that t5xxl_fp16.safetensors or clip_l.safetensors and VAE need to be downloaded separately.

The emergence of FLUX.1 not only brings new hope to the field of open source AI image generation, but also injects new vitality into the entire AI industry. Its powerful performance and open source characteristics are likely to accelerate the popularization and innovation of AI image generation technology. For ordinary users, this means that we may soon be able to run AI image generation models that are comparable to or even surpass Midjourney on home computers.

Project address: https://github.com/black-forest-labs/flux

Trial address: https://replicate.com/black-forest-labs/flux-pro

ComfyUI workflow: https://comfyanonymous.github.io/ComfyUI_examples/flux/

 

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

Reddit acquires generative AI startup Memorable AI to improve ad effectiveness

2024-8-3 8:59:24

Information

US Justice Department investigates Nvidia's acquisition of Israeli AI startup Run:ai

2024-8-3 9:02:44

Search