在之前的文章中已经介绍了当前优秀的图像反推提示语模型:Joy Caption。Joy Caption反推提示词能够生产详细且明确的图像描述信息,并且支持SFW和NSFW图像描述而备受关注。这是一款结合了谷歌siglip-so400m-patch14-384模型和面向图像提示精细化微调的Meta-Llama-3.1-8B-bnb-4bit模型构成。在本地运行Joy Caption模型需要占用大约7GB显存。
Joy Caption安装指南
首先需要利用ComfyUI插件管理器搜索Comfyui_CXH_joy_caption插件,并点击安装此插件并重启。然后还需下载对应的模型,这也是本地复杂安装的步骤:
- 注意本地运行环境中 需要确认transformers>=4.44.2 版本。插件地址:https://github.com/StartHua/Comfyui_CXH_joy_caption
- 下载模型google/siglip-so400m-patch14-384并放置目录 /ComfyUI/models/clip/siglip-so400m-patch14-384。这里需要下载整个项目文件,模型地址为:https://huggingface.co/google/siglip-so400m-patch14-384
- 下载模型 unsloth/Meta-Llama-3.1-8B-bnb-4bit 并放置目录 /ComfyUI/models/LLM。下载整个项目文件,模型地址为:https://huggingface.co/unsloth/Meta-Llama-3.1-8B-bnb-4bit
- 下载模型 fancyfeast/joy-caption-pre-alpha 并放置目录 /ComfyUI/models/Joy_caption。这里需要手动下载整个项目文件,模型地址为:https://huggingface.co/spaces/fancyfeast/joy-caption-pre-alpha/tree/main/wpkklhc6
如嫌手动下载文件麻烦同学可以尝试使用git方式下载,在CMD中导航到对应目录然后使用git目录下载,例如google/siglip-so400m-patch14-384 模型如下:
git lfs install
# 替换对应模型git地址
git lfs clone https://huggingface.co/google/siglip-so400m-patch14-384
基础Flux ComfyUI工作流
关于Flux模型本地ComfyUI工作流体验参见之前文章:FLUX[续篇]:12B参数23G最大开源文生图模型,Dev版直出惊艳美图欣赏.
本文涉及ComfyUI工作流和模型均可在LIBLIBAI上下载或在线运行体验:
- FLUX.1哩布在线可运行-黑暗森林工作室:https://www.liblib.art/modelinfo/488cd9d58cd4421b9e8000373d7da123
- 工作流-Flux文|图生图+LORA+CN+提示反推一键切换工作流:https://www.liblib.art/modelinfo/782aacd70f604da39e83368c696a02a8
Joy Caption反推工作流
要想增加JoyCaption反推提示工作流,只需要在Flux基础工作流上增加JoyCaption模型加载和反推两个节点。工作流下载地址为:https://www.liblib.art/modelinfo/cc112e6f18bf46049b680ec4b42c511a 。
注意:本文使用了Flux细节质感提升LORA模型,提升整体图像质感。关于LORA介绍参见之前文章:[ComfyUI]Flux:太赞了!细节质感增强,人物降油光写实,富有电影光线,丰富画面元素。
01. 晾晒衣服
flim rendering, depicting a young Asian woman standing on a rooftop on a bright, sunny day. She has long, straight black hair tied into a high ponytail, and is wearing a simple, white, short-sleeved T-shirt and light blue denim shorts. The woman is facing away from the camera, gazing towards the horizon with a serene expression. She holds a wooden clothespin in her right hand, which is holding a white T-shirt on a clothesline strung horizontally across the rooftop. The clothesline is made of thin, yellow string, and the clothespin is positioned near the sleeve of the shirt. In the background, there is a clear blue sky with a few scattered clouds, and a view of a cityscape featuring multiple high-rise apartment buildings with balconies. The rooftop surface is concrete, with a few small plants adding some greenery. The overall scene conveys a sense of tranquility and simplicity, with the bright sunlight casting soft, natural shadows.
02. 在一起
octane rendering,UE5,Maya,blender, . This is a digital photograph featuring two fingers, one on top of the other, with the tips touching. Each finger is drawn with black marker to resemble a person. The top finger is a girl, depicted with closed eyes and a small smile, suggesting happiness. She has a pink heart above her head, and her arms are bent at the elbow, with her hands clasped together. The bottom finger is a boy, with closed eyes and a small smile, also suggesting happiness. He has a pink heart above his head and his arms are bent at the elbow, with his hands clasped together. The fingers are positioned upright and facing each other, with the girl's finger on top. The background is a soft, pale yellow color, providing a neutral and soothing backdrop that enhances the warm, affectionate theme of the image. Text is written in black, playful, hand-like font, with the words "Together Forever" above the girl's finger, and "I love you..." below the boy's finger. The text is surrounded by small pink hearts, adding a whimsical touch. The overall mood is one of love and affection, conveyed through the simple yet charming depiction of the fingers and the accompanying text.
03. 卖猪仔
octane rendering,UE5,Maya,blender, Slung over his shoulder was a stick with two caged piglets on either side, of a photorealistic CGI (computer-generated imagery) artwork. This digital artwork depicts a chubby, adorable baby with dark hair and large, round eyes, dressed in a sleeveless, light pink dress with a subtle polka dot pattern. The baby is holding two woven baskets on either side of its body, balanced on its shoulders. Each basket contains a small piglet with pink skin and short snouts. The baby's expression is one of contentment and innocence. The background features a rural setting with a paved path leading into the distance, flanked by lush green foliage and a wooden building on the left. The lighting is soft and natural, creating a serene atmosphere. The textures are meticulously detailed, from the smoothness of the baby's skin to the coarse texture of the woven baskets and the soft fur of the piglets. This CGI artwork combines photorealism with a whimsical, almost surrealistic touch, enhancing the charm and cuteness of the subject.
04. 黄瓜服装秀
octane rendering,UE5,Maya,blender, . This is a highly detailed, high-resolution photograph featuring a life-sized, stylized human figure crafted entirely from cucumber slices. The figure stands against a plain white background, emphasizing its vivid green hue. The person, with a serene expression, wears an elegant, sleeveless gown made from cucumber slices arranged in a layered, petal-like fashion. The gown's neckline is V-shaped, and the slices form a series of overlapping, scalloped edges that resemble a flower's petals. The figure's head is adorned with a crown of cucumber leaves, adding to the botanical theme. The texture of the cucumber slices is smooth and glossy, with the light reflecting off the wet surface, giving it a fresh, vibrant appearance. The overall effect is both surreal and artistic, blending elements of nature and human craftsmanship. The photograph captures the intricate details of the cucumber slices, emphasizing their natural patterns and the delicate arrangement that creates the gown. The figure's skin is a pale, almost translucent white, contrasting starkly with the green of the cucumber slices, enhancing the surreal nature of the image.