AI image tool InstantID tutorial, one photo can generate different styles of pictures in seconds

Recently, AI portrait generation technology has become very popular. This article introducesInstantID, it can achieve personalized image synthesis using only a single facial image reference while maintaining high-fidelity identity preservation, and supports a variety of different styles.

AI photo tool InstantID tutorial, one photo can generate pictures of different styles in seconds

Project Homepage：https://instantid.github.io/

Code address: https://github.com/InstantID/InstantID

Experience Address：https://huggingface.co/spaces/InstantX/InstantID

1. Introduction to InstantID

The paper introduces InstantID: "Zero-shot Identity-Preserving Generation in Seconds", which translates to "zero-shot identity preservation and generation in a few seconds".

InstantID is a powerful solution based on diffusion models. The designed plug-and-play module can skillfully handle various styles of image personalization using only a single face image while ensuring high fidelity. At its core, it designs a novel IdentityNet that combines face and landmark images with textual cues to guide image generation by imposing semantic and weak spatial conditions.

Given only one reference ID image, InstantID aims to generate customized images with various poses or styles from a single reference ID image while ensuring high fidelity. It consists of three key components:

(1) ID embedding that captures semantic face information;

(2) A lightweight adaptation module with decoupled cross-attention to facilitate the use of images as visual cues

(3) IdentityNet, which encodes detailed features of reference facial images through additional spatial control.

2. Introduction to InstantID Function

Function 1: Generate a picture of any style from a face

Feature 2: Editability

You can edit the generated images through text prompts, such as changing the expressions, background or other elements of the characters in the image. You can also use the ControlNet plug-in to more accurately control the details of image generation and achieve personalized customization.

Function 3: Multiple references

It allows multiple reference images to be used to generate a new image, thereby enhancing the richness and diversity of the generated images.

For multiple reference images, the average of the ID embeddings is taken as the image hint. InstantID achieves good results even with only one reference image.

InstantID also has the flexibility to support adding identity attributes to non-human roles.

3. Comparison between InstantID and similar products

Comparison 1: InstantID vs. IP-Adapter/IP-Adapter-FaceID/PhotoMaker

Compare with IP-Adapter (IPA), IP-Adapter-FaceID and the latest PhotoMaker. Among them, PhotoMaker needs to train the LoRA parameters of UNet. It can be seen that both PhotoMaker and IP-Adapter-FaceID achieve good fidelity, but the text control ability has obvious degradation. In contrast, InstantID achieves better fidelity and retains good text editability (faces and styles are better integrated).

Comparison 2: InstantID vs. LoRa

InstantID can achieve competitive results like LoRA without any training.

Comparison 3: InstantID vs. InsightFace Swapper

In the non-realistic style, InstantID is more flexible in the fusion of face and background.

4. InstantID User Experience

Next, let’s experience it on the huggingface website.

There is an explanation of the operation steps at the top, and the core operation only requires 4 steps.

[Step 1]: Upload personal pictures

For multi-person images, we will only detect the largest face. Make sure the face is not too small and not significantly occluded or blurred.

For example, we upload a photo of Fairy Zixia here.

Step 2: (Optional) Upload an image of another person as a reference pose

If not uploaded, we will use the first person image to extract landmarks. If a cropped face was used in step 1, it is recommended to upload it to extract a new pose.

【Step 3】：Writing prompt words

Prompt word: A beautiful woman was sitting on the grass in the park

[Step 4]: Image generation

We first select different styles, then click the "Submit" button to generate the image. Here we take a look at the image effects of different styles.

Style 1: WaterColor

Style 2: Film Noir (black and white film)

Style 3: Neon

Style 4: Jungle

Style 5: Mars

Style 6: Vibrant Color

Style 7: Snow

Style 8: Line art

Judging from the effect of the produced pictures, the character images remain very uniform and are very similar to the original pictures.

5. Related Notes

(1) If you are not satisfied with the similarity, you can increase the weights of controlnet_conditioning_scale (IdentityNet) and ip_adapter_scale (Adapter) appropriately.

(2) If the generated image is oversaturated, reduce the weight of ip_adapter_scale. If that does not work, reduce the weight of controlnet_conditioning_scale.

(3) If the text prompt word does not meet expectations, reduce the weight of ip_adapter_scale.

(4) It is important to choose a good basic model.

Okay, that’s all for today’s sharing. If you are interested, go and experience it.

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.

{{userData.name}}Verify

AI photo tool InstantID tutorial, one photo can generate pictures of different styles in seconds

Midjourney doesn't work? 7 free AI apps for Midjourney in China

AI painting tutorial, Stable Diffusion [prompt words]: portraits of people of different ages

AI Weibo

AI Applications

5000+ AI applications! Updated daily

AIAICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

{{userData.name}}Verify

Related content:

Midjourney doesn't work? 7 free AI apps for Midjourney in China

AI painting tutorial, Stable Diffusion [prompt words]: portraits of people of different ages

Xiaohongshu open-sources InstantID, WebUI uses InstantID, AI face-changing perfect version

How to use Stable Diffusion to achieve artistic photography, Stable Diffusion uses the InstantID plug-in to achieve AI photography

Use a new face-changing plug-in InstantID to achieve character consistency

AI face-changing tool InstantID, how to use online one-click face-changing in ComfyUI

AI Applications

5000+ AI applications! Updated daily

AIAICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow