Recent AI Circle,Digital HumanThe good guys are getting prettier and prettier, and the houses are launching "Open SourceStrongest"The Digital Man
But there are so many choices, how do you know which one is right for you? It's not like "me + difficulty = giving up", is it?
Unavailable!Being the doting fanatic that I am, there's no way I'm going to let everyone face such a dilemma!
That's why I'mMake a decisive move!
A one-time roundup of previously shared Digimon related integration packs for everyone, containingEffects realized, configurations needed, generation time, etc.Let's allRead it all in one sitting.Which exactly is the current open source digital person, together with choosing the best excavator!
Digimon Fire Fire Fire!
If you want to say what's the hottest thing in the AI world, it seems like, maybe, probably, probably, probably isDigital Human.
Publisher of AI Painting, Stability AI is frequently rumored to be going out of business. The giants of large models at home and abroad have even been rolled into a pot of porridge. Now a lot of big model interface call, has been rolled into a cabbage price, 1 dollar can generate a few books of the Dream of the Red Chamber, this also how to make money it?
However, the digital person in the AI circle, that is a real possibility to bring great benefits! Let's take a chestnut directly: in mid-April this year, Jingdong Liu Qiangdong's digital person "picking and selling Dong" appeared in the live broadcast of Jingdong, not only replicated Liu Qiangdong's speed of speech, accent, habitual actions are exactly the same.
Occasionally rubbing his fingers together while speaking, emphasizing something in conjunction with a larger hand movement, and nodding his head from time to time, etc. Onlookers said they couldn't quite tell that this Dong was even a digitizer!
This debut show in less than an hour, the live broadcast viewing volume of more than 20 million, the entire live cumulative turnover of more than50 million.
After the success of this live broadcast, during the 618 promotion this year, jingdong launched a "president of the digital people live" activities. Gree, Hisense, LG, Mingchuang Youpin, Jie Liya stone, Corvus, vivo, samsung and other corporate executives "group" in the guise of digital people live.
The data disclosed by Jingdong shows that up to now, the speech rhinoceros digital people have accumulated more than 5,000 service brands, driven GMVexceeding ten billion.
Such huge gains have led many to look at digital people. It's still worth a lot of money to achieve such results, but AI technology continues to advance and the digital people produced in the open source space are getting stronger!
Next, let me give you a list of those superb open source digital people~
Who is the best open source digital person
Digital human technology, a concept that once appeared only in science fiction movies, is now stepping into our real life. With the rapid development of AI technology, the competition for open source digital human technology has become more and more intense, and major manufacturers have shown their own trump card products.
Next, we look at the effect of the major open source digital people together, will basically follow the sequence of technological progress to inventory, we can also see at a glance that the technology in the gradual progress!
①Wav2lip: The Wav2Lip algorithm is a deep learning based speech-driven facial animation generation algorithm, which is the earliest technique utilized by digital humans, the core idea of the algorithm is to map the information in the speech signal to the facial animation parameters to generate realistic facial animation.
- Configuration RequirementsWav2Lip is relatively less machine performance hungry and only needs to have 4G of small video memory to run; generating 1 digital human video of about 1 minute takes 5~15 minutes to process.
②SadTalk: SadTalker is an open source project of Xi'an Jiaotong University , it generates 3D motion coefficients by learning from audio , using a new 3D facial renderer to generate head movements , you can realize the picture + audio can generate high-quality video .
- Configuration Requirements: Because SadTalker generated digital people effect is a little better, so the requirements of the machine configuration has been improved, probably need to have 6G video memory computer can run smoothly, video memory less than 6G or use the CPU will be slower. Generate a 1-minute digital human video, need to deal with 10-20 minutes.
③MuseTalk: MuseTalk is a digital human project launched by Tencent that supports real-time audio-driven lip-synchronized digital humans. MuseTalk's core technology automatically adjusts the facial image of the digital character based on the audio signal to ensure that the lip shape is highly consistent with the audio content, so that just by inputting the audio, your digital character can achieve perfect lip-synchronization.
- Configuration Requirements: Using MuseTalk probably requires a computer with 6G of video memory to run smoothly, and generating 1 video of a digital person of about 1 minute will take 10~20 minutes of processing, which is about the same as SadTalker.
④Halo:Hallo, a digital human project developed by Baidu in collaboration with Fudan University, ETH Zurich and Nanjing University, has made impressive progress in audio-driven portrait animation generation. It utilizes advanced AI technology to generate realistic and dynamic portrait image videos based on voice input. This technology analyzes the voice input to synchronize the facial movements of the portrait, including lips, expressions, and head poses, ultimately rendering a digital person with stunning results.
- Configuration Requirements: Hallo generated digital people effect is good, but really, it is very eat machine performance, according to my review, need 10G video memory or more video card to run. Moreover, generating a 1-minute video of a digital person requires 30-40 minutes of processing.
⑤LivePortrait:LivePortrait is a stunning digital person project open-sourced by Racer, and it's amazing because it not only has the ability to accurately control the direction of eye gaze and the opening and closing movements of the lips, but it can also handle the seamless stitching of multiple portraits.
- Configuration RequirementsCompared to Hallo, LivePortrait generates not only good digital human results, but also for the configuration requirements have been reduced a lot, according to my review, you need 8G video card can run smoothly, 6G video memory can also run. Generate a 1-minute digital human video, need to deal with 10-20 minutes.
(vi) EchoMimic:Traditional digital human technology either relies on audio-driven or facial keypoint-driven, each with their own advantages and disadvantages. EchoMimic, on the other hand, skillfully combines these two driving methods to achieve more realistic and natural dynamic portrait generation through dual training of audio and facial keypoints.
- Configuration Requirements: The digital person generated by EchoMimic is basically not visible as a dummy, which can be said to be quite real. And it does not increase the configuration requirements, 8G video memory graphics card can run smoothly. However, the generation length has increased slightly, generating a 1-minute digital human video, probably need to deal with 15-30 minutes.
Conclusion
The development of digital human technology is constantly breaking through our imagination. I know you like to look at pictures, and it just so happens that I have a lot of them haha, so let's go straight to a comparison chart of technological advances:
As AI technology continues to be rolled out, allowing us to experience increasingly powerful open source AI digitizers, if you're curious about digitizer technology, and if you want to experience the stunning effects of digitizers first hand, then now is the best time to do so.