Stanford Universityproposes a method calledOccFusion's new approach is designed to enable high-fidelity rendering of occluded human bodies. In other words, OccFusion ends up rendering the full human form even if part of the body is occluded by other objects.
Traditional human rendering methods usually require every part of the human body to be fully visible in the video, however, in real life, occlusion is common, resulting in the human body being only partially visible.OccFusion utilizes efficient 3D Gaussian slicing in combination with pre-trained 2D diffusion modeling for supervision, to achieve efficient and high fidelity human rendering.
The method consists of three phases:initialization, optimization and refinement. In the initialization phase, a complete human body mask is generated from the partial visibility mask; in the optimization phase, the human body Gaussian is optimized by conditional score distillation sampling; and finally, in the refinement phase, the rendering quality is further improved by contextual patching.
OccFusion was evaluated on ZJU-MoCap and challenging OcMotion sequences and performed well, reaching the state-of-the-art in occluded human rendering. The entire training process took just 10 minutes on a single Titan RTX GPU.