TY - GEN
T1 - Deep Unsupervised 3D SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment
AU - Wang, Yuxing
AU - Lu, Yawen
AU - Xie, Zhihua
AU - Lu, Guoyu
N1 - Publisher Copyright: © 2021 ACM.
PY - 2021/10/17
Y1 - 2021/10/17
N2 - We address the problem of reconstructing 3D human face from multi-view facial images using Structure-from-Motion (SfM) based on deep neural networks. While recent learning-based monocular view methods have shown impressive results for 3D facial reconstruction, the single-view setting is easily affected by depth ambiguities and poor face pose issues. In this paper, we propose a novel unsupervised 3D face reconstruction architecture by leveraging the multi-view geometry constraints to train accurate face pose and depth maps. Facial images from multiple perspectives of each 3D face model are input to train the network. Multi-view geometry constraints are fused into unsupervised network by establishing loss constraints from spatial and spectral perspectives. To make the trained 3D face have more details, facial landmark detector is explored to acquire massive facial information to constrain face pose and depth estimation. Through minimizing massive landmark displacement distance by bundle adjustment, an accurate 3D face model can be reconstructed. Extensive experiments demonstrate the superiority of our proposed approach over other methods.
AB - We address the problem of reconstructing 3D human face from multi-view facial images using Structure-from-Motion (SfM) based on deep neural networks. While recent learning-based monocular view methods have shown impressive results for 3D facial reconstruction, the single-view setting is easily affected by depth ambiguities and poor face pose issues. In this paper, we propose a novel unsupervised 3D face reconstruction architecture by leveraging the multi-view geometry constraints to train accurate face pose and depth maps. Facial images from multiple perspectives of each 3D face model are input to train the network. Multi-view geometry constraints are fused into unsupervised network by establishing loss constraints from spatial and spectral perspectives. To make the trained 3D face have more details, facial landmark detector is explored to acquire massive facial information to constrain face pose and depth estimation. Through minimizing massive landmark displacement distance by bundle adjustment, an accurate 3D face model can be reconstructed. Extensive experiments demonstrate the superiority of our proposed approach over other methods.
KW - 3d face reconstruction
KW - deep learning
KW - landmark detection
KW - structure from motion
UR - https://www.scopus.com/pages/publications/85119373842
U2 - 10.1145/3474085.3475689
DO - 10.1145/3474085.3475689
M3 - Conference contribution
T3 - MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
SP - 1350
EP - 1358
BT - MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
PB - Association for Computing Machinery, Inc
T2 - 29th ACM International Conference on Multimedia, MM 2021
Y2 - 20 October 2021 through 24 October 2021
ER -