Digital Human and Video-Based Rendering

The commoditization of virtual and augmented reality devices and the availability of inexpensive consumer depth cameras have catalyzed a resurgence of interest in spatiotemporal performance capture. Recent systems like Fusion4D and Holoportation address several crucial problems in the real-time fusion of multiview depth maps into volumetric and deformable representations. Nonetheless, stitching multiview video textures onto dynamic meshes remains challenging due to imprecise geometries, occlusion seams, and critical time constraints. In this paper, we present a practical solution towards real-time seamless texture montage for dynamic multiview reconstruction. We build on the ideas of dilated depth discontinuities and majority voting from Holoportation to reduce ghosting effects when blending textures. In contrast to their approach, we determine the appropriate blend of textures per vertex using view-dependent rendering techniques, so as to avert fuzziness caused by the ubiquitous normal-weighted blending. By leveraging geodesics-guided diffusion and temporal texture fields, our algorithm mitigates spatial occlusion seams while preserving temporal consistency. Experiments demonstrate significant enhancement in rendering quality, especially in detailed regions such as faces. We envision a wide range of applications for Montage4D, including immersive telepresence for business, training, and live entertainment.

Publications

teaser image of Montage4D: Real-Time Seamless Fusion and Stylization of Multiview Video Textures

Montage4D: Real-Time Seamless Fusion and Stylization of Multiview Video Textures

Journal of Computer Graphics Techniques (JCGT), 2019.
{{_Keywords}}texture montage, 3d reconstruction, texture stitching, view-dependent rendering, discrete geodesics, projective texture mapping, differential geometry, temporal texture fields

teaser image of Fusing Multimedia Data Into Dynamic Virtual Environments

Fusing Multimedia Data Into Dynamic Virtual Environments

Ruofei Du
Ph.D. Dissertation, Computer Science Department., University of Maryland, College Park., 2018.
{{_Keywords}}social street view, geollery, spherical harmonics, 360 video, multiview video, montage4d, haptics, cryptography, metaverse, mirrored world
teaser image of HumanGPS: Geodesic PreServing Feature for Dense Human Correspondence

HumanGPS: Geodesic PreServing Feature for Dense Human Correspondence

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
{{_Keywords}}correspondences, geodesic distance, embeddings, neural networks

teaser image of Montage4D: Interactive Seamless Fusion of Multiview Video Textures

Montage4D: Interactive Seamless Fusion of Multiview Video Textures

Proceedings of ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D), 2018.
{{_Keywords}}texture montage, 3d reconstruction, texture stitching, view-dependent rendering, discrete geodesics, projective texture mapping, differential geometry, temporal texture fields

teaser image of Video Fields: Fusing Multiple Surveillance Videos Into a Dynamic Virtual Environment

Video Fields: Fusing Multiple Surveillance Videos Into a Dynamic Virtual Environment

Proceedings of the 21st International Conference on Web3D Technology (Web3D), 2016.
{{_Keywords}}virtual reality; mixed-reality; video-based rendering; projection mapping; surveillance video; WebGL; WebVR

Videos

HumanGPS: Geodesic PreServing Feature for Dense Human Correspondence


Montage4D: Real-Time Seamless Fusion and Stylization of Multiview Video Textures


Talks

Cited By

  • Image-Guided Neural Object Rendering. 8th International Conference on Learning Representations. Justus Thies, Michael Zollh{\"o}fer, Christian Theobalt, Marc Stamminger, and Matthias Nie{\ss}ner. doi | cite
  • The Relightables: Volumetric Performance Capture of Humans With Realistic Relighting. ACM Transactions on Graphics.Kaiwen Guo, Peter Lincoln, Philip Davidson, Jay Busch, Xueming Yu, Matt Whalen, Geoff Harvey, Sergio Orts-Escolano, Rohit Pandey, Jason Dourgarian, Danhang Tang, Anastasia Tkach, Adarsh Kowdle, Emily Cooper, Mingsong Dou, Sean Fanello, Graham Fyffe, Christoph Rhemann, Jonathan Taylor, Paul Debevec, and Shahram Izadi. doi | cite
  • Instant Panoramic Texture Mapping With Semantic Object Matching for Large-Scale Urban Scene Reproduction. IEEE Transactions on Visualization and Computer Graphics. Jinwoo Park, Ik-Beom Jeon, Sung-Eui Yoon, and Woontack Woo. doi | cite
  • LookinGood: Enhancing Performance Capture With Real-Time Neural Re-Rendering. ACM Transactions on Graphics. Ricardo Martin-Brualla, Rohit Pandey, Shuoran Yang, Pavel Pidlypenskyi, Jonathan Taylor, Julien Valentin, Sameh Khamis, Philip Davidson, Anastasia Tkach, Peter Lincoln, Adarsh Kowdle, Christoph Rhemann, Dan B Goldman, Cem Keskin, Steve Seitz, Shahram Izadi, and Sean Fanello. doi | cite
  • A Review of Video Surveillance Systems. Journal of Visual Communication and Image Representation. Omar Elharrouss, Noor Almaadeed, and Somaya Al-Maadeed. doi | cite
  • An Inexpensive Upgradation of Legacy Cameras Using Software and Hardware Architecture for Monitoring and Tracking of Live Threats. IEEE Access. Ume Habiba, Muhammad Awais, Milhan Khan, and Abdul Jaleel. doi | cite
  • Spatiotemporal Retrieval of Dynamic Video Object Trajectories in Geographical Scenes. Transactions in GIS. Yujia Xie, Meizhen Wang, Xuejun Liu, Ziran Wang, Bo Mao, Feiyue Wang, and Xiaozhi Wang. doi | cite
  • A Multi-Resolution Approach for Color Correction of Textured Meshes. . Mohammad Rouhani, Matthieu Fradet, and Caroline Baillard. doi | cite
  • IBRNet: Learning Multi-View Image-Based Rendering. CVPR 2021. Qianqian Wang, Zhicheng Wang, Kyle Genova, Pratul P Srinivasan, Howard Zhou, Jonathan T Barron, Ricardo Martin-Brualla, Noah Snavely, and Thomas Funkhouser. doi | cite
  • Neural Body: Implicit Neural Representations With Structured Latent Codes for Novel View Synthesis of Dynamic Humans. CVPR 2021. Sida Peng, Yuanqing Zhang, Yinghao Xu, Qianqian Wang, Qing Shuai, Hujun Bao, and Xiaowei Zhou. doi | cite
  • Video\textemdashGeographic Scene Fusion Expression Based on Eye Movement Data. 2021 IEEE 7th International Conference on Virtual Reality (ICVR). Xiaozhi Wang, Yujia Xie, and Xing Wang. doi | cite
  • Multi-Camera Light Field Capture : Synchronization, Calibration, Depth Uncertainty, and System Design. . Elijs Dima. doi | cite
  • MonoMR: Synthesizing Pseudo-2.5D Mixed Reality Content From Monocular Videos. Applied Sciences. Dong-Hyun Hwang and Hideki Koike. doi | cite
  • Feature Based Object Tracking: A Probabilistic Approach. Florida Institute of Technology. Kaleb Smith. doi | cite
  • Reconstruction and Detection of Occluded Portions of 3D Human Body Model Using Depth Data From Single Viewpoint. U.S. Patent 10,818,078. Jie Ni and Mohammad Gharavi-Alkhansari. doi | cite
  • Heterogeneous Data Fusion . U.S. Patent 11,068,756. James Browning. doi | cite
  • Video Display Method and Device. CN110996087B. Feihu Luo. doi | cite
  • Image Processing Module, Image Processing Method, Camera Assembly and Mobile Terminal. CN112291479A. Jingyang Chang. doi | cite
  • Video Content Representation to Support the Hyper-Reality Experience in Virtual Reality. 2021 IEEE Virtual Reality and 3D User Interfaces (VR). Hyerim Park and Woontack Woo. doi | cite
  • Multi-View Neural Human Rendering. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Minye Wu, Yuehao Wang, Qiang Hu, and Jingyi Yu. doi | cite
  • Spatiotemporal Texture Reconstruction for Dynamic Objects Using a Single RGB-D Camera. Computer Graphics Forum. Hyomin Kim, Jungeon Kim, Hyeonseo Nam, Jaesik Park, and Seungyong Lee. doi | cite
  • RealityCheck: Blending Virtual Environments With Situated Physical Reality. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. Jeremy Hartmann, Christian Holz, Eyal Ofek, and Andrew Wilson. doi | cite
  • High-Precision 5DoF Tracking and Visualization of Catheter Placement in EVD of the Brain Using AR. ACM Transactions on Computing for Healthcare.Xuetong Sun, Sarah B. Murthi, Gary Schwartzbauer, and Amitabh Varshney. doi | cite
  • Volumetric Capture of Humans With a Single RGBD Camera Via Semi-Parametric Learning. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Rohit Pandey, Cem Keskin, Shahram Izadi, Sean Fanello, Anastasia Tkach, Shuoran Yang, Pavel Pidlypenskyi, Jonathan Taylor, Ricardo Martin-Brualla, Andrea Tagliasacchi, George Papandreou, and Philip Davidson. doi | cite
  • SIGNET: Efficient Neural Representation for Light Fields. ICCV 2021.Brandon Feng and Amitabh Varshney. doi | cite
  • Pri3D: Can 3D Priors Help 2D Representation Learning?. https://arxiv.org/abs/2104.11225.pdf. Ji Hou, Saining Xie, Benjamin Graham, Angela Dai, and M. Nießner. doi | cite
  • Multi‐camera Video Synopsis of a Geographic Scene Based on Optimal Virtual Viewpoint. . Yujia Xie, Meizhen Wang, Xuejun Liu, Xing Wang, Yiguang Wu, Feiyue Wang, and Xiaozhi Wang. doi | cite
  • Dance in the Wild: Monocular Human Animation With Neural Dynamic Appearance Synthesis. https://arxiv.org/pdf/2111.05916.pdf. Tuanfeng Y. Wang, Duygu Ceylan, Krishna Kumar Singh, and Niloy J. Mitra. doi | cite
  • GeoNeRF: Generalizing NeRF With Geometry Priors. https://arxiv.org/pdf/2111.13539.pdf. Mohammad Mahdi Johari, Yann Lepoittevin, and François Fleuret. doi | cite
  • Light Field Neural Rendering. https://arxiv.org/pdf/2112.09687.pdf. Mohammed Suhail, Carlos Esteves, Leonid Sigal, and Ameesh Makadia. doi | cite
  • Human View Synthesis Using a Single Sparse RGB-D Input. . Phong Nguyen, Nikolaos Sarafianos, Christoph Lassner, Janne Heikkila, and Tony Tung. doi | cite
  • Stay In Touch