Publications
“Don't worry about what anybody else is going to do. The best way to predict the future is to invent it.”
As a Researcher at Google, I devote to inventing technologies in interactive perception and graphics, fusing the information from the physical and virtual worlds, and making it interactive, accessible, and useful in VR, AR, and MR. I have published over 35 peer-reviewed publications in top venues of HCI, Computer Graphics, and Computer Vision, including CHI, SIGGRAPH Asia, UIST, TVCG, CVPR, ICCV, ECCV, ISMAR, VR, I3D, Web3D, etc. Please feel free to search keywords / authors / journal / conference below or visit my Google Scholar for more details.
Peer-reviewed Publications [bibTeX]
Rapsai: Accelerating Machine Learning Prototyping of Multimedia Applications Through Visual ProgrammingHonorable Mentions Award, 170K+ views
Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI), 2023.
Keywords: visual programming, node-graph editor, deep neural networks, data augmentation, deep learning, model comparison, visual analytics, interactive perception
Visual Captions: Augmenting Verbal Communication With On-the-fly VisualsOpen Source, Real-time, Live!
Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI), 2023.
Keywords: augmented communication, large language models, video-mediated communication, online meeting, collaborative work, augmented reality, XR interaction
DepthLab: Real-time 3D Interaction With Depth Maps for Mobile Augmented Reality50K downloads
Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (UIST), 2020.
Keywords: depth map; interactive 3D graphics; real time; interaction; augmented reality; mobile AR; rendering; GPU; ARCore; XR interaction; digital world, interactive graphics
Geollery: A Mixed Reality Social Media PlatformLive Demo of a Metaverse of Mirrored World
Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI), 2019.
Keywords: metaverse, virtual reality, augmented reality, social media, GIS, street view, visualization, 3D user interface, 3D reconstruction, digital twins, mirrored world; digital world; digital world; augmented communication
Augmented Object Intelligence With XR-Objects
Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (UIST), 2024.
Keywords: mixed reality; extended reality; augmented reality; augmented objects; spatial computing; user interfaces; context menus
Experiencing Thing2Reality: Transforming 2D Content Into Conditioned Multiviews and 3D Gaussian Objects for XR Communication
Adjunct Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (UIST), 2024.
Keywords: extended reality, augmented communication, image-to-3D, remote collaboration, spatial referencing, co-presence
Human I/O: Towards a Unified Approach to Detecting Situational ImpairmentsBest Paper Honourable Mentions Award
Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI), 2024.
Keywords: situational impairments, augmented reality, large language models, multimodal sensing, context awareness, XR interaction, interactive perception
ChatDirector: Enhancing Video Conferencing With Space-Aware Scene Rendering and Speech-Driven Layout Transition500K+ Media Coverage
Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI), 2024.
Keywords: augmented communication, video conferencing, 3D portrait avatar, co-presence, attention transition, depth estimation, video-mediated communication
UI Mobility Control in XR: Switching UI Positionings Between Static, Dynamic, and Self Entities
Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI), 2024.
Keywords: Extended Reality, User Interface, UI Mobility, UI Positioning, XR interaction, interactive graphics
FaceFolds: Meshed Radiance Manifolds for Efficient Volumetric Rendering of Dynamic FacesBest Student Paper Award
Proceedings of the ACM on Computer Graphics and Interactive Techniques (I3D), 2024.
Keywords: Volumetric Rendering, Face Modeling, View Synthesis, PerformanceCapture, digital human
Experiencing InstructPipe: Building Multi-modal AI Pipelines Via Prompting LLMs and Visual Programming
Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems (CHI), 2024.
Keywords: Visual Programming; Large Language Models; Visual Prototyping;Node-graph Editor; Graph Compiler; Low-code Development; DeepNeural Networks; Deep Learning; Visual Analytics
Montage4D: Real-time Seamless Fusion and Stylization of Multiview Video TexturesMicrosoft TechFest 2018
Journal of Computer Graphics Techniques (JCGT), 2019.
Keywords: texture montage, 3d reconstruction, texture stitching, view-dependent rendering, discrete geodesics, projective texture mapping, differential geometry, temporal texture fields; digital human
Social Street View: Blending Immersive Street Views With Geo-Tagged Social MediaBest Paper Award
Proceedings of the 21st International Conference on Web3D Technology (Web3D), 2016.
Keywords: metaverse, spatial-temporal virtual reality; social media; street view; geographical information systems; mixed reality; WebGL; digital twins; digital world
Fusing Multimedia Data Into Dynamic Virtual EnvironmentsPh.D. Dissertation
Ph.D. Dissertation, University of Maryland, College Park., 2018.
Keywords: social street view, geollery, spherical harmonics, 360 video, multiview video, montage4d, haptics, cryptography, metaverse, mirrored world
InstructPipe: Building Visual Programming Pipelines With Human Instructions
https://arxiv.org/abs/2312.09672, 2023.
Keywords: Visual Programming; Large Language Models; Visual Prototyping; Nodegraph Editor; Graph Compiler; Low-code Development; Deep Neural Networks; Deep Learning; Visual Analytics; Interactive Perception
ThingShare: Ad-Hoc Digital Copies of Physical Objects for Sharing Things in Video Meetings
Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI), 2023.
Keywords: video-mediated communication, object-centered meetings, online meeting, collaborative work, augmented communication, XR interaction
Modeling and Improving Text Stability in Live CaptionsLanded in Live Transcribe App
Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems (CHI EA), 2023.
Keywords: live captions; real-time transcription; visual instability; flickering metric; speech-to-text; text stability; tokenized alignment; augmented communication
Learning Personalized High Quality Volumetric Head Avatars From Monocular RGB Videos
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Keywords: implicit 3D avatar, monocular RGB video, facial expressions, head poses, neural radiance field, photorealism, digital human
Experiencing Visual Blocks for ML: Visual Prototyping of AI Pipelines
Adjunct Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (UIST), 2023.
Keywords: visual programming, large language models, visual prototyping, multi-modal models, node-graph editor, deep neural networks, data augmentation, deep learning, visual analytics
Experiencing Visual Captions: Augmented Communication With Real-time Visuals Using Large Language Models
Adjunct Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (UIST), 2023.
Keywords: augmented communication, large language models, video-mediated communication, online meeting, collaborative work, dataset, textto-visual, AI agent, augmented reality
RetroSphere: Self-Contained Passive 3D Controller Tracking for Augmented RealityIMWUT Vol. 6 Distinguished Paper Award
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), 2022.
Keywords: Retroreflectors, Augmented reality, Virtual reality, Infrared marker tracking, Augmented reality glasses, XR interaction
Sandwiched Image Compression: Increasing the Resolution and Dynamic Range of Standard CodecsBest Paper Finalist
2022 Picture Coding Symposium (PCS), 2022.
Keywords: deep learning, image compression, nonlineartransform coding, high dynamic range, super-resolution, interactive perception
“Slurp” Revisited: Using Software Reconstruction to Reflect on Spatial Interactivity and Locative Media
Proceedings of the Designing Interactive Systems Conference (DIS), 2022.
Keywords: system re-presencing, affordances, metaphor, software reconstruction, historical precedents, gestural interface, augmented reality, spatial interaction, XR interaction
ProtoSound: A Personalized and Scalable Sound Recognition System for Deaf and Hard of Hearing Users
Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI), 2022.
Keywords: accessibility, deaf, Deaf, hard of hearing, sound awareness
Opportunistic Interfaces for Augmented Reality: Transforming Everyday Objects Into Tangible 6DoF Interfaces Using Ad Hoc UI
Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems (CHI), 2022.
Keywords: augmented reality, everyday objects, tangible user interface, 3D user interface, 6 DoF, spatial interaction, markerless tracking, tangible interaction, hand gestures, XR interaction
OmniSyn: Intermediate View Synthesis Between Wide-baseline Panoramas
2022 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 2022.
Keywords: 360 image, virtual reality, view synthesis, panorama, neural rendering, depth map, mesh rendering, inpainting, digital world
GazeChat: Enhancing Virtual Conferences With Gaze-aware 3D Photos
Proceedings of the 34th Annual ACM Symposium on User Interface Software and Technology (UIST), 2021.
Keywords: eye contact, gaze awareness, video conferencing, video-mediated communication, gaze interaction, augmented communication, augmented conversation, eye tracking, XR interaction
Multiresolution Deep Implicit Functions for 3D Shape Representation
2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
Keywords: deep implicit functions, neural representation, compression, levels of detail, MDIF, interactive perception
HumanGPS: Geodesic PreServing Feature for Dense Human Correspondence
2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
Keywords: correspondences, geodesic distance, embeddings, neural networks, digital human, interactive perception
A Log-Rectilinear Transformation for Foveated 360-degree Video StreamingTVCG Honorable Mentions
IEEE Transactions on Visualization and Computer Graphics (TVCG), 2021.
Keywords: 360° video, foveation, virtual reality, live video stream-ing, log-rectilinear, summed-area table, eye tracking, digital world
Saliency Computation for Virtual Cinematography in 360° Videos
IEEE Computer Graphics and Applications (CGA), 2021.
Keywords: spherical harmonics, virtual reality, visual saliency, 360°videos, omnidirectional videos, perception, Itti model, spectralresidual, GPGPU, CUDA, eye tracking, interactive graphics
CollaboVR: A Reconfigurable Framework for Multi-user to Communicate in Virtual Reality
2020 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), 2020.
Keywords: chalktalk, virtual reality, collaborative work, layout, telepresence, communication, XR interaction, augmented communication
3D-Kernel Foveated Rendering for Light Fields
IEEE Transactions on Visualization and Computer Graphics (TVCG), 2020.
Keywords: light field, foveated rendering, microscopic light field, eye tracking, visualization, eye tracking, interactive graphics
Experiencing Real-time 3D Interaction With Depth Maps for Mobile Augmented Reality in DepthLab
Adjunct Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (UIST), 2020.
Keywords: depth map; interactive 3D graphics; real time; interaction; augmented reality; mobile AR; rendering; GPU; ARCore; interactive graphics, XR interaction
MeteoVis: Visualizing Meteorological Events in Virtual Reality
Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems (CHI EA), 2020.
Keywords: scientific visualization, virtual reality, meteorological data, immersion, interactive visualization, vector field, XR interaction
Eye-Dominance-guided Foveated Rendering
IEEE Transactions on Visualization and Computer Graphics (TVCG, Special Issue of IEEE Conference on Virtual Reality and 3D User Interfaces), 2020.
Keywords: virtual reality, foveated rendering, perception, gaze-contingent rendering, ocular dominance, eye tracking, interactive graphics
Language-based Colorization of Scene Sketches
ACM Transactions on Graphics (SIGGRAPH Asia), 2019.
Keywords: deep neural networks; image segmentation; language-based editing; scene sketch; sketch colorization, interactive graphics, interactive perception, augmented communication
ORC Layout: Adaptive GUI Layout With OR-Constraints
Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI), 2019.
Keywords: GUI builder, layout manager, constraint-based layout, visual interface design, visual programming, interactive graphics
Kernel Foveated RenderingMost read in PACMCGIT
Proceedings of the ACM on Computer Graphics and Interactive Techniques (I3D), 2018.
Keywords: foveated rendering, perception, log-polar mapping, eye-tracking, virtual reality, head-mounted displays, eye tracking, interactive graphics
Project Geollery.com: Reconstructing a Live Mirrored World With Geotagged Social Media
Proceedings of the 24th International Conference on Web3D Technology (Web3D), 2019.
Keywords: virtual reality, mixed reality, 360° image, GIS, 3D reconstruction, projection mapping, mirrored world, social media, WebGL, metaverse, mirrored world, interactive graphics, digital world
Montage4D: Interactive Seamless Fusion of Multiview Video Textures
Proceedings of ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D), 2018.
Keywords: texture montage, 3d reconstruction, texture stitching, view-dependent rendering, discrete geodesics, projective texture mapping, differential geometry, temporal texture fields, digital human
Evaluating Haptic and Auditory Directional Guidance to Assist Blind People in Reading Printed Text Using Finger-Mounted Cameras
ACM Transactions on Accessible Computing (TACCESS), 2016.
Keywords: accessibility, real-time OCR, visual impairments, wearables, XR interaction
Video Fields: Fusing Multiple Surveillance Videos Into a Dynamic Virtual Environment
Proceedings of the 21st International Conference on Web3D Technology (Web3D), 2016.
Keywords: virtual reality; mixed-reality; video-based rendering; projection mapping; surveillance video; WebGL; WebVR; interactive graphics
Experiencing a Mirrored World With Geotagged Social Media in Geollery
Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems (CHI EA), 2019.
Keywords: virtual reality, augmented reality, social media, GIS, street view, visualization, 3D user interface, 3D reconstruction, metaverse, mirrored world
Interactive Fusion of 360° Images for a Mirrored World
2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 2019.
Keywords: virtual reality, 360° image, 3D reconstruction, mixed reality, projection mapping, mirrored world, metaverse, mirrored world
VRSurus: Enhancing Interactivity and Tangibility of Puppets in Virtual RealityDemoed at UIST 2015
Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems (CHI EA), 2016.
Keywords: Virtual Reality; Tangible User Interface; Haptics; Gesture Recognition; Head-Mounted Display; XR interaction
AtmoSPHERE: Representing Space and Movement Using Sand Traces in an Interactive Zen Garden
Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems (CHI EA), 2015.
Keywords: Visualization; Tangible Interactive Art; Machine Aesthetics; Calm Technology; XY Servo Table; Kinect; XR interaction
The Design and Preliminary Evaluation of a Finger-Mounted Camera and Feedback System to Enable Reading of Printed Text for the Blind
Computer Vision - ECCV 2014 Workshops (ECCVW), 2014.
Keywords: Accessibility, Wearables, Real-time OCR, Text Reading for Blind
Supporting Everyday Activities for Persons With Visual Impairments Through Computer Vision
Proceedings of the 17th International ACM SIGACCESS Conference on Computers Accessibility (ASSETS), 2015.
Keywords: Blind; visually impaired; wearable computing; computer vision; vision-augmented touch
Online Vigilance Analysis Combining Video and Electrooculography Features
Neural Information Processing - 19th International Conference (ICONIP), 2012.
Keywords: Vigilance Analysis, Fatigue Detection, Active Shape Model, Electrooculography, Support Vector Machine, eye tracking
A Pilot Study of Spherical Harmonics for Saliency Computation and Navigation in 360° Videos
ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D), 2018.
Keywords: spherical harmonics, virtual reality, visual saliency, 360°videos, omnidirectional videos, perception, Itti model, spectralresidual, GPGPU, CUDA
Technical Reports
Experiencing Rapid Prototyping of Machine Learning Based Multimedia Applications in Rapsai
Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems (CHI EA), 2023.
Keywords: visual programming, node-graph editor, deep neural networks, data augmentation, deep learning, model comparison, visual analytics, interactive perception
C-Flow: Visualizing Foot Traffic and Profit Data to Make Informative Decisions
University of Maryland, College Park. Department of Computer Science, 2012.
Keywords: Information Visualization; Data Mapping; Indoor Visualization; Business; Usability Testing