Hybrelighter: Real-time Scene Relighting for Mixed Reality on Edge Devices

1. Introduction & Overview

Mixed Reality (MR) scene relighting is a transformative capability that allows virtual alterations to lighting conditions to interact realistically with physical objects, producing authentic illumination and shadows. This technology has significant potential in applications like real estate visualization, immersive storytelling, and virtual object integration. However, achieving this in real-time on resource-constrained edge devices (like MR headsets) presents a major challenge.

Existing approaches fall short: 2D image filters lack geometric understanding; sophisticated 3D reconstruction-based methods are hampered by the low-fidelity meshes generated by on-device sensors (e.g., LiDAR); and state-of-the-art deep learning models are computationally prohibitive for real-time use. Hybrelighter proposes a novel hybrid solution that bridges this gap.

Core Proposition

Hybrelighter integrates image segmentation, lighting propagation via anisotropic diffusion, and basic scene understanding to correct scanning inaccuracies and deliver visually appealing, accurate relighting effects at speeds up to 100 fps on edge devices.

2. Methodology & Technical Approach

The Hybrelighter pipeline is designed for efficiency and robustness on mobile hardware.

2.1. Scene Understanding & Segmentation

The first step involves parsing the camera feed to identify distinct surfaces and objects. A lightweight neural network or traditional CV algorithm segments the image into regions (e.g., walls, floor, furniture). This segmentation provides a semantic mask that guides subsequent lighting operations, allowing for localized effects (e.g., a virtual spotlight only affecting a table).

2.2. Lighting Propagation via Anisotropic Diffusion

This is the core innovation. Instead of performing physically-based rendering on a potentially faulty 3D mesh, Hybrelighter models light spread as a diffusion process on a 2D manifold defined by the scene's geometry and normals. The anisotropic diffusion equation is used:

$\frac{\partial L}{\partial t} = \nabla \cdot (D \nabla L)$

where $L$ is the light intensity, $t$ is time, and $D$ is a diffusion tensor that controls the direction and rate of light spread. Crucially, $D$ is constructed using surface normal information (even if approximate from the basic scene mesh or estimated from the image). This allows light to flow along surfaces but not across depth discontinuities, naturally creating effects like attached shadows and soft illumination gradients without needing perfect geometry.

2.3. Integration with On-device Reconstruction

The system uses the coarse 3D mesh from the device's scene reconstruction (e.g., from ARKit or ARCore) not for direct rendering, but as a guidance layer. The mesh provides approximate depth and surface normal data to inform the anisotropic diffusion tensor $D$. Errors in the mesh (holes, jagged edges) are mitigated because the diffusion process is inherently smoothing and operates primarily on the more reliable 2D segmentation.

3. Technical Details & Mathematical Formulation

The anisotropic diffusion process is discretized for efficient GPU/GPU computation. The key is defining the diffusion tensor $D$ at each pixel $(i,j)$:

$D_{i,j} = g(\|\nabla I_{i,j}\|) \cdot n_{i,j} n_{i,j}^T + \epsilon I$

where:

$\nabla I_{i,j}$ is the image intensity gradient (edge strength).
$g(\cdot)$ is a decreasing function (e.g., $g(x) = \exp(-x^2 / \kappa^2)$), causing diffusion to slow across strong edges (object boundaries).
$n_{i,j}$ is the estimated surface normal vector (from the coarse mesh or photometric stereo).
$\epsilon$ is a small constant for numerical stability, and $I$ is the identity matrix.

This formulation ensures light propagates strongly in directions tangential to the surface ($n n^T$ component) and is inhibited across image edges and depth boundaries ($g(\cdot)$ component). The result is a perceptually convincing approximation of global illumination at a fraction of the computational cost of ray tracing or full neural rendering.

4. Experimental Results & Performance

The paper demonstrates Hybrelighter's efficacy through qualitative and quantitative results.

Performance Benchmark

Frame Rate: >100 FPS on iPhone 16 Pro / Meta Quest 3

Comparison Baseline: Industry-standard, mesh-based deferred shading.

Key Metric: Visual fidelity vs. computational load.

Visual Results (Referencing Fig. 1 & 3):

Fig. 1: Shows a room relit under various conditions (daylight, evening, spotlight). The anisotropic diffusion (row 1) effectively creates soft shadows and illumination gradients that are composited into the MR view (row 2). The results are free of the hard, aliased shadows typical of low-polygon mesh rendering.
Fig. 3: Highlights the problem: the raw LiDAR mesh from a mobile device is noisy and incomplete. Hybrelighter's method is robust to these imperfections, as the diffusion process does not rely on watertight geometry.

The method shows superior visual quality compared to simple 2D filters and comparable or better quality than mesh-based methods while being orders of magnitude faster than neural relighting approaches like those inspired by NeRF or DeepLight.

5. Analysis Framework & Case Study

Case: Real Estate Virtual Staging

Scenario: A user wearing an MR headset views an empty apartment. They want to see how it would look with virtual furniture and under different lighting conditions (morning sun vs. warm evening lights).

Hybrelighter Workflow:

Scan & Segment: The headset scans the room, creating a coarse mesh and segmenting surfaces (walls, windows, floor).
Place Virtual Light: User places a virtual floor lamp in the corner.
Light Propagation: The system treats the lamp's position as a heat source in the anisotropic diffusion equation. Light spreads across the floor and up the adjacent wall, respecting the segmented geometry (slows at the wall-floor boundary). The coarse mesh normals guide the falloff.
Real-time Compositing: The computed illumination map is blended with the passthrough video, darkening areas occluded from the virtual lamp (using approximate depth). The result is a convincing, real-time relit scene without complex 3D rendering.

This framework bypasses the need for perfect 3D models, making it practical for on-the-fly use by non-experts.

6. Industry Analyst's Perspective

Core Insight: Hybrelighter isn't just another relighting paper; it's a pragmatic engineering hack that correctly identifies the mobile MR hardware's weakest link—poor geometry reconstruction—and cleverly routes around it. Instead of trying to win the losing battle for perfect on-device meshes (a la Microsoft's DirectX Raytracing ambition on desktop), it leverages the human visual system's tolerance for perceptual plausibility over physical accuracy. This is reminiscent of the success of CycleGAN's approach to image-to-image translation without paired data—finding a clever, constrained objective that yields "good enough" results efficiently.

Logical Flow: The logic is impeccable: 1) Mobile meshes are bad. 2) Physics-based rendering needs good meshes. 3) Therefore, do not do physics-based rendering. 4) Instead, use a fast, image-based diffusion process that simulates light behavior using the bad mesh only as a gentle guide. The shift from a generative problem (create a perfect lit image) to a filtering problem (diffuse a light source) is the key intellectual leap.

Strengths & Flaws: Its strength is its breathtaking efficiency and hardware compatibility, achieving 100 fps where neural methods struggle for 30 fps. However, its flaw is a fundamental ceiling on realism. It cannot simulate complex optical phenomena like caustics, specular inter-reflections, or accurate transparency—the hallmarks of true high-fidelity rendering as seen in academic benchmarks like the Bitterli rendering resource. It's a solution for the first generation

Actionable Insights: For product managers in AR/VR at Meta, Apple, or Snap, this paper is a blueprint for a shippable feature now. The takeaway is to prioritize "good enough" real-time relighting as a user engagement tool over pursuing cinematic-quality rendering that burns battery life. The research direction it signals is clear: hybrid neuro-symbolic approaches, where lightweight networks (like MobileNet for segmentation) guide classical, efficient algorithms (like diffusion). The next step is to make the diffusion parameters (like the $\kappa$ in $g(x)$) learnable from data, adapting to different scene types without manual tuning.

7. Future Applications & Research Directions

Immediate Applications:

Virtual Home Staging & Interior Design: As demonstrated, allowing real-time visualization of lighting fixtures and paint colors.
AR Gaming & Entertainment: Dynamically changing the mood and atmosphere of a physical room to match game narrative.
Remote Collaboration & Telepresence: Consistent relighting of a user's environment to match a virtual meeting space, enhancing immersion.
Accessibility: Simulating optimal lighting conditions for low-vision users in real-time.

Research & Development Directions:

Learning-based Diffusion Guidance: Replacing hand-crafted functions $g(\cdot)$ with a tiny neural network trained on a dataset of light propagation, enabling adaptation to complex materials.
Integration with Neural Radiance Fields (NeRFs): Using a compact, pre-baked NeRF of a static scene to provide near-perfect geometry and normal guidance for the diffusion process, bridging the gap between quality and speed.
Holographic Display Compatibility: Extending the 2D diffusion model to 3D light fields for next-generation glasses-free displays.
Energy-Aware Optimization: Dynamically scaling the diffusion resolution and iterations based on device thermal and power state.

The trajectory points towards a future where such hybrid methods become the standard middleware for real-time perceptual effects on edge devices, much like rasterization graphics pipelines dominated the past era.

8. References

Zhao, H., Akers, J., Elmieh, B., & Kemelmacher-Shlizerman, I. (2025). Hybrelighter: Combining Deep Anisotropic Diffusion and Scene Reconstruction for On-device Real-time Relighting in Mixed Reality. arXiv preprint arXiv:2508.14930.
Mildenhall, B., Srinivasan, P. P., Tancik, M., Barron, J. T., Ramamoorthi, R., & Ng, R. (2020). NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. ECCV.
Zhu, J., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. ICCV.
Apple Inc. (2024). ARKit Documentation: Scene Reconstruction. Retrieved from developer.apple.com.
Bitterli, B. (2016). Rendering Resources. Retrieved from https://benedikt-bitterli.me/resources/.
Microsoft Research. (2018). DirectX Raytracing. Retrieved from https://www.microsoft.com/en-us/research/project/directx-raytracing/.