EndoSparse: Real-Time Sparse View Synthesis of
Endoscopic Scenes using Gaussian Splatting

MICCAI 2024


Chenxin Li1, Brandon Y. Feng2†, Yifan Liu1, Hengyu Liu1, Cheng Wang1, Weihao Yu1, Yixuan Yuan1†

1The Chinese University of Hong Kong    2Massachusetts Institute of Technology    Corresponding Author   

Abstract


3D reconstruction of biological tissues from a collection of endoscopic images is a key to unlock various important downstream surgical applications with 3D capabilities.

Existing methods employ various advanced neural rendering techniques for photorealistic view synthesis, but they often struggle to recover accurate 3D representations when only sparse observations are available, which is usually the case in real-worldclinical scenarios.

To tackle this sparsity challenge, we propose a frame work leveraging the prior knowledge from multiple foundation models during the reconstruction process, dubbed as EndoSparse. Experimental results indicate that our proposed strategy significantly improves the geometric and appearance quality under challenging sparse-view conditions, including using only three views. In rigorous benchmarking experiments against state-of-the-art methods, EndoSparse achieves superior results in terms of accurate geometry, realistic appearance, and rendering efficiency, confirming the robustness to sparse-view limitations in endoscopicreconstruction.

EndoSparse signifies a steady step towards the practical deployment of neural 3D reconstruction in real-world clinical scenarios.


Highlight


  • We present state-of-the-art results on surgical scene reconstruction from a sparse set of endoscopic views, achieving and significantly enhancing the practical usage potential of neural reconstruction methods.
  • We demonstrate an effective strategy to instill prior knowledge from a pre-trained 2D generative model to improve and regularize the visual reconstruction quality under sparse observations.
  • We introduce an effective strategy to distill geometric prior knowledge from a visual foundation model that drastically improves the geometric reconstruction quality under sparse observations.


Network




Overview of EndoSparse: Within a 3D-GS scene reconstruction framework, we incorporate vision foundation models as effective regularizers of the 3D scene. We incorporate geometric prior knowledge from Depth-Anything and image appearance priors from Stable Diffusion, which provide valuable guidance signals for optimization at viewpoints without camera coverage.

Experimental Results


Quantitative Results

Qualitative Results


Citation


@article{li2024endosparse,
  author    = {Chenxin Li and Brandon Y. Feng and Yifan Liu and Hengyu Liu and Cheng Wang and Weihao Yu and Yixuan Yuan},
  title     = {EndoSparse: Real-Time Sparse View Synthesis of Endoscopic Scenes using Gaussian Splatting},
  journal   = {arXiv preprint},
  year      = {2024}
}
              

Relevant Works


EndoGaussian: Real-time Gaussian Splatting for Dynamic Endoscopic Scene Reconstruction

An intial exploration into real-time surgincal scene reconstruction built on 3D Gaussian Splatting.

[Page] | [Paper] | [Code]


LGS: A Light-weight 4D Gaussian Splatting for Efficient Surgical Scene Reconstruction

An initial investigation into memory-efficient surgical scene reconstruction achieves over 9X compression while preserving high visual quality and efficiency.

[Page] | [Paper] | [Code]


Endora: Video Generation Models as Endoscopy Simulators

A pioneering exploration into high-fidelity medical video generation on endoscopy scenes.

[Page] | [Paper] | [Code]


U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation

An innovative enhancement of U-Net for medical image tasks using Kolmogorov-Arnold Network (KAN).

[Page] | [Paper] | [Code]