DIFR3CT

DIFR3CT: Latent Diffusion for Probabilistic 3D CT Reconstruction from Few Planar X-Rays
AAPM 67th Annual Meeting & Exhibition (AAPM 2025)

Yiran Sun
Rice University
Guha Balakrishnan
Rice University
Ashok Veeraraghavan
Rice University
Osama Mawlawi
MD Anderson Cancer Center

Abstract

Computed Tomography (CT) scans are the standard-of-care for the visualization and diagnosis of many clinical ailments, and are needed for the treatment planning of external beam radiotherapy. Unfortunately, the availability of CT scanners in low- and mid-resource settings is highly variable. Planar x-ray radiography units, in comparison, are far more prevalent, but can only provide limited 2D observations of the 3D anatomy. In this work we propose DIFR3CT, a 3D latent diffusion model, that can generate a distribution of plausible CT volumes from one or few (<10) planar x-ray observations. DIFR3CT works by fusing 2D features from each x-ray into a joint 3D space, and performing diffusion conditioned on these fused features in a low-dimensional latent space. We conduct extensive experiments demonstrating that DIFR3CT is better than recent sparse CT reconstruction baselines in terms of standard pixel-level (PSNR, SSIM) on both the public LIDC and in-house post-mastectomy CT datasets. We also show that DIFR3CT supports uncertainty quantification via Monte Carlo sampling, which provides an opportunity to measure reconstruction reliability. Finally, we perform a preliminary pilot study evaluating DIFR3CT for automated breast radiotherapy contouring and planning -- and demonstrate promising feasibility.

DIFR3CT

scales

DIFR3CT consists of two parts:

(a) Feature fusion of multi-view X-rays: We extract a feature image W_k from each input planar x-ray X_k with a 2D U-Net. We then re-project W_k back into 3D space using known x-ray imaging acquisition settings. We average all re-projected feature volumes into one feature volume F_avg.

(b) 3D conditional latent diffusion model: During training, each CT volume is encoded into a latent code Z₀ using a pretrained encoder. We train a time-conditioned 3D denoising U-Net to take a random noisy latent code Z_t and conditioning signal F_avg, and output a partially denoised code Z_t-1. After T steps, the predicted code Ẑ₀ is reconstructed into a CT volume using a pretrained decoder.

Quantitative Results

Comparison of DIFR3CT with baselines on the Thoracic Dataset, given biplanar x-ray inputs

(From left to right: Ground Truth, NAF [1], 3D Diffusion [2], X2CT-GAN [3], INRR3CT [4], DIFR3CT)

scales

Citation

If you find the paper useful in your research, please cite the paper:

@article{sun2024difr3ct,
  title={DIFR3CT: Latent Diffusion for Probabilistic 3D CT Reconstruction from Few Planar X-Rays},
  author={Sun, Yiran and Baroudi, Hana and Netherton, Tucker and Court, Laurence and Mawlawi, Osama and Veeraraghavan, Ashok and Balakrishnan, Guha},
  journal={arXiv preprint arXiv:2408.15118},
  year={2024}
}

Acknowledgements

The website template was borrowed from Michaël Gharbi.

DIFR3CT: Latent Diffusion for Probabilistic 3D CT Reconstruction from Few Planar X-Rays
AAPM 67th Annual Meeting & Exhibition (AAPM 2025)

Paper

Video

Code

Abstract

DIFR3CT

Quantitative Results

Related links

Citation

Acknowledgements

DIFR3CT: Latent Diffusion for Probabilistic 3D CT Reconstruction from Few Planar X-Rays AAPM 67th Annual Meeting & Exhibition (AAPM 2025)

Paper

Video

Code

Abstract

DIFR3CT

Quantitative Results

Related links

Citation

Acknowledgements

DIFR3CT: Latent Diffusion for Probabilistic 3D CT Reconstruction from Few Planar X-Rays
AAPM 67th Annual Meeting & Exhibition (AAPM 2025)