Generative AI model for Global Illumination effects

Originally posted:
SungYe Kim's avatar
SungYe Kim
Wojciech Uss's avatar
Wojciech Uss
Wojciech Kaliński's avatar
Wojciech Kaliński
Alexandr Kuznetsov's avatar
Alexandr Kuznetsov
Harish Anand's avatar
Harish Anand

Recent advances in generative techniques [1] exhibit the ability to generate images with visually appealing content and illumination. Strong priors in generative models, learned from a large-scale datasets, have enabled the breakthrough, ushering a new era in neural rendering. While there has been some research focusing on realistic and controllable lighting effects with diffusion-based models, they still lack the capability to produce specific lighting effects. Particularly, generating multi-bounce, high-frequency lighting effects like caustics remain untackled in diffusion-based image generation.

Diffusion-based models [2] have demonstrated the capability of generating photorealistic images in various domains. Nevertheless, some research addresses the limitation of the current diffusion model-based image generation for shadows and reflections, while introducing conditioned diffusion models to model the lighting with single bounce shading and mirror reflection as a depth conditioned image inpainting task.

Caustics Generation by a Conditional Diffusion Model

We leverage diffusion-based techniques to generate an indirect illumination of a particular lighting effect. Specifically, our technique enables a diffusion model to generate cardioid-shaped reflective caustics as a conditional image generation task.

We use a latent-space diffusion model as our baseline architecture and set multi-image conditioning and light embeddings. We use geometric and material information like albedo, normal, roughness, and metallic as conditioning images, augmenting them with illumination information like direct illumination and radiance cues. These conditioning images are encoded into latent space using a pre-trained Variational Autoencoder (VAE) encoder. Light position encoded by Positional Encoding and light direction encoded by Spherical Harmonics form an additional input to the diffusion UNet. Figure 1 presents our framework with a conditional diffusion model for generating caustics effect.

Figure 1. Framework with a conditional diffusion model.

Results

We fine-tune a latent-space diffusion model using our caustics dataset and demonstrate our approach generates visually plausible cardioid-shaped caustics. The conditioning information that includes geometric, material and illumination data, as well as light property, is easily obtained from existing rendering pipeline.

Figure 2 shows our results for validation data (Top) and test data (Bottom). The indirect illumination (Figure 2 (b)) is generated from our fine-tuned diffusion model and composited to the direct illumination (Figure 2 (a)), which is one of our conditioning images, to present a final result (Figure 2 (c)).

Figure 2. Our results. (a) Direct illumination, (b) Indirect illumination in our result, (c) (a)+(b) our result, (d) Reference global illumination.

Figure 3. (Left) Our result. (Right) Reference global illumination.

Our work paves a way to interesting research for generative diffusion-based models to be capable of specific indirect illumination effect generation. Further details can be found in our paper [3] presented at Eurographics 2025 – Short paper.

References

  1. Brooks, Peebles, et al. Video generation models as world simulators. (2024).
  2. Hanqun Cao, Cheng Tan, Zhangyang Gao, Yilun Xu, Guangyong Chen, Pheng-Ann Heng, Stan Z. Li. A Survey on Generative Diffusion Models. IEEE Transactions on Knowledge and Data Engineering (2024).
  3. Wojciech Uss, Wojciech Kaliński, Alexandr Kuznetsov, Harish Anand, Sungye Kim. Cardioid Caustics Generation with Conditional Diffusion Models. Eurographics 2025 - Short Papers (2025).
SungYe Kim's avatar

SungYe Kim

SungYe Kim is a Principal Member of Technical Staff (PMTS) in the Advanced Graphics Program group, where she focuses on research and development of AI-assisted neural rendering techniques and leads development of forward-looking techniques.  She received her PhD in Computer Engineering from Purdue University.  Throughout her career in the industry, she has developed proficiency in diverse domains including gaming, media, VR and neural rendering with an emphasis on generating high-quality images for real-time use cases.
Wojciech Uss's avatar

Wojciech Uss

Wojciech Uss is a Senior Member of Technical Staff (SMTS) in the Advanced Rendering Research group, specializing in the development and optimization of neural network models for use in computer graphics rendering. His work focuses on pushing the boundaries of real-time graphics and path tracing rendering efficiency. He holds a PhD in Mathematics from Gdańsk University. Outside of work, Wojciech enjoys spending time with his family, running, and honing communication skills, particularly in Nonviolent Communication (NVC).
Wojciech Kaliński's avatar

Wojciech Kaliński

Wojciech Kaliński is a Member of Technical Staff (MTS) in the Advanced Rendering Research group. He has extensive experience in computer graphics, which he applies to his work on neural rendering projects. His main interests are physically based rendering, ray tracing and applications of AI in 3D graphics.
Alexandr Kuznetsov's avatar

Alexandr Kuznetsov

Alexandr Kuznetsov is a Member of Technical Staff (MTS) in the Advanced Graphics Program group, specializing in applying deep learning techniques to computer graphics. He received his PhD from UC San Diego under supervision of Prof. Ravi Ramamoorthi. During his PhD he worked on denoising and neural materials.
Harish Anand's avatar

Harish Anand

Harish Anand is a Member of Technical Staff (MTS) in the Advanced Graphics Program group, specializing in the development and optimization of diffusion and transformer models. Harish earned his master's degree in Computer Science from Arizona State University.