Featured image of post Bachelor Project - Global Illumination Overview

Bachelor Project - Global Illumination Overview

Global Illumination Overview

In this post I’ll give an overview of the global illumination system used in my toy renderer which has recently gone open-source. As development continues, I’ll post new write-ups on improvements and additions made to the project.

The renderer uses a technique called voxel cone tracing that approximates global illumination by tracing a series of increasingly larger spheres into a 3D scene. We leverage the graphics pipeline to voxelize the scene with one draw call per material. We support hardware conservative rasterization when available and provide a software implementation of our own as a fallback. Shading attributes are stored in a dense grid of anisotropic voxels. All direct diffuse lighting is rendered as part of the voxelization step by evaluating the reflectance equation using PCF shadows as the visibility function and a Lambertian BRDF. The pre-filtered scene data in the mip-map pyramid is then generated with the help of compute dispatches. We launch the final gather for indirect lighting in the deferred shading pass. If we evaluate more than one light bounce, we run a compute dispatch prior to deferred shading that traces the voxelized scene and outputs a new voxelized version.

In the following sections we will describe a single frame rendered in 14.32 milliseconds with the following setup:

Attribute Setup
Operating System Microsoft Windows 10 Home (10.0.18363)
CPU AMD Ryzen 3950X
GPU NVIDIA GeForce RTX 2080 Super
GPU Driver Version 497.09
RAM 32 GB
VRAM 8192 MB GDDR 6
Rendering API Direct3D 11.3

And these settings:

Setting Value
Screen Resolution 1920x1080
Voxel Grid Resolution 128x64x64
Normal Mapping On
Anisotropic Texture Filtering 4x
Shadow Filtering Mode PCSS
Number of Bounces 3
Specular Cone Angle 10
Opacity Threshold 0.95
Trace Step Scale 0.5
Mip Bias 0
Deferred Pass Cone Count* 9
Bounce Pass Cone Count* 9

(*) Note that the two last rows aren’t exposed to the UI

What happens in a frame?:

Shadow pass (~ 0.40ms):

For shadow maps, we use an inverted depth buffer to minimize the maximum floating point error. Additionally, we apply a depth bias when writing to the shadow map, instead of when reading and comparing. A normal offset bias is used using the vertex normal instead of the face normal.

Voxelization pass (~ 1.48ms):

In the voxelization pass, we generate a voxelized representation of opacity and emittance from a lower LOD version of the scene. The opacity and emittance maps are two seperate 3D textures. During voxelization, we write to mip 0 of these textures. Optionally, a gap-filling compute pass is run on mip 0. The higher mips are then generated with a compute dispatch.

G-Buffer pass (~ 0.29ms):

The geometry pass writes per pixel information about the depth, surface normal, color etc. to G-buffers. We use octahedral encoding for packing the surface normal in two 16-bit unorm integers, as described by the oct32 encoding. Furthermore, we do normal mapping as part of our geometry pass.

Bounce pass (~ 6.95ms):

The bounce pass is a compute shader pre-pass on the emittance map that computes a light bounce in texel-space before the final light bounce happens in the deferred pass. Running this pass more than once per frame is too expensive. Therefore, we amortize the work over multiple frames for multi-bounce lighting.

SSSO (~ 1.41ms) + Filtering (~ 0.29ms):

The renderer uses screen-space specular occlusion to remove some of the light leaking that happens because of voxel self-intersection avoidance. Occlusion is gathered by cone tracing the depth buffer. If the stored depth is outside the depth range covered by the sphere, the fragment is occluded. A cross bilateral gaussian filter is used to smooth out the results while preserving edges:

image2

The resulting texture is then used to attenuate the specular contribution in the deferred pass:

The SSSO pass is slower than it has to be due to the fact that we don’t skip texels during z-buffer tracing.

Deferred pass (~ 4.80ms):

In the deferred shading pass, we use the G-buffer data to render direct lighting using shadow maps for the visibility function and cone tracing for indirect diffuse, indirect specular, and ambient occlusion. For shadows, we provide implementations of PCF and PCSS. In our case, all direct lighting can be computed with closed-form analytical formulas. Indirect diffuse is handled by voxel cone tracing and no solution for indirect specular is provided. However, indirect diffuse contributes to a local fragment’s specular response, thereby providing reflections without specular highlights.

Images rendered with a different number of indirect diffuse bounces:

As can be seen, one of the main weaknesses of the renderer is light leaking from the indirect diffuse. To solve this a occlusion pass can be done similar to SSSO. However, since indirect diffuse is gathered by tracing multiple cones for every pixel, doing this at full resolution would be prohibitive. Instead, we can trace enough cones at quater resolution to cover the hemisphere, allowing for a small amount of overlap and then upscale the result before deferred shading happens.

 

Licensed under CC BY-NC-SA 4.0
comments powered by Disqus
Built with Hugo
Theme Stack designed by Jimmy