Bachelor Project: Global Illumination Overview

2022-08-02 · Bjarke Damsgaard Eriksen

Global Illumination Overview

In this post I’ll give an overview of the global illumination system used in my toy renderer which has recently gone open-source. As development continues, I’ll post new write-ups on improvements and additions made to the project.

The renderer uses a technique called voxel cone tracing that approximates global illumination by tracing a series of increasingly larger spheres into a 3D scene. We leverage the graphics pipeline to voxelize the scene with one draw call per material. We support hardware conservative rasterization when available and provide a software implementation of our own as a fallback. Shading attributes are stored in a dense grid of anisotropic voxels. All direct diffuse lighting is rendered as part of the voxelization step by evaluating the reflectance equation using PCF shadows as the visibility function and a Lambertian BRDF. The pre-filtered scene data in the mip-map pyramid is then generated with the help of compute dispatches. We launch the final gather for indirect lighting in the deferred shading pass. If we evaluate more than one light bounce, we run a compute dispatch prior to deferred shading that traces the voxelized scene and outputs a new voxelized version.

In the following sections we will describe a single frame rendered in 14.32 milliseconds with the following setup:

Attribute	Setup
Operating System	Microsoft Windows 10 Home (10.0.18363)
CPU	AMD Ryzen 3950X
GPU	NVIDIA GeForce RTX 2080 Super
GPU Driver Version	497.09
RAM	32 GB
VRAM	8192 MB GDDR 6
Rendering API	Direct3D 11.3

And these settings:

Setting	Value
Screen Resolution	1920x1080
Voxel Grid Resolution	128x64x64
Normal Mapping	On
Anisotropic Texture Filtering	4x
Shadow Filtering Mode	PCSS
Number of Bounces	3
Specular Cone Angle	10
Opacity Threshold	0.95
Trace Step Scale	0.5
Mip Bias	0
Deferred Pass Cone Count*	9
Bounce Pass Cone Count*	9

(*) Note that the two last rows aren’t exposed to the UI

What happens in a frame?:

Shadow pass (~ 0.40ms):

For shadow maps, we use an inverted depth buffer to minimize the maximum floating point error. Additionally, we apply a depth bias when writing to the shadow map, instead of when reading and comparing. A normal offset bias is used using the vertex normal instead of the face normal.

Spotlight #1
Spotlight #2

‹

›

Voxelization pass (~ 1.48ms):

In the voxelization pass, we generate a voxelized representation of opacity and emittance from a lower LOD version of the scene. The opacity and emittance maps are two seperate 3D textures. During voxelization, we write to mip 0 of these textures. Optionally, a gap-filling compute pass is run on mip 0. The higher mips are then generated with a compute dispatch.

Emittance
Opacity

‹

›

G-Buffer pass (~ 0.29ms):

The geometry pass writes per pixel information about the depth, surface normal, color etc. to G-buffers. We use octahedral encoding for packing the surface normal in two 16-bit unorm integers, as described by the oct32 encoding. Furthermore, we do normal mapping as part of our geometry pass.

Albedo (RGB) + specular exponent (A)
Normals
Material index

‹

›

Bounce pass (~ 6.95ms):

The bounce pass is a compute shader pre-pass on the emittance map that computes a light bounce in texel-space before the final light bounce happens in the deferred pass. Running this pass more than once per frame is too expensive. Therefore, we amortize the work over multiple frames for multi-bounce lighting.

SSSO (~ 1.41ms) + Filtering (~ 0.29ms):

The renderer uses screen-space specular occlusion to remove some of the light leaking that happens because of voxel self-intersection avoidance. Occlusion is gathered by cone tracing the depth buffer. If the stored depth is outside the depth range covered by the sphere, the fragment is occluded. A cross bilateral gaussian filter is used to smooth out the results while preserving edges:

The resulting texture is then used to attenuate the specular contribution in the deferred pass:

SSSO - Off
SSSO - On

‹

›

The SSSO pass is slower than it has to be due to the fact that we don’t skip texels during z-buffer tracing.

Deferred pass (~ 4.80ms):

In the deferred shading pass, we use the G-buffer data to render direct lighting using shadow maps for the visibility function and cone tracing for indirect diffuse, indirect specular, and ambient occlusion. For shadows, we provide implementations of PCF and PCSS. In our case, all direct lighting can be computed with closed-form analytical formulas. Indirect diffuse is handled by voxel cone tracing and no solution for indirect specular is provided. However, indirect diffuse contributes to a local fragment’s specular response, thereby providing reflections without specular highlights.

Images rendered with a different number of indirect diffuse bounces:

One bounce
Two bounces
Three bounces

‹

›

As can be seen, one of the main weaknesses of the renderer is light leaking from the indirect diffuse. To solve this a occlusion pass can be done similar to SSSO. However, since indirect diffuse is gathered by tracing multiple cones for every pixel, doing this at full resolution would be prohibitive. Instead, we can trace enough cones at quater resolution to cover the hemisphere, allowing for a small amount of overlap and then upscale the result before deferred shading happens.

Image Gallery

#open-source