Light Bounces Once

The renderer at the end of chapter four knows where occlusion lives but not what color it should be. SSAO produces a scalar — less light reaches here — and applies it as a uniform multiplier on whatever the IBL produced. The physical effect being approximated is less light from some directions of the hemisphere, and “some directions” is the part SSAO throws away.

This chapter — the final one — implements Screen-Space Directional Occlusion (Ritschel, Grosch & Seidel, 2009). It refuses the scalar approximation. For each fragment, samples are distributed in 3D across the hemisphere aligned to the surface normal. Each sampled direction is tested for occlusion in screen space, the same way SSAO tests its 2D disc — but instead of producing a single accumulated obscurance, the loop accumulates two outputs simultaneously: the visibility-weighted environment radiance (only the directions that are actually open contribute), and a per-sample record of which directions were blocked.

The blocked samples are then re-purposed in a second pass as senders: the surface that occluded the sample received some direct illumination of its own, and that surface acts as a secondary emitter, bouncing its tinted reflectance toward the receiver. The result is a real-time approximation of one-bounce indirect illumination — color bleeding from nearby surfaces — running entirely off the G-buffer.

From scalar visibility to directional

The mathematical step from SSAO to SSDO is short. SSAO computes

A(P) = pow(max(0, 1 - scale · S), contrast)

where S is a scalar accumulator over hemisphere samples. That A multiplies the diffuse irradiance to produce the final color.

SSDO splits the sum. For each of N hemisphere directions ωᵢ:

L_dir(P) = Σ (ρ/π) · L_in(ωᵢ) · V(ωᵢ) · cos(θᵢ) · Δω

where L_in(ωᵢ) is the environment radiance arriving from direction ωᵢ, and V(ωᵢ) is the screen-space visibility test. The result is the actual environment energy arriving at the surface from unoccluded directions, in RGB. A surface occluded from a warm sky but open toward a cool wall doesn’t darken uniformly — it picks up the cool wall’s tint instead.

Two operating modes fall out of this. Visibility mode sums the visibility weight across samples and uses the resulting fraction to scale the SH diffuse term — directional but still using SH for the radiance. Radiance mode accumulates L_dir directly and replaces the SH diffuse entirely with the per-sample environment integral.

CLICK TO ZOOM

| Cycling through SSDO debug modes on Sponza — visibility map (greyscale, like SSAO), then direct radiance map (the colored tint of the unblocked light), then the indirect bounce contribution |

Sampling the 3D hemisphere

SSAO’s samples were a 2D screen-space disc. SSDO’s are a 3D hemisphere aligned to each fragment’s normal. The shader builds a tangent frame (T, B, N) per fragment by picking an auxiliary up vector that’s never collinear with N:

vec3 up = abs(N.y) < 0.999 ? vec3(0.0, 1.0, 0.0) : vec3(1.0, 0.0, 0.0);
vec3 T  = normalize(cross(up, N));
vec3 B  = cross(N, T);

Sample directions are generated in the local frame using stratified elevation plus a golden-ratio azimuthal layout. The elevation is concentrated toward the pole — where cos(θ) is largest and the projected solid angle is greatest — by setting cosElevation = 1 - α directly:

float alpha        = (float(i) + 0.5) / float(N);
float cosElevation = 1.0 - alpha;
float sinElevation = sqrt(1.0 - cosElevation * cosElevation);
float theta        = 2.0 * PI * alpha * (7.0 * float(N) / 9.0) + phi;

vec3 localDir = vec3(sinElevation * cos(theta),
                     sinElevation * sin(theta),
                     cosElevation);

vec3 sampleDir = normalize(T * localDir.x + B * localDir.y + N * localDir.z);

The per-pixel rotation phi uses Interleaved Gradient Noise — a screen-space hash that breaks up sample patterns across pixels without needing a noise texture lookup:

float phi = PI * fract(52.9829189 * fract(0.06711056 * gl_FragCoord.x
                                        + 0.00583715 * gl_FragCoord.y));

Each direction is given a random length λ ∈ (0, R] so the sample lands at a different depth per fragment, jittering the coherent banding into noise that the bilateral blur removes later.

The screen-space visibility test

Each 3D sample point P + sampleDir · λ is projected back to screen space by multiplying through the camera’s view-projection matrix:

vec4 clipPos    = projection * view * vec4(samplePoint, 1.0);
vec2 sampleUV   = (clipPos.xy / clipPos.w) * 0.5 + 0.5;
vec3 surfacePos = texture(gPosition, sampleUV).xyz;

Three conditions classify the sample as visible:

The UV falls outside [0, 1]² — no geometry exists at that screen location, so the sample is in open sky.
The G-buffer normal is zero — background pixel, no surface.
The sample’s view-space distance from the camera is less than or equal to the surface distance recorded at sampleUV — the sample sits in front of or on the recorded surface.

Otherwise the sample is occluded — the recorded surface is closer to the camera than the sample, meaning the sample is behind something visible.

float sampleDist  = length(samplePoint - viewPos);
float surfaceDist = length(surfacePos  - viewPos);
bool  visible     = (sampleDist <= surfaceDist);

The single-depth-layer assumption is the same trade SSAO makes: the G-buffer only records the closest surface per pixel, so geometry hidden behind other geometry can’t contribute to occlusion. False positives can occur at thin silhouettes — but the failure mode is consistent across the screen, and the eye accepts it.

Two outputs from one loop

The direct pass runs the hemisphere loop once and produces two outputs via Multiple Render Targets. For each visible sample, the environment is sampled at a user-controlled mip level (typically high — 8 to 10 — to suppress sharp features in the panorama):

float cosTheta = max(0.0, dot(N, sampleDir));
vec3  L_in     = textureLod(envMap, uvOfW(sampleDir), lod).rgb;
float weight   = dot(L_in, vec3(0.2126, 0.7152, 0.0722)) * cosTheta;

totalLight   += weight;                                            // full hemisphere budget
if (visible) {
    visibleLight += weight;
    L_dir        += (1.0 / PI) * L_in * cosTheta * (2.0 * PI / N); // radiance integral
}

The blurred mip means each “sample” is really a small solid-angle region, smoothing single-pixel high-frequency noise from sharp environment features (the sun disc, narrow window slits) without losing low-frequency directional information.

After the loop:

float visibility = totalLight > 0.0 ? visibleLight / totalLight : 1.0;
visibility       = clamp(pow(visibility, contrast), 0.0, 1.0);

// MRT outputs
FragVisibility = vec4(vec3(visibility), 1.0);
FragRadiance   = vec4(L_dir, 1.0);

visibility is the fraction of environment energy reaching the surface — directionally weighted. L_dir is the integrated radiance from unoccluded directions, in RGB. Both come out of the same loop, with the only extra cost being the MRT’s second framebuffer write.

CLICK TO ZOOM

| Raw SSDO visibility map walked through Sponza — looks similar to SSAO output, but is now a *projection* of which directions are blocked, weighted by their environment energy |

CLICK TO ZOOM

| Raw SSDO direct radiance map — the same scene rendered as colored direction-aware light, with crevices showing the tint of whatever sky region is *not* blocked |

The pair is then bilaterally blurred — same algorithm as chapter four’s SSAO blur, extended to handle two MRT inputs simultaneously:

CLICK TO ZOOM

| Raw vs blurred SSDO — sampling noise smoothed while contact edges and surface boundaries hold |

The bounce — sender as emitter

SSAO threw away the blocked samples. SSDO turns them into senders.

A sample classified as occluded hit some surface in screen space. That surface — at coordinates sampleUV — has its own direct radiance recorded in the (now-blurred) ssdoRadMap. It also has its own albedo (from the G-buffer). Under Lambertian reflection, the surface re-emits its incoming light tinted by its albedo. If the renderer treats every blocked sample as a small patch of secondary emitter, the integrated contribution at the receiver is one bounce of indirect illumination.

The form-factor between a small sender patch (area As) and a receiver point P is:

F = cos(θ_s) · cos(θ_r) / d²

where θ_s is the angle between the sender’s normal and the direction toward P, θ_r is the angle between P’s normal and the direction back to the sender, and d is the distance between them. The contribution to the receiver’s incoming radiance is the sender’s direct radiance times its albedo (Lambertian re-emission) times this form-factor.

The indirect pass reuses the same hemisphere sampling loop. For each sample classified as an occluder:

vec3 L_pixel       = texture(blurSSDORadianceMap, sampleUV).rgb;
vec3 senderAlbedo  = texture(gAlbedo,             sampleUV).rgb;
L_pixel           *= senderAlbedo;                                 // Lambertian re-emission

vec3  dir         = (surfacePos - P) / actualDist;                 // sender → receiver
float cosTheta_s  = max(0.0, dot(surfaceNormal,  dir));            // sender faces toward P?
float cosTheta_r  = max(0.0, dot(N,             -dir));            // receiver faces sender?
float falloff     = max(0.0, 1.0 - actualDist / R);                // smooth fade at radius edge
float d           = max(distanceClamp, actualDist);                // 1/d² guard

float As          = PI * R * R / float(N);                         // approx. patch area

L_ind += (1.0 / PI) * L_pixel * As * cosTheta_s * cosTheta_r
       * strength * falloff / (d * d);

The distanceClamp parameter prevents the inverse-square term from exploding when sender and receiver are very close (a known issue with form-factor approximations at small separations). falloff smoothly fades the contribution to zero at the hemisphere boundary R, matching the original paper’s recommended smoothing.

A subtle but consequential choice: the indirect pass samples the blurred direct radiance for each sender, not the raw output. Pre-blurring spatially smooths each sender’s incoming light, so a single noisy pixel doesn’t spike the indirect contribution; the result is much more stable indirect bleeding even at modest sample counts.

CLICK TO ZOOM

| Toggling the indirect bounce on and off — surfaces near the orange wall pick up an orange tint, the wood floor under the leather chair gains a reflected wood color |

Integration into the lighting pass

The lighting shader from chapter three is extended with a single SSDO branch that selects between two operating modes:

if (ssdo.enable) {
    vec3 L_ind = vec3(0.0);
    if (ssdo.enableIndirect) {
        L_ind = texture(blurindirectSSDOMap, TexCoords).rgb;
    }

    if (ssdo.doMethod == 0) {
        // Visibility mode: scale SH diffuse by directional visibility
        float visibility = texture(blurSSDOVisibilityMap, TexCoords).r;
        return kD * ((diffuse * visibility) + (albedo * L_ind)) + specColor;
    } else {
        // Radiance mode: replace SH diffuse with direct integral
        vec3 L_dir = texture(blurSSDORadianceMap, TexCoords).rgb;
        return kD * albedo * (L_dir + L_ind) + specColor;
    }
}

Visibility mode keeps the SH diffuse term from chapter three as the base color and modulates it by the directional visibility scalar. Open areas are unaffected; occluded areas darken proportionally to how much environment energy is blocked — directionally weighted, so an area shadowed from a bright warm sky but open toward a cool wall darkens less than SSAO would have darkened the same area. Indirect is added on top. This mode is the closest to a drop-in replacement for SSAO.

Radiance mode bypasses the SH diffuse entirely. The diffuse contribution is the integral the SSDO loop just computed — exactly what arrives from unoccluded directions, full RGB. This is more physically correct: there is no SH approximation, only the integral. It is also more sample-hungry: SH gets the low-frequency response right with nine numbers and no per-pixel work, while radiance mode reconstructs that response from N per-pixel samples.

In both modes the specular term — GGX importance sampling from chapter three — is unchanged. SSDO modulates only the diffuse component, because directional occlusion of a tight specular lobe is a different physical question that screen-space sampling can’t answer well.

CLICK TO ZOOM

| Visibility mode vs radiance mode side-by-side — radiance mode carries the environment's hue into shadowed areas; visibility mode preserves the SH approximation but darkens it directionally |

Where the real-time ladder ends

The renderer at the end of this series draws PBR materials, shadowed by analytic lights and softened with moment-based filtering, with image-based ambient lighting that responds to material properties, with screen-space ambient occlusion contact shadows, and with one bounce of screen-space directional global illumination. The final scene has every term the path tracer’s first three chapters had to compute by sampling — and runs at sixty frames per second on a single GPU.

CLICK TO ZOOM

| Final integrated walkthrough — Sponza with SSDO visibility mode and indirect bounce active, running on a deferred PBR + IBL + MSM + SSAO + SSDO pipeline |

Each technique is an approximation. Shadow mapping is a depth-buffer comparison standing in for shadow rays. SH irradiance is nine coefficients standing in for an integral over the upper hemisphere. GGX importance sampling is N samples standing in for a continuous integral over the BRDF lobe. SSAO is a screen-space disc standing in for a hemisphere integral of accessibility. SSDO is one screen-space bounce standing in for the full indirect transport that path tracing solves with multiple-bounce path samples.

Each approximation has a horizon — a class of effects it cannot capture. Moment-based shadows cannot represent shadow detail finer than the shadow-map texel. SH diffuse cannot represent high-frequency irradiance variation. SSAO cannot represent occlusion from geometry not in the G-buffer. SSDO cannot represent more than one bounce, cannot represent caustics, cannot represent specular interreflection between two glossy surfaces. Each of those failure modes is the entry point for another technique: cascaded shadow maps, parallax-corrected reflection probes, voxel cone tracing, ray-traced global illumination on hardware that can spare the rays.

The honest summary of the real-time renderer is that it is a layered series of compromises, each one carefully chosen to fail in a way the eye accepts at the framerate it has to hold. The path tracer gets the right answer slowly. The deferred renderer gets a good-enough answer fast. The art is knowing which compromises break first under pressure — and the rendering equation, sitting underneath all of it, looking the same at the top of chapter one as it does at the bottom of chapter five.