Home » Blogs » Foliage in AMD FidelityFX™ Brixelizer GI » Part 2: Extending AMD FidelityFX Brixelizer GI

Part 2: Extending AMD FidelityFX Brixelizer GI

To improve the quality of foliage in Brixelizer GI, we must modify the algorithm to support alpha-clipped geometry. In essence, we want to alter the Brixelizer GI probe trace to sample the alpha of each surface hit point. This alpha value is then used to determine if the surface is transparent. If deemed transparent, we can perform an additional trace using the surface hit point as the ray origin. This process can be repeated until we encounter an opaque surface containing the final radiance.

Alpha cache

As mentioned in Part 1, Brixelizer does not provide an API for sampling the material parameters of a surface intersection. Therefore, we will need to implement another secondary cache to store the alpha value of geometry. This cache functions in a sparse nature similar to that of the Radiance Cache. At each Brick ID, we allocate a 4x4x4 grid of values that will provide higher detail transparency of the Brick surface.

Alpha buffer

As with the Radiance Cache, we also need a way to fill in the new Alpha Cache. The naive solution here is to render an additional Alpha GBuffer that exports the alpha property of each surface. Then, we can use screen-space voxelization like we do the Radiance Cache to emit this buffer into the scene. Unfortunately, this solution will not be adequate since the geometry of alpha-tested cards often obscures other cards — as is the case for the grass. The below image demonstrates this; the two obscured cards will not have the majority of their alpha values exported.

Three grass cards overlap each other, causing the alpha of the overlapped cards to not be calculated

A couple of solutions exist to remedy this; one way would be to hook up the Alpha Cache directly to the Alpha Buffer pass and use unrestricted memory accesses to store the alpha of each surface. This solution would require a relatively significant change in the Brixelizer GI API; as such, we instead opt for something simpler. To capture surfaces at each depth level, we inject a geometry shader into the Alpha Buffer pass, which randomizes the Z-value of each triangle primitive. Due to us performing screen-space voxelization each frame, these randomized Alpha Buffers have proved adequate for filling alpha at a high enough speed.

Each frame, we randomize Z-values resulting in all alpha being captured over time

For the same reasons as the Radiance Cache, we also need a way to reproduce the world-space coordinates of the Alpha Buffer fragments. One way to do this is to repeat what we do in the Radiance Cache and reconstruct coordinates from the depth buffer and perspective-view matrix. Since our Alpha Buffer depth will consist of randomized values, we must write the original depth to a different output channel.

Another way of reconstructing world-space coordinates is to write the coordinates of each fragment directly to three different channels of the Alpha Buffer. This solution will make reconstruction as simple as reading the coordinates of each fragment.

Additionally, writing world-space coordinates directly to the Alpha Buffer means that Brixelizer GI does not need to know how the fragments were projected. This makes it trivial to use different perspective-view transforms than those used in the other GBuffer passes. Leveraging this simplicity allows us to easily capture out-of-view alpha information, such as leaves located above the camera that need to let light through from the sky. To do this, we render a random axis-oriented direction from the camera every other frame. We render random directions only for every other frame because we want a higher-frequency Alpha Cache in the view space. The process of rendering out-of-view directions is demonstrated in the below image, where the screen-space frustum will be rendered on the first, third, and fifth frames, while the other out-of-view frustums will be rendered on even frames.

Each frame we render a different out-of-view prespective.

We need to add one more special case to the Alpha Buffer. In the final GI, it’s important that non-alpha-tested geometry remains fully opaque. Otherwise, we might have leaking when foliage intersects with such objects. We write a special value to the alpha channel if a fragment belongs to such an object. This will allow us to treat such fragments differently in the Alpha Cache insertion pass.

Below follows some pseudo code for calculating the final value of an Alpha Buffer fragment.

Copied!

output.rgb = input.origPosition; // Write world-space coordinates
if(blendMode == Opaque) { // If fragment belongs to an opaque object
    output.a = OPAQUE_FRAGMENT; // Write special value for opaque geometry
}else{
    output.a = input.alpha; // Otherwise, just write alpha
}

Alpha cache insertion

We insert fragments into the Alpha Cache in two steps. First, we emit non-opaque fragments; the alpha value of these fragments is blended into the Alpha Cache at the reconstructed world-space coordinate. The blend factor is proportional to the amount of alpha fragments that have already been stored at the location in the cache; this will result in quick rough estimates being written initially, which are then averaged over time. The stored samples counter at the cache location will also be incremented up to a max value of 64, which will result in the lowest amount of blending. Each alpha fragment is emitted into the cache three times, jittered along all axes.

In the second step, we insert opaque fragments into the cache. These are jittered less to avoid bleeding onto alpha, and we also do not use any blending. Opaque fragments will overwrite the entire alpha value at their location in the cache. Additionally, they will set the stored samples counter to an above-max value of 128, which will make it harder for alpha fragments to bleed onto it.

To ensure that a stored sample counter with a value of 128 can return to 64, we decrement it (instead of incrementing) when storing alpha values and the counter is above 64.

Pseudo code for the Alpha Cache insertion is shown below.

Copied!

// Alpha fragments
for(fragment in fragments) {
    ws_coord = fragment.xyz;
    if(fragment.a != OPAQUE_FRAGMENT) {
        for(i = 0; i < 3; i++) {
            jws_coord = jitter(ws_coord);
            blendFactor = 1/alphaCache[jws_coord].g;
            alphaCache[jws_coord].r = lerp(alphaCache[jws_coord], fragment.a,blendFactor); // Blend alpha value
            if(alphaCache[jws_coord].g < 64)
                alphaCache[jws_coord].g += 1; // Increment stored samples
            else if(alphaCache[jws_coord].g > 64)
                alphaCache[jws_Coord].g -= 1; // Decrement stored samples
        }
    }
}

// Opaque fragments
for(fragment in fragments) {
    ws_coord = fragment.xyz;
    if(fragment.a == OPAQUE_FRAGMENT) {
        alphaCache[ws_coord].r = 1.0f; // Fully opaque
        alphaCache[ws_coord].g = 128; // Set stored samples
    }
}

Tracing the cache

Finally, we need to implement a new trace function for Brixelizer GI that takes advantage of the new Alpha Cache. After a Brixelizer trace, we determine whether a hit surface is transparent stochastically. We fetch the alpha value stored at the location from the Alpha Cache; if it’s less than a random value generated between 0 and 1, it’s deemed transparent. This stochasticity will allow for surface areas that contain equal parts opaque and transparent fragments to be traced through 50% of the time, providing a sort of semi-transparency.

If a surface hit point is deemed transparent, we issue an additional Brixelizer trace with its origin on the other side of the surface. When this new trace hits a surface, we repeat the process of determining its transparency and re-tracing. The process is repeated a maximum of 16 times, after which any hit surface will be deemed opaque, no matter the stored alpha.

When an opaque surface is finally reached, the ray hit payload of that surface intersection will be returned. The pseudo code for this trace is shown below.

Copied!

func new_trace(ray_description) {
    ray_hit;
    for(i = 0; i < 16; i++) {
        ray_hit = brixelizer_trace(ray_description); // Perform Brixelizer trace
        alpha = alphaCache[ray_hit.pos].r;
        if(alpha >= random()) { // If surface is opaque
            return ray_hit; // Return ray hit
        }
        ray_description.origin = ray_hit.pos + ray_description.dir * EPS; // Move the next ray origin to the other side of the surface.
    }
    return ray_hit;
}

Using this new trace function, let us now look at what the probes see. The image below shows what the probes see before and after implementing the new trace. Left is before, and right is after.

What the probes see with the new foliage trace

The trees have a lot more contour, and the grass becomes slightly see-through as the alpha of the strands is averaged out. This all results in light being able to pass through the grass to a certain depth, which allows the grass to sample the sky. This can be seen in the final GI result below.

The final foliage GI

The overbrightening observed with the naive solution is no longer present. We can still see a lot of detail in the shadowed area close to the camera, but the area under the trees becomes much darker since the leaves now provide indirect shadowing. We can even make out some ambient occlusion at the roots of the grass near the camera.

Animated foliage

To finish out this blog series, let us talk about the challenge of implementing animated foliage in Brixelizer GI. If you recall, the Radiance Cache and now the Alpha Cache are both associated with surfaces using the Brixelizer-provided Brick ID. Unfortunately, since Brixelizer handles animated geometry by re-allocating associated bricks, the cache can not accumulate on such surfaces.

For the Radiance Cache, this is not a large issue, but since we are not generating alpha fragments for screen-space geometry each frame, half of the frames, the Alpha Cache will be non-existent for such geometry. We want to average the alpha of dynamic geometry over multiple frames, but to do this, we need a new data structure for the Alpha Cache.

Spatial hash maps are a data structure that coherently associates voxels in world space with the same memory location. To access a cell in the hash map, we quantize world-space coordinates to acquire the containing voxels’ coordinates. Then, we hash these coordinates using a one-dimensional hash function, and we use the resulting hash to index the data structure.

There exist many ways to optimize spatial hash-maps to improve memory coherency and avoid collisions. For this experiment, we forgo all that. Below, you can see a comparison of what the probes see when capturing a rotating fan blade. The fan blade consists of an alpha-clipped card. The left video uses the standard Brick ID-based data structure, while the right video uses a spatial hash map. As can be observed, using a spatial hash map, the cache can now accumulate across frames, providing approximate transparency.

What probes see using two different Alpha Cache structures applied to an animated propeller

Here is the final GI result.

The final animated propeller GI

What’s next?

If you would like to find out more about BrixelizerGI, take a look at: Introducing AMD FidelityFX™ Brixelizer.

Petter Blomkvist
Petter Blomkvist

Petter is a Master's student who has completed a year-long internship at AMD's European Game Engineering Team. Throughout this time, he conducted extensive research on Software-Based GI and assisted in the shipment of Brixelizer™ and Brixelizer GI™.

Introducing AMD FidelityFX™ Brixelizer

As of FidelityFX SDK version 1.1, Brixelizer and Brixelizer GI are now unleashed to world so in this article we aim to discuss a few practical use cases and provide you with some tips you can apply for getting the most performance out of Brixelizer in your application.

Looking for a good place to get started with exploring GPUOpen?

AMD GPUOpen documentation

Explore our huge collection of detailed tutorials, sample code, presentations, and documentation to find answers to your graphics development questions.

AMD GPUOpen Effects - AMD FidelityFX technologies

Create wonder. No black boxes. Meet the AMD FidelityFX SDK!

AMD GPUOpen Performance Guides

The home of great performance and optimization advice for AMD RDNA™ 2 GPUs, AMD Ryzen™ CPUs, and so much more.

AMD GPUOpen Samples

Browse all our useful samples. Perfect for when you’re needing to get started, want to integrate one of our libraries, and much more.

AMD GPUOpen developer SDKs

Discover what our SDK technologies can offer you. Query hardware or software, manage memory, create rendering applications or machine learning, and much more!

AMD GPUOpen Developer Tools

Analyze, Optimize, Profile, Benchmark. We provide you with the developer tools you need to make sure your game is the best it can be!

Getting started: AMD GPUOpen software

New or fairly new to AMD’s tools, libraries, and effects? This is the best place to get started on GPUOpen!

AMD GPUOpen Getting Started Development and Performance

Looking for tips on getting started with developing and/or optimizing your game, whether on AMD hardware or generally? We’ve got you covered!