AMD FidelityFX™ Single Pass Downsampler (SPD)

AMD FidelityFX™ Single Pass Downsampler (SPD) provides an AMD RDNA™-optimized solution for generating up to 12 MIP levels of a texture.

Supports:

DirectX®12 Ultimate.
Vulkan®.

Part of the AMD FidelityFX™ SDK

Download the latest version - v2.0

This release adds the following features:

Support for cube textures and array textures. The SpdDownsampler function takes the slice index as a new parameter. The total number of slices is passed in with the z component of your dispatch size.
Support for downsampling a sub-rectangle, in case only a known region of the source texture has been updated.
CPU side helper function to compute constants:
- Dispatch size x and y.
- Thread group offsets in case only a sub-rectangle has modified data.
- Number of thread groups.
- Number of MIPs.
Automatic reset of the global atomic counter after each SPD run: only one single initialization for the first run of SPD is required now.

Download the latest version as part of FidelityFX SDK v1.1

Updated as part of AMD FidelityFX SDK v1.1:

AMD FidelityFX backend updates, including buffer allocator overrides
Updated documentation and release of reference documentation for SDK + Framework
Native Microsoft® GDK® backend implementation library (requires developer access to GDK® program)

Features

Downsample in a single pass

RDNA™ optimized

Use SPD with existing postprocessing

Details

MIP levels have versions of the same texture but in smaller resolutions. They are used when the high-resolution texture is not necessarily needed (such as when objects are far from the camera, covering only a few pixels) or might introduce aliasing artefacts. MIP levels are also commonly used in effects like Bloom, Screen Space Reflections, and many more.

Use AMD FidelityFX SPD as a building block to accelerate your post processing pipeline or texture creation.

More key features

Generates up to 12 mip levels (maximum source texture size is 4096x4096).
Single function call.
User defined 2x2 reduction function.
User controlled border handling.
Supports various image formats.
HLSL and GLSL versions available.
Rapid Packed Math support.
Uses optionally subgroup operations / SM6+ wave operations, which can provide faster performance.

Generating these MIP levels is in principle quite simple. For the next lower MIP, a 2×2 pixel quad is averaged resulting in a half-sized resolution texture. The most common approach is to generate these MIP levels one after the other. This requires synchronization after each step.

AMD FidelityFX SPD instead works on patches, downsampling each patch individually using a single pass compute shader without the requirement to synchronize the whole GPU after each step. Without these in-between synchronizations it makes SPD suitable for async compute.

The traditional approach

vs.

AMD FidelityFX SPD

New for v2.0 - Support for downsampling sub-rectangles

Default approach:

Invokes 64 thread groups.

With the sub-rectangle feature:

Invokes 12 thread groups.
Only downsamples a subrectangle covering all patches with modified data.

Note: The last active thread group still computes the last 4 MIPs in both cases i.e. 8x8 → 4x4 → 2x2 → 1x1

Source texture: 512×512

64×64 patch with unmodified data

64×64 patch with modified data

Optimizing for the Radeon™ RDNA Architecture

Lou Kramer, Developer Technology Engineer

When AMD introduced its Navi family of GPUs, it also introduced a whole new GPU architecture: RDNA™. This architecture is not only used in AMD GPUs for PC, but also in next-generation consoles. Join the session to learn about the details of RDNA and how it differs from the previous GCN architecture. We will also be presenting examples of optimizations based on the case study of implementing an efficient downsampler covering topics such as characteristics of workload distribution, shader optimizations, and efficient texture access.

Optimizing for the Radeon™ RDNA Architecture (Let’s build 2020) – YouTube link

Learn about the details of RDNA and how it differs from the previous GCN architecture. Includes examples of optimizations based on the case study of implementing an efficient downsampler. Covers topics such as characteristics of workload distribution, shader optimizations, and efficient texture access.

Version history

Version 1.1 - FidelityFX SDK (July 2024)

Updated as part of AMD FidelityFX SDK v1.1:

AMD FidelityFX backend updates, including buffer allocator overrides
Updated documentation and release of reference documentation for SDK + Framework
Native Microsoft® GDK® backend implementation library (requires developer access to GDK® program)

Version 2.1 (June 2023)

Now part of AMD FidelityFX SDK.
Allows you to select the filter kernel. Previously, only the mean was available, but now you can use the min and max as well.

Version 2.0 (August 2020)

Support for cube textures and array textures: the SpdDownsampler function takes as new parameter the slice index. The total number of slices is passed in with the z component of your dispatch size.
Support for downsampling a sub-rectangle, in case only a known region of the source texture has been updated.
CPU side helper function to compute constants:
- dispatch size x and y
- thread group offsets in case only a sub-rectangle has modified data
- number of thread groups
- number of MIPs
Automatic reset of the global atomic counter after each SPD run: only one single initialization for the first run of SPD is required now.