FidelityFX Single Pass Downsampler (SPD) provides an RDNA-optimized solution for generating up to 12 MIP levels of a texture.

| DOWNLOAD - Latest version 2.0

This release adds the following features:

  • Support for cube textures and array textures. The SpdDownsampler function takes the slice index as a new parameter. The total number of slices is passed in with the z component of your dispatch size.
  • Support for downsampling a sub-rectangle, in case only a known region of the source texture has been updated.
  • CPU side helper function to compute constants:
    • Dispatch size x and y.
    • Thread group offsets in case only a sub-rectangle has modified data.
    • Number of thread groups.
    • Number of MIPs.
  • Automatic reset of the global atomic counter after each SPD run: only one single initialization for the first run of SPD is required now.​

| FEATURES

Downsample in a single pass

RDNA optimized

Use SPD with existing postprocessing

MIP levels have versions of the same texture but in smaller resolutions. They are used when the high-resolution texture is not necessarily needed (such as when objects are far from the camera, covering only a few pixels) or might introduce aliasing artefacts. MIP levels are also commonly used in effects like Bloom, Screen Space Reflections, and many more.

Use FidelityFX SPD as a building block to accelerate your post processing pipeline or texture creation.

More key features:

  • Generates up to 12 mip levels (maximum source texture size is 4096x4096).
  • Single function call.
  • User defined 2x2 reduction function.
  • User controlled border handling.
  • Supports various image formats.
  • HLSL and GLSL versions available.
  • Rapid Packed Math support.
  • Uses optionally subgroup operations / SM6+ wave operations, which can provide faster performance.

The traditional approach

vs.

FidelityFX SPD

Generating these MIP levels is in principle quite simple. For the next lower MIP, a 2×2 pixel quad is averaged resulting in a half-sized resolution texture. The most common approach is to generate these MIP levels one after the other. This requires synchronization after each step.

FidelityFX SPD instead works on patches, downsampling each patch individually using a single pass compute shader without the requirement to synchronize the whole GPU after each step. Without these in-between synchronizations it makes SPD suitable for async compute.

New for v2.0 - Support for downsampling sub-rectangles

Why update the whole texture when only a known region has been updated? Here’s an example:

Default approach: 

  • Invokes 64 thread groups.

With the sub-rectangle feature:

  • Invokes 12 thread groups.
  • Only downsamples a subrectangle covering all patches with modified data.

Source texture: 512×512

 64×64 patch with unmodified data

 64×64 patch with modified data

| VERSION HISTORY

  • Initial release

| RELATED VIDEO PRESENTATION

Optimizing for the Radeon™ RDNA Architecture

Lou Kramer, Developer Technology Engineer

When AMD introduced its Navi family of GPUs, it also introduced a whole new GPU architecture: RDNA. This architecture is not only used in AMD GPUs for PC, but also in next-generation consoles. Join the session to learn about the details of RDNA and how it differs from the previous GCN architecture. We will also be presenting examples of optimizations based on the case study of implementing an efficient downsampler covering topics such as characteristics of workload distribution, shader optimizations, and efficient texture access.

Share on facebook
Share on twitter
Share on linkedin
Share on reddit
Share on whatsapp

| USEFUL INFORMATION

| OUR OTHER EFFECTS

FEMFX

A multithreaded CPU library for deformable material physics, using the Finite Element Method (FEM)

Cauldron Framework

Radeon™ Cauldron is our open source experimentation framework for DirectX®12 and Vulkan®.

DepthOfFieldFX

The DepthOfFieldFX library provides a GCN-optimized Compute Shader implementation of Depth of Field using the Fast Filter Spreading approach.

GeometryFX

GeometryFX improves the rasterizer efficiency by culling triangles that do not contribute to the output in a pre-pass. This allows the full chip to be used to process geometry, and ensures that the rasterizer only processes triangles that are visible.

ShadowFX

ShadowFX library provides a scalable GCN-optimized solution for deferred shadow filtering. It supports uniform and contact hardening shadow (CHS) kernels.

TressFX

The TressFX library is AMD’s hair/fur rendering and simulation technology. TressFX is designed to use the GPU to simulate and render high-quality, realistic hair and fur.