AMD FidelityFX Single Pass Downsampler (SPD) provides an RDNA™-optimized solution for generating up to 12 MIP levels of a texture.
Download the latest version - v2.0
This release adds the following features:
- Support for cube textures and array textures. The SpdDownsampler function takes the slice index as a new parameter. The total number of slices is passed in with the z component of your dispatch size.
- Support for downsampling a sub-rectangle, in case only a known region of the source texture has been updated.
- CPU side helper function to compute constants:
- Dispatch size x and y.
- Thread group offsets in case only a sub-rectangle has modified data.
- Number of thread groups.
- Number of MIPs.
- Automatic reset of the global atomic counter after each SPD run: only one single initialization for the first run of SPD is required now.
Downsample in a single pass
Use SPD with existing postprocessing
MIP levels have versions of the same texture but in smaller resolutions. They are used when the high-resolution texture is not necessarily needed (such as when objects are far from the camera, covering only a few pixels) or might introduce aliasing artefacts. MIP levels are also commonly used in effects like Bloom, Screen Space Reflections, and many more.
Use AMD FidelityFX SPD as a building block to accelerate your post processing pipeline or texture creation.
More key features:
The traditional approach
AMD FidelityFX SPD
Generating these MIP levels is in principle quite simple. For the next lower MIP, a 2×2 pixel quad is averaged resulting in a half-sized resolution texture. The most common approach is to generate these MIP levels one after the other. This requires synchronization after each step.
AMD FidelityFX SPD instead works on patches, downsampling each patch individually using a single pass compute shader without the requirement to synchronize the whole GPU after each step. Without these in-between synchronizations it makes SPD suitable for async compute.
New for v2.0 - Support for downsampling sub-rectangles
Why update the whole texture when only a known region has been updated? Here’s an example:
- Invokes 64 thread groups.
With the sub-rectangle feature:
- Invokes 12 thread groups.
Only downsamples a subrectangle covering all patches with modified data.
Source texture: 512×512
64×64 patch with unmodified data
64×64 patch with modified data
Optimizing for the Radeon™ RDNA Architecture
Lou Kramer, Developer Technology Engineer
When AMD introduced its Navi family of GPUs, it also introduced a whole new GPU architecture: RDNA™. This architecture is not only used in AMD GPUs for PC, but also in next-generation consoles. Join the session to learn about the details of RDNA and how it differs from the previous GCN architecture. We will also be presenting examples of optimizations based on the case study of implementing an efficient downsampler covering topics such as characteristics of workload distribution, shader optimizations, and efficient texture access.
More AMD FidelityFX effects
AMD FidelityFX Parallel Sort makes sorting data on the GPU quicker, and easier. Use our SM6.0 compute shaders to get your data in order.
AMD FidelityFX Variable Shading drives Variable Rate Shading into your game.
AMD FidelityFX Denoiser is a denoising compute shader which removes artefacts from reflection rendering.
AMD FidelityFX LPM provides an open source library to easily integrate HDR and wide gamut tone and gamut mapping into your game.
The AMD FidelityFX SSSR effect provides an open source library to easily integrate stochastic screen space reflections into your game.
AMD FidelityFX Combined Adaptive Compute Ambient Occlusion (CACAO) is an RDNA-optimized implementation of ambient occlusion.
Radeon™ Cauldron is our open source experimentation framework for DirectX®12 and Vulkan®.
AMD FidelityFX Contrast Adaptive Sharpening (CAS) provides a mixed ability to sharpen and optionally scale an image.
Our other effects
A multithreaded CPU library for deformable material physics, using the Finite Element Method (FEM)
The DepthOfFieldFX library provides a GCN-optimized Compute Shader implementation of Depth of Field using the Fast Filter Spreading approach.
GeometryFX improves the rasterizer efficiency by culling triangles that do not contribute to the output in a pre-pass. This allows the full chip to be used to process geometry, and ensures that the rasterizer only processes triangles that are visible.
ShadowFX library provides a scalable GCN-optimized solution for deferred shadow filtering. It supports uniform and contact hardening shadow (CHS) kernels.
The TressFX library is AMD’s hair/fur rendering and simulation technology. TressFX is designed to use the GPU to simulate and render high-quality, realistic hair and fur.