
AMD FidelityFX Parallel Sort provides an open source header implementation to easily integrate a highly optimized compute-based radix sort into your game.
Supports:
- DirectX®12
- Vulkan®
Download the latest version v1.1:
This release adds the following features:
- Vulkan® implementation.
- Upgraded framework to Cauldron v1.4.
- General code cleanup and readability improvements.
Download the latest version v1.2
This release includes:
- Now part of AMD FidelityFX SDK.
- Various fixes applied.
Part of the AMD FidelityFX™ SDK

Features
State-of-the-art algorithm
Optimized for Shader Model 6.0+
Open source, MIT license
Additional features:
- Direct and Indirect execution support.
- RDNA™+ optimized algorithm.
- 32-bit key and payload sort support
- Support for DirectX®12 API and Vulkan®.
- Shaders written in HLSL utilizing Shader Model 6.0 wave-level operations.
A sample application is provided for DirectX®12 and Vulkan®.
Details
Algorithm overview
AMD FidelityFX Parallel Sort is an RDNA™-optimized version of the Radix Sort algorithm.
At a high level, the algorithm works by recursing over a data set to be sorted (key or key/value pairs), and re-arranging it in place by 4-bit increments. Each pass guarantees that the data set is fully sorted up to the number of bits processed. For example, after 4 iterations, we are guaranteed that the first 16 bits of the key is properly sorted.
For each iteration that is executed, 5 actions are taken on the data set:
- The 4-bit value range we are currently sorting is summed up into buckets from 0-15, so that we know how many of each value occurs throughout the data set.
- The number of occurrences go through a reduction phase in order to pre-increment offsets on a thread group basis later on.
- The reduced occurrences go through a scan-prefix to calculate offset values for each value group (0-15) on a thread group basis.
- The full occurrences buffer then also goes through a scan-prefix, and adds the reduced scan-prefix values to properly index the data across all thread groups.
- The data set is read in one more time, and written to its new sorted offset location. If there is also a payload, it is also copied over at this time.
Once all iterations have run (in the case of 32-bit keys, it runs 8 times), the entire data set is sorted.
Additional resources
Comparison: GPU particle sorting


Comparison: image index buffer sorting


Version history
- Vulkan® implementation.
- Upgraded framework to Cauldron v1.4.
- General code cleanup and readability improvements.
- Initial release
Part of the AMD FidelityFX SDK

AMD FidelityFX SDK
The AMD FidelityFX SDK is our easy-to-integrate solution for developers looking to include FidelityFX features into their games.
More AMD FidelityFX effects

AMD FidelityFX Blur
AMD FidelityFX Blur is an AMD RDNA™ architecture optimized collection of blur kernels from 3×3 up to 21×21.

AMD FidelityFX Combined Adaptive Compute Ambient Occlusion (CACAO)
AMD FidelityFX Combined Adaptive Compute Ambient Occlusion (CACAO) is an AMD RDNA™ architecture optimized implementation of ambient occlusion.

AMD FidelityFX Contrast Adaptive Sharpening (CAS)
AMD FidelityFX Contrast Adaptive Sharpening (CAS) provides a mixed ability to sharpen and optionally scale an image.

AMD FidelityFX Denoiser
AMD FidelityFX Denoiser is a set of denoising compute shaders which remove artefacts from reflection and shadow rendering.

AMD FidelityFX Depth of Field (DoF)
AMD FidelityFX Depth of Field is an AMD RDNA™-architecture optimized implementation of physically correct camera-based depth of field.

AMD FidelityFX Hybrid Shadows sample
This sample demonstrates how to combine ray traced shadows and rasterized shadow maps together to achieve high quality and performance.

AMD FidelityFX Hybrid Stochastic Reflections sample
This sample shows how to combine AMD FidelityFX Stochastic Screen Space Reflections (SSSR) with ray tracing in order to create high quality reflections.

AMD FidelityFX Lens
AMD FidelityFX Lens is an AMD RDNA™ architecture optimized implementation of some of gaming’s most used post-processing effects.

AMD FidelityFX Luminance Preserving Mapper (HDR Mapper)
AMD FidelityFX LPM provides an open-source library to easily integrate HDR and wide gamut tone and gamut mapping into your game.

AMD FidelityFX Single Pass Downsampler (SPD)
AMD FidelityFX Single Pass Downsampler (SPD) provides an AMD RDNA™ architecture optimized solution for generating up to 12 MIP levels of a texture.

AMD FidelityFX Stochastic Screen Space Reflections (SSSR)
The AMD FidelityFX SSSR effect provides an open-source library to easily integrate stochastic screen space reflections into your game.

AMD FidelityFX Super Resolution 1 (FSR 1)
AMD FidelityFX Super Resolution (FSR) is our open-source, high-quality, high-performance upscaling solution.

AMD FidelityFX Super Resolution 2 (FSR 2)
Learn even more about our new open-source temporal upscaling solution FSR 2, and get the source code and documentation!

AMD FidelityFX Variable Shading
AMD FidelityFX Variable Shading drives Variable Rate Shading into your game.

Radeon™ Cauldron Framework
Radeon Cauldron is our open-source experimentation framework for DirectX®12 and Vulkan®, provided in the AMD FidelityFX SDK.
Other effects on GPUOpen

TressFX
The TressFX library is AMD’s hair/fur rendering and simulation technology. TressFX is designed to use the GPU to simulate and render high-quality, realistic hair and fur.