
AMD FidelityFX™ Parallel Sort provides an open source header implementation to easily integrate a highly optimized compute-based radix sort into your game.
Supports:
- DirectX®12
- Vulkan®
Part of the AMD FidelityFX™ SDK

Download the latest version as part of FidelityFX SDK v1.1
AMD FidelityFX Parallel Sort:
Clean up of GPU particle code to make better use of FidelityFX Parallel Sort.
Updated as part of AMD FidelityFX SDK v1.1:
- AMD FidelityFX backend updates, including buffer allocator overrides
- Updated documentation and release of reference documentation for SDK + Framework
- Native Microsoft® GDK® backend implementation library (requires developer access to GDK® program)
Features
State-of-the-art algorithm
Optimized for Shader Model 6.0+
Open source, MIT license
Additional features:
- Direct and Indirect execution support.
- AMD RDNA™+ optimized algorithm.
- 32-bit key and payload sort support
- Support for DirectX®12 API and Vulkan®.
- Shaders written in HLSL utilizing Shader Model 6.0 wave-level operations.
A sample application is provided for DirectX®12 and Vulkan®.
Details
Algorithm overview
AMD FidelityFX Parallel Sort is an AMD RDNA™-optimized version of the Radix Sort algorithm.
At a high level, the algorithm works by recursing over a data set to be sorted (key or key/value pairs), and re-arranging it in place by 4-bit increments. Each pass guarantees that the data set is fully sorted up to the number of bits processed. For example, after 4 iterations, we are guaranteed that the first 16 bits of the key is properly sorted.
For each iteration that is executed, 5 actions are taken on the data set:
- The 4-bit value range we are currently sorting is summed up into buckets from 0-15, so that we know how many of each value occurs throughout the data set.
- The number of occurrences go through a reduction phase in order to pre-increment offsets on a thread group basis later on.
- The reduced occurrences go through a scan-prefix to calculate offset values for each value group (0-15) on a thread group basis.
- The full occurrences buffer then also goes through a scan-prefix, and adds the reduced scan-prefix values to properly index the data across all thread groups.
- The data set is read in one more time, and written to its new sorted offset location. If there is also a payload, it is also copied over at this time.
Once all iterations have run (in the case of 32-bit keys, it runs 8 times), the entire data set is sorted.
Additional resources
Comparison: GPU particle sorting


Comparison: image index buffer sorting


Version history
AMD FidelityFX Parallel Sort:
Clean up of GPU particle code to make better use of FidelityFX Parallel Sort.
Updated as part of AMD FidelityFX SDK v1.1:
- AMD FidelityFX backend updates, including buffer allocator overrides
- Updated documentation and release of reference documentation for SDK + Framework
- Native Microsoft® GDK® backend implementation library (requires developer access to GDK® program)
- Now part of AMD FidelityFX SDK.
- Various fixes applied.
- Vulkan® implementation.
- Upgraded framework to Cauldron v1.4.
- General code cleanup and readability improvements.
- Initial release
Part of the AMD FidelityFX SDK

AMD FidelityFX™ SDK
The AMD FidelityFX SDK is our easy-to-integrate solution for developers looking to include FidelityFX features into their games.
More AMD FidelityFX effects

AMD FidelityFX™ Blur
AMD FidelityFX Blur is an AMD RDNA™ architecture optimized collection of blur kernels from 3×3 up to 21×21.

AMD FidelityFX™ Breadcrumbs library
AMD FidelityFX Breadcrumbs library uses the breadcrumbs marker technique to track down where your submitted commands cause a GPU crash.

AMD FidelityFX™ Brixelizer/GI
AMD FidelityFX™ Brixelizer GI is compute-based real-time dynamic global illumination solution built upon sparse distance fields.

AMD FidelityFX™ Cauldron Framework
AMD FidelityFX Cauldron Framework is our open-source experimentation framework for DirectX®12 and Vulkan®, provided in the AMD FidelityFX SDK.

AMD FidelityFX™ Combined Adaptive Compute Ambient Occlusion (CACAO)
AMD FidelityFX Combined Adaptive Compute Ambient Occlusion (CACAO) is an AMD RDNA™ architecture optimized implementation of ambient occlusion.

AMD FidelityFX™ Contrast Adaptive Sharpening (CAS)
AMD FidelityFX Contrast Adaptive Sharpening (CAS) provides a mixed ability to sharpen and optionally scale an image.

AMD FidelityFX™ Denoiser
AMD FidelityFX Denoiser is a set of denoising compute shaders which remove artefacts from reflection and shadow rendering.

AMD FidelityFX™ Depth of Field (DoF)
AMD FidelityFX Depth of Field is an AMD RDNA™-architecture optimized implementation of physically correct camera-based depth of field.

AMD FidelityFX™ Hybrid Shadows sample
This sample demonstrates how to combine ray traced shadows and rasterized shadow maps together to achieve high quality and performance.

AMD FidelityFX™ Hybrid Stochastic Reflections sample
This sample shows how to combine AMD FidelityFX Stochastic Screen Space Reflections (SSSR) with ray tracing in order to create high quality reflections.

AMD FidelityFX™ Lens
AMD FidelityFX Lens is an AMD RDNA™ architecture optimized implementation of some of gaming’s most used post-processing effects.

AMD FidelityFX™ Luminance Preserving Mapper (HDR Mapper)
AMD FidelityFX LPM provides an open-source library to easily integrate HDR and wide gamut tone and gamut mapping into your game.

AMD FidelityFX™ Single Pass Downsampler (SPD)
AMD FidelityFX Single Pass Downsampler (SPD) provides an AMD RDNA™ architecture optimized solution for generating up to 12 MIP levels of a texture.

AMD FidelityFX™ Stochastic Screen Space Reflections (SSSR)
The AMD FidelityFX SSSR effect provides an open-source library to easily integrate stochastic screen space reflections into your game.

AMD FidelityFX™ Super Resolution 1 (FSR 1)
AMD FidelityFX Super Resolution (FSR) is our open-source, high-quality, high-performance upscaling solution.

AMD FidelityFX™ Super Resolution 2 (FSR 2)
Learn even more about our new open-source temporal upscaling solution FSR 2, and get the source code and documentation!

AMD FidelityFX™ Super Resolution 3 (FSR 3)
Discover frame generation with AMD FidelityFX™ Super Resolution 3, and get the source code and documentation!

AMD FidelityFX™ Variable Shading
AMD FidelityFX Variable Shading drives Variable Rate Shading into your game.
Other effects on GPUOpen

TressFX
The TressFX library is AMD’s hair/fur rendering and simulation technology. TressFX is designed to use the GPU to simulate and render high-quality, realistic hair and fur.