FidelityFX Parallel Sort
Navigation: SDK → GPU → FidelityFX GPU References
FidelityFX Parallel Sort
FidelityFX Parallel Sort GPU documentation.
Defines
FFX_PARALLELSORT_SORT_BITS_PER_PASS
#define FFX_PARALLELSORT_SORT_BITS_PER_PASS 4
The number of bits we are sorting per pass. Changing this value requires internal changes in LDS distribution and count, reduce, scan, and scatter passes.
Source: sdk/include/FidelityFX/gpu/parallelsort/ffx_parallelsort.h
(line 34, column 9)
FFX_PARALLELSORT_SORT_BIN_COUNT
#define FFX_PARALLELSORT_SORT_BIN_COUNT (1 <<FFX_PARALLELSORT_SORT_BITS_PER_PASS)
The number of bins used for the counting phase of the algorithm. Changing this value requires internal changes in LDS distribution and count, reduce, scan, and scatter passes.
Source: sdk/include/FidelityFX/gpu/parallelsort/ffx_parallelsort.h
(line 42, column 9)
FFX_PARALLELSORT_ELEMENTS_PER_THREAD
#define FFX_PARALLELSORT_ELEMENTS_PER_THREAD 4
The number of elements dealt with per running thread.
Source: sdk/include/FidelityFX/gpu/parallelsort/ffx_parallelsort.h
(line 48, column 9)
FFX_PARALLELSORT_THREADGROUP_SIZE
#define FFX_PARALLELSORT_THREADGROUP_SIZE 128
The number of threads to execute in parallel for each dispatch group.
Source: sdk/include/FidelityFX/gpu/parallelsort/ffx_parallelsort.h
(line 54, column 9)
FFX_PARALLELSORT_MAX_THREADGROUPS_TO_RUN
#define FFX_PARALLELSORT_MAX_THREADGROUPS_TO_RUN 800
The maximum number of thread groups to run in parallel. Modifying this value can help or hurt GPU occupancy, but is very hardware class specific.
Source: sdk/include/FidelityFX/gpu/parallelsort/ffx_parallelsort.h
(line 62, column 9)
Functions
ffxParallelSortCalculateScratchResourceSize
inline void ffxParallelSortCalculateScratchResourceSize(uint32_t maxNumKeys, uint32_t &scratchBufferSize, uint32_t &reduceScratchBufferSize)
Call to calculate the required size for the scratch and reduce scratch buffers used by parallel sort algorithm.
Parameters:
maxNumKeys
(uint32_t
) – [in] The maximum number of keys the algorithm will be asked to sort through.scratchBufferSize
(uint32_t &
) – [out] The size of the scratch buffer that needs to be allocated.reduceScratchBufferSize
(uint32_t &
) – [out] The size of the reduce scratch buffer that needs to be allocated.
Attributes: inline
Source: sdk/include/FidelityFX/gpu/parallelsort/ffx_parallelsort.h
(line 91, column 13)
ffxParallelSortSetConstantAndDispatchData
inline void ffxParallelSortSetConstantAndDispatchData(uint32_t numKeys, uint32_t maxThreadGroups, FfxParallelSortConstants &constantBuffer, uint32_t &numThreadGroupsToRun, uint32_t &numReducedThreadGroupsToRun)
Call to setup the constant buffer data needed to bind to the GPU for Parallel Sort execution (all passes). Note that the implementor is left to manually modify the shift (bit shift for each pass) value.
Parameters:
numKeys
(uint32_t
) – [in] The number of keys the algorithm will be sorting through.maxThreadGroups
(uint32_t
) – [in] The maximum number of thread groups to use in parallel.constantBuffer
(FfxParallelSortConstants &
) – [out] The FfxParallelSortConstants buffer to fill with information.numThreadGroupsToRun
(uint32_t &
) – [out] The number of thread groups (dispatch size) to run for this sort run.numReducedThreadGroupsToRun
(uint32_t &
) – [out] The number of reduce thread groups (dispatch size) to run for this sort run.
Attributes: inline
Source: sdk/include/FidelityFX/gpu/parallelsort/ffx_parallelsort.h
(line 113, column 13)
Dependencies: FfxParallelSortConstants