The GCN architecture contains a lot of functionality in the shader cores which is not currently exposed in current APIs like Vulkan™ or Direct3D® 12. One of the mandates of GPUOpen is to give developers better access to the hardware, and this post details extensions for these APIs to expose additional GCN features to developers.
Shader Extensions
With those shader extensions, we provide access to wavefront-wide functions, which is an important building block to exploit the SIMD execution model of GPUs. For instance, the use of mbcnt
and ballot
can replace atomics in various cases, drastically boosting performance. The wavefront-wide instructions also include swizzles, which allow individual lanes to exchange data without going through memory.
Additionally, we expose readfirstlane
and other functions which enable the compiler to move data from VGPRs into SGPRs. Especially for VGPR heavy code, marking variables as wavefront-uniform can reduce the VGPR count significantly.
Another often-requested feature which is getting exposed today is direct access to the barycentric coordinates. This is again an important building block for various algorithms.
Finally, we also provide various utility functions. In this release, we’re providing the 3-parameter min
, max
and med
functions which map directly to the corresponding GCN opcodes.
Direct3D Access
In Direct3D, the shader extensions are exposed through the AMD GPU Services (AGS) library. For more information on AGS, visit GPUOpen’s AGS page.
The extension allows you to query the presence of the various extension functions. The functions have to be enabled before you can load a shader that uses them – this guarantees that your shader is compatible with the underlying hardware.
In your code, all you need to do is include amd_ags.h in your c/cpp file, initialize AGS with agsInit
, and then check for shader extension support as shown in the code below:
#include "amd_ags.h" // call agsInit prior to D3D12 device creation [...] unsigned int extensionsSupported = 0; if (agsDriverExtensionsDX12_Init(m_agsContext, m_device.Get(), &extensionsSupported)== AGS_SUCCESS) { if (extensionsSupported & AGS_DX12_EXTENSION_INTRINSIC_BARYCENTRICS) { // the Barycentric extension is supported so use this codepath } }
In your HLSL code you need to add the following line at the top of your file:
#include "ags_shader_intrinsics_dx12.hlsl"
And then you can call the new function this way:
float2 barycentric = AmdExtD3DShaderIntrinsics_IjBarycentricCoords(AmdExtD3DShaderIntrinsicsBarycentric_LinearCenter); return float3(barycentric.x, barycentric.y, 1.0 - (barycentric.x + barycentric.y));
Vulkan Access
In Vulkan, the shader extensions are grouped into three separate extensions:
AMD_shader_ballot
AMD_shader_trinary_minmax
AMD_shader_explicit_vertex_parameter
These are device extensions which you need to check before you can use them. Similar to Direct3D, we’re grouping the functions, so there’s no need to check each one individually. Keep in mind that the extensions are SPIR-V – Vulkan does not consume GLSL directly. In order to use them from GLSL, we’re also releasing an update to the glslangValidator
with support for those extensions.
Extension Overview
Here’s an overview of all extension that we expose, and the corresponding names for HLSL and GLSL.
HLSL | GLSL | Description |
---|---|---|
Readfirstlane | readFirstInvocationARB | Read a value from the first active lane |
Readlane | readInvocationARB | Read a value from any line |
N/A | writeInvocationAMD | Write a value to a different lane |
LaneId | gl_SubGroupInvocationARB | Get the current lane id |
N/A | swizzleInvocationsAMD | Exchange data across lanes |
Swizzle | swizzleInvocationsMaskedAMD | Exchange data across lanes |
Ballot | ballotARB | Read the execution mask |
MBCnt | mbcntAMD | Count number of active lanes with an index less than the current lane |
Min3F/Min3U | min3 | 3-parameter min |
Med3F/Med3U | med3 | 3-parameter median |
Max3F/Max3U | max3 | 3-parameter max |
BarycentricCoords(LinearCenter) | gl_BaryCoordNoPerspAMD | Read barycentrics with linear interpolation |
BarycentricCoords(LinearCentroid) | gl_BaryCoordNoPerspCentroidAMD | Read barycentrics with linear interpolation |
BarycentricCoords(LinearSample) | gl_BaryCoordNoPerspSampleAMD | Read barycentrics with linear interpolation |
BarycentricCoords(PerspCenter) | gl_BaryCoordSmoothAMD | Read barycentrics with perspective interpolation |
BarycentricCoords(PerspCentroid) | gl_BaryCoordSmoothCentroidAMD | Read barycentrics with perspective interpolation |
BarycentricCoords(PerspSample) | gl_BaryCoordSmoothSampleAMD | Read barycentrics with perspective interpolation |
BarycentricCoords(PerspPullModel) | gl_BaryCoordPullModelAMD | Select the pull model for barycentrics |
VertexParameter | interpolateAtVertexAMD | Interpolate a vertex parameter |
Samples
The following GPUOpen samples demonstrate how to access the extensions:
Vulkan® mbcnt Sample
This sample shows how to use the AMD_shader_ballot extension and mbcnt to perform a fast reduction within a wavefront.
Barycentrics DirectX® Shader Extension Samples
The Barycentrics samples show how to enable intrinsic instructions in your DirectX®11 or DirectX®12 HLSL code.
Resources
Stable Barycentric Coordinates
The AMD GCN Vulkan extensions allow developers to get access to the barycentric coordinates at the fragment-shader level.
AMD GPU Services (AGS) Library
The AMD GPU Services (AGS) library provides software developers with the ability to query AMD GPU software and hardware state information that is not normally available through standard operating systems or graphics APIs.
DirectX®12
Microsoft® DirectX®12 provides APIs for creating games and other graphics applications. Find out how we can help you get the best out of DirectX®12.
Vulkan®
Vulkan® gives software developers control over the performance, efficiency, and capabilities of AMD Radeon™ GPUs and multi-core CPUs.