With the launch of AGS 5.0, developers now have access to the shader compiler control API. Here’s a quick summary of the how and why….
Background
The AMD DirectX® 11 driver multi-threads the compilation of shaders. Calls to CreateVertexShader
and CreatePixelShader
etc. do not block until the shader is created, instead the compilation job gets pushed to one of the driver’s shader compiler threads. This means you don’t know if your shader has finished compiling by the time you come to use it. This may result in a stall if the first draw call that uses the shader has to wait for the job to complete. The standard workaround is to render a dummy triangle with that shader during the transition from loading screen to game (“shader cache warming”).
Requirements
The API is available from AGS version 5 onwards. You’ll need to check the AGS_DX11_EXTENSION_CREATE_SHADER_CONTROLS
flag is set in the bitfield returned from agsDriverExtensionsDX11_Init
. The underlying driver support is in Crimson 16.9.2 onwards (driver version 16.40.2311).
The Shader Compiler Control API
The agsDriverExtensionsDX11_NumPendingAsyncCompileJobs
is useful to keep track of how many asynchronously compiled jobs are currently in the queue or being compiled on the driver’s compiler threads. So if your engine is already using the dummy triangle technique to ensure the shader has compiled, you can replace that path with a check to ensure no async threads are running.
By default, the DirectX 11 driver spins up max( 1, numPhysicalCores - 2 )
threads to compile shaders on. This could be somewhat conservative for your specific use case, so you can use agsDriverExtensionsDX11_SetMaxAsyncCompileThreadCount
to specify more worker threads. Alternatively, if your engine already does a great job at multi-threading resource creation, then you might find setting the count to zero to be a win. This means that the compile request is not put onto a different thread, it will be compiled inside the CreatePixelShader
call and only return from this function once compilation has completed.
In order to ensure you are profiling the same workload each time when trying out these API calls, you can switch off the shader cache using agsDriverExtensionsDX11_SetDiskShaderCacheEnabled
.