
TressFX
The TressFX library is AMD’s hair/fur rendering and simulation technology. TressFX is designed to use the GPU to simulate and render high-quality, realistic hair and fur.
Once we got our barriers right and we started to hit performance close to DirectX® 11, it was time to switch gears. Since both APIs are driving the same hardware, if you only do rendering that could have been done in DirectX® 11 then there will be little difference in the GPU’s performance. In this scenario, it’s about how you build your commands that counts and the performance win you get from DirectX® 12 comes from the CPU. Under the hood, your command buffers will be very similar in both APIs as you are going to render the same content after all. There is an option unlocked by DirectX® 12 however, and that is the use of multiple queues. Since the DirectX® 11 driver doesn’t know about the structure of the frame it cannot make decisions about using multiple queues automatically, but since we have direct control over the different queues on DirectX® 12, we can.
So we started looking for parts of the pipeline which could be moved to compute shaders and could be run parallel to some geometry processing. The main things we needed to consider are the dependencies between the different parts of the pipeline. For a quick overview of the pipeline, please refer to the diagram in the first part of this blog post. In a nutshell, we identified the following three sites:
The main takeaway from these is that asynchronous compute is a really great tool. It is relatively easy to squeeze some additional performance out from DirectX 12, we got an extra 2ms of performance with our particle simulation and SSAO. However, it’s not magic and can lead to some surprising results in certain cases. For the best results you have to try to find the best possible match of blocks in your pipeline and be prepared to experiment a little.
If you have any questions, please feel free to comment or get in touch with Tamas on Twitter. Next time we’ll take a look at explicit Multiple GPU in DirectX 12.