
TressFX
The TressFX library is AMD’s hair/fur rendering and simulation technology. TressFX is designed to use the GPU to simulate and render high-quality, realistic hair and fur.
We are releasing TressFX 3.1. Our biggest update in this release is a new order-independent transparency (OIT) option we call “ShortCut”. We’ve also addressed some of the issues brought up by the community.
ShortCut is our new Order Independent Transparency (OIT) option. It’s inspired by the method presented by Eidos-Montréal and Hybrid Transparency. Whereas our original method focused on the front k = 8 or so layers of hair, ShortCut is good for cases when you can get away with k = 2 or 3, and you’re more concerned about memory usage.
It does require some forethought on how to build your models, however, as it comes with different performance characteristics, and a quality trade-off. But between the simpler memory bounds and the potential for higher performance, we expect it to be a popular choice.
The four main steps are outlined below.
InterlockedMin
calls to update the list of k nearest fragments while computing an overall alpha.[earlydepthstencil]
focuses shading cost on the front k.With the original method, you needed to allocate a single memory pool that is large enough for all hair fragments, not just the front k. With ShortCut, you only need space for the front k layers: 4 bytes for each depth, 4 bytes for each color, and 4 bytes for an accumulated alpha term for each pixel. Another difference with our previous method, is that although you still get the performance advantage of only shading the front k layers, you don’t need to store the shader inputs in screen space.
ShortCut’s main drawback is in extra geometry passes. But it’s still a performance win when the depth complexity is high relative to the geometry cost, as it is for the “ponytail” model that’s included. It also doesn’t give quite the same quality result as the Per-Pixel Linked List (PPLL) method. However, as long as you are aware of these trade-offs, you should be able to create content that works well within these constraints.
We’ve also included some additional compile time choices. The default version uses k = 3. One compile-time switch changes this to k = 2. There’s also a compile-time switch for a non-deterministic mode, which can benefit performance in some configurations, so we wanted to provide this option as well.
We’ve received some terrific feedback from the community. This update also addresses some of the issues they have brought up.
One issue was caused by skipping simulation steps when the frame rate exceeds 60 Hz. This caused fur on the bear to sometimes separate from the underlying mesh, since the animation would move forward, but not the simulation. This issue was identified by mrgreywater, who went on to also provide a fix for us! In this release, we’re providing another alternative that’s also a little easier on performance with the ShortCut method.
We’re also looking at some changes to the library structure and API to enable easier engine integration. We’ve received some terrific feedback from the community on these issues (13, 14). These changes are still in the works, but we invite more input.