
AMD FidelityFX™ Single Pass Downsampler (SPD)
AMD FidelityFX Single Pass Downsampler (SPD) provides an AMD RDNA™ architecture optimized solution for generating up to 12 MIP levels of a texture.
Game engines do most of their shading work per-pixel or per-fragment. But there is another alternative that has been popular in film for decades: object space shading. Pixar’s RenderMan®, the most well-known renderer for Computer Graphics, uses the Reyes rendering method, which is an object space shading method.
This blog looks at an object space shading method that works on Direct3D® 11 class hardware. In particular, we’ll be looking at texture space shading, which uses the texture parameterization of the model. Shading in object space means we have already decoupled the shading rate from pixels, and it’s also easy to decouple in time. We’ve used these decouplings to find ways to improve performance, but we’ll also discuss some future possibilities in this space.
When most people hear the term “texture space shading” they generally think of rasterizing geometry in texture space. There are two difficulties with that kind of approach: visibility and choosing the right resolution. Rasterizing in texture space means you don’t have visibility information from the camera view, and end up shading texels that aren’t visible. Also, your resolution choices are limited, as you can only rasterize at one resolution at a time.
Resolution choice is important because it drives your total shading cost. Each mipmap level costs an additional 4× shading cost. So if you need 512×512 to match pixel rate for one part of an object and 1024×1024 for another part, but you can only pick one, which do you pick? If you pick 512×512, part of the object will be under sampled. If you pick 1024×1024, part of your object will cost 4× as much as needed. That increases with every level you span. If you have an object that spans 4 mipmap levels, for example, you can increase shading cost by up to 64× to hit a particular resolution target for the whole object! The following image shows just a few levels of a single character’s texture access for the given view, (after early depth rejection has been applied), to give you an idea of just how much waste there can be when not selecting the right resolution in texture space shading.
So let’s try a different approach. Instead, we will rasterize in screen space (from the camera view), and instead of shading, we just record the texels we need as shading work. This is why we chose the term “texel shading”. Since we are rasterizing from the camera view, we get the two pieces of information that aren’t normally available in texture space methods – visibility after early depth test, and the screen space derivative for selecting the mipmap level. The work itself are the texels.
At a high level, the process is as follows:
Note that texel shading can be selectively applied on a per-object, or even per-fragment basis. The fragment-level choice happens at the time of the first geometry pass, so it still incurs additional cost of other stages, but it allows for fall backs to standard forward rendering. It also also allows you to split shading work between texel shading and pixel shading.
There are a few details on how this is done that I will just mention here, but hopefully you have a chance to read the Eurographics short paper on the topic to get a better idea.
One bit of information to know, is that we actually shade 8×8 tiles of texels. We have a sort of cache that has one entry per tile that we use to eliminate redundant tile shades, as well as age tracking for techniques that use shades from previous frames.
The second, is that we need to interpolate vertex attributes in the compute shader. To do so, we have a map we call the “triangle index texture” that we use to identify which triangle we need to shade a particular texel.
This alone gives you no benefit. In fact, it incurs additional cost in a few different ways:
However, once you have an object-space rendered like this, new options open up. I’ll list several of them in this section. The first two, and a bit of the third, are what we’ve tried so far. The rest are what we think is worth trying in the future.
We’re not the first to try something like this. The Nitrous engine, for example, uses a texture space shading technique that is suitable for the Real Time Strategy (RTS) style games they target. They don’t have to worry so much about objects spanning mipmap levels, or occluded texels, and so they took a different approach. I encourage you to take a look at Dan Baker’s GDC presentation. Texel shading fits more naturally into a forward rendering pass than deferred (as far as I’ve thought about it – although you could also do deferred texture space shading). It requires that the object has a unique texture parameterization, as is often done for a lightmap. It also requires space for the texture of shaded results, and a choice as to how high the resolution needs to be. This depends on how the texture is to be used – not all the shading needs to be done in texture space. If there are textures that do not use the unique parameterization, additional derivatives are required for mipmap level lookups. This basically fills the function of the screen space derivatives in standard mipmapping, but instead trying to match the rate of the second texture with the first. However, since the mapping from one UV set to the other tends to stay fixed, it can potentially be precomputed. As far as engine integration, it would require the ability to bind the index and vertex buffer of the object during a compute shader pass, preferably a vertex buffer that is already skinned for performance reasons. Alternatively, you could pre-interpolate everything that doesn’t change dynamically, but that costs more memory and isn’t something we’ve tried. Adopting texel shading doesn’t require an all-or-nothing choice. You could apply it to a single object. You could even make a per-fragment choice within that object.
I’ve described a way to do object space shading with Direct3D 11 hardware. It’s a texture space approach, but uses a camera space rasterization to help with occlusion and getting the right mipmap level choices for a given view. We’ve explored some ways to reduce shader load by decoupling shading rate both spatially and temporally, but there’s a lot more to explore. You can get a bit more detail from the Eurographics 2016 short paper.