There are many games out there taking place in vast environments. The basic building block of every environment is height-field based terrain – there’s no two ways around having some fields, roads, hills and mountains when building a world. As terrain is almost always visible, a lot of GPU time is spent on rendering it with high quality. Multiple texture layers are blended, close to the camera, the terrain is often highly tessellated, and advanced culling techniques are used to skip invisible parts of the terrain or reduce tessellation rate based on curvature.
One thing which is often forgotten though is shadow map rendering. In general, the terrain gets rendered into the shadow map with the same tessellation level as for the primary view to ensure that there are no self-shadow artefacts. As the tessellation level is not optimized for the shadow camera, but for the primary camera, this often results in a very strong mismatch and shadow maps end up getting extremely over-tessellated.
That’s one part of the problem – the other one is that you can’t really cull the terrain from the shadow map because everything can be potentially lit by the sun. But is this really true? And it turns out, it often isn’t, because there’s one more thing to be taken into account when generating shadow maps: can the object which is rendered into the shadow map actually cast a shadow? There’s no point in rendering an object into a shadow map if you can guarantee that all objects in the game will be always in front of it. I hope you see where I’m coming from here – as the player usually can’t get below the terrain, there’s some optimization potential here! Let’s take a look at the idea:
In the figure above, we can see the general idea. Only the backfacing geometry (marked red) as seen from the sun can actually cast a shadow. Everything else will receive shadows, but not cast them, and hence can be removed from the shadow map wholesale. If you know your terrain curvature, and your sun position, you can trivially discard tiles in your terrain where all triangles would face the sun. There’s simply no point in rendering those, as there’s nothing they could ever cast a shadow upon – unless you have sub-terrain geometry, in which case you can also easily discover which tiles will occlude it. I’ve created a small test scene to show case this a bit better:
On the left hand side, you can see the tile classification. Red means “shadow receive” only, and all red tiles did not get rendered into the shadow map. On the right hand side, you can see the top-down view with normal shading and shadows – you’ll immediately notice that even though I removed roughly 50% of the geometry from the shadow map, there’s no missing shadow here, as the only shadows are cast by the mountains in the lower left and upper right. Here’s another view of said scene – you can see the shadows in the bottom left, but most of the geometry – despite being wavy and noisy – has not enough curvature to end up casting shadows.
This is an easy optimization as most games already know the curvature of their terrain. All you need to do is evaluate said curvature against your shadow camera: if you can guarantee that everything you see is front-facing, just skip that part of the terrain. The benefits from this optimization are well worth it: shadow map rendering tends to be bottlenecked on the GPU frontend i.e. triangle throughput. Any triangle saved there will improve performance – even more so if tessellation is used. Moreover, you can reap the benefits from this optimization multiple times. The sun shadow is typically using a shadow map cascade, and you can apply the technique for every cascade, getting rid of most of the terrain in all passes. All of those benefits require minimal CPU costs – once you know the minimum and maximum curvature per tile, you’re done, so there’s really no downside from this optimization.
Other content by Matthäus Chajdas
Asynchronous compute can help you to get the maximum GPU usage. I’ll be explaining the details based on the nBodyGravity sample from Microsoft.
GCN hardware supports a special out-of-order rasterization mode which relaxes the ordering guarantee, and allows fragments to be produced out-of-order.
With cluster culling, GeometryFX is able to reject large chunks of the geometry – with corresponding performance increases.
With shader extensions, we provide access to a much better tool to get compaction done: GCN provides a special op-code for compaction within a wavefront.
One of the mandates of GPUOpen is to give developers better access to the hardware, and this post details extensions for Vulkan and Direct3D12 that expose additional GCN features to developers.
Vulkan barriers are unique as they requires you to provide what resources are transitioning and also specify a source and destination pipeline stage.
mGPU isn’t just for gamers – if you’re a developer working on a game, you should think of using mGPU to make your life easier.
One of our engineers explains a few small code changes that can help you integrate RenderDoc for more unconventional applications.