| Unlock the Rasterizer with Out-of-Order Rasterization

Full-speed, out-of-order rasterization

If you’re familiar with graphics APIs, you’re certainly aware of the API ordering guarantees. At their core, these guarantees mean that if you put two triangles into the pipeline one after the other, they will also end up in the framebuffer in exactly the same order. This makes it possible, for instance, to sort transparent geometry by depth and get the correct blending.

While this guarantee is usually necessary for correctness, it’s often an unnecessary constraint. If you’re laying down a G-Buffer without blending, for example, you typically don’t care about a specific rasterization order. The same commonly applies to depth-only rendering operations. For those cases, GCN hardware supports a special “out-of-order” rasterization mode which does exactly what the name implies: it relaxes the ordering guarantee, and allows fragments to be produced out-of-order. This can improve efficiency in various cases, and in fact, the driver will try to enable it automatically when it is safe to do so.

Comparison of strict and relaxed ordering, showing that the only possible difference is in case of overlapping primitives. In those case, there is no tie-braker rule when relaxed order is in effect.
Comparison of strict and relaxed order rendering in two cases. On the left hand side, you can see two overlapping primitives being rasterized – the arrow indicates the view direction. With relaxed ordering, the output can vary depending on how the hardware decides to process the triangles. On the right-hand side, in case of non-overlapping primitives, the order is well defined in both cases.

However, there are some cases when forcing Out of Order Rasterization at the driver level is not safe. For instance, if you’re rendering with a less-or-equal depth test. In this case, out-of-order rendering will produce different results, as any geometry which is Z-fighting in the less-or-equal case is no longer guaranteed to produce the same results. Because you probably don’t care about the specific pattern of your Z-buffer artifacts (or you know that your scene doesn’t produce them) enabling out-of-order rasterization manually is fine – but it’s a case where the driver can’t do it.

How to use it

Today, we’re introducing a new Vulkan extension, VK_AMD_rasterization_order which allows you to control out-of-order rendering on a per-draw-call basis. It’s a new rasterization state, which you can turn on for everything that does not require strict primitive ordering. This will be generally every G-Buffer pass, all shadow map rendering, and passes that enable commutative blending. In those cases, you can turn on RELAXED order.

There’s no downside to using RELAXED – performance will be the same or better as in the STRICT ordering mode. You also keep the benefits of triangle order optimization tools like AMD Tootle, as the out-of-order execution is still somewhat following the original input order. 

To actually use it in Vulkan, first you need to enable it when creating the device – remember that in Vulkan, only extensions that have been enabled during device creation can be actually used. Once that is done, it hooks into the normal Vulkan extension mechanism. As this is a rasterizer state, you set the pNext field of a VkPipelineRasterizationStateCreateInfo and link it to the structure specified in the extension. For example:


VkPipelineRasterizationStateRasterizationOrderAMD orderAMD = {};
orderAMD.sType = VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_RASTERIZATION_ORDER_AMD;
orderAMD.rasterizationOrder = VK_RASTERIZATION_ORDER_RELAXED_AMD;

VkPipelineRasterizationStateCreateInfo rasterizationStateCreateInfo;
rasterizationStateCreateInfo.pNext = &orderAMD;
What kind of performance can you expect? We’ve seen increases in the 10% range. You won’t see any benefit if the driver had enabled it automatically of course (for instance, depth-only rendering). In nearly all other cases, the driver has to play safe and cannot enable it even though there wouldn’t be any visible artifacts. With this extension, we enable the application to decide on whether relaxed order rendering is sufficient and reap the performance benefits.

In order to use the new extension, you need the Crimson 16.5.2 driver or later – the extension is already implemented and exposed. You can find the extension documentation here; the extension itself is very simple and consists of a new rasterization state. If you want to try it out right away, we also got you covered with our out-of-order rasterization sample application.

| SAMPLE

Vulkan OoORasterization

The Vulkan™ out-of-order rasterization sample shows how to use the out-of-order rasterization extension.

| OTHER POSTS BY MATTHAEUS CHAJDAS

Fast compaction with mbcnt

With shader extensions, we provide access to a much better tool to get compaction done: GCN provides a special op-code for compaction within a wavefront.

GCN Shader Extensions for Direct3D and Vulkan

One of the mandates of GPUOpen is to give developers better access to the hardware, and this post details extensions for Vulkan and Direct3D12 that expose additional GCN features to developers.

Vulkan barriers explained

Vulkan barriers are unique as they requires you to provide what resources are transitioning and also specify a source and destination pipeline stage.

Optimizing Terrain Shadows

One thing which is often forgotten is shadow map rendering. As the tessellation level of the terrain is not optimized for the shadow camera, but for the primary camera, this often results in a very strong mismatch and shadow maps end up getting extremely over-tessellated.

Matthaeus Chajdas
Matthäus Chajdas is a developer technology engineer at AMD. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.

| YOU MAY ALSO LIKE...

Tutorials Library

Browse all our fantastic tutorials, including programming techniques, performance improvements, guest blogs, and how to use our tools.

Samples Library

Browse all our useful samples. Perfect for when you’re needing to get started, want to integrate one of our libraries, and much more.