When it comes to multi-GPU (mGPU), most developers immediately think of complicated Crossfire setups with two or more GPUs and how to make their game run well on those setups. This is however only one side of the mGPU story – the other one is mGPU during content creation. If you’re a developer working on a game, you should think of using mGPU to make your life easier.
One tool per GPU
In many cases, artists will run your in-game editor or preview while working with other DCC tools like Maya® or Mari®. In this case these DCC applications must share resources with your editing tool as well as the desktop compositor. Especially when working with dense geometry and high-resolution textures, memory can easily become a bottleneck – more so as the editing tools are optimized for fast iteration time, not minimal resource usage. In this case, you can get a much better user experience by using an mGPU system and forcing your tool to use the second GPU.
This can be easily done in all modern graphics APIs: Direct3D® 11 allows you to enumerate your adapters, and just picking the second one is usually fine as most applications will pick the first one by default. Make sure to not simply call
D3D11CreateDevice () with a null adapter, instead, enumerate them all and pick an appropriate one. You can reuse the same code for Direct3D 12.
For Vulkan®, this requires a (currently experimental) extension to associate a Vulkan device with the corresponding D3D device. You can assume that most applications will just pick the first adapter, and if you want to know which of your Vulkan devices that is, you can use the
VK_KHX_external_memory_capabilities extension. It provides access to the LUID of a physical device, which can be queried from the DXGI device. Please keep in mind that this extension is experimental: The idea is you can test it today and give feedback whether it works, and eventually, it’ll get replaced by a
KHR variant. It’s very likely that the final extension will look very similar, but until then, please use this for your internal tools only and don’t ship yet.
To get you started, I’ve written a tiny sample which shows you how to enumerate DXGI adapters (which works for both 11 and 12), and how to enumerate the Vulkan devices. Here’s the sample output for a machine with two Radeon RX 580, and one Radeon Fury X:
Enumerating DXGI devices DXGI Adapter: Radeon RX 580 Series - LUID: 00000000001251cd DXGI Adapter: Radeon RX 580 Series - LUID: 0000000000126510 DXGI Adapter: AMD Radeon (TM) R9 Fury Series - LUID: 000000000000be80 DXGI Adapter: Microsoft Basic Render Driver - LUID: 000000000000be66 Enumerating Vulkan devices Vulkan device: AMD Radeon (TM) R9 Fury Series - LUID: 000000000000be80 Vulkan device: Radeon RX 580 Series - LUID: 00000000001251cd Vulkan device: Radeon RX 580 Series - LUID: 0000000000126510
That’s it! With the right affinity settings, you can use two GPUs to get robust performance in the most common case, where you need to run both your game and some DCC application, or even just the editor, at the same time.
Other content by Matthäus Chajdas
Asynchronous compute can help you to get the maximum GPU usage. I’ll be explaining the details based on the nBodyGravity sample from Microsoft.
GCN hardware supports a special out-of-order rasterization mode which relaxes the ordering guarantee, and allows fragments to be produced out-of-order.
With cluster culling, GeometryFX is able to reject large chunks of the geometry – with corresponding performance increases.
With shader extensions, we provide access to a much better tool to get compaction done: GCN provides a special op-code for compaction within a wavefront.
One of the mandates of GPUOpen is to give developers better access to the hardware, and this post details extensions for Vulkan and Direct3D12 that expose additional GCN features to developers.
Vulkan barriers are unique as they requires you to provide what resources are transitioning and also specify a source and destination pipeline stage.
One thing which is often forgotten is shadow map rendering. As the tessellation level of the terrain is not optimized for the shadow camera, but for the primary camera, this often results in a very strong mismatch and shadow maps end up getting extremely over-tessellated.
One of our engineers explains a few small code changes that can help you integrate RenderDoc for more unconventional applications.