Home » Blogs » RGP support for GPU Work Graphs

New Work Graphs sample and Radeon GPU Profiler support for GPU Work Graphs

Chris Hesik
Chris Hesik

Chris Hesik is the Radeon™ GPU Profiler technical lead for the Developer Tools Group at AMD. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

link to RGP support for GPU Work GraphsIn June, Microsoft announced the preview of a new addition to the DirectX® 12 API: GPU Work Graphs. At the same time, AMD released a driver along with documentation to support this exciting new paradigm. You can read all about it here on GPUOpen. And last month, AMD announced support for using the same features in Vulkan®. You can read about that support here.

New sample application

Today, we are happy to announce the availability of a new D3D12SimpleClassify sample, which shows the use of a work graph in a simple frame-based graphics application.

D3D12SimpleClassify sample

The sample implements a trivial material classification and shading scenario. To achieve this, it uploads a 2D Texture containing integer values at each pixel location. This texture gets auto-generated at build-time by running a Python script on the input image, which is located at material-ids.png in the project directory. The GPU runs a work graph containing two node arrays to process this input image. The first node array in the graph contains a single node, named ClassifyPixels and uses the launch mode Broadcast. This reads the input texture’s data and interprets each pixel value as a unique material ID, and then creates a node payload for the second node array, selecting the appropriate node in that array corresponding to the material ID. The second node array, named ShadePixels, contains a node for each unique material ID and each node consumes an input record containing a pixel location, and writes out a unique color value at that location to the destination Texture UAV. The shaders in the ShadePixels node array all use the Coalescing launch type.

Finally, the destination Texture is presented to the display Window. Successful execution is determined by comparing the image displayed in the Window against the source image.

Here is a diagram for the Work Graph dispatched by this application:

D3D12SimpleClassify Work Graph structure diagram

Radeon GPU Profiler support for Work Graphs

We would also like to introduce support for GPU Work Graphs in the Radeon GPU Profiler or RGP. As you may be aware, in order to profile a DirectX or Vulkan application using RGP, the application needs to be frame-based. Now that we have both a DirectX and Vulkan frame-based Work Graphs sample available, it makes sense to talk about how you can use RGP with Work Graphs. Support for Work Graphs in RGP existed starting with the 1.15 version, which can be downloaded as part of the Radeon Developer Tool Suite right here on GPUOpen.

New event types

First up, there are new event types that can show up in RGP’s event list. The main new event type that you will see for a DirectX Work Graph application is the SubDispatch event, as seen in the image below.

RGP event list with SubDispatch events

Each of these events represents a separate dispatch which originates from the GPU, rather than a normal dispatch that originates from a Dispatch (or similar) command recorded in a command buffer by the CPU. Similar to normal dispatches, each SubDispatch indicates the dispatch dimensions, letting you know how many work items are associated with each graph node. SubDispatches do not have an associated pipeline so the `API PSO hash’ and ‘Driver internal pipeline hash’ fields in the Details panel are shown as “N/A” for these events.

RGP also has support for showing GPU work associated with setting up the backing memory for a graph dispatch. This work is displayed as the new event type InitGraphBackingStore. Below, you can see this event type in RGP’s event list. Since RGP only captures GPU work associated with a single frame, you will only see this event if an application initializes the backing memory in the frame which has been captured. This is more likely to happen if the application performs this initialization each frame, but as you can see in the below screenshot, the GPU work associated with InitGraphBackingStore can be expensive (shown here as over 1200 μs per frame). Thus, setting up the backing memory each frame may not lead to the most performant application. However, for the purposes of this demo, the D3D12SimpleClassify application can be run with a command line argument that will force it to set up the backing memory each frame. This allows us to easily see this work in RGP. To duplicate this, please run the sample with the -AlwaysResetBackingStore command line argument.

RGP event list with SubDispatch events

For Vulkan apps that use the new VK_AMDX_shader_enqueue extension, the equivalent of the SubDispatch event name seen in RGP is vkCmdSubDispatch and the equivalent of the InitGraphBackingStore event name is vkCmdInitializeGraphScratchMemoryAMD.

Wavefront occupancy for Work Graphs

In addition to the new event types, RGP also shows the wavefront activity from the Work Graph. Below, we see a zoomed-in section of the Wavefront occupancy timeline, showing several SubDispatch events and the corresponding set of wavefronts launched from those dispatches.

RGP event list with SubDispatch events

If you’re interested in seeing the wavefront activity from individual SubDispatches, you can ask the Wavefront occupancy pane to color the wavefronts by event. This will give you a visual indicator of which waves come from which SubDispatch event.

RGP event list with SubDispatch events

Current limitations

There are a few features in RGP that are not fully functional for Work Graphs.

  • RGP is currently unable to extract the disassembled ISA from shaders used in a work graph. Because of this, the ISA tab in the Pipeline state pane will be disabled for SubDispatch events.
  • Because RGP does not have access to a graph node’s ISA, it is currently unable to support the Instruction timing feature. Thus, the Instruction timing pane in RGP will be blank for SubDispatch events.
  • Similarly, the Cache Counter data displayed in RGP will be unavailable for parts of the timeline where graph dispatches are taking place.

You can still view ISA, Instruction timing and Counter data for non-Work Graph related events in an application that uses Work Graphs, and we plan to enable these features for SubDispatch events in future driver and RGP builds.

If you are experimenting with GPU Work Graphs in your applications and run into any issues with RGP or if you have ideas for ways RGP can be improved to provide better support for Work Graphs, please let us know via our GitHub Issues page.

To learn more about getting started with GPU Work Graphs, don’t miss our GPU Work Graphs primer!

Work Graphs

GPU Work Graphs in Microsoft DirectX® 12

Our primer on GPU Work Graphs introduces this exciting new paradigm for graphics developers, which enable a live shader kernel to dispatch new workloads on-demand without needing to circle back around to the CPU first.

Chris Hesik
Chris Hesik

Chris Hesik is the Radeon™ GPU Profiler technical lead for the Developer Tools Group at AMD. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

Enjoy this blog post? If you found it useful, why not share it with other game developers?

You may also like...

Getting started: AMD GPUOpen software

New or fairly new to AMD’s tools, libraries, and effects? This is the best place to get started on GPUOpen!

AMD GPUOpen Getting Started Development and Performance

Looking for tips on getting started with developing and/or optimizing your game, whether on AMD hardware or generally? We’ve got you covered!

GPUOpen Manuals

Don’t miss our manual documentation! And if slide decks are what you’re after, you’ll find 100+ of our finest presentations here.

AMD GPUOpen Technical blogs

Browse our technical blogs, and find valuable advice on developing with AMD hardware, ray tracing, Vulkan®, DirectX®, Unreal Engine, and lots more.

AMD GPUOpen videos

Words not enough? How about pictures? How about moving pictures? We have some amazing videos to share with you!

AMD GPUOpen Performance Guides

The home of great performance and optimization advice for AMD RDNA™ 2 GPUs, AMD Ryzen™ CPUs, and so much more.

AMD GPUOpen software blogs

Our handy software release blogs will help you make good use of our tools, SDKs, and effects, as well as sharing the latest features with new releases.

AMD GPUOpen publications

Discover our published publications.