Home » Blogs » Radeon™ GPU Profiler 2.4 adds support for Radeon™ RX 9000 Series, pure-compute applications, DirectML applications (and more!)

AMD Radeon™ GPU Profiler 2.4 adds support for AMD Radeon™ RX 9000 Series, pure-compute applications, DirectML applications (and more!)

Picture of Chris Hesik
Chris Hesik

Chris Hesik is the Radeon™ GPU Profiler technical lead for the Developer Tools Group at AMD. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

We are excited to announce the release of the AMD Radeon™ GPU Profiler (RGP) v2.4 and present some of the new things you can find in this release.

New! Support for AMD Radeon™ RX 9000 Series GPUs

AMD recently released the AMD Radeon™ RX 9000 Series GPUs, based on AMD RDNA™ 4 Architecture. A large effort on the tools team over the past year has involved adding support for these latest GPUs. All of the RGP features are now available to help you optimize your GPU applications for this new architecture.

New! Profile pure compute DirectX® 12 and Vulkan® applications

The latest version of the AMD Software: Adrenalin Edition™ driver coupled with the latest version of the AMD Radeon™ Developer Panel (RDP) now supports a new profile capture mechanism. This should be mostly transparent to developers; however, one of the benefits of this new capture mechanism is that it enables support for profiling new types of applications using RDP and RGP. In previous releases, for DirectX® 12 and Vulkan®, only frame-based applications (those that called Present) were supported. With this new release, RDP and the driver can now capture profiles from pure compute, console-based applications. So, if you have an application that uses DirectX® 12 and Vulkan® to only dispatch compute shaders, and the application does not present to the screen, you can now enjoy the benefits of RGP. The capture mechanism is similar to what is supported for HIP and OpenCL™. You can configure RDP to capture Dispatches within the Profiling UI. Simply set the Capture mode setting to Dispatch. Then, when you click the Capture Profile button, RDP and the driver will capture the compute dispatches from the pure compute application.

Radeon™ Developer Panel Dispatch capture

You can also configure RDP to automatically capture a range of dispatches. To do this, change the Auto capture mode setting to Dispatch range and provide a Dispatch start index and a Dispatch count. Then when the pure compute application is run, the specified range of dispatches will be automatically captured without any additional user interaction.

Radeon™ Developer Panel Dispatch auto capture

It is worth noting here that the Dispatch count setting is also used for pure compute applications when Auto capture is not enabled. In this case, when you click the Capture profile button, the number of dispatches specified will be captured.

For more details on the RDP configuration options, please view the Radeon™ Developer Panel User Manual on gpuopen.com

When you then launch RGP to visualize the profiling data captured from a pure compute application, you may notice a few differences in the user interface. Some UI elements that are only meaningful for graphics applications are hidden when viewing data from a pure compute application. Some of the Overview panes (like the Context rolls and the Render/depth targets panes) are hidden in this case. The Frame summary pane will be replaced by the Profile summary pane. There will also be a few minor differences in some of the other panes.

New! Profile DirectML applications

In addition to providing support for profiling pure compute applications, this version of RGP also has some enhancements related to profiling Direct Machine Learning (DirectML) applications. An introduction to DirectML can be found here. RGP can be used to analyze the performance of a DirectML application, similar to any other DirectX® 12 application. This includes the support mentioned earlier for DirectML applications that are pure compute (non-graphics) applications. There is one additional feature in RGP that provides additional insight for DirectML applications. Under the hood, DirectML makes use of DirectX® 12 meta commands. When you profile a DirectML application, RGP will give you additional information about the meta commands used under the hood. These are presented in both the Event timing pane and in the Event timeline row of the Wavefront occupancy pane as additional user markers, which tell you the category of each meta command. Here are two screenshots of these additional user markers, showing a case where the meta commands are executing a Convolution.

Here is what this looks like in the Event timing pane:

Radeon™ GPU Profiler DirectML meta commands in the Event timing pane

And here is the same information displayed in the Event timeline row of the Wavefront occupancy pane:

Radeon™ GPU Profiler DirectML meta commands in the Wavefront occupancy pane

Enhanced! Improved support for Work Graphs applications

RGP 2.4 also enhances the experience for developers working with Work Graphs. In addition to the features mentioned in this blog post, RGP now supports viewing both shader ISA and Instruction timing data for Work Graph sub-dispatches. Shader ISA for sub-dispatches will be displayed in the usual place: in the ISA tab of the Pipeline state pane.

Radeon™ GPU Profiler Sub-dispatch ISA

Similarly, after selecting a sub-dispatch event, you can navigate to the Instruction timing pane to view the low lever instruction timing data for the selected event.

Radeon™ GPU Profiler Sub-dispatch Instruction timing

Enhanced! Updates for the ISA views

There have also been some additional UI enhancements to the ISA views in RGP. As you may be aware, in RGP 1.15, we provided a new ISA view experience in RGP. In RGP 2.4, we made a few improvments. Now, when you hover the mouse over an instruction in the Opcode column, a tooltip will appear to show some additional details about that instruction. This tooltip will show the Instruction, a Description and the Encoding used. See an example below.

Radeon™ GPU Profiler ISA tooltip

The information displayed in the tooltip comes directly from the AMD machine-readable GPU ISA specifications. To achieve this, the ISADecoder API has been integrated into RGP’s ISA views. By having this information at your fingertips within RGP, you will no longer need to break your optimization flow by having to reach for an external ISA specification document.

When searching for text in shader ISA, previous versions of RGP would highlight an entire line where a search match was found. Starting with RGP 2.4, individual search matches within a line are highlighted. In the below screenshot, we have searched for the vector register v2. As you can see, each individual instance of v2 is highlighted, including each separate instance on lines 306 and 313.

Radeon™ GPU Profiler ISA tooltip

And more!

In addition to the above features, there are a few other changes worth mentioning.

  • The System information pane will now show information about the driver installed on the system where the profile was captured. This can be useful to know when trying to reproduce issues or when comparing application performance with different driver versions.
  • As with previous releases, this release also includes many bug fixes and minor changes intended to improve the quality for our users.

Please check out the RGP product page on gpuopen.com to learn more about RGP and to download the latest version.

Picture of Chris Hesik
Chris Hesik

Chris Hesik is the Radeon™ GPU Profiler technical lead for the Developer Tools Group at AMD. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

AMD Schola introduction blog image

Advancing AI in Video Games with AMD Schola

By connecting popular open-source RL libraries (written in Python) with the visual and physics capabilities of Unreal Engine, Schola empowers AI researchers and game developers alike to push the boundaries of intelligent gameplay.

AMD Schola

AMD Schola

AMD Schola is a library for developing reinforcement learning (RL) agents in Unreal Engine and training with your favorite python-based RL Frameworks.

Enjoy this blog post? If you found it useful, why not share it with other game developers?

You may also like...

Getting started: AMD GPUOpen software

New or fairly new to AMD’s tools, libraries, and effects? This is the best place to get started on GPUOpen!

AMD GPUOpen Getting Started Development and Performance

Looking for tips on getting started with developing and/or optimizing your game, whether on AMD hardware or generally? We’ve got you covered!

GPUOpen Manuals

Don’t miss our manual documentation! And if slide decks are what you’re after, you’ll find 100+ of our finest presentations here.

AMD GPUOpen Technical blogs

Browse our technical blogs, and find valuable advice on developing with AMD hardware, ray tracing, Vulkan®, DirectX®, Unreal Engine, and lots more.

AMD GPUOpen videos

Words not enough? How about pictures? How about moving pictures? We have some amazing videos to share with you!

AMD GPUOpen Performance Guides

The home of great performance and optimization advice for AMD RDNA™ 2 GPUs, AMD Ryzen™ CPUs, and so much more.

AMD GPUOpen software blogs

Our handy software release blogs will help you make good use of our tools, SDKs, and effects, as well as sharing the latest features with new releases.

AMD GPUOpen publications

Discover our published publications.