Professional compute content is no longer on GPUOpen.
Search GPUOpen:
Browse some of our GPUOpen compute-related content

AMD FidelityFX – Denoiser
AMD FidelityFX Denoiser is a set of denoising compute shaders which remove artefacts from reflection and shadow rendering.

Optimize Your Engine Using Compute
This talk by AMD’s Lou Kramer at 4C in 2018 discusses optimising your engine using compute.

Gears 5 – High-Gear Visuals On Multiple Platforms – YouTube link
This talk will discuss Direct3D® 12 in general, as well as some of the features that were leveraged to accomplish this goal, such as Async Compute, Tiled Resources, Debugging, Copy Queues, and HDR.

Integrating AMD FidelityFX into the Ego Engine
Tom Hammersley from Codemasters talks about integrating FidelityFX into the Ego Engine and implementing Contrast Adaptive Sharpening (CAS).

Using Radeon™ GPU Analyzer with Direct3D®12 Compute
Radeon GPU Analyzer (RGA) has support for DirectX12 compute shaders with the command line tool. This mode can generate GCN/RDNA ISA disassembly for your compute shaders, regardless of the physically installed GPU.

Optimizing GPU occupancy and resource usage with large thread groups
Sebastian Aaltonen, co-founder of Second Order Ltd, talks about how to optimize GPU occupancy and resource usage of compute shaders that use large thread groups.

Leveraging Asynchronous Queues for Concurrent Execution
Understanding concurrency (and what breaks it) is extremely important when optimizing for modern GPUs.

The Importance of Audio in VR
The AMD TrueAudio Next open-source library and driver-controlled CU Reservation enables dramatically higher levels of audio rendering realism in VR.

Anatomy Of The Total War Engine: Part IV
Tamas Rabel talks about how Total War: Warhammer utilized asynchronous compute to extract some extra GPU performance in DirectX® 12 and delves into the process of moving some of the passes in the engine to asynchronous compute pipelines.

The Art of AMDGCN Assembly: How to Bend the Machine to Your Will
This article explains how to produce Hsaco from assembly code and also takes a closer look at some new features of the GCN architecture.

Maxing Out GPU usage in nBodyGravity
Asynchronous compute can help you to get the maximum GPU usage. I’ll be explaining the details based on the nBodyGravity sample from Microsoft.

nBody DirectX® 12 Sample (asynchronous compute version)
This is a slightly modified version of the Microsoft D3D12nBodyGravity sample, This sample demonstrates the use of asynchronous compute shaders (multi-engine) to simulate an n-body gravity system.

SeparableFilter11 DirectX® 11 SDK Sample
This sample presents a technique for achieving highly optimized, user-defined separable filters.

GPUParticles11 DirectX® 11 SDK Sample
This sample demonstrates how to implement a simple GPU-based particle system.

ForwardPlus11 DirectX® 11 SDK Sample
This sample provides an example implementation of the Forward+ algorithm, which extends traditional forward rendering to support high numbers of dynamic lights while maintaining performance.

TiledLighting11 DirectX® 11 SDK Sample
This sample provides an example implementation of two tile-based light culling methods: Forward+ and Tiled Deferred.