Professional compute content is no longer on GPUOpen.
Browse some of our GPUOpen compute-related content
AMD FidelityFX Denoiser is a set of denoising compute shaders which remove artefacts from reflection and shadow rendering.
This talk by AMD’s Lou Kramer at 4C in 2018 discusses optimising your engine using compute.
This talk will discuss Direct3D® 12 in general, as well as some of the features that were leveraged to accomplish this goal, such as Async Compute, Tiled Resources, Debugging, Copy Queues, and HDR.
Tom Hammersley from Codemasters talks about integrating FidelityFX into the Ego Engine and implementing Contrast Adaptive Sharpening (CAS).
Radeon GPU Analyzer (RGA) has support for DirectX12 compute shaders with the command line tool. This mode can generate GCN/RDNA ISA disassembly for your compute shaders, regardless of the physically installed GPU.
Sebastian Aaltonen, co-founder of Second Order Ltd, talks about how to optimize GPU occupancy and resource usage of compute shaders that use large thread groups.
Understanding concurrency (and what breaks it) is extremely important when optimizing for modern GPUs.
The AMD TrueAudio Next open-source library and driver-controlled CU Reservation enables dramatically higher levels of audio rendering realism in VR.
Tamas Rabel talks about how Total War: Warhammer utilized asynchronous compute to extract some extra GPU performance in DirectX® 12 and delves into the process of moving some of the passes in the engine to asynchronous compute pipelines.
This article explains how to produce Hsaco from assembly code and also takes a closer look at some new features of the GCN architecture.
Asynchronous compute can help you to get the maximum GPU usage. I’ll be explaining the details based on the nBodyGravity sample from Microsoft.
This is a slightly modified version of the Microsoft D3D12nBodyGravity sample, This sample demonstrates the use of asynchronous compute shaders (multi-engine) to simulate an n-body gravity system.
This sample presents a technique for achieving highly optimized, user-defined separable filters.
This sample demonstrates how to implement a simple GPU-based particle system.
This sample provides an example implementation of the Forward+ algorithm, which extends traditional forward rendering to support high numbers of dynamic lights while maintaining performance.
This sample provides an example implementation of two tile-based light culling methods: Forward+ and Tiled Deferred.