Tutorials Library

Browse all our tutorials in one place


We have over sixty (and growing!) fantastic tutorials written by our engineers and also guest engineers for you to discover.

Here they all are in one place, sorted in date order with the most recent at the top.

Using Radeon™ GPU Analyzer with DirectX®12 Graphics

With DirectX 12 comes the power of generating disassembly and hardware resource usage statistics that are closest to the real-world case, and therefore making better performance optimization decisions.

Using Radeon™ GPU Analyzer with Direct3D®12 Compute

Radeon GPU Analyzer (RGA) has support for DirectX12 compute shaders with the command line tool. This mode can generate GCN/RDNA ISA disassembly for your compute shaders, regardless of the physically installed GPU.

Decoding Radeon™ Vulkan® versions

A machine-readable mapping that you can integrate into your software for decoding Radeon™ Vulkan® versions. Includes a link to a handy table.

Understanding GPU context rolls

Learn what a context roll on our GPUs is, how they apply to the pipeline and how they’re managed, and what you can do to analyse them and find out if they’re a limiting factor in the performance of your game or application.

Reducing Vulkan API call overhead

This guest post, by Arseny Kapoulkine from Roblox, looks at the costs associated with calling various Vulkan functions tens or hundreds of thousands of times per frame, and ways to bring them down.

First steps when implementing FP16

Half-precision (FP16) computation is a performance-enhancing GPU technology long exploited in console and mobile devices not previously used or widely available in mainstream PC development.

Deferred Path Tracing By Enscape

Insights from Enscape as to how they designed a renderer that produces path traced real time global illumination and can also converge to offline rendered image quality.

CPU core count detection on Windows

Due to architectural differences between Zen and our previous processor architecture, Bulldozer, developers need to take care when using the Windows® APIs for processor and core enumeration.

Stable barycentric coordinates

The AMD GCN Vulkan extensions allow developers to get access to the barycentric coordinates at the fragment-shader level.

Understanding Vulkan® objects

An important part of learning the Vulkan® API is to understand what types of objects are defined in it, what they represent and how they relate to each other.

Content Creation Tools and Multi-GPU

mGPU isn’t just for gamers – if you’re a developer working on a game, you should think of using mGPU to make your life easier.

Live VGPR Analysis with Radeon GPU Analyzer

This tutorial explains how to use Radeon GPU Analyzer (RGA) to produce a live VGPR analysis report for your shaders and kernels. Basic RGA usage knowledge is assumed.

Using Sub DWord Addressing on AMD GPUs

Sub DWord Addressing is a feature of the AMD GCN architecture which allows the efficient extraction of 8-bit and 16-bit values from a 32-bit register. 

Optimizing Terrain Shadows

One thing which is often forgotten is shadow map rendering. As the tessellation level of the terrain is not optimized for the shadow camera, but for the primary camera, this often results in a very strong mismatch and shadow maps end up getting extremely over-tessellated.

Vulkan and DOOM

This post takes a look at the interesting bits of helping id Software with their DOOM Vulkan effort, from the perspective of AMD’s Game Engineering Team.

Vulkan barriers explained

Vulkan barriers are unique as they requires you to provide what resources are transitioning and also specify a source and destination pipeline stage.

The Importance of Audio in VR

The AMD TrueAudio Next open-source library and driver-controlled CU Reservation enables dramatically higher levels of audio rendering realism in VR.

Anatomy Of The Total War Engine: Part IV

Tamas Rabel talks about how Total War: Warhammer utilized asynchronous compute to extract some extra GPU performance in DirectX® 12 and delves into the process of moving some of the passes in the engine to asynchronous compute pipelines.

AMD GCN Assembly: Cross-Lane Operations

Cross-lane operations are an efficient way to share data between wavefront lanes. This article covers in detail the cross-lane features that GCN3 offers.

Anatomy Of The Total War Engine: Part I

Tamas Rabel, Lead Graphics Programmer on the Total War series provides a detailed look at the Total War renderer as well as digging deep into some of the optimizations that the team at Creative Assembly did for the brilliant, Total War: Warhammer.

Texel Shading

Game engines do most of their shading work per-pixel or per-fragment. But there is another alternative that has been popular in film for decades…

Using Vulkan Device Memory

This post serves as a guide on how to best use the various Memory Heaps & Memory Types exposed in Vulkan on AMD drivers, starting with some high-level tips.

GCN Shader Extensions for Direct3D and Vulkan

One of the mandates of GPUOpen is to give developers better access to the hardware, and this post details extensions for Vulkan and Direct3D12 that expose additional GCN features to developers.

Fast compaction with mbcnt

With shader extensions, we provide access to a much better tool to get compaction done: GCN provides a special op-code for compaction within a wavefront.

GeometryFX 1.2 – Cluster Culling

With cluster culling, GeometryFX is able to reject large chunks of the geometry – with corresponding performance increases.

Using AMD Crossfire API

Alternate Frame Rendering (AFR) is the method used to take advantage of Multiple GPUs in DirectX® 11 and OpenGL® applications.

VDR Follow Up – Grain and Fine Details

This post is going to look at very subtle changes to improve grain and fine details using the same 3-bit/channel quantization case from the prior post.

VDR Follow Up – Fine Art of Film Grain

Expanding on Advanced Techniques and Optimization of VDR Color Pipelines: Details on the generation of film grain ideal for transfer functions like sRGB.

Getting the most out of Delta Color Compression

DCC is a domain-specific compression that tries to take advantage of data coherence. It’s lossless, and adapted for 3D rendering. The key idea is to process whole blocks instead of individual pixels.

Using the Vulkan™ Validation Layers

Vulkan validation layers make it easier to catch any mistakes, provide useful information beyond basic errors and minimize portability issues.

Vulkan Renderpasses

Renderpasses are objects designed to allow an application to communicate the high-level structure of a frame to the driver.

Fetching From Cubes and Octahedrons

For GPU-side dynamically generated data structures which need 3D spherical mappings, two of the most useful mappings are cubemaps and octahedral maps. This post explores the overhead of both mappings.

Maxing out GPU usage in nBodyGravity

Asynchronous compute can help you to get the maximum GPU usage. I’ll be explaining the details based on the nBodyGravity sample from Microsoft.

Optimized Reversible Tonemapper for Resolve

Optimized tonemapper form of the technique Brian Karis talks about on Graphics Rants: Tone mapping. Replace the luma computation with max3(red,green,blue).