| ALL TUTORIALS
We have over sixty (and growing!) fantastic tutorials written by our engineers and also guest engineers for you to discover.
Here they all are in one place, sorted in date order with the most recent at the top.
CasCmdLine is a simple, command-line Windows program included in the FidelityFX CAS (Contrast Adaptive Sharpening) repository in binary and source form. It lets you test the effect on image files such as screenshots from your game.
This tutorial explains how to take advantage of the functionality in RDP v2.1 onwards, which unifies the RMV and RGP functionality from earlier versions to provide a unified workflow.
The final part of this joint series with Quantic Dream discusses shader scalarization, async compute, multithreaded render lists, memory management using our Vulkan Memory Allocator (VMA), and much more.
Part 2 of this joint post between Quantic Dream and AMD looks at non-uniform resource indexing on PC and for AMD cards specifically.
Porting the PS4® game Detroit: Become Human to PC presented some interesting challenges. This first part of a joint collaboration from engineers at Quantic Dream and AMD discusses the decision to use Vulkan® and talks shader pipelines and descriptors.
One of our engineers explains a few small code changes that can help you integrate RenderDoc for more unconventional applications.
Radeon™ Memory Visualizer (RMV) is a tool provided by AMD for use by game engine developers. It allows engineers to examine, diagnose, and understand the GPU memory management within their projects.
The Windows version of Compressonator 4.0 supports GPU encoding with DirectX® Compute (DXC) or OpenCL™ (OCL) shaders.
Part 4 of a series of posts on AMD FreeSync™ Premium Pro Technology. Here, we look at how to enable FreeSync Premium Pro with all next gen graphics APIs.
This is a Visual Studio® Code extension for the Radeon GPU Analyzer (RGA). This extension makes it possible to use RGA directly from within VS Code.
Tom Hammersley from Codemasters talks about integrating FidelityFX into the Ego Engine and implementing Contrast Adaptive Sharpening (CAS).
Beyond Spatial Audio: TrueAudio Next Acceleration of Steam Audio Sound Reflections with Third Order Ambisonics – Demo Video
Higher levels of realism and believability can be achieved when accurate sound reflections are deployed in a game or experience. Spatialized, physically generated reflections can achieve this goal.
With DirectX 12 comes the power of generating disassembly and hardware resource usage statistics that are closest to the real-world case, and therefore making better performance optimization decisions.
Radeon GPU Analyzer (RGA) has support for DirectX12 compute shaders with the command line tool. This mode can generate GCN/RDNA ISA disassembly for your compute shaders, regardless of the physically installed GPU.
In this tutorial, we will be going over what gamut mapping is, how we implemented a gamut mapper to show how FreeSync HDR works, and some pitfalls with different gamut mapping algorithms.
In part two of this tutorial, we cover the terminology of tone mapping, what tone mapping is, as well as different monitor features that influence how well a tone mapper will work.
The first in a series of four tutorials related to AMD Freesync™ Premium Pro HDR. This tutorial covers terminology related to color.
Guest post by Sebastian Aaltonen, co-founder of Second Order. It covers optimising building the engine and asset production when using AMD Ryzen Threadripper processors.
A machine-readable mapping that you can integrate into your software for decoding Radeon™ Vulkan® versions. Includes a link to a handy table.
Learn what a context roll on our GPUs is, how they apply to the pipeline and how they’re managed, and what you can do to analyse them and find out if they’re a limiting factor in the performance of your game or application.
This guest post, by Arseny Kapoulkine from Roblox, looks at the costs associated with calling various Vulkan functions tens or hundreds of thousands of times per frame, and ways to bring them down.
Half-precision (FP16) computation is a performance-enhancing GPU technology long exploited in console and mobile devices not previously used or widely available in mainstream PC development.
Insights from Enscape as to how they designed a renderer that produces path traced real time global illumination and can also converge to offline rendered image quality.
Due to architectural differences between Zen and our previous processor architecture, Bulldozer, developers need to take care when using the Windows® APIs for processor and core enumeration.
The AMD GCN Vulkan extensions allow developers to get access to the barycentric coordinates at the fragment-shader level.
An important part of learning the Vulkan® API is to understand what types of objects are defined in it, what they represent and how they relate to each other.
Sebastian Aaltonen, co-founder of Second Order Ltd, talks about how to optimize GPU occupancy and resource usage of compute shaders that use large thread groups.
mGPU isn’t just for gamers – if you’re a developer working on a game, you should think of using mGPU to make your life easier.
This tutorial explains how to use Radeon GPU Analyzer (RGA) to produce a live VGPR analysis report for your shaders and kernels. Basic RGA usage knowledge is assumed.
Croteam’s Karlo Jez writes about AMD LiquidVR MultiView Rendering in Serious Sam VR with the GPU Services (AGS) Library.
Sub DWord Addressing is a feature of the AMD GCN architecture which allows the efficient extraction of 8-bit and 16-bit values from a 32-bit register.
A guide to using the Windows Performance Analyzer tool, with a focus on video resources.
Guide to using the shader compiler control API in AGS 5.0
One thing which is often forgotten is shadow map rendering. As the tessellation level of the terrain is not optimized for the shadow camera, but for the primary camera, this often results in a very strong mismatch and shadow maps end up getting extremely over-tessellated.
Understanding concurrency (and what breaks it) is extremely important when optimizing for modern GPUs.
3D intensive application performance may suffer greatly if the best graphics device is not selected. As a developer you can easily fix this problem by adding only one line to your executable’s source code.
This post takes a look at the interesting bits of helping id Software with their DOOM Vulkan effort, from the perspective of AMD’s Game Engineering Team.
This guest post by Croteam’s Karlo Jez gives a detailed look at how Affinity Multi-GPU support was added to their game engine.
How to set up the AMD Driver Symbol Server in Visual Studio.
Vulkan barriers are unique as they requires you to provide what resources are transitioning and also specify a source and destination pipeline stage.
Follow up on VDR and practical advice on adapting a game’s tonemapping pipeline to both traditional display signals and new HDR output signals.
RapidFire SDK captures and encodes the input images entirely on the GPU and then copies the encoded result into the system memory for processing on the CPU.
The final instalment in Tamas Rabel’s insight into developing the Total War engine looks at Multi-GPU.
The AMD TrueAudio Next open-source library and driver-controlled CU Reservation enables dramatically higher levels of audio rendering realism in VR.
Tamas Rabel talks about how Total War: Warhammer utilized asynchronous compute to extract some extra GPU performance in DirectX® 12 and delves into the process of moving some of the passes in the engine to asynchronous compute pipelines.
Cross-lane operations are an efficient way to share data between wavefront lanes. This article covers in detail the cross-lane features that GCN3 offers.
Here’s Tamas Rabel again with some juicy details about how Creative Assembly brought Total War to DirectX® 12.
Tamas Rabel from Creative Assembly discusses how performance was measured with the Total War Engine.
Tamas Rabel, Lead Graphics Programmer on the Total War series provides a detailed look at the Total War renderer as well as digging deep into some of the optimizations that the team at Creative Assembly did for the brilliant, Total War: Warhammer.
Game engines do most of their shading work per-pixel or per-fragment. But there is another alternative that has been popular in film for decades…
This post serves as a guide on how to best use the various Memory Heaps & Memory Types exposed in Vulkan on AMD drivers, starting with some high-level tips.
This article explains how to produce Hsaco from assembly code and also takes a closer look at some new features of the GCN architecture.
One of the mandates of GPUOpen is to give developers better access to the hardware, and this post details extensions for Vulkan and Direct3D12 that expose additional GCN features to developers.
DOPPEngine changes the output of your desktop in ways that can be very useful with various effects.
With shader extensions, we provide access to a much better tool to get compaction done: GCN provides a special op-code for compaction within a wavefront.
With cluster culling, GeometryFX is able to reject large chunks of the geometry – with corresponding performance increases.
GCN hardware supports a special out-of-order rasterization mode which relaxes the ordering guarantee, and allows fragments to be produced out-of-order.
Alternate Frame Rendering (AFR) is the method used to take advantage of Multiple GPUs in DirectX® 11 and OpenGL® applications.
Useful information about features within our AMD GPU Services (AGS) library.
This post is going to look at very subtle changes to improve grain and fine details using the same 3-bit/channel quantization case from the prior post.
Expanding on Advanced Techniques and Optimization of VDR Color Pipelines: Details on the generation of film grain ideal for transfer functions like sRGB.
An explanation of how GCN hardware coalesces memory operations to minimize traffic throughout the memory hierarchy.
DCC is a domain-specific compression that tries to take advantage of data coherence. It’s lossless, and adapted for 3D rendering. The key idea is to process whole blocks instead of individual pixels.
Vulkan validation layers make it easier to catch any mistakes, provide useful information beyond basic errors and minimize portability issues.
Renderpasses are objects designed to allow an application to communicate the high-level structure of a frame to the driver.
For GPU-side dynamically generated data structures which need 3D spherical mappings, two of the most useful mappings are cubemaps and octahedral maps. This post explores the overhead of both mappings.
Asynchronous compute can help you to get the maximum GPU usage. I’ll be explaining the details based on the nBodyGravity sample from Microsoft.
Optimized tonemapper form of the technique Brian Karis talks about on Graphics Rants: Tone mapping. Replace the luma computation with max3(red,green,blue).