Professional compute content is no longer on GPUOpen.
Alternatively, try:
Search GPUOpen:
Browse some of our GPUOpen compute-related content

Sparse matrix vector multiplication – part 1 – AMD lab notes
Sparse matrix vector multiplication (SpMV) is a core computational kernel of nearly every implicit sparse linear algebra solver. This is the first post in the series covering SpMV.

Jacobi Solver with HIP and OpenMP offloading – AMD lab notes
In this blog, we explore GPU offloading using HIP and OpenMP target directives and discuss their relative merits in terms of implementation efforts and performance.

Creating a PyTorch/TensorFlow Code Environment on AMD GPUs – AMD lab notes
The machine learning ecosystem is quickly exploding and this article is designed to assist data scientists/ML practitioners get their machine learning environments up and running on AMD GPUs.

Finite difference method – Laplacian part 4 – AMD lab notes
In the fourth and final part of Finite Difference Laplacian blog series we cover scaling studies and cache size limitations

GPU-aware MPI with ROCm – amd-lab-notes
MPI is the de facto standard for inter-process communication in High-Performance Computing. This post will guide you through the process of setting up an MPI application that supports execution on GPU clusters.

Register pressure in AMD CDNA2™ GPUs – amd-lab-notes
Register pressure of GPU kernels has a tremendous impact on performance. This post provides a practical demo on applying recommendations.

Finite Difference Method – Laplacian part 3 – amd-lab-notes
In this third part, we cover additional optimizations to fine tune the performance of the kernel, and introduce temporary files, register pressure, and occupancy.

Introduction to profiling tools for AMD hardware (amd-lab-notes)
This post gives an overview of AMD’s open source profiling tools, helping you diagnose bottlenecks and understand how your application is using the hardware.

Compute Shaders – Game Industry Conference 2021
This talk introduces compute shaders, explaining ideas from a software and hardware perspective, as well as considerations when writing compute shaders.

AMD FidelityFX Denoiser
AMD FidelityFX Denoiser is a set of denoising compute shaders which remove artefacts from reflection and shadow rendering.

Optimize Your Engine Using Compute
This talk by AMD’s Lou Kramer at 4C in 2018 discusses optimising your engine using compute.

Gears 5 – High-Gear Visuals On Multiple Platforms – YouTube link
This talk will discuss Direct3D® 12 in general, as well as some of the features that were leveraged to accomplish this goal, such as Async Compute, Tiled Resources, Debugging, Copy Queues, and HDR.