Professional compute content is no longer on GPUOpen.
Browse some of our GPUOpen compute-related content
Sparse matrix vector multiplication (SpMV) is a core computational kernel of nearly every implicit sparse linear algebra solver. This is the first post in the series covering SpMV.
In this blog, we explore GPU offloading using HIP and OpenMP target directives and discuss their relative merits in terms of implementation efforts and performance.
The machine learning ecosystem is quickly exploding and this article is designed to assist data scientists/ML practitioners get their machine learning environments up and running on AMD GPUs.
In the fourth and final part of Finite Difference Laplacian blog series we cover scaling studies and cache size limitations
MPI is the de facto standard for inter-process communication in High-Performance Computing. This post will guide you through the process of setting up an MPI application that supports execution on GPU clusters.
Register pressure of GPU kernels has a tremendous impact on performance. This post provides a practical demo on applying recommendations.
In this third part, we cover additional optimizations to fine tune the performance of the kernel, and introduce temporary files, register pressure, and occupancy.
This post gives an overview of AMD’s open source profiling tools, helping you diagnose bottlenecks and understand how your application is using the hardware.
This talk introduces compute shaders, explaining ideas from a software and hardware perspective, as well as considerations when writing compute shaders.
AMD FidelityFX Denoiser is a set of denoising compute shaders which remove artefacts from reflection and shadow rendering.
This talk by AMD’s Lou Kramer at 4C in 2018 discusses optimising your engine using compute.
This talk will discuss Direct3D® 12 in general, as well as some of the features that were leveraged to accomplish this goal, such as Async Compute, Tiled Resources, Debugging, Copy Queues, and HDR.