The new Wave Matrix Multiply Accumulate (WMMA) instructions added in HLSL Shader Model 6.8 allow shader developers to accelerate Generalized Matrix Multiplication (GEMM) matrix operations of the form:

WMMA instructions task all threads in a wave to collaboratively perform a matrix-multiply operation with higher efficiency and throughput than previously achievable using SM 6.7 or earlier instructions.

GEMM operations have many uses in in signal-processing, physics simulations, machine learning, and computer vision.

Some examples include:

  • Performing Fast-Fourier Transforms for signal processing, such as audio, radio, or radar applications
  • Applying filters and post-processing to images
  • Calculating deformation of objects and fluids, such as fluid/physics sim or molecular simulation
  • Implementing deep learning operators  – for example convolution, multi-layer-perceptron, neural radiance fields, and so on.

For more information on WMMA, see:

Find out more about WMMA and matrix cores here on GPUOpen

AMD Lab Notes

AMD matrix cores (amd-lab-notes)

This first post in the ‘AMD lab notes’ series takes a look at AMD’s Matrix Core technology and how best to use it to speed up your matrix operations.