The latest Vulkan SDK now ships with Vulkan Memory Allocator (VMA)

We’re delighted that our Vulkan® Memory Allocator library is now available to install as an optional component in the latest Vulkan SDK (, making it even easier for developers to discover how handy it is.

The Vulkan Memory Allocator (VMA) SDK provides a simple and easy to integrate API to help you allocate memory for Vulkan buffer and image storage. It’s a generic C++ library which works on any Vulkan-supporting GPU and platform, and has no external dependencies beyond the standard C/C++ library and Vulkan.

VMA is an established and well-respected library, having been integrated into the majority of Vulkan game titles on PC, and is already a part of the official Khronos® Group Vulkan samples. If you’re not already using it, or you’re already a happy user who wants to find out more about it, you can take a look at our dedicated GPUOpen VMA page here:

Vulkan® Memory Allocator

VMA is our single-header, MIT-licensed, C++ library for easily and efficiently managing memory allocation for your Vulkan® games and applications.

Want to learn more about VMA and Vulkan memory management, or even Vulkan in general? Take a look at our other GPUOpen content:

Developing Vulkan® applications

Discover our Vulkan blog posts, presentations, samples, and more. Find out how we can help you create and optimize your Vulkan applications!


Vulkan® gives software developers control over the performance, efficiency, and capabilities of AMD Radeon™ GPUs and multi-core CPUs.

Using Vulkan® Device Memory

This post serves as a guide on how to best use the various Memory Heaps & Memory Types exposed in Vulkan on AMD drivers, starting with some high-level tips.

Latest news

Looking for a good place to get started with exploring GPUOpen?

Getting started: our software

New or fairly new to AMD’s tools, libraries, and effects? This is the best place to get started on GPUOpen!

Getting started: development and performance

Looking for tips on getting started with developing and/or optimizing your game, whether on AMD hardware or generally? We’ve got you covered!

You may also like...

Explore our huge collection of detailed tutorials, sample code, presentations, and documentation to find answers to your graphics development questions.

Create wonder. No black boxes.

The home of great performance and optimization advice for AMD RDNA™ 2 GPUs, AMD Ryzen™ CPUs, and so much more.

Browse all our useful samples. Perfect for when you’re needing to get started, want to integrate one of our libraries, and much more.

Discover what our SDK technologies can offer you. Query hardware or software, manage memory, create rendering applications or machine learning, and much more!

Analyze, Optimize, Profile, Benchmark. We provide you with the developer tools you need to make sure your game is the best it can be!

Updated May 31, 2023 following the release of AMD Software: Adrenalin Edition™ 23.5.2

AMD is pleased to support the recently released Microsoft® DirectML optimizations for Stable Diffusion. AMD has worked closely with Microsoft to help ensure the best possible performance on supported AMD devices and platforms. Stable Diffusion is a text-to-image model that transforms natural language into stunning images.

Microsoft has provided a path in DirectML for vendors like AMD to enable optimizations called ‘metacommands’. In the case of Stable Diffusion with the Olive pipeline, AMD has released driver support for a metacommand implementation intended to improve performance and reduce the time it takes to generate output from the model.

Using the recently released AMD Software: Adrenalin Edition 23.5.2 we can observe the following average 2X performance improvement with Stable Diffusion 1.5.1

Comparison of AMD Software: Adrenalin Edition 23.5.2 vs 23.5.1 for Stable Diffusion 1.5 with Microsoft Olive Optimized DirectML performance. 23.5.2 shows 19 average 512x512 images generated per minute and 23.5.1 shows 10.

The bulk of the performance gains were achieved by enabling driver optimizations for the Multi-Head Attention (MHA) operator on AMD hardware using the metacommand path provided by Microsoft within DirectML. Using a Python environment with the Microsoft Olive pipeline and Stable Diffusion 1.5, along with the ONNX runtime and AMD Software: Adrenalin Edition 23.5.2 installed, we ran the DirectML example scripts from the Olive repository to generate output and test the performance. Some tips on how to get this running can be found on the Microsoft blog post.

These optimizations have been validated on AMD RDNA™ 3 devices that feature compute units with AI accelerators, including AMD Radeon™ RX 7900 Series graphics cards and AMD Ryzen™ 7040 Series Mobile processors with Radeon™ graphics.

AMD Software: Adrenalin Edition™ 23.5.2 with Microsoft Olive optimized DirectML performance for Stable Diffusion can be downloaded here:

Learn more about:


Links to third-party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites, and no endorsement is implied. GD-98

1. [Performance claim copy TBD]
2. Based on AMD internal measurements, November 2022, comparing the Radeon RX 7900 XTX at 2.505 GHz boost clock with 96 CUs issuing 2X the Bfloat16 math operations per clocks vs. the RX 6900 XT GPU at 2.25 GHz boost clock and 80 CUs issue 1X the Bfloat16 math operations per clock. RX-821.


Microsoft is a registered trademark of Microsoft Corporation in the US and/or other countries. Other product names used in this publication are for identification purposes only and may be trademarks of their respective owners.