Using the Vulkan® Validation Layers

Originally posted: March 9, 2016

Daniel Rakos

Vulkan™ provides unprecedented control to developers over generating graphics and compute workloads for a wide range of hardware, from tiny embedded processors to high-end workstation GPUs with wildly different architectures. As usual, with great power comes great responsibility, and making sure that your application runs correctly on all these possible target platforms it is crucial to follow all the rules of the API specification even if some level of violation of these rules, either intentional or accidental, seem to not cause any issues on a particular hardware and driver implementation.

Traditional graphics APIs try to solve this issue by defining a set of illegal API usage conditions that are required to be caught by driver implementations and reported to the application through some sort of error reporting mechanism. The problem with this approach is that even though these errors generated in response to incorrect API usage are extremely valuable during the development of an application, checking for all of these error conditions costs significant CPU time spent in the driver that provides no value when running a released application that is known to use the API correctly. Not to mention the fact that practice reveals some driver implementations are less pedantic about certain rules established by the API specifications than others, and thus relying on testing on a particular implementation and observing no problems could still lead to portability issues when the same application is ran against other driver implementations.

Errors vs Errors

Unlike traditional graphics APIs, Vulkan groups possible error scenarios into two distinct buckets:

Validity errors are error conditions resulting from incorrect API usage, i.e. the application not respecting the API usage rules that are required in order to get well-defined behavior from the issued commands. These rules are described in the specification for all API commands and structures in text blocks titled “Valid Usage“.
Run-time errors are error conditions that can occur even during the execution of applications that use the API correctly, like running out of memory, or failure to present to a window that has been closed in the meantime. Run-time errors are reported in the form of result codes. The specification describes the possible result codes each command may return individually in the form of text blocks titled “Return Codes“, accompanied with language describing the situations when each particular result code is expected to be returned by driver implementations.

While many of the Vulkan API commands do return a result code in the form of one of the constants of the VkResult enumeration, these result codes are only used to indicate run-time errors and status information about certain operations or objects, but do not report information about respecting valid usage conditions. This allows release builds of applications to run at maximum performance because the driver implementations don’t have to spend precious CPU cycles on checking for the potential violation of specification rules as that’s anyways unnecessary in case of applications that are known to use the API correctly.

As driver implementations aren’t checking valid usage conditions and expect that all inputs coming from the application to be valid according to the specification, running applications that use the API incorrectly may result in unexpected behavior, including corrupted rendering or even application crashes. Often the consequences of passing invalid parameters to an API command might only manifest when executing latter commands.

Validation Layers to the Rescue

We already acknowledged that not having to check for valid API usage for release builds of applications that are known to behave correctly from the point of view of the Vulkan API specification has great benefits, but it’s still very important to be able to identify incorrect API usage during the development of an application because finding the mistake we made that results in the weird corruption we see or the mysterious crash we can’t explain is not trivial to debug without a hint about where we should look for the error.

In order to provide a solution for this, Vulkan comes with a set of validation and debug layers as part of the Vulkan SDK. At the time of writing the SDK includes almost a dozen layers dedicated for validating certain aspects of API usage and providing debugging tools to developers like an API call dumper. When any subset of these layers are enabled they insert themselves automatically into the call-chain of every Vulkan API call issued by the application to perform their job. A detailed description of the individual layers is outside of the scope of this article but curious readers can find more information here.

The benefit of validation layers compared to the approach taken by traditional APIs is that applications only have to spend time on extensive error checking when explicitly requested, during development and typically when using debug builds of the application. This fits naturally in the general pay for what you use principle of the Vulkan API. Additionally, as the official validation layers coming with the SDK are maintained centrally and work equivalently across driver implementations, this approach doesn’t suffer from the fragmentation issues often seen in the error checking behavior of traditional APIs thus developers can be confident that the same validation errors are going to be reported in all cases, indifferent of the driver implementation the application is ran against.

What’s even better, the validation layers aren’t just looking for violations of the allowed API usage, but can also report warnings about potential incorrect or dangerous use of the API, and are even capable of reporting performance warnings that allow developers to identify places where the API is used correctly but isn’t used in the most efficient way. Examples of such potential performance warnings are binding resources that aren’t actually used or using a sub-optimal layout for an image.

Application developers willing to validate their API usage during development are going to be primarily interested in VK_LAYER_LUNARG_standard_validation that bulks all standard validation layers in a big meta-layer. Enabling this layer ensures that all official validation layers will going to be keen on trying to catch any mistake the application makes in the use of Vulkan. In order to report the caught violations of valid API usage to the application the validation layers expose the VK_EXT_debug_report instance extension that allows feeding the detected validation errors and warnings to application-provided callbacks. We are going to present the basic usage of this extension in this article but more information is available in the Vulkan Registry.

Preparing Our Instance For Validation

We recommend that all applications should enable and use the validation layers in their debug builds in order to make sure their applications are always respecting valid API usage and thus are going to be portable across the wide range of Vulkan driver implementations.

The following code snippet shows a typical C++ example of how applications should enable the VK_LAYER_LUNARG_standard_validation layer and the VK_EXT_debug_report extension at instance creation time in their debug builds:

    std::vector enabledInstanceLayers;
    std::vector enabledInstanceExtensions;
#ifdef MY_DEBUG_BUILD_MACRO
    /* Enable validation layers in debug builds to detect validation errors */
    enabledInstanceLayers.push_back("VK_LAYER_LUNARG_standard_validation");
#endif

    /* Enable instance extensions used in all build types */
    enabledInstanceExtensions.push_back("VK_KHR_surface");
    ...
#ifdef MY_DEBUG_BUILD_MACRO
    /* Enable debug report extension in debug builds to be able to consume validation errors */
    enabledInstanceExtensions.push_back("VK_EXT_debug_report");
#endif

    /* Setup instance creation information */
    VkInstanceCreateInfo instanceCreateInfo = {};
    ...
    instanceCreateInfo.enabledLayerCount       = static_cast(enabledInstanceLayers.size());
    instanceCreateInfo.ppEnabledLayerNames     = &enabledInstanceLayers[0];
    instanceCreateInfo.enabledExtensionCount   = static_cast(enabledInstanceExtensions.size());
    instanceCreateInfo.ppEnabledExtensionNames = &enabledInstanceExtensions[0];

    /* Create the instance */
    VkInstance instance = VK_NULL_HANDLE;
    VkResult result = vkCreateInstance(&instanceCreateInfo, nullptr, &instance);

Editor’s Note: Based on your input I’ve replaced the use of the NDEBUG macro to indicate code that is meant to be built only in debug versions of the application and now the code examples refer to a custom macro called MY_DEBUG_BUILD_MACRO that you should replace with the debug build macro used by your project or compiler toolchain.

Of course, a resilient application should first check for the presence of the used instance layers and extensions before passing them to vkCreateInstance by using the vkEnumerateInstanceLayerProperties and vkEnumerateInstanceExtensionProperties commands, respectively. After a successful instance creation the validation layers are active for the instance and the debug report extension is available for use.

As the VK_EXT_debug_report instance extension is not a core feature, the addresses of its entry points have to be acquired through the use of the vkGetInstanceProcAddr command as shown in the code snippet below:

#ifdef MY_DEBUG_BUILD_MACRO
    /* Load VK_EXT_debug_report entry points in debug builds */
    PFN_vkCreateDebugReportCallbackEXT vkCreateDebugReportCallbackEXT =
        reinterpret_cast
            (vkGetInstanceProcAddr(instance, "vkCreateDebugReportCallbackEXT"));
    PFN_vkDebugReportMessageEXT vkDebugReportMessageEXT =
        reinterpret_cast
            (vkGetInstanceProcAddr(instance, "vkDebugReportMessageEXT"));
    PFN_vkDestroyDebugReportCallbackEXT vkDestroyDebugReportCallbackEXT =
        reinterpret_cast
            (vkGetInstanceProcAddr(instance, "vkDestroyDebugReportCallbackEXT"));
#endif

Our First Debug Report Callback

We’ll talk about each individual entry point of the extension separately, but first let’s take a look at how an application-provided debug report callback should look like and what behavior it should follow. The application can register any number of debug report callbacks, they only need to match the signature defined by PFN_vkDebugReportCallbackEXT . A sample debug report callback that simply directs all incoming debug messages to stderr is presented below:

VKAPI_ATTR VkBool32 VKAPI_CALL MyDebugReportCallback(
    VkDebugReportFlagsEXT       flags,
    VkDebugReportObjectTypeEXT  objectType,
    uint64_t                    object,
    size_t                      location,
    int32_t                     messageCode,
    const char*                 pLayerPrefix,
    const char*                 pMessage,
    void*                       pUserData)
{
    std::cerr << pMessage << std::endl;
    return VK_FALSE;
}

The parameters passed to the callback provide information about where and what type of validation event has triggered the call, like the type of the event (error, warning, performance warning, etc.), the type and handle of the object being created or manipulated by the command triggering the call, the code and text message describing the event, and there’s even a parameter to supply application-specific user data to the callback which is provided when registering the callback. By putting a breakpoint in the callback, developers can also have access to the complete callstack to more accurately determine the location of the offending API call.

The return value of the callback is a Boolean that indicates to the validation layers whether the API call that triggered the debug report callback should be aborted or not. However, developers have to be aware that in case an error is reported by one of the validation layers it’s an indication that something invalid was being attempted by the application thus any operation following the error might result in undefined behavior or even a crash. As such, it’s advised that developers stop at the first error and try to resolve that before making any assumptions about the behavior of subsequent operations. Think about validation errors in the same way like errors reported by compilers: often subsequent errors are just consequences of the first one.

When registering our debug report callback, we can specify what type of events we want to get notification about. Typically we’re interested in errors, warnings, and performance warnings; the following code snipped registers our callback with such a configuration:

#ifdef MY_DEBUG_BUILD_MACRO
    /* Setup callback creation information */
    VkDebugReportCallbackCreateInfoEXT callbackCreateInfo;
    callbackCreateInfo.sType       = VK_STRUCTURE_TYPE_DEBUG_REPORT_CREATE_INFO_EXT;
    callbackCreateInfo.pNext       = nullptr;
    callbackCreateInfo.flags       = VK_DEBUG_REPORT_ERROR_BIT_EXT |
                                     VK_DEBUG_REPORT_WARNING_BIT_EXT |
                                     VK_DEBUG_REPORT_PERFORMANCE_WARNING_BIT_EXT;
    callbackCreateInfo.pfnCallback = &MyDebugReportCallback;
    callbackCreateInfo.pUserData   = nullptr;

    /* Register the callback */
    VkDebugReportCallbackEXT callback;
    VkResult result = vkCreateDebugReportCallbackEXT(instance, &callbackCreateInfo, nullptr, &callback);
#endif

An already registered callback can then be unregistered by destroying the callback object just like any other API object using the corresponding destroy command, vkDestroyDebugReportCallbackEXT . Developers should make sure to unregister their debug report callbacks before destroying the instance, otherwise they going to be notified about their misbehavior through any debug report callback that’s registered to receive errors.

The last remaining entry point of the debug report extension that we didn’t discuss yet, vkDebugReportMessageEXT can be used to generate debug report messages from application code. This can be useful to mark certain points of the execution of the application or to report application specific information to the same stream where the validation messages are fed.

Update: Since version 1.0.13 of the Vulkan API specification and the Vulkan SDK device layers have been deprecated so the instructions related to enabling the validation layers at the device level have been removed accordingly.

Forcing Validation Externally

The recommended way to validate an application is the approach presented so far, because it allows developers to enable validation based on the type of the build, as presented, based on some application setting, or through any other mechanism. Additionally, the debug report callback enables fine grained control over which validation events should be captured and how.

However, in some cases it’s possible that modifying or rebuilding the application to enable validation programatically is not viable or convenient. This includes cases like validating release builds of applications that don’t reproduce the issue in debug builds, or validating third-party applications or libraries that we cannot rebuild because of lack of access to the source code.

There’s a solution even for situations like this, as layers can also be enabled through the environment variable VK_INSTANCE_LAYERS . This variable accepts a list of layer names to enable separated by semicolons (Windows) or colons (Linux). The following command enables all standard validation layers on Windows:

> set VK_INSTANCE_LAYERS=VK_LAYER_LUNARG_standard_validation

When enabling validation through this approach, besides setting the environment variable to activate the layers, the reporting mechanism must be configured for each layer via a settings file, otherwise the activated layers will produce no output. This settings file must be named vk_layer_settings.txt and must be located in the working directory of the application or in the directory specified using the VK_LAYER_SETTINGS_PATH environment variable. A sample layer settings file is provided as part of the Vulkan SDK under the config folder which will simply output all error, warning, and performance warning messages to stdout , if used, but can be easily changed to output a different subset of the validation messages and can be redirected to files instead of console output (which may be necessary to capture the output of applications without a console). The sample settings file contains instructions about how to change the various configuration options.

Summary

While getting familiar with the Vulkan API may seem a bit involving at the beginning, as due to its nature it has a steeper learning curve than traditional APIs, the validation layers make it much easier to catch any mistakes, and they also provide a lot of additional useful information beyond just reporting basic errors. While using the validation layers does not completely eliminate the need to test your application on multiple platforms, it minimizes the chances of any portability issues resulting from incorrect API usage.

In addition to that, the official loader and validation layers are all available open-source on Github. So in case you find any errors that aren’t currently caught by any of the validation layers then don’t hesitate: contribute!

Don’t forget: validate your application before the users validate it for you!

More Vulkan® tutorials

Vulkan® Renderpasses

Renderpasses are objects designed to allow an application to communicate the high-level structure of a frame to the driver.

HelloVulkan Introductory Vulkan® Sample

HelloVulkan is a small, introductory Vulkan® "Hello Triangle" sample which shows how to set up a window, set up a Vulkan context, and render a triangle.

Vulkan® Barriers Explained

Barriers control resource and command synchronisation in Vulkan applications and are critical to performance and correctness. Learn more here.

Understanding Vulkan® Objects

An important part of learning the Vulkan® API is to understand what types of objects are defined by it, what they represent and how they relate to each other.

Daniel Rakos

Daniel Rakos is a member of the Software Architecture Team at AMD. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.