Modern APIs are rapidly growing in complexity, with each added feature introducing more responsibility and risk. Typically, the first tool we turn to is the standard set of validation layers, to ensure we write specification compliant code. However, what if the problem persists despite the lack of validation errors?
The error could be due to dynamic shader behaviour on the GPU that cannot be statically validated on the CPU timeline. If so, this can result in hours-long debugging sessions to find the one culprit among a heap of operations. Potential issues include indices getting out of bounds, propagated NaN values across multiple stages, or a missing initialization and the resulting access of invalid data.
Have you ever wished for a tool that helps you in finding these types of errors and validates dynamic shader behaviour on the GPU?
Meet GPU Reshape, a toolset that leverages on-the-fly instrumentation of GPU operations with instruction-level validation of potentially undefined behavior, supporting both DX12 and Vulkan. A standalone desktop application with no integration required, all open source (MIT), is now available in Beta.
My name is Miguel Petersen, Senior Rendering Engineer at Striking Distance Studios, author of GPU Reshape.
The toolset was developed in collaboration with AMD and Avalanche Studios Group, initially as a proof-of-concept Vulkan layer at Avalanche, after which development was continued externally. Development was supported by Lou Kramer, Jonas Gustavsson, Rys Sommefeldt, Mark Simpson, Marek Machliński, Daniel Isheden, and William Hjelm. Thank you all.
Making the GPU a first class citizen
GPU Reshape brings powerful features typical of CPU tooling to the GPU, providing validation of dynamic behaviour, such as:
- Resource Bounds Validation of resource read / write coordinates against its bounds.
- Export Stability Numeric stability validation of floating point exports (UAV writes, render targets, vertex exports), e.g. NaN / Inf.
- Descriptor Validation Validation of descriptors, potentially dynamically indexed. This includes undefined, mismatched (compile-time to runtime), out of bounds descriptor indexing, and missing table bindings.
- Concurrency Validation Validation of resource concurrency, i.e. single-producer or multiple-consumer, between queues and events.
- Resource Initialization Validation of resource initialization, ensures any read was preceded by a write. (*1)
- Infinite Loops Detection of infinite loops. Experimental.
All at interactive frame rates.
Additionally, certain features, such as descriptor validation and loops, can safeguard a potentially erroneous operation, preventing undefined behaviour during instrumentation. This is especially useful if the error would result in a GPU crash, limiting the application’s ability to write out useful debug information of the issue.
Validation errors are reported on the exact line of source code, with, for example, the resource, dimensions, and coordinates accessed. GPU Reshape is agnostic to the front-end language, such as HLSL or GLSL, as it functions solely on the instructions and associated symbols.
In case symbols are not available, or are not desired, validation errors may be reported on the offending instruction instead. The instruction stream is that of the internal intermediate language, see Instrumentation as a Framework for details.
Debugging symbols are supported for both SPIR-V and DXIL (DXBC planned), either embedded or through externally hosted PDBs. GPU Reshape does not require debugging symbols to produce useful information, however, symbols greatly improve the tools ability to track down issues.
Out of the box usage requires no integration, and can be done in a few clicks. Applications may be attached to after launching, or can be launched from the toolset with the desired workspace. Workspaces represent a graphics (API) device connection, and contain all instrumentation states, shaders, pipelines, and validation data.
Connecting to existing applications is an opt-in feature that greatly improves the ease of usability. Additionally, if configured, GPU Reshape can connect to running applications across network boundaries, allowing developers to instrument, for example, an artists machine as the corruption is happening. This is in contrast to spending, potentially, hours reproducing the issue.
After an application has been launched with the specified workspace, instrumentation occurs immediately. Any validation error can be inspected in further detail by double clicking it.
Instrumentation can be changed on the fly, and can be specialized on a per shader and pipeline basis, there are no restrictions on how a workspace may be configured. If the application is connected to after launching, this is a common pattern.
The toolset aims to bring you as much information as possible in order to investigate and resolve faults, with interactive instrumentation of applications in a matter of seconds. However, under the hood GPU Reshape is a little more than a fixed toolset.
Instrumentation as a framework
GPU Reshape is, at its core, a modular API-agnostic instrumentation framework. Performing appropriate call hooking, instrumentation of shader code, and any additional state management CPU-side.
Shader instrumentation is done on a generalized SSA-based intermediate language. It’s a custom intermediate language, specifically written for GPU Reshape and is bi-directionally translated to the backend language – namely SPIR-V and DXIL (DXBC experimental). Each feature, such as validation of out-of-bounds reads / writes, operates solely on the intermediate language and has no visibility on neither backend language nor API.
// Emitters take care of creating instructions
IL::Emitter emitter(program, context.basicBlock);
// any(coordinates > buffer.GetDimensions())
IL::ID failureCondition = emitter.Any(emitter.GreaterThanEqual(
// Branch to error block if the condition failed, otherwise resume block
Decoupling features from backends through a custom intermediate language has a number of benefits. It keeps permutations low, as features do not need an implementation per backend, and introducing backends does not require feature changes. As the number of features and backends grows, this becomes paramount. The intermediate language also allows for a standardized toolset across backends, significantly lowering the complexity of writing instrumentation code. On top of this, the bi-directional translation is single layered, meaning it’s translated to and from the backend binaries directly, without other intermediate languages. This greatly improves translation speeds. A typical shader is instrumented in just a few milliseconds, although this varies based on shader complexity.
Each feature can alter the program as it sees fit, such as adding, removing and modifying instructions. The feature is given a shader “program”, which act as the abstraction for the active backend, from which the user has access to all functions, instructions, constants, types, etc…, and is able to modify as necessary. After modification, the backend then performs just-in-time recompilation of the modified program back to the backend language.
Features do not need to concern themselves with backend specifics, such as vectorized versus scalarized execution, control-flow differences, and other implementation details. Given compliance, each feature will translate seamlessly to the backend language.
An open collaboration
GPU Reshape aims to serve as a framework for instrumentation, acting as a modular base from which any number of tools, techniques, and optimizations can be implemented. With the standard validation layers, validating statically known behaviour, GPU Reshape acts as an entirely complimentary toolset covering dynamic behaviour on the GPU.
It is my hope, with time, it matures and evolves into a general purpose tool. In fact, there’s a number of potentially planned additions, such as:
- Shader debugging, providing the ability to inspect live data as the shader sees it.
- Shader assertions, in source assertions typical of CPU code.
- Branch hot spots, live hot spot profiling of all branches.
- Branch coherence, live coherence analysis of all branches.
- And more!
Join us over at Github. Collaboration, discussion, and bug reporting is most welcome!
- DirectX® 12
- DXBC (*2)
- All vendors supported
- For AMD GPUs: Latest AMD Software: Adrenalin Edition™ application software driver (minimum version 23.10.2)
- Windows® 10
- Windows® 11
Linux® support is a planned addition.
*1 – May exhibit some false positives in large applications, in particular with aliasing. Will be fixed.
*2 – Experimental, converted to DXIL internally. Native support being considered.
Avalanche Studios Group and the Avalanche Studios Group logo are trademarks of the Avalanche Studios Group.