Our Unreal Engine performance guide
Our dedicated team of Unreal Engine engineers have been working hard to put together our recommendations
for optimizing with your Unreal Engine code base.
| PERFORMANCE PATCHES
Many of the optimization efforts that produce these patches are eventually integrated directly into a stock release of Unreal Engine. The patches presented here represent optimizations that have not been taken into Unreal Engine, and generally meet one of the following criteria:
- A similar optimization has already been applied in a subsequent release of stock Unreal Engine, and we wanted to make it available for prior versions that are still widely used.
- The optimization is still being reviewed for inclusion into stock Unreal Engine, and is provided here in the meantime.
- The optimization contains one or more AMD-specific implementations which, while they do not impact non-AMD performance, are not widely applicable enough for stock release.
AGS 5.4 Integration and DX12 Engine Registration
Strongly recommended. This optimization works by integrating the AMD GPU Services library at version 5.4, which then allows application registration under the D3D12RHI.
Prioritize ClearRenderTargetView over clearing RT regions by drawing quads
This optimization works by eliminating the usage of rendered quads to clear a texture, and instead always calling ClearRenderTargetView().
Replace anisotropic samplers during Shadow Depths
This optimization works by identifying scenarios in which Unreal Engine will insert anisotropic samplers during the ShadowDepths pass that will not directly affect the depth output. It replaces them with Trilinear samplers instead.
Eliminate unnecessary and slow RT clears
This optimization works by identifying Render Targets which are subjected to expensive and slow clears, but are likely to subsequently have every pixel written without blending. Such clears can be skipped without penalty.
Use Compute for Histogram-based reduction during Lens Exposure
This optimization works by replacing the PostProcessHistogramReduce PS with a Compute shader implementation that leverages LDS to avoid looping texture samples.
Use one triangle for fullscreen draws
This optimization works by replacing fullscreen clears or copies which have been implemented by drawing a dual-triangle quad over an entire render target with a single triangle which also covers that entire render target (and then some).
Reduce memory pressure on RT volume during Translucent Lighting
This optimization works by recognizing that the pair of 64x64x64 3D textures used in the FilterTranslucentVolume pass have an alpha channel that is regularly unused. In these cases, converting those volumes from RGBA16 to R11G11B10 can yield significant performance gains with minimal side effects.
Use Compute for HZB Mip Generation
This optimization works by replacing UE4’s existing Pixel Shader HZB downsampling implementation with a tightly-threaded Compute Shader implementation.
Use Compute for combining LUTs during Post Processing
This optimization works by forcing the usage of an already-existing Compute path for LUT combination.
Improve AO Shader memory access pattern
This optimization works by making a trio of small modifications to the AmbientOcclusion PS implementation.
- Eliminate a sample into a randomized texture, and replace it with a computed value.
- Reorganize HZB samples to improve cache utilization.
- Eliminate a small amount of processing if OPTIMIZATION_O1 is defined.
| FEATURE PATCHES
Enables GPU accelerated compositing of content and videos based on color keying.
360 Video Stitching
Enables GPU accelerated stitching of recorded or live video streams.
Advanced Media Framework
Accelerated multimedia processing, including video playback and encoding
FEMFX – Finite Element Method*
Accelerated CPU library for Finite Element Method (FEM) to compute physics for many different materials, duplicating real-world bending and breaking effects.