Our Unreal Engine performance guide

Our dedicated team of Unreal Engine engineers have been working hard to put together our recommendations
for optimizing with your Unreal Engine code base.

Performance Patches

Many of the optimization efforts that produce these patches are eventually integrated directly into a stock release of Unreal Engine. The patches presented here represent optimizations that have not been taken into Unreal Engine, and generally meet one of the following criteria:

  • A similar optimization has already been applied in a subsequent release of stock Unreal Engine, and we wanted to make it available for prior versions that are still widely used.
  • The optimization is still being reviewed for inclusion into stock Unreal Engine, and is provided here in the meantime.
  • The optimization contains one or more AMD-specific implementations which, while they do not impact non-AMD performance, are not widely applicable enough for stock release.

Patch

Integration difficulty

4.25

4.24

4.23

AGS 5.4 Integration and DX12 Engine Registration
Strongly recommended. This optimization works by integrating the AMD GPU Services library at version 5.4, which then allows application registration under the D3D12RHI.

Very easy

Prioritize ClearRenderTargetView over clearing RT regions by drawing quads
This optimization works by eliminating the usage of rendered quads to clear a texture, and instead always calling ClearRenderTargetView().

Very easy

Replace anisotropic samplers during Shadow Depths
This optimization works by identifying scenarios in which Unreal Engine will insert anisotropic samplers during the ShadowDepths pass that will not directly affect the depth output.  It replaces them with Trilinear samplers instead.

Very easy

Eliminate unnecessary and slow RT clears

This optimization works by identifying Render Targets which are subjected to expensive and slow clears, but are likely to subsequently have every pixel written without blending. Such clears can be skipped without penalty.

Easy

Use Compute for Histogram-based reduction during Lens Exposure
This optimization works by replacing the PostProcessHistogramReduce PS with a Compute shader implementation that leverages LDS to avoid looping texture samples.

Very easy

Use one triangle for fullscreen draws
This optimization works by replacing fullscreen clears or copies which have been implemented by drawing a dual-triangle quad over an entire render target with a single triangle which also covers that entire render target (and then some).

Very easy

Reduce memory pressure on RT volume during Translucent Lighting
This optimization works by recognizing that the pair of 64x64x64 3D textures used in the FilterTranslucentVolume pass have an alpha channel that is regularly unused.  In these cases, converting those volumes from RGBA16 to R11G11B10 can yield significant performance gains with minimal side effects.

Very easy

Use Compute for HZB Mip Generation
This optimization works by replacing UE4’s existing Pixel Shader HZB downsampling implementation with a tightly-threaded Compute Shader implementation.

Very easy

Use Compute for combining LUTs during Post Processing
This optimization works by forcing the usage of an already-existing Compute path for LUT combination.

Very easy

Improve AO Shader memory access pattern

This optimization works by making a trio of small modifications to the AmbientOcclusion PS implementation.

  1. Eliminate a sample into a randomized texture, and replace it with a computed value.
  2. Reorganize HZB samples to improve cache utilization.
  3. Eliminate a small amount of processing if OPTIMIZATION_O1 is defined.

Very easy

Feature Patches

Patch

Integration difficulty

4.25

4.24

4.23

AMD FidelityFX – Single Pass Downsampler (SPD)

Provides an optimized solution for creating up to 12 MIP levels of a texture. This patch provides SPD optimization to the Bloom effect to accelerate texture creation. Learn more about SPD.

Easy

AMD FidelityFX – Luminance Preserving Mapper (LPM)

Integrates HDR, FreeSync™2, and wide gamut tone and gamut mapping. UE4.25+ only. Learn more about LPM.

Medium

AMD FidelityFX – Contrast Adaptive Sharpening (CAS)

Integration of FidelityFX CAS into Unreal Engine’s renderer. Learn more about CAS.

Very easy

Chroma Keying

Enables GPU accelerated compositing of content and videos based on color keying.

Very easy

360 Video Stitching

Enables GPU accelerated stitching of recorded or live video streams.

Medium

Advanced Media Framework

Accelerated multimedia processing, including video playback and encoding

Very easy

TressFX 4.1

Integration of TressFX 4.1 into Unreal Engine’s renderer. Learn more about TressFX.

Easy

FEMFX – Finite Element Method*

Accelerated CPU library for Finite Element Method (FEM) to compute physics for many different materials, duplicating real-world bending and breaking effects.

Hard

* – FEMFX is currently available for Unreal Engine 4.18