FidelityFX Super Resolution 3.1.2 (FSR3) – Upscaling and Frame Generation

Screenshot

Table of contents

Introduction

AMD FidelityFX Super Resolution 3 (FSR3) combines resolution upscaling with frame generation.

It uses new and improved temporal upscaling, along with a new optical flow implementation to reproject samples from 2 rendered frames to generate an additional frame in between. FSR3 also implements swapchain proxies, which are used to schedule interpolation workloads and handle frame pacing for DirectX 12 and Vulkan.

FSR3 flowchart

Integration guidelines

Shading language and API requirements

DirectX 12

  • CS_6_2

  • CS_6_6† is used on some hardware which supports 64-wide wavefronts.

Vulkan

  • Vulkan 1.x

While this document is based on DX12, a Vulkan implementation is available through the Vulkan FidelityFX API. A reference integration can be seen in the FSR sample.

Quick start checklist

  • Integrate using the new FidelityFX API using the single FidelityFX DLL, prebuilt and signed by AMD.

  • It is recommended to first implement upscaling only, before frame generation

  • Ensure a high-quality upscaling implementation:

    • Correct use of jittering pattern

    • Correct placement of post-process operations

    • Correct use of reactive mask

    • Correct use of transparency & composition mask

    • Correct setting of mip-bias for samplers

  • For frame generation

    • Add two new contexts for frame generation and the pacing swapchain

    • Add frame generation prepare dispatch

    • Add frame generation configure call

    • Double buffer where required

    • Modify UI rendering – choose one of the following 3 options to handle the UI:

      • Render the UI inside a callback function

        • This function will get called once for every presented frame and needs to be able to render asynchronously while the next frame is being rendered

        • Rendering the UI twice will have a performance cost, but the benefit is that the UI (including effects like film grain) can be updated at display frequency and with little latency

      • Render the UI to a separate texture which will then be composed on top of the final frames

        • This should still be pretty straight forward to integrate in most applications

        • Compared to the 3rd option, this will result in the UI appearing to be rendered at lower frequency than the main scene

      • Provide a HUD-less texture to frame interpolation for automatic detection and composition of the UI onto the interpolated frame

        • This is probably the easiest method to implement for most applications, but may result in minor artifacts in semi-transparent parts of the UI

    • Frame generation can run synchronously or asynchronously

      • An asynchronous implementation may run faster on some applications and hardware, but may require some additional effort

Walkthrough

FSR3 uses the FidelityFX API . See the link for an introduction to its usage.

Add upscaling through FSR3 interface

Note: if an FSR2 or FSR 3.0 upscaling implementation is already present and working correctly, please refer to the migration guide for the new interface.

Include the ffx_upscale.h header (or for C++ helpers ffx_upscale.hpp):

Copied!

#include <ffx_api/ffx_upscale.h>

Create the ffxContext for upscaling by filling out the ffxCreateContextDescUpsale structure with the required arguments. Pass an instance of either ffxCreateBackendDX12Desc or ffxCreateBackendVKDesc for backend creation in the pNext field.

Example using the C++ helpers:

Copied!

ffx::Context upscalingContext;
ffx::CreateBackendDX12Desc backendDesc{};
backendDesc.device = GetDevice ()->DX12Device();

ffx::CreateContextDescUpscale createUpscaling;
createUpscaling.maxUpscaleSize = {DisplayWidth, DisplayHeight};
createUpscaling.maxRenderSize = {RenderWidth, RenderHeight};
createUpscaling.flags = FFX_UPSCALE_ENABLE_AUTO_EXPOSURE | FFX_UPSCALE_ENABLE_HIGH_DYNAMIC_RANGE;

ffx::ReturnCode retCode = ffx::CreateContext(upscalingContext, nullptr, createUpscaling, backendDesc);

Get resolution based on settings

To get the render resolution or upscaling ratio from a selected quality mode, call ffxQuery with ffxQueryDescUpscaleGetUpscaleRatioFromQualityMode or ffxQueryDescUpscaleGetRenderResolutionFromQualityMode. It is possible to call these queries before context creation by passing NULL in the first argument (if using the C++ helper, call ffx::Query with a single argument instead of two).

In that case, if using a version override, make sure to include the same version override for the query, otherwise the default version will be used for the query.

Apply camera jitter

To get camera jitter phase count and offset parameters, use ffxQuery. Make sure to always pass a valid context after it has been created. The following examples show proper usage of the new API:

Copied!

ffx::ReturnCode                     retCode;
int32_t                             jitterPhaseCount;
ffx::QueryDescUpscaleGetJitterPhaseCount getJitterPhaseDesc{};
getJitterPhaseDesc.displayWidth   = resInfo.DisplayWidth;
getJitterPhaseDesc.renderWidth    = resInfo.RenderWidth;
getJitterPhaseDesc.pOutPhaseCount = &jitterPhaseCount;

retCode = ffx::Query(m_UpscalingContext, getJitterPhaseDesc);
CauldronAssert (ASSERT_CRITICAL , retCode == ffx::ReturnCode::Ok, L"ffxQuery(FSR_GETJITTERPHASECOUNT) returned %d", retCode);

ffx::QueryDescUpscaleGetJitterOffset getJitterOffsetDesc{};
getJitterOffsetDesc.index                              = m_JitterIndex;
getJitterOffsetDesc.phaseCount                         = jitterPhaseCount;
getJitterOffsetDesc.pOutX                              = &m_JitterX;
getJitterOffsetDesc.pOutY                              = &m_JitterY;

retCode = ffx::Query(m_UpscalingContext, getJitterOffsetDesc);

See the related section of the upscaler documentation for more information about the underlying implementation and how to apply the jitter during rendering.

Dispatch upscaling

To dispatch, call ffxDispatch with a description of type ffxDispatchDescUpscale. The structure is declared as follows:

Copied!

#define FFX_API_DISPATCH_DESC_TYPE_UPSCALE 0x00010001u
struct ffxDispatchDescUpscale
{
    ffxDispatchDescHeader      header;
    void*                      commandList;
    struct FfxApiResource      color;
    struct FfxApiResource      depth;
    struct FfxApiResource      motionVectors;
    struct FfxApiResource      exposure;
    struct FfxApiResource      reactive;
    struct FfxApiResource      transparencyAndComposition;
    struct FfxApiResource      output;
    struct FfxApiFloatCoords2D jitterOffset;
    struct FfxApiFloatCoords2D motionVectorScale;
    struct FfxApiDimensions2D  renderSize;
    struct FfxApiDimensions2D  upscaleSize;
    bool                       enableSharpening;
    float                      sharpness;
    float                      frameTimeDelta;
    float                      preExposure;
    bool                       reset;
    float                      cameraNear;
    float                      cameraFar;
    float                      cameraFovAngleVertical;
    float                      viewSpaceToMetersFactor;
    uint32_t                   flags;
};

Details about inputs, outputs and placement in the frame are described in relevant sections of the upscaler documentation .

Shortened example from the FSR sample using C++ helpers:

Copied!

ffx::DispatchDescUpscale dispatchUpscale{};

dispatchUpscale.commandList = pCmdList->GetImpl()->DX12CmdList();

dispatchUpscale.color                      = SDKWrapper::ffxGetResourceApi(m_pTempTexture->GetResource(), FFX_API_RESOURCE_STATE_PIXEL_COMPUTE_READ);
dispatchUpscale.depth                      = SDKWrapper::ffxGetResourceApi(m_pDepthTarget->GetResource(), FFX_API_RESOURCE_STATE_PIXEL_COMPUTE_READ);
dispatchUpscale.motionVectors              = SDKWrapper::ffxGetResourceApi(m_pMotionVectors->GetResource(), FFX_API_RESOURCE_STATE_PIXEL_COMPUTE_READ);
dispatchUpscale.output                     = SDKWrapper::ffxGetResourceApi(m_pColorTarget->GetResource(), FFX_API_RESOURCE_STATE_PIXEL_COMPUTE_READ);
dispatchUpscale.reactive                   = SDKWrapper::ffxGetResourceApi(m_pReactiveMask->GetResource(), FFX_API_RESOURCE_STATE_PIXEL_COMPUTE_READ);
dispatchUpscale.transparencyAndComposition = SDKWrapper::ffxGetResourceApi(m_pCompositionMask->GetResource(), FFX_API_RESOURCE_STATE_PIXEL_COMPUTE_READ);

// Jitter is calculated earlier in the frame using a callback from the camera update
dispatchUpscale.jitterOffset.x      = -m_JitterX;
dispatchUpscale.jitterOffset.y      = -m_JitterY;
dispatchUpscale.motionVectorScale.x = resInfo.fRenderWidth();
dispatchUpscale.motionVectorScale.y = resInfo.fRenderHeight();
dispatchUpscale.reset               = m_ResetUpscale;
dispatchUpscale.enableSharpening    = m_RCASSharpen;
dispatchUpscale.sharpness           = m_Sharpness;

// Cauldron keeps time in seconds, but FSR expects milliseconds
dispatchUpscale.frameTimeDelta = static_cast<float>(deltaTime * 1000.f);

dispatchUpscale.preExposure        = GetScene ()->GetSceneExposure ();
dispatchUpscale.renderSize.width   = resInfo.RenderWidth;
dispatchUpscale.renderSize.height  = resInfo.RenderHeight;
dispatchUpscale.upscaleSize.width  = resInfo.UpscaleWidth;
dispatchUpscale.upscaleSize.height = resInfo.UpscaleHeight;

// Setup camera params as required
dispatchUpscale.cameraFovAngleVertical = pCamera->GetFovY();

if (s_InvertedDepth)
{
    dispatchUpscale.cameraFar  = pCamera->GetNearPlane();
    dispatchUpscale.cameraNear = FLT_MAX;
}
else
{
    dispatchUpscale.cameraFar  = pCamera->GetFarPlane();
    dispatchUpscale.cameraNear = pCamera->GetNearPlane();
}

ffx::ReturnCode retCode = ffx::Dispatch (m_UpscalingContext, dispatchUpscale);
CauldronAssert (ASSERT_CRITICAL , !!retCode, L"Dispatching FSR upscaling failed: %d", (uint32_t)retCode);

The full code can be found in fsrapirendermodule.cpp.

Enable FSR3’s proxy frame generation swapchain

For ease of integration, FSR3 provides a frame generation swapchain, which provides an interface similar to IDXGISwapChain and VkSwapchainKHR. These classes can replace the “normal” swapchain and handle dispatching the frame generation and UI composition workloads, as well as modulating frame pacing to ensure frames are displayed at roughly even pacing. They are handled as part of a context with its own lifecycle, separate from the frame generation context.

Using the frame generation swapchain has been optimized to ensure low latency, minimize tearing, and work well with variable refresh rate displays.

Since replacing the swapchain is not allowed while in full-screen mode, the frame generation swapchain supports a passthrough mode with minimal overhead so that frame generation can be easily disabled without the need to recreate the swapchain.

Snippet from the FSR sample for DirectX 12:

Copied!

#include <ffx_api/dx12/ffx_api_dx12.hpp>

IDXGISwapChain4* dxgiSwapchain = GetSwapChain ()->GetImpl ()->DX12SwapChain();
dxgiSwapchain->AddRef();
// Unset the swapchain in the engine
cauldron::GetSwapChain ()->GetImpl ()->SetDXGISwapChain(nullptr);

// For illustration, uses most elaborate call.
// Alternative 1: without hwnd, use ffxCreateContextDescFrameGenerationSwapChainNewDX12
// Alternative 2: replace existing swapchain, use ffxCreateContextDescFrameGenerationSwapChainWrapDX12
ffx::CreateContextDescFrameGenerationSwapChainForHwndDX12 createSwapChainDesc{};
dxgiSwapchain->GetHwnd(&createSwapChainDesc.hwnd);
DXGI_SWAP_CHAIN_DESC1 desc1;
dxgiSwapchain->GetDesc1(&desc1);
createSwapChainDesc.desc = &desc1;
DXGI_SWAP_CHAIN_FULLSCREEN_DESC fullscreenDesc;
dxgiSwapchain->GetFullscreenDesc(&fullscreenDesc);
createSwapChainDesc.fullscreenDesc = &fullscreenDesc;
dxgiSwapchain->GetParent(IID_PPV_ARGS(&createSwapChainDesc.d.gifactory));
createSwapChainDesc.gameQueue = GetDevice ()->GetImpl ()->DX12CmdQueue(cauldron::CommandQueue::Graphics );

dxgiSwapchain->Release();
dxgiSwapchain = nullptr;
createSwapChainDesc.swapchain = &dxgiSwapchain;

ffx::ReturnCode retCode = ffx::CreateContext(m_SwapChainContext, nullptr, createSwapChainDesc);
CauldronAssert (ASSERT_CRITICAL , retCode == ffx::ReturnCode::Ok, L"Couldn't create the ffxapi fg swapchain (dx12): %d", (uint32_t)retCode);
createSwapChainDesc.d.gifactory->Release();

// Set new swapchain in engine
cauldron::GetSwapChain ()->GetImpl ()->SetDXGISwapChain(dxgiSwapchain);

// In case the app is handling Alt-Enter manually we need to update the window association after creating a different swapchain
ID.gifactory7* factory = nullptr;
if (SUCCEEDED(dxgiSwapchain->GetParent(IID_PPV_ARGS(&factory))))
{
    factory->MakeWindowAssociation(cauldron::GetFramework ()->GetImpl()->GetHWND(), DXGI_MWA_NO_WINDOW_CHANGES);
    factory->Release();
}

dxgiSwapchain->Release();

// Call SetHDRMetaData and SetColorSpace1 if needed
cauldron::GetSwapChain ()->SetHDRMetadataAndColorspace ();

Snippet from the FSR sample for Vulkan:

Copied!

cauldron::SwapChain * pSwapchain       = GetSwapChain ();
VkSwapchainKHR       currentSwapchain = pSwapchain->GetImpl ()->VKSwapChain();

ffx::CreateContextDescFrameGenerationSwapChainVK createSwapChainDesc{};
createSwapChainDesc.physicalDevice                = cauldron::GetDevice ()->GetImpl ()->VKPhysicalDevice();
createSwapChainDesc.device                        = cauldron::GetDevice ()->GetImpl ()->VKDevice();
// Pass swapchain to be replaced. Can also be null to only create new swapchain.
createSwapChainDesc.swapchain                     = &currentSwapchain;
createSwapChainDesc.createInfo                    = *cauldron::GetFramework ()->GetSwapChain ()->GetImpl ()->GetCreateInfo();
createSwapChainDesc.allocator                     = nullptr;
// Set queues
createSwapChainDesc.gameQueue.queue               = cauldron::GetDevice ()->GetImpl ()->VKCmdQueue(cauldron::CommandQueue::Graphics );
createSwapChainDesc.gameQueue.familyIndex         = cauldron::GetDevice ()->GetImpl ()->GetQueueFamilies().familyIndices[cauldron::RequestedQueue::Graphics];
createSwapChainDesc.gameQueue.submitFunc          = nullptr;  // this queue is only used in vkQueuePresentKHR, hence doesn't need a callback
createSwapChainDesc.asyncComputeQueue.queue       = cauldron::GetDevice ()->GetImpl ()->GetFIAsyncComputeQueue()->queue;
createSwapChainDesc.asyncComputeQueue.familyIndex = cauldron::GetDevice ()->GetImpl ()->GetQueueFamilies().familyIndices[cauldron::RequestedQueue::FIAsyncCompute];
createSwapChainDesc.asyncComputeQueue.submitFunc  = nullptr;
createSwapChainDesc.presentQueue.queue            = cauldron::GetDevice ()->GetImpl ()->GetFIPresentQueue()->queue;
createSwapChainDesc.presentQueue.familyIndex      = cauldron::GetDevice ()->GetImpl ()->GetQueueFamilies().familyIndices[cauldron::RequestedQueue::FIPresent];
createSwapChainDesc.presentQueue.submitFunc       = nullptr;
createSwapChainDesc.imageAcquireQueue.queue       = cauldron::GetDevice ()->GetImpl ()->GetFIImageAcquireQueue()->queue;
createSwapChainDesc.imageAcquireQueue.familyIndex = cauldron::GetDevice ()->GetImpl ()->GetQueueFamilies().familyIndices[cauldron::RequestedQueue::FIImageAcquire];
createSwapChainDesc.imageAcquireQueue.submitFunc  = nullptr;

// make sure swapchain is not holding a ref to real swapchain
cauldron::GetFramework ()->GetSwapChain ()->GetImpl ()->SetVKSwapChain(VK_NULL_HANDLE);

auto convertQueueInfo = [](VkQueueInfoFFXAPI queueInfo) {
    VkQueueInfoFFX  info;
    info.queue        = queueInfo.queue;
    info.familyIndex  = queueInfo.familyIndex;
    info.submitFunc   = queueInfo.submitFunc;
    return info;
};

VkFrameInterpolationInfoFFX  frameInterpolationInfo = {};
frameInterpolationInfo.device                       = createSwapChainDesc.device;
frameInterpolationInfo.physicalDevice               = createSwapChainDesc.physicalDevice;
frameInterpolationInfo.pAllocator                   = createSwapChainDesc.allocator;
frameInterpolationInfo.gameQueue                    = convertQueueInfo(createSwapChainDesc.gameQueue);
frameInterpolationInfo.asyncComputeQueue            = convertQueueInfo(createSwapChainDesc.asyncComputeQueue);
frameInterpolationInfo.presentQueue                 = convertQueueInfo(createSwapChainDesc.presentQueue);
frameInterpolationInfo.imageAcquireQueue            = convertQueueInfo(createSwapChainDesc.imageAcquireQueue);

ffx::ReturnCode retCode = ffx::CreateContext(m_SwapChainContext, nullptr, createSwapChainDesc);

// Get replacement function pointers
ffx::QueryDescSwapchainReplacementFunctionsVK replacementFunctions{};
ffx::Query(m_SwapChainContext, replacementFunctions);
cauldron::GetDevice ()->GetImpl ()->SetSwapchainMethodsAndContext(nullptr, nullptr, replacementFunctions.pOutGetSwapchainImagesKHR, replacementFunctions.pOutAcquireNextImageKHR, replacementFunctions.pOutQueuePresentKHR, replacementFunctions.pOutSetHdrMetadataEXT, replacementFunctions.pOutCreateSwapchainFFXAPI, replacementFunctions.pOutDestroySwapchainFFXAPI, nullptr,  replacementFunctions.pOutGetLastPresentCountFFXAPI, m_SwapChainContext, &frameInterpolationInfo);

// Set frameinterpolation swapchain to engine
cauldron::GetFramework ()->GetSwapChain ()->GetImpl ()->SetVKSwapChain(currentSwapchain, true);

After this, the application should run the same as before. Frame generation is not yet enabled.

Create frame generation context

Similar to context creation for upscaling, call ffxCreateContext with a description for frame generation and a backend description.

Example using the C++ helpers:

Copied!

ffx::Context frameGenContext;
ffx::CreateBackendDX12Desc backendDesc{};
backendDesc.device = GetDevice ()->DX12Device();

ffx::CreateContextDescFrameGeneration createFg{};
createFg.displaySize = {resInfo.DisplayWidth, resInfo.DisplayHeight};
createFg.maxRenderSize = {resInfo.DisplayWidth, resInfo.DisplayHeight};
createFg.flags = FFX_FRAMEGENERATION_ENABLE_HIGH_DYNAMIC_RANGE;

if (m_EnableAsyncCompute)
    createFg.flags |= FFX_FRAMEGENERATION_ENABLE_ASYNC_WORKLOAD_SUPPORT;

createFg.backBufferFormat = SDKWrapper::GetFfxSurfaceFormat (GetFramework ()->GetSwapChain ()->GetSwapChainFormat());
ffx::ReturnCode retCode = ffx::CreateContext(frameGenContext, nullptr, createFg, backendDesc);

Configure frame generation

Configure frame generation by filling out the ffxConfigureDescFrameGeneration structure with the required arguments and calling ffxConfigure.

This must be called once per frame. The frame ID must increment by exactly 1 each frame. Any other difference between consecutive frames will reset frame generation logic.

Copied!

// Update frame generation config
FfxApiResource hudLessResource = SDKWrapper::ffxGetResourceApi(m_pHudLessTexture[m_curUiTextureIndex]->GetResource(), FFX_API_RESOURCE_STATE_COMPUTE_READ);

m_FrameGenerationConfig.frameGenerationEnabled = m_FrameInterpolation;
m_FrameGenerationConfig.flags                  = 0;
m_FrameGenerationConfig.flags |= m_DrawFrameGenerationDebugTearLines ? FFX_FRAMEGENERATION_FLAG_DRAW_DEBUG_TEAR_LINES : 0;
m_FrameGenerationConfig.flags |= m_DrawFrameGenerationDebugResetIndicators ? FFX_FRAMEGENERATION_FLAG_DRAW_DEBUG_RESET_INDICATORS : 0;
m_FrameGenerationConfig.flags |= m_DrawFrameGenerationDebugView ? FFX_FRAMEGENERATION_FLAG_DRAW_DEBUG_VIEW : 0;
m_FrameGenerationConfig.HUDLessColor = (s_uiRenderMode == 3) ? hudLessResource : FfxApiResource({});
m_FrameGenerationConfig.allowAsyncWorkloads = m_AllowAsyncCompute && m_EnableAsyncCompute;
// assume symmetric letterbox
m_FrameGenerationConfig.generationRect.left   = (resInfo.DisplayWidth - resInfo.UpscaleWidth) / 2;
m_FrameGenerationConfig.generationRect.top    = (resInfo.DisplayHeight - resInfo.UpscaleHeight) / 2;
m_FrameGenerationConfig.generationRect.width  = resInfo.UpscaleWidth;
m_FrameGenerationConfig.generationRect.height = resInfo.UpscaleHeight;
// For sample purposes only. Most applications will use one or the other.
if (m_UseCallback)
{
    m_FrameGenerationConfig.frameGenerationCallback = [](ffxDispatchDescFrameGeneration* params, void* pUserCtx) -> ffxReturnCode_t
    {
        return ffxDispatch(reinterpret_cast<ffxContext*>(pUserCtx), &params->header);
    };
    m_FrameGenerationConfig.frameGenerationCallbackUserContext = &m_FrameGenContext;
}
else
{
    m_FrameGenerationConfig.frameGenerationCallback = nullptr;
    m_FrameGenerationConfig.frameGenerationCallbackUserContext = nullptr;
}
m_FrameGenerationConfig.onlyPresentGenerated = m_PresentInterpolatedOnly;
m_FrameGenerationConfig.frameID = m_FrameID;

m_FrameGenerationConfig.swapChain = GetSwapChain ()->GetImpl ()->DX12SwapChain();

ffx::ReturnCode retCode = ffx::Configure(m_FrameGenContext, m_FrameGenerationConfig);
CauldronAssert (ASSERT_CRITICAL , !!retCode, L"Configuring FSR FG failed: %d", (uint32_t)retCode);

If using the frame generation callback, the swapchain will call the callback with appropriate parameters. Otherwise, the application is responsible for calling the frame generation dispatch and setting parameters itself. In that case, the frame ID must be equal to the frame ID used in configuration. The command list and output texture can be queried from the frame generation context using ffxQuery. See the sample code for an example.

The user context pointers will only be passed into the respective callback functions. FSR code will not attempt to dereference them.

When allowAsyncWorkloads is set to false the main graphics queue will be used to execute the Optical Flow and Frame Generation workloads. It is strongly advised to profile, if significant performance benefits can be gained from asynchronous compute usage. Not using asynchronous compute will result in a lower memory overhead.

Note that UI composition and presents will always get executed on an async queue, so they can be paced and injected into the middle of the workloads generating the next frame.

FSR3 non async workflow

When allowAsyncWorkloads is set to true, the Optical Flow and Frame Generation workloads will run on an asynchronous compute queue and overlap with workloads of the next frame on the main game graphics queue. This can improve performance depending on the GPU and workloads.

FSR3 non async workflow

UI Composition

For frame interpolation the user interface will require some special treatment, otherwise very noticeable artifacts will be generated which can impact readability of the interface.

To prohibit those artifacts FSR3 supports various options to handle the UI:

The preferred method is to use the presentCallback. The function provided in this parameter will get called once for every frame presented and allows the application to schedule the GPU workload required to render the UI. By using this function the application can reduce UI input latency and render effects that do not work well with frame generation (e.g. film grain).

The UI composition callback function will be called for every frame (real or generated) to allow rendering the UI separately for each presented frame, so the UI can get rendered at presentation rate to achieve smooth UI animations.

Copied!

ffxReturnCode_t FSR3RenderModule::UiCompositionCallback(ffxCallbackDescFrameGenerationPresent* params, void* userCtx)
{
    ID3D12GraphicsCommandList2* pDxCmdList  = reinterpret_cast<ID3D12GraphicsCommandList2*>(params->commandList);
    ID3D12Resource*             pRtResource = reinterpret_cast<ID3D12Resource*>(params->outputSwapChainBuffer.resource);
    ID3D12Resource*             pBbResource = reinterpret_cast<ID3D12Resource*>(params->currentBackBuffer.resource);

    // Use pDxCmdList to copy pBbResource and render UI into the outputSwapChainBuffer.
    // The backbuffer is provided as SRV so postprocessing (e.g. adding a blur effect behind the UI) can easily be applied

    return FFX_API_RETURN_OK;
}

FSR3 non async workflow

If frame generation is disabled presentCallback will still get called on present.

FSR3 non async workflow

The second option to handle the UI is to render the UI into a dedicated surface that will be blended onto the interpolated and real backbuffer before present. Composition of this surface can be done automatically composed by the proxy swapchain or manually in the presentCallback. This method allows to present an UI unaffected by frame interpolation, however the UI will only be rendered at render rate. For applications with a largely static UI this might be a good solution without the additional overhead of rendering the UI at presentation rate.

FSR3 non async workflow

If frame generation is disabled and the UI Texture is provided, UI composition will still get executed by the frame interpolation swapchain.

FSR3 non async workflow

In that case the surface needs to be registered to the swap chain by calling ffxConfigure with a ffxConfigureDescFrameGenerationSwapChainRegisterUiResourceDX12 structure:

Copied!

FfxResource  uiColor = ffxGetResource (m_pUiTexture[m_curUiTextureIndex]->GetResource(), L"FSR3_UiTexture", FFX_RESOURCE_STATE_PIXEL_COMPUTE_READ );
ffx::ConfigureDescFrameGenerationSwapChainRegisterUiResourceDX12 uiConfig{};
uiConfig.uiResource = uiColor;
uiConfig.flags      = m_DoublebufferInSwapchain ? FFX_FRAMEGENERATION_UI_COMPOSITION_FLAG_ENABLE_INTERNAL_UI_DOUBLE_BUFFERING : 0;
ffx::Configure(m_SwapChainContext, uiConfig);

The final method to handle the UI is to provide a HUDLessColor surface in the FfxFrameGenerationConfig. This surface will get used during frame interpolation to detect the UI and avoid distortion on UI elements. This method has been added for compatibility with engines that can not apply either of the other two options for UI rendering.

FSR3 non async workflow

Dispatch frame generation preparation

Since version 3.1.0, frame generation runs independently of FSR upscaling. To replace the resources previously shared with the upscaler, a new frame generation prepare pass is required.

After the call to ffxConfigure, fill out the ffxDispatchDescFrameGenerationPrepare structure and call ffxDispatch using the frame generation context and it.

For fields also found in ffxDispatchDescUpscale, the same input requirements and recommendations apply here.

Set the frameID to the same value as in the configure description.

Shutdown

During shutdown, disable UI handling and frame generation in the proxy swapchain and destroy the contexts:

Copied!

// disable frame generation before destroying context
// also unset present callback, HUDLessColor and UiTexture to have the swapchain only present the backbuffer
m_FrameGenerationConfig.frameGenerationEnabled = false;
m_FrameGenerationConfig.swapChain              = GetSwapChain ()->GetImpl ()->DX12SwapChain();
m_FrameGenerationConfig.presentCallback        = nullptr;
m_FrameGenerationConfig.HUDLessColor           = FfxApiResource({});
ffx::Configure(m_FrameGenContext, m_FrameGenerationConfig);

ffx::ConfigureDescFrameGenerationSwapChainRegisterUiResourceDX12 uiConfig{};
uiConfig.uiResource = {};
uiConfig.flags = 0;
ffx::Configure(m_SwapChainContext, uiConfig);

// Destroy the contexts
ffx::DestroyContext(m_UpscalingContext);
ffx::DestroyContext(m_FrameGenContext);

Finally, destroy the proxy swap chain by releasing the handle, destroying the context with ffxDestroyContext and re-create the normal DX12 swap chain.

Thread safety

The ffx-api context is not guarranted to be thread safe. In this technique, FrameGenContext and SwapChainContext are not thread safe. Race condition symptom includes access violation error crash, interpolation visual artifact, and infinite wait in Dx12CommandPool destructor when releasing swapchain. It’s not obvious but FrameInterpolationSwapchainDX12::Present() actually access SwapChainContext and FrameGenContext (for dispatching Optical Flow and Frame Generation). A race condition occurs if app threads can simutaneously call FrameInterpolationSwapchainDX12::Present() and Dispatch(m_FrameGenContext, DispatchDescFrameGenerationPrepare). Another race condition occurance is if app threads can simutaneously call FrameInterpolationSwapchainDX12::Present() and DestroyContext(SwapChainContext). App could acquire mutex lock before calling ffx functions that access FrameGenContext or SwapChainContext to guarantee at any time there is at most 1 thread that can access the context.

Resource Lifetime

When UiTexture composition mode is used

If FFX_FRAMEGENERATION_UI_COMPOSITION_FLAG_ENABLE_INTERNAL_UI_DOUBLE_BUFFERING is set:

The UiTexture gets copied to an internal resource on the game queue The UiTexture may be reused on the GFX queue immediately in the next frame

If FFX_FRAMEGENERATION_UI_COMPOSITION_FLAG_ENABLE_INTERNAL_UI_DOUBLE_BUFFERING is not set:

The application is responsible to ensure UiTexture persists until composition of the real frame is finished This is typically in the middle of the next frame, so the UiTexture should not be used during the next frame. The application must ensure double buffering of the UITexture

When HUDLess composition mode is used:

The HUDLess texture will be used during FrameInterpolation The application is responsible to ensure it persists until FrameInterpolation is complete If FfxFrameGenerationConfig::allowAsyncWorkloads is true: Frameinterpolation happens on an async compute queue so the HUDLess texture needs to be double buffered by the application If FfxFrameGenerationConfig:: allowAsyncWorkloads is false: Frameinterpolation happens on the game GFX queue, so app can safely modify HUDLess texture in the next frame

When distortionField texture is registered to FrameInterpolation:

The application is responsible to ensure distortionField texture persists until FrameInterpolation is complete If FfxFrameGenerationConfig::allowAsyncWorkloads is true: Frameinterpolation happens on an async compute queue so the distortionField texture needs to be double buffered by the application If FfxFrameGenerationConfig:: allowAsyncWorkloads is false: Frameinterpolation happens on the game GFX queue, so app can safely modify distortionField texture in the next frame

The Technique

FSR3 is a container effect consisting of four components. For details on each component, please refer to the dedicated documentation page:

  1. FfxFsr3Upscaler

  2. FfxOpticalFlow

  3. FfxFrameinterpolation

  4. Frame generation swapchain

Memory Usage

Figures are given to the nearest MB, taken on Radeon RX 7900 XTX using DirectX 12, and are subject to change. Does not include frame generation swapchain overheads.

Output resolution

Quality

Memory usage (upscaler)

Memory usage (frame generation)

Total memory usage

3840×2160

native AA

498.88

473

971.88

3840×2160

quality

306

268

574

3840×2160

balanced

268.98

222.91

491.89

3840×2160

performance

237.26

189.88

427.14

2560×1440

native AA

227.82

221.74

449.56

2560×1440

quality

140.27

1496.42

265.54

2560×1440

balanced

123.75

104.3

228.05

2560×1440

performance

106.98

86.21

193.19

1920×1080

native AA

130.83

129.33

260.16

1920×1080

quality

78.93

71.92

150.85

1920×1080

balanced

70.28

63.01

133.29

1920×1080

performance

63.72

56.72

120.44

See also