From version 4.0, the Windows® version of Compressonator supports GPU-based encoding with DirectX® Compute (DXC) or OpenCL™ (OCL) shaders. This tutorial explains how to use this feature. 

Encoding with GPU Options

The new encoding options can be selected from the command line tool using “ -EncodeWith < DXC | OCL>  ” or the GUI application’s dialog using “Encode with” drop-down selections “GPU_DirectX” or “GPU_OpenCL” as shown in this example:

As an alternative to the GUI or CLI tools, developers can add Compressonator SDK to their own application using:

  • CMP_Core components for encoding with BC1 to BC5 which supports processing with OpenCL™ and DirectX® Compute pipeline.
  • Utilize the sample compute HLSL shader interfaces that use CMP_Framework with a CMP_GPU_DXC component library.
  • Use CMP_Core BC1 to BC5 and BC7 encoders that use CMP_Framework with a CMP_GPU_OCL component library.

How Shader Compiling Is Setup

When using Compressonator GUI or CLI  with GPU-based encoding options DXC or OCL for the first time, you will notice a long delay for each codec BC1 to BC7 been used. This is because each of these shader sources must be complied using D3D or OpenCL™ compiler. Once they are compiled, the shader binaries “ .cmp ” files will be cached for quick use in subsequent usage. In case the compilation of these shaders fail to build on a certain GPU platform, the GUI or CLI tools will automatically switch to CPU encoding, which guarantees that processed results are always available.

Working With MIP Maps Using GPU Encoding

Requested MIP map level generation is dependent on the original image width and height dimensions, with each level being divisible by two from its direct predecessor.  For MIP level generation with GPU, the resulting dimensions must also be divisible by four. Compressonator  GUI and CLI applications will automatically adjust the user request levels with a warning message if a specific level cannot be achieved during processing.

Automated Make Compatible Feature  

With the GPU or CLI tools, users can compress HDR and LDR images interchangeably between any BCn codec without any prior requirements to perform format transformations of the image source.

When using GPU based encoders both the GUI and CLI tools use temporary CPU based buffers to make the encoding seamlessly smooth. To see how this works, try compressing an EXR file format image to formats like BC1 to BC5 or BC7 using the “Encode with” option for GPU_DirectX or GPU_OpenCL.

If you are unfamiliar with how to process textures, check the online tutorial on getting started using sample projects.

Using Codec Quality Settings

In this release, the quality setting for both GPU and CPU has been modified based on which BCn codec is been used to process images.

BC1, BC2, and BC3

There are three-course settings, each uses a different algorithm for fast, medium, and slow processing speeds. Currently, the quality of each algorithm remains the same, in the next release this will change so that these codecs will have a much smoother combined range of settings from 0.0 to 1.0.

Quality setting Result
0.0 to 0.1
Fast processing, a min/max algorithm is used.
0.101 to 0.6
Medium processing, a minimal distance algorithm is used.
0.601 to 1.0
Slow processing, a ramp algorithm is used.

BC4 and BC5

The quality setting has no effect and performance remains the same. A simple packing algorithm is used and it does not have any refinement settings.

BC6 and BC7

Has full quality ranges from 0.0 to 1.0, and performance will vary from fast to very slow. The algorithms used here are more complex and are based on pre-calculated ramps and block partitions.

Using Global Quality Settings

In the GUI users can override all individual destination compression settings, using a global set value before processing. Currently, only the quality settings can be overwritten with a new global setting. 

The process is as follows:

On the “Project Explorer”, click once on “Double Click here to add files…” to select its view options:

A new property view will be displayed:

Set a new “Quality” value to override all existing quality settings for textures used in the “Project Explorer” dialog. A value of zero will restore the old values and disable the global settings.

When an override is set, the quality settings displayed in the “Property View” for each destination setting in the “Project Explorer” will show a non-editable override setting.

Notice also that the “Double Click Here to add files” background color has also changed to indicate that an override setting is in effect. It will return to a white background if the override settings are turned off by setting its value to zero.

Once set, all texture processing will use the override value until it is turned off.

Viewing Performance Numbers

An updated table view of the GUI Analysis Views is provided that includes GPU encode performance statistics.

With these stats, users can analyze the data generated on specific GPU or CPU for performance vs image quality.

KPerf(ms) 

This estimates the time it takes to process 1000 (4×4 texel) blocks, using the current encoder and GPU setup. where Perf(ms) is the time it took to process a single block of 16 pixels in milliseconds.

HPC performance monitoring uses CPU timers, while OCL & DXC uses GPU performance query timers.

MTx/s

This is a measurement of the time it takes to process 1 Million texels in a second.

Time(s)

CPU performance-based timing, that measures the overall end to end time it took for the image to be processed. It includes device setup, loading image to GPU, receiving the image from GPU, and file IO.

Using the Log Format Option

Users can generate log files for GPU and CPU usage stats and test automation. The process_results.txt logging format has changed to include KPerf(ms) and MTx/s.  To accommodate these changes the revision for the new log format has been updated from 1.0 to 1.1 as shown in this sample (Bold items in the log file are new):

Using: CompressonatorCLI -log -fd BC1 -EncodeWith OCL  .\images\ruby.bmp ruby_bc1.dds

The process_results.txt   file contains:

CompressonatorCLI Performance Log v1.1

Source          : .\images\ruby.bmp, Height 416, Width 576, Linear size 0.936 MBytes
Destination     : ruby_bc1.dds
Processed to    : BC1 with 0 iteration(s) in 0.544 seconds
Using           : OCL
Quality         : 0.05
KPerf(ms)       : 0.193
Mpx/s           : 1242.000
MSE             : 9.33
PSNR            : 38.4
SSIM            : 0.9848
Total time      : 0.552 seconds

Generating a CSV Log File 

In addition to the -log and -logfile two new command-line options have been added to output analysis data into a comma-separated file format.

Use -logcsv or -logcsvfile to generate a .csv file suitable to use in any application that supports viewing these files in table formats like the one shown in this sample:

Using: CompressonatorCLI -logcsv -fd BC1 -EncodeWith HPC  .\images\ruby.bmp ruby_bc1.dds

Followed by: CompressonatorCLI -logcsv -fd BC1 -EncodeWith OCL  .\images\ruby.bmp ruby_bc1.dds

Related content

AMD Compressonator

Compressonator

Compressonator is a set of tools to allow artists and developers to more easily work with compressed assets and easily visualize the quality impact of various compression technologies.

AMD Brotli-G SDK

Brotli-G SDK

Brotli-G is an open-source compression/decompression standard for digital assets (based on Brotli) that is compatible with GPU hardware.