Amit Ben-Moshe
Amit Ben-Moshe is a Technical Lead and a Principal Member of Technical Staff at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.
About CodeXL Analyzer CLI
CodeXL Analyzer CLI is an offline compiler and performance analysis tool for OpenCL kernels, DirectX® shaders and OpenGL® shaders. Using CodeXL Analyzer CLI, you can compile kernels and shaders for a variety of AMD GPUs and APUs, independent of your system hardware, and generate AMD ISA, intermediate language and performance statistics for each target platform.
CodeXL Analyzer CLI is being used by graphics engineers and by developers of parallel-computing applications to identify performance bottlenecks and optimize their code. It is also being used as a backend for shader compilation and performance statistics generation by AMD tool products: CodeXL’s Analyze mode and GPU PerfStudio’s Shader Analyzer.
Key Features
- Compile OpenCL kernels and DirectX or OpenGL shaders, to generate AMD ISA code, intermediate language code, performance statistics and program binaries.
- Generate , binaries and performance statistics for a variety of AMD GPUs and APUs, independent from the device that is physically installed on your system.
- Observe how different optimizations and compilation chains affect the performance of your kernels and shaders: 32-bit vs 64-bit, , various compiler optimizations, kernel and shader code changes, and more.
CodeXL Analyzer CLI supports both Microsoft Windows® and Linux®.
Launching CodeXL Analyzer CLI
CodeXL Analyzer CLI’s commands are comprised of multiple command line switches, some of which are relevant to all platforms (OpenCL, DirectX, OpenGL), and others are platform-specific. Below is a list of key command options that are applicable to all platforms:
Key basic options
Command line switch | Description | Comments |
-s | Specifies the platform: “cl” for OpenCL, “hlsl” for DirectX and “glsl” for OpenGL | Each invocation handles a single platform |
-s | Display the help menu for the selected platform | |
-c | Target device for which output would be generated | Can appear multiple times; If not present, all devices are targeted by default |
-l | List the names of the supported devices | |
–isa | Generate textual ISA code and save the result to the given output full path | The Analyzer concatenates the device name to the file name to differentiate between the output of different devices |
-a | Generate performance statistics and save the result to the given output full path | The Analyzer concatenates the device name to the file name to differentiate between the output of different devices |
In the following sections, we will go through key command options for specific platforms. We will focus on the most commonly used commands, and not cover all available options. For the list of all available options, you can always use the –h command line switch.
Key OpenCL-specific options
For OpenCL kernels, CodeXL Analyzer CLI can compile high-level source code and extract AMD IL code and compiled binaries, in addition to textual ISA and performance statistics. Here are the options that are specific for OpenCL kernels:
Command line switch | Description | Comments |
–il | Generate textual AMD IL code and save the result to the given output file (full file path) | Output file name is changed to differentiate between the output of different devices |
-b | Save the compiled binaries to the given output file (full file path) | |
–kernel | Generate output for the given kernel | Use –kernel all to target all kernels |
Key DirectX-specific options
For DirectX shaders, CodeXL Analyzer CLI can extract DX ASM code, in addition to textual ISA and performance statistics. Here are the options that are specific for DirectX shaders:
Command line switch | Description | Comments |
-f | The name of the target entry point | |
-p | The shader profile (e.g. vs_5_0, ps_5_0, etc.) | |
–DumpMSIntermediate | Save the DX ASM code to the given output full path |
Key OpenGL-specific options
For OpenGL, only single shader source files.
Command line switch | Description | Comments |
-p | Specifies the shader type: Vertex, TessEval, Geometry, Fragment and compute | Tessellation control shaders are not supported by the Anlayzer’s “glsl” mode |
Note: CodeXL Analyzer CLI’s “glsl” mode, which accepts only a single shader, is deprecated and will be replaced in future versions with a new OpenGL mode, which will allow compiling and linking of whole OpenGL programs, and generation of more accurate ISA, performance statistics and per-stage binaries.
Usage Examples
Let’s have a look at the following .cl file (BinarySearch_Kernels.cl, taken from the AMD APP SDK):
__kernel voidbinarySearch( __global uint4 * outputArray,__const __global uint2 * sortedArray,const unsigned int findMe){ unsigned int tid = get_global_id(0); uint2 element = sortedArray[tid]; if((element.x > findMe) || (element.y < findMe)) { return; } else { outputArray[0].x = tid; outputArray[0].w = 1; }}
__kernel voidbinarySearch_mulkeys(__global int *keys,__global uint *_input,const unsigned int numKeys,__global int *_output){ int gid = get_global_id(0); int lBound = gid * 256; int uBound = lBound + 255;
for(int i = 0; i < numKeys; i++) { if(keys[i] >= _input[lBound] && keys[i] <=_input[uBound]) _output[i]=lBound; }}
__kernel voidbinarySearch_mulkeysConcurrent(__global uint *keys,__global uint *_input,const unsigned int inputSize,const unsigned int numSubdivisions,__global int *_output){ int lBound = (get_global_id(0) % numSubdivisions) * (inputSize / numSubdivisions); int uBound = lBound + inputSize / numSubdivisions; int myKey = keys[get_global_id(0) / numSubdivisions]; int mid;
while(uBound >= lBound) { mid = (lBound + uBound) / 2; if(_input[mid] == myKey) { _output[get_global_id(0) / numSubdivisions] = mid; return; } else if(_input[mid] > myKey) uBound = mid - 1; else lBound = mid + 1; }}