Support both HIP and CUDA® with ease
The Orochi library loads HIP and CUDA® APIs dynamically, allowing you to switch between them at runtime. Orochi is named after a legendary Japanese dragon with eight heads and eight tails on a single body. In keeping with its namesake, Orochi enables a single library to use multiple backends at runtime.
Download the latest version - v2.0
This release adds the following features:
- Support many more CUDA/HIP functions compared to Orochi 1. Should be almost exhaustive.
- We will keep one branch per version of CUDA/HIP, (example of branch name:
release/hip5.7_cuda12.2
),
so developers can switch on branches depending on their environment.
If you need a combination that doesn’t exist, open an ‘Issue’ on the GitHub of the project. - Change compared to Orochi 1: you need to install the CUDA SDK corresponding to the branch you are using.
for example, if you use branchrelease/hip5.7_cuda12.2
, install CUDA SDK 12.2.
However CUDA will still be dynamically loaded at runtime, only includes of the SDK are used at compile time. - New demo for textures.
- New demo for Direct3D® 12 interop.
- Some refactoring/improvement of
OrochiUtils
. Orochi.h
can be included in the kernel files to have theoro*
names.- The binding and naming between HIP/CUDA have been improved and developed in a way it should be easier to maintain for future versions.
- Most of the Orochi/OrochiUtils API has not been changed so updating the project from Orochi 1.0 to 2.0 should be straightforward.
- We included an experimental high performance radix sort which we are going to publish the detail in the future.
Features
- No need to compile two separate implementations for HIP and CUDA.
- Compile and maintain a single binary that can run on both AMD and NVIDIA® GPUs.
- Dynamically load the corresponding HIP/CUDA shared libraries depending on your platform.
- Combines the functionality offered by both HIPEW and CUEW into a single library.
- No need to link to CUDA (for the driver APIs) nor HIP (for both driver and runtime APIs) at build-time.
Requirements
To run an application compiled with Orochi, you need to install a driver of your choice with the corresponding .dll/.so files based on the GPU(s) available. Orochi will automatically link with the corresponding shared library at runtime.
Version history
- Support many more CUDA/HIP functions compared to Orochi 1. Should be almost exhaustive.
- We will keep one branch per version of CUDA/HIP, (example of branch name:
release/hip5.7_cuda12.2
),
so developers can switch on branches depending on their environment.
If you need a combination that doesn’t exist, open an ‘Issue’ on the GitHub of the project. - Change compared to Orochi 1: you need to install the CUDA SDK corresponding to the branch you are using.
for example, if you use branchrelease/hip5.7_cuda12.2
, install CUDA SDK 12.2.
However CUDA will still be dynamically loaded at runtime, only includes of the SDK are used at compile time. - New demo for textures.
- New demo for Direct3D® 12 interop.
- Some refactoring/improvement of
OrochiUtils
. Orochi.h
can be included in the kernel files to have theoro*
names.- The binding and naming between HIP/CUDA have been improved and developed in a way it should be easier to maintain for future versions.
- Most of the Orochi/OrochiUtils API has not been changed so updating the project from Orochi 1.0 to 2.0 should be straightforward.
- We included an experimental high performance radix sort which we are going to publish the detail in the future.
- Bitcode linking support
- Added OrochiUtils. A wrapper for convenience
- A workaround for 22.7.1 AMD driver regression (missing RTC)
- Support more HIP and CUDA APIs
- Use only from CUDA driver apis (except for RTC)
- Proper error handling
- Unit test
- Bug fixes
- Initial release
Our other SDKs
AMD Radeon™ Anti-Lag 2 reduces the system latency by applying frame alignment between the CPU and GPU jobs.
Capsaicin is a Direct3D12 framework for real-time graphics research which implements the GI-1.0 technique and a reference path-tracer.
The Render Pipeline Shaders (RPS) SDK provides a framework for graphics engines to use Render Graphs with explicit APIs.
ADLX is a modern library designed to access features and functionality of AMD systems such as Display, 3D graphics, Performance Monitoring, GPU Tuning, and more.
Brotli-G is an open-source compression/decompression standard for digital assets (based on Brotli) that is compatible with GPU hardware.
HIP RT is a ray tracing library for HIP, making it easy to write ray tracing applications in HIP.
AMD Radeon™ ProRender is our fast, easy, and incredible physically-based rendering engine built on industry standards that enables accelerated rendering on virtually any GPU, any CPU, and any OS in over a dozen leading digital content creation and CAD applications.
Radeon™ Machine Learning (Radeon™ ML or RML) is an AMD SDK for high-performance deep learning inference on GPUs.
Harness the power of machine learning to enhance images with denoising, enabling your application to produce high quality images in a fraction of the time traditional denoising filters take.
The Advanced Media Framework SDK provides developers with optimal access to AMD GPUs for multimedia processing.
The D3D12 Memory Allocator (D3D12MA) is a C++ library that provides a simple and easy-to-integrate API to help you allocate memory for DirectX®12 buffers and textures.
The AMD Display Library (ADL) SDK is designed to access display driver functionality for AMD Radeon™ and AMD FirePro™ graphics cards.
NVIDIA and CUDA are registered trademarks of NVIDIA Corporation.