Home » Blogs » AMD lab notes » Creating a PyTorch/TensorFlow Code Environment on AMD GPUs

Creating a PyTorch/TensorFlow Code Environment on AMD GPUs

Goal: The machine learning ecosystem is quickly exploding and we aim to make porting to AMD GPUs simple with this series of machine learning blogposts.

Audience: Data scientists and machine learning practitioners, as well as software engineers who use PyTorch/TensorFlow on AMD GPUs. You can be new to machine learning, or experienced in using Nvidia GPUs.

Motivation: Because when starting a new machine learning project, you may notice that many existing codes on GitHub are almost always CUDA based. If you have AMD GPUs and follow their instructions on running the code, it often does not work. We provide steps, based on our experience, that can help you get a code environment working for your experiments and to manage working with CUDA-based code repositories on AMD GPUs.

Differentiator from existing online resources:

  • This is from a machine learning practitioner’s perspective, to guide you away from rabbit holes due to habits and preferences, such as using Jupyter Notebooks and pip install.

  • This is not to teach you how to install PyTorch/TensorFlow on ROCm because this step alone often times cannot lead to successfully running machine learning code.

  • This is not to teach you how to HIPify code, but instead, to let you know that sometimes you don’t even need that step.

  • As of today, this is the only documentation so far on the internet that has end-to-end instructions on how to create PyTorch/TensorFlow code environment on AMD GPUs.

The prerequisite is to have ROCm installed, follow the instructions here and here.

Install PyTorch or TensorFlow on ROCm

Option 1. PyTorch

We recommend following the instructions on the official ROCm PyTorch website.

Option 2. TensorFlow

We recommend following the instructions on the official ROCm TensorFlow website.

Note: We also strongly recommend using Docker image with PyTorch or TensorFlow pre-installed. The reason is that if you create a virtual environment or conda environment, certain ROCm dependencies may not be properly installed. It can be non-trivial to install dependencies.

Note: You don’t need flag --gpus all to run docker on AMD GPUs.

Git clone the source code you want to run

Copied!

git clone –-recursive <https://github.com/project/repo.git>

Install library requirements based on the GitHub repository

  • Skip the commands that create virtual environments or conda environments. They are usually in machine_install.sh or setup.sh files.

  • Go directly to the library list and remove torch and tensorflow since these are CUDA-based by default. The docker containers should already have those libraries installed for ROCm. You can usually find the library list in requirements.txt.

  • Run pip3 install –r requirements.txt where requirements.txt contains single lines with package names (and possibly package versions).

Run your code

If you can run your code without problems, then you have successfully created a code environment on AMD GPUs!

If not, then it may be due to the additional packages in requirements.txt depending on CUDA, which needs to be HIPified to run on AMD GPUs.

Obtain HIPified library source code

Option 1. Find existing HIPified library source code

You can simply search online or on GitHub for “library_name” + “ROCm”. The HIPified code will pop up if it exists.

Since this step is not trivial, here is an example:

If you are trying to run large language model related code, you may need the library bitsandbytes (see link).

Searching online for “bitsandbytes ROCm” you will find this fork which adds ROCm support with a HIP compilation target.

Copied!

git clone https://github.com/agrocylo/bitsandbytes-rocm 
cd bitsandbytes-rocm 
export ROCM_HOME=/opt/rocm/ 
make hip -j 
python3 setup.py install 

Note: the installation location may have the version number such as /opt/rocm-5.5.0.

Option 2. HIPify code if necessary

We recommend following the below tutorials for this option.

Commit changes to Docker Image

Once you finish modifying the new Docker container following the first step (“Install PyTorch or TensorFlow on ROCm”), exit out:

Copied!

exit

Prompt the system to display a list of launched containers and find the docker container ID:

Copied!

docker ps -a

Create a new image by committing the changes:

Copied!

docker commit [CONTAINER_ID] [new_image_name]

In conclusion, this article introduces key steps on how to create PyTorch/TensorFlow code environment on AMD GPUs. ROCm is a maturing ecosystem and more GitHub codes will eventually contain ROCm/HIPified ports. Future posts to AMD lab notes will discuss the specifics of porting from CUDA to HIP, as well guides to running popular community models from HuggingFace.

Yao Fehlis
Yao Fehlis

Yao Fehlis is a Member of Technical Staff (MTS) at Research and Advanced Development at AMD, and her focus involves AI for science, AI for manufacturing and large language models. In AI for science, she works internally and externally with academia to enhance traditional HPC applications with machine learning to accelerate scientific discoveries. In AI for manufacturing, she works internally with product teams to use machine learning to uncover optimal parameters and configurations in AMD designs. Prior to joining AMD, she worked as a data scientist at KUKA Robotics where she led predictive maintenance projects for industrial KUKA robots and worked on deep learning projects such as teaching robots to pick up objects. She holds a PhD in computational chemistry from Rice University.

Rajat Arora
Rajat Arora

Rajat Arora is a Senior Member of Technical Staff (SMTS) Software System Design Engineer in the Data Center GPU Software Solutions group at AMD, where he works on porting and optimizing high-performance computing applications for AMD GPUs. He obtained his PhD in Computational Mechanics from Carnegie Mellon University. His PhD research focused at the intersection of high performance scientific computing, numerical analysis, and material science. Recently, his research interests have expanded to include development of physics-informed machine learning models and tools to accelerate scientific discovery and engineering design.

Justin Chang
Justin Chang

Justin Chang is a Senior Member of Technical Staff (SMTS) Software System Design Engineer in the Data Center GPU Software Solutions group and manages the AMD lab notes blog series. He received his PhD degree in Civil Engineering from the University of Houston, where he published several journal papers on structure-preserving high performance computational methods for transport in porous media. As a postdoc, he worked for both Rice University and the National Renewable Energy Laboratory to accelerate finite element simulation time of subsurface flow through dual porosity porous medium and lithium-ion batteries used in electric vehicles. He also worked for the Oil and Gas industry and focused on GPU porting and optimization of key FWI, RTM, and other seismic imaging workloads.

Austin Ellis
Austin Ellis

Austin Ellis is an Member of the Technical Staff (MTS) at AMD and on-site APU Application Architect at Lawrence Livermore National Laboratory helping to deploy AMDʼs next Exascale system, El Capitan. He specializes in high performance computing, machine learning, GPU computing, scalable algorithms, and data analytics. Previously, he was an HPC research scientist within the Analytics and AI Methods at Scale group within the Oak Ridge Leadership Computing Facility (OLCF). While at the OLCF, he was part of the HPL-MxP team which currently holds the record for the world's fastest computation at 9.951 ExaOps on the world's first Exascale supercomputer, Frontier, powered by AMD hardware.

AMD Lab Notes

GPU-aware MPI with ROCm – amd-lab-notes

MPI is the de facto standard for inter-process communication in High-Performance Computing. This post will guide you through the process of setting up an MPI application that supports execution on GPU clusters.

AMD Lab Notes

AMD ROCm™ installation (amd-lab-notes)

Installation of the AMD ROCm™ software package can be challenging. This introductory material shows how to install ROCm on a workstation with an AMD GPU card that supports the AMD GFX9 architecture.

Looking for a good place to get started with exploring GPUOpen?

AMD GPUOpen documentation

Explore our huge collection of detailed tutorials, sample code, presentations, and documentation to find answers to your graphics development questions.

AMD GPUOpen Effects - AMD FidelityFX technologies

Create wonder. No black boxes. Meet the AMD FidelityFX SDK!

AMD GPUOpen Performance Guides

The home of great performance and optimization advice for AMD RDNA™ 2 GPUs, AMD Ryzen™ CPUs, and so much more.

AMD GPUOpen Samples

Browse all our useful samples. Perfect for when you’re needing to get started, want to integrate one of our libraries, and much more.

AMD GPUOpen developer SDKs

Discover what our SDK technologies can offer you. Query hardware or software, manage memory, create rendering applications or machine learning, and much more!

AMD GPUOpen Developer Tools

Analyze, Optimize, Profile, Benchmark. We provide you with the developer tools you need to make sure your game is the best it can be!

Getting started: AMD GPUOpen software

New or fairly new to AMD’s tools, libraries, and effects? This is the best place to get started on GPUOpen!

AMD GPUOpen Getting Started Development and Performance

Looking for tips on getting started with developing and/or optimizing your game, whether on AMD hardware or generally? We’ve got you covered!