This repository contains a complete workflow for training, evaluating, and fine-tuning a ProtT5 model for protein sequence classification using PyTorch and Hugging Face Transformers.
To run this locally, use conda and the use the environment.yml file to ensure dependencies are installed.
Use run.py, and modify with your dataset inputs. This file starts training, validation, prediction and outputs metric accuracies for all. You fine-tuned model will be saved, and re-used in prediction.