This folder contains different Kubeflow pipeline PyTorch examples using the PyTorch KFP Components SDK.
- Cifar10 example for Computer Vision
- BERT example for NLP
Please navigate to the following link for running the examples with Google Vertex AI pipeline
https://github.com/amygdala/code-snippets/tree/master/ml/vertex_pipelines/pytorch/cifar
Use the following link for installing KFP python sdk
https://github.com/kubeflow/pipelines/tree/master/sdk/python
Check the following prerequisites before running the examples
- From Kubeflow Jupyter Notebook mentioned in Option 1
- compiling and uploading to KFP mentioned in Option 2
This involves steps for building and running the pipeline from Kubeflow Jupyter notebook. Here the pipeline is defined in a Jupyter notebook and run directly from the Jupyter notebook.
Use the following notebook files for running the Cifar 10 and Bert examples
Cifar 10 - Pipeline-Cifar10.ipynb
Bert - Pipeline-Bert.ipynb
Steps to Run the example pipelines from Kubeflow Jupyter Notebook
This involves steps to build the pipeline in local machine and run it by uploading the pipeline file to the Kubeflow Dashboard. Here we have a python file that defines the pipeline. The python file containing the pipeline is compiled and the generated yaml is uploaded to the KFP for creating a run out of it.
Use the following python files building the pipeline locally for Cifar 10 and Bert examples
Cifar 10 - cifar10/pipeline.py
Bert - bert/pipeline.py
Steps to run the examples pipelines by compiling and uploading to KFP
In this example, we train a PyTorch Lightning model to using image classification cifar10 dataset with Captum Insights. This uses PyTorch KFP components to preprocess, train, visualize and deploy the model in the pipeline and interpretation of the model using the Captum Insights.
Open the example notebook and run to deploy the example in KFP.
Cifar 10 Captum Insights - Pipeline-Cifar10-Captum-Insights.ipynb
In this example, we train a PyTorch Lightning model to using image classification cifar10 dataset. A parent run will be created during the training process,which would dump the baseline model and relevant parameters,metrics and model along with its summary,subsequently followed by a set of nested child runs, which will dump the trial results. The best parameters would be dumped into the parent run once the experiments are completed.
Open the example notebook and run to deploy the example in KFP.
Cifar 10 HPO - Pipeline-Cifar10-hpo.ipynb
In this example, we deploy a pipeline to launch the distributed training of this BERT model file using the pytorch operator and deploy with torchserve using KFServing.
Open the example notebook and run to deploy the example in KFP.
Bert Distributed Training - Pipeline-Bert-Dist.ipynb
Refer: Running Pipelines in Kubeflow Jupyter Notebook
Before you start contributing to PyTorch KFP Samples, read the guidelines in How to Contribute. To learn how to build and deploy PyTorch components with pytorch-kfp-components SDK.