A high-throughput and memory-efficient inference and serving engine for LLMs
-
Updated
Jul 1, 2026 - Python
A high-throughput and memory-efficient inference and serving engine for LLMs
QualityScaler - image/video AI upscaler app
Stable Diffusion web UI
Vendor-agnostic orchestration for training, inference and agentic workloads across NVIDIA, AMD, TPU, and Tenstorrent on clouds, Kubernetes, and bare metal.
AMD-SHARK Studio -- Web UI for SHARK+IREE High Performance Machine Learning Distribution
Open Source Continuous Inference Benchmark Research Platform — Kimi K2.7-Code, MiniMax M3, DeepSeekv4, GLM5 - GB200 NVL72 vs MI355X vs B200 vs GB300 NVL72 & soon™ TPUv6e/v7/Trainium2/3
OpenCL integration for Python, plus shiny features
RealScaler - image/video AI upscaler app (Real-ESRGAN)
Nabla: High-Performance Scientific Computing
FluidFrames | video AI frame-generation app
[DEPRECATED] Moved to ROCm/rocm-libraries repo
web UI for GPU-accelerated ONNX pipelines like Stable Diffusion, even on Windows and AMD
AMD Strix Halo local LLM guide: setup for Ryzen AI MAX+ 395 / Radeon 8060S, Ollama, llama.cpp Vulkan/RADV, ROCm, raw evidence, direct 100 t/s 30B Qwen, 140 t/s CHADROCK MTP, 120B GGUF.
Add a description, image, and links to the amd topic page so that developers can more easily learn about it.
To associate your repository with the amd topic, visit your repo's landing page and select "manage topics."