Skip to content

"Efficient and scalable solutions for PyTorch, enabling large language model quantization with k-bit precision for enhanced accessibility.

License

Notifications You must be signed in to change notification settings

ved1beta/Quanta

Repository files navigation

Quanta πŸš€

A lightweight PyTorch library for efficient model quantization and memory optimization. Perfect for running large language models on consumer hardware.

Key Features

  • 🎯 8-bit & 4-bit quantization primitives
  • πŸ’Ύ Memory-efficient optimizers
  • πŸš€ LLM.int8() inference support
  • πŸ”„ QLoRA-style fine-tuning
  • πŸ–₯️ Cross-platform hardware support

Quick Start

import torch
from Quanta.functional.quantization import quantize_8bit, dequantize_8bit

# Quantize your model
q_tensor, scale, zero_point = quantize_8bit(model_weights)

Status

🚧 Early Development - Currently implementing core quantization features.

License

MIT License

Inspired by bitsandbytes

About

"Efficient and scalable solutions for PyTorch, enabling large language model quantization with k-bit precision for enhanced accessibility.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages