Newest 'quantization' Questions

0 votes

0 answers

32 views

how to convert a QAT quantization aware trained tensorflow graph into tflite model?

I have am quantizing a neural network using QAT and I want to convert it into tflite. Quantization nodes get added to the skeleton graph and we get a new graph. I am able to load the trained QAT ...

Prateek Sharma

1

asked Apr 8 at 9:08

0 votes

0 answers

18 views

Stable Diffusion v1.4 PTQ on both weight and activation

I'm currently working on quantizing the Stable Diffusion v1.4 checkpoint without relying on external libraries such as torch.quantization or other quantization toolkits. I’m exploring two scenarios: ...

DOGLOPER

1

asked Apr 4 at 10:06

0 votes

0 answers

58 views

Error about bitsandbytes from Huggingface

For the below code, which is a standard snippet from Huggingface website, I'm getting the error: ImportError: Using `bitsandbytes` 4-bit quantization requires the latest version of bitsandbytes: `pip ...

Aryan Bhusari

1

asked Mar 31 at 22:33

0 votes

0 answers

18 views

sub-4 bit quantized model on nvidia gpu

I was trying to run deepseek-r1-distill-llama70b-bf16.gguf (131gb on disk) on two A6000 gpus (48gb vram each) with llama.cpp. It runs with partial gpu offload but the gpu utilization is at 9-10% and ...

afsara_ben

692

asked Mar 28 at 4:40

0 votes

0 answers

66 views

How do I resolve ImportError Using bitsandbytes 4bit quantization requires the latest version of bitsandbytes despite having version 0.45.3 installed?

I am trying to use the bitsandbytes library for 4-bit quantization in my model loading function, but I keep encountering an ImportError. The error message says, "Using bitsandbytes 4-bit ...

from

1

asked Mar 11 at 10:54

0 votes

0 answers

27 views

Onnxruntime quantization script for MatMulNbits, what is the type after conversion?

In the onnxruntime documentation, for quantization here: https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html#quantize-to-int4uint4 It sets accuracy_level=4 which means it's a ...

Owen Zhang

23

asked Feb 17 at 9:55

1 vote

0 answers

39 views

Issues with MP3-like Compression: Quantization and File Size

I’m trying to implement an MP3-like compression algorithm for audio and have followed the general steps, but I’m encountering a few issues with the quantization step. Here's the overall process I'm ...

Muchacho

17

asked Jan 6 at 13:18

0 votes

0 answers

17 views

Kernel Dies When Testing a Quantized ResNet101 Model in PyTorch

Issue: I am encountering a kernel dies problem specifically during inference when using a quantized ResNet101 model in PyTorch. The model trains and quantized successfully, but the kernel dies when ...

Pavan Pandya

1

asked Dec 13, 2024 at 5:28

0 votes

1 answer

223 views

Trying to quantize YOLOv11 in tensorflow, is this topology normal?

I'm trying to quantize the YOLO v11 model in tensorflow and get this as a result: The target should be int8. Is this normal behaviour? When running it with tflite micro on an esp32 I quicly run out of ...

gillo04

138

asked Dec 11, 2024 at 7:00

0 votes

0 answers

21 views

Unit testing PNG quantization by Sharp in Jest

I am fetching PNG files from a 3rd party API endpoint and quantizing them using Sharp before sending the response to the client. How can I unit test the quantization process? My intention was to have ...

Ben Sullivan

1

asked Dec 2, 2024 at 16:19

0 votes

0 answers

160 views

Structured Pruning of Yolov8

I have to run the object detector on Raspberry Pi 4b for real-time object detection. For this task, I have decided to use yolov8n. I have to run the detector in real-time, and since I don't have any ...

Hiba Lashari

1

asked Nov 23, 2024 at 21:16

0 votes

0 answers

34 views

Reference code to convert microsoft/Multilingual-MiniLM-L12-H384 to int8 quantization

I am trying to convert microsoft/Multilingual-MiniLM-L12-H384 in to int8 with post quantization approach. Can someone point out references. I have referred to tensorflow blogs and converter APIs and ...

Ravi Sankar Guntur

1

asked Nov 21, 2024 at 15:41

0 votes

0 answers

22 views

While trying to implement QLORA using trainer class, getting casting error

lora_config=LoraConfig( r=8, lora_alpha=32, target_modules=['q_lin','v_lin'], lora_dropout=0.1, bias='all' ) class distilbertMultiClass(nn.Module): def __init__(self,model,...

Lijin Durairaj

5,260

asked Nov 16, 2024 at 15:58

1 vote

0 answers

69 views

Transforming a picture into a posterized image with matching grid overlay and symbols

First of all, I want to help my mom with her embroidery projects and secondly, I want to get better in Python. So I don't need an exact solution. But it would be great to be pointed in the right ...

Ricked

11

asked Nov 12, 2024 at 16:33

3 votes

1 answer

496 views

RuntimeError: "Unused kwargs" and "frozenset object has no attribute discard" with BitsAndBytes bf16 Quantized Model in Hugging Face Gradio App

I'm encountering a RuntimeError while running a BitsAndBytes bf16 quantized Gemma-2-2b model on Hugging Face Spaces with a Gradio UI. The error specifically mentions unused kwargs and an ...

doniker99

56

asked Nov 10, 2024 at 16:07

Collectives™ on Stack Overflow

how to convert a QAT quantization aware trained tensorflow graph into tflite model?

Stable Diffusion v1.4 PTQ on both weight and activation

Error about bitsandbytes from Huggingface

sub-4 bit quantized model on nvidia gpu

How do I resolve ImportError Using bitsandbytes 4bit quantization requires the latest version of bitsandbytes despite having version 0.45.3 installed?

Onnxruntime quantization script for MatMulNbits, what is the type after conversion?

Issues with MP3-like Compression: Quantization and File Size

Kernel Dies When Testing a Quantized ResNet101 Model in PyTorch

Trying to quantize YOLOv11 in tensorflow, is this topology normal?

Unit testing PNG quantization by Sharp in Jest

Structured Pruning of Yolov8

Reference code to convert microsoft/Multilingual-MiniLM-L12-H384 to int8 quantization

While trying to implement QLORA using trainer class, getting casting error

Transforming a picture into a posterized image with matching grid overlay and symbols

RuntimeError: "Unused kwargs" and "frozenset object has no attribute discard" with BitsAndBytes bf16 Quantized Model in Hugging Face Gradio App

Hot Network Questions

Collectives™ on Stack Overflow

Related Tags