Skip to main content
0 votes
0 answers
32 views

how to convert a QAT quantization aware trained tensorflow graph into tflite model?

I have am quantizing a neural network using QAT and I want to convert it into tflite. Quantization nodes get added to the skeleton graph and we get a new graph. I am able to load the trained QAT ...
Prateek Sharma's user avatar
0 votes
0 answers
18 views

Stable Diffusion v1.4 PTQ on both weight and activation

I'm currently working on quantizing the Stable Diffusion v1.4 checkpoint without relying on external libraries such as torch.quantization or other quantization toolkits. I’m exploring two scenarios: ...
DOGLOPER's user avatar
0 votes
0 answers
58 views

Error about bitsandbytes from Huggingface

For the below code, which is a standard snippet from Huggingface website, I'm getting the error: ImportError: Using `bitsandbytes` 4-bit quantization requires the latest version of bitsandbytes: `pip ...
Aryan Bhusari's user avatar
0 votes
0 answers
18 views

sub-4 bit quantized model on nvidia gpu

I was trying to run deepseek-r1-distill-llama70b-bf16.gguf (131gb on disk) on two A6000 gpus (48gb vram each) with llama.cpp. It runs with partial gpu offload but the gpu utilization is at 9-10% and ...
afsara_ben's user avatar
0 votes
0 answers
66 views

How do I resolve ImportError Using bitsandbytes 4bit quantization requires the latest version of bitsandbytes despite having version 0.45.3 installed?

I am trying to use the bitsandbytes library for 4-bit quantization in my model loading function, but I keep encountering an ImportError. The error message says, "Using bitsandbytes 4-bit ...
from's user avatar
  • 1
0 votes
0 answers
27 views

Onnxruntime quantization script for MatMulNbits, what is the type after conversion?

In the onnxruntime documentation, for quantization here: https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html#quantize-to-int4uint4 It sets accuracy_level=4 which means it's a ...
Owen Zhang's user avatar
1 vote
0 answers
39 views

Issues with MP3-like Compression: Quantization and File Size

I’m trying to implement an MP3-like compression algorithm for audio and have followed the general steps, but I’m encountering a few issues with the quantization step. Here's the overall process I'm ...
Muchacho's user avatar
0 votes
0 answers
17 views

Kernel Dies When Testing a Quantized ResNet101 Model in PyTorch

Issue: I am encountering a kernel dies problem specifically during inference when using a quantized ResNet101 model in PyTorch. The model trains and quantized successfully, but the kernel dies when ...
Pavan Pandya's user avatar
0 votes
1 answer
223 views

Trying to quantize YOLOv11 in tensorflow, is this topology normal?

I'm trying to quantize the YOLO v11 model in tensorflow and get this as a result: The target should be int8. Is this normal behaviour? When running it with tflite micro on an esp32 I quicly run out of ...
gillo04's user avatar
  • 138
0 votes
0 answers
21 views

Unit testing PNG quantization by Sharp in Jest

I am fetching PNG files from a 3rd party API endpoint and quantizing them using Sharp before sending the response to the client. How can I unit test the quantization process? My intention was to have ...
Ben Sullivan's user avatar
0 votes
0 answers
160 views

Structured Pruning of Yolov8

I have to run the object detector on Raspberry Pi 4b for real-time object detection. For this task, I have decided to use yolov8n. I have to run the detector in real-time, and since I don't have any ...
Hiba Lashari's user avatar
0 votes
0 answers
34 views

Reference code to convert microsoft/Multilingual-MiniLM-L12-H384 to int8 quantization

I am trying to convert microsoft/Multilingual-MiniLM-L12-H384 in to int8 with post quantization approach. Can someone point out references. I have referred to tensorflow blogs and converter APIs and ...
Ravi Sankar Guntur's user avatar
0 votes
0 answers
22 views

While trying to implement QLORA using trainer class, getting casting error

lora_config=LoraConfig( r=8, lora_alpha=32, target_modules=['q_lin','v_lin'], lora_dropout=0.1, bias='all' ) class distilbertMultiClass(nn.Module): def __init__(self,model,...
Lijin Durairaj's user avatar
1 vote
0 answers
69 views

Transforming a picture into a posterized image with matching grid overlay and symbols

First of all, I want to help my mom with her embroidery projects and secondly, I want to get better in Python. So I don't need an exact solution. But it would be great to be pointed in the right ...
Ricked's user avatar
  • 11
3 votes
1 answer
496 views

RuntimeError: "Unused kwargs" and "frozenset object has no attribute discard" with BitsAndBytes bf16 Quantized Model in Hugging Face Gradio App

I'm encountering a RuntimeError while running a BitsAndBytes bf16 quantized Gemma-2-2b model on Hugging Face Spaces with a Gradio UI. The error specifically mentions unused kwargs and an ...
doniker99's user avatar

15 30 50 per page
1
2 3 4 5
33