Skip to main content
1 vote
0 answers
8 views

How to re-use attention in huggingface

I have a long chunk of text that I need to process using a transformer, I would then like to have users ask different questions about it (all questions are independent, they don't relate to each other)...
Matt's user avatar
  • 35
0 votes
1 answer
48 views

Error while loading Deepseek using HuggingFace

I am using HuggingFace as a pipeline to use text DeepSeek classifying model as follows: model_kwargs = {"trust_remote_code": True} embedding_model = HuggingFaceEmbeddings( model_kwargs=...
John's user avatar
  • 13
0 votes
0 answers
39 views

The issue of mask fragmentation during SAM2 tracking

I am currently working on object tracking. I use Moondream2 to identify objects in the scene, filter out duplicate bounding boxes, and then use SAM2 to track the objects. During the tracking process, ...
Limit 's user avatar
0 votes
0 answers
25 views

Unable to call Hugging face api from my local machine

I want to use a model via hugging face, but even with a valid token it's not working. Can someone please help. Test Code from huggingface_hub import InferenceClient token = "...
vicky113's user avatar
  • 351
1 vote
0 answers
83 views

Running DeepSeek-V3 inference without GPU (on CPU only)

I am trying to run the DeepSeek-V3 model inference on a remote machine (SSH). This machine does not have any GPU, but has many CPU cores. 1rst method/ I try to run the model inference using the ...
The_Average_Engineer's user avatar
2 votes
1 answer
94 views

Load DeepSeek-V3 model from local repo

I want to run the DeepSeek-V3 model inference using the Hugging-Face Transformer library (>= v4.51.0). I read that you can do the following to do that (download the model and run it) from ...
The_Average_Engineer's user avatar
0 votes
0 answers
15 views

Optimise FAISS vector search results

I am using FAISS vector search to search across about 6 million data present in different vectors and then on top that results I am using fuzzysearch to filter out the top results. The problem here is ...
Dilpreet Singh's user avatar
3 votes
2 answers
667 views

NameError: name 'init_empty_weights' is not defined while using hugging face models

I am trying to set up hugging face locally and im running into this issue. NameError: name 'init_empty_weights' is not defined Here is the code I have tested my installation with from transformers ...
cosm1c v1bes's user avatar
0 votes
0 answers
35 views

Workload evicted, storage limit exceeded (100G) Llama-4-Scout-17B-16E

Iam trying to run meta-llama/Llama-4-Scout-17B-16E-Instruct on A100 GPU. It’s giving me error. Also I have 1TB perstient storage. The total storage required for the Llama-4-Scout-17B model is 217.21 ...
Bhargav's user avatar
  • 4,301
0 votes
1 answer
41 views

about llama-2-7b model loading from huggingface even with meta liscence access

I am trying to load this model, but it gives me the same error. How to fix this? from transformers import AutoTokenizer, AutoModelForCausalLM model_id = "meta-llama/Llama-2-7b-hf" ...
orchestration sam's user avatar
1 vote
1 answer
80 views

FastAPI + Transformers + 4-bit Mistral: .to() is not supported for bitsandbytes 4-bit models error

I'm deploying a FastAPI backend using HuggingFace Transformers with the mistralai/Mistral-7B-Instruct-v0.1 model, quantized to 4-bit using BitsAndBytesConfig. I’m running this inside an NVIDIA GPU ...
Dalmouda's user avatar
0 votes
0 answers
31 views

Token limit exceeded on Qwen2.5 VL 7B Instruct

I am making an inference with Qwen2.5 VL 7B like this, but when I try to encode the image with base64, it exceeds the token limit (since the base64 is quite long). from huggingface_hub import ...
Coco Zeng's user avatar
0 votes
0 answers
13 views

vector database giving error using OpenAIEmbeddings

I just started learning Open AI, and trying to call below code in Google colab. from langchain_community.vectorstores import Chroma from langchain_openai import OpenAIEmbeddings db=Chroma....
J Learner's user avatar
1 vote
1 answer
107 views

Serving models using VLLM on Huggingface Spaces

Hi everyone so I'm trying to serve models on huggingface using docker and vllm, but am encountering model config errors and unknown model errors, I'm using the latest version of vllm and have tried ...
Marios Petrov's user avatar
0 votes
0 answers
42 views

LoRA Adapter Loading Issue with Llama 3.1 8B - Missing Keys Warning

I'm having trouble loading my LoRA adapters for inference after fine-tuning Llama 3.1 8B. When I try to load the adapter files in a new session, I get a warning about missing adapter keys: /usr/local/...
Mohanad Hafez's user avatar

15 30 50 per page
1
2 3 4 5
66