Newest 'huggingface' Questions

1 vote

0 answers

8 views

How to re-use attention in huggingface

I have a long chunk of text that I need to process using a transformer, I would then like to have users ask different questions about it (all questions are independent, they don't relate to each other)...

Matt

35

asked 6 hours ago

0 votes

1 answer

48 views

Error while loading Deepseek using HuggingFace

I am using HuggingFace as a pipeline to use text DeepSeek classifying model as follows: model_kwargs = {"trust_remote_code": True} embedding_model = HuggingFaceEmbeddings( model_kwargs=...

John

13

asked yesterday

0 votes

0 answers

39 views

The issue of mask fragmentation during SAM2 tracking

I am currently working on object tracking. I use Moondream2 to identify objects in the scene, filter out duplicate bounding boxes, and then use SAM2 to track the objects. During the tracking process, ...

Limit

1

asked Apr 21 at 11:34

0 votes

0 answers

25 views

Unable to call Hugging face api from my local machine

I want to use a model via hugging face, but even with a valid token it's not working. Can someone please help. Test Code from huggingface_hub import InferenceClient token = "...

vicky113

351

asked Apr 19 at 9:48

1 vote

0 answers

83 views

Running DeepSeek-V3 inference without GPU (on CPU only)

I am trying to run the DeepSeek-V3 model inference on a remote machine (SSH). This machine does not have any GPU, but has many CPU cores. 1rst method/ I try to run the model inference using the ...

The_Average_Engineer

409

asked Apr 14 at 19:35

2 votes

1 answer

94 views

Load DeepSeek-V3 model from local repo

I want to run the DeepSeek-V3 model inference using the Hugging-Face Transformer library (>= v4.51.0). I read that you can do the following to do that (download the model and run it) from ...

The_Average_Engineer

409

asked Apr 11 at 18:26

0 votes

0 answers

15 views

Optimise FAISS vector search results

I am using FAISS vector search to search across about 6 million data present in different vectors and then on top that results I am using fuzzysearch to filter out the top results. The problem here is ...

Dilpreet Singh

13

asked Apr 11 at 11:09

3 votes

2 answers

667 views

NameError: name 'init_empty_weights' is not defined while using hugging face models

I am trying to set up hugging face locally and im running into this issue. NameError: name 'init_empty_weights' is not defined Here is the code I have tested my installation with from transformers ...

cosm1c v1bes

105

asked Apr 7 at 11:02

0 votes

0 answers

35 views

Workload evicted, storage limit exceeded (100G) Llama-4-Scout-17B-16E

Iam trying to run meta-llama/Llama-4-Scout-17B-16E-Instruct on A100 GPU. It’s giving me error. Also I have 1TB perstient storage. The total storage required for the Llama-4-Scout-17B model is 217.21 ...

Bhargav

4,301

asked Apr 7 at 8:52

0 votes

1 answer

41 views

about llama-2-7b model loading from huggingface even with meta liscence access

I am trying to load this model, but it gives me the same error. How to fix this? from transformers import AutoTokenizer, AutoModelForCausalLM model_id = "meta-llama/Llama-2-7b-hf" ...

orchestration sam

1

asked Apr 7 at 5:51

1 vote

1 answer

80 views

FastAPI + Transformers + 4-bit Mistral: .to() is not supported for bitsandbytes 4-bit models error

I'm deploying a FastAPI backend using HuggingFace Transformers with the mistralai/Mistral-7B-Instruct-v0.1 model, quantized to 4-bit using BitsAndBytesConfig. I’m running this inside an NVIDIA GPU ...

Dalmouda

11

asked Apr 2 at 23:12

0 votes

0 answers

31 views

Token limit exceeded on Qwen2.5 VL 7B Instruct

I am making an inference with Qwen2.5 VL 7B like this, but when I try to encode the image with base64, it exceeds the token limit (since the base64 is quite long). from huggingface_hub import ...

Coco Zeng

1

asked Apr 2 at 20:06

0 votes

0 answers

13 views

vector database giving error using OpenAIEmbeddings

I just started learning Open AI, and trying to call below code in Google colab. from langchain_community.vectorstores import Chroma from langchain_openai import OpenAIEmbeddings db=Chroma....

J Learner

39

asked Apr 2 at 6:31

1 vote

1 answer

107 views

Serving models using VLLM on Huggingface Spaces

Hi everyone so I'm trying to serve models on huggingface using docker and vllm, but am encountering model config errors and unknown model errors, I'm using the latest version of vllm and have tried ...

Marios Petrov

11

asked Mar 31 at 20:46

0 votes

0 answers

42 views

LoRA Adapter Loading Issue with Llama 3.1 8B - Missing Keys Warning

I'm having trouble loading my LoRA adapters for inference after fine-tuning Llama 3.1 8B. When I try to load the adapter files in a new session, I get a warning about missing adapter keys: /usr/local/...

Mohanad Hafez

1

asked Mar 31 at 8:16

Collectives™ on Stack Overflow

How to re-use attention in huggingface

Error while loading Deepseek using HuggingFace

The issue of mask fragmentation during SAM2 tracking

Unable to call Hugging face api from my local machine

Running DeepSeek-V3 inference without GPU (on CPU only)

Load DeepSeek-V3 model from local repo

Optimise FAISS vector search results

NameError: name 'init_empty_weights' is not defined while using hugging face models

Workload evicted, storage limit exceeded (100G) Llama-4-Scout-17B-16E

about llama-2-7b model loading from huggingface even with meta liscence access

FastAPI + Transformers + 4-bit Mistral: .to() is not supported for bitsandbytes 4-bit models error

Token limit exceeded on Qwen2.5 VL 7B Instruct

vector database giving error using OpenAIEmbeddings

Serving models using VLLM on Huggingface Spaces

LoRA Adapter Loading Issue with Llama 3.1 8B - Missing Keys Warning

Hot Network Questions

Collectives™ on Stack Overflow

Related Tags