Skip to main content

All Questions

Tagged with
1 vote
1 answer
96 views

Why does my Llama 3.1 model act differently between AutoModelForCausalLM and LlamaForCausalLM?

I have one set of weights, one tokenizer, the same prompt, and identical generation parameters. Yet somehow, when I load the model using AutoModelForCausalLM, I get one output, and when I construct it ...
han mo's user avatar
  • 23
0 votes
0 answers
53 views

How can HuggingFaceEndpoint instance not need a quantization config or tokenizer?

My original goal was to make a base chain class so I could further instantiate a chain with a LLM of my choice (e.g. gpt-4o-mini or meta-llama/Meta-Llama-3-8B etc). I've noticed that ...
user29109772's user avatar
-1 votes
1 answer
279 views

Cannot install llama-index-embeddings-huggingface==0.1.3 because these package versions have conflicting dependencies

I am unable to install the huggingfaceEmbedding \ Getting the followng error: ERROR: Cannot install llama-index-embeddings-huggingface==0.1.3, llama-index-embeddings-huggingface==0.1.4 and llama-index-...
Saurabh Verma's user avatar
2 votes
2 answers
2k views

Cannot load a gated model from hugginface despite having access and logging in

I am training a Llama-3.1-8B-Instruct model for a specific task. I have request the access to the huggingface repository, and got access, confirmed on the huggingface webapp dashboard. I tried calling ...
majTheHero's user avatar
0 votes
1 answer
564 views

huggingface model inference: ERROR: Flag 'minloglevel' was defined more than once (in files 'src/error.cc' and ..)

I'm trying to use llama 3.1 70b from huggingface(end goal is to quantize it and deploy it in amazon sagemaker), but I'm facing: ERROR: Flag 'minloglevel' was defined more than once (in files 'src/...
Luis Leal's user avatar
  • 3,534
1 vote
1 answer
3k views

Finding config.json for Llama 3.1 8B

I installed the Llama 3.1 8B model through Meta's Github page, but I can't get their example code to work. I'm running the following code in the same directory as the Meta-Llama-3.1-8B folder: import ...
MatthewScarpino's user avatar
2 votes
0 answers
162 views

TRL SFTTrainer clarification on truncation

I am currently finetuning LLama models using SFTTrainer in huggingface. However, I came up with a question, I can not answer through the documentations (atleast, it is a bit ambigious). My dataset ...
iiiiiiiiiiiiiiiiiiii's user avatar
0 votes
0 answers
80 views

meta-llama/Llama-2-13b-hf torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 50.00 MiB. GPU

I am trying load Llama-2-13b on multiple GPU's but isn't loading, i have 3 GPU's 24.169 GB each , but unable to load, i have tried using cuda or device_map ='auto' This is my current code. When I try ...
weifeng tripods's user avatar
2 votes
1 answer
2k views

Error loading Hugging Face model: SafeTensorsInfo.__init__() got an unexpected keyword argument 'sharded'

I have been using Hugging Face transformers quantized Llama2 model. Suddenly, code I was able to run earlier today is throwing an error when I try to load the model. This code is straight from the ...
SamR's user avatar
  • 21k
1 vote
0 answers
397 views

OSError: TheBloke/Llama-2-7B-Chat-GGML does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack

I want to build myself an AI bot. ("TheBloke/Llama-2-7B-Chat-GGML",model_type="llama", model_file="llama-2-7b-chat.ggmlv3.q8_0.bin") I chose this model. But I couldn't ...
Deniz ÇELİK's user avatar
6 votes
2 answers
6k views

How to Merge Fine-tuned Adapter and Pretrained Model in Hugging Face Transformers and Push to Hub?

I have fine-tuned the Llama-2 model following the llama-recipes repository's tutorial. Currently, I have the pretrained model and fine-tuned adapter stored in two separate directories as follows: ...
Aun Zaidi's user avatar
  • 315
0 votes
1 answer
972 views

Llama 70b on Hugging Face Inference API Endpoint short responses

I just deployed the Nous-Hermes-Llama2-70b parameter on a 2x Nvidia A100 GPU through the Hugging Face Inference endpoints. When I tried the following code, the response generations were incomplete ...
Jimmison Johnson's user avatar
6 votes
1 answer
13k views

ValueError: Tokenizer class LlamaTokenizer does not exist or is not currently imported

I am trying to run the code from this Hugging Face blog. At first, I had no access to the model so this error: OSError: meta-llama/Llama-2-7b-chat-hf is not a local folder, is now solved and I created ...
Quinten's user avatar
  • 41.9k
4 votes
2 answers
21k views

OSError: meta-llama/Llama-2-7b-chat-hf is not a local folder

I'm trying to replied the code from this Hugging Face blog. At first I installed the transformers and created a token to login to hugging face hub: pip install transformers huggingface-cli login ...
Quinten's user avatar
  • 41.9k
1 vote
1 answer
1k views

Transformers - LLAMA2 13B - Key Error / Attribute Error

I'm trying to load and run the LLAMA2 13B model on my local machine, however I'm not able test any prompts due to an Key Error / Attribute Error (see image attached). My machine has the following ...
DJM's user avatar
  • 75

15 30 50 per page