Newest 'huggingface+python+llama' Questions

1 vote

1 answer

96 views

Why does my Llama 3.1 model act differently between AutoModelForCausalLM and LlamaForCausalLM?

I have one set of weights, one tokenizer, the same prompt, and identical generation parameters. Yet somehow, when I load the model using AutoModelForCausalLM, I get one output, and when I construct it ...

han mo

23

asked Mar 8 at 8:24

0 votes

0 answers

53 views

How can HuggingFaceEndpoint instance not need a quantization config or tokenizer?

My original goal was to make a base chain class so I could further instantiate a chain with a LLM of my choice (e.g. gpt-4o-mini or meta-llama/Meta-Llama-3-8B etc). I've noticed that ...

user29109772

1

asked Jan 8 at 15:40

-1 votes

1 answer

279 views

Cannot install llama-index-embeddings-huggingface==0.1.3 because these package versions have conflicting dependencies

I am unable to install the huggingfaceEmbedding \ Getting the followng error: ERROR: Cannot install llama-index-embeddings-huggingface==0.1.3, llama-index-embeddings-huggingface==0.1.4 and llama-index-...

Saurabh Verma

668

asked Dec 6, 2024 at 6:33

2 votes

2 answers

2k views

Cannot load a gated model from hugginface despite having access and logging in

I am training a Llama-3.1-8B-Instruct model for a specific task. I have request the access to the huggingface repository, and got access, confirmed on the huggingface webapp dashboard. I tried calling ...

majTheHero

143

asked Nov 21, 2024 at 14:54

0 votes

1 answer

564 views

huggingface model inference: ERROR: Flag 'minloglevel' was defined more than once (in files 'src/error.cc' and ..)

I'm trying to use llama 3.1 70b from huggingface(end goal is to quantize it and deploy it in amazon sagemaker), but I'm facing: ERROR: Flag 'minloglevel' was defined more than once (in files 'src/...

Luis Leal

3,534

asked Oct 3, 2024 at 20:46

1 vote

1 answer

3k views

Finding config.json for Llama 3.1 8B

I installed the Llama 3.1 8B model through Meta's Github page, but I can't get their example code to work. I'm running the following code in the same directory as the Meta-Llama-3.1-8B folder: import ...

MatthewScarpino

5,946

asked Aug 3, 2024 at 12:54

2 votes

0 answers

162 views

TRL SFTTrainer clarification on truncation

I am currently finetuning LLama models using SFTTrainer in huggingface. However, I came up with a question, I can not answer through the documentations (atleast, it is a bit ambigious). My dataset ...

iiiiiiiiiiiiiiiiiiii

359

asked Jul 20, 2024 at 20:46

0 votes

0 answers

80 views

meta-llama/Llama-2-13b-hf torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 50.00 MiB. GPU

I am trying load Llama-2-13b on multiple GPU's but isn't loading, i have 3 GPU's 24.169 GB each , but unable to load, i have tried using cuda or device_map ='auto' This is my current code. When I try ...

weifeng tripods

1

asked May 29, 2024 at 10:41

2 votes

1 answer

2k views

Error loading Hugging Face model: SafeTensorsInfo.init() got an unexpected keyword argument 'sharded'

I have been using Hugging Face transformers quantized Llama2 model. Suddenly, code I was able to run earlier today is throwing an error when I try to load the model. This code is straight from the ...

SamR

21k

asked Apr 2, 2024 at 15:10

1 vote

0 answers

397 views

OSError: TheBloke/Llama-2-7B-Chat-GGML does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack

I want to build myself an AI bot. ("TheBloke/Llama-2-7B-Chat-GGML",model_type="llama", model_file="llama-2-7b-chat.ggmlv3.q8_0.bin") I chose this model. But I couldn't ...

Deniz ÇELİK

21

asked Dec 20, 2023 at 7:22

6 votes

2 answers

6k views

How to Merge Fine-tuned Adapter and Pretrained Model in Hugging Face Transformers and Push to Hub?

I have fine-tuned the Llama-2 model following the llama-recipes repository's tutorial. Currently, I have the pretrained model and fine-tuned adapter stored in two separate directories as follows: ...

Aun Zaidi

315

asked Sep 23, 2023 at 21:20

0 votes

1 answer

972 views

Llama 70b on Hugging Face Inference API Endpoint short responses

I just deployed the Nous-Hermes-Llama2-70b parameter on a 2x Nvidia A100 GPU through the Hugging Face Inference endpoints. When I tried the following code, the response generations were incomplete ...

Jimmison Johnson

49

asked Sep 8, 2023 at 17:59

6 votes

1 answer

13k views

ValueError: Tokenizer class LlamaTokenizer does not exist or is not currently imported

I am trying to run the code from this Hugging Face blog. At first, I had no access to the model so this error: OSError: meta-llama/Llama-2-7b-chat-hf is not a local folder, is now solved and I created ...

Quinten

41.9k

asked Aug 30, 2023 at 11:11

4 votes

2 answers

21k views

OSError: meta-llama/Llama-2-7b-chat-hf is not a local folder

I'm trying to replied the code from this Hugging Face blog. At first I installed the transformers and created a token to login to hugging face hub: pip install transformers huggingface-cli login ...

Quinten

41.9k

asked Aug 30, 2023 at 9:34

1 vote

1 answer

1k views

Transformers - LLAMA2 13B - Key Error / Attribute Error

I'm trying to load and run the LLAMA2 13B model on my local machine, however I'm not able test any prompts due to an Key Error / Attribute Error (see image attached). My machine has the following ...

DJM

75

asked Aug 30, 2023 at 9:18

Collectives™ on Stack Overflow

All Questions

Why does my Llama 3.1 model act differently between AutoModelForCausalLM and LlamaForCausalLM?

How can HuggingFaceEndpoint instance not need a quantization config or tokenizer?

Cannot install llama-index-embeddings-huggingface==0.1.3 because these package versions have conflicting dependencies

Cannot load a gated model from hugginface despite having access and logging in

huggingface model inference: ERROR: Flag 'minloglevel' was defined more than once (in files 'src/error.cc' and ..)

Finding config.json for Llama 3.1 8B

TRL SFTTrainer clarification on truncation

meta-llama/Llama-2-13b-hf torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 50.00 MiB. GPU

Error loading Hugging Face model: SafeTensorsInfo.init() got an unexpected keyword argument 'sharded'

OSError: TheBloke/Llama-2-7B-Chat-GGML does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack

How to Merge Fine-tuned Adapter and Pretrained Model in Hugging Face Transformers and Push to Hub?

Llama 70b on Hugging Face Inference API Endpoint short responses

ValueError: Tokenizer class LlamaTokenizer does not exist or is not currently imported

OSError: meta-llama/Llama-2-7b-chat-hf is not a local folder

Transformers - LLAMA2 13B - Key Error / Attribute Error

Hot Network Questions

Collectives™ on Stack Overflow

All Questions

Related Tags