Skip to main content

Questions tagged [large-language-models]

For questions about large language models (LLMs), i.e. language models that are "large" in size and the data they use.

5 votes
1 answer
857 views

Large language models often perform impressively on benchmark tasks, coding, and natural language generation, but they can still fail on reasoning problems that seem simple for humans, especially when ...
Avalon Brooks's user avatar
1 vote
0 answers
12 views

I understand that transformer-based language models can generate highly fluent responses, and that retrieval-augmented generation (RAG) is often used to improve factual grounding by supplying relevant ...
Avalon Brooks's user avatar
1 vote
1 answer
25 views

TF-IDF cosine similarity is a powerful means of determining the similarity of two text documents. Can it be used as a loss function in a machine translation model training? For example, when measuring ...
Geremia's user avatar
  • 599
0 votes
1 answer
40 views

If I'm running GLM-4.7-Flash-GGUF:Q6_K_XL from the powershell terminal like this ...
ChristianOConnor's user avatar
0 votes
0 answers
19 views

I downloaded the weights of a llama model from huggingface. It works for simple tasks, but I don't know how to use it with langgraph to create agents or how to bind tools. Here is how I downloaded it: ...
jottbe's user avatar
  • 176
0 votes
0 answers
21 views

I provided test for arc-easy results for Llama 2 7B. I use lm evaluation harness, I get: acc: 75.51 acc_norm: 73.86 In the Llama 2 technical report, arc-easy is reported as 75.2. But in some ...
danlee's user avatar
  • 1
-1 votes
0 answers
15 views

We are looking to understand which AI tool or tools would be best to create a scrape of public job boards for specific job titles. We will also need the contact information from the jobs. Then we ...
Centurion27's user avatar
0 votes
0 answers
16 views

I have hundreds of chunked documents categorized under News and Pages categories. Docs that fall under News category have publish date, while Pages category files don’t have. I’m trying to apply ...
Mayar Alzerki's user avatar
1 vote
1 answer
62 views

Large language models often perform well on single-step tasks but sometimes fail when reasoning requires multiple logical steps. Is this limitation mainly due to training data patterns, model ...
Avalon Brooks's user avatar
0 votes
1 answer
26 views

I am a student who has recently started learning about artificial intelligence and reasoning systems, so I apologize in advance if this question is already well known in the literature. Many modern ...
Sagar P.'s user avatar
1 vote
2 answers
86 views

Large language models such as GPT, LLaMA, and Claude are trained on massive datasets and can generate highly coherent text. However, they still frequently produce incorrect or fabricated information, ...
Avalon Brooks's user avatar
0 votes
1 answer
70 views

Did AI change the benchmark for measuring machine performance from following exact design goals to meeting design objectives? Where the latter could be evaluated based on criteria such as relevance ...
Mohamed El Nawawy's user avatar
3 votes
1 answer
86 views

Large language models today have huge context windows, sometimes exceeding 100k+ tokens, yet they still fail on tasks that require consistent multi-step logical reasoning. I’m referring to tasks like: ...
Avalon Brooks's user avatar
3 votes
1 answer
83 views

In 2020, RealFormer introduced residual attention (c): But in 2024, the state-of-the-art DeepSeek transformer model still uses Post-LayerNorm (a): Residual attention is a simple addition operation, ...
Daniel T's user avatar
  • 133
3 votes
1 answer
110 views

Transformers are designed to capture long-range dependencies better than RNNs and LSTMs, but in practice, many models still fail to maintain consistent long-term reasoning. For example, when working ...
Avalon Brooks's user avatar

15 30 50 per page
1
2 3 4 5
20