Newest 'retrieval-augmented-generation+python' Questions

0 votes

0 answers

31 views

KeyFrame detection in python

I'm building a RAG system for a platform where the primary content consists of videos and slides. My approach involves extracting keyframes from videos using OpenCV diff = cv2.absdiff(prev_image, ...

Daniel

11

asked Mar 24 at 15:40

0 votes

0 answers

415 views

BM25Retriever + ChromaDB Hybrid Search Optimization using LangChain

For those who have integrated the ChromaDB client with the Langchain framework, I am proposing the following approach to implement the Hybrid search (Vector Search + BM25Retriever): from ...

Diallo Francis Patrick

177

asked Mar 1 at 14:31

0 votes

0 answers

20 views

Index type 0x73726576 ("vers") not recognized

I have a chat bot app that I can run without any problem in my local environment. I can both run it locally on pycharm and I can run a docker container locally again. then I deploy it to koyeb using ...

mehmet

39

asked Feb 28 at 12:36

0 votes

0 answers

183 views

RAG on Mac (M3) with langchain (RetrievalQA): code runs indefinitely

I'm trying to run a RAG system on my mac M3-pro (18gb RAM) using langchain and `Llama-3.2-3B-Instruct` on a jupyter notebook (and the vector storage is Milvus). When I am invoking RetrievalQA....

ArieAI

514

asked Jan 13 at 10:32

0 votes

1 answer

809 views

ModuleNotFoundError: No module named 'huggingface_hub.inference._types'

I am running a RAG pipeline, with LlamaIndex and quantized LLama3-8B-Instruct. I just installed these libraries: !pip install --upgrade huggingface_hub !pip install --upgrade peft !pip install llama-...

Hoang Cuong Nguyen

441

asked Dec 21, 2024 at 4:34

1 vote

1 answer

83 views

Creating an index in PyMilvus 2.5.x does not actually index any rows

I am trying to create an index on text embeddings for a RAG system with Milvus 2.5.x as vector database in Python. I have already create the collections and populated them. My dataset size is quite ...

Liqs

197

asked Dec 17, 2024 at 14:10

1 vote

0 answers

304 views

code walkthrough of chain syntax in langchain [duplicate]

I am following a RAG tutorial from: https://medium.com/@vndee.huynh/build-your-own-rag-and-run-it-locally-langchain-ollama-streamlit-181d42805895 In the tutorial there is a section that creates a ...

Null Salad

1,060

asked Oct 17, 2024 at 20:28

0 votes

1 answer

806 views

Getting Tokens Usage Metadata from Gemini LLM calls in LangChain RAG RunnableSequence

I would like to have the token utilisation of my RAG chain each time it is invoked. No matter what I do, I can't seem to find the right way to output the total tokens from the Gemini model I'm using. ...

Matheus Torquato

1,639

asked Sep 30, 2024 at 15:04

6 votes

0 answers

354 views

Best Approach to Evaluate a Graph RAG Pipeline Using Metrics?

I’ve developed a Graph RAG (Retrieval-Augmented Generation) pipeline that performs reasoning over a knowledge graph. Given a user query, the pipeline retrieves relevant nodes and relationships in the ...

LLM_Enthusiast

87

asked Aug 17, 2024 at 3:44

0 votes

0 answers

183 views

Building a RAG agent for a large scale project

I have recently created a chatbot using langchain, openAI (for embeddings + LLM) and Pinecone as my vectorDB (serverless). I have not used any advance RAG techniques as of now simple because I don't ...

Afaq Khan

1

asked Aug 11, 2024 at 19:08

0 votes

1 answer

3k views

Stream output using VLLM

I am working on a RAG app, where I use LLMs to analyze various documents. I'm looking to improve the UX by streaming responses in real time. a snippet of my code: params = SamplingParams(temperature=...

Cihan Yalçın

53

asked Jul 31, 2024 at 9:54

0 votes

1 answer

557 views

DSPy: How to get the number of tokens available for the input fields?

This is a cross-post from Issue #1245 of DSPy GitHub Repo. There were no responses in the past week, am I am working on a project with a tight schedule. When running a DSPy module with a given ...

Tom Lin

110

asked Jul 13, 2024 at 8:25

1 vote

0 answers

248 views

Running entirely local RAG system in Colab over GDrive files?

I am trying to run an entirely local RAG using Colab on my google drive, without sending any tokens to an external language model API. I downloaded the model into a Drive folder (here just called path,...

Groovatys_rainbow

11

asked Jul 7, 2024 at 20:34

1 vote

1 answer

1k views

LlamaParse not able to parse documents inside directory

Whenever I try to use LlamaParse I get an error that states the file_input must be a file path string, file bytes, or buffer object. parser = LlamaParse(result_type="markdown") ...

verstandskies

11

asked Jul 4, 2024 at 13:46

0 votes

2 answers

474 views

BedrockEmbeddings - botocore.errorfactory.ModelTimeoutException

I am trying to get vector embeddings on scale for documents. Importing, from langchain_community.embeddings import BedrockEmbeddings package. Using embeddings = BedrockEmbeddings( ...

Benny

7,008

asked Jun 30, 2024 at 16:45

Collectives™ on Stack Overflow

All Questions

KeyFrame detection in python

BM25Retriever + ChromaDB Hybrid Search Optimization using LangChain

Index type 0x73726576 ("vers") not recognized

RAG on Mac (M3) with langchain (RetrievalQA): code runs indefinitely

ModuleNotFoundError: No module named 'huggingface_hub.inference._types'

Creating an index in PyMilvus 2.5.x does not actually index any rows

code walkthrough of chain syntax in langchain [duplicate]

Getting Tokens Usage Metadata from Gemini LLM calls in LangChain RAG RunnableSequence

Best Approach to Evaluate a Graph RAG Pipeline Using Metrics?

Building a RAG agent for a large scale project

Stream output using VLLM

DSPy: How to get the number of tokens available for the input fields?

Running entirely local RAG system in Colab over GDrive files?

LlamaParse not able to parse documents inside directory

BedrockEmbeddings - botocore.errorfactory.ModelTimeoutException

Hot Network Questions

Collectives™ on Stack Overflow

All Questions

Related Tags