All Questions
Tagged with retrieval-augmented-generation python
46 questions
0
votes
0
answers
31
views
KeyFrame detection in python
I'm building a RAG system for a platform where the primary content consists of videos and slides. My approach involves extracting keyframes from videos using OpenCV
diff = cv2.absdiff(prev_image, ...
0
votes
0
answers
415
views
BM25Retriever + ChromaDB Hybrid Search Optimization using LangChain
For those who have integrated the ChromaDB client with the Langchain framework, I am proposing the following approach to implement the Hybrid search (Vector Search + BM25Retriever):
from ...
0
votes
0
answers
20
views
Index type 0x73726576 ("vers") not recognized
I have a chat bot app that I can run without any problem in my local environment.
I can both run it locally on pycharm and I can run a docker container locally again. then I deploy it to koyeb using ...
0
votes
0
answers
183
views
RAG on Mac (M3) with langchain (RetrievalQA): code runs indefinitely
I'm trying to run a RAG system on my mac M3-pro (18gb RAM) using langchain and `Llama-3.2-3B-Instruct` on a jupyter notebook (and the vector storage is Milvus).
When I am invoking RetrievalQA....
0
votes
1
answer
809
views
ModuleNotFoundError: No module named 'huggingface_hub.inference._types'
I am running a RAG pipeline, with LlamaIndex and quantized LLama3-8B-Instruct. I just installed these libraries:
!pip install --upgrade huggingface_hub
!pip install --upgrade peft
!pip install llama-...
1
vote
1
answer
83
views
Creating an index in PyMilvus 2.5.x does not actually index any rows
I am trying to create an index on text embeddings for a RAG system with Milvus 2.5.x as vector database in Python. I have already create the collections and populated them. My dataset size is quite ...
1
vote
0
answers
304
views
code walkthrough of chain syntax in langchain [duplicate]
I am following a RAG tutorial from: https://medium.com/@vndee.huynh/build-your-own-rag-and-run-it-locally-langchain-ollama-streamlit-181d42805895
In the tutorial there is a section that creates a ...
0
votes
1
answer
806
views
Getting Tokens Usage Metadata from Gemini LLM calls in LangChain RAG RunnableSequence
I would like to have the token utilisation of my RAG chain each time it is invoked.
No matter what I do, I can't seem to find the right way to output the total tokens from the Gemini model I'm using.
...
6
votes
0
answers
354
views
Best Approach to Evaluate a Graph RAG Pipeline Using Metrics?
I’ve developed a Graph RAG (Retrieval-Augmented Generation) pipeline that performs reasoning over a knowledge graph. Given a user query, the pipeline retrieves relevant nodes and relationships in the ...
0
votes
0
answers
183
views
Building a RAG agent for a large scale project
I have recently created a chatbot using langchain, openAI (for embeddings + LLM) and Pinecone as my vectorDB (serverless). I have not used any advance RAG techniques as of now simple because I don't ...
0
votes
1
answer
3k
views
Stream output using VLLM
I am working on a RAG app, where I use LLMs to analyze various documents. I'm looking to improve the UX by streaming responses in real time.
a snippet of my code:
params = SamplingParams(temperature=...
0
votes
1
answer
557
views
DSPy: How to get the number of tokens available for the input fields?
This is a cross-post from Issue #1245 of DSPy GitHub Repo. There were no responses in the past week, am I am working on a project with a tight schedule.
When running a DSPy module with a given ...
1
vote
0
answers
248
views
Running entirely local RAG system in Colab over GDrive files?
I am trying to run an entirely local RAG using Colab on my google drive, without sending any tokens to an external language model API. I downloaded the model into a Drive folder (here just called path,...
1
vote
1
answer
1k
views
LlamaParse not able to parse documents inside directory
Whenever I try to use LlamaParse I get an error that states the file_input must be a file path string, file bytes, or buffer object.
parser = LlamaParse(result_type="markdown")
...
0
votes
2
answers
474
views
BedrockEmbeddings - botocore.errorfactory.ModelTimeoutException
I am trying to get vector embeddings on scale for documents.
Importing, from langchain_community.embeddings import BedrockEmbeddings package.
Using embeddings = BedrockEmbeddings( ...