Agentic Research Copilot

An AI-powered research assistant that transforms vague research objectives into structured, citation-backed reports. Built for local execution on resource-constrained hardware (MacBook Pro M2, 8GB RAM).

Note: This project is not currently deployed as a hosted service. If you'd like to try it out, you're welcome to clone the repository and run it locally. The setup is straightforward and documented below. I'd appreciate any feedback or suggestions.

The Problem: Why Traditional RAG Falls Short

Traditional Retrieval-Augmented Generation (RAG) systems follow a simplistic pattern: retrieve documents, stuff them into context, and generate a response. This approach suffers from critical limitations in research-intensive tasks:

Single-Pass Retrieval - Standard RAG retrieves once and hopes for the best. Complex research topics require iterative exploration—uncovering a relevant paper often reveals new subtopics worth investigating.
No Critical Evaluation - RAG blindly synthesizes retrieved content without assessing source quality, identifying contradictions, or recognizing gaps in coverage.
Flat Knowledge Representation - Documents are treated as isolated chunks. The semantic relationships between papers—citations, methodological similarities, conflicting findings—are lost.
Context Limitations - Stuffing all retrieved documents into a single prompt leads to context overflow and poor synthesis quality.
Zero Observability - Most RAG pipelines are black boxes. When output quality degrades, debugging is nearly impossible.

The Solution: Agentic Architecture

This project reimagines research synthesis as a multi-agent workflow rather than a single retrieval-generation step. The system operates like a research team:

Agent Node	Role	Advantage Over RAG
Planner	Decomposes topic into sub-queries	Systematic coverage vs. single-shot retrieval
Retriever	Multi-source search (arXiv, Semantic Scholar, Wikipedia)	Broader, more authoritative sources
Reader	Extracts key claims and methods per paper	Structured understanding vs. raw context stuffing
Synthesizer	Combines findings with proper attribution	Coherent narrative with citations
Critic	Identifies gaps, contradictions, weak coverage	Self-correction loop—retrieves more if needed
Finalizer	Produces structured report with all sections	Consistent, actionable output format

The Critical Differentiator: The Feedback Loop

The Critic node evaluates synthesis quality and can trigger additional retrieval rounds. This mimics how humans actually research: read papers, identify what's missing, search again. Traditional RAG cannot do this—it's strictly feed-forward.

Core Capabilities

Capability	Description
Multi-Agent Supervisor	LangGraph workflow with 7 specialized nodes
Multi-Source Retrieval	arXiv, Semantic Scholar, Wikipedia integration
Knowledge Graph	NetworkX-based paper relationship mapping
Semantic Cache	Reduces redundant LLM calls by approximately 40%
Local Observability	Full execution tracing without external dependencies
Streamlit UI	Real-time research progress visualization
Export Formats	Markdown, PDF, BibTeX, JSON
Trend Detection	Identifies emerging research directions from metadata

Architecture

graph TB
    subgraph "API Layer"
        API[FastAPI<br/>REST + SSE]
    end
    
    subgraph "LangGraph Agent"
        INTAKE[intake] --> PLANNER[planner]
        PLANNER --> RETRIEVER[retriever]
        RETRIEVER --> READER[reader]
        READER --> SYNTHESIZER[synthesizer]
        SYNTHESIZER --> CRITIC[critic]
        CRITIC -->|gaps found| RETRIEVER
        CRITIC --> FINALIZER[finalizer]
    end
    
    subgraph "Tools"
        OLLAMA[Ollama<br/>llama3.2:3b]
        ARXIV[arXiv API]
        EMBED[sentence-transformers]
    end
    
    subgraph "Storage"
        CHROMA[(ChromaDB)]
        SQLITE[(SQLite)]
    end

Potential Impact

For Researchers and Academics

Literature Review Acceleration: Reduce weeks of literature survey to hours with automated gap analysis and contradiction detection.
Emerging Trend Identification: Surface research directions before they become mainstream, enabling early positioning.

For Industry R&D Teams

Competitive Intelligence: Rapidly understand state-of-the-art in any technical domain without manual paper hunting.
Decision Support: Get structured, citation-backed answers to technical feasibility questions.

For the AI/ML Community

Reproducible Research Workflows: Fully local, open-source stack eliminates vendor lock-in and ensures reproducibility.
Resource-Efficient AI: Demonstrates that sophisticated AI workflows can run on consumer hardware (8GB RAM), democratizing access.

Broader Implications

This project demonstrates that agentic architectures fundamentally outperform monolithic RAG for complex cognitive tasks. The patterns here—iterative refinement, self-critique, structured decomposition—apply far beyond research synthesis to any domain requiring systematic analysis.

Quick Start

Prerequisites

Install Ollama
```
brew install ollama
```
Pull the model
```
ollama pull llama3.2:3b
```
Start Ollama server
```
ollama serve
```

Installation

# Clone and enter directory
cd agentic_research_copilot

# Install dependencies
pip install -r requirements.txt

# Copy environment file
cp .env.example .env

# Create data directories
make dirs

# Start the API server
make dev

Run the UI (optional)

make ui

Visit http://localhost:8501 for the Streamlit dashboard.

API Usage

Start Research

curl -X POST http://localhost:8000/v1/research \
  -H "Content-Type: application/json" \
  -d '{
    "topic": "transformer attention mechanisms and their variants",
    "depth": "normal",
    "constraints": "Focus on papers from 2023-2024"
  }'

Response:

{
  "run_id": "abc123-...",
  "status": "pending",
  "message": "Research started..."
}

Check Status

curl http://localhost:8000/v1/runs/{run_id}

Stream Progress (SSE)

curl -N http://localhost:8000/v1/runs/{run_id}/stream

Export Results

# Markdown
curl http://localhost:8000/v1/runs/{run_id}/export?format=markdown

# BibTeX
curl http://localhost:8000/v1/runs/{run_id}/export?format=bibtex

# PDF
curl http://localhost:8000/v1/runs/{run_id}/export?format=pdf -o report.pdf

Submit Feedback

curl -X POST http://localhost:8000/v1/feedback \
  -H "Content-Type: application/json" \
  -d '{"run_id": "...", "rating": 5, "comment": "Comprehensive coverage."}'

Report Structure

Every research run produces a structured report:

TL;DR - Three-bullet executive summary
Background - Context and foundational concepts
Key Papers/Sources - 5-10 papers with links and summaries
Disagreements/Contradictions - Conflicting findings across sources
Gaps and Open Questions - Identified unknowns in the literature
Research Trends - Emerging directions based on publication patterns
Proposed Experiments - Actionable next steps
References - Complete bibliography with links

Testing

# Run all tests
make test

# Run specific test file
pytest tests/test_tools_arxiv.py -v

# Run with coverage
pytest tests/ --cov=app --cov-report=html

Evaluation

# Run evaluation suite (20 golden prompts)
make eval

# Check results
cat eval/results.json

Configuration

Variable	Default	Description
`OLLAMA_BASE_URL`	`http://localhost:11434`	Ollama API URL
`OLLAMA_MODEL`	`llama3.2:3b`	LLM model
`OLLAMA_NUM_CTX`	`4096`	Context window
`EMBEDDING_MODEL`	`all-MiniLM-L6-v2`	Embedding model
`CHROMA_DIR`	`./chroma_data`	ChromaDB path
`SQLITE_PATH`	`./app_data/app.db`	SQLite path

Memory Footprint (8GB RAM Target)

Component	Usage
Ollama + llama3.2:3b	~2.5 GB
sentence-transformers	~200 MB
FastAPI + ChromaDB	~300 MB
Total	~3 GB

Reducing Memory Load

Use depth: "quick" - Limits to 5 sources
Set OLLAMA_NUM_CTX=2048 - Smaller context window
Close Streamlit UI - Saves approximately 150 MB
Stop Ollama when idle - ollama stop llama3.2:3b

Project Structure

app/
├── main.py                 # FastAPI application
├── api/                    # API endpoints
├── core/                   # Config, logging, SSE
├── db/                     # SQLite models and repositories
├── agent/                  # LangGraph workflow
├── tools/                  # Ollama, arXiv, embeddings
├── intelligence/           # Knowledge graph, cache
├── traces/                 # Local observability
└── export/                 # Export formats

ui/
└── app.py                  # Streamlit dashboard

eval/
├── golden.json             # 20 test prompts
└── run_eval.py             # Evaluation runner

tests/
├── test_tools_arxiv.py
├── test_tools_vectordb.py
├── test_knowledge_graph.py
├── test_semantic_cache.py
└── test_api_research.py

Known Limitations

PDF Parsing - Skipped for memory efficiency; uses abstracts only
Rate Limits - Semantic Scholar: 100 requests per 5 minutes
Context Window - 4096 tokens limits long document processing
CPU Inference - No GPU acceleration; slower but functional

Troubleshooting

Ollama not connecting:

ollama serve  # Start the server
ollama list   # Check available models

Out of memory:

# Use smaller context
export OLLAMA_NUM_CTX=2048

# Use quick depth
curl -X POST http://localhost:8000/v1/research \
  -d '{"topic": "...", "depth": "quick"}'

ChromaDB errors:

# Reset the database
rm -rf chroma_data
make dirs

Author

Built by parth-6-5-4

For questions, issues, or contributions, please open an issue on GitHub.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Built with LangGraph, Ollama, and FastAPI

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
app		app
eval		eval
tests		tests
ui		ui
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Agentic Research Copilot

The Problem: Why Traditional RAG Falls Short

The Solution: Agentic Architecture

The Critical Differentiator: The Feedback Loop

Core Capabilities

Architecture

Potential Impact

For Researchers and Academics

For Industry R&D Teams

For the AI/ML Community

Broader Implications

Quick Start

Prerequisites

Installation

Run the UI (optional)

API Usage

Start Research

Check Status

Stream Progress (SSE)

Export Results

Submit Feedback

Report Structure

Testing

Evaluation

Configuration

Memory Footprint (8GB RAM Target)

Reducing Memory Load

Project Structure

Known Limitations

Troubleshooting

Author

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages