GitHub - araray/llmcore: A unified, async Python framework for LLM applications—chat, autonomous agents, RAG, and sandboxed code execution. Supports OpenAI, Anthropic, Gemini, Ollama, and more through a single API. Features an 8-phase cognitive reasoning cycle, Docker-based isolation, human-in-the-loop approvals, session management, and comprehensive observability.

A Production-Ready Framework for LLM Applications, Autonomous Agents, and RAG Systems

Features • Quickstart • Installation • Architecture • Documentation

LLMCore is a comprehensive Python library providing a unified, asynchronous interface for building production-grade LLM applications. From simple chat interactions to sophisticated autonomous agents with sandboxed code execution, LLMCore offers a complete toolkit for modern AI development.

✨ Features

Core Capabilities

Category	Features
🔌 Multi-Provider Support	OpenAI, Anthropic, Google Gemini, Ollama, DeepSeek, Mistral, Qwen, xAI
💬 Chat Interface	Unified `chat()` API, streaming responses, tool/function calling
📦 Session Management	Persistent conversations, SQLite/PostgreSQL backends, transient sessions
🔍 RAG System	ChromaDB/pgvector storage, semantic search, context injection
🤖 Autonomous Agents	8-phase cognitive cycle, goal classification, iterative reasoning
🔒 Sandboxed Execution	Docker/VM isolation, security policies, output tracking
👤 Human-in-the-Loop	Risk assessment, approval workflows, audit logging
📊 Observability	Structured event logging, metrics collection, execution replay
🎭 Persona System	Customizable agent personalities and communication styles
📚 Model Card Library	Comprehensive model metadata, capability validation, cost estimation

What's New in v0.26.0

🐳 Sandbox System: Complete Docker and VM-based isolation for agent code execution
🧠 Darwin Layer 2: Enhanced cognitive cycle with 8 reasoning phases
⚡ Fast-Path Execution: Sub-5-second responses for trivial goals
🔌 Circuit Breaker: Automatic detection and interruption of failing agent loops
📋 Activity System: XML-based structured output for models without native tool support
📈 Comprehensive Observability: JSONL event logging with execution replay

🚀 Quickstart

Simple Chat

import asyncio
from llmcore import LLMCore

async def main():
    async with await LLMCore.create() as llm:
        # Simple question
        response = await llm.chat("What is the capital of France?")
        print(response)
        
        # Streaming response
        stream = await llm.chat("Tell me a story.", stream=True)
        async for chunk in stream:
            print(chunk, end="", flush=True)

asyncio.run(main())

Conversation with Session

async def conversation():
    async with await LLMCore.create() as llm:
        # First message - sets context
        await llm.chat(
            "My name is Alex and I love astronomy.",
            session_id="alex_chat",
            system_message="You are a friendly science tutor."
        )
        
        # Follow-up - LLM remembers context
        response = await llm.chat(
            "What should I observe tonight?",
            session_id="alex_chat"
        )
        print(response)

RAG with Document Context

async def rag_example():
    async with await LLMCore.create() as llm:
        # Add documents to vector store
        await llm.add_documents_to_vector_store(
            documents=[
                {"content": "LLMCore supports multiple providers...", "metadata": {"source": "docs"}},
                {"content": "Configuration uses the confy library...", "metadata": {"source": "docs"}},
            ],
            collection_name="my_docs"
        )
        
        # Query with RAG
        response = await llm.chat(
            "How does LLMCore handle configuration?",
            enable_rag=True,
            rag_collection_name="my_docs",
            rag_retrieval_k=3
        )
        print(response)

Autonomous Agent

from llmcore.agents import AgentManager, AgentMode

async def agent_example():
    async with await LLMCore.create() as llm:
        # Create agent manager
        agent = AgentManager(
            provider_manager=llm._provider_manager,
            memory_manager=llm._memory_manager,
            storage_manager=llm._storage_manager
        )
        
        # Run agent with a goal
        result = await agent.run(
            goal="Research the top 3 Python web frameworks and compare them",
            mode=AgentMode.SINGLE
        )
        print(result.final_answer)

📦 Installation

Requires Python 3.11 or later.

Basic Installation

pip install llmcore

With Specific Providers

# OpenAI support
pip install llmcore[openai]

# Anthropic Claude support
pip install llmcore[anthropic]

# Google Gemini support
pip install llmcore[gemini]

# Local Ollama support
pip install llmcore[ollama]

With Storage Backends

# ChromaDB for vector storage
pip install llmcore[chromadb]

# PostgreSQL with pgvector
pip install llmcore[postgres]

With Sandbox Support

# Docker sandbox
pip install llmcore[sandbox-docker]

# VM/SSH sandbox
pip install llmcore[sandbox-vm]

# Both sandbox types
pip install llmcore[sandbox]

Full Installation

# Everything included
pip install llmcore[all]

From Source

git clone https://github.com/araray/llmcore.git
cd llmcore
pip install -e ".[dev]"

🏗️ Architecture

┌───────────────────────────────────────────────────────────────────────────┐
│                              LLMCore API Facade                           │
│                          (llmcore.api.LLMCore)                            │
├───────────────────────────────────────────────────────────────────────────┤
│                                                                           │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐   │
│  │   Provider   │  │   Session    │  │   Memory     │  │  Embedding   │   │
│  │   Manager    │  │   Manager    │  │   Manager    │  │   Manager    │   │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘   │
│         │                 │                 │                 │           │
│  ┌──────┴───────┐  ┌──────┴───────┐  ┌──────┴───────┐  ┌──────┴───────┐   │
│  │  Providers   │  │   Storage    │  │    RAG       │  │  Embeddings  │   │
│  │  • OpenAI    │  │  • SQLite    │  │  • ChromaDB  │  │  • Sentence  │   │
│  │  • Anthropic │  │  • Postgres  │  │  • pgvector  │  │    Transform │   │
│  │  • Gemini    │  │  • JSON      │  │              │  │  • OpenAI    │   │
│  │  • Ollama    │  │              │  │              │  │  • Google    │   │
│  └──────────────┘  └──────────────┘  └──────────────┘  └──────────────┘   │
│                                                                           │
├───────────────────────────────────────────────────────────────────────────┤
│                            Agent System (Darwin Layer 2)                  │
│  ┌──────────────────────────────────────────────────────────────────────┐ │
│  │                         Cognitive Cycle                              │ │
│  │ PERCEIVE → PLAN → THINK → VALIDATE → ACT → OBSERVE → REFLECT → UPDATE│ │
│  └──────────────────────────────────────────────────────────────────────┘ │
│                                                                           │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐          │
│  │    Goal     │ │  Fast-Path  │ │   Circuit   │ │    HITL     │          │
│  │ Classifier  │ │  Executor   │ │  Breaker    │ │   Manager   │          │
│  └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘          │
│                                                                           │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐          │
│  │   Persona   │ │   Prompt    │ │  Activity   │ │ Capability  │          │
│  │   Manager   │ │   Library   │ │   System    │ │   Checker   │          │
│  └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘          │
│                                                                           │
├───────────────────────────────────────────────────────────────────────────┤
│                              Sandbox System                               │
│  ┌──────────────────────────────────────────────────────────────────────┐ │
│  │  Docker Provider  │  VM Provider  │  Registry  │  Output Tracker     │ │
│  └──────────────────────────────────────────────────────────────────────┘ │
│                                                                           │
├───────────────────────────────────────────────────────────────────────────┤
│                           Supporting Systems                              │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐          │
│  │ Model Card  │ │Observability│ │   Tracing   │ │   Logging   │          │
│  │  Registry   │ │   System    │ │   System    │ │   Config    │          │
│  └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘          │
└───────────────────────────────────────────────────────────────────────────┘

⚙️ Configuration

LLMCore uses confy for layered configuration with the following precedence (highest priority last):

Package Defaults → llmcore/config/default_config.toml
User Config → ~/.config/llmcore/config.toml
Custom File → LLMCore.create(config_file_path="...")
Environment Variables → LLMCORE_* prefix
Direct Overrides → LLMCore.create(config_overrides={...})

Example Configuration

# ~/.config/llmcore/config.toml

[llmcore]
default_provider = "openai"
default_embedding_model = "text-embedding-3-small"
log_level = "INFO"

[providers.openai]
# API key via: LLMCORE_PROVIDERS__OPENAI__API_KEY or OPENAI_API_KEY
default_model = "gpt-4o"
timeout = 60

[providers.anthropic]
default_model = "claude-sonnet-4-5-20250929"
timeout = 60

[providers.ollama]
# host = "http://localhost:11434"
default_model = "llama3.2:latest"

[storage.session]
type = "sqlite"
path = "~/.llmcore/sessions.db"

[storage.vector]
type = "chromadb"
path = "~/.llmcore/chroma_db"
default_collection = "llmcore_default"

[agents]
max_iterations = 10
default_timeout = 600

[agents.sandbox]
mode = "docker"

[agents.sandbox.docker]
enabled = true
image = "python:3.11-slim"
memory_limit = "1g"
cpu_limit = 2.0
network_enabled = false

[agents.hitl]
enabled = true
global_risk_threshold = "medium"
default_timeout_seconds = 300

Environment Variables

# Provider API Keys
export LLMCORE_PROVIDERS__OPENAI__API_KEY="sk-..."
export LLMCORE_PROVIDERS__ANTHROPIC__API_KEY="sk-ant-..."
export LLMCORE_PROVIDERS__GEMINI__API_KEY="..."

# Or use standard provider env vars
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."

# Storage
export LLMCORE_STORAGE__SESSION__TYPE="postgres"
export LLMCORE_STORAGE__SESSION__DB_URL="postgresql://user:pass@localhost/llmcore"

# Logging
export LLMCORE_LOG_LEVEL="DEBUG"
export LLMCORE_LOG_RAW_PAYLOADS="true"

🔌 Providers

LLMCore supports multiple LLM providers through a unified interface:

Provider	Models	Features
OpenAI	GPT-4o, GPT-4o-mini, GPT-5.2, o1, o3-mini	Streaming, Tools, Vision
Anthropic	Claude Opus 4.5, Sonnet 4.5, Haiku 4.5	Streaming, Tools, Vision
Google	Gemini 2.5 Pro/Flash, Gemini 3 Preview	Streaming, Tools, Vision
Ollama	Llama 3.2/3.3, Gemma 3, Phi-3, Mistral	Streaming, Local
DeepSeek	DeepSeek-R1, DeepSeek-V3.2, DeepSeek-Chat	Streaming, Reasoning
Mistral	Mistral Large 3	Streaming, Tools
Qwen	Qwen 3 Max, Qwen3-Coder-480B	Streaming, Tools
xAI	Grok-4, Grok-4-Heavy	Streaming, Tools

Switching Providers

# Use default provider
response = await llm.chat("Hello!")

# Override per-request
response = await llm.chat(
    "Explain quantum computing",
    provider_name="anthropic",
    model_name="claude-sonnet-4-5-20250929"
)

# With provider-specific parameters
response = await llm.chat(
    "Write a poem",
    provider_name="openai",
    model_name="gpt-4o",
    temperature=0.9,
    max_tokens=500
)

🧠 Agent System (Darwin Layer 2)

The agent system implements an advanced cognitive architecture for autonomous task execution.

Cognitive Cycle

The 8-phase cognitive cycle enables sophisticated reasoning:

┌───────────────────────────────────────────────────────────────────┐
│                        COGNITIVE CYCLE                            │
│                                                                   │
│  ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐     │
│  │ PERCEIVE │ →  │   PLAN   │ →  │  THINK   │ →  │ VALIDATE │     │
│  │          │    │          │    │          │    │          │     │
│  │ Analyze  │    │ Generate │    │ Reason & │    │ Check    │     │
│  │ goal &   │    │ strategy │    │ decide   │    │ validity │     │
│  │ context  │    │          │    │ action   │    │          │     │
│  └──────────┘    └──────────┘    └──────────┘    └──────────┘     │
│       ↑                                               │           │
│       │                                               ↓           │
│  ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐     │
│  │  UPDATE  │ ←  │ REFLECT  │ ←  │ OBSERVE  │ ←  │   ACT    │     │
│  │          │    │          │    │          │    │          │     │
│  │ Update   │    │ Learn &  │    │ Analyze  │    │ Execute  │     │
│  │ state &  │    │ improve  │    │ results  │    │ action   │     │
│  │ memory   │    │          │    │          │    │          │     │
│  └──────────┘    └──────────┘    └──────────┘    └──────────┘     │
└───────────────────────────────────────────────────────────────────┘

Phase Descriptions:

Phase	Purpose
PERCEIVE	Analyze goal, extract entities, assess complexity
PLAN	Generate execution strategy and select approach
THINK	Reason about next action using CoT/ReAct patterns
VALIDATE	Check proposed action validity and safety
ACT	Execute the chosen action (tool call, code, etc.)
OBSERVE	Analyze results and extract observations
REFLECT	Learn from outcome, identify improvements
UPDATE	Update working memory and iteration state

Goal Classification

Automatic goal complexity assessment for optimal routing:

from llmcore.agents import GoalClassifier, classify_goal

# Classify a goal
classification = classify_goal("What's 2 + 2?")
print(classification.complexity)  # GoalComplexity.TRIVIAL
print(classification.execution_strategy)  # ExecutionStrategy.FAST_PATH

# Complex goal
classification = classify_goal(
    "Research and compare the top 5 cloud providers, "
    "analyze their pricing, and create a recommendation report"
)
print(classification.complexity)  # GoalComplexity.COMPLEX
print(classification.max_iterations)  # 15

Complexity Levels:

Level	Max Iterations	Examples
`TRIVIAL`	1	Greetings, simple math, factual Q&A
`SIMPLE`	5	Single-step tasks, translations
`MODERATE`	10	Multi-step tasks, analysis
`COMPLEX`	15	Research, multi-source synthesis

Fast-Path Execution

Bypass the full cognitive cycle for trivial goals (sub-5s responses):

from llmcore.agents.learning import FastPathExecutor, should_use_fast_path

# Check if fast-path is appropriate
if should_use_fast_path("Hello, how are you?"):
    executor = FastPathExecutor(config=fast_path_config)
    result = await executor.execute(goal="Hello!")
    print(result.response)  # Instant response

Circuit Breaker

Automatic detection and interruption of failing agent loops:

from llmcore.agents import AgentCircuitBreaker, CircuitBreakerConfig

breaker = AgentCircuitBreaker(CircuitBreakerConfig(
    max_iterations=15,
    max_same_errors=3,
    max_execution_time_seconds=300,
    max_total_cost=1.0,
    progress_stall_threshold=5
))

# Circuit breaker trips on:
# - Maximum iterations exceeded
# - Repeated identical errors
# - Timeout exceeded
# - Cost limit exceeded
# - Progress stall detected

Human-in-the-Loop (HITL)

Interactive approval workflows for sensitive operations:

from llmcore.agents.hitl import HITLManager, HITLConfig, ConsoleHITLCallback

# Create HITL manager
hitl = HITLManager(
    config=HITLConfig(
        enabled=True,
        global_risk_threshold="medium",
        timeout_policy="reject"
    ),
    callback=ConsoleHITLCallback()  # Interactive console prompts
)

# Check if activity needs approval
decision = await hitl.check_approval(
    activity_type="execute_shell",
    parameters={"command": "rm -rf /tmp/test"}
)

if decision.is_approved:
    # Execute the activity
    pass
else:
    print(f"Rejected: {decision.reason}")

Risk Levels:

Level	Requires Approval	Examples
`LOW`	No	Read files, calculations
`MEDIUM`	Configurable	Write files, API calls
`HIGH`	Yes	Delete files, network access
`CRITICAL`	Always	System commands, credentials

Persona System

Customize agent behavior and communication style:

from llmcore.agents import PersonaManager, AgentPersona, PersonalityTrait

# Create a custom persona
persona = AgentPersona(
    name="DataAnalyst",
    description="A meticulous data analyst focused on accuracy",
    personality=[
        PersonalityTrait.ANALYTICAL,
        PersonalityTrait.METHODICAL,
        PersonalityTrait.CAUTIOUS
    ],
    communication_style="formal",
    risk_tolerance="low",
    planning_depth="thorough"
)

# Apply to agent
manager = PersonaManager()
manager.register_persona(persona)
agent_state.persona = manager.get_persona("DataAnalyst")

🐳 Sandbox System

Secure, isolated execution environments for agent-generated code.

Architecture

┌────────────────────────────────────────────────────────────────────┐
│                         Sandbox Registry                           │
│                    (Manages sandbox lifecycle)                     │
├────────────────────────────────────────────────────────────────────┤
│                                                                    │
│  ┌─────────────────────────┐    ┌─────────────────────────┐        │
│  │    Docker Provider      │    │     VM Provider         │        │
│  │                         │    │                         │        │
│  │  • Container isolation  │    │  • SSH-based access     │        │
│  │  • Image management     │    │  • Full VM isolation    │        │
│  │  • Resource limits      │    │  • Network separation   │        │
│  │  • Output capture       │    │  • Persistent storage   │        │
│  └─────────────────────────┘    └─────────────────────────┘        │
│                                                                    │
│  ┌─────────────────────────┐    ┌─────────────────────────┐        │
│  │    Output Tracker       │    │   Ephemeral Manager     │        │
│  │                         │    │                         │        │
│  │  • File lineage         │    │  • Resource cleanup     │        │
│  │  • Execution logs       │    │  • Timeout handling     │        │
│  │  • Artifact collection  │    │  • State management     │        │
│  └─────────────────────────┘    └─────────────────────────┘        │
└────────────────────────────────────────────────────────────────────┘

Usage

from llmcore import (
    SandboxRegistry, DockerSandboxProvider, SandboxConfig, SandboxMode
)

# Create sandbox registry
registry = SandboxRegistry()

# Register Docker provider
docker_provider = DockerSandboxProvider(SandboxConfig(
    mode=SandboxMode.DOCKER,
    image="python:3.11-slim",
    memory_limit="1g",
    cpu_limit=2.0,
    timeout_seconds=300,
    network_enabled=False
))
registry.register_provider("docker", docker_provider)

# Create and use sandbox
async with registry.create_sandbox("docker") as sandbox:
    # Execute Python code
    result = await sandbox.execute_python("""
import math
print(f"Pi = {math.pi}")
    """)
    print(result.stdout)  # "Pi = 3.141592653589793"
    
    # Execute shell command
    result = await sandbox.execute_shell("ls -la")
    print(result.stdout)
    
    # Save file
    await sandbox.save_file("output.txt", "Hello, Sandbox!")
    
    # Read file
    content = await sandbox.load_file("output.txt")

Container Images

Pre-built, security-hardened images organized by tier:

Tier	Image	Description
Base	`llmcore-sandbox-base:1.0.0`	Minimal Ubuntu 24.04
Specialized	`llmcore-sandbox-python:1.0.0`	Python 3.12 development
	`llmcore-sandbox-nodejs:1.0.0`	Node.js 22 development
	`llmcore-sandbox-shell:1.0.0`	Shell scripting
Task	`llmcore-sandbox-research:1.0.0`	Research & analysis
	`llmcore-sandbox-websearch:1.0.0`	Web scraping

Access Levels

Level	Network	Filesystem	Tools
`RESTRICTED`	Blocked	Limited	Whitelisted only
`FULL`	Enabled	Extended	All tools

Security Features

Non-root execution: All containers run as sandbox user (UID 1000)
No SUID/SGID binaries: Privilege escalation vectors removed
Resource limits: Memory, CPU, and process limits enforced
Network isolation: Optional network blocking
Output tracking: Full lineage and audit trail
AppArmor/seccomp ready: Compatible with security profiles

📚 Model Card Library

Comprehensive metadata management for LLM models.

Usage

from llmcore import get_model_card_registry, get_model_card

# Get registry singleton
registry = get_model_card_registry()

# Lookup model card
card = registry.get("openai", "gpt-4o")
print(f"Context: {card.get_context_length():,} tokens")
print(f"Vision: {card.capabilities.vision}")
print(f"Tools: {card.capabilities.tools}")

# Cost estimation
cost = card.estimate_cost(
    input_tokens=50_000,
    output_tokens=2_000,
    cached_tokens=10_000
)
print(f"Estimated cost: ${cost:.4f}")

# List models by capability
vision_models = registry.list_cards(tags=["vision"])
for model in vision_models:
    print(f"{model.provider}/{model.model_id}")

# Alias resolution
card = registry.get("anthropic", "claude-4.5-sonnet")  # Resolves alias

Supported Providers

Built-in model cards for:

OpenAI: GPT-4o, GPT-4o-mini, GPT-5.2, o1, o3-mini, embeddings
Anthropic: Claude Opus 4.5, Sonnet 4.5, Haiku 4.5
Google: Gemini 2.5 Pro/Flash, Gemini 3 Preview
Ollama: Llama 3.2/3.3, Gemma 3, Phi-3, Mistral, CodeLlama
DeepSeek: DeepSeek-R1, DeepSeek-V3.2
Mistral: Mistral Large 3
Qwen: Qwen 3 Max, Qwen3-Coder
xAI: Grok-4, Grok-4-Heavy

Custom Model Cards

Add custom cards in ~/.config/llmcore/model_cards/<provider>/<model>.json:

{
  "model_id": "my-custom-model",
  "display_name": "My Custom Model",
  "provider": "ollama",
  "model_type": "chat",
  "context": {
    "max_input_tokens": 32768,
    "max_output_tokens": 4096
  },
  "capabilities": {
    "streaming": true,
    "tools": false,
    "vision": false
  }
}

📊 Observability

Comprehensive monitoring and debugging for agent executions.

Event Logging

from llmcore.agents.observability import EventLogger, EventCategory

# Events logged to ~/.llmcore/events.jsonl
logger = EventLogger(log_path="~/.llmcore/events.jsonl")

# Event categories
# - LIFECYCLE: Agent start/stop
# - COGNITIVE: Phase execution
# - ACTIVITY: Tool executions
# - HITL: Human approvals
# - ERROR: Exceptions
# - METRIC: Performance data
# - MEMORY: Memory operations
# - SANDBOX: Container lifecycle
# - RAG: Retrieval operations

Metrics Collection

from llmcore.agents.observability import MetricsCollector

collector = MetricsCollector()

# Available metrics
# - Iteration counts
# - LLM call latency (p50, p90, p95, p99)
# - Token usage (input/output)
# - Estimated costs
# - Activity execution times
# - Error counts by type

Execution Replay

from llmcore.agents.observability import ExecutionReplay

replay = ExecutionReplay(events_path="~/.llmcore/events.jsonl")

# List executions
executions = replay.list_executions()
for exec_id, metadata in executions.items():
    print(f"{exec_id}: {metadata['goal']}")

# Replay specific execution
events = replay.get_execution_events(exec_id)
for event in events:
    print(f"[{event.timestamp}] {event.category}: {event.type}")

Configuration

[agents.observability]
enabled = true

[agents.observability.events]
enabled = true
log_path = "~/.llmcore/events.jsonl"
min_severity = "info"
categories = []  # Empty = all categories

[agents.observability.events.rotation]
strategy = "size"
max_size_mb = 100
max_files = 10
compress = true

[agents.observability.metrics]
enabled = true
track_cost = true
track_tokens = true
latency_percentiles = [50, 90, 95, 99]

[agents.observability.replay]
enabled = true
cache_enabled = true
cache_max_executions = 50

🔍 RAG System

Retrieval-Augmented Generation for knowledge-enhanced responses.

Adding Documents

# Add documents with metadata
await llm.add_documents_to_vector_store(
    documents=[
        {
            "content": "LLMCore is a Python library...",
            "metadata": {
                "source": "documentation",
                "version": "0.26.0",
                "category": "overview"
            }
        },
        {
            "content": "Configuration uses the confy library...",
            "metadata": {
                "source": "documentation",
                "category": "configuration"
            }
        }
    ],
    collection_name="project_docs"
)

Semantic Search

# Direct similarity search
results = await llm.search_vector_store(
    query="How does configuration work?",
    k=5,
    collection_name="project_docs",
    metadata_filter={"category": "configuration"}
)

for doc in results:
    print(f"Score: {doc.score:.4f}")
    print(f"Content: {doc.content[:100]}...")
    print(f"Metadata: {doc.metadata}")

RAG-Enhanced Chat

response = await llm.chat(
    "Explain how to configure providers",
    enable_rag=True,
    rag_collection_name="project_docs",
    rag_retrieval_k=3,
    system_message="Answer based ONLY on the provided context."
)

External RAG Integration

LLMCore can serve as an LLM backend for external RAG engines:

# External engine (e.g., semantiscan) handles retrieval
relevant_docs = await external_rag_engine.retrieve(query)

# Construct prompt with retrieved context
context = format_documents(relevant_docs)
full_prompt = f"Context:\n{context}\n\nQuestion: {query}"

# Use LLMCore for generation only
response = await llm.chat(
    message=full_prompt,
    enable_rag=False,  # Disable internal RAG
    explicitly_staged_items=[]  # Optional additional context
)

📖 API Reference

Core Classes

Class	Description
`LLMCore`	Main facade for all LLM operations
`AgentManager`	Manages autonomous agent execution
`StorageManager`	Handles session and vector storage
`ProviderManager`	Manages LLM provider connections

Data Models

Model	Description
`ChatSession`	Conversation session with messages
`Message`	Individual chat message (user/assistant/system)
`ContextDocument`	Document for RAG/context
`Tool`	Function/tool definition
`ToolCall`	Tool invocation by LLM
`ToolResult`	Result of tool execution
`ModelCard`	Model metadata and capabilities

Exceptions

from llmcore import (
    LLMCoreError,          # Base exception
    ConfigError,           # Configuration issues
    ProviderError,         # LLM provider errors
    StorageError,          # Storage operations
    SessionStorageError,   # Session storage specific
    VectorStorageError,    # Vector storage specific
    SessionNotFoundError,  # Session lookup failure
    ContextError,          # Context management
    ContextLengthError,    # Context exceeds limits
    EmbeddingError,        # Embedding generation
    SandboxError,          # Sandbox execution
    SandboxInitializationError,
    SandboxExecutionError,
    SandboxTimeoutError,
    SandboxAccessDenied,
    SandboxResourceError,
    SandboxConnectionError,
    SandboxCleanupError,
)

📁 Project Structure

llmcore/
├── src/llmcore/
│   ├── __init__.py           # Public API exports
│   ├── api.py                # Main LLMCore class
│   ├── models.py             # Core data models
│   ├── exceptions.py         # Exception hierarchy
│   ├── config/               # Configuration system
│   │   ├── default_config.toml
│   │   └── models.py
│   ├── providers/            # LLM provider implementations
│   │   ├── openai_provider.py
│   │   ├── anthropic_provider.py
│   │   ├── gemini_provider.py
│   │   └── ollama_provider.py
│   ├── storage/              # Storage backends
│   │   ├── sqlite_session.py
│   │   ├── postgres_session_storage.py
│   │   ├── chromadb_vector.py
│   │   └── pgvector_storage.py
│   ├── embedding/            # Embedding models
│   │   ├── sentence_transformer.py
│   │   ├── openai.py
│   │   └── google.py
│   ├── agents/               # Agent system
│   │   ├── manager.py        # AgentManager
│   │   ├── cognitive/        # 8-phase cognitive cycle
│   │   ├── hitl/             # Human-in-the-loop
│   │   ├── sandbox/          # Sandbox execution
│   │   ├── learning/         # Learning mechanisms
│   │   ├── persona/          # Persona system
│   │   ├── prompts/          # Prompt library
│   │   ├── observability/    # Monitoring & logging
│   │   └── routing/          # Model routing
│   ├── model_cards/          # Model metadata
│   │   ├── registry.py
│   │   ├── schema.py
│   │   └── default_cards/
│   └── memory/               # Memory management
├── container_images/         # Sandbox Docker images
├── examples/                 # Usage examples
├── tests/                    # Test suite
├── docs/                     # Documentation
└── pyproject.toml            # Project configuration

🧪 Testing

# Run all tests
pytest

# Run with coverage
pytest --cov=src/llmcore --cov-report=term-missing

# Run specific test categories
pytest -m "not slow"           # Skip slow tests
pytest -m "not integration"    # Skip integration tests
pytest -m sandbox              # Run sandbox tests only

# Run specific test file
pytest tests/agents/test_cognitive_system.py

🤝 Contributing

Contributions are welcome! Please follow these guidelines:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Write tests for your changes
Ensure all tests pass (pytest)
Follow the existing code style (ruff check .)
Commit with conventional commits (feat: add amazing feature)
Push to your branch
Open a Pull Request

Development Setup

git clone https://github.com/araray/llmcore.git
cd llmcore
pip install -e ".[dev]"
pre-commit install

Code Style

Formatter: Ruff
Type Hints: Required for all public APIs
Docstrings: Google style
Line Length: 100 characters

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

🔗 Related Projects

llmchat - CLI interface for llmcore
semantiscan - Advanced RAG engine
confy - Configuration management library

📞 Support

Documentation: docs/
Issues: GitHub Issues
Discussions: GitHub Discussions

_{Built with ❤️ by Araray Velho}

Name		Name	Last commit message	Last commit date
Latest commit History 457 Commits
container_images		container_images
docs		docs
examples		examples
scripts		scripts
src/llmcore		src/llmcore
tests		tests
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
run_example.sh		run_example.sh
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

✨ Features

Core Capabilities

What's New in v0.26.0

🚀 Quickstart

Simple Chat

Conversation with Session

RAG with Document Context

Autonomous Agent

📦 Installation

Basic Installation

With Specific Providers

With Storage Backends

With Sandbox Support

Full Installation

From Source

🏗️ Architecture

⚙️ Configuration

Example Configuration

Environment Variables

🔌 Providers

Switching Providers

🧠 Agent System (Darwin Layer 2)

Cognitive Cycle

Goal Classification

Fast-Path Execution

Circuit Breaker

Human-in-the-Loop (HITL)

Persona System

🐳 Sandbox System

Architecture

Usage

Container Images

Access Levels

Security Features

📚 Model Card Library

Usage

Supported Providers

Custom Model Cards

📊 Observability

Event Logging

Metrics Collection

Execution Replay

Configuration

🔍 RAG System

Adding Documents

Semantic Search

RAG-Enhanced Chat

External RAG Integration

📖 API Reference

Core Classes

Data Models

Exceptions

📁 Project Structure

🧪 Testing

🤝 Contributing

Development Setup

Code Style

📄 License

🔗 Related Projects

📞 Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages