Skip to content

Priyanshu-byte-coder/modulino

Repository files navigation

Raspberry Pi 5 Ollama Python 3.10+ 100% Offline License

Maya — Wellbeing AI Companion

A Fully Offline, Privacy-First AI Mental Wellbeing Companion for Raspberry Pi 5

Maya combines a local LLM, real-time facial emotion detection, text sentiment analysis, guided mental exercises, and long-term conversational memory (RAG) to provide empathetic, context-aware support — all without any data ever leaving the device.

"Technology should care for people, not exploit them." — Maya was built on this principle.


Table of Contents

  1. Overview
  2. Key Features
  3. System Architecture
  4. Project Structure
  5. Hardware Requirements
  6. LLM Model Research & Benchmarking
  7. Software Stack & Technology Choices
  8. Complete Processing Pipeline
  9. Guided Mental Exercises
  10. Web Interface & UI Design
  11. Installation — Raspberry Pi 5
  12. Installation — Windows (Development)
  13. Running the Application
  14. Configuration Reference
  15. Module Deep Dive
  16. API Reference
  17. Utility Scripts
  18. Troubleshooting
  19. Privacy & Security
  20. Future Scope
  21. License

1. Overview

Maya is an AI-powered wellbeing companion that listens, understands, and responds with empathy. It runs 100% offline on a Raspberry Pi 5, ensuring complete privacy. The system uses:

  • A local LLM (Microsoft Phi-3 Mini via Ollama) for natural conversation
  • VADER sentiment analysis to understand the emotional tone of text
  • FER (Facial Expression Recognition) with a webcam for real-time facial emotion detection
  • ChromaDB vector database for long-term conversational memory (RAG)
  • An Emotion Engine that fuses text sentiment, facial emotion, and historical patterns into a unified mental state model
  • A Guided Exercise System that detects stress and offers evidence-based 30-second exercises

The companion is named Maya and provides brief, warm, supportive responses tailored to the user's current emotional state.

Why Maya?

Problem Maya's Solution
Mental health apps send data to the cloud 100% offline — nothing leaves the device
AI assistants require internet Runs on local LLM via Ollama
Text-only chatbots miss visual cues Facial emotion detection via webcam
Chatbots forget past conversations Long-term memory via ChromaDB (RAG)
Generic responses lack empathy Emotion-aware prompting fuses sentiment + face + history
Expensive hardware requirements Runs on a $80 Raspberry Pi 5

2. Key Features

Feature Description
Fully Offline No internet required after initial setup. All inference happens locally on-device.
Privacy-First Zero data leaves the Raspberry Pi. No cloud APIs, no telemetry.
Multimodal Emotion Understanding Combines text sentiment + facial expression + conversation history.
Guided Mental Exercises When stress is detected, Maya offers quick 30-second exercises (breathing, grounding, gratitude, mindfulness).
Long-Term Memory (RAG) Remembers past conversations using ChromaDB vector similarity search for context-aware responses.
Dual Interface Terminal CLI for direct interaction, or a beautiful Flask web UI accessible from any device on LAN.
Real-Time Camera Feed Web interface shows live camera feed with emotion overlay and bounding boxes.
Streaming Responses LLM responses stream token-by-token via Server-Sent Events (SSE).
Modular Architecture Clean separation of concerns with abstract base classes for camera and display.
RPi 5 Optimized Tuned context windows, thread counts, token limits for ARM64 CPU inference.

3. System Architecture

┌─────────────────────────────────────────────────────────────────┐
│                         User Interfaces                         │
│  ┌──────────────────────┐    ┌───────────────────────────────┐  │
│  │   Terminal CLI        │    │   Flask Web App (port 5000)   │  │
│  │   (main.py)           │    │   (web_app.py + index.html)   │  │
│  └──────────┬───────────┘    └──────────────┬────────────────┘  │
└─────────────┼───────────────────────────────┼───────────────────┘
              │                               │
              ▼                               ▼
┌─────────────────────────────────────────────────────────────────┐
│                       AgentBrain (brain.py)                      │
│          Central orchestrator — coordinates all modules          │
│                                                                  │
│  ┌─────────────┐ ┌──────────────┐ ┌───────────┐ ┌────────────┐ │
│  │ LLMClient   │ │ Sentiment    │ │ Emotion   │ │ Conversa-  │ │
│  │ (llm.py)    │ │ Analyzer     │ │ Engine    │ │ tion       │ │
│  │             │ │ (sentiment.  │ │ (emotion. │ │ Memory     │ │
│  │ Ollama API  │ │  py)         │ │  py)      │ │ (memory.py)│ │
│  │ phi3:mini   │ │ VADER        │ │ Fusion    │ │ ChromaDB   │ │
│  └─────────────┘ └──────────────┘ └───────────┘ └────────────┘ │
└─────────────────────────────────────────────────────────────────┘
              │                                        │
              ▼                                        ▼
┌──────────────────────┐               ┌──────────────────────────┐
│   Ollama Server      │               │  Camera Module           │
│   (localhost:11434)  │               │  (camera.py)             │
│   LLM Inference      │               │  OpenCV + FER            │
└──────────────────────┘               └──────────────────────────┘

4. Project Structure

wellbeing_ai/
│
├── main.py                    # Terminal CLI entry point
├── web_app.py                 # Flask web application (REST API + SSE streaming)
├── benchmark_models.py        # LLM model benchmark script (tests all Ollama models)
├── requirements.txt           # Python dependencies with version pins
├── setup_rpi.sh               # Automated setup script for Raspberry Pi (Linux/Bash)
├── setup_rpi.bat              # Automated setup script for Windows development
├── patch_fer.py               # Patches FER library to fix moviepy import on RPi
├── reset_memory.py            # Utility: clear all stored conversations
├── view_memory.py             # Utility: view stored conversations
├── test_camera.py             # Camera & FER diagnostic test script
├── benchmark_results.json     # Full benchmark data (auto-generated)
│
├── agent/                     # Core AI agent modules
│   ├── __init__.py
│   ├── brain.py               # AgentBrain — central orchestrator (7-step pipeline)
│   ├── llm.py                 # LLMClient — Ollama REST API integration
│   ├── sentiment.py           # SentimentAnalyzer — VADER-based text sentiment
│   ├── emotion.py             # EmotionEngine — multimodal emotion fusion
│   ├── memory.py              # ConversationMemory — ChromaDB RAG store
│   └── exercises.py           # ExerciseManager — 7 guided mental exercises
│
├── config/                    # Configuration
│   ├── __init__.py
│   └── config.py              # All tuneable parameters (LLM, camera, paths, etc.)
│
├── interface/                 # Hardware abstraction layers
│   ├── __init__.py
│   ├── camera.py              # BaseCamera / WebcamCamera — webcam + FER
│   └── display.py             # BaseDisplay / TerminalDisplay — output rendering
│
├── templates/                 # Flask HTML templates
│   └── index.html             # Web chat interface (glassmorphism UI, ~1100 lines)
│
├── data/                      # Runtime data (auto-created, gitignored)
│   └── memory/                # ChromaDB persistent vector storage
│
└── .gitignore                 # Git ignore rules

5. Hardware Requirements

Target: Raspberry Pi 5

Component Specification
Board Raspberry Pi 5 (4GB or 8GB RAM recommended)
Storage 32GB+ microSD card (Class 10 / UHS-I minimum)
Camera USB Webcam or Raspberry Pi Camera Module v2/v3
Power Official RPi 5 USB-C power supply (5V/5A)
Network Required only for initial setup (downloading models & packages)
Display Optional — web interface accessible from any device on LAN

Development: Any PC

  • Windows, macOS, or Linux with Python 3.10+
  • Webcam (for testing emotion detection)
  • 8GB+ RAM recommended

6. LLM Model Research & Benchmarking

6.1 Research Objective

Selecting the right LLM is critical for a wellbeing companion running on resource-constrained hardware. The model must:

  1. Fit in memory — RPi 5 has 4–8GB RAM shared between OS, app, and model
  2. Respond quickly — Users in emotional distress need timely responses (<30s)
  3. Show empathy — Generic/robotic responses are harmful in a wellbeing context
  4. Follow instructions — Must stay in character as Maya, keep responses brief, not hallucinate
  5. Run offline — Must be available via Ollama for local inference

6.2 Models Tested

We benchmarked 10 locally available Ollama models spanning a wide range of sizes and architectures:

# Model Architecture Parameters Quantized Size Source
1 phi3:mini Phi-3 Mini 3.8B 2.0 GB Microsoft
2 llama3.1:latest LLaMA 3.1 8B 4.6 GB Meta
3 qwen2.5:latest Qwen 2.5 7B 4.4 GB Alibaba
4 mistral:latest Mistral 7B 4.1 GB Mistral AI
5 gemma:2b Gemma 2B 1.6 GB Google
6 survival-gemma3:latest Gemma 3 (finetuned) 2B 1.6 GB Custom
7 survival-gemma2:latest Gemma 2 (finetuned) 2B 1.6 GB Custom
8 survival-gemma:latest Gemma (finetuned) 2B 1.6 GB Custom
9 tinyllama:latest TinyLlama 1.1B 0.6 GB TinyLlama Team
10 my-survival:latest TinyLlama (finetuned) 1.1B 0.6 GB Custom

6.3 Benchmark Methodology

Test Environment: Settings identical to production config — temperature=0.3, max_tokens=60, num_ctx=1024, num_thread=4. Each model was tested against 5 diverse wellbeing conversation prompts (50 total inferences).

Test Prompts:

# Category User Message
1 General Greeting "Hey, I just wanted someone to talk to."
2 Negative Emotion "I've been feeling really down lately and nothing seems to help."
3 Anxiety "I have a big exam tomorrow and I can't stop worrying about it."
4 Positive Emotion "I got promoted at work today! I'm so excited!"
5 Context Recall "It happened again last night. I barely slept 3 hours." (with memory context)

Quality Scoring (0–10 weighted):

Criterion Weight Description
Empathy 30% Empathetic language ("I understand", "sounds like", "here for you")
Brevity 20% 1–3 sentence responses score highest (optimized for RPi latency)
Naturalness 20% Absence of robotic phrases ("as an AI", "language model")
Length Fit 15% 20–200 character responses ideal for quick supportive replies
No Hallucination 15% Doesn't invent user's name or identity

6.4 Benchmark Results — Performance Comparison

Model Size Avg Time TTFT Tok/s Avg Tokens Quality Pass
qwen2.5:latest 4.4 GB 5.68s 3.93s 8.48 42.6 9.13/10 5/5
llama3.1:latest 4.6 GB 8.54s 6.96s 6.29 32.0 8.65/10 5/5
survival-gemma2 1.6 GB 3.30s 2.69s 13.16 44.0 8.59/10 5/5
phi3:mini 2.0 GB 3.68s 3.03s 11.18 36.6 8.56/10 5/5
tinyllama 0.6 GB 2.66s 2.27s 17.88 47.6 8.41/10 5/5
survival-gemma3 1.6 GB 7.96s 7.31s 12.18 47.4 8.38/10 5/5
mistral:latest 4.1 GB 4.53s 3.25s 10.18 41.2 8.35/10 5/5
gemma:2b 1.6 GB 3.36s 2.73s 14.00 48.0 8.11/10 5/5
survival-gemma 1.6 GB 3.36s 2.67s 14.57 49.4 7.96/10 5/5
my-survival 0.6 GB 2.74s 2.59s 7.66 20.0 7.45/10 5/5

Legend: TTFT = Time To First Token | Tok/s = Tokens Per Second

6.5 Visual Comparisons

Quality Score (higher is better)

qwen2.5         ██████████████████░░ 9.13   ✗ Too large (4.4GB)
llama3.1        █████████████████░░░ 8.65   ✗ Too large (4.6GB)
survival-gemma2 ████████████████░░░░ 8.59   ✓ GOOD (1.6GB)
phi3:mini       ████████████████░░░░ 8.56   ✓ SELECTED (2.0GB)
tinyllama       ████████████████░░░░ 8.41   ✓ Fast but less empathetic
survival-gemma3 ████████████████░░░░ 8.38   ~ Slow first token
mistral         ████████████████░░░░ 8.35   ✗ Too large (4.1GB)
gemma:2b        ███████████████░░░░░ 8.11   ✓ Acceptable fallback
survival-gemma  ███████████████░░░░░ 7.96   ✓ Acceptable
my-survival     ██████████████░░░░░░ 7.45   ✗ Poor empathy

Response Time (lower is better)

tinyllama       ██░░░░░░░░░░░░░░░░░░  2.66s ← Fastest
my-survival     ██░░░░░░░░░░░░░░░░░░  2.74s
survival-gemma2 ███░░░░░░░░░░░░░░░░░  3.30s
phi3:mini       ███░░░░░░░░░░░░░░░░░  3.68s ← SELECTED
mistral         ████░░░░░░░░░░░░░░░░  4.53s
qwen2.5         █████░░░░░░░░░░░░░░░  5.68s
survival-gemma3 ████████░░░░░░░░░░░░  7.96s
llama3.1        █████████░░░░░░░░░░░  8.54s ← Slowest

Tokens Per Second (higher is better)

tinyllama       ██████████████████░░ 17.88 ← Fastest generation
survival-gemma  █████████████████░░░ 14.57
gemma:2b        ██████████████░░░░░░ 14.00
survival-gemma2 █████████████░░░░░░░ 13.16
phi3:mini       ███████████░░░░░░░░░ 11.18 ← SELECTED
mistral         ██████████░░░░░░░░░░ 10.18
qwen2.5         ████████░░░░░░░░░░░░  8.48
llama3.1        ██████░░░░░░░░░░░░░░  6.29 ← Slowest

6.6 Per-Category Quality Breakdown

Model General Negative Emotion Anxiety Positive Context Recall
phi3:mini 7.75 9.55 10.00 7.75 7.75
qwen2.5 9.25 9.55 9.40 10.00 7.45
llama3.1 8.50 8.50 8.50 10.00 7.75
mistral 7.90 9.55 8.65 7.90 7.75
survival-gemma2 7.90 8.95 8.65 8.50 8.95
gemma:2b 7.15 8.35 8.95 7.75 8.35
tinyllama 8.50 8.80 8.50 7.45 8.80

phi3:mini scored a perfect 10.00/10 on the Anxiety prompt — the most critical category for a wellbeing companion.

6.7 Sample Responses — Qualitative Comparison

Prompt: "I've been feeling really down lately and nothing seems to help."

phi3:mini (9.55/10): "I'm truly sorry to hear that you're going through a tough time. Remember, it's okay to feel sad sometimes, but I believe in your strength too. If there's anything specific on your mind or if you need someone to talk to, I'm here for you."

qwen2.5 (9.55/10): "I'm sorry to hear you're feeling this way. It's okay to feel sad sometimes, and it's important to give yourself lots of kindness during these times. Would you like to talk about what's been on your mind?"

llama3.1 (8.50/10): "Sweetheart, I'm so sorry you're struggling. Would you like me to guide you through a gentle breathing exercise or simply listen while you share what's on your mind?"

my-survival (7.75/10): "Can you please suggest some ways I can make myself feel better?" ⚠️ This model responds as the user instead of the assistant — poor instruction following.

Prompt: "I have a big exam tomorrow and I can't stop worrying about it."

phi3:mini (10.00/10 — PERFECT): "It sounds like you're feeling anxious, which is completely normal before an important test. Remember to take breaks and get some rest tonight—you got this!"

qwen2.5 (9.40/10): "I understand how stressful it can be before an important exam. Remember, you've studied hard, and you've got this. Take some deep breaths and try to get a good night's rest tonight. You'll do great!"

6.8 RPi 5 Suitability Analysis

Model Size OK? Speed OK? Quality OK? Verdict
phi3:mini ✓ (2.0 GB) ✓ (3.68s) ✓ (8.56) ✅ RECOMMENDED
survival-gemma2 ✓ (1.6 GB) ✓ (3.30s) ✓ (8.59) ✅ Good Alternative
gemma:2b ✓ (1.6 GB) ✓ (3.36s) ~ (8.11) ⚠️ Acceptable Fallback
tinyllama ✓ (0.6 GB) ✓ (2.66s) ~ (8.41) ⚠️ Fast but less empathetic
qwen2.5 ✗ (4.4 GB) ✓ (5.68s) ✓ (9.13) ❌ Too large for 4GB RPi
llama3.1 ✗ (4.6 GB) ~ (8.54s) ✓ (8.65) ❌ Too large, too slow
mistral ✗ (4.1 GB) ✓ (4.53s) ✓ (8.35) ❌ Too large for 4GB RPi
my-survival ✓ (0.6 GB) ✓ (2.74s) ✗ (7.45) ❌ Poor instruction following

6.9 Why We Chose Phi-3 Mini

After benchmarking all 10 models, Microsoft Phi-3 Mini (phi3:mini) was selected as the default:

Criterion Phi-3 Mini Best Alternative (qwen2.5)
Model Size 2.0 GB ✓ 4.4 GB ✗ (won't fit 4GB RPi)
Quality Score 8.56/10 9.13/10
Response Time 3.68s 5.68s
Empathy (Anxiety) 10.00/10 9.40/10
RPi 5 Compatible

Key Insights:

  1. Best quality-to-size ratio — Achieves quality comparable to 7B models at nearly half the size
  2. Perfect score on anxiety prompts — 10/10 on the most critical wellbeing category
  3. Fits comfortably in 4GB RAM — Leaves room for OS, Python, ChromaDB, TensorFlow, and FER
  4. Excellent instruction following — Stays in character as Maya, keeps responses brief
  5. Natural empathy — Uses phrases like "I'm truly sorry", "it's okay to feel", "I believe in your strength"

Note: Users with 8GB RPi 5 may try qwen2.5:latest (LLM_MODEL=qwen2.5 in config) for higher quality at the cost of longer inference.

6.10 Reproducing the Benchmark

source venv/bin/activate   # Linux/RPi
# OR
venv\Scripts\activate      # Windows

python benchmark_models.py

The script discovers all local Ollama models, runs 5 prompts against each, measures latency/quality, prints a comparison table, and saves results to benchmark_results.json.


7. Software Stack & Technology Choices

Language Model (LLM)

Property Value
Model Microsoft Phi-3 Mini (phi3:mini)
Runtime Ollama (local inference server)
Parameters ~3.8B parameters
Quantization Q4_K_M (default Ollama quantization)
Context Window 1024 tokens (tuned for RPi CPU performance)
Max Output Tokens 60 (brief, focused responses)
Temperature 0.3 (low creativity, high consistency)
CPU Threads 4 (matches RPi 5's quad-core Cortex-A76)
Timeout 300 seconds (5 min for slow CPU inference)
API Ollama REST API at http://localhost:11434
Stop Sequences \n\n, User:, Assistant:

Why Phi-3 Mini? — It is one of the smallest high-quality LLMs that can run on RPi 5 hardware with acceptable latency. It handles empathetic conversation well within tight token budgets.

Sentiment Analysis

Property Value
Library VADER (Valence Aware Dictionary and sEntiment Reasoner)
Package vaderSentiment>=3.3.2
Type Rule-based, lexicon-driven
Output Compound score (-1.0 to +1.0), pos/neg/neu breakdown
Thresholds Positive: ≥ 0.05, Negative: ≤ -0.05
Why VADER? Zero-latency, no GPU needed, specifically tuned for social/conversational text

Facial Emotion Recognition

Property Value
Library FER (Facial Expression Recognition) v22.5.1
Backend TensorFlow (Keras CNN)
Face Detector OpenCV Haar Cascade (mtcnn=False for speed on RPi)
Detectable Emotions happy, sad, angry, fear, surprise, neutral, disgust
Confidence Threshold 0.30 (detections below this are discarded)
Sampling Interval Every 3 conversation turns (CLI) or every 2.5s (web UI polling)
Why not MTCNN? MTCNN is more accurate but significantly slower on CPU. Haar Cascade provides adequate speed on RPi 5.

Vector Memory (RAG)

Property Value
Database ChromaDB (persistent mode)
Package chromadb>=0.4.22
Embedding ChromaDB's default all-MiniLM-L6-v2 Sentence Transformer
Distance Metric Cosine similarity (hnsw:space: cosine)
Retrieval Top-K 2 (reduced for CPU performance)
Storage Location data/memory/ (auto-created)
Collection Name conversations
Stored Metadata user_message, assistant_response, sentiment_label, sentiment_score, emotion, timestamp

Web Framework

Property Value
Framework Flask 3.0+
CORS flask-cors 4.0+
Streaming Server-Sent Events (SSE) via /api/chat_stream
Host 0.0.0.0:5000 (accessible on LAN)
Template Single-page glassmorphism UI (templates/index.html)
Font Google Quicksand (loaded via CDN on first access)

Computer Vision

Property Value
Library OpenCV (headless) opencv-python-headless>=4.8.0
Usage Camera capture, color conversion (BGR→RGB), bounding box drawing, JPEG encoding

Other Dependencies

Package Version Purpose
requests ≥2.31.0 HTTP client for Ollama REST API
numpy ≥1.24.0, <2.0.0 Array operations for OpenCV/TF (pinned <2.0 for compatibility)
tensorflow ≥2.15.0, <2.18.0 Backend for FER emotion detection CNN

8. Complete Processing Pipeline

Per-Message Processing Pipeline

When a user sends a message, the AgentBrain.process() method orchestrates this pipeline:

User Input (text)
     │
     ▼
┌─────────────────────────────────┐
│ 1. SENTIMENT ANALYSIS           │
│    SentimentAnalyzer.analyze()  │
│    VADER scores the text →      │
│    label: positive/negative/    │
│           neutral               │
│    compound: -1.0 to +1.0      │
│    intensity: 0.0 to 1.0       │
└────────────┬────────────────────┘
             │
             ▼
┌─────────────────────────────────┐
│ 2. MEMORY RETRIEVAL (RAG)       │
│    ConversationMemory.retrieve()│
│    Query ChromaDB with user     │
│    input → retrieve top-2 most  │
│    semantically similar past    │
│    conversations                │
└────────────┬────────────────────┘
             │
             ▼
┌─────────────────────────────────┐
│ 3. EMOTION ENGINE UPDATE        │
│    EmotionEngine.update()       │
│    Fuses:                       │
│    • Text sentiment (VADER)     │
│    • Facial emotion (FER/cam)   │
│    • Historical sentiment avg   │
│    • Memory sentiment patterns  │
│    Outputs: MentalState object  │
│    (dominant_emotion, trend,    │
│     historical_avg)             │
└────────────┬────────────────────┘
             │
             ▼
┌─────────────────────────────────┐
│ 4. LLM RESPONSE GENERATION     │
│    LLMClient.chat()             │
│    Builds system prompt with:   │
│    • Base persona (Maya)        │
│    • Current user mood          │
│    • Emotional trend guidance   │
│    • Retrieved memory context   │
│    Sends last 4 messages +      │
│    system prompt to Ollama      │
│    Streams response tokens      │
└────────────┬────────────────────┘
             │
             ▼
┌─────────────────────────────────┐
│ 5. MEMORY STORAGE               │
│    ConversationMemory.store()   │
│    Stores in ChromaDB:          │
│    • user_message               │
│    • assistant_response         │
│    • sentiment_label & score    │
│    • dominant emotion           │
│    • timestamp                  │
│    Embedded for future RAG      │
└─────────────────────────────────┘

Emotion Fusion Logic (EmotionEngine)

The Emotion Engine maintains a sliding window (10 turns) of sentiment scores and emotion labels to compute:

  1. Dominant Emotion Resolution — If a facial emotion is detected (not unknown/None), it takes priority. Otherwise, text sentiment is mapped: positive→happy, negative→sad, neutral→neutral.
  2. Historical Sentiment Average — Running mean of compound scores over the window.
  3. Emotional Trend — Compares the average sentiment of the first half vs second half of the window. Difference >0.15 = "improving", <-0.15 = "declining", else "stable".
  4. Memory Pattern Adjustment — If >60% of retrieved memories have negative sentiment, the historical average is shifted down by 0.1 to increase concern.

Camera Emotion Flow (Web UI)

Browser polls /api/camera/snapshot every 2.5s
     │
     ▼
WebcamCamera.capture_snapshot_with_overlay()
     │
     ├── cv2.VideoCapture.read() → raw BGR frame
     ├── FER.detect_emotions(RGB frame) → bounding boxes + emotion scores
     ├── Draw green bounding box + emotion label on frame
     ├── cv2.imencode('.jpg') → JPEG bytes
     │
     ▼
Response: { image: base64 JPEG, emotion: "happy" }
     │
     ▼
Browser updates camera feed image + emotion emoji/label

9. Guided Mental Exercises

Maya actively monitors the user's emotional state and offers guided mental exercises when stress is detected.

Design Principles

Principle Implementation
Non-intrusive Exercises offered only when multiple stress indicators align
Opt-in User can accept ("yes", "let's do it") or decline ("skip", "not now")
Quick All exercises complete in under 30 seconds
Skippable User can exit mid-exercise at any time
Cooldown Only offered once every 5 turns (configurable)

Exercise Library — 7 Evidence-Based Exercises

# Exercise Category Duration Description
1 Box Breathing 🌬️ Breathing 24s 4-4-4-4 pattern (inhale, hold, exhale, hold)
2 Calming Breath 🍃 Breathing 26s Inhale 4s, exhale 6s — twice
3 5-4-3-2-1 Grounding 🌍 Grounding 30s Name 5 things you see, 4 feel, 3 hear, 2 smell, 1 taste
4 Quick Gratitude 🙏 Gratitude 20s Reflect on one thing you're grateful for
5 Body Scan 🧘 Mindfulness 25s Release tension in shoulders, jaw, chest, toes
6 Present Moment 🧘 Mindfulness 20s Three deep breaths with awareness
7 Tension Release 💪 Mindfulness 25s Progressive muscle relaxation (squeeze and release)

Exercise Trigger Conditions

Exercises are offered when any of these conditions are met:

┌─────────────────────────────────────────────────────────────┐
│                  STRESS DETECTION ENGINE                     │
│                                                              │
│  Condition 1: hist_avg < -0.3      → Persistent low mood    │
│  Condition 2: trend="declining"    → Getting worse           │
│               AND hist_avg < -0.15                           │
│  Condition 3: sentiment < -0.5     → Very negative now       │
│  Condition 4: emotion ∈ {sad,      → Stress emotion          │
│               angry, fear, disgust}                          │
│               AND sentiment < -0.2                           │
│                                                              │
│  ANY condition true → needs_exercise = True                  │
│  Cooldown: min 5 turns between offers                        │
└─────────────────────────────────────────────────────────────┘

Exercise Flow (Web UI)

Stress Detected → Exercise Offer (SSE event)
       │
       ▼
┌──────────────────────────┐
│  Exercise Selection Card │
│  ┌────┐ ┌────┐ ┌────┐   │
│  │🌬️ │ │🍃 │ │🌍 │   │  ← User picks one
│  │Box │ │Calm│ │5421│   │
│  └────┘ └────┘ └────┘   │
│         [Skip]           │
└──────────┬───────────────┘
           │
           ▼
┌──────────────────────────┐
│  Step-by-Step Guide      │
│  "Breathe IN slowly..."  │
│  ┌──────────────────┐    │
│  │   ⏱️ 4 seconds    │   │  ← Timer countdown
│  │   ████████░░░░    │   │
│  └──────────────────┘    │
│  [Next Step] [Skip]      │
└──────────────────────────┘

10. Web Interface & UI Design

The web interface is a single-page application built with a glassmorphism design language.

UI Components

Component Description
Chat Panel Message bubbles with avatars, typing indicator, auto-scroll
Emotion Sidebar Live camera feed, emotion emoji, status indicators
Header System status dots (LLM ●, Memory ●, Camera ●), reset button
Exercise Cards Gradient cards with icons, timers, step-by-step guides

Design Specifications

Property Value
UI Style Glassmorphism (frosted glass panels, gradient background)
Color Palette Soft purple-blue gradient (#e0c3fc#8ec5fc)
Font Google Quicksand (warm, approachable)
Layout Responsive — side-by-side on desktop, stacked on mobile
Streaming SSE via ReadableStream for token-by-token display
Camera Polling /api/camera/snapshot every 2.5 seconds
Accessibility High contrast text, large touch targets

Emotion Display Mapping

Emotion Emoji Color Accent
Happy 😊 Green
Sad 😢 Blue
Angry 😠 Red
Fear 😨 Purple
Surprise 😲 Yellow
Neutral 😐 Gray
Disgust 🤢 Green

11. Installation — Raspberry Pi 5

Prerequisites

  • Raspberry Pi OS (64-bit / Bookworm recommended)
  • Python 3.10 or higher
  • Internet connection (for initial setup only)

Automated Setup

# Clone the repository
git clone <repository-url> ~/wellbeing_ai
cd ~/wellbeing_ai

# Run the setup script
chmod +x setup_rpi.sh
./setup_rpi.sh

The script will:

  1. Create a Python virtual environment
  2. Install all pip dependencies
  3. Create the data/memory/ directory
  4. Install Ollama (if not already installed)
  5. Pull the phi3:mini model

Manual Setup

# 1. Install system dependencies
sudo apt update && sudo apt install -y python3 python3-pip python3-venv libatlas-base-dev

# 2. Create & activate virtual environment
python3 -m venv venv
source venv/bin/activate

# 3. Upgrade pip
pip install --upgrade pip

# 4. Install Python dependencies
pip install -r requirements.txt

# 5. Patch FER for RPi (fixes moviepy import error)
python patch_fer.py

# 6. Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# 7. Start Ollama and pull the model
ollama serve &
sleep 3
ollama pull phi3:mini

# 8. Create data directory
mkdir -p data/memory

# 9. Enable camera (if using RPi Camera Module)
sudo raspi-config
# Navigate to: Interface Options → Camera → Enable
# Reboot if prompted

12. Installation — Windows (Development)

REM Clone the repository
git clone <repository-url>
cd wellbeing_ai

REM Run the setup script
setup_rpi.bat

Or manually:

python -m venv venv
venv\Scripts\activate
pip install --upgrade pip
pip install -r requirements.txt

REM Install Ollama from https://ollama.com/download/windows
ollama pull phi3:mini

13. Running the Application

Terminal CLI

source venv/bin/activate   # Linux/RPi
# OR
venv\Scripts\activate      # Windows

python main.py

This launches an interactive terminal session where you type messages and Maya responds. Camera emotion detection samples every 3 turns (configurable).

Web Interface

source venv/bin/activate   # Linux/RPi

python web_app.py

Then open in a browser:

  • On the Pi: http://localhost:5000
  • From another device on LAN: http://<raspberry-pi-ip>:5000

The web interface provides:

  • A chat window with streaming responses
  • Live camera feed with emotion overlay (bounding boxes + labels)
  • Real-time emotion emoji display
  • System status indicators (LLM, Memory, Camera)
  • Conversation reset button

14. Configuration Reference

All configuration lives in config/config.py. Key settings:

Setting Default Description
OLLAMA_BASE_URL http://localhost:11434 Ollama server URL (env: OLLAMA_BASE_URL)
LLM_MODEL phi3:mini Ollama model name (env: LLM_MODEL)
LLM_TEMPERATURE 0.3 Creativity vs consistency (0.0–1.0)
LLM_MAX_TOKENS 60 Max response length in tokens
LLM_NUM_CTX 1024 Context window size
LLM_NUM_THREAD 4 CPU threads (matches RPi 5 quad-core)
LLM_TIMEOUT 300 Request timeout in seconds
CAMERA_ENABLED True Enable/disable camera subsystem
CAMERA_INDEX 0 OpenCV camera device index
CAMERA_SAMPLE_INTERVAL 3 Capture emotion every N turns (CLI)
MEMORY_COLLECTION conversations ChromaDB collection name
MEMORY_TOP_K 2 Number of memories to retrieve per query
EXERCISE_TRIGGER_THRESHOLD -0.3 Sentiment threshold for offering exercises
EXERCISE_COOLDOWN_TURNS 5 Minimum turns between exercise offers
DISPLAY_MODE terminal Display mode (terminal or eink)
SYSTEM_PROMPT Maya persona System prompt sent to the LLM

Environment Variables

These settings can be overridden via environment variables:

  • OLLAMA_BASE_URL
  • LLM_MODEL
  • CAMERA_ENABLED (set to "true" / "false")
  • DISPLAY_MODE

15. Module Deep Dive

agent/brain.py — AgentBrain

The central orchestrator. Initializes all subsystems and exposes:

  • check_systems() → dict of subsystem health checks
  • process(user_input, face_emotion, stream) → runs the full 5-step pipeline (sentiment → memory retrieval → emotion update → LLM generation → memory storage)
  • Maintains a rolling conversation history (last 4 messages sent to LLM)
  • Builds a dynamic system prompt that includes Maya's persona, current user mood, emotional trend guidance, and retrieved memory context

agent/llm.py — LLMClient

Communicates with Ollama's REST API:

  • is_available() → checks if Ollama is running and the model is loaded (via /api/tags)
  • generate(prompt, system) → single-shot generation via /api/generate (streaming internally)
  • chat(messages, stream_output) → chat-style generation via /api/chat. When stream_output=True, returns a generator that yields tokens one by one for SSE streaming.
  • Handles connection errors, timeouts, and server unavailability gracefully with error messages.

agent/sentiment.py — SentimentAnalyzer

Wraps VADER for conversational sentiment analysis:

  • analyze(text) → returns SentimentResult(label, compound, intensity, scores)
  • Labels: positive (compound ≥ 0.05), negative (compound ≤ -0.05), neutral
  • Intensity = absolute value of compound score (0.0–1.0)
  • Zero dependencies beyond vaderSentiment, instant CPU execution

agent/emotion.py — EmotionEngine

Maintains session-level emotional state:

  • update(sentiment, face_emotion, retrieved_memories) → returns MentalState
  • Tracks a sliding window of 10 sentiment scores and emotion labels
  • Resolves dominant emotion (face > text mapping)
  • Computes emotional trend (improving / declining / stable)
  • Adjusts for long-term memory patterns (negative memory ratio)

agent/memory.py — ConversationMemory

ChromaDB-backed long-term memory with RAG:

  • store(MemoryEntry) → stores a conversation turn with full metadata
  • retrieve(query, top_k) → semantic similarity search, returns list[RetrievedMemory]
  • Uses cosine distance in HNSW index
  • Each entry stores: user message, assistant response, sentiment label/score, emotion, timestamp
  • Documents are formatted as "The user said: ...\nMaya (the AI assistant) responded: ..." for embedding (prevents role confusion)

agent/exercises.py — ExerciseManager

Manages guided mental exercises for stress relief:

  • ExerciseManager — tracks exercise state and cooldown periods
  • should_offer_exercise(current_turn, cooldown_turns) → checks if enough time has passed since last offer
  • get_random_exercise() → selects a random exercise, avoiding repetition
  • mark_exercise_offered(turn) → records when an exercise was offered
  • format_exercise_offer() → generates the opt-in offer message
  • format_exercise_steps(exercise) → formats exercise steps into a single message
  • Contains 7 pre-built exercises: Box Breathing, Calming Breath, 5-4-3-2-1 Grounding, Quick Gratitude, Body Scan, Present Moment, Tension Release
  • All exercises complete in under 30 seconds

interface/camera.py — BaseCamera / WebcamCamera

Hardware abstraction for camera + emotion detection:

  • BaseCamera — abstract base class defining the interface
  • WebcamCamera — implementation using OpenCV + FER
  • capture_emotion() → capture frame, detect dominant emotion, return label or None
  • capture_frame() → return raw OpenCV frame
  • capture_snapshot_with_overlay() → capture frame, detect emotion, draw bounding box + label, return (JPEG bytes, emotion label)
  • is_available() → check if camera is accessible
  • release() → release camera resources
  • FER initialization is optional — if TensorFlow/FER is unavailable, the camera still works for frame capture without emotion detection

interface/display.py — BaseDisplay / TerminalDisplay

Hardware abstraction for output rendering:

  • BaseDisplay — abstract base class
  • TerminalDisplay — rich terminal output with text wrapping
  • show_message(sender, message) — formatted chat output
  • show_welcome() — welcome banner
  • show_emotion(emotion) — emoji-mapped emotion display
  • show_status(status) — system status messages
  • clear() — clear terminal screen (platform-aware: cls on Windows, clear on Linux)
  • Designed to be replaceable with an E-Ink display implementation

web_app.py — Flask Web Application

REST API + SSE streaming server:

  • GET / — serves the chat interface (templates/index.html)
  • GET /api/status — returns system health (LLM, memory, camera)
  • POST /api/chat — synchronous chat endpoint (returns full response)
  • POST /api/chat_stream — SSE streaming chat endpoint (yields tokens)
  • GET /api/camera/snapshot — returns base64 JPEG with emotion overlay
  • GET /api/camera/emotion — returns detected emotion label only
  • POST /api/reset — resets conversation history

templates/index.html — Web Chat Interface

Single-page application with:

  • Glassmorphism UI — frosted glass panels with gradient background
  • Chat panel — message bubbles with avatars, typing indicator, auto-scroll
  • Emotion sidebar — live camera feed, emotion emoji display, status indicators
  • SSE streaming — reads token-by-token from /api/chat_stream using ReadableStream
  • Emotion polling — polls /api/camera/snapshot every 2.5 seconds for live emotion updates
  • Responsive — adapts to mobile screens with stacked layout
  • Font — Google Quicksand for a friendly, approachable feel

16. API Reference

The Flask web application exposes the following REST API endpoints:

Method Endpoint Description Request Body Response
GET / Serve the chat interface HTML page
GET /api/status System health check {status, camera_enabled, model}
POST /api/chat Synchronous chat {message, capture_emotion} {response, face_emotion, turn_count}
POST /api/chat_stream SSE streaming chat {message, capture_emotion} SSE stream of tokens
GET /api/camera/snapshot Camera frame + emotion {image: base64, emotion}
GET /api/camera/emotion Detect emotion only {emotion, success}
POST /api/reset Reset conversation {success: true}
POST /api/trigger_exercise Force exercise offer {success, exercises}
GET /api/exercises List all exercises {exercises: [...]}
POST /api/exercise/start Start an exercise {name} {success, exercise}
POST /api/exercise/skip Skip current exercise {success: true}

SSE Stream Event Types

The /api/chat_stream endpoint emits these Server-Sent Events:

Event Type Payload Description
emotion {type: "emotion", emotion: "happy"} Detected facial emotion (sent first)
token {type: "token", token: "Hello"} Single token from LLM response
exercise_offer {type: "exercise_offer", exercises: [...]} Stress detected, offer exercises
done {type: "done"} Response complete
error {type: "error", error: "..."} Error occurred

17. Utility Scripts

benchmark_models.py

Comprehensive LLM benchmark that tests all locally installed Ollama models against wellbeing conversation prompts. Measures latency, TTFT, tokens/second, and empathy quality scores. Results saved to benchmark_results.json.

python benchmark_models.py

test_camera.py

Diagnostic script that tests the full camera pipeline in 4 steps:

  1. Webcam access (OpenCV)
  2. FER library import
  3. FER detector initialization
  4. Live emotion detection with confidence scores
python test_camera.py

patch_fer.py

Patches the FER library's classes.py to make the moviepy import optional. This fixes the "No module named 'moviepy.editor'" error that occurs on Raspberry Pi since moviepy is not needed for emotion detection.

python patch_fer.py

reset_memory.py

Deletes all stored conversations from ChromaDB. Asks for confirmation before proceeding.

python reset_memory.py

view_memory.py

Displays all stored conversations with timestamps, sentiment labels, emotions, and message previews.

python view_memory.py

18. Troubleshooting

Ollama not running

[!] Ollama is not running or model not found.

Fix:

ollama serve &          # Start Ollama server
ollama pull phi3:mini   # Download the model

Camera not detected

⚠️ Camera enabled but not available

Fix (RPi Camera Module):

sudo raspi-config       # Interface Options → Camera → Enable
sudo reboot

Fix (USB Webcam):

ls /dev/video*          # Check available camera devices
# If your camera is at /dev/video1, set CAMERA_INDEX=1 in config/config.py

FER moviepy import error

No module named 'moviepy.editor'

Fix:

python patch_fer.py

Slow LLM responses

This is expected on RPi 5 CPU. Responses may take 30–120 seconds. To improve:

  • Reduce LLM_MAX_TOKENS in config/config.py
  • Reduce LLM_NUM_CTX (smaller context = faster)
  • Ensure no other heavy processes are running

TensorFlow / NumPy compatibility

If you see numpy-related errors:

pip install "numpy>=1.24.0,<2.0.0"
pip install --force-reinstall "tensorflow>=2.15.0,<2.18.0"

ChromaDB SQLite errors

On some RPi OS versions, the system SQLite may be too old:

pip install pysqlite3-binary

19. Privacy & Security

  • All processing happens locally on the Raspberry Pi. No data is sent to any external server.
  • Ollama runs locally — the LLM never contacts the internet during inference.
  • ChromaDB stores data locally in data/memory/ on the device's filesystem.
  • Camera frames are processed in-memory and never saved to disk (only emotion labels are stored).
  • The web interface binds to 0.0.0.0:5000 — it is accessible on the local network. For additional security, configure a firewall to restrict access to trusted devices only.
  • No API keys are required. No accounts, no cloud services, no telemetry.

20. Future Scope

Area Enhancement Description
Voice Input Speech-to-text Add offline whisper.cpp integration for voice conversations
Voice Output Text-to-speech Use Piper TTS for spoken Maya responses
E-Ink Display Hardware display Implement EInkDisplay class for Waveshare e-paper HAT
Multi-User User profiles Separate ChromaDB collections per user with face recognition
Journaling Mood journal Daily mood summaries and weekly trend reports
RPi Camera Module Native camera PiCameraModule class using picamera2 library
Larger Models 8GB RPi option Support qwen2.5 or llama3.1 on 8GB RPi 5
Exercise Expansion More exercises Add progressive relaxation, visualization, and journaling prompts
Multilingual Language support Support Hindi, Spanish, and other languages via multilingual LLMs
Wearable Integration Heart rate data Integrate with fitness bands for physiological stress signals

21. License

This project is intended for personal and educational use.

About

ai_opffline_wellbeing_chatbot

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors