Maya — Wellbeing AI Companion

A Fully Offline, Privacy-First AI Mental Wellbeing Companion for Raspberry Pi 5

Maya combines a local LLM, real-time facial emotion detection, text sentiment analysis, guided mental exercises, and long-term conversational memory (RAG) to provide empathetic, context-aware support — all without any data ever leaving the device.

"Technology should care for people, not exploit them." — Maya was built on this principle.

1. Overview

Maya is an AI-powered wellbeing companion that listens, understands, and responds with empathy. It runs 100% offline on a Raspberry Pi 5, ensuring complete privacy. The system uses:

A local LLM (Microsoft Phi-3 Mini via Ollama) for natural conversation
VADER sentiment analysis to understand the emotional tone of text
FER (Facial Expression Recognition) with a webcam for real-time facial emotion detection
ChromaDB vector database for long-term conversational memory (RAG)
An Emotion Engine that fuses text sentiment, facial emotion, and historical patterns into a unified mental state model
A Guided Exercise System that detects stress and offers evidence-based 30-second exercises

The companion is named Maya and provides brief, warm, supportive responses tailored to the user's current emotional state.

Why Maya?

Problem	Maya's Solution
Mental health apps send data to the cloud	100% offline — nothing leaves the device
AI assistants require internet	Runs on local LLM via Ollama
Text-only chatbots miss visual cues	Facial emotion detection via webcam
Chatbots forget past conversations	Long-term memory via ChromaDB (RAG)
Generic responses lack empathy	Emotion-aware prompting fuses sentiment + face + history
Expensive hardware requirements	Runs on a $80 Raspberry Pi 5

2. Key Features

Feature	Description
Fully Offline	No internet required after initial setup. All inference happens locally on-device.
Privacy-First	Zero data leaves the Raspberry Pi. No cloud APIs, no telemetry.
Multimodal Emotion Understanding	Combines text sentiment + facial expression + conversation history.
Guided Mental Exercises	When stress is detected, Maya offers quick 30-second exercises (breathing, grounding, gratitude, mindfulness).
Long-Term Memory (RAG)	Remembers past conversations using ChromaDB vector similarity search for context-aware responses.
Dual Interface	Terminal CLI for direct interaction, or a beautiful Flask web UI accessible from any device on LAN.
Real-Time Camera Feed	Web interface shows live camera feed with emotion overlay and bounding boxes.
Streaming Responses	LLM responses stream token-by-token via Server-Sent Events (SSE).
Modular Architecture	Clean separation of concerns with abstract base classes for camera and display.
RPi 5 Optimized	Tuned context windows, thread counts, token limits for ARM64 CPU inference.

3. System Architecture

┌─────────────────────────────────────────────────────────────────┐
│                         User Interfaces                         │
│  ┌──────────────────────┐    ┌───────────────────────────────┐  │
│  │   Terminal CLI        │    │   Flask Web App (port 5000)   │  │
│  │   (main.py)           │    │   (web_app.py + index.html)   │  │
│  └──────────┬───────────┘    └──────────────┬────────────────┘  │
└─────────────┼───────────────────────────────┼───────────────────┘
              │                               │
              ▼                               ▼
┌─────────────────────────────────────────────────────────────────┐
│                       AgentBrain (brain.py)                      │
│          Central orchestrator — coordinates all modules          │
│                                                                  │
│  ┌─────────────┐ ┌──────────────┐ ┌───────────┐ ┌────────────┐ │
│  │ LLMClient   │ │ Sentiment    │ │ Emotion   │ │ Conversa-  │ │
│  │ (llm.py)    │ │ Analyzer     │ │ Engine    │ │ tion       │ │
│  │             │ │ (sentiment.  │ │ (emotion. │ │ Memory     │ │
│  │ Ollama API  │ │  py)         │ │  py)      │ │ (memory.py)│ │
│  │ phi3:mini   │ │ VADER        │ │ Fusion    │ │ ChromaDB   │ │
│  └─────────────┘ └──────────────┘ └───────────┘ └────────────┘ │
└─────────────────────────────────────────────────────────────────┘
              │                                        │
              ▼                                        ▼
┌──────────────────────┐               ┌──────────────────────────┐
│   Ollama Server      │               │  Camera Module           │
│   (localhost:11434)  │               │  (camera.py)             │
│   LLM Inference      │               │  OpenCV + FER            │
└──────────────────────┘               └──────────────────────────┘

4. Project Structure

wellbeing_ai/
│
├── main.py                    # Terminal CLI entry point
├── web_app.py                 # Flask web application (REST API + SSE streaming)
├── benchmark_models.py        # LLM model benchmark script (tests all Ollama models)
├── requirements.txt           # Python dependencies with version pins
├── setup_rpi.sh               # Automated setup script for Raspberry Pi (Linux/Bash)
├── setup_rpi.bat              # Automated setup script for Windows development
├── patch_fer.py               # Patches FER library to fix moviepy import on RPi
├── reset_memory.py            # Utility: clear all stored conversations
├── view_memory.py             # Utility: view stored conversations
├── test_camera.py             # Camera & FER diagnostic test script
├── benchmark_results.json     # Full benchmark data (auto-generated)
│
├── agent/                     # Core AI agent modules
│   ├── __init__.py
│   ├── brain.py               # AgentBrain — central orchestrator (7-step pipeline)
│   ├── llm.py                 # LLMClient — Ollama REST API integration
│   ├── sentiment.py           # SentimentAnalyzer — VADER-based text sentiment
│   ├── emotion.py             # EmotionEngine — multimodal emotion fusion
│   ├── memory.py              # ConversationMemory — ChromaDB RAG store
│   └── exercises.py           # ExerciseManager — 7 guided mental exercises
│
├── config/                    # Configuration
│   ├── __init__.py
│   └── config.py              # All tuneable parameters (LLM, camera, paths, etc.)
│
├── interface/                 # Hardware abstraction layers
│   ├── __init__.py
│   ├── camera.py              # BaseCamera / WebcamCamera — webcam + FER
│   └── display.py             # BaseDisplay / TerminalDisplay — output rendering
│
├── templates/                 # Flask HTML templates
│   └── index.html             # Web chat interface (glassmorphism UI, ~1100 lines)
│
├── data/                      # Runtime data (auto-created, gitignored)
│   └── memory/                # ChromaDB persistent vector storage
│
└── .gitignore                 # Git ignore rules

5. Hardware Requirements

Target: Raspberry Pi 5

Component	Specification
Board	Raspberry Pi 5 (4GB or 8GB RAM recommended)
Storage	32GB+ microSD card (Class 10 / UHS-I minimum)
Camera	USB Webcam or Raspberry Pi Camera Module v2/v3
Power	Official RPi 5 USB-C power supply (5V/5A)
Network	Required only for initial setup (downloading models & packages)
Display	Optional — web interface accessible from any device on LAN

Development: Any PC

Windows, macOS, or Linux with Python 3.10+
Webcam (for testing emotion detection)
8GB+ RAM recommended

6. LLM Model Research & Benchmarking

6.1 Research Objective

Selecting the right LLM is critical for a wellbeing companion running on resource-constrained hardware. The model must:

Fit in memory — RPi 5 has 4–8GB RAM shared between OS, app, and model
Respond quickly — Users in emotional distress need timely responses (<30s)
Show empathy — Generic/robotic responses are harmful in a wellbeing context
Follow instructions — Must stay in character as Maya, keep responses brief, not hallucinate
Run offline — Must be available via Ollama for local inference

6.2 Models Tested

We benchmarked 10 locally available Ollama models spanning a wide range of sizes and architectures:

#	Model	Architecture	Parameters	Quantized Size	Source
1	`phi3:mini`	Phi-3 Mini	3.8B	2.0 GB	Microsoft
2	`llama3.1:latest`	LLaMA 3.1	8B	4.6 GB	Meta
3	`qwen2.5:latest`	Qwen 2.5	7B	4.4 GB	Alibaba
4	`mistral:latest`	Mistral	7B	4.1 GB	Mistral AI
5	`gemma:2b`	Gemma	2B	1.6 GB	Google
6	`survival-gemma3:latest`	Gemma 3 (finetuned)	2B	1.6 GB	Custom
7	`survival-gemma2:latest`	Gemma 2 (finetuned)	2B	1.6 GB	Custom
8	`survival-gemma:latest`	Gemma (finetuned)	2B	1.6 GB	Custom
9	`tinyllama:latest`	TinyLlama	1.1B	0.6 GB	TinyLlama Team
10	`my-survival:latest`	TinyLlama (finetuned)	1.1B	0.6 GB	Custom

6.3 Benchmark Methodology

Test Environment: Settings identical to production config — temperature=0.3, max_tokens=60, num_ctx=1024, num_thread=4. Each model was tested against 5 diverse wellbeing conversation prompts (50 total inferences).

Test Prompts:

#	Category	User Message
1	General Greeting	"Hey, I just wanted someone to talk to."
2	Negative Emotion	"I've been feeling really down lately and nothing seems to help."
3	Anxiety	"I have a big exam tomorrow and I can't stop worrying about it."
4	Positive Emotion	"I got promoted at work today! I'm so excited!"
5	Context Recall	"It happened again last night. I barely slept 3 hours." (with memory context)

Quality Scoring (0–10 weighted):

Criterion	Weight	Description
Empathy	30%	Empathetic language ("I understand", "sounds like", "here for you")
Brevity	20%	1–3 sentence responses score highest (optimized for RPi latency)
Naturalness	20%	Absence of robotic phrases ("as an AI", "language model")
Length Fit	15%	20–200 character responses ideal for quick supportive replies
No Hallucination	15%	Doesn't invent user's name or identity

6.4 Benchmark Results — Performance Comparison

Model	Size	Avg Time	TTFT	Tok/s	Avg Tokens	Quality	Pass
qwen2.5:latest	4.4 GB	5.68s	3.93s	8.48	42.6	9.13/10	5/5
llama3.1:latest	4.6 GB	8.54s	6.96s	6.29	32.0	8.65/10	5/5
survival-gemma2	1.6 GB	3.30s	2.69s	13.16	44.0	8.59/10	5/5
phi3:mini	2.0 GB	3.68s	3.03s	11.18	36.6	8.56/10	5/5
tinyllama	0.6 GB	2.66s	2.27s	17.88	47.6	8.41/10	5/5
survival-gemma3	1.6 GB	7.96s	7.31s	12.18	47.4	8.38/10	5/5
mistral:latest	4.1 GB	4.53s	3.25s	10.18	41.2	8.35/10	5/5
gemma:2b	1.6 GB	3.36s	2.73s	14.00	48.0	8.11/10	5/5
survival-gemma	1.6 GB	3.36s	2.67s	14.57	49.4	7.96/10	5/5
my-survival	0.6 GB	2.74s	2.59s	7.66	20.0	7.45/10	5/5

Legend: TTFT = Time To First Token | Tok/s = Tokens Per Second

6.5 Visual Comparisons

Quality Score (higher is better)

qwen2.5         ██████████████████░░ 9.13   ✗ Too large (4.4GB)
llama3.1        █████████████████░░░ 8.65   ✗ Too large (4.6GB)
survival-gemma2 ████████████████░░░░ 8.59   ✓ GOOD (1.6GB)
phi3:mini       ████████████████░░░░ 8.56   ✓ SELECTED (2.0GB)
tinyllama       ████████████████░░░░ 8.41   ✓ Fast but less empathetic
survival-gemma3 ████████████████░░░░ 8.38   ~ Slow first token
mistral         ████████████████░░░░ 8.35   ✗ Too large (4.1GB)
gemma:2b        ███████████████░░░░░ 8.11   ✓ Acceptable fallback
survival-gemma  ███████████████░░░░░ 7.96   ✓ Acceptable
my-survival     ██████████████░░░░░░ 7.45   ✗ Poor empathy

Response Time (lower is better)

tinyllama       ██░░░░░░░░░░░░░░░░░░  2.66s ← Fastest
my-survival     ██░░░░░░░░░░░░░░░░░░  2.74s
survival-gemma2 ███░░░░░░░░░░░░░░░░░  3.30s
phi3:mini       ███░░░░░░░░░░░░░░░░░  3.68s ← SELECTED
mistral         ████░░░░░░░░░░░░░░░░  4.53s
qwen2.5         █████░░░░░░░░░░░░░░░  5.68s
survival-gemma3 ████████░░░░░░░░░░░░  7.96s
llama3.1        █████████░░░░░░░░░░░  8.54s ← Slowest

Tokens Per Second (higher is better)

tinyllama       ██████████████████░░ 17.88 ← Fastest generation
survival-gemma  █████████████████░░░ 14.57
gemma:2b        ██████████████░░░░░░ 14.00
survival-gemma2 █████████████░░░░░░░ 13.16
phi3:mini       ███████████░░░░░░░░░ 11.18 ← SELECTED
mistral         ██████████░░░░░░░░░░ 10.18
qwen2.5         ████████░░░░░░░░░░░░  8.48
llama3.1        ██████░░░░░░░░░░░░░░  6.29 ← Slowest

6.6 Per-Category Quality Breakdown

Model	General	Negative Emotion	Anxiety	Positive	Context Recall
phi3:mini	7.75	9.55	10.00	7.75	7.75
qwen2.5	9.25	9.55	9.40	10.00	7.45
llama3.1	8.50	8.50	8.50	10.00	7.75
mistral	7.90	9.55	8.65	7.90	7.75
survival-gemma2	7.90	8.95	8.65	8.50	8.95
gemma:2b	7.15	8.35	8.95	7.75	8.35
tinyllama	8.50	8.80	8.50	7.45	8.80

phi3:mini scored a perfect 10.00/10 on the Anxiety prompt — the most critical category for a wellbeing companion.

6.7 Sample Responses — Qualitative Comparison

Prompt: "I've been feeling really down lately and nothing seems to help."

phi3:mini (9.55/10): "I'm truly sorry to hear that you're going through a tough time. Remember, it's okay to feel sad sometimes, but I believe in your strength too. If there's anything specific on your mind or if you need someone to talk to, I'm here for you."

qwen2.5 (9.55/10): "I'm sorry to hear you're feeling this way. It's okay to feel sad sometimes, and it's important to give yourself lots of kindness during these times. Would you like to talk about what's been on your mind?"

llama3.1 (8.50/10): "Sweetheart, I'm so sorry you're struggling. Would you like me to guide you through a gentle breathing exercise or simply listen while you share what's on your mind?"

my-survival (7.75/10): "Can you please suggest some ways I can make myself feel better?" ⚠️ This model responds as the user instead of the assistant — poor instruction following.

Prompt: "I have a big exam tomorrow and I can't stop worrying about it."

phi3:mini (10.00/10 — PERFECT): "It sounds like you're feeling anxious, which is completely normal before an important test. Remember to take breaks and get some rest tonight—you got this!"

qwen2.5 (9.40/10): "I understand how stressful it can be before an important exam. Remember, you've studied hard, and you've got this. Take some deep breaths and try to get a good night's rest tonight. You'll do great!"

6.8 RPi 5 Suitability Analysis

Model	Size OK?	Speed OK?	Quality OK?	Verdict
phi3:mini	✓ (2.0 GB)	✓ (3.68s)	✓ (8.56)	✅ RECOMMENDED
survival-gemma2	✓ (1.6 GB)	✓ (3.30s)	✓ (8.59)	✅ Good Alternative
gemma:2b	✓ (1.6 GB)	✓ (3.36s)	~ (8.11)	⚠️ Acceptable Fallback
tinyllama	✓ (0.6 GB)	✓ (2.66s)	~ (8.41)	⚠️ Fast but less empathetic
qwen2.5	✗ (4.4 GB)	✓ (5.68s)	✓ (9.13)	❌ Too large for 4GB RPi
llama3.1	✗ (4.6 GB)	~ (8.54s)	✓ (8.65)	❌ Too large, too slow
mistral	✗ (4.1 GB)	✓ (4.53s)	✓ (8.35)	❌ Too large for 4GB RPi
my-survival	✓ (0.6 GB)	✓ (2.74s)	✗ (7.45)	❌ Poor instruction following

6.9 Why We Chose Phi-3 Mini

After benchmarking all 10 models, Microsoft Phi-3 Mini (phi3:mini) was selected as the default:

Criterion	Phi-3 Mini	Best Alternative (qwen2.5)
Model Size	2.0 GB ✓	4.4 GB ✗ (won't fit 4GB RPi)
Quality Score	8.56/10	9.13/10
Response Time	3.68s	5.68s
Empathy (Anxiety)	10.00/10 ✓	9.40/10
RPi 5 Compatible	✓	✗

Key Insights:

Best quality-to-size ratio — Achieves quality comparable to 7B models at nearly half the size
Perfect score on anxiety prompts — 10/10 on the most critical wellbeing category
Fits comfortably in 4GB RAM — Leaves room for OS, Python, ChromaDB, TensorFlow, and FER
Excellent instruction following — Stays in character as Maya, keeps responses brief
Natural empathy — Uses phrases like "I'm truly sorry", "it's okay to feel", "I believe in your strength"

Note: Users with 8GB RPi 5 may try qwen2.5:latest (LLM_MODEL=qwen2.5 in config) for higher quality at the cost of longer inference.

6.10 Reproducing the Benchmark

source venv/bin/activate   # Linux/RPi
# OR
venv\Scripts\activate      # Windows

python benchmark_models.py

The script discovers all local Ollama models, runs 5 prompts against each, measures latency/quality, prints a comparison table, and saves results to benchmark_results.json.

7. Software Stack & Technology Choices

Language Model (LLM)

Property	Value
Model	Microsoft Phi-3 Mini (`phi3:mini`)
Runtime	Ollama (local inference server)
Parameters	~3.8B parameters
Quantization	Q4_K_M (default Ollama quantization)
Context Window	1024 tokens (tuned for RPi CPU performance)
Max Output Tokens	60 (brief, focused responses)
Temperature	0.3 (low creativity, high consistency)
CPU Threads	4 (matches RPi 5's quad-core Cortex-A76)
Timeout	300 seconds (5 min for slow CPU inference)
API	Ollama REST API at `http://localhost:11434`
Stop Sequences	`\n\n`, `User:`, `Assistant:`

Why Phi-3 Mini? — It is one of the smallest high-quality LLMs that can run on RPi 5 hardware with acceptable latency. It handles empathetic conversation well within tight token budgets.

Sentiment Analysis

Property	Value
Library	VADER (Valence Aware Dictionary and sEntiment Reasoner)
Package	`vaderSentiment>=3.3.2`
Type	Rule-based, lexicon-driven
Output	Compound score (-1.0 to +1.0), pos/neg/neu breakdown
Thresholds	Positive: ≥ 0.05, Negative: ≤ -0.05
Why VADER?	Zero-latency, no GPU needed, specifically tuned for social/conversational text

Facial Emotion Recognition

Property	Value
Library	FER (Facial Expression Recognition) v22.5.1
Backend	TensorFlow (Keras CNN)
Face Detector	OpenCV Haar Cascade (`mtcnn=False` for speed on RPi)
Detectable Emotions	happy, sad, angry, fear, surprise, neutral, disgust
Confidence Threshold	0.30 (detections below this are discarded)
Sampling Interval	Every 3 conversation turns (CLI) or every 2.5s (web UI polling)
Why not MTCNN?	MTCNN is more accurate but significantly slower on CPU. Haar Cascade provides adequate speed on RPi 5.

Vector Memory (RAG)

Property	Value
Database	ChromaDB (persistent mode)
Package	`chromadb>=0.4.22`
Embedding	ChromaDB's default all-MiniLM-L6-v2 Sentence Transformer
Distance Metric	Cosine similarity (`hnsw:space: cosine`)
Retrieval Top-K	2 (reduced for CPU performance)
Storage Location	`data/memory/` (auto-created)
Collection Name	`conversations`
Stored Metadata	user_message, assistant_response, sentiment_label, sentiment_score, emotion, timestamp

Web Framework

Property	Value
Framework	Flask 3.0+
CORS	flask-cors 4.0+
Streaming	Server-Sent Events (SSE) via `/api/chat_stream`
Host	`0.0.0.0:5000` (accessible on LAN)
Template	Single-page glassmorphism UI (`templates/index.html`)
Font	Google Quicksand (loaded via CDN on first access)

Computer Vision

Property	Value
Library	OpenCV (headless) `opencv-python-headless>=4.8.0`
Usage	Camera capture, color conversion (BGR→RGB), bounding box drawing, JPEG encoding

Other Dependencies

Package	Version	Purpose
`requests`	≥2.31.0	HTTP client for Ollama REST API
`numpy`	≥1.24.0, <2.0.0	Array operations for OpenCV/TF (pinned <2.0 for compatibility)
`tensorflow`	≥2.15.0, <2.18.0	Backend for FER emotion detection CNN

8. Complete Processing Pipeline

Per-Message Processing Pipeline

When a user sends a message, the AgentBrain.process() method orchestrates this pipeline:

User Input (text)
     │
     ▼
┌─────────────────────────────────┐
│ 1. SENTIMENT ANALYSIS           │
│    SentimentAnalyzer.analyze()  │
│    VADER scores the text →      │
│    label: positive/negative/    │
│           neutral               │
│    compound: -1.0 to +1.0      │
│    intensity: 0.0 to 1.0       │
└────────────┬────────────────────┘
             │
             ▼
┌─────────────────────────────────┐
│ 2. MEMORY RETRIEVAL (RAG)       │
│    ConversationMemory.retrieve()│
│    Query ChromaDB with user     │
│    input → retrieve top-2 most  │
│    semantically similar past    │
│    conversations                │
└────────────┬────────────────────┘
             │
             ▼
┌─────────────────────────────────┐
│ 3. EMOTION ENGINE UPDATE        │
│    EmotionEngine.update()       │
│    Fuses:                       │
│    • Text sentiment (VADER)     │
│    • Facial emotion (FER/cam)   │
│    • Historical sentiment avg   │
│    • Memory sentiment patterns  │
│    Outputs: MentalState object  │
│    (dominant_emotion, trend,    │
│     historical_avg)             │
└────────────┬────────────────────┘
             │
             ▼
┌─────────────────────────────────┐
│ 4. LLM RESPONSE GENERATION     │
│    LLMClient.chat()             │
│    Builds system prompt with:   │
│    • Base persona (Maya)        │
│    • Current user mood          │
│    • Emotional trend guidance   │
│    • Retrieved memory context   │
│    Sends last 4 messages +      │
│    system prompt to Ollama      │
│    Streams response tokens      │
└────────────┬────────────────────┘
             │
             ▼
┌─────────────────────────────────┐
│ 5. MEMORY STORAGE               │
│    ConversationMemory.store()   │
│    Stores in ChromaDB:          │
│    • user_message               │
│    • assistant_response         │
│    • sentiment_label & score    │
│    • dominant emotion           │
│    • timestamp                  │
│    Embedded for future RAG      │
└─────────────────────────────────┘

Emotion Fusion Logic (`EmotionEngine`)

The Emotion Engine maintains a sliding window (10 turns) of sentiment scores and emotion labels to compute:

Dominant Emotion Resolution — If a facial emotion is detected (not unknown/None), it takes priority. Otherwise, text sentiment is mapped: positive→happy, negative→sad, neutral→neutral.
Historical Sentiment Average — Running mean of compound scores over the window.
Emotional Trend — Compares the average sentiment of the first half vs second half of the window. Difference >0.15 = "improving", <-0.15 = "declining", else "stable".
Memory Pattern Adjustment — If >60% of retrieved memories have negative sentiment, the historical average is shifted down by 0.1 to increase concern.

Camera Emotion Flow (Web UI)

Browser polls /api/camera/snapshot every 2.5s
     │
     ▼
WebcamCamera.capture_snapshot_with_overlay()
     │
     ├── cv2.VideoCapture.read() → raw BGR frame
     ├── FER.detect_emotions(RGB frame) → bounding boxes + emotion scores
     ├── Draw green bounding box + emotion label on frame
     ├── cv2.imencode('.jpg') → JPEG bytes
     │
     ▼
Response: { image: base64 JPEG, emotion: "happy" }
     │
     ▼
Browser updates camera feed image + emotion emoji/label

9. Guided Mental Exercises

Maya actively monitors the user's emotional state and offers guided mental exercises when stress is detected.

Design Principles

Principle	Implementation
Non-intrusive	Exercises offered only when multiple stress indicators align
Opt-in	User can accept ("yes", "let's do it") or decline ("skip", "not now")
Quick	All exercises complete in under 30 seconds
Skippable	User can exit mid-exercise at any time
Cooldown	Only offered once every 5 turns (configurable)

Exercise Library — 7 Evidence-Based Exercises

#	Exercise	Category	Duration	Description
1	Box Breathing	🌬️ Breathing	24s	4-4-4-4 pattern (inhale, hold, exhale, hold)
2	Calming Breath	🍃 Breathing	26s	Inhale 4s, exhale 6s — twice
3	5-4-3-2-1 Grounding	🌍 Grounding	30s	Name 5 things you see, 4 feel, 3 hear, 2 smell, 1 taste
4	Quick Gratitude	🙏 Gratitude	20s	Reflect on one thing you're grateful for
5	Body Scan	🧘 Mindfulness	25s	Release tension in shoulders, jaw, chest, toes
6	Present Moment	🧘 Mindfulness	20s	Three deep breaths with awareness
7	Tension Release	💪 Mindfulness	25s	Progressive muscle relaxation (squeeze and release)

Exercise Trigger Conditions

Exercises are offered when any of these conditions are met:

┌─────────────────────────────────────────────────────────────┐
│                  STRESS DETECTION ENGINE                     │
│                                                              │
│  Condition 1: hist_avg < -0.3      → Persistent low mood    │
│  Condition 2: trend="declining"    → Getting worse           │
│               AND hist_avg < -0.15                           │
│  Condition 3: sentiment < -0.5     → Very negative now       │
│  Condition 4: emotion ∈ {sad,      → Stress emotion          │
│               angry, fear, disgust}                          │
│               AND sentiment < -0.2                           │
│                                                              │
│  ANY condition true → needs_exercise = True                  │
│  Cooldown: min 5 turns between offers                        │
└─────────────────────────────────────────────────────────────┘

Exercise Flow (Web UI)

Stress Detected → Exercise Offer (SSE event)
       │
       ▼
┌──────────────────────────┐
│  Exercise Selection Card │
│  ┌────┐ ┌────┐ ┌────┐   │
│  │🌬️ │ │🍃 │ │🌍 │   │  ← User picks one
│  │Box │ │Calm│ │5421│   │
│  └────┘ └────┘ └────┘   │
│         [Skip]           │
└──────────┬───────────────┘
           │
           ▼
┌──────────────────────────┐
│  Step-by-Step Guide      │
│  "Breathe IN slowly..."  │
│  ┌──────────────────┐    │
│  │   ⏱️ 4 seconds    │   │  ← Timer countdown
│  │   ████████░░░░    │   │
│  └──────────────────┘    │
│  [Next Step] [Skip]      │
└──────────────────────────┘

10. Web Interface & UI Design

The web interface is a single-page application built with a glassmorphism design language.

UI Components

Component	Description
Chat Panel	Message bubbles with avatars, typing indicator, auto-scroll
Emotion Sidebar	Live camera feed, emotion emoji, status indicators
Header	System status dots (LLM ●, Memory ●, Camera ●), reset button
Exercise Cards	Gradient cards with icons, timers, step-by-step guides

Design Specifications

Property	Value
UI Style	Glassmorphism (frosted glass panels, gradient background)
Color Palette	Soft purple-blue gradient (`#e0c3fc` → `#8ec5fc`)
Font	Google Quicksand (warm, approachable)
Layout	Responsive — side-by-side on desktop, stacked on mobile
Streaming	SSE via `ReadableStream` for token-by-token display
Camera Polling	`/api/camera/snapshot` every 2.5 seconds
Accessibility	High contrast text, large touch targets

Emotion Display Mapping

Emotion	Emoji	Color Accent
Happy	😊	Green
Sad	😢	Blue
Angry	😠	Red
Fear	😨	Purple
Surprise	😲	Yellow
Neutral	😐	Gray
Disgust	🤢	Green

11. Installation — Raspberry Pi 5

Prerequisites

Raspberry Pi OS (64-bit / Bookworm recommended)
Python 3.10 or higher
Internet connection (for initial setup only)

Automated Setup

# Clone the repository
git clone <repository-url> ~/wellbeing_ai
cd ~/wellbeing_ai

# Run the setup script
chmod +x setup_rpi.sh
./setup_rpi.sh

The script will:

Create a Python virtual environment
Install all pip dependencies
Create the data/memory/ directory
Install Ollama (if not already installed)
Pull the phi3:mini model

Manual Setup

# 1. Install system dependencies
sudo apt update && sudo apt install -y python3 python3-pip python3-venv libatlas-base-dev

# 2. Create & activate virtual environment
python3 -m venv venv
source venv/bin/activate

# 3. Upgrade pip
pip install --upgrade pip

# 4. Install Python dependencies
pip install -r requirements.txt

# 5. Patch FER for RPi (fixes moviepy import error)
python patch_fer.py

# 6. Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# 7. Start Ollama and pull the model
ollama serve &
sleep 3
ollama pull phi3:mini

# 8. Create data directory
mkdir -p data/memory

# 9. Enable camera (if using RPi Camera Module)
sudo raspi-config
# Navigate to: Interface Options → Camera → Enable
# Reboot if prompted

12. Installation — Windows (Development)

REM Clone the repository
git clone <repository-url>
cd wellbeing_ai

REM Run the setup script
setup_rpi.bat

Or manually:

python -m venv venv
venv\Scripts\activate
pip install --upgrade pip
pip install -r requirements.txt

REM Install Ollama from https://ollama.com/download/windows
ollama pull phi3:mini

13. Running the Application

Terminal CLI

source venv/bin/activate   # Linux/RPi
# OR
venv\Scripts\activate      # Windows

python main.py

This launches an interactive terminal session where you type messages and Maya responds. Camera emotion detection samples every 3 turns (configurable).

Web Interface

source venv/bin/activate   # Linux/RPi

python web_app.py

Then open in a browser:

On the Pi: http://localhost:5000
From another device on LAN: http://<raspberry-pi-ip>:5000

The web interface provides:

A chat window with streaming responses
Live camera feed with emotion overlay (bounding boxes + labels)
Real-time emotion emoji display
System status indicators (LLM, Memory, Camera)
Conversation reset button

14. Configuration Reference

All configuration lives in config/config.py. Key settings:

Setting	Default	Description
`OLLAMA_BASE_URL`	`http://localhost:11434`	Ollama server URL (env: `OLLAMA_BASE_URL`)
`LLM_MODEL`	`phi3:mini`	Ollama model name (env: `LLM_MODEL`)
`LLM_TEMPERATURE`	`0.3`	Creativity vs consistency (0.0–1.0)
`LLM_MAX_TOKENS`	`60`	Max response length in tokens
`LLM_NUM_CTX`	`1024`	Context window size
`LLM_NUM_THREAD`	`4`	CPU threads (matches RPi 5 quad-core)
`LLM_TIMEOUT`	`300`	Request timeout in seconds
`CAMERA_ENABLED`	`True`	Enable/disable camera subsystem
`CAMERA_INDEX`	`0`	OpenCV camera device index
`CAMERA_SAMPLE_INTERVAL`	`3`	Capture emotion every N turns (CLI)
`MEMORY_COLLECTION`	`conversations`	ChromaDB collection name
`MEMORY_TOP_K`	`2`	Number of memories to retrieve per query
`EXERCISE_TRIGGER_THRESHOLD`	`-0.3`	Sentiment threshold for offering exercises
`EXERCISE_COOLDOWN_TURNS`	`5`	Minimum turns between exercise offers
`DISPLAY_MODE`	`terminal`	Display mode (`terminal` or `eink`)
`SYSTEM_PROMPT`	Maya persona	System prompt sent to the LLM

Environment Variables

These settings can be overridden via environment variables:

OLLAMA_BASE_URL
LLM_MODEL
CAMERA_ENABLED (set to "true" / "false")
DISPLAY_MODE

15. Module Deep Dive

`agent/brain.py` — AgentBrain

The central orchestrator. Initializes all subsystems and exposes:

check_systems() → dict of subsystem health checks
process(user_input, face_emotion, stream) → runs the full 5-step pipeline (sentiment → memory retrieval → emotion update → LLM generation → memory storage)
Maintains a rolling conversation history (last 4 messages sent to LLM)
Builds a dynamic system prompt that includes Maya's persona, current user mood, emotional trend guidance, and retrieved memory context

`agent/llm.py` — LLMClient

Communicates with Ollama's REST API:

is_available() → checks if Ollama is running and the model is loaded (via /api/tags)
generate(prompt, system) → single-shot generation via /api/generate (streaming internally)
chat(messages, stream_output) → chat-style generation via /api/chat. When stream_output=True, returns a generator that yields tokens one by one for SSE streaming.
Handles connection errors, timeouts, and server unavailability gracefully with error messages.

`agent/sentiment.py` — SentimentAnalyzer

Wraps VADER for conversational sentiment analysis:

analyze(text) → returns SentimentResult(label, compound, intensity, scores)
Labels: positive (compound ≥ 0.05), negative (compound ≤ -0.05), neutral
Intensity = absolute value of compound score (0.0–1.0)
Zero dependencies beyond vaderSentiment, instant CPU execution

`agent/emotion.py` — EmotionEngine

Maintains session-level emotional state:

update(sentiment, face_emotion, retrieved_memories) → returns MentalState
Tracks a sliding window of 10 sentiment scores and emotion labels
Resolves dominant emotion (face > text mapping)
Computes emotional trend (improving / declining / stable)
Adjusts for long-term memory patterns (negative memory ratio)

`agent/memory.py` — ConversationMemory

ChromaDB-backed long-term memory with RAG:

store(MemoryEntry) → stores a conversation turn with full metadata
retrieve(query, top_k) → semantic similarity search, returns list[RetrievedMemory]
Uses cosine distance in HNSW index
Each entry stores: user message, assistant response, sentiment label/score, emotion, timestamp
Documents are formatted as "The user said: ...\nMaya (the AI assistant) responded: ..." for embedding (prevents role confusion)

`agent/exercises.py` — ExerciseManager

Manages guided mental exercises for stress relief:

ExerciseManager — tracks exercise state and cooldown periods
should_offer_exercise(current_turn, cooldown_turns) → checks if enough time has passed since last offer
get_random_exercise() → selects a random exercise, avoiding repetition
mark_exercise_offered(turn) → records when an exercise was offered
format_exercise_offer() → generates the opt-in offer message
format_exercise_steps(exercise) → formats exercise steps into a single message
Contains 7 pre-built exercises: Box Breathing, Calming Breath, 5-4-3-2-1 Grounding, Quick Gratitude, Body Scan, Present Moment, Tension Release
All exercises complete in under 30 seconds

`interface/camera.py` — BaseCamera / WebcamCamera

Hardware abstraction for camera + emotion detection:

BaseCamera — abstract base class defining the interface
WebcamCamera — implementation using OpenCV + FER
capture_emotion() → capture frame, detect dominant emotion, return label or None
capture_frame() → return raw OpenCV frame
capture_snapshot_with_overlay() → capture frame, detect emotion, draw bounding box + label, return (JPEG bytes, emotion label)
is_available() → check if camera is accessible
release() → release camera resources
FER initialization is optional — if TensorFlow/FER is unavailable, the camera still works for frame capture without emotion detection

`interface/display.py` — BaseDisplay / TerminalDisplay

Hardware abstraction for output rendering:

BaseDisplay — abstract base class
TerminalDisplay — rich terminal output with text wrapping
show_message(sender, message) — formatted chat output
show_welcome() — welcome banner
show_emotion(emotion) — emoji-mapped emotion display
show_status(status) — system status messages
clear() — clear terminal screen (platform-aware: cls on Windows, clear on Linux)
Designed to be replaceable with an E-Ink display implementation

`web_app.py` — Flask Web Application

REST API + SSE streaming server:

GET / — serves the chat interface (templates/index.html)
GET /api/status — returns system health (LLM, memory, camera)
POST /api/chat — synchronous chat endpoint (returns full response)
POST /api/chat_stream — SSE streaming chat endpoint (yields tokens)
GET /api/camera/snapshot — returns base64 JPEG with emotion overlay
GET /api/camera/emotion — returns detected emotion label only
POST /api/reset — resets conversation history

`templates/index.html` — Web Chat Interface

Single-page application with:

Glassmorphism UI — frosted glass panels with gradient background
Chat panel — message bubbles with avatars, typing indicator, auto-scroll
Emotion sidebar — live camera feed, emotion emoji display, status indicators
SSE streaming — reads token-by-token from /api/chat_stream using ReadableStream
Emotion polling — polls /api/camera/snapshot every 2.5 seconds for live emotion updates
Responsive — adapts to mobile screens with stacked layout
Font — Google Quicksand for a friendly, approachable feel

16. API Reference

The Flask web application exposes the following REST API endpoints:

Method	Endpoint	Description	Request Body	Response
`GET`	`/`	Serve the chat interface	—	HTML page
`GET`	`/api/status`	System health check	—	`{status, camera_enabled, model}`
`POST`	`/api/chat`	Synchronous chat	`{message, capture_emotion}`	`{response, face_emotion, turn_count}`
`POST`	`/api/chat_stream`	SSE streaming chat	`{message, capture_emotion}`	SSE stream of tokens
`GET`	`/api/camera/snapshot`	Camera frame + emotion	—	`{image: base64, emotion}`
`GET`	`/api/camera/emotion`	Detect emotion only	—	`{emotion, success}`
`POST`	`/api/reset`	Reset conversation	—	`{success: true}`
`POST`	`/api/trigger_exercise`	Force exercise offer	—	`{success, exercises}`
`GET`	`/api/exercises`	List all exercises	—	`{exercises: [...]}`
`POST`	`/api/exercise/start`	Start an exercise	`{name}`	`{success, exercise}`
`POST`	`/api/exercise/skip`	Skip current exercise	—	`{success: true}`

SSE Stream Event Types

The /api/chat_stream endpoint emits these Server-Sent Events:

Event Type	Payload	Description
`emotion`	`{type: "emotion", emotion: "happy"}`	Detected facial emotion (sent first)
`token`	`{type: "token", token: "Hello"}`	Single token from LLM response
`exercise_offer`	`{type: "exercise_offer", exercises: [...]}`	Stress detected, offer exercises
`done`	`{type: "done"}`	Response complete
`error`	`{type: "error", error: "..."}`	Error occurred

17. Utility Scripts

`benchmark_models.py`

Comprehensive LLM benchmark that tests all locally installed Ollama models against wellbeing conversation prompts. Measures latency, TTFT, tokens/second, and empathy quality scores. Results saved to benchmark_results.json.

python benchmark_models.py

`test_camera.py`

Diagnostic script that tests the full camera pipeline in 4 steps:

Webcam access (OpenCV)
FER library import
FER detector initialization
Live emotion detection with confidence scores

python test_camera.py

`patch_fer.py`

Patches the FER library's classes.py to make the moviepy import optional. This fixes the "No module named 'moviepy.editor'" error that occurs on Raspberry Pi since moviepy is not needed for emotion detection.

python patch_fer.py

`reset_memory.py`

Deletes all stored conversations from ChromaDB. Asks for confirmation before proceeding.

python reset_memory.py

`view_memory.py`

Displays all stored conversations with timestamps, sentiment labels, emotions, and message previews.

python view_memory.py

18. Troubleshooting

Ollama not running

[!] Ollama is not running or model not found.

Fix:

ollama serve &          # Start Ollama server
ollama pull phi3:mini   # Download the model

Camera not detected

⚠️ Camera enabled but not available

Fix (RPi Camera Module):

sudo raspi-config       # Interface Options → Camera → Enable
sudo reboot

Fix (USB Webcam):

ls /dev/video*          # Check available camera devices
# If your camera is at /dev/video1, set CAMERA_INDEX=1 in config/config.py

FER moviepy import error

No module named 'moviepy.editor'