Feed Summarizer

Funny story: This was mostly a vibe-coded project that got out of hand. It actually started as a Node-RED flow for personal use, then morphed into a Python script, and I thought it would both help me save time reading news in the mornings and make for a great demo of spec-driven development.

As a direct outcome of my swearing at various LLMs, it became this, which is, in a mouthful, an asyncio-based background service that fetches multiple RSS/Atom (and optional Mastodon) sources, stores raw items in SQLite, generates AI summaries (Azure OpenAI), groups them into bulletins, and publishes both HTML and RSS outputs (optionally uploading to Azure Blob Static Website hosting).

And the end-user experience, for me (using Reeder Classic), looks like this:

Neatly organized by topic, with concise summaries and links to the original articles, all in a clean, readable format I can peruse over breakfast.

Overview

The pipeline is designed for efficiency (conditional fetching, batching, backoff) and the output is tailored to my reading habits (three "bulletins" per day that group items by topic, each bulletin published as both HTML and an RSS entry).

Most of the implementation started as a vibe-coded prototype, with some manual tweaking here and there, but it now has extensive error handling, logging, and observability hooks for Azure Application Insights via OpenTelemetry, and it publishes to Azure Blob Storage for publishing the results because there is no way I am letting this thing run a web server.

It is also deployable as a Docker Swarm service using kata, a private helper tool used for my own infrastructure.

Quickstart (5 commands)

python -m venv .venv              # 1. Create virtualenv
source .venv/bin/activate         # 2. Activate it
pip install -r requirements.txt   # 3. Install dependencies
cp feeds.yaml.example feeds.yaml  # 4. Seed a starter config (edit it)
python main.py run                # 5. One full pipeline run (fetch→summarize→publish→upload*)

(*) Azure upload happens only if storage env vars are set; otherwise it is skipped automatically.

Features

Concurrent conditional feed fetching (ETag / Last-Modified; respectful backoff & error tracking)
Optional reader mode & GitHub README enrichment for richer summarization context
AI summarization with per‑group introductions (opt‑in) via Azure OpenAI
Topic/group bulletins rendered as responsive HTML + RSS 2.0 feeds
SimHash-powered dedupe (optional BM25/FTS5 fallback) merges near-duplicate summaries and surfaces every source link
Optional passthrough (raw) feeds with minimal processing
Smart time‑based scheduling (timezone aware) plus interval overrides
Azure Blob Storage upload with MD5 de‑dup (skip unchanged) & optional sync delete
Graceful shutdown with executor timeouts and robust logging
Config hot‑reload for feeds; caching of YAML & prompt data
Observability hooks via OpenTelemetry (HTTP, DB, custom spans)

Documentation

Long-form documentation is in the docs/ folder:

CONFIGURATION (env vars, secrets, scheduling)
RUNNING (CLI modes, flags, scheduling)
PUBLISHING (output paths, Azure upload)
TELEMETRY (OpenTelemetry + Azure exporter)
TROUBLESHOOTING (common symptoms and fixes)
ARCHITECTURE (module map and pipeline flow)
MERGE_TUNING (dedupe/merge behavior and diagnostics)
RETENTION (age window & retention controls)
SPEC (long-form architecture/runtime spec)

Contributions & License

See LICENSE (MIT) for licensing details. Contribution guidelines and a code of conduct will be documented in CONTRIBUTING.md and CODE_OF_CONDUCT.md as the project evolves. Security reports: (will be defined in SECURITY.md).

Attribution

Some components and refactoring work were assisted by AI tooling; all code is reviewed for clarity and maintainability.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github		.github
.vscode		.vscode
docs		docs
models		models
services		services
templates		templates
tests		tests
tools		tools
tor		tor
utils		utils
workers		workers
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
config.py		config.py
conftest.py		conftest.py
feeds.yaml.example		feeds.yaml.example
kata-compose.yaml		kata-compose.yaml
main.py		main.py
prompt.yaml		prompt.yaml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
secrets.yaml.example		secrets.yaml.example

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Feed Summarizer

Overview

Quickstart (5 commands)

Features

Documentation

Contributions & License

Attribution

About

Uh oh!

Releases

Packages

Languages

License

rcarmo/feed-summarizer

Folders and files

Latest commit

History

Repository files navigation

Feed Summarizer

Overview

Quickstart (5 commands)

Features

Documentation

Contributions & License

Attribution

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages