Architecture · ingest / transform / serve

Data Pipeline

FastAPI + APScheduler + httpx-pool + SQLite WAL FTS5 + local Ollama gemma3:1b. 44 RSS sources aggregated, parsed, cached 10 min.

Stages

1. Ingest

Reuters · BBC · AP · Al Jazeera · ISW · ReliefWeb · Defense News · USGS · GDACS · CoinGecko · Open-Meteo · CISA · Feodo · SSLBL · URLHaus · NWS · NHTSA · TheSpaceDevs · HN · Bluesky. 44 RSS sources via asyncio.gather.

2. Connection pool

Process-wide httpx.AsyncClient (max_connections=100, max_keepalive=40). 14 modules share the pool.

3. Cache

10-min TTL per category. SQLite WAL with FTS5 for full-text queries.

4. Synthesize

gemma3:1b prompts per category. Background task (non-blocking). 30-min cache on synthesized output.

5. Serve

FastAPI behind Caddy (9443) behind Cloudflare Tunnel. api.thatcomputerguy26.org stable URL.

6. Schedule

APScheduler in-process for warm pulls. Windows Scheduled Tasks for durable cron (survives reboots).

Routes

435 FastAPI routes. Public read endpoints documented at /api. Admin endpoints behind /admin/* with separate auth.

Data warehouses

← API · Models · Dev / GitHub