Retrieval-Augmented Generation app that lets security analysts query public threat-intelligence data in natural language and get grounded, cited answers — plus structured extraction of ATT&CK techniques and IOCs.
Ingests two public threat-intelligence sources — MITRE ATT&CK Enterprise techniques and the CISA Known Exploited Vulnerabilities catalog — chunking, embedding, and storing them in a persistent ChromaDB vector store with an idempotent pipeline that skips already-indexed documents.
Uses sentence-transformers (all-MiniLM-L6-v2) for fully local embeddings — no API calls during retrieval — then passes the top-k relevant chunks to Claude, which answers using only the supplied context to keep responses grounded and prevent hallucination.
Returns cited answers alongside structured extraction of ATT&CK technique IDs and IOCs via Claude tool-use, so analysts get both a readable explanation and machine-parseable indicators in a single query.
Exposes the RAG loop two ways: a FastAPI service (POST /query, GET /health) that degrades gracefully with a 503 when the corpus isn't ingested or the API key is missing, and a command-line wrapper for quick local queries.