hatchmoment. scored by care · not by stars

mosaic

Local RAG indexer that guarantees every chunk is stored

notablePython🧠 AI & ML

Mosaic extracts text (including OCR), token‑aware chunks, embeds with a sentence‑transformers model, and upserts into Milvus (dense + BM25). After insertion it checks that the stored count matches the chunk count, failing loudly if any chunk is missing. This makes it ideal for developers building reliable retrieval‑augmented generation pipelines who need local, API‑free indexing. Its coverage verification and hybrid retrieval set it apart from typical RAG indexers.

doclingembeddingsllmlocal-firstmilvusragretrieval-augmented-generationvector-database
View on GitHub →

fazalrshah/mosaic