hatchmoment. scored by care · not by stars

ZINA

ZINA detects and edits hallucinated text in multimodal LLM outputs

ZINA tackles hallucinations in multimodal large language models by pinpointing erroneous spans, classifying them into six error types, and producing edited captions. It provides a simple CLI that accepts an image, a candidate caption, and a reference caption, returning JSON with span tags and a cleaned version. The tool is aimed at researchers and developers who need precise evaluation and correction of MLLM outputs. Compared to generic detectors, ZINA offers span-level insight and automatic editing, enhancing interpretability and downstream usefulness.

View on GitHub →

YuigaWada/ZINA