Validates LLM specs via cross-model PBT to catch intent drift
Solves verification of LLM-generated formal specifications by comparing multiple LLM outputs using property-based testing. Detects discrepancies like overspecification or underspecification, ensuring specs match intent. For AI researchers and security engineers, it's faster and more accessible than full formal proofs. Built for a hackathon, it shows high effectiveness with real-world benchmarks.
View on GitHub →astral-fate/DiffSpec-PBT-Apart-Research