Why a 94% Citation Hallucination in Grok-3 Forced a Rethink of Factuality Benchmarks
https://www.4shared.com/s/fHURCIveVfa
Grok-3 hit 94% citation hallucination while the FACTS benchmark reported a 68.8 score — hard numbers that changed production risk estimates The data suggests the situation was worse than the vendor materials implied