AI detection startup GPTZero scanned all 4,841 papers accepted to the prestigious Neural Information Processing Systems Conference (NeurIPS) held in San Diego last month. The company discovered 100 hallucinatory citations across 51 papers and confirmed them to be fake, the company told TechCrunch.
Having a paper accepted by NeurIPS is a resume-worthy accomplishment in the world of AI. Given that they are leading experts in AI research, one might assume that they would use LLM for the devastatingly boring task of writing citations.
Therefore, this finding requires a lot of caution. The 100 identified hallucination citations across 51 papers are not statistically significant. Each paper has dozens of citations. So out of tens of thousands of citations, this is statistically zero.
It is also important to note that inaccurate citations do not negate the paper’s research. As NeurIPS told Fortune magazine, which first reported on GPTZero’s research, “even if 1.1% of papers have one or more incorrect references due to the use of LLM, this does not necessarily invalidate the content of the paper itself.”
But that being said, forged quotes aren’t meaningless either. NeurIPS prides itself on “rigorous academic publishing on machine learning and artificial intelligence,” the company says. Each paper is then peer reviewed by multiple people who are instructed to flag hallucinations.
Citations are also a kind of currency for researchers. These are used as career indicators to show how influential a researcher’s work is among colleagues. Once AI builds them, their value is watered down.
Given the sheer volume involved, no one can blame the reviewers for not catching some of the AI-fabricated citations. GPTZero is also quick to point this out. The purpose of the exercise was to provide concrete data on how AI slop has infiltrated via a “submission tsunami” and “squeezed the review pipelines of these conferences to breaking point,” the startup said in its report. GPTZero also references a May 2025 paper called “The Peer Review Crisis at AI Conferences” that discussed this issue at premier conferences including NeurIPS.
tech crunch event
san francisco
|
October 13-15, 2026
So why couldn’t the researchers themselves fact-check the accuracy of LLM’s research? Surely they must know the actual list of documents they used for their work?
What this whole thing really points to is a big, ironic conclusion. If the world’s leading AI experts can’t guarantee that their use of LLM is accurate in every detail, even though their reputations are at stake, what does that mean for the rest of us?
