On dealing with GPT results, or, Pots, Kettles And Hallucinations by Eduardo Bellani
Hallucinations in GPTs can lead to the dissemination of false information, creating harmful outcomes in applications of critical decision making or leading to mistrust in AI. In a viral instance, The New York Times published an article about a lawyer who used ChatGPT to produce case citations without realizing they were fictional, or hallucinated. This incident highlights the danger of hallucinations in LLM-based queries; often the hallucinations are subtle and go easily unnoticed. Given these risks, an important question arises: Why do GPTs hallucinate? (Waldo and Boussard 2024)
Given the above, how to safely use the results of GPTs? As
(Waldo and Boussard 2024) put it, a GPT generates the most common response
from its training corpus, reflecting the current linguistic
consensus. Where there is such consensus, GPTs appear accurate; where
there is controversy or little data, hallucination follows.
Therein lies the crux of the matter: most people would see the current
ideas represented in the LLM training corpus
as representing truth,
since they are the most popular (presumably) ideas around. So, someone
would need to already have an apprehension of actual truth to be able to
trust the output of the LLM.
An interesting turn is that one can take the (Waldo and Boussard 2024) itself as an example. Here is what it says about Epistemic Trust:
We tend to forget how recent the current mechanisms are for establishing trust in a claim. The notion that science is an activity based on experience and experiment can be traced back to Francis Bacon in the 17th century; the idea that we can use logic and mathematics to derive new knowledge from base principles can be traced to about the same period to René Descartes. This approach of using logic and experiment is a hallmark of the Renaissance. Prior to that time, trust was established by reference to ancient authorities (such as Aristotle or Plato) or from religion.
What has emerged over the past number of centuries is the set of practices that are lumped together as science, which has as its gold standard the process of experimentation, publication, and peer review. We trust something by citing evidence obtained through experimentation and documenting how that evidence was collected and how the conclusion was reached. Then, both the conclusion and the process are reviewed by experts in the field. Those experts are determined by their education and experience, often proved by their past ability to uncover new knowledge as judged by the peer-review process. 1
The first paragraph is an example of something that is clearly part of the current intellectual landscape, and very likely what would be generated by a LLM. As a matter of fact, here is what ChatGPT-5 replied when fed this text:
Your paragraph is mostly correct in its broad strokes…
Given that I have some knowledge in this matter, I can tell the paragraph is wrong:
- Bacon did not establish science as an activity of experience and experiment. What he did was to deny the validity of form, finality and ultimately metaphysics in the universe. Given that his work is a work arguing for a form(sense perception), finality(man) and metaphysics (sense perception as the only ground for truth), he is a self refuting ideologue.
- Descartes did something similar, denying the validity of anything that was not quantity (including sense perception, ironically). His is a self refuting position for the same reason.
Here is what the educated man thought science was before Bacon and Descartes:
The method is the path that intelligence must follow for the acquisition of science. It has 3 elements: a starting point, a process and an end.
- The starting point cannot be universal doubt, but must rest in immediately evident truths(necessary regarding its rational part, contingent regarding its experimental part).
- The process is the method properly speaking, being analytical or synthetic, in accordance to its direction — from/to the objects that have more/less comprehension.
- The end is science, which is a system of correct knowledge, relative to the cause of beings and ultimately deduced through demonstration. Such science can be rational or experimental, and is structured in a dependency hierarchy. 2 (Sinibaldi 2021)
Conclusion and advice
GPT queries are useful only when you can verify them. When that is possible, GPTs become a viable exploration and templating system.
So, the practical advice boils down to: don’t rely on it in areas you are not well versed in already.
Given how this advice applies to (Waldo and Boussard 2024) itself — where the authors ventured in a branch of knowledge that they didn’t seem to hold expertise besides the current popular science milieu — the follow image seems appropos here.

Figure 1: Charles H. Bennett’s coloured engraving from Shadow and Substance (1860), a series based on popular sayings. In this case, a coal-man and chimney sweep stop to argue in the street in illustration of “The pot calling the kettle black”. A street light throws the shadow of the kitchen implements on the wall behind them.