AI and Theory of Mind: Will a Machine Understand a Scientis…

AI and Theory of Mind: Will a Machine Understand a Scientist's Intentions?

When a researcher types a question about an enzyme's mechanism into an AI assistant, the model responds based on statistical patterns in billions of tokens of scientific text. But does it understand that the question stems from frustration after a failed experiment? That the researcher is seeking not just a definition but a foothold for a new hypothesis? This distinction has practical consequences for anyone designing AI systems to support research.

What is Theory of Mind and Why AI Lacks It

Theory of Mind (ToM) is the ability to attribute mental states—beliefs, intentions, desires, knowledge—to others. Children develop it between the ages of three and five. It allows us to recognize that a colleague isn’t lying but simply lacks information.

Language models do not possess this ability in a mechanistic sense. Their architecture is based on predicting the next token from context, not on representing the internal states of the interlocutor. Research from 2023 and 2024 shows that LLMs can pass some classic ToM tests, such as the false-belief task, but they do so by leveraging patterns in training data, not through actual modeling of the interlocutor’s mind. When the test is even slightly rephrased, performance drops significantly.

For research work, this is the difference between a tool that answers a question literally and an assistant that understands the context behind the question.

What AI Actually Extracts from a Researcher’s Question

Even without Theory of Mind, models have real utility in interpreting intent. It’s worth separating what works from what doesn’t.

What works consistently. The model infers intent from lexical and structural signals: keywords, grammatical mood, question length, and prior exchanges in the context window. If a researcher writes, “what mechanisms explain X,” the model recognizes an explanatory question and responds differently than to “how to measure X.” For well-defined intents (literature review, synthesis, methodology comparison), accuracy is high.

What doesn’t work. The model does not read emotions, frustration levels, hidden assumptions, or project history unless the researcher explicitly includes them in the prompt. It doesn’t know the researcher just rejected a third hypothesis in a row and is looking for something else. It doesn’t model that the question comes from a domain where the researcher is an expert and that an introductory-level answer will be useless.

At Cashcrown, when designing RAG assistants for R&D clients, we observe this pattern regularly: the model answers the literal question precisely but misses the researcher’s intent when it isn’t explicitly stated in the prompt.

Consequences for Designing Research Systems

The absence of ToM in AI doesn’t eliminate its usefulness in science. It does, however, change the architecture of the system we aim to build.

System Element	Without Accounting for Lack of ToM	Accounting for Lack of ToM
Question Interface	Single text field	Structured form: goal, context, constraints
Result Interpretation	Model interprets alone	Researcher verifies intent accuracy before use
Iteration	Single answer	Multiple interpretation variants to choose from
Documentation	None	Prompt and intent logs as part of research documentation

When designing a system to support scientific work, it’s worth embedding a mechanism where the model explicitly paraphrases the understood intent before providing an answer. This simple pattern dramatically reduces the gap between what the researcher wanted to know and what the model treated as the question.

Try It Live

▶Model Paraphrases Research Question Intentsandbox · reasoning

Intent Hallucinations: When the Model “Guesses” Too Much

Hallucinations in the context of Theory of Mind take a particular form. Instead of responding, “I don’t know what you mean,” the model sometimes constructs an answer based on an assumed intent that wasn’t present in the question.

Example from practice: A researcher asks about an “RNA isolation protocol from FFPE samples.” The model responds with standard protocols but doesn’t signal that for FFPE samples older than three years, a specialized fragmentation repair procedure exists. Not knowing the researcher works with old archival tissues, the model omits critical information—not out of malice, but because it lacked context.

Explainability in such systems means not only explaining where the answer comes from but also explicitly signaling what assumptions the model made about the questioner’s intent. Without this, the researcher doesn’t know what the answer doesn’t cover.

Human-Oversight: Where the Human Must Step In

The lack of Theory of Mind directly translates into requirements for human-oversight in research systems. This isn’t excessive caution but a consequence of the model’s architecture.

Three points where researcher intervention is mandatory:

First, verifying intent accuracy after the model’s initial response. The researcher assesses whether the model answered the question they asked or the one it interpreted. This takes a moment but eliminates hours of work in the wrong direction.

Second, hypothesis acceptance before experimentation. LLMs as hypothesis generators can propose dozens of candidates, but selecting one for experimentation requires biological, chemical, or domain-specific expert knowledge. The model doesn’t know which hypotheses are experimentally feasible with available lab resources.

Third, validation before inclusion in a manuscript. Every AI-generated claim that enters a publication must be verified by the researcher with a primary source, not the model’s output.

The human-gate pattern used in Cashcrown’s agents works on this principle: every irreversible action requires explicit confirmation. In research, the equivalent is protocol approval before running an experiment.

What Changes When AI Better Interprets Context

The development direction for models in 2025 and 2026 includes extended context windows, better handling of multi-step instructions, and explicit parameterization of the user’s role in the system prompt. This realistically brings models closer to more useful intent interpretation, even without Theory of Mind in a cognitive sense.

Practical implications for research systems:

A system prompt describing the researcher’s role (domain, expertise level, project context) improves answer accuracy far more than changing the model.
Multi-step dialogue, where the model first paraphrases intent and waits for correction, works better than a single-step query.
Integration with a project knowledge management system (notes, previous experiments, rejected hypotheses) via semantic search (RAG) partially compensates for the model’s lack of contextual memory about the project’s history.

Each of these solutions reduces the intent gap without requiring the model to “understand” the scientist in a human sense. This is a realistic trajectory for 2026 and 2027, unlike speculation about machines possessing true Theory of Mind.

FAQ

Will LLMs ever pass classic Theory of Mind tests?

Some models already achieve high scores on standard ToM tests, such as the false-belief task. However, research shows this is an effect of memorizing patterns from training data, not true modeling of mental states. Minor test modifications can cause a significant drop in accuracy. Interpreting “passing the test” as evidence of possessing ToM is methodologically flawed.

How to design a prompt so the model better interprets the question’s intent?

The most effective approach is to explicitly describe the question’s context: domain, goal, constraints, and expected answer level. Instead of “what mechanisms explain X,” write, “I’m a biochemist working on kinase inhibitors, looking for mechanisms explaining variability in proliferation measurements, at the review article level.” The model answers the literal question, so the more literally the intent is described, the more accurate the response.

Is the lack of Theory of Mind in AI an ethical problem in research?

In the context of scientific research, it’s not a direct ethical problem but an architectural one. The model doesn’t interpret the researcher’s intent, meaning the researcher must explicitly verify whether the model’s answer matches the intended question. The ethical dimension arises when model outputs enter publications without this verification, violating reproducibility and scientific responsibility standards.

How does a RAG system help compensate for the lack of Theory of Mind?

Semantic search (RAG) provides the model with project-specific context: previous experimental results, rejected hypotheses, researcher notes. This doesn’t eliminate the lack of ToM but reduces the most common cause of intent mismatch: the model didn’t know what the researcher had already tried. With a well-designed project knowledge index, the model has enough context to answer the researcher’s question, not a generic one.

How does the AI Act regulate AI systems interpreting user intent in research?

The AI Act doesn’t define a separate category for systems interpreting intent, but AI systems used in research affecting medical, regulatory, or human safety decisions are subject to high-risk system requirements: registry, technical documentation, conformity assessment, post-deployment oversight. Systems supporting literature searches or preliminary synthesis, which don’t directly influence high-risk decisions, have lighter requirements.

Topics like explainability of AI systems, the role of humans in the decision loop, and AI autonomy in science are directly linked to the question of Theory of Mind. If you’re designing an AI-based research system, the readiness assessment tool will help identify architectural gaps before encountering intent issues in production.

Related case studydowodyIO — turning case files into auditable evidence