Genomics today produces more data than any research team can manually review. Genome-wide association studies (GWAS) typically involve millions of variants across cohorts of hundreds of thousands of participants. In the social sciences, the scale is similar: data from digital media, administrative records, interview transcripts. All of this creates corpora whose systematic manual analysis is practically impossible within a reasonable timeframe. The question is no longer "whether to use AI in research," but "at which points in the research process AI delivers real value, and where humans remain indispensable."
At Cashcrown, we work with companies implementing analytical systems on their own datasets. Below, we’ve compiled what we observe as repeatable patterns: no declarative claims about revolution, with a clear indication of where decisions still belong to the researcher.
What AI does well in genetics and biological research
#The biggest advantage of AI models in genetics is their ability to process multidimensional data without requiring the researcher to impose structure beforehand.
Detecting patterns in genomic data. Models trained on sequencing data learn correlations between variants and phenotypic traits that would not be visible in classical regression analysis. DeepMind’s AlphaMissense characterized the pathogenicity of over 70 million missense variants—a task that would take decades using traditional methods. Important caveat: correlation between a variant and phenotype is not a causal mechanism. The researcher must assess the biological plausibility of each identified association before proceeding to experimentation.
Literature search and synthesis. LLMs with access to databases like PubMed, bioRxiv, or Europe PMC can generate a map of contradictions and gaps in the literature on a given topic within hours. A systematic review, which traditionally takes months, is reduced to a few days of preliminary selection. This does not eliminate expert assessment of study quality: the model does not know whether the methodology of a specific RCT was robust if it wasn’t described in the text.
Generating hypothesis candidates. A RAG system with a domain-specific corpus can identify combinations of factors that would be invisible in a human review (e.g., linking signaling pathways from different publications that together suggest a drug resistance mechanism). Not every such hypothesis is useful, but even if 5 out of 100 generated candidates prove valuable, the time savings are real.
Applications in social and behavioral sciences
#In the social sciences, AI enters primarily through three channels: text analysis, pattern detection in behavioral data, and integration of heterogeneous sources.
Large-scale text analysis. Classifying statements, coding qualitative interviews, detecting themes in corpora of administrative documents: these are tasks where models perform comparably to human coders at a fraction of the time. Psychology, sociology, and political science use this for analyzing media discourse, social sentiment, or the evolution of political narratives.
Detecting behavioral patterns. Machine learning on longitudinal data identifies subtle dependencies between contextual variables and behaviors that escape classical regression models. Behavioral economics researchers use these tools to generate hypotheses about decision-making mechanisms, which they then test in laboratory or quasi-experimental conditions.
Integrating data from multiple sources. Genomic data combined with environmental, demographic, and behavioral data create a space where AI can point to unexpected correlations. This is the foundation of research in epigenetics, health psychology, or medical sociology. At the same time, it is a space with the highest risk of artifacts: correlations between disparate sources often reflect sampling errors, not true relationships.
Bias in data and its impact on hypotheses
#AI models formulate hypotheses based on what they find in training data. If the data are systematically distorted, the hypotheses will inherit those distortions as facts.
In clinical genetics, a known issue is the overrepresentation of samples of European origin in GWAS databases. A model trained on such a corpus will generate hypotheses better suited to this population and worse for others. In the social sciences, the equivalent is publication bias: literature favors positive results, so a model learning from published scientific papers will systematically overestimate effects.
| Source of bias | Example in the field | Researcher’s mitigation |
|---|---|---|
| Population overrepresentation | GWAS mainly on European cohorts | Audit the composition of the training database before implementation |
| Publication bias | Preference for statistically significant results | Include preprints and clinical trial registries |
| Temporal bias | Older literature dominates model weights | Limit the date window or actively enrich with newer sources |
| Language bias | Dominance of English-language publications | Include multilingual databases (e.g., WHO IRIS, LILACS) |
None of these mitigations are automatic. Each requires a conscious decision by the researcher at the system design stage. We describe a systematic approach to detecting and limiting these distortions in the article on algorithmic bias in research.
Interpretability: when “the model said so” isn’t enough
#Science requires falsifiability. A hypothesis you don’t understand doesn’t allow you to design a verifying experiment.
Modern research systems apply several layers of explainability. Attention maps and saliency indicate which parts of the input (sequence, protocol fragment, measurement values) had the greatest impact on the result. Natural language justifications generated by LLMs describe the mechanism in a way readable to domain experts. Confidence intervals and hallucination detection flag answers where the model lacks strong foundations.
None of these mechanisms provide full causal explanation. They offer a starting point: “the model pointed to this connection—does it make biological or social sense?” The answer to this question belongs to the researcher, not the model.
We explore the issue of model transparency in the context of scientific responsibility in the article on the black box problem in AI systems.
Try it live
#Human-oversight: where decisions must belong to humans
#AI autonomy in the research process does not mean lack of oversight. It means thoughtfully designing points where the researcher enters the loop.
At Cashcrown, we apply a three-checkpoint pattern for analytical agents. The same pattern is directly transferable to the research context:
| Checkpoint | Example in research | Decision-maker |
|---|---|---|
| Hypothesis selection | AI generated a list of candidates; the researcher accepts a subset for experimentation | Domain researcher |
| Protocol approval | AI designed an experiment plan; the PI approves before launch | Project leader |
| Pre-publication validation | AI prepared a draft; full verification by the team before submission for review | Entire research team |
Skipping any of these points is not an acceleration of the process. It shifts risk to a stage where errors are costlier: post-publication correction or retraction.
Human-oversight as an AI system design principle is detailed in the article on the role of humans in the loop. The issue of authorship and scientific integrity when using AI (declaring tools in the Methods section, maintaining prompt logs) is discussed in the article on AI as an autonomous scientist.
How structured output and RAG are changing laboratory practice
#Two technical patterns are particularly significant for scientific research.
Structured output allows the model to return results in a schema compliant with laboratory data management system (LIMS) or clinical database requirements. Instead of unstructured text that needs to be manually transcribed, the model generates JSON validated against a schema. This reduces the risk of transcription errors and speeds up integration of AI results with existing workflows.
RAG on an institution’s own knowledge base (protocols, results of previous experiments, standard operating procedures) allows the model to formulate hypotheses in a context specific to the laboratory, not just based on public literature. This is a fundamental difference for translational research, where institutional context is critical.
We detail the principles of implementing such systems with responsible innovation and data management in the article on data governance for AI.
FAQ
#Can AI independently generate scientific hypotheses without researcher involvement?
#Technically yes, but "independently" is misleading here. The model generates hypothesis candidates based on patterns in training data. It lacks a causal model of the world and does not know whether the proposed mechanism is biologically or socially plausible. A researcher with domain knowledge is needed to evaluate each candidate before resources are invested in an experiment. Without this verification, the risk of chasing artifacts is high.
How to protect against hallucinations in scientific research?
#The key is requiring source citations for every factual claim. A RAG system with an index of verified publications and a requirement to provide source identifiers drastically reduces factual hallucinations, though it does not eliminate them. Every citation requires verification before inclusion in a manuscript. Systems with structured output and a schema validating citation formats facilitate this audit.
What obligations does the AI Act impose on AI systems used in research?
#The AI Act does not regulate all research applications uniformly. Systems supporting literature searches or preliminary hypothesis generation, which do not directly influence high-risk decisions, have lighter requirements. Systems supporting diagnostic, therapeutic, or regulatory decisions (e.g., analyzing genomic data for disease predispositions) are classified as high-risk and require registration, conformity assessment, and technical documentation. It’s worth consulting a lawyer to classify a specific system before implementation.
How does GDPR affect the use of participant data in AI systems?
#Genomic, psychological, and behavioral data of research participants are special categories of data under GDPR (Art. 9). Processing them through AI systems requires a legal basis (most often consent or public interest in scientific research), a data protection impact assessment (DPIA), and implementation of data minimization measures. Data cannot be sent to external cloud APIs without appropriate data processing agreements and transfer assessments. Self-hosting or on-premises architectures with local LLMs are often preferred in research environments with sensitive data.
Can small research teams without a data science department use AI in hypothesis formulation?
#Yes, assuming the scope is well-defined. A RAG assistant on a personal PDF library, a pipeline for automatic data extraction from reports, a tool for generating hypothesis drafts based on a given research question: these are tasks accessible without extensive infrastructure. The entry point is usually a readiness assessment, which helps identify which research processes have the greatest potential for AI support before investing in implementation.
Related topics: scientists with AI achieve more, LLM as a hypothesis generator.
