AI-Driven Interdisciplinarity: Medicine, Biology, Physics

In 2023, AlphaMissense characterized the pathogenicity of 71 million genetic variants within days. A year later, analogous systems began suggesting experimental protocols in drug chemistry and identifying links between biophysical data and therapeutic response. At Cashcrown, we examine these changes from a practical perspective: what can be implemented today in research organizations or companies using scientific methods, and what remains in the realm of lab demonstrations.

The answer is neither enthusiastic nor pessimistic. AI is currently a useful assistant at the intersection of disciplines, not an autonomous researcher. The distinction matters because it determines where control points are placed.

What Connects Medicine, Biology, and Physics in the Context of AI

Each of the three fields generates data in different formats and scales. Medicine produces diagnostic images, clinical data, and case reports. Computational biology works with genomic sequences, protein structures, and kinetic measurement results. Physics provides mechanical models, sensor data, and molecular dynamics simulations.

For decades, these datasets were isolated due to the lack of tools capable of processing such diverse representations simultaneously. LLMs and multimodal models have changed this, though not unconditionally.

Concrete tasks where data integration works reliably:

Correlating MRI imaging results with patients' genetic profiles to identify prognostic markers.
Mining literature spanning cardiology, membrane biophysics, and fluid physics for shared mechanisms.
Transferring simulation models from materials physics to bone hardness modeling and metastasis dynamics.
Extracting kinetic data from experimental descriptions in non-standard PDF formats into unified tables.

Each of these tasks reduces researcher workload from weeks to hours or days. None eliminate the need for verification experiments.

Simulations at the Intersection of Physics and Biology: What AI Contributes and What It Doesn’t Replace

Molecular simulations based on classical molecular dynamics are computationally expensive: modeling one microsecond of protein behavior in an aqueous environment can take weeks on a compute cluster. RAG models with structural article corpora and neural networks trained on simulation results enable "surrogate models" that approximate outcomes dozens of times faster.

This is a real shift in workflow speed. But surrogates have a known failure mode: they interpolate effectively within training data ranges but fail for out-of-distribution systems. A researcher running a surrogate simulation for a new compound class must know how far they are from the training data.

The table below compares task types, AI approaches, and human intervention points:

Task	AI Approach	Human Control Point
Literature mining	Multidomain semantic search	Relevance and source quality assessment
Protein structure prediction	Structure prediction models (AlphaFold class)	Experimental validation (crystallography, cryo-EM)
Surrogate MD model	Network trained on trajectories	Comparison with classical simulation on a sample
Clinical and genomic data integration	Extraction pipeline + correlation	Clinical significance and causality
Hypothesis generation	LLM + domain knowledge base	Hypothesis selection and prioritization for experiment

Personalized Medicine: Where AI Actually Helps Today

Personalized medicine is a field where data from different biological levels (genomics, transcriptomics, clinical data, imaging) must be integrated to make therapeutic decisions. It’s a natural environment for systems integrating heterogeneous sources.

Tasks where explainability is critical today:

Patient stratification. The model groups patients based on molecular profiles, prior responses, and imaging data. The result is a segmentation proposal, not a diagnosis. The clinician decides whether the division boundary makes biological sense and corresponds to clinical course differences.

Virtual screening of therapeutic candidates. AI suggests a set of compounds with estimated activity against a therapeutic target. A narrow selection from the candidate list proceeds to wet lab testing. Without in vitro, and subsequently in vivo, experiments, none advance to further development.

Drug-target interaction prediction in new populations. Models trained primarily on European population data may underestimate risks for patients from other genetic groups. This is a known training data bias issue, discussed in the article on the black box problem.

Under the AI Act, AI systems intended for medical applications are classified as high-risk when subject to sectoral product legislation (Annex I AI Act, including MDR and IVDR). They require technical documentation, conformity assessment, and post-market monitoring mechanisms.

Hallucinations and Errors in Scientific Research Context

Hallucination in research systems differs from that in conversational assistants. The model may generate citations for non-existent articles, provide kinetic constant values to four decimal places (none of which are data-based), or propose an experimental protocol with a step referencing a reagent unavailable in Poland.

Three mitigation layers used in research system deployments:

Verification of each citation by the author before inclusion in the manuscript. Tools: Semantic Scholar API, PubMed, DOI lookup.
Structured output with JSON Schema enforcing numeric value ranges (kinetic parameters, concentrations, temperatures) and flagging values outside physically plausible ranges.
Prompt and response logging as part of research documentation, analogous to a lab notebook.

The article on limiting AI hallucinations details technical mechanisms. In research contexts, the citation layer with text snippets supporting model responses is particularly useful.

Human-Oversight in Interdisciplinary Research: Concrete Architecture

The concept of "human-in-the-loop" is often treated abstractly. In practical analytical system deployments, a concrete checkpoint architecture is used, as detailed in the article on the role of humans in the loop.

Three types of checkpoints in an AI-driven research project:

Pre-experiment verification. AI generates a hypothesis or protocol. The research lead approves it before physical experiment initiation. This is the human-gate equivalent for irreversible actions: no reagent enters a test tube without the PI’s signature.

Intermediate data review. AI synthesizes results after each experimental round and proposes the next step. The researcher decides whether the path is biologically plausible or if the model has strayed into out-of-training-data territory.

Pre-publication validation. Every AI-derived claim in a manuscript is verified by at least one domain expert before submission for review. ICMJE, Nature, and Science guidelines (as of 2026) explicitly exclude AI as an author.

Systems that embed these checkpoints from the start work faster than traditional workflows. Systems without controls only appear faster—until results need to be retracted.

▶Design a human-oversight checkpoint for an AI system in molecular researchsandbox · reasoning

FAQ

Can AI independently conduct interdisciplinary research without a scientist’s involvement?

No, in a scientifically credible sense. Models lack a causal world model and cannot distinguish correlation from mechanism. They can automate workflow stages (literature mining, data extraction, hypothesis generation, candidate pre-screening), but each requires domain expert assessment. The article on AI as an autonomous scientist analyzes the limits of this autonomy in detail.

What data is needed for AI to effectively integrate knowledge from medicine, biology, and physics?

Data must be structured or extractable: sequences in FASTA or UniProt formats, measurement results in tables with units, images in standard DICOM or NIfTI formats with annotations, experimental descriptions with numeric endpoints. The less structured the source, the higher the extraction error risk. A data quality audit before system deployment can reduce debugging time by weeks.

How does the AI Act regulate AI systems used in biomedical research?

Systems directly influencing medical decisions or human safety are classified as high-risk under the AI Act and sectoral product legislation (including MDR and IVDR). They require technical documentation, conformity assessment, and post-market monitoring. Systems supporting only literature mining or hypothesis generation, without direct impact on high-risk decisions, have lighter requirements. The boundary between these categories is contextually assessed by the provider and conformity assessor.

How to verify AI-generated hypotheses in scientific research?

Verification proceeds in three steps. First, the researcher assesses the hypothesis’s biological or physical plausibility based on expert knowledge. Next, they design a minimal experiment capable of falsifying the hypothesis. Finally, experimental results are compared with the model’s prediction, and discrepancies are documented and analyzed regardless of direction. The article on LLMs as hypothesis generators details this cycle.

Does AI help with interdisciplinary literature reviews, and what are the risks of uncritical use?

Yes, semantic search across large corpora is one of AI’s strongest applications in science. The risks of uncritical use include three issues: the model may generate non-existent citations (hallucination), systematically omit historically underrepresented sources in training data, and overemphasize highly cited works at the expense of new findings. Verifying each citation before manuscript inclusion and manually supplementing source lists from primary databases (PubMed, Scopus, Web of Science) are mandatory.

Related case studyMature Product Builder — a gated playbook that builds the app on its own