Thinking Outside the Pill

How Scientists Use Analogies to Discover New Medicines

The next medical breakthrough might be inspired by something already in your kitchen cupboard.

Imagine a medical researcher, staring at a complex biological puzzle. The usual experiments have hit a dead end. Then, they think of a process from an entirely different field—how yeast ferments bread, how a computer sorts information, or how a weedkiller targets specific plants. This flash of connection, an analogy, becomes a powerful tool, guiding the way to a new discovery. In the intricate world of pharmacological research, scientists routinely use such analogical inferences to venture into the unknown, drawing parallels from familiar territories to illuminate the mysteries of new drugs and their effects on the human body.

From Familiar to Unknown: The Engine of Analogy

At its core, an analogical inference is a form of scientific reasoning that identifies similarities between two domains: a well-understood "source domain" and a less familiar "target domain". By observing that the two domains share certain properties, researchers can make the inductive leap that a further, crucial feature of the source domain might also exist in the target.

Source Domain

Well-understood area (e.g., yeast fermentation, computer algorithms)

Analogical Mapping

Identifying relevant similarities between domains

Target Domain

Less familiar area (e.g., new drug mechanism, disease pathway)

The philosopher Mary Hesse famously broke down this process into vertical and horizontal relations4 . Think of it like a table comparing two things. The vertical relations are the internal connections within the source domain—the theory that explains why certain features lead to a specific outcome. The horizontal relations are the similarities between the source and target domains. A strong analogical inference requires both: meaningful connections within the source, and relevant parallels to the target4 .

Why Some Analogies Work and Others Don't

Not all similarities are created equal. The scientific debate around an analogy often focuses not on if similarities exist, but on how relevant they are to the conclusion4 .

Case Study: The Turtle's Place in Evolution

A classic example is the long-standing debate in biology about the turtle's place in the evolutionary tree. One group of scientists argued that turtles belong with anapsid reptiles because both lack temporal fenestrae (openings in the skull). Another group pointed to molecular similarities between turtles and archosaurs (like crocodiles and birds). The debate wasn't about the existence of these similarities, but their relevance. Was the lack of skull openings an ancestral trait, or was it lost later? Were the molecular links uniquely significant? The plausibility of each analogical inference hinged on the answer4 .

Recent philosophical work suggests that for an analogy to be plausible, it's not enough for the source domain to be well-understood (Source Domain Salience) and for there to be an overlap in features (Overlap). The target domain itself must have salient features that make it receptive to the comparison, a characteristic known as Target Domain Salience4 . This means a good pharmacological analogy must consider the unique biological context of the target disease or drug mechanism.

Key Factors for Successful Analogical Inference

A Digital Analogy: How AI Repurposes Old Drugs for New Diseases

One of the most exciting modern applications of analogical reasoning in pharmacology is drug repurposing—finding new therapeutic uses for existing drugs. Today, researchers are using artificial intelligence to perform analogical inference at a massive scale, and the process relies on building a knowledge graph (KG)9 .

The Methodology: Building a Web of Biomedical Knowledge

A knowledge graph is a vast, computational analogy for biological reality. It represents different entities—like drugs, diseases, genes, and proteins—as nodes, and the relationships between them as connecting lines, or "edges." For example, a graph might show that "Drug A" "treats" "Disease B," and "Disease B" "is associated with" "Gene C."

Drug A
Disease B
Gene C
Protein D
Knowledge Graph Development Process
Step 1: Knowledge Assembly

Researchers compile data from manually curated databases (like Hetionet) and use natural language processing (NLP) to automatically extract relationships from millions of published scientific articles (creating resources like GNBR). This builds a rich, semantic network of biomedical knowledge9 .

Step 2: Topological Perturbation

To test the strength of the analogies, scientists deliberately "perturb" or disturb the knowledge graph. They might remove highly connected "hub" nodes to see if the AI is relying on true biological nuance or just on the network's structure9 .

Step 3: Semantic Ablation

In another test, they "flatten" or "corrupt" the relationships. For instance, they might change the specific relationship "inhibits" to a generic "relates," stripping away the biological meaning to see how much the AI depends on it9 .

Step 4: Link Prediction

Finally, they use machine learning models called "knowledge graph embeddings" to analyze this web. The AI's job is to infer missing links—to predict new, plausible connections between a drug and a disease, thereby proposing a new therapeutic analogy9 .

Results and Analysis: Structure vs. Meaning

The results from these computational experiments are revealing. They show that these AI systems can indeed identify new drug-disease relationships, validating the power of large-scale analogical inference. However, they also highlight a critical challenge: the trade-off between semantics and topology9 .

Network Statistics of Biomedical Knowledge Graphs
Dataset Triples Entities Relations
GNBR 321,000 44,000 32
Hetionet 555,000 20,000 11

Source: Adapted from 9

Impact of Knowledge Graph Perturbations
Topological Perturbation

Removal of highly connected "hub" nodes reduces bias, forcing models to rely on nuanced connections.

Semantic Perturbation

Performance drop of 21-38% shows models use biological meaning when topological bias is mitigated.

Combined Approach

Reveals tension between network structure and biological meaning in AI-driven analogy.

Source: Adapted from 9

When researchers corrupted the biological meaning of the relationships (semantic ablation), the performance in predicting new drug uses dropped significantly—by 21% to 38%—but only after they had first mitigated the topological bias by downsampling hub nodes. This suggests that the AI models, while powerful, can be overly reliant on the network's structure (e.g., which nodes are most connected) rather than the nuanced biological meaning of the relationships9 .

This is the computational equivalent of ensuring that an analogy is based on deep, relevant similarities (the specific mechanism of a drug), and not just superficial patterns (that a drug is well-studied). The future of AI in drug discovery lies in developing methods that give equal weight to rich biological semantics as they do to network structure9 .

The Scientist's Toolkit: Resources for Pharmacological Analogy

Pursuing these analogies, whether in a wet lab or through computational models, requires a sophisticated toolkit. The following details some of the essential "reagent solutions" and resources that power modern pharmacological research.

In Vivo Models

Provides a whole-organism context (e.g., spontaneously hypertensive rats for ADHD research) to study drug effects in a complex biological system5 .

In Vitro & Ex Vivo Models

Allows for controlled study of drug effects on cells, tissues, or organs outside a living organism, helping to isolate specific mechanisms5 .

Knowledge Graphs

Serves as a structured, queryable analog of biomedical knowledge, enabling large-scale analogical inference for tasks like drug repurposing9 .

KG Embedding Models

Learns numerical representations of entities and relations in a knowledge graph, allowing it to infer new, plausible links (analogies)9 .

Bayesian Framework

Provides a formal, mathematical method for combining different pieces of evidence from analogical reasoning, improving the reliability of conclusions3 .

3Rs Principle

Ethical framework (Replacement, Reduction, Refinement) guiding all animal research to ensure humane treatment5 .

The commitment to rigorous methodology is paramount. In experimental pharmacology, this means all animal research must adhere to the ethical 3Rs principle (Replacement, Reduction, Refinement) and receive ethical board approval5 . In computational work, it means transparently acknowledging and correcting for biases in data and models to ensure that the analogies drawn are not just artifacts of the method9 .

Conclusion: A Bridge to Tomorrow's Cures

Analogy-based inference is far from a mere rhetorical device in pharmacology; it is a fundamental engine of discovery.

From Louis Pasteur's chance observation of a stale culture leading to the principle of vaccination, to the complex digital webs woven by AI to find new uses for old drugs, analogies provide a bridge from the known to the unknown4 .

As we move forward, the refinement of this tool continues. Philosophers of science work to define what makes an analogy truly plausible, while computer scientists build systems that can navigate the delicate balance between network structure and deep biological meaning. In the endless quest for new and better medicines, the ability to think analogically—to see the familiar in the strange—will remain one of the researcher's most vital skills. The next medical breakthrough may not start at a lab bench, but with a researcher seeing a compelling parallel in the most unexpected of places.

References