BiohacksAI Knowledge Graph
A deterministic evidence graph mapping bioactive compounds to molecular targets, biological pathways, and research evidence. Every edge is derived from experimental binding data or PubMed-indexed studies — no inferred associations.
1,130,126
PubMed Articles
Graph Structure
The BiohacksAI graph is a directed evidence network with four node types. Each edge carries a confidence score derived from experimental assay data. The graph is rebuilt deterministically from source databases — no manual curation.
Compound
9,238 nodes
Small molecules, peptides, natural products. Identified by PubChem CID. Edges to targets (binding affinity), pathways (gene set), and papers (MeSH co-occurrence).
Target
3,120 nodes
Human gene products (proteins, enzymes, receptors). Edges from compounds via PubChem BioAssay active results. Confidence = log-normalized active assay count.
Pathway
2,209 nodes
Reactome biological pathways. A compound is linked to a pathway if it has binding evidence for ≥2 genes within the pathway gene set.
Cluster
129 nodes
Mechanistic compound groups computed via Louvain community detection on the target interaction graph. Compounds within a cluster share target binding profiles.
Top Molecular Targets
All 3,120 targets →Top Reactome Pathways
All 2,209 pathways →Signal Transduction3069Metabolism2203Signaling by GPCR2106Disease2097GPCR downstream signalling2052Immune System1821Metabolism of proteins1813Gene expression (Transcription)1803RNA Polymerase II Transcription1780Generic Transcription Pathway1774GPCR ligand binding1701Cellular responses to stimuli1673
Mechanistic Clusters
All 129 clusters →CYP2D6 Modulators
1534 compounds
FLT3 Modulators
369 compounds
CNR2 Modulators
342 compounds
OPRD Modulators
287 compounds
SC6A4 Modulators
286 compounds
CA1 Modulators
172 compounds
DRD2 Modulators
99 compounds
ACM2 Modulators
96 compounds
COX-2 Inhibitors
80 compounds
OPRM1 Modulators
65 compounds
HDAC1 Modulators
58 compounds
DHI2 Modulators
58 compounds
Provenance & Verification
The BiohacksAI corpus is deterministically generated from public databases and sealed with a Merkle proof (SHA-256). Each corpus version has a cryptographic root hash that can be independently verified. No AI-generated content is included in the evidence graph.
Corpus version
v20260307-01
Data sources
PubMed, PubChem, BindingDB, ChEMBL, Reactome
Patent
EVE-PAT-2026-001 (pending, PRV Sweden)