정보
Drug toxicity prediction based on genotype-phenotype differences between preclinical models and humans
Abstract
[Background] A major hurdle in drug development is the poor translatability of preclinical toxicity findings to human outcomes, largely due to biological differences between humans and model organisms. This gap leads to high clinical trial attrition and post-marketing drug withdrawals. Existing toxicity prediction methods primarily rely on chemical properties and typically overlook these inter-species (or -organism) differences.
[Methods] We developed a machine learning framework that incorporates genotype-phenotype differences (GPD) between preclinical models (cell lines and mice) and humans to improve the prediction of human drug toxicity. Clinical risk information on drugs (e.g., clinical trials or post-marketing surveillance) was obtained without bias from published reports, data sources, and databases. GPD of drug target was assessed across three biological contexts: gene essentiality, tissue expression profiles, and network connectivity. We benchmarked the GPD-based model against state-of-the-art toxicity predictors and evaluated its performance using independent datasets and chronological validation.
[Findings] Using a dataset of 434 risky and 790 approved drugs, GPD features were significantly associated with drug failures due to severe adverse events. The Random Forest model integrating GPD with chemical features demonstrated enhanced predictive accuracy (AUPRC = 0.63 vs. baseline 0.35; AUROC = 0.75 vs. baseline 0.50), particularly for neurotoxicity and cardiovascular toxicity, two major causes of clinical failures that were previously overlooked due to their chemical properties alone. Our model outperformed state-of-the-art chemical structure-based models and demonstrated a practical ability to anticipate future drug withdrawals in real-world settings.
[Interpretation] Incorporating differences in genotype-phenotype relationships offers a biologically grounded strategy for drug toxicity prediction. Our framework enables early identification of high-risk drugs in clinical development. This approach holds promise for reducing development costs, improving patient safety, and increasing the success rate of therapeutic approvals.
Image

