文章摘要
基于迁移学习的化学品生殖毒性筛查模型构建
Development of Reproductive Toxicity Screening Model of Chemicals Based on Transfer Learning
投稿时间:2025-04-01  修订日期:2025-04-28
DOI:
中文关键词: 生殖毒性  机器学习  迁移学习  定量构效关系(QSAR)
英文关键词: Reproductive toxicity  Machine learning  Transfer learning  Quantitative structure-activity relationship (QSAR)
基金项目:国家自然科学基金项目(面上项目,重点项目,重大项目)
作者单位
宁沛霖 工业生态与环境工程教育部重点实验室 
杨康 工业生态与环境工程教育部重点实验室 
宋国宝 工业生态与环境工程教育部重点实验室 
傅志强* 工业生态与环境工程教育部重点实验室 
摘要点击次数: 17
全文下载次数: 0
中文摘要:
      筛查具有生殖毒性的化学品是源头治理新污染物及保障人群健康的关键。活体动物实验存在成本高、数据匮乏等瓶颈,亟待发展高效的预测模型。本研究基于6个生殖毒性离体测试数据集,采用随机森林、极限梯度提升、支持向量机3种算法与分子MACCS, ECFP, PubChem指纹和Mordred描述符,创建了72个定量构效关系(QSAR)源模型。将源模型预测值与MACCS指纹结合形成融合特征,经过迁移学习创建了活体生殖毒性筛查模型。结果显示,迁移学习模型预测性能显著优于基准模型,受试者工作特性曲线下面积从0.392提升至0.888,F1分数从0.436提升至0.824,平衡准确率从0.450提升至0.838,表明融合体外毒性知识可显著提升模型性能。通过SHAP特征分析发现,卤素、苯环和五元杂环等是潜在的毒性警示子结构。基于离体数据的结构活性形貌分析表征了模型的应用域,确保模型筛查的准确性。研究结果为快速筛查生殖毒物、评价其健康风险提供了有效工具。
英文摘要:
      Identification of reproductive toxicants is a key strategy for the source control of emerging pollutants and the protection of public health. Traditional in vivo animal experiments are limited by high costs and data scarcity, highlighting the urgent need for the development of efficient predictive models. Based on 6 in vitro reproductive toxicity test datasets, 72 quantitative structure-activity relationship (QSAR) source models were created using 3 machine learning algorithms including random forest, extreme gradient boosting, support vector machine, and molecular MACCS, ECFP, PubChem fingerprints, and Mordred descriptor. By fusing the predicted value of source models with MACCS fingerprints, a screening model of in vivo reproductive toxicity was created through transfer learning. The results showed that the predictive performance of the transfer learning model was significantly better than that of the benchmark model. The area under the receiver operating characteristic curve increased from 0.392 to 0.888, the F1 score increased from 0.436 to 0.824, and the balance accuracy increased from 0.450 to 0.838, indicating that the integration of in vitro toxicity knowledge could significantly improve the model performance. By analyzing SHAP characteristics, it was found that halogen, benzene ring and five-membered heterocyclic ring were potential structural alerts for toxicity. The analysis of structure - activity landscapes based on in vitro data characterizes the application domain of the model and ensures the accuracy of screening. The results of this study provide an effective tool for screening reproductive toxicants and evaluating their health risks.
View Fulltext   查看/发表评论  下载PDF阅读器
关闭