首页 >科技周边 >人工智能 >Nature子刊,准确率达96%,AI从序列中预测蛋白-配体互作

Nature子刊,准确率达96%,AI从序列中预测蛋白-配体互作

PHPz
PHPz原创
2024-07-11 12:56:20759浏览

Nature子刊,准确率达96%,AI从序列中预测蛋白-配体互作

Editor | Radish Skin

In drug development, it is crucial to determine the binding affinity and functional effect of small molecule ligands on proteins. Current computational methods can predict these protein-ligand interaction properties, but without high-resolution protein structures, accuracy is often lost and functional effects cannot be predicted.

Researchers at Monash University and Griffith University have developed PSICHIC (PhySIcoCHhemICal Graph Neural Network), a framework that combines physicochemical constraints directly from sequences Data decoding interaction fingerprints. This enables PSICHIC to decode the mechanisms behind protein-ligand interactions, achieving state-of-the-art accuracy and interpretability.

Trained on the same protein-ligand pairs without structural data, PSICHIC performed on par with, or even exceeded, leading structure-based methods in binding affinity predictions.

PSICHIC’s interpretable fingerprint identifies protein residues and ligand atoms involved in the interaction and helps reveal the selectivity determinants of protein-ligand interactions.

The research was titled "Physicochemical graph neural network for learning protein–ligand interaction fingerprints from sequence data" and was published in "Nature Machine Intelligence" on June 17, 2024.

Nature子刊,准确率达96%,AI从序列中预测蛋白-配体互作

Protein-ligand affinity prediction challenge

In the drug discovery process, it is critical to understand the binding affinity and functional effects of small molecule ligands on proteins, as the selective interaction of the ligand with a specific protein determines The expected effect of the drug.

However, although current computational methods are capable of predicting protein-ligand interaction properties, in the absence of high-resolution protein structures, the prediction accuracy often decreases, and there are also difficulties in predicting functional effects.

Although sequence-based methods have more advantages in cost and resources (for example, they do not require expensive experimental structure determination processes), these methods often face the problem of excessive degrees of freedom in pattern matching, which can easily lead to overfitting and limited generalization. ization capabilities, thereby creating a performance gap with structure- or composite-based methods.

Physical Chemistry Graph Neural Network

A research team from Monash University and Griffith University developed PSICHIC (Physical Chemistry Graph Neural Network), a method to directly decode protein-ligands from sequence data following physical and chemical principles. Body interaction fingerprint method. Unlike previous sequence-based models, PSICHIC uniquely incorporates physicochemical constraints to achieve state-of-the-art accuracy and interpretability.

As a 2D sequence-based method, PSICHIC generates and imposes these constraints on a 2D plot by applying a clustering algorithm, allowing PSICHIC to primarily adapt to the rational underlying patterns that determine protein-ligand interactions during training.

Nature子刊,准确率达96%,AI从序列中预测蛋白-配体互作

Illustration: PSICHIC Overview

(Source: Paper)

Performance Validation and Comparison

After training on the same protein-ligand pairs without structural data, PSICHIC outperforms in binding affinity predictions State-of-the-art structure-based and composite-based methods rival or even surpass them.

Experimental results on PDBBind v2016 and PDBBind v2020 datasets show that PSICHIC outperforms other sequence-based methods, such as TransCPI, MolTrans, and DrugBAN, on multiple indicators.

Nature子刊,准确率达96%,AI从序列中预测蛋白-配体互作

Illustration:

Performance statistical summary of protein-ligand binding affinity predictions on PDBBind v2016 and PDBBind v2020 benchmarks. (Source: paper)

Specifically:

  • PSICHIC shows lower prediction error and higher correlation index, especially in terms of prediction accuracy and generalization ability.
  • PSICHIC achieves up to 96% accuracy in functional effect prediction.

Also:

  • PSICHIC 在结合位点和关键配体功能基团的识别方面表现出色。
  • 在多个蛋白质-配体复杂结构(如 PDB 6K1S 和 6OXV)的分析中,PSICHIC 能够准确定位重要的结合残基和配体功能基团,验证了其在序列数据中直接解码蛋白质-配体相互作用模式的能力。
  • 这一能力特别体现在其通过序列数据预测蛋白质-配体结合位点和关键残基上。

    Nature子刊,准确率达96%,AI从序列中预测蛋白-配体互作

    图示:利用交互指纹进行虚拟筛选。(来源:论文)

有趣的是,PSICHIC 的可解释指纹表明,它获得了仅从序列数据解码蛋白质-配体相互作用的潜在机制、识别结合位点蛋白质残基和所涉及的配体原子的能力,即使仅在具有结合亲和力标签而没有相互作用信息的序列数据上进行训练也是如此。

Nature子刊,准确率达96%,AI从序列中预测蛋白-配体互作

价值体现
  1. 蛋白质-配体相互作用指纹描述了配体和蛋白质残基之间的特定相互作用特征。
  2. PSICHIC 仅利用序列数据,为获取可解释的相互作用指纹提供了一种独特的方法。
  3. PSICHIC 纳入约束,展示了新兴能力,可以揭示蛋白质-配体相互作用机制并有效预测相互作用特性。
  4. PSICHIC 消除了对 3D 数据的需求,为在大规模序列数据库上进行稳健学习铺平了道路。

未来展望

  1. 将 PSICHIC 分析扩展到蛋白质复合物,例如与异三聚体 G 蛋白复合的 GPCR。
  2. 探索变构调节等复杂相互作用,有助于理解变构配体如何调节蛋白质靶标内的正构配体。
  3. PSICHIC 已在各个应用领域中证明其稳健性和有效性,在未来发展中具有广阔的潜力。

以上是Nature子刊,准确率达96%,AI从序列中预测蛋白-配体互作的详细内容。更多信息请关注PHP中文网其他相关文章!

声明:
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn