Home  >  Article  >  Technology peripherals  >  Nature sub-journal, with an accuracy rate of 96%, AI predicts protein-ligand interactions from sequences

Nature sub-journal, with an accuracy rate of 96%, AI predicts protein-ligand interactions from sequences

PHPz
PHPzOriginal
2024-07-11 12:56:20641browse

Nature sub-journal, with an accuracy rate of 96%, AI predicts protein-ligand interactions from sequences

Editor | Radish Skin

In drug development, it is crucial to determine the binding affinity and functional effect of small molecule ligands on proteins. Current computational methods can predict these protein-ligand interaction properties, but without high-resolution protein structures, accuracy is often lost and functional effects cannot be predicted.

Researchers at Monash University and Griffith University have developed PSICHIC (PhySIcoCHhemICal Graph Neural Network), a framework that combines physicochemical constraints directly from sequences Data decoding interaction fingerprints. This enables PSICHIC to decode the mechanisms behind protein-ligand interactions, achieving state-of-the-art accuracy and interpretability.

Trained on the same protein-ligand pairs without structural data, PSICHIC performed on par with, or even exceeded, leading structure-based methods in binding affinity predictions.

PSICHIC’s interpretable fingerprint identifies protein residues and ligand atoms involved in the interaction and helps reveal the selectivity determinants of protein-ligand interactions.

The study was titled "Physicochemical graph neural network for learning protein–ligand interaction fingerprints from sequence data" and was published in "Nature Machine Intelligence" on June 17, 2024.

Nature sub-journal, with an accuracy rate of 96%, AI predicts protein-ligand interactions from sequences

Protein-ligand affinity prediction challenge

In the drug discovery process, it is critical to understand the binding affinity and functional effects of small molecule ligands on proteins, as the selective interaction of the ligand with a specific protein determines The expected effect of the drug.

However, although current computational methods are capable of predicting protein-ligand interaction properties, in the absence of high-resolution protein structures, the prediction accuracy often decreases, and there are also difficulties in predicting functional effects.

Although sequence-based methods have more advantages in cost and resources (for example, they do not require expensive experimental structure determination processes), these methods often face the problem of excessive degrees of freedom in pattern matching, which can easily lead to overfitting and limited generalization. ization capabilities, thereby creating a performance gap with structure- or composite-based methods.

Physical Chemistry Graph Neural Network

A research team from Monash University and Griffith University developed PSICHIC (Physical Chemistry Graph Neural Network), a method to directly decode protein-ligands from sequence data following physical and chemical principles. Body interaction fingerprint method. Unlike previous sequence-based models, PSICHIC uniquely incorporates physicochemical constraints to achieve state-of-the-art accuracy and interpretability.

As a 2D sequence-based method, PSICHIC generates and imposes these constraints on a 2D plot by applying a clustering algorithm, allowing PSICHIC to primarily adapt to the rational underlying patterns that determine protein-ligand interactions during training.

Nature sub-journal, with an accuracy rate of 96%, AI predicts protein-ligand interactions from sequences

Illustration: PSICHIC Overview

(Source: Paper)

Performance Validation and Comparison

After training on the same protein-ligand pairs without structural data, PSICHIC outperforms in binding affinity predictions State-of-the-art structure-based and composite-based methods rival or even surpass them.

Experimental results on PDBBind v2016 and PDBBind v2020 datasets show that PSICHIC outperforms other sequence-based methods, such as TransCPI, MolTrans, and DrugBAN, on multiple indicators.

Nature sub-journal, with an accuracy rate of 96%, AI predicts protein-ligand interactions from sequences

Illustration:

Performance statistical summary of protein-ligand binding affinity predictions on PDBBind v2016 and PDBBind v2020 benchmarks. (Source: paper)

Specifically:

  • PSICHIC shows lower prediction error and higher correlation index, especially in terms of prediction accuracy and generalization ability.
  • PSICHIC achieves up to 96% accuracy in functional effect prediction.

Also:

  • PSICHIC excels in the identification of binding sites and key ligand functional groups.
  • In the analysis of multiple protein-ligand complex structures (such as PDB 6K1S and 6OXV), PSICHIC was able to accurately locate important binding residues and ligand functional groups, verifying its direct decoding of protein-ligands in sequence data. The ability of body interaction patterns.
  • This ability is particularly reflected in its ability to predict protein-ligand binding sites and key residues from sequence data.

    Nature sub-journal, with an accuracy rate of 96%, AI predicts protein-ligand interactions from sequences

    Illustration: Virtual screening using interactive fingerprints. (Source: Paper)

Interestingly, PSICHIC’s interpretable fingerprints show that it obtains the ability to decode the underlying mechanism of protein-ligand interactions from sequence data alone, identifying binding site protein residues and involved ligands atomic capabilities, even when trained only on sequence data with binding affinity labels and no interaction information.

Nature sub-journal, with an accuracy rate of 96%, AI predicts protein-ligand interactions from sequences

Value
  1. Protein-ligand interaction fingerprint describes the specific interaction characteristics between ligands and protein residues.
  2. PSICHIC provides a unique approach to obtaining interpretable interaction fingerprints using only sequence data.
  3. PSICHIC incorporates constraints and demonstrates emerging capabilities to reveal protein-ligand interaction mechanisms and efficiently predict interaction properties.
  4. PSICHIC eliminates the need for 3D data, paving the way for robust learning on large-scale sequence databases.

Future Outlook

  1. Expand PSICHIC analysis to protein complexes, such as GPCRs complexed with heterotrimeric G proteins.
  2. Exploring complex interactions such as allosteric regulation helps understand how allosteric ligands regulate orthosteric ligands within protein targets.
  3. PSICHIC has proven its robustness and effectiveness in various application fields and has broad potential for future development.

The above is the detailed content of Nature sub-journal, with an accuracy rate of 96%, AI predicts protein-ligand interactions from sequences. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn