Home  >  Article  >  Technology peripherals  >  Nature sub-journal, Peking University team’s general AI framework conducts comprehensive structure prediction for protein-protein docking, bridging the gap between experiment and calculation

Nature sub-journal, Peking University team’s general AI framework conducts comprehensive structure prediction for protein-protein docking, bridging the gap between experiment and calculation

王林
王林Original
2024-08-08 02:16:01309browse

Nature sub-journal, Peking University team’s general AI framework conducts comprehensive structure prediction for protein-protein docking, bridging the gap between experiment and calculation

Editor | Radish Skin

Protein complex structure prediction plays an important role in drug development, antibody design and other applications. However, due to the limited prediction accuracy, the prediction results are often inconsistent with the experimental results.

A research team from Peking University, Changping Laboratory, and Harvard University proposed ColabDock, a general framework that employs deep learning structural prediction models to integrate experimental constraints of different forms and sources without further large-scale retraining Or fine-tuning.

ColabDock outperforms HADDOCK and ClusPro using AlphaFold2 as a structure prediction model, not only in complex structure predictions with simulated residues and surface constraints, but also in structure predictions with NMR chemical shift perturbations and covalent labeling in this way.

Also, it can help antibody-antigen interface prediction by simulating interface scan limitations.

The study was titled "Integrated structure prediction of protein–protein docking with experimental restraints using ColabDock" and was published in "Nature Machine Intelligence" on August 5, 2024.

Nature sub-journal, Peking University team’s general AI framework conducts comprehensive structure prediction for protein-protein docking, bridging the gap between experiment and calculation

Protein docking provides important structural information for understanding biological mechanisms. Although deep models have developed rapidly in protein structure prediction, most models perform predictions in a free-docking manner, which may lead to inconsistencies between experimental constraints and predicted structures.

In order to solve this problem, research teams from Peking University, Changping Laboratory and other institutions proposed a general framework for constrained complex conformation prediction - ColabDock, which is a general protein-protein docking guided by sparse experimental constraints frame.

Through gradient backpropagation, this method effectively integrates experimentally constrained priors and the energy landscape of data-driven protein structure prediction models, automatically searching for conformations that satisfy both while tolerating conflicts or ambiguities in constraints.

ColabDock can leverage different forms and sources of experimental constraints without further extensive retraining or fine-tuning.

Nature sub-journal, Peking University team’s general AI framework conducts comprehensive structure prediction for protein-protein docking, bridging the gap between experiment and calculation

Illustration: ColabDock’s workflow. (Source: paper)

The framework contains two stages: generation stage and prediction stage.

In the generation phase, ColabDock uses ColabDesign, a protein design framework developed based on AlphaFold2. Input sequence profiles are optimized in logit space to guide structure prediction models to generate complex structures based on given experimental constraints and templates while maximizing pLDDT and pAE measurements.

In the prediction phase, the structure is predicted based on the generated complex structure and the given template. For each target, ColabDock performs multiple runs and generates different conformations. The final conformation was selected by a ranked support vector machine (SVM) algorithm.

Robust performance

As a proof of concept, the researchers adopted AlphaFold2 as a structure prediction model in ColabDock. Of course, other data-driven deep learning models can also be used here, such as RoseTTAFold2 and AF-Multimer.

The researchers tested ColabDock on synthetic data sets and several types of experimental constraints, including NMR chemical shift perturbation (CSP), covalent labeling (CL), and simulated deep mutation scanning (DMS).

Nature sub-journal, Peking University team’s general AI framework conducts comprehensive structure prediction for protein-protein docking, bridging the gap between experiment and calculation

Illustration: ColabDock’s performance on the validation set. (Source: Paper)

ColabDock evaluates two types of constraints, namely 1v1 and MvN constraints. The former is at the residue-residue level and examples include constraints from XL-MS. The latter is at the interface level and is relevant for NMR and CL experiments.

Test results on synthetic data sets show that ColabDock achieves satisfactory performance. Furthermore, as expected, the performance of ColabDock improves as the number of constraints increases.

Even with few constraints, ColabDock outperforms AF-Multimer on benchmark data sets and the same frame settings, and converges to fewer conformations when more constraints are provided, demonstrating the effective application of additional information.

Nature sub-journal, Peking University team’s general AI framework conducts comprehensive structure prediction for protein-protein docking, bridging the gap between experiment and calculation

Illustration: Comparing ColabDock, HADDOCK and ClusPro on the benchmark set. (Source: paper)

Compared with HADDOCK and ClusPro, ColabDock performs better when the constraint quality is higher. On both experimental datasets, ColabDock still outperforms HADDOCK and ClusPro regardless of the number and quality of constraints provided.

Nature sub-journal, Peking University team’s general AI framework conducts comprehensive structure prediction for protein-protein docking, bridging the gap between experiment and calculation

Illustration: Performance and constraint analysis of ColabDock on CSP set. (Source: paper)

Finally, the researchers evaluated the performance of different docking methods on the antibody-antigen data set. ColabDock predicted a much higher proportion of medium or higher quality structures than HADDOCK and ClusPro.

Nature sub-journal, Peking University team’s general AI framework conducts comprehensive structure prediction for protein-protein docking, bridging the gap between experiment and calculation

Illustration: Comparison of ColabDock, HADDOCK and ClusPro on the antibody-antigen benchmark set. (Source: paper)

This shows that ColabDock has potential application value in antibody design. Moreover, ColabDock still shows comparable or even better performance than AF-Multimer on the newly released unbiased dataset.

Limitations and Conclusion

ColabDock also has some limitations. Currently, ColabDock can only accept distances smaller than 22 Å, which is determined by the upper limit of the distance map in AlphaFold2. This limitation renders the model applicable to only a small subset of XL-MS reagents.

Without fragment-based optimization, ColabDock can only process complexes of less than 1,200 residues on an NVIDIA A100 graphics processing unit (GPU) due to limited memory.

In addition, this method can be very time-consuming, especially for large protein complexes. Using the bfloat16 floating point format version of AlphaFold2 is expected to help save memory and speed up calculations.

I believe that in the future, after researchers iteratively optimize it, as a unified framework, ColabDock will be able to help bridge the gap between experimental and computational protein science.

Paper link:https://www.nature.com/articles/s42256-024-00873-z


The above is the detailed content of Nature sub-journal, Peking University team’s general AI framework conducts comprehensive structure prediction for protein-protein docking, bridging the gap between experiment and calculation. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn