


World's first: Molecular Heart's open source new AI algorithm to overcome the problems of protein side chain prediction and sequence design
The formation of protein structure and function depends largely on the interaction between side chain atoms. Therefore, accurate protein side chain prediction (PSCP) is the key to solving the problems of protein structure prediction and protein design. ring. However, previous protein structure predictions mostly focused on the main chain structure, and side chain structure prediction has always been a difficult problem that has not been completely solved.
Recently, Xu Jinbo’s team at Molecular Heart launched a new PSCP deep architecture AttnPacker, which has achieved significant improvements in speed, memory efficiency and overall accuracy. It is currently the best known side. The chain structure prediction algorithm is also the world's first AI algorithm that can simultaneously predict protein side chains and sequence design.
The paper was published in the Proceedings of the National Academy of Sciences (PNAS), and its pre-trained model, source code and inference scripts have been open sourced on Github.
- ##Paper link: https://www.pnas. org/doi/10.1073/pnas.2216438120#supplementary-materials
- Open source link: https://github.com/MattMcPartlon/AttnPacker
Proteins are folded from several amino acids, and their structures are divided into main chains and side chains. Differences in side chains have a huge impact on protein structure and function, especially biological activity. Based on a clear understanding of the side chain structure, scientists can more accurately determine the three-dimensional structure of proteins, analyze protein-protein interactions, and conduct rational protein design. When applied to the field of drug design, scientists can quickly and more accurately find suitable binding sites for drugs and receptors, and even optimize or design binding sites as needed; in the field of enzyme optimization, scientists can optimize sequences Transformation allows multiple side chains to participate in catalytic reactions to achieve more efficient and specific catalytic effects.
Most current protein structure prediction algorithms mainly focus on the structural analysis of the main chain, but protein side chain structure prediction is still a problem that has not been completely overcome. Whether it is popular protein structure prediction algorithms such as AlphaFold2 or algorithms focusing on side chain structure prediction such as DLPacker and RosettaPacker, the accuracy or speed are not satisfactory. This also imposes limitations on protein design.
Traditional methods, such as RosettaPacker, mainly use energy optimization methods, first grouping the distribution of side chain atoms, and then searching for the grouping of side chains for a specific amino acid to find the minimum energy The combination. These methods differ primarily from the researcher's choice of rotamer libraries, energy functions, and energy minimization procedures, with accuracy limited by the use of search heuristics and discrete sampling procedures. There are also side chain prediction methods based on deep learning in the industry, such as DLPacker, which formulates PSCP as an image-to-image conversion problem and adopts a U-net model structure. However, the prediction accuracy and speed are still not ideal.
MethodAttnPacker is an end-to-end deep learning method for predicting protein side chain coordinates. It jointly simulates side chain interactions, with directly predicted side chain structures that are more physically feasible, with fewer atomic collisions and more ideal bond lengths and angles.
Specifically, AttnPacker introduces a depth map converter architecture that leverages the geometric and relational aspects of PSCP. Inspired by AlphaFold2, Molecular Heart proposes position-aware triangle updates to optimize pairwise features using a graph-based framework to compute triangle attention and multiplicative updates. With this approach, AttnPacker has significantly less memory and a higher capacity model. Furthermore, Molecular Heart explores several SE (3) equivariant attention mechanisms and proposes an equivariant transformer architecture for learning from 3D points.
AttnPacker runs the process. The protein backbone coordinates and sequence are used as input, and the spatial feature map and equivariable basis are derived based on the coordinate information. The feature map is processed by the invariant graph-transformer module and then passed to an equivariant TFN-Transformer that outputs predicted side chain coordinates, confidence scores for each residue, and optional design sequences. The predicted coordinates are post-processed to remove all spatial conflicts and ensure idealized geometry.
Effect
In terms of prediction performance, AttnPacker shows improvements in accuracy and efficiency for both natural and non-natural backbone structures. At the same time, physical feasibility is ensured, deviations from ideal bond lengths and angles are negligible, and minimal atomic steric hindrance is produced.
Molecular Heart conducts comparative tests on AttnPacker and the current state-of-the-art methods - SCWRL4, FASPR, RosettaPacker and DLPacker on the CASP13 and CASP14 natural and non-native protein backbone data sets. Results show that AttnPacker significantly outperforms traditional protein side chain prediction methods on CASP13 and CASP14 native backbones, with average reconstruction RMSDs more than 18% lower than the suboptimal method on each test set. AttnPacker also outperforms the deep learning method DLPacker, reducing average RMSD by more than 11% while also significantly improving sidechain dihedral accuracy. In addition to accuracy, AttnPacker has significantly fewer atomic collisions than other methods.
##When the natural main chain structure is given, each algorithm performs in CASP13 and CASP14 Side chain structure prediction results on the target protein. Asterisks indicate that the average conflict values are lower than the native structure—56.0, 5.9, and 0.4 for CASP13 and 80.4, 7.9, and 2.5 for CASP14.
On the CASP13 and CASP14 non-native backbones, AttnPacker is also significantly better than other methods, and the atomic collisions are also significantly less than other methods.
When the non-natural main chain structure is given, each algorithm is in CASP13 and Side chain structure prediction results on CASP14 target proteins. Asterisks indicate that the average conflict values are lower than the corresponding native structures—34.6, 2.2, 0.5 for CASP13 and 40.0, 2.7, 0.7 for CASP14.
Innovatively abandons the discrete rotamer library and computationally expensive conformational search and sampling steps, and directly combines the main chain 3D geometry to calculate all sides in parallel chain coordinates. Compared with the deep learning-based method DLPacker and the traditional computing method-based RosettaPacker, AttnPacker has significantly improved computing efficiency and reduced inference time by more than 100 times.
Time comparison of different PSCP methods. Reconstructing the relative times of side chain atoms for all 83 CASP13 target proteins.
AttnPacker performs equally well in protein design. Molecular Heart trained an AttnPacker variant for co-design that achieves native sequence recovery rates comparable to current state-of-the-art methods while also producing highly accurate assemblies. Rosetta simulation validation shows that AttnPacker-designed structures generally produce subnative (lower) Rosetta energies.
##Use ESMFold scTM and plDDT indicators to compare native protein sequences and The sequences generated by AttnPacker were used to evaluate the generation quality of AttnPacker, and the results showed strong correlation. In addition to its amazing effectiveness and efficiency, AttnPaker also has a very practical value - it is very easy to use. AttnPaker only requires a protein structure file to run. In contrast, OPUS-Rota4 (28) requires a voxel representation of the atomic environment from DLPacker, logic, secondary structure from trRosetta100, and constraint files from OPUS-CM output. Additionally, since AttnPacker directly predicts side chain coordinates, the output is fully differentiable, which facilitates downstream prediction tasks such as optimization or protein-protein interactions. "The advantages of good prediction effect, high efficiency and ease of use are conducive to the widespread use of AttnPacker in research and industrial fields." Professor Xu Jinbo said. 1. AttnPacker is a SE (3) equivariant model used to directly predict sequence and side chain coordinates, and can be used for protein side chain structure prediction. , which can also be used for protein sequence design and is a pioneering work. 2. AttnPacker's accuracy is better than other methods, its efficiency is greatly improved, and it is extremely easy to use. Summary
The above is the detailed content of World's first: Molecular Heart's open source new AI algorithm to overcome the problems of protein side chain prediction and sequence design. For more information, please follow other related articles on the PHP Chinese website!

Running large language models at home with ease: LM Studio User Guide In recent years, advances in software and hardware have made it possible to run large language models (LLMs) on personal computers. LM Studio is an excellent tool to make this process easy and convenient. This article will dive into how to run LLM locally using LM Studio, covering key steps, potential challenges, and the benefits of having LLM locally. Whether you are a tech enthusiast or are curious about the latest AI technologies, this guide will provide valuable insights and practical tips. Let's get started! Overview Understand the basic requirements for running LLM locally. Set up LM Studi on your computer

Guy Peri is McCormick’s Chief Information and Digital Officer. Though only seven months into his role, Peri is rapidly advancing a comprehensive transformation of the company’s digital capabilities. His career-long focus on data and analytics informs

Introduction Artificial intelligence (AI) is evolving to understand not just words, but also emotions, responding with a human touch. This sophisticated interaction is crucial in the rapidly advancing field of AI and natural language processing. Th

Introduction In today's data-centric world, leveraging advanced AI technologies is crucial for businesses seeking a competitive edge and enhanced efficiency. A range of powerful tools empowers data scientists, analysts, and developers to build, depl

This week's AI landscape exploded with groundbreaking releases from industry giants like OpenAI, Mistral AI, NVIDIA, DeepSeek, and Hugging Face. These new models promise increased power, affordability, and accessibility, fueled by advancements in tr

But the company’s Android app, which offers not only search capabilities but also acts as an AI assistant, is riddled with a host of security issues that could expose its users to data theft, account takeovers and impersonation attacks from malicious

You can look at what’s happening in conferences and at trade shows. You can ask engineers what they’re doing, or consult with a CEO. Everywhere you look, things are changing at breakneck speed. Engineers, and Non-Engineers What’s the difference be

Simulate Rocket Launches with RocketPy: A Comprehensive Guide This article guides you through simulating high-power rocket launches using RocketPy, a powerful Python library. We'll cover everything from defining rocket components to analyzing simula


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Atom editor mac version download
The most popular open source editor

SublimeText3 Linux new version
SublimeText3 Linux latest version

SublimeText3 Mac version
God-level code editing software (SublimeText3)

SublimeText3 English version
Recommended: Win version, supports code prompts!

SAP NetWeaver Server Adapter for Eclipse
Integrate Eclipse with SAP NetWeaver application server.