search
HomeTechnology peripheralsAIWorld's first: Molecular Heart's open source new AI algorithm to overcome the problems of protein side chain prediction and sequence design

The formation of protein structure and function depends largely on the interaction between side chain atoms. Therefore, accurate protein side chain prediction (PSCP) is the key to solving the problems of protein structure prediction and protein design. ring. However, previous protein structure predictions mostly focused on the main chain structure, and side chain structure prediction has always been a difficult problem that has not been completely solved.

Recently, Xu Jinbo’s team at Molecular Heart launched a new PSCP deep architecture AttnPacker, which has achieved significant improvements in speed, memory efficiency and overall accuracy. It is currently the best known side. The chain structure prediction algorithm is also the world's first AI algorithm that can simultaneously predict protein side chains and sequence design.

The paper was published in the Proceedings of the National Academy of Sciences (PNAS), and its pre-trained model, source code and inference scripts have been open sourced on Github.

全球首创 :分子之心开源新AI算法,攻克蛋白质侧链预测与序列设计难题

  • ##Paper link: https://www.pnas. org/doi/10.1073/pnas.2216438120#supplementary-materials
  • Open source link: https://github.com/MattMcPartlon/AttnPacker
Background

Proteins are folded from several amino acids, and their structures are divided into main chains and side chains. Differences in side chains have a huge impact on protein structure and function, especially biological activity. Based on a clear understanding of the side chain structure, scientists can more accurately determine the three-dimensional structure of proteins, analyze protein-protein interactions, and conduct rational protein design. When applied to the field of drug design, scientists can quickly and more accurately find suitable binding sites for drugs and receptors, and even optimize or design binding sites as needed; in the field of enzyme optimization, scientists can optimize sequences Transformation allows multiple side chains to participate in catalytic reactions to achieve more efficient and specific catalytic effects.

Most current protein structure prediction algorithms mainly focus on the structural analysis of the main chain, but protein side chain structure prediction is still a problem that has not been completely overcome. Whether it is popular protein structure prediction algorithms such as AlphaFold2 or algorithms focusing on side chain structure prediction such as DLPacker and RosettaPacker, the accuracy or speed are not satisfactory. This also imposes limitations on protein design.

Traditional methods, such as RosettaPacker, mainly use energy optimization methods, first grouping the distribution of side chain atoms, and then searching for the grouping of side chains for a specific amino acid to find the minimum energy The combination. These methods differ primarily from the researcher's choice of rotamer libraries, energy functions, and energy minimization procedures, with accuracy limited by the use of search heuristics and discrete sampling procedures. There are also side chain prediction methods based on deep learning in the industry, such as DLPacker, which formulates PSCP as an image-to-image conversion problem and adopts a U-net model structure. However, the prediction accuracy and speed are still not ideal.

Method

AttnPacker is an end-to-end deep learning method for predicting protein side chain coordinates. It jointly simulates side chain interactions, with directly predicted side chain structures that are more physically feasible, with fewer atomic collisions and more ideal bond lengths and angles.

Specifically, AttnPacker introduces a depth map converter architecture that leverages the geometric and relational aspects of PSCP. Inspired by AlphaFold2, Molecular Heart proposes position-aware triangle updates to optimize pairwise features using a graph-based framework to compute triangle attention and multiplicative updates. With this approach, AttnPacker has significantly less memory and a higher capacity model. Furthermore, Molecular Heart explores several SE (3) equivariant attention mechanisms and proposes an equivariant transformer architecture for learning from 3D points.

全球首创 :分子之心开源新AI算法,攻克蛋白质侧链预测与序列设计难题

AttnPacker runs the process. The protein backbone coordinates and sequence are used as input, and the spatial feature map and equivariable basis are derived based on the coordinate information. The feature map is processed by the invariant graph-transformer module and then passed to an equivariant TFN-Transformer that outputs predicted side chain coordinates, confidence scores for each residue, and optional design sequences. The predicted coordinates are post-processed to remove all spatial conflicts and ensure idealized geometry.

Effect

In terms of prediction performance, AttnPacker shows improvements in accuracy and efficiency for both natural and non-natural backbone structures. At the same time, physical feasibility is ensured, deviations from ideal bond lengths and angles are negligible, and minimal atomic steric hindrance is produced.

Molecular Heart conducts comparative tests on AttnPacker and the current state-of-the-art methods - SCWRL4, FASPR, RosettaPacker and DLPacker on the CASP13 and CASP14 natural and non-native protein backbone data sets. Results show that AttnPacker significantly outperforms traditional protein side chain prediction methods on CASP13 and CASP14 native backbones, with average reconstruction RMSDs more than 18% lower than the suboptimal method on each test set. AttnPacker also outperforms the deep learning method DLPacker, reducing average RMSD by more than 11% while also significantly improving sidechain dihedral accuracy. In addition to accuracy, AttnPacker has significantly fewer atomic collisions than other methods.

全球首创 :分子之心开源新AI算法,攻克蛋白质侧链预测与序列设计难题

##When the natural main chain structure is given, each algorithm performs in CASP13 and CASP14 Side chain structure prediction results on the target protein. Asterisks indicate that the average conflict values ​​are lower than the native structure—56.0, 5.9, and 0.4 for CASP13 and 80.4, 7.9, and 2.5 for CASP14.

On the CASP13 and CASP14 non-native backbones, AttnPacker is also significantly better than other methods, and the atomic collisions are also significantly less than other methods.

全球首创 :分子之心开源新AI算法,攻克蛋白质侧链预测与序列设计难题

When the non-natural main chain structure is given, each algorithm is in CASP13 and Side chain structure prediction results on CASP14 target proteins. Asterisks indicate that the average conflict values ​​are lower than the corresponding native structures—34.6, 2.2, 0.5 for CASP13 and 40.0, 2.7, 0.7 for CASP14.

Innovatively abandons the discrete rotamer library and computationally expensive conformational search and sampling steps, and directly combines the main chain 3D geometry to calculate all sides in parallel chain coordinates. Compared with the deep learning-based method DLPacker and the traditional computing method-based RosettaPacker, AttnPacker has significantly improved computing efficiency and reduced inference time by more than 100 times.

Time comparison of different PSCP methods. Reconstructing the relative times of side chain atoms for all 83 CASP13 target proteins.

AttnPacker performs equally well in protein design. Molecular Heart trained an AttnPacker variant for co-design that achieves native sequence recovery rates comparable to current state-of-the-art methods while also producing highly accurate assemblies. Rosetta simulation validation shows that AttnPacker-designed structures generally produce subnative (lower) Rosetta energies.

全球首创 :分子之心开源新AI算法,攻克蛋白质侧链预测与序列设计难题

##Use ESMFold scTM and plDDT indicators to compare native protein sequences and The sequences generated by AttnPacker were used to evaluate the generation quality of AttnPacker, and the results showed strong correlation.

In addition to its amazing effectiveness and efficiency, AttnPaker also has a very practical value - it is very easy to use. AttnPaker only requires a protein structure file to run. In contrast, OPUS-Rota4 (28) requires a voxel representation of the atomic environment from DLPacker, logic, secondary structure from trRosetta100, and constraint files from OPUS-CM output. Additionally, since AttnPacker directly predicts side chain coordinates, the output is fully differentiable, which facilitates downstream prediction tasks such as optimization or protein-protein interactions. "The advantages of good prediction effect, high efficiency and ease of use are conducive to the widespread use of AttnPacker in research and industrial fields." Professor Xu Jinbo said.

Summary

1. AttnPacker is a SE (3) equivariant model used to directly predict sequence and side chain coordinates, and can be used for protein side chain structure prediction. , which can also be used for protein sequence design and is a pioneering work.

2. AttnPacker's accuracy is better than other methods, its efficiency is greatly improved, and it is extremely easy to use.

The above is the detailed content of World's first: Molecular Heart's open source new AI algorithm to overcome the problems of protein side chain prediction and sequence design. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete
How to Run LLM Locally Using LM Studio? - Analytics VidhyaHow to Run LLM Locally Using LM Studio? - Analytics VidhyaApr 19, 2025 am 11:38 AM

Running large language models at home with ease: LM Studio User Guide In recent years, advances in software and hardware have made it possible to run large language models (LLMs) on personal computers. LM Studio is an excellent tool to make this process easy and convenient. This article will dive into how to run LLM locally using LM Studio, covering key steps, potential challenges, and the benefits of having LLM locally. Whether you are a tech enthusiast or are curious about the latest AI technologies, this guide will provide valuable insights and practical tips. Let's get started! Overview Understand the basic requirements for running LLM locally. Set up LM Studi on your computer

Guy Peri Helps Flavor McCormick's Future Through Data TransformationGuy Peri Helps Flavor McCormick's Future Through Data TransformationApr 19, 2025 am 11:35 AM

Guy Peri is McCormick’s Chief Information and Digital Officer. Though only seven months into his role, Peri is rapidly advancing a comprehensive transformation of the company’s digital capabilities. His career-long focus on data and analytics informs

What is the Chain of Emotion in Prompt Engineering? - Analytics VidhyaWhat is the Chain of Emotion in Prompt Engineering? - Analytics VidhyaApr 19, 2025 am 11:33 AM

Introduction Artificial intelligence (AI) is evolving to understand not just words, but also emotions, responding with a human touch. This sophisticated interaction is crucial in the rapidly advancing field of AI and natural language processing. Th

12 Best AI Tools for Data Science Workflow - Analytics Vidhya12 Best AI Tools for Data Science Workflow - Analytics VidhyaApr 19, 2025 am 11:31 AM

Introduction In today's data-centric world, leveraging advanced AI technologies is crucial for businesses seeking a competitive edge and enhanced efficiency. A range of powerful tools empowers data scientists, analysts, and developers to build, depl

AV Byte: OpenAI's GPT-4o Mini and Other AI InnovationsAV Byte: OpenAI's GPT-4o Mini and Other AI InnovationsApr 19, 2025 am 11:30 AM

This week's AI landscape exploded with groundbreaking releases from industry giants like OpenAI, Mistral AI, NVIDIA, DeepSeek, and Hugging Face. These new models promise increased power, affordability, and accessibility, fueled by advancements in tr

Perplexity's Android App Is Infested With Security Flaws, Report FindsPerplexity's Android App Is Infested With Security Flaws, Report FindsApr 19, 2025 am 11:24 AM

But the company’s Android app, which offers not only search capabilities but also acts as an AI assistant, is riddled with a host of security issues that could expose its users to data theft, account takeovers and impersonation attacks from malicious

Everyone's Getting Better At Using AI: Thoughts On Vibe CodingEveryone's Getting Better At Using AI: Thoughts On Vibe CodingApr 19, 2025 am 11:17 AM

You can look at what’s happening in conferences and at trade shows. You can ask engineers what they’re doing, or consult with a CEO. Everywhere you look, things are changing at breakneck speed. Engineers, and Non-Engineers What’s the difference be

Rocket Launch Simulation and Analysis using RocketPy - Analytics VidhyaRocket Launch Simulation and Analysis using RocketPy - Analytics VidhyaApr 19, 2025 am 11:12 AM

Simulate Rocket Launches with RocketPy: A Comprehensive Guide This article guides you through simulating high-power rocket launches using RocketPy, a powerful Python library. We'll cover everything from defining rocket components to analyzing simula

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.