Home >Technology peripherals >AI >How does the brain process language? Princeton team analyzes Transformer model
Editor | Radish Skin
When processing language, the brain deploys specialized computations to construct meaning from complex linguistic structures. Artificial neural networks based on the Transformer architecture are important tools for natural language processing.
Princeton University researchers explore the Transformer model and the functional specialization of the human brain in language processing.
Transformer integrates contextual information between words through structured circuit calculations. However, current research mainly focuses on the internal representations ("embeddings") generated by these circuits.
The researchers analyzed circuit calculations directly: they deconstructed these calculations into functionally specialized "transformations" that integrate contextual information across words. Using functional MRI data obtained while participants listened to naturalistic stories, the researchers tested whether these "transformations" could account for significant differences in brain activity across cortical language networks.
Research has proven that emergent calculations performed by each functionally specialized "attention head" predict brain activity in specific cortical areas in different ways. These attention heads descend along gradients corresponding to different layers and context lengths in low-dimensional cortical space.
The research was published in "Nature Communications" on June 29, 2024 under the title "Shared functional specialization in transformer-based language models and the human brain".
Language understanding is fundamentally a constructive process. Our brains resolve local dependencies between words, assemble low-level units of language into high-level units of meaning, and ultimately form the narratives we use to make sense of the world.
For example, if the speaker mentions "secret plan", we will implicitly process the relationship between the words in this structure to understand that "secret" modifies "plan". At a higher level, we use the context of the surrounding narrative to understand the meaning of the phrase—what does this plan entail, who is keeping this secret, and who are they keeping it from?
This context may contain hundreds of words spread out over minutes. The human brain is thought to implement these processes through a series of functionally specialized computations that convert speech signals into actionable representations of meaning.
Traditional neuroimaging research uses experimental means to analyze specific language computation processes and map them to brain activity in a controlled environment. However, this approach has difficulty generalizing the complexity of natural language.
In recent years, deep neural networks based on the Transformer architecture have changed the way natural language processing is done. These models learn on large-scale real text corpora through self-supervised training, enabling context-sensitive meaning representation of each word in long sequences.
In addition to relying on the embedded representation inside the Transformer model, some attention heads in the Transformer model will implement specific functional specializations, such as parsing verb direct objects or tracking noun modifiers.
In the current study, the researchers believe that headwise transformations (functionally specialized contextual computations performed by individual attention heads) can provide a complementary window into language processing in the brain. A neurocomputational theory of natural language processing must ultimately specify how meaning is constructed across words.
The Transformer architecture provides explicit access to candidate mechanisms for quantifying how the meaning of past words fits into the meaning of the current word.
If this is an important part of human language processing, then these transformations should provide a good basis for simulating human brain activity during natural language understanding.
The researchers extracted transformations from the widely studied BERT model and used an encoding model to evaluate how well these transformations, along with several other language feature families, perform in predicting brain activity during natural language understanding.
Illustration: Comparing three classes of language models across cortical language areas. (Source: Paper)
The researchers compared the performance of three language models: classical language features, non-contextual word embeddings (GloVe), and contextual Transformer features (BERT).
Illustration: Layer preferences for embeddings and transformations. (Source: paper)
연구원들은 변환이 임베딩과 동등하게 수행되고 일반적으로 비문맥 임베딩 및 고전적 구문 주석보다 성능이 뛰어나다는 것을 발견했습니다. 이는 주변 단어에서 추출된 문맥 정보가 매우 풍부하다는 것을 나타냅니다.
사실 모델 초기 계층의 변환은 임베딩 자체보다 뇌 활동의 독특한 차이점을 더 많이 설명합니다. 마지막으로 연구자들은 이러한 변환을 개별 Attention Head가 수행하는 기능별 계산으로 분해합니다.
그림: 머리 두뇌와 의존성 예측 간의 대응. (출처: 논문)
연구원들은 머리 방향의 특정 속성(예: 되돌아보기 거리)이 머리 방향 변환과 피질 음성 귀 사이의 매핑을 결정한다는 것을 발견했습니다. 연구자들은 또한 특정 언어 영역의 경우 특정 언어 종속성을 우선적으로 인코딩하는 머리 방향 변환이 뇌 활동을 더 잘 예측한다는 사실도 발견했습니다.
요약하자면, 이 연구는 인간의 언어 처리를 이해하는 데 새로운 관점을 제공합니다.
논문 링크:https://www.nature.com/articles/s41467-024-49173-5
The above is the detailed content of How does the brain process language? Princeton team analyzes Transformer model. For more information, please follow other related articles on the PHP Chinese website!