Home >Technology peripherals >AI >Your friends are watching too! Google STUDY algorithm supports book list recommendation system to make students fall in love with reading
We have always known that opening books is beneficial. Reading can help people improve their language skills and learn new skills....
Reading can also improve mood and mental health. People who read regularly have greater general knowledge and a deeper understanding of other cultures.
Moreover, studies have confirmed that reading for pleasure is related to academic success.
But in the era of information explosion, there are abundant online and offline reading resources. What to read becomes a difficult challenge.
In particular, the reading content must match different age groups and be engaging.
The recommendation system is the solution to this challenge. It presents readers with relevant reading material and helps them stay interested.
The core of the recommendation system is machine learning (ML), which is widely used to build various types of recommendation systems: from videos to books to e-commerce platforms wait.
The trained ML model can improve user experience by making recommendations to each user individually based on user preferences, user engagement, and recommended items.
Google’s latest research proposes an audiobook content recommendation system that takes into account the social nature of reading (such as educational environments): the STUDY algorithm.
Because what a person’s peers are currently reading can have a significant impact on what they are interested in reading, Google has partnered with Learning Ally.
Learning Ally is an educational nonprofit with a large digital library of curated audiobooks for students, perfect for building social recommendation models.
This enables the model to benefit from real-time information about students’ localized social groups (such as classrooms).
The STUDY algorithm adopts a method of modeling the recommended content problem as a click-through rate prediction problem.
Where the simulated user’s interaction probability with each specific item depends on:
1) User and item characteristics
2) The user’s project interaction history sequence.
Previous work has shown that the Transformer model is well suited to modeling this problem.
When each user is treated individually, simulating interaction becomes an autoregressive sequence modeling problem.
The STUDY algorithm is the final product of modeling data through this conceptual framework and then extending this framework.
The click-through rate prediction problem can model the dependencies between individual users’ past and future item preferences, and learn similarity patterns between users during training.
But one problem is that the click-through rate prediction method cannot model the dependencies between different users.
To this end, Google developed the STUDY model, which can solve the shortcomings of autoregressive sequence modeling that cannot model the social nature of reading.
STUDY can connect the sequences of books read by multiple students in a class into one sequence, thereby collecting data from multiple students in one model.
However, this data representation needs to be studied carefully when modeling it with a Transformer.
In Transformer, the attention mask is a matrix that controls which inputs can be used to predict which outputs.
The pattern of using all previous tokens in the sequence to inform the prediction of the output results in an upper triangular attention matrix, which is typically found in causal decoders.
However, since the sequence input into the STUDY model is not in time order, although each of its component subsequences is in time order, the traditional causal decoder is no longer suitable. this sequence.
In trying to predict each token, the model does not allow attention to turn to each token that appears before it in the sequence; some of these tokens may have later timestamps and contain In information that is not available at the time of deployment.
picture
Attention masks commonly used in causal decoders. Each column represents an output, and each column represents an output. A matrix entry with a value of 1 (shown in blue) at a specific position indicates that the model can observe the input for that row when predicting the output of the corresponding column, while a value of 0 (shown in white) indicates the opposite.
The STUDY model is based on a causal transformer that replaces the triangular matrix attention mask with a flexible timestamp-based attention mask, allowing attention across different subsequences.
Compared to ordinary converters, the STUDY model maintains a causal triangle attention matrix within a sequence and has flexible values in different sequences that depend on the timestamp.
Therefore, the prediction for any output point in the sequence will refer to all input points that occurred in the past relative to the current point in time, regardless of whether they occurred before or after the current input point in the sequence.
This causal constraint is important because if this constraint is not enforced during training, the model risks learning to use future information to make predictions, which is not possible in real-world deployments is impossible to achieve.
Picture
(a) A sequential autoregressive transformer with causal attention that can handle each user individually ; (b) An equivalent joint forward pass, which computes the same as (a); (c) allows information to flow between users by introducing new non-zero values in the attention mask (shown in purple). To do this, we allowed predictions to be conditional on all interactions with earlier timestamps, regardless of whether the interactions were from the same user
Google uses the Learning Ally dataset to train the STUDY model and uses multiple baselines for comparison.
The team used an autoregressive CTR decoder (called "individual"), a k-nearest neighbor baseline (KNN), and a comparable social baseline - Social Attention Memory Network (SAMN).
They used data from the first academic year for training and data from the second academic year for validation and testing.
The team evaluates these models by measuring the percentage of time the next item the user actually interacts with is within the model's top n suggestions.
In addition to evaluating the model on the entire test set, the team also reports the model's score on two subsets of the test set that are more accurate than the entire dataset. challenge.
It can be observed that students usually interact with audiobooks multiple times, so simply recommending the last book a user read is trivial.
Therefore, the researchers call the first test subset "non-continuation". In this subset, we only examine each model when students interact with students who are different from the previous interaction. Recommendation performance when books are interactive.
In addition, the team also observed that students revisit books they have read in the past, so the books recommended for each student are limited to the books they have read in the past Within, you can achieve good performance on the test set.
Although there may be some value in recommending students their past favorite books, much of the value of recommendation systems comes from recommending new, unknown content to users.
To measure this, the team evaluated the model on a subset of the test set where students interacted with the bibliography for the first time. We name this evaluation subset "new subset".
It can be found that "STUDY" outperforms other models in almost all evaluations.
Picture
The core of the STUDY algorithm is Group users into groups and perform joint inference on multiple users in the same group in a single forward pass of the model.
The researchers examined the importance of actual grouping on model performance through an ablation study.
In the proposed model, the researchers grouped all students in the same grade and school.
We then experimented with groupings defined by all students in the same grade and district, as well as grouping all students into one group and using a random subset on each forward pass.
The researchers also compared these models to “individual” models for reference.
Research has found that using more localized groups is more effective, i.e., school and grade groupings than school district and grade groupings.
This supports the hypothesis that the research model succeeds because activities such as reading are social: people's reading choices are likely to be correlated with the reading choices of those around them.
Both models outperformed the other two models (single group model and individual model) without using grade levels to group students.
This shows that data from users with similar reading levels and interests is beneficial to improving the performance of the model.
Finally, this Google study was limited to modeling a user group that assumes social relationships are homogeneous.
Reference:
https://www.php.cn/link/0b32f1a9efe5edf3dd2f38b0c0052bfe
The above is the detailed content of Your friends are watching too! Google STUDY algorithm supports book list recommendation system to make students fall in love with reading. For more information, please follow other related articles on the PHP Chinese website!