


Salesforce collaborates with MIT researchers to open source GPT-4 revision tutorials to deliver more information with fewer words
Automatic summarization technology has made significant progress in recent years, mainly due to paradigm shifts. In the past, the technology relied mainly on supervised fine-tuning on annotated data sets, but now uses large language models (LLM) for zero-shot prompts, such as GPT-4. Through careful prompt settings, fine control of summary length, theme, style and other features can be achieved without additional training
But one aspect is often overlooked: the information density of the summary. Theoretically, as a compression of another text, a summary should be denser, that is, contain more information, than the source file. Considering the high latency of LLM decoding, it is important to cover more information with fewer words, especially for real-time applications.
However, information density is an open question: if the abstract contains insufficient details, it is equivalent to no information; if it contains too much information without increasing the total length, it will become difficult to understand. To convey more information within a fixed token budget, it is necessary to combine abstraction, compression, and fusion.
In recent research, researchers from Salesforce, MIT, and others have attempted to determine the limits of increasing density by soliciting human preferences for a set of summaries generated by GPT-4. This method provides a lot of inspiration for improving the "expression ability" of large language models such as GPT-4
Paper link: https://arxiv.org/pdf/2309.04269 .pdf
Dataset address: https://huggingface.co/datasets/griffin/chain_of_density
Specifically, the researchers used the average number of entities per token as the density of representatives, generating an initial, entity-sparse summary. Then, they iteratively identify and fuse the 1-3 entities that were missing from the previous summary without increasing the total length (5 times the total length). Each digest has a higher entity to token ratio than the previous digest. Based on human preference data, the authors ultimately determined that humans prefer summaries that are nearly as dense as human-written summaries, and denser than summaries generated by ordinary GPT-4 prompts. The overall contribution of the study can be summarized as The following points:
- We need to develop a prompt-based iterative method (CoD) to improve the entity density of the summary
- for CNN / Manual and automated assessment of the density of summaries in Daily Mail articles to better understand the trade-off between informativeness (favoring more entities) and clarity (favoring fewer entities)
- Open source GPT-4 abstracts, annotations and a set of 5000 unannotated CoD abstracts for evaluation or refinement.
The author formulated a single chain of density (CoD) Prompt, which generates an initial summary and makes its entity density continuously increase. Specifically, within a fixed number of interactions, a unique set of salient entities in the source text are identified and merged into the previous summary without increasing the length.
Examples of prompts and output are shown in Figure 2. The author does not explicitly specify the type of entity, but defines the missing entity as:
- Related to the main story:
- Specific: Concise Summary (5 words or less);
- Unique: not mentioned in previous summaries;
- Faithful: present in In the article;
- Anywhere: Located anywhere in the article.
The author randomly selected 100 articles from the CNN/DailyMail summary test set to generate CoD summaries for them. For ease of reference, they compared CoD summary statistics to human-written bullet-point reference summaries and summaries generated by GPT-4 under the normal prompt: "Write a very short summary of the article. No more than 70 words."
In the study, the author summarized from two aspects: direct statistical data and indirect statistical data. Direct statistics (tokens, entities, entity density) are directly controlled by CoD, while indirect statistics are an expected by-product of densification.
Direct statistics. As shown in Table 1, the second step reduced the length by an average of 5 tokens (from 72 to 67) due to the removal of unnecessary words from the initially lengthy summary. Entity density starts at 0.089, initially lower than human and Vanilla GPT-4 (0.151 and 0.122), and eventually rises to 0.167 after 5 steps of densification. Indirect statistics. The level of abstraction should increase with each step of CoD, as the abstract is repeatedly rewritten to make room for each additional entity. The authors measure abstraction using extraction density: the average square length of extracted fragments (Grusky et al., 2018). Likewise, concept fusion should increase monotonically as entities are added to a fixed-length summary. The authors expressed the degree of integration by the average number of source sentences aligned with each summary sentence. For alignment, the authors use the relative ROUGE gain method (Zhou et al., 2018), which aligns the source sentence with the target sentence until the relative ROUGE gain of the additional sentences is no longer positive. They also expected changes in content distribution, or the position within the article from which the summary content comes.
Specifically, the authors expected that CoD abstracts would initially exhibit a strong "lead bias" but would then gradually begin to introduce entities from the middle and end of the article. To measure this, they used alignment in fusion to rewrite content in Chinese, without the original sentence appearing, and measured the average sentence rank across all aligned source sentences.
Figure 3 confirms these hypotheses: as the number of rewriting steps increases, the abstraction increases (the left image shows lower extraction density), the fusion rate also increases (the middle image shows), and the abstract starts to include the middle of the article and the content at the end (shown on the right). Interestingly, all CoD summaries are more abstract compared to human-written summaries and baseline summaries
When rewriting the content, you need to rewrite it in Chinese , the original sentence does not need to appear
In order to better understand the tradeoff of CoD abstracts, the authors conducted a preference-based human study and conducted a rating-based evaluation using GPT-4.
Human preferences. Specifically, for the same 100 articles (5 steps *100 = 500 abstracts in total), the author randomly showed the "re-created" CoD abstracts and articles to the first four authors of the paper. Each annotator gave his or her favorite summary based on Stiennon et al.'s (2020) definition of a "good summary." Table 2 reports the first-place votes of each annotator in the CoD stage, as well as the summary of each annotator. Overall, 61% of first-place abstracts (23.0 22.5 15.5) involved ≥3 densification steps. The median number of preferred CoD steps is in the middle (3), with an expected step number of 3.06.
Based on the average density in the third step, the preferred entity density of all CoD candidates is approximately 0.15. As can be seen from Table 1, this density is consistent with human-written summaries (0.151), but significantly higher than summaries written with ordinary GPT-4 Prompt (0.122)
automatic measures. As a supplement to human evaluation (below), the authors used GPT-4 to score CoD summaries (1-5 points) along 5 dimensions: informativeness, quality, coherence, attributability, and overallness. As shown in Table 3, density correlates with informativeness, but up to a limit, with the score peaking at step 4 (4.74).
Judging from the average scores of each dimension, the first and last steps of CoD have the lowest scores, while the middle three steps have close scores (4.78, 4.77 and 4.76).
Qualitative analysis. There is a clear trade-off between abstract coherence/readability and informativeness. Two CoD steps are shown in Figure 4: one step's summary is improved by more detail, and the other step's summary is compromised. On average, intermediate CoD summaries best achieve this balance, but this tradeoff still needs to be precisely defined and quantified in future work.
For more details of the paper, please refer to the original paper.
The above is the detailed content of Salesforce collaborates with MIT researchers to open source GPT-4 revision tutorials to deliver more information with fewer words. For more information, please follow other related articles on the PHP Chinese website!

The legal tech revolution is gaining momentum, pushing legal professionals to actively embrace AI solutions. Passive resistance is no longer a viable option for those aiming to stay competitive. Why is Technology Adoption Crucial? Legal professional

Many assume interactions with AI are anonymous, a stark contrast to human communication. However, AI actively profiles users during every chat. Every prompt, every word, is analyzed and categorized. Let's explore this critical aspect of the AI revo

A successful artificial intelligence strategy cannot be separated from strong corporate culture support. As Peter Drucker said, business operations depend on people, and so does the success of artificial intelligence. For organizations that actively embrace artificial intelligence, building a corporate culture that adapts to AI is crucial, and it even determines the success or failure of AI strategies. West Monroe recently released a practical guide to building a thriving AI-friendly corporate culture, and here are some key points: 1. Clarify the success model of AI: First of all, we must have a clear vision of how AI can empower business. An ideal AI operation culture can achieve a natural integration of work processes between humans and AI systems. AI is good at certain tasks, while humans are good at creativity and judgment

Meta upgrades AI assistant application, and the era of wearable AI is coming! The app, designed to compete with ChatGPT, offers standard AI features such as text, voice interaction, image generation and web search, but has now added geolocation capabilities for the first time. This means that Meta AI knows where you are and what you are viewing when answering your question. It uses your interests, location, profile and activity information to provide the latest situational information that was not possible before. The app also supports real-time translation, which completely changed the AI experience on Ray-Ban glasses and greatly improved its usefulness. The imposition of tariffs on foreign films is a naked exercise of power over the media and culture. If implemented, this will accelerate toward AI and virtual production

Artificial intelligence is revolutionizing the field of cybercrime, which forces us to learn new defensive skills. Cyber criminals are increasingly using powerful artificial intelligence technologies such as deep forgery and intelligent cyberattacks to fraud and destruction at an unprecedented scale. It is reported that 87% of global businesses have been targeted for AI cybercrime over the past year. So, how can we avoid becoming victims of this wave of smart crimes? Let’s explore how to identify risks and take protective measures at the individual and organizational level. How cybercriminals use artificial intelligence As technology advances, criminals are constantly looking for new ways to attack individuals, businesses and governments. The widespread use of artificial intelligence may be the latest aspect, but its potential harm is unprecedented. In particular, artificial intelligence

The intricate relationship between artificial intelligence (AI) and human intelligence (NI) is best understood as a feedback loop. Humans create AI, training it on data generated by human activity to enhance or replicate human capabilities. This AI

Anthropic's recent statement, highlighting the lack of understanding surrounding cutting-edge AI models, has sparked a heated debate among experts. Is this opacity a genuine technological crisis, or simply a temporary hurdle on the path to more soph

India is a diverse country with a rich tapestry of languages, making seamless communication across regions a persistent challenge. However, Sarvam’s Bulbul-V2 is helping to bridge this gap with its advanced text-to-speech (TTS) t


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Safe Exam Browser
Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

SublimeText3 Mac version
God-level code editing software (SublimeText3)

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Notepad++7.3.1
Easy-to-use and free code editor

WebStorm Mac version
Useful JavaScript development tools
