Home > Article > Technology peripherals > Tian Yuandong's team released the second version of the DOC of "Long Story Generator": the coherence has been greatly improved, and the fun has increased by 20.7%!
Some time ago, Dr. Tian Yuandong’s team released a story generator Re3 (Recursive Reprompting and Revision) framework based on a large-scale language model at EMNLP2022. By designing prompts, the model can generate consistent stories without any need. Fine-tuning large models can generate stories of up to 7,500 words.
Re3’s author team recently released the second version of the long story generation framework DOC (Detailed Outline Control) , which uses a hierarchical outline (outline) to describe the story For more detailed depictions and a more coherent continuation of the generated content using the fine-tuned OPT-350m model, human evaluations rated DOC as more capable of writing than the previous generation Re3.
##Paper link: https://arxiv.org/abs/2212.10077
Paper link: https://github.com/yangkevin2/doc-story-generation
DOC consists of two complementary components:
1. Detailed outline generator (detailed outliner)Can create a more detailed, hierarchical structure of the outline, the creative work from the main drafting process Move to the planning stage;
##2.detailed controllerEnsure more detail by controlling the story paragraphs to be consistent with the outline details The outline can still play a role in the generation process.
In the human evaluation of automatically generated stories, DOC achieved an absolute gain of 22.5% in plot consistency, a 28.2% increase in outline relevance, and a 20.7% increase in interest, which is significantly better than previous Re3 baseline model, and human evaluators also found DOC to be easier to control in an interactive generation environment.The first author of the article, Kevin Yang, is a fourth-year doctoral student at the University of California, Berkeley. His main research interest is controllable natural language text generation in structured settings, such as using controllable Generative structured methods to improve the consistency of long texts.
The second author, Dr. Tian Yuandong, is a researcher and senior manager at Meta Artificial Intelligence Research Institute. His research interests include deep reinforcement learning and its application in games, as well as theoretical analysis of deep learning models. . He received his bachelor's and master's degrees from Shanghai Jiao Tong University in 2005 and 2008, and his doctorate from the Robotics Institute of Carnegie Mellon University in the United States in 2013.
DOC Framework
With the continuous development of natural language technology, the understanding of short texts by large-scale language models is gradually approaching the bottleneck, and people are gradually becoming more and more interested in generating longer texts. Generate interest, such as generating thousands of words at once.Compared with short text generation tasks, long text contains more content and restrictions. The model needs to maintain overall consistency, long-term factual consistency, and maintain consistency with user output. The premise or plan remains relevant.
Compared with humans, story generation systems like Re3 still have shortcomings in many aspects, such as the inability to guarantee plot coherence over long distances, global inconsistencies, and story content deviating from the setting. plans etc.
To bridge this gap, the Detailed Outline Control (DOC) framework reuses Re3’s high-level planning-drafting-revision structure through two complementary approach improves long-term consistency.
Detailed Outliner
Rather than improvising new plot points, a writer might plan a coherent overarching plot in the high-level outline stage, using an expanded outline to provide more detailed guidance during the drafting process. During the drafting stage, the researchers reused the outline relevance and text coherence reordering from the Re3 rewriting stage to detect where the current outline items were. A paragraph of article is completed at the same time, and early stopping is implemented based on the score threshold. There are complete settings and relevant characters in the outline, and each outline item is carefully screened for relevance and coherence in context. In the structured prompt, the model highlights the current settings, changes in the settings, and also retrieves role descriptions based on the roles detected in the outline. In contrast, Re3 dynamically selects relevant characters for each segment during the drafting process and does not track setting information, which can lead to story Unexpected changes in settings The second component, the detailed controller, controls paragraphs based on the corresponding outline item Generated to maintain fidelity to a detailed outline. Because the detailed outline imposes many overlapping soft constraints, the detailed controller must exert sufficient control strength. At the same time, the detailed controller must also adapt to flexible natural language input and use State-of-the-art large language models are generated with computational efficiency. So the researchers implemented the detailed controller as a controller based on OPT350m, and designed a contrast training program to align the summary with the paragraph prefix. The most critical thing is that the researchers also constructed many fluent hard negatives to facilitate the generated paragraphs to not only It starts off relevant to the theme and stays relevant throughout. In the experiment, the input to the model is just a short English premise, usually 30-60 words, and the output is a complete story . The researchers did not impose more rule constraints because the definition of "story" is not yet clear, let alone the definition of "good story", and the quality mainly relies on manual evaluation. index. There are three main indicators used in evaluation, which are more suitable for comparing paragraphs rather than complete stories: 1. Coherence Sexuality, the percentage of paragraphs that human annotators judge to have a coherent plot; 2. Relevance, the percentage of paragraphs that are judged to conform to the corresponding outline entries; 3. Interestingness, the percentage of passages that are considered interesting. The baseline models compared include Re3, ROLLING-OPT and ROLLING-GPT. As can be seen from the experimental results, compared with Re3, the annotators believe that the plot generated by DOC is more coherent and more relevant to the outline. ROLLING baseline improvement is higher. And the results confirm the correctness of the model design, that is, plot coherence and outline relevance benefit from shifting creative work from planning to drafting, as well as improved control mechanisms. And surprisingly, the annotators also believed that the DOC paragraphs were significantly more interesting. The researchers believed that this was an improvement brought about by more detailed (more event-based) outlines, and further ablation experiments also supported this this assumption. However, qualitative analysis also revealed that the model still has huge room for further improvement. Unlike RE3, DOC usually doesn't deviate significantly from the top-level outline, while RE3 sometimes strays almost completely off topic, but DOC often fails to follow the lower-level parts of the detailed outline. Internal consistency remains problematic in DOC and RE3, and occasional errors in detailed outlines can have a particularly negative impact, leading to greater levels of confusion during the drafting process. Connection error. Additionally, outlines in the DOC are often inconsistent in the level of detail, with some being too vague and others appearing to be over-expanded. Additionally, the settings and roles detected by the model can sometimes be incorrect or incomplete, the example below shows the DOC written according to the above outline A heavily abridged story. Detailed Controller
Experimental part
The above is the detailed content of Tian Yuandong's team released the second version of the DOC of "Long Story Generator": the coherence has been greatly improved, and the fun has increased by 20.7%!. For more information, please follow other related articles on the PHP Chinese website!