在当今快节奏的世界中,无论是快速浏览文章还是突出研究论文中的要点,将长篇内容压缩为简洁的摘要都是至关重要的。 Hugging Face 提供了一个强大的文本摘要工具:BART 模型。在本文中,我们将探讨如何利用 Hugging Face 的预训练模型,特别是 facebook/bart-large-cnn 模型来总结长文章和文本。
开始使用 Hugging Face 的 BART 模型
Hugging Face 为文本分类、翻译和摘要等 NLP 任务提供了多种模型。最流行的摘要模型之一是 BART(双向和自回归变压器),它经过训练可以从大型文档生成连贯的摘要。
第 1 步:安装 Hugging Face Transformers 库
要开始使用 Hugging Face 模型,您需要安装 Transformer 库。您可以使用 pip 来执行此操作:
pip install transformers
步骤 2:导入摘要管道
安装库后,您可以轻松加载预先训练的模型进行摘要。 Hugging Face 的管道 API 提供了使用 facebook/bart-large-cnn 等模型的高级接口,该模型已针对摘要任务进行了微调。
from transformers import pipeline # Load the summarization model summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
第 3 步:运行摘要器
现在您已准备好摘要生成器,您可以输入任何长文本来生成摘要。以下是使用有关英国著名女演员玛吉·史密斯夫人的示例文章的示例。
ARTICLE = """ Dame Margaret Natalie Smith (28 December 1934 – 27 September 2024) was a British actress. Known for her wit in both comedic and dramatic roles, she had an extensive career on stage and screen for over seven decades and was one of Britain's most recognisable and prolific actresses. She received numerous accolades, including two Academy Awards, five BAFTA Awards, four Emmy Awards, three Golden Globe Awards and a Tony Award, as well as nominations for six Olivier Awards. Smith is one of the few performers to earn the Triple Crown of Acting. Smith began her stage career as a student, performing at the Oxford Playhouse in 1952, and made her professional debut on Broadway in New Faces of '56. Over the following decades Smith established herself alongside Judi Dench as one of the most significant British theatre performers, working for the National Theatre and the Royal Shakespeare Company. On Broadway, she received the Tony Award for Best Actress in a Play for Lettice and Lovage (1990). She was Tony-nominated for Noël Coward's Private Lives (1975) and Tom Stoppard's Night and Day (1979). Smith won Academy Awards for Best Actress for The Prime of Miss Jean Brodie (1969) and Best Supporting Actress for California Suite (1978). She was Oscar-nominated for Othello (1965), Travels with My Aunt (1972), A Room with a View (1985) and Gosford Park (2001). She portrayed Professor Minerva McGonagall in the Harry Potter film series (2001–2011). She also acted in Death on the Nile (1978), Hook (1991), Sister Act (1992), The Secret Garden (1993), The Best Exotic Marigold Hotel (2012), Quartet (2012) and The Lady in the Van (2015). Smith received newfound attention and international fame for her role as Violet Crawley in the British period drama Downton Abbey (2010–2015). The role earned her three Primetime Emmy Awards; she had previously won one for the HBO film My House in Umbria (2003). Over the course of her career she was the recipient of numerous honorary awards, including the British Film Institute Fellowship in 1993, the BAFTA Fellowship in 1996 and the Society of London Theatre Special Award in 2010. Smith was made a dame by Queen Elizabeth II in 1990. """ # Generate the summary summary = summarizer(ARTICLE, max_length=130, min_length=30, do_sample=False) # Print the summary print(summary)
输出:
[{'summary_text': 'Dame Margaret Natalie Smith (28 December 1934 – 27 September 2024) was a British actress. Known for her wit in both comedic and dramatic roles, she had an extensive career on stage and screen for over seven decades. She received numerous accolades, including two Academy Awards, five BAFTA Awards, four Emmy Awards, three Golden Globe Awards and a Tony Award.'}]
正如您从输出中看到的,摘要器将文章的要点浓缩为简短、可读的格式,突出显示了关键事实,例如她的职业生涯寿命和荣誉。
另一种方法:总结文件中的文本
在某些用例中,您可能希望从文件而不是硬编码字符串中读取文本。下面是一个更新的 Python 脚本,它从文本文件中读取文章并生成摘要。
from transformers import pipeline # Load the summarizer pipeline summarizer = pipeline("summarization", model="facebook/bart-large-cnn") # Function to read the article from a text file def read_article_from_file(file_path): with open(file_path, 'r') as file: return file.read() # Path to the text file containing the article file_path = 'article.txt' # Change this to your file path # Read the article from the file ARTICLE = read_article_from_file(file_path) # Get the summary summary = summarizer(ARTICLE, max_length=130, min_length=30, do_sample=False) # Print the summary print(summary)
文件输入:
在这种情况下,您需要将文章保存到文本文件(示例中为article.txt),脚本将读取内容并对其进行总结。
结论
Hugging Face 的 BART 模型是自动文本摘要的绝佳工具。无论您是在处理长文章、研究论文还是任何大量文本,该模型都可以帮助您将信息提炼成简洁的摘要。
本文演示了如何将 Hugging Face 的预训练摘要模型集成到您的项目中,包括硬编码文本和文件输入。只需几行代码,您就可以在 Python 项目中启动并运行高效的摘要管道。
以上是使用 Hugging Face 的 BART 模型总结文本的详细内容。更多信息请关注PHP中文网其他相关文章!

pythonlistscanStoryDatatepe,ArrayModulearRaysStoreOneType,and numpyArraySareSareAraysareSareAraysareSareComputations.1)列出sareversArversAtileButlessMemory-Felide.2)arraymoduleareareMogeMogeNareSaremogeNormogeNoreSoustAta.3)

WhenyouattempttostoreavalueofthewrongdatatypeinaPythonarray,you'llencounteraTypeError.Thisisduetothearraymodule'sstricttypeenforcement,whichrequiresallelementstobeofthesametypeasspecifiedbythetypecode.Forperformancereasons,arraysaremoreefficientthanl

pythonlistsarepartofthestAndArdLibrary,herilearRaysarenot.listsarebuilt-In,多功能,和Rused ForStoringCollections,而EasaraySaraySaraySaraysaraySaraySaraysaraySaraysarrayModuleandleandleandlesscommonlyusedDduetolimitedFunctionalityFunctionalityFunctionality。

ThescriptisrunningwiththewrongPythonversionduetoincorrectdefaultinterpretersettings.Tofixthis:1)CheckthedefaultPythonversionusingpython--versionorpython3--version.2)Usevirtualenvironmentsbycreatingonewithpython3.9-mvenvmyenv,activatingit,andverifying

Pythonarrayssupportvariousoperations:1)Slicingextractssubsets,2)Appending/Extendingaddselements,3)Insertingplaceselementsatspecificpositions,4)Removingdeleteselements,5)Sorting/Reversingchangesorder,and6)Listcomprehensionscreatenewlistsbasedonexistin

NumPyarraysareessentialforapplicationsrequiringefficientnumericalcomputationsanddatamanipulation.Theyarecrucialindatascience,machinelearning,physics,engineering,andfinanceduetotheirabilitytohandlelarge-scaledataefficiently.Forexample,infinancialanaly

useanArray.ArarayoveralistinpythonwhendeAlingwithHomeSdata,performance-Caliticalcode,orinterFacingWithCcccode.1)同质性data:arrayssavememorywithtypedelements.2)绩效code-performance-clitionalcode-clitadialcode-critical-clitical-clitical-clitical-clitaine code:araysofferferbetterperperperformenterperformanceformanceformancefornalumericalicalialical.3)

不,notalllistoperationsareSupportedByArrays,andviceversa.1)arraysdonotsupportdynamicoperationslikeappendorinsertwithoutresizing,wheremactssperformance.2)listssdonotguaranteeconeeconeconstanttanttanttanttanttanttanttanttimecomplecomecomecomplecomecomecomecomecomecomplecomectaccesslikearrikearraysodo。


热AI工具

Undresser.AI Undress
人工智能驱动的应用程序,用于创建逼真的裸体照片

AI Clothes Remover
用于从照片中去除衣服的在线人工智能工具。

Undress AI Tool
免费脱衣服图片

Clothoff.io
AI脱衣机

Video Face Swap
使用我们完全免费的人工智能换脸工具轻松在任何视频中换脸!

热门文章

热工具

Atom编辑器mac版下载
最流行的的开源编辑器

SecLists
SecLists是最终安全测试人员的伙伴。它是一个包含各种类型列表的集合,这些列表在安全评估过程中经常使用,都在一个地方。SecLists通过方便地提供安全测试人员可能需要的所有列表,帮助提高安全测试的效率和生产力。列表类型包括用户名、密码、URL、模糊测试有效载荷、敏感数据模式、Web shell等等。测试人员只需将此存储库拉到新的测试机上,他就可以访问到所需的每种类型的列表。

Dreamweaver CS6
视觉化网页开发工具

SublimeText3汉化版
中文版,非常好用

DVWA
Damn Vulnerable Web App (DVWA) 是一个PHP/MySQL的Web应用程序,非常容易受到攻击。它的主要目标是成为安全专业人员在合法环境中测试自己的技能和工具的辅助工具,帮助Web开发人员更好地理解保护Web应用程序的过程,并帮助教师/学生在课堂环境中教授/学习Web应用程序安全。DVWA的目标是通过简单直接的界面练习一些最常见的Web漏洞,难度各不相同。请注意,该软件中