The rapid development of large language models this year has resulted in models like BERT now being called "small" models. In Kaggle's LLM science exam competition, players using deberta achieved fourth place, which is an excellent result. Therefore, in a specific domain or need, a large language model is not necessarily required as the best solution, and small models also have their place. Therefore, what we are going to introduce today is PubMedBERT, which is a paper published by Microsoft Research at ACM in 2022. This model pre-trains BERT from scratch by using domain-specific corpora
Here are the main takeaways from the paper:
For specific domains with large amounts of unlabeled text, such as the biomedical domain, pre-training from scratch Language models are more effective than continuous pre-training of general-domain language models. To this end, we propose the Biomedical Language Understanding and Reasoning Benchmark (BLURB) for domain-specific pre-training
PubMedBERT
1 , Domain-specific Pretraining
# Research shows that domain-specific pretraining from scratch greatly outperforms continuous pretraining of general language models, thus demonstrating support for hybrid The prevailing assumptions of domain pretraining do not always apply.
2. Model
Using the BERT model, for the masked language model (MLM), the requirement of whole word masking (WWM) is necessary Mask the entire word
3. BLURB data set
According to the author, BLUE[45] is The first attempt to create an NLP benchmark in the biomedical field. But BLUE's coverage is limited. For biomedical applications based on pubmed, the author proposes the Biomedical Language Understanding and Reasoning Benchmark (BLURB).
PubMedBERT uses a larger domain-specific corpus (21GB).
Results displayed
In most biomedical natural In language processing (NLP) tasks, PubMedBERT consistently outperforms all other BERT models, often with clear advantages
The above is the detailed content of Specific pre-trained models for the biomedical NLP domain: PubMedBERT. For more information, please follow other related articles on the PHP Chinese website!

译者|布加迪审校|重楼本文介绍了如何使用GroqLPU推理引擎在JanAI和VSCode中生成超快速响应。每个人都致力于构建更好的大语言模型(LLM),例如Groq专注于AI的基础设施方面。这些大模型的快速响应是确保这些大模型更快捷地响应的关键。本教程将介绍GroqLPU解析引擎以及如何在笔记本电脑上使用API和JanAI本地访问它。本文还将把它整合到VSCode中,以帮助我们生成代码、重构代码、输入文档并生成测试单元。本文将免费创建我们自己的人工智能编程助手。GroqLPU推理引擎简介Groq

大语言模型潜力被激发——无需训练大语言模型就能实现高精度时序预测,超越一切传统时序模型。蒙纳士大学、蚂蚁和IBM研究院联合开发了一种通用框架,成功推动了大语言模型跨模态处理序列数据的能力。该框架已经成为一项重要的技术创新。时序预测有益于城市、能源、交通、遥感等典型复杂系统的决策制定。自此,大模型有望彻底改变时序/时空数据挖掘方式。通用大语言模型重编程框架研究团队提出了一个通用框架,将大语言模型轻松用于一般时间序列预测,而无需做任何训练。主要提出两大关键技术:时序输入重编程;提示做前缀。Time-

想了解更多AIGC的内容,请访问:51CTOAI.x社区https://www.51cto.com/aigc/译者|晶颜审校|重楼不同于互联网上随处可见的传统问题库,这些问题需要跳出常规思维。大语言模型(LLM)在数据科学、生成式人工智能(GenAI)和人工智能领域越来越重要。这些复杂的算法提升了人类的技能,并在诸多行业中推动了效率和创新性的提升,成为企业保持竞争力的关键。LLM的应用范围非常广泛,它可以用于自然语言处理、文本生成、语音识别和推荐系统等领域。通过学习大量的数据,LLM能够生成文本

本文将第二届OpenHarmony技术大会上展示的《在OpenHarmony本地部署大语言模型》成果开源,开源地址:https://gitee.com/openharmony-sig/tpc_c_cplusplus/blob/master/thirdparty/InferLLM/docs/hap_integrate.md。实现思路和步骤移植轻量级LLM模型推理框架InferLLM到OpenHarmony标准系统,编译出能在OpenHarmony运行的二进制产物。InferLLM是一个简单高效的L

今天下午,鸿蒙智行正式迎来了新品牌与新车。8月6日,华为举行鸿蒙智行享界S9及华为全场景新品发布会,带来了全景智慧旗舰轿车享界S9、问界新M7Pro和华为novaFlip、MatePadPro12.2英寸、全新MatePadAir、华为毕昇激光打印机X1系列、FreeBuds6i、WATCHFIT3和智慧屏S5Pro等多款全场景智慧新品,从智慧出行、智慧办公到智能穿戴,华为全场景智慧生态持续构建,为消费者带来万物互联的智慧体验。鸿蒙智行:深度赋能,推动智能汽车产业升级华为联合中国汽车产业伙伴,为

大语言模型(LLMs)在语言理解和各种推理任务中展现出令人印象深刻的性能。然而,它们在人类认知的关键一面——空间推理上,仍然未被充分研究。人类具有通过一种被称为心灵之眼的过程创造看不见的物体和行为的心智图像的能力,从而使得对未见世界的想象成为可能。受到这种认知能力的启发,研究人员提出了“思维可视化”(VisualizationofThought,VoT)。VoT旨在通过可视化其推理迹象来引导LLMs的空间推理,从而引导后续的推理步骤。研究人员将VoT应用于多跳空间推理任务,包括自然语言导航、视觉

大语言模型(LargeLanguageModels,LLMs)在过去两年内迅速发展,涌现出一些现象级的模型和产品,如GPT-4、Gemini、Claude等,但大多数是闭源的。研究界目前能接触到的大部分开源LLMs与闭源LLMs存在较大差距,因此提升开源LLMs及其他小模型的能力以减小其与闭源大模型的差距成为了该领域的研究热点。LLM的强大能力,特别是闭源LLM,使得科研人员和工业界的从业者在训练自己的模型时都会利用到这些大模型的输出和知识。这一过程本质上是知识蒸馏(Knowledge,Dist

当前人工智能技术面临的最大风险是大语言模型(LLM)和生成式人工智能技术的发展和应用速度已经远远超过了安全和治理的速度。OpenAI、Anthropic、谷歌和微软等公司的生成式人工智能和大语言模型产品的使用正呈指数级增长。与此同时,开源大语言模型方案也在高速成长,HuggingFace等开源人工智能社区提供了大量开源模型、数据集和AI应用。为了推动人工智能的发展步伐,OWASP、OpenSSF、CISA等行业组织正在积极开发和提供人工智能安全与治理关键资产,例如OWASPAIExchange、


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

SublimeText3 English version
Recommended: Win version, supports code prompts!

VSCode Windows 64-bit Download
A free and powerful IDE editor launched by Microsoft

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

WebStorm Mac version
Useful JavaScript development tools

EditPlus Chinese cracked version
Small size, syntax highlighting, does not support code prompt function
