search
HomeTechnology peripheralsAIWill LLM become history? Open source bGPT may subvert the deep learning paradigm: directly simulate binary, opening a new era of analog digital world!

The latest achievement bGPT launched by Microsoft Research Asia, this byte-based Transformer model, opens a new door for us to explore the digital world.

Unlike traditional vocabulary-based language models, bGPT is unique in that it can directly process raw binary data without being restricted by specific formats or tasks. It aims to fully simulate the digital world, opening up new possibilities for model development.

Will LLM become history? Open source bGPT may subvert the deep learning paradigm: directly simulate binary, opening a new era of analog digital world!

Paper: https://www.php.cn/link/ee88b3cea2051be97bcddf2e0d9a28f6

Code: https://www.php.cn/link/359499f804ea7988921bf86c9377fb95

Model :https://www.php.cn/link/4b459ea1a5917be436df5f0bd5b3c4ad

## Project homepage: https://www.php.cn/link/71af59614c8b42af334933e9261e53be

The research team demonstrated the huge potential of bGPT in modeling in their research paper. By performing byte-level processing, bGPT can not only generate text, images, and audio, but also simulate computer behavior, including format conversion algorithms and modeling of CPU states. This approach of treating all data as a sequence of bytes enables bGPT to integrate different types of data into the same framework.

Once released, bGPT’s paper caused widespread discussion on This activity opens up new possibilities.

Binary data: the basic DNA that constitutes the digital world

Binary data is the cornerstone of the digital world, which runs through the operation of computer processors and the electronic products we use every day The system is the core of all data, equipment and software. Therefore, based on this foundation, the goal of bGPT is to understand the internal logic of digital systems by studying binary data sequences, thereby reshaping and simulating various complex digital phenomena.

bGPT can not only be applied to conventional AI generation and understanding tasks through byte-level processing, but can also handle more non-traditional applications. For example, it can directly simulate MIDI - a standard format for music transmission and storage, which previous research has avoided direct modeling due to the binary nature of MIDI.

But bGPT is naturally suitable for such tasks. It can accurately simulate the conversion algorithm of music data and achieve an extremely low error rate (0.0011 BPB) when converting ABC notation to MIDI format. .

In practical applications, bGPT is usually able to accurately complete the conversion between ABC symbols and MIDI files, and sometimes can even correct errors in the original files to make the music conversion more accurate.

Will LLM become history? Open source bGPT may subvert the deep learning paradigm: directly simulate binary, opening a new era of analog digital world!


bGPT automatically converts ABC notation into MIDI format (above) and the original MIDI data ( The comparison of the figure below) highlights the key difference: although a beat is missing in the original MIDI data (see figure below), causing the chord accompaniment to be disconnected, the result of the bGPT conversion (see figure above) correctly fills in this is missing, ensuring the smoothness of the chord accompaniment.

The research team also regards CPU modeling as a representative task for hardware behavior simulation: this task requires the model to receive a sequence of low-level machine instructions as input, and its goal is to accurately predict the execution of each instruction. How the CPU status is updated until the program is stopped.

In this task, bGPT demonstrated an accuracy of over 99.99%, demonstrating the power and scalability of the byte model in processing native binary data.

Will LLM become history? Open source bGPT may subvert the deep learning paradigm: directly simulate binary, opening a new era of analog digital world!

Given the program and initial CPU state, bGPT is able to accurately predict the complete process of CPU execution until the program terminates. In this example, bGPT handles all CPU instructions accurately. For ease of understanding, the actual byte sequence is converted into a more readable format.

From bytes to everything: Breaking through boundaries and moving towards unified data modeling

bGPT can not only process native binary data, but also integrate multiple data types Into a unified model architecture, all data are regarded as byte sequences.

This approach not only simplifies the data modeling process, but also makes integration from any data source a breeze without the need to customize models for specific data types.

The research team gave examples of traditional text, image and audio files in the paper, demonstrating bGPT's capabilities in unified data modeling. The bGPT model they trained has about 100 million parameters.

Experimental results show that in comparison with models of the same scale as GPT-2 (text model), ViT (visual model) and AST (audio model), bGPT performs better on different data types Both demonstrated comparable performance.

bGPT performs very well in text generation. Thanks to its byte-level text encoding, the model does not rely on vocabulary and can therefore support all languages.

Its hierarchical Transformer architecture, although the computational overhead is similar to GPT-2, can generate text up to 8KB, which greatly exceeds the length limit of GPT-2. After pre-training on Wikipedia data, the text generated by bGPT is comparable to GPT-2 in both style and topic, proving its powerful ability in text generation.

bGPT is pre-trained on the Wikipedia dataset, and the quality and topic consistency of the generated text samples are comparable to GPT-2.

bGPT can generate images by predicting the next byte in a sequence of image bytes. The model is pre-trained on the ImageNet dataset, and the generated images have a resolution of 32x32 pixels.

Although at the current scale, it is difficult to accurately capture the two-dimensional spatial relationship of the image through byte sequences, resulting in artifacts and noise in the generated image, texture and light and shadow effects are usually Still relatively accurate.

In addition, these generated images can be decoded into BMP files normally. The research team pointed out that by expanding the scale of bGPT, similar to the method of pixel sequence modeling of iGPT developed by OpenAI, it may be possible to achieve higher quality and more realistic image generation.

These are a set of images generated by bGPT pre-trained on the ImageNet dataset. While the texture and lighting effects of the images are generally accurate, identifying the main objects in these generated images can be challenging.

bGPT treats audio data as a sequence of bytes and can generate 1 second long audio samples with a sampling rate of 8000 Hz.

The model was pre-trained on the LibriSpeech data set and further fine-tuned and demonstrated on the Speech Commands v2 data set. The audio samples generated by bGPT maintain a high level of accuracy, with some samples being nearly indistinguishable from real audio. The following is a set of examples demonstrating bGPT's capabilities in the field of audio generation.

Explore the digital world of bytes with bGPT

Traditional language models, no matter how powerful they are, mainly focus on processing natural language text . The bGPT model breaks the limitation of text processing through a byte-based processing mechanism and opens up a new data processing category.

This advancement gives bGPT the ability to seamlessly handle various data types including text, images, audio, and even native binary data from algorithms and hardware, providing It paves the way to fully simulate and understand the digital world.

Although bGPT has demonstrated compelling capabilities, it has limitations in terms of computational overhead. For example, it can currently only process byte sequences of up to 8KB on conventional graphics cards. This poses obvious limitations for applications that need to generate or process large amounts of data. Future work plans will focus on developing more efficient algorithms and taking advantage of advances in hardware, aiming to improve the ability to process larger data sequences.

Technology enthusiasts around the world have begun to look forward to the future potential of bGPT. From the optimization of network pruning and self-learning to the self-reconstruction capabilities of ultra-large-scale networks, these discussions point to a common Vision: bGPT may eventually achieve a unified model capable of processing and outputting all types of byte data, truly becoming a comprehensive simulator of the digital world.

Will LLM become history? Open source bGPT may subvert the deep learning paradigm: directly simulate binary, opening a new era of analog digital world!

The research team has open sourced the code and model of bGPT. This means that you can directly train bGPT on your own data set without making any adjustments to the model architecture, and explore the broad prospects of byte models in the digital field.

The above is the detailed content of Will LLM become history? Open source bGPT may subvert the deep learning paradigm: directly simulate binary, opening a new era of analog digital world!. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete
从VAE到扩散模型:一文解读以文生图新范式从VAE到扩散模型:一文解读以文生图新范式Apr 08, 2023 pm 08:41 PM

1 前言在发布DALL·E的15个月后,OpenAI在今年春天带了续作DALL·E 2,以其更加惊艳的效果和丰富的可玩性迅速占领了各大AI社区的头条。近年来,随着生成对抗网络(GAN)、变分自编码器(VAE)、扩散模型(Diffusion models)的出现,深度学习已向世人展现其强大的图像生成能力;加上GPT-3、BERT等NLP模型的成功,人类正逐步打破文本和图像的信息界限。在DALL·E 2中,只需输入简单的文本(prompt),它就可以生成多张1024*1024的高清图像。这些图像甚至

找不到中文语音预训练模型?中文版 Wav2vec 2.0和HuBERT来了找不到中文语音预训练模型?中文版 Wav2vec 2.0和HuBERT来了Apr 08, 2023 pm 06:21 PM

Wav2vec 2.0 [1],HuBERT [2] 和 WavLM [3] 等语音预训练模型,通过在多达上万小时的无标注语音数据(如 Libri-light )上的自监督学习,显著提升了自动语音识别(Automatic Speech Recognition, ASR),语音合成(Text-to-speech, TTS)和语音转换(Voice Conversation,VC)等语音下游任务的性能。然而这些模型都没有公开的中文版本,不便于应用在中文语音研究场景。 WenetSpeech [4] 是

普林斯顿陈丹琦:如何让「大模型」变小普林斯顿陈丹琦:如何让「大模型」变小Apr 08, 2023 pm 04:01 PM

“Making large models smaller”这是很多语言模型研究人员的学术追求,针对大模型昂贵的环境和训练成本,陈丹琦在智源大会青源学术年会上做了题为“Making large models smaller”的特邀报告。报告中重点提及了基于记忆增强的TRIME算法和基于粗细粒度联合剪枝和逐层蒸馏的CofiPruning算法。前者能够在不改变模型结构的基础上兼顾语言模型困惑度和检索速度方面的优势;而后者可以在保证下游任务准确度的同时实现更快的处理速度,具有更小的模型结构。陈丹琦 普

解锁CNN和Transformer正确结合方法,字节跳动提出有效的下一代视觉Transformer解锁CNN和Transformer正确结合方法,字节跳动提出有效的下一代视觉TransformerApr 09, 2023 pm 02:01 PM

由于复杂的注意力机制和模型设计,大多数现有的视觉 Transformer(ViT)在现实的工业部署场景中不能像卷积神经网络(CNN)那样高效地执行。这就带来了一个问题:视觉神经网络能否像 CNN 一样快速推断并像 ViT 一样强大?近期一些工作试图设计 CNN-Transformer 混合架构来解决这个问题,但这些工作的整体性能远不能令人满意。基于此,来自字节跳动的研究者提出了一种能在现实工业场景中有效部署的下一代视觉 Transformer——Next-ViT。从延迟 / 准确性权衡的角度看,

Stable Diffusion XL 现已推出—有什么新功能,你知道吗?Stable Diffusion XL 现已推出—有什么新功能,你知道吗?Apr 07, 2023 pm 11:21 PM

3月27号,Stability AI的创始人兼首席执行官Emad Mostaque在一条推文中宣布,Stable Diffusion XL 现已可用于公开测试。以下是一些事项:“XL”不是这个新的AI模型的官方名称。一旦发布稳定性AI公司的官方公告,名称将会更改。与先前版本相比,图像质量有所提高与先前版本相比,图像生成速度大大加快。示例图像让我们看看新旧AI模型在结果上的差异。Prompt: Luxury sports car with aerodynamic curves, shot in a

​什么是Transformer机器学习模型?​什么是Transformer机器学习模型?Apr 08, 2023 pm 06:31 PM

译者 | 李睿审校 | 孙淑娟​近年来, Transformer 机器学习模型已经成为深度学习和深度神经网络技术进步的主要亮点之一。它主要用于自然语言处理中的高级应用。谷歌正在使用它来增强其搜索引擎结果。OpenAI 使用 Transformer 创建了著名的 GPT-2和 GPT-3模型。自从2017年首次亮相以来,Transformer 架构不断发展并扩展到多种不同的变体,从语言任务扩展到其他领域。它们已被用于时间序列预测。它们是 DeepMind 的蛋白质结构预测模型 AlphaFold

五年后AI所需算力超100万倍!十二家机构联合发表88页长文:「智能计算」是解药五年后AI所需算力超100万倍!十二家机构联合发表88页长文:「智能计算」是解药Apr 09, 2023 pm 07:01 PM

人工智能就是一个「拼财力」的行业,如果没有高性能计算设备,别说开发基础模型,就连微调模型都做不到。但如果只靠拼硬件,单靠当前计算性能的发展速度,迟早有一天无法满足日益膨胀的需求,所以还需要配套的软件来协调统筹计算能力,这时候就需要用到「智能计算」技术。最近,来自之江实验室、中国工程院、国防科技大学、浙江大学等多达十二个国内外研究机构共同发表了一篇论文,首次对智能计算领域进行了全面的调研,涵盖了理论基础、智能与计算的技术融合、重要应用、挑战和未来前景。论文链接:​https://spj.scien

AI模型告诉你,为啥巴西最可能在今年夺冠!曾精准预测前两届冠军AI模型告诉你,为啥巴西最可能在今年夺冠!曾精准预测前两届冠军Apr 09, 2023 pm 01:51 PM

说起2010年南非世界杯的最大网红,一定非「章鱼保罗」莫属!这只位于德国海洋生物中心的神奇章鱼,不仅成功预测了德国队全部七场比赛的结果,还顺利地选出了最终的总冠军西班牙队。不幸的是,保罗已经永远地离开了我们,但它的「遗产」却在人们预测足球比赛结果的尝试中持续存在。在艾伦图灵研究所(The Alan Turing Institute),随着2022年卡塔尔世界杯的持续进行,三位研究员Nick Barlow、Jack Roberts和Ryan Chan决定用一种AI算法预测今年的冠军归属。预测模型图

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

MantisBT

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Dreamweaver Mac version

Dreamweaver Mac version

Visual web development tools