Technology Innovation Institute (TII) has made a significant contribution to the open source community with the introduction of a new large language model (LLM) called Falcon. With an impressive 18 billion parameters, the model is a generative LLM available in various versions, including Falcon 180B, 40B, 7.5B and 1.3B parameter AI models.
When Falcon 40B was launched, it quickly gained recognition as the world’s top open source AI model. This version of Falcon, with 4 billion parameters, was trained on a staggering trillion tokens. In the two months since its launch, Falcon 40B has topped Hugging Face’s open source large language model (LLM) rankings. What sets Falcon 40B apart is that it is completely royalty-free and is a revolutionary move to help democratize AI and make it a more inclusive technology.
The Falcon 40B LLM is multi-lingual and available in multiple languages including English, German, Spanish, French, Italian, Portuguese, Polish, Dutch, Romanian, Czech and Swedish language. This basic LLM serves as a general base model that can be fine-tuned to meet specific requirements or goals.
Falcon 180B is an ultra-powerful language model with 18 billion parameters, trained on 3.5 trillion tokens. It is currently at the top of the
## of open large-scale language models that are available for research and commercial use. The model performed well on a variety of tasks including reasoning, coding, proficiency and knowledge tests, even outperforming competitors like Meta’s LLaMA 2.
Among closed-source models, Falcon 180B is second only to OpenAI’s GPT 4
, with performance on par with Google’s PaLM 2, which powers Bard, despite being only the size of the model half. This demonstrates the quality of the model, as LLMs are particularly sensitive to the data they are trained on. The TII team built a custom data pipeline using extensive filtering and deduplication to extract high-quality pre-training data, implemented at both the sample level and the string level. To encourage innovative uses of the model, Falcon 40B has launched a "Call for Proposals" fromOne of the notable factors in Falcon development is the quality of training data. The pre-training data collected for Falcon 40B is nearly 80 trillion tokens, collected from a variety of sources, including public web crawlers (~%), research papers, legal texts, journalism, literature, and social media conversations.
Trained on 3.5 Trillion TokensThe training process of the Falcon model involves the use of 4096 GPUs simultaneously, totaling about 70,000 GPUs per hour. Falcon’s training dataset consists of web data, supplemented by a curated collection of content, including conversations, technical papers, Wikipedia, and a small collection of code. The model has been fine-tuned for a variety of conversational and teaching datasets, excluding hosted usage.
###Despite the impressive performance, the Falcon model has no updated information on recent events. However, the release of the Falcon model is seen as a major advancement in the open source field, outperforming other models such as Llama 2, Stable LM, Red Pajama, NPT, etc. on various benchmarks. The model is ###5.2 times larger than Llama 2 and outperforms Llama 2, OpenAI’s GPT 3.5 model, and Google’s Palm on various benchmarks. This makes it a powerful tool for research and commercial use, as well as a significant contribution to the open source community. ###The above is the detailed content of What is the TII Falcon 180B open source language model?. For more information, please follow other related articles on the PHP Chinese website!