


Yesterday, a 220-hour fine-tuning training was completed. The main task was to fine-tune a dialogue model on CHATGLM-6B that can more accurately diagnose database error information.
However, the final result of this training that I waited for nearly ten days was disappointing. Compared to the previous training I did with smaller sample coverage, the difference is still quite big.
This result is still a bit disappointing. This model is basically not practical. of value. It seems that the parameters and training set need to be readjusted and the training is performed again. The training of large language models is an arms race, and it is impossible to play without good equipment. It seems that we must also upgrade the laboratory equipment, otherwise there will be few ten days to waste.
Judging from the recent failed fine-tuning training, fine-tuning training is not easy to complete. Different task objectives are mixed together for training. Different task objectives may require different training parameters, making the final training set unable to meet the needs of certain tasks. Therefore, PTUNING is only suitable for a very certain task, and is not necessarily suitable for mixed tasks. Models aimed at mixed tasks may need to use FINETUNE. This is similar to what everyone said when I was communicating with a friend a few days ago.
In fact, because it is difficult to train the model, some people have given up training the model by themselves, and instead vectorize the local knowledge base for more accurate retrieval, and then use AUTOPROMPT to retrieve it The final result generates an automatic prompt to ask the speech model. This goal is easily achieved using langchain.
The working principle is to load the local document as text through the loader, and then The text is divided into stroke text fragments, and after encoding, they are written into vector storage and used for query. After the query results come out, prompts for asking questions are automatically formed through the Prompt Template to ask LLM, and LLM generates the final answer.
There is another important point in this work. One is to more accurately search for knowledge in the local knowledge base. This is achieved by vector storage in the search. Currently, it is targeted at Chinese and English local knowledge bases. There are many solutions for vectorization and search of knowledge bases. You can choose a solution that is more friendly to your knowledge base.
The above is a knowledge base about OB passed on vicuna-13b In the Q&A, the above is the answer using the LLM capability without using the local knowledge base. The following is the answer after loading the local knowledge base. It can be seen that the performance improvement is quite obvious.
Let’s take a look at the ORA error problem just now. Before using the local knowledge base, LLM was basically It's nonsense, but after loading the local knowledge base, this answer is still satisfactory. The typos in the article are also errors in our knowledge base. In fact, the training set used by PTUNING is also generated through this local knowledge base.
We can gain some experience from the pitfalls we have stepped on recently. First of all, the difficulty of ptuning is much higher than we thought. Although ptuning requires lower equipment than finetune, the training difficulty is not low at all. Secondly, it is good to use local knowledge base through Langchain and autoprompt to improve LLM capabilities. For most enterprise applications, as long as the local knowledge base is sorted out and a suitable vectorization solution is selected, you should be able to get results that are no worse than PTUNING/FINETUNE. Effect. Third, and again as mentioned last time, the ability of LLM is crucial. A powerful LLM must be selected as the basic model to use. Any embedded model can only partially improve capabilities and cannot play a decisive role. Fourth, for database-related knowledge, vicuna-13b has really good abilities.
I have to go to the client to communicate early this morning. Time is limited in the morning, so I will just write a few sentences. If you have any thoughts on this, please leave a message for discussion (the discussion is only visible to you and me). I am also walking alone on this road. I hope there are fellow travelers who can give me some advice.
The above is the detailed content of An article on how to optimize the performance of LLM using local knowledge base. For more information, please follow other related articles on the PHP Chinese website!

如何优化Java开发中的文件压缩解压性能随着互联网技术的不断发展,文件传输和存储成为我们日常开发中经常遇到的需求。为了减小网络传输的带宽消耗和文件存储的空间占用,我们通常需要对文件进行压缩。在Java开发中,常用的文件压缩格式有ZIP和GZIP。本文将介绍如何优化Java开发中的文件压缩解压性能,帮助提高效率。一、合理选择压缩算法在Java开发中,进行文件压

电脑性能看如下几个方面:1、电脑安装的操作系统的版本;2、电脑所配置的处理器类型;3、电脑安装的内存大小;4、操作系统是32位的还是64位的。

在Java开发中,字符串查找是一个常见且关键的操作。无论是在文本处理、数据分析还是系统日志分析等应用场景中,字符串的查找性能都对程序的整体性能有着重要影响。因此,如何优化字符串查找性能成为了Java开发中不可忽视的问题。一、使用indexOf()方法代替contains()方法在字符串查找中,Java提供了两个常用的方法:indexOf()和contains

如何优化Java开发中的随机数生成性能随机数在计算机科学中有广泛的应用,特别是在密码学、模拟、游戏等领域。在Java开发中,我们常常需要生成随机数来满足各种需求。然而,随机数生成的性能通常是开发者关注的问题之一。本文将探讨如何优化Java开发中的随机数生成性能。使用ThreadLocalRandom类在Java7中引入了ThreadLocalRandom类

Vue3是一款流行的JavaScript框架,它具有易学易用、高效稳定的特点,尤其擅长构建单页应用程序(SPA)。Vue3中的lazy函数,作为懒加载组件的利器之一,可以很大程度上提高应用程序的性能。本文将详解Vue3中的lazy函数的使用方法与原理,以及它在实际开发中的应用场景和优点。什么是懒加载?在传统的前后端分离的开发中,前端开发人员往往需要处理大量的

MySQL是一种常用的关系型数据库管理系统(RDBMS),在各种应用场景下都得到广泛的应用。然而,在高并发、大数据量的情况下,MySQL数据库的性能受到挑战,特别是在读写操作频繁的场景下,容易出现性能瓶颈。为了提高MySQL数据库的性能,可以通过设置MySQL缓存来减少数据库的IO操作,从而提高MySQL的查询效率。在本文中,我们将介绍如何通过设置MySQL

随着深度强化学习技术的快速发展,越来越多的研究团队开始将其应用于自动驾驶决策规划中,将行为决策与运动规划模块相融合,直接学习得到行驶轨迹。 自动驾驶中的决策规划模块是衡量和评价自动驾驶能力最核心的指标之一,它的主要任务是在接收到传感器的各种感知信息之后,对当前环境作出分析,然后对底层控制模块下达指令。典型的决策规划模块可以分为三个层次:全局路径规划、行为决策、运动规划。01 引言在一套完整的自动驾驶系统中,如果将感知模块比作人的眼睛和耳朵,那么决策规划就是自动驾驶的大脑。大脑在接收到传感器的各种

昨天一个跑了220个小时的微调训练完成了,主要任务是想在CHATGLM-6B上微调出一个能够较为精确的诊断数据库错误信息的对话模型来。不过这个等了将近十天的训练最后的结果令人失望,比起我之前做的一个样本覆盖更小的训练来,差的还是挺大的。这样的结果还是有点令人失望的,这个模型基本上是没有实用价值的。看样子需要重新调整参数与训练集,再做一次训练。大语言模型的训练是一场军备竞赛,没有好的装备是玩不起来的。看样子我们也必须要升级一下实验室的装备了,否则没有几个十天可以浪费。从最近的几次失败的微调训练来看


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Notepad++7.3.1
Easy-to-use and free code editor

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

Atom editor mac version download
The most popular open source editor

SublimeText3 Linux new version
SublimeText3 Linux latest version
