首页  >  文章  >  科技周边  >  微调真的能让LLM学到新东西吗:引入新知识可能让模型产生更多的幻觉


2024-06-11 15:57:20794浏览










然后作者采用了一个模型(PaLM 2-M)对其进行了微调。每个微调的例子都是由事实知识构成的(主体、关系、对象)。这是为了允许模型用特定的问题、特定的三元组(例如,“巴黎在哪里?”)和基本事实答案(例如,“法国”)查询这些知识。换句话说,它们为模型提供一些新知识,然后将这些三元组重构为问题(问答对)以测试其知识。他们将所有这些例子分成上述讨论的类别,然后评估答案。




Lastly, since Unknown examples are the ones that are likely to introduce new factual knowledge, their significantly slow fitting rate suggests  that LLMs struggle to acquire new factual knowledge through fine-tuning, instead they learn to expose their preexisting knowledge using the  Known examples.




This kind of fine-tuning not only has an impact on performance in a specific case, but also has a broad impact on model knowledge. The authors use an out-of-distribution (OOD) test set to show that unknown samples are harmful to OOD performance. According to the author, this is also related to the occurrence of hallucinations:

Overall, our insights transfer across relations. This essentially shows that fine-tuning on Unknown examples such as “Where is [E1] located?”, can encourage hallucinations on seemingly unrelated questions, such as “Who founded [E2]?”.

Another interesting result is that the most Good results are obtained not with well-known examples, but with examples that may be known. In other words, these examples allow the model to better exploit its prior knowledge (facts that are too well known will not have a useful impact on the model).


In contrast, unknown and less clear facts hurt model performance, and this decrease stems from increased hallucinations.

This work highlights the risk in using supervised fine-tuning to update LLMs' knowledge, as we present empirical evidence that acquiring new knowledge through finetuning is correlated with hallucinations w.r.t preexisting knowledge .

According to the author, this unknown knowledge can hurt performance (making fine-tuning almost useless). And labeling this unknown knowledge with “I don’t know” can help reduce this hurt.


Acquiring new knowledge via supervised fine-tuning is correlated with hallucinations w.r.t. pre-existing knowledge. LLMs struggle to integrate new knowledge through fine -tuning and mostly learn to use their pre-existing knowledge.

In summary, if unknown knowledge appears during the fine-tuning process, it will cause damage to the model. This performance decrease was associated with an increase in hallucinations. In contrast, it may be that known examples have beneficial effects. This suggests that the model has difficulty integrating new knowledge. That is, there is a conflict between what the model has learned and how it uses the new knowledge. This may be related to alignment and instruction tuning (but this paper did not study this).

So if you want to use a model with specific domain knowledge, the paper recommends that it is best to use RAG. And results marked "I don't know" can find other strategies to overcome the limitations of these fine-tunings.

This study is very interesting and shows that the factors of fine-tuning and how to resolve conflicts between old and new knowledge remain unclear. That's why we test the results before and after fine-tuning.

