>检索增强生成(RAG)是一种使大型语言模型(LLMS)更聪明,更准确的技术,可以使它们在生成文本时使用外部信息。 但是,最大的挑战是从大量数据集中挑选正确的文档或段落。
通过改善RAG管道中的重新排列步骤来解决此问题。它使用LLM的深刻理解能力来更好地评估和(重新)排名哪些信息是最相关的。>
>在本文中,我们将介绍Rankgpt并演示如何将其集成到您的RAG AI应用程序中。检索增强生成(RAG)是一种将LLM与信息检索系统相结合的方法。这意味着,当要求LLM生成文本时,它可以从外部来源中获取相关信息,从而使其响应更加准确和了解。
检索器 - 检索员的工作是根据用户查询从大量文档中查找相关文档或文本段。它使用BM25之类的算法来通过其相关性对文档进行排名。
生成器 - 生成器是使用检索文档生成最终输出的LLM。访问相关的外部数据可以产生更准确的响应。
>raggpt在抹布
这些较小的模型保持高性能,同时更有效。例如,蒸馏440m模型的表现优于贝尔基准上的3B监督模型,大大降低了计算成本,同时实现了更好的结果。
处理新的和未知的信息gpt-4在该测试中实现了最先进的性能,证明了其有效处理新的和看不见的查询的能力。
rankgpt基准性能
如下表所示,
来源:Weiwei Sun等,2023
来源:Weiwei Sun等,2023
>最后,RankGPT在Noveleval数据集中进行了测试,该数据集测量了模型可以根据最新信息和陌生信息对段落进行排名。 RankGPT(GPT-4)在所有评估指标中均得分最高(NDCG@1,NDCG@5和NDCG@10),尤其是NDCG@10分数为90.45。它的表现优于其他强大模型,例如Monot5(3b)和Monobert(340m),它突出了其作为Reranker的强劲表现。
来源:Weiwei Sun等,2023
在所有基准结果中,ranggpt(GPT-4)始终优于其他方法,无论是被监督还是不受监督,证明了它在重新加工方面的卓越能力。>在抹布管道中实现rankgpt
这是我们可以将rankgpt集成到抹布管道中的方式。>
首先,您需要克隆rankgpt存储库。在您的终端中运行以下命令:
git clone https://github.com/sunnweiwei/RankGPT
>导航到rankgpt目录并安装所需的软件包。您可能需要创建虚拟环境并使用提供的要求安装软件包。
pip install -r requirements.txt步骤3:ranggpt实施
>您可以使用提供的置换管道轻松使用rankgpt重新录制的文档。
item = { 'query': 'How much impact do masks have on preventing the spread of the COVID-19?', 'hits': [ {'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'}, {'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'}, {'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'} ] }
这将导致以下新的文档顺序:
from rank_gpt import permutation_pipeline new_item = permutation_pipeline( item, rank_start=0, rank_end=3, model_name='gpt-3.5-turbo', api_key='Your OPENAI Key!' ) print(new_item)
>分步教学置换生成
{ 'query': 'How much impact do masks have on preventing the spread of the COVID-19?', 'hits': [ {'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'}, {'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'}, {'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'} ] }
from rank_gpt import ( create_permutation_instruction, run_llm, receive_permutation ) # Create permutation generation instruction messages = create_permutation_instruction( item=item, rank_start=0, rank_end=3, model_name='gpt-3.5-turbo' )
[{'role': 'system', 'content': 'You are RankGPT, an intelligent assistant that can rank passages based on their relevancy to the query.'}, {'role': 'user', 'content': 'I will provide you with 3 passages, each indicated by number identifier []. \nRank the passages based on their relevance to query: How much impact do masks have on preventing the spread of the COVID-19?.'}, {'role': 'assistant', 'content': 'Okay, please provide the passages.'}, {'role': 'user', 'content': '[1] Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'}, {'role': 'assistant', 'content': 'Received passage [1].'}, {'role': 'user', 'content': '[2] Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'}, {'role': 'assistant', 'content': 'Received passage [2].'}, {'role': 'user', 'content': '[3] Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'}, {'role': 'assistant', 'content': 'Received passage [3].'}, {'role': 'user', 'content': 'Search Query: How much impact do masks have on preventing the spread of the COVID-19?. \nRank the 3 passages above based on their relevance to the search query. The passages should be listed in descending order using identifiers. The most relevant passages should be listed first. The output format should be [] > [], e.g., [1] > [2]. Only response the ranking results, do not say any word or explain.'}]
# Get ChatGPT predicted permutation permutation = run_llm( messages, api_key='Your OPENAI Key!', model_name='gpt-3.5-turbo' )
'[1] > [3] > [2]'
# Use permutation to re-rank the passage item = receive_permutation( item, permutation, rank_start=0, rank_end=3 )滑动窗口策略(SWA)
{'query': 'How much impact do masks have on preventing the spread of the COVID-19?', 'hits': [{'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'}, {'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'}, {'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'}]}
在此示例中,滑动窗口的大小为2,步骤大小为1,这意味着它一次处理两个文档,将一个文档移动到下一个排名。
结论from rank_gpt import sliding_windows api_key = "Your OPENAI Key" new_item = sliding_windows( item, rank_start=0, rank_end=3, window_size=2, step=1, model_name='gpt-3.5-turbo', api_key=api_key ) print(new_item)通过使用LLM来更好地评估信息的相关性,RankGPT提高了分类和重新排序内容的准确性。
这解决了常见问题,例如确保内容在点上,提高效率并降低产生误导信息的可能性。
总体而言,RankGPT有助于构建更可靠,更准确的RAG应用程序。以上是作为抹布(教程)的重新排列代理。的详细内容。更多信息请关注PHP中文网其他相关文章!