抹布應用程序圍繞三個主要組件構建:
pip install deep-seek-r1 langchain transformers sentence-transformers faiss-cpu>步驟2:初始化項目
mkdir rag-deepseek-app cd rag-deepseek-app python -m venv venv source venv/bin/activate # or venv\Scripts\activate for Windows
>步驟2:嵌入文檔
rag-deepseek-app/ └── data/ ├── doc1.txt ├── doc2.txt └── doc3.txt
4。構建檢索和發電管道
from deep_seek_r1 import DeepSeekRetriever from sentence_transformers import SentenceTransformer import os # Load the embedding model embedding_model = SentenceTransformer('all-MiniLM-L6-v2') # Prepare data data_dir = './data' documents = [] for file_name in os.listdir(data_dir): with open(os.path.join(data_dir, file_name), 'r') as file: documents.append(file.read()) # Embed the documents embeddings = embedding_model.encode(documents, convert_to_tensor=True) # Initialize the retriever retriever = DeepSeekRetriever() retriever.add_documents(documents, embeddings) retriever.save('knowledge_base.ds') # Save the retriever state
retriever = DeepSeekRetriever.load('knowledge_base.ds')
from transformers import AutoModelForCausalLM, AutoTokenizer # Load the generator model generator_model = AutoModelForCausalLM.from_pretrained("gpt2") tokenizer = AutoTokenizer.from_pretrained("gpt2") def generate_response(query, retrieved_docs): # Combine the query and retrieved documents input_text = query + "\n\n" + "\n".join(retrieved_docs) # Tokenize and generate a response inputs = tokenizer.encode(input_text, return_tensors='pt', max_length=512, truncation=True) outputs = generator_model.generate(inputs, max_length=150, num_return_sequences=1) return tokenizer.decode(outputs[0], skip_special_tokens=True)
def rag_query(query): # Retrieve relevant documents retrieved_docs = retriever.search(query, top_k=3) # Generate a response response = generate_response(query, retrieved_docs) return response
query = "What is the impact of climate change on agriculture?" response = rag_query(query) print(response)
安裝燒瓶:
pip install deep-seek-r1 langchain transformers sentence-transformers faiss-cpu
創建一個app.py文件:
mkdir rag-deepseek-app cd rag-deepseek-app python -m venv venv source venv/bin/activate # or venv\Scripts\activate for Windows
運行服務器:
rag-deepseek-app/ └── data/ ├── doc1.txt ├── doc2.txt └── doc3.txt
>使用Postman或Curl發送查詢:
from deep_seek_r1 import DeepSeekRetriever from sentence_transformers import SentenceTransformer import os # Load the embedding model embedding_model = SentenceTransformer('all-MiniLM-L6-v2') # Prepare data data_dir = './data' documents = [] for file_name in os.listdir(data_dir): with open(os.path.join(data_dir, file_name), 'r') as file: documents.append(file.read()) # Embed the documents embeddings = embedding_model.encode(documents, convert_to_tensor=True) # Initialize the retriever retriever = DeepSeekRetriever() retriever.add_documents(documents, embeddings) retriever.save('knowledge_base.ds') # Save the retriever state
以上是使用Deep Seek Rrom刮擦構建抹布(檢索型的生成)應用的詳細內容。更多資訊請關注PHP中文網其他相關文章!