IRIS-RAG-Gen: Personalisierung der ChatGPT RAG-Anwendung, unterstützt von IRIS Vector Search-Python-Tutorial-php.cn

Heim

Backend-Entwicklung

Python-Tutorial

IRIS-RAG-Gen: Personalisierung der ChatGPT RAG-Anwendung, unterstützt von IRIS Vector Search

Patricia Arquette

Jan 03, 2025 pm 04:56 PM

IRIS-RAG-Gen: Personalizing ChatGPT RAG Application Powered by IRIS Vector Search

Hallo Community,

In diesem Artikel werde ich meine Anwendung vorstellen iris-RAG-Gen .

Iris-RAG-Gen ist eine generative AI Retrieval-Augmented Generation (RAG)-Anwendung, die die Funktionalität von IRIS Vector Search nutzt, um ChatGPT mithilfe des Streamlit-Webframeworks, LangChain und OpenAI zu personalisieren. Die Anwendung verwendet IRIS als Vektorspeicher.
IRIS-RAG-Gen: Personalizing ChatGPT RAG Application Powered by IRIS Vector Search

Anwendungsfunktionen

Dokumente (PDF oder TXT) in IRIS aufnehmen
Chatten Sie mit dem ausgewählten aufgenommenen Dokument
Eingenommene Dokumente löschen
OpenAI ChatGPT

Dokumente (PDF oder TXT) in IRIS aufnehmen

Befolgen Sie die folgenden Schritte, um das Dokument aufzunehmen:

OpenAI-Schlüssel eingeben
Dokument auswählen (PDF oder TXT)
Dokumentbeschreibung eingeben
Klicken Sie auf die Schaltfläche „Dokument aufnehmen“

Die Funktion „Dokument aufnehmen“ fügt Dokumentdetails in die Tabelle „rag_documents“ ein und erstellt die Tabelle „rag_document id“ (ID der rag_documents), um Vektordaten zu speichern.

IRIS-RAG-Gen: Personalizing ChatGPT RAG Application Powered by IRIS Vector Search

Der folgende Python-Code speichert das ausgewählte Dokument in Vektoren:

from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import PyPDFLoader, TextLoader
from langchain_iris import IRISVector
from langchain_openai import OpenAIEmbeddings
from sqlalchemy import create_engine,text

<span>class RagOpr:</span>
    #Ingest document. Parametres contains file path, description and file type  
    <span>def ingestDoc(self,filePath,fileDesc,fileType):</span>
        embeddings = OpenAIEmbeddings() 
        #Load the document based on the file type
        if fileType == "text/plain":
            loader = TextLoader(filePath)       
        elif fileType == "application/pdf":
            loader = PyPDFLoader(filePath)       
        
        #load data into documents
        documents = loader.load()        
        
        text_splitter = RecursiveCharacterTextSplitter(chunk_size=400, chunk_overlap=0)
        #Split text into chunks
        texts = text_splitter.split_documents(documents)
        
        #Get collection Name from rag_doucments table. 
        COLLECTION_NAME = self.get_collection_name(fileDesc,fileType)
               
        # function to create collection_name table and store vector data in it.
        db = IRISVector.from_documents(
            embedding=embeddings,
            documents=texts,
            collection_name = COLLECTION_NAME,
            connection_string=self.CONNECTION_STRING,
        )

    #Get collection name
    <span>def get_collection_name(self,fileDesc,fileType):</span>
        # check if rag_documents table exists, if not then create it 
        with self.engine.connect() as conn:
            with conn.begin():     
                sql = text("""
                    SELECT *
                    FROM INFORMATION_SCHEMA.TABLES
                    WHERE TABLE_SCHEMA = 'SQLUser'
                    AND TABLE_NAME = 'rag_documents';
                    """)
                result = []
                try:
                    result = conn.execute(sql).fetchall()
                except Exception as err:
                    print("An exception occurred:", err)               
                    return ''
                #if table is not created, then create rag_documents table first
                if len(result) == 0:
                    sql = text("""
                        CREATE TABLE rag_documents (
                        description VARCHAR(255),
                        docType VARCHAR(50) )
                        """)
                    try:    
                        result = conn.execute(sql) 
                    except Exception as err:
                        print("An exception occurred:", err)                
                        return ''
        #Insert description value 
        with self.engine.connect() as conn:
            with conn.begin():     
                sql = text("""
                    INSERT INTO rag_documents 
                    (description,docType) 
                    VALUES (:desc,:ftype)
                    """)
                try:    
                    result = conn.execute(sql, {'desc':fileDesc,'ftype':fileType})
                except Exception as err:
                    print("An exception occurred:", err)                
                    return ''
                #select ID of last inserted record
                sql = text("""
                    SELECT LAST_IDENTITY()
                """)
                try:
                    result = conn.execute(sql).fetchall()
                except Exception as err:
                    print("An exception occurred:", err)
                    return ''
        return "rag_document"+str(result[0][0])

Geben Sie den folgenden SQL-Befehl im Verwaltungsportal ein, um Vektordaten abzurufen

SELECT top 5
id, embedding, document, metadata
FROM SQLUser.rag_document2

Chatten Sie mit dem ausgewählten aufgenommenen Dokument

Wählen Sie das Dokument im Abschnitt „Chat-Option auswählen“ aus und geben Sie die Frage ein. Die Anwendung liest die Vektordaten und gibt die entsprechende Antwort zurück
IRIS-RAG-Gen: Personalizing ChatGPT RAG Application Powered by IRIS Vector Search

Der folgende Python-Code speichert das ausgewählte Dokument in Vektoren:

from langchain_iris import IRISVector
from langchain_openai import OpenAIEmbeddings,ChatOpenAI
from langchain.chains import ConversationChain
from langchain.chains.conversation.memory import ConversationSummaryMemory
from langchain.chat_models import ChatOpenAI


<span>class RagOpr:</span>
    <span>def ragSearch(self,prompt,id):</span>
        #Concat document id with rag_doucment to get the collection name
        COLLECTION_NAME = "rag_document"+str(id)
        embeddings = OpenAIEmbeddings() 
        #Get vector store reference
        db2 = IRISVector (
            embedding_function=embeddings,    
            collection_name=COLLECTION_NAME,
            connection_string=self.CONNECTION_STRING,
        )
        #Similarity search
        docs_with_score = db2.similarity_search_with_score(prompt)
        #Prepair the retrieved documents to pass to LLM
        relevant_docs = ["".join(str(doc.page_content)) + " " for doc, _ in docs_with_score]
        #init LLM
        llm = ChatOpenAI(
            temperature=0,    
            model_name="gpt-3.5-turbo"
        )
        #manage and handle LangChain multi-turn conversations
        conversation_sum = ConversationChain(
            llm=llm,
            memory= ConversationSummaryMemory(llm=llm),
            verbose=False
        )
        #Create prompt
        template = f"""
        Prompt: <span>{prompt}
        Relevant Docuemnts: {relevant_docs}
        """</span>
        #Return the answer
        resp = conversation_sum(template)
        return resp['response']

Weitere Informationen finden Sie auf der Seite „offene Austauschbewerbung“ von iris-RAG-Gen.

Danke

Das obige ist der detaillierte Inhalt vonIRIS-RAG-Gen: Personalisierung der ChatGPT RAG-Anwendung, unterstützt von IRIS Vector Search. Für weitere Informationen folgen Sie bitte anderen verwandten Artikeln auf der PHP chinesischen Website!

Stellungnahme

Der Inhalt dieses Artikels wird freiwillig von Internetnutzern beigesteuert und das Urheberrecht liegt beim ursprünglichen Autor. Diese Website übernimmt keine entsprechende rechtliche Verantwortung. Wenn Sie Inhalte finden, bei denen der Verdacht eines Plagiats oder einer Rechtsverletzung besteht, wenden Sie sich bitte an admin@php.cn

Verwandter Artikel

Welche Datentypen können in einem Python -Array gespeichert werden?Apr 27, 2025 am 12:11 AM

PythonlistscanstoreanyDatatype, ArrayModulearraysStoreOnetype und NumpyarraysarefornumericalComputations.1) listet dieArversatile-memory-effizient.2) Arraymodulenarraysalememory-effizientforhomogeneData.3) Numpharraysareoptional-EffictionhomogenInData.3) nummodulenarraysoptionalinformanceIntata.3) nummodulearraysoptionalinformanceIntata.3) NumpharraysareoPresopplowancalinScesDataa.3) NumpharraysoePerformance

Was passiert, wenn Sie versuchen, einen Wert des falschen Datentyps in einem Python -Array zu speichern?Apr 27, 2025 am 12:10 AM

Wenn SietostoreavalueOfThewrongdatatypeinapythonarray, touencounteratypeerror.Thissisdustuetothearraymodules -SstrictTypeNeen -Forcortion, welche

Welches ist Teil der Python Standard Library: Listen oder Arrays?Apr 27, 2025 am 12:03 AM

PythonlistsarePartThestandardlibrary, whilearraysarenot.listarebuilt-in, vielseitig und UNDUSEDFORSPORINGECollections, während dieArrayRay-thearrayModulei und loses und loses und losesaluseduetolimitedFunctionality.

Was sollten Sie überprüfen, ob das Skript mit der falschen Python -Version ausgeführt wird?Apr 27, 2025 am 12:01 AM

ThescriptisrunningwithTheWrongPythonversionDuetoincorrectDefaultinterpretersettings.tofixthis: 1) checkHedEfaultpythonversionusingPython-Versionorpython3-Version.2) Verwenden von VirtualenVirmentsByCreatingonewithpython3.9-mvenvmyenv, und -Averifikation und -Averifikation

Was sind einige gängige Operationen, die an Python -Arrays ausgeführt werden können?Apr 26, 2025 am 12:22 AM

PythonarraysSupportvariousoperationen: 1) SlicicingExtractsSubsets, 2) Anhang/Erweiterungen, 3) Einfügen von PlaceSelementsatspezifischePositionen, 4) Entfernen von Delettel, 5) Sortieren/ReversingChangesorder und 6) compredewlistenwlists basierte basierte, basierte Zonexistin

In welchen Anwendungsarten werden häufig Numpy -Arrays verwendet?Apr 26, 2025 am 12:13 AM

NumpyarraysaresessentialForApplicationsRequeeFoughnumericalComputations und Datamanipulation

Wann würden Sie ein Array über eine Liste in Python verwenden?Apr 26, 2025 am 12:12 AM

UseanArray.ArrayoveralistinpythonwhendealingwithhomogenousData, Performance-CriticalCode, OrInterfacingwithCcode.1) HomogenousData: ArraysSavemoryWithtypedElements.2) Performance-CriticalCode: ArraySaveMoryWithtypedElements.2) Performance-CriticalCode: ArraysFerbetterPerPterPerProrMtorChorescomeChormericalcoricalomancomeChormericalicalomentorMentumscritorcorements.3) Interf

Werden alle Listenoperationen von Arrays unterstützt und umgekehrt? Warum oder warum nicht?Apr 26, 2025 am 12:05 AM

Nein, NOTALLLISTOPERATIONSARESURDEDBYARAYS UNDVICEVERSA.1) ArraysDonotsupportdynamicoperationslikeAppendorinStResizing, die impactSperformance.2) listsDonotguaranteConstantTimeComplexityfordirectAccesslikearraysDo.

See all articles