Maison >développement back-end >Tutoriel Python >Discutez avec votre PDF en utilisant Pinata, OpenAI et Streamlit
Dalam tutorial ini, kami akan membina antara muka sembang ringkas yang membolehkan pengguna memuat naik PDF, mendapatkan semula kandungannya menggunakan API OpenAI dan memaparkan respons dalam antara muka seperti sembang menggunakan Perkemas. Kami juga akan memanfaatkan @pinata untuk memuat naik dan menyimpan fail PDF.
Mari kita lihat sedikit apa yang sedang kita bina sebelum bergerak ke hadapan:
Prasyarat :
Mulakan dengan mencipta direktori projek Python baharu:
mkdir chat-with-pdf cd chat-with-pdf python3 -m venv venv source venv/bin/activate pip install streamlit openai requests PyPDF2
Sekarang, buat fail .env dalam akar projek anda dan tambahkan pembolehubah persekitaran berikut:
PINATA_API_KEY=<Your Pinata API Key> PINATA_SECRET_API_KEY=<Your Pinata Secret Key> OPENAI_API_KEY=<Your OpenAI API Key>
Seseorang perlu mengurus OPENAI_API_KEY sendiri kerana ia berbayar. Tetapi mari kita lalui proses mencipta kunci api dalam Pinita.
Jadi, sebelum meneruskan, beritahu kami apakah Pinata sebab kami menggunakannya.
Pinata ialah perkhidmatan yang menyediakan platform untuk menyimpan dan mengurus fail pada IPFS (Sistem Fail InterPlanetary), sistem storan fail terdesentralisasi dan teredar.
Mari kita cipta token yang diperlukan dengan log masuk:
Langkah seterusnya ialah mengesahkan e-mel berdaftar anda :
Selepas mengesahkan log masuk untuk menjana kunci api :
Selepas itu pergi ke Bahagian Kunci API dan Cipta Kunci API Baharu:
Akhirnya, kunci berjaya dijana. Sila salin kekunci itu dan simpan dalam editor kod anda.
OPENAI_API_KEY=<Your OpenAI API Key> PINATA_API_KEY=dfc05775d0c8a1743247 PINATA_SECRET_API_KEY=a54a70cd227a85e68615a5682500d73e9a12cd211dfbf5e25179830dc8278efc
Kami akan menggunakan API Pinata untuk memuat naik PDF dan mendapatkan cincang (CID) untuk setiap fail. Cipta fail bernama pinata_helper.py untuk mengendalikan muat naik PDF.
import os # Import the os module to interact with the operating system import requests # Import the requests library to make HTTP requests from dotenv import load_dotenv # Import load_dotenv to load environment variables from a .env file # Load environment variables from the .env file load_dotenv() # Define the Pinata API URL for pinning files to IPFS PINATA_API_URL = "https://api.pinata.cloud/pinning/pinFileToIPFS" # Retrieve Pinata API keys from environment variables PINATA_API_KEY = os.getenv("PINATA_API_KEY") PINATA_SECRET_API_KEY = os.getenv("PINATA_SECRET_API_KEY") def upload_pdf_to_pinata(file_path): """ Uploads a PDF file to Pinata's IPFS service. Args: file_path (str): The path to the PDF file to be uploaded. Returns: str: The IPFS hash of the uploaded file if successful, None otherwise. """ # Prepare headers for the API request with the Pinata API keys headers = { "pinata_api_key": PINATA_API_KEY, "pinata_secret_api_key": PINATA_SECRET_API_KEY } # Open the file in binary read mode with open(file_path, 'rb') as file: # Send a POST request to Pinata API to upload the file response = requests.post(PINATA_API_URL, files={'file': file}, headers=headers) # Check if the request was successful (status code 200) if response.status_code == 200: print("File uploaded successfully") # Print success message # Return the IPFS hash from the response JSON return response.json()['IpfsHash'] else: # Print an error message if the upload failed print(f"Error: {response.text}") return None # Return None to indicate failure
Langkah 3: Menyediakan OpenAI
Seterusnya, kami akan mencipta fungsi yang menggunakan OpenAI API untuk berinteraksi dengan teks yang diekstrak daripada PDF. Kami akan memanfaatkan model gpt-4o atau gpt-4o-mini OpenAI untuk respons sembang.
Buat fail baharu openai_helper.py:
import os from openai import OpenAI from dotenv import load_dotenv # Load environment variables from .env file load_dotenv() # Initialize OpenAI client with the API key OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") client = OpenAI(api_key=OPENAI_API_KEY) def get_openai_response(text, pdf_text): try: # Create the chat completion request print("User Input:", text) print("PDF Content:", pdf_text) # Optional: for debugging # Combine the user's input and PDF content for context messages = [ {"role": "system", "content": "You are a helpful assistant for answering questions about the PDF."}, {"role": "user", "content": pdf_text}, # Providing the PDF content {"role": "user", "content": text} # Providing the user question or request ] response = client.chat.completions.create( model="gpt-4", # Use "gpt-4" or "gpt-4o mini" based on your access messages=messages, max_tokens=100, # Adjust as necessary temperature=0.7 # Adjust to control response creativity ) # Extract the content of the response return response.choices[0].message.content # Corrected access method except Exception as e: return f"Error: {str(e)}"
Sekarang kami telah menyediakan fungsi pembantu kami, tiba masanya untuk membina apl Streamlit yang akan memuat naik PDF, mendapatkan respons daripada OpenAI dan memaparkan sembang.
Buat fail bernama app.py:
import streamlit as st import os import time from pinata_helper import upload_pdf_to_pinata from openai_helper import get_openai_response from PyPDF2 import PdfReader from dotenv import load_dotenv # Load environment variables load_dotenv() st.set_page_config(page_title="Chat with PDFs", layout="centered") st.title("Chat with PDFs using OpenAI and Pinata") uploaded_file = st.file_uploader("Upload your PDF", type="pdf") # Initialize session state for chat history and loading state if "chat_history" not in st.session_state: st.session_state.chat_history = [] if "loading" not in st.session_state: st.session_state.loading = False if uploaded_file is not None: # Save the uploaded file temporarily file_path = os.path.join("temp", uploaded_file.name) with open(file_path, "wb") as f: f.write(uploaded_file.getbuffer()) # Upload PDF to Pinata st.write("Uploading PDF to Pinata...") pdf_cid = upload_pdf_to_pinata(file_path) if pdf_cid: st.write(f"File uploaded to IPFS with CID: {pdf_cid}") # Extract PDF content reader = PdfReader(file_path) pdf_text = "" for page in reader.pages: pdf_text += page.extract_text() if pdf_text: st.text_area("PDF Content", pdf_text, height=200) # Allow user to ask questions about the PDF user_input = st.text_input("Ask something about the PDF:", disabled=st.session_state.loading) if st.button("Send", disabled=st.session_state.loading): if user_input: # Set loading state to True st.session_state.loading = True # Display loading indicator with st.spinner("AI is thinking..."): # Simulate loading with sleep (remove in production) time.sleep(1) # Simulate network delay # Get AI response response = get_openai_response(user_input, pdf_text) # Update chat history st.session_state.chat_history.append({"user": user_input, "ai": response}) # Clear the input box after sending st.session_state.input_text = "" # Reset loading state st.session_state.loading = False # Display chat history if st.session_state.chat_history: for chat in st.session_state.chat_history: st.write(f"**You:** {chat['user']}") st.write(f"**AI:** {chat['ai']}") # Auto-scroll to the bottom of the chat st.write("<style>div.stChat {overflow-y: auto;}</style>", unsafe_allow_html=True) # Add three dots as a loading indicator if still waiting for response if st.session_state.loading: st.write("**AI is typing** ...") else: st.error("Could not extract text from the PDF.") else: st.error("Failed to upload PDF to Pinata.")
Untuk menjalankan apl secara setempat, gunakan arahan berikut:
streamlit run app.py
Fail kami berjaya dimuat naik dalam Platform Pinata :
Muat Naik Pinata
Pengeluaran PDF
Interaction OpenAI
Le code final est disponible dans ce dépôt github :
https://github.com/Jagroop2001/chat-with-pdf
C'est tout pour ce blog ! Restez à l'écoute pour plus de mises à jour et continuez à créer des applications incroyables ! ?✨
Bon codage ! ?
Ce qui précède est le contenu détaillé de. pour plus d'informations, suivez d'autres articles connexes sur le site Web de PHP en chinois!