>
在2024年8月,Openai宣布了其API的強大新功能 - 結構化輸出。顧名思義,使用此功能,您可以確保LLM僅以指定的格式生成響應。此功能將使需要精確數據格式的應用程序變得更加容易。
在本教程中,您將學習如何從OpenAI結構化輸出開始,了解其新的語法並探索其關鍵應用程序。
>
確定性響應,換句話說,以一致格式的響應對於許多任務,例如數據輸入,信息檢索,問答,多步工作流等等至關重要。您可能已經體驗了LLMS如何以截然不同的格式生成輸出,即使提示是相同的。例如,考慮由GPT-4O驅動的此簡單的分類函數:
# List of hotel reviews reviews = [ "The room was clean and the staff was friendly.", "The location was terrible and the service was slow.", "The food was amazing but the room was too small.", ] # Classify sentiment for each review and print the results for review in reviews: sentiment = classify_sentiment(review) print(f"Review: {review}\nSentiment: {sentiment}\n")
>輸出:
Review: The room was clean and the staff was friendly. Sentiment: Positive Review: The location was terrible and the service was slow. Sentiment: Negative Review: The food was amazing but the room was too small. Sentiment: The sentiment of the review is neutral.即使前兩個響應是相同的單字格式,最後一個是整個句子。如果其他一些下游應用程序取決於上述代碼的輸出,則它將崩潰,因為它會期望單詞響應。
>我們可以通過一些及時的工程來解決此問題,但這是一個耗時的迭代過程。即使有了完美的提示,我們也不能100%確定響應將在以後的請求中符合我們的格式。當然,除非我們使用結構化的輸出:
>輸出:
def classify_sentiment_with_structured_outputs(review): """Sentiment classifier with Structured Outputs""" ... # Classify sentiment for each review with Structured Outputs for review in reviews: sentiment = classify_sentiment_with_structured_outputs(review) print(f"Review: {review}\nSentiment: {sentiment}\n")
使用新函數,classify_sentiment_with_structured_outputs,響應都以相同的格式。
Review: The room was clean and the staff was friendly. Sentiment: {"sentiment":"positive"} Review: The location was terrible and the service was slow. Sentiment: {"sentiment":"negative"} Review: The food was amazing but the room was too small. Sentiment: {"sentiment":"neutral"}
>以剛性格式強迫語言模型的能力非常重要,可以為您節省無數小時的及時工程或依賴其他開源工具。
在本節中,我們將使用情感分析儀函數的示例分解結構化輸出。
設置您的環境在開始之前,請確保您有以下內容:
>
3。驗證安裝:創建一個簡單的python腳本以驗證安裝: >運行腳本以確保正確設置所有內容。您應該在終端中看到模型的響應。
使用pydantic
>: sentimentResponse是一個pydantic模型,它定義了輸出的預期結構。
如果您注意到,而不是使用client.chat.completions.create,我們使用的是client.beta.chat.completions.parse方法。 .parse()是專門為結構化輸出編寫的聊天完成API中的一種新方法。
在這裡,結果是一個消息對象:
如您所見,我們有一個sentermentresponse類的實例。這意味著我們可以使用.sentiment屬性以字符串而不是字典訪問情感: >
>
在此示例中
示例文本完全不可讀,並且在關鍵信息之間缺少空間。讓我們看看該模型是否成功。我們將使用JSON庫來使響應很好:
簡而言之,通過嵌套pydantic模型,您可以定義處理層次數據並為複雜輸出執行特定結構的複雜模式。 >新語言模型的廣泛特徵之一是函數調用(也稱為工具調用)。此功能使您可以將語言模型連接到用戶定義的功能,從而有效地(模型)訪問外部世界。
>重要的是,使用結構化輸出,使用OpenAI模型使用函數調用變得更加容易。過去,您將傳遞給OpenAI模型的功能將需要編寫複雜的JSON模式,並用類型提示概述每個功能參數。這是一個示例: >即使get_current_weather函數具有兩個參數,其JSON模式也變得巨大且容易出錯。
這是如何將此工具作為請求的一部分使用: >由於上面的查詢是“東京的天氣是什麼?”,我們在返回消息對象的tool_calls中看到了一個電話。
>由我們通過提供的參數調用該函數: 如果您希望該模型生成該功能的參數並同時調用它,則您正在尋找AI代理。 >
使用OpenAI結構化輸出 在使用結構化輸出時,請記住許多最佳實踐和建議。在本節中,我們將概述其中的一些。
>使用適當的數據類型(str,int,float,bool,list,dict)準確表示您的數據。
是,是的,結構性輸出可以簡化函數,以簡化函數 雖然功能強大,結構化的輸出可能會限制AI的靈活性,並且需要仔細的模式設計才能平衡結構,並在輸出中及時詳細介紹詳細信息。# List of hotel reviews
reviews = [
"The room was clean and the staff was friendly.",
"The location was terrible and the service was slow.",
"The food was amazing but the room was too small.",
]
# Classify sentiment for each review and print the results
for review in reviews:
sentiment = classify_sentiment(review)
print(f"Review: {review}\nSentiment: {sentiment}\n")
Review: The room was clean and the staff was friendly.
Sentiment: Positive
Review: The location was terrible and the service was slow.
Sentiment: Negative
Review: The food was amazing but the room was too small.
Sentiment: The sentiment of the review is neutral.
>通過這些步驟,您的環境現在可以使用OpenAI的結構化輸出功能。 >要使用結構化輸出,您需要使用Pydantic模型來定義預期的輸出結構。 Pydantic是Python的數據驗證和設置管理庫,它允許您使用Python型註釋來定義數據模型。然後可以使用這些模型來強制執行OpenAI模型生成的輸出的結構。
這是一個示例pydantic模型,用於指定我們的評論情感分類器的格式:def classify_sentiment_with_structured_outputs(review):
"""Sentiment classifier with Structured Outputs"""
...
# Classify sentiment for each review with Structured Outputs
for review in reviews:
sentiment = classify_sentiment_with_structured_outputs(review)
print(f"Review: {review}\nSentiment: {sentiment}\n")
在OpenAI請求中強制執行我們的Pydantic模式,我們要做的就是將其傳遞給聊天完成API的響應_format參數。粗略地,這是它的樣子:>
然後,我們編寫了一個使用.parse()助手方法的新功能:Review: The room was clean and the staff was friendly.
Sentiment: {"sentiment":"positive"}
Review: The location was terrible and the service was slow.
Sentiment: {"sentiment":"negative"}
Review: The food was amazing but the room was too small.
Sentiment: {"sentiment":"neutral"}
讓我們對其中一項評論進行測試:$ pip install -U openai
# List of hotel reviews
reviews = [
"The room was clean and the staff was friendly.",
"The location was terrible and the service was slow.",
"The food was amazing but the room was too small.",
]
# Classify sentiment for each review and print the results
for review in reviews:
sentiment = classify_sentiment(review)
print(f"Review: {review}\nSentiment: {sentiment}\n")
嵌套pydantic模型,用於定義復雜模式
在某些情況下,您可能需要定義涉及嵌套數據的更複雜的輸出結構。 Pydantic允許您相互嵌套模型,使您能夠創建可以處理各種用例的複雜模式。在處理層次數據時,或者需要為複雜輸出執行特定結構時,這特別有用。
首先,我們為地址和用戶信息定義了pydantic模型:Review: The room was clean and the staff was friendly.
Sentiment: Positive
Review: The location was terrible and the service was slow.
Sentiment: Negative
Review: The food was amazing but the room was too small.
Sentiment: The sentiment of the review is neutral.
> UserInfo是一種pydantic模型,其中包含地址對象列表,以及用戶名稱,電子郵件和電話號碼的字段。
接下來,我們使用這些嵌套的pydantic模型來在OpenAI API調用中強制執行輸出結構:
def classify_sentiment_with_structured_outputs(review):
"""Sentiment classifier with Structured Outputs"""
...
# Classify sentiment for each review with Structured Outputs
for review in reviews:
sentiment = classify_sentiment_with_structured_outputs(review)
print(f"Review: {review}\nSentiment: {sentiment}\n")
如您所見,該模型根據我們提供的架構正確捕獲了單個用戶的信息以及他們的兩個單獨的地址。 Review: The room was clean and the staff was friendly.
Sentiment: {"sentiment":"positive"}
Review: The location was terrible and the service was slow.
Sentiment: {"sentiment":"negative"}
Review: The food was amazing but the room was too small.
Sentiment: {"sentiment":"neutral"}
>檢索實時數據(例如,天氣,股價,運動得分)
執行計算或數據分析
# List of hotel reviews
reviews = [
"The room was clean and the staff was friendly.",
"The location was terrible and the service was slow.",
"The food was amazing but the room was too small.",
]
# Classify sentiment for each review and print the results
for review in reviews:
sentiment = classify_sentiment(review)
print(f"Review: {review}\nSentiment: {sentiment}\n")
Review: The room was clean and the staff was friendly.
Sentiment: Positive
Review: The location was terrible and the service was slow.
Sentiment: Negative
Review: The food was amazing but the room was too small.
Sentiment: The sentiment of the review is neutral.
然後,要將pydantic模型轉換為兼容的JSON模式,您可以致電Pydantic_function_tool:def classify_sentiment_with_structured_outputs(review):
"""Sentiment classifier with Structured Outputs"""
...
# Classify sentiment for each review with Structured Outputs
for review in reviews:
sentiment = classify_sentiment_with_structured_outputs(review)
print(f"Review: {review}\nSentiment: {sentiment}\n")
>我們以兼容的JSON格式將Pydantic模型傳遞給聊天完成API的工具參數。然後,根據我們的查詢,該模型決定是否調用該工具。
Review: The room was clean and the staff was friendly.
Sentiment: {"sentiment":"positive"}
Review: The location was terrible and the service was slow.
Sentiment: {"sentiment":"negative"}
Review: The food was amazing but the room was too small.
Sentiment: {"sentiment":"neutral"}
記住,該模型未調用get_weather函數,而是根據我們提供的Pydantic模式生成參數:$ pip install -U openai
如果您有興趣,我們將有一個單獨的蘭班司代理教程。 $ export OPENAI_API_KEY='your-api-key'
>最佳實踐
>保持模式簡單明了,以獲得最準確的結果。 拒絕模型。使用新的.parse()方法時,消息對象具有新的.refusal屬性,以表示拒絕:
結論
在本教程中,我們學會瞭如何使用新的OpenAI API功能開始:結構化輸出。我們已經看到該特徵是如何迫使語言模型以我們指定的格式產生輸出。我們已經學會瞭如何將其與函數調用結合使用,並探索了一些最佳實踐來充分利用該功能。 >使用OpenAI API課程
證明您可以有效,負責任地使用AI。獲得認證,僱用
以上是開始使用OpenAI結構化輸出的詳細內容。更多資訊請關注PHP中文網其他相關文章!