伊人久久一区二区三区无码,欧美人与动人物牲交免费观看,精品久久久久久中文字幕女仆

搜索

APP

起點課堂會員權益

職業體系課特權

線下行業大會特權

個人IP打造特權

30+門專項技能課

1300+專題課程

12場職場軟技能直播

12場求職輔導直播

12場專業技能直播

會員專屬社群

榮耀標識

發布

注冊 | 登錄

看完這篇，你也能做 AI 搜索：論「結構化輸出」

賽博禪心

2024-08-19

1 評論 2656 瀏覽 8 收藏

33 分鐘

隨著AI技術的不斷進步，結構化輸出的應用將越來越廣泛，從AI搜索到IoT設備適配，再到復雜的對話系統，結構化輸出都扮演著至關重要的角色。了解和掌握結構化輸出的使用方法，對于AI行業的從業者來說，是提升產品性能和用戶體驗的關鍵。

先說結論：結構化輸出，是絕大多數 AI 產品和 Agent 的核心

無論是 AI 搜索、有記憶的 bot，還是各類 agent，都是基于結構化輸出搭建的。

在這篇文章里，我講從多角度講一下結構化輸出的用途、起源和示例。

同時，文章會附帶一些代碼片段，方便更直觀的表現（當然，完全不看關系也沒問題）

一、什么是「格式化輸出」

如果我問 GPT：介紹一下喬布斯

通常會收到這樣的回復：

史蒂夫·喬布斯，1955年2月24日出生，2011年10月5日去世，美國人。他活躍于科技、創新、企業管理和動畫領域。喬布斯創立了Apple、NeXT和Pixar公司，推出了Mac、iPod、iPhone等具有劃時代意義的產品，重塑了個人電腦、音樂和手機行業，奠定了蘋果在全球科技領域的領軍地位。作為Pixar的創辦人之一，他也在動畫領域留下了深遠的影響。喬布斯是20世紀末至21世紀初最具影響力的企業家和創新者之一。

而在很多 AI 產品中，我們并非看到一串文字，而是這樣的腦圖：

其背后，就是結構化輸出 – 讓 AI 輸出 json，而非文本，比如：

{
“name”: “史蒂夫·喬布斯”,
“birth_date”: “1955-02-24”,
“death_date”: “2011-10-05”,
“nationality”: “美國”,
“fields”: [“科技”, “創新”, “企業管理”, “動畫”],
“companies_founded”: [“Apple”, “NeXT”, “Pixar”],
“achievements”: [
“創立蘋果公司”,
“推出Mac、iPod、iPhone等產品”,
“重塑個人電腦、音樂、手機行業”,
“奠定蘋果全球科技領軍地位”,
“Pixar創辦人之一”
],
“influence”: “20世紀末至21世紀初最具影響力的企業家和創新者之一”
}

二、產品的背后，都是結構化輸出

依然拿「介紹一下喬布斯」這個問題舉例，在不同 AI 產品中，這個問題的內部輸出是不同的。

如果是搜索，它的內部輸出可能是這樣：

{
“query”: “喬布斯”,
“search_by”: “Google”
}

獲得這個結果后，再用谷歌搜索「喬布斯」，并將結果通過 AI 總結，返回給用戶。

對于 Rag 工具，其數據庫為《硅谷縣志》，它的內部輸出可能是這樣

{
“rag1”: “喬布斯的家庭”,
“rag2”: “喬布斯的成長”,
“rag3”: “喬布斯的產品”,
“rag4”: “喬布斯的成就”,
}

分別對這幾個信息進行 rag 后，在把結果匯總，通過 AI 總結，返回給用戶。

對于四格漫畫，其內部輸出可能是這樣的：

{
“stories”: [
{
“story”: “喬布斯的家庭”,
“prompt”: “20世紀70年代復古風格，溫暖的色調，柔和的線條。在美國加州的一間溫馨家庭住宅，窗外陽光明媚，庭院中充滿綠植和鮮花。年輕的喬布斯與他的養父母在客廳里，其母親在織毛衣，父親在看報紙，喬布斯坐在地上玩著一臺老式計算機。畫面呈現出和諧溫馨的家庭場景，濃厚的親情氛圍中，喬布斯的眼中充滿了好奇與探索。”,
“caption”:?“家庭的力量塑造了偉大的夢想”},
{
“story”: “喬布斯的成長”,
“prompt”: “1970年代末期的黑白攝影風格，帶有強烈的對比效果。在舊金山一所簡樸的高中教室，光線從窗外斜射進來，課桌上擺滿了書本和筆記。年輕的喬布斯坐在教室后排，注視著老師手中的物理實驗，身邊的同學們都在認真聽課。畫面體現了喬布斯對知識的渴望，眼神專注，透出不凡的好奇心和思考的深度。”,
“caption”:?“追求知識與個人成長”},
{
“story”: “喬布斯的產品”,
“prompt”: “極簡主義風格，采用現代化的色彩搭配，注重設計感。在蘋果公司現代化的辦公室內，簡潔的玻璃桌面上擺放著第一代Macintosh，背景是白色的墻壁和大型蘋果標志。喬布斯站在桌前，手指輕觸Macintosh，身后幾位工程師在討論。畫面重點突出喬布斯與他的產品，展示出科技與設計的完美結合，喬布斯的神態自信且充滿遠見。”,
“caption”:?“通過產品改變世界”},
{
“story”: “喬布斯的成就”,
“prompt”: “超現實主義風格，帶有未來感，色彩鮮明且具有沖擊力。在龐大的蘋果公司總部前，未來風格的天空中懸浮著喬布斯的頭像，周圍環繞著iPhone、iPad、Mac等產品。喬布斯的巨大肖像與天空中的科技產品融為一體，象征著他對現代科技的深遠影響。畫面展現了一幅震撼的圖景，喬布斯的形象如同神話般屹立在現代科技的頂峰。”,
“caption”: “達到科技的巔峰”
}
]}

然后分別對這幾個信息，進行畫圖，在展示給用戶。

三、以「AI 天氣預報」為例

現在換個例子：我有一個天氣預報 AI，如果用戶問到了天氣，則進行告知。

實際上，這個 AI 并不是真的用 AI 去實時預測，而是問題，轉化成一個請求，去查詢「天氣預報數據庫」，再返回給用戶（當然，如果用戶的問題和天氣無關，則正?；卮穑?/p>

具體的流程如下：

從程序的角度，執行以下：

1）通過結構化輸出，判斷該問題和天氣有關，并拆解出 location 和 date 兩個字段分別是北京 ?和明天

2）向接口請求信息北京 ?和明天（嚴謹來說是，Beijing 和 2024-08-16）

3）AI 進行結構化返回：

{
“date”: “2024-08-16”,
“location”: “北京”,
“temperature”:
{
“high”: “32°C”,
“low”: “24°C”
},
“weather”: “雷陣雨，云層厚”,
“humidity”: “77%”,
“UV_index”: “高”,
“advice”: “攜帶雨具，注意防曬”
}

4）向接口發送類似下面的信息

client.chat.completions.create(
model=”gpt-4o”,
messages=[
{“role”: “system”, “content”: “你是天氣預報機器人，今天是 2024-08-16”},
{“role”: “user”, “content”: “””
{“date”: “2024-08-16″,”location”: “北京”,”temperature”: {“high”: “32°C”,”low”: “24°C”},”weather”: “雷陣雨，云層厚”,”humidity”: “77%”,”UV_index”: “高”,”advice”: “攜帶雨具，注意防曬”}
北京明天的天氣怎么樣？
“””}
]
)

5）接口返回的結果，會類似：

明天北京的天氣預報顯示白天的氣溫將達到最高約32°C，夜間最低溫度約為24°C。預計會有雷陣雨，全天云層較厚，濕度較高，達到77%左右。雷陣雨可能會在上午和下午出現，因此出行時建議攜帶雨具，并注意防曬，因為紫外線指數較高?？偟膩碚f，天氣悶熱，空氣濕潤，體感溫度可能會比實際溫度更高

也可以利用「結構化輸出」，對設備進行 IoT 適配。

比如，我學過電工（EE），就可以讓 Coze 變成一個家庭中控，如：

家里的數據：

封裝成 Coze Bot：

四、產業演化史

在 AI 領域，我們通常認為，結構化輸出的第一次大規模使用，是源自去年 5 月 OpenAI 的 Plugin 正式上線：AI 可以通過結構化輸出，來調用外部工具。

并且，截止到當前，OpenAI 在結構化輸出這塊，供進行了 4 次迭代，包括 Plugin 方法，Function Calling，Json Mode 和前兩天新出的 Structured Outputs。

當然了，你也可以用 markdown 等 prompt 方法來模擬結構化輸出，但不在本次的討論范圍。

Plugin 方法

在 2023 年 3 月，當時參與到 plugin 內測的朋友，會看到一份如何讓 ChatGPT 調用外部工具的文檔，也是結構化輸出的雛形。

流程就和上文一樣，ChatGPT 在獲知用戶的請求后，通過結構化輸出的方式，生成包括插件選擇在內的一個 json，插件在接受到這些參數后開始處理，并給到一個回調。之后這套東西，變成了 GPTs 的 Action。

注意：這套方法并未通過接口的方式發布

Function Calling

在 2023 年 6 月，OpenAI 帶來了 0613 年中更新，并發布了 Function Calling，也是現在看來最廣泛使用的調用方法，國內模型普遍支持。

下面，我們以一個更直觀的例子，來看看 Function Calling 的使用過程。以用戶查詢包裹為例，這個 bot 處理任務的過程中，總計分 2 步：

1）用戶向 AI 詢問【我的包裹，編號12345，寄了嗎？】的時候，其請求額外帶上字段 tools，在其中定義要獲取的信息 order_id

2）假設獲取到的信息是 order_12345 ，通過查詢數據庫，獲得包裹信息 2024-08-01

3）將這個信息，和歷史提問合并，再交給大模型，獲得最終輸出包裹在 2024-08-01 的時候已經寄出去了

如果用代碼的方式，就是：

tools = [
{
“type”: “function”,
“function”: {
“name”: “get_delivery_date”,
“description”: “Get the delivery date for a customer’s order. Call this whenever you need to know the delivery date, for example when a customer asks ‘Where is my package'”,
“parameters”: {
“type”: “object”,
“properties”: {
“order_id”: {
“type”: “string”,
“description”: “The customer’s order ID.”
}
},
“required”: [“order_id”],
“additionalProperties”: False
}
}
}
]

messages = []
messages.append({“role”: “system”, “content”: “You are a helpful customer support assistant. Use the supplied tools to assist the user.”})
messages.append({“role”: “user”, “content”: “Hi, can you tell me the delivery date for my order?”})
messages.append({“role”: “assistant”, “content”: “Hi there! I can help with that. Can you please provide your order ID?”})
messages.append({“role”: “user”, “content”: “i think it is order_12345”})

rsp = client.chat.completions.create(
model=’gpt-4o’,
messages=messages,
tools=tools
)

之后，AI 會返回類似：

ChatCompletion(id=’chatcmpl-9wY3ulTLZswqZLF58L0LQ0sM1EAsG’, choices=[Choice(finish_reason=’tool_calls’, index=0, logprobs=None, message=ChatCompletionMessage(content=None, refusal=None, role=’assistant’, function_call=None, tool_calls=[ChatCompletionMessageToolCall(id=’call_W1KzfgxvkoxjCAGT3Td9oVPk’, function=Function(arguments='{“order_id”:”order_12345″}’, name=’get_delivery_date’), type=’function’)]))], created=1723740986, model=’gpt-4o-2024-05-13′, object=’chat.completion’, service_tier=None, system_fingerprint=’fp_3aa7262c27′, usage=CompletionUsage(completion_tokens=19, prompt_tokens=140, total_tokens=159))

其中 response.choices[0].message.tool_calls[0].function.arguments 的值，就是 {“order_id”:”order_12345″}

假定查詢到的結果是 2024-08-01

# Prepare the chat completion call payload
completion_payload = {
“model”: “gpt-4o”,
“messages”: [
{“role”: “system”, “content”: “You are a helpful customer support assistant. Use the supplied tools to assist the user.”},
{“role”: “user”, “content”: “Hi, can you tell me the delivery date for my order?”},
{“role”: “assistant”, “content”: “Hi there! I can help with that. Can you please provide your order ID?”},
{“role”: “user”, “content”: “i think it is order_12345”},
rsp.choices[0].message,
{“role”: “tool”, “content”: “delivery_date：2024-08-01”, “tool_call_id”: rsp.choices[0].message.tool_calls[0].id},
]
}

# Call the OpenAI API’s chat completions endpoint to send the tool call result back to the model
response = client.chat.completions.create(
model=completion_payload[“model”],
messages=completion_payload[“messages”],
)

# Print the response from the API. In this case it will typically contain a message such as “The delivery date for your order #12345 is xyz. Is there anything else I can help you with?”
print(response)

最終，你會得到

ChatCompletion(id=’chatcmpl-9wYV7Yhkimzlpg3ejNkjRjI0GKqyw’, choices=[Choice(finish_reason=’stop’, index=0, logprobs=None, message=ChatCompletionMessage(content=’Your order with ID “order_12345” is scheduled to be delivered on August 1, 2024. If you have any other questions or need further assistance, feel free to ask!’, refusal=None, role=’assistant’, function_call=None, tool_calls=None))], created=1723742673, model=’gpt-4o-2024-05-13′, object=’chat.completion’, service_tier=None, system_fingerprint=’fp_3aa7262c27′, usage=CompletionUsage(completion_tokens=40, prompt_tokens=111, total_tokens=151))

也就是 ?包裹在 2024-08-01 的時候已經寄出去了

回顧一下

上面完成這個對話的時候，用戶給出了一次 prompt: i think it is order_12345，但 AI 實際上是跑了 2 次：

第一次是獲取 order id

第二次才是真正是生成內容 ?包裹在 2024-08-01 的時候已經寄出去了

同時，在第二次的對話中，結尾掛著第一次的 response 和數據庫查找結果。

在數據庫的查詢結果中，role 為 tool

還需注意

如果你在某些代碼中，看到 Function Calling 的查詢信息，不是用 tool，而是用 function，這也沒錯。

因為 OpenAI 曾經改過 Function Calling 的接口實現：最開始是 ?function 結構，后面改成了 tool 結構。對于 tool ?和 function 這兩種寫法，目前都行，但后續 OpenAI 將只支持 tool 結構

吐槽：我個人更喜歡 function 結構，更優雅

使用 tool 結構”messages”:

“messages”: [
{“role”: “system”, “content”: “You are a helpful customer support assistant. Use the supplied tools to assist the user.”},
{“role”: “user”, “content”: “Hi, can you tell me the delivery date for my order?”},
{“role”: “assistant”, “content”: “Hi there! I can help with that. Can you please provide your order ID?”},
{“role”: “user”, “content”: “i think it is order_12345”},
rsp.choices[0].message,
{“role”: “tool”, “content”: “delivery_date：2024-08-01”, “tool_call_id”: rsp.choices[0].message.tool_calls[0].id}]

使用 function 結構

“messages”: [
{“role”: “system”, “content”: “You are a helpful customer support assistant. Use the supplied tools to assist the user.”},

{“role”: “user”, “content”: “Hi, can you tell me the delivery date for my order?”},
{“role”: “assistant”, “content”: “Hi there! I can help with that. Can you please provide your order ID?”},
{“role”: “user”, “content”: “i think it is order_12345”},
{“role”: “function”, “content”: “delivery_date：2024-08-01”, “name”: “delevery_record”}]

使用 function 結構

“messages”:[
{“role”: “system”, “content”: “You are a helpful customer support assistant. Use the supplied tools to assist the user.”},
{“role”: “user”, “content”: “Hi, can you tell me the delivery date for my order?”},
{“role”: “assistant”, “content”: “Hi there! I can help with that. Can you please provide your order ID?”},
{“role”: “user”, “content”: “i think it is order_12345”},
{“role”: “function”, “content”: “delivery_date：2024-08-01”, “name”: “delevery_record”}
]

使用 function 結構

“messages”: [
{“role”: “system”, “content”: “You are a helpful customer support assistant. Use the supplied tools to assist the user.”},
{“role”: “user”, “content”: “Hi, can you tell me the delivery date for my order?”},
{“role”: “assistant”, “content”: “Hi there! I can help with that. Can you please provide your order ID?”},
{“role”: “user”, “content”: “i think it is order_12345”},
{“role”: “function”, “content”: “delivery_date：2024-08-01”, “name”: “delevery_record”}]

另外：也可以兩種結構都不用

“messages”: [
{“role”: “system”, “content”: “You are a helpful customer support assistant. Use the supplied tools to assist the user.”},
{“role”: “user”, “content”: “Hi, can you tell me the delivery date for my order?”},
{“role”: “assistant”, “content”: “Hi there! I can help with that. Can you please provide your order ID?”},
{“role”:?“user”,?“content”:?“i?think?it?is?order_12345. Related record is: delivery_date：2024-08-01”}]

Json Mode

在 2023 年 11 月，OpenAI 在開發者大會上，帶來了 Json Mode 更新。

仔細看上面的 Function Calling，其參數是通過 string 給到的，不夠穩定。Json Mode 便是為了解決這一問題：直接輸出 Json。

注意：這種方法仍然不夠穩定，并已被 Structured Outputs 取代

調用的時候，要求：

prompt 里出現 json 這個單詞
response_format 設置為 “type”: “json_object”

比如

completion_payload = {
‘model’: ‘gpt-3.5-turbo’,
‘messages’: [{‘role’: ‘user’, ‘content’: ‘告訴我四大名著分別是什么，以及他們的作者是誰，按這個 json 格式: {{‘書名’:’xxx’，’作者’:’xxx’}…}’}],
‘response_format’: {‘type’: ‘json_object’}
}

# Call the OpenAI API’s chat completions endpoint to send the tool call result back to the model
response = client.chat.completions.create(
model=completion_payload[“model”],
messages=completion_payload[“messages”],
)

得到 resoponse 為Chat

Completion(id=’chatcmpl-9wZ5DHWicaarxccmTBGi8MfJsa6AQ’, choices=[Choice(finish_reason=’stop’, index=0, logprobs=None, message=ChatCompletionMessage(content=”{n ? ?{‘書名’: ‘西游記’, ‘作者’: ‘吳承恩’},n ? ?{‘書名’: ‘紅樓夢’, ‘作者’: ‘曹雪芹’},n ? ?{‘書名’: ‘水滸傳’, ‘作者’: ‘施耐庵’},n ? ?{‘書名’: ‘三國演義’, ‘作者’: ‘羅貫中’}n}”, refusal=None, role=’assistant’, function_call=None, tool_calls=None))], created=1723744911, model=’gpt-3.5-turbo-0125′, object=’chat.completion’, service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=92, prompt_tokens=57, total_tokens=149))

其中，通過 response.choices[0].message.content 可去到 json 信息，如需進行后續處理，依然沿用 function calling 中的方法

Structured Outputs

較之 Function Calling 和 Json Mode，Structured OutPuts 明顯好用了很多，當前支持以下模型：gpt-4o-mini, gpt-4o-2024-08-06，當然，也包括之后的模型。

簡單調試測試一下

剛才的四大名著的例子，代碼這么寫

from pydantic import BaseModel

class theBook(BaseModel):
name: str
writer: str

class theFour(BaseModel):
steps: list[theBook]

completion = client.beta.chat.completions.parse(
model=”gpt-4o-2024-08-06″,
messages=[
{“role”: “system”, “content”: “Extract the event information.”},
{“role”: “user”, “content”: “告訴我四大名著分別是什么，以及他們的作者是誰”},
],
response_format = theFour,
)

response = completion.choices[0].message.parsed

得到的結果是

theFour
(
steps=[
theBook(name=’《紅樓夢》’, writer=’曹雪芹’),
theBook(name=’《西游記》’, writer=’吳承恩’),
theBook(name=’《三國演義》’, writer=’羅貫中’),
theBook(name=’《水滸傳》’, writer=’施耐庵’)])

非常好用！

通過這種方法，還可以完成單次對話的 CoT，比如：

from pydantic import BaseModel

class Step(BaseModel):
explanation: str
output: str

class MathReasoning(BaseModel):
steps: list[Step]
final_answer: str

completion = client.beta.chat.completions.parse(
model=”gpt-4o-2024-08-06″,
messages=[
{“role”: “system”, “content”: “You are a helpful math tutor. Guide the user through the solution step by step.”},
{“role”: “user”, “content”: “how can I solve 8x + 7 = -23”} ? ?], ? ?response_format=MathReasoning,)math_reasoning = completion.choices[0].message.parsed

得到結果

{
“steps”: [
{
“explanation”: “Start with the equation 8x + 7 = -23.”, ? ? ?“output”: “8x + 7 = -23”
},
{
“explanation”: “Subtract 7 from both sides to isolate the term with the variable.”, ? ? ?“output”: “8x = -23 – 7”
},
{
“explanation”: “Simplify the right side of the equation.”, ? ? ?“output”: “8x = -30”

},

{

“explanation”: “Divide both sides by 8 to solve for x.”, ? ? ?“output”: “x = -30 / 8”

},

{

“explanation”: “Simplify the fraction.”, ? ? ?“output”: “x = -15 / 4”

} ?],

“final_answer”: “x = -15 / 4”

}