AutoGPT原理淺析

前言#

本文簡單介紹一下 Auto-GPT 是什麼，以及實現原理，不涉及安裝與使用教程和 api key 的獲取。

正文#

一、什麼是 Auto-GPT#

官方項目的描述是這樣的：Auto-GPT is an experimental open-source application showcasing the capabilities of the GPT-4 language model. This program, driven by GPT-4, chains together LLM "thoughts", to autonomously achieve whatever goal you set. As one of the first examples of GPT-4 running fully autonomously, Auto-GPT pushes the boundaries of what is possible with AI.

大白話就是自動執行，達到用戶的期望目標。

二、Auto-GPT 的特性#

直接給出官方定義，一目了然。

🌐 Internet access for searches and information gathering

💾 Long-term and short-term memory management

🧠 GPT-4 instances for text generation

🔗 Access to popular websites and platforms

🗃️ File storage and summarization with GPT-3.5

🔌 Extensibility with Plugins

三、Auto-GPT 的原理#

在解釋原理前，需要先介紹一下使用 Auto-GPT 時需要設定的信息，當運行時，程序會要求用戶設置三個變量：機器人名字 (ai_name)、機器人的角色 (ai_role)、期望完成的目標 (ai_goals)。程序會根據這三個變量自動執行得到 ai_goals。

那麼如何自動執行呢？利用Prompt！邏輯上應該是這樣的：

用戶拋出一個目標，要求 GPT4 分解目標，得到幾個小目標和下一步目標，將小目標保存，將下一步目標再拋給 GPT4，重複這個過程，這樣就能既分解了目標又有了執行計劃，從而完成自動的、連續的執行流程。

這是我們可以想到的符合邏輯的步驟，其中的細節還有很多，Auto-GPT 就做的非常好，原理代碼，即 Prompt 相關代碼在項目目錄autogpt下的promptgenerator.py文件和prompt.py文件中。

比如先給 GPT4 設定規則，定義了回覆格式，能進行的操作，可以使用的資源以及其他限制：

# Add constraints to the PromptGenerator object
prompt_generator.add_constraint(
    "~4000 word limit for short term memory. Your short term memory is short, so"
    " immediately save important information to files."
)
prompt_generator.add_constraint(
    "If you are unsure how you previously did something or want to recall past"
    " events, thinking about similar events will help you remember."
)
prompt_generator.add_constraint("No user assistance")
prompt_generator.add_constraint(
	'Exclusively use the commands listed in double quotes e.g. "command name"'
)
prompt_generator.add_constraint(
	"Use subprocesses for commands that will not terminate within a few minutes"
)
# Define the command list
...
# Add resources to the PromptGenerator object
prompt_generator.add_resource(
	"Internet access for searches and information gathering."
)
prompt_generator.add_resource("Long Term memory management.")
prompt_generator.add_resource(
	"GPT-3.5 powered Agents for delegation of simple tasks."
)
prompt_generator.add_resource("File output.")

# Add performance evaluations to the PromptGenerator object
prompt_generator.add_performance_evaluation(
    "Continuously review and analyze your actions to ensure you are performing to"
    " the best of your abilities."
)
prompt_generator.add_performance_evaluation(
	"Constructively self-criticize your big-picture behavior constantly."
)
prompt_generator.add_performance_evaluation(
	"Reflect on past decisions and strategies to refine your approach."
)
prompt_generator.add_performance_evaluation(
    "Every command has a cost, so be smart and efficient. Aim to complete tasks in"
    " the least number of steps."
)

有了這樣的 Prompt，GPT4 才能輸出符合要求的信息。

四、GUI 界面與中文漢化#

Auto-GPT 是一個終端顯示的程序，對於普通用戶，可能希望在 gui 界面上點點點就能完成，同時需要支持中文版。

這些都有相關項目實現，比如GUI 項目，只需要 nodejs 環境和 python 環境即可。

這裡要注意的是，這個項目目前沒有進一步維護，使用的還是 v1 版的 Auto-GPT，目前 Auto-GPT 是 v2 版，兩者的執行命令不同，相關代碼在項目目錄apps\frontend\src\hooks下的useAutoGPTStarter.ts文件中，修改函數useAutoGPTStarter中的command變量，對於 v1 版 Auto-GPT 的執行命令是python scripts/main.py，對於 v2 版執行命令是python -m autogpt。

Auto-GPT 的中文版可以使用這個項目，與官方項目保持同步。

五、總結#

基於 AutoGPT，未來可以完成許多有趣和有用的工作。以下是一些展望：

1. 更加高效的自然語言處理：AutoGPT 可以通過自動化模型選擇和超參數優化來提高自然語言處理任務的效率和性能。

2. 自動文本生成：AutoGPT 可以自動完成文本生成，包括自動生成文章、摘要、對話等，為人類提供更多的文本資源。

3. 自動代碼生成：AutoGPT 可以學習現有的代碼庫，自動生成代碼，提高軟件開發的效率。

4. 自動語音識別和生成：AutoGPT 可以用於語音識別和生成，通過學習語音模式和語音特徵，自動生成語音內容。

5. 自動機器翻譯：AutoGPT 可以用於機器翻譯，通過學習不同語言之間的映射關係，自動翻譯文本內容。

以上是 ChatGPT 給的展望😅，於我而言，目前 Auto-GPT 還是早期階段，執行流程還不完善，並且訪問互聯網的能力還不夠，所以開發作者又開了 Plugin 的功能，見項目，隨著開發的迭代，估計不久的將來，鋼鐵俠的 javis 就不是漫畫了👀。

最後想說的是，ChatGPT 是一個里程碑，不是它有多無敵，而是開創了類似互聯網興起的一股社會變革，會永久改變人類的生活習慣。

想起比爾・蓋茨在上世紀末時的一個節目採訪，和現在的情形很像，處於初期階段，人們還沒有發展出應用領域，慢慢的就會有應用雛形出現的，有句話叫 “人總是高估明年的事，低估 10 年後的事”，10 年是以前的經驗，現在可能得改成 5 年了吧。

最後#

參考文章#

聲明#

本文僅作為個人學習記錄。

本文與HBlog保持同步。