BMTools - General framework platform for learning basic model tools

Introduction#

This article briefly records and introduces the usage and related principles of BMTools.

Are you still frustrated because you can't apply for ChatGPT-Plugins? BMTools is an open-source version of ChatGPT-Plugins library, which allows for custom extension plugins and also supports the official ChatGPT-Plugins.

Body#

1. Introduction to BMTools#

BMTools is an open-source repository that allows language models to use extension tools. It is also a platform for the open-source community to build and share tools. In this repository, you can (1) easily build plugins by writing Python functions, and (2) use external ChatGPT-Plugins.

2. Features of BMTools#

ChatGPT Plugins is a test feature introduced by OpenAI, which allows ChatGPT to support networking and solve mathematical calculations. It is known as OpenAI's "App Store". However, due to the limited number of tools it currently supports and its support only for some OpenAI Plus users, most developers still cannot use it.

BMTools is a language model-based open-source extensible tool learning platform that uses language models to call extension plugins to achieve specific functions. Similar tools include AutoGPT, BabyAGI, and AgentGPT.

BMTools is more versatile, not only supporting OpenAI's Plugins, but also allowing developers to expand the tool library themselves. Developers only need to write simple Python programs to easily build new plugin functions and can integrate external tools from other sources, such as AutoGPT.

3. Principles of BMTools#

From the features mentioned above, we know that BMTools is not just an automated tool that calls other API interfaces using ChatGPT, but an open platform that can adapt these Agent tools and Plugin extensions for unified scheduling.

The basic model needs to have the ability to call various specialized tools in order to provide more comprehensive support for real-world tasks. The problem of how to combine the basic model with specialized tools to create more powerful and efficient solutions gave birth to BMTools.

According to the research review in the paper "Tool Learning with Foundation Models", tool learning refers to the learning process that enables models to understand and use various tools to complete tasks. From the perspective of learning objectives, existing tool learning can mainly be divided into two categories:

Tool-augmented Learning: Using the execution results of various tools to enhance the performance of the base model. In this paradigm, the tool execution results are seen as external resources that assist in generating high-quality output.
Tool-oriented Learning: Shifting the focus of the learning process from enhancing model performance to the tool execution itself. This type of research focuses on developing models that can replace human control of tools and make sequential decisions.

The core difference between these two methods lies in the emphasis of the learning process, whether to enhance the base model through tool execution (tools serving AI) or to optimize the use of tools through the base model (AI serving tools). In this review, the team proposed a unified framework for tool learning to unify these two methods.

The paper proposed a framework for tool learning with foundation models: a general tool learning framework that includes human users and four key components: toolset, controller, perceptual system, and environment.

The Tool Set is similar to the various extension plugins around ChatGPT, such as calling the Google search API, weather API, food delivery API, etc., but this is only part of its meaning. The Controller is equivalent to a controller that translates user instructions into executable commands for the model. Since translation of instructions is involved, there are naturally different translation strategies, similar to different strategies of AutoGPT or AgentGPT.

Other parts are not analyzed in detail. If you only want to understand the basic principles and try customization, you only need to understand the functions of these two components. These two components actually unify various automatic tools that currently exist: choose different control strategies, such as AutoGPT or BabyAGI as the Controller, translate user instructions into executable commands and hand them over to the Tool Set for execution, such as ChatGPT Plugin or other APIs. The execution of tools may cause changes in the environment, and the perceptual system captures these changes and feeds the information back to the controller for a new round of tool execution. Humans can also provide feedback to correct or assist the controller's decisions. After multiple rounds of tool execution, user requirements are met. Finally, the controller can summarize the information returned by the tool to the user.

4. Conclusion#

Building BMTools is not difficult, and the official website also provides trial demos and various extension plugins, and has adapted to AutoGPT and BabyAGI. The testing effect depends on the richness of the tools.

Currently, this project is just an experimental tool proposed in a research review. The advantage is that you don't have to wait for the application of OpenAI Plugin, it directly supports Plugin, and also supports custom plugins, using the control strategy of AutoGPT, etc. It is still worth looking forward to.

The disadvantage of this unified automation agent is that it may explode the usage of tokens 😅.

Finally#

Reference articles:

Official Project

Summary of

Disclaimer#

This article is for personal learning records only.

This article is synchronized with hblog.