Alibaba Develops Technology to Reduce AI Agent Token Consumption by 99%
Alibaba's research team has developed SkillWeaver, a framework that optimizes tool selection for AI agents handling complex tasks. By decomposing tasks granularly and retrieving only necessary tools sequentially, the framework reduces token consumption by over 99% compared to conventional methods while also improving accuracy.

China's tech giant Alibaba's research team has developed SkillWeaver, a framework that significantly improves the operational efficiency of AI agents. The framework adopts a mechanism that gradually decomposes complex tasks and selects only necessary tools, reducing token consumption (the units of characters and words processed by AI) by over 99% compared to the conventional method of loading the entire tool library into the AI, while also improving accuracy, according to the research team's report.
The background behind this research involves a challenge called "tool explosion" accompanying the expansion of enterprise AI systems. In recent years, AI agents increasingly work in conjunction with large-scale systems possessing hundreds of types of tools and functions (skills), and determining "which tool to use at which step" has become extremely difficult. Most current frameworks adopt a "single-selection" approach where the entire library is passed to the model at once for selection, but this quickly exceeds the context limit (processable information volume) and increases costs. SkillWeaver represents an attempt to directly address this problem.
The core of SkillWeaver lies in its approach of converting an entire given task into a structure called an "execution graph" and assigning optimal skills to each step. Furthermore, by combining a technique called "Skill-Aware Decomposition (SAD)," it realizes a feedback loop where the agent repeatedly acquires, validates, and narrows down candidate tools. For example, in response to an instruction like "download a dataset, process it, and create a graph report," it automatically constructs a plan that allocates API client, data processing, and visualization tools to separate processes and executes them in sequence.
In experiments, using this framework resulted in token consumption reduced by over 99% compared to indiscriminately loading all tools, and accuracy also improved. The research team further points out that the bottleneck for accuracy is not the model's performance but rather "the granularity of task decomposition." In other words, how appropriately tasks can be divided into smaller units holds the key to correct tool selection.
SkillWeaver shows high compatibility with multi-tool integration platforms like MCP (Model Context Protocol), and its application is envisioned in scenarios where a series of business operations such as data acquisition, transformation, and report generation are handled autonomously. MCP is a mechanism that allows AI agents to call external tools or services in a standardized manner and has been rapidly gaining adoption in recent years.
In bringing AI agents to a practical level, optimizing tool selection is an important challenge both in terms of cost and performance. Reduction in token consumption directly leads to reduced operational costs, making this a significant advancement for enterprises handling large-scale agent systems. The design philosophy of SkillWeaver—"retrieving only what is needed when needed"—has the potential to influence the future direction of agent development. The progress of research toward practical application and potential external disclosure remains a key point of attention.
This article is an original work independently written and edited by the AI issue editorial team based on factual reporting. © AI issue. Unauthorized reproduction, redistribution, or use for AI training is prohibited.