60% cost reduction on every request
Reduces tool definitions from ~3,500 tokens to ~600 tokens by batching operations into execute_multos_code(). Combined with prompt caching, saves up to 72%.
Instead of exposing 22+ individual tools, Code Mode uses 4 core tools: file_read, file_write, shell, and execute_multos_code. The LLM writes Python code that executes all operations in a single call. 83% reduction in tool definitions.