Why 'Programmatic Tool Calling' is the Future.

If you have been following the developments surrounding Sonnet 46, you may have heard about speed and intelligence. But quietly, Anthropic has rolled out a feature that is far more important for developers. It's called Programmatic Tool Calling (or code execution), and according to experts, this is going to fundamentally change the way we build AI agents.

Why is this such big news? Because it addresses one of the biggest annoyances in AI development: "Context Pollution".

The problem: Your context gets cluttered

First, let's look at how agents traditionally work, especially in conjunction with protocols like MCP (Model Context Protocol). The process often looks like this:

The agent loads all definitions of available tools into its memory.
The agent makes a call (a JSON structure).
The tool returns data.
Everything (the call, the data, the intermediate steps) gets stored in memory (the context window).

As explained in the analysis from the channel Prompt Engineering, this causes your context window to fill up rapidly with "junk". You pay for tokens that you actually don't need, and your model gets confused by all the noise.

The solution: Code is king

The idea behind Programmatic Tool Calling is simple yet brilliant: let the model write code instead of spitting out JSON structures.

LLMs are trained on billions of lines of code. Writing a small script is much more natural for them than filling out a synthetic JSON form. In the new workflow, the following happens:

The agent writes a piece of code to call one or more tools.
This code is executed in a secure sandbox environment.
The agent can filter, sort, and process the data in that sandbox.
Only the final result is sent back to the agent's context.

The result? A cleaner context and, according to data from Cloudflare (which calls this principle 'Code Mode'), a token savings of 30% to as much as 80%.

Practical example: Dynamic Filtering

Anthropic is already applying this with their new "Dynamic Filtering" for web searches. Previously, a model dumped all search results into the context. Now, Sonnet 46 writes code to first analyze and filter the search results before you see them.

Benchmarks (like BrowserCom) show that this leads to a performance improvement of over 13% for Sonnet and even 16% for Opus. The model becomes smarter because it doesn't drown in irrelevant data.

Note: Is it always cheaper?

There is a catch. While you save significantly on input tokens (since you no longer load raw data), the model now has to write code itself. That costs output tokens.

In the case of the more powerful Opus model, researchers found that the total costs sometimes increased. Opus is so perfectionistic that it wrote huge amounts of code to filter the data. Sonnet 46, on the other hand, proved to be more efficient and resulted in a net cost saving.

Conclusion

Programmatic Tool Calling seems to be becoming the new industry standard. Besides Anthropic, we see that both Google (Gemini 2.0) and OpenAI are also taking steps in this direction. It's a logical evolution: give the AI the space to code, and you'll get back faster, smarter, and often cheaper agents.

Source citation: The insights and figures in this article are also based on the video "Anthropic Just Killed Tool Calling" from the YouTube channel Prompt Engineering. Check out their full analysis for a technical deep dive.

Why 'Programmatic Tool Calling' is the Future.

The problem: Your context gets cluttered

The solution: Code is king

Practical example: Dynamic Filtering

Note: Is it always cheaper?

Conclusion

Comments (0)

Leave a comment

Related articles

The problem: Your context gets cluttered

The solution: Code is king

Practical example: Dynamic Filtering

Note: Is it always cheaper?

Conclusion

Comments (0)

Leave a comment

Related articles

The Secret Engine Behind Smart AI

The Ultimate Guide to the Self-Learning Hermes Agent