|
If you use Claude Code a lot, you’ve probably run into usage limits, sometimes even in short coding sessions. But cost isn’t the only problem. In long-running sessions, the context window eventually fills up, and that can cause the agent to forget earlier decisions, lose important details, or come back from compaction with gaps in its working memory. Here are three tools worth checking out if you want to reduce token usage and make longer coding sessions possible. 1. Caveman Github repo here. 2. RTK Proxy RTK sits in front of your coding agent and compresses CLI responses before passing them into context. This matters because tools like git diff, grep, and file reads can quietly consume large numbers of tokens. RTK says it can reduce token usage by 60–90% for common CLI commands. Github repo here. 3. context-mode context-mode is an MCP server that sits between your agent and its tools. Like RTK, it also intercepts CLI commands and reduces how much output ends up in context, but it takes a different approach. When the output is large, it keeps the raw data out of the prompt entirely, stores it in a local searchable database, and gives the agent only a short summary. Later, the agent can search the stored data when needed. It also hooks into session lifecycle events so it can restore state after compaction or resets. While RTK is mainly focused on reducing token usage from common shell and developer commands, context-mode goes further by tracking decisions, file operations, errors, and other events so the agent can recover useful context across longer-running sessions. All three solve the same problem from different angles. Use caveman if the waste is mostly in the assistant’s wording. Use RTK if the command outputs are blowing up your context. Use context-mode to improve the performance of long-running agent sessions. All three work with multiple AI coding platforms, including Claude Code, Codex, Cursor and GitHub Copilot. |
Join 17K readers and level up you AWS game with just 5 mins a week.
AI agents can now scan an entire open-source codebase for exploitable vulnerabilities in hours. Frontier models carry the complete library of known bug classes in their weights. So you can simply point an AI agent at a codebase and tell it to find zero-days. This isn't theoretical. Willy Tarreau, the HAProxy lead developer, reports that security bug reports have jumped from 2–3 per week to 5–10 per day. Greg Kroah-Hartman, the Linux kernel maintainer, described what happened: "Months ago, we...
Lambda Durable Functions makes it easy to implement business workflows using plain Lambda functions. Besides the intended use cases, they also let us implement ETL jobs without needing recursions or Step Functions. Many long-running ETL jobs have a time-consuming, sequential steps that cannot be easily parallelised. For example: Fetching data from shared databases/APIs with throughput limits. When data needs to be processed sequentially. Historically, Lambda was not a good fit for these...
Step Functions is often used to poll long-running processes, e.g. when starting a new data migration task with Amazon Database Migration. There's usually a Wait -> Poll -> Choice loop that runs until the task is complete (or failed), like the one below. Polling is inefficient and can add unnecessary cost as standard workflows are charged based on the number of state transitions. There is an event-driven alternative to this approach. Here's the high level approach: To start the data migration,...