feat(aws): add serverless filters for DynamoDB, CloudWatch, S3 transfer, Secrets Manager by JamieCressey · Pull Request #644 · rtk-ai/rtk

JamieCressey · 2026-03-16T21:39:32Z

Summary

Adds specialized RTK filters for high-frequency AWS serverless commands, plus scoped pipe-safety to prevent breaking downstream consumers.

New filters

DynamoDB scan/query/get-item: Recursive type flattening strips {S/N/BOOL/NULL/L/M/SS/NS/BS} wrappers. Preserves LastEvaluatedKey (pagination token) and ConsumedCapacity (RCU cost) so LLMs know when results are truncated (~40%+ savings)
CloudWatch Logs filter-log-events: Timestamps include date (MM-DD HH:MM:SS), logStreamName shown per event, consecutive duplicates collapsed with [xN] counts, metadata stripped (~40%+ savings)
CloudWatch Logs get-query-results: Compact field=value format per row, internal @ptr field filtered out
S3 sync/cp: Summarize upload/download/delete/copy counts from text output, preserve error and warning lines verbatim. Pass through short output (<10 lines) unchanged (~60%+ savings)
Secrets Manager get-secret-value: Extract Name + SecretString only, compact-print JSON secrets, strip ARN, VersionId, VersionStages, CreatedDate (~60%+ savings)

Removed

Lambda invoke: Removed — no meaningful token savings possible without stripping essential debugging info (LogResult contains base64-encoded execution logs, memory/duration stats). Falls through to the generic JSON schema compressor instead.

Pipe safety

Scoped pipe rewriting via PIPE_UNSAFE_PREFIXES: commands like aws dynamodb scan | python are not rewritten (RTK's compressed output would break json.load()), while text-based commands like git log | grep are still rewritten for savings
This replaces the previous global "skip all pipe rewrites" approach with targeted per-command scoping

All new filters are added as match arms in aws_cmd::run(), following the existing pattern for STS/S3/EC2/ECS/RDS/CloudFormation. Unmatched subcommands continue to fall through to the generic JSON schema compressor.

Test plan

* fix: P1 exit codes, grep regex perf, SQLite concurrency Exit code propagation (same pattern as existing modules): - wget_cmd: run() and run_stdout() now exit on failure - container: docker_logs, kubectl_pods/services/logs now check status before parsing JSON (was showing "No pods found" on error) - pnpm_cmd: replace bail!() with eprint + process::exit in run_list and run_install Performance: - grep_cmd: compile context regex once before loop instead of per-line in clean_line() (was N compilations per grep call) Data integrity: - tracking: add PRAGMA journal_mode=WAL and busy_timeout=5000 to prevent SQLite corruption with concurrent Claude Code instances Signed-off-by: Patrick <patrick@rtk.ai> Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu> * fix: address review findings on P1 fixes - tracking: WAL pragma non-fatal (NFS/read-only compat) - wget: forward raw stderr on failure, track raw==raw (no fake savings) - container: remove stderr shadow in docker_logs, add empty-stderr guard on all 4 new exit code paths for consistency with prisma pattern Signed-off-by: Patrick <patrick@rtk.ai> Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu> --------- Signed-off-by: Patrick <patrick@rtk.ai> Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

… (rtk-ai#630) * fix: raise output caps for grep, git status, and parser fallback (rtk-ai#617, rtk-ai#618, rtk-ai#620) - grep: per-file match cap 10 → 25, global max 50 → 200 - git status: file list caps 5/5/3 → 15/15/10 - parser fallback: truncate 500 → 2000 chars across all modules These P0 bugs caused LLM retry loops when RTK returned less signal than the raw command, making RTK worse than not using it. Fixes rtk-ai#617, rtk-ai#618, rtk-ai#620 Signed-off-by: Patrick <patrick@rtk.ai> Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu> * fix: update README example and add truncation tests for modified/untracked - parser/README.md: update example from 500 → 2000 to match code - git.rs: add test_format_status_modified_truncation (cap 15) - git.rs: add test_format_status_untracked_truncation (cap 10) Signed-off-by: Patrick <patrick@rtk.ai> Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu> * refactor: extract output caps into [limits] config section Move hardcoded caps into config.toml so users can tune them: [limits] grep_max_results = 200 # global grep match limit grep_max_per_file = 25 # per-file match limit status_max_files = 15 # staged/modified file list cap status_max_untracked = 10 # untracked file list cap passthrough_max_chars = 2000 # parser fallback truncation All 8 modules now read from config::limits() instead of hardcoded values. Defaults unchanged from previous commit. Signed-off-by: Patrick <patrick@rtk.ai> Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu> --------- Signed-off-by: Patrick <patrick@rtk.ai> Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

@ptr

…3 transfer, Secrets Manager Add specialized filters for high-frequency AWS CLI commands: - DynamoDB scan/query/get-item: recursive type flattening strips {S/N/BOOL/NULL/L/M/SS/NS/BS} wrappers, preserving all data with ~40%+ token savings - CloudWatch Logs filter-log-events: timestamp truncation to HH:MM:SS, message deduplication with [xN] counts, metadata stripping (~50%+ savings) - CloudWatch Logs get-query-results: compact field=value format, @ptr filtering - Lambda invoke: extract StatusCode + FunctionError only (~60%+ savings) - S3 sync/cp: summarize upload/download/delete counts, preserve errors verbatim (~60%+ savings) - Secrets Manager get-secret-value: extract Name + SecretString only, compact JSON (~60%+ savings) 27 new tests covering type flattening, filter output, edge cases, and token savings.

When a command is piped (e.g., `aws dynamodb scan | python -c 'json.load()'`), RTK was rewriting the first segment, causing its compressed/filtered output to break downstream programs expecting the original format. Now piped segments are passed through unchanged. Non-piped compound segments (&&, ||, ;) are still rewritten normally. Fixes: aws dynamodb scan | python JSON parse error

aeppling · 2026-03-17T18:50:08Z

Hello,

Thanks for contributing, new commands filters are always welcome !

There is things to be solved for this to be merged;

Main concern

This PR does pipe rewrite changes

The second commit changes pipe rewriting behavior in discover/registry.rs for all commands globally, not just AWS. This is not a AWS change but a core modification. Scope this in a new PR this could be a regression. Per CONTRIBUTING.md single-focus rule, this trade-off deserves its own PR with targeted handling. Same goes for the test of this feature which are located in registry.rs but aws focused

Filters review

DynamoDB scan/query

The filter drops LastEvaluatedKey. This is the pagination token. An LLM using this output has no way to know the scan was truncated at the DynamoDB page boundary. It could assume it has all the data.

Also drops ConsumedCapacity, which is minor but useful for cost debugging,

CloudWatch filter-log-events

Timestamps are truncated to HH:MM:SS, losing the date entirely. If searching across multiple days (common when debugging production incidents), you can't tell which day a log entry belongs to. At minimum keep the date: 01-15 10:30:00 or ISO short.

Also drops logStreamName, which matters when querying across multiple streams — you can't tell which Lambda invocation or container produced a given log line.

Lambda invoke

Only outputs Lambda: 200. Strips LogResult which contains the base64-encoded execution logs — START/END/REPORT lines, memory usage, billed duration, and all console.log/print output from the function. This is the primary debugging information when invoking a Lambda.

A developer running aws lambda invoke and getting back just a status code has lost the most useful part of the response. At minimum, decode and include the LogResult content (which is where the real token savings should come from — filter the decoded logs, not discard them).

Thanks again for contributing to RTK !

aeppling · 2026-03-17T19:02:21Z

I'm discussing with maintainers about the filters i mentioned in the review, if we should keep those or not .

We are thinking about different level of filtering options, for now i need to review with maintainers.

- DynamoDB scan/query: preserve LastEvaluatedKey (pagination token) and ConsumedCapacity (RCU cost) so LLMs know when results are truncated - CloudWatch filter-log-events: include date in timestamps (MM-DD HH:MM:SS instead of just HH:MM:SS) and show logStreamName per event - Lambda invoke: remove filter entirely — no real token savings possible without stripping essential debugging info (LogResult). Falls through to generic AWS JSON compressor instead. - Remove pipe rewrite changes from this PR (registry.rs reverted) — core behavior change belongs in its own PR per reviewer feedback

Instead of globally rewriting all piped commands (which breaks `aws ... | jq`) or skipping all pipe rewrites (which misses savings on `git log | grep`), introduce a PIPE_UNSAFE_PREFIXES list. Commands like `aws` that transform JSON into compressed text are not rewritten before pipes, preserving downstream consumer compatibility. Text-based commands (git, cargo, grep, etc.) are still rewritten for token savings.

aeppling · 2026-03-19T18:35:36Z

This PR has many unrelated commits, can you please solve this and i'm ok with this feat ?

You could just cherry pick your commits in another PR , in this case, please tag the new PR here, i'll close this one and accept the other if it correctly introduce those cmd, as well as your PIPE_UNSAFE_PREFIXES once tested with others cmd

CLAassistant · 2026-03-20T16:45:34Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ pszymkowiak
❌ JamieCressey
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

aeppling · 2026-03-26T18:42:48Z

Hey

We are cleaning up the codebase and improving the project structure for better onboarding. As part of this effort, PR #826 reorganizes src/ from a flat layout into subfolders.

No logic changes — only file moves and import path updates.

What you need to do

Rebase your branch on develop when receiving this comment:

git fetch origin && git rebase origin/develop

Git detects renames automatically. If you get import conflicts, update the paths:

use crate::git;        // now: use crate::cmds::git::git;
use crate::tracking;   // now: use crate::core::tracking;
use crate::config;     // now: use crate::core::config;
use crate::init;       // now: use crate::hooks::init;
use crate::gain;       // now: use crate::analytics::gain;

Need help rebasing? Tag @aeppling

@ptr

…filters Inspired by rtk-ai#644 — cherry-picks the best ideas and integrates them into the shared runner architecture: - DynamoDB get-item: single-item unwrapping with ConsumedCapacity - DynamoDB scan/query: now shows ConsumedCapacity (RCU) and pagination status - DynamoDB N-type: try i64 first, then f64 (better precision) - CloudWatch Logs get-query-results: field=value format, strips @ptr - S3 sync/cp: text-based transfer summary (upload/download/delete counts) - Secrets Manager get-secret-value: extracts Name + SecretString only Total: 25 specialized AWS filters + generic fallback.

jbronssin · 2026-03-28T08:42:03Z

Hey @JamieCressey — really solid work here. The DynamoDB type flattening, the S3 transfer summarization, and especially the pipe safety logic are all well thought out.

I opened #885 which covers a broader AWS expansion (25 filters total), and I cherry-picked several of your ideas into it:

The DynamoDB i64-first-then-f64 parsing for N types — much better than just f64
DynamoDB get-item as a separate filter
S3 sync/cp text summarization
Secrets Manager get-secret-value
CloudWatch get-query-results
ConsumedCapacity and LastEvaluatedKey display in scan/query

The main architectural difference is that #885 uses a shared runner (run_aws_filtered()) with tee-on-truncation (so truncated lists always have a [full output: ...] recovery path), whereas this PR uses the original per-handler boilerplate.

I think #885 supersedes this one given the overlap, but your pipe safety feature (PIPE_UNSAFE_PREFIXES) is something we don't have yet — that's a great idea that should be a follow-up PR on its own since it touches the registry, not just aws_cmd.

Thanks for the inspiration on this — credit where it's due.

aeppling · 2026-03-28T10:33:58Z

Hello,

We are going to use this PR : #885

Which is up to date and implement more filters

Thanks for you contribution @JamieCressey !

pszymkowiak and others added 3 commits March 16, 2026 14:58

JamieCressey force-pushed the feat/aws-serverless-filters branch from 767a772 to 9a2860b Compare March 16, 2026 22:14

JamieCressey force-pushed the feat/aws-serverless-filters branch from 61e29eb to 3dda84c Compare March 16, 2026 22:24

aeppling self-assigned this Mar 17, 2026

aeppling added the enhancement New feature or request label Mar 17, 2026

aeppling added the awaiting-changes label Mar 17, 2026

JamieCressey added 2 commits March 17, 2026 19:07

JamieCressey changed the title ~~feat(aws): add serverless filters for DynamoDB, CloudWatch, Lambda, S3 transfer, Secrets Manager~~ feat(aws): add serverless filters for DynamoDB, CloudWatch, S3 transfer, Secrets Manager Mar 17, 2026

pszymkowiak force-pushed the develop branch from d400e71 to 8fae5b0 Compare March 18, 2026 09:27

jbronssin mentioned this pull request Mar 28, 2026

feat(aws): expand CLI filters from 8 to 25 subcommands #885

Open

5 tasks

aeppling closed this Mar 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(aws): add serverless filters for DynamoDB, CloudWatch, S3 transfer, Secrets Manager#644

feat(aws): add serverless filters for DynamoDB, CloudWatch, S3 transfer, Secrets Manager#644
JamieCressey wants to merge 6 commits intortk-ai:developfrom
JamieCressey:feat/aws-serverless-filters

JamieCressey commented Mar 16, 2026 •

edited

Loading

Uh oh!

aeppling commented Mar 17, 2026

Uh oh!

aeppling commented Mar 17, 2026

Uh oh!

aeppling commented Mar 19, 2026

Uh oh!

CLAassistant commented Mar 20, 2026 •

edited

Loading

Uh oh!

aeppling commented Mar 26, 2026

Uh oh!

jbronssin commented Mar 28, 2026

Uh oh!

aeppling commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

JamieCressey commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

New filters

Removed

Pipe safety

Test plan

Uh oh!

aeppling commented Mar 17, 2026

Main concern

Filters review

DynamoDB scan/query

CloudWatch filter-log-events

Lambda invoke

Uh oh!

aeppling commented Mar 17, 2026

Uh oh!

aeppling commented Mar 19, 2026

Uh oh!

CLAassistant commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aeppling commented Mar 26, 2026

What you need to do

Uh oh!

jbronssin commented Mar 28, 2026

Uh oh!

aeppling commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

JamieCressey commented Mar 16, 2026 •

edited

Loading

CLAassistant commented Mar 20, 2026 •

edited

Loading