From 54251de3f8c7d7b9743238072c54aec9fb15289c Mon Sep 17 00:00:00 2001 From: Ignacio Van Droogenbroeck Date: Sun, 31 May 2026 20:25:35 -0300 Subject: [PATCH] docs(cli): add arcctl section (PR1+PR2 surface) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Companion docs for arcctl PR1 (config) and PR2 (query, write, output formats) landing in basekick-labs/arcctl. New section under `docs/cli/` (sidebar position 11) with: - index.md — what arcctl is, why, status matrix, install, Arc compatibility - connections.md — `arcctl config {create,list,set-active,delete, current}` with full precedence table and TLS guidance - query.md — `arcctl query` with table/json/csv/arrow output, SQL input precedence, timeouts, exit codes - write.md — `arcctl write` line-protocol ingestion with stdin/file input, precision validation, common patterns Implements the per-PR docs rule (every code-shipping PR ships matching documentation alongside, not deferred). Build verified locally with `npm run build` — no new broken links or anchors introduced. --- docs/cli/_category_.json | 8 ++ docs/cli/connections.md | 131 ++++++++++++++++++++++++++++++ docs/cli/index.md | 61 ++++++++++++++ docs/cli/query.md | 167 +++++++++++++++++++++++++++++++++++++++ docs/cli/write.md | 142 +++++++++++++++++++++++++++++++++ 5 files changed, 509 insertions(+) create mode 100644 docs/cli/_category_.json create mode 100644 docs/cli/connections.md create mode 100644 docs/cli/index.md create mode 100644 docs/cli/query.md create mode 100644 docs/cli/write.md diff --git a/docs/cli/_category_.json b/docs/cli/_category_.json new file mode 100644 index 0000000..ba024bc --- /dev/null +++ b/docs/cli/_category_.json @@ -0,0 +1,8 @@ +{ + "label": "CLI (arcctl)", + "position": 11, + "link": { + "type": "generated-index", + "description": "arcctl is the operator CLI for Arc — query, write, and admin operations without curl." + } +} diff --git a/docs/cli/connections.md b/docs/cli/connections.md new file mode 100644 index 0000000..74b5018 --- /dev/null +++ b/docs/cli/connections.md @@ -0,0 +1,131 @@ +--- +sidebar_position: 2 +--- + +# Connection Management + +`arcctl` stores connection profiles in `~/.arcctl/config.toml` (mode 0600 — plaintext tokens, same posture as `~/.aws/credentials`). One profile is marked active and used by default; you can override per-command via flags or env vars. + +The model is deliberately the same as the InfluxDB v2 CLI's `influx config`, so operators coming from InfluxDB get the same UX. + +## Adding a connection + +```bash +arcctl config create \ + --name local \ + --endpoint http://localhost:8000 \ + --token YOUR-TOKEN +``` + +Flags: + +| Flag | Required | Description | +|---|---|---| +| `--name` | yes | Profile name (used by `--connection` and `set-active`) | +| `--endpoint` | yes | Arc HTTP base URL — no trailing slash | +| `--token` | yes | Bearer token from Arc's first-run banner | +| `--default-database` | no | Default database for query/write commands | +| `--insecure` | no | Skip TLS verification for this connection | +| `--activate` | no | Make this the active connection | + +The first connection you create is auto-activated (saves you one command on first run). Subsequent ones require `--activate` to take over. + +## Switching active connection + +```bash +arcctl config set-active prod +``` + +Errors cleanly if the named connection does not exist. + +## Listing connections + +```bash +$ arcctl config list +┌────────┬─────────┬─────────────────────────────┬─────────┬────────────┐ +│ ACTIVE │ NAME │ ENDPOINT │ TOKEN │ DEFAULT_DB │ +├────────┼─────────┼─────────────────────────────┼─────────┼────────────┤ +│ * │ prod │ https://arc.prod.example… │ abc…xyz │ metrics │ +│ │ local │ http://localhost:8000 │ dev…123 │ - │ +└────────┴─────────┴─────────────────────────────┴─────────┴────────────┘ +``` + +Tokens are redacted to `first4...last4`; tokens shorter than 12 chars are fully replaced with `*`. + +## Inspecting the active connection + +```bash +$ arcctl config current +name: prod +endpoint: https://arc.prod.example.com +token: abc…xyz +default_database: metrics +``` + +## Removing a connection + +```bash +arcctl config delete staging +# Delete connection "staging"? [y/N] y +# Deleted connection "staging" +``` + +Pass `--yes` (or `-y`) to skip the confirmation prompt. If you delete the currently-active profile, the active pointer is cleared so the next command produces a clear "no active connection" error rather than silently falling back to an unrelated profile. + +## Per-command overrides + +Every command (`query`, `write`, future `db`, `import`, etc.) accepts the same connection overrides: + +```bash +# Use a named profile other than the active one +arcctl --connection prod query "SELECT count(*) FROM cpu" + +# Full ad-hoc — both flags must be set together +arcctl query --endpoint https://arc.x.example.com --token YOUR-TOKEN "SELECT 1" + +# Env var, named profile lookup +ARC_CONNECTION=prod arcctl query "SELECT 1" + +# Env var, full ad-hoc — CI-friendly, no config file needed +ARC_ENDPOINT=https://arc.x.example.com ARC_TOKEN=YOUR-TOKEN arcctl query "SELECT 1" +``` + +## Precedence + +When `arcctl` needs to know which connection to use, it checks these sources in order and uses the first that matches: + +1. `--connection NAME` flag +2. `--endpoint URL --token TOKEN` flags (both required together) +3. `ARC_CONNECTION` env var +4. `ARC_ENDPOINT` + `ARC_TOKEN` env vars (both required together) +5. The `active` connection in `~/.arcctl/config.toml` + +If none of those is set, the command exits with a clear "no active connection" error rather than guessing. + +## Config file location + +By default `~/.arcctl/config.toml`. Override via `ARCCTL_CONFIG` env var — useful for tests, CI, or per-environment isolation: + +```bash +ARCCTL_CONFIG=/etc/arcctl/prod.toml arcctl query "SELECT 1" +``` + +The file is written atomically (write to temp + rename) so a crash mid-`config create` never leaves a half-written file. + +## TLS + +For HTTPS endpoints, certificate verification is on by default. To skip verification (lab or self-signed certs only) use either: + +- `--insecure` on a single command, or +- `insecure_tls = true` in the connection profile (set once via `arcctl config create --insecure`) + +When verification is skipped, a `WARNING:` line is printed to stderr. The flag is a no-op on `http://` endpoints and the warning is suppressed there. + +**Never disable TLS verification against a production endpoint.** It exposes the bearer token to any on-path attacker. + +## Security notes + +- The config file is mode 0600 (owner read/write only). The parent directory `~/.arcctl/` is mode 0700. +- Tokens are stored plaintext. Same posture as `~/.aws/credentials`. +- `arcctl` never logs the token. Help text, error messages, `config list`, and `config current` all use redaction. +- `arcctl` does not phone home. No telemetry. No update checks. diff --git a/docs/cli/index.md b/docs/cli/index.md new file mode 100644 index 0000000..c430a2f --- /dev/null +++ b/docs/cli/index.md @@ -0,0 +1,61 @@ +--- +sidebar_position: 1 +--- + +# arcctl + +`arcctl` is the operator-facing CLI for Arc. It replaces hand-crafted `curl` calls with a familiar workflow modeled on `influx`, `kubectl`, and `clickhouse-client`. + +```bash +# Add a connection profile +arcctl config create --name local --endpoint http://localhost:8000 --token YOUR-TOKEN + +# Run a query +arcctl query "SELECT count(*) FROM cpu" + +# Write line protocol +echo "cpu,host=server-1 value=42.5 $(date +%s)000000000" | arcctl write +``` + +## Why arcctl + +Operating Arc without arcctl means: + +- Reading the bootstrap token from a stderr banner once, then copying it into every `curl` (or losing it and forcing a restart with `ARC_AUTH_FORCE_BOOTSTRAP=true`). +- Building JSON query bodies by hand, setting `Authorization: Bearer`, remembering the `x-arc-database` header. +- Decoding `{"columns":[...],"data":[...]}` responses by eye. +- Juggling endpoints and tokens across dev / staging / prod via shell-var swaps. + +`arcctl` handles all of that and adds named connection profiles, multiple output formats (table, JSON, CSV, Arrow IPC), file/stdin input for both query and write, and consistent error messages. + +## Status + +| Version | Surface | +|---|---| +| v0.1.0 (PR1) | `config` subcommand tree, multi-connection store at `~/.arcctl/config.toml` | +| v0.2.0 (PR2) | `query`, `write` — table / JSON / CSV / Arrow IPC output, stdin / file input | +| v0.3.0+ | `db`, `import`, `auth`, `cluster` subcommands (in development) | +| v1.0.0 | release workflow + Homebrew tap + multi-arch Docker | + +## Compatibility + +`arcctl` 0.x and 1.x talk to Arc 26.06 or newer. Arc < 26.06 lacks the Phase A cluster auth replication that makes token admin behave consistently across nodes; an older Arc server may work for `query`/`write` but is not supported. + +## Installation + +Pre-built binaries land in v1.0. For now, build from source: + +```bash +git clone https://github.com/Basekick-Labs/arcctl +cd arcctl +go build -o arcctl ./cmd/arcctl +./arcctl --version +``` + +Requires Go 1.25+. + +## Next + +- [Connection management](/arc/cli/connections) — adding, switching, and overriding connection profiles +- [Querying](/arc/cli/query) — running SQL with table / JSON / CSV / Arrow output +- [Writing line protocol](/arc/cli/write) — stdin and file ingestion with precision control diff --git a/docs/cli/query.md b/docs/cli/query.md new file mode 100644 index 0000000..600be20 --- /dev/null +++ b/docs/cli/query.md @@ -0,0 +1,167 @@ +--- +sidebar_position: 3 +--- + +# Querying + +`arcctl query` runs SQL against an Arc cluster and renders the result in your chosen format. Defaults are operator-friendly: pretty table on stdout, errors on stderr, exit 0 on success, exit 1 on any failure. + +## Quick reference + +```bash +# Pretty table (default) +arcctl query "SELECT host, value FROM cpu ORDER BY value LIMIT 10" + +# Different database for one call +arcctl query --database metrics "SELECT count(*) FROM cpu" + +# SQL from a file +arcctl query -f reports/p99.sql + +# SQL from stdin +echo "SELECT 1" | arcctl query +``` + +## SQL input + +`arcctl query` accepts SQL three ways, in this precedence: + +1. **Positional argument** — `arcctl query "SELECT 1"` +2. **`-f file.sql` flag** — `arcctl query -f reports/p99.sql` +3. **Stdin** — used when neither arg nor `-f` is given and stdin is a pipe (not a TTY) + +If you run `arcctl query` interactively with no arguments, it exits immediately with a clear error rather than hanging waiting for stdin. + +## Output formats + +Pass `-o` (`--output`) to switch: + +| Format | When to use | +|---|---| +| `table` (default) | Interactive use; pretty-printed with column headers | +| `json` | Pipe to `jq`, save as `.json`, parse from another script | +| `csv` | Save to a spreadsheet, load into pandas/R, RFC 4180 with header | +| `arrow` | Stream Arrow IPC bytes; pipe to pyarrow / duckdb / polars for analytical post-processing | + +### Table + +```bash +$ arcctl query "SELECT host, value FROM cpu ORDER BY value" +┌──────────┬───────┐ +│ HOST │ VALUE │ +├──────────┼───────┤ +│ server-1 │ 42.5 │ +│ server-2 │ 43.2 │ +│ server-3 │ 44.1 │ +└──────────┴───────┘ +``` + +Modifiers: + +- `--no-header` — drop the column header row +- `--limit N` — cap output rows client-side (the server still computes the full result; use `LIMIT` in your SQL if you want to bound server work) + +Empty result (e.g. measurement that has never been written): prints `(0 rows)` instead of nothing, so you know the query ran. + +### JSON + +```bash +$ arcctl query "SELECT host, value FROM cpu LIMIT 2" -o json +{ + "columns": ["host", "value"], + "data": [ + ["server-1", 42.5], + ["server-2", 43.2] + ], + "row_count": 2, + "execution_time_ms": 1 +} +``` + +The shape is row-major (`data[i]` is row i). This is the raw Arc JSON query response, indented for readability — pipe to `jq` for one-liner transformations: + +```bash +arcctl query "SELECT * FROM cpu" -o json | jq '.data[] | {host: .[0], val: .[1]}' +``` + +### CSV + +```bash +$ arcctl query "SELECT host, value FROM cpu ORDER BY value" -o csv +host,value +server-1,42.5 +server-2,43.2 +server-3,44.1 +``` + +RFC 4180 with a header row by default; `--no-header` drops it. Cell types are stringified — `true`/`false` for bools, `null` cells render as empty fields, integers print without a decimal tail, floats use compact `strconv` formatting. + +### Arrow IPC + +```bash +arcctl query "SELECT * FROM cpu" -o arrow > out.arrow +# arrow: 4096 bytes, server execution 12ms (stderr) +``` + +The Arrow IPC stream goes to stdout; the byte count and server-side execution time go to stderr. Stream the result into pyarrow, DuckDB, or polars: + +```bash +# DuckDB +arcctl query "SELECT * FROM cpu" -o arrow | \ + duckdb -c "SELECT count(*) FROM read_arrow('/dev/stdin')" + +# pyarrow +arcctl query "SELECT * FROM cpu" -o arrow | python3 -c ' +import pyarrow.ipc as ipc, sys +print(ipc.open_stream(sys.stdin.buffer).read_all()) +' +``` + +If the stream is interrupted mid-flight (network drop, client kill, server reset), `arcctl` writes a clear `arrow: stream interrupted after N bytes` line to stderr along with the error. The partial bytes on stdout will not parse cleanly — that's a feature, not a bug; truncated IPC should fail loud. + +## Database selection + +The default database for a query is taken from the active connection's `default_database` (set via `arcctl config create --default-database NAME`). Override per-call with `--database`: + +```bash +arcctl query --database logs "SELECT count(*) FROM access" +``` + +If neither the connection default nor `--database` is set, Arc applies its own server-side default (`default`). + +## Timeouts + +```bash +arcctl query --timeout 5m "SELECT count(*) FROM giant_table" +``` + +`--timeout` is the **per-request HTTP timeout** (default 60s). It must be `> 0`. For long-running queries override it explicitly; arcctl does not infer a longer timeout from the SQL. + +## Exit codes + +| Code | Meaning | +|---|---| +| 0 | Query succeeded (even if 0 rows returned) | +| 1 | Any failure: bad config, network error, server error, malformed flags | + +Error messages go to stderr; output goes to stdout. Standard Unix conventions, so this works: + +```bash +arcctl query "SELECT * FROM cpu" -o json > out.json 2> err.log +``` + +## Errors + +Server errors are surfaced with the original message and HTTP status: + +```bash +$ arcctl query "SELECT FROM cpu" +Error: arc: Parser Error: syntax error at end of input (HTTP 500) +``` + +Client-side errors (bad flags, missing connection) are caught before any network call: + +```bash +$ arcctl query --output yaml "SELECT 1" +Error: invalid --output "yaml" (valid: table, json, csv, arrow) +``` diff --git a/docs/cli/write.md b/docs/cli/write.md new file mode 100644 index 0000000..6198237 --- /dev/null +++ b/docs/cli/write.md @@ -0,0 +1,142 @@ +--- +sidebar_position: 4 +--- + +# Writing Line Protocol + +`arcctl write` POSTs line-protocol records to Arc's `/api/v1/write/line-protocol` endpoint. Body is streamed — large files / pipes never buffer in memory — so `cat huge.lp | arcctl write` works at line-rate. + +## Quick reference + +```bash +# Stdin pipe — most common in CI / log forwarders / quick experiments +echo "cpu,host=server-1 value=42.5 $(date +%s)000000000" | arcctl write + +# From a file +arcctl write -f payload.lp --database metrics + +# Explicit precision (default is nanoseconds) +echo "cpu v=1 1700000000" | arcctl write --precision s +``` + +## Input + +`arcctl write` reads its body from one of: + +1. **`-f file.lp` flag** — opens the file with `os.Open` and streams it through to the POST body. The file handle is closed when the write completes. +2. **Stdin** — used when `-f` is not given. No buffering; bytes flow through as they arrive. + +Unlike `arcctl query`, `arcctl write` does **not** error on an empty TTY stdin — typing line protocol interactively is a (rare) supported workflow. An empty body is accepted by the server as a no-op (`OK`, exit 0). + +## Line protocol + +Arc accepts the standard InfluxDB line protocol: + +``` +,=,= =,= +``` + +- **Measurement** — required, the "table" name in Arc +- **Tags** — optional, comma-separated key=value pairs (string-only, schemas inferred) +- **Fields** — required, at least one numeric/string/bool value +- **Timestamp** — optional; if omitted, Arc applies wall-clock-at-receive + +Example: + +``` +cpu,host=server-1,region=us-east value=42.5,temp=68.0 1700000000000000000 +mem,host=server-1 used=8.5,total=16.0 1700000000000000000 +``` + +## Precision + +`--precision` tells the server how to interpret bare-integer timestamps: + +| Value | Unit | +|---|---| +| `ns` (default) | nanoseconds since epoch | +| `us` | microseconds | +| `ms` | milliseconds | +| `s` | seconds | + +```bash +echo "cpu v=1 1700000000" | arcctl write --precision s +echo "cpu v=1 1700000000000" | arcctl write --precision ms +``` + +`arcctl` validates the precision flag client-side before the request goes out, so a typo like `--precision furlong` fails fast: + +``` +$ arcctl write --precision furlong < x.lp +Error: invalid --precision "furlong" (must be one of ns, us, ms, s) +``` + +## Database selection + +The default database comes from the active connection's `default_database`. Override per-call: + +```bash +arcctl write -f payload.lp --database metrics +``` + +If neither the connection default nor `--database` is set, Arc applies its own server-side default (`default`). + +## Streaming behavior + +`arcctl write` does NOT buffer the body. This matters for: + +- **Large files** — `arcctl write -f /var/log/lp/all-day.lp` streams without loading the file into RAM. +- **Continuous pipes** — `tail -F app.log | parser | arcctl write` keeps memory flat while ingesting at line-rate. +- **HTTP timeout** — `--timeout` applies to the **whole request**. A 10 GB file + the default 60s timeout will time out before completion. For large writes, raise `--timeout`: + +```bash +arcctl write -f huge.lp --timeout 30m +``` + +## Exit codes + +| Code | Meaning | +|---|---| +| 0 | Server returned 204 No Content (success) | +| 1 | Any failure: bad flags, network error, server error (4xx/5xx) | + +On success, `arcctl write` prints `OK` to stdout. On failure, the server's error message is surfaced: + +```bash +$ echo "garbage" | arcctl write +Error: arc: malformed line at offset 0 (HTTP 400) +``` + +## Common patterns + +### Backfill from a file + +```bash +arcctl write -f historical.lp --database backfill --precision ms +``` + +### Continuous ingestion from a log tail + +```bash +tail -F /var/log/app.log | \ + awk '{ printf("log,host=%s msg=\"%s\" %d000000000\n", "myhost", $0, systime()) }' | \ + arcctl write --database logs +``` + +### Ad-hoc connection (no profile) + +```bash +echo "metric v=1" | arcctl write \ + --endpoint https://arc.staging.example.com \ + --token YOUR-TOKEN \ + --database metrics +``` + +### CI-friendly with env vars + +```bash +ARC_ENDPOINT=https://arc.x.example.com ARC_TOKEN=YOUR-TOKEN \ + arcctl write -f payload.lp --database metrics +``` + +See [Connection management](/arc/cli/connections#precedence) for the full precedence rules.