Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions docs/cli/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "CLI (arcctl)",
"position": 11,
"link": {
"type": "generated-index",
"description": "arcctl is the operator CLI for Arc — query, write, and admin operations without curl."
}
}
131 changes: 131 additions & 0 deletions docs/cli/connections.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
---
sidebar_position: 2
---

# Connection Management

`arcctl` stores connection profiles in `~/.arcctl/config.toml` (mode 0600 — plaintext tokens, same posture as `~/.aws/credentials`). One profile is marked active and used by default; you can override per-command via flags or env vars.

The model is deliberately the same as the InfluxDB v2 CLI's `influx config`, so operators coming from InfluxDB get the same UX.

## Adding a connection

```bash
arcctl config create \
--name local \
--endpoint http://localhost:8000 \
--token YOUR-TOKEN
```

Flags:

| Flag | Required | Description |
|---|---|---|
| `--name` | yes | Profile name (used by `--connection` and `set-active`) |
| `--endpoint` | yes | Arc HTTP base URL — no trailing slash |
| `--token` | yes | Bearer token from Arc's first-run banner |
| `--default-database` | no | Default database for query/write commands |
| `--insecure` | no | Skip TLS verification for this connection |
| `--activate` | no | Make this the active connection |

The first connection you create is auto-activated (saves you one command on first run). Subsequent ones require `--activate` to take over.

## Switching active connection

```bash
arcctl config set-active prod
```

Errors cleanly if the named connection does not exist.

## Listing connections

```bash
$ arcctl config list
┌────────┬─────────┬─────────────────────────────┬─────────┬────────────┐
│ ACTIVE │ NAME │ ENDPOINT │ TOKEN │ DEFAULT_DB │
├────────┼─────────┼─────────────────────────────┼─────────┼────────────┤
│ * │ prod │ https://arc.prod.example… │ abc…xyz │ metrics │
│ │ local │ http://localhost:8000 │ dev…123 │ - │
└────────┴─────────┴─────────────────────────────┴─────────┴────────────┘
```

Tokens are redacted to `first4...last4`; tokens shorter than 12 chars are fully replaced with `*`.

## Inspecting the active connection

```bash
$ arcctl config current
name: prod
endpoint: https://arc.prod.example.com
token: abc…xyz
default_database: metrics
```

## Removing a connection

```bash
arcctl config delete staging
# Delete connection "staging"? [y/N] y
# Deleted connection "staging"
```

Pass `--yes` (or `-y`) to skip the confirmation prompt. If you delete the currently-active profile, the active pointer is cleared so the next command produces a clear "no active connection" error rather than silently falling back to an unrelated profile.

## Per-command overrides

Every command (`query`, `write`, future `db`, `import`, etc.) accepts the same connection overrides:

```bash
# Use a named profile other than the active one
arcctl --connection prod query "SELECT count(*) FROM cpu"

# Full ad-hoc — both flags must be set together
arcctl query --endpoint https://arc.x.example.com --token YOUR-TOKEN "SELECT 1"

# Env var, named profile lookup
ARC_CONNECTION=prod arcctl query "SELECT 1"

# Env var, full ad-hoc — CI-friendly, no config file needed
ARC_ENDPOINT=https://arc.x.example.com ARC_TOKEN=YOUR-TOKEN arcctl query "SELECT 1"
```

## Precedence

When `arcctl` needs to know which connection to use, it checks these sources in order and uses the first that matches:

1. `--connection NAME` flag
2. `--endpoint URL --token TOKEN` flags (both required together)
3. `ARC_CONNECTION` env var
4. `ARC_ENDPOINT` + `ARC_TOKEN` env vars (both required together)
5. The `active` connection in `~/.arcctl/config.toml`

If none of those is set, the command exits with a clear "no active connection" error rather than guessing.

## Config file location

By default `~/.arcctl/config.toml`. Override via `ARCCTL_CONFIG` env var — useful for tests, CI, or per-environment isolation:

```bash
ARCCTL_CONFIG=/etc/arcctl/prod.toml arcctl query "SELECT 1"
```

The file is written atomically (write to temp + rename) so a crash mid-`config create` never leaves a half-written file.

## TLS

For HTTPS endpoints, certificate verification is on by default. To skip verification (lab or self-signed certs only) use either:

- `--insecure` on a single command, or
- `insecure_tls = true` in the connection profile (set once via `arcctl config create --insecure`)

When verification is skipped, a `WARNING:` line is printed to stderr. The flag is a no-op on `http://` endpoints and the warning is suppressed there.

**Never disable TLS verification against a production endpoint.** It exposes the bearer token to any on-path attacker.

## Security notes

- The config file is mode 0600 (owner read/write only). The parent directory `~/.arcctl/` is mode 0700.
- Tokens are stored plaintext. Same posture as `~/.aws/credentials`.
- `arcctl` never logs the token. Help text, error messages, `config list`, and `config current` all use redaction.
- `arcctl` does not phone home. No telemetry. No update checks.
61 changes: 61 additions & 0 deletions docs/cli/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
---
sidebar_position: 1
---

# arcctl

`arcctl` is the operator-facing CLI for Arc. It replaces hand-crafted `curl` calls with a familiar workflow modeled on `influx`, `kubectl`, and `clickhouse-client`.

```bash
# Add a connection profile
arcctl config create --name local --endpoint http://localhost:8000 --token YOUR-TOKEN

# Run a query
arcctl query "SELECT count(*) FROM cpu"

# Write line protocol
echo "cpu,host=server-1 value=42.5 $(date +%s)000000000" | arcctl write
```

## Why arcctl

Operating Arc without arcctl means:

- Reading the bootstrap token from a stderr banner once, then copying it into every `curl` (or losing it and forcing a restart with `ARC_AUTH_FORCE_BOOTSTRAP=true`).
- Building JSON query bodies by hand, setting `Authorization: Bearer`, remembering the `x-arc-database` header.
- Decoding `{"columns":[...],"data":[...]}` responses by eye.
- Juggling endpoints and tokens across dev / staging / prod via shell-var swaps.

`arcctl` handles all of that and adds named connection profiles, multiple output formats (table, JSON, CSV, Arrow IPC), file/stdin input for both query and write, and consistent error messages.

## Status

| Version | Surface |
|---|---|
| v0.1.0 (PR1) | `config` subcommand tree, multi-connection store at `~/.arcctl/config.toml` |
| v0.2.0 (PR2) | `query`, `write` — table / JSON / CSV / Arrow IPC output, stdin / file input |
| v0.3.0+ | `db`, `import`, `auth`, `cluster` subcommands (in development) |
| v1.0.0 | release workflow + Homebrew tap + multi-arch Docker |

## Compatibility

`arcctl` 0.x and 1.x talk to Arc 26.06 or newer. Arc < 26.06 lacks the Phase A cluster auth replication that makes token admin behave consistently across nodes; an older Arc server may work for `query`/`write` but is not supported.

## Installation

Pre-built binaries land in v1.0. For now, build from source:

```bash
git clone https://github.com/Basekick-Labs/arcctl
cd arcctl
go build -o arcctl ./cmd/arcctl
./arcctl --version
```

Requires Go 1.25+.

## Next

- [Connection management](/arc/cli/connections) — adding, switching, and overriding connection profiles
- [Querying](/arc/cli/query) — running SQL with table / JSON / CSV / Arrow output
- [Writing line protocol](/arc/cli/write) — stdin and file ingestion with precision control
167 changes: 167 additions & 0 deletions docs/cli/query.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
---
sidebar_position: 3
---

# Querying

`arcctl query` runs SQL against an Arc cluster and renders the result in your chosen format. Defaults are operator-friendly: pretty table on stdout, errors on stderr, exit 0 on success, exit 1 on any failure.

## Quick reference

```bash
# Pretty table (default)
arcctl query "SELECT host, value FROM cpu ORDER BY value LIMIT 10"

# Different database for one call
arcctl query --database metrics "SELECT count(*) FROM cpu"

# SQL from a file
arcctl query -f reports/p99.sql

# SQL from stdin
echo "SELECT 1" | arcctl query
```

## SQL input

`arcctl query` accepts SQL three ways, in this precedence:

1. **Positional argument** — `arcctl query "SELECT 1"`
2. **`-f file.sql` flag** — `arcctl query -f reports/p99.sql`
3. **Stdin** — used when neither arg nor `-f` is given and stdin is a pipe (not a TTY)

If you run `arcctl query` interactively with no arguments, it exits immediately with a clear error rather than hanging waiting for stdin.

## Output formats

Pass `-o` (`--output`) to switch:

| Format | When to use |
|---|---|
| `table` (default) | Interactive use; pretty-printed with column headers |
| `json` | Pipe to `jq`, save as `.json`, parse from another script |
| `csv` | Save to a spreadsheet, load into pandas/R, RFC 4180 with header |
| `arrow` | Stream Arrow IPC bytes; pipe to pyarrow / duckdb / polars for analytical post-processing |

### Table

```bash
$ arcctl query "SELECT host, value FROM cpu ORDER BY value"
┌──────────┬───────┐
│ HOST │ VALUE │
├──────────┼───────┤
│ server-1 │ 42.5 │
│ server-2 │ 43.2 │
│ server-3 │ 44.1 │
└──────────┴───────┘
```

Modifiers:

- `--no-header` — drop the column header row
- `--limit N` — cap output rows client-side (the server still computes the full result; use `LIMIT` in your SQL if you want to bound server work)

Empty result (e.g. measurement that has never been written): prints `(0 rows)` instead of nothing, so you know the query ran.

### JSON

```bash
$ arcctl query "SELECT host, value FROM cpu LIMIT 2" -o json
{
"columns": ["host", "value"],
"data": [
["server-1", 42.5],
["server-2", 43.2]
],
"row_count": 2,
"execution_time_ms": 1
}
```

The shape is row-major (`data[i]` is row i). This is the raw Arc JSON query response, indented for readability — pipe to `jq` for one-liner transformations:

```bash
arcctl query "SELECT * FROM cpu" -o json | jq '.data[] | {host: .[0], val: .[1]}'
```

### CSV

```bash
$ arcctl query "SELECT host, value FROM cpu ORDER BY value" -o csv
host,value
server-1,42.5
server-2,43.2
server-3,44.1
```

RFC 4180 with a header row by default; `--no-header` drops it. Cell types are stringified — `true`/`false` for bools, `null` cells render as empty fields, integers print without a decimal tail, floats use compact `strconv` formatting.

### Arrow IPC

```bash
arcctl query "SELECT * FROM cpu" -o arrow > out.arrow
# arrow: 4096 bytes, server execution 12ms (stderr)
```

The Arrow IPC stream goes to stdout; the byte count and server-side execution time go to stderr. Stream the result into pyarrow, DuckDB, or polars:

```bash
# DuckDB
arcctl query "SELECT * FROM cpu" -o arrow | \
duckdb -c "SELECT count(*) FROM read_arrow('/dev/stdin')"

# pyarrow
arcctl query "SELECT * FROM cpu" -o arrow | python3 -c '
import pyarrow.ipc as ipc, sys
print(ipc.open_stream(sys.stdin.buffer).read_all())
'
```

If the stream is interrupted mid-flight (network drop, client kill, server reset), `arcctl` writes a clear `arrow: stream interrupted after N bytes` line to stderr along with the error. The partial bytes on stdout will not parse cleanly — that's a feature, not a bug; truncated IPC should fail loud.

## Database selection

The default database for a query is taken from the active connection's `default_database` (set via `arcctl config create --default-database NAME`). Override per-call with `--database`:

```bash
arcctl query --database logs "SELECT count(*) FROM access"
```

If neither the connection default nor `--database` is set, Arc applies its own server-side default (`default`).

## Timeouts

```bash
arcctl query --timeout 5m "SELECT count(*) FROM giant_table"
```

`--timeout` is the **per-request HTTP timeout** (default 60s). It must be `> 0`. For long-running queries override it explicitly; arcctl does not infer a longer timeout from the SQL.

## Exit codes

| Code | Meaning |
|---|---|
| 0 | Query succeeded (even if 0 rows returned) |
| 1 | Any failure: bad config, network error, server error, malformed flags |

Error messages go to stderr; output goes to stdout. Standard Unix conventions, so this works:

```bash
arcctl query "SELECT * FROM cpu" -o json > out.json 2> err.log
```

## Errors

Server errors are surfaced with the original message and HTTP status:

```bash
$ arcctl query "SELECT FROM cpu"
Error: arc: Parser Error: syntax error at end of input (HTTP 500)
```

Client-side errors (bad flags, missing connection) are caught before any network call:

```bash
$ arcctl query --output yaml "SELECT 1"
Error: invalid --output "yaml" (valid: table, json, csv, arrow)
```
Loading