diff --git a/README.md b/README.md index 212ec25..e85bfe8 100644 --- a/README.md +++ b/README.md @@ -81,7 +81,7 @@ The `${credential.X}` substitution resolves to the field's value (string fields) ## Status -**v0.2.4** — 21 modules in `library/` gated by `last_verified` (5 production · 14 verified · 2 partial). New since v0.2.3: `seedance` (Doubao Seedance 2.0 video), `ngrok` (dev tunneling), `seedream` (Doubao Seedream image gen — incl. multi-image fusion / group output / streaming / web-search), `dashscope` (Alibaba CosyVoice TTS + voice cloning + Wanx image gen). Plus CI on every PR (`.github/workflows/ci.yml`), `SECURITY.md` vuln reporting policy, SPEC.md §0–§4 English translation. Format spec is stable; AI-assisted module authoring (v0.3) in progress. +**v0.2.4** — 22 modules in `library/` gated by `last_verified` (5 production · 15 verified · 2 partial). New since v0.2.3: `seedance` (Doubao Seedance 2.0 video), `ngrok` (dev tunneling), `seedream` (Doubao Seedream image gen — incl. multi-image fusion / group output / streaming / web-search), `dashscope` (Alibaba CosyVoice TTS + voice cloning + Wanx image gen), `volcengine-tos` (S3-compatible object storage — the bridge for hosting Seedance / Seedream reference images at public URLs). Plus CI on every PR (`.github/workflows/ci.yml`), `SECURITY.md` vuln reporting policy, SPEC.md §0–§4 English translation. Format spec is stable; AI-assisted module authoring (v0.3) in progress. See: - [SPEC.md](./SPEC.md) — full format specification (Chinese, English translation forthcoming) diff --git a/ROADMAP.md b/ROADMAP.md index 4177c58..8423221 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -56,7 +56,7 @@ Stack: Bun + Hono + HTMX + Tailwind CDN, server-side rendered. - [x] **npm module** (registry + publish workflow — dogfood from shipping `@robozephyr/trove` itself); covers token types, scoped-package private-by-default, bare-name squat, double-shebang trap, Bypass-2FA Granular Token, `NPM_CONFIG_USERCONFIG=` for non-interactive publish. `last_verified: production` - [x] `trove install ...` CLI sidecar — copy library modules into `~/.trove/`; `--list` shows available + installed status; `--force` to overwrite; idempotent - [ ] `trove install ` — install from arbitrary git repo (community modules); needed for the marketplace story but not for v1.0 launch -- [ ] Re-verify the rest of the modules to production-grade `last_verified` — happens organically as maintainer (or contributors) use modules in real projects. Currently **5 production · 14 verified · 2 partial** out of 21 +- [ ] Re-verify the rest of the modules to production-grade `last_verified` — happens organically as maintainer (or contributors) use modules in real projects. Currently **5 production · 15 verified · 2 partial** out of 22 ## v0.2.x → OSS launch prep (active) diff --git a/library/volcengine-tos/credentials.example.json b/library/volcengine-tos/credentials.example.json new file mode 100644 index 0000000..c294e6b --- /dev/null +++ b/library/volcengine-tos/credentials.example.json @@ -0,0 +1,7 @@ +{ + "VOLC_ACCESS_KEY_ID": "", + "VOLC_SECRET_ACCESS_KEY": "", + "TOS_REGION": "cn-beijing", + "TOS_BUCKET": "", + "TOS_ENDPOINT": "tos-cn-beijing.volces.com" +} diff --git a/library/volcengine-tos/module.md b/library/volcengine-tos/module.md new file mode 100644 index 0000000..a71ffc6 --- /dev/null +++ b/library/volcengine-tos/module.md @@ -0,0 +1,319 @@ +--- +name: volcengine-tos +version: 0.1.0 +category: infra +description: Volcengine TOS (Tencent Object Storage) — S3-compatible object storage on the Volcengine platform. Sub-user AK/SK auth, public-read ACL for AI-gen reference URLs, native + S3 SDK paths. The canonical place to host images / audio / video that Seedance / Seedream / other Volcengine AI services reference +homepage: https://www.volcengine.com/docs/6349 +tags: [object-storage, s3-compatible, volcengine, cdn-ready, infra] +applies_to: + - "hosting reference images / audio / video for Seedance + Seedream API calls (both require publicly fetchable URLs; same Volcengine region = low-latency model server pull)" + - "general-purpose object storage on Volcengine (static assets, backups, AIGC archive)" + - "drop-in replacement for AWS S3 / Aliyun OSS when you're already on Volcengine and want one platform key" + - "browser-direct uploads via signed PUT URLs (with CORS configured)" +trove_spec: "0.1" +lastmod: "2026-05-18" +last_verified: "2026-05-18 · E2E live — sub-user AK/SK auth via official `tos==2.9.0` Python SDK, list_buckets / create_bucket (public-read ACL) / put_object / public-URL GET / delete_object all succeeded. Public URL pattern `https://.tos-cn-beijing.volces.com/` returned uploaded bytes via HTTP 200 with correct content-type. Cross-module verified shape: same host pattern Seedance docs explicitly reference for their `image_url` field" + +credentials: + VOLC_ACCESS_KEY_ID: + type: password + required: true + help: "Sub-user access key ID, AKLT... format. Get from https://console.volcengine.com/iam/keymanage. STRONGLY recommend creating a dedicated sub-user (not the root account) with the `TOSFullAccess` policy attached — see Critical Constraint #1." + VOLC_SECRET_ACCESS_KEY: + type: password + required: true + help: "Sub-user secret access key. Volcengine shows this ONCE at creation; save it then. Lost it? Regenerate the AK/SK pair (old pair will keep working until you explicitly delete it)." + TOS_REGION: + type: select + options: [cn-beijing, cn-shanghai, cn-guangzhou, ap-southeast-1, ap-northeast-1] + default: cn-beijing + help: "Region your bucket lives in. Match to the region of the consuming service (e.g. Seedance + Seedream are cn-beijing → keep TOS bucket cn-beijing too for low-latency model server fetch)." + TOS_BUCKET: + type: text + required: false + help: "Your default bucket name. Globally unique within the region. Lowercase letters / digits / hyphens, 3-63 chars, can't start/end with hyphen. Once created, can't be renamed (only delete + recreate). Leave blank if you create per-task buckets via API." + TOS_ENDPOINT: + type: url + required: false + default: "tos-cn-beijing.volces.com" + help: "Region-derived endpoint host (no scheme). Default matches `TOS_REGION: cn-beijing`. For other regions: `tos-.volces.com`. Some setups prefer the public CDN endpoint `*.tos--cdn.volces.com` after attaching a CDN — see CDN integration section below." +--- + +# Volcengine TOS Usage Guide + +## ⚠️ Critical Constraints (read before writing code) + +1. **Use a sub-user, NOT the root account's AK/SK** — root credentials grant access to every Volcengine service (TOS, ARK, DNS, Pages, billing...). For Trove, create a sub-user at https://console.volcengine.com/iam/keymanage with **`TOSFullAccess`** policy attached (or a custom policy scoped to specific buckets for stricter least-privilege). Sub-user compromise = TOS-scoped damage; root compromise = whole-account meltdown. +2. **AK/SK is NOT the same credential as `ARK_API_KEY`** — Volcengine has two parallel auth systems. AI model APIs (ARK / Bailian-equivalent) use `Bearer `. Object storage / infra APIs (TOS / VPC / DNS) use signature v4 with AK/SK. Don't try the ARK key against TOS — you'll get `InvalidSignature` with a misleading message. +3. **Bucket names are globally unique within a region and CANNOT be renamed** — choose carefully. Delete + recreate is the only way to "rename", and recreate is blocked for a cooldown window after delete (~12-24h). Naming convention recommendation: `--`, e.g. `myapp-prod-assets`. +4. **Default bucket ACL is private — set `public-read` explicitly if AI services need to pull** — Seedance / Seedream / other model servers cannot reach private objects. Either (a) set bucket ACL to `public-read` at create time, OR (b) set per-object ACL to `public-read` on each upload, OR (c) hand model servers presigned URLs. Per-object ACL is the right choice when most of the bucket is private but some objects need public reach. +5. **Region split matters for cross-service latency + traffic cost** — TOS in `cn-beijing` ↔ Seedance/Seedream in `cn-beijing` = same-region fetch (fast, free intra-region traffic). TOS in `ap-southeast-1` ↔ Seedance in `cn-beijing` = cross-region fetch (slow, billed egress). **Always co-locate TOS bucket with the consuming AI service's region.** +6. **Object key naming: avoid URL-reserved chars** — `?` `#` `%` in keys will URL-encode unpredictably across SDKs. Safe chars: `[a-zA-Z0-9._-/]`. Keys with `/` work as virtual folders (`avatars/user-123.jpg`). Max key length: 1024 chars; max value size per object: 5 GB per PUT (use multipart for >5 GB). +7. **TOS is S3-compatible — `aws-sdk-s3` works pointed at the TOS endpoint** — for codebases that already use `boto3` or `@aws-sdk/client-s3`, just override the endpoint URL. Caveat: signature region must be `cn-beijing` etc. exactly, not `us-east-1` (which some S3 SDKs default to for "S3-compatible" mode). +8. **No official Node SDK exists — use S3 SDK** — `tos-python-sdk` (Python), Java SDK are official. For Node / Edge / Deno / Go, the path is `@aws-sdk/client-s3` with custom endpoint, NOT a custom WebSocket / fetch wrapper. See the Node section below. +9. **CORS is per-bucket, not per-object** — if you plan to PUT from a browser via presigned URL, configure the bucket's CORS rule first. Without CORS, browser blocks the upload (CORS preflight fails); but server-side uploads (Python / Node SDK from your backend) never hit CORS. +10. **Public-read does NOT mean public-list** — `public-read` ACL lets anyone GET an object if they know the key. It does NOT let them list the bucket's contents (LIST is a separate `public-read-write` or explicit policy). Don't rely on key obscurity for security; if an object is sensitive, keep it private and use short-TTL presigned GET URLs. + +--- + +## Setup + +```bash +# Trove pattern — pull keys on demand +VOLC_ACCESS_KEY_ID=$(jq -r .VOLC_ACCESS_KEY_ID ~/.trove/volcengine-tos/credentials.json) +VOLC_SECRET_ACCESS_KEY=$(jq -r .VOLC_SECRET_ACCESS_KEY ~/.trove/volcengine-tos/credentials.json) +``` + +Install the official Python SDK (smoothest path): + +```bash +pip install 'tos>=2.9' +``` + +--- + +## Quickstart: list / create / upload / get-url / delete (Python) + +```python +import os, tos + +client = tos.TosClientV2( + ak=os.environ["VOLC_ACCESS_KEY_ID"], + sk=os.environ["VOLC_SECRET_ACCESS_KEY"], + endpoint="tos-cn-beijing.volces.com", + region="cn-beijing", +) + +# 1. List your buckets +buckets = client.list_buckets() +print(f"have {len(buckets.buckets)} buckets") + +# 2. Create a new bucket with public-read ACL (so AI model servers can fetch) +client.create_bucket(bucket="myapp-assets", acl=tos.ACLType.ACL_Public_Read) + +# 3. Upload an object (also public-read so individual URLs work) +content = b"hello trove" +client.put_object( + bucket="myapp-assets", + key="demo/hello.txt", + content=content, + acl=tos.ACLType.ACL_Public_Read, +) + +# 4. Construct the public URL +url = f"https://myapp-assets.tos-cn-beijing.volces.com/demo/hello.txt" +# anyone can `curl $url` and get back the content + +# 5. Delete the object when done (control storage cost) +client.delete_object(bucket="myapp-assets", key="demo/hello.txt") +``` + +**Public URL pattern**: `https://.tos-.volces.com/`. No signed token needed when bucket / object ACL is `public-read`. Same host pattern Seedance / Seedream docs reference for their `image_url` / `video_url` / `audio_url` fields. + +--- + +## Cross-module recipe — TOS as the bridge for Seedance / Seedream references + +Both Seedance and Seedream require reference images / videos / audios at **publicly fetchable URLs** (model server pulls). TOS is the canonical place to host them. Full chain: + +```python +# 1. Generate an image with Seedream (or pre-render a frame from elsewhere) +import os +from openai import OpenAI + +ark = OpenAI(base_url="https://ark.cn-beijing.volces.com/api/v3", + api_key=os.environ["ARK_API_KEY"]) +r = ark.images.generate( + model="doubao-seedream-5-0-260128", + prompt="a red origami crane on a wooden desk, studio lighting", + size="2K", + response_format="b64_json", # inline so we can upload immediately + extra_body={"watermark": False}, +) +import base64 +image_bytes = base64.b64decode(r.data[0].b64_json) + +# 2. Upload to TOS with public-read ACL +import tos +tos_client = tos.TosClientV2( + ak=os.environ["VOLC_ACCESS_KEY_ID"], + sk=os.environ["VOLC_SECRET_ACCESS_KEY"], + endpoint="tos-cn-beijing.volces.com", + region="cn-beijing", +) +key = "seedance-refs/crane-2026-05-18.jpeg" +tos_client.put_object( + bucket="myapp-assets", + key=key, + content=image_bytes, + acl=tos.ACLType.ACL_Public_Read, + content_type="image/jpeg", +) +public_url = f"https://myapp-assets.tos-cn-beijing.volces.com/{key}" + +# 3. Reference that public URL in a Seedance video gen call +import requests +seedance_resp = requests.post( + "https://ark.cn-beijing.volces.com/api/v3/contents/generations/tasks", + headers={"Authorization": f"Bearer {os.environ['ARK_API_KEY']}", "Content-Type": "application/json"}, + json={ + "model": "doubao-seedance-2-0-260128", + "content": [ + {"type": "text", "text": "Camera slowly orbits around the crane, golden hour lighting"}, + {"type": "image_url", "image_url": {"url": public_url}, "role": "first_frame"}, + ], + "ratio": "16:9", + "duration": 5, + }, +) +task_id = seedance_resp.json()["id"] +# … then poll task_id per the seedance module +``` + +The killer property: **all three steps run on cn-beijing** → TOS upload + Seedance model server fetch is same-region, zero egress cost. Cross-region would be slower and metered. + +--- + +## S3 SDK path (Node / Edge / Deno / Go / anywhere without an official tos SDK) + +TOS implements the S3 API. Point the AWS S3 client at the TOS endpoint: + +```typescript +import { S3Client, PutObjectCommand, CreateBucketCommand } from "@aws-sdk/client-s3"; + +const s3 = new S3Client({ + region: "cn-beijing", // must match TOS region exactly + endpoint: "https://tos-cn-beijing.volces.com", // include https:// + credentials: { + accessKeyId: process.env.VOLC_ACCESS_KEY_ID!, + secretAccessKey: process.env.VOLC_SECRET_ACCESS_KEY!, + }, + forcePathStyle: false, // TOS uses virtual-host style: .tos-.volces.com +}); + +await s3.send(new PutObjectCommand({ + Bucket: "myapp-assets", + Key: "demo/hello.txt", + Body: "hello trove", + ContentType: "text/plain", + ACL: "public-read", +})); + +const publicUrl = `https://myapp-assets.tos-cn-beijing.volces.com/demo/hello.txt`; +``` + +Same shape works in Go (`aws-sdk-go-v2`), Rust (`aws-sdk-s3`), Java (`software.amazon.awssdk`). + +--- + +## Presigned URLs (when you want time-limited access without making the object public) + +```python +# Presigned GET — anyone with the URL can read for the next 3600 seconds, then 403 +url = tos_client.pre_signed_url( + http_method=tos.HttpMethodType.Http_Method_Get, + bucket="myapp-assets", + key="private/sensitive.pdf", + expires=3600, +) + +# Presigned PUT — same idea for upload from a browser +upload_url = tos_client.pre_signed_url( + http_method=tos.HttpMethodType.Http_Method_Put, + bucket="myapp-assets", + key="user-uploads/avatar.jpg", + expires=600, +) +# Hand `upload_url` to the browser; browser does fetch(uploadUrl, { method: 'PUT', body: file }) +# Needs CORS configured on the bucket to allow browser-origin +``` + +Use presigned URLs when objects are sensitive but need short-lived sharing. For long-lived AI gen references that don't churn, `public-read` ACL is simpler. + +--- + +## CORS (only needed for browser-direct uploads / downloads) + +If your frontend will PUT directly to TOS (e.g. user avatar upload bypassing your backend), configure CORS: + +```python +client.put_bucket_cors( + bucket="myapp-assets", + cors_rules=[tos.CORSRule( + allowed_origins=["https://yourapp.example.com"], + allowed_methods=["PUT", "GET", "POST"], + allowed_headers=["*"], + expose_headers=["ETag"], + max_age_seconds=3000, + )], +) +``` + +For server-side uploads (Python / Node from your backend), CORS is irrelevant — only browser-origin requests trigger preflight. + +--- + +## CDN integration (optional, for high-traffic public assets) + +Default `tos-.volces.com` is the origin endpoint. For high-traffic public assets, attach a Volcengine CDN in front: + +1. Console: TOS bucket → 域名管理 → 绑定 CDN +2. CDN returns a `*.volccdn.com` (or your custom domain) — point your CNAME there +3. The CDN endpoint then handles cache + edge distribution + +For low-traffic dev / internal use, direct TOS endpoint works fine — no CDN needed. + +--- + +## Cost (approximate, RMB, 2026-05) + +| line item | unit | cn-beijing price | +|---|---|---| +| Storage | per GB-month | ¥0.12 | +| Outbound traffic (egress) | per GB | ¥0.50 | +| PUT/POST/DELETE requests | per 10k requests | ¥0.01 | +| GET/HEAD requests | per 10k requests | ¥0.01 | +| Intra-region traffic (TOS → Seedance same region) | free | — | + +For AI gen workflows: image refs are tiny (~500 KB), even 1000 references = 500 MB ≈ ¥0.06/month storage. The big variable is egress IF you serve the public URLs externally. **Same-region serving (Seedance pulling) is free.** + +Free tier: Volcengine TOS includes 50 GB storage + 10 GB egress per month for new accounts (check current at https://www.volcengine.com/pricing). + +--- + +## Error reference + +| symptom | cause | fix | +|---|---|---| +| `SignatureDoesNotMatch` | AK/SK wrong, OR region mismatch in signature, OR clock skew > 15min | confirm region in signature == TOS endpoint region; sync system clock | +| `NoSuchBucket` | bucket doesn't exist OR exists in a different region | check `client.list_buckets()`; confirm region matches | +| `BucketAlreadyExists` | bucket name globally taken in that region | pick a different name (try adding a unique suffix) | +| `AccessDenied` on PUT after bucket create | sub-user's policy doesn't grant PutObject | attach `TOSFullAccess` policy, or add `tos:PutObject` to a custom policy | +| `AccessDenied` on GET via public URL | bucket/object ACL is `private` | set ACL to `public-read` (per-object or per-bucket) | +| Browser PUT fails with CORS error | bucket has no CORS rule | `put_bucket_cors` with your origin in `allowed_origins` | +| Slow upload from outside China | China region with non-China client | use the `ap-southeast-1` region for international uploads, or attach CDN | +| Object served as `application/octet-stream` instead of expected type | no `content_type` passed at PUT | pass `content_type="image/jpeg"` (or appropriate) on the PUT call | + +--- + +## When to pick TOS vs other Trove storage modules + +- **volcengine-tos (this module)** → mandatory when serving references to Seedance / Seedream (same-region free egress). Good general choice for Volcengine-anchored stacks. +- Cloudflare R2 / AWS S3 — better when your stack is already on Cloudflare / AWS. AI services on Volcengine fetching from R2 / S3 = cross-cloud egress, billed and slow. +- Aliyun OSS — same logic as TOS but for DashScope-anchored stacks (Qwen / Wanx). + +Rule of thumb: **host assets in the same cloud as the consuming service**. TOS for Volcengine AI services; OSS for Alibaba; R2 for Cloudflare Workers; S3 for AWS Lambda. + +--- + +## Source of truth (refresh when these change) + +- Volcengine TOS overview — https://www.volcengine.com/docs/6349 +- TOS Python SDK — https://www.volcengine.com/docs/6349/92786 +- TOS S3-compatibility notes — https://www.volcengine.com/docs/6349/79895 +- IAM sub-user + AK/SK management — https://console.volcengine.com/iam/keymanage +- TOS console (bucket create / ACL / CORS) — https://console.volcengine.com/tos +- Pricing — https://www.volcengine.com/pricing?product=TOS +- Cross-module: `library/seedance/module.md` + `library/seedream/module.md` — both reference the `tos-.volces.com` host pattern in their docs + +Last upstream-docs sync: see `lastmod` in frontmatter. Last live-API verification: see `last_verified`. diff --git a/site/index.html b/site/index.html index 652917d..5f6a493 100644 --- a/site/index.html +++ b/site/index.html @@ -134,7 +134,7 @@
     _
     | |_ ___ _____   _____
     |  _|  _|  _  |_|   -|
-    |_| |_| |_____|_|___/   v0.2.4 — 21 modules, live-verified
+    |_| |_| |_____|_|___/   v0.2.4 — 22 modules, live-verified
 

Trove

@@ -147,7 +147,7 @@

Trove

- 21 modules in library + 22 modules in library · 5 production · 14 verified · 1 partial · @@ -179,7 +179,7 @@

Quick start

02

Install a module + open the Web UI to fill credentials

-

Pick from 21 bundled modules (or trove install --list to browse). The UI binds to 127.0.0.1:7821 only — never public.

+

Pick from 22 bundled modules (or trove install --list to browse). The UI binds to 127.0.0.1:7821 only — never public.

trove install stripe
 trove ui     # → http://127.0.0.1:7821
@@ -264,7 +264,7 @@

Web UI

  • Modules — your installed modules grouped by category with credential-status indicators
  • -
  • Library — 21 bundled module templates, one-click Install copies module.md into ~/.trove/
  • +
  • Library — 22 bundled module templates, one-click Install copies module.md into ~/.trove/
  • Credentials form — masked password fields with reveal toggle, file-type fields with present/replace/delete widget, inline save via HTMX
  • Module detail — frontmatter + rendered skill markdown side-by-side, with last_verified tier dot
@@ -286,7 +286,7 @@

MCP support (optional)

What's in the library

-

Every module carries a last_verified field — what was actually tested, by whom, when. Dot color reflects current state. We'd rather ship 21 honest modules than 50 LLM-hallucinated ones.

+

Every module carries a last_verified field — what was actually tested, by whom, when. Dot color reflects current state. We'd rather ship 22 honest modules than 50 LLM-hallucinated ones.

production · daily-use @@ -320,10 +320,11 @@

What's in the library

google-analytics google-search-console -
infra · email · db
+
infra · email · db · storage
cloudflare resend supabase + volcengine-tos
collab · dev-tool
lark @@ -338,7 +339,7 @@

Documentation

  • SPEC — the format definition (frontmatter schema, reference syntax, runtime conventions). Includes a living convention adherence log in §10 of real dogfood lessons from production.
  • -
  • library/ — the 21 bundled modules listed above
  • +
  • library/ — the 22 bundled modules listed above
  • ROADMAP — phases and explicit non-goals (no trove init, no inject step, no SaaS — ever)
  • CONTRIBUTING — module quality bar
  • design-v0.2.md — why the Web UI dropped AI-chat features (chat IS the entry interface; UI is the visualization)
  • @@ -346,7 +347,7 @@

    Documentation

    Status

    -

    v0.2.4 — the format spec is stable, all 21 modules are gated by last_verified, and the maintainer dogfoods trove daily across personal projects. AI-assisted module authoring (v0.3) and a marketplace for community modules (v1.0) are next.

    +

    v0.2.4 — the format spec is stable, all 22 modules are gated by last_verified, and the maintainer dogfoods trove daily across personal projects. AI-assisted module authoring (v0.3) and a marketplace for community modules (v1.0) are next.

    The repo is github.com/RoboZephyr/trove — issues, PRs, and module additions welcome (see CONTRIBUTING.md for the quality bar).