From d43e774827c91bd0847cf93ada81c2f63ac4e24c Mon Sep 17 00:00:00 2001 From: MonkeyIn Date: Sat, 9 May 2026 15:58:02 +0800 Subject: [PATCH] feat: add transcript intake triage --- README.md | 13 + README.zh-CN.md | 13 + docs/growth/2026-05-07-outreach-followup.md | 9 + docs/ops/external-pilot-runbook.zh-CN.md | 10 +- docs/ops/external-pilot-tracker.zh-CN.md | 3 +- package.json | 1 + src/testops/cli.ts | 167 +++++++++- src/testops/transcriptIntake.ts | 344 ++++++++++++++++++++ tests/testops/cli.test.ts | 57 ++++ tests/testops/packageMetadata.test.ts | 1 + 10 files changed, 615 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index c20a926..0d6f108 100644 --- a/README.md +++ b/README.md @@ -287,6 +287,19 @@ This runs the photo-studio multi-turn suite and exports a customer-ready report. ## Turn A Real Failure Into A Regression Test +When a pilot replies with one sanitized transcript, start with the intake wrapper. It writes a private suite draft, merchant draft, and triage summary without quoting the raw transcript in the summary: + +```bash +pbpaste | npx voice-agent-testops transcript-intake \ + --stdin \ + --suite .voice-testops/transcript-intake/suite.json \ + --merchant-out .voice-testops/transcript-intake/merchant.json \ + --summary .voice-testops/transcript-intake/summary.md \ + --merchant-name "Pilot demo agent" +``` + +Use this before a live endpoint is ready. The summary highlights inferred industry, turn counts, assertion mix, risk signals, privacy warnings, generated artifacts, and the next `validate` / `doctor` / `run` commands. It does not print raw transcript text. + Paste a failed call and generate a starter suite plus an editable merchant draft: ```bash diff --git a/README.zh-CN.md b/README.zh-CN.md index e74a9f9..6ab7973 100644 --- a/README.zh-CN.md +++ b/README.zh-CN.md @@ -217,6 +217,19 @@ npm run voice-test -- \ ## 把真实失败对话变成回归测试 +如果试点对象先回复了一条脱敏 transcript,还没有 endpoint,先跑 intake 包装命令。它会写出私有 suite 草稿、商家草稿和 triage 摘要;summary 不会引用原始 transcript 文本: + +```bash +pbpaste | npx voice-agent-testops transcript-intake \ + --stdin \ + --suite .voice-testops/transcript-intake/suite.json \ + --merchant-out .voice-testops/transcript-intake/merchant.json \ + --summary .voice-testops/transcript-intake/summary.md \ + --merchant-name "Pilot demo agent" +``` + +这个 summary 会列出推断行业、turn 数、断言分布、风险信号、隐私警告、生成产物和下一步 `validate` / `doctor` / `run` 命令,但不输出原始 transcript。 + 如果你已经遇到过一次真实失败,可以直接复制 transcript,生成一个可编辑的 suite 和商家资料草稿: ```bash diff --git a/docs/growth/2026-05-07-outreach-followup.md b/docs/growth/2026-05-07-outreach-followup.md index 4bf3d93..f4f967d 100644 --- a/docs/growth/2026-05-07-outreach-followup.md +++ b/docs/growth/2026-05-07-outreach-followup.md @@ -66,3 +66,12 @@ Issue:[kev-hu/vapi-voice-agent#1](https://github.com/kev-hu/vapi-voice-agent/i 为降低对方提供样本的成本,已补充可直接复制填写的 intake 包:[Insurance transcript intake pack](../ops/insurance-transcript-intake.md)。下一次 follow-up 优先贴这个模板,而不是继续泛泛请求“提供 transcript”。 已将 intake 包回贴到 `kev-hu/vapi-voice-agent#1`:[issuecomment-4403442303](https://github.com/kev-hu/vapi-voice-agent/issues/1#issuecomment-4403442303)。对方现在只需按模板粘贴一条脱敏 transcript,或明确标注 synthetic/public sample。 + +## 2026-05-09 reply check + +检查时间:2026-05-09 13:58 CST。 + +- `streamcoreai/streamcore-server#4`:对方明确同意测试,提示可先测 `streamcore.ai` demo,但 demo 只有基础 Streamcore knowledge;已回复请求可脚本化 HTTP/WebSocket 测试入口或脱敏 transcript:[issuecomment-4411642368](https://github.com/streamcoreai/streamcore-server/issues/4#issuecomment-4411642368)。 +- `codewithmuh/ai-voice-agent#2`:对方给出弱正向回复;已回复请求 dev/test endpoint 或一条脱敏 booking / missed-call / handoff transcript:[issuecomment-4411642357](https://github.com/codewithmuh/ai-voice-agent/issues/2#issuecomment-4411642357)。 + +当前优先级:Streamcore > codewithmuh。拿到 endpoint 走 `doctor` / `run`;拿到 transcript 先跑 `transcript-intake`,只公开 aggregate 结果,不贴原始 transcript。 diff --git a/docs/ops/external-pilot-runbook.zh-CN.md b/docs/ops/external-pilot-runbook.zh-CN.md index 0cffb5d..5728b04 100644 --- a/docs/ops/external-pilot-runbook.zh-CN.md +++ b/docs/ops/external-pilot-runbook.zh-CN.md @@ -20,7 +20,15 @@ - 测试数据已脱敏,不包含真实身份证、完整地址、病历、交易账号、客户真实姓名等敏感信息。 - 至少选择一个 starter 行业:`real_estate`、`dental_clinic`、`home_design`、`insurance`、`restaurant`。 -如果对方只能提供保险 / 监管服务 transcript,先让对方按 [Insurance transcript intake pack](insurance-transcript-intake.md) 填写一条脱敏失败或边界通话,再进入 `from-transcript` / `draft-regressions` 流程。 +如果对方只能先提供一条脱敏 transcript,优先用 `transcript-intake` 生成私有 suite 草稿、商家草稿和 triage summary: + +```bash +pbpaste | npx voice-agent-testops transcript-intake \ + --stdin \ + --summary .voice-testops/transcript-intake/summary.md +``` + +summary 只输出统计、风险信号、隐私警告和下一步命令,不引用原始 transcript 文本。保险 / 监管服务 transcript 先让对方按 [Insurance transcript intake pack](insurance-transcript-intake.md) 填写一条脱敏失败或边界通话,再加 `--intake insurance` 进入 `from-transcript` / `draft-regressions` 流程。 如果对方提供的是一批原始录音链接或 call replay URL,先按 [录音资源 Intake Runbook](recording-resource-intake.zh-CN.md) 建私有 manifest,完成授权、脱敏和 `keep` / `maybe` / `discard` 筛选后,再挑样本转 transcript。 diff --git a/docs/ops/external-pilot-tracker.zh-CN.md b/docs/ops/external-pilot-tracker.zh-CN.md index fecf49a..f3ea56b 100644 --- a/docs/ops/external-pilot-tracker.zh-CN.md +++ b/docs/ops/external-pilot-tracker.zh-CN.md @@ -38,6 +38,7 @@ | 日期 | 对象 | 入口 | 行业 starter | 接入方式 | 数据授权 | 当前状态 | 下一步动作 | |---|---|---|---|---|---|---|---| +| 2026-05-06 | Streamcore server | https://github.com/streamcoreai/streamcore-server/issues/4 | custom platform / realtime voice agent | HTTP / WebSocket / Transcript import | 未确认 | replied | 已回复需求;等待可脚本化 demo endpoint、WebSocket route 或 sanitized transcript | | 2026-05-08 | Awaisali36 outbound real-estate Vapi agent | https://github.com/Awaisali36/Outbound-Real-State-Voice-AI-Agent-/issues/6 | real_estate / outbound_leadgen | Vapi / Transcript import | 未确认 | contacted | 2026-05-10 follow-up:endpoint 或 1 条脱敏 transcript | | 2026-05-08 | santmun Sofia voice agent | https://github.com/santmun/sofia-voice-agent/issues/2 | real_estate | Retell / Twilio / Transcript import | 未确认 | contacted | 2026-05-10 follow-up:endpoint 或 1 条脱敏 transcript | | 2026-05-08 | askjohngeorge Pipecat lead qualifier | https://github.com/askjohngeorge/pipecat-lead-qualifier/issues/1 | outbound_leadgen | Pipecat / HTTP | 未确认 | contacted | 2026-05-10 follow-up:lead qualifier endpoint 或 transcript | @@ -49,7 +50,7 @@ | 2026-05-09 | videosdk WhatsApp AI calling agent | https://github.com/videosdk-community/videosdk-whatsapp-ai-calling-agent/issues/2 | outbound_leadgen / custom channel | WhatsApp / Twilio / VideoSDK / Transcript import | 未确认 | contacted | 2026-05-11 follow-up:0.1.19 comment 已补;等 endpoint 或 sanitized transcript | | 2026-05-09 | VoiceBlender | https://github.com/VoiceBlender/voiceblender/issues/28 | outbound_leadgen / platform adapter | REST / Webhook / WebSocket adapter | 未确认 | contacted | 2026-05-11 follow-up:0.1.19 comment 已补;问 adapter interest 或 demo endpoint | | 2026-05-09 | theaifutureguy LiveKit voice agent | https://github.com/theaifutureguy/livekit-voice-agent/issues/6 | outbound_leadgen / receptionist | LiveKit / Telephony / HTTP | 未确认 | contacted | 2026-05-11 follow-up:0.1.19 comment 已补;等 dev endpoint 或 one sanitized call | -| 2026-05-09 | codewithmuh AI voice receptionist | https://github.com/codewithmuh/ai-voice-agent/issues/2 | restaurant / custom receptionist | Vapi / HTTP / Transcript import | 未确认 | contacted | 2026-05-11 follow-up:booking endpoint 或 one sanitized transcript | +| 2026-05-09 | codewithmuh AI voice receptionist | https://github.com/codewithmuh/ai-voice-agent/issues/2 | restaurant / custom receptionist | Vapi / HTTP / Transcript import | 未确认 | replied | 已回复需求;等待 booking endpoint 或 one sanitized transcript | | 2026-05-09 | Teleglobals voicebot calling agent | https://github.com/Teleglobals-org/voicebot-calling-agent/issues/1 | real_estate | Twilio / AWS / HTTP / Transcript import | 未确认 | contacted | 2026-05-11 follow-up:real-estate endpoint 或 one sanitized transcript | | 2026-05-09 | frejun Teler Vapi bridge | https://github.com/frejun-tech/teler-vapi-bridge/issues/6 | outbound_leadgen / platform bridge | Vapi bridge / HTTP / WebSocket | 未确认 | contacted | 2026-05-11 follow-up:contract-test adapter interest 或 dev endpoint | diff --git a/package.json b/package.json index 9e351c3..d2f4799 100644 --- a/package.json +++ b/package.json @@ -48,6 +48,7 @@ "voice-test": "tsx src/testops/cli.ts", "judge:calibrate": "tsx src/testops/cli.ts calibrate-judge", "suite:from-transcript": "tsx src/testops/cli.ts from-transcript", + "transcript:intake": "tsx src/testops/cli.ts transcript-intake", "calls:import": "tsx src/testops/cli.ts import-calls", "voice-test:openclaw": "scripts/openclaw-docker.sh voice-test", "voice-test:photo-demo": "scripts/openclaw-docker.sh voice-test examples/voice-testops/photo-studio-multiturn-suite.json", diff --git a/src/testops/cli.ts b/src/testops/cli.ts index 50b0234..ae7c2f3 100644 --- a/src/testops/cli.ts +++ b/src/testops/cli.ts @@ -38,7 +38,13 @@ import { renderSemanticJudgeCalibrationMarkdown, } from "./semanticJudgeCalibration"; import { loadVoiceTestSuite } from "./suiteLoader"; -import { getTranscriptIntakeDefaults, parseTranscriptIntakePreset, type TranscriptIntakePreset } from "./transcriptIntake"; +import { + analyzeTranscriptIntake, + getTranscriptIntakeDefaults, + parseTranscriptIntakePreset, + renderTranscriptIntakeMarkdown, + type TranscriptIntakePreset, +} from "./transcriptIntake"; import { buildDraftMerchantFromTranscript, buildVoiceTestSuiteFromTranscript } from "./transcriptSuite"; const severityRank: Record = { @@ -84,6 +90,10 @@ async function main(argv: string[]): Promise { return recordingIntake(argv.slice(1)); } + if (argv[0] === "transcript-intake") { + return transcriptIntake(argv.slice(1)); + } + if (argv[0] === "pilot-report") { return generatePilotReport(argv.slice(1)); } @@ -531,6 +541,23 @@ type RecordingIntakeArgs = { summaryPath?: string; }; +type TranscriptIntakeArgs = { + transcriptPath?: string; + readFromStdin: boolean; + suitePath: string; + merchantPath?: string; + merchantOutPath?: string; + summaryPath: string; + merchantName?: string; + industry?: Industry; + name?: string; + scenarioId?: string; + scenarioTitle?: string; + source: LeadSource; + intake?: TranscriptIntakePreset; + turnRole: FromTranscriptTurnRole; +}; + type PilotReportArgs = { reportPath: string; commercialPath?: string; @@ -707,6 +734,144 @@ function parseRecordingIntakeArgs(argv: string[]): RecordingIntakeArgs { }; } +async function transcriptIntake(argv: string[]): Promise { + const args = parseTranscriptIntakeArgs(argv); + const transcript = args.readFromStdin + ? await readFromStdin() + : await readFile(await resolveReadablePath(args.transcriptPath ?? ""), "utf8"); + const intakeDefaults = args.intake ? getTranscriptIntakeDefaults(args.intake) : undefined; + const merchantName = args.merchantName ?? intakeDefaults?.merchantName; + const industry = args.industry ?? intakeDefaults?.industry; + const merchant = args.merchantPath + ? merchantConfigSchema.parse(JSON.parse(await readFile(await resolveReadablePath(args.merchantPath), "utf8"))) + : buildDraftMerchantFromTranscript({ transcript, name: merchantName, industry }); + const suite = buildVoiceTestSuiteFromTranscript({ + transcript, + merchant, + name: args.name ?? intakeDefaults?.suiteName, + scenarioId: args.scenarioId ?? intakeDefaults?.scenarioId, + scenarioTitle: args.scenarioTitle ?? intakeDefaults?.scenarioTitle, + source: args.source, + turnRole: args.turnRole, + }); + const suiteOutput = args.merchantOutPath + ? buildSuiteWithMerchantRef(suite, relativeMerchantRef(args.suitePath, args.merchantOutPath)) + : suite; + const report = analyzeTranscriptIntake({ + transcript, + suite, + sourcePath: args.readFromStdin ? undefined : args.transcriptPath, + selectedTurnRole: args.turnRole, + artifacts: { + suitePath: args.suitePath, + merchantPath: args.merchantOutPath, + summaryPath: args.summaryPath, + }, + }); + + if (args.merchantOutPath) { + await writeReport(args.merchantOutPath, `${JSON.stringify(merchant, null, 2)}\n`); + } + await writeReport(args.suitePath, `${JSON.stringify(suiteOutput, null, 2)}\n`); + await writeReport(args.summaryPath, renderTranscriptIntakeMarkdown(report)); + + console.log(`Transcript intake summary: ${args.summaryPath}`); + console.log(`Generated suite draft: ${args.suitePath}`); + if (args.merchantOutPath) { + console.log(`${args.merchantPath ? "Merchant profile" : "Merchant draft"}: ${args.merchantOutPath}`); + } + console.log(`Transcript: ${args.readFromStdin ? "read from stdin" : args.transcriptPath}`); + if (args.intake) { + console.log(`Transcript intake: ${args.intake}`); + } + console.log(`Suite: ${suite.name}`); + console.log(`Scenario: ${suite.scenarios[0].id} - ${suite.scenarios[0].title}`); + printTurnCount(args.turnRole, suite.scenarios[0].turns.length); + console.log(`Assertions: ${report.assertionCount}`); + console.log(`Risk signals: ${report.riskSignals.length}`); + console.log(`Privacy warnings: ${report.privacyWarnings.length}`); + + return 0; +} + +function parseTranscriptIntakeArgs(argv: string[]): TranscriptIntakeArgs { + const values = new Map(); + const flags = new Set(); + const knownValues = new Set([ + "transcript", + "input", + "suite", + "out", + "merchant", + "merchant-out", + "summary", + "merchant-name", + "industry", + "name", + "scenario-id", + "scenario-title", + "source", + "intake", + "turn-role", + ]); + + for (let index = 0; index < argv.length; index += 1) { + const arg = argv[index]; + if (!arg.startsWith("--")) { + throw new Error(`Unexpected argument: ${arg}`); + } + + const name = arg.slice(2); + if (name === "stdin") { + flags.add(name); + continue; + } + if (!knownValues.has(name)) { + throw new Error(`Unknown transcript-intake option: --${name}`); + } + + const value = argv[index + 1]; + if (!value || value.startsWith("--")) { + throw new Error(`${arg} requires a value`); + } + + values.set(name, value); + index += 1; + } + + const readFromStdin = flags.has("stdin"); + const transcriptPath = values.get("transcript") ?? values.get("input"); + if (readFromStdin && transcriptPath) { + throw new Error("--stdin cannot be combined with --transcript or --input"); + } + if (!readFromStdin && !transcriptPath) { + throw new Error("--transcript, --input, or --stdin is required"); + } + + const suitePath = values.get("suite") ?? values.get("out") ?? ".voice-testops/transcript-intake/suite.json"; + const merchantPath = values.get("merchant"); + const merchantOutPath = values.get("merchant-out") ?? (merchantPath ? undefined : ".voice-testops/transcript-intake/merchant.json"); + const summaryPath = values.get("summary") ?? ".voice-testops/transcript-intake/summary.md"; + const intake = values.get("intake"); + + return { + transcriptPath, + readFromStdin, + suitePath, + merchantPath, + merchantOutPath, + summaryPath, + merchantName: values.get("merchant-name"), + industry: values.has("industry") ? industrySchema.parse(values.get("industry")) : undefined, + name: values.get("name"), + scenarioId: values.get("scenario-id"), + scenarioTitle: values.get("scenario-title"), + source: leadSourceSchema.parse(values.get("source") ?? "website"), + intake: intake ? parseTranscriptIntakePreset(intake) : undefined, + turnRole: parseTranscriptTurnRole(values.get("turn-role") ?? "customer"), + }; +} + function parsePilotReportArgs(argv: string[]): PilotReportArgs { const values = parseKeyValueArgs(argv); diff --git a/src/testops/transcriptIntake.ts b/src/testops/transcriptIntake.ts index e1f0f68..42f2780 100644 --- a/src/testops/transcriptIntake.ts +++ b/src/testops/transcriptIntake.ts @@ -1,4 +1,6 @@ import type { Industry } from "../domain/merchant"; +import type { VoiceTestAssertion, VoiceTestSeverity, VoiceTestSuite } from "./schema"; +import { parseTranscript } from "./transcriptSuite"; export type TranscriptIntakePreset = "insurance"; @@ -20,6 +22,55 @@ const transcriptIntakeDefaults: Record; + severityCounts: Array<{ severity: VoiceTestSeverity; count: number }>; + riskSignals: TranscriptIntakeRiskSignal[]; + privacyWarnings: TranscriptIntakePrivacyWarning[]; + artifacts: TranscriptIntakeArtifactPaths; + nextSteps: string[]; +}; + +export type AnalyzeTranscriptIntakeOptions = { + transcript: string; + suite: VoiceTestSuite; + sourcePath?: string; + selectedTurnRole: "customer" | "assistant"; + artifacts?: TranscriptIntakeArtifactPaths; +}; + export function parseTranscriptIntakePreset(value: string): TranscriptIntakePreset { if (value === "insurance") { return value; @@ -31,3 +82,296 @@ export function parseTranscriptIntakePreset(value: string): TranscriptIntakePres export function getTranscriptIntakeDefaults(preset: TranscriptIntakePreset): TranscriptIntakeDefaults { return transcriptIntakeDefaults[preset]; } + +export function analyzeTranscriptIntake(options: AnalyzeTranscriptIntakeOptions): TranscriptIntakeTriageReport { + const messages = parseTranscript(options.transcript); + const scenario = options.suite.scenarios[0]; + const assertions = scenario.turns.flatMap((turn) => turn.expect); + const suitePath = options.artifacts?.suitePath ?? ".voice-testops/transcript-intake/suite.json"; + + return { + generatedAt: new Date().toISOString(), + sourcePath: options.sourcePath, + suiteName: options.suite.name, + scenarioId: scenario.id, + scenarioTitle: scenario.title, + merchantName: scenario.merchant.name, + industry: scenario.merchant.industry, + selectedTurnRole: options.selectedTurnRole, + totalMessages: messages.length, + customerTurns: messages.filter((message) => message.role === "customer").length, + assistantTurns: messages.filter((message) => message.role === "assistant").length, + generatedTurns: scenario.turns.length, + assertionCount: assertions.length, + assertionTypeCounts: countAssertionsByType(assertions), + severityCounts: countAssertionsBySeverity(assertions), + riskSignals: buildRiskSignals(assertions), + privacyWarnings: detectTranscriptPrivacyWarnings(options.transcript), + artifacts: options.artifacts ?? {}, + nextSteps: buildTranscriptIntakeNextSteps(suitePath), + }; +} + +export function renderTranscriptIntakeMarkdown(report: TranscriptIntakeTriageReport): string { + const lines = [ + "# Voice Agent TestOps Transcript Intake", + "", + `Generated: ${report.generatedAt}`, + report.sourcePath ? "Source: file input" : "Source: stdin", + "", + "Privacy: raw transcript text is not included in this summary. Keep the transcript and generated suite in a private workspace unless the data owner explicitly approves public sharing.", + "", + "## Triage", + "", + "| Metric | Value |", + "|---|---:|", + `| Total messages | ${report.totalMessages} |`, + `| Customer turns | ${report.customerTurns} |`, + `| Assistant turns | ${report.assistantTurns} |`, + `| Generated ${report.selectedTurnRole} turns | ${report.generatedTurns} |`, + `| Draft assertions | ${report.assertionCount} |`, + "", + "## Generated Draft", + "", + "| Field | Value |", + "|---|---|", + `| Suite | ${markdownCode(report.suiteName)} |`, + `| Scenario | ${markdownCode(`${report.scenarioId} - ${report.scenarioTitle}`)} |`, + `| Merchant | ${markdownCode(`${report.merchantName} (${report.industry})`)} |`, + `| Turn role | ${markdownCode(report.selectedTurnRole)} |`, + "", + "## Assertion Mix", + "", + "| Assertion type | Count |", + "|---|---:|", + ...formatCountRows(report.assertionTypeCounts, "No assertions generated."), + "", + "## Severity Mix", + "", + "| Severity | Count |", + "|---|---:|", + ...formatCountRows(report.severityCounts, "No assertions generated."), + "", + "## Risk Signals", + "", + "| Signal | Severity | Count | Note |", + "|---|---|---:|---|", + ...formatRiskSignalRows(report.riskSignals), + "", + "## Privacy Warnings", + "", + "| Warning | Count | Note |", + "|---|---:|---|", + ...formatPrivacyWarningRows(report.privacyWarnings), + "", + "## Generated Artifacts", + "", + ...formatArtifactRows(report.artifacts), + "", + "## Next Steps", + "", + ...report.nextSteps.map((step) => `- ${step}`), + "", + ]; + + return `${lines.join("\n")}`; +} + +function countAssertionsByType(assertions: VoiceTestAssertion[]): Array<{ type: VoiceTestAssertion["type"]; count: number }> { + const counts = new Map(); + for (const assertion of assertions) { + counts.set(assertion.type, (counts.get(assertion.type) ?? 0) + 1); + } + + return [...counts.entries()].map(([type, count]) => ({ type, count })); +} + +function countAssertionsBySeverity(assertions: VoiceTestAssertion[]): Array<{ severity: VoiceTestSeverity; count: number }> { + const counts = new Map([ + ["critical", 0], + ["major", 0], + ["minor", 0], + ]); + for (const assertion of assertions) { + counts.set(assertion.severity, (counts.get(assertion.severity) ?? 0) + 1); + } + + return [...counts.entries()] + .filter(([, count]) => count > 0) + .map(([severity, count]) => ({ severity, count })); +} + +function buildRiskSignals(assertions: VoiceTestAssertion[]): TranscriptIntakeRiskSignal[] { + const signals = new Map(); + for (const assertion of assertions) { + const signal = riskSignalForAssertion(assertion); + if (!signal) { + continue; + } + + const existing = signals.get(signal.tag); + if (existing) { + existing.count += 1; + existing.severity = higherSeverity(existing.severity, signal.severity); + } else { + signals.set(signal.tag, { ...signal, count: 1 }); + } + } + + return [...signals.values()]; +} + +function riskSignalForAssertion(assertion: VoiceTestAssertion): Omit | undefined { + if (assertion.type === "semantic_judge") { + const notes: Record = { + no_unsupported_guarantee: "Transcript suggests an unsupported promise or guarantee should be guarded.", + requires_human_confirmation: "Transcript touches facts that need human or system confirmation.", + requires_handoff: "Transcript contains a human handoff, escalation, opt-out, or refusal pattern.", + }; + return { tag: assertion.rubric, severity: assertion.severity, note: notes[assertion.rubric] }; + } + + if (assertion.type === "lead_field_present") { + return { + tag: `lead_field_${assertion.field}`, + severity: assertion.severity, + note: `Generated suite expects structured lead field ${assertion.field}.`, + }; + } + + if (assertion.type === "must_not_match") { + return { + tag: "forbidden_phrase_guard", + severity: assertion.severity, + note: "Generated suite includes a forbidden wording or promise guard.", + }; + } + + if (assertion.type === "must_contain_any") { + return { + tag: "required_phrase_or_fact", + severity: assertion.severity, + note: "Generated suite expects approved wording, facts, or collection prompts.", + }; + } + + if (assertion.type === "max_latency_ms") { + return { + tag: "latency_guard", + severity: assertion.severity, + note: "Generated suite includes a basic response latency guard.", + }; + } + + return undefined; +} + +function detectTranscriptPrivacyWarnings(transcript: string): TranscriptIntakePrivacyWarning[] { + const warnings: TranscriptIntakePrivacyWarning[] = []; + addWarning( + warnings, + "possible_url", + countMatches(transcript, /\b(?:https?:\/\/|s3:\/\/|gs:\/\/)\S+/gi), + "Replace private URLs, replay links, and recording links with placeholders before sharing.", + ); + addWarning( + warnings, + "possible_email", + countMatches(transcript, /\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b/gi), + "Replace email addresses with placeholders such as [EMAIL].", + ); + addWarning( + warnings, + "possible_phone_or_account_number", + countMatches(transcript, /(?:\+?\d[\s().-]?){7,}/g), + "Replace phone numbers and account-like numeric identifiers with stable placeholders.", + ); + addWarning( + warnings, + "possible_secret", + countMatches(transcript, /\b(?:bearer\s+[a-z0-9._-]+|api[_-]?key|secret|token|sk-[a-z0-9_-]+)/gi), + "Remove API keys, bearer tokens, secrets, and production credentials.", + ); + + return warnings; +} + +function addWarning( + warnings: TranscriptIntakePrivacyWarning[], + tag: string, + count: number, + note: string, +): void { + if (count > 0) { + warnings.push({ tag, count, note }); + } +} + +function countMatches(text: string, pattern: RegExp): number { + return [...text.matchAll(pattern)].length; +} + +function buildTranscriptIntakeNextSteps(suitePath: string): string[] { + return [ + `Validate the draft: ${markdownCode(`npx voice-agent-testops validate --suite ${suitePath}`)}`, + `If an endpoint is available, run doctor first: ${markdownCode(`npx voice-agent-testops doctor --agent http --endpoint "$VOICE_AGENT_ENDPOINT" --suite ${suitePath}`)}`, + `Run the pilot suite and write a private summary: ${markdownCode(`npx voice-agent-testops run --agent http --endpoint "$VOICE_AGENT_ENDPOINT" --suite ${suitePath} --summary .voice-testops/transcript-intake/run-summary.md`)}`, + "Review generated assertions before using the suite as a CI gate.", + "Share aggregate findings only; do not quote raw transcript text publicly unless explicitly authorized.", + ]; +} + +function higherSeverity(a: VoiceTestSeverity, b: VoiceTestSeverity): VoiceTestSeverity { + const rank: Record = { critical: 3, major: 2, minor: 1 }; + return rank[a] >= rank[b] ? a : b; +} + +function formatCountRows( + rows: T[], + empty: string, +): string[] { + if (rows.length === 0) { + return [`| ${empty} | 0 |`]; + } + + return rows.map((row) => { + const label = "type" in row ? String(row.type) : "severity" in row ? String(row.severity) : "item"; + return `| ${markdownCode(label)} | ${row.count} |`; + }); +} + +function formatRiskSignalRows(signals: TranscriptIntakeRiskSignal[]): string[] { + if (signals.length === 0) { + return ["| none | - | 0 | No obvious risk signal was inferred. |"]; + } + + return signals.map((signal) => `| ${markdownCode(signal.tag)} | ${signal.severity} | ${signal.count} | ${signal.note} |`); +} + +function formatPrivacyWarningRows(warnings: TranscriptIntakePrivacyWarning[]): string[] { + if (warnings.length === 0) { + return ["| none | 0 | No obvious raw URL, email, phone/account number, or secret pattern detected. |"]; + } + + return warnings.map((warning) => `| ${markdownCode(warning.tag)} | ${warning.count} | ${warning.note} |`); +} + +function formatArtifactRows(artifacts: TranscriptIntakeArtifactPaths): string[] { + const rows: string[] = []; + + if (artifacts.suitePath) { + rows.push(`- Suite draft: ${markdownCode(artifacts.suitePath)}`); + } + if (artifacts.merchantPath) { + rows.push(`- Merchant draft: ${markdownCode(artifacts.merchantPath)}`); + } + if (artifacts.summaryPath) { + rows.push(`- Intake summary: ${markdownCode(artifacts.summaryPath)}`); + } + + return rows.length > 0 ? rows : ["- No files were requested."]; +} + +function markdownCode(value: string): string { + return `\`${value.replace(/`/g, "\\`")}\``; +} diff --git a/tests/testops/cli.test.ts b/tests/testops/cli.test.ts index 771923b..bc9bc0e 100644 --- a/tests/testops/cli.test.ts +++ b/tests/testops/cli.test.ts @@ -419,6 +419,63 @@ describe("voice-test CLI", () => { expect(markdown).not.toContain("https://signed.example.test"); }); + it("triages sanitized transcripts into private suite and summary drafts", async () => { + const tempDir = await mkdtemp(path.join(tmpdir(), "voice-testops-cli-")); + const transcriptPath = path.join(tempDir, "streamcore-demo.txt"); + const suitePath = path.join(tempDir, "suite.json"); + const merchantPath = path.join(tempDir, "merchant.json"); + const summaryPath = path.join(tempDir, "transcript-intake.md"); + await writeFile( + transcriptPath, + [ + "Customer: Can a human call me tomorrow at 13800000000? I also saw https://private.example.test/replay.wav", + "Assistant: The demo can answer basic Streamcore questions.", + "Customer: Can you guarantee Streamcore will replace all human support?", + "Assistant: It is guaranteed.", + ].join("\n"), + "utf8", + ); + + const result = await runCli([ + "transcript-intake", + "--input", + transcriptPath, + "--suite", + suitePath, + "--merchant-out", + merchantPath, + "--summary", + summaryPath, + "--merchant-name", + "Streamcore demo", + "--industry", + "outbound_leadgen", + ]); + + const suite = await loadVoiceTestSuite(suitePath); + const merchantDraft = JSON.parse(await readFile(merchantPath, "utf8")) as { name: string; industry: string }; + const markdown = await readFile(summaryPath, "utf8"); + + expect(result.code).toBe(0); + expect(result.stdout).toContain(`Transcript intake summary: ${summaryPath}`); + expect(result.stdout).toContain(`Generated suite draft: ${suitePath}`); + expect(result.stdout).toContain(`Merchant draft: ${merchantPath}`); + expect(result.stdout).toContain("Risk signals:"); + expect(result.stdout).toContain("Privacy warnings: 2"); + expect(suite.name).toBe("Generated transcript regression"); + expect(suite.scenarios[0].merchant.name).toBe("Streamcore demo"); + expect(suite.scenarios[0].turns).toHaveLength(2); + expect(merchantDraft).toMatchObject({ name: "Streamcore demo", industry: "outbound_leadgen" }); + expect(markdown).toContain("# Voice Agent TestOps Transcript Intake"); + expect(markdown).toContain("Privacy: raw transcript text is not included"); + expect(markdown).toContain("possible_url"); + expect(markdown).toContain("possible_phone_or_account_number"); + expect(markdown).toContain("requires_handoff"); + expect(markdown).toContain("npx voice-agent-testops validate --suite"); + expect(markdown).not.toContain("13800000000"); + expect(markdown).not.toContain("https://private.example.test"); + }); + it("generates commercial pilot report and pilot recap templates from a JSON report", async () => { const tempDir = await mkdtemp(path.join(tmpdir(), "voice-testops-cli-")); const reportPath = path.join(tempDir, "report.json"); diff --git a/tests/testops/packageMetadata.test.ts b/tests/testops/packageMetadata.test.ts index 01be0cd..857f772 100644 --- a/tests/testops/packageMetadata.test.ts +++ b/tests/testops/packageMetadata.test.ts @@ -42,6 +42,7 @@ describe("package metadata", () => { expect(packageJson.dependencies).not.toHaveProperty("react"); expect(packageJson.dependencies).not.toHaveProperty("@prisma/client"); expect(packageJson.scripts?.["judge:calibrate"]).toContain("calibrate-judge"); + expect(packageJson.scripts?.["transcript:intake"]).toContain("transcript-intake"); expect(lockRoot.packages[""].dependencies).toHaveProperty("tsx"); expect(lockRoot.packages[""].dependencies).toHaveProperty("zod"); expect(lockRoot.packages[""].dependencies).not.toHaveProperty("next");