Skip to content

Integrate local main improvements into remote main#15

Merged
jaydestro merged 172 commits into
mainfrom
integration/local-main-2026-06-11
Jun 11, 2026
Merged

Integrate local main improvements into remote main#15
jaydestro merged 172 commits into
mainfrom
integration/local-main-2026-06-11

Conversation

@jaydestro

Copy link
Copy Markdown
Owner

Summary

This PR brings the current local working main state to GitHub via the integration/local-main-2026-06-11 branch, without force-pushing over the divergent remote main.

Key changes included:

  • Browser-scan sidecars now feed Community signals immediately via direct server indexing.
  • Standalone browser-scan runs auto-start a /scout-scan ingest run by default.
  • Browser-scan no longer treats late async warnings after successful social sidecar writes as failed social scans.
  • Content Scout runtime config loading now prefers .local/configs with legacy fallback.
  • Web UI app.js was split into focused lib/, pages/, and components/ modules.
  • Dashboard, Conversations, and Community Signals freshness/reliability fixes.
  • Hermetic web-ui test runner prevents real .local/state files from being clobbered.
  • Community-signal links that definitively return 404/410 are stripped.

Validation

  • tools/web-ui test suite: 117 pass / 0 fail.
  • Syntax checks passed for edited server/browser-scan/frontend modules.
  • API validation showed fresh browser-scan rows indexed into conversations: X 46, LinkedIn 8, Reddit 10, hiringHits=0.
  • Tracked worktree was clean before pushing this integration branch.

Notes

Local main and origin/main are divergent. This PR is intentionally opened from an integration branch instead of force-pushing over origin/main. GitHub may report conflicts because origin/main has commits not present in local main; those should be resolved in the PR.

jaydestro and others added 30 commits April 23, 2026 18:17
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…through the process of setting up their environment and configuring their preferences.
changes to onoarding experience, including a CLI tool to guide users …
- Setup view is the default on first load when no agent + no configs
- Agent presets: Claude Code, GitHub Copilot CLI, OpenAI Codex, Custom, None
- Choice persisted in tools/web-ui/.scout-web-settings.json (gitignored)
- SCOUT_RUNNER env var still overrides and locks the picker
- Dashboard empty banner points users to Setup
- /scout-onboard added to Run command dropdown
- New GET/POST /api/env endpoints read and write .env at repo root
- Keys from .env.example are seeded into the form when .env is missing
- Password-masked value inputs with show/hide toggle
- Add custom keys via '+ Add custom key' button
- Values with whitespace/quotes are safely double-quoted on write
- Rejects invalid key names (non A-Za-z0-9_ / leading digit)
jaydestro added 28 commits June 10, 2026 11:14
Add a compact signal metric strip, replace platform emoji markers with text badges, use CSS tone dots for sentiment indicators, and tighten platform/source rows for desktop and mobile.
- Community signals card: always include items from the latest scan, even if the post date is older than the 30d window; relabel meta as 'Latest scan'.
- Reject non-navigable LinkedIn /feed/sdui-post/ permalinks in isValidPostUrl so the social card prefers a real /feed/update/urn: link and routes dead-link samples/needs-reply rows into Conversations instead of a broken external link.
- Add SEO rewrite lib + tests and supporting analytics/server/app/config wiring.
Expand the Configs view Form mode from 4 flat fields into 5 collapsible groups covering identity/search, role & behavior flags, people, competitors, and sources/filters. Role flags render as on/off toggles + text inputs and round-trip back to markdown via in-place section replacement, preserving untouched sections (tables, brand assets) and leading HTML comments. Fix a CRLF parsing bug where parseKvSection's anchored regex failed on carriage returns.
…b/prompts fallback)

Configs now resolve from the gitignored .local/configs/scout-config-{slug}.md standard location, falling back to the legacy .github/prompts/scout-config-{slug}.prompt.md. Writes go to .local only and remove any legacy copy so personal configs can never be committed.
…nto Mindshare

loadConfig now resolves .local/configs/scout-config-{slug}.md first (legacy .github/prompts/*.prompt.md fallback) and parses Search Hashtags. Updates scout-scan.prompt.md and content-scout.agent.md to use the .local/state/browser-scan sidecar path and to explicitly route Layer 0 X/LinkedIn/Reddit items into the Mindshare section (google-news/google-web excluded).
…rsations

First-visit reliability: dashboard-enhancer loads each card via Promise.allSettled (one slow/cold endpoint no longer blanks the others); intel.js dashboard cards (sentiment/creators/source-health/social) and the Conversations view fetch via a retry wrapper so a cold server-side index build self-heals on the 2nd attempt instead of needing a manual refresh; the app.js /api GET coalescer now bypasses dedup when the caller supplies an AbortSignal so timeouts actually abort.

Freshness: the server re-warms the parsed index in the background after a successful run and after sentiment overrides, so the next dashboard load post-scan is warm, not cold. /api/sentiment-summary now surfaces the most recent report that actually classified sentiment (with a 'newest' pointer) instead of going blank when the latest scan is Mindshare/items-only; the card explains when sentiment is carried from an earlier scan.

Conversations UX: platform filter is de-duplicated via a canonicalPlatform() normalizer (X/x/'X / Twitter'/'X + Bluesky' -> X; reddit/r-subs -> Reddit; etc.) so the dropdown lists one entry per real platform; sentiment-pulse copy no longer contradicts the Community-signals count.
…/state

The state libs centralize writes on STATE_DIR via paths.mjs and ignore any per-call dir, so 'node --test' read and OVERWROTE the developer's real .local/state/{muted-accounts,closed-conversations}.json and failed 2 tests via test-to-test pollution. Add test/run.mjs which sets SCOUT_LOCAL_ROOT to a throwaway temp dir before spawning node --test (paths.mjs honors that env var), and add beforeEach state-file resets in the two affected test files. npm test now points at the runner. Suite: 117 pass / 0 fail; real state files untouched.
decayDeadDashLinks used to EXEMPT x/linkedin/reddit/bsky/youtube from probing entirely (to avoid 403/999 bot-wall false negatives), so a genuinely-deleted post on those hosts stayed clickable forever -> users hit 404s. Now it probes every link and strips on a DEFINITIVE gone status (404/410) regardless of host, while still ignoring ambiguous failures (403/429/timeout) on bot-walled hosts. Probe budget raised to 5s so the server's HEAD->GET returns a definitive status on first paint (cached 1h server-side -> later loads strip instantly). Verified: /api/check-url returns {ok:false,status:404} for a deleted reddit post vs {ok:true,status:200} for a live one; a dead link in live data was auto-stripped to an inert span on load.

Also fixed a misleading affordance: keyless backfilled samples showed 'Open in Conversations' but a plain click actually opened the (possibly dead) external source. Titles now match behavior - keyed samples open the tracked conversation in-app (source via Ctrl/Cmd-click); keyless samples say 'Open original post' and open the decay-verified source.
…re.js

First step of breaking up the 5.1k-line app.js monolith. Moves the 4 foundation helpers (\$, api, escape, escapeAttr) into public/lib/core.js as ES module exports and imports them into app.js. Behavior-preserving: app.js is already type=module so these were module-scoped locals (never on window); no other script consumed them. The /api fetch coalescer still applies because api() reads the patched global fetch at call time. Verified in-browser: lib/core.js resolves over HTTP, all 4 exports present, dashboard renders (stat-reports=32 via \$/api/escape), Reports renders 52 rows via renderDocListItem/escapeAttr. Establishes the import seam that subsequent page-module extractions build on.
…b/config-md.js

Moves the 12 pure markdown round-trip helpers (getMdSection, getMdSubsection, parseBulletList, bulletListBlock, stripHtmlComments, leadingComment, listSectionBody, getKvField, replaceKvField, parseKvSection, replaceMdSubsection, replaceMdSection) out of app.js into lib/config-md.js as ES exports, imported back into app.js. All 12 verified byte-identical to the originals (brace-matched diff), including the exact regex escaping. Pure functions (string in, string/array out), no DOM or shared state, so the move is behavior-preserving. Verified in-browser: loading the azure-cosmos-db config into the Configs form populates Name, Search Terms/Hashtags and 16 role-flag rows via populateConfigForm/renderRoleFlags, and the imported helpers parse the live 19.8KB config markdown correctly. Second step of de-monolithing app.js (after lib/core.js).
The Setup tools card was refactored in index.html to use a generic data-launch-cmd=scout-onboard button, removing the #setup-run-onboard element, but loadSetup still set .disabled/.title on it -> 'Cannot set properties of null' threw on every Setup view load (the onboarding button + env editor below it never wired). Null-guard the reference. Verified in-browser with cache disabled: Setup loads with zero page errors, 7 agent options, vision panel + services-detect mounted, config status rendered.
loadCfpConferences + renderCfpConferences/renderCfpRow/renderConferenceRow and the _cfpConfCache had zero callers and their DOM hooks (#cfps-body, #conferences-body, #cfp-conf-status) no longer exist in index.html. The May 2026 IA refactor moved CFP/Conference data into the dashboard (dashboard-enhancer.js fetches /api/cfp-conferences directly) and the Reports view, orphaning this tab block. Removes 96 lines of dead code; the /api/cfp-conferences endpoint and its live consumer are untouched. Verified in-browser (hard reload): all 8 views load with zero page/console errors.
…letion

A standalone browser scan (the Run-view 'Force-rescan' button -> POST /api/browser-scan/scan, or the raw CLI) wrote X/LinkedIn/Reddit sidecars but never folded them into a report, so 100+ fresh posts sat invisible and the Community-signals card looked stale until a separate /scout-scan. Now a completed scan chains a /scout-scan agent run (browserScan:'skip', scoped to the slug) so 'run completes -> data ingested' is the default. Opt out with {ingest:false}.

Gate is sidecar-freshness, NOT exit code: browser-scan routinely exits 1 on a Google News timeout AFTER writing all three social sidecars, so keying off code===0 would strand good posts. The close handler counts -(x|linkedin|reddit).json files mtime'd at/after spawn; >=1 fresh sidecar -> ingest (even on exit 1, with a note), 0 fresh (e.g. hard CDP failure) -> skip. Requires a configured runner; without one it logs a hint pointing at the dashboard ingest button.

Also adds the dashboard pending-sidecar hint as the safety net for the no-runner / not-yet-ingested window: GET /api/browser-scan/pending?slug= compares newest social sidecar stamp vs latest report stamp and returns per-platform pending counts; intel.js renders a '.dash-pending-hint' banner atop the social-activity card (slug falls back to the latest report's slug since activeRoleSlug() is empty on a cold dashboard). 117 web-ui tests pass.
…rnings

The scanner can write good X/LinkedIn/Reddit sidecars and then see a late async warning from a slower surface such as Google. In Node's default unhandled-rejection mode that can make the process look failed even though the persisted sidecars are valid, which in turn made web-ui completion logic treat fresh social results as suspect. Add a scan-mode unhandledRejection handler that logs the warning and lets the explicit scan completion decide the result; non-scan commands still fail fast. If a warning occurred, print a clear 'sidecars above are still usable' note before the final Done line. Syntax checked with node --check.
Headless /scout-scan runs were still instructed to search .github/prompts/scout-config-*.prompt.md first, so auto-ingest could miss the real .local/configs/scout-config-{slug}.md and wander into example configs. Update the VS Code agent, scout-scan prompt, Copilot instructions, and Claude skill note to prefer .local/configs/scout-config-*.md with legacy .github/prompts fallback.
The dashboard should not wait for an agent-written report before showing results from a completed browser scan. Add fresh X/LinkedIn/Reddit sidecars into the server index as synthetic conversation rows when their stamp is newer than the latest report for that slug, and include sidecar mtimes in the index signature so the cache invalidates immediately. Rows are deduped with convoKey and use a synthetic {stamp}-{slug}-content.md report value so existing freshness and grouping code works.

Direct sidecar indexing is filtered before surfacing: use the shared hiring-filter.mjs, add recruiter-format phrases that caught a LinkedIn vendor job post, and require phrase-level relevance for Azure Cosmos DB / DocumentDB instead of bare 'cosmos'. Removed the temporary pending-sidecar banner from the dashboard because fresh sidecars are now visible directly. Also switched social activity sample sorting to parsed dates so ISO sidecar dates beat older month-name report dates. Verified: 64 sidecar-derived rows (X 46, LinkedIn 8, Reddit 10), latest synthetic stamp 2026-06-10-1517, hiringHits=0, syntax OK, diff check OK, web-ui tests 117 pass / 0 fail.
…n-2026-06-11

# Conflicts:
#	.github/prompts/scout-trends.prompt.md
@jaydestro jaydestro merged commit a27c77e into main Jun 11, 2026
5 checks passed
@jaydestro jaydestro deleted the integration/local-main-2026-06-11 branch June 11, 2026 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant