Skip to content

release: 0.8.4#316

Merged
neuromechanist merged 39 commits into
mainfrom
develop
Jun 8, 2026
Merged

release: 0.8.4#316
neuromechanist merged 39 commits into
mainfrom
develop

Conversation

@neuromechanist

Copy link
Copy Markdown
Member

Release: develop -> main (0.8.4)

Promotes 0.8.4.dev18 to a stable 0.8.4 release. CI strips the .dev suffix on merge.

Highlights

Widget feedback

Papers / live search

Community admin auto-merge

Assistants / docs

CI

Testing

  • uv run --all-extras pytest: 1820 passed, 35 skipped.
  • 11 failures are environmental only (live hedtools.org service timeouts + an ANSI-color rendering artifact in the CLI version test); none touch files changed in this release.
  • ruff check: clean.

neuromechanist and others added 30 commits May 14, 2026 11:29
Manual recovery: sync-develop.yml could not run because develop was
auto-deleted after the v0.8.3 release PR merged (repo had
delete_branch_on_merge=true). Reproducing what that workflow would
have done: develop = main + patch dev0 bump.
…ng (#279)

* feat: document versioned widget URL and SRI hashes for secure embedding

Add security-sensitive embedding section to the widget demo page showing
the versioned jsDelivr URL and SRI hash approach as an alternative to the
default (always-latest) URL. The default remains the recommended approach;
the versioned option is clearly labeled for supply-chain-secure workflows.

Also add scripts/widget-sri.py to generate SHA-384 SRI hashes for any
release, useful for maintainers publishing release notes.

Closes #268

Co-authored-by: Seyed (Yahya) Shirazi <neuromechanist@users.noreply.github.com>

* Enhance release notes with widget SRI hash

Append widget SRI hash to release notes and update embedding instructions.

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Seyed (Yahya) Shirazi <neuromechanist@users.noreply.github.com>
* feat: per-response and general widget feedback

Widget: thumbs up/down under each assistant reply (sentiment only,
one-vote-lock, keyed to a new per-response request_id) plus a
Send-feedback footer link opening a free-text modal for general feedback.

Backend: anonymous POST /feedback (worker already proxies to it) writing
to a new feedback_log table in the existing metrics.db (auto-created via
CREATE TABLE IF NOT EXISTS; rides the osa-data volume). request_id is now
exposed on the chat/ask done event and ChatResponse/AskResponse. page_url
is scheme-validated to prevent stored XSS in the admin view.

Admin: GET /metrics/feedback (per-community scoped key, strictly confined)
and a Feedback panel on the status dashboard.

Tested: 23 new tests (db/queries + endpoint scoping); ruff clean.

Closes #282

* refactor: address PR #284 review findings

- FeedbackEntry: narrow to Literal types + __post_init__ enforcing the
  per-type invariants at the storage layer (last checkpoint before SQLite);
  add CHECK constraints + a UNIQUE index on feedback_id to the schema.
- feedback.py: move sentiment/comment normalization into a mode=before
  validator (drop object.__setattr__); remove the unreachable manual
  whitelist checks and the dead try/except (write_feedback never raises).
- write_feedback: acquire the connection inside the try so it honors its
  never-raises contract on a connect failure.
- Widget: submitResponseFeedback now awaits the POST and rolls back the
  optimistic vote (with an error) if it fails, instead of silently dropping.
- Tests: replace the mock-based fixture with a DATA_DIR env fixture; add
  __post_init__ invariant tests, write-failure counter + CRITICAL escalation,
  comments_only/offset/limit-floor coverage, data:/ftp page_url rejection,
  general-sentiment-stripping, request_id round-trip, cross-community admin
  aggregation; fix the no-op test_limit_clamped.
- Docs/comments: db module docstring, warning SSE event, request_id wording,
  widget/dashboard comment and error-message fixes.
* feat: community-admin scoped PR auto-merge

New workflow approves + squash auto-merges a PR only when the author is in
the target community's maintainers list (read from the BASE branch), the
diff touches only src/assistants/<id>/**, and the PR does not edit the
maintainers field. Uses a dedicated GitHub App token (not the CI PAT);
pull_request_target with base-only checkout never runs PR-head code;
required CI checks still gate the merge. Community-id is whitelisted before
interpolation. Includes a setup/trust-model doc.

Validated: YAML + every embedded bash step (bash -n).

Closes #283

* fix: address PR #285 security review

- permissions: add pull-requests:read (the eligibility step reads the PR
  file list with github.token; with explicit permissions the unspecified
  scope is 'none', which would 403 every PR).
- Scope bypass fix: read the changed-file list from the paginated
  /pulls/{n}/files API and check BOTH .filename and .previous_filename.
  gh pr diff --name-only omitted a rename's OLD path (letting a file be
  moved from outside the community dir into it) and truncated at 300 files.
- Injection hardening: pass user.login and head.sha via env vars instead of
  interpolating ${{ }} into the run-step shell body.
- A config.yaml deleted/renamed-away is now caught (its old path appears in
  the file list, the head-config fetch 404s, and the PR is held for review).
- Checkout: drop fetch-depth:0 (only BASE_SHA is needed); fix the comment.
- Comment/doc accuracy: GITHUB_TOKEN approval restriction wording, the
  what-it-does step ordering (maintainers-field check before author check),
  and the setup-incomplete behavior wording.
* feat: optional 'what went wrong' box on thumbs-down

Thumbs-down now reveals a small inline comment box (What went wrong?
optional) with Send/Skip under that reply, instead of only Thanks!.
Thumbs-up is unchanged (one-click count).

The down-vote is deferred until the user sends or skips, and is flushed
(committed bare) when they send a new message, reset, or close the widget,
so leaving the box open never drops the vote. The optional comment rides on
the same single response/down row (linked by request_id) -- no double count,
no backend change. Optimistic with rollback + error on POST failure.

Closes #286

* fix: address PR #287 review (confirm-then-commit feedback)

- commitResponseFeedback now confirms before committing: feedbackCommitted
  (the 'Thanks!' state) is set only AFTER the POST succeeds, so a failure
  never leaves a false success in the UI or in localStorage.
- A flush on send/reset/close is best-effort (interactive=false): it never
  writes errors into a tearing-down or hidden UI; a pending down stays pending
  and retries on the next flush. Interactive Send/Skip/up surface failures and
  keep the box (with the typed comment) for retry.
- saveHistory strips transient flags (_feedbackCommitting/_feedbackJustOpened/
  feedbackDraft) and never persists an unconfirmed vote.
The footer line ('Send feedback . Powered by OSA vX') overflowed the
window next to the page-URL toggle. Shorten the default disclaimer to
'This is an AI assistant and may make mistakes.' and move the Send
feedback link to the end of that line (flex space-between). The footer
keeps only 'Powered by OSA vX'.

Closes #288
The community-admin auto-merge always used --squash, but the protect-main
ruleset only allows merge-commits, so a community PR to main would fail at
the merge step. Choose the method by base ref: --squash into develop
(feature-branch convention) and --merge into main (release convention).
Updates the confirmation comment, header comment, and setup doc to match.

Closes #290
Grants the OSC owner scoped community-admin merge rights on these three
communities and enables end-to-end testing of the auto-merge flow.

Closes #292
…ge comment (#298)

* feat(ci): merge community PRs into develop on a maintainer's LGTM/merge comment

Replace auto-merge-on-open with an explicit comment command: a community
maintainer merges a scoped PR into develop by commenting both 'LGTM' and
'merge'. Triggers on issue_comment; authorizes the COMMENTER against the
target community's base-branch maintainers; squash-merges pinned to the
commented head (--match-head-commit) so a later push needs a fresh comment.
Develop only -- an OSA admin merges develop to main. All prior guards keep:
single-community path scope, maintainers-field-edit exclusion, community-id
whitelist, never run PR-head code, App token for writes. Docs rewritten.

Closes #297

* harden(ci): guard negated 'merge' comments + pin pyyaml

Address PR #298 security review: reject comments that say 'do not merge'/
'don't merge' even if they also contain LGTM+merge; pin pyyaml==6.0.2 in the
privileged eval step.
The App's confirmation comment contains 'LGTM / merge' and re-triggered the
issue_comment workflow (which then no-op'd: bot is not a maintainer). Skip
comments where comment.user.type == 'Bot' so the App's own comment no longer
spawns an extra run.

Closes #302
* fix(search): match multi-word FTS queries

Replace blanket phrase-wrapping in _sanitize_fts5_query with tokenize ->
drop stopwords/operators -> quote each term -> OR. Multi-word queries
(every query the agent sends for papers/discussions/FAQ/docstrings) were
exact-phrase matched and returned nothing despite populated databases.

Still injection-safe (each term individually quoted); results ordered by
existing BM25 rank. Affects all 6 knowledge-search call sites.

Closes #305

* test(search): harden operator test; keep list/use terms

Address PR review:
- strengthen test_sanitize_fts5_operators to assert every term is
  individually quoted (no bare operator can reach MATCH)
- drop 'list' and 'use' from stopwords (meaningful EEGLAB/MATLAB nouns)
  so multi-word queries like 'list channels' don't silently lose a term

* test(bep): use non-matching query for no-results case

BEP keyword search is OR-based and rank-ordered after the FTS fix, so the
old phrase-only no-match query ('...data type...') now matches BEPs that
mention 'data'. Use a genuinely non-matching query to keep the
no-results path covered.
)

* refactor(papers): use opencite for paper sync

Replace the hand-rolled OpenAlex/Semantic Scholar/PubMed fetchers (and
inverted-index reconstruction) with the opencite multi-source client,
which aggregates and deduplicates across sources. Public sync function
signatures are unchanged, so the CLI and scheduler call them as before;
only the fetch layer is swapped. opencite Paper objects map to stable
(source, external_id) pairs compatible with existing rows.

Declares opencite>=0.5.2 as a server dependency, which also attributes
OSA to neuromechanist/opencite in GitHub's dependency graph.

Config is constructed directly (not Config.from_env) so sync never
depends on ambient .env files in the working directory.

Closes #307

* refactor(papers): share one opencite client per batch; guard async bridge

Address PR review:
- sync_all_papers/sync_citing_papers now open a single SearchOrchestrator/
  CitationExplorer for the whole batch instead of one per query/DOI (each
  open spins up 11 HTTP clients). Per-item errors are isolated so one bad
  query/DOI no longer aborts the batch.
- Add _run() helper that uses asyncio.run normally but offloads to a worker
  thread if a loop is already running, so the public sync functions are safe
  to call from any context (CLI, scheduler thread, or future async caller).
- Test _run in both the no-loop and running-loop paths.
* feat(papers): live on-demand paper search via opencite

Adds search_<community>_papers_live, an opt-in tool that queries opencite
for the most recent literature (newest first) when the user asks for
recent/new papers or the local index comes up short. Results are
best-effort cached into the community DB so future local searches find
them. Bounded by a timeout to keep chat responsive.

- search_papers_live() in papers_sync.py reuses the opencite client +
  _run async bridge; keys read from the server's env.
- Gated by citations.live_search (new config flag, default on) wherever
  paper citations are configured.
- EEGLAB prompt guides the agent to use it for recency.

Closes #308

* fix(papers): non-blocking cache + cleaner live-search timeout

Address PR review:
- cache live-search results in a background daemon thread so a chat
  response is never delayed (was synchronous; could block up to the
  SQLite busy timeout if the scheduler was mid-sync on the same DB).
- set opencite per-request timeout just under the overall cap so each
  source finishes/times out cleanly before wait_for cancels, avoiding
  orphaned-task error noise on the timeout path.
- test caching deterministically via _cache_papers_async (offline, real
  SQLite); live network test now just asserts result shape.
github-actions Bot and others added 7 commits June 5, 2026 08:34
* feat(papers): local-first live search, confirmed, OpenAlex-only

Dogfooding showed live search felt slow and fired too eagerly.

- Live search now queries OpenAlex only (LIVE_SOURCES) instead of also
  waiting on Semantic Scholar (~1 req/s) and PubMed; latency drops from
  ~tens of seconds to ~1s. Batch sync still uses all three.
- Lower default timeout to 15s.
- Tool description + EEGLAB prompt require local-first: always search the
  indexed library first, OFFER a live search and wait for confirmation,
  and announce 'this might take a minute' before running it. Never call
  the live tool as the first action on a paper question.

Closes #312

* docs(tool): align live-search description with announce-then-call framing

Address PR review: a tool description can't enforce pre-call sequencing,
so phrase the announce guidance as 'when you call it, your message in that
turn should first tell the user...' - matters most for communities that
rely on the description without extra prompt guidance.

* fix(papers): address full PR review on live search

silent-failure-hunter:
- narrow live-search error handling: APIKeyError/ConfigurationError -> error
  log, OpenCiteError -> warning, both return []; let programming errors
  propagate instead of masquerading as 'no results'.
- escalate cache-write failure to logger.error(exc_info=True) (a lost cache
  write is a real degraded state).
- tool 'no results' message no longer falsely implies a timeout.

comment-analyzer:
- correct overstated duration: ~15s cap, not 'a minute' (tool description +
  EEGLAB prompt say 'a few seconds').
- tighten DEFAULT_SOURCES/LIVE_SOURCES rationale comments.

pr-test-analyzer:
- add TestSourceConstants (live = OpenAlex-only, strict subset of batch
  sources); exercise the new sources param + production timeout in the live
  test.
The 'Append widget SRI hash' step put its markdown heredoc at 6-space
indentation inside a 'run: |' block whose content base is 10 spaces. That
ended the YAML block scalar early and the parser hit '**Standard**' as a
YAML alias (ScannerError at line 77), making the whole workflow file
invalid. GitHub then emitted a startup_failure run on EVERY push (develop
and feature branches), regardless of the tags-only trigger.

Rewrite the step to pass version/hash via env vars into a quoted heredoc
and build the section as an explicit list of lines, so all content stays
within the block scalar. Verified: yaml.safe_load parses, trigger is still
push.tags v*, and the rendered notes section is correct.

Closes #311
opencite 0.5.3 ships the search-task cancellation fix (neuromechanist/opencite#41):
search() now cancels and drains pending per-source tasks in a finally block, so a
wait_for timeout no longer orphans tasks that later touch closed HTTP clients.

Bump the floor to >=0.5.3 to require the fix and refresh uv.lock (0.5.2 -> 0.5.3).
No other packages change.
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Dashboard Preview

Name Link
Preview URL https://develop.osa-dash.pages.dev
Branch develop
Commit cbf93a7

This preview will be updated automatically when you push new commits.

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Preview Deployment

Name Link
Preview URL https://develop-demo.osc.earth
Branch develop
Commit cbf93a7

This preview will be updated automatically when you push new commits.

neuromechanist and others added 2 commits June 8, 2026 11:02
* feat(papers): make live_search opt-in (default off), enable for EEGLAB

Live web paper search added external-API latency to every community with a
citations block. Default to False so communities opt in explicitly; enable it
for EEGLAB, whose prompt already tells the agent to ask before running it.

* fix(papers): harden opencite sync error handling

Narrow the broad except in the batch search/citation loops to
(OpenCiteError, TimeoutError) for expected API failures; log any other
exception with a full traceback (logger.exception) so a bug no longer
masquerades as a routine 'no results'. Isolate per-query/per-DOI _store_papers
so one DB failure cannot abort the batch or desync metadata. Add type hints to
_run.

* chore(deps): drop direct pyalex dependency

pyalex is no longer imported anywhere; opencite pulls it in transitively. Remove
the redundant direct declaration and refresh the stale OpenAlex comment.

* fix(feedback): return 422 not 500 on FeedbackEntry invariant drift

FeedbackRequest and FeedbackEntry validate the same invariants independently;
if they ever drift, guard the construction so a bad shape returns a clean 422
instead of an unhandled 500. Add a whitespace-only general-comment rejection test.

* fix(widget): timeout and double-submit guard on general feedback

postFeedback now aborts after 10s instead of hanging indefinitely, and the
general-feedback Send button is disabled while the POST is in flight so a
double-click cannot store two rows.

* style: ruff-format widget-sri.py
@neuromechanist neuromechanist merged commit 2213e8c into main Jun 8, 2026
24 checks passed
@neuromechanist neuromechanist deleted the develop branch June 8, 2026 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants