Skip to content

fix: scroll anchor consistency, lookahead, speaking hysteresis, and word-tracking freeze#64

Open
samgutentag wants to merge 2 commits into
f:masterfrom
samgutentag:fix/voice-activated-scroll-anchor
Open

fix: scroll anchor consistency, lookahead, speaking hysteresis, and word-tracking freeze#64
samgutentag wants to merge 2 commits into
f:masterfrom
samgutentag:fix/voice-activated-scroll-anchor

Conversation

@samgutentag

Copy link
Copy Markdown

Summary

Four related fixes for auto-scroll and word tracking, found while debugging why the teleprompter kept lagging behind and jumping around during real use.

1. Manual scroll release snapped the text by half the window height

wordProgressAtCurrentOffset() computed the resume word from the viewport center, but smooth-mode scrolling (recalcCenter) anchors the active word near the bottom. So releasing a trackpad scroll immediately re-anchored the center word at the bottom, jumping the text by ~half the container height. Both paths now share a single readingAnchorY() helper so the resume word stays where the user left it.

2. Zero lookahead in smooth modes

The smooth-mode anchor sat 20pt above the bottom edge, so every word past the timer position was below the window. If the speaker got even slightly ahead of the configured words/sec rate, the words they needed next were off-screen (and in silence-paused mode the timer can never catch up, so the lag compounds). The anchor now sits at 70% of the viewport height, keeping a couple of upcoming lines visible while still showing read text above.

3. isSpeaking flickered around its threshold

isSpeaking was a single avg > 0.08 check over recent audio levels, so a voice hovering near the threshold rapidly started/stopped the silence-paused scroll timer, making the scroll stutter. It now uses hysteresis (turns on above 0.08, off below 0.05) and resets when the audio tap is removed so it can't freeze at a stale true.

4. Word tracking froze while the transcription bar kept updating

When the char-level and word-level matchers disagreed by more than the tolerance, matchCharacters took min() of the two (introduced in #37 to prevent runaway forward jumps). But the char matcher's resync can only bridge 3 characters, so a single word-level STT substitution ("sits" transcribed as "says") wedges it permanently — and from then on min() vetoes the word matcher's correct position forever. The user sees their words in the transcription bar while the highlight stays frozen.

Verified with os_log tracing during a live read: the word matcher tracked the speaker exactly (word=187 at "following your voice") while the char matcher was stuck at char=89 ("MacBook's"), which is exactly where the highlight froze.

Disagreements now resolve to the word-level result. Its forward movement requires consecutive fuzzy word matches, and the existing 2-of-3 agreement gate from #37 still filters transient false jumps, so the runaway failure mode that min() guarded against remains covered — without the freeze.

Test plan

  • Builds clean (arm64, Xcode 26.6)
  • Word tracking: highlight and scroll follow a full live read of the welcome script with no lag or freeze (log-verified positions)
  • Voice-activated mode: scroll keeps upcoming lines visible below the current word
  • Voice-activated mode: trackpad scroll + release resumes without jumping
  • Word-tracking anchor unchanged (active word still centered)

🤖 Generated with Claude Code

samgutentag and others added 2 commits July 1, 2026 16:11
Three related fixes for jumpy/lagging auto-scroll in classic and
voice-activated (silence-paused) modes:

- wordProgressAtCurrentOffset() computed the resume word from the
  viewport center, but smooth-mode scrolling anchors the active word
  near the bottom. Releasing a manual scroll therefore snapped the text
  down by roughly half the window height. Both paths now share a single
  readingAnchorY() helper.

- The smooth-mode anchor sat 20pt above the bottom edge, giving the
  speaker zero lookahead: any word past the timer position was below
  the window. The anchor now sits at 70% of the viewport height so a
  couple of upcoming lines stay visible.

- isSpeaking was a single 0.08 threshold over recent audio levels, so a
  voice hovering near it rapidly started/stopped the scroll timer. It
  now uses hysteresis (on above 0.08, off below 0.05) and resets when
  the audio tap is removed.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
When the char-level and word-level matchers disagreed by more than the
tolerance, matchCharacters took min() of the two. The char matcher's
resync can only bridge 3 characters, so a single word-level STT
substitution (e.g. "sits" transcribed as "says") wedges it permanently.
From then on min() vetoed the word matcher's correct position forever:
the transcription bar kept updating while the highlight froze.

Log-verified against a live read: the word matcher tracked the speaker
exactly (word=187 at "following your voice") while the char matcher was
stuck at char=89 ("MacBook's"), which is where the highlight sat.

Disagreements now resolve to the word-level result. Its forward
movement requires consecutive fuzzy word matches, and the existing
2-of-3 agreement gate still filters transient false jumps, so the
runaway-jump failure mode that min() guarded against remains covered.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant