Skip to content

D2M: Sat, Jun 20, 2026 (v6.20.26)#74

Merged
gandalf-the-engineer merged 49 commits into
mainfrom
develop
Jun 20, 2026
Merged

D2M: Sat, Jun 20, 2026 (v6.20.26)#74
gandalf-the-engineer merged 49 commits into
mainfrom
develop

Conversation

@github-actions

@github-actions github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Summary

This large branch merges several weeks of parallel feature development into develop, covering five major areas: iOS 27 App Intents integration, an anti-loop tool-call guard, markdown share toolbars, music playback and iCloud sync fixes, and a wave of new skills (Google Workspace, Stories/HTML infographics, on-device browser, Feed Cards, image search, VM cron agents, and geocoding). The Mac release CI workflow is also fully re-enabled with Sparkle auto-update support, and the website's Mac download CTA now points at the signed DMG release.

Changes by area

Agent harness / skill infrastructure

  • Added ToolCallGuard singleton with four-layer anti-loop protection (dedup window, repeat cap, result-diff check, system-prompt injection); wired into every dispatch path in SkillDispatcher.
  • Added per-turn vision fallback in AgentHarness.chat: if the selected model can't see images, the image turn is transparently rerouted to a vision-capable model on the same provider (e.g. GLM 5.2 → Kimi K2.6).
  • Registered new skills: SerpImageSearchSkill, GeocodingSkill, VMCronSkill, GoogleDriveSkill, GoogleGmailSkill, GoogleCalendarSkill, AgentMailSkill, CardSkill, StorySkill, BrowseSkill.
  • Updated ToolRouter with keyword groups and coreToolNames entries for all new skills.
  • Added ToolCallGuard.Config for per-layer enable flags and thresholds.

New skills

  • Google Workspace: GoogleDriveSkill, GoogleGmailSkill, GoogleCalendarSkill with shared GoogleWorkspaceClient; token stored in KeyStore; UI row in IntegrationsVC.
  • Stories / HTML Infographic: end-to-end pipeline (StorySkill, StoryGenerator, StoryGenerationService, StoryBundledTemplates, StoryPlayerView/VC, StoryAttachment); ported to Mac via StoryMacUI.swift.
  • On-device browser (BrowseSkill, BrowseSession, BrowsePlayerVC, BrowseGenerationService): drives a WKWebView in a non-persistent data store, runs an agent loop with read/click/scroll/eval actions, persists a scrubbable screenshot+DOM replay bundle.
  • Feed Cards (CardSkill, CardStore, CardDetailViewController, FeedCardListView, ImageCardRenderer, MarkdownCardRenderer): image and markdown poster cards attached to messages.
  • Image search (SerpImageSearchSkill): web image search rendering an inline gallery.
  • Geocoding (GeocodingSkill): address/place-name → lat/lon.
  • VM cron agents (VMCronSkill, VMCronManager, VMCronPoller, VMCronTasksVC, RunnerTurnApplier, BackgroundTurnRunner): recurring agents running on an SSH VM with push-based completion.
  • Agent Mail (AgentMailSkill, AgentMailClient): send/search email via the AgentMail relay.

iOS 27 App Intents

  • Added AskLoopIntent, CaptureToLoopIntent, LoopRememberIntent, SearchLoopIntent, all gated @available(iOS 27.0, *).
  • Added LoopConversationEntity and LoopNoteEntity (IndexedEntity) for Siri and Spotlight indexing.
  • MessagingVC subscribes to Notification.Name.loopIntentMessageReceived to inject intent-delivered messages.
  • All UIKit-dependent intents excluded from LoopMac/LoopVision targets.

Music & audio

  • MusicController: support library IDs (p., l., i. prefixes); queue_mode: append now works for albums/playlists; fallback queue-rebuild path for iOS 27 MPMusicPlayerController.Queue.insert failure.
  • Audio session: deactivateAudioSession() added to all TTS finish/error/stop paths (iOS, visionOS) with .duckOthers options so other audio apps resume reliably.
  • Media ducking in voice sessions: duck before earcon, resume only on .idle state, resumeToken prevents orphaned resumes; new tracks queued mid-session become the resume target.
  • MusicMiniPlayerView + TopBannerScrollView: compact pill/expanded card mini-player shown in chat while music is playing.

Conversation store / sync fixes

  • ConversationFileStore: eviction now requires two consecutive misses (deferred eviction via pendingEvictions) to tolerate transient iCloud renames/downloads.
  • MessagingVC subscribes to .conversationStoreDidChange to re-read messages after pass-2 hydration or iCloud sync; guarded against clobbering in-flight agent turns.
  • prioritizeHydration(id:) moves the active conversation to the front of the pass-2 queue.
  • Defensive empty-state guard: 0 messages for a not-yet-hydrated conversation defers render and requests async hydration instead of blanking the screen.
  • messagesDidReload() hook ensures avatar/orb visibility is refreshed on every message-load path.

Chat clients (Anthropic / OpenAI / Fireworks)

  • Image downgrade: raw image sent only on the introducing turn; later turns inline visionSummary (cached description from VisionSummaryService) to avoid re-paying image input tokens every turn.
  • VisionSummaryService: fires a background vision call (cheapest keyed model) after user messages with image attachments; result persisted via SimpleConversationManager.updateAttachmentSummary.
  • AnthropicStreamReader: sawToolUse flag suppresses onDelta once a tool_use block starts, preventing pre-tool thinking text from leaking into the streaming bubble.
  • All three clients accept modelIDOverride / modelStampOverride for per-turn rerouting.
  • LocalInferenceController: tracks the active streaming task for cancellation on runner handoff.
  • story- prefixed placeholder messages excluded from wire payloads alongside image- and pdf-.

Mac

  • release-mac.yml fully re-enabled: builds, signs, notarizes .app and DMG, packages a stapled Sparkle ZIP, generates a signed appcast.xml, publishes a per-build GitHub Release and updates a stable appcast release.
  • Sparkle (≥2.6.0) added as a Mac SPM dependency.
  • TTS voice parity on Mac: OpenAI gpt-4o-mini-tts + 11 voices and 5 missing ElevenLabs voices added to LoopMac.
  • 7 previously stub-only skills wired into VoiceLoopCoordinator dispatch on Mac (Maps, Geocoding, Navigation, MuniRealtime, Twitter, SSH, MCP).
  • MacMarkdownShareToolbar added to Mac markdown surfaces.

UI / UX polish

  • MarkdownShareToolbar (iOS) and MacMarkdownShareToolbar (macOS): reusable share toolbars on all markdown preview surfaces and file preview cards; visionOS gets a ShareLink.
  • AgentLargeVC drag-to-dismiss: transform applied to the content subview (agentView) instead of the root view, so the animation renders live during the drag.
  • AgentLargeView.setChromeHidden(_:animated:): holds chrome back until the orb has flown into place during the present transition.
  • STT engine badge (DG/APL pill) shown inline in the transcription UI on iOS and appended to the Mac recorder bar.

Website

  • index.html: Hero "Download for Mac" CTA now points at /releases/latest (the signed DMG) instead of the #setup build-from-source anchor.

Tests

  • AnthropicStreamReaderTests: text-only assembly, tool-call assembly, onDelta suppression, error propagation, usage parsing.
  • MusicSkillTests: tool schema and dispatch routing.
  • ImageSummaryDowngradeTests: image-downgrade logic in AnthropicChat.

Notable risks

  • ToolCallGuard false positives: the dedup guard blocks a call if its signature appears in the last 5-call window, even across different conversational contexts within the same agent turn. Legitimate repeated lookups (e.g. checking the same URL twice) will be blocked and may confuse models or degrade tool-heavy workflows until thresholds are tuned.
  • Vision fallback silently changes billing provider: if the user selects GLM 5.2 (Fireworks) and sends an image, the image turn is rerouted to Kimi K2.6 on Fireworks transparently. The model stamp in the message reflects the override, but the user is not explicitly warned that their selected model was bypassed.
  • BrowseSession parks a WKWebView in the key window offscreen: the UIView at x=−10,000 is a common pattern but can interact poorly with accessibility trees, view-hierarchy debuggers, or layout passes that enumerate all subviews.
  • Sparkle auto-update wired but SUFeedURL in Info.plist must be set: if the plist URL points at the wrong release tag or is missing, the update feed will silently do nothing or 404 on every launch.
  • pendingEvictions deferred-eviction logic: a conversation whose file is permanently gone (manual iCloud deletion) now requires two consecutive surgicalRefresh cycles to evict. During that window the cache entry is stale and any message write targeting it will fail silently.

Auto-generated by .github/workflows/auto-develop-pr.yml

devin-ai-integration Bot and others added 6 commits June 7, 2026 05:41
…tool-call streaming

1. Apple on-device fallback leaves thread non-sendable:
   Both Apple Intelligence fallback paths in MessagingVC (post-tool-loop
   and primary send) now properly call ActiveRequestTracker.markIdle(),
   clear streamingPartial, and reset VoiceLoopCoordinator to .idle.
   Previously these were missing, leaving the conversation in a stuck
   'active' state after an Apple response.

2. Apple fallback error handling:
   The catch blocks in both Apple Task{} paths were empty. Now they
   surface a user-facing error message, persist it, play the error
   earcon, and clean up all state — matching the existing error path
   for when Apple Intelligence is unavailable.

3. Claude tool-call content leaks into streaming bubble:
   AnthropicStreamReader now tracks when the first tool_use content
   block starts (sawToolUse flag). Once set, text_delta events are
   still accumulated in contentBuffer (for the 'Used N tools'
   disclosure prose) but no longer fired via onDelta — preventing
   pre-tool-call 'thinking' text from appearing in the live streaming
   bubble as visible assistant prose.

4. Added AnthropicStreamReaderTests covering:
   - Text-only content assembly
   - Tool-call assembly from input_json_delta
   - onDelta suppression after tool_use block starts
   - Error propagation on mid-stream connectivity failure
   - Usage (token) parsing

Co-Authored-By: bot_apk <apk@cognition.ai>
Show which speech-to-text backend is active during live transcription:
- iOS: small monospaced pill (DG or APL) beside the transcribingLabel
- macOS: appended to the recorder bar placeholder text

VoiceLoopCoordinator gains an STTEngine enum and activeSTTEngine
property. MessageBox publishes the engine when recording starts and
on Deepgram-to-SFSpeech fallback.

Co-Authored-By: bot_apk <apk@cognition.ai>
Implements the Music Mini-Player Banner feature:
- MusicMiniPlayerView: compact pill (minimized) and expanded card states
  with play/pause, skip, progress, album art, and deep-link to Apple Music
- TopBannerScrollView: horizontally scrollable container below the sub-agent
  status bar, hosting the music mini-player
- Visibility logic: shows when music playing/paused within 5min, auto-dismisses
  on stop, auto-minimizes on voice recording
- Gesture support: tap pill to expand, swipe down to collapse, swipe away to dismiss
- Wired to MusicController.shared for playback state and controls
- Xcode project updated: new files excluded from LoopMac and LoopVision targets

Co-Authored-By: bot_apk <apk@cognition.ai>
- Duck music synchronously before earcon in switchToRecordingState()
  so the Action Button flow doesn't clip audio against a playing track.
- Fix handleVoiceLoopState to only resume on .idle, not .thinking or
  .transcribing — prevents a brief music-plays gap mid-voice-turn.
- Add resumeToken (UUID) that tracks whether auto-resume is valid;
  cleared on user-explicit stop/pause so orphaned resumes never fire.
- New play_music/set_music_mood calls mid-voice-session immediately
  re-pause via reduckIfVoiceSessionActive(); the new track becomes
  the resume target.
- resumeAfterVoiceSession() fails silently with a log on any
  MusicKit/AVAudioSession error.
- status() now exposes will_auto_resume for debugging.

Co-Authored-By: bot_apk <apk@cognition.ai>
…ter-failures

Fix model router failure cases: Apple fallback state leak & Claude tool-call streaming
Add inline STT engine badge (DG / APL) to voice transcription UI
@vercel

vercel Bot commented Jun 7, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
loop-harness Ready Ready Preview, Comment Jun 20, 2026 9:11pm

…ayer-banner

feat: Music Mini-Player Banner in chat view
Add media ducking and resumption for voice sessions
Co-authored-by: Ash Bhat <me@ashbhat.com>
ashbhat and others added 4 commits June 7, 2026 10:05
Replace the deferred transmitToVM stub with PushBridge, a runtime-discovered
seam (mirroring AppSignals) that hands the APNs device token to a private
sender when present and is a no-op in public clones. Resolve the APNs
environment from the embedded provisioning profile, and obtain/register the
token immediately after a notification-permission grant instead of waiting
for the next launch.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
devin-ai-integration Bot and others added 9 commits June 7, 2026 23:10
- Playlist IDs starting with 'p.' (user library) now use
  MusicLibraryRequest instead of MusicCatalogResourceRequest, which
  only works for catalog playlists (pl.…).
- Same fix for album IDs starting with 'l.' (library albums).
- Same fix for song IDs starting with 'i.' (library songs).
- queue_mode 'append' now works for albums and playlists, not just
  songs — previously albums/playlists always replaced the queue.
- Updated tool description and system prompt to document library ID
  support and full-track-list queuing.

Co-Authored-By: bot_apk <apk@cognition.ai>
…tivation after TTS

Root cause: When TTS playback finishes on iOS/visionOS, the audio
session was never deactivated with .notifyOthersOnDeactivation. Without
this, the system never sends the 'interruption ended — you may resume'
signal to other audio apps (Apple Music, Spotify, podcasts), so they
stay paused indefinitely after Loop's TTS finishes.

The recording path already did this correctly (MessageBox.swift line 693
and 1644). Only the TTS finish paths were missing it.

Changes:
- iOS (MessagingVC): Add deactivateAudioSession() helper; call it from
  stopSpeaking(), audioPlayerDidFinishPlaying, audioPlayerDecodeError,
  speechSynthesizer(_:didFinish:), and DeepgramTTS onFinished/onError.
- iOS (MessagingVC): Switch TTS category options from [] or
  [.mixWithOthers] to [.duckOthers] for all providers (Deepgram,
  ElevenLabs, OpenAI, offline, playMP3Data). This tells the system to
  duck other apps' volume during speech and send .shouldResume when we
  deactivate — the standard polite-audio-citizen pattern.
- visionOS (VisionVoiceCoordinator): Add matching deactivateAudioSession
  calls after TTS finishes (speechDidFinish, onFinished, onError) and
  after recording teardown.
- macOS/shared (DeepgramTTS): Remove the mixer output tap in
  finishAfterDrain() and the error path before stopping the engine.
  Ensures the engine is fully released so the audio device isn't held.

Loop-initiated MusicKit playback (MusicController) is unaffected — it
uses explicit ApplicationMusicPlayer.play()/pause() on state transitions
and doesn't depend on the audio session's activation state.

Co-Authored-By: bot_apk <apk@cognition.ai>
Add a full Google Workspace integration following the existing Notion/Slack
pattern:

- GoogleWorkspaceClient: shared networking layer with token injection,
  error parsing, and typed errors (including token_expired on 401)
- GoogleDriveSkill: list_files, get_file, read_file, create_file actions
- GoogleGmailSkill: search_messages, get_message, send_message actions
- GoogleCalendarSkill: list_events, create_event actions
- KeyStore: add google_workspace_access_token (required) plus optional
  refresh_token, client_id, client_secret
- IntegrationsVC: add Google Workspace row with connected status,
  service details, and Revoke/Remove button
- Register all three skills in SkillDispatcher, AgentHarness (catalog +
  system prompt fragments), Messaging.swift tools array, MessagingVC
  statusText, SubAgentRuntime statusText, and VoiceLoopCoordinator
  (dispatch + statusText)

Co-Authored-By: bot_apk <apk@cognition.ai>
End-to-end prototype for generating 1080×1920 portrait HTML infographics
(stories) from structured JSON data and rendering them in-app via a
chromeless WKWebView player.

Components:
- StoryAttachment: chat message attachment type (.generating → .ready)
- StoryGenerator: JSON → HTML renderer that injects data into templates
- StoryGenerationService: async pipeline (mirrors PDFGenerationService)
- StoryPlayerView: chromeless WKWebView, inline (scaled) + full-screen
- StoryPlayerVC: full-screen modal with tap-to-advance + progress bar
- StorySkill: tool definition for generate_story LLM calls
- StoryDemoVC: demo view controller with sample data
- Templates: DailyRecap, ActivitySummary (pure HTML/CSS/JS, no deps)

Wired into MessageStruct via storyAttachment field.

Co-Authored-By: bot_apk <apk@cognition.ai>
…b.com:getathelas/LoopHarness into devin/1780880580-stories-html-infographic
…m:getathelas/LoopHarness into devin/1780880580-stories-html-infographic
….com:getathelas/LoopHarness into devin/1780880580-stories-html-infographic
My work this change:
- TTS voice parity: add OpenAI gpt-4o-mini-tts provider + 11 voices and the
  5 missing ElevenLabs voices to LoopMac so Mac matches the iOS voice menu.
- Wire 7 already-shared skills into VoiceLoopCoordinator's dispatch on Mac
  (Maps, Geocoding, Navigation, MuniRealtime, Twitter, SSH, MCP) — they were
  advertised to the model but returned "not available on Mac".
- Port the Stories feature to Mac: make StorySkill/StoryGenerationService
  cross-platform (#if os(iOS)||os(macOS)) and add StoryMacUI.swift
  (StoryBubbleView card + StoryPlayerWindowController WKWebView player) with a
  StorySkillHost on ConversationWindowController; advertise generate_story on
  macOS via the shared tools/AgentHarness catalog.

Also includes concurrent in-flight work present in the tree (committed at the
user's request): the iOS Stories prototype rewrite (StorySkill/StoryGenerator/
StoryBundledTemplates/templates/MessagingCell/MessagingVC + story- prefix
handling), Sparkle auto-update wiring, Google Workspace integration tokens,
and the release-mac.yml workflow overhaul.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…infographic

feat: Stories / HTML Infographic prototype
@github-actions github-actions Bot changed the title D2M: Sun, Jun 7, 2026 (v6.7.26) D2M: Tue, Jun 9, 2026 (v6.9.26) Jun 9, 2026
devin-ai-integration Bot and others added 4 commits June 11, 2026 18:01
Fix A: MessagingVC subscribes to .conversationStoreDidChange so it re-reads
the current conversation when pass-2 hydration or iCloud sync delivers new
messages. Guarded against clobbering in-flight agent turns (ai_state, streaming
partial).

Fix D: Add messagesDidReload() hook called at the end of every
loadMessagesFromConversation path. MainVC overrides it to call
refreshAvatarVisibility(), ensuring the hero orb collapses on every message-
load path — not just loadConversation and newMessageSent.

Quick wins:
- Prioritize the active conversation in pass-2 hydration queue so the user
  doesn't stare at a blank screen while other conversations hydrate first.
- Defensive empty-state guard: if the store returns 0 messages for a not-yet-
  hydrated conversation, defer render and request async hydration instead of
  clearing the screen.
- Eviction tolerance in surgicalRefresh: use a mark-for-eviction set so
  cache entries aren't evicted on the first miss (protects against transient
  iCloud file rename/download operations).

Co-Authored-By: bot_apk <apk@cognition.ai>
* added improved support for dragging cells to see times

* added support for managed secrets

---------

Co-authored-by: Ash Bhat <me@ashbhat.com>
@github-actions github-actions Bot changed the title D2M: Thu, Jun 11, 2026 (v6.11.26) D2M: Mon, Jun 15, 2026 (v6.15.26) Jun 15, 2026
ashbhat and others added 17 commits June 16, 2026 12:57
Add a bottom toolbar with a Share button (square.and.arrow.up SF Symbol)
to all markdown preview surfaces:

- iOS MarkdownEditorViewController: bottom toolbar with share button that
  presents UIActivityViewController with raw markdown text
- Mac MarkdownEditorViewController: bottom toolbar with share button that
  presents NSSharingServicePicker with raw markdown text
- iOS FilePreviewCardView: share row at the bottom of markdown/text file
  cards in chat; triggers UIActivityViewController from the host cell
- Mac MacFilePreviewCardView: share row at the bottom of markdown/text
  file cards; triggers NSSharingServicePicker
- visionOS FileAttachmentBubble: ShareLink for markdown/text files

New shared components:
- MarkdownShareToolbar (iOS): reusable UIView toolbar
- MacMarkdownShareToolbar (macOS): reusable NSView toolbar

The toolbar collapses to zero height on non-text attachment types and uses
constraint-based collapse so hidden state doesn't leave dead space.

Co-Authored-By: bot_apk <apk@cognition.ai>
Add GLM 5.2 (accounts/fireworks/models/glm-5p2) to the model picker
under the Fireworks provider group. Context window: 1M tokens.

Co-Authored-By: bot_apk <apk@cognition.ai>
…ub.com:getathelas/LoopHarness into dev/jun16
dev/jun16 → develop: iOS 27 App Intents, anti-loop guard, markdown share, music & sync fixes
@github-actions github-actions Bot changed the title D2M: Mon, Jun 15, 2026 (v6.15.26) D2M: Sat, Jun 20, 2026 (v6.20.26) Jun 20, 2026
ashbhat and others added 2 commits June 20, 2026 14:10
Hero "Download for Mac" now points at /releases/latest (the signed +
notarized DMG) instead of the build-from-source #setup anchor. TestFlight
links (nav + iPhone CTA) already present on develop.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
site: route Mac download CTA to latest release
@gandalf-the-engineer gandalf-the-engineer merged commit b705b88 into main Jun 20, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants