Skip to content

fix(instagram-api): env reprompt + runtime wedge hardening + post-login challenge#72

Open
tnunamak wants to merge 3 commits intofix/instagram-api-compatfrom
fix/instagram-api-env-reprompt
Open

fix(instagram-api): env reprompt + runtime wedge hardening + post-login challenge#72
tnunamak wants to merge 3 commits intofix/instagram-api-compatfrom
fix/instagram-api-env-reprompt

Conversation

@tnunamak
Copy link
Copy Markdown
Member

@tnunamak tnunamak commented Apr 21, 2026

Summary

Bundles two rounds of hardening for the API connector, both validated against staging canary in `vana-com/context-gateway#186`.

Round 1 (original scope): env-credential reprompt

When env-provided credentials fail with invalid_credentials, fall through to promptWithRetry instead of throwing. Supports bad-password canary mode + improves real-user UX. Already verified green:

Round 2: runtime wedge + post-login challenge

Additional changes to make `instagram × full` viable in canary and handle Instagram's post-OTP challenge pages without a headed fallback:

  1. setData pressure reduction. setAuthState / setCollectorTraceSection accumulate in-memory; flush at terminal. ~93 round-trips → ~26.
  2. handlePostLoginChallenge(): walks /challenge/ pages up to 5 steps, clicking 'This was me' / 'Dismiss' / 'Continue' / 'Next', or prompting for an email code with keyword so canary email worker activates.
  3. Accounts Center HTML shrink: fetchAccountsCenterHtml projects data-sjs script bodies inside the page instead of returning the full multi-MB document.
  4. Captured GraphQL filter: readCapturedGraphql(matchSubstring) filters inside the page, caps body/response length, avoids unbounded igAllGraphql transfers.
  5. Runtime-wedge circuit breaker: module-scope flag triggered by 'Command timeout:' errors; skips remaining ads sub-phases and optional terminal breadcrumbs. Run time drops ~5min → ~2min when wedged.
  6. Full event dispatch for React-rendered challenge buttons (pointerdown/mousedown/pointerup/mouseup/click). Native element.click() wasn't firing React onClick handlers on IG's challenge page.

Verification

`instagram × full` (rotation) on context-gateway#186 got past login + challenge + profile + posts + followers + following into the Accounts Center phase, where the remote browser runtime wedges regardless of payload shrinks. That's a separate infra-layer concern.

Merge order

Targeting `fix/instagram-api-compat` (PR #70). Once #70 lands and this is included, CG can repin to canonical main ref instead of the branch index.

…ials

When env-provided credentials (PLATFORM_LOGIN / PLATFORM_PASSWORD) fail
with invalid_credentials, fall through to the existing promptWithRetry
loop instead of throwing immediately. Matches the no-env path behavior
and supports the bad-password canary mode in context-gateway.

Other error kinds (login_api_error, network, etc.) still throw as
before. Real users who fat-finger their password now re-prompt instead
of seeing a generic failure.

Ported from an ad-hoc snapshot edit in context-gateway PR #185.
…nge handling

Bundles the improvements from context-gateway PR #186 into the upstream
connector. Aimed at making instagram × full mode viable in canary and
improving real-user UX when Instagram flags the login.

Changes, roughly in order of impact:

1. Mid-run setData pressure reduction. setAuthState and
   setCollectorTraceSection now accumulate in-memory; terminal flush
   in main IIFE only. ~93 page.setData round-trips per run → ~26.
   User-visible 'status' breadcrumbs unchanged.

2. handlePostLoginChallenge(): after OTP, if session lands on
   /challenge/ or /checkpoint/, programmatically walk the page:
   - Click 'This was me' / 'Confirm it's you' / 'It was me'.
   - Click 'Dismiss' / 'Continue' / 'Next' (for 'automated behavior'
     and 'help us confirm it's you' pages respectively).
   - Email-code prompt with 'email' keyword so canary email worker
     polls and supplies.
   - Full event dispatch (pointer/mouse/click) for React onClick
     handlers that ignore bare element.click().

3. Accounts Center HTML shrinking. fetchAccountsCenterHtml previously
   returned document.documentElement.outerHTML (multi-MB) across the
   remote-browser boundary. Now projects inside the page: extracts
   data-sjs script bodies containing fxcal_settings and a small meta
   slice; reconstructs minimal HTML for existing downstream parsers.

4. readCapturedGraphql(matchSubstring) — pre-filters inside the page
   and caps body/response length. Previously returned the full
   __igAllGraphql__ array every 250ms during adCategories discovery.

5. Runtime-wedge circuit breaker. Module-scope runtimeWedgeDetected
   flag set when any caught error matches /Command timeout:/. Used
   to skip remaining ads sub-phases and optional terminal breadcrumbs;
   cuts wedged-run duration from ~5min to ~2min.

Still limited by the Accounts Center pages consistently wedging the
remote browser runtime after 1-2 navigations — a remote-browser-
service capacity issue that needs infra-layer work.
@tnunamak tnunamak changed the title fix(instagram-api): re-prompt via requestInput on invalid env credentials fix(instagram-api): env reprompt + runtime wedge hardening + post-login challenge Apr 21, 2026
Ports additional improvements from context-gateway side that went in
after the initial port on this branch. Matches the code merged in:

- CG #187 (profile capture wedge check): short-circuit profile-capture
  polling loop on runtime wedge.
- CG #188 (XHR Accounts Center): fetch data-sjs HTML blocks via same-
  origin XHR from an instagram.com anchor page instead of navigating
  the remote browser to each heavy Accounts Center page.

Verified on staging parallel full-mode runs: acct 3 got first
advertiser via XHR-fetched ads page; acct 1 consistently collects
profile + posts + followers + following + advertisers with this code.
@tnunamak
Copy link
Copy Markdown
Member Author

Updated with additional improvements from CG #187 + #188:

  • Profile-capture wedge check: short-circuit the 30-attempt polling loop when the runtime wedges (saves ~30min of 60s drain).
  • XHR Accounts Center: fetch data-sjs HTML blocks via same-origin XHR from an instagram.com anchor page instead of navigating the remote browser to each heavy Accounts Center page. Avoids letting the runtime load the heavy React app that consistently wedged after 1-2 navigations.

Verified across multiple staging parallel full-mode runs. Not 6/6 green (accounts flagged with /challenge/ 'Help us confirm it's you' can't be automated through; categories phase still wedges because it needs UI tab click for GraphQL discovery), but consistently 3-4/6 partial outcomes with real data collection — up from 0/6 before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant