Skip to content

feat(identity): bootstrap anonymousId via config and runtime setter#637

Open
tsushanth wants to merge 1 commit into
PostHog:mainfrom
tsushanth:fix/issue-471-bootstrap-anonymous-id
Open

feat(identity): bootstrap anonymousId via config and runtime setter#637
tsushanth wants to merge 1 commit into
PostHog:mainfrom
tsushanth:fix/issue-471-bootstrap-anonymous-id

Conversation

@tsushanth

Copy link
Copy Markdown

Why

Closes #471 (and mirrors what `posthog-js#862` is asking for on the JS side).

Two real users on #471 hit the same wall: `Application Installed` and `Application Updated` are captured synchronously during `setup()`, before there's any chance to call `identify()` or to mutate `PostHogStorageManager` (which is public for backwards compatibility but not really intended for app code). Those lifecycle events end up tied to the SDK's auto-generated UUID instead of the caller's account ID.

What

Two additive surfaces. Both honour empty-string as a no-op so a sentinel can't accidentally clobber the stored value.

1. `PostHogConfig.anonymousId: String?` — set before `setup()`. On the first call to `getAnonymousId()`, if nothing is persisted, the SDK uses this value instead of generating a UUID v7. Pre-existing `PostHogConfig.getAnonymousId` (UUID → UUID hook) is unaffected.

```swift
let config = PostHogConfig(apiKey: "phc_...")
config.anonymousId = "A-\(myUserId)"
PostHogSDK.shared.setup(config)
// "Application Installed" now reports anonymousId = "A-..."
```

2. `PostHogSDK.setAnonymousId(_:)` — runtime override, intended to seed the next anonymous session after `reset()`:

```swift
PostHogSDK.shared.reset()
PostHogSDK.shared.setAnonymousId("A-guest-\(UUID().uuidString)")
```

This is the iOS equivalent of `posthog.reset({ bootstrap: { distinctID } })` from the JS proposal in posthog-js#862.

Tests

Three new specs in `PostHogStorageManagerTest`:

  • bootstrap value picked up on fresh install
  • empty bootstrap falls back to UUID
  • bootstrap re-applies after a `reset()` that clears persisted storage

Note: `swift test` is currently broken on `main` (pre-existing test target rot, see commit 853e716 — "ci: run the iOS test target on a simulator and fix the rot it surfaced"). The SDK target builds clean; the new tests follow the existing Quick/Nimble pattern and should pass on the iOS-simulator CI path.

Compatibility

  • No public API removals, no signature changes.
  • Existing call sites compile unchanged (both new surfaces have defaults / are additive).
  • `@objc` annotations preserved on both new members.

Adds two surfaces for caller-supplied anonymous IDs, mirroring posthog-js
behaviour requested in PostHog#471 (and posthog-js#862):

1. `PostHogConfig.anonymousId` — pre-seeds the anonymous ID before
   `setup()` so events captured synchronously during initialization
   (Application Installed/Updated, screen views) use the caller's ID
   instead of the SDK-generated UUID. Honoured only when no value is
   already persisted.

2. `PostHogSDK.setAnonymousId(_:)` — overrides the persisted anonymous
   ID at runtime, intended to be called immediately after `reset()` to
   seed the next anonymous session.

Both ignore empty strings to avoid clobbering with sentinel values.

Closes PostHog#471
@tsushanth tsushanth requested a review from a team as a code owner June 9, 2026 18:00
@greptile-apps

greptile-apps Bot commented Jun 9, 2026

Copy link
Copy Markdown

Comments Outside Diff (1)

  1. PostHog/PostHogStorageManager.swift, line 44-58 (link)

    P2 Bootstrap ID silently re-seeds every session after reset()

    The doc comment on PostHogConfig.anonymousId says the bootstrap is "Ignored when an anonymous ID is already persisted," and the new test explicitly verifies it re-applies after reset() clears storage. However, a developer who sets config.anonymousId = "A-\(myUserId)" to bootstrap the first install will find that every subsequent reset() (e.g., on logout) re-seeds the same user-specific ID into the next anonymous session — potentially linking a post-logout anonymous session back to the identified user's account without the developer realising it. The doc comment on PostHogConfig.anonymousId should call out that this value is reused on every fresh (post-reset) session, not only on first install, so callers can make an informed choice between the config approach and the runtime setAnonymousId method.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: PostHog/PostHogStorageManager.swift
    Line: 44-58
    
    Comment:
    **Bootstrap ID silently re-seeds every session after `reset()`**
    
    The doc comment on `PostHogConfig.anonymousId` says the bootstrap is "Ignored when an anonymous ID is already persisted," and the new test explicitly verifies it re-applies after `reset()` clears storage. However, a developer who sets `config.anonymousId = "A-\(myUserId)"` to bootstrap the first install will find that every subsequent `reset()` (e.g., on logout) re-seeds the same user-specific ID into the next anonymous session — potentially linking a post-logout anonymous session back to the identified user's account without the developer realising it. The doc comment on `PostHogConfig.anonymousId` should call out that this value is reused on every fresh (post-reset) session, not only on first install, so callers can make an informed choice between the config approach and the runtime `setAnonymousId` method.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
PostHog/PostHogStorageManager.swift:44-58
**Bootstrap ID silently re-seeds every session after `reset()`**

The doc comment on `PostHogConfig.anonymousId` says the bootstrap is "Ignored when an anonymous ID is already persisted," and the new test explicitly verifies it re-applies after `reset()` clears storage. However, a developer who sets `config.anonymousId = "A-\(myUserId)"` to bootstrap the first install will find that every subsequent `reset()` (e.g., on logout) re-seeds the same user-specific ID into the next anonymous session — potentially linking a post-logout anonymous session back to the identified user's account without the developer realising it. The doc comment on `PostHogConfig.anonymousId` should call out that this value is reused on every fresh (post-reset) session, not only on first install, so callers can make an informed choice between the config approach and the runtime `setAnonymousId` method.

### Issue 2 of 2
PostHogTests/PostHogStorageManagerTest.swift:62-87
**Non-parameterised tests for bootstrap value handling**

The first two new specs ("Uses bootstrap anonymousId…" and "Ignores empty bootstrap…") differ only in the input value and the expected outcome. Per the project's preference for parameterised tests, these could be expressed as a single `ParameterizedTest` or Quick `sharedExamples` block driven by `("A-bootstrap-id-123", "A-bootstrap-id-123")` and `("", nil)` entries, making it easy to add more sentinel/edge-case inputs later without duplicating the setup and teardown code.

Reviews (1): Last reviewed commit: "feat(identity): bootstrap anonymousId vi..." | Re-trigger Greptile

Comment on lines +62 to +87
it("Uses bootstrap anonymousId from config on fresh install") {
let config = PostHogConfig(projectToken: "test_project_token")
config.anonymousId = "A-bootstrap-id-123"
let sut = self.getSut(config)

let anonymousId = sut.getAnonymousId()
expect(anonymousId) == "A-bootstrap-id-123"

// Subsequent calls return the same persisted value.
expect(sut.getAnonymousId()) == "A-bootstrap-id-123"

sut.reset(true)
}

it("Ignores empty bootstrap anonymousId and falls back to UUID") {
let config = PostHogConfig(projectToken: "test_project_token")
config.anonymousId = ""
let sut = self.getSut(config)

let anonymousId = sut.getAnonymousId()
expect(anonymousId.isEmpty) == false
// Should be a UUID, not an empty string.
expect(UUID(uuidString: anonymousId)) != nil

sut.reset(true)
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Non-parameterised tests for bootstrap value handling

The first two new specs ("Uses bootstrap anonymousId…" and "Ignores empty bootstrap…") differ only in the input value and the expected outcome. Per the project's preference for parameterised tests, these could be expressed as a single ParameterizedTest or Quick sharedExamples block driven by ("A-bootstrap-id-123", "A-bootstrap-id-123") and ("", nil) entries, making it easy to add more sentinel/edge-case inputs later without duplicating the setup and teardown code.

Context Used: Do not attempt to comment on incorrect alphabetica... (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: PostHogTests/PostHogStorageManagerTest.swift
Line: 62-87

Comment:
**Non-parameterised tests for bootstrap value handling**

The first two new specs ("Uses bootstrap anonymousId…" and "Ignores empty bootstrap…") differ only in the input value and the expected outcome. Per the project's preference for parameterised tests, these could be expressed as a single `ParameterizedTest` or Quick `sharedExamples` block driven by `("A-bootstrap-id-123", "A-bootstrap-id-123")` and `("", nil)` entries, making it easy to add more sentinel/edge-case inputs later without duplicating the setup and teardown code.

**Context Used:** Do not attempt to comment on incorrect alphabetica... ([source](https://app.greptile.com/review/custom-context?memory=instruction-0))

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@dustinbyrne dustinbyrne left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @tsushanth, thank you for the pull request!

You mentioned that application lifecycle events are being recorded with an anonymous identifier. Once identify() is called, those anonymous events should become linked with the identified person. It sounds like this isn't what you want, so out of curiosity, could you describe what the specific gap is? It'd be helpful for me to understand if there's a particular usage pattern that we're not accounting for here.

@dustinbyrne dustinbyrne requested a review from a team June 11, 2026 15:38
@tsushanth

tsushanth commented Jun 11, 2026

Copy link
Copy Markdown
Author

Hey @dustinbyrne — thanks for the look.

You're right that once identify() runs, anonymous events become linked to the identified person via $anon_distinct_id. The gap is the events captured before identify() — they're still attributed to the SDK-generated UUID, and there are a few flows where the host app already knows a better anonymous ID at install time:

1. Cross-SDK / cross-platform funnels. Apps that ship a backend or a web SDK alongside iOS often want a single anonymous identity across all three before the user signs in. With bootstrap, web hands the iOS app its anonymous_id cookie value (via a deep link or login handoff) and iOS picks up the same key from the very first event, rather than generating a throwaway UUID that has to be merged later. This matches what posthog-js and posthog-android already do.

2. Stable feature-flag bucketing pre-identify. Feature flags evaluated before sign-in bucket by the anonymous ID. When that ID is deterministic from the app's own auth/install attribution layer, A/B exposures are stable across reinstalls (where the iOS SDK would otherwise mint a new UUID and shuffle the bucket).

3. Application Installed / Application Updated events. Those fire synchronously during setup() — there's no clean window to call setAnonymousId before they go out. Bootstrap is the only way to land them on a caller-controlled ID without monkey-patching the lifecycle hooks.

The PR is the matching iOS surface for the bootstrap option that already exists on the JS/Android SDKs, so this is mostly an SDK-family-parity move.


Re: the Greptile note about reset() semantics — that's a fair callout and worth tightening in the doc. As written, config.anonymousId is re-applied on any session where storage is empty, which includes the post-reset() (e.g. logout) case, not just first install. For someone who bootstraps with something like an interpolated "A-<userId>" value and then calls reset() on logout, the next anonymous session re-seeds the same ID and silently re-links activity to the prior identity.

Two cleanish options if you'd like me to follow up:

  • (a) Doc-only: explicitly call out that config.anonymousId is reused on every fresh (post-reset()) session, and steer callers who want install-only behavior toward setting it once via setAnonymousId(...) from app code after detecting first launch.
  • (b) Track a one-shot "bootstrap_applied" flag in persistent storage so config.anonymousId only seeds genuinely first-install sessions. (posthog-android opted for (a) — happy to match.)

Happy to push either as a follow-up commit on this branch — let me know your preference.

@dustinbyrne

Copy link
Copy Markdown
Contributor

Hi @tsushanth, thanks for that information!

What I'm hoping to understand is your specific use case in which identify() and merge not are ideal. Regarding feature flag evaluations, there are some different strategies for maintaining stable flag evaluations across distinct_id changes (e.g., device bucketing, persisted feature flags, using setPropertiesForFlags). How would you be seeding a known anonymous identifier on initialization?

The PR is the matching iOS surface for the bootstrap option that already exists on the JS/Android SDKs, so this is mostly an SDK-family-parity move.

iOS and Android both have getAnonymousId hooks which can be set on the configuration used when initializing PostHog. Bootstrapping the JS client is somewhat different in that you can provide a distinct_id and indicate whether it's identified or anonymous. This is somewhat of a more common case in web given that it's trivial to seed a known identity via server-side rendering, query parameters, etc.

Sorry for all the questions, I'm mainly trying to understand if this is part of a broader workflow we don't have great support for, or if it's more of a targeted fix specifically in response to #471

@tsushanth

Copy link
Copy Markdown
Author

Fair point — getAnonymousId: ((UUID) -> UUID) does cover the customise the UUID case, and you're right that feature-flag stability has cleaner answers via persisted flags / setPropertiesForFlags / device bucketing. Let me sharpen the gap I'm actually trying to close:

The hook's signature constrains the anonymous ID to a UUID. That's fine when the consumer's "known ID" is itself a UUID (or can be deterministically hashed into one), but the case I keep running into is when the external system hands me a non-UUID string that needs to land on the wire exactly as-is for cross-system joins:

  • Web hands iOS its ph_anonymous_id cookie value via a Universal Link or post-login redirect (e.g. "A-r3K7-cookie-…").
  • A backend issues a stable opaque per-device key for unauth'd telemetry (a JWT sub for an "anonymous session" token, or a server-generated device_session_id).
  • A first-party attribution SDK (AppsFlyer / Adjust install referrer) returns a known string that the server-side analytics pipeline keys off.

In each of those, I want PostHog's first event from iOS to carry that exact string as $distinct_id, so downstream joins (web↔iOS in the same dashboard, server logs↔PostHog events) work without a .identify() merge step or a post-hoc reconciliation table. getAnonymousId can't accept those — I'd have to derive a UUID from the string, at which point the on-wire ID no longer matches what the other system sees, and the join is broken.

Concrete shape:

// Web has already set the user as anonymous_id = "A-r3K7-cookie-2A8F"
// and handed it to iOS via a Universal Link.
let cookieFromWeb = deepLink.queryItem("anon")  // "A-r3K7-cookie-2A8F"

let config = PostHogConfig(apiKey: )
config.anonymousId = cookieFromWeb   // ← what this PR adds
// First event from iOS carries $distinct_id = "A-r3K7-cookie-2A8F", same as web

Without that, the same flow needs either (a) a synchronous identify() immediately after setup() (which produces a $identify merge event and bumps the user record, even though there's no real identification happening), or (b) a custom workaround using setAnonymousId after the SDK has already emitted lifecycle events on the SDK-generated UUID.

Scope honesty: yes, this is targeted at #471 — but the issue itself was filed against the same gap (the reporter wanted to seed an ID matching their web/server tracking on init). I think it generalises cleanly because it mirrors what posthog-js already exposes via bootstrap.distinctID and what posthog-android exposes via PostHogConfig#getAnonymousId (which on Android does accept a String?, not a UUID transform — that platform asymmetry is what made me reach for this).

If you'd rather not expand the iOS surface and the recommended path is "use setAnonymousId after init and accept the brief window where lifecycle events go out on a throwaway UUID," I'm happy to close — would just want to capture that as guidance in the doc so the next person who hits #471 sees it. Or if you think the right shape is different from what I drafted (e.g. matching JS's bootstrap struct with explicit isIdentified), happy to refactor.

@marandaneto

Copy link
Copy Markdown
Member

i think the public api should match https://posthog.com/docs/feature-flags/bootstrapping
so its an object that takes more things to bootstrap and not only the anon id
also i'd not expose any other public api besides the bootstrap config unless its needed (does not look like it)

@marandaneto

Copy link
Copy Markdown
Member

another concern (is it?) is if the user is already identified with a stable distinct id and a merged anon id, and then someone starts using the bootstraped distinct if after the user is identified, not sure if this would mess with the user merging
maybe we should use the bootstraped distinct id only if not identified yet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ability to set anonymous ID manually

3 participants