Skip to content
Discussion options

You must be logged in to vote

Indeed, it seems that not all records are pulled for events_internal, and then the model is never aware of these, thus also generates too few.

Might it be that the keys are not matching, or of different type? Could you please share the output of these lines:

print(f"users: {len(users)} rows, {len(set(users.user_id))} unique user ids")
print(f"events_internal: {len(events_internal)} rows, {len(set(events_internal.user_id))} unique user ids")
print(f"{len(set(events_internal.user_id) - set(users.user_id))} user ids in events_internal but not in users")
print(f"{len(set(users.user_id) - set(events_internal.user_id))} user ids in users but not in events_internal")
print(f"{users.user_id.dtype…

Replies: 5 comments 4 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@shuangwu5
Comment options

Comment options

You must be logged in to vote
1 reply
@mplatzer
Comment options

Comment options

You must be logged in to vote
1 reply
@mplatzer
Comment options

Answer selected by khazana
Comment options

You must be logged in to vote
1 reply
@mplatzer
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants