feat: added session search api#259
Conversation
slyt3
left a comment
There was a problem hiding this comment.
I ran go test ./... locally and it fails after this PR due to pgvector not being enabled on the go side
the error:
ERROR: type "vector" does not exist (SQLSTATE 42704)
when GORM migrates
tasks.embedding vector(1536)
I think we need to ensure CREATE EXTENSION IF NOT EXISTS vector; that its executed before AutoMigrate in go and in go test db setup and the repo tests will pass I am pretty sure.
Good start, but please fix the Go pgvector/CI test failure and address the follow-ups noted in inline comments
|
I have a option in the sdk design, I think the session search should be able to be scoped, for example, within a user (https://docs.acontext.io/store/per-user), even can be required (since pgvector is low-effificent in large scale embedding search yet great for a small pool). So maybe the SDK should be: r = client.users.search_sessions(user='xxxx', query="xxx", topk=10)or client.sessions.search(query=..., user='xxxx')Also, are you using any coding agent? Acontext has a |
@gusye1234 sure, will make changes to make this endpoint to be user scoped, i currently impl something close to r = client.sessions.search(query=..., user='xxxx')then we want user_id to be mandatory arg in both sdk and api, if this is fine, I'll proceed with option 2 of what you suggested. |
cef8bfc to
283aa7f
Compare
slyt3
left a comment
There was a problem hiding this comment.
I pulled the latest commits and re-ran tests. The pgvector migration issue is fixed now and go test ./... passes.
I still think we need a couple fixes before merge:
-
session_task_process_statusvalues changed in Go (message.go) topending/processing/completed/failed, but Core + other Go code still usespending/running/success/failed(observing status query + tests). Can we keep this consistent? (I think revert to the old set unless we’re changing it everywhere.) -
Session search scoping: Core SQL only filters by
user_id. Can we also scope byproject_id(or validate the user belongs to the project)? Otherwise it can mix/leak sessions across projects.
Also a question for @gusye1234 :
Core exposes configurable embedding dim/model now, but Go still hard-codes vector(1536) in the tasks schema. Since GORM tags can’t read config at runtime, do we want dim fixed at 1536 everywhere, or truly configurable (needs custom Go migrations)?
Sorry I missed some of this in my earlier review, the PR changed a lot across updates and I was mainly focused on the pgvector/CI failure first.
18f2a6e to
c24176b
Compare
No hard-coeds vector dim is allowed @isuniverseok-ua @slyt3 , in Go api layer, we have a
hi @slyt3 , project_id is not visible to users (they only have api-key), so scoped with project_id is not feasible. But we should check the constrain that user id is bounded within this project, somewhere(maybe during the query of using user string to get the actually user id). |
@gusye1234 so user_id would be a compulsory field, its just that we should derive project_id from context (?), like how do we check that user id is bounded within this project?, like one is ofc do a db call to validate this; or else we make user_id to be optional, like worst case we search across all users in any authenticated project |
Signed-off-by: Arnesh Dadhich <arnesh.dadhich@unifyapps.com>
Signed-off-by: Arnesh Dadhich <arnesh.dadhich@unifyapps.com>
Signed-off-by: Arnesh Dadhich <arnesh.dadhich@unifyapps.com>
Signed-off-by: Arnesh Dadhich <arnesh.dadhich@unifyapps.com>
Signed-off-by: Arnesh Dadhich <arneshdadhich0011@gmail.com>
Signed-off-by: Arnesh Dadhich <arnesh.dadhich@unifyapps.com>
Signed-off-by: Arnesh Dadhich <arnesh.dadhich@unifyapps.com>
5fb6a38 to
9fe5ed5
Compare
Signed-off-by: Arnesh Dadhich <arneshdadhich0011@gmail.com>
|
@isuniverseok-ua The major point is that I think we shall have a way to limit the search scope, to make sure we search with condition, a user scope will be perfect. because now we're using pg_vector, which is not very good for large-scale searching |
@gusye1234 okay, can we do similar thing as done in |
97e1f22 to
ab37bdd
Compare
Why we need this PR?
Added a basic session search functionality using task entities for indexing.
Describe your solution
session_search_service.pyto handle vector similarity search.taskstable using pgvector's cosine distance operator.GET /api/v1/project/{id}/sessions/searchin Python Core.GET /sessions/searchin Go API Server, proxying requests to Core.client.sessions.search(query=...)method to the Python SDK.Implementation Tasks
Please ensure your pull request meets the following requirements:
session_search_service.pywith vector logic/sessions/searchendpoint in Core and APIsearch()method to Python Client SDKImpact Areas
Which part of Acontext would this feature affect?
Checklist
devbranch.