Skip to content

Conversation

@ankit-v2-3
Copy link
Collaborator

Pull Request

Description:
Add RTStream Audio and Transcript Support, Clip Support

Changes:

  • RTStream Audio and Transcript Support
  • Clip Support
  • Fix upload from gel collections object where going to default collection
  • Add scene_index_id, scene_index_name, metadata in Shot
  • Fix 0 score threshold fix in defaults search
  • Segmentation_type llm in spoken word index

0xrohitgarg and others added 30 commits January 16, 2026 23:40
- Rename 'api_url' to 'base_url' in CaptureClient constructor
- Remove 'session_id' from CaptureClient constructor; it is now fetched via 'session_token'
- Implement 'fetch_session_id' to retrieve session ID and 'callback_url' from backend
- Add 'CaptureClient.create()' async factory method for initialization
- Implement structured 'Channels' object with 'mics', 'displays', and 'system_audio' collections
- Rename 'start_capture' to 'start_capture_session' and remove 'callback_url' arg
- internalize 'callback_url' handling from session token response
- Update .gitignore to ignore test_local.py
- Added  and  to  for cleaner indexing API.
- Added  to  to easily retrieve streams.
- Cleaned up  callback/websocket logic.
- CaptureSession.get_rtstream() uses simple name matching for standardized callback names
- Added index_audio() and index_visuals() wrappers to RTStream
…ata in client layer

- Add RTStreamChannelType constants (mic, screen, system_audio) to _constants.py
- Simplify CaptureSession.get_rtstream to use channel_id matching
- Move rtstream normalization (rtstream_id -> id) to client.py and collection.py
- Add channel_id attribute to RTStream class
- Clean up CaptureSession._update_attributes
Rename the CaptureClient parameter from upload_token to client_token for consistency with the generate_client_token() method. This improves API naming consistency and matches user expectations.

- Rename __init__ parameter: upload_token -> client_token
- Update docstring to reflect new parameter name
- Keep "uploadToken" in binary protocol payload (required by recorder)
- Add session_token parameter to connect() function as alternative to api_key
- Update Connection class to accept and handle session_token authentication
- Update .gitignore to exclude videodb-recorder binary
- Update capture_bin package manifest

This enables frontend clients to create WebSocket connections using
time-bound session tokens for capture operations.
Changed store from hardcoded True to a configurable property that defaults to False.
- Reorganize capture binaries into platform folders (darwin_arm64, darwin_x86_64, win_amd64)
- Update package_data glob pattern to include subfolders (bin/**/*)
- Add platform detection logic to select correct binary at runtime
- Export RTStreamChannelType from videodb package
Use errors="replace" when decoding stdout/stderr from recorder binary
to handle Windows console encoding (CP1252) gracefully.
The second __all__ was overwriting the first, causing capture classes
(CaptureClient, Channel, etc.) to be missing from exports.
- Remove capture_bin/ from main SDK repo (will live in separate repo)
- Add ChannelList class with .default property for cleaner API
- Change API: channels.default_mic -> channels.mics.default
- Export ChannelList from package

Breaking change: channels.default_mic, channels.default_display,
channels.default_system_audio replaced with channels.mics.default,
channels.displays.default, channels.system_audio.default
- Bump videodb-capture-bin from >=0.2.4 to >=0.2.5
- Remove capture_bin from .gitignore (moved to separate repo)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants