You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This PR adds a configurable chunk size to semble. The chunk size is set to 750, which is the default today, but can be overridden by setting SEMBLE_CHUNK_SIZE. I have not documented this option yet, we don't have a great place to do so.
The PR also exposes the chunk size parameter as part of the Python API.
The indentation change in from_git causes token-savings stats to silently go empty for every git-indexed repository; merging as-is would ship a quiet regression on that code path.
The refactoring in from_git moved the return SembleIndex(...) outside the with tempfile.TemporaryDirectory() context manager. SembleIndex.__init__ calls _compute_file_sizes(root) synchronously, but the temp dir is already deleted at that point, so every file read silently fails and _file_sizes is always empty for git repos. This affects stats reporting on every from_git call. The rest of the changes — threading desired_chunk_length through the call stack, renaming the metadata key, env-var parsing — are clean and well-tested.
src/semble/index/index.py — the from_git method needs the return SembleIndex(...) block moved back inside the with tempfile.TemporaryDirectory() context.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds a configurable chunk size to semble. The chunk size is set to 750, which is the default today, but can be overridden by setting
SEMBLE_CHUNK_SIZE. I have not documented this option yet, we don't have a great place to do so.The PR also exposes the chunk size parameter as part of the Python API.