Add merged structure IO and mmCIF parity#74
Open
heathcliff233 wants to merge 2 commits intosteineggerlab:masterfrom
Open
Add merged structure IO and mmCIF parity#74heathcliff233 wants to merge 2 commits intosteineggerlab:masterfrom
heathcliff233 wants to merge 2 commits intosteineggerlab:masterfrom
Conversation
- add Python split compression and merged fragment database reads - expose source fragment indices for merged entries - support format-selectable decompression in Python and CLI (pdb|mmcif|cif) - add shared mmCIF atom writer in C++ output path - harden tar/db output path handling with parent-directory checks - expand tests and docs using existing multichain fixture
Author
|
Hi authors, thanks for the great tool and the efforts on maintaining it. I have checked the error log and followed the black formatting requirements. Other errors seem to be on the github server side that failed on apt install. This PR aims to add protein multimer support based on the current storage format. It seems that there is already support for segment storage, so I reuse it for multi-chain support and add an additional layer to allow sample-level iteration (with additional mmcif write option by gemmi). Hope it can help. Best, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds practical multi-chain support while keeping FCZ chunk storage unchanged (single-chain per chunk).
Highlights
compress(..., split=True)andopen(..., merge_fragments=True)pdb | mmcif | ciffoldcomp decompress --output-format ...Compatibility
Validation
conda run -n foldcomp pytest test -q→12 passedconda run -n foldcomp ./build.sh test→ pass