Skip to content

Comments

Add merged structure IO and mmCIF parity#74

Open
heathcliff233 wants to merge 2 commits intosteineggerlab:masterfrom
heathcliff233:feature/merged-structure-io
Open

Add merged structure IO and mmCIF parity#74
heathcliff233 wants to merge 2 commits intosteineggerlab:masterfrom
heathcliff233:feature/merged-structure-io

Conversation

@heathcliff233
Copy link

Summary

This PR adds practical multi-chain support while keeping FCZ chunk storage unchanged (single-chain per chunk).

Highlights

  • Added split/merge workflow for multi-chain/discontinuous structures:
    • Python: compress(..., split=True) and open(..., merge_fragments=True)
  • Added Python/CLI format parity for decompression output:
    • pdb | mmcif | cif
    • CLI: foldcomp decompress --output-format ...
  • Added shared mmCIF writer path in C++ output utilities.
  • Improved robustness by creating/checking parent directories for tar/db outputs.

Compatibility

  • Backward compatible with existing FCZ/DB files.
  • Single-chain behavior remains unchanged.

Validation

  • conda run -n foldcomp pytest test -q12 passed
  • conda run -n foldcomp ./build.sh test → pass

- add Python split compression and merged fragment database reads

- expose source fragment indices for merged entries

- support format-selectable decompression in Python and CLI (pdb|mmcif|cif)

- add shared mmCIF atom writer in C++ output path

- harden tar/db output path handling with parent-directory checks

- expand tests and docs using existing multichain fixture
@heathcliff233
Copy link
Author

Hi authors, thanks for the great tool and the efforts on maintaining it. I have checked the error log and followed the black formatting requirements. Other errors seem to be on the github server side that failed on apt install.

This PR aims to add protein multimer support based on the current storage format. It seems that there is already support for segment storage, so I reuse it for multi-chain support and add an additional layer to allow sample-level iteration (with additional mmcif write option by gemmi). Hope it can help.

Best,
Liang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant