Skip to content

cactus_consolidated OOM during BAR/abPOA phase with 16 genomes on 1 TB node #1914

@Ruiqi-CUB

Description

@Ruiqi-CUB

Hi there,

Thanks for developing this awesome tool! I was wondering if you could give me some suggestions on the following issues? Is there any strategy to use for this task?

Input Data:
16 species in the same family, All genomes softmasked with RepeatMasker (33–59% masked)
Genome Size Range: 755 MB – 1.58 GB, Average: ~1.21 GB, Total: ~19.3 GB

Software:
Cactus v2 (commit 00699c2), native install (no Docker/Singularity)

HPC:
SLURM cluster, --batchSystem single_machine, 1 node, 64 CPUs, ~1 TB RAM

Problem

cactus_consolidated OOM-kills during the abPOA BAR phase at one internal nodes Anc01. This nodes is not the root — it is mid-level ancestors covering subsets of species. Jobs run sequentially (one at a time, not in parallel), so this is a single-job memory issue; Reducing --consCores from 60 → 48 slightly reduced peak mem usage.

Questions

  1. Are there recommended config XML parameters to reduce abPOA BAR memory for closely-related species at this genome size
    (e.g. partialOrderAlignmentWindow, partialOrderAlignmentMaskFilter)?
  2. Would switching partialOrderAlignment="0" (cPecan instead of abPOA) substantially reduce memory for this use case?
  3. Is there any way to split or chunk the cactus_consolidated step for a single node, or is the monolithic design
    fundamental?
  4. Is the table calibrated for vertebrate genomes, and if so, are there recommended values for
    invertebrates with ~1.2 GB genomes?
  5. I have seen that cactus progressive successfully align >50 mammalian genomes, do you happen to know what strategies they used? or they just have a lot more memory on their HPC?

Thank you very much!

Best
Ruiqi

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions