Skip to content

Add mppdisttool: unified MPP distribution tool#407

Draft
underwoo wants to merge 2 commits into
NOAA-GFDL:mainfrom
underwoo:mppdisttool
Draft

Add mppdisttool: unified MPP distribution tool#407
underwoo wants to merge 2 commits into
NOAA-GFDL:mainfrom
underwoo:mppdisttool

Conversation

@underwoo

Copy link
Copy Markdown
Member

Summary

Adds mppdisttool, a single pure-C binary with a subcommand interface that replaces seven separate MPP distribution programs:

Legacy tool Subcommand
mppncscatter scatter
mppnccombine combine --mpp
combine-ncc combine --land
scatter-ncc scatter --land
decompress-ncc decompress
iceberg_comb.sh combine --iceberg / check --iceberg
combine_restarts auto

The new binary eliminates the Fortran and NCO (ncrcat, ncatted) runtime dependencies, making these tools usable on HPC login, data-transfer, and service nodes that frequently lack the full Fortran module stack or NCO.

Subcommands

  • combine — auto-detects file type (compressed-by-gathering → land path; iceberg restart → iceberg path; otherwise MPP path); mode can be forced with --land, --iceberg, or --mpp
  • scatter — MPP domain decomposition or --land CF compressed scatter
  • decompress — expand compressed-by-gathering files to full grid
  • check--compressed (replaces is-compressed) or --iceberg; exit codes preserved
  • auto — scans CWD for *{res,nc}.#### files and routes each group through the appropriate combine path (replaces combine_restarts)

Backward compatibility

install-exec-hook installs symlinks for all six legacy binary names. argv[0] detection maps each symlink name to the correct subcommand and flags at startup, so existing scripts require no changes.

Source layout

src/mpp-disttool/
├── main.c               argv[0] compat + subcommand dispatch table
├── cmd_combine.c/h      MPP (mppnccombine, modernised nc_* API),
│                        land (combine-ncc.F90 port),
│                        iceberg (no ncrcat/ncatted)
├── cmd_scatter.c/h      MPP (mppncscatter/domain.c),
│                        land (scatter-ncc.F90 port)
├── cmd_decompress.c/h   decompress-ncc.F90 port
├── cmd_check.c/h        is-compressed + iceberg check (no ncdump)
├── cmd_auto.c/h         combine_restarts port (POSIX opendir + regcomp)
├── compress.c/h         nfu_compress.F90 port + rank_ascending merge-sort
├── nc_utils.c/h         nfu.F90 helpers (clone_dim/var, format detect)
├── domain.c/h           mpp_compute_extent, hyperslabcopy, scatter_dims
├── strlist.c/h          copied from mpp-ncscatter
└── xmalloc.c/h          copied from mpp-ncscatter

Sources are placed in mpp-disttool/ (not mppdisttool/) to avoid automake creating a same-named object subdirectory that shadows the binary target under make -j.

Build system changes

  • src/Makefile.am — new mppdisttool target; install-exec-hook for six compatibility symlinks
  • configure.ac — adds AC_PROG_LN_S so $(LN_S) is defined for the install hook

Test plan

  • autoreconf -i && configure && make -j builds cleanly
  • make install installs mppdisttool and all six symlinks
  • Each symlink dispatches to the correct subcommand (--help output spot-check)
  • Existing test suite passes: make check for mppncscatter, mppnccombine, combine-ncc, scatter-ncc, decompress-ncc, iceberg_comb, is-compressed, combine_restarts
  • ldd mppdisttool confirms no Fortran runtime or NCO library dependency

Closes #405 (parallel build work tracked separately on nfu.parallel.build)

🤖 Generated with Claude Code

Adds a single pure-C binary, mppdisttool, with a subcommand interface
that replaces mppncscatter, mppnccombine, combine-ncc, scatter-ncc,
decompress-ncc, iceberg_comb.sh, and combine_restarts.  The new binary
eliminates the Fortran and NCO (ncrcat, ncatted) runtime dependencies,
making it usable on HPC nodes that lack the full Fortran module stack.

Subcommands: combine, scatter, decompress, check, auto
Backward-compat symlinks installed for all seven legacy names.
Auto-detection selects MPP, land (compressed-by-gathering), or iceberg
path; each path can also be forced explicitly.

Source layout: src/mpp-disttool/ (11 C source files)
  - main.c              argv[0] compat + subcommand dispatch
  - cmd_combine.c/h     MPP (mppnccombine, modernised nc_* API),
                        land (combine-ncc.F90), iceberg (no ncrcat/ncatted)
  - cmd_scatter.c/h     MPP (mppncscatter/domain.c), land (scatter-ncc.F90)
  - cmd_decompress.c/h  decompress-ncc.F90 port
  - cmd_check.c/h       is-compressed + iceberg check (no ncdump)
  - cmd_auto.c/h        combine_restarts port (POSIX opendir + regcomp)
  - compress.c/h        nfu_compress.F90 port + rank_ascending merge-sort
  - nc_utils.c/h        nfu.F90 helpers (clone_dim/var, format detect)
  - domain.c/h          mpp_compute_extent, hyperslabcopy, scatter_dims
  - strlist.c/h         copied from mpp-ncscatter
  - xmalloc.c/h         copied from mpp-ncscatter

Build system:
  - src/Makefile.am: new mppdisttool target; sources in mpp-disttool/
    to avoid make shadowing the binary with a same-named object subdir;
    install-exec-hook creates the six backward-compat symlinks
  - configure.ac: add AC_PROG_LN_S so $(LN_S) is defined for the hook

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Security fixes:
- xmalloc.c: Add integer overflow check in xcalloc() to prevent heap corruption
- strlist.c: Replace unsafe strcpy calls with memcpy + null termination
- compress.c: Replace fixed 2048-byte buffer with dynamic allocation
- domain.c: Fix strncpy null-termination using safe memcpy pattern

Memory safety improvements:
- cmd_combine.c: Add comprehensive integer overflow validation for varbuf allocation
- cmd_combine.c: Improve null check formatting after malloc

Code quality improvements:
- xmalloc.h: Use NULL instead of 0 in XFREE macro
- strlist.c: Replace strtok with thread-safe strtok_r
- cmd_auto.c: Add bounds checking for filename buffer

Error handling improvements:
- cmd_scatter.c: Add NetCDF error checking with proper cleanup on failure
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Intermittent parallel build failure: f951 cannot rename nfu_mod.mod0 under make -j

1 participant