Add mppdisttool: unified MPP distribution tool#407
Draft
underwoo wants to merge 2 commits into
Draft
Conversation
Adds a single pure-C binary, mppdisttool, with a subcommand interface
that replaces mppncscatter, mppnccombine, combine-ncc, scatter-ncc,
decompress-ncc, iceberg_comb.sh, and combine_restarts. The new binary
eliminates the Fortran and NCO (ncrcat, ncatted) runtime dependencies,
making it usable on HPC nodes that lack the full Fortran module stack.
Subcommands: combine, scatter, decompress, check, auto
Backward-compat symlinks installed for all seven legacy names.
Auto-detection selects MPP, land (compressed-by-gathering), or iceberg
path; each path can also be forced explicitly.
Source layout: src/mpp-disttool/ (11 C source files)
- main.c argv[0] compat + subcommand dispatch
- cmd_combine.c/h MPP (mppnccombine, modernised nc_* API),
land (combine-ncc.F90), iceberg (no ncrcat/ncatted)
- cmd_scatter.c/h MPP (mppncscatter/domain.c), land (scatter-ncc.F90)
- cmd_decompress.c/h decompress-ncc.F90 port
- cmd_check.c/h is-compressed + iceberg check (no ncdump)
- cmd_auto.c/h combine_restarts port (POSIX opendir + regcomp)
- compress.c/h nfu_compress.F90 port + rank_ascending merge-sort
- nc_utils.c/h nfu.F90 helpers (clone_dim/var, format detect)
- domain.c/h mpp_compute_extent, hyperslabcopy, scatter_dims
- strlist.c/h copied from mpp-ncscatter
- xmalloc.c/h copied from mpp-ncscatter
Build system:
- src/Makefile.am: new mppdisttool target; sources in mpp-disttool/
to avoid make shadowing the binary with a same-named object subdir;
install-exec-hook creates the six backward-compat symlinks
- configure.ac: add AC_PROG_LN_S so $(LN_S) is defined for the hook
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
8 tasks
Security fixes: - xmalloc.c: Add integer overflow check in xcalloc() to prevent heap corruption - strlist.c: Replace unsafe strcpy calls with memcpy + null termination - compress.c: Replace fixed 2048-byte buffer with dynamic allocation - domain.c: Fix strncpy null-termination using safe memcpy pattern Memory safety improvements: - cmd_combine.c: Add comprehensive integer overflow validation for varbuf allocation - cmd_combine.c: Improve null check formatting after malloc Code quality improvements: - xmalloc.h: Use NULL instead of 0 in XFREE macro - strlist.c: Replace strtok with thread-safe strtok_r - cmd_auto.c: Add bounds checking for filename buffer Error handling improvements: - cmd_scatter.c: Add NetCDF error checking with proper cleanup on failure
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
mppdisttool, a single pure-C binary with a subcommand interface that replaces seven separate MPP distribution programs:mppncscatterscattermppnccombinecombine --mppcombine-ncccombine --landscatter-nccscatter --landdecompress-nccdecompressiceberg_comb.shcombine --iceberg/check --icebergcombine_restartsautoThe new binary eliminates the Fortran and NCO (
ncrcat,ncatted) runtime dependencies, making these tools usable on HPC login, data-transfer, and service nodes that frequently lack the full Fortran module stack or NCO.Subcommands
combine— auto-detects file type (compressed-by-gathering → land path; iceberg restart → iceberg path; otherwise MPP path); mode can be forced with--land,--iceberg, or--mppscatter— MPP domain decomposition or--landCF compressed scatterdecompress— expand compressed-by-gathering files to full gridcheck—--compressed(replacesis-compressed) or--iceberg; exit codes preservedauto— scans CWD for*{res,nc}.####files and routes each group through the appropriate combine path (replacescombine_restarts)Backward compatibility
install-exec-hookinstalls symlinks for all six legacy binary names.argv[0]detection maps each symlink name to the correct subcommand and flags at startup, so existing scripts require no changes.Source layout
Sources are placed in
mpp-disttool/(notmppdisttool/) to avoid automake creating a same-named object subdirectory that shadows the binary target undermake -j.Build system changes
src/Makefile.am— newmppdisttooltarget;install-exec-hookfor six compatibility symlinksconfigure.ac— addsAC_PROG_LN_Sso$(LN_S)is defined for the install hookTest plan
autoreconf -i && configure && make -jbuilds cleanlymake installinstallsmppdisttooland all six symlinks--helpoutput spot-check)make checkformppncscatter,mppnccombine,combine-ncc,scatter-ncc,decompress-ncc,iceberg_comb,is-compressed,combine_restartsldd mppdisttoolconfirms no Fortran runtime or NCO library dependencyCloses #405 (parallel build work tracked separately on
nfu.parallel.build)🤖 Generated with Claude Code