Skip to content

feat: Intelligent File Synchronization (sync_dir) #87

@JacobCallahan

Description

@JacobCallahan

Summary

Add a sync_dir() method that performs bandwidth-efficient directory synchronization between local and remote paths by comparing file metadata and only transferring files that have changed—an rsync-lite built into hussh.

Motivation

Full directory uploads via put_dir (see #84) re-transfer every file on every run, which is wasteful for deployment workflows where only a handful of files change between runs. An rsync-style sync operation would make hussh a first-class tool for continuous deployment and configuration management.

Proposed API

with Connection("myserver.example.com", username="deploy", password="...") as conn:
    # Upload only files that are new or changed
    result = conn.sftp.sync_dir(
        "/local/project/",
        "/remote/app/",
        direction="push",           # or "pull" to sync remote → local
        compare="mtime_and_size",   # or "size", "checksum"
        delete=False,               # if True, remove remote files absent locally
    )
    print(f"{result.transferred} files synced, {result.skipped} unchanged")

Async variant:

async with AsyncConnection(...) as conn:
    result = await conn.sftp.sync_dir("/local/project/", "/remote/app/")

Comparison Strategies

compare value How files are compared
size Transfer if sizes differ
mtime_and_size (default) Transfer if size differs or source mtime is newer
checksum Compute MD5/SHA256 of both copies; transfer only if hashes differ

Return Value

SyncResult with fields:

  • transferred: int — count of files actually sent/received
  • skipped: int — count of files left unchanged
  • deleted: int — count of files removed (when delete=True)
  • bytes_sent: int

Implementation Notes

  • Internally builds on put_dir / get_dir primitives (see feat: Recursive Directory Transfers via SFTP (put_dir / get_dir) #84) but wraps each individual file transfer in a metadata check.
  • Remote mtime can be obtained via SFTP stat. Setting remote mtime after upload requires SFTP setstat—preserve this where the server supports it.
  • The checksum strategy requires remote hash computation: run md5sum / sha256sum on the server via execute(), or compute the hash on a buffered read—whichever is more efficient.
  • delete=True should require an explicit opt-in to prevent accidental data loss.

Acceptance Criteria

  • sync_dir(local, remote, direction, compare, delete) on sync and async SFTP
  • All three comparison strategies implemented
  • SyncResult returned with accurate counts
  • delete=True removes files at the destination that are absent at the source
  • Integration tests for push, pull, and idempotency (second sync transfers nothing)
  • Benchmarks vs naïve put_dir on a partially-changed directory
  • Documentation updated

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions