Skip to content

Conversation

@AlexMikhalev
Copy link
Contributor

… reports

@AlexMikhalev
Copy link
Contributor Author

GitHub Runner Execution Results

PR: #423 - feat(update): add auto-update system with verification and validation…
URL: #423

❌ python-bindings.yml

Execution failed: VM allocation failed: Allocation failed with status: 429 Too Many Requests

❌ test-on-pr.yml

Execution failed: VM allocation failed: Allocation failed with status: 429 Too Many Requests

❌ ci-optimized.yml

Execution failed: VM allocation failed: Allocation failed with status: 429 Too Many Requests

❌ vm-execution-tests.yml

Execution failed: VM allocation failed: Allocation failed with status: 429 Too Many Requests

❌ ci-native.yml

Execution failed: VM allocation failed: Allocation failed with status: 429 Too Many Requests

Powered by terraphim-github-runner

@claude
Copy link

claude bot commented Jan 10, 2026

Code Review: Auto-Update System Implementation

Overview

This PR introduces a comprehensive auto-update system for Terraphim AI binaries. The implementation is well-structured with ~12,000 additions across 38 files, primarily adding the terraphim_update crate with supporting modules and extensive documentation.


✅ Strengths

1. Excellent Architecture & Modularity

  • Clean separation of concerns across modules: config, downloader, rollback, scheduler, signature, platform, notification, state
  • Well-designed async API using tokio::spawn_blocking to handle blocking self_update operations
  • Builder pattern for configuration (UpdaterConfig, DownloadConfig)
  • Proper error handling with anyhow::Result

2. Comprehensive Documentation

  • Extensive rustdoc comments with examples
  • Multiple design documents: DESIGN-AUTO-UPDATE.md, RESEARCH-AUTO-UPDATE.md, VERIFICATION-REPORT-AUTO-UPDATE.md, VALIDATION-REPORT-AUTO-UPDATE.md
  • README with usage examples in crates/terraphim_update/README.md
  • Phase 2 roadmap in .github/issues/AUTO-UPDATE-PHASE2.md

3. Security Considerations

  • Signature verification module (though placeholder implementation)
  • SHA256 checksum validation for backups (rollback.rs:183-189)
  • Uses self_update with rustls and signatures features
  • Backup integrity validation before rollback

4. Testing

  • Integration tests in tests/integration_test.rs covering:
    • Full update flow
    • Backup/restore roundtrip
    • Permission failure scenarios
    • Backup retention
  • Unit tests in lib.rs for version comparison logic
  • Test coverage for edge cases

⚠️ Issues & Concerns

Critical: Security Gaps

1. Signature Verification is Placeholder Only 🔴

// crates/terraphim_update/src/signature.rs:58-80
pub fn verify_signature(
    _binary_path: &Path,
    _signature_path: &Path,
    _public_key: &str,
) -> Result<VerificationResult> {
    // ... checks file existence ...
    Ok(VerificationResult::Valid)  // ⚠️ ALWAYS returns Valid\!
}

Impact: Updates are NOT cryptographically verified. This is a critical security vulnerability allowing:

  • Man-in-the-middle attacks
  • Malicious binary distribution
  • Compromised update servers

Recommendation:

  • Either implement proper signature verification using self_update's signature features
  • OR clearly document this as MVP limitation and add security warning
  • Consider using minisign or ed25519-dalek for production

2. Version Comparison Vulnerability

// crates/terraphim_update/src/lib.rs:43-73
fn is_newer_version_static(version1: &str, version2: &str) -> bool {
    let v1_parts: Vec<u32> = version1
        .trim_start_matches('v')
        .split('.')
        .take(3)
        .map(|s| s.parse().unwrap_or(0))  // ⚠️ Silently fails to 0
        .collect();
    // ...
}

Issues:

  • No semver compliance (pre-release versions, build metadata ignored)
  • Silently converts parse errors to 0 (e.g., "1.2.3-beta" → "1.2.3")
  • Could misorder versions with pre-release tags

Recommendation: Use semver crate:

use semver::Version;
fn is_newer_version(v1: &str, v2: &str) -> Result<bool> {
    let version1 = Version::parse(v1.trim_start_matches('v'))?;
    let version2 = Version::parse(v2.trim_start_matches('v'))?;
    Ok(version1 > version2)
}

High Priority

3. Compilation Warnings 🟡

From diagnostics:

  • scheduler.rs:382:13: Unused mutable variable receiver
  • integration_test.rs:119:13: Unused import std::os::unix::fs::PermissionsExt
  • integration_test.rs:234:9: Unused variable i

These must be fixed before merge (fails cargo clippy).

4. Error Handling Could Be More Robust

// crates/terraphim_update/src/lib.rs:198
Err(e) => Ok(UpdateStatus::Failed(format\!("Check failed: {}", e))),

Converting errors to successful Ok results loses error context. Consider:

Err(e) => Err(anyhow\!("Update check failed: {}", e))

5. Potential Race Conditions

// crates/terraphim_update/src/lib.rs:559-563
let rt = tokio::runtime::Runtime::new()?;
rt.block_on(async { check_for_updates_auto(&bin_name, &current_version).await })

Creating nested runtimes inside spawn_blocking can cause panics if called from async context. Use tokio::runtime::Handle::current() instead.

Medium Priority

6. Backup Rotation Logic Incomplete

// crates/terraphim_update/src/rollback.rs:198
pub fn rotate_backups(&mut self) -> Result<Vec<String>> {
    let mut removed = Vec::new();
    // Implementation cut off at line 200

Need to verify this is complete and handles edge cases:

  • Empty backup directory
  • Concurrent modifications
  • Disk full scenarios

7. No TLS Certificate Validation Mentioned

While using rustls, there's no explicit verification of:

  • GitHub's TLS certificate pinning
  • OCSP stapling
  • Certificate revocation checking

Recommendation: Document TLS security assumptions or add explicit validation.

8. Scheduler State Management

The scheduler (start_update_scheduler) returns a JoinHandle but doesn't provide:

  • Graceful shutdown mechanism
  • State persistence across restarts
  • Recovery from scheduler crashes

📝 Code Quality

Best Practices Adherence

✅ Follows Rust idioms and naming conventions
✅ Proper use of async/await patterns
✅ Good module organization
✅ Comprehensive error types with thiserror

Areas for Improvement

9. Test Coverage Gaps

Missing tests for:

  • Network failure scenarios (timeout, DNS errors)
  • Concurrent update attempts
  • Rollback after partial installation
  • Scheduler persistence/recovery
  • Permission escalation on different platforms

10. Platform-Specific Code Needs Validation

#[cfg(unix)]
{
    use std::os::unix::fs::PermissionsExt;
    perms.set_mode(0o755);
}

Windows/macOS equivalents not shown. Need to verify cross-platform compatibility.

11. Documentation Debt

The PR includes many planning/design docs (HANDOVER-AUTO-UPDATE.md, PHASE5_INTEGRATION_SUMMARY.md, etc.) that should probably:

  • Move to docs/ directory
  • Be consolidated
  • Have orphaned files cleaned up post-merge

🔍 Performance Considerations

Good

  • Exponential backoff for retries (line 22 in downloader.rs)
  • Configurable timeouts
  • Async operations don't block main thread

Concerns

  • No mention of download size limits (could DoS with huge files)
  • Checksum calculation loads entire file into memory (fs::read in rollback.rs:184)
  • Consider streaming hash for large binaries

🛡️ Security Recommendations (Priority Order)

  1. Implement Real Signature Verification 🔴 (Blocker)
  2. Switch to semver crate for version comparison 🟡 (High)
  3. Add hash verification for downloads 🟡 (High)
  4. Document threat model and security boundaries 🟢 (Medium)
  5. Add rate limiting for update checks 🟢 (Low)
  6. Consider update channel separation (stable/beta) 🟢 (Low)

📋 Action Items Before Merge

Must Fix

  • Fix compilation warnings (unused variables/imports)
  • Implement or document signature verification limitation
  • Fix nested runtime creation in scheduler
  • Add semver dependency and proper version comparison

Should Fix

  • Add network failure tests
  • Complete backup rotation implementation
  • Add graceful scheduler shutdown
  • Move documentation files to proper locations

Nice to Have

  • Streaming hash calculation for large files
  • Download size limits
  • More detailed error types
  • Cross-platform compatibility testing

💡 Suggestions

Consider Adding

  1. Dry-run mode for testing without actual updates
  2. Bandwidth throttling for background downloads
  3. Partial download resumption for interrupted downloads
  4. Telemetry opt-in (as mentioned in Phase 2 doc)

Documentation

Add to README:

  • Security model and trust assumptions
  • Minimum supported versions
  • Rollback limitations
  • Network requirements

🎯 Overall Assessment

Quality: High - Well-architected, documented, and tested
Security: Medium-Low - Critical signature verification gap
Completeness: 85% - Core functionality solid, some edge cases need work

Recommendation: Conditional Approval pending:

  1. Security issue resolution (signature verification)
  2. Compilation warning fixes
  3. Documentation updates

This is excellent foundation work for Phase 1. The Phase 2 planning shows good foresight. Once security gaps are addressed, this will be production-ready.


Review completed per CLAUDE.md guidelines - focusing on security, performance, and Rust best practices.

- Update @types/node to 24.10.5 to match available versions
- Update CI check script to expect Node.js 20 instead of 18
- Resolves npm install failures in CI
@AlexMikhalev
Copy link
Contributor Author

GitHub Runner Execution Results

PR: #423 - feat(update): add auto-update system with verification and validation…
URL: #423

❌ ci.yml

Execution failed: VM allocation failed: Allocation failed with status: 429 Too Many Requests

❌ ci-optimized.yml

Execution failed: VM allocation failed: Allocation failed with status: 429 Too Many Requests

❌ test-on-pr.yml

Execution failed: VM allocation failed: Allocation failed with status: 429 Too Many Requests

❌ ci-native.yml

Execution failed: VM allocation failed: Allocation failed with status: 429 Too Many Requests

❌ vm-execution-tests.yml

Execution failed: VM allocation failed: Allocation failed with status: 429 Too Many Requests

Powered by terraphim-github-runner

AlexMikhalev and others added 8 commits January 12, 2026 17:22
Implement actual cryptographic signature verification for auto-update system
using zipsign-api (Ed25519 signatures), replacing the placeholder implementation.

**Changes:**

**Dependencies:**
- Add zipsign-api v0.1 for signature verification
- Add base64 v0.22 for public key decoding

**Signature Verification (crates/terraphim_update/src/signature.rs):**
- Implement verify_archive_signature() using zipsign-api
- Signatures are embedded in .tar.gz archives (GZIP comment), not separate files
- Add get_embedded_public_key() placeholder for public key storage
- Update verify_signature_detailed() to use new verification API
- Re-export ZipsignError for error handling

**Key Generation (scripts/generate-zipsign-keypair.sh):**
- Create key pair generation script using zipsign CLI
- Instructions for secure private key storage (1Password)
- Public key embedding workflow

**Key Decisions:**
- Chose zipsign-api over minisign-verify because:
  - Already integrated with self_update crate (signatures feature)
  - Designed for .tar.gz archives (exact use case)
  - Embeds signatures in archives (no separate .sig files needed)
  - Uses Ed25519 (modern, secure, fast)

**Security:**
- Placeholder public key returns Valid for development
- Actual verification will reject invalid/tampered signatures
- Fail-on-invalid approach for security-critical updates

**Related:**
- Issue #421 - CRITICAL: Implement actual signature verification
- RESEARCH-SIGNATURE-VERIFICATION.md - Research document
- DESIGN-SIGNATURE-VERIFICATION.md - Implementation plan

**Next Steps:**
- Generate real Ed25519 key pair
- Integrate verification into update flow
- Create signing script for release pipeline
- Add comprehensive test suite

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implement Option A (manual download + verify + install) for secure
updates with Ed25519 signature verification using zipsign-api.

Changes:
- Add update_with_verification() method for secure update flow
- Add helper methods: get_latest_release_info(), download_release_archive(),
  install_verified_archive(), extract_zip(), extract_tarball(), get_target_triple()
- Update check_and_update() to use signature verification by default
- Add dependencies: flate2, tar, zip, tempfile
- Rewrite signature.rs test module for new API (107 tests passing)

Security: Updates are now rejected if signature is invalid, missing,
or verification encounters an error.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…ipeline

Add automated signing infrastructure for release archives using zipsign
Ed25519 signatures.

Changes:
- Create scripts/sign-release.sh for signing release archives
- Integrate sign_binaries() into scripts/release.sh
- Add signing step after package creation in release pipeline
- Update SIGNATURE_VERIFICATION_PROGRESS.md

Features:
- Signs all .tar.gz and .tar.zst archives with Ed25519 signatures
- Verifies signatures after signing
- Skips already-signed archives
- Graceful handling if signing key unavailable
- Comprehensive error checking and colored output

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add complete test coverage for signature verification functionality.

Changes:
- Create crates/terraphim_update/tests/signature_test.rs
- Add 15 tests covering unit, integration, edge cases, property, and performance
- Update SIGNATURE_VERIFICATION_PROGRESS.md

Test Coverage:
- Unit tests: placeholder key behavior, error handling, validation
- Edge cases: corrupted archives, custom keys, non-file paths
- Integration: signed archive verification, wrong key rejection, tamper detection
- Property-based: deterministic behavior, no-panic guarantee
- Performance: small archive and batch verification benchmarks

All 15 tests passing. Integration tests are gated behind
integration-signing feature (requires zipsign CLI).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Step 9 of signature verification implementation:

- Generate Ed25519 key pair using zipsign
- Embed real public key in signature.rs (placeholder replaced)
- Update all tests to reflect correct security behavior (reject unsigned)
- Add key fingerprint to documentation
- Add keys/ directory to .gitignore
- Create secure key storage instructions (keys/README.md)

Public Key (v1.0):
- Base64: 1uLjooBMO+HlpKeiD16WOtT3COWeC8J/o2ERmDiEMc4=
- Fingerprint: 1c78db3c8e1afa3af4fcbaf32ccfa30988c82f9e7d383dfb127ae202732b631a

Security Improvements:
- All 107 lib tests updated to reject unsigned archives
- All 15 integration tests updated to reject unsigned archives
- Placeholder key behavior completely removed
- Private key storage instructions provided for 1Password

Test Results:
- cargo test -p terraphim_update --lib: 107/107 passing
- cargo test -p terraphim_update --test signature_test: 15/15 passing
- Tests verify rejection of unsigned/corrupted archives (correct behavior)

Next Steps:
- Store private.key in 1Password vault "TerraphimPlatform"
- Delete private.key from filesystem using shred
- Configure GitHub Actions secret ZIPSIGN_PRIVATE_KEY

Related: #421

Co-Authored-By: Terraphim AI
… embedded

Updated progress tracking:
- Step 9 marked complete (90% of implementation done)
- All 10 success criteria now met
- Ready for final security audit (Step 10)

Next: Store private key in 1Password and perform security audit

Co-Authored-By: Terraphim AI
- Update scripts/sign-release.sh with 1Password CLI integration
  - Add get_key_from_op() function to retrieve signing key from 1Password
  - Support ZIPSIGN_OP_ITEM environment variable for item ID
  - Support ZIPSIGN_PRIVATE_KEY=op:// to trigger 1Password retrieval
  - Add automatic cleanup of temporary key files with shred
  - Update usage documentation with 1Password examples

- Update scripts/release.sh to prefer 1Password over file-based keys
  - Detect 1Password CLI availability in sign_binaries()
  - Use ZIPSIGN_OP_ITEM=jbhgblc7m2pluxe6ahqdfr5b6a when 1Password available
  - Fall back to file-based keys when 1Password CLI not found
  - Fix sign_cmd variable usage in execute call

- Update keys/README.md with 1Password item reference
  - Document item ID: jbhgblc7m2pluxe6ahqdfr5b6a
  - Add three methods for using the signing key
  - Update GitHub Actions integration examples
  - Include 1Password Action configuration

This completes the secure storage requirement for the Ed25519 signing key.
The private key is now stored in 1Password vault TerraphimPlatform and all
signing scripts have been updated to retrieve it securely.

Note: Pre-commit hook flagged 'private_key' variable names as potential secrets,
but these are just variable assignments with no hardcoded secrets.

Ref: #421 Step 9-10
Co-Authored-By: Terraphim AI
@AlexMikhalev
Copy link
Contributor Author

GitHub Runner Execution Results

PR: #423 - feat(update): add auto-update system with verification and validation…
URL: #423

❌ python-bindings.yml

Execution failed: VM allocation failed: Allocation failed with status: 429 Too Many Requests

❌ test-on-pr.yml

Execution failed: VM allocation failed: Allocation failed with status: 429 Too Many Requests

❌ ci-optimized.yml

Execution failed: VM allocation failed: Allocation failed with status: 429 Too Many Requests

❌ vm-execution-tests.yml

Execution failed: VM allocation failed: Allocation failed with status: 429 Too Many Requests

❌ ci-native.yml

Execution failed: VM allocation failed: Allocation failed with status: 429 Too Many Requests

Powered by terraphim-github-runner

@github-actions
Copy link
Contributor

Documentation Preview

Your documentation changes have been deployed to:
https://ced6e681.terraphim-docs.pages.dev

This preview will be available until the PR is closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants