Skip to content

Add MSCP v1 to v2.0 Automated Migration Tool#646

Open
nancysangani wants to merge 1 commit intousnistgov:mainfrom
nancysangani:feature/issue-487-migration-v1-to-v2
Open

Add MSCP v1 to v2.0 Automated Migration Tool#646
nancysangani wants to merge 1 commit intousnistgov:mainfrom
nancysangani:feature/issue-487-migration-v1-to-v2

Conversation

@nancysangani
Copy link
Copy Markdown

@nancysangani nancysangani commented Mar 27, 2026

Fixes #487

Overview

Adds automated migration tool to transition MSCP v1.x projects to v2.0 format without manual effort or risk of missing controls.

Changes

  • migrate_v1_to_v2.py: Automated migration script with validation and reporting
  • test_migration.py: Full test suite (30+ tests, all passing)
  • docs/MIGRATION.md: Complete user guide with troubleshooting

How It Works

# Preview migration (no changes)
python migrate_v1_to_v2.py /path/to/v1_project --dry-run

# Execute migration
python migrate_v1_to_v2.py /path/to/v1_project --output /path/to/v2_project

Copilot AI review requested due to automatic review settings March 27, 2026 15:06
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an automated migration tool to convert MSCP v1.x project customizations into the MSCP v2.0 project structure, with accompanying documentation and a new test suite.

Changes:

  • Introduces scripts/migrate_v1_to_v2.py to scan/validate/copy v1 custom rules/sections/baselines into a v2-compatible output directory and emit migration metadata/reporting.
  • Adds tests/test_migration.py covering report behavior and end-to-end migration scenarios.
  • Adds docs/MIGRATION.md describing prerequisites, usage, verification, and troubleshooting for migration.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 13 comments.

File Description
scripts/migrate_v1_to_v2.py New migration implementation (scan, validate, copy, report/metadata generation, CLI).
tests/test_migration.py New unittest/pytest-runnable test suite for migrator/report behavior.
docs/MIGRATION.md New user guide for running and verifying migrations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +285 to +292
output_sections_dir = self.output_path / "custom/sections"
output_sections_dir.mkdir(parents=True, exist_ok=True)

for section_file in sections:
try:
output_file = output_sections_dir / section_file.name
shutil.copy2(section_file, output_file)
except Exception as e:
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

migrate_custom_sections also flattens paths by using only section_file.name. If sections are organized in subdirectories, their structure will be lost and name collisions can overwrite files. Copy using the path relative to custom/sections to preserve nesting.

Copilot uses AI. Check for mistakes.
Comment on lines +491 to +495
parser.add_argument(
"--verbose", "-v",
action="store_true",
help="Verbose output"
)
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CLI exposes --verbose, but the flag is never used (no conditional logging, no extra output). Either implement verbose behavior (e.g., gated debug prints / logging) or remove the argument to avoid a misleading interface (docs also mention verbose output).

Copilot uses AI. Check for mistakes.
Comment on lines +442 to +454

print("\n[SUCCESS] Migration completed successfully!")
self.report.print_summary()

# Save detailed report
report_path = self.output_path / "custom" / "MIGRATION_REPORT.json"
with open(report_path, 'w') as f:
json.dump(self.report.to_dict(), f, indent=2)
print(f"\n[*] Detailed report saved to: {report_path}")

return True


Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

migrate() prints a [SUCCESS] message and returns True even if parsing/scanning recorded errors in self.report.errors (e.g., invalid YAML in scan_custom_rules). Consider treating any collected errors as a failed migration (non-zero exit) or at least printing “completed with errors” and returning False so automation can detect partial/invalid migrations.

Suggested change
print("\n[SUCCESS] Migration completed successfully!")
self.report.print_summary()
# Save detailed report
report_path = self.output_path / "custom" / "MIGRATION_REPORT.json"
with open(report_path, 'w') as f:
json.dump(self.report.to_dict(), f, indent=2)
print(f"\n[*] Detailed report saved to: {report_path}")
return True
# Determine final status based on collected errors
has_errors = bool(getattr(self.report, "errors", None))
if has_errors:
print("\n[COMPLETED WITH ERRORS] Migration completed with errors. See report for details.")
else:
print("\n[SUCCESS] Migration completed successfully!")
self.report.print_summary()
# Save detailed report
report_path = self.output_path / "custom" / "MIGRATION_REPORT.json"
with open(report_path, 'w') as f:
json.dump(self.report.to_dict(), f, indent=2)
print(f"\n[*] Detailed report saved to: {report_path}")
return not has_errors

Copilot uses AI. Check for mistakes.
except ImportError as e:
print(f"Error: Could not import migration module. Make sure migrate_v1_to_v2.py is in the scripts directory.")
print(f"Import error: {e}")
sys.exit(1)
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test module exits the entire test process (sys.exit(1)) if the migration script import fails. Under pytest this will abort collection rather than reporting a normal test failure/skip. Prefer letting the ImportError raise (so pytest shows the failure) or conditionally skipping the suite when the module isn’t available.

Suggested change
sys.exit(1)
raise

Copilot uses AI. Check for mistakes.
migrator = MSCPMigrator(v1_project, v2_base, output_dir)
result = migrator.migrate(dry_run=False)

# Should have error but still complete
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_invalid_yaml_handling assigns result = migrator.migrate(...) but never asserts the expected return value. This makes the test pass regardless of whether migration succeeded or failed. Add an assertion for result that matches the intended contract (e.g., migration should return False when parse errors occur, or True if errors are non-fatal).

Suggested change
# Should have error but still complete
# Should have error but still complete
self.assertTrue(result)

Copilot uses AI. Check for mistakes.
Comment on lines +353 to +354
with open(gitignore_path, 'w') as f:
f.write(content)
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repository’s existing custom/.gitignore ignores everything in custom/ except rules and sections (see custom/.gitignore). This script generates a different .gitignore that doesn’t ignore all other files, and the migration guide suggests copying it into the v2 repo—potentially changing what gets tracked. Consider copying the canonical custom/.gitignore from self.v2_base_path/custom/.gitignore when present, or matching its ignore-everything-except-rules/sections behavior.

Suggested change
with open(gitignore_path, 'w') as f:
f.write(content)
canonical_gitignore = self.v2_base_path / "custom" / ".gitignore"
if canonical_gitignore.is_file():
shutil.copy2(canonical_gitignore, gitignore_path)
else:
with open(gitignore_path, 'w') as f:
f.write(content)

Copilot uses AI. Check for mistakes.
Comment on lines +170 to +213
for rule_file in rules_dir.rglob("*.yaml"):
try:
with open(rule_file, 'r') as f:
yaml_data = yaml.safe_load(f)
if yaml_data and 'id' in yaml_data:
rules[yaml_data['id']] = str(rule_file)
except Exception as e:
self.report.add_error(f"Failed to load v2 rule {rule_file}: {str(e)}")

return rules

def validate_v1_project(self) -> bool:
"""
Validate that v1 project has required structure

Returns:
True if valid, False otherwise
"""
required_dirs = ["custom/rules", "custom/sections"]

for req_dir in required_dirs:
path = self.v1_path / req_dir
if not path.exists():
self.report.add_error(f"Missing required directory: {req_dir}")
return False

# Check for custom files
custom_rules = list((self.v1_path / "custom/rules").rglob("*.yaml"))
custom_sections = list((self.v1_path / "custom/sections").rglob("*.yaml"))

if not custom_rules and not custom_sections:
self.report.add_warning("No custom rules or sections found - migration will be minimal")

return True

def scan_custom_rules(self) -> List[Path]:
"""Find all custom rules in v1 project"""
custom_rules_dir = self.v1_path / "custom/rules"
rules = []

if custom_rules_dir.exists():
rules = list(custom_rules_dir.rglob("*.yaml"))
for rule in rules:
try:
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script only searches for *.yaml files (e.g., rglob('*.yaml') / glob('*.yaml')). Other repo scripts use **/*.y*ml to accept both .yaml and .yml (e.g., scripts/generate_baseline.py). For better compatibility with existing/custom projects, consider supporting .yml as well in scans and in v2 rule loading.

Copilot uses AI. Check for mistakes.
Comment on lines +47 to +52
### 4. Project Configuration

- Organization-specific settings
- Custom metadata
- Baseline mappings and references

Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guide claims “Project Configuration” items (organization settings/metadata/baseline mappings) are migrated, but the script currently only copies custom rules/sections/baselines and generates .gitignore + migration metadata. Either implement configuration migration or adjust this section so users don’t expect files/settings that won’t be produced.

Copilot uses AI. Check for mistakes.

**Possible Solutions:**

1. The rule was renamed in v2.0 - check the migration mappings
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Troubleshooting suggests “check the migration mappings” for renamed rules, but the current implementation doesn’t load mappings from anywhere (_load_rule_mappings() returns {}). Consider either providing/mentioning where mappings live (file/URL) or updating this guidance to reflect current behavior (warnings only, no automatic remap).

Suggested change
1. The rule was renamed in v2.0 - check the migration mappings
1. The rule was renamed in v2.0 - review the warning in `MIGRATION_REPORT.json` and update your custom rule ID manually (the migration tool does not apply automatic rule ID mappings)

Copilot uses AI. Check for mistakes.
return True

# Step 3: Create output directory structure
print("\n[3] Creating output directory structure...")
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The migration writes into output_path even if it already exists, and shutil.copy2 will overwrite existing files with the same names. To keep the “low risk / reversible” promise, consider failing fast when output_path exists and is non-empty (or require an explicit --force flag).

Suggested change
print("\n[3] Creating output directory structure...")
print("\n[3] Creating output directory structure...")
if self.output_path.exists() and any(self.output_path.iterdir()):
print(f"[!] Output directory '{self.output_path}' already exists and is not empty. Aborting to avoid overwriting existing files.")
self.report.add_issue("Output directory already exists and is not empty; migration aborted to avoid overwriting files.")
self.report.print_summary()
return False

Copilot uses AI. Check for mistakes.
@robertgendler
Copy link
Copy Markdown
Collaborator

This is pretty awesome. We'll check it out and review it. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Migration from MSCP 1 to MSCP 2.0

3 participants