fix: allow spaces in ADO repository names when parsing URLs#437
fix: allow spaces in ADO repository names when parsing URLs#437PremKumarML wants to merge 3 commits intomicrosoft:mainfrom
Conversation
Azure DevOps repository names can contain spaces (URL-encoded as %20). The Phase 3 validation regex in DependencyReference.parse() was incorrectly using [a-zA-Z0-9._-]+ for the repo segment, which disallows spaces. Fixed the regex to match the already-correct intermediate validation pattern [a-zA-Z0-9._\\- ]+ that permits spaces in ADO names. Also fixed to_github_url() to percent-encode the repo name when it contains spaces, consistent with how the project name was already encoded. Fixes: apm install dev.azure.com/org/Project%20Name/_git/Repo%20Name failing with 'Invalid Azure DevOps repository format' error.
|
@microsoft-github-policy-service agree |
There was a problem hiding this comment.
Pull request overview
This PR fixes Azure DevOps dependency parsing so apm install accepts repository names containing spaces (commonly URL-encoded as %20) and ensures generated ADO URLs percent-encode the repo segment.
Changes:
- Allow spaces in the ADO repo segment during
DependencyReference.parse()validation. - Percent-encode
ado_repoinDependencyReference.to_github_url()for Azure DevOps URLs. - Add/extend unit tests covering ADO repo names with spaces (encoded and literal).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
src/apm_cli/models/dependency/reference.py |
Updates ADO validation regex to allow spaces in repo names and encodes ado_repo in ADO URL generation. |
tests/test_apm_package_models.py |
Adds regression tests for parsing ADO repo names with spaces and verifies URL encoding behavior. |
Comments suppressed due to low confidence (1)
tests/test_apm_package_models.py:152
- This comment contains a non-ASCII em dash (
—). The repo encoding rules require printable ASCII only in*.pyfiles (see.github/instructions/encoding.instructions.md). Please replace—with an ASCII equivalent like-or--.
# These are not injection attacks — they are legitimate host references.
| - ComposioHQ/awesome-claude-skills/brand-guidelines → True | ||
| - owner/repo/prompts/file.prompt.md → False (is_virtual_file) | ||
| - owner/repo/collections/name → False (is_virtual_collection) |
There was a problem hiding this comment.
This docstring uses the non-ASCII arrow character (→). Per the repo encoding rules, Python source files must stay within printable ASCII (see .github/instructions/encoding.instructions.md). Please replace → with an ASCII equivalent like ->.
tests/test_apm_package_models.py
Outdated
| """Shorthand @alias syntax is no longer supported — @ in shorthand is rejected.""" | ||
| with pytest.raises(ValueError): | ||
| DependencyReference.parse("user/repo@myalias") | ||
|
|
||
| def test_parse_with_reference_and_alias_shorthand_not_parsed(self): | ||
| """Shorthand #ref@alias — @ is no longer parsed as alias separator.""" |
There was a problem hiding this comment.
These docstrings use a non-ASCII em dash (—). The repo encoding rules require printable ASCII only in *.py files (see .github/instructions/encoding.instructions.md). Please replace — with an ASCII equivalent like - or -- (and consider a quick search for other occurrences in this file).
This issue also appears on line 152 of the same file.
See below for a potential fix:
"""Shorthand @alias syntax is no longer supported -- @ in shorthand is rejected."""
with pytest.raises(ValueError):
DependencyReference.parse("user/repo@myalias")
def test_parse_with_reference_and_alias_shorthand_not_parsed(self):
"""Shorthand #ref@alias -- @ is no longer parsed as alias separator."""
| is_ado_final = host and is_azure_devops_hostname(host) | ||
| if is_ado_final: | ||
| if not re.match(r'^[a-zA-Z0-9._-]+/[a-zA-Z0-9._\- ]+/[a-zA-Z0-9._-]+$', repo_url): | ||
| raise ValueError(f"Invalid Azure DevOps repository format: {repo_url}. Expected 'org/project/repo'") | ||
| ado_parts = repo_url.split('/') | ||
| if not re.match( | ||
| r"^[a-zA-Z0-9._-]+/[a-zA-Z0-9._\- ]+/[a-zA-Z0-9._\- ]+$", repo_url | ||
| ): | ||
| raise ValueError( | ||
| f"Invalid Azure DevOps repository format: {repo_url}. Expected 'org/project/repo'" | ||
| ) | ||
| ado_parts = repo_url.split("/") | ||
| ado_organization = ado_parts[0] | ||
| ado_project = ado_parts[1] | ||
| ado_repo = ado_parts[2] |
There was a problem hiding this comment.
parse() final validation for Azure DevOps (is_ado_final) relies on a regex but does not explicitly reject traversal segments like . / ... _parse_standard_url() already rejects these, but SSH parsing can bypass that and still reach this block. Consider splitting repo_url and raising PathTraversalError if any segment is . or .. before accepting the ADO reference.
See below for a potential fix:
ado_parts = repo_url.split("/")
for part in ado_parts:
if part in (".", ".."):
raise PathTraversalError(
f"Path traversal segment '{part}' is not allowed in Azure DevOps repository path: {repo_url}"
)
ado_organization = ado_parts[0]
ado_project = ado_parts[1]
ado_repo = ado_parts[2]
else:
segments = repo_url.split("/")
for seg in segments:
if seg in (".", ".."):
raise PathTraversalError(
f"Path traversal segment '{seg}' is not allowed in repository path: {repo_url}"
)
| raise ValueError( | ||
| f"Invalid repository format: {repo_url}. Contains invalid characters" | ||
| ) | ||
| for seg in segments: |
There was a problem hiding this comment.
In the non-ADO validation branch, segments = repo_url.split("/") is only checked against the character regex. For SSH inputs like git@host:owner/../repo, .. passes the regex and will produce a DependencyReference with traversal segments (only rejected later by get_install_path). Consider rejecting . / .. segments here (similar to the checks in _parse_standard_url) to avoid path traversal in downstream URL/path handling.
| for seg in segments: | |
| for seg in segments: | |
| if seg in (".", ".."): | |
| raise ValueError( | |
| f"Invalid repository format: {repo_url}. Contains '.' or '..' path segments" | |
| ) |
| if self.is_azure_devops(): | ||
| # ADO format: https://dev.azure.com/org/project/_git/repo | ||
| project = urllib.parse.quote(self.ado_project, safe='') | ||
| return f"https://{host}/{self.ado_organization}/{project}/_git/{self.ado_repo}" | ||
| project = urllib.parse.quote(self.ado_project, safe="") | ||
| repo = urllib.parse.quote(self.ado_repo, safe="") | ||
| return f"https://{host}/{self.ado_organization}/{project}/_git/{repo}" |
There was a problem hiding this comment.
Docs currently show ADO URLs only with simple org/project/repo names. Since parsing/to_github_url now explicitly supports spaces in ADO repository names (via %20 encoding or literal spaces), please update the Starlight docs (e.g., docs/src/content/docs/getting-started/authentication.md and/or guides/dependencies.md) with a short note/example so users know how to express names with spaces.
- Replace non-ASCII arrows and em-dashes with ASCII equivalents in reference.py and test_apm_package_models.py (encoding rule compliance) - Add path traversal guards for '.' and '..' segments in ADO and non-ADO validation blocks in DependencyReference.parse() - Update authentication.md and dependencies.md docs with note on using percent-encoded spaces in ADO project/repo names
Closes #436
Summary
Fixes
apm installfailing when an Azure DevOps repository name contains spaces (URL-encoded as%20).Root Cause
Phase 3 validation regex in
DependencyReference.parse()allowed spaces in the ADO project segment but not the repo segment. Azure DevOps allows spaces in both.Also
to_github_url()was not percent-encoding the repo name.Changes
reference.py: regex fix[a-zA-Z0-9._\-]+->[a-zA-Z0-9._\- ]+for repo segmentreference.py:to_github_url()now percent-encodesado_repotests/test_apm_package_models.py: added tests for spaces in repo namesTests
105 passed in
tests/test_apm_package_models.py