Skip to content

fix: allow spaces in ADO repository names when parsing URLs#437

Open
PremKumarML wants to merge 3 commits intomicrosoft:mainfrom
PremKumarML:fix/ado-url-spaces-in-repo-name
Open

fix: allow spaces in ADO repository names when parsing URLs#437
PremKumarML wants to merge 3 commits intomicrosoft:mainfrom
PremKumarML:fix/ado-url-spaces-in-repo-name

Conversation

@PremKumarML
Copy link

@PremKumarML PremKumarML commented Mar 24, 2026

Closes #436

Summary

Fixes apm install failing when an Azure DevOps repository name contains spaces (URL-encoded as %20).

Root Cause

Phase 3 validation regex in DependencyReference.parse() allowed spaces in the ADO project segment but not the repo segment. Azure DevOps allows spaces in both.

Also to_github_url() was not percent-encoding the repo name.

Changes

  1. reference.py: regex fix [a-zA-Z0-9._\-]+ -> [a-zA-Z0-9._\- ]+ for repo segment
  2. reference.py: to_github_url() now percent-encodes ado_repo
  3. tests/test_apm_package_models.py: added tests for spaces in repo names

Tests

105 passed in tests/test_apm_package_models.py

Azure DevOps repository names can contain spaces (URL-encoded as %20).
The Phase 3 validation regex in DependencyReference.parse() was incorrectly
using [a-zA-Z0-9._-]+ for the repo segment, which disallows spaces.

Fixed the regex to match the already-correct intermediate validation
pattern [a-zA-Z0-9._\\- ]+ that permits spaces in ADO names.

Also fixed to_github_url() to percent-encode the repo name when it
contains spaces, consistent with how the project name was already encoded.

Fixes: apm install dev.azure.com/org/Project%20Name/_git/Repo%20Name failing
with 'Invalid Azure DevOps repository format' error.
@PremKumarML
Copy link
Author

@microsoft-github-policy-service agree

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes Azure DevOps dependency parsing so apm install accepts repository names containing spaces (commonly URL-encoded as %20) and ensures generated ADO URLs percent-encode the repo segment.

Changes:

  • Allow spaces in the ADO repo segment during DependencyReference.parse() validation.
  • Percent-encode ado_repo in DependencyReference.to_github_url() for Azure DevOps URLs.
  • Add/extend unit tests covering ADO repo names with spaces (encoded and literal).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
src/apm_cli/models/dependency/reference.py Updates ADO validation regex to allow spaces in repo names and encodes ado_repo in ADO URL generation.
tests/test_apm_package_models.py Adds regression tests for parsing ADO repo names with spaces and verifies URL encoding behavior.
Comments suppressed due to low confidence (1)

tests/test_apm_package_models.py:152

  • This comment contains a non-ASCII em dash (). The repo encoding rules require printable ASCII only in *.py files (see .github/instructions/encoding.instructions.md). Please replace with an ASCII equivalent like - or --.
        # These are not injection attacks — they are legitimate host references.

Comment on lines 103 to 105
- ComposioHQ/awesome-claude-skills/brand-guidelines → True
- owner/repo/prompts/file.prompt.md → False (is_virtual_file)
- owner/repo/collections/name → False (is_virtual_collection)
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring uses the non-ASCII arrow character (). Per the repo encoding rules, Python source files must stay within printable ASCII (see .github/instructions/encoding.instructions.md). Please replace with an ASCII equivalent like ->.

Copilot uses AI. Check for mistakes.
Comment on lines 59 to 64
"""Shorthand @alias syntax is no longer supported — @ in shorthand is rejected."""
with pytest.raises(ValueError):
DependencyReference.parse("user/repo@myalias")

def test_parse_with_reference_and_alias_shorthand_not_parsed(self):
"""Shorthand #ref@alias — @ is no longer parsed as alias separator."""
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These docstrings use a non-ASCII em dash (). The repo encoding rules require printable ASCII only in *.py files (see .github/instructions/encoding.instructions.md). Please replace with an ASCII equivalent like - or -- (and consider a quick search for other occurrences in this file).

This issue also appears on line 152 of the same file.

See below for a potential fix:

        """Shorthand @alias syntax is no longer supported -- @ in shorthand is rejected."""
        with pytest.raises(ValueError):
            DependencyReference.parse("user/repo@myalias")

    def test_parse_with_reference_and_alias_shorthand_not_parsed(self):
        """Shorthand #ref@alias -- @ is no longer parsed as alias separator."""

Copilot uses AI. Check for mistakes.
Comment on lines 919 to 930
is_ado_final = host and is_azure_devops_hostname(host)
if is_ado_final:
if not re.match(r'^[a-zA-Z0-9._-]+/[a-zA-Z0-9._\- ]+/[a-zA-Z0-9._-]+$', repo_url):
raise ValueError(f"Invalid Azure DevOps repository format: {repo_url}. Expected 'org/project/repo'")
ado_parts = repo_url.split('/')
if not re.match(
r"^[a-zA-Z0-9._-]+/[a-zA-Z0-9._\- ]+/[a-zA-Z0-9._\- ]+$", repo_url
):
raise ValueError(
f"Invalid Azure DevOps repository format: {repo_url}. Expected 'org/project/repo'"
)
ado_parts = repo_url.split("/")
ado_organization = ado_parts[0]
ado_project = ado_parts[1]
ado_repo = ado_parts[2]
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse() final validation for Azure DevOps (is_ado_final) relies on a regex but does not explicitly reject traversal segments like . / ... _parse_standard_url() already rejects these, but SSH parsing can bypass that and still reach this block. Consider splitting repo_url and raising PathTraversalError if any segment is . or .. before accepting the ADO reference.

See below for a potential fix:

            ado_parts = repo_url.split("/")
            for part in ado_parts:
                if part in (".", ".."):
                    raise PathTraversalError(
                        f"Path traversal segment '{part}' is not allowed in Azure DevOps repository path: {repo_url}"
                    )
            ado_organization = ado_parts[0]
            ado_project = ado_parts[1]
            ado_repo = ado_parts[2]
        else:
            segments = repo_url.split("/")
            for seg in segments:
                if seg in (".", ".."):
                    raise PathTraversalError(
                        f"Path traversal segment '{seg}' is not allowed in repository path: {repo_url}"
                    )

Copilot uses AI. Check for mistakes.
raise ValueError(
f"Invalid repository format: {repo_url}. Contains invalid characters"
)
for seg in segments:
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the non-ADO validation branch, segments = repo_url.split("/") is only checked against the character regex. For SSH inputs like git@host:owner/../repo, .. passes the regex and will produce a DependencyReference with traversal segments (only rejected later by get_install_path). Consider rejecting . / .. segments here (similar to the checks in _parse_standard_url) to avoid path traversal in downstream URL/path handling.

Suggested change
for seg in segments:
for seg in segments:
if seg in (".", ".."):
raise ValueError(
f"Invalid repository format: {repo_url}. Contains '.' or '..' path segments"
)

Copilot uses AI. Check for mistakes.
Comment on lines 994 to +998
if self.is_azure_devops():
# ADO format: https://dev.azure.com/org/project/_git/repo
project = urllib.parse.quote(self.ado_project, safe='')
return f"https://{host}/{self.ado_organization}/{project}/_git/{self.ado_repo}"
project = urllib.parse.quote(self.ado_project, safe="")
repo = urllib.parse.quote(self.ado_repo, safe="")
return f"https://{host}/{self.ado_organization}/{project}/_git/{repo}"
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs currently show ADO URLs only with simple org/project/repo names. Since parsing/to_github_url now explicitly supports spaces in ADO repository names (via %20 encoding or literal spaces), please update the Starlight docs (e.g., docs/src/content/docs/getting-started/authentication.md and/or guides/dependencies.md) with a short note/example so users know how to express names with spaces.

Copilot uses AI. Check for mistakes.
- Replace non-ASCII arrows and em-dashes with ASCII equivalents in
  reference.py and test_apm_package_models.py (encoding rule compliance)
- Add path traversal guards for '.' and '..' segments in ADO and
  non-ADO validation blocks in DependencyReference.parse()
- Update authentication.md and dependencies.md docs with note on
  using percent-encoded spaces in ADO project/repo names
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants