Skip to content

fix(validator): prevent XML validator hang on recursive DTD entity expansion#509

Open
MD-Mushfiqur123 wants to merge 5 commits into
Boeing:mainfrom
MD-Mushfiqur123:main
Open

fix(validator): prevent XML validator hang on recursive DTD entity expansion#509
MD-Mushfiqur123 wants to merge 5 commits into
Boeing:mainfrom
MD-Mushfiqur123:main

Conversation

@MD-Mushfiqur123
Copy link
Copy Markdown

Problem

The XML validator hangs indefinitely when parsing a file with recursive DTD entity definitions. The helium parser expands entities without depth or size limits, and ValidateSyntax() passes context.Background() with no timeout.

Any XML file with entities that reference other entities causes unbounded CPU usage — the process pins a core and never returns.

Root Cause

ctx := context.Background()
_, err := helium.NewParser().ValidateDTD(true).Parse(ctx, b)

context.Background() never expires, so there is nothing to abort the parse.

Fix

Wrap the parse call in a context with a 10-second timeout:

ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
_, err := helium.NewParser().ValidateDTD(true).Parse(ctx, b)

This preserves DTD validation for normal files while preventing the hang. If the timeout fires, the parser returns a context.DeadlineExceeded error which is propagated as a validation failure.

Testing

  • All existing validator tests pass
  • Manual verification: a crafted XML file with recursive entities now returns a timeout error instead of hanging

Fixes #503

…pansion

The XML validator uses context.Background() when calling helium's DTD parser, which means a malicious or accidental XML file with recursive entity definitions will cause unbounded CPU usage and hang forever.

Fix by wrapping the parse call in a 10-second context.WithTimeout so the parser aborts and returns a validation error instead of hanging.
@MD-Mushfiqur123 MD-Mushfiqur123 requested a review from a team as a code owner May 27, 2026 08:10
@kehoecj kehoecj added OSS Community Contribution Contributions from the OSS Community waiting-on-maintainer-review PR is waiting to be reviewed and functionally tested by the maintainers labels May 27, 2026
Copy link
Copy Markdown
Collaborator

@kehoecj kehoecj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a regression test. A minimal recursive entity is enough to prove the parser doesn't hang. Something like:

func Test_XMLRecursiveEntityDoesNotHang(t *testing.T) {
    recursive := []byte(`<?xml version="1.0"?>
    <!DOCTYPE foo [
    <!ENTITY a "&b;">
    <!ENTITY b "&a;">
    ]>
    <foo>&a;</foo>`)
 
    valid, err := XMLValidator{}.ValidateSyntax(recursive)
    require.False(t, valid)
    require.Error(t, err)
}

Comment thread pkg/validator/xml.go

func (XMLValidator) ValidateSyntax(b []byte) (bool, error) {
ctx := context.Background()
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add the same timeout to ValidateXSD — it has the same context.Background() pattern and the same risk:

func ValidateXSD(b []byte, schemaPath string) (bool, error) {
    ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
    defer cancel()
    ....

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MD-Mushfiqur123 Need to address this

@kehoecj kehoecj added pr-action-requested PR is awaiting feedback from the submitting developer and removed waiting-on-maintainer-review PR is waiting to be reviewed and functionally tested by the maintainers labels May 27, 2026
@MD-Mushfiqur123
Copy link
Copy Markdown
Author

@kehoecj regression test added — Test_XMLRecursiveEntityDoesNotHang verifies the parser returns an error on recursive entity declarations instead of hanging.

@kehoecj kehoecj self-requested a review May 28, 2026 13:34
@MD-Mushfiqur123
Copy link
Copy Markdown
Author

@kehoecj the regression test has been added — Test_XMLRecursiveEntityDoesNotHang verifies the parser returns an error on recursive entity declarations instead of hanging. Could you take another look?

@MD-Mushfiqur123
Copy link
Copy Markdown
Author

All CI checks are passing and the PR is mergeable. @kehoecj could you take another look when you have a moment?

Copy link
Copy Markdown
Collaborator

@kehoecj kehoecj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @MD-Mushfiqur123 - just need this comment addressed and you're good to go: https://github.com/Boeing/config-file-validator/pull/509/changes#r3311052915

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

OSS Community Contribution Contributions from the OSS Community pr-action-requested PR is awaiting feedback from the submitting developer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

XML validator hangs on recursive entity expansion

3 participants