fix(validator): prevent XML validator hang on recursive DTD entity expansion#509
fix(validator): prevent XML validator hang on recursive DTD entity expansion#509MD-Mushfiqur123 wants to merge 5 commits into
Conversation
…pansion The XML validator uses context.Background() when calling helium's DTD parser, which means a malicious or accidental XML file with recursive entity definitions will cause unbounded CPU usage and hang forever. Fix by wrapping the parse call in a 10-second context.WithTimeout so the parser aborts and returns a validation error instead of hanging.
kehoecj
left a comment
There was a problem hiding this comment.
Please add a regression test. A minimal recursive entity is enough to prove the parser doesn't hang. Something like:
func Test_XMLRecursiveEntityDoesNotHang(t *testing.T) {
recursive := []byte(`<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY a "&b;">
<!ENTITY b "&a;">
]>
<foo>&a;</foo>`)
valid, err := XMLValidator{}.ValidateSyntax(recursive)
require.False(t, valid)
require.Error(t, err)
}|
|
||
| func (XMLValidator) ValidateSyntax(b []byte) (bool, error) { | ||
| ctx := context.Background() | ||
| ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) |
There was a problem hiding this comment.
Please also add the same timeout to ValidateXSD — it has the same context.Background() pattern and the same risk:
func ValidateXSD(b []byte, schemaPath string) (bool, error) {
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
....|
@kehoecj regression test added — Test_XMLRecursiveEntityDoesNotHang verifies the parser returns an error on recursive entity declarations instead of hanging. |
|
@kehoecj the regression test has been added — Test_XMLRecursiveEntityDoesNotHang verifies the parser returns an error on recursive entity declarations instead of hanging. Could you take another look? |
|
All CI checks are passing and the PR is mergeable. @kehoecj could you take another look when you have a moment? |
kehoecj
left a comment
There was a problem hiding this comment.
Hey @MD-Mushfiqur123 - just need this comment addressed and you're good to go: https://github.com/Boeing/config-file-validator/pull/509/changes#r3311052915
Problem
The XML validator hangs indefinitely when parsing a file with recursive DTD entity definitions. The
heliumparser expands entities without depth or size limits, andValidateSyntax()passescontext.Background()with no timeout.Any XML file with entities that reference other entities causes unbounded CPU usage — the process pins a core and never returns.
Root Cause
context.Background()never expires, so there is nothing to abort the parse.Fix
Wrap the parse call in a context with a 10-second timeout:
This preserves DTD validation for normal files while preventing the hang. If the timeout fires, the parser returns a
context.DeadlineExceedederror which is propagated as a validation failure.Testing
Fixes #503