feat: implement semantic rule validation for DQ rules (#1169)#1203
feat: implement semantic rule validation for DQ rules (#1169)#1203Vsatyam013 wants to merge 4 commits into
Conversation
|
All commits in PR should be signed ('git commit -S ...'). See https://docs.github.com/en/authentication/managing-commit-signature-verification/signing-commits |
d9d5ef7 to
379a5ba
Compare
ff01ddc to
f84132b
Compare
ghanse
left a comment
There was a problem hiding this comment.
Hi @Vsatyam013 , the linked issue focuses more on validating the entire ruleset to detect the following issues:
- Detect if a duplicate rule already exists with the same run config
- Detect if a similar rule exists with different arguments (e.g. a narrower threshold of is_in_range)
Can you make some changes to address these more specifically? I think it should be simpler because the check function inputs are mostly structured. We can document any limitations when users execute checks with Spark expressions.
There was a problem hiding this comment.
We update CHANGELOG automatically so these changes will need to be reverted.
There was a problem hiding this comment.
Not sure we want to do this level of validation in DQX. There are a lot of scenarios we would need to cover.
|
@Vsatyam013 do you plan to complete the PR, and address review comments? there was no update since 4 weeks. |
ea0c092 to
da2897a
Compare
|
Hi @mwojtyczka, sorry for the delay. I've reworked the implementation based on @ghanse's feedback. The SQL keyword validation approach has been removed and replaced with ruleset-level semantic checks, duplicate rule detection and conflicting rule detection (same function and column with different arguments). The validation is now integrated into validate_checks, save_checks, and load_checks with a configurable semantic_validation_mode parameter (warn, fail, or None). CHANGELOG changes have also been reverted.Can you review it? |
Changes
Implements semantic validation for DQ rule expressions as part of issue #1169.
Adds a
SemanticValidatorclass that checks SQL expressions for forbiddendestructive keywords (DROP, TRUNCATE, DELETE, UPDATE, INSERT, ALTER) before
rules are applied. Validation results are collected into
ChecksValidationStatusrather than raising exceptions, consistent with the existing validation contract.
Linked issues
Resolves #1169
Tests
Documentation and Demos