The evaluation engine consumes a small JSON-based DSL to describe the expected state of the database after an agent run. Each expectation is expressed as an "assertion" evaluated against the diff between the before/after snapshots.
{
"strict": true,
"assertions": [
{
"diff_type": "added",
"entity": "messages",
"where": {
"channelId": {"eq": 123},
"body": {"contains": "hello"}
},
"expected_count": 1
},
{
"diff_type": "changed",
"entity": "issues",
"where": {"id": {"eq": 42}},
"expected_changes": {
"status": {"to": {"eq": "Done"}}
}
}
]
}diff_type– one ofadded,removed,changed.entity– table name (as it appears in the service schema).where– field predicates composed from the operator set below.expected_count– optional exact or bounded ({"min":1}/{"max":2}) match on results.expected_changes– forchanged, lists fields and optionalfrom/topredicates.strict– when true, the engine fails if it observes additional field changes beyondexpected_changes.
Scalar comparisons:
| Operator | Meaning |
|---|---|
eq |
equality |
ne |
inequality |
gt/gte |
greater than / or equal |
lt/lte |
less than / or equal |
Collection / string helpers:
| Operator | Meaning |
|---|---|
in / not_in |
membership check on sequences |
contains |
substring match (case sensitive) |
not_contains |
substring miss |
i_contains |
substring match (case insensitive) |
starts_with |
prefix check |
ends_with |
suffix check |
i_starts_with |
prefix check (case insensitive) |
i_ends_with |
suffix check (case insensitive) |
regex |
regular expression match |
has_any |
any overlapping element in arrays |
has_all |
all elements present in arrays |
Existence handling:
| Operator | Meaning |
|---|---|
exists |
true = field is not NULL, false = field is NULL |
Multiple operators in one predicate object are ANDed. Multiple where fields are ANDed. Dot paths are supported for nested objects (e.g. start.timeZone).
- JSON schema:
backend/src/platform/evaluationEngine/dsl_schema.json - Engine implementation:
backend/src/platform/evaluationEngine/assertion.pySample test scenarios for Slack agents: - slack_bench_v2.json - Test cases covering message sending, channel ops, reactions, threading
- slack_default.json - Seed data