Evaluation DSL

The evaluation engine consumes a small JSON-based DSL to describe the expected state of the database after an agent run. Each expectation is expressed as an "assertion" evaluated against the diff between the before/after snapshots.

Anatomy of a spec

{
  "strict": true,
  "assertions": [
    {
      "diff_type": "added",
      "entity": "messages",
      "where": {
        "channelId": {"eq": 123},
        "body": {"contains": "hello"}
      },
      "expected_count": 1
    },
    {
      "diff_type": "changed",
      "entity": "issues",
      "where": {"id": {"eq": 42}},
      "expected_changes": {
        "status": {"to": {"eq": "Done"}}
      }
    }
  ]
}

diff_type – one of added, removed, changed.
entity – table name (as it appears in the service schema).
where – field predicates composed from the operator set below.
expected_count – optional exact or bounded ({"min":1} / {"max":2}) match on results.
expected_changes – for changed, lists fields and optional from/to predicates.
strict – when true, the engine fails if it observes additional field changes beyond expected_changes.

Operators

Scalar comparisons:

Operator	Meaning
`eq`	equality
`ne`	inequality
`gt`/`gte`	greater than / or equal
`lt`/`lte`	less than / or equal

Collection / string helpers:

Operator	Meaning
`in` / `not_in`	membership check on sequences
`contains`	substring match (case sensitive)
`not_contains`	substring miss
`i_contains`	substring match (case insensitive)
`starts_with`	prefix check
`ends_with`	suffix check
`i_starts_with`	prefix check (case insensitive)
`i_ends_with`	suffix check (case insensitive)
`regex`	regular expression match
`has_any`	any overlapping element in arrays
`has_all`	all elements present in arrays

Existence handling:

Operator	Meaning
`exists`	`true` = field is not NULL, `false` = field is NULL

Multiple operators in one predicate object are ANDed. Multiple where fields are ANDed. Dot paths are supported for nested objects (e.g. start.timeZone).

Resources

JSON schema: backend/src/platform/evaluationEngine/dsl_schema.json
Engine implementation: backend/src/platform/evaluationEngine/assertion.py Sample test scenarios for Slack agents:
slack_bench_v2.json - Test cases covering message sending, channel ops, reactions, threading
slack_default.json - Seed data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation DSL

Anatomy of a spec

Operators

Resources

FilesExpand file tree

evaluation-dsl.md

Latest commit

History

evaluation-dsl.md

File metadata and controls

Evaluation DSL

Anatomy of a spec

Operators

Resources