Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -724,6 +724,8 @@
"redis/search/aggregations",
"redis/search/counting",
"redis/search/aliases",
"redis/search/command-reference",
"redis/search/troubleshooting",
{
"group": "Query Operators",
"pages": [
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,32 @@ Use bucket aggregations when you want segmented analytics (for example by catego
| [`$dateHistogram`](./date-histogram) | Fixed date/time intervals |
| [`$facet`](./facet) | Hierarchical FACET paths |

### Input Format

Every bucket operator takes an object with a `field` property and operator-specific parameters:

**`$terms`** — group by distinct values:
```json
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INFO — Bucket aggregation 'Input Format' block is not valid JSON (multiple root objects)

The Input Format example in the bucket aggregations overview shows four separate JSON objects on consecutive lines inside a single ```json block. This is not valid JSON (a JSON document must have a single root value). While each line is individually valid, the block as a whole could confuse readers or tools.

Recommendation: Either separate each example into its own code block, or use JSONL / plain text syntax highlighting, or wrap them in an array. A common pattern is to show each operator as a separate mini-example with a label.

[dx] confidence: 75%

{"by_category": {"$terms": {"field": "category", "size": 10}}}
```

**`$range`** — custom range buckets:
```json
{"price_ranges": {"$range": {"field": "price", "ranges": [{"to": 50}, {"from": 50, "to": 100}, {"from": 100}]}}}
```

**`$histogram`** — fixed numeric intervals:
```json
{"price_buckets": {"$histogram": {"field": "price", "interval": 10}}}
```

**`$dateHistogram`** — fixed time intervals:
```json
{"by_month": {"$dateHistogram": {"field": "createdAt", "fixedInterval": "30d"}}}
```

### Behavior Notes

- Bucket operators can contain nested `$aggs`.
- Bucket operators can contain nested `$aggs` for per-bucket metrics.
- `$terms`, `$range`, `$histogram`, and `$dateHistogram` support nested `$aggs`.
- `$facet` does not support nested `$aggs` and cannot be used as a sub-aggregation.
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,21 @@ Use metrics when you want one value (or stats object), not grouped buckets.
| [`$extendedStats`](./extended-stats) | `$stats` + variance and std deviation metrics |
| [`$percentiles`](./percentiles) | Distribution percent points |

### Input Format

Every metric operator takes an object with at least a `field` property:

```json
{"alias_name": {"$avg": {"field": "price"}}}
{"alias_name": {"$avg": {"field": "price", "missing": 0}}}
```

The `field` value must be a string pointing to a FAST field in your schema.
Do **not** pass a bare string — it must be an object with `{"field": "..."}`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Do **not** pass a bare string — it must be an object with `{"field": "..."}`.


### Behavior Notes

- Metric operators require a `field`.
- The field must be `FAST` in your schema.
- Metric operators require a `field` — you will get `missing required 'field' property` if it's omitted.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Metric operators require a `field` — you will get `missing required 'field' property` if it's omitted.
- Metric operators require a `field`.

- The field must be `FAST` in your schema — otherwise you will get an error: `operator '$avg' requires field '<field>' to be FAST`. See [FAST Fields](/redis/search/schema-definition#fast-fields).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- The field must be `FAST` in your schema — otherwise you will get an error: `operator '$avg' requires field '<field>' to be FAST`. See [FAST Fields](/redis/search/schema-definition#fast-fields).
- The field must be `FAST` in your schema.

- Metric operators do not support nested `$aggs`.
- For many metric operators, `missing` lets you provide a fallback value for documents where the field does not exist.
68 changes: 68 additions & 0 deletions redis/search/aggregations.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,74 @@ Aggregation requests have two phases:

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INFO — Response field naming differs between SDK examples without explicit explanation

In aggregations.mdx, the TypeScript response example uses docCount (camelCase) while the Python example uses doc_count (snake_case) for the same bucket aggregation field. While this is likely intentional SDK-level normalization to match each language's conventions, no note explains that the wire format is transformed differently per SDK, which could confuse users working across languages.

Recommendation: Consider adding a brief note (e.g., in a <Note> block) mentioning that each SDK normalizes response keys to match language conventions (camelCase for TypeScript, snake_case for Python), and that the Redis CLI returns the raw wire format.

[architecture] confidence: 65%

Each aggregation is defined with an **alias** (the key you choose for the result) and an **operator** that specifies what to compute.

### Response Format

<Tabs>

<Tab title="TypeScript">
```ts
const result = await index.aggregate({
aggregations: {
avg_price: { $avg: { field: "price" } },
by_category: { $terms: { field: "category", size: 5 } },
},
});

// result is an object keyed by alias:
// {
// avg_price: 49.99,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this output correct? the backend returns something like {"avg_price":{"value":49.99}}

// by_category: [
// { key: "electronics", docCount: 42 },
// { key: "clothing", docCount: 31 },
// ]
// }
```
</Tab>

<Tab title="Python">
```python
result = index.aggregate(
aggregations={
"avg_price": {"$avg": {"field": "price"}},
"by_category": {"$terms": {"field": "category", "size": 5}},
},
)

# result is a dict keyed by alias:
# {
# "avg_price": 49.99,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

# "by_category": [
# {"key": "electronics", "doc_count": 42},
# {"key": "clothing", "doc_count": 31},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we doing anything special on the python sdk? if not, server sends docCount not doc_count

# ]
# }
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 LOW — Inconsistent response key casing between TypeScript and Python examples

In the aggregations response format, the TypeScript example uses docCount (camelCase) while the Python example uses doc_count (snake_case) for bucket results. If both SDKs return the same wire format, one of these is wrong. If they differ by design (SDK-specific casing), this should be called out explicitly to avoid confusion.

Recommendation: Add a brief note clarifying that each SDK normalizes response keys to its language's conventions (camelCase for TS, snake_case for Python), or verify the actual SDK behavior and correct whichever is wrong.

[dx] confidence: 72%

```
</Tab>

<Tab title="Redis CLI">
```bash
SEARCH.AGGREGATE products '{}' '{"avg_price": {"$avg": {"field": "price"}}, "by_category": {"$terms": {"field": "category", "size": 5}}}'

# Response (`redis-cli --json`) is an object keyed by alias:
# {"avg_price":{"value":49.99},"by_category":{"buckets":[{"key":"electronics","docCount":42},{"key":"clothing","docCount":31}]}}
```
</Tab>

</Tabs>

<Note>
Each SDK normalizes response keys to match its language's conventions: camelCase for TypeScript
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont think this is true. we are not doing it for python

(e.g., `docCount`) and snake_case for Python (e.g., `doc_count`). In `redis-cli --json`, Redis CLI
returns a JSON object keyed by aggregation alias.
</Note>

<Note>
All metric aggregation operators require the target field to be marked as `FAST` in your schema.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is already mentioned. i would just remove this warning

If the field is not FAST, you will get an error like:
`Aggregation '<name>' operator '$avg' requires field '<field>' to be FAST`.
See [FAST Fields](/redis/search/schema-definition#fast-fields) for details.
</Note>

## Filtering

Use `filter` to restrict which documents participate in the aggregation.
Expand Down
Loading