Skip to content

N+1 on AssetSerializer.asset_tags in /api/assets/ list #162

@t-a-y-l-o-r

Description

@t-a-y-l-o-r

Summary

AssetSerializer.asset_tags is declared as a nested AssetTagSerializer(read_only=True, many=True) at blueflow/views/asset.py:153. On the list endpoint (GET /api/assets/) it walks the reverse FK from Asset to AssetTag once per asset in the page, producing an N+1 query pattern.

This is the sibling issue to the usage fan-out, which was just fixed in commit 4b9538c (perf(api): prefetch Asset.usage to fix N+1 on list endpoint).

Why it matters

HugeLimitOffsetPagination (blueflow/pagination.py:9) sets default_limit = 1_000_000. A single unpaginated list call against /api/assets/ therefore fires up to one extra query per asset for asset_tags. With even a few thousand assets in the database, that puts the endpoint into pathological-query territory.

The N+1 was confirmed empirically while wiring up test_asset_list_usage_does_not_n_plus_one: with usage prefetched, growing the page from 1 asset to 10 still added ~2 fan-out queries per extra asset — asset_tags is one of them.

Where

  • Field declaration: blueflow/views/asset.py:153
  • Inline doc block: blueflow/views/asset.py above the queryset = ... line on AssetViewSet

Suggested fix

Two options to weigh:

  1. Prefetch. Extend AssetViewSet.queryset to include prefetch_related("asset_tags", "asset_tags__tag") (the second hop avoids a follow-on N+1 inside AssetTagSerializer if it dereferences the tag FK).
  2. Drop from list response. If consumers don't need tags on the list endpoint, remove asset_tags from AssetSerializer.Meta.computed_fields and expose them only on the detail endpoint or a dedicated /api/assets/{id}/tags/ route. Cheapest fix, but a wire-format change.

Pick (1) if the field is in active use by list-page consumers; pick (2) if not. The decision should be informed by checking consumers (Viper, frontend) for asset_tags usage on list responses.

Acceptance criteria

  • The list endpoint fires a constant number of queries against the asset_tags-backing table regardless of page size (or the field is no longer in the list response).
  • Add a sibling test in blueflow/tests/test_asset_usage.py-style — capture queries, filter by table name, assert flat across page sizes. Follow the scope-narrowing rationale documented on test_asset_list_usage_does_not_n_plus_one.
  • Update the inline doc block in blueflow/views/asset.py to remove asset_tags from the "currently causing per-asset fan-out" list.

Reference

  • Worked example of the prefetch-plus-test pattern: commit 4b9538c.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions