perf(datasource-active-record): JOIN same-DB to-one relations and select only projected columns#323
Conversation
… of preload
Resolve non-polymorphic to-one relation chains (belongs_to / has_one) that live
on the same database with a single LEFT OUTER JOIN (eager_load) instead of one
extra preload query per relation hop. Collapses the round-trip cascade on record
show and keeps a constant query count on lists.
Guards preserve existing behaviour wherever a JOIN would be unsafe or impossible:
- to-many relations keep preload (a JOIN multiplies rows and breaks pagination)
- polymorphic to-one relations keep preload (target table varies per row)
- targets on a different database connection keep preload (connects_to)
- targets carrying a default_scope keep preload (may inject unqualifiable SQL,
e.g. `where('id > ?', 10)`, that becomes ambiguous once joined)
- targets that cannot be resolved as an AR-backed collection belonging to this
very datasource instance keep preload (defensive against cross-datasource /
name-collision cases): checks concrete class + datasource object identity
Adds a spec proving a two-hop chain collapses to one JOINed query, query count
stays constant regardless of row count, and every guard falls back safely.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
9 new issues
|
| @query = @query.select(@select.join(', ')) if @select | ||
| @query = @query.includes(format_relation_projection(@projection)) unless @projection.nil? | ||
|
|
||
| @query |
…ed to-one relations Builds on the JOIN optimization: instead of eager_load (which forces `table.*` for every joined association), to-one relations are now resolved with left_outer_joins + an explicit SELECT of ONLY their projected columns, aliased on the flat row. The serializer rebuilds the nested hash from those aliases (and detects NULL relations via the target primary key) rather than reading the ActiveRecord association. Effect: displaying a single field of a wide relation (e.g. a heavy DB view with ~150 columns) reads exactly that column plus the join keys, in one query — no full-row read and no per-hop round-trip. to-many / guarded relations keep their preload path and the existing association-based serialization untouched. 156 examples, 0 failures; rubocop clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
| object.attributes.except(*join_aliases) | ||
| end | ||
|
|
||
| def serialize_associations(object, projection, hash, path) |
| hash_object(item, projection.relations[association_name], path: relation_path) | ||
| end | ||
| end | ||
| end |
| # joined to-one relation (recursively), and records the aliases in @joined_relations | ||
| # so the serializer can rebuild the nested hash from the flat row. The target primary | ||
| # key is always selected to let the serializer detect a NULL (absent) relation. | ||
| def collect_joined_selects(collection, relation_name, sub_projection, path) |
…le only Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…acroscope review) - Only belongs_to (ManyToOne) relations are JOINed now. has_one (OneToOne) does not guarantee a unique child row, so a JOIN could duplicate the parent and break list results / pagination; has_one falls back to preload. - Never JOIN a table already present in the query (base or a sibling/nested join). ActiveRecord would alias a table joined twice, and collect_joined_selects references the plain table name; such relations fall back to preload instead. joinable_tables replaces fully_joinable?: it returns the set of tables a subtree would add via JOIN (or nil), threading the used-tables set so collisions are detected across the whole query. Specs cover: belongs_to collapses to one JOIN with only projected columns and constant query count; has_one, to-many, default_scope, already-used table, and non-local targets all fall back to preload. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
| # primary key, so the JOIN cannot duplicate the parent row (a has_one child may not be | ||
| # unique). used_tables already covers the base + sibling joins: a table joined twice would | ||
| # be aliased by ActiveRecord, which collect_joined_selects cannot reference — so bail out. | ||
| def joinable_tables(collection, relation_name, sub_projection, used_tables) |
…ed selects (Macroscope review) target.model.primary_key returns an array for composite-key models; iterate Array(pk) so each key column gets its own aliased select instead of embedding the array as a single, invalid column reference. NULL detection uses the first key column's alias. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ting, real pool check (Macroscope review) - Do not add a second JOIN for a relation already joined by a filter/sort: resolve_field records the joined table in @filter_joined_tables, and apply_select seeds used_tables with it so joinable_tables rejects it (falls back to preload). - Quote joined-select identifiers via the adapter (quote_table_name / quote_column_name) instead of ANSI double quotes, which are string literals on MySQL's default sql_mode. - same_database? compares connection_pool instead of connection_specification_name, which is only the owner class name and can be shared across different databases/shards. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…connection & quoting (Macroscope review)
- Fall back to preload when the belongs_to itself carries a scope (e.g.
`belongs_to :x, -> { where('id > ?', 1) }`): the scope is applied to the JOIN and can
inject raw/unqualified SQL or extra joins. joinable_target now also checks the
association reflection's scope, not only the target model's default_scope.
- Obtain the connection via connection_pool.with_connection (connection is deprecated on
Rails 8) and quote joined-select identifiers through the adapter.
- Split guards into joinable_target and the relation partitioning into split_relations for
clarity; drop the always-true with_associations parameter.
161 examples, 0 failures; rubocop clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
|
||
| tables |= nested | ||
| end | ||
| tables |
| return nil if object[meta[:pk_alias]].nil? | ||
|
|
||
| hash = {} | ||
| projection.columns.each { |column| hash[column] = object[meta[:columns][column]] } |
There was a problem hiding this comment.
Joined and preloaded to-one relations now return different column sets for the same projection.
A preloaded to-one is still hydrated from the full row (base_attributes -> object.attributes, the pre-PR behavior), whereas a JOINed one is built here from only projection.columns (+ the pk). So the shape of a nested to-one hash now depends on how the relation happened to be resolved.
Concretely: if a consumer reads a column off a to-one relation that isn't in the projection (relying on the old over-fetch), it's present when the relation falls back to preload (default_scope / duplicate table / cross-db) and nil when it JOINs.
The projected-subset behavior is arguably the more correct one — but the divergence is latent and adapter/schema-dependent. Worth either making the preload path match (serialize only projected columns there too) or confirming that every consumer only reads projected columns.
There was a problem hiding this comment.
Good catch, aligned in 6115cfa. The preload path now serializes a related record to exactly its projected columns too (via projected_columns), matching the JOINed hydration — so a to-one's shape no longer depends on how it was resolved. The root record still keeps all its selected columns (own attributes + FKs); only related records are projection-restricted. Added specs asserting the same column set for a JOINed (account → supplier) and a preloaded (supplier → account) to-one. Thanks for the review!
…r projected columns (review: bexchauveto) Previously a preloaded to-one was hydrated from its full row while a JOINed one was built from only the projected columns, so a related record's shape depended on how it was resolved (preload fallback vs JOIN). Now the preload path is restricted to the projected columns too, matching the JOINed hydration. The root record still exposes all its selected columns (own attributes + foreign keys); only related records are projection-restricted. Adds specs asserting a to-one relation returns exactly the projected columns whether it is JOINed (account -> supplier) or preloaded (supplier -> account). 163 examples, 0 failures; rubocop clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…le only Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
| return unless same_database?(collection.model, target.model) | ||
| return if used_tables.include?(target.model.table_name) # a table joined twice would be aliased by AR | ||
|
|
||
| target |
…roscope review) join_aliases used Enumerable#to_set without requiring 'set', which raises NoMethodError on Ruby 3.0/3.1 for any serialized record. Use uniq (an Array) instead — the value is only splatted into except(*...) and checked with empty?, so a Set is not needed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Context
When rendering a record (or a list) through the ActiveRecord datasource, every relation in the projection is resolved with
.includes→ ActiveRecord preload (one extra query per relation hop). Displaying abelongs_tochain such asincome → bank_account → organizationtherefore issues 3 sequential queries (a round-trip waterfall). And because relations are preloaded (or, witheager_load, joined), the related rows are always read in full (SELECT relation.*) — costly when the relation is a wide table or a heavy DB view.Change
Utils::Query#apply_selectnow splits the projection's relations:belongs_to/has_one, non-polymorphic) are resolved with a singleLEFT OUTER JOIN, selecting only the projected columns of each joined table (aliased on the flat row), plus the target primary key for NULL detection. The serializer rebuilds the nested hash from those aliases instead of reading the ActiveRecord association.Measured effect
income → bank_account → organization, displaying onlyorganization.name:Before — 3 sequential queries, each
SELECT relation.*:After — 1 query, only the projected column of the view:
On a list, the query count stays constant regardless of the number of rows.
Safety — nothing is JOINed unless it is provably safe
fully_joinable?walks the whole relation subtree and falls back to preload if any of these do not hold:connects_to) → preload (cannot JOIN across connections);default_scopeon the target → preload (may inject unqualifiable raw SQL, e.g.where('id > ?', 10), that becomes ambiguous once joined);Tests
New spec
utils/join_to_one_optimization_spec.rb:table.*);query_spec.rbupdated to reflect that nested to-one relations are now JOINed.Full
forest_admin_datasource_active_recordsuite: 156 examples, 0 failures. RuboCop clean.🤖 Generated with Claude Code
Note
JOIN same-database belongs_to relations and select only projected columns in ActiveRecord datasource queries
Utils::Querynow splitsbelongs_torelations into two groups: eligible ones (same DB, no scopes, no duplicate tables) are LEFT OUTER JOINed with aliased column selects; the rest are preloaded viaincludes.Utils::ActiveRecordSerializeraccepts ajoined_relationsmap and hydrates JOINed relations from aliased columns on the root row instead of traversing preloaded AR associations.@filter_joined_tablesto prevent duplicate conflicting JOINs whenapply_selectruns.belongs_torelations now issue a single JOIN query with an explicit SELECT list, changing the SQL shape and column count of results.Changes since #323 opened
ForestAdminDatasourceActiveRecord::Utils::ActiveRecordSerializer.hash_objectto conditionally serialize related records with only projected columns instead of full attributes, and introducedForestAdminDatasourceActiveRecord::Utils::ActiveRecordSerializer.projected_columnshelper method [6115cfa]ForestAdminDatasourceActiveRecord::Utils::ActiveRecordSerializerandForestAdminDatasourceActiveRecord::Utils::Query[0a762a6]ForestAdminDatasourceActiveRecord::Utils::ActiveRecordSerializer.join_aliasesprivate method to use array deduplication instead of Set conversion [43e0e92]Macroscope summarized ac04d57.