Set join_collapse_limit default value to 13 #1525

andr-sokolov · 2026-01-11T20:13:35Z

The default value of join_collapse_limit was 20. When this value is set and
the query contains about 20 joins (see added test), Postgres query optimizer
cannot build a plan during hours and consumes a lot of memory, because the
planner checks a lot of possible ways to join the tables.
When join_collapse_limit is 13, the query plan is built in reasonable time.

andr-sokolov · 2026-01-12T17:47:11Z

I suggest join_collapse_limit=8 to reduce the difference from PostgreSQL 14, but you can consider slightly larger values. In the table below there are time and memory consumption to process the query from the test.

join_collapse_limit	Query processing time, ms	Backend memory usage
8	96	52 MB
9	230	88 MB
10	653	210 MB
11	1269	394 MB
12	2994	889 MB
13	9003	2.4 GB
14	27245	7.3 GB
15	80904	22 GB
16	670578	66 GB
17	4158314	200 GB

Query processing time was got using \timing on in psql. Backend memory usage was measured at the end of the planner function

leborchuk · 2026-01-13T07:37:29Z

I suggest join_collapse_limit=8 to reduce the difference from PostgreSQL 14, but you can consider slightly larger values. In the table below there are time and memory consumption to process the query from the test.

join_collapse_limit Query processing time, ms Backend memory usage
8 96 52 MB
9 230 88 MB
10 653 210 MB
11 1269 394 MB
12 2994 889 MB
13 9003 2.4 GB
14 27245 7.3 GB
15 80904 22 GB
16 670578 66 GB
17 4158314 200 GB
Query processing time was got using \timing on in psql. Backend memory usage was measured at the end of the planner function

Thank you such interesting research!

I would suggest setting join_collapse_limit to 13 or 14 for your choice.

I agree, 20 is too much, but 8 is also too small. The reason is to exchange execution time for planning time and free memory (on the master). However, the master has its own limitations. Join_collapse_limit should be set to a value that prevents exhaustion of resources by a single user. Typically, even test/development installations have 8 GB of free memory space. I'm not sure about 22 GB; it seems quite large. I have seen many dev/test instances with 10 GB to 16 GB of free memory space.

andr-sokolov · 2026-01-13T08:01:24Z

I would suggest setting join_collapse_limit to 13 or 14 for your choice.

I agree, 20 is too much, but 8 is also too small. The reason is to exchange execution time for planning time and free memory (on the master). However, the master has its own limitations. Join_collapse_limit should be set to a value that prevents exhaustion of resources by a single user. Typically, even test/development installations have 8 GB of free memory space. I'm not sure about 22 GB; it seems quite large. I have seen many dev/test instances with 10 GB to 16 GB of free memory space.

I have replaced 8 with 13. 14 is too high, because in this case I get ERROR: Canceling query because of high VMEM usage. Used: 7376MB, available 816MB, red zone: 7372MB (runaway_cleaner.c:189) on demo cluster with default gp_vmem_protect_limit value

Smyatkin-Maxim

+1 for anything between 12 and 13

avamingli · 2026-01-16T03:12:33Z

I suggest join_collapse_limit=8 to reduce the difference from PostgreSQL 14, but you can consider slightly larger values. In the table below there are time and memory consumption to process the query from the test.

join_collapse_limit Query processing time, ms Backend memory usage
8 96 52 MB
9 230 88 MB
10 653 210 MB
11 1269 394 MB
12 2994 889 MB
13 9003 2.4 GB
14 27245 7.3 GB
15 80904 22 GB
16 670578 66 GB
17 4158314 200 GB
Query processing time was got using \timing on in psql. Backend memory usage was measured at the end of the planner function

I’m surprised that the planner consumes so much memory — typically it uses very little. Your test results are surprising to me. Could you share the complete test setup so others can reproduce it?

Another point is that this parameter is intended to generate better query plans. Reducing it could lead to worse plans. Memory usage during the planning phase is usually negligible compared to execution memory.

We need to verify in standard benchmarks whether lowering this parameter causes any performance regression. Typically, we should test it with TPCDS at 1TB scale.

andr-sokolov · 2026-01-19T06:18:22Z

Could you share the complete test setup so others can reproduce it?

I ran queries on standard demo cluster.

do $$
begin 
  for i in 1..20 loop
    execute 'create table tj' ||  i ||  '(id int)';
  end loop; 
end
$$;

set optimizer to off;

select *
from tj1 
  join tj2 on tj1.id = tj2.id
  join tj3 on tj2.id = tj3.id
  join tj4 on tj3.id = tj4.id
  join tj5 on tj4.id = tj5.id
  join tj6 on tj5.id = tj6.id
  join tj7 on tj6.id = tj7.id
  join tj8 on tj7.id = tj8.id
  join tj9 on tj8.id = tj9.id
  join tj10 on tj9.id = tj10.id
  join tj11 on tj10.id = tj11.id
  join tj12 on tj11.id = tj12.id
  join tj13 on tj12.id = tj13.id
  join tj14 on tj13.id = tj14.id
  join tj15 on tj14.id = tj15.id
  join tj16 on tj15.id = tj16.id
  join tj17 on tj16.id = tj17.id
  join tj18 on tj17.id = tj18.id
  join tj19 on tj18.id = tj19.id
  join tj20 on tj19.id = tj20.id;

OS: Ubuntu 22.04.5
RAM: 32 GB
Processor: Core i7 13700H

avamingli · 2026-01-19T07:08:32Z

Could you share the complete test setup so others can reproduce it?

I ran queries on standard demo cluster.

do $$
begin 
  for i in 1..20 loop
    execute 'create table tj' ||  i ||  '(id int)';
  end loop; 
end
$$;

set optimizer to off;

select *
from tj1 
  join tj2 on tj1.id = tj2.id
  join tj3 on tj2.id = tj3.id
  join tj4 on tj3.id = tj4.id
  join tj5 on tj4.id = tj5.id
  join tj6 on tj5.id = tj6.id
  join tj7 on tj6.id = tj7.id
  join tj8 on tj7.id = tj8.id
  join tj9 on tj8.id = tj9.id
  join tj10 on tj9.id = tj10.id
  join tj11 on tj10.id = tj11.id
  join tj12 on tj11.id = tj12.id
  join tj13 on tj12.id = tj13.id
  join tj14 on tj13.id = tj14.id
  join tj15 on tj14.id = tj15.id
  join tj16 on tj15.id = tj16.id
  join tj17 on tj16.id = tj17.id
  join tj18 on tj17.id = tj18.id
  join tj19 on tj18.id = tj19.id
  join tj20 on tj19.id = tj20.id;

OS: Ubuntu 22.04.5 RAM: 32 GB Processor: Core i7 13700H

Thank you for sharing the test case—I was able to reproduce the issue. Indeed, the planner does become very slow in this scenario. The reason is that with 20 tables joined purely via inner joins and no restrictions (such as LEFT JOIN), the join order can be rearranged arbitrarily. In GPDB and our Cloudberry, join ordering relies on dynamic programming for search. Consider the number of possible join permutations with 20 tables—this becomes extremely costly.

This highlights a fundamental limitation: unlike upstream PostgreSQL, which can fall back to genetic algorithm-based join search when the number of joins exceeds a threshold, we currently do not inherit that mechanism. I recall this topic was previously discussed with @reshke

I appreciate you raising this. Overall, I’m supportive of exploring adjustments here.

The current default of 20 has been inherited from Greenplum and has remained unchanged for years. To my knowledge, no customer has reported this issue before, and as your PR shows no plan differences, it suggests the impact may be limited in typical workloads. However, this parameter is quite sensitive. While your example—which uses only inner joins—justifies lowering it to around 12 or 13, we must be cautious: a lower limit could prevent the planner from finding optimal join orders in real-world complex queries. In queries like those in TPCDS, execution time often far outweighs planning time.

Therefore, before moving forward, I recommend we validate this change with a TPCDS 1T benchmark (using AOCO tables). If no plan changes are observed, I have no objection. If we can identify a better-performing setting through such testing, that would be even more valuable.

avamingli · 2026-01-19T07:11:45Z

This highlights a fundamental limitation: unlike upstream PostgreSQL, which can fall back to genetic algorithm-based join search when the number of joins exceeds a threshold, we currently do not inherit that mechanism. I recall this topic was previously discussed with @reshke

here: #657

avamingli · 2026-01-22T05:25:45Z

Therefore, before moving forward, I recommend we validate this change with a TPCDS 1T benchmark (using AOCO tables). If no plan changes are observed, I have no objection. If we can identify a better-performing setting through such testing, that would be even more valuable.

Based on my manual review of all 99 TPC-DS queries, I have confirmed that Query 72 contains the highest number of explicit JOINs: 11 tables with 10 JOIN operations, which is below the value modified in your PR. Therefore, this change should theoretically have no impact on query planning for explicit joins.

Regarding queries with a higher number of table references, such as Query 64, which lists over 18 tables in its FROM clause, these are presented as a FROM-LIST and are governed by the from_collapse_limit GUC. Consequently, your change will not affect them either.

This change looks good to me, thanks,

…QL 14 The default value of join_collapse_limit was 20. When this value is set and the query contains about 20 joins (see added test), Postgres query optimizer cannot build a plan during hours and consumes a lot of memory, because the planner checks a lot of possible ways to join the tables. When join_collapse_limit is 8, the query plan is built in reasonable time.

andr-sokolov changed the title ~~Set join_collapse_limit default value to the same as used in PostgreSQL 14~~ Set join_collapse_limit default value to 13 Jan 13, 2026

Smyatkin-Maxim approved these changes Jan 13, 2026

View reviewed changes

leborchuk approved these changes Jan 14, 2026

View reviewed changes

yjhjstz requested a review from Smyatkin-Maxim January 16, 2026 16:44

avamingli added the planner label Jan 19, 2026

avamingli approved these changes Jan 22, 2026

View reviewed changes

avamingli force-pushed the fix_join_collapse_limit branch from cf638b4 to 6f095e0 Compare January 22, 2026 05:26

my-ship-it approved these changes Jan 22, 2026

View reviewed changes

andr-sokolov added 2 commits January 22, 2026 20:39

Replace 8 with 13

32061ee

avamingli force-pushed the fix_join_collapse_limit branch from 7215bbb to 32061ee Compare January 22, 2026 12:39

avamingli merged commit b245e4d into apache:main Jan 22, 2026
40 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Set join_collapse_limit default value to 13 #1525

Set join_collapse_limit default value to 13 #1525

andr-sokolov commented Jan 11, 2026 •

edited

Loading

Uh oh!

andr-sokolov commented Jan 12, 2026

Uh oh!

leborchuk commented Jan 13, 2026

Uh oh!

andr-sokolov commented Jan 13, 2026

Uh oh!

Smyatkin-Maxim left a comment

Uh oh!

avamingli commented Jan 16, 2026

Uh oh!

andr-sokolov commented Jan 19, 2026

Uh oh!

avamingli commented Jan 19, 2026

Uh oh!

avamingli commented Jan 19, 2026

Uh oh!

avamingli commented Jan 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Set join_collapse_limit default value to 13 #1525

Set join_collapse_limit default value to 13 #1525

Conversation

andr-sokolov commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andr-sokolov commented Jan 12, 2026

Uh oh!

leborchuk commented Jan 13, 2026

Uh oh!

andr-sokolov commented Jan 13, 2026

Uh oh!

Smyatkin-Maxim left a comment

Choose a reason for hiding this comment

Uh oh!

avamingli commented Jan 16, 2026

Uh oh!

andr-sokolov commented Jan 19, 2026

Uh oh!

avamingli commented Jan 19, 2026

Uh oh!

avamingli commented Jan 19, 2026

Uh oh!

avamingli commented Jan 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

andr-sokolov commented Jan 11, 2026 •

edited

Loading