Skip to content

Fix multi-response support in PRESS stats, R-squared, and check_fit#16

Closed
bdphilli wants to merge 3 commits into
nasa:mainfrom
bdphilli:fix/multi-response-stats
Closed

Fix multi-response support in PRESS stats, R-squared, and check_fit#16
bdphilli wants to merge 3 commits into
nasa:mainfrom
bdphilli:fix/multi-response-stats

Conversation

@bdphilli

@bdphilli bdphilli commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

Three bugs when fitting PCE with multiple response columns (e.g., results file with >1 column):

  1. model.py get_press_stats(): np.delete(responses, idx) without axis flattens the 2D array. Added axis=0 to delete rows, matching the var_basis delete on the preceding line.

  2. statistics.py calc_R_sq(): scalar comparison 'if R_sq > 1' fails when R_sq is an array. Use np.atleast_1d, np.any, and np.clip for element-wise handling. Returns scalar for single response.

  3. pce.py check_fit(): press_stat[m] and R_sq[m] index into arrays whose shape depends on atleast_2d wrapping. Use .flat[m] to reliably extract the m-th scalar regardless of shape.

Single-response behavior is unchanged.

Three bugs when fitting PCE with multiple response columns
(e.g., results file with >1 column):

1. model.py get_press_stats(): np.delete(responses, idx) without
   axis flattens the 2D array. Added axis=0 to delete rows, matching
   the var_basis delete on the preceding line.

2. statistics.py calc_R_sq(): scalar comparison 'if R_sq > 1' fails
   when R_sq is an array. Use np.atleast_1d, np.any, and np.clip for
   element-wise handling. Returns scalar for single response.

3. pce.py check_fit(): press_stat[m] and R_sq[m] index into arrays
   whose shape depends on atleast_2d wrapping. Use .flat[m] to
   reliably extract the m-th scalar regardless of shape.

Single-response behavior is unchanged.
@bdphilli bdphilli force-pushed the fix/multi-response-stats branch from a0044b5 to cf1d22f Compare April 1, 2026 13:56
bdphilli and others added 2 commits April 1, 2026 11:37
calc_error_sum_of_sq and calc_total_sum_of_sq used matrix algebra
forms (y'y - β'X'y and y'y - (Σy)²/n) that produce (m×m) matrices
for m responses. R² was then computed from the full matrix instead
of just the diagonal, giving incorrect values (always ~1.0).

Additionally, calc_total_sum_of_sq had np.sum(responses)**2 which
sums ALL elements across responses into one scalar instead of summing
each column independently.

Replaced both with direct residual-based computation:
  SSE = Σ(y - ŷ)² per column
  SST = Σ(y - ȳ)² per column

Returns 1D arrays for multi-response, scalars for single-response.
Equivalent to the matrix form for single response; correct for both.
Restored prior "calc_total_sum_of_sq" and "calc_error_sum_of_sq" functions, adding capability to support 2D matrix coefficients and responses for problems with multiple responses.
@jnschmid

Copy link
Copy Markdown
Contributor

Updated to support statistics output with multiple responses

@jnschmid jnschmid closed this Apr 20, 2026
@jnschmid jnschmid reopened this Apr 20, 2026
@jnschmid jnschmid closed this Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants