Fix multi-response support in PRESS stats, R-squared, and check_fit#16
Closed
bdphilli wants to merge 3 commits into
Closed
Fix multi-response support in PRESS stats, R-squared, and check_fit#16bdphilli wants to merge 3 commits into
bdphilli wants to merge 3 commits into
Conversation
Three bugs when fitting PCE with multiple response columns (e.g., results file with >1 column): 1. model.py get_press_stats(): np.delete(responses, idx) without axis flattens the 2D array. Added axis=0 to delete rows, matching the var_basis delete on the preceding line. 2. statistics.py calc_R_sq(): scalar comparison 'if R_sq > 1' fails when R_sq is an array. Use np.atleast_1d, np.any, and np.clip for element-wise handling. Returns scalar for single response. 3. pce.py check_fit(): press_stat[m] and R_sq[m] index into arrays whose shape depends on atleast_2d wrapping. Use .flat[m] to reliably extract the m-th scalar regardless of shape. Single-response behavior is unchanged.
a0044b5 to
cf1d22f
Compare
calc_error_sum_of_sq and calc_total_sum_of_sq used matrix algebra forms (y'y - β'X'y and y'y - (Σy)²/n) that produce (m×m) matrices for m responses. R² was then computed from the full matrix instead of just the diagonal, giving incorrect values (always ~1.0). Additionally, calc_total_sum_of_sq had np.sum(responses)**2 which sums ALL elements across responses into one scalar instead of summing each column independently. Replaced both with direct residual-based computation: SSE = Σ(y - ŷ)² per column SST = Σ(y - ȳ)² per column Returns 1D arrays for multi-response, scalars for single-response. Equivalent to the matrix form for single response; correct for both.
Restored prior "calc_total_sum_of_sq" and "calc_error_sum_of_sq" functions, adding capability to support 2D matrix coefficients and responses for problems with multiple responses.
Contributor
|
Updated to support statistics output with multiple responses |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Three bugs when fitting PCE with multiple response columns (e.g., results file with >1 column):
model.py get_press_stats(): np.delete(responses, idx) without axis flattens the 2D array. Added axis=0 to delete rows, matching the var_basis delete on the preceding line.
statistics.py calc_R_sq(): scalar comparison 'if R_sq > 1' fails when R_sq is an array. Use np.atleast_1d, np.any, and np.clip for element-wise handling. Returns scalar for single response.
pce.py check_fit(): press_stat[m] and R_sq[m] index into arrays whose shape depends on atleast_2d wrapping. Use .flat[m] to reliably extract the m-th scalar regardless of shape.
Single-response behavior is unchanged.