Skip to content

Multivariable covariant parameter space priors#135

Open
kayhangultekin wants to merge 6 commits intodevfrom
kg_dev
Open

Multivariable covariant parameter space priors#135
kayhangultekin wants to merge 6 commits intodevfrom
kg_dev

Conversation

@kayhangultekin
Copy link
Collaborator

@kayhangultekin kayhangultekin commented Mar 6, 2026

Pull Request: Multivariate Priors & Covariant GSMF Implementation

Description

Note the code in this PR and the text here were created with the assistance of AI: Gemini 3 Flash and Gemini 3.1 Pro (Low)

This PR introduces support for multivariate normal prior distributions and implements the Covariant Galaxy Stellar Mass Function (GSMF) using data from Leja+2020.

The core of this update involves extending our parameter space sampling logic to seamlessly handle a mix of univariate and multivariate distributions without breaking the underlying Latin Hypercube continuous sampling.

Todos

Notable points that this PR has either accomplished or will accomplish.

  • Add multivariate normal prior support
  • Add covariance matrix support to parameter spaces

Status

  • Ready to go except that the covariance matrix values are based on approximations to Leja+2020 data. These need to be updated with the real data when they become available!! This can be done with the covariant-double-schechter.ipynb notebook.

Key Features & New Classes

  • PD_MVNormal (New Class): Added to lib_tools.py. This class implements a multivariate normal/Gaussian parameter distribution. It takes a list of parameter names, a means vector, and a covariance matrix. It automatically checks for positive definiteness using Cholesky decomposition to map uniform samples to the multivariate normal space.
  • Covariant GSMF Parameter Spaces: Added hardcoded values for GSMF_COV_NAMES, GSMF_COV_MEANS, and GSMF_COV_MATRIX to param_spaces.py. (The values in GSMF_COV_MATRIX are based on approximations to Leja+2020 data. These need to be updated with the real data when they become available!! This can be done with the covariant-double-schechter.ipynb notebook.)
  • PS_Astro_Strong_Covariant_GSMF & PS_Astro_Strong_Covariant_All: New parameter spaces that utilize the PD_MVNormal class to sample the GSMF parameters covariantly.
  • PS_Test_Astro_Strong_Covariant_MMBulge: A new test parameter space demonstrating how to couple specific parameters (like mmb_mamp_log10 and mmb_plaw) using a covariance matrix while keeping others (like mmb_scatter_dex) independent.

Architectural Changes & Helper Functions

  • Sample Transformation (_transform_samples): Completely rewrote the sample transformation logic in the _Param_Space base class. Previously, it assumed a 1:1 mapping between uniform samples and parameter distributions. It now dynamically tracks and slices the correct number of input dimensions required by each _Param_Dist object (whether 1D or ND).
  • Extrema Property Update: Updated the bounds evaluation (extrema property) to correctly handle 2D stacking of boundaries for multivariate distributions.
  • repair_covariance(m): Added a new helper function to holodeck/utils.py. This function finds the nearest positive semi-definite matrix by applying eigenvalue decomposition and clipping negative eigenvalues. This is critical for preventing Cholesky decomposition failures when dealing with manually estimated (or slightly numerically unstable) covariance matrices.

Testing & Notebooks

  • New Notebook (covariant-double-schechter.ipynb): Added a dedicated notebook in notebooks/devs/sams/ to demonstrate the synthetic dataset generation from the multivariate normal distribution and test the covariance repairing logic.
  • Notebook Updates: Refreshed execution states and plots in double-schechter.ipynb, librarian.ipynb, and semi-analytic-models.ipynb to ensure compatibility with the updated parameter space logic.
  • Local Repo Organization: Added scratch/ directory to .gitignore to keep temporary testing/development scripts untracked.

Kayhan Gultekin added 6 commits December 15, 2025 15:32
…ses _Param_Space and _Param_Dist so that they can tell if they are using a multi-parameter distribution or if they are using a legacy-style parameter distribution. I created a single new multiparameter distribution, a multivariable Gaussian PD_MVNormal. You give it the means and a covariance matrix, and when doing Latin Hypercube Sampling, it will sample appropriately. I added a new test _Param_Space sublcass PS_Test_Astro_Strong_Covariant_MMBulge. I modified librarian.ipynb to showcase the new parameter space and the joint distribution.
double Schechter function parameters based on
Leja+2020. Added a notebook to generate synthetic
data with the covariances estimated from the corner
plot in the paper. Also added a function to repair the
covariance matrix if it's not positive definite. This is
a work in progress and will need to be updated with actual
covariances from the paper or samples.
…t I will remove before committing upstream.
- Removed several test scripts from git index and moved into a new root 'scratch/' directory.
- Ignored 'scratch/' in .gitignore.
- Renamed 'new-covariant_double-schecter.ipynb' to 'covariant_double-schecter.ipynb' and tracked the latter.
@kayhangultekin kayhangultekin added the enhancement New feature or request label Mar 6, 2026
@kayhangultekin kayhangultekin added this to the NG20 Ready milestone Mar 6, 2026
@kayhangultekin
Copy link
Collaborator Author

Totally forgot to say that this addresses issue #132

@CayenneMatt
Copy link
Collaborator

The additions seem to functioning properly except for two minor compatibility issues that cause some unit tests to fail.

The test _check_ps_basic from holodeck/librarian/tests/test_lib_tools__param_space.py fails the check assert len(params) == nparams. This is because the test was not designed for the distribution classes to have multiple parameters. (line 282 of Run tests and generate coverage report)

The default method of _Param_Dist in holodeck/librarian/lib_tools.py returns self(0.5) which causes if xx.ndim != 2: on line 565 to fail. This occurs when running test_all_param_spaces in holodeck/librarian/tests/test_param_spaces.py. Details are on and around on line 349 of Run tests and generate coverage report.

Finally, the backslashes on lines 552, 553, 566, and 574 of holodeck/librarian/lib_tools.py cause the associated error messages to print the strings literally without filling in the assigned variables.

@kayhangultekin
Copy link
Collaborator Author

@CayenneMatt Good notes. I will update the tests so that they properly check to see if the distribution is multidimensional and check for the appropriate number of parameters. I'll also deal with the backslashes properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants