A Novel Bayesian Framework Uncovering Brain Connectivity-to-Shape Relationship in Preclinical Alzheimer’s Disease
This repository contains the implementation of COSR, a Bayesian modeling framework designed to relate brain connectivity patterns (matrix-valued outcome) to spatially varying structural (shape) features. Implementations include full Bayesian MCMC (two variants) and a Variational Bayes alternative for scalable approximate inference.
Although motivated by applications to preclinical Alzheimer’s disease (AD), the methods are broadly applicable to other multimodal neuroimaging problems where connectivity-to-shape (or connectivity-to-image) relationships are of interest.
Top-level files and their purpose:
COSR_wrapper.m— unified entry point that validates inputs and dispatches to the selected inference routine.COSR_MCMC_BF.m— MCMC sampler using a Bayesian factor model for correlated residuals.COSR_MCMC_IND.m— MCMC sampler assuming independent residuals.COSR_VB_IND.m— Variational Bayes / coordinate ascent implementation (scalable, approximate).trandn.m— truncated normal sampler used by MCMC routines.example_simu/— small scripts and helper functions for generating toy data and running examples.
- Clone the repository:
git clone https://github.com/Naomi-Ding/COSR.git cd COSR - Open MATLAB and add the COSR folder to your MATLAB path.
addpath('path_to_COSR'); - Run the example simulation to generate toy data and run a small experiment:
example_simu/COSR_example_simu.m
- Explore results (tables, and coefficient maps) in the
example_simu/directory. The example is intentionally small so it runs quickly as a smoke test. Inspect variables in the workspace after the run, and result structures with estimates and diagnostics.
- MATLAB (tested with R2019b – R2024a).
- Basic MATLAB toolboxes (Statistics, Linear Algebra).
- No external packages are required for the example scripts included.
The main entry point is COSR_wrapper.m, which dispatches to one of the three COSR implementations based on the specified error_model. It accepts either positional arguments or a params struct (recommended for reproducibility). The wrapper handles input validation, default parameter settings, random initialization, and forwards to the chosen implementation.
COSR supports two common calling patterns. Provide either:
- Spatial coordinates
Svand let the wrapper compute basis functions automatically:
COSR_wrapper(error_model, A, M, Sv, x, params)- Precompute basis functions and pass them in a
paramsstruct (in this caseSvmay be[]):
params.Psi = Psi; params.Lambda_sqrt = Lambda_sqrt; % precomputed
COSR_wrapper(error_model, A, M, [], x, params)where the other core inputs are:
error_model: string, one of {'MCMC_BF','MCMC_IND','VB_IND'} to select the inference method.A: (V, V, n) symmetric connectivity matrices for n subjects (zeros on diagonal).M: (n, S) matrix of shape measurements evaluated at S spatial locations (each row is a subject).Sv: (S, d) spatial coordinates of the S locations (e.g., mesh vertices (d=3) or 2D contours (d=2)) where shape is measured. Pass[]if providingPsiandLambda_sqrtinparams.- If you pass
Sv, the wrapper will compute an RBF kernel and extract eigenvectors/eigenvalues to formPsiandLambda_sqrt(unless you override inparams). - If you prefer to compute a custom basis externally (e.g., Laplace-Beltrami eigenfunctions), pass them via
params.Psiandparams.Lambda_sqrtand setSv = [].
- If you pass
x: (n, p) optional subject-level covariates (use[]when none).params: optional struct to control MCMC/VB settings, priors, and diagnostic output. The wrapper auto-fills sensible defaults when fields are omitted. Note:Vis the number of nodes in the connectivity matrix,Sis the number of spatial locations (e.g., mesh vertices),pis the number of covariates (including intercept if desired).
See the header of COSR_wrapper.m for more details on argument shapes and optional parameters.
- Data preparation: load or generate
A,M,x, and eitherSvor precomputePsiandLambda_sqrt.% generate toy data: seed = 2025; rng(seed); % Small settings for a quick smoke test V = 20; % nodes (small) n = 40; % subjects s = 6; % image side length (s*s = S locations) S = s * s; p = 2; H = 3; % generator options (use the independent-error generator here) sigma2_e = 0.5; sigma2_gamma = 1; delta = 0.5; tau_pattern = 1; % Call the available data generator (simu_data_gen_2d) [A, M, x, Psi, Lambda_sqrt, B_true, BM, Gamma_x, tau_true, Z_true, w_true, ... gamma1_true, gamma2_true, sigma2_e_actual, SNR_A, SNR_BM, Sv] = ... simu_data_gen_2d(n, V, H, p, s, sigma2_e, sigma2_gamma, delta, ... tau_pattern, seed, false, 0.95, 20, 1);
- Example A: provide spatial coordinates
Svand let wrapper computePsi/Lambda_sqrt% Set random seed for reproducibility rng(123); params.error_model = 'MCMC_IND'; params.nsamples = 2000; params.burnin = 500; params.thinning = 2; result = COSR_wrapper(params.error_model, A, M, Sv, x, params); estimates = result.estimates; samples = result.samples;
- Example B: precompute basis functions and pass them via params (
Svoptional / pass[]):% Set random seed for reproducibility rng(123); params.error_model = 'VB_IND'; params.nsamples = 1000; params.Psi = Psi; params.Lambda_sqrt = Lambda_sqrt; result = COSR_wrapper(params.error_model, A, M, [], x, params); estimates = result.estimates;
MCMC_BF: Full MCMC using a Bayesian factor model for correlated residuals — suited for moderate-sized problems and when modeling spatial residual correlation is important.MCMC_IND: MCMC assuming independent residuals — simpler and faster than BF variant.VB_IND: Variational Bayes / CAVI — fast, approximate inference for large-scale or high-dimensional problems.
Primary outputs produced by the COSR wrapper and inference functions:
estimates: point estimates (spatial coefficient maps, community assignments Z, effect sizes, etc.).samples: retained MCMC samples when using an MCMC method (for posterior summaries and credible intervals).fitted_errors: diagnostic objects from the example diagnostic utilities.
Analyses typically focus on:
-
Spatially varying coefficient maps ($\tau_{h,h'}(s)$) that show where on the surface a given shape measurement is associated with the connectivity between subnetworks
$h$ and$h'$ . - Posterior inclusion probabilities (PIPs) to identify the selection probabilities of associations, quantifying the uncertainty in the estimated effects.
- Community / cluster assignments (Z) which summarize modularity of nodes.
Visualize coefficient maps on your mesh/surface of choice (FreeSurfer, MATLAB patch/trisurf, or export to neuroimaging viewers).
example_simu/COSR_example_simu.m performs a quick pipeline:
- Generates synthetic connectivity matrices
Aand shape observationsMusingsimu_data_gen_2d.m. - Splits data into training (80%), validation (10%), and testing (10%) sets.
- Runs a COSR routine on training data, based on chosen method (
MCMC_BF,MCMC_IND, orVB_IND). - Compares recovered vs. true coefficients /community assignments, computes performance evaluation metrics (MSE, Adjusted rand index (ARI), Recall, Precision, F1), and produces diagnostic summaries.
Open the example script to change dimensions or noise settings for more stress testing.
- Dimension mismatch errors: ensure
Ais (V,V,n) and symmetric;Mmust be (n,S). - Slow runs: reduce the number of basis functions
L(fewer spatial basis) or useVB_INDfor faster approximate inference. - Reproducibility: set seed via
rng(2025)before running simulations or MCMC. - Memory: large meshes (S large) and many edges (V large) increase memory — only
VB_INDis recommended for large-scale problems.
📌 Citation and licensing information can be added once the paper is published / accepted.