diff --git a/.github/workflows/publish.yml b/.github/workflows/publish.yml index ce3729d..67511e4 100644 --- a/.github/workflows/publish.yml +++ b/.github/workflows/publish.yml @@ -1,8 +1,9 @@ -name: Publish to PyPI and GitHub Packages +name: Publish to PyPI + on: - push: - tags: - - "v*.*.*" + release: + types: [published] + jobs: build-and-publish: runs-on: ubuntu-latest @@ -12,32 +13,21 @@ jobs: - name: Set up Python uses: actions/setup-python@v4 - with: - python-version: "3.10" - - name: Install build tools + python-version: "3.12" + - name: Install build tools run: | python -m pip install --upgrade pip setuptools wheel twine + python -m pip install build - name: Build source & wheel run: | - python setup.py sdist bdist_wheel - -# - name: Publish to GitHub Packages -# env: -# TWINE_USERNAME: __token__ -# TWINE_PASSWORD: ${{ secrets.GITHUB_TOKEN }} -# run: | -# python -m pip install --upgrade pip twine -# python -m twine upload \ -# --repository-url https://api.github.com/orgs/${{ github.repository_owner }}/packages/pypi/upload \ -# dist/* + python -m build - name: Publish to PyPI env: TWINE_USERNAME: __token__ TWINE_PASSWORD: ${{ secrets.PYPI_API_TOKEN }} - run: | twine upload dist/* diff --git a/.gitignore b/.gitignore index a4aff76..3bcb9ef 100644 --- a/.gitignore +++ b/.gitignore @@ -121,3 +121,4 @@ dpmon_output_* lib/ .DS_Store cancer_output_1/GlobalNetwork.csv +Parkinson diff --git a/CHANGELOG.md b/CHANGELOG.md index c1a1411..cb2b551 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -70,16 +70,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/). ### **Added** - **New Embedding Integration Utility** - - `_integrate_embeddings(reduced, method="multiply", alpha=2.0, beta=0.5)`: - - Integrates reduced embeddings with raw omics features via a multiplicative scheme: - - `enhanced = beta * raw + (1 - beta) * (alpha * normalized_weight * raw)` + - `_integrate_embeddings(reduced, method="multiply", alpha=2.0, beta=0.5)`: + - Integrates reduced embeddings with raw omics features via a multiplicative scheme: + - `enhanced = beta * raw + (1 - beta) * (alpha * normalized_weight * raw)` - (default ensures ≥ 50 % of each feature’s final value is influenced by the learned weights). - **Graph-Generation Algorithms** - - `gen_similarity_graph`: k-NN Cosine / Gaussian RBF similarity graph - - `gen_correlation_graph`: Pearson / Spearman co-expression graph - - `gen_threshold_graph`: soft-threshold (WGCNA-style) correlation graph - - `gen_gaussian_knn_graph`: Gaussian kernel k-NN graph + - `gen_similarity_graph`: k-NN Cosine / Gaussian RBF similarity graph + - `gen_correlation_graph`: Pearson / Spearman co-expression graph + - `gen_threshold_graph`: soft-threshold (WGCNA-style) correlation graph + - `gen_gaussian_knn_graph`: Gaussian kernel k-NN graph - `gen_mutual_info_graph`: mutual-information graph - **Preprocessing Utilities** @@ -89,24 +89,32 @@ and this project adheres to [Semantic Versioning](https://semver.org/). - Correlation selection (supervised / unsupervised): `select_top_k_correlation` - RandomForest importance: `select_top_randomforest` - ANOVA F-test selection: `top_anova_f_features` - - Network-pruning helpers: - - `prune_network`, `prune_network_by_quantile`, + - Network-pruning helpers: + - `prune_network`, `prune_network_by_quantile`, - `network_remove_low_variance`, `network_remove_high_zero_fraction` -- **Continuous-Deployment Workflow** +- **Continuous-Deployment Workflow** Added `.github/workflows/publish.yml` to auto-publish releases to PyPI when a Git tag is pushed. -- **Updated Homepage Image** +- **Updated Homepage Image** Replaced the index-page illustration to depict the full BioNeuralNet workflow. ### **Changed** - **Comprehensive Documentation Update** - - Rebuilt ReadTheDocs site with a new workflow diagram on the landing page. - - Synced API reference to include all new graph-generation, preprocessing, and embedding-integration functions. - - Added quick-start guide, expanded tutorials, and refreshed examples/notebooks. + - Rebuilt ReadTheDocs site with a new workflow diagram on the landing page. + - Synced API reference to include all new graph-generation, preprocessing, and embedding-integration functions. + - Added quick-start guide, expanded tutorials, and refreshed examples/notebooks. - Updated narrative docs, docstrings, and licencing info for consistency. - **License**: Project is now distributed under the [Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0)](https://creativecommons.org/licenses/by-nc-nd/4.0/). ### **Fixed** - **Packaging Bug**: Missing `.csv` datasets and `.R` scripts in source distribution; `MANIFEST.in` updated to include all requisite data files. + +## [1.1.2] - 2025-11-01 + +- **Linked Zenodo DOI to GitHub repository** + +## [1.1.3] - 2025-11-01 + +- **Tag update to Sync Zenodo and PIPY** diff --git a/README.md b/README.md index ab33058..6219e85 100644 --- a/README.md +++ b/README.md @@ -6,9 +6,10 @@ [![GitHub Contributors](https://img.shields.io/github/contributors/UCD-BDLab/BioNeuralNet)](https://github.com/UCD-BDLab/BioNeuralNet/graphs/contributors) [![Downloads](https://static.pepy.tech/badge/bioneuralnet)](https://pepy.tech/project/bioneuralnet) [![Documentation](https://img.shields.io/badge/docs-read%20the%20docs-blue.svg)](https://bioneuralnet.readthedocs.io/en/latest/) +[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.17503084.svg)](https://doi.org/10.5281/zenodo.17503084) -## Welcome to BioNeuralNet 1.1.1 +## Welcome to BioNeuralNet 1.1.3 ![BioNeuralNet Logo](assets/LOGO_WB.png) @@ -31,16 +32,16 @@ For your convenience, you can use the following BibTeX entry:
BibTeX Citation - + ```bibtex @misc{ramos2025bioneuralnetgraphneuralnetwork, - title={BioNeuralNet: A Graph Neural Network based Multi-Omics Network Data Analysis Tool}, + title={BioNeuralNet: A Graph Neural Network based Multi-Omics Network Data Analysis Tool}, author={Vicente Ramos and Sundous Hussein and Mohamed Abdel-Hafiz and Arunangshu Sarkar and Weixuan Liu and Katerina J. Kechris and Russell P. Bowler and Leslie Lange and Farnoush Banaei-Kashani}, year={2025}, eprint={2507.20440}, archivePrefix={arXiv}, primaryClass={cs.LG}, - url={https://arxiv.org/abs/2507.20440}, + url={https://arxiv.org/abs/2507.20440}, } ```
@@ -105,10 +106,10 @@ BioNeuralNet is a flexible, modular Python framework developed to facilitate end - **Similarity graphs:** k-NN (cosine/Euclidean), RBF, mutual information. - **Correlation graphs:** Pearson, Spearman; optional soft-thresholding. - + - **Phenotype-aware graphs:** SmCCNet integration (R) for sparse multiple canonical-correlation networks. -- **[Preprocessing Utilities](https://bioneuralnet.readthedocs.io/en/latest/utils.html#graph-generation):** +- **[Preprocessing Utilities](https://bioneuralnet.readthedocs.io/en/latest/utils.html#graph-generation):** - **RData conversion to pandas DataFrame:** Converts an RData file to CSV and loads it into a pandas DataFrame. @@ -198,7 +199,7 @@ dpmon = DPMON( model="GCN", repeat_num=5, tune=True, - gpu=True, + gpu=True, cuda=0, output_dir="./output" ) @@ -275,24 +276,24 @@ See the [LICENSE](LICENSE) file for details. If you use BioNeuralNet in your research, we kindly ask that you cite our paper: -> Vicente Ramos, et al. (2025). -> [**BioNeuralNet: A Graph Neural Network based Multi-Omics Network Data Analysis Tool**](https://arxiv.org/abs/2507.20440). +> Vicente Ramos, et al. (2025). +> [**BioNeuralNet: A Graph Neural Network based Multi-Omics Network Data Analysis Tool**](https://arxiv.org/abs/2507.20440). > *arXiv preprint arXiv:2507.20440*. For your convenience, you can use the following BibTeX entry:
BibTeX Citation - + ```bibtex @misc{ramos2025bioneuralnetgraphneuralnetwork, - title={BioNeuralNet: A Graph Neural Network based Multi-Omics Network Data Analysis Tool}, + title={BioNeuralNet: A Graph Neural Network based Multi-Omics Network Data Analysis Tool}, author={Vicente Ramos and Sundous Hussein and Mohamed Abdel-Hafiz and Arunangshu Sarkar and Weixuan Liu and Katerina J. Kechris and Russell P. Bowler and Leslie Lange and Farnoush Banaei-Kashani}, year={2025}, eprint={2507.20440}, archivePrefix={arXiv}, primaryClass={cs.LG}, - url={https://arxiv.org/abs/2507.20440}, + url={https://arxiv.org/abs/2507.20440}, } ``` -
\ No newline at end of file + diff --git a/bioneuralnet/__init__.py b/bioneuralnet/__init__.py index 1869b22..531ee90 100644 --- a/bioneuralnet/__init__.py +++ b/bioneuralnet/__init__.py @@ -29,7 +29,7 @@ - `datasets`: Contains example (synthetic) datasets for testing and demonstration purposes. """ -__version__ = "1.1.1" +__version__ = "1.1.3" from .network_embedding import GNNEmbedding from .downstream_task import SubjectRepresentation diff --git a/bioneuralnet/utils/graph.py b/bioneuralnet/utils/graph.py index 5817740..3b7a2e0 100644 --- a/bioneuralnet/utils/graph.py +++ b/bioneuralnet/utils/graph.py @@ -10,15 +10,15 @@ def gen_similarity_graph(X:pd.DataFrame, k:int = 15, metric:str = "cosine", mutual:bool = False, per_node:bool = True, self_loops:bool = False) -> pd.DataFrame: """ - Build a normalized k-nearest neighbors (kNN) similarity graph from feature vectors. + Build a normalized k-nearest neighbors (kNN) similarity graph from feature vectors. The function computes pairwise `cosine` or `Euclidean` distances, sparsifies the matrix by keeping `top-k` neighbours per node (or by applying a global threshold), optionally prunes edges to mutual neighbours, and can add self-loops. Args: - + - X: Dataframe of shape (N, D) where, N(`rows`) is the number of subjects/samples and D(`columns`) represents the multi-omics features. - k: Number of neighbors to keep per node. - - metric: `cosine` or `Euclidean` (uses gaussian kernel on distances). + - metric: `cosine` or `Euclidean` (uses gaussian kernel on distances). - mutual: If `True`, retain only mutual edges (i->j and j->i). - per_node: If `True`, use per-node `k`, else apply a global cutoff. - self_loops: If `True`, add a self-loop weight of 1. @@ -36,7 +36,7 @@ def gen_similarity_graph(X:pd.DataFrame, k:int = 15, metric:str = "cosine", mutu x_torch = torch.tensor(X.values, dtype=torch.float32, device=device) else: raise TypeError("X must be a pandas.DataFrame") - + N = x_torch.size(0) k = min(k, N-1) @@ -80,18 +80,18 @@ def gen_similarity_graph(X:pd.DataFrame, k:int = 15, metric:str = "cosine", mutu if final_graph.shape != (number_of_omics, number_of_omics): logger.info(f"Please make sure your input X follows the description: A DataFrame (N, D) where, N(rows) is the number of subjects/samples and D(columns) represents the multi-omics features.") raise ValueError(f"Generated graph shape {final_graph.shape} does not match expected shape ({number_of_omics}, {number_of_omics}).") - + return final_graph def gen_correlation_graph(X: pd.DataFrame, k: int = 15,method: str = 'pearson', mutual: bool = False, per_node: bool = True,threshold: Optional[float] = None, self_loops:bool = False) -> pd.DataFrame: """ - Build a normalized k-nearest neighbors (kNN) correlation graph from feature vectors. + Build a normalized k-nearest neighbors (kNN) correlation graph from feature vectors. The function computes pairwise `pearson` or `spearman` correlations, sparsifies the matrix by keeping `top-k` neighbours per node (or by applying a global threshold), optionally prunes edges to mutual neighbours, and can add self-loops. Args: - + - X: Dataframe of shape (N, D) where, N(`rows`) is the number of subjects/samples and D(`columns`) represents the multi-omics features. - k: Number of neighbors to keep per node. - method: `pearson` or `spearman`. @@ -170,12 +170,12 @@ def gen_correlation_graph(X: pd.DataFrame, k: int = 15,method: str = 'pearson', def gen_threshold_graph(X:pd.DataFrame, b: float = 6.0,k: int = 15, mutual: bool = False, self_loops: bool = False) -> pd.DataFrame: """ - Build a normalized k-nearest neighbors (kNN) soft-threshold co-expression graph, similar to the network-construction step in WGCNA. + Build a normalized k-nearest neighbors (kNN) soft-threshold co-expression graph, similar to the network-construction step in WGCNA. The function computes absolute pair-wise Pearson correlations, applies apower-law soft threshold with exponent `b`, sparsifies the matrix bykeeping `top-k` neighbours per node, optionally prunes edges to mutualneighbours, and can add self-loops. Args: - + - X: Dataframe of shape (N, D) where, N(`rows`) is the number of subjects/samples and D(`columns`) represents the multi-omics features. - b: Soft-threshold exponent applied to absolute correlations. - k: Number of neighbors to keep per node. @@ -235,12 +235,12 @@ def gen_threshold_graph(X:pd.DataFrame, b: float = 6.0,k: int = 15, mutual: bool def gen_gaussian_knn_graph(X: pd.DataFrame,k: int = 15,sigma: Optional[float] = None ,mutual: bool = False, self_loops: bool = True) -> pd.DataFrame: """ - Build a normalized k-nearest neighbors (kNN) similarity graph using a Gaussian(RBF) kernel. + Build a normalized k-nearest neighbors (kNN) similarity graph using a Gaussian(RBF) kernel. The function computes pairwise Euclidean distances, converts them to similarities with a Gaussian kernel (bandwidth `sigma`; if `None`, the median-distance heuristic is used), sparsifies the matrix by keeping `top-k` neighbours per node, optionally prunes edges to mutual neighbours, and can add self-loops. Args: - + - X: Dataframe of shape (N, D) where, N(`rows`) is the number of subjects/samples and D(`columns`) represents the multi-omics features. - k: Number of neighbors to keep per node. - sigma: Bandwidth of the Gaussian kernel; if `None`, uses the median squared distance. @@ -295,12 +295,12 @@ def gen_gaussian_knn_graph(X: pd.DataFrame,k: int = 15,sigma: Optional[float] = def gen_lasso_graph(X: pd.DataFrame, alpha: float = 0.01, self_loops: bool = False, max_iter:int = 500) -> pd.DataFrame: """ - Build a sparse network using Graphical Lasso (inverse-covariance estimation). + Build a sparse network using Graphical Lasso (inverse-covariance estimation). The function fits a precision matrix with L1 regularization (`alpha`),converts the non-zero entries to edge weights, can add self-loops, and row-normalizes the result. Args: - + - X: Dataframe of shape (N, D) where, N(`rows`) is the number of subjects/samples and D(`columns`) represents the multi-omics features. - alpha: Regularization strength for Graphical Lasso; larger values yield sparser graphs. - self_loops: If `True`, add a self-loop weight of 1. @@ -313,7 +313,7 @@ def gen_lasso_graph(X: pd.DataFrame, alpha: float = 0.01, self_loops: bool = Fal if isinstance(X, pd.DataFrame): nodes = X.columns number_of_omics = len(nodes) - x_numpy = X.values + x_numpy = X.values else: raise TypeError("X must be a pandas.DataFrame") @@ -344,12 +344,12 @@ def gen_lasso_graph(X: pd.DataFrame, alpha: float = 0.01, self_loops: bool = Fal def gen_mst_graph(X: pd.DataFrame, self_loops: bool = False) -> pd.DataFrame: """ - Build a minimum-spanning-tree (MST) graph from feature vectors. + Build a minimum-spanning-tree (MST) graph from feature vectors. The function computes pairwise Euclidean distances, extracts the MST, can add self-loops, and row-normalizes the result. Args: - + - X: Dataframe of shape (N, D) where, N(`rows`) is the number of subjects/samples and D(`columns`) represents the multi-omics features. - self_loops: If `True`, add a self-loop weight of 1. @@ -410,12 +410,12 @@ def gen_mst_graph(X: pd.DataFrame, self_loops: bool = False) -> pd.DataFrame: def gen_snn_graph(X: pd.DataFrame,k: int = 15,mutual: bool = False, self_loops: bool = False) -> pd.DataFrame: """ - Build a shared-nearest-neighbor (SNN) graph from feature vectors. + Build a shared-nearest-neighbor (SNN) graph from feature vectors. The function first finds the `top-k` nearest neighbours for each node,counts how many neighbours two nodes share, converts that count to anSNN similarity score, optionally prunes edges to mutual neighbours, and can add self-loops. Args: - + - X: Dataframe of shape (N, D) where, N(`rows`) is the number of subjects/samples and D(`columns`) represents the multi-omics features. - k: Number of neighbors to keep per node. - mutual: If `True`, retain only mutual edges (i->j and j->i). diff --git a/docs/source/Quick_Start.ipynb b/docs/source/Quick_Start.ipynb index 8a4089b..b5b06f3 100644 --- a/docs/source/Quick_Start.ipynb +++ b/docs/source/Quick_Start.ipynb @@ -896,7 +896,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "BioNeuralNet version: 1.1.1\n" + "BioNeuralNet version: 1.1.3\n" ] } ], diff --git a/docs/source/conf.py b/docs/source/conf.py index 34836ce..d2da42a 100644 --- a/docs/source/conf.py +++ b/docs/source/conf.py @@ -7,7 +7,7 @@ try: release = metadata.version("bioneuralnet") except metadata.PackageNotFoundError: - release = "1.1.1" + release = "1.1.3" project = "BioNeuralNet" version = release diff --git a/docs/source/index.rst b/docs/source/index.rst index 210e1fb..31d60ed 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -13,6 +13,9 @@ BioNeuralNet: Graph Neural Networks for Multi-Omics Network Analysis .. image:: https://img.shields.io/badge/GitHub-View%20Code-blue :target: https://github.com/UCD-BDLab/BioNeuralNet +.. image:: https://zenodo.org/badge/DOI/10.5281/zenodo.17503084.svg + :target: https://doi.org/10.5281/zenodo.17503084 + .. figure:: _static/LOGO_TB.png :align: center :alt: BioNeuralNet Logo diff --git a/requirements.txt b/requirements.txt index 288e2a8..938730c 100644 --- a/requirements.txt +++ b/requirements.txt @@ -5,6 +5,7 @@ matplotlib scikit-learn networkx python-louvain +pydantic ray[tune] statsmodels # torch diff --git a/setup.cfg b/setup.cfg index 1c12434..c85c934 100644 --- a/setup.cfg +++ b/setup.cfg @@ -1,6 +1,6 @@ [metadata] name = bioneuralnet -version = 1.1.1 +version = 1.1.3 author = Vicente Ramos author_email = vicente.ramos@ucdenver.edu description = A comprehensive framework for integrating multi-omics data with neural network embeddings. @@ -29,6 +29,7 @@ install_requires = statsmodels networkx python-louvain + pydantic ray[tune] [options.packages.find]