GitHub - stagedtrees/exp_class_stagedtrees

what it is

This repository contains the code to run experiments on the benchmarking and evaluations of staged event tree classifiers.

The code was used in the following publications (expand for bibtex):

Leonelli, M. and Varando, G.. (2024). Context-Specific Refinements of Bayesian Network Classifiers. Proceedings of The 12th International Conference on Probabilistic Graphical Models

  
@InProceedings{leonelli24context,
  title = 	 {Context-Specific Refinements of Bayesian Network Classifiers},
  author =       {Leonelli, Manuele and Varando, Gherardo},
  booktitle = 	 {Proceedings of The 12th International Conference on Probabilistic Graphical Models},
  pages = 	 {182--198},
  year = 	 {2024},
  editor = 	 {Kwisthout, Johan and Renooij, Silja},
  volume = 	 {246},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {11--13 Sep},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v246/main/assets/leonelli24a/leonelli24a.pdf},
  url = 	 {https://proceedings.mlr.press/v246/leonelli24a.html}
}

Carli F., Leonelli M., Varando G. (2023) A new class of generative classifiers based on staged tree models Knowledge-Based Systems

@article{carli2023new,
title = {A new class of generative classifiers based on staged tree models},
journal = {Knowledge-Based Systems},
volume = {268},
pages = {110488},
year = {2023},
issn = {0950-7051},
doi = {https://doi.org/10.1016/j.knosys.2023.110488},
url = {https://www.sciencedirect.com/science/article/pii/S0950705123002381},
author = {Federico Carli and Manuele Leonelli and Gherardo Varando},
keywords = {Bayesian networks, Model selection, Staged trees, Statistical classification},
abstract = {Generative models for classification use the joint probability distribution of the class variable and the features to construct a decision rule. Among generative models, Bayesian networks and naive Bayes classifiers are the most commonly used and provide a clear graphical representation of the relationship among all variables. However, these have the disadvantage of highly restricting the type of relationships that could exist, by not allowing for context-specific independence. Here we introduce a new class of generative classifiers, called staged tree classifiers, which formally account for context-specific independence. They are constructed by a partitioning of the vertices of an event tree from which conditional independence can be formally read. The naive staged tree classifier is also defined, which extends the classic naive Bayes classifier whilst retaining the same complexity. An extensive simulation study shows that the classification accuracy of staged tree classifiers is competitive with that of state-of-the-art classifiers and an example showcases their use in practice.}
}

how to

Rscript run_classifiers.R runs all the defined classifiers (in methods.R) over all datasets (by default now only on the fast binary datasets). You can pass optional arguments, in that case the format is:
```
Rscript run_classifier.R DATA CLASS1 CLASS2 CLASS3 ... 
```
where DATA can be the name of one of the datasets (e.g. Asym) or the name of a tsv file containing a list of datasets (e.g. binary_fast_datasets_names.tsv). The arguments CLASS1 CLASS2 ...
are identifiers of classifiers: the name of a method (e.g. bnc_nb for the naive bayes implemented in bnclassify) or the name of a family of methods such as bnc_ in that case all classifiers in the bnc_ family will be executed (the final _ is important!)

examples :
- run all methods in the st_ family (stagedtrees) and the simple classifier over the Titanic dataset:
```
Rscript run_classifiers.R Titanic st_ simple 
```
- run the simple classifier over all the datasets:
```
Rscript run_classifiers.R datasets_names.tsv simple 
```
Running aggregate.R the available results will be evaluated according to the measures defined in statistics.R and saved in a multidimensional array TABLE.rds. Values for plotting roc curves are saved into the multidimensional array ROC_CURVES.rds.
The script plot.R takes the aggregated tables TABLE.rds and ROC_CURVES.rds and produces plots.

See the available methods in METHODS.md and the available datasets in datasets_names.tsv.

order experiments

Rscript run_order.R DATA st_naive_order st_fbhc_order
Rscript aggregate_order.R DATA
Rscript plot_order.R DATA

where DATA is the name of one of the datasets.

Name		Name	Last commit message	Last commit date
Latest commit History 112 Commits
datasets		datasets
methods		methods
splits		splits
.gitignore		.gitignore
README.md		README.md
aggregate.R		aggregate.R
aggregate_order.R		aggregate_order.R
asymmetry_measure.R		asymmetry_measure.R
binary_datasets_names.tsv		binary_datasets_names.tsv
binary_datasets_names_final.tsv		binary_datasets_names_final.tsv
binary_fast_datasets_names.tsv		binary_fast_datasets_names.tsv
datasets_names.tsv		datasets_names.tsv
exp_class_stagedtrees.Rproj		exp_class_stagedtrees.Rproj
format_table.R		format_table.R
generate_splits.R		generate_splits.R
methods.R		methods.R
new_simulations.R		new_simulations.R
plot.R		plot.R
plot_no_cutoff.R		plot_no_cutoff.R
plot_order.R		plot_order.R
plot_simulations.R		plot_simulations.R
plot_st_bnc.R		plot_st_bnc.R
requirements.R		requirements.R
run_classifiers.R		run_classifiers.R
run_order.R		run_order.R
simulations.R		simulations.R
simulations_2.R		simulations_2.R
statistics.R		statistics.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

what it is

how to

order experiments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

what it is

how to

order experiments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages