NeuroFuzzyProject
├── data
│ ├── datasets
│ │ ├── issues (not used)
│ │ ├── sepsis
│ │ │ ├── sepsis_survival_primary_cohort.csv
│ │ │ ├── sepsis_survival_study_cohort.csv
│ │ │ └── sepsis_survival_validation_cohort.csv
│ │ ├── diabetes.csv
│ │ ├── maternal_health_risk.csv
│ │ ├── ... (other datasets not used)
│ │ └── obesity.csv
│ └── data.py
├── experiments
│ ├── configurations
│ │ ├── diabetes
│ │ │ └── json file with configurations
│ │ ├── maternal_hr
│ │ │ └── json file with configurations
│ │ ├── obesity
│ │ │ └── json file with configurations
│ │ ├── sepsis
│ │ │ └── json file with configurations
│ │ ├── .... (other datasets not used)
│ │ ├── conf_general_V.json
│ │ ├── conf_general_weights.json
│ │ └── configurations.py
│ ├── results
│ │ ├── maternal_hr
│ │ │ ├── ReadMe.md
│ │ │ ├── summary_results
│ │ │ │ ├── show_results_maternal_hr.ipynb
│ │ │ │ ├── summary_results_maternal.csv
│ │ │ │ └── table_confrontation.py
│ │ │ └── csv with all results of the experiments
│ │ └── previous_experiments
│ │ ├── ReadMe.md
│ │ ├── diabetes
│ │ │ └── preliminar results of some experiments
│ │ ├── maternal_hr
│ │ │ └── preliminar results of some experiments (also with micro precision as fitness)
│ │ └── sepsis
│ │ └── preliminar results of the experiments
│ ├── calculate.py
│ ├── evolution.py
│ ├── plots.py
│ └── utils.py
├── models
│ ├── crossover.py
│ ├── models.py
│ ├── operators.py
│ └── selection.py
├── env2.yml
├── environment.yml
├── README.md
├── baseline-1.py
├── main_evol_ind.py
├── main.py
└── README.md
The evolution operations are implemented in the models directory.
- in
models.py, the creation of individuals for the population and the calculus of the fitness were added - in
selection.py, the implementation of selection of the best individuals from the population (eventually with mutation) is present - in
crossover.pythere is the implementation of crossover between two individuals
For running correcly the evolutionary version of the project, the configuration file should be present in the experiments/configurations/<dataset>/ directory. This file should contain this info:
- number of seeds
- neuron types (AND, OR)
- number of membership functions (MFs)
- which genes update, in particular
- V
- weights of neurons
- activation function
- optimizer
- data encoding
- prediction method
- how to calculate the fitness function (accuracy, f1, ...)
- parameters for mutation rate (general)
- parameters for mutation rate of each individual
- parameters for crossover rate
- max number of generations
- max number for patience (early stopping)
- initial population size
- number of individuals to generated (offspings)
- selection strategy
- plus
- comma
- path for storing the results
An experiment for each possible configuration is performed. The pipeline of the project is the following:
- load the dataset
- run the experiment
- save the results
- local results, with detailed results of a single experiment (they are as many as combiantion of configurations)
- global results, with the summary of all the experiments
In the run_experiment function, the following steps are performed:
-
initialize the population, that is a List[FNNModel]
- an individual is created with the class
FNNModel - the individual is initilized and added to the population
- an individual is created with the class
-
for each generation (from 1 to max_gen or until patient != 0 ), the following steps are performed:
-
selection of the best individuals from the population, eventually with mutation
-
for EACH individual, the fitness function is calculated and saved for the train, validation and test set
- if the individual is the best, according to the fitness function on the val set, it is saved
-
the population performances (mean, std, max and min fitness) are saved in the local results for each set (train, val, test)
-
if there was no improvement in individuals, the patient counter is decreased
-
-
only the fitness in the train, validation and test set of the best individual is return (and saved in the global results)
The selection strategy is implemented in the selection.py file.
For compute the selection, the number of parents (pop size) has to be lower that number of offsprings (children) to generate.
The one implemented is a tournament selection: for each offspring (child) to be generated, the best individual is chosen from a random subset of the population. Also, a plus strategy is developed:
In the file, the following steps are done:
-
check if the number of parents is lower than the number of offsprings
-
calculate the fitness of individual, if not already done
-
offsprings are generated:
- tournament selection is performed
- crossover and mutation are performed
- the new individual is added to the offsprings set
-
for each offspring, the fitness is calculated and saved
-
if the selection strategy is equal to plus, offsprings are added to the initial population
-
the new population created is sorted according to the fitness
-
the population truncated: the same number of individuals as the initial population is kept and returned
The crossover operation is implemented in the crossover.py file.
The crossover operation is performed between two individuals. This works by swapping some parameters of the two fuzzy neural networks.
The mutation operation is implemented in the models.py file.
An individual is mutated by by adding random noise: first the fitness is resetted, then the mutation is performed.