Context
gpredomics has many tunable parameters (k_penalty, population_size, cooling_rate, etc.) that significantly affect model quality. Manual tuning is tedious and suboptimal. Optuna provides efficient Bayesian hyperparameter optimization.
Design
Implement in gpredomicspy (Python layer) as gpredomicspy.optimize():
import gpredomicspy as gp
# Define search space
search_space = {
"algo": ["ga", "beam", "sa", "ils", "lasso"],
"k_penalty": (1e-5, 0.01, "log"),
"language": ["ter", "bin,ter", "bin,ter,ratio"],
"data_type": ["prev", "raw", "raw,prev"],
"population_size": (500, 10000),
"cooling_rate": (0.99, 0.9999),
"feature_minimal_prevalence_pct": (5, 30),
"feature_maximal_adj_pvalue": (0.01, 0.1),
}
results = gp.optimize(
base_param="param.yaml",
search_space=search_space,
n_trials=100,
metric="test_auc", # or "fit", "spearman", etc.
direction="maximize",
cv=True, # use CV for robust evaluation
n_jobs=4, # parallel trials
)
print(results.best_params)
print(results.best_value)
results.plot_importance() # which params matter most
Parameters worth optimizing
| Category |
Parameter |
Type |
Range |
| Regularization |
k_penalty |
log-float |
[1e-5, 0.1] |
| Regularization |
fr_penalty |
float |
[0, 1] |
| Regularization |
bias_penalty |
float |
[0, 1] |
| Algorithm |
algo |
categorical |
ga/beam/sa/ils/lasso/aco |
| Data |
language |
categorical |
ter/bin/ratio combos |
| Data |
data_type |
categorical |
raw/prev/log combos |
| Feature selection |
prevalence_pct |
float |
[5, 50] |
| Feature selection |
max_adj_pvalue |
float |
[0.01, 1.0] |
| GA |
population_size |
int |
[500, 10000] |
| GA |
max_epochs |
int |
[50, 500] |
| GA |
mutated_children_pct |
float |
[50, 95] |
| SA |
cooling_rate |
float |
[0.99, 0.9999] |
| SA |
max_iterations |
int |
[1000, 50000] |
| ACO |
n_ants |
int |
[50, 1000] |
| ACO |
alpha/beta |
float |
[0.5, 5.0] |
| ACO |
rho |
float |
[0.01, 0.5] |
Web app integration
Add a "Tune" button in ParametersTab that:
- Opens a modal with parameter search space configuration
- Runs Optuna in the background (via worker.py)
- Shows convergence plot + parameter importance
- "Apply best" button to set optimal parameters
Dependencies
optuna (pip install optuna)
optuna-dashboard (optional, for web visualization)
Context
gpredomics has many tunable parameters (k_penalty, population_size, cooling_rate, etc.) that significantly affect model quality. Manual tuning is tedious and suboptimal. Optuna provides efficient Bayesian hyperparameter optimization.
Design
Implement in gpredomicspy (Python layer) as
gpredomicspy.optimize():Parameters worth optimizing
Web app integration
Add a "Tune" button in ParametersTab that:
Dependencies
optuna(pip install optuna)optuna-dashboard(optional, for web visualization)