analysis: build 3-view moma by Piotr1619 · Pull Request #10 · Tom-Ellis-Lab/growth-bench

Piotr1619 · 2024-05-13T15:57:24Z

The PR contains a jupyter notebook run_triple_view_moma.ipynb to run the whole simulation (training & validation) for 3-view MOMA model that now includes three inputs: transcriptomics, fluxomics, proteomics

tdsone · 2024-05-15T15:20:04Z

the proteomics model loss makes me really optimistic that this could work - good job!
make sure to log the loss function and check that your model is learning (read up on how to interpret loss curves)
there is something broken with the training of the 3-view model
restructure your code into separate python files (a useful split is: model.py (model architecture), train.py (train + eval), preprocessing.py (data prerpocessing code)
you could also think about having a separate folder for the proteomics model and the 3 view model
jupyter notebooks are a bit hard to review as you cannot track changes properly and they are inherently messy

looking forward to the next iteration

Piotr1619 · 2024-05-15T18:45:28Z

Thanks for your comments! Here is what I improved so far:
I split the code into the following architecture: moma/model.py (here, we keep all the models used in moma). Then for our analysis, I created ralser_moma folder where we have ralser_preprocessing.py, ralser_train.py, and ralser_main.py.
To run the script type: python bench/models/moma/ralser_moma/ralser_main.py
For the input data, you need: data/models/moma/yeast5k_impute_wide.csv and data/tasks/task3/yeast5k_growthrates_byORF.csv

Currently, the script saves automatically the loss plot into data/models/moma/proteomics_model_loss.png and weights into data/models/moma/proteomics_ralser.weights.h5

I'm happy with any more suggestions 👍

bench/models/moma/ralser_moma/ralser_train.py

tdsone · 2024-05-16T08:21:31Z

I really like this PR! Was much easier to review and spot things. You can merge but make an issue or smth to add instructions about setting up the datasets in the README. Really looking forward to the final weeks. If you want to flex a bit: make a gallery of the MSE/loss curves and show them as before and after images in the subgroup meeting!

…uble-view model, improve ralser analysis

…o moma train.py and processing.py

…d hyperparams, refactor import, encapsulate wandb init

…elect_genes feature

…it data to preprocessing, remove gene analysis

…; add tests for new components

…pe for datasplitter and normaliser

… and filters for them

analysis: build 3-view moma

aae0185

Piotr1619 requested a review from tdsone May 13, 2024 15:57

Merge branch 'dev' of github.com:Tom-Ellis-Lab/growth-bench into dev

01dfd12

refactor the code for proteomics model

c792020

tdsone added 2 commits May 16, 2024 10:08

Sorts data by knockout

60cc9a9

Formatting and rename batches to batch_size

c95e093

tdsone reviewed May 16, 2024

View reviewed changes

bench/models/moma/ralser_moma/ralser_train.py Outdated Show resolved Hide resolved

Formatting with black

0a74a63

Piotr Gidzinski and others added 19 commits May 16, 2024 12:19

refactor: remove jupyter notebook for 3-view model analysis

ec53a4a

Adds wandb for logging

7f759a4

fix: add shuffling, normalise loss, use Adam optimiser

11bc069

Adds correlation metrics to wandb

4242fec

Logs predictions to wandb and plots predictions locally as scatter plot

2d2f84e

Scatter plot of predictions

6669c13

analysis: add custom loss/val_loss plots to wandb

f869d2b

analysis: add original culley model training, common train.py, add do…

6d11c32

…uble-view model, improve ralser analysis

analysis: add three view moma, refactoring, moving common functions t…

8f4fe09

…o moma train.py and processing.py

analysis: add 3-view moma training-evaluation

403b2b2

fix: include new parameters for single-view model

c2b6239

fix: remove growth data from Ralser in Culley model, set radnom seed

3a3299e

feature: build single-view moma, enable building 3-outputs models, ad…

4099a38

…d hyperparams, refactor import, encapsulate wandb init

feature: add pca analysis option, log multiple outputs metrics

40ef09c

refactor: run any model using one script, add cross-validation, add s…

99a9401

…elect_genes feature

refactor: rename file, simplify pipeline, move plot to view, move spl…

02552b2

…it data to preprocessing, remove gene analysis

refactor: separation of concerns for gateways and preprocessing steps…

51dc6e4

…; add tests for new components

refactor: create splitter class, add tests

3e58958

refactor: create cv_data_splitter

fc2f106

Piotr Gidzinski added 5 commits August 22, 2024 18:34

refactor: create normaliser class

d65870c

refactor: create model constructor, add tests

2be0aad

refactor: create training module, optimiser factory, change return ty…

2890c04

…pe for datasplitter and normaliser

refactor: domain-driven development, created entities, gateway, repos…

804b1e8

… and filters for them

refactor: remove old files

bce325d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

analysis: build 3-view moma#10

analysis: build 3-view moma#10
Piotr1619 wants to merge 30 commits intomainfrom
dev

Piotr1619 commented May 13, 2024

Uh oh!

tdsone commented May 15, 2024

Uh oh!

Piotr1619 commented May 15, 2024

Uh oh!

Uh oh!

tdsone commented May 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Piotr1619 commented May 13, 2024

Uh oh!

tdsone commented May 15, 2024

Uh oh!

Piotr1619 commented May 15, 2024

Uh oh!

Uh oh!

tdsone commented May 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants