feature: integrate tile db loading in data_loader by VittorioRossi · Pull Request #6 · giacomo-ciro/diffusion-llms

VittorioRossi · 2025-08-23T18:48:40Z

No description provided.

giacomo-ciro

I would modify the dataloader to return:

return {
            "input_ids": input_ids,             # [max_length,]: LongTensor
            "eos_labels": eos_labels,           # [max_length,]: LongTensor
            "response_length": response_length, # [1,]: FloatTensor
            "input_mask": input_mask            # [max_length,]: BoolTensor,
            ++ "input_embeddings: input_embeddings # [max_length, d_model]: FloatTensor
        }

Effectively, the input_ids won't be needed anymore, but I suggest we keep them just in case (the pipeline is already in place, just ignore them).

And also modify the models' forward pass (LLaDaRegressor.forward() and LLaDaClassifier.forward() in models/llada.py) accordingly to take directly the embeddings as input and work on those.

As of now, the model takes only input_ids and calls hidden_state = self.llada.get_last_hidden_state(input_ids), since we have the embeddings stored on disk we can skip this part and directly go with hidden_state = input_embeddings.

… train.py

giacomo-ciro · 2025-08-24T08:32:59Z

@VittorioRossi ho fatto il wrapper al trainer in trainer.py e fatto lo script train.py che ora e' super clean (5 righe).

Rimane da adattare il forward pass dei modelli e testare tutto (io non ho ancora runnato il mio perche mancano delle cose,quindi ci saranno bug, ma una volta terminato tutto il refactor li fixiamo

… embeddings and not compute them

…idation

feature: integrate tile db loading in data_loader

afcc650

VittorioRossi requested a review from giacomo-ciro August 23, 2025 18:48

giacomo-ciro requested changes Aug 24, 2025

View reviewed changes

feat(trainer): create trainer class to wrap pl.trainer and streamline…

9aa0a29

… train.py

giacomo-ciro assigned VittorioRossi and giacomo-ciro Aug 24, 2025

giacomo-ciro and others added 5 commits August 24, 2025 10:35

feat(init pytest): to complete the test_train script

6775549

feat(uv): add pyproject.toml and fix requirements

494f11f

feature: add input_ids to data laoder and adjust the models to accept…

fb5aeff

… embeddings and not compute them

fix(slurm): remove quotes from model_path argument to fix HF repo val…

175c59f

…idation

fix various bugs and create slurm to train

e4ebe60

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: integrate tile db loading in data_loader#6

feature: integrate tile db loading in data_loader#6
VittorioRossi wants to merge 7 commits into
mainfrom
vitto

VittorioRossi commented Aug 23, 2025

Uh oh!

giacomo-ciro left a comment •

edited

Loading

Uh oh!

giacomo-ciro commented Aug 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

VittorioRossi commented Aug 23, 2025

Uh oh!

giacomo-ciro left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

giacomo-ciro commented Aug 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

giacomo-ciro left a comment •

edited

Loading