Skip to content

Fix streaming validation infinite loop (#42)#45

Open
vominh1919 wants to merge 1 commit intoPrimeIntellect-ai:mainfrom
vominh1919:fix/42-streaming-validation-infinite-loop
Open

Fix streaming validation infinite loop (#42)#45
vominh1919 wants to merge 1 commit intoPrimeIntellect-ai:mainfrom
vominh1919:fix/42-streaming-validation-infinite-loop

Conversation

@vominh1919
Copy link
Copy Markdown

Fixes #42

Problem: In train_diloco_torch.py, the validation dataset is loaded with streaming=True, creating an IterableDataset that lacks __len__. The DataLoader iterates infinitely in evaluate_model(), never terminating.

Solution: Limit the streaming validation dataset to 1000 samples using eval_dataset.take(1000). This caps evaluation to a reasonable sample size for perplexity testing while avoiding the infinite loop.

Uses hasattr(eval_dataset, "take") guard so the fix is safe for non-streaming datasets (e.g., when using c4_tiny).

Limit streaming validation dataset to 1000 samples using .take(1000).
IterableDataset has no __len__, causing DataLoader to loop forever
when used with streaming=True. This caps evaluation to a reasonable
sample size for perplexity testing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Streaming validation dataset will lead to infinite loop

1 participant