Skip to content

Comments

docs: add dataset instructions and generation guide#140

Open
Shrishagk wants to merge 1 commit intoML4SCI:mainfrom
Shrishagk:docs/dataset-readme
Open

docs: add dataset instructions and generation guide#140
Shrishagk wants to merge 1 commit intoML4SCI:mainfrom
Shrishagk:docs/dataset-readme

Conversation

@Shrishagk
Copy link

@Shrishagk Shrishagk commented Feb 17, 2026

Summary

This PR improves the repository documentation by explaining how to obtain the datasets required to run the notebooks in
DeepLense_Gravitational_Lens_Classification_Transformers_Dhruv_Srivastava project

Currently the notebooks fail because the datasets and folder structure are not described, and new users cannot determine whether the issue is code-related or missing data.

Changes

  • Added dataset format explanations

  • Added expected directory structures for each experiment

  • Added instructions to generate datasets using DeepLenseSim

  • Clarified dataloader generation workflow

Why this is needed

Without these instructions:

  • Notebooks throw FileNotFoundError

  • Users cannot reproduce results

  • It is unclear that datasets must be generated externally

This PR makes the project reproducible for new contributors.

Type of Change

Documentation improvement (no code changes)

Testing

Verified that a new user can now understand:

  1. Where the data comes from

  2. How to generate it

  3. Where to place it

  4. Why dataloader files are missing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant