Skip to content

Possible duplicated SDSS files in Zenodo dataset #18

@emilywu-567

Description

@emilywu-567

Hi, thank you for releasing DeepAstroUDA.

I am trying to reproduce the SDSS → DECaLS experiment from the paper, but I noticed several issues regarding the released SDSS dataset on Zenodo.

Specifically:

1sdss_1.h5 and sdss_2.h5 appear to be completely identical (same MD5 hash).
2 The class distributions seem inconsistent with the paper description. For example, one class contains only a few dozen samples, while the paper mentions that most classes should contain 1k–2.6k images, with the smallest class containing 334 images.
3 The paper describes the gravitational lens class as a target-only unknown class, but the released source-domain data appears to contain all 10 classes.

Because of this, I am unsure whether:

  1. the uploaded SDSS files are duplicated accidentally,
  2. the released dataset is only a reduced/demo subset,
  3. or some preprocessing/splitting step is missing from the repository.

Could you clarify which dataset should be used for reproducing the SDSS → DECaLS experiment, or whether the full source-domain dataset is available somewhere else?

Thank you very much.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions