Skip to content

Comments

Fix: Prevent NaN errors and optimize tensor loading in LensingDataset#142

Open
kamilansri wants to merge 1 commit intoML4SCI:mainfrom
kamilansri:fix/lensing-dataset-nan-prevention
Open

Fix: Prevent NaN errors and optimize tensor loading in LensingDataset#142
kamilansri wants to merge 1 commit intoML4SCI:mainfrom
kamilansri:fix/lensing-dataset-nan-prevention

Conversation

@kamilansri
Copy link

This PR addresses a critical mathematical bug in the LensingDataset normalization logic and optimizes how image arrays are loaded into PyTorch tensors. It also fixes fragile file path handling to prevent FileNotFoundErrors across different operating systems.

Changes Made

  • Zero-Division Prevention: Added a small epsilon (1e-8) to the denominator during min-max normalization. This prevents the tensor from filling with NaNs if an image happens to have uniform pixel values (where max == min).
  • Safe Path Handling: Replaced the hardcoded string concatenation (self.directory + selected_class + '/sim_%d.npy') with os.path.join() for safe, cross-platform path resolution.
  • Tensor Optimization: Swapped torch.tensor(np.array([np.load(...)]) for torch.from_numpy(np.load(...)).unsqueeze(0). This avoids allocating an intermediate NumPy array and shares memory directly, speeding up the __getitem__ pipeline.
  • String Formatting: Modernized file path string formatting from %d to f-strings.
  • Whitespace Cleanup: Removed hidden non-breaking spaces that could cause Python syntax errors.

Context

During training, a single blank or uniformly colored image in the dataset would cause the min-max normalizer to divide by zero, instantly ruining model gradients with NaN values. Furthermore, the dataset loader was previously creating unnecessary memory copies of every image, which creates a bottleneck during data loading. This refactor makes the dataloader safer, faster, and more robust.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant