Skip to content

Added outlier detection funcitons#36

Open
michaelkiper wants to merge 3 commits into
emit-sds:developfrom
michaelkiper:outlier-detection
Open

Added outlier detection funcitons#36
michaelkiper wants to merge 3 commits into
emit-sds:developfrom
michaelkiper:outlier-detection

Conversation

@michaelkiper

Copy link
Copy Markdown
Collaborator

Overview

This PR adds in a set of stand-alone functions for outlier detection for input data. Specifically, it offers the below 4 methods:

  • z-score outliers
  • k-means cluster distance percentiles
  • Mahalanobis distance percentiles
  • Local Outlier Factor (k-NN using other distance metrics besides euclidean)

The main show_outliers function outputs a PNG (see example below) that highlights all the outliers, and provides indices of those outliers.

Related Tickets

resolves #28

@michaelkiper michaelkiper requested a review from Copilot November 5, 2025 18:40
@michaelkiper michaelkiper self-assigned this Nov 5, 2025
@michaelkiper michaelkiper added the enhancement New feature or request label Nov 5, 2025

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds outlier detection functionality to the cover-class repository and fixes a data type conversion issue in the training pipeline.

Key changes:

  • Adds a new outlier_detection.py module with multiple outlier detection methods (z-score, k-means, Mahalanobis distance, and LOF)
  • Fixes train label conversion to explicitly use LongTensor type
  • Documents the new outlier detection feature in README with usage examples

Reviewed Changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 5 comments.

File Description
src/cover_class/outlier_detection.py New module implementing four outlier detection methods with visualization capabilities
src/cover_class/train.py Wraps train_labels with LongTensor conversion for proper type handling in dataloader
README.md Adds documentation section for outlier detection feature with usage examples

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/cover_class/outlier_detection.py
Comment thread src/cover_class/outlier_detection.py
Comment thread src/cover_class/outlier_detection.py Outdated
Comment thread README.md Outdated
Comment thread src/cover_class/train.py
michaelkiper and others added 2 commits November 5, 2025 10:43
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Clean Endmember outliers

2 participants