Image Metadata Exploration

This repository contains tooling, documentation, and experiments for clustering and interpreting a personal collection of ~7,000 images enriched with captions, metadata, and precomputed embeddings. The goal is to surface recurring visual motifs and themes that reflect personal visual preferences.

Objectives

Organize the dataset into meaningful clusters using both existing embeddings and any additional representations that improve structure.
Leverage LLM-assisted sensemaking to label clusters and summarize prevailing concepts.
Iterate on quantitative evaluations that validate cluster quality and capture trends over time.

Repository Structure

docs/ – project documentation, changelog, research notes, and experiment journal.
src/analysis/ – Python modules for preprocessing, clustering, and visualization helpers.
requirements.txt – curated dependency list for the experimentation environment.
data/ – generated artifacts (ignored by git). Created by preprocessing/clustering scripts.

Note: eagle_images_rows.csv is ignored by git due to size/sensitivity. Place it at the repo root before running scripts.

Environment Setup

Create and activate a virtual environment (e.g., python -m venv .venv && source .venv/bin/activate).
Install dependencies: pip install -r requirements.txt.
Ensure the CSV export from Eagle (eagle_images_rows.csv) lives at the project root.

Script Usage

All modules live under src/analysis. Add the directory to PYTHONPATH when executing modules:

PYTHONPATH=src python -m analysis.preprocess --help

The planned workflow is:

Run analysis.preprocess to parse JSON columns, materialize embeddings as NumPy arrays, and serialize a cleaned metadata table.
Experiment with analysis.cluster to perform UMAP dimensionality reduction followed by HDBSCAN (or other algorithms) and persist experiment outputs.
Summarize results, plots, and LLM interpretations in docs/cluster-journal.md.

License

Personal research project – no license specified yet.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
docs		docs
src/analysis		src/analysis
.gitignore		.gitignore
Justfile		Justfile
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Metadata Exploration

Objectives

Repository Structure

Environment Setup

Script Usage

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Image Metadata Exploration

Objectives

Repository Structure

Environment Setup

Script Usage

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages