Bug / Feature: Balanced Playlist Selection — equal representation regardless of playlist size


## Summary

When multiple playlists are selected, songs are drawn from a single merged pool with equal probability per song. This means large playlists dominate the game: a 290-song playlist combined with a 30-song playlist will produce ~91% of rounds from the large one and only ~9% from the small one. Players who selected the small playlist for variety will barely hear any of its songs.

---

## Current Behavior (Code Analysis)

### Songs are merged into a flat list with no playlist tracking

**`views.py`** — all songs from all playlists are concatenated into one list:

```python
songs: list[dict] = []

for playlist_path in playlist_paths:
    playlist_data = json.loads(file_content)
    for song in playlist_data.get("songs", []):
        songs.append(song)   # ← flat merge, no playlist origin tracked

# Result: one big list passed to create_game(songs=songs)
```

**`playlist.py` — `PlaylistManager.get_next_song()`** picks randomly from the merged pool:

```python
def get_next_song(self) -> dict | None:
    available = [
        s for s in self._songs
        if get_song_uri(s, self._provider) not in self._played_uris
    ]
    if not available:
        return None
    return random.choice(available)   # ← uniform random from ALL songs
```

### Real-world impact with actual playlist sizes

| Playlist | Songs | Share in combined pool |
|---|---|---|
| Cologne Carnival 🎭 | 290 | 43 % |
| 80s Hits | 208 | 31 % |
| 100 Greatest Movie Themes 🎬 | 162 | 24 % |
| Gen Z Anthems | 30 | 4 % |
| **Combined (all 4)** | **690** | |

→ In a 15-round game combined from all four: **Gen Z gets ~0.6 songs on average**, Cologne Carnival gets ~6.5.

---

## Proposed Fix — Playlist-Aware Weighted Selection

Instead of picking from a flat pool, the `PlaylistManager` should:

1. **Track songs per source playlist**
2. **Pick a playlist first** (each playlist gets equal weight = 1/N), **then pick a random unplayed song from that playlist**
3. Skip playlists that are exhausted

This guarantees that every selected playlist contributes equally to the game, regardless of size.

### Implementation

**`views.py`** — tag each song with its source playlist before merging:

```python
songs: list[dict] = []

for playlist_path in playlist_paths:
    playlist_data = json.loads(file_content)
    for song in playlist_data.get("songs", []):
        song = dict(song)
        song["_playlist_source"] = playlist_path  # ← track origin
        songs.append(song)
```

**`playlist.py` — `PlaylistManager`** — restructure to group songs by playlist:

```python
class PlaylistManager:
    def __init__(self, songs: list[dict], provider: str = PROVIDER_DEFAULT) -> None:
        self._provider = provider
        self._played_uris: set[str] = set()

        # Group songs by source playlist
        from collections import defaultdict
        buckets: dict[str, list[dict]] = defaultdict(list)
        for song in songs:
            uri = get_song_uri(song, provider)
            if not uri:
                continue
            source = song.get("_playlist_source", "__default__")
            buckets[source].append(song)

        self._buckets: dict[str, list[dict]] = dict(buckets)
        self._single_pool = len(self._buckets) <= 1  # fallback: single playlist

        total = sum(len(v) for v in self._buckets.values())
        _LOGGER.info(
            "PlaylistManager: %d songs across %d playlist(s) for %s",
            total, len(self._buckets), provider,
        )

    def get_next_song(self) -> dict | None:
        if self._single_pool:
            return self._get_random_unplayed()

        # Balanced selection: pick a random non-exhausted playlist, then a song
        active_buckets = {
            k: [s for s in v if get_song_uri(s, self._provider) not in self._played_uris]
            for k, v in self._buckets.items()
        }
        active_buckets = {k: v for k, v in active_buckets.items() if v}

        if not active_buckets:
            return None  # all playlists exhausted

        # Equal weight per playlist regardless of size
        chosen_key = random.choice(list(active_buckets.keys()))  # noqa: S311
        song = random.choice(active_buckets[chosen_key])         # noqa: S311
        song_copy = song.copy()
        song_copy["_resolved_uri"] = get_song_uri(song, self._provider)
        return song_copy

    def _get_random_unplayed(self) -> dict | None:
        """Fallback: uniform random from single merged pool."""
        all_songs = [s for bucket in self._buckets.values() for s in bucket]
        available = [
            s for s in all_songs
            if get_song_uri(s, self._provider) not in self._played_uris
        ]
        if not available:
            return None
        song = random.choice(available)  # noqa: S311
        song_copy = song.copy()
        song_copy["_resolved_uri"] = get_song_uri(song, self._provider)
        return song_copy
```

---

## Deduplication — How Duplicate Songs Are Prevented

### Within a game session: `_played_uris` set (already exists)

The existing `mark_played(uri)` mechanism adds the resolved URI to a shared `set[str]`. In the proposed bucket-based selection, each bucket is filtered before picking:

```python
active_buckets = {
    k: [s for s in v if get_song_uri(s, self._provider) not in self._played_uris]
    for k, v in self._buckets.items()
}
```

Because `_played_uris` is **shared across all buckets**, a song played from Bucket A is automatically excluded from Bucket B in the next round — even if that same song exists in both playlists. No song can play twice. ✅

### Cross-playlist duplicates: 308 songs appear in multiple playlists

A scan of all bundled playlists reveals **308 songs that appear in 2 or more playlists** (e.g. a song in both "Disco & Funk Classics" and "80s Hits"). With the naive bucket approach, such a song would exist in two buckets and therefore have a **proportionally higher chance of being selected** — it can be reached from either bucket.

**Fix: deduplicate at `PlaylistManager` init time by URI**

```python
def __init__(self, songs: list[dict], provider: str = PROVIDER_DEFAULT) -> None:
    self._provider = provider
    self._played_uris: set[str] = set()

    seen_uris: set[str] = set()          # ← global dedup across all playlists
    buckets: dict[str, list[dict]] = defaultdict(list)

    for song in songs:
        uri = get_song_uri(song, provider)
        if not uri:
            continue
        if uri in seen_uris:
            continue                     # ← skip: already in another bucket
        seen_uris.add(uri)
        source = song.get("_playlist_source", "__default__")
        buckets[source].append(song)

    self._buckets = dict(buckets)
```

**Result:** each unique URI appears in exactly one bucket (the first playlist that contained it). The 308 cross-playlist duplicates are silently dropped from secondary playlists. Every song plays at most once. ✅

**Alternative dedup strategy:** instead of "first playlist wins", assign each duplicate to the playlist where it is most "at home" (e.g. by playlist name matching the song's genre tag). But "first wins" is simple, deterministic, and sufficient.

---

## Edge Cases

| Scenario | Behavior |
|---|---|
| Single playlist selected | Falls back to existing uniform random (unchanged) |
| One playlist exhausted mid-game | Remaining rounds drawn from other playlists |
| All playlists exhausted | `get_next_song()` returns `None` → game ends (existing behavior) |
| Combined with `num_rounds` limit | Works transparently — balanced selection applies to however many rounds are played |

---

## Optional Enhancement — Admin UI: Playlist Balance Mode

For power users, an optional toggle could expose the behavior:

```
Playlist Mix:
  ● Balanced  — equal rounds per playlist (proposed default)
  ○ Random    — current behavior, proportional to playlist size
```

This gives users who explicitly want more songs from a large playlist the option to opt out.

---

## Affected Files

| File | Change |
|---|---|
| `server/views.py` | Tag each song with `_playlist_source` before merging |
| `game/playlist.py` | `PlaylistManager.__init__`: build per-playlist buckets; `get_next_song()`: playlist-first selection |
| `www/admin.html` | *(optional)* Balanced/Random toggle |
| `www/js/admin.js` | *(optional)* Send `balanced_playlists` flag in `startGame()` payload |


File	Change
`server/views.py`	Tag each song with `_playlist_source` before merging
`game/playlist.py`	`PlaylistManager.__init__`: build per-playlist buckets; `get_next_song()`: playlist-first selection
`www/admin.html`	(optional) Balanced/Random toggle
`www/js/admin.js`	(optional) Send `balanced_playlists` flag in `startGame()` payload

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug / Feature: Balanced Playlist Selection — equal representation regardless of playlist size #525

Summary

Current Behavior (Code Analysis)

Songs are merged into a flat list with no playlist tracking

Real-world impact with actual playlist sizes

Proposed Fix — Playlist-Aware Weighted Selection

Implementation

Deduplication — How Duplicate Songs Are Prevented

Within a game session: `_played_uris` set (already exists)

Cross-playlist duplicates: 308 songs appear in multiple playlists

Edge Cases

Optional Enhancement — Admin UI: Playlist Balance Mode

Affected Files

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Playlist	Songs	Share in combined pool
Cologne Carnival 🎭	290	43 %
80s Hits	208	31 %
100 Greatest Movie Themes 🎬	162	24 %
Gen Z Anthems	30	4 %
Combined (all 4)	690

Scenario	Behavior
Single playlist selected	Falls back to existing uniform random (unchanged)
One playlist exhausted mid-game	Remaining rounds drawn from other playlists
All playlists exhausted	`get_next_song()` returns `None` → game ends (existing behavior)
Combined with `num_rounds` limit	Works transparently — balanced selection applies to however many rounds are played

Bug / Feature: Balanced Playlist Selection — equal representation regardless of playlist size #525

Description

Summary

Current Behavior (Code Analysis)

Songs are merged into a flat list with no playlist tracking

Real-world impact with actual playlist sizes

Proposed Fix — Playlist-Aware Weighted Selection

Implementation

Deduplication — How Duplicate Songs Are Prevented

Within a game session: _played_uris set (already exists)

Cross-playlist duplicates: 308 songs appear in multiple playlists

Edge Cases

Optional Enhancement — Admin UI: Playlist Balance Mode

Affected Files

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Within a game session: `_played_uris` set (already exists)