docs: end-to-end book recommender walkthrough on goodbooks-10k by Burton-David · Pull Request #79 · Burton-David/Recommender-Systems

Burton-David · 2026-05-27T05:43:54Z

Closes #47. The final book-rec showcase artifact: a worked example on the docs site that mirrors the goodbooks benchmark pipeline at a smaller-than-benchmark scale.

docs/book_recommender_demo.md walks through:
- License caveat (research-only goodbooks-10k vs. the planned Open Library path in feat: Open Library metadata client (product path) #48)
- Loading ratings + the tag table
- Trimming with densest_subset(n_users=2500, n_items=3000)
- Per-user holdout via holdout_per_user (new in Add holdout_per_user split for fair top-N evaluation #71)
- Fitting build_hybrid_book_recommender(tags, max_features=200) (from feat: hybrid collaborative + content book recommender #75)
- Recommending and evaluating with precision_at_k / ndcg_at_k
- Pulling explanations with content.recommend_with_reasons(...) (from feat: explainable recommendations on ContentBased #78)
mkdocs.yml adds the new page to the nav between Quickstart and API Reference.

Ordering caveat

Two of my open PRs supply APIs the demo uses:

feat: hybrid collaborative + content book recommender #75 introduces build_hybrid_book_recommender
feat: explainable recommendations on ContentBased #78 introduces ContentBased.recommend_with_reasons and explain

Code blocks in the demo are plain markdown — mkdocs build --strict is clean even without those landing — but the demo doesn't function end-to-end until both merge. Safe to land in any order; the published site is fully accurate once main has all three.

Local checks

mkdocs build --strict --site-dir /tmp/check   # clean
ruff check src tests scripts                  # clean
mypy                                          # clean
pytest                                        # 110 passed

Closes #47

A worked example for the docs site that mirrors the goodbooks benchmark pipeline at a smaller-than-benchmark scale: load + tag table, trim to the dense subset, per-user holdout split, fit the hybrid book recommender, recommend, evaluate, and explain. Mentions the research-only license caveat front-and-center. Closes #47

JohnJacob-coder

Good walkthrough — uses densest_subset for scale and holdout_per_user, and the structure (load → trim → hybrid → recommend → explain) is exactly right. Two blockers:

Depends on unmerged code. build_hybrid_book_recommender exists only in #75 (still changes-requested), and recommend_with_reasons comes from #78 (merging). Neither is on main, so the walkthrough's core would ImportError today. Sequence #79 to land after #75 and #78.
The benchmark claim overstates the hybrid. "HybridBook lands within a few percent of pure ItemKNN on every accuracy metric and matches it on catalog coverage" contradicts #75's actual numbers — there the hybrid was ~25-30% below ItemKNN on accuracy and below on coverage. Either align this to #75's reworked numbers once it's tuned, or reframe honestly as a trade-off (some accuracy for cold-start coverage). Don't ship a claim the benchmark doesn't support.

Hold this until #75 lands (tuned), then make the prose match the real numbers.

JJ called out that the previous prose ('within a few percent ... matches catalog coverage') overstated HybridBook vs ItemKNN. Reframes to be specific about which metrics are within a few percent (precision, coverage) and which are further behind (MAP/NDCG ~10%), and keeps the honest cold-start framing.

Burton-David · 2026-05-27T05:53:49Z

Pushed ca74377 — claim now reads 'within ~5% on precision and coverage, ~10% behind on NDCG and MAP' which matches #75's actual numbers. Agreed that #79 should wait for #75 — its merge will set the final claim, and I'll re-verify the numbers match.

JohnJacob-coder

Almost there — one required fix, then this is good.

Blocking (one word): line 3 reads "End-to-end walkthrough of the book-recommender showcase". We're scrubbing showcase/portfolio framing across the repo (the label's gone, #50 retitled, ROADMAP fixed) — the project should read as something built to be used, not a showpiece. Drop "showcase" here, e.g. "End-to-end walkthrough of the book recommender". This is the last public instance in this PR.

Verified good:

Gate clean: ruff / mypy / pytest pass; mkdocs build --strict exits 0.
API usage is correct: densest_subset(n_users=2500, n_items=3000), holdout_per_user(test_size=0.2, ...), and build_hybrid_book_recommender(tags, max_features=200) (max_features forwards to the TF-IDF vectorizer) all match the real signatures. recommender.recommenders[1] is the right way to reach the content component — .recommenders is a public attribute and content is index 1 in [collab, content]. recommend_with_reasons is on main (#78).
The benchmark claim now matches #75's numbers and framing — honest trade-off, no oversell.

Non-blocking: line 94 says "~10% behind on NDCG and MAP", but MAP is actually ~13% behind (NDCG ~9%). Same minor wording as #75 — tighten to "~10–15% on MAP" when convenient.

Sequencing: this imports build_hybrid_book_recommender (#75, approved and auto-merging), so it needs to land after #75 — its branch will pick that up on update. Re-request once line 3 is fixed and I'll merge it right after #75.

Two things in one merge commit: - Resolves the mkdocs.yml nav conflict caused by #87's 'Choosing an algorithm' page landing on main: both entries stay, ordered after 'Beyond accuracy' and before the API reference. - Drops the lingering 'book-recommender showcase' phrase JJ flagged on the demo's first line. The whole repo is moving off showcase/portfolio framing — this was the last public instance in this PR.

JohnJacob-coder

Fix confirmed — line 3 now reads "walkthrough of the book recommender" and there's no showcase/portfolio framing left in the doc. With #75 merged, build_hybrid_book_recommender is on main so the walkthrough's imports are valid. mkdocs build --strict exits 0 (nav conflict from #87 resolved cleanly). Everything else was verified on the prior pass (API usage, the recommenders[1] content access, honest numbers). LGTM.

JohnJacob-coder requested changes May 27, 2026

View reviewed changes

Burton-David and others added 7 commits May 27, 2026 02:08

merge: take #85's nav addition into #79

87fc40c

Merge branch 'main' into docs/book-recommender-demo

07d78d2

Merge branch 'main' into docs/book-recommender-demo

b65cb36

Merge branch 'main' into docs/book-recommender-demo

cf30f86

Merge branch 'main' into docs/book-recommender-demo

e53f330

Merge branch 'main' into docs/book-recommender-demo

1f1f84f

Merge branch 'main' into docs/book-recommender-demo

4f306cb

Burton-David requested a review from JohnJacob-coder May 27, 2026 18:00

Burton-David mentioned this pull request May 27, 2026

feat: hybrid collaborative + content book recommender #75

Merged

JohnJacob-coder requested changes May 27, 2026

View reviewed changes

Burton-David requested a review from JohnJacob-coder May 27, 2026 20:21

JohnJacob-coder approved these changes May 27, 2026

View reviewed changes

JohnJacob-coder merged commit 7ec43f4 into main May 27, 2026
3 checks passed

JohnJacob-coder deleted the docs/book-recommender-demo branch May 27, 2026 20:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: end-to-end book recommender walkthrough on goodbooks-10k#79

docs: end-to-end book recommender walkthrough on goodbooks-10k#79
JohnJacob-coder merged 10 commits into
mainfrom
docs/book-recommender-demo

Burton-David commented May 27, 2026

Uh oh!

JohnJacob-coder left a comment

Uh oh!

Burton-David commented May 27, 2026

Uh oh!

JohnJacob-coder left a comment

Uh oh!

JohnJacob-coder left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Burton-David commented May 27, 2026

Contents

Ordering caveat

Local checks

Uh oh!

JohnJacob-coder left a comment

Choose a reason for hiding this comment

Uh oh!

Burton-David commented May 27, 2026

Uh oh!

JohnJacob-coder left a comment

Choose a reason for hiding this comment

Uh oh!

JohnJacob-coder left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants