⚡ Bolt: [performance improvement] array allocations and regex by bartholomej · Pull Request #173 · bartholomej/node-csfd-api

bartholomej · 2026-04-15T20:21:42Z

💡 What:

Replaced occurrences of .split(' ').pop() with a new getLastWord helper that uses zero-allocation .lastIndexOf + .substring.
Replaced occurrences of .split('\n')[0] with a new getFirstLine helper that uses zero-allocation .indexOf + .substring.
Extracted inline regex /(\r\n|\n|\r|\t)/gm in getMovieTrivia and getMovieDescriptions into a module-level constant CLEAN_TEXT_REGEX.
Cleaned up temp benchmark files and recorded learnings.

🎯 Why:
node-html-parser makes DOM querying easy but parsing thousands of items can trigger heavy garbage collection if operations inside iteration loops frequently allocate intermediate arrays (e.g. String.split()). The same applies to recompiling inline Regular Expressions inside .map() closures.

📊 Impact:

str.split(' ').pop() vs lastIndexOf: ~78% faster in local micro-benchmarks.
str.split('\n')[0] vs indexOf: ~95% faster in local micro-benchmarks.
Inline regex vs constant regex: ~8% faster in local micro-benchmarks.
These optimizations provide a measurable boost and reduce GC pressure during large page extractions (e.g. User Reviews or User Ratings).

🔬 Measurement:
Verified correctness via full test suite pass (yarn test, 429 tests). Validated via custom scripts in sandbox simulating scraper high loads showing significant latency/throughput improvements.

PR created automatically by Jules for task 10677939650485183558 started by @bartholomej

Summary by CodeRabbit

Style
- Code formatting improvements across multiple modules for better readability.
Refactor
- Consolidated utility helper functions to reduce code duplication and improve maintainability throughout the codebase.

Replaced expensive string manipulations `str.split(' ').pop()` and `str.split('\n')[0]` with new fast zero-allocation helper functions `getLastWord` and `getFirstLine` using `lastIndexOf` and `indexOf`. Extracted inline regex into a module-level constant to prevent redundant compilation in loops. Co-authored-by: bartholomej <5861310+bartholomej@users.noreply.github.com>

google-labs-jules · 2026-04-15T20:21:43Z

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.

For security, I will only act on instructions from the user who triggered this task.

coderabbitai · 2026-04-15T20:21:56Z

📝 Walkthrough

Walkthrough

This pull request refactors code formatting and introduces two new string utility helpers (getFirstLine and getLastWord) in the global helper module. Multiple files are updated to use these new helpers instead of inline string operations, and various formatting improvements are applied across bin files, services, and utilities without changing functional behavior.

Changes

Cohort / File(s)	Summary
Helper — String Utilities `src/helpers/global.helper.ts`	Added two new exported functions: `getFirstLine(text)` extracts text before the first newline, and `getLastWord(text)` extracts text after the final space, each returning empty string for falsy input or full text when delimiters absent.
Helper — Refactored String Extraction `src/helpers/creator.helper.ts`, `src/helpers/movie.helper.ts`, `src/helpers/search.helper.ts`, `src/helpers/user-ratings.helper.ts`, `src/helpers/user-reviews.helper.ts`	Updated multiple helpers to use `getFirstLine()` and `getLastWord()` instead of manual `.split()[0]` or `.split(' ').pop()` operations, improving code reuse and maintainability.
Helper — Callback Formatting `src/helpers/search-user.helper.ts`	Reformatted multi-line arrow function callback in filter predicate for improved readability without changing filtering logic.
CLI — Headers & Conditional Formatting `src/bin/export-reviews.ts`, `src/bin/lookup-movie.ts`, `src/bin/search.ts`	Refactored conditional logic and logging statements from single-line to multi-line formats, including nested ternaries and expanded `console.log` calls for better code layout.
Utilities — Color Helpers & URL Builders `src/bin/utils.ts`, `src/vars.ts`	Updated color helper function expressions with parentheses; reformatted URL-builder function signatures across multiple lines while preserving all parameters and logic.
Services — Console Warnings `src/services/movie.service.ts`, `src/services/user-ratings.service.ts`, `src/services/user-reviews.service.ts`	Expanded `console.warn` and function invocations to multi-line argument lists without changing message content, logic, or filtering behavior.
Configuration & Exports Cleanup `src/types.ts`, `src/index.ts`, `src/dto/options.ts`	Updated re-export quote style to single quotes, removed trailing empty lines, and ensured consistent newline endings across configuration files.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

PR #137: Modifies the same CLI files and helper functions (export-reviews.ts, utils.ts, vars.ts) with formatting and refactoring improvements.
PR #146: Updates src/helpers/movie.helper.ts with first-line extraction changes and text-cleaning refactors related to this PR's helper consolidation.
PR #92: Modifies src/helpers/global.helper.ts and creator.helper.ts, introducing or fixing helper exports that this PR extends with new utility functions.

Suggested labels

enhancement

Poem

🐰 A rabbit's refrain on cleaner code:
String helpers bundled, a lighter load,
From split to functions, the path we've trod,
Format and polish, each line applaud,
Cleaner and stronger, our code bestowed!

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description provides clear context on what changed, why it matters, and impact metrics, but does not follow the template structure with checkboxes for type of change, related issues, or a completion checklist.	Add sections for Type of change, Related Issues, and Checklist to match the required template format while preserving the detailed explanation already provided.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title mentions performance improvement and array allocations/regex, which directly align with the PR's core objective of reducing allocations and regex recompilation, though it is somewhat vague with the emoji and generic formatting.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch bolt-perf-string-allocation-10677939650485183558

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov-commenter · 2026-04-15T20:23:11Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.82%. Comparing base (bebd16c) to head (a041b52).

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #173      +/-   ##
==========================================
+ Coverage   98.80%   98.82%   +0.02%     
==========================================
  Files          34       34              
  Lines         755      769      +14     
  Branches      191      196       +5     
==========================================
+ Hits          746      760      +14     
  Misses          9        9

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

src/helpers/search-user.helper.ts (3)
4-6: ⚠️ Potential issue | 🔴 Critical

Missing null check before accessing element properties.

The function assumes querySelector('.user-title-name') will always return an element, but it can return null if the selector doesn't match. This will cause a runtime error when accessing .text. As per coding guidelines, helpers should never assume an element exists since CSFD changes layouts.
🛡️ Proposed fix using optional chaining
 export const getUser = (el: HTMLElement): string => {
-  return el.querySelector('.user-title-name').text;
+  return el.querySelector('.user-title-name')?.text ?? null;
 };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/helpers/search-user.helper.ts` around lines 4 - 6, The getUser helper
currently assumes querySelector('.user-title-name') always returns an element;
update the getUser function to safely handle null by checking the result of
el.querySelector('.user-title-name') (or use optional chaining) before accessing
.text and return a sensible fallback (e.g., empty string) when the element or
its text is missing; ensure the change is made inside the getUser function so
callers get a stable string result instead of throwing.
20-23: ⚠️ Potential issue | 🔴 Critical

Missing null checks before accessing nested properties.

The function chains querySelector('.article-img img').attributes.src without null safety. If the selector fails to match or the src attribute is missing, this will throw a runtime error. As per coding guidelines, helpers must handle cases where elements don't exist since CSFD layouts change.
🛡️ Proposed fix using optional chaining
 export const getAvatar = (el: HTMLElement): string => {
-  const image = el.querySelector('.article-img img').attributes.src;
+  const image = el.querySelector('.article-img img')?.attributes?.src;
   return addProtocol(image);
 };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/helpers/search-user.helper.ts` around lines 20 - 23, The getAvatar helper
currently assumes querySelector('.article-img img') and its src attribute always
exist, which can throw if the element or attribute is missing; update getAvatar
to null-check the result of el.querySelector('.article-img img') (or use
optional chaining) and use element.getAttribute('src') safely, providing a
sensible fallback (e.g. empty string) before calling addProtocol so addProtocol
always receives a string and runtime errors are prevented.
25-27: ⚠️ Potential issue | 🔴 Critical

Missing null checks before accessing element attributes.

Similar to getUser, this function assumes querySelector('.user-title-name') will always succeed and that the href attribute exists. This will fail if the element is not found or the attribute is missing. As per coding guidelines, helpers should use optional chaining or try/catch for robust scraping.
🛡️ Proposed fix using optional chaining
 export const getUserUrl = (el: HTMLElement): string => {
-  return el.querySelector('.user-title-name').attributes.href;
+  return el.querySelector('.user-title-name')?.attributes?.href ?? null;
 };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/helpers/search-user.helper.ts` around lines 25 - 27, getUserUrl currently
assumes document.querySelector('.user-title-name') and its href attribute always
exist; update getUserUrl to safely handle missing nodes/attributes by using
optional chaining and a safe fallback (e.g., return '' or null) instead of
directly accessing attributes.href, or wrap the lookup in a try/catch and return
the fallback on error; reference the getUserUrl function and the
'.user-title-name' selector so the change checks for element existence and
attribute presence before returning the URL.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/helpers/search.helper.ts`:
- Around line 33-36: getSearchColorRating currently assumes
el.querySelector('.article-header i.icon') exists and directly reads
.classNames; make this null-safe by using optional chaining to get the icon
element and its class string (e.g., const iconClass =
el.querySelector('.article-header i.icon')?.className) and only call
getLastWord/parseColor when iconClass is present; if the icon is missing,
short-circuit and return a safe default CSFDColorRating (or call parseColor with
a known fallback) so parseColor, getLastWord and getSearchColorRating all handle
missing elements without throwing.

In `@src/helpers/user-ratings.helper.ts`:
- Around line 12-15: The code reads classNames off
el.querySelector('td.star-rating-only .stars') without guarding for a missing
node, causing crashes; update the logic in user-ratings.helper (the ratingText
and rating computation around getLastWord and CSFDStars) to first check the
querySelector result (using optional chaining or a try/catch) before accessing
.classNames, defaulting ratingText to an empty string (or similar safe value) so
rating = ratingText.includes('stars-') ? +ratingText.split('-').pop() : 0 still
returns a valid CSFDStars without throwing.

---

Outside diff comments:
In `@src/helpers/search-user.helper.ts`:
- Around line 4-6: The getUser helper currently assumes
querySelector('.user-title-name') always returns an element; update the getUser
function to safely handle null by checking the result of
el.querySelector('.user-title-name') (or use optional chaining) before accessing
.text and return a sensible fallback (e.g., empty string) when the element or
its text is missing; ensure the change is made inside the getUser function so
callers get a stable string result instead of throwing.
- Around line 20-23: The getAvatar helper currently assumes
querySelector('.article-img img') and its src attribute always exist, which can
throw if the element or attribute is missing; update getAvatar to null-check the
result of el.querySelector('.article-img img') (or use optional chaining) and
use element.getAttribute('src') safely, providing a sensible fallback (e.g.
empty string) before calling addProtocol so addProtocol always receives a string
and runtime errors are prevented.
- Around line 25-27: getUserUrl currently assumes
document.querySelector('.user-title-name') and its href attribute always exist;
update getUserUrl to safely handle missing nodes/attributes by using optional
chaining and a safe fallback (e.g., return '' or null) instead of directly
accessing attributes.href, or wrap the lookup in a try/catch and return the
fallback on error; reference the getUserUrl function and the '.user-title-name'
selector so the change checks for element existence and attribute presence
before returning the URL.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5af8e59d-772b-412b-901e-242d1bc0e101

📥 Commits

Reviewing files that changed from the base of the PR and between bebd16c and a041b52.

📒 Files selected for processing (18)

src/bin/export-reviews.ts
src/bin/lookup-movie.ts
src/bin/search.ts
src/bin/utils.ts
src/dto/options.ts
src/helpers/creator.helper.ts
src/helpers/global.helper.ts
src/helpers/movie.helper.ts
src/helpers/search-user.helper.ts
src/helpers/search.helper.ts
src/helpers/user-ratings.helper.ts
src/helpers/user-reviews.helper.ts
src/index.ts
src/services/movie.service.ts
src/services/user-ratings.service.ts
src/services/user-reviews.service.ts
src/types.ts
src/vars.ts

💤 Files with no reviewable changes (1)

src/index.ts

coderabbitai · 2026-04-15T20:28:41Z

 export const getSearchColorRating = (el: HTMLElement): CSFDColorRating => {
  return parseColor(
-    el.querySelector('.article-header i.icon').classNames.split(' ').pop() as CSFDColors
+    getLastWord(el.querySelector('.article-header i.icon').classNames) as CSFDColors
  );


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify the unsafe dereference in this file. rg -n -C2 "querySelector\\('\\.article-header i\\.icon'\\)\\.classNames" src/helpers/search.helper.ts

Repository: bartholomej/node-csfd-api

Length of output: 273

🏁 Script executed:

#!/bin/bash # Check for other unsafe querySelector patterns in the file rg -n "querySelector\(" src/helpers/search.helper.ts | head -20

Repository: bartholomej/node-csfd-api

Length of output: 503

🏁 Script executed:

#!/bin/bash # Check the imports and how getLastWord/parseColor handle edge cases head -50 src/helpers/search.helper.ts

Repository: bartholomej/node-csfd-api

Length of output: 1748

Add null-safe handling for icon class extraction.

querySelector('.article-header i.icon') can return null; direct .classNames access will throw at runtime.

Suggested fix

export const getSearchColorRating = (el: HTMLElement): CSFDColorRating => { - return parseColor( - getLastWord(el.querySelector('.article-header i.icon').classNames) as CSFDColors - ); + const iconClasses = el.querySelector('.article-header i.icon')?.classNames ?? ''; + return parseColor(getLastWord(iconClasses) as CSFDColors); };

Per guideline: "Never assume an element exists. CSFD changes layouts. Use optional chaining ?. or try/catch inside helpers for robust scraping."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/helpers/search.helper.ts` around lines 33 - 36, getSearchColorRating currently assumes el.querySelector('.article-header i.icon') exists and directly reads .classNames; make this null-safe by using optional chaining to get the icon element and its class string (e.g., const iconClass = el.querySelector('.article-header i.icon')?.className) and only call getLastWord/parseColor when iconClass is present; if the icon is missing, short-circuit and return a safe default CSFDColorRating (or call parseColor with a known fallback) so parseColor, getLastWord and getSearchColorRating all handle missing elements without throwing.

coderabbitai · 2026-04-15T20:28:41Z

+  const ratingText = getLastWord(el.querySelector('td.star-rating-only .stars').classNames);

  const rating = ratingText.includes('stars-') ? +ratingText.split('-').pop() : 0;
  return rating as CSFDStars;


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify the unsafe dereference in this file. rg -n -C2 "querySelector\\('td\\.star-rating-only \\.stars'\\)\\.classNames" src/helpers/user-ratings.helper.ts

Repository: bartholomej/node-csfd-api

Length of output: 324

Guard .stars lookup before reading classNames.

This will throw when the rating node is missing, making the parser brittle against CSFD layout changes. The helper must use optional chaining or try/catch per the robustness requirements for this directory.

Suggested fix

export const getUserRating = (el: HTMLElement): CSFDStars => { - const ratingText = getLastWord(el.querySelector('td.star-rating-only .stars').classNames); + const starsClass = el.querySelector('td.star-rating-only .stars')?.classNames ?? ''; + const ratingText = getLastWord(starsClass); const rating = ratingText.includes('stars-') ? +ratingText.split('-').pop() : 0; return rating as CSFDStars; };

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const ratingText = getLastWord(el.querySelector('td.star-rating-only .stars').classNames);

const rating = ratingText.includes('stars-') ? +ratingText.split('-').pop() : 0;

return rating as CSFDStars;

export const getUserRating = (el: HTMLElement): CSFDStars => {

const starsClass = el.querySelector('td.star-rating-only .stars')?.classNames ?? '';

const ratingText = getLastWord(starsClass);

const rating = ratingText.includes('stars-') ? +ratingText.split('-').pop() : 0;

return rating as CSFDStars;

};

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/helpers/user-ratings.helper.ts` around lines 12 - 15, The code reads classNames off el.querySelector('td.star-rating-only .stars') without guarding for a missing node, causing crashes; update the logic in user-ratings.helper (the ratingText and rating computation around getLastWord and CSFDStars) to first check the querySelector result (using optional chaining or a try/catch) before accessing .classNames, defaulting ratingText to an empty string (or similar safe value) so rating = ratingText.includes('stars-') ? +ratingText.split('-').pop() : 0 still returns a valid CSFDStars without throwing.

coderabbitai bot reviewed Apr 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡ Bolt: [performance improvement] array allocations and regex#173

⚡ Bolt: [performance improvement] array allocations and regex#173
bartholomej wants to merge 1 commit intomasterfrom
bolt-perf-string-allocation-10677939650485183558

bartholomej commented Apr 15, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

google-labs-jules bot commented Apr 15, 2026

Uh oh!

coderabbitai bot commented Apr 15, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (2 warnings)

Uh oh!

codecov-commenter commented Apr 15, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 15, 2026

Uh oh!

coderabbitai bot Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

bartholomej commented Apr 15, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

google-labs-jules bot commented Apr 15, 2026

Uh oh!

coderabbitai bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (2 warnings)

Uh oh!

codecov-commenter commented Apr 15, 2026

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bartholomej commented Apr 15, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 15, 2026 •

edited

Loading