Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 36 additions & 22 deletions src/helpers/search.helper.ts
Original file line number Diff line number Diff line change
Expand Up @@ -43,28 +43,42 @@ export const getSearchOrigins = (el: HTMLElement): string[] => {
};

export const parseSearchPeople = (
el: HTMLElement,
type: 'directors' | 'actors'
): CSFDMovieCreator[] => {
let who: Creator;
if (type === 'directors') who = 'Režie:';
if (type === 'actors') who = 'Hrají:';

const peopleNode = Array.from(el && el.querySelectorAll('.article-content p')).find((el) =>
el.textContent.includes(who)
);
el: HTMLElement
): { directors: CSFDMovieCreator[]; actors: CSFDMovieCreator[] } => {
const result = {
directors: [] as CSFDMovieCreator[],
actors: [] as CSFDMovieCreator[]
};

if (!el) return result;

// Optimization: Traverse `.article-content p` once to find both directors and actors
const articleContent = el.querySelector('.article-content');
if (!articleContent) return result;

const pNodes = articleContent.querySelectorAll('p');

if (peopleNode) {
const people = Array.from(peopleNode.querySelectorAll('a')) as unknown as HTMLElement[];

return people.map((person) => {
return {
id: parseIdFromUrl(person.attributes.href),
name: person.innerText.trim(),
url: `https://www.csfd.cz${person.attributes.href}`
};
});
} else {
return [];
for (const node of pNodes) {
const text = node.textContent;
let targetGroup: CSFDMovieCreator[] | null = null;

if (text.includes('Režie:')) {
targetGroup = result.directors;
} else if (text.includes('Hrají:')) {
targetGroup = result.actors;
}

if (targetGroup) {
const people = node.querySelectorAll('a');
for (const person of people) {
targetGroup.push({
id: parseIdFromUrl(person.attributes.href),
name: person.innerText.trim(),
url: `https://www.csfd.cz${person.attributes.href}`
});
Comment on lines +73 to +78
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard missing href before ID/URL parsing

Line [75] and Line [77] assume every <a> has href. If CSFD markup changes, this can break parsing instead of safely skipping malformed entries.

Suggested fix
       const people = node.querySelectorAll('a');
       for (const person of people) {
+        const href = person.attributes?.href;
+        if (!href) continue;
         targetGroup.push({
-          id: parseIdFromUrl(person.attributes.href),
+          id: parseIdFromUrl(href),
           name: person.innerText.trim(),
-          url: `https://www.csfd.cz${person.attributes.href}`
+          url: `https://www.csfd.cz${href}`
         });
       }

As per coding guidelines: "Never assume an element exists. CSFD changes layouts. Use optional chaining ?. or try/catch inside helpers for robust scraping."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for (const person of people) {
targetGroup.push({
id: parseIdFromUrl(person.attributes.href),
name: person.innerText.trim(),
url: `https://www.csfd.cz${person.attributes.href}`
});
for (const person of people) {
const href = person.attributes?.href;
if (!href) continue;
targetGroup.push({
id: parseIdFromUrl(href),
name: person.innerText.trim(),
url: `https://www.csfd.cz${href}`
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/helpers/search.helper.ts` around lines 73 - 78, The loop in
search.helper.ts assumes every person element has an href, which can throw in
parseIdFromUrl and when building the URL; update the loop over people to first
check person.attributes?.href (or equivalent truthy check) and skip entries
missing href, and wrap the parseIdFromUrl call in a try/catch or guard so
malformed hrefs don't break parsing; specifically, modify the block that pushes
into targetGroup (references: people, targetGroup, parseIdFromUrl) to only push
when href exists and parseIdFromUrl succeeds, otherwise continue.

}
}
}

return result;
};
5 changes: 1 addition & 4 deletions src/services/search.service.ts
Original file line number Diff line number Diff line change
Expand Up @@ -56,10 +56,7 @@ export class SearchScraper {
colorRating: getSearchColorRating(m),
poster: getSearchPoster(m),
origins: getSearchOrigins(m),
creators: {
directors: parseSearchPeople(m, 'directors'),
actors: parseSearchPeople(m, 'actors')
}
creators: parseSearchPeople(m)
};
};

Expand Down
27 changes: 13 additions & 14 deletions tests/search.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -147,8 +147,8 @@ describe('Get Movie origins', () => {

describe('Get Movie creators', () => {
test('First movie directors', () => {
const movie = parseSearchPeople(moviesNode[0], 'directors');
expect(movie).toEqual<CSFDMovieCreator[]>([
const movie = parseSearchPeople(moviesNode[0]);
expect(movie.directors).toEqual<CSFDMovieCreator[]>([
{
id: 3112,
name: 'Lilly Wachowski',
Expand All @@ -162,8 +162,8 @@ describe('Get Movie creators', () => {
]);
});
test('Last movie actors', () => {
const movie = parseSearchPeople(moviesNode[moviesNode.length - 1], 'actors');
expect(movie).toEqual<CSFDMovieCreator[]>([
const movie = parseSearchPeople(moviesNode[moviesNode.length - 1]);
expect(movie.actors).toEqual<CSFDMovieCreator[]>([
{
id: 101,
name: 'Carrie-Anne Moss',
Expand Down Expand Up @@ -295,8 +295,8 @@ describe('Get TV series origins', () => {

describe('Get TV series creators', () => {
test('First TV series directors', () => {
const movie = parseSearchPeople(tvSeriesNode[0], 'directors');
expect(movie).toEqual<CSFDMovieCreator[]>([
const movie = parseSearchPeople(tvSeriesNode[0]);
expect(movie.directors).toEqual<CSFDMovieCreator[]>([
{
id: 8877,
name: 'Allan Eastman',
Expand All @@ -310,8 +310,8 @@ describe('Get TV series creators', () => {
]);
});
test('Last TV series actors', () => {
const movie = parseSearchPeople(tvSeriesNode[tvSeriesNode.length - 1], 'actors');
expect(movie).toEqual<CSFDMovieCreator[]>([
const movie = parseSearchPeople(tvSeriesNode[tvSeriesNode.length - 1]);
expect(movie.actors).toEqual<CSFDMovieCreator[]>([
{
id: 74751,
name: 'Takeru Sató',
Expand All @@ -325,20 +325,19 @@ describe('Get TV series creators', () => {
]);
});
test('Empty directors', () => {
const movie = parseSearchPeople(tvSeriesNode[3], 'directors');
expect(movie).toEqual<CSFDMovieCreator[]>([]);
const movie = parseSearchPeople(tvSeriesNode[3]);
expect(movie.directors).toEqual<CSFDMovieCreator[]>([]);
});
test('Empty directors + some actors', () => {
const movie = parseSearchPeople(tvSeriesNode[3], 'actors');
const movieDirectors = parseSearchPeople(tvSeriesNode[3], 'directors');
expect(movie).toEqual<CSFDMovieCreator[]>([
const movie = parseSearchPeople(tvSeriesNode[3]);
expect(movie.actors).toEqual<CSFDMovieCreator[]>([
{
id: 61834,
name: 'David Icke',
url: 'https://www.csfd.cz/tvurce/61834-david-icke/prehled/'
}
]);
expect(movieDirectors).toEqual<CSFDMovieCreator[]>([]);
expect(movie.directors).toEqual<CSFDMovieCreator[]>([]);
});
});

Expand Down
Loading