Skip to content

Some entries doesn't load metadata properly #14

@Rakambda

Description

@Rakambda

Sometimes, some feed entries doen't display properly and are skipped by the plugin.

Example : https://www.reddit.com/user/neo3dofficial/submitted/.rss?sort=new
At some point this error happens (NO METADATA is a log I added in the isValid method) :

[Wed, 13 Sep 2023 18:55:16 +0200] [error] --- NO METADATA
[Wed, 13 Sep 2023 18:55:16 +0200] [error] --- RedditImage\Exception\InvalidContentException:   submitted by   <a href="https://www.reddit.com/user/neo3dofficial"> /u/neo3dofficial </a>   to   <a href="https://www.reddit.com/r/DigitalArt/"> r/DigitalArt </a> <br> <span><a href="https://i.redd.it/d88vk9o3ehgb1.jpg">[link]</a></span>   <span><a href="https://www.reddit.com/r/DigitalArt/comments/15jo38m/chroma_abstract_wallpaper_pack/">[comments]</a></span> in /app/www/extensions/xExtension-RedditImage/Content.php:29
Stack trace:
#0 /app/www/extensions/xExtension-RedditImage/Processor/BeforeInsertProcessor.php(51): RedditImage\Content->__construct()
#1 /app/www/lib/Minz/ExtensionManager.php(338): RedditImage\Processor\BeforeInsertProcessor->process()
#2 /app/www/lib/Minz/ExtensionManager.php(309): Minz_ExtensionManager::callOneToOne()
#3 /app/www/app/Controllers/feedController.php(483): Minz_ExtensionManager::callHook()
#4 /app/www/app/Controllers/feedController.php(653): FreshRSS_feed_Controller::actualizeFeed()
#5 /app/www/lib/Minz/Dispatcher.php(119): FreshRSS_feed_Controller->actualizeAction()
#6 /app/www/lib/Minz/Dispatcher.php(46): Minz_Dispatcher->launchAction()
#7 /app/www/lib/Minz/FrontController.php(58): Minz_Dispatcher->run()
#8 /app/www/p/i/index.php(57): Minz_FrontController->run()
#9 {main}

If I make the metadata regex a bit more tolerent with #(?P<metadata>\s+submitted.*</span>)#, the error is gone.
This allows the entry to actually be processed by the transformers and the image inlined. Before, as the error happened, the entry was skipped from going through the processors.

However this doesn't seem to handle all cases.
Example (NSFW) : https://www.reddit.com/user/throwmeaway896/submitted/.rss?sort=new
With this feed, even if I have the regex modified, some entries failed to match (though image was already added by the BeforeInsertProcessor) :

[Wed, 13 Sep 2023 19:01:09 +0200] [error] --- NO METADATA
[Wed, 13 Sep 2023 19:01:09 +0200] [error] --- RedditImage\Exception\InvalidContentException: <div class="reddit-image figure"><!--xExtension-RedditImage/1.1.1 | RedditImage\Processor\BeforeInsertProcessor | RedditImage\Transformer\Agnostic\ImageTransformer--><img src="https://i.redd.it/rtuzzy7h9nnb1.jpg" class="reddit-image"></div>
  submitted by   <a href="https://www.reddit.com/user/throwmeaway896"> /u/throwmeaway896 </a>   to   <a href="https://www.reddit.com/r/phgonewild/"> r/phgonewild </a> <br> <span><a href="https://i.redd.it/rtuzzy7h9nnb1.jpg">[link]</a></span>   <span><a href="https://www.reddit.com/r/phgonewild/comments/16fy4tb/what_if_nasa_kama_mo_ako_now/">[comments]</a></span> in /app/www/extensions/xExtension-RedditImage/Content.php:29
Stack trace:
#0 /app/www/extensions/xExtension-RedditImage/Processor/BeforeDisplayProcessor.php(43): RedditImage\Content->__construct()
#1 /app/www/lib/Minz/ExtensionManager.php(338): RedditImage\Processor\BeforeDisplayProcessor->process()
#2 /app/www/lib/Minz/ExtensionManager.php(309): Minz_ExtensionManager::callOneToOne()
#3 /app/www/app/views/index/normal.phtml(34): Minz_ExtensionManager::callHook()
#4 /app/www/lib/Minz/View.php(88): include('...')
#5 /app/www/lib/Minz/View.php(110): Minz_View->includeFile()
#6 /app/www/app/layout/layout.phtml(69): Minz_View->render()
#7 /app/www/lib/Minz/View.php(88): include('...')
#8 /app/www/lib/Minz/View.php(101): Minz_View->includeFile()
#9 /app/www/lib/Minz/View.php(68): Minz_View->buildLayout()
#10 /app/www/lib/Minz/Dispatcher.php(56): Minz_View->build()
#11 /app/www/lib/Minz/FrontController.php(58): Minz_Dispatcher->run()
#12 /app/www/p/i/index.php(57): Minz_FrontController->run()
#13 {main}

I have to say I don't really understand that one, using an online checker the regex seems to match https://www.phpliveregex.com/p/JSm

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions