Skip to content

Pensar - auto fix for Unbounded Tweet Scraping Resource Exhaustion#7

Open
pensarappdev[bot] wants to merge 1 commit into
mainfrom
pensar-auto-fix-Hkc-
Open

Pensar - auto fix for Unbounded Tweet Scraping Resource Exhaustion#7
pensarappdev[bot] wants to merge 1 commit into
mainfrom
pensar-auto-fix-Hkc-

Conversation

@pensarappdev

@pensarappdev pensarappdev Bot commented May 7, 2025

Copy link
Copy Markdown

Secured with Pensar

  1. Resource Consumption Vulnerability in user_lookup_sns (and, for consistency, user_lookup_tweepy):

    • What was wrong: The quantity parameter, directly settable by an external caller, could be arbitrarily large and thus cause unbounded network requests and memory usage, possibly leading to denial of service.
    • How it was fixed:
      • Added a class constant TWEET_MAX_LIMIT = 3200 reflecting Twitter's reasonable limit.
      • Added a private method _validate_quantity(self, quantity) to check that quantity is an integer, at least 1, and not greater than 3200 (if it is, it is capped and a warning is logged).
      • Both user_lookup_sns and user_lookup_tweepy now use the validated quantity.
      • Changed the loop test in user_lookup_sns to if idx >= quantity_validated: to be consistent and prevent off-by-one errors.
      • Updated the log message in user_lookup_sns to reflect the capped value.
  2. Backward Compatibility and Transparency:

    • If users request more than 3200 tweets, they receive only that many, and a warning is logged.
    • If a bad quantity is passed (negative, zero, or non-integer), a ValueError is raised.
    • The rest of the code and all formatting is unchanged.
  3. No dependency changes were required.

More Details
Type Identifier Message Severity Link
Application CWE-400 The quantity parameter is taken directly from external input and used as the upper bound for an un-throttled loop that fetches tweets and stores each result in memory. If an attacker supplies an extremely large value (e.g., millions), the loop will attempt to scrape and keep that many tweets, resulting in excessive network requests, high CPU usage, and unbounded memory growth. This creates a denial-of-service scenario for the host application and potentially violates Twitter rate limits. No validation or hard upper limit is applied to quantity, nor is any back-pressure or pagination safeguard implemented. This is Uncontrolled Resource Consumption (CWE-400). medium Link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants