Skip to content

feat!: pluggable HTTP backend – httpx or curl-cffi (#269)#308

Open
vladkens wants to merge 9 commits into
mainfrom
new-http-backend
Open

feat!: pluggable HTTP backend – httpx or curl-cffi (#269)#308
vladkens wants to merge 9 commits into
mainfrom
new-http-backend

Conversation

@vladkens
Copy link
Copy Markdown
Owner

@vladkens vladkens commented May 21, 2026

Summary

Introduces a pluggable HTTP backend layer (twscrape/http.py) so the library can use either httpx (existing default) or curl-cffi for requests. curl-cffi uses libcurl with browser-level TLS fingerprint spoofing, which helps bypass Cloudflare bot detection.

  • pip install twscrape — works as before (httpx)
  • pip install twscrape[curl] — enables curl-cffi, preferred automatically when present
  • TWS_HTTP_BACKEND=httpx|curl — force a specific backend

Changes

New twscrape/http.py — unified HttpClient interface wrapping both backends:

  • HttpxClient / CurlClient — backend implementations
  • Response — thin wrapper normalising httpx.Response and curl_cffi.Response to one API
  • _detect_backend() — auto-selects curl if installed, falls back to httpx
  • Normalised exceptions: ConnectError, NetworkError, HttpStatusError

account.pymake_client() now returns HttpClient instead of httpx.AsyncClient

queue_client.py — replaced direct httpx error types with ConnectError / NetworkError from the new abstraction

Breaking change

Raw API methods (e.g. search_raw) now return twscrape.Response instead of httpx.Response. The interface is compatible (.status_code, .json(), .text, .headers), but direct isinstance(rep, httpx.Response) checks will break.

Tests

  • tests/test_http.py (new) — full coverage of both backends, all error-mapping paths, _detect_backend, make_client
  • tests/mock_http.py (new) — MockClient for integration tests without a real HTTP server
  • test_queue_client.py — added branches: error code 131, _Missing, Authorization passthrough, unknown error, unhandled 5xx, JSON decode fallback, 404 retry, unknown-exception retry
  • test_pool.pydelete_accounts, reset_locks, mark_inactive, next_available_at, accounts_info sorting, load_from_file, get_for_queue_or_wait raise-on-no-account

http.py: 100% | queue_client.py: 69→83% | accounts_pool.py: 55→76%

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a pluggable HTTP layer for twscrape, replacing direct httpx usage with a unified HttpClient/Response abstraction and adding an optional curl-cffi backend (auto-preferred when installed, or selectable via TWS_HTTP_BACKEND).

Changes:

  • Added twscrape.http with HttpxClient/CurlClient, unified Response, and backend auto-detection + env override.
  • Migrated internal call sites (API/raw paths, login, queue client, xclid, CLI) from httpx types to the new wrapper types.
  • Updated packaging/docs/tests to support the new backend model (twscrape[curl], new unit tests, removed pytest-httpx).

Reviewed changes

Copilot reviewed 18 out of 19 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
uv.lock Adds curl-cffi (and deps), removes pytest-httpx, updates extras/dev deps accordingly.
pyproject.toml Defines curl extra, updates dev dependency group, adjusts pyright include.
readme.md Documents the new optional curl backend and TWS_HTTP_BACKEND.
twscrape/http.py New HTTP abstraction layer + backend detection/selection.
twscrape/account.py Switches account client creation to make_client() and unified headers/cookies setup.
twscrape/queue_client.py Replaces httpx exceptions/types with twscrape.http equivalents.
twscrape/login.py Switches login flow to use HttpClient/Response.
twscrape/models.py Updates parsing helpers to accept twscrape.Response.
twscrape/xclid.py Switches page/script fetches to HttpClient.
twscrape/api.py Updates raw response type to twscrape.Response.
twscrape/accounts_pool.py Updates status error handling to use HttpStatusError.
twscrape/cli.py Updates --raw printing path to accept twscrape.Response.
scripts/update_gql_ops.py Switches script downloader to make_client() for pluggable backend support.
tests/mock_http.py Adds a small HttpClient mock for deterministic tests without pytest-httpx.
tests/conftest.py Monkeypatches Account.make_client() to use MockClient, adjusts log level.
tests/test_http.py Adds coverage for both backends + backend detection + wrapper behavior.
tests/test_queue_client.py Reworks tests to use MockClient and adds many additional branch tests.
tests/test_pool.py Adds coverage for pool maintenance behaviors and no-account error paths.
tests/test_utils.py Adds coverage for get_env_bool and new-schema flattening via to_old_obj.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/test_http.py
Comment thread tests/test_http.py
Comment thread tests/test_http.py Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants