feat!: pluggable HTTP backend – httpx or curl-cffi (#269)#308
Open
vladkens wants to merge 9 commits into
Open
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces a pluggable HTTP layer for twscrape, replacing direct httpx usage with a unified HttpClient/Response abstraction and adding an optional curl-cffi backend (auto-preferred when installed, or selectable via TWS_HTTP_BACKEND).
Changes:
- Added
twscrape.httpwithHttpxClient/CurlClient, unifiedResponse, and backend auto-detection + env override. - Migrated internal call sites (API/raw paths, login, queue client, xclid, CLI) from
httpxtypes to the new wrapper types. - Updated packaging/docs/tests to support the new backend model (
twscrape[curl], new unit tests, removedpytest-httpx).
Reviewed changes
Copilot reviewed 18 out of 19 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| uv.lock | Adds curl-cffi (and deps), removes pytest-httpx, updates extras/dev deps accordingly. |
| pyproject.toml | Defines curl extra, updates dev dependency group, adjusts pyright include. |
| readme.md | Documents the new optional curl backend and TWS_HTTP_BACKEND. |
| twscrape/http.py | New HTTP abstraction layer + backend detection/selection. |
| twscrape/account.py | Switches account client creation to make_client() and unified headers/cookies setup. |
| twscrape/queue_client.py | Replaces httpx exceptions/types with twscrape.http equivalents. |
| twscrape/login.py | Switches login flow to use HttpClient/Response. |
| twscrape/models.py | Updates parsing helpers to accept twscrape.Response. |
| twscrape/xclid.py | Switches page/script fetches to HttpClient. |
| twscrape/api.py | Updates raw response type to twscrape.Response. |
| twscrape/accounts_pool.py | Updates status error handling to use HttpStatusError. |
| twscrape/cli.py | Updates --raw printing path to accept twscrape.Response. |
| scripts/update_gql_ops.py | Switches script downloader to make_client() for pluggable backend support. |
| tests/mock_http.py | Adds a small HttpClient mock for deterministic tests without pytest-httpx. |
| tests/conftest.py | Monkeypatches Account.make_client() to use MockClient, adjusts log level. |
| tests/test_http.py | Adds coverage for both backends + backend detection + wrapper behavior. |
| tests/test_queue_client.py | Reworks tests to use MockClient and adds many additional branch tests. |
| tests/test_pool.py | Adds coverage for pool maintenance behaviors and no-account error paths. |
| tests/test_utils.py | Adds coverage for get_env_bool and new-schema flattening via to_old_obj. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Introduces a pluggable HTTP backend layer (
twscrape/http.py) so the library can use eitherhttpx(existing default) orcurl-cffifor requests.curl-cffiuses libcurl with browser-level TLS fingerprint spoofing, which helps bypass Cloudflare bot detection.pip install twscrape— works as before (httpx)pip install twscrape[curl]— enables curl-cffi, preferred automatically when presentTWS_HTTP_BACKEND=httpx|curl— force a specific backendChanges
New
twscrape/http.py— unifiedHttpClientinterface wrapping both backends:HttpxClient/CurlClient— backend implementationsResponse— thin wrapper normalisinghttpx.Responseandcurl_cffi.Responseto one API_detect_backend()— auto-selects curl if installed, falls back to httpxConnectError,NetworkError,HttpStatusErroraccount.py—make_client()now returnsHttpClientinstead ofhttpx.AsyncClientqueue_client.py— replaced directhttpxerror types withConnectError/NetworkErrorfrom the new abstractionBreaking change
Raw API methods (e.g.
search_raw) now returntwscrape.Responseinstead ofhttpx.Response. The interface is compatible (.status_code,.json(),.text,.headers), but directisinstance(rep, httpx.Response)checks will break.Tests
tests/test_http.py(new) — full coverage of both backends, all error-mapping paths,_detect_backend,make_clienttests/mock_http.py(new) —MockClientfor integration tests without a real HTTP servertest_queue_client.py— added branches: error code 131,_Missing,Authorizationpassthrough, unknown error, unhandled 5xx, JSON decode fallback, 404 retry, unknown-exception retrytest_pool.py—delete_accounts,reset_locks,mark_inactive,next_available_at,accounts_infosorting,load_from_file,get_for_queue_or_waitraise-on-no-accounthttp.py: 100% |queue_client.py: 69→83% |accounts_pool.py: 55→76%