feat: Add network request retry for ncbi_lrg_refseqgene.py#84
Conversation
zealws
left a comment
There was a problem hiding this comment.
I had imagined you'd wrap these requests calls in a session to do the retries for you. That would be the more sophisticated approach to this. The documentation is probably a better resource than ChatGPT, fwiw.
This approach is crude, but it does work (except for the the exception issue I highlighted), however, I think it'd be better to expose a helper function that does the retries so you could switch to a session later if necessary, and not complicate the call site with the retry logic.
I'll defer to @jsstevenson as to which of those approaches are acceptable.
|
To clarify exactly how to use the session most effectively: Here's the example code from the docs: s = Session()
retries = Retry(
total=3,
backoff_factor=0.1,
status_forcelist=[502, 503, 504],
allowed_methods={'POST'},
)
s.mount('https://', HTTPAdapter(max_retries=retries))You would create a session early on in the process, set up the retries as pictured here, pass the session to any code that needs to make HTTP requests, and then call I'm not sure how amenable the wags-tails codebase would be to passing a session around in a bunch of places, so it might be acceptable to use a helper function that just creates a session for a single request. YMMV |
In general, the only shared state is within the class dedicated to a specific resource, so you're usually talking about a max of 2 distinct requests (if resource isn't available locally: 1 request to get it, if it's available locally: 1 request to check if up-to-date, and possibly a 2nd to acquire if local is out of date). I don't think it would be that hard to manage state like a session for this group of requests, but that's not a huge savings since it's a max of 2 requests, and typically just the 1. To share a session across individual getters, we could update
Yeah I think that's reasonable |
|
@zealws @jsstevenson Were you two thinking of a helper function like this (see most recent commit)? I'm not entirely sure what sources this would be used be in since the download command is different for each one? |
zealws
left a comment
There was a problem hiding this comment.
Looks good to me. I don't have permission to greenlight changes in this repo though, so I think you'll also need a review from @jsstevenson or another reviewer.
ncbi_lrg_refseqgene.pyncbi_lrg_refseqgene.py
closes #83
Note: I had ChatGPT assist me here, but this seems to be a logical solution for the
ncbi_lrg_refseqgene.pymodule. There are other harvesters that don't have the network retry logic, so I am wondering if we should add this in other places, or add a new module that can imported fromutilsthat does the network retry