fix possible deadlock, add retries.#104
Conversation
add retries, possibly fix deadlock.
Update seekable_http_reader.rs
|
Hi @pgrace-google ! please review this. |
pgrace-google
left a comment
There was a problem hiding this comment.
thanks for your patience with this PR!
| log::debug!("Immediate cache success"); | ||
| return Ok(bytes_read_from_cache); | ||
| } | ||
| if state.read_failed_somewhere { |
There was a problem hiding this comment.
where is this being set?
There was a problem hiding this comment.
The idea was when we consider two thread scenarios; when one reader thread enters this method and has already notified the second thread of the failure, but second thread is not 'on time' to receive the notification, this could result in the problem described in the issue.
but there could be better way of handling this.
There was a problem hiding this comment.
FYI, this kind of resulted in problems for us, we checked in this change, we could see that there were issues of partial downloads. but when we revert the change, we still see the original problem, so maybe the fix is something more subtle.
| ErrorKind::UnexpectedEof | ErrorKind::Interrupted => { | ||
| retries += 1; | ||
| // This is a bit of a hack, but it seems to be the only way | ||
| // to get the underlying stream to retry. |
There was a problem hiding this comment.
can you explain more what happens? i am a bit hesitant about this way of retry
There was a problem hiding this comment.
I'm sure there is a better way to perform retries in rust,
so, here when I want to retry, I ensure that we use a new stream reader, and use 'take()' to get full ownership of the reader for the thread in execution. and we retry only in case of specific failures. now that we have a new reader,
ensure that we update the reading_stuff with the right information.
Fixes #102
After we check the cache for data and before we wait on stream reader thread to notify other threads, check if stream read failed somewhere.
Add retries when we error out during read_exact block, very basic while loop to ensure we retry, when we hit UnexpectedEof or Interrupted ErrorKind,