Open
Conversation
…u into hp/fix-import-cuda-init
This reverts commit 91fff61.
ghost
reviewed
Dec 27, 2021
| SRUCell.init_elementwise_recurrence_funcs() | ||
|
|
||
| @classmethod | ||
| def init_elementwise_recurrence_funcs(cls): |
ghost
reviewed
Dec 27, 2021
| weight_proj: Optional[Tensor] | ||
|
|
||
| initialized = False | ||
| elementwise_recurrence_inference = None |
There was a problem hiding this comment.
Please add a note about this function initialization to the SRUCell docstring or the docstring for sru/modules.py (it lacks a module docstring now, it probably should have one)
ghost
approved these changes
Dec 27, 2021
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We noticed the package initialization for
sruis eagerly triggering the initialization because of the following stack of module importssru.modules->sru.ops->cuda_functionaland this last module is executing the functionloadoftorch.utils.cpp_extension.This was detected because of issues caused when running with the server framework in SUBPROCESS_MODE, that is forking a new process for it to run the model. We got an error complaining that CUDA had been already initialized in the parent process, which was not necessary because it is not meant to run the inference in the model.
This PR changes this loading to be more lazy, more concretely we changed the code in
sru.modulesto avoid the eager import ofsru.opsand instead postpone it to the instantiation of a firstSRUCell.The changes in this PR have been tested doing a checkout of this branch in an AWS instance with GPU and running
pytest -sv testwhich resulted in141 passed, 161 warningsand no failures. So we understand this is working as expected for both CPU and GPU settings.