Centralize 'trtexec' subprocess runs in ONNX into a single function#1268
Centralize 'trtexec' subprocess runs in ONNX into a single function#1268gcunhase wants to merge 2 commits intoNVIDIA:mainfrom
Conversation
📝 WalkthroughWalkthroughCentralized execution of the external Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Review rate limit: 9/10 reviews remaining, refill in 6 minutes. Comment |
| def _run_trtexec(cmd, timeout=None): | ||
| """Run a 'trtexec' command via subprocess.""" | ||
| # Ensure that this command is a trtexec run | ||
| assert any("trtexec" in c for c in cmd), "Subprocess can only execute 'trtexec' commands" |
There was a problem hiding this comment.
@kevalmorabia97 please let me know if this is enough to ensure only trtexec runs or if this is even necessary. Thanks.
There was a problem hiding this comment.
Can we not pass trtexec in cmd and only pass arguments to trtexec so this function prepends trtexec to the cmd?
Please add type annotation also
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelopt/onnx/quantization/ort_utils.py`:
- Around line 47-53: Replace the fragile assert and the banned "# nosec" in
_run_trtexec: validate that cmd is a non-empty sequence and that cmd[0] contains
"trtexec" (or equals "trtexec") using an explicit runtime check (e.g., if not
cmd or "trtexec" not in str(cmd[0]): raise ValueError("...")), then call
subprocess.run(cmd, capture_output=True, text=True, timeout=timeout) without any
"# nosec" comment; keep the function name _run_trtexec and parameters (cmd,
timeout) and raise a clear exception (ValueError/RuntimeError) when validation
fails so Bandit checks are not bypassed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 7e439fa2-8582-4851-9734-1eec0c8351c0
📒 Files selected for processing (2)
modelopt/onnx/quantization/autotune/benchmark.pymodelopt/onnx/quantization/ort_utils.py
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1268 +/- ##
===========================================
+ Coverage 66.36% 76.93% +10.57%
===========================================
Files 471 471
Lines 50504 50506 +2
===========================================
+ Hits 33515 38855 +5340
+ Misses 16989 11651 -5338
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
6f25809 to
01cc557
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
modelopt/onnx/quantization/ort_utils.py (1)
47-53:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winReplace the
assert/# nosecgate with explicit validation.
assertis not enforceable at runtime, and the# nosec B603suppression keeps Bandit bypassed. Validate the executable fromcmd[0]explicitly and remove the suppression so this helper actually enforces thetrtexec-only contract. As per coding guidelines, Bandit security checks must pass without exceptions.Proposed fix
-def _run_trtexec(cmd, timeout=None): +def _run_trtexec(cmd: Sequence[str], timeout: float | None = None): """Run a 'trtexec' command via subprocess.""" - # Ensure that this command is a trtexec run - assert any("trtexec" in c for c in cmd), "Subprocess can only execute 'trtexec' commands" + if not cmd: + raise ValueError("Empty command passed to _run_trtexec") + exe = os.path.basename(str(cmd[0])).lower() + if not exe.startswith("trtexec"): + raise ValueError("Subprocess can only execute 'trtexec' commands") # Run trtexec command - return subprocess.run(cmd, capture_output=True, text=True, timeout=timeout) # nosec B603 + return subprocess.run(list(cmd), capture_output=True, text=True, timeout=timeout)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modelopt/onnx/quantization/ort_utils.py` around lines 47 - 53, The helper _run_trtexec currently uses an assert and a "# nosec" suppression; replace that with explicit validation: check cmd is a non-empty sequence and validate the executable by inspecting os.path.basename(cmd[0]) (ensure it equals or contains "trtexec"), and if the check fails raise a ValueError (or RuntimeError) with a clear message; then call subprocess.run(cmd, capture_output=True, text=True, timeout=timeout) without the "# nosec" comment so Bandit can validate it. Ensure you import os at top if not present and keep function name _run_trtexec and the subprocess.run usage unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@modelopt/onnx/quantization/ort_utils.py`:
- Around line 47-53: The helper _run_trtexec currently uses an assert and a "#
nosec" suppression; replace that with explicit validation: check cmd is a
non-empty sequence and validate the executable by inspecting
os.path.basename(cmd[0]) (ensure it equals or contains "trtexec"), and if the
check fails raise a ValueError (or RuntimeError) with a clear message; then call
subprocess.run(cmd, capture_output=True, text=True, timeout=timeout) without the
"# nosec" comment so Bandit can validate it. Ensure you import os at top if not
present and keep function name _run_trtexec and the subprocess.run usage
unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 9f4ea3a4-c8aa-4f4a-985c-e760d68c2329
📒 Files selected for processing (2)
modelopt/onnx/quantization/autotune/benchmark.pymodelopt/onnx/quantization/ort_utils.py
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
modelopt/onnx/quantization/ort_utils.py (1)
47-60:⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy liftRemove the
# nosecbypass and make the shared helper time-bounded by default.This helper still introduces a new
# nosecsuppression, which is a pre-merge policy violation here, andtimeout=Nonemeans every new caller can block forever on an external process.Possible direction
+DEFAULT_TRTEXEC_TIMEOUT_S = 300.0 + def _run_trtexec( - args: list[str] | None = None, timeout: float | None = None -) -> subprocess.CompletedProcess: + args: Sequence[str] | None = None, + timeout: float = DEFAULT_TRTEXEC_TIMEOUT_S, +) -> subprocess.CompletedProcess[str]: """Run a 'trtexec' command via subprocess. @@ Returns: The completed subprocess result. """ cmd = ["trtexec", *(args or [])] - return subprocess.run(cmd, capture_output=True, text=True, timeout=timeout) # nosec B603 + # `cmd[0]` is fixed to `trtexec`; callers may only supply argv items. + return subprocess.run(cmd, capture_output=True, text=True, timeout=timeout, check=False)If Bandit still flags the safe wrapper, please route that through the required codeowner/security review instead of reintroducing a suppression.
As per coding guidelines, "Bandit security checks must pass without exceptions.
# noseccomments are not allowed as a bypass for security checks," and subprocess execution should enforce limits "by enforcing limits (e.g., timeouts) for external execution."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modelopt/onnx/quantization/ort_utils.py` around lines 47 - 60, Remove the "# nosec" suppression and make _run_trtexec time-bounded by default: change the function signature of _run_trtexec to use a sensible default timeout (e.g., timeout: float = 30.0) instead of None, remove the "# nosec B603" comment, and call subprocess.run(cmd, capture_output=True, text=True, timeout=timeout) so callers cannot block indefinitely; keep the same cmd construction (cmd = ["trtexec", *(args or [])]) and if Bandit still flags this wrapper escalate to the security/codeowner review rather than reintroducing a nosec.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelopt/onnx/quantization/autotune/benchmark.py`:
- Around line 44-45: The code in TrtExecBenchmark currently delegates to
_run_trtexec but ignores the caller-supplied trtexec_path; update the call sites
in TrtExecBenchmark (and any other callers noted) to pass the provided
trtexec_path into _run_trtexec (or into _check_for_trtexec if that helper
resolves the executable) so custom binary locations are honored; if _run_trtexec
does not accept an executable parameter, add one (e.g., executable or
trtexec_path) and use it when building the command, and ensure the
FileNotFoundError branch and logging still reflect the actual executable
attempted (reference TrtExecBenchmark, _run_trtexec, and _check_for_trtexec).
In `@modelopt/torch/_deploy/_runtime/tensorrt/engine_builder.py`:
- Around line 43-66: The _run_trtexec_streamed function uses p.wait() with no
timeout (and a "# nosec" bypass) which can hang; change it to enforce a timeout
and handle TimeoutExpired: either replace Popen + wait with subprocess.run(cmd,
stdout=log, stderr=log, cwd=..., timeout=TRTEXEC_TIMEOUT) or if using Popen keep
p.communicate(timeout=TRTEXEC_TIMEOUT) and on TimeoutExpired call p.kill(), read
remaining output, and return a failure code and the captured bytes; remove the
"# nosec" bypass and ensure you flush/rewind the temp file (NamedTemporaryFile)
before reading its contents to return combined stdout/stderr and log the output
on non-zero/timeout return.
---
Duplicate comments:
In `@modelopt/onnx/quantization/ort_utils.py`:
- Around line 47-60: Remove the "# nosec" suppression and make _run_trtexec
time-bounded by default: change the function signature of _run_trtexec to use a
sensible default timeout (e.g., timeout: float = 30.0) instead of None, remove
the "# nosec B603" comment, and call subprocess.run(cmd, capture_output=True,
text=True, timeout=timeout) so callers cannot block indefinitely; keep the same
cmd construction (cmd = ["trtexec", *(args or [])]) and if Bandit still flags
this wrapper escalate to the security/codeowner review rather than reintroducing
a nosec.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: bc4f915d-eeb0-46f0-839e-1c966372cd61
📒 Files selected for processing (4)
modelopt/onnx/quantization/autotune/benchmark.pymodelopt/onnx/quantization/ort_utils.pymodelopt/torch/_deploy/_runtime/tensorrt/constants.pymodelopt/torch/_deploy/_runtime/tensorrt/engine_builder.py
💤 Files with no reviewable changes (1)
- modelopt/torch/_deploy/_runtime/tensorrt/constants.py
| from modelopt.onnx.quantization.ort_utils import _check_for_trtexec, _run_trtexec | ||
|
|
There was a problem hiding this comment.
Preserve the caller-supplied trtexec_path when delegating here.
TrtExecBenchmark still accepts and documents trtexec_path, but _run_trtexec() hardcodes "trtexec", so this call now ignores custom binary paths entirely. That breaks existing callers using a non-PATH install and makes the FileNotFoundError branch misleading.
Possible direction
- from modelopt.onnx.quantization.ort_utils import _check_for_trtexec, _run_trtexec
+ from modelopt.onnx.quantization.ort_utils import _check_for_trtexec, _run_trtexec
@@
- result = _run_trtexec(cmd)
+ result = _run_trtexec(cmd, executable=self.trtexec_path)def _run_trtexec(
args: Sequence[str] | None = None,
*,
executable: str = "trtexec",
timeout: float | None = None,
) -> subprocess.CompletedProcess[str]:
cmd = [executable, *(args or [])]
...Also applies to: 270-270
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modelopt/onnx/quantization/autotune/benchmark.py` around lines 44 - 45, The
code in TrtExecBenchmark currently delegates to _run_trtexec but ignores the
caller-supplied trtexec_path; update the call sites in TrtExecBenchmark (and any
other callers noted) to pass the provided trtexec_path into _run_trtexec (or
into _check_for_trtexec if that helper resolves the executable) so custom binary
locations are honored; if _run_trtexec does not accept an executable parameter,
add one (e.g., executable or trtexec_path) and use it when building the command,
and ensure the FileNotFoundError branch and logging still reflect the actual
executable attempted (reference TrtExecBenchmark, _run_trtexec, and
_check_for_trtexec).
| def _run_trtexec_streamed(args: list[str], cwd: Path | None = None) -> tuple[int, bytes]: | ||
| """Run a 'trtexec' command via subprocess, streaming stdout/stderr to a temp file. | ||
|
|
||
| This util will not direct stdout and stderr to console if the cmd succeeds. | ||
| The 'trtexec' binary is hardcoded as the executable; only its arguments may be supplied | ||
| by the caller. This restricts the function to trtexec invocations. | ||
|
|
||
| Output handling: stdout and stderr are captured to a temp file and returned as bytes. | ||
| On failure (non-zero returncode), the captured output is also logged at ERROR level; | ||
| on success, this function emits nothing to the console. | ||
|
|
||
| Args: | ||
| cmd: the command line list | ||
| cwd: current working directory | ||
| args: Arguments to pass to trtexec (without the 'trtexec' command itself). | ||
| cwd: Optional working directory for the subprocess. | ||
|
|
||
| Returns: | ||
| return code: 0 means successful, otherwise means failed | ||
| log_string: the stdout and stderr output as a string | ||
|
|
||
| A tuple of (returncode, output) where output is the combined stdout/stderr bytes. | ||
| """ | ||
| cmd = ["trtexec", *args] | ||
| logging.info(" ".join(cmd)) | ||
| with NamedTemporaryFile("w+b") as log: | ||
| p = subprocess.Popen(cmd, stdout=log, stderr=log, cwd=str(cwd) if cwd else None) # nosec | ||
| p = subprocess.Popen( # nosec B603 - cmd[0] is hardcoded "trtexec" | ||
| cmd, stdout=log, stderr=log, cwd=str(cwd) if cwd else None | ||
| ) | ||
| p.wait() |
There was a problem hiding this comment.
This streamed wrapper still bypasses Bandit and can hang indefinitely.
p.wait() has no timeout, so a stuck trtexec process can block engine build/profile forever, and the new # nosec comment is blocked by repo policy.
Possible direction
-def _run_trtexec_streamed(args: list[str], cwd: Path | None = None) -> tuple[int, bytes]:
+def _run_trtexec_streamed(
+ args: list[str],
+ cwd: Path | None = None,
+ timeout: float = 300.0,
+) -> tuple[int, bytes]:
@@
- p = subprocess.Popen( # nosec B603 - cmd[0] is hardcoded "trtexec"
- cmd, stdout=log, stderr=log, cwd=str(cwd) if cwd else None
- )
- p.wait()
+ p = subprocess.Popen(cmd, stdout=log, stderr=log, cwd=str(cwd) if cwd else None)
+ try:
+ p.wait(timeout=timeout)
+ except subprocess.TimeoutExpired:
+ p.kill()
+ p.wait()
+ raiseAs per coding guidelines, "Bandit security checks must pass without exceptions. # nosec comments are not allowed as a bypass for security checks," and subprocess execution should enforce limits "by enforcing limits (e.g., timeouts) for external execution."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modelopt/torch/_deploy/_runtime/tensorrt/engine_builder.py` around lines 43 -
66, The _run_trtexec_streamed function uses p.wait() with no timeout (and a "#
nosec" bypass) which can hang; change it to enforce a timeout and handle
TimeoutExpired: either replace Popen + wait with subprocess.run(cmd, stdout=log,
stderr=log, cwd=..., timeout=TRTEXEC_TIMEOUT) or if using Popen keep
p.communicate(timeout=TRTEXEC_TIMEOUT) and on TimeoutExpired call p.kill(), read
remaining output, and return a failure code and the captured bytes; remove the
"# nosec" bypass and ensure you flush/rewind the temp file (NamedTemporaryFile)
before reading its contents to return combined stdout/stderr and log the output
on non-zero/timeout return.
|
|
||
| import logging | ||
| import shutil | ||
| import subprocess # nosec |
There was a problem hiding this comment.
can you move this import subprocess in the function also?
What does this PR do?
Type of change: code improvement
Subprocess is needed to run
trtexeccommands but those requirenosecapproval. This PR centralizes it into a single function for the ONNX workflow to avoid future approval triggers. The torch workflow has that centralized inrun_command.Original request: #1259 (comment)
Usage
Testing
N/A
Before your PR is "Ready for review"
CONTRIBUTING.md: N/ASummary by CodeRabbit