Skip to content

Add GPU/Device Support and Fix Symlink Deduplication Issues#176

Open
akshaver wants to merge 15 commits intomintoolkit:masterfrom
akshaver:ashaver/nv_runtime
Open

Add GPU/Device Support and Fix Symlink Deduplication Issues#176
akshaver wants to merge 15 commits intomintoolkit:masterfrom
akshaver:ashaver/nv_runtime

Conversation

@akshaver
Copy link

@akshaver akshaver commented Feb 23, 2026

Fixes-102, Fixes-306, Fixes-401, Fixes-579

What

This MR introduces four patches that add GPU support to docker-slim and fix critical symlink processing bugs:

  1. Implement docker device request in CLI - Adds --cro-device-request and --cro-runtime CLI flags to enable GPU access during container profiling. Allows passing device requests as JSON (e.g., --cro-device-request '{"Count":-1, "Capabilities":[["gpu"]]}' --cro-runtime nvidia).

  2. Fix issues with duplicate symlink processing - Resolves a bug where files accessed through multiple symlinked paths (e.g., /usr/local/cuda/ vs /usr/local/cuda-12.9/) would be copied multiple times, with later copies randomly overwriting with 0-byte content. Implements inode-based deduplication to ensure only one canonical copy is kept.

  3. Example use case with nvidia runtime - Adds documentation and example scripts demonstrating GPU workloads:

    • test_nvidia_smi.sh - Slims ubuntu to run nvidia-smi
    • test_nvidia_pytorch.sh - Slims nvidia-pytorch to run CUDA tests
  4. Non-trivial example of slimming vllm - Adds a comprehensive example for slimming VLLM (LLM inference) containers with full API test suite validation.

  5. Fixes PTRACE handling bugs Exits an infinite loop that occurs when syscall had a NULL path
    argument. Don't drop parent paths when ENOENT-only child paths.

Why

  • GPU Support (Print os.Args prior to exiting. #102, #401): Many modern workloads require GPU access, especially in ML/AI contexts. Without --cro-device-request and --cro-runtime, it was impossible to profile containers that required GPU access during execution, making docker-slim unusable for CUDA-based images.

  • Symlink Bug (#306, #579): The symlink deduplication bug caused slimmed images to have corrupted (0-byte) library files, particularly in NVIDIA CUDA containers where /usr/local/cuda symlinks to versioned directories like /usr/local/cuda-12.9. This made slimmed images non-functional, as critical .so files would be empty.

How Tested

  • Unit Tests: Added comprehensive test coverage for:

    • Device request JSON parsing and CLI flag handling (clifvgetter_test.go, config_test.go, device_request_test.go)
    • Symlink deduplication logic (dedup_test.go)
    • Ptrace return status handling (ptrace_test.go)
    • File copy operations (fsutil_test.go)
  • Integration Tests:

    • test_nvidia_smi.sh - Verifies slimmed ubuntu image can run nvidia-smi with identical output to original
    • test_nvidia_pytorch.sh - Verifies slimmed pytorch image passes CUDA tests
    • test_nvidia_vllm.sh - Comprehensive VLLM API test suite comparing original vs slimmed image behavior (15 endpoint tests)
  • Manual Testing: Successfully slimmed VLLM containers with GPU workloads and verified functional parity.

Now one can pass in the cro runtime and GPU like
  --cro-device-request '{"Count":-1, \
    "Capabilities":[["gpu"]]}' \
  --cro-runtime nvidia \
  ...
Correct issues with duplicate paths from symlink
processing, when they both point to the same
inode. This fixes a behavior where slimtookit
would randomly generate o-byte files for the
actual file referenced by the symlink.
Infinite loop in getStringParam: when a syscall had a NULL path
argument, PtracePeekData returned EIO with count=0. The function
silenced EIO, never advanced the pointer, and had no exit condition,
burning 100% CPU until exit. Fix: return early
on NULL pointer and on any PtracePeekData error.
…ubdir false positive

OKReturnStatus for checkFile syscalls intentionally accepts ENOENT (-2)
to track Python import search paths. When Python imports a namespace
package (a directory without __init__.py), it probes for several
__init__.* variants -- all returning ENOENT -- before stat'ing the
directory itself (success). The radix tree walk in FileActivity() then
sees the directory as a prefix of these ghost child paths and marks it
IsSubdir=true, excluding it from the slim image. Since the ghost
children don't exist on disk, neither the directory nor any files are
preserved.

Add HasSuccessfulAccess to FSActivityInfo, set it only when retVal==0,
and require it in the IsSubdir determination so that ENOENT-only child
paths cannot cause a parent directory to be dropped.
@kilo-code-bot
Copy link

kilo-code-bot bot commented Feb 23, 2026

Code Review Summary

Status: 8 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 7
WARNING 1
SUGGESTION 0
Issue Details (click to expand)

CRITICAL

File Line Issue
pkg/monitor/ptrace/ptrace.go 289 Incorrect HasSuccessfulAccess logic for OpenFileType
pkg/monitor/ptrace/ptrace.go 303 Incorrect HasSuccessfulAccess initialization for OpenFileType
pkg/monitor/ptrace/ptrace.go 290 Logic bug - HasSuccessfulAccess is always set to true for OpenFileType
pkg/monitor/ptrace/ptrace.go 290 Logic bug - HasSuccessfulAccess always set to true for OpenFileType
pkg/monitor/ptrace/ptrace.go N/A Incorrect indentation - this if statement should be indented
pkg/monitor/ptrace/ptrace.go N/A Incorrect HasSuccessfulAccess logic for OpenFileType
pkg/monitor/ptrace/ptrace.go 303 Incorrect HasSuccessfulAccess initialization for OpenFileType

WARNING

File Line Issue
pkg/report/container_report.go 123 Incorrect indentation - Go code should use tabs
Other Observations (not in diff)

Issues found in unchanged code that cannot receive inline comments:

File Line Issue
pkg/monitor/ptrace/ptrace.go N/A HasSuccessfulAccess is set based on e.retVal == 0, but
pkg/app/sensor/artifact/artifact.go N/A deduplicateFileMap() keys duplicates only by sys.Ino, bu
pkg/app/master/inspectors/container/container_inspector.go 584 If --cro-device-request contains invalid JSON, RunContain
examples/nvidia_runtime/test_vllm.sh N/A This script creates SIGNAL_PIPE via mkfifo but never use
Files Reviewed (16 files)
  • examples/nvidia_runtime/.gitignore
  • examples/nvidia_runtime/Dockerfile
  • examples/nvidia_runtime/README.md
  • examples/nvidia_runtime/test_nvidia_pytorch.sh
  • examples/nvidia_runtime/test_nvidia_smi.sh
  • examples/nvidia_runtime/test_vllm.sh
  • examples/nvidia_runtime/vllm_api_tests.py
  • pkg/app/master/security/seccomp/seccomp.go
  • pkg/app/sensor/artifact/artifact.go
  • pkg/app/sensor/artifact/dedup_test.go
  • pkg/app/sensor/monitor/composite.go
  • pkg/monitor/ptrace/ptrace.go
  • pkg/monitor/ptrace/ptrace_test.go
  • pkg/report/container_report.go
  • pkg/util/fsutil/fsutil.go
  • pkg/util/fsutil/fsutil_test.go

@dosubot
Copy link

dosubot bot commented Feb 23, 2026

Related Documentation

Checked 2 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

@kcq
Copy link
Contributor

kcq commented Feb 23, 2026

thank you @akshaver for creating the PR here!

@kcq
Copy link
Contributor

kcq commented Feb 23, 2026

auggie review

@augmentcode
Copy link

augmentcode bot commented Feb 23, 2026

🤖 Augment PR Summary

Summary: This PR adds GPU/device support for profiling runs and fixes several correctness issues in file/symlink processing and ptrace-based monitoring.

Changes:

  • Added `--cro-device-request` CLI flag and plumbed it through `ContainerRunOptions` into the container inspector to populate Docker `HostConfig.DeviceRequests` for GPU access.
  • Hardened seccomp/profile generation and sensor artifact processing to handle missing/disabled PT monitor reports safely.
  • Implemented inode-based deduplication and symlink-aware copy behavior in the sensor artifact store to prevent duplicate copies and 0-byte overwrites (notably for CUDA symlink trees).
  • Improved ptrace monitor robustness (NULL path pointers, ENOENT/ENOTDIR handling) and added a success marker used to avoid dropping parent paths due to “ghost” children.
  • Added unit tests for CLI flag extraction, config struct behavior, device-request JSON parsing, ptrace return-status logic, and fsutil copy behavior.
  • Added NVIDIA runtime examples and scripts (nvidia-smi, PyTorch, VLLM) plus a VLLM API test harness for end-to-end validation.

Technical Notes: Device requests are passed as JSON matching Docker’s DeviceRequest schema; artifact dedup relies on filesystem metadata to select a single canonical copy when multiple paths reference the same underlying file.

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 4 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

@akshaver
Copy link
Author

Will try to address these comments today or tomorrow, so the PR doesn't lose momentum.

akshaver and others added 2 commits February 27, 2026 21:01
Co-authored-by: kilo-code-bot[bot] <240665456+kilo-code-bot[bot]@users.noreply.github.com>
Co-authored-by: kilo-code-bot[bot] <240665456+kilo-code-bot[bot]@users.noreply.github.com>
@akshaver
Copy link
Author

@kcq let me know if you need anything else with the PR.

fsa.OpsAll++
fsa.Pids[e.pid] = struct{}{}
fsa.Syscalls[int(e.callNum)] = struct{}{}
if p.OKReturnStatus(e.retVal) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRITICAL: Incorrect indentation - this if statement should be indented to be inside the preceding if block

The if p.OKReturnStatus(e.retVal) on line 289 has the same indentation as the if fsa, ok := app.fsActivity[e.pathParam]; ok on line 285, but it should be indented further to be part of that block. This causes HasSuccessfulAccess to never be set for existing file activity entries.

Suggested change
if p.OKReturnStatus(e.retVal) {
if p.OKReturnStatus(e.retVal) {
fsa.HasSuccessfulAccess = true
}

@akshaver
Copy link
Author

akshaver commented Mar 2, 2026

The recent kilo-code changes are breaking the vllm integration test. Tracking down why.

…Status

The kilo-code-bot suggested changing the HasSuccessfulAccess condition from
`e.retVal == 0 || p.SyscallType() == OpenFileType` to `p.OKReturnStatus(e.retVal)`,
arguing that the former incorrectly treated all OpenFileType syscalls as successful.

However, the original expression was correct:

- For OpenFileType (open, openat, readlink): the outer condition on line 276-278
  already filters to successful calls via `p.OKReturnStatus(e.retVal)`, which
  requires fd >= 0. The `|| p.SyscallType() == OpenFileType` clause is redundant,
  not wrong -- it simply ensures HasSuccessfulAccess=true for events that already
  passed the success filter.

- For CheckFileType (stat, access, lstat): `e.retVal == 0` correctly marks only
  files that actually exist on disk. The replacement `p.OKReturnStatus(e.retVal)`
  also accepts ENOENT (-2) and ENOTDIR (-20), causing non-existent "ghost" paths
  to be marked as HasSuccessfulAccess=true.

This broke the namespace package fix (da8eb2c): when Python probes for
__init__.cpython-311.so, __init__.so, __init__.py in a directory that has none
of them (namespace package), those ENOENT stat results created ghost children
with HasSuccessfulAccess=true. The FileActivity() radix tree walk then marked
the parent directory as IsSubdir and excluded it from the slim image. Since the
ghost children don't exist on disk, neither the directory nor its contents were
preserved, causing libraries to go missing.
@akshaver akshaver force-pushed the ashaver/nv_runtime branch from 8c0c5e6 to a2ad80e Compare March 2, 2026 18:41
@akshaver
Copy link
Author

akshaver commented Mar 2, 2026

Reverted back to pre kilo-code-bot suggestions.

The kilo-code-bot suggested changing the HasSuccessfulAccess condition from
e.retVal == 0 || p.SyscallType() == OpenFileType to p.OKReturnStatus(e.retVal),
arguing that the former incorrectly treated all OpenFileType syscalls as successful.

Original expression was correct:

  • For OpenFileType (open, openat, readlink): the outer condition on line 276-278
    already filters to successful calls via p.OKReturnStatus(e.retVal), which
    requires fd >= 0. The || p.SyscallType() == OpenFileType clause is redundant,
    not wrong -- it simply ensures HasSuccessfulAccess=true for events that already
    passed the success filter.

  • For CheckFileType (stat, access, lstat): e.retVal == 0 correctly marks only
    files that actually exist on disk. The replacement p.OKReturnStatus(e.retVal)
    also accepts ENOENT (-2) and ENOTDIR (-20), causing non-existent "ghost" paths
    to be marked as HasSuccessfulAccess=true.

This broke the namespace package fix (da8eb2c): when Python probes for
init.cpython-311.so, init.so, init.py in a directory that has none
of them (namespace package), those ENOENT stat results created ghost children
with HasSuccessfulAccess=true. The FileActivity() radix tree walk then marked
the parent directory as IsSubdir and excluded it from the slim image. Since the
ghost children don't exist on disk, neither the directory nor its contents were
preserved, causing libraries to go missing.

@akshaver
Copy link
Author

akshaver commented Mar 3, 2026

The recent kilo-code changes are breaking the vllm integration test. Tracking down why.

Resolved. kilo-code-bot was wrong.

@kcq
Copy link
Contributor

kcq commented Mar 5, 2026

Thank you @akshaver for investigating and addressing all of those! Really appreciate it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants