Skip to content

security: harden output file creation with symlink checks and 0600 mode [5/5]#173

Open
rsampaio wants to merge 1 commit intomainfrom
security/harden-output-file-creation
Open

security: harden output file creation with symlink checks and 0600 mode [5/5]#173
rsampaio wants to merge 1 commit intomainfrom
security/harden-output-file-creation

Conversation

@rsampaio
Copy link
Copy Markdown
Collaborator

@rsampaio rsampaio commented Apr 15, 2026

When the agent runs as root, output files were created via os.Create which uses the process umask (typically 0022), producing world-readable files. Additionally, a local attacker could place a symlink at an expected output path to redirect writes to a sensitive system file.

Add safeCreateFile() in both the file writer and CSV converter that:

  • Checks for symlinks via os.Lstat before opening (rejects symlinks)
  • Creates files with explicit 0600 mode (owner read/write only)

Also chmod the audit log file to 0600 after lumberjack creates it.

Description

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Summary by CodeRabbit

Bug Fixes

  • Enhanced file security: Output files (CSV exports, JSON exports, and audit logs) are now created with restricted permissions (0600) to prevent unauthorized access.
  • Symlink protection: Added safeguards to prevent writing through symbolic links, protecting against symlink-based attacks on exported data and logs.

When the agent runs as root, output files were created via os.Create
which uses the process umask (typically 0022), producing world-readable
files. Additionally, a local attacker could place a symlink at an
expected output path to redirect writes to a sensitive system file.

Add safeCreateFile() in both the file writer and CSV converter that:
- Checks for symlinks via os.Lstat before opening (rejects symlinks)
- Creates files with explicit 0600 mode (owner read/write only)

Also chmod the audit log file to 0600 after lumberjack creates it.

Signed-off-by: Rodrigo Sampaio Vaz <rvaz@nvidia.com>
@rsampaio rsampaio requested a review from jingxiang-z April 15, 2026 01:08
@rsampaio rsampaio changed the title security: harden output file creation with symlink checks and 0600 mode security: harden output file creation with symlink checks and 0600 mode [5/5] Apr 15, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 15, 2026

📝 Walkthrough

Walkthrough

Security-critical updates introducing symlink-safe file creation across three modules (CSV exporter, file writer, and audit logging). Each module implements helper functions using unix.O_NOFOLLOW to refuse symlink traversal, enforces 0600 permissions, and provides explicit error messages when symlink attacks are detected.

Changes

Cohort / File(s) Summary
CSV Exporter Security
internal/exporter/converter/csv.go, internal/exporter/converter/csv_test.go
Introduced safeCreateFile() helper that refuses symlink-following writes via unix.O_NOFOLLOW and enforces 0600 permissions. Updated writeMetricsCSV, writeEventsCSV, writeComponentHealthCSV, and writeMachineInfoCSV to use secure file creation. Added unit tests validating symlink rejection and permission enforcement.
File Writer Security
internal/exporter/writer/file.go, internal/exporter/writer/file_test.go
Introduced safeCreateFile() helper for OTLP JSON file output, applying unix.O_NOFOLLOW to block symlink traversal and enforcing 0600 permissions. Updated writeOTLPJSONFile to use secure file creation. Added unit tests covering symlink rejection and permission tightening.
Audit Logger Security
third_party/fleet-intelligence-sdk/pkg/log/audit.go, third_party/fleet-intelligence-sdk/pkg/log/audit_test.go
Introduced prepareAuditLogFile() function that pre-validates log file paths using O_NOFOLLOW, detects symlink attempts, and enforces 0600 permissions before NewAuditLogger configures output (falling back to os.Stdout on failure). Enhanced test assertions to verify resulting log file permissions are exactly 0o600.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 Symlinks beware, we've fortified our gates,
With O_NOFOLLOW flags and permissions straight,
No sneaky traversals shall pass through today,
Only true files find their rightful way,
Safe files created, 0600 secure!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 15.38% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main security improvements introduced across the changeset: hardening output file creation with symlink checks and enforcing 0600 file permissions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch security/harden-output-file-creation

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e112cae27b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +99 to +100
if err := prepareAuditLogFile(logFile); err != nil {
w = zapcore.AddSync(os.Stdout)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep audit logs on file sink after transient preflight errors

Falling back to os.Stdout on any prepareAuditLogFile error permanently disables file-backed audit logging for this process, even when the configured path becomes valid later (for example, the directory is created after startup). Because NewAuditLogger does not return an error, this silently breaks retention/rotation expectations for callers that explicitly provided logFile; before this change, the logger still targeted the file path. Consider only rejecting known unsafe cases (like symlink detection) or at least surfacing the failure while preserving file sink behavior.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
internal/exporter/converter/csv.go (1)

100-115: Consider centralizing safeCreateFile into a shared internal utility.

This helper is now duplicated here and in internal/exporter/writer/file.go; centralizing reduces security drift risk if behavior needs future updates.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/exporter/converter/csv.go` around lines 100 - 115, The
safeCreateFile implementation is duplicated (in safeCreateFile in
internal/exporter/converter/csv.go and the same helper in
internal/exporter/writer/file.go); extract and centralize it into a single
internal utility (e.g., internal/fs or internal/util) as a public or
package-exported function (e.g., SafeCreateFile) preserving the exact behavior
(use os.OpenFile with unix.O_NOFOLLOW, ELOOP check, Chmod to 0o600, close on
error and wrapped errors), then update both caller sites to import and call that
centralized function (replace local safeCreateFile usages in csv.go and
writer/file.go with the new utility function) to avoid duplication and security
drift.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@third_party/fleet-intelligence-sdk/pkg/log/audit.go`:
- Around line 99-109: The current code does a one-time O_NOFOLLOW preflight in
prepareAuditLogFile but then hands lumberjack.Logger a path (Filename) which
opens the file later without O_NOFOLLOW, creating a TOCTOU/symlink swap risk and
silently falling back to stdout on preflight failure; fix by (1) stopping use of
lumberjack.Logger with Filename: open the audit file yourself with O_NOFOLLOW
(use syscall.Open or os.OpenFile with the platform-specific flag) and pass the
resulting *os.File to zapcore.AddSync(w) (use the file descriptor for all
writes), or replace lumberjack with a rotation implementation that accepts an
already-open FD and performs rotations by safely creating/truncating new files
with O_NOFOLLOW, and (2) remove the silent fallback to stdout in the error
branch of prepareAuditLogFile so prepareAuditLogFile errors are returned (or
cause startup failure) instead of defaulting to os.Stdout; update references to
prepareAuditLogFile, lumberjack.Logger, and variable w accordingly.
- Around line 99-101: Currently the code silently falls back to
zapcore.AddSync(os.Stdout) when prepareAuditLogFile(...) returns an error, which
can expose audit logs and hide failures; instead, on error from
prepareAuditLogFile return or propagate the error to the caller (do not reassign
w to stdout), and log the failure to stderr or the process logger so the
deployment can detect misconfiguration. Concretely: in the block handling
prepareAuditLogFile in audit.go, replace the silent assignment to
zapcore.AddSync(os.Stdout) with code that logs the prepareAuditLogFile error
(including the error value) and returns that error (or wraps it) from the
enclosing initializer function so the caller can handle startup failure, keeping
use of w only when the file setup succeeds.

---

Nitpick comments:
In `@internal/exporter/converter/csv.go`:
- Around line 100-115: The safeCreateFile implementation is duplicated (in
safeCreateFile in internal/exporter/converter/csv.go and the same helper in
internal/exporter/writer/file.go); extract and centralize it into a single
internal utility (e.g., internal/fs or internal/util) as a public or
package-exported function (e.g., SafeCreateFile) preserving the exact behavior
(use os.OpenFile with unix.O_NOFOLLOW, ELOOP check, Chmod to 0o600, close on
error and wrapped errors), then update both caller sites to import and call that
centralized function (replace local safeCreateFile usages in csv.go and
writer/file.go with the new utility function) to avoid duplication and security
drift.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 6e100627-6feb-4fb1-a56f-bbf68005c675

📥 Commits

Reviewing files that changed from the base of the PR and between 47dab0c and e112cae.

📒 Files selected for processing (6)
  • internal/exporter/converter/csv.go
  • internal/exporter/converter/csv_test.go
  • internal/exporter/writer/file.go
  • internal/exporter/writer/file_test.go
  • third_party/fleet-intelligence-sdk/pkg/log/audit.go
  • third_party/fleet-intelligence-sdk/pkg/log/audit_test.go

Comment on lines +99 to +109
if err := prepareAuditLogFile(logFile); err != nil {
w = zapcore.AddSync(os.Stdout)
} else {
w = zapcore.AddSync(&lumberjack.Logger{
Filename: logFile,
MaxSize: 128, // megabytes
MaxBackups: 5,
MaxAge: 3, // days
Compress: true, // compress the rotated files
})
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

In gopkg.in/natefinch/lumberjack.v2, what flags and mode are used when opening/creating log files? Does it support O_NOFOLLOW or custom open flags for every open/rotate operation?

💡 Result:

In gopkg.in/natefinch/lumberjack.v2 (original natefinch/lumberjack v2.0 branch, latest release v2.2.1 from 2023), log files are opened using os.OpenFile with hardcoded flags and modes, no support for O_NOFOLLOW or custom open flags. - Existing files (open/append): os.O_APPEND | os.O_WRONLY, mode 0644 Code: file, err := os.OpenFile(filename, os.O_APPEND|os.O_WRONLY, 0644) - New files (create/rotate): os.O_CREATE | os.O_WRONLY | os.O_TRUNC, mode os.FileMode(0600) (copied from existing if present) Code: f, err := os.OpenFile(name, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, mode) These occur in openExistingOrNew (for first write or append) and openNew (for rotation/create new). No configuration options for custom flags/modes in Logger struct or functions. Forks like chaseSpace/lumberjack add FileMode but not flags, and original does not support O_NOFOLLOW or customization.

Citations:


TOCTOU vulnerability: preflight O_NOFOLLOW check is bypassed by lumberjack's path-based file operations.

prepareAuditLogFile validates the path with O_NOFOLLOW once, then closes the fd. However, lumberjack.Logger opens/reopens Filename by path using hardcoded flags (O_APPEND|O_WRONLY for existing, O_CREATE|O_WRONLY|O_TRUNC for rotation) without O_NOFOLLOW support. A symlink swap between the preflight validation and lumberjack's first open—or during any rotation cycle—bypasses the guard entirely.

Additionally, silent fallback to stdout when preflight fails can mask configuration errors and silently alter data handling behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@third_party/fleet-intelligence-sdk/pkg/log/audit.go` around lines 99 - 109,
The current code does a one-time O_NOFOLLOW preflight in prepareAuditLogFile but
then hands lumberjack.Logger a path (Filename) which opens the file later
without O_NOFOLLOW, creating a TOCTOU/symlink swap risk and silently falling
back to stdout on preflight failure; fix by (1) stopping use of
lumberjack.Logger with Filename: open the audit file yourself with O_NOFOLLOW
(use syscall.Open or os.OpenFile with the platform-specific flag) and pass the
resulting *os.File to zapcore.AddSync(w) (use the file descriptor for all
writes), or replace lumberjack with a rotation implementation that accepts an
already-open FD and performs rotations by safely creating/truncating new files
with O_NOFOLLOW, and (2) remove the silent fallback to stdout in the error
branch of prepareAuditLogFile so prepareAuditLogFile errors are returned (or
cause startup failure) instead of defaulting to os.Stdout; update references to
prepareAuditLogFile, lumberjack.Logger, and variable w accordingly.

Comment on lines +99 to +101
if err := prepareAuditLogFile(logFile); err != nil {
w = zapcore.AddSync(os.Stdout)
} else {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid silent downgrade to stdout when secure file setup fails.

At Line 99-Line 101, failures are swallowed and logging silently switches destinations. This can hide misconfiguration/security incidents and unexpectedly expose audit logs in stdout sinks.

Suggested minimal visibility fix
 if logFile != "" {
 	if err := prepareAuditLogFile(logFile); err != nil {
+		_, _ = fmt.Fprintf(os.Stderr, "audit logger: failed to prepare secure log file %q: %v; falling back to stdout\n", logFile, err)
 		w = zapcore.AddSync(os.Stdout)
 	} else {
 		w = zapcore.AddSync(&lumberjack.Logger{
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@third_party/fleet-intelligence-sdk/pkg/log/audit.go` around lines 99 - 101,
Currently the code silently falls back to zapcore.AddSync(os.Stdout) when
prepareAuditLogFile(...) returns an error, which can expose audit logs and hide
failures; instead, on error from prepareAuditLogFile return or propagate the
error to the caller (do not reassign w to stdout), and log the failure to stderr
or the process logger so the deployment can detect misconfiguration. Concretely:
in the block handling prepareAuditLogFile in audit.go, replace the silent
assignment to zapcore.AddSync(os.Stdout) with code that logs the
prepareAuditLogFile error (including the error value) and returns that error (or
wraps it) from the enclosing initializer function so the caller can handle
startup failure, keeping use of w only when the file setup succeeds.

@@ -4,10 +4,13 @@
package log

import (
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we using this sdk pkg/log? Would like to see if we can remove this.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we do extensively @jingxiang-z, it may be worth to refactor our logging strategy and remove this from the SDK, the SDK should mostly have the metrics collection packages

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants