Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
153b0f4
feat(sandbox): integrate OCSF structured logging for all sandbox events
johntmyers Mar 26, 2026
3d5fa46
fix(scripts): attach provider to all smoke test phases to avoid rate …
johntmyers Mar 26, 2026
74d928d
fix(ocsf): remove timestamp from shorthand format to avoid double-tim…
johntmyers Mar 26, 2026
1e5d0df
refactor(ocsf): replace single-char severity with bracketed labels
johntmyers Mar 26, 2026
6365660
feat(sandbox): use OCSF level label for structured events in log push
johntmyers Mar 26, 2026
a9582c9
fix(sandbox): convert new Landlock path-skip warning to OCSF
johntmyers Apr 1, 2026
6b180ba
fix(sandbox): use rolling appender for OCSF JSONL file
johntmyers Apr 1, 2026
d52cb28
fix(sandbox): address reviewer warnings for OCSF integration
johntmyers Apr 1, 2026
9184ff6
refactor(sandbox): rename ocsf_logging_enabled to ocsf_json_enabled
johntmyers Apr 1, 2026
01cf8ee
fix(ocsf): add timestamps to shorthand file layer output
johntmyers Apr 1, 2026
911cedf
fix(docker): touch openshell-ocsf source to invalidate cargo cache
johntmyers Apr 1, 2026
50c2971
fix(ocsf): add OCSF level prefix to file layer shorthand output
johntmyers Apr 1, 2026
002f4cd
fix(ocsf): clean up shorthand formatting for listen and SSH events
johntmyers Apr 1, 2026
459d0fa
docs(observability): update examples with OCSF prefix and formatting …
johntmyers Apr 1, 2026
e6dfcb0
docs(agents): add OCSF logging guidance to AGENTS.md
johntmyers Apr 1, 2026
4bff188
fix: remove workflow files accidentally included during rebase
johntmyers Apr 1, 2026
514fd2b
docs(observability): use sandbox connect instead of raw SSH
johntmyers Apr 1, 2026
d916316
fix(docs): correct settings CLI syntax in OCSF JSON export page
johntmyers Apr 1, 2026
e8579fd
fix(e2e): update log assertions for OCSF shorthand format
johntmyers Apr 1, 2026
cc16052
feat(sandbox): convert WebSocket upgrade log calls to OCSF
johntmyers Apr 2, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 80 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ These pipelines connect skills into end-to-end workflows. Individual skill files
| `crates/openshell-policy/` | Policy engine | Filesystem, network, process, and inference constraints |
| `crates/openshell-router/` | Privacy router | Privacy-aware LLM routing |
| `crates/openshell-bootstrap/` | Cluster bootstrap | K3s cluster setup, image loading, mTLS PKI |
| `crates/openshell-ocsf/` | OCSF logging | OCSF v1.7.0 event types, builders, shorthand/JSONL formatters, tracing layers |
| `crates/openshell-core/` | Shared core | Common types, configuration, error handling |
| `crates/openshell-providers/` | Provider management | Credential provider backends |
| `crates/openshell-tui/` | Terminal UI | Ratatui-based dashboard for monitoring |
Expand Down Expand Up @@ -66,6 +67,85 @@ These pipelines connect skills into end-to-end workflows. Individual skill files
- Store plan documents in `architecture/plans`. This is git ignored so its for easier access for humans. When asked to create Spikes or issues, you can skip to GitHub issues. Only use the plans dir when you aren't writing data somewhere else specific.
- When asked to write a plan, write it there without asking for the location.

## Sandbox Logging (OCSF)

When adding or modifying log emissions in `openshell-sandbox`, determine whether the event should use OCSF structured logging or plain `tracing`.

### When to use OCSF

Use an OCSF builder + `ocsf_emit!()` for events that represent **observable sandbox behavior** visible to operators, security teams, or agents monitoring the sandbox:

- Network decisions (allow, deny, bypass detection)
- HTTP/L7 enforcement decisions
- SSH authentication (accepted, denied, nonce replay)
- Process lifecycle (start, exit, timeout, signal failure)
- Security findings (unsafe policy, unavailable controls, replay attacks)
- Configuration changes (policy load/reload, TLS setup, inference routes, settings)
- Application lifecycle (supervisor start, SSH server ready)

### When to use plain tracing

Use `info!()`, `debug!()`, `warn!()` for **internal operational plumbing** that doesn't represent a security decision or observable state change:

- gRPC connection attempts and retries
- "About to do X" events where the result is logged separately
- Internal SSH channel state (unknown channel, PTY resize)
- Zombie process reaping, denial flush telemetry
- DEBUG/TRACE level diagnostics

### Choosing the OCSF event class

| Event type | Builder | When to use |
|---|---|---|
| TCP connections, proxy tunnels, bypass | `NetworkActivityBuilder` | L4 network decisions, proxy operational events |
| HTTP requests, L7 enforcement | `HttpActivityBuilder` | Per-request method/path decisions |
| SSH sessions | `SshActivityBuilder` | Authentication, channel operations |
| Process start/stop | `ProcessActivityBuilder` | Entrypoint lifecycle, signal failures |
| Security alerts | `DetectionFindingBuilder` | Nonce replay, bypass detection, unsafe policy. Dual-emit with the domain event. |
| Policy/config changes | `ConfigStateChangeBuilder` | Policy load, Landlock apply, TLS setup, inference routes, settings |
| Supervisor lifecycle | `AppLifecycleBuilder` | Sandbox start, SSH server ready/failed |

### Severity guidelines

| Severity | When |
|---|---|
| `Informational` | Allowed connections, successful operations, config loaded |
| `Low` | DNS failures, non-fatal operational warnings, LOG rule failures |
| `Medium` | Denied connections, policy violations, deprecated config |
| `High` | Security findings (nonce replay, Landlock unavailable) |
| `Critical` | Process timeout kills |

### Example: adding a new network event

```rust
use openshell_ocsf::{
ocsf_emit, NetworkActivityBuilder, ActivityId, ActionId,
DispositionId, Endpoint, Process, SeverityId, StatusId,
};

let event = NetworkActivityBuilder::new(crate::ocsf_ctx())
.activity(ActivityId::Open)
.action(ActionId::Denied)
.disposition(DispositionId::Blocked)
.severity(SeverityId::Medium)
.status(StatusId::Failure)
.dst_endpoint(Endpoint::from_domain(&host, port))
.actor_process(Process::new(&binary, pid))
.firewall_rule(&policy_name, &engine_type)
.message(format!("CONNECT denied {host}:{port}"))
.build();
ocsf_emit!(event);
```

### Key points

- `crate::ocsf_ctx()` returns the process-wide `SandboxContext`. It is always available (falls back to defaults in tests).
- `ocsf_emit!()` is non-blocking and cannot panic. It stores the event in a thread-local and emits via `tracing::info!()`.
- The shorthand layer and JSONL layer extract the event from the thread-local. The shorthand format is derived automatically from the builder fields.
- For security findings, **dual-emit**: one domain event (e.g., `SshActivityBuilder`) AND one `DetectionFindingBuilder` for the same incident.
- Never log secrets, credentials, or query parameters in OCSF messages. The OCSF JSONL file may be shipped to external systems.
- The `message` field should be a concise, grep-friendly summary. Details go in builder fields (dst_endpoint, firewall_rule, etc.).

## Sandbox Infra Changes

- If you change sandbox infrastructure, ensure `mise run sandbox` succeeds.
Expand Down
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 7 additions & 2 deletions crates/openshell-core/src/settings.rs
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,13 @@ pub struct RegisteredSetting {
/// keys are accepted.
/// 5. Add a unit test in this module's `tests` section to cover the new key.
pub const REGISTERED_SETTINGS: &[RegisteredSetting] = &[
// Production settings go here. Add entries following the steps above.
//
// When true the sandbox writes OCSF v1.7.0 JSONL records to
// `/var/log/openshell-ocsf*.log` (daily rotation, 3 files) in addition
// to the human-readable shorthand log. Defaults to false (no JSONL written).
RegisteredSetting {
key: "ocsf_json_enabled",
kind: SettingValueKind::Bool,
},
// Test-only keys live behind the `dev-settings` feature flag so they
// don't appear in production builds.
#[cfg(feature = "dev-settings")]
Expand Down
91 changes: 68 additions & 23 deletions crates/openshell-ocsf/src/format/shorthand.rs
Original file line number Diff line number Diff line change
Expand Up @@ -36,15 +36,37 @@ pub fn severity_char(severity_id: u8) -> char {
}
}

/// Format the severity as a bracketed tag placed after the `CLASS:ACTIVITY`.
///
/// Placed as a suffix so the class name always starts at column 0, keeping
/// logs vertically scannable:
///
/// ```text
/// NET:OPEN [INFO] ALLOWED python3(42) -> api.example.com:443
/// NET:OPEN [MED] DENIED python3(42) -> blocked.com:443
/// FINDING:BLOCKED [HIGH] "NSSH1 Nonce Replay Attack"
/// ```
#[must_use]
pub fn severity_tag(severity_id: u8) -> &'static str {
match severity_id {
1 => "[INFO]",
2 => "[LOW]",
3 => "[MED]",
4 => "[HIGH]",
5 => "[CRIT]",
6 => "[FATAL]",
_ => "[INFO]",
}
}

impl OcsfEvent {
/// Produce the single-line shorthand for `openshell.log` and gRPC log push.
///
/// This is a display-only projection — the full OCSF JSON is the source of truth.
#[must_use]
pub fn format_shorthand(&self) -> String {
let base = self.base();
let ts = format_ts(base.time);
let sev = severity_char(base.severity.as_u8());
let sev = severity_tag(base.severity.as_u8());

match self {
Self::NetworkActivity(e) => {
Expand Down Expand Up @@ -85,7 +107,13 @@ impl OcsfEvent {
format!(" {actor_str} -> {dst}")
};

format!("{ts} {sev} NET:{activity} {action}{arrow}{rule_ctx}")
let detail = match (action.is_empty(), arrow.is_empty()) {
(true, true) => String::new(),
(true, false) => arrow,
(false, true) => format!(" {action}"),
(false, false) => format!(" {action}{arrow}"),
};
format!("NET:{activity} {sev}{detail}{rule_ctx}")
}

Self::HttpActivity(e) => {
Expand Down Expand Up @@ -116,7 +144,13 @@ impl OcsfEvent {
format!(" {actor_str} -> {method} {url_str}")
};

format!("{ts} {sev} HTTP:{method} {action}{arrow}{rule_ctx}")
let detail = match (action.is_empty(), arrow.is_empty()) {
(true, true) => String::new(),
(true, false) => arrow,
(false, true) => format!(" {action}"),
(false, false) => format!(" {action}{arrow}"),
};
format!("HTTP:{method} {sev}{detail}{rule_ctx}")
}

Self::SshActivity(e) => {
Expand All @@ -143,7 +177,21 @@ impl OcsfEvent {
})
.unwrap_or_default();

format!("{ts} {sev} SSH:{activity} {action} {peer}{auth_ctx}")
let detail = [
if action.is_empty() { "" } else { &action },
if peer.is_empty() { "" } else { &peer },
]
.iter()
.filter(|s| !s.is_empty())
.copied()
.collect::<Vec<_>>()
.join(" ");
let detail = if detail.is_empty() {
String::new()
} else {
format!(" {detail}")
};
format!("SSH:{activity} {sev}{detail}{auth_ctx}")
}

Self::ProcessActivity(e) => {
Expand All @@ -160,7 +208,7 @@ impl OcsfEvent {
.map(|c| format!(" [cmd:{c}]"))
.unwrap_or_default();

format!("{ts} {sev} PROC:{activity} {proc_str}{exit_ctx}{cmd_ctx}")
format!("PROC:{activity} {sev} {proc_str}{exit_ctx}{cmd_ctx}")
}

Self::DetectionFinding(e) => {
Expand All @@ -173,7 +221,7 @@ impl OcsfEvent {
.map(|c| format!(" [confidence:{}]", c.label().to_lowercase()))
.unwrap_or_default();

format!("{ts} {sev} FINDING:{disposition} \"{title}\"{confidence_ctx}")
format!("FINDING:{disposition} {sev} \"{title}\"{confidence_ctx}")
}

Self::ApplicationLifecycle(e) => {
Expand All @@ -185,7 +233,7 @@ impl OcsfEvent {
.map(|s| s.label().to_lowercase())
.unwrap_or_default();

format!("{ts} {sev} LIFECYCLE:{activity} {app} {status}")
format!("LIFECYCLE:{activity} {sev} {app} {status}")
}

Self::DeviceConfigStateChange(e) => {
Expand Down Expand Up @@ -214,7 +262,7 @@ impl OcsfEvent {
})
.unwrap_or_default();

format!("{ts} {sev} CONFIG:{state} {what}{version_ctx}")
format!("CONFIG:{state} {sev} {what}{version_ctx}")
}

Self::Base(e) => {
Expand All @@ -240,7 +288,7 @@ impl OcsfEvent {
})
.unwrap_or_default();

format!("{ts} {sev} EVENT {message}{unmapped_ctx}")
format!("EVENT {sev} {message}{unmapped_ctx}")
}
}
}
Expand Down Expand Up @@ -337,7 +385,7 @@ mod tests {
let shorthand = event.format_shorthand();
assert_eq!(
shorthand,
"14:00:00.000 I NET:OPEN ALLOWED python3(42) -> api.example.com:443 [policy:default-egress engine:mechanistic]"
"NET:OPEN [INFO] ALLOWED python3(42) -> api.example.com:443 [policy:default-egress engine:mechanistic]"
);
}

Expand Down Expand Up @@ -366,7 +414,7 @@ mod tests {
let shorthand = event.format_shorthand();
assert_eq!(
shorthand,
"14:00:00.000 M NET:REFUSE DENIED node(1234) -> 93.184.216.34:443/tcp [policy:bypass-detect engine:iptables]"
"NET:REFUSE [MED] DENIED node(1234) -> 93.184.216.34:443/tcp [policy:bypass-detect engine:iptables]"
);
}

Expand Down Expand Up @@ -395,7 +443,7 @@ mod tests {
let shorthand = event.format_shorthand();
assert_eq!(
shorthand,
"14:00:00.000 I HTTP:GET ALLOWED curl(88) -> GET https://api.example.com/v1/data [policy:default-egress]"
"HTTP:GET [INFO] ALLOWED curl(88) -> GET https://api.example.com/v1/data [policy:default-egress]"
);
}

Expand All @@ -416,7 +464,7 @@ mod tests {
let shorthand = event.format_shorthand();
assert_eq!(
shorthand,
"14:00:00.000 I SSH:OPEN ALLOWED 10.42.0.1:48201 [auth:NSSH1]"
"SSH:OPEN [INFO] ALLOWED 10.42.0.1:48201 [auth:NSSH1]"
);
}

Expand All @@ -435,7 +483,7 @@ mod tests {
let shorthand = event.format_shorthand();
assert_eq!(
shorthand,
"14:00:00.000 I PROC:LAUNCH python3(42) [cmd:python3 /app/main.py]"
"PROC:LAUNCH [INFO] python3(42) [cmd:python3 /app/main.py]"
);
}

Expand All @@ -459,10 +507,7 @@ mod tests {
});

let shorthand = event.format_shorthand();
assert_eq!(
shorthand,
"14:00:00.000 I PROC:TERMINATE python3(42) [exit:0]"
);
assert_eq!(shorthand, "PROC:TERMINATE [INFO] python3(42) [exit:0]");
}

#[test]
Expand All @@ -487,7 +532,7 @@ mod tests {
let shorthand = event.format_shorthand();
assert_eq!(
shorthand,
"14:00:00.000 H FINDING:BLOCKED \"NSSH1 Nonce Replay Attack\" [confidence:high]"
"FINDING:BLOCKED [HIGH] \"NSSH1 Nonce Replay Attack\" [confidence:high]"
);
}

Expand All @@ -514,7 +559,7 @@ mod tests {
let shorthand = event.format_shorthand();
assert_eq!(
shorthand,
"14:00:00.000 I LIFECYCLE:START openshell-sandbox success"
"LIFECYCLE:START [INFO] openshell-sandbox success"
);
}

Expand All @@ -536,7 +581,7 @@ mod tests {
let shorthand = event.format_shorthand();
assert_eq!(
shorthand,
"14:00:00.000 I CONFIG:LOADED policy reloaded [version:v3 hash:sha256:abc123def456]"
"CONFIG:LOADED [INFO] policy reloaded [version:v3 hash:sha256:abc123def456]"
);
}

Expand All @@ -551,7 +596,7 @@ mod tests {
let shorthand = event.format_shorthand();
assert_eq!(
shorthand,
"14:00:00.000 I EVENT Network namespace created [ns:openshell-sandbox-abc123]"
"EVENT [INFO] Network namespace created [ns:openshell-sandbox-abc123]"
);
}
}
4 changes: 3 additions & 1 deletion crates/openshell-ocsf/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -62,4 +62,6 @@ pub use builders::{
};

// --- Tracing layers ---
pub use tracing_layers::{OcsfJsonlLayer, OcsfShorthandLayer, emit_ocsf_event};
pub use tracing_layers::{
OCSF_TARGET, OcsfJsonlLayer, OcsfShorthandLayer, clone_current_event, emit_ocsf_event,
};
Loading
Loading