-
Notifications
You must be signed in to change notification settings - Fork 175
codec and x.509 hardening #2323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
4e5982c
b3f9c3a
81804a3
8f43237
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,6 @@ | ||
| target/ | ||
| **/*.rs.bk | ||
| x509-cert/tests/limbo/limbo.json | ||
|
|
||
| # CLion IDE | ||
| .idea | ||
|
|
||
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,27 @@ | ||
| [package] | ||
| name = "ct-disasm" | ||
| version = "0.0.0" | ||
| edition = "2024" | ||
| publish = false | ||
|
|
||
| # This crate exists only to anchor the constant-time-regression disassembly | ||
| # check (`./check.sh`). It is NOT part of the public API surface and is not | ||
| # published. | ||
|
|
||
| [lib] | ||
| # rlib so we don't need a panic_handler / std runtime; we only consume the | ||
| # emitted asm, never link a final image. | ||
| crate-type = ["lib"] | ||
|
|
||
| [dependencies] | ||
| base16ct = { path = "../../base16ct", default-features = false } | ||
| base32ct = { path = "../../base32ct", default-features = false } | ||
| base64ct = { path = "../../base64ct", default-features = false } | ||
|
|
||
| [profile.release] | ||
| # Match the profile downstream consumers actually ship. | ||
| opt-level = 3 | ||
| lto = "thin" | ||
| codegen-units = 1 | ||
| debug = 1 | ||
| overflow-checks = false |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,119 @@ | ||
| # ct-disasm — constant-time disassembly check (research preview) | ||
|
|
||
| A regression-detection tool for the constant-time hot paths in `base16ct`, | ||
| `base32ct`, and `base64ct`. Compiles a small wrapper crate that drives each | ||
|
Comment on lines
+3
to
+4
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh hmm, if it's specialized to these I guess it can go here |
||
| crate's public encode/decode API on a fixed-length buffer, dumps the | ||
| emitted assembly via `rustc --emit=asm`, and reports per-wrapper | ||
| conditional-branch counts. Designed to be diffed against a snapshotted | ||
| baseline as a CI gate. | ||
|
|
||
| ## Why this exists | ||
|
|
||
| The CT-critical inner functions (`decode_nibble`, `decode_5bits`, | ||
| `decode_6bits`, `is_pad_ct`) are `#[inline(always)]` and not exported. | ||
| Checking them in isolation isn't useful — what matters is what they look | ||
| like *after* LLVM has inlined them into the public API path that real | ||
| callers use. This crate places `#[inline(never)]` `extern "C"` shims around | ||
| those public APIs, with input/output buffers sized as compile-time | ||
| constants, so length-dependent branches in the public API collapse and | ||
| the asm scanner sees the actual inlined CT machinery. | ||
|
|
||
| A constant-time-analysis audit of these crates (Trail of Bits) confirmed | ||
| the source-level CT idioms are correct. This tool exists to catch | ||
| **compiler regressions** — a future LLVM that decides to lower one of the | ||
| arithmetic-mask idioms back into a branch. | ||
|
|
||
| ## Usage | ||
|
|
||
| ```sh | ||
| # Native target (auto-detects): | ||
| ./check.sh | ||
|
|
||
| # Explicit target (requires `rustup target add <triple>`): | ||
| ./check.sh --target x86_64-unknown-linux-gnu | ||
|
|
||
| # Machine-readable output for use as a baseline / regression diff: | ||
| ./check.sh --baseline > baseline.txt | ||
| git diff baseline.txt # any change → manual review | ||
| ``` | ||
|
|
||
| No external tools beyond a Rust toolchain plus standard `awk` / `grep` / | ||
| `find` / `sed`. | ||
|
|
||
| ## Current baseline (aarch64-apple-darwin, rustc 1.85) | ||
|
|
||
| ```text | ||
| ct_disasm_base16_lower_decode branches=0 lines=164 | ||
| ct_disasm_base16_lower_encode branches=0 lines=43 | ||
| ct_disasm_base16_upper_decode branches=0 lines=164 | ||
| ct_disasm_base16_upper_encode branches=0 lines=43 | ||
| ct_disasm_base16_mixed_decode branches=0 lines=225 | ||
| ct_disasm_base32_lower_decode branches=20 lines=426 | ||
| ct_disasm_base32_lower_encode branches=0 lines=252 | ||
| ct_disasm_base32_upper_decode branches=20 lines=426 | ||
| ct_disasm_base64_padded_decode branches=14 lines=369 | ||
| ct_disasm_base64_padded_encode branches=0 lines=30 | ||
| ct_disasm_base64_unpadded_decode branches=3 lines=516 | ||
| ``` | ||
|
|
||
| Reading this: | ||
|
|
||
| * **`branches=0` is the strongest signal.** Five of `base16ct`'s wrappers | ||
| and two of `base64ct`/`base32ct`'s are fully branch-free at the chosen | ||
| fixed size. Any future regression that flips one of these to nonzero | ||
| is a CT-violation candidate that needs immediate review. | ||
| * **`branches > 0` does NOT mean a CT bug exists today.** The remaining | ||
| branches are length-dependent (chunk-loop iteration boundaries, | ||
| `if src_rem.len() >= N` tail handling) or panic-trampoline branches | ||
| (bounds checks LLVM didn't fold). The crate documentation explicitly | ||
| states timing depends on message *length*, not content. The number is | ||
| a fingerprint of how LLVM chose to lower the code. | ||
| * **The metric is whether the count, and the structure of branch targets, | ||
| changes between revisions.** A toolchain bump that drops 20 branches | ||
| to 18 is fine. A source change that turns 0 into 1 demands review. A | ||
| source change that turns 14 into 14 with different target labels also | ||
| demands review (LLVM may have rewritten the same control flow into | ||
| different but equivalent shape — or it may have introduced a new | ||
| data-dependent branch). | ||
|
|
||
| ## Layout | ||
|
|
||
| * `src/lib.rs` — `#[no_mangle] extern "C"` shims around the public APIs. | ||
| Buffer sizes are fixed at compile time and inputs/outputs are routed | ||
| through `core::hint::black_box` to anchor the optimizer. | ||
| * `check.sh` — bash script: builds, locates the emitted `.s`, scans each | ||
| wrapper body delimited by `.cfi_endproc`, counts conditional branches. | ||
| * `Cargo.toml` — workspace member; depends on the three CT crates as | ||
| in-tree path deps. | ||
|
|
||
| ## Known limitations | ||
|
|
||
| 1. **Length-dependent and panic-trampoline branches are counted.** These | ||
| are not CT bugs. To filter them out automatically the script would | ||
| need to walk the asm CFG and identify branch targets that are panic | ||
| blocks (`bl ___rust_alloc_error_handler`, `udf`, `brk`, etc.) or | ||
| non-cold loop-iteration blocks. Out of scope for this preview. | ||
| 2. **Conditional moves (`cmov` / `csel`) are not flagged.** They're | ||
| branch-free at the pipeline level. If a hyper-strict posture is | ||
| wanted, extend `BRANCH_REGEX` in `check.sh`. | ||
| 3. **Memory-access-pattern leaks aren't caught.** A function that | ||
| indexed a table by a secret byte would still register as | ||
| `branches=0` here. The audited crates avoid such patterns by | ||
| construction; this tool doesn't reverify that. | ||
| 4. **Some `base64ct` variants aren't wrapped** (`Base64Url`, | ||
| `Base64Bcrypt`, `Base64Crypt`, `Base64ShaCrypt`, `Base64Pbkdf2`). | ||
| They share the `Alphabet::decode_6bits` / `encode_6bits` machinery | ||
| with `Base64`, so the existing wrappers transitively exercise the | ||
| same code. If a variant ever diverges, add a wrapper. | ||
| 5. **Not yet wired into CI.** To promote, add a step that runs | ||
| `./dev-tools/ct-disasm/check.sh --baseline | diff - baseline.txt` | ||
| and fails on any non-empty diff. | ||
|
|
||
| ## Suggested next steps | ||
|
|
||
| * Snapshot a `baseline.{aarch64,x86_64}.txt` on a known-good revision and | ||
| commit it. Add a CI step that diffs the two. | ||
| * Add panic-target filtering to `check.sh` so it can become a true | ||
| pass/fail gate rather than a baseline-diff gate. | ||
| * Extend the wrapper list to `serdect` once that crate's public API is | ||
| also fixed-length-callable. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,11 @@ | ||
| ct_disasm_base16_lower_decode 0 164 | ||
| ct_disasm_base16_lower_encode 0 43 | ||
| ct_disasm_base16_upper_decode 0 164 | ||
| ct_disasm_base16_upper_encode 0 43 | ||
| ct_disasm_base16_mixed_decode 0 225 | ||
| ct_disasm_base32_lower_decode 20 426 | ||
| ct_disasm_base32_lower_encode 0 252 | ||
| ct_disasm_base32_upper_decode 20 426 | ||
| ct_disasm_base64_padded_decode 14 369 | ||
| ct_disasm_base64_padded_encode 0 30 | ||
| ct_disasm_base64_unpadded_decode 3 516 |
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,130 @@ | ||||||
| #!/bin/bash | ||||||
| # Constant-time regression check (research preview). | ||||||
| # | ||||||
| # Builds the `ct-disasm` wrapper crate at release optimization, dumps the | ||||||
| # emitted assembly via `rustc --emit=asm`, and reports per-wrapper | ||||||
| # conditional-branch counts. The wrapper crate places `#[inline(never)]` | ||||||
| # `extern "C"` shims around the public `base16ct` / `base32ct` / `base64ct` | ||||||
| # encode/decode entry points so that the `#[inline(always)]` CT-critical | ||||||
| # helpers (`decode_nibble`, `decode_5bits`, `decode_6bits`, `is_pad_ct`) | ||||||
| # get inlined into our wrapper, where the asm scanner can see them. | ||||||
| # | ||||||
| # IMPORTANT — current limitation: | ||||||
| # The reported branch count includes BOTH genuine data-dependent branches | ||||||
| # AND panic-trampoline branches that LLVM emits for things like bounds | ||||||
| # checks and integer overflow checks on cold paths. Filtering the latter | ||||||
| # out automatically requires walking the asm CFG to identify which targets | ||||||
| # are panic blocks. This prototype does NOT do that. Use the script as: | ||||||
| # | ||||||
| # 1. Baseline: run it once on a known-good revision, snapshot the per- | ||||||
| # wrapper counts (`./check.sh --baseline > baseline.txt`). | ||||||
| # 2. Regression gate: rerun and diff against the baseline. Any *increase* | ||||||
| # in count, or change in branch-target structure, warrants manual | ||||||
| # asm review. | ||||||
| # | ||||||
| # Usage: | ||||||
| # ./check.sh # report counts | ||||||
| # ./check.sh --baseline # machine-readable counts | ||||||
| # ./check.sh --baseline > b.txt | ||||||
| # diff b.txt baseline.txt # regression gate | ||||||
|
|
||||||
| set -euo pipefail | ||||||
| cd "$(dirname "$0")" | ||||||
|
|
||||||
| MODE="report" | ||||||
| TARGET="" | ||||||
| while [[ $# -gt 0 ]]; do | ||||||
| case "$1" in | ||||||
| --baseline) MODE="baseline"; shift ;; | ||||||
| --target) TARGET="$2"; shift 2 ;; | ||||||
| *) echo "unknown arg: $1" >&2; exit 2 ;; | ||||||
| esac | ||||||
| done | ||||||
|
|
||||||
| if [[ -z "$TARGET" ]]; then | ||||||
| HOST_OS=$(uname -s) | ||||||
| HOST_ARCH=$(uname -m) | ||||||
| case "${HOST_OS}-${HOST_ARCH}" in | ||||||
| Darwin-arm64) TARGET="aarch64-apple-darwin" ;; | ||||||
| Darwin-x86_64) TARGET="x86_64-apple-darwin" ;; | ||||||
| Linux-aarch64) TARGET="aarch64-unknown-linux-gnu" ;; | ||||||
| Linux-x86_64) TARGET="x86_64-unknown-linux-gnu" ;; | ||||||
| *) echo "unknown host: ${HOST_OS}-${HOST_ARCH}; pass --target" >&2; exit 2 ;; | ||||||
| esac | ||||||
| fi | ||||||
|
|
||||||
| case "$TARGET" in | ||||||
| aarch64-*|arm64-*) ISA="arm64" ;; | ||||||
| x86_64-*) ISA="x86" ;; | ||||||
| *) echo "unsupported target $TARGET" >&2; exit 2 ;; | ||||||
| esac | ||||||
|
|
||||||
| # Conditional branches only (no unconditional jumps, no conditional moves). | ||||||
| case "$ISA" in | ||||||
| arm64) | ||||||
| BRANCH_REGEX='b\.(eq|ne|cs|hs|cc|lo|mi|pl|vs|vc|hi|ls|ge|lt|gt|le)|cbz|cbnz|tbz|tbnz' | ||||||
| ;; | ||||||
| x86) | ||||||
| BRANCH_REGEX='j(e|ne|z|nz|l|le|g|ge|a|ae|b|be|c|nc|o|no|p|np|s|ns|cxz|ecxz|rcxz)' | ||||||
| ;; | ||||||
| esac | ||||||
|
|
||||||
| DEPS_DIR="../../target/$TARGET/release/deps" | ||||||
| rm -f "$DEPS_DIR"/ct_disasm-*.s 2>/dev/null || true | ||||||
| touch src/lib.rs | ||||||
| cargo rustc --release --target "$TARGET" --quiet -- --emit=asm | ||||||
|
|
||||||
| ASM_FILE=$(find "$DEPS_DIR" -maxdepth 1 -name 'ct_disasm-*.s' | head -n1) | ||||||
| if [[ -z "$ASM_FILE" || ! -f "$ASM_FILE" ]]; then | ||||||
| echo "FAIL: could not locate emitted asm under $DEPS_DIR" >&2 | ||||||
| exit 1 | ||||||
| fi | ||||||
|
|
||||||
| # Wrapper symbol names — keep in sync with src/lib.rs. | ||||||
| WRAPPERS=( | ||||||
| ct_disasm_base16_lower_decode | ||||||
| ct_disasm_base16_lower_encode | ||||||
| ct_disasm_base16_upper_decode | ||||||
| ct_disasm_base16_upper_encode | ||||||
| ct_disasm_base16_mixed_decode | ||||||
| ct_disasm_base32_lower_decode | ||||||
| ct_disasm_base32_lower_encode | ||||||
| ct_disasm_base32_upper_decode | ||||||
| ct_disasm_base64_padded_decode | ||||||
| ct_disasm_base64_padded_encode | ||||||
| ct_disasm_base64_unpadded_decode | ||||||
| ) | ||||||
|
|
||||||
| if [[ "$MODE" == "report" ]]; then | ||||||
| echo "=== ct-disasm: target=$TARGET isa=$ISA ===" | ||||||
| echo "asm: $ASM_FILE ($(wc -l <"$ASM_FILE") lines)" | ||||||
| fi | ||||||
|
|
||||||
| for W in "${WRAPPERS[@]}"; do | ||||||
| # Mach-O prefixes symbols with `_`. Try both. | ||||||
| BODY=$(awk -v sym="$W" ' | ||||||
| BEGIN { in_body = 0 } | ||||||
| # Match either `<sym>:` or `_<sym>:` at the start of a line. | ||||||
| $0 ~ ("^_?" sym ":$") { in_body = 1; next } | ||||||
| # Stop on .cfi_endproc — emitted by both Mach-O and ELF assemblers | ||||||
| # at the end of every function. This is more reliable than scanning | ||||||
| # for the next label, since the asm contains many local debug | ||||||
| # labels (Lfunc_begin0, Ltmp1, etc.) inside a function body. | ||||||
| in_body && /\.cfi_endproc/ { in_body = 0 } | ||||||
| in_body { print } | ||||||
| ' "$ASM_FILE") | ||||||
|
|
||||||
| if [[ -z "$BODY" ]]; then | ||||||
| echo "WARN: $W not found in asm" >&2 | ||||||
| continue | ||||||
| fi | ||||||
|
|
||||||
| COUNT=$(printf '%s\n' "$BODY" | grep -cE "(^|[[:space:]])($BRANCH_REGEX)([[:space:]]|$)" || true) | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Am I reading this regexp wrong? Surely there is a better way to read assembly than a regex anyway? That seems like it's looking for false-positives. |
||||||
| LINES=$(printf '%s\n' "$BODY" | wc -l | tr -d ' ') | ||||||
|
|
||||||
| if [[ "$MODE" == "baseline" ]]; then | ||||||
| printf '%s\t%s\t%s\n' "$W" "$COUNT" "$LINES" | ||||||
| else | ||||||
| printf ' %-40s branches=%-3s lines=%s\n' "$W" "$COUNT" "$LINES" | ||||||
| fi | ||||||
| done | ||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For something like this it would probably be better to put it in https://github.com/rustcrypto/utils though it does look interesting