./build.shThe build script auto-detects LLVM/Clang using Homebrew (macOS) or
llvm-config (Linux). If detection fails, set LLVM_DIR and Clang_DIR.
Options:
--build-dir <dir>(default:build)--type <Release|Debug|RelWithDebInfo>(default:Release)--generator <Ninja|Unix Makefiles>--jobs <n>--llvm-dir <path>/--clang-dir <path>--clean--configure-only
Examples:
./build.sh --type Release
./build.sh --type Debug --build-dir out/build
LLVM_DIR=/opt/llvm/lib/cmake/llvm Clang_DIR=/opt/llvm/lib/cmake/clang ./build.sh --generator NinjaFor CI usage as a code analyzer, use a two-layer setup:
stack_usage_analyzerremains the analysis engine.scripts/ci/run_code_analysis.pyis the CI adapter (report export + policy gate).
Why this architecture:
- The analyzer stays CI-agnostic and reusable everywhere (CLI, local scripts, CI).
- CI policy (
fail-on=error|warning|none) is isolated in one place. - It provides stable artifacts for platforms (
JSON+SARIF) without changing analyzer core logic.
Quick example (same repository):
python3 scripts/ci/run_code_analysis.py \
--analyzer ./build/stack_usage_analyzer \
--compdb ./build/compile_commands.json \
--fail-on error \
--json-out artifacts/stack-usage.json \
--sarif-out artifacts/stack-usage.sarifGitHub Actions consumer example is available at:
docs/ci/github-actions-consumer.ymldocs/ci/github-actions-module-consumer.yml(consume this repo directly viauses:)- Analyzer architecture notes:
docs/architecture/analyzer-modules.md
If you publish tags for this repository, other projects can consume it directly:
name: Stack Analysis
on:
pull_request:
workflow_dispatch:
jobs:
analyze:
runs-on: ubuntu-24.04
permissions:
contents: read
security-events: write
steps:
- uses: actions/checkout@v4
- name: Generate compile_commands.json
run: cmake -S . -B build -DCMAKE_EXPORT_COMPILE_COMMANDS=ON
- name: Run CoreTrace action module
uses: CoreTrace/coretrace-stack-analyzer@v0
with:
compile-commands: build/compile_commands.json
analysis-profile: fast
resource-model: default
resource-cache-memory-only: "true"
fail-on: error
sarif-file: artifacts/coretrace-stack-analysis.sarif
json-file: artifacts/coretrace-stack-analysis.json
upload-sarif: "true"Notes:
- SARIF is generated by default (
sarif-file) and can be uploaded automatically (upload-sarif: "true"). - If
compile-commandsis not provided, the action tries common locations:build/compile_commands.json,compile_commands.json,.coretrace/build-linux/compile_commands.json. - If no compile database is found, it can fallback to git-tracked sources
(
inputs-from-git-fallback, enabled by default).
When you want a reusable analyzer image in CI (instead of rebuilding the tool each run), build and publish:
Dockerfile: analyzer runtime image with sensible defaults for full-repo analysis.Dockerfile.ci: CI gate image (entrypoint =run_code_analysis.py).
Default behavior of Dockerfile runtime entrypoint:
- auto-detect
compile_commands.jsonfrom/workspace/build/compile_commands.json(fallback:/workspace/compile_commands.json) --analysis-profile=fast--compdb-fast(drops heavy/platform-specific compile flags from compile DB)--resource-summary-cache-memory-only--resource-model=/models/resource-lifetime/generic.txt- if
compile_commands.jsoncontains stale absolute paths (e.g./tmp/evan/...) while the repo is mounted at/workspace, a compatibility symlink is created automatically when safe (so analysis can still run without extra Docker flags)
Runtime image is intentionally analyzer-only (toolchain/runtime + analyzer models). Project-specific SDKs/headers must be installed in the target CI job or in a derived image.
Simple local run (analyze whole repo from compile database):
docker build -t coretrace-stack-analyzer .
docker run --rm -v "$PWD:/workspace" coretrace-stack-analyzerOverride defaults:
docker run --rm -v "$PWD:/workspace" coretrace-stack-analyzer \
--analysis-profile=full \
--warnings-onlyBypass defaults entirely:
docker run --rm -v "$PWD:/workspace" coretrace-stack-analyzer --raw --helpBuild and push:
docker build -f Dockerfile.ci \
--build-arg VERSION=0.1.0 \
--build-arg VCS_REF="$(git rev-parse --short HEAD)" \
-t ghcr.io/<org>/coretrace-stack-analyzer-ci:0.1.0 .
docker push ghcr.io/<org>/coretrace-stack-analyzer-ci:0.1.0Run in CI (entrypoint already targets run_code_analysis.py):
docker run --rm \
-u "$(id -u):$(id -g)" \
-v "$PWD:/workspace" -w /workspace \
ghcr.io/<org>/coretrace-stack-analyzer-ci:0.1.0 \
--inputs-from-git --repo-root /workspace \
--compdb /workspace/build/compile_commands.json \
--analyzer-arg=--analysis-profile=fast \
--analyzer-arg=--resource-summary-cache-memory-only \
--analyzer-arg=--resource-model=/models/resource-lifetime/generic.txt \
--exclude _deps/ \
--base-dir /workspace \
--fail-on error \
--print-diagnostics warning \
--json-out /workspace/artifacts/stack-usage.json \
--sarif-out /workspace/artifacts/stack-usage.sarifMost C/C++ repos generate it during CMake configure:
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DCMAKE_EXPORT_COMPILE_COMMANDS=ONThen run the Docker image with /workspace/build/compile_commands.json.
Important for CI reliability:
- Generate
compile_commands.jsonin the same OS/toolchain family as the analyzer run. - Reusing a macOS compile database in Linux CI often fails (
-arch,-isysroot, Apple SDK paths). --compdb-fastimproves portability by dropping many heavy/platform-specific flags, but cannot replace missing third-party headers/SDKs.- If your project needs extra dependencies, extend the analyzer image:
FROM ghcr.io/<org>/coretrace-stack-analyzer:0.1.0
RUN apt-get update && apt-get install -y --no-install-recommends \
<project-dev-packages> \
&& rm -rf /var/lib/apt/lists/*Ready-to-adapt workflow examples:
- non-Docker consumer:
docs/ci/github-actions-consumer.yml - Docker consumer:
docs/ci/github-actions-docker-consumer.yml
- Target version:
clang-format20 (used in CI). - Format locally:
./scripts/format.sh - Check without modifying:
./scripts/format-check.sh - CMake:
cmake --build build --target formator--target format-check - CI: the GitHub Actions
clang-formatjob fails if a file is not formatted.
./stack_usage_analyzer --mode=[abi/ir] test.[ll/c/cpp] other.[ll/c/cpp]
./stack_usage_analyzer main.cpp -I./include
./stack_usage_analyzer main.cpp -I./include --compile-arg=-I/opt/homebrew/opt/llvm@20/include
./stack_usage_analyzer main.cpp --compile-commands=build/compile_commands.json
./stack_usage_analyzer main.cpp -I./include --only-file=./main.cpp --only-function=main
./stack_usage_analyzer main.cpp --dump-ir=./debug/main.ll
./stack_usage_analyzer a.c b.c --dump-ir=./debug--format=json|sarif|human
--analysis-profile=fast|full selects analysis precision/performance profile (default: full)
--quiet disables diagnostics entirely
--warnings-only hides info-level diagnostics; in human output it also lists only functions with warnings/errors
--stack-limit=<value> overrides stack limit (bytes, or KiB/MiB/GiB)
--compile-arg=<arg> passes an extra argument to the compiler
--compile-commands=<path> uses compile_commands.json (file or directory)
--compdb=<path> alias for --compile-commands
--compdb-fast drops heavy build flags for faster analysis
--include-compdb-deps includes `_deps` entries when inputs are auto-discovered from compile_commands.json
--jobs=<N> parallel jobs for multi-file loading/analysis and cross-TU resource summary build (default: 1)
--escape-model=<path> loads external noescape rules for stack pointer escape analysis (`noescape_arg`)
--resource-model=<path> loads external acquire/release rules for generic resource lifetime checks
--resource-cross-tu enables cross-TU resource summaries for resource lifetime analysis (default: on)
--no-resource-cross-tu disables cross-TU resource summaries
--resource-summary-cache-dir=<path> sets cache directory for cross-TU resource summaries (default: .cache/resource-lifetime)
--resource-summary-cache-memory-only keeps cross-TU summary cache in memory only (process-local, no files)
--timing prints compile/analysis timings to stderr
--dump-ir=<path> writes LLVM IR to a file (or directory for multiple inputs)
-I<dir> or -I <dir> adds an include directory
-D<name>[=value] or -D <name>[=value] defines a macro
--only-file=<path> or --only-file <path> filters by file
--only-dir=<path> or --only-dir <path> filters by directory
--exclude-dir=<dir0,dir1> excludes input files under one or more directories
--only-function=<name> or --only-function <name> filters by function
--only-func=<name> alias for --only-function
--STL includes STL/system library functions (default excludes them)
--dump-filter prints filter decisions (stderr)
To generate compile_commands.json with CMake, configure with
-DCMAKE_EXPORT_COMPILE_COMMANDS=ON and point to the resulting file
(often under build/).
If analysis feels slow, --compdb-fast disables heavy flags (optimizations,
sanitizers, profiling) while keeping include paths and macros.
For multi-file runs, --jobs=<N> parallelizes input loading; with resource lifetime cross-TU enabled it also parallelizes summary construction.
When inputs are auto-discovered from compile_commands.json, _deps entries are skipped by default
to keep analysis focused on project code; use --include-compdb-deps to opt back in.
Use --analysis-profile=full (default) or --analysis-profile=fast.
Examples:
./build/stack_usage_analyzer --compile-commands=build/compile_commands.json --analysis-profile=fast
./build/stack_usage_analyzer --compile-commands=build/compile_commands.json --analysis-profile=fullfast:- For
StackBufferOverflowandMultipleStoreschecks, functions bigger than 1200 IR instructions are skipped. StackBufferOverflowanalyzes at most 16getelementptrsites per function.MultipleStoresanalyzes at most 32storesites per function.- Alias backtracking through pointer stores is disabled for these two checks.
- Result: significantly faster runs, with possible false negatives on very large/complex functions.
- For
full:- No instruction-count skip for these checks.
- No per-function GEP/store budget limit.
- Full alias backtracking is enabled for these checks.
- Result: better coverage/precision, but potentially much slower on large translation units.
When inputs are auto-discovered from compile_commands.json and multiple files are analyzed,
the CLI auto-selects fast unless you explicitly pass --analysis-profile=full.
If you embed the analyzer as a library and still want to reuse analyzer-style
arguments (--mode=..., --jobs=..., etc.), use the CLI parser bridge:
ctrace::stack::cli::parseArguments(const std::vector<std::string>&)ctrace::stack::cli::parseCommandLine(const std::string&)
Example:
#include "cli/ArgParser.hpp"
auto parsed = ctrace::stack::cli::parseCommandLine(
"--mode=abi --analysis-profile=fast --warnings-only --jobs=4"
);
if (parsed.status == ctrace::stack::cli::ParseStatus::Error) {
// handle parsed.error
}
ctrace::stack::AnalysisConfig cfg = parsed.parsed.config;This keeps one single source of truth for option semantics between CLI and library consumers.
When --compile-commands is provided and no input file is passed on the CLI,
the analyzer automatically uses compile_commands.json as the source of truth:
- it analyzes supported entries (
.c,.cc,.cpp,.cxx,.ll) - it skips unsupported entries (e.g. Objective-C
.m) with an explicit status line - it skips
_depsentries by default (override with--include-compdb-deps) - duplicate file entries are merged deterministically, preferring the most informative command
- translation units with no analyzable functions are reported as informational skips (not fatal errors)
--exclude-diris applied before analysis to skip selected directory trees (works with explicit inputs and compdb-driven inputs)
The analyzer can detect:
- missing release in a function (
ResourceLifetime.MissingRelease,CWE-772) - double release in a function (
ResourceLifetime.DoubleRelease,CWE-415) - constructor acquisition not released in destructor for class fields
(
ResourceLifetime.MissingDestructorRelease,CWE-772)
Why this architecture:
- API ownership semantics are defined in an external model file instead of hardcoded rules.
- The same analysis engine stays reusable across libraries (Vulkan, file handles, sockets, custom APIs).
- Extending coverage does not require modifying analyzer core logic.
- Cross-TU summaries propagate ownership effects across translation units without requiring whole-program linking.
- Incremental summary caching keeps multi-file analysis scalable in CI by reusing unchanged module summaries.
Cross-TU summary behavior:
- Active when
--resource-modelis provided and multiple input files are analyzed. --resource-cross-tukeeps this behavior enabled (default).--no-resource-cross-tuforces local-only (single-file) resource reasoning.--resource-summary-cache-dir=<path>controls where per-module summary cache files are stored.--resource-summary-cache-memory-onlydisables filesystem cache writes and uses an in-process cache only.--jobs=<N>parallelizes module loading/compilation and per-module summary extraction during each fixpoint iteration.- The CLI prints an explicit status line to
stderrto indicate whether resource inter-procedural analysis is enabled or unavailable/disabled (with reason). - If a local release depends on an unmodeled/external callee and no summary is available, the tool
emits
ResourceLifetime.IncompleteInterprocas a warning to make precision limits visible.
Model format (--resource-model=<path>):
acquire_out <function-pattern> <out-arg-index> <resource-kind>
acquire_ret <function-pattern> <resource-kind>
release_arg <function-pattern> <arg-index> <resource-kind>
Function pattern matching supports exact names and glob patterns (*, ?, [ ... ]) and
is applied to symbol names and demangled names.
Example model:
acquire_out acquire_handle 0 GenericHandle
release_arg release_handle 0 GenericHandle
Example run:
./build/stack_usage_analyzer \
test/resource-lifetime/local-missing-release.c \
--resource-model=models/resource-lifetime/generic.txt \
--warnings-only
./build/stack_usage_analyzer \
test/resource-lifetime/cross-tu-wrapper-def.c \
test/resource-lifetime/cross-tu-wrapper-use.c \
--resource-model=models/resource-lifetime/generic.txt \
--resource-summary-cache-memory-only \
--warnings-only
./build/stack_usage_analyzer \
test/resource-lifetime/cross-tu-wrapper-def.c \
test/resource-lifetime/cross-tu-wrapper-use.c \
--resource-model=models/resource-lifetime/generic.txt \
--resource-summary-cache-dir=.cache/resource-lifetime \
--warnings-onlyFor test files, run_test.py also supports per-file model selection with:
// resource-model: <path>.
Given this code:
#define SIZE_LARGE 8192000000
#define SIZE_SMALL (SIZE_LARGE / 2)
int main(void)
{
char test[SIZE_LARGE];
return 0;
}You can pass either the .c file or the corresponding .ll file to the analyzer.
You may receive the following output:
Language: C
Compiling source file to LLVM IR...
Mode: ABI
Function: main
local stack: 4096000016 bytes
max stack (including callees): 4096000016 bytes
[!] potential stack overflow: exceeds limit of 8388608 bytesGiven this code:
int foo(void)
{
char test[8192000000];
return 0;
}
int bar(void)
{
return 0;
}
int main(void)
{
foo();
bar();
return 0;
}Depending on the selected --mode, you may obtain the following results:
Language: C
Compiling source file to LLVM IR...
Mode: ABI
Function: foo
local stack: 8192000000 bytes
max stack (including callees): 8192000000 bytes
[!] potential stack overflow: exceeds limit of 8388608 bytes
Function: bar
local stack: 16 bytes
max stack (including callees): 16 bytes
Function: main
local stack: 32 bytes
max stack (including callees): 8192000032 bytes
[!] potential stack overflow: exceeds limit of 8388608 bytesLanguage: C
Compiling source file to LLVM IR...
Mode: IR
Function: foo
local stack: 8192000000 bytes
max stack (including callees): 8192000000 bytes
[!] potential stack overflow: exceeds limit of 8388608 bytes
Function: bar
local stack: 0 bytes
max stack (including callees): 0 bytes
Function: main
local stack: 16 bytes
max stack (including callees): 8192000016 bytes
[!] potential stack overflow: exceeds limit of 8388608 bytesExamples:
char buf[10];
return buf; // returns pointer to stack -> use-after-returnOr storing:
global = buf; // leaking address of stack variableStack escape API contracts (--escape-model=<path>)
- Why this exists:
- Some external APIs consume pointer arguments immediately during the call.
- Their declarations often do not carry LLVM
nocapture-like attributes. - A model lets you encode this behavior without hardcoding library names in analyzer code.
- Resolution order used by the analyzer:
- LLVM call-site attributes (
nocapture/ byval / byref) - Inter-procedural summary (for analyzed definitions)
- External stack-escape model (
noescape_arg) - Opaque external call without proof/model: no strong escape diagnostic is emitted.
- LLVM call-site attributes (
Model format (--escape-model=<path>):
noescape_arg <function-pattern> <arg-index>
Function pattern matching supports exact names and glob patterns (*, ?, [ ... ]) and
is applied to symbol names and demangled names.
Example model:
noescape_arg vkUpdateDescriptorSets 2
noescape_arg vkUpdateDescriptorSets 4
For test files, run_test.py supports per-file selection with:
// escape-model: <path>.
Actually done:
-
- Multi-file CLI inputs with deterministic ordering and aggregated output.
-
- Per-result file attribution in JSON/SARIF and diagnostics.
-
- Filters:
--only-file,--only-dir,--exclude-dir,--only-function/--only-func, plus--dump-filter.
- Filters:
-
- Compile args passthrough:
-I,-D,--compile-arg.
- Compile args passthrough:
-
- Dynamic alloca / VLA detection, including user-controlled sizes, upper-bound inference, and recursion-aware severity (errors for infinite recursion or oversized allocations, warnings for other dynamic sizes).
-
- Deriving human-friendly names for unnamed allocas in diagnostics.
-
- Detection of memcpy/memset overflows on stack buffers.
-
- Warning when a function performs multiple stores into the same stack buffer.
-
- Deeper traversal analysis: constraint propagation.
-
- Detection of deep indirection in aliasing.
-
- Detection of overflow in a struct containing an internal array.
-
- Detection of stack pointer leaks:
- store_unknown -> storing the pointer in a non-local location (typically out-parameter, heap, etc.)
- call_callback -> passing it to a callback (indirect call)
- call_arg -> passing it as an argument to a direct function, potentially capturable
-
- Generic resource lifetime analysis using external API models
(
acquire_out,acquire_ret,release_arg), including missing release, double release, and constructor/destructor lifecycle mismatches.
- Generic resource lifetime analysis using external API models
(