-
Notifications
You must be signed in to change notification settings - Fork 0
Analysis Pipeline
The analysis pipeline is the core orchestration layer that coordinates how a single LLVM module is analyzed.
The public API analyzeModule(llvm::Module&, const AnalysisConfig&) in StackUsageAnalyzer.cpp delegates to AnalysisPipeline.
AnalysisResult analyzeModule(llvm::Module& mod, const AnalysisConfig& config);File: src/analyzer/ModulePreparationService.cpp
Builds ModuleAnalysisContext (or PreparedModule) containing:
- Local stack sizes for every function
- Filtered call graph (edges between analyzed functions)
- Recursion metadata (cycle detection, infinite self-recursion heuristic via DominatorTree)
- Function-level metadata (has dynamic alloca, is recursive, etc.)
This stage is pure computation with no diagnostic side effects. The PreparedModule is reusable and testable independently.
Each analysis module is invoked on the prepared module. Passes are independent and can run on any function:
| Pass | What it does |
|---|---|
StackComputation |
Computes max stack including callees, detects overflow |
StackBufferAnalysis |
Detects buffer overflow via GEP + store analysis |
DynamicAlloca |
Detects VLAs, user-controlled alloca, oversized alloca |
AllocaUsage |
Classifies alloca usage patterns |
MemIntrinsicOverflow |
Checks memcpy/memset sizes against buffer bounds |
StackPointerEscape |
Detects stack address leaks (store, callback, return) |
ResourceLifetimeAnalysis |
Model-driven acquire/release checking |
UninitializedVarAnalysis |
Detects reads before writes |
DuplicateIfCondition |
Detects duplicate conditions in if-else chains |
ConstParamAnalysis |
Detects parameters that could be const |
SizeMinusKWrites |
Detects off-by-one patterns |
InvalidBaseReconstruction |
Detects unsafe pointer arithmetic |
IntRanges |
Integer range analysis (used by other passes) |
File: src/analysis/Reachability.cpp
Filters out findings in unreachable code. Uses static reachability heuristics to annotate findings:
- Detached basic blocks (no predecessors)
- Blocks dominated by
unreachableterminator paths - Multi-predecessor blocks where all paths are unreachable
File: src/analyzer/LocationResolver.cpp
Converts LLVM debug locations (!dbg metadata) to source coordinates:
- Primary: instruction debug location
- Fallback: debug intrinsics (
dbg.declare,dbg.value) for allocas - Normalization: line/column numbers, file paths
File: src/analyzer/DiagnosticEmitter.cpp
Converts raw analysis findings into Diagnostic structs:
- Assigns rule IDs (e.g.,
StackBufferOverflow,ResourceLifetime.MissingRelease) - Sets severity (
Info,Warning,Error) - Formats human-readable messages with alias paths, index values, etc.
- Assigns CWE identifiers
- Computes SARIF-compatible source ranges
Before analysis, the pipeline applies function filtering:
-
STL filter (
FunctionFilter.cpp): excludes standard library, system, and third-party functions by default (override with--STL) -
Name/path filters:
--only-function,--only-file,--only-dirare applied post-analysis during result filtering
The fast profile modifies pass behavior:
-
StackBufferAnalysis: skips functions > 1200 IR instructions, limits to 16 GEP sites/function, disables alias backtracking -
MultipleStores(withinStackBufferAnalysis): limits to 32 store sites/function
All other passes run identically in both profiles.
Within a single module, analysis is sequential. Parallelism occurs at the multi-file level:
- Module loading is parallelized with
--jobs - Cross-TU summary extraction is parallelized per module
- Without cross-TU, each file can be analyzed independently in parallel