IH-Korupsi (short for Intelligent Hunting (IH) - Korupsi) is an open-source Python toolkit designed to detect financial anomalies and potential corruption using pure mathematical methods without AI/Machine Learning.
Developed by OurCreativity Edisi Coding to support transparency and financial accountability worldwide.
Corruption harms economies and societies. IH-Korupsi provides an educational and professional tool that can be used by:
- Government internal auditors
- Investigative journalists
- Anti-corruption researchers
- Students of data forensics
- Transparency NGOs
- Anyone committed to fighting corruption
- Transparency: Every anomaly detection can be explained through mathematical formulas.
- Auditable: No "black box" algorithms—everything is open and deterministic.
- Zero AI Dependency: Pure statistics and mathematics to ensure results are legally defensible.
- Open Source: Free to use, study, and improve.
Detects manipulation in financial reports. Benford’s Law states that in natural financial data, the first leading digit follows a specific logarithmic distribution. If someone fabricates numbers, the distribution will likely deviate.
Case Example: Financial reports that are manipulated tend to have an unusual spike in numbers starting with digits like 5, 6, 7, or 8.
Identifies unusual transactions for a specific entity. RSF compares an entity's largest transaction to the average of its other transactions.
Formula: RSF = Largest Transaction / Average of Other Transactions
Case Example: A vendor that usually receives $500–$1,000 suddenly gets a contract for $50,000.
Standard statistical methods to find extreme outliers in transaction data.
Finds funds that return to the original sender through multiple intermediaries.
Case Example:
Department A → Vendor X → Sub-vendor Y → Consultant Z → Department A
This pattern is often used for price mark-ups or money laundering loops.
Finds hidden key actors in a network using algorithms like PageRank and Betweenness Centrality.
Detects unusual spending spikes during the final month of the fiscal year (budget dumping).
Indicator: Ratio of December spending vs. monthly average > 2.5x.
Detects inhuman transaction frequencies within a short period.
Case Example: 50 transactions in a single day for the same vendor.
Identifies "Ghost Vendors"—entities with slightly different names but likely the same identity.
Examples:
Global Solutions CorpvsGlobal Solutions Corp(double space)Smith & Sons LtdvsSmith and Sons Ltd
Uses the Levenshtein Distance algorithm without external NLP libraries.
- Python 3.10 or newer
- Windows / Linux / MacOS
- Clone this repository:
git clone https://github.com/[username]/ih-korupsi.git
cd ih-korupsi- Install dependencies:
pip install -r requirements.txtRun the toolkit with synthetic data containing injected anomalies:
python main.py --type sample --output sample_report.json --html visual_report.htmlThe system will:
- Generate 500 rows of synthetic transaction data.
- Inject intentional anomalies (Benford, RSF, Fiscal Cliff).
- Execute all detection modules.
- Generate a JSON data report and a beautiful HTML visual report.
Your CSV must include at least these columns:
amount: Transaction valuevendor_name: Vendor or recipient namevendor_id: Unique vendor IDdate: Transaction date (YYYY-MM-DD)sender_id: Sender identifierreceiver_id: Receiver identifier
Example:
python main.py --input my_data.csv --type csv --output my_results.json --html report.htmlThe JSON report is structured as follows:
{
"metadata": {
"total_rows": 500,
"total_amount": 300000000,
"currency": "IDR"
},
"findings": {
"The Mathematician": { ... },
"The Connector": { ... },
"The Chronologist": { ... },
"String Detective": { ... }
}
}- MAD < 0.006: High Conformity (Normal)
- MAD 0.006–0.012: Acceptable
- MAD 0.012–0.015: Marginal (Needs attention)
- MAD > 0.015: Non-conformity (Red flag!)
- RSF < 5: Normal
- RSF 5–10: Needs verification
- RSF > 10: Highly suspicious
- Not Legal Proof: IH-Korupsi provides indicators only. Anomalies do not automatically mean corruption; they require further investigation.
- Context Matters: Some anomalies can be legitimate (e.g., a massive infrastructure project causing a high RSF).
- Data Quality: Garbage in, garbage out. Ensure your input data is clean.
- Ethical Use: This toolkit is for transparency and education. Please use it responsibly.
We are a developer community that believes technology can be a force for social good. IH-Korupsi is one of our efforts toward a more transparent and accountable future.
Slogan: "Code for Justice, Data for Transparency"
This project is licensed under the MIT License.
Join us in building a cleaner, more transparent world!
