Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
171 changes: 171 additions & 0 deletions .github/SECRET_SCANNING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
# Secret Scanning with Gitleaks

This repository uses [Gitleaks](https://github.com/gitleaks/gitleaks) to prevent secrets (API keys, passwords, private keys, tokens) from being committed to the codebase.

## How It Works

### Automated CI Scanning
- **Runs on:** All pull requests and pushes to `main` and `release-*` branches
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix trigger scope statement to match workflow config.

Line 8 says pushes to main and release-*, but .github/workflows/secret-scan.yml currently triggers push only on main. Please align the doc (or workflow) to avoid misleading contributors.

Proposed doc fix
-- **Runs on:** All pull requests and pushes to `main` and `release-*` branches
+- **Runs on:** Pull requests targeting `main` and `release-*`, and pushes to `main`
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- **Runs on:** All pull requests and pushes to `main` and `release-*` branches
- **Runs on:** Pull requests targeting `main` and `release-*`, and pushes to `main`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/SECRET_SCANNING.md at line 8, The README line "**Runs on:** All pull
requests and pushes to `main` and `release-*` branches" is out of sync with the
secret scan workflow; update the text in SECRET_SCANNING.md to match the actual
trigger in secret-scan.yml (e.g., change to "pushes to `main` only" or
explicitly list the triggers used), or alternatively modify secret-scan.yml to
include `push` on `release-*` and `pull_request` if you want the doc to remain
as-is; edit the specific line containing that sentence in SECRET_SCANNING.md (or
the `on:` block in secret-scan.yml) so both sources state the same trigger
scope.

- **What it scans:** Only new commits in your PR via `gitleaks-action@v2` with `GITLEAKS_ARGS: --log-opts="main..HEAD"` (not the entire git history despite `fetch-depth: 0`)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify trigger targets and gitleaks range args from workflow.
# Expected: pull_request includes main/release-*; gitleaks arg is hardcoded main..HEAD.

set -euo pipefail

echo "== Workflow trigger block =="
sed -n '1,40p' .github/workflows/secret-scan.yml

echo
echo "== Gitleaks step and args =="
rg -n -C3 'gitleaks-action@v2|GITLEAKS_ARGS|log-opts' .github/workflows/secret-scan.yml

Repository: openshift/origin

Length of output: 1711


🏁 Script executed:

cat -n .github/SECRET_SCANNING.md | head -20

Repository: openshift/origin

Length of output: 1069


🏁 Script executed:

# Check for any conditional logic in the workflow that might adjust gitleaks args based on target branch
rg -n 'if:|env\.|GITLEAKS_ARGS' .github/workflows/secret-scan.yml

Repository: openshift/origin

Length of output: 178


Gitleaks scan range is hardcoded to main..HEAD, which fails for release-* PRs.

Line 9 claims "Only new commits in your PR," but GITLEAKS_ARGS: --log-opts="main..HEAD" is hardcoded with no branch-awareness. For PRs targeting release-*, this scans commits from main that aren't in the target branch—not just PR commits. Either document this limitation or switch to ${GITHUB_BASE_REF}..HEAD for base-ref-aware scanning.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/SECRET_SCANNING.md at line 9, The Gitleaks scan range is hardcoded
to use GITLEAKS_ARGS: --log-opts="main..HEAD", which mis-scans PRs targeting
release-* branches; update the workflow to use the repository base ref variable
by replacing the fixed "main..HEAD" with a base-ref-aware range such as
"${GITHUB_BASE_REF}..HEAD" (or add conditional logic to fall back to main when
GITHUB_BASE_REF is empty), and update the GITLEAKS_ARGS reference and any
documentation in SECRET_SCANNING.md to reflect this change so scans truly cover
only the PR commits.

- **Speed:** ~4 mins for full repository scan
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Speed line conflicts with stated scan mode.

Line 10 references “full repository scan,” but Line 9 describes PR-commit scanning. Reword this to avoid mixing two different scan scopes.

Suggested doc edit
-- **Speed:** ~4 mins for full repository scan
+- **Speed:** Typically completes in a few minutes (depends on commit range and repository size)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- **Speed:** ~4 mins for full repository scan
- **Speed:** Typically completes in a few minutes (depends on commit range and repository size)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/SECRET_SCANNING.md at line 10, The "Speed: ~4 mins for full
repository scan" line conflicts with the preceding "PR-commit scanning" scope;
update the sentence that currently reads "Speed: ~4 mins for full repository
scan" so it either (a) specifies the time for PR-commit scans (e.g., "Speed: ~4
mins per PR-commit scan") to match the described mode, or (b) clearly documents
both modes (e.g., "Speed: ~4 mins per PR-commit scan; full repository scans may
take longer") thereby removing scope confusion—locate and edit the exact phrase
"Speed: ~4 mins for full repository scan" to one of these clarified variants.

- **Action:** Blocks PR merge if secrets are detected
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid overstating merge-blocking behavior.

Line 11 implies guaranteed merge blocking, but that depends on branch protection/required checks outside this file. Safer wording is that CI fails when findings are detected.

Proposed doc fix
-- **Action:** Blocks PR merge if secrets are detected
+- **Action:** Fails the secret-scan CI check if secrets are detected
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/SECRET_SCANNING.md at line 11, The statement "**Action: Blocks PR
merge if secrets are detected**" overstates behavior; update that sentence to
clarify it reflects CI/checks behavior rather than an absolute guarantee—replace
the line text with something like "**Action:** CI check fails (may block PR
merge depending on branch protection and required checks)" so it accurately
notes that merge blocking depends on repository branch protection and required
CI checks; edit the exact string shown on the existing line to the clarified
wording.


### What Gitleaks Detects
- API keys (AWS, GitHub, GitLab, Slack, etc.)
- Private keys (RSA, SSH, PGP, TLS)
- Database credentials and connection strings
- OAuth and JWT tokens
- Generic secrets (password=, api_key=, etc.)
- High entropy strings (randomized secrets)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Minor grammar: Use hyphenated compound modifier.

"High entropy strings" should be hyphenated as "high-entropy strings" when used as a compound modifier before a noun. As per static analysis tools.

📝 Suggested fix
 - Generic secrets (password=, api_key=, etc.)
-- High entropy strings (randomized secrets)
+- High-entropy strings (randomized secrets)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- High entropy strings (randomized secrets)
- High-entropy strings (randomized secrets)
🧰 Tools
🪛 LanguageTool

[grammar] ~19-~19: Use a hyphen to join words.
Context: ...crets (password=, api_key=, etc.) - High entropy strings (randomized secrets) ##...

(QB_NEW_EN_HYPHEN)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/SECRET_SCANNING.md at line 19, Update the phrase "High entropy
strings" to the hyphenated compound modifier "High-entropy strings" in the
markdown content (the heading/line currently reading "High entropy strings") so
it reads "High-entropy strings" wherever it's used as a modifier before a noun;
locate the exact phrase in the SECRET_SCANNING.md content and replace it with
the hyphenated version to satisfy grammar/static analysis rules.


### What It Ignores
- Test fixtures in `test/` directories
- Vendor code in `vendor/`
- Example files in `examples/`
- Mock/placeholder credentials
- Variable names like `password` or `apiKey`

## Running Locally

### Installation

**macOS (Homebrew):**
```bash
brew install gitleaks
```

**Linux:**
```bash
# Docker/Podman
docker pull ghcr.io/gitleaks/gitleaks:latest

# Or download binary
wget https://github.com/gitleaks/gitleaks/releases/latest/download/gitleaks_linux_x64.tar.gz
tar -xzf gitleaks_linux_x64.tar.gz
sudo mv gitleaks /usr/local/bin/
```

### Scan Before Committing

**Scan staged changes (recommended):**
```bash
gitleaks protect --staged --verbose
```

**Scan entire working directory:**
```bash
gitleaks detect --source . --config .gitleaks.toml --verbose
```

**Scan specific file:**
```bash
gitleaks detect --source path/to/file.go --no-git
```

## Handling Detections

### If Gitleaks Flags Your Commit

**1. Is it a real secret?**

If YES:
- **Remove the secret immediately**
- Use environment variables instead: `os.Getenv("API_KEY")`
- Store secrets in Kubernetes Secrets, Vault, or similar
- Rotate/revoke the exposed secret if it was already pushed

If NO (false positive):
- Continue to step 2

**2. For legitimate test fixtures or examples:**

Add to `.gitleaks.toml` allowlist:

```toml
[allowlist]
paths = [
'''path/to/test/file\.go$''',
]

# OR for specific values
regexes = [
'''specific-test-value-to-ignore''',
]
```

**3. For one-time overrides (use sparingly):**

Add inline comment in your code:
```go
password := "test-password" // gitleaks:allow
```

## Configuration

The `.gitleaks.toml` file controls what gets scanned and ignored:

- **Excluded paths:** `test/`, `vendor/`, `examples/`, `*.md`
- **Excluded patterns:** Test credentials, base64 test values, common examples
- **Rules:** Extends default gitleaks ruleset

To modify exclusions, edit `.gitleaks.toml` and test:
```bash
gitleaks detect --source . --config .gitleaks.toml --verbose
```

## Best Practices

### DO:
- ✅ Use environment variables for secrets
- ✅ Use Kubernetes Secrets or external secret management
- ✅ Run `gitleaks protect --staged` before committing sensitive changes
- ✅ Use placeholder values in examples: `YOUR_API_KEY_HERE`

### DON'T:
- ❌ Commit real credentials, even temporarily
- ❌ Use `--no-verify` to bypass the check
- ❌ Add broad exclusions to `.gitleaks.toml` without review
- ❌ Assume deleted secrets are safe (git history remembers)

## Troubleshooting

### CI fails but local scan passes
```bash
# Ensure you're using the config file
gitleaks detect --source . --config .gitleaks.toml --no-git

# Check which gitleaks version CI uses
grep 'gitleaks-action@' .github/workflows/secret-scan.yml
```

### Too many false positives
1. Review the findings carefully
2. Update `.gitleaks.toml` with specific exclusions
3. Test the config change locally
4. Submit the config update in your PR

### Need to scan git history
```bash
# Scan all commits (WARNING: can be slow on large repos)
gitleaks detect --source . --verbose

# Scan specific commit range
gitleaks detect --log-opts="main..HEAD"
```

## Additional Resources

- [Gitleaks Documentation](https://github.com/gitleaks/gitleaks)
- [Gitleaks Configuration Reference](https://github.com/gitleaks/gitleaks#configuration)
- [GitHub Secret Scanning](https://docs.github.com/en/code-security/secret-scanning)

## Questions?

For issues with secret scanning:
1. Check this guide first
2. Review `.gitleaks.toml` configuration
3. Ask in your PR or open an issue

---

**Remember:** It's easier to prevent secrets from being committed than to clean them up from git history!
43 changes: 43 additions & 0 deletions .github/workflows/secret-scan.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
name: Secret Scan

# Prevents secrets (API keys, passwords, tokens) from being committed.
# For setup and troubleshooting, see: .github/SECRET_SCANNING.md
# To run locally: gitleaks protect --staged --verbose

on:
pull_request:
branches:
- main
- 'release-*'
push:
branches:
- main

permissions:
contents: read
pull-requests: write

jobs:
gitleaks:
name: Scan for secrets
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history for comprehensive scanning

- name: Run Gitleaks
uses: gitleaks/gitleaks-action@v2
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITLEAKS_LICENSE: ${{ secrets.GITLEAKS_LICENSE }} # Optional: for Gitleaks Pro features
GITLEAKS_ENABLE_COMMENTS: true
GITLEAKS_ARGS: --log-opts="main..HEAD" # Scan only PR commits, not full history

- name: Upload SARIF report
if: failure()
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: results.sarif
category: gitleaks
67 changes: 67 additions & 0 deletions .gitleaks.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Gitleaks configuration for OpenShift Origin
# This config excludes common false positives while catching real secrets
#
# To test this config locally:
# gitleaks detect --source . --config .gitleaks.toml --verbose
#
# For full documentation, see: .github/SECRET_SCANNING.md

title = "gitleaks config for openshift/origin"

# Extend the default gitleaks config
[extend]
useDefault = true

[allowlist]
description = "Allowlist for test fixtures, examples, and vendor code"

# Paths to exclude from scanning
paths = [
# Test directories and files
'''test/''',
'''.*_test\.go$''',
'''testdata/''',

# Vendor dependencies
'''vendor/''',

# Examples and demo files
'''examples/''',

# Generated binary data files
'''bindata\.go$''',

# Lock files and checksums
'''go\.sum$''',
'''package-lock\.json$''',
'''yarn\.lock$''',

# Documentation
'''\.md$''',
]
Comment on lines +39 to +41
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Consider narrowing the markdown exclusion.

Excluding all .md files from secret scanning may be too broad. While most documentation is safe, developers might accidentally paste real secrets into markdown files (e.g., in README examples, troubleshooting guides, or inline code blocks). Consider either:

  1. Removing this exclusion entirely and relying on stopwords/regex allowlist
  2. Only excluding specific safe markdown files like README.md or CHANGELOG.md
  3. Keeping it but clearly documenting this risk in the SECRET_SCANNING.md guide
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.gitleaks.toml around lines 39 - 41, The current gitleaks exclusion pattern
'''\.md$''' is too broad; update .gitleaks.toml to narrow or remove this rule by
either deleting the '''\.md$''' entry, replacing it with a whitelist of safe
markdown filenames (e.g., README.md|CHANGELOG.md) in the pattern, or keeping the
exclusion but adding a note in SECRET_SCANNING.md documenting the risk of
secrets in markdown and instructing developers to avoid pasting secrets into .md
files; target the pattern string '''\.md$''' when making the change.


# Specific regexes to exclude (for base64 encoded test values, etc.)
regexes = [
# Base64 encoded placeholder values commonly used in tests
'''c2VjcmV0dmFsdWU=''', # "secretvalue" in base64
'''bXktc2VjcmV0LXZhbHVl''', # "my-secret-value" in base64
'''cGFzc3dvcmQ=''', # "password" in base64

# Common test/example credentials
'''admin:admin''',
'''system:admin''',
'''secretvalue1''',

# Grafana default example secret
'''SW2YcwTIb9zpOOhoPsMm''',
]

# Stopwords - tokens that if found in the match will cause it to be ignored
stopwords = [
# Common variable names and placeholders
'''YOUR_API_KEY''',
'''REPLACE_ME''',
'''CHANGEME''',
'''example\.com''',
'''localhost''',
]