If you use this code in your research, please cite our paper:
@misc{kachwala2025prefillguidedthinking,
title={Prefill-Guided Thinking for zero-shot detection of AI-generated images},
author={Zoher Kachwala and Danishjeet Singh and Danielle Yang and Filippo Menczer},
year={2025},
eprint={2506.11031},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2506.11031},
}Note: Paper submitted to ACL ARR.
Can you tell which images above are real vs AI-generated? Answer in footnote¹
This repository contains the evaluation system for our paper on using Prefill-Guided Thinking (PGT) to detect AI-generated images with Vision-Language Models (VLMs).
💡 For detailed technical documentation, particularly helpful for LLM code agents: See AGENTS.md for complete architecture details, function signatures, and implementation specifics.
Key Finding: Simply prefilling a VLM's response with the phrase "Let's examine the style and the synthesis artifacts" improves detection by up to 24% in Macro F1 — without any training or fine-tuning.
Instead of asking a VLM to detect fake images directly, we prefill its response to guide its reasoning:
- (a) Baseline: Direct query → incorrect classification (real)
- (b) Chain-of-Thought: "Let's think step by step" → still incorrect
- (c) S2 (our method): "Let's examine the style and the synthesis artifacts" → correct ✓
This simple technique works across 3 VLMs and 16 different image generators spanning faces, objects, and natural scenes.
See SETUP.md for complete environment setup instructions (conda, PyTorch, vLLM, Flash-Attention).
See Usage Examples for detailed command-line examples and all available options.
We evaluate on three diverse benchmarks:
| Dataset | Content | Images | Generators |
|---|---|---|---|
| D3 | Diverse web images (objects, scenes, art) | 8.4k | 4 (Stable Diffusion variants, DeepFloyd) |
| DF40 | Human faces (deepfakes) | 10k | 6 (Midjourney, StyleCLIP, StarGAN, etc.) |
| GenImage | ImageNet objects (animals, vehicles) | 10k | 8 (ADM, BigGAN, GLIDE, etc.) |
See Data Collection & Setup for complete instructions on downloading and organizing all three datasets.
- Qwen2.5-VL-7B — Dynamic-resolution Vision Transformer
- LLaVA-OneVision-7B — Multimodal instruction-following model
- Qwen3-VL-8B — Current Qwen vision-language model, with Instruct and Thinking configs
All model configs run through vLLM for efficient inference.
| Method | Description |
|---|---|
| Baseline | No prefill, just ask the question |
| CoT | Chain-of-thought reasoning |
| S2 | Task-aligned (our method) |
See Usage Examples for detailed command-line examples and all available options.
Detection Macro F1 across models, datasets, and PGT variations. Bars show relative improvement of S2 over the next best method.
Per-generator recall figures used in the COLM paper are generated under results/figures/ by the plotting scripts in results/.
To understand how prefills affect reasoning, we track answer confidence at five partial-response intervals (0–100% of sentences):
At each interval, we probe for the model's answer and confidence. The results reveal a striking pattern:
Evolution of answer confidence and Macro F1 across partial responses for Qwen. Baseline queries trigger immediate high confidence despite poor detection — the model commits to an answer before examining the image. Prefills induce a confidence dip mid-response, with detection improving steadily as the response progresses.
- Multi-Response Generation (n>1) - Generate multiple responses with majority voting → Details
- Phrase Modes - Test prefill vs prompt vs system instruction → Details
- Debug Mode - Quick validation with 5 examples → Details
Results are saved in hierarchical directories with timestamped JSON files containing metrics and full reasoning traces.
See Output Structure for detailed file organization and JSON schemas.
Generate publication-ready plots (Macro F1 bars, radar plots, vocabulary analysis, etc.)
See Plotting & Visualization System for available plots and usage instructions.
- SETUP.md - Environment setup and installation instructions
- AGENTS.md - Complete technical reference (architecture, function signatures, all details)
- Paper - arXiv:2506.11031
Zoher Kachwala · Danishjeet Singh · Danielle Yang · Filippo Menczer
Observatory on Social Media Indiana University, Bloomington
¹ Answer to image quiz: Only images 3, 10, and 11 in the mosaic are real. All others are AI-generated.




