Skip to content

Eliot-Shen/DF-LLaVA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DF-LLaVA: Unlocking MLLMs for Synthetic Image Detection via Knowledge Injection and Conflict-Driven Self-Reflection

News

  • 2025.9.23 🤗 We have released the code and classifier weights.
  • 2025.9.18 🔥 We have released DF-LLaVA: Unlocking MLLM's potential for Synthetic Image Detection via Prompt-Guided Knowledge Injection. Check out the paper. We present DF-LLaVA model.

Evaluate image authenticity and obtain comprehensive artifact explanations

DF-LLaVA
DF-LLaVA provides comprehensive artifact-level interpretability with detection accuracy outperforming expert models.
DF-LLaVA
Overview of DF-LLaVA during inference. DF-LLaVA leverages its frozen vision encoder via a binary classifier for initial authenticity estimation. The probabilistic output is used as reference in prompts, based on which DF-LLaVA makes its prediction. The prediction then undergoes a conflict check and a possible self-reflection process from model to ensure its precision and robustness. Finally, artifacts are explained from various perspectives.

Contents

Install

  1. Clone the repo into a local folder.
git clone https://github.com/Eliot-Shen/DF-LLaVA.git

cd DF-LLaVA
  1. Install packages.
conda create -n dfllava python=3.10 -y
conda activate dfllava
pip install --upgrade pip 
pip install -e .
pip install -e ".[train]"
pip install flash-attn --no-build-isolation

Models

Pretrained model weights will be released on Hugging Face soon.

The auxiliary binary classifier weights are released here.

Training

1.Download training data

Please download the training data from FakeClue.

2.Train the auxiliary classifier

Use the code from My-UFD to train the UnivFD-style auxiliary classifier, then either select the best checkpoint or directly use the provided weights.

3. Augment the train set

Run inference on the entire training set using your pretrained auxiliary classifier, and add a confidence_score field to train.json from FakeClue. For example:

[
  {
    "image": "ff++/fake/Deepfakes/c23/frames/071_054/160.png",
    "label": 0,
    "cate": "deepfake",
    "width": 256,
    "height": 256,
    "conversations": [
      {
        "from": "human",
        "value": "<image>Does the image looks real/fake?"
      },
      {
        "from": "gpt",
        "value": "..."
      }
    ],
    "confidence_score": 0.9914323091506958 
  },
]

4.Train the LLaVA

sh ./scripts/train_dfllava.sh

Make sure to set "data_path" to the location of your augmented train.json.

Evaluation

Please download the test data used in the paper from FakeClue, LOKI and DMImage.

BibTeX

@article{Shen2025DFLLaVA,
      title={DF-LLaVA: Unlocking MLLMs for Synthetic Image Detection via Knowledge Injection and Conflict-Driven Self-Reflection}, 
      author={Zhuokang Shen and Kaisen Zhang and Bohan Jia and Heming Jia and Yuan Fang and Zhou Yu and Shaohui Lin},
      journal={arXiv preprint arXiv:2509.14957},
      year={2025}
}

About

DF-LLaVA: Unlocking MLLMs for Synthetic Image Detection via Knowledge Injection and Conflict-Driven Self-Reflection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors