Skip to content

haifangong/IDAMA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

English | 中文

Intermediate Domain Alignment and Morphology Analogy for Patent-Product Image Retrieval

NeurIPS 2025 Paper Hugging Face

Haifan Gong1,2† · Xuanye Zhang1,2† · Ruifei Zhang1,2 · Yun Su3 · Zhuo Li1,2 · Yuhao Du1,2 · Anningzhe Gao2 · Xiang Wan1,2* · Haofeng Li4,2*

1 The Chinese University of Hong Kong, Shenzhen · 2 Shenzhen Research Institute of Big Data · 3 University of Waterloo · 4 Sun Yat-sen University

Equal contribution · * Corresponding authors

Resource Link
Paper (NeurIPS) Abstract · PDF
OpenReview vE98S8BmzP
Dataset & models huggingface.co/haifan-gong/IDAMA
Project page docs/index.html
Code docs code/README.md

Overview

Patent-Product Image Retrieval (PPIR) retrieves patent images from product images to support infringement analysis. It is challenging because (1) both modalities contain diverse artificial objects, so standard pre-training generalizes poorly to unseen categories in an open-set setting, and (2) binary patent line drawings and colorful RGB product photos lie in very different visual domains.

We introduce IDAMA (Intermediate Domain Alignment and Morphology Analogy) and the benchmark PPIRD (Patent-Product Image Retrieval Dataset):

  • Intermediate Domain Mapping (IDM) — map both patent and product images into a shared sketch / edge domain via edge detection to reduce cross-domain gap.
  • Morphology Analogy Filter (MAF) — select discriminative patent images using high classification confidence (label-agnostic), inspired by analogical reasoning over visual morphology.

On PPIRD, IDAMA improves over strong baselines by +7.58 mAR and offers insights for open-set retrieval in PPIR.

Full figures, dataset protocol, and citation: see the project page (content aligned with idama-project).

PPIRD at a glance

Split / component Description Scale
Test queries Product–patent pairs with infringement labels and product metadata 439 pairs
Retrieval gallery Open-set patent pool at test time 727,921 images
Unlabeled pre-training Product & patent images (+ edge-domain variants for IDAMA) 3,799,695 images

Protocol: Given a product query, rank gallery patents; ground truth is annotated infringing patents. Gallery patents are not assumed seen during training.

Download PPIRD splits, checkpoints, and edge-detector weights from Hugging Face. Large files are not in this git repo—clone weights into data/ and model/ locally.

Repository layout

Path Description
code/preprocessing/ Edge extraction (UAED), index building, tar packing
code/feature_extraction/ Multi-backbone embeddings (EVA02, Swin, MAE, iBOT, …)
code/inference/ Product–patent similarity & Top-K evaluation
code/pretrain/ MAE / Swin / iBOT pretraining
docs/ Project homepage (index.html)
data/, model/ Local data & checkpoints (gitignored; from Hugging Face)

Set the project root:

export IDAMA_ROOT=/path/to/IDAMA
cd "${IDAMA_ROOT}"

Quick start (reproduction)

1. Download dataset and model weights from haifan-gong/IDAMA into data/ and model/.

2. Intermediate-domain edges (if not using precomputed edge images):

bash code/preprocessing/edge_feature/run_uaed_edge_inference.sh \
  /path/to/raw_images /path/to/edge_output 0.5

3. Feature extraction (example: EVA02):

CUDA_VISIBLE_DEVICES=0 bash code/feature_extraction/run_feature_extraction.sh eva02

4. Retrieval evaluation:

bash code/inference/run_inference.sh eva02

5. Optional — MAE pretraining on edge images:

cd code/pretrain
NPROC_PER_NODE=8 bash run_pretrain.sh mae-pretrain \
  "${IDAMA_ROOT}/data/unlabeled_train/data_edge/goods_edge_0.5" \
  "${IDAMA_ROOT}/output/pretrain_mae"

Step-by-step module reference: code/README.md.

Clone

git clone https://github.com/haifangong/IDAMA.git
cd IDAMA

Citation

If you find this work useful, please cite:

@inproceedings{gong2025intermediate,
  title     = {Intermediate Domain Alignment and Morphology Analogy for Patent-Product Image Retrieval},
  author    = {Gong, Haifan and Zhang, Xuanye and Zhang, Ruifei and Su, Yun and Li, Zhuo and Du, Yuhao and Gao, Anningzhe and Wan, Xiang and Li, Haofeng},
  booktitle = {Advances in Neural Information Processing Systems},
  volume    = {38},
  year      = {2025},
  url       = {https://papers.nips.cc/paper_files/paper/2025/hash/154743e7e9688cf77db5ee75807bda82-Abstract-Conference.html}
}

License

Third-party code (UAED, MAE, Swin, etc.) follows upstream licenses. Dataset and model terms are described on Hugging Face.

About

NeurIPS 2025 Intermediate Domain Alignment for Patent-Product Retrieval

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages