clip-interrogator-average

Want to figure out what a good prompt might be to describe your training dataset as a whole or perhaps the opposite? The CLIP Interrogator is here to get you answers!

Run it!

Run Version 2 on Colab!

https://colab.research.google.com/drive/1BhPkF6P-nSU02pJSNbiUhkE3Ch8zMekA

About

The CLIP Interrogator average is a prompt engineering tool that combines OpenAI's CLIP and Salesforce's BLIP to optimize text prompts to match a dataset of images or to provide a negative prompt to test Lora on out of distribution prompts. Use the resulting prompts with text-to-image models like Stable Diffusion on DreamStudio to create cool art!

Using as a library

Create and activate a Python virtual environment

python3 -m venv ci_env
(for linux  ) source ci_env/bin/activate
(for windows) .\ci_env\Scripts\activate

Install with PIP

# install torch with GPU support for example:
pip3 install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu117

# install clip-interrogator
pip install clip-interrogator==0.5.4

# or for very latest WIP with BLIP2 support
#pip install clip-interrogator==0.6.0

This was optimized for usage in the J-something notebooks, didn't test locally.

CLIP Interrogator uses OpenCLIP which supports many different pretrained CLIP models. For the best prompts for Stable Diffusion 1.X use ViT-L-14/openai for clip_model_name. For Stable Diffusion 2.0 use ViT-H-14/laion2b_s32b_b79k, for SDXL - ViT-g-14/laion2B-s34B-b88K.

Configuration

The Config object lets you configure CLIP Interrogator's processing.

clip_model_name: which of the OpenCLIP pretrained CLIP models to use
cache_path: path where to save precomputed text embeddings
chunk_size: batch size for CLIP, use smaller for lower VRAM
quiet: when True no progress bars or text output will be displayed

On systems with low VRAM you can call config.apply_low_vram_defaults() to reduce the amount of VRAM needed (at the cost of some speed and quality). The default settings use about 6.3GB of VRAM and the low VRAM settings use about 2.7GB.

Ranking against your own list of terms (requires version 0.6.0)

from clip_interrogator import Config, Interrogator, LabelTable, load_list
from PIL import Image

ci = Interrogator(Config(blip_model_type=None))
images = [Image.open(image_path).convert('RGB') for image_path in path_list]
table = LabelTable(load_list('terms.txt'), 'terms', ci)
best_match = table.rank(ci.image_to_features(images), top_count=1)[0]
print(best_match)

Name		Name	Last commit message	Last commit date
Latest commit History 203 Commits
.github/workflows		.github/workflows
clip_interrogator		clip_interrogator
.gitignore		.gitignore
README.md		README.md
clip_interrogator_average.ipynb		clip_interrogator_average.ipynb
requirements.txt		requirements.txt
run_cli.py		run_cli.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

clip-interrogator-average

Run it!

About

Using as a library

Configuration

Ranking against your own list of terms (requires version 0.6.0)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

clip-interrogator-average

Run it!

About

Using as a library

Configuration

Ranking against your own list of terms (requires version 0.6.0)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages