Skip to content

torch.load fails due to weights_only=True in PyTorch >=2.6 when loading quantized model #11

@umerkayvyro

Description

@umerkayvyro

When running app/gradio_app.py, loading the model via FluxTransformer2DModel.from_pretrained(...) fails with a pickle.UnpicklingError due to weights_only=True being the new default in PyTorch 2.6+.

Error:

_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options...

In particular, the model loading fails due to:

transformer_model = FluxTransformer2DModel.from_pretrained(
    "sayakpaul/flux.1-schell-int8wo-improved",
    torch_dtype=torch.bfloat16,
    use_safetensors=False,
)

Temporary Fix:

I patched torch.load globally as follows to resolve the issue:

original_torch_load = torch.load
def patched_torch_load(*args, **kwargs):
    kwargs['weights_only'] = False
    return original_torch_load(*args, **kwargs)
torch.load = patched_torch_load

This forces weights_only=False, which restores compatibility with model files containing pickled classes like torchao.dtypes.affine_quantized_tensor.AffineQuantizedTensor.

Suggestion:

Consider adding this workaround (or a conditional variant) in gradio_app.py.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions