Detailed Debugging Journey: Still hitting 'TypeError: PlainLayout must define __torch_dispatch__' after workarounds for UnpicklingError and AttributeError

Hello ZenCtrl developers and community,

I've been extensively trying to get ZenCtrl app/gradio_app.py running on Windows 10/11 with an NVIDIA RTX 3050 Ti Laptop GPU (4GB VRAM), Python 3.10, and PyTorch 2.5.1 (cuda 12.1). I'm encountering a series of issues related to loading the quantized model sayakpaul/flux.1-schell-int8wo-improved.
Following up on the discussion here, particularly the _pickle.UnpicklingError due to weights_only=True and the subsequent AttributeError: Can't get attribute 'PlainAQTLayout', I've gone through a detailed debugging process.

Environment:
OS: Windows 10/11
GPU: NVIDIA RTX 3050 Ti Laptop GPU (4GB VRAM)
Conda Environment: Python 3.10 (located at C:\Users\Antonio\anaconda3\envs\zenctrl_env)
PyTorch: 2.5.1 (installed via conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia)
ZenCtrl: Cloned fresh from GitHub into D:\ZenCtrl.
requirements.txt: Installed as per the repository (which installs torchao==0.11.0 by default when no version is pinned).

Summary of Issues and Attempts:

1. Initial Error (with original gradio_app.py and torchao==0.11.0):
The application fails with:
_pickle.UnpicklingError: Weights only load failed. ... WeightsUnpickler error: Unsupported global: GLOBAL torchao.dtypes.affine_quantized_tensor.PlainAQTLayout was not an allowed global by default.
This occurs because accelerate calls torch.load with weights_only=True, and PlainAQTLayout is not a default safe global, nor is it directly importable from torchao 0.11.0 to be added via torch.serialization.add_safe_globals.

2. Attempting umerkayvyro's Global torch.load Monkey Patch (from Issue #11):
This patch successfully forces weights_only=False and bypasses the initial _pickle.UnpicklingError.
The patch applied in D:\ZenCtrl\app\gradio_app.py involved these core lines:
original_torch_load = torch.load
def patched_torch_load(*args, **kwargs):
    kwargs['weights_only'] = False
    return original_torch_load(*args, **kwargs)
torch.load = patched_torch_load

3. Error After Applying Global torch.load Patch:
As Jandown also reported, after forcing weights_only=False, the next error is:
AttributeError: Can't get attribute 'PlainAQTLayout' on <module 'torchao.dtypes.affine_quantized_tensor' from 'C:\Users\Antonio\anaconda3\envs\zenctrl_env\lib\site-packages\torchao\dtypes\affine_quantized_tensor.py'>
This happens because the unpickler, now allowed to run more freely, cannot find the definition for PlainAQTLayout within the torchao.dtypes.affine_quantized_tensor module of torchao 0.11.0.

4. Attempting to Fix AttributeError for PlainAQTLayout with an Alias:
Based on the torchao 0.11.0 source, I tried aliasing the expected name to the existing PlainLayout.
The relevant lines added to D:\ZenCtrl\app\gradio_app.py (in addition to the global torch.load patch) were:
import torchao.dtypes.affine_quantized_tensor
from torchao.dtypes.utils import PlainLayout as UtilsPlainLayout
torchao.dtypes.affine_quantized_tensor.PlainAQTLayout = UtilsPlainLayout

5. Error After Aliasing PlainAQTLayout (with global torch.load patch still active):
This resolved the AttributeError for PlainAQTLayout, but a new one appeared:
AttributeError: Can't get attribute 'PlainLayoutType' on <module 'torchao.dtypes.utils' from 'C:\Users\Antonio\anaconda3\envs\zenctrl_env\lib\site-packages\torchao\dtypes\utils.py'>
This indicates the model pickle also expects a PlainLayoutType class/attribute within torchao.dtypes.utils.

6. Attempting to Fix AttributeError for PlainLayoutType with a Second Alias:
Again, assuming PlainLayout from torchao.dtypes.utils (v0.11.0) is the intended functional equivalent.
The relevant lines added to D:\ZenCtrl\app\gradio_app.py (in addition to the global torch.load patch and first alias) were:
import torchao.dtypes.utils
# UtilsPlainLayout was already imported as: from torchao.dtypes.utils import PlainLayout as UtilsPlainLayout
torchao.dtypes.utils.PlainLayoutType = UtilsPlainLayout

7. Final Error (with global torch.load patch + both aliases for PlainAQTLayout and PlainLayoutType, using torchao==0.11.0):
After applying all the above patches, the _pickle.UnpicklingError and AttributeErrors for the names are gone. However, the loading process now fails with:
TypeError: PlainLayout must define __torch_dispatch__
This occurs deep in the torch.load -> _load -> unpickler.load() -> _rebuild_from_type_v2 -> _rebuild_wrapper_subclass call stack.

Conclusion from these steps:
It appears that the sayakpaul/flux.1-schell-int8wo-improved model was serialized with an older/different version of torchao where:
a. torchao.dtypes.affine_quantized_tensor.PlainAQTLayout was a defined class/type.
b. torchao.dtypes.utils.PlainLayoutType was a defined class/type.
c. These types were likely proper torch.Tensor subclasses or compatible types that correctly interacted with PyTorch's dispatch mechanism (e.g., by defining __torch_dispatch__ or implementing the necessary tensor protocols).

In torchao==0.11.0 (and nearby versions like 0.9.0, 0.10.0 which were also tested and gave the initial PlainAQTLayout unpickling error), these exact names/definitions do not exist. While aliasing them to torchao.dtypes.utils.PlainLayout allows the unpickler to find names, the underlying PlainLayout class in torchao 0.11.0 is not a functional substitute, as it lacks the required __torch_dispatch__ method, leading to the TypeError.
The version of transformers installed by requirements.txt also requires a relatively recent torchao (versions like 0.7.0, 0.8.0 cause an ImportError: cannot import name 'Int4WeightOnlyConfig').

This creates a difficult compatibility situation for users with GPUs that necessitate the quantized model.

Current State:
Unable to run ZenCtrl with the sayakpaul/flux.1-schell-int8wo-improved model on torchao==0.11.0 due to these deep unpickling/type compatibility issues, even when weights_only=False is forced and name aliasing is attempted.
Any guidance or an update to the model/requirements would be greatly appreciated. My 4GB VRAM GPU makes the quantized model a necessity.

Thank you for your work on ZenCtrl.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detailed Debugging Journey: Still hitting 'TypeError: PlainLayout must define __torch_dispatch__' after workarounds for UnpicklingError and AttributeError #13

UtilsPlainLayout was already imported as: from torchao.dtypes.utils import PlainLayout as UtilsPlainLayout

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Detailed Debugging Journey: Still hitting 'TypeError: PlainLayout must define __torch_dispatch__' after workarounds for UnpicklingError and AttributeError #13

Description

UtilsPlainLayout was already imported as: from torchao.dtypes.utils import PlainLayout as UtilsPlainLayout

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions