Skip to content

ActivationsExtractorHelper._package_layer infers wrong shape for 1x1 conv implemented as linear layers #2190

@KogniJannis

Description

@KogniJannis

ConvNeXt (and possibly other modern models) implement 1x1 convolutions as linear layers broadcasted over height and width because this is slightly more performant (according to a comment in the model repo).
E.g. from the ConvNeXt implementation in the FacebookResearch Repo:
Line 30: self.pwconv1 = nn.Linear(dim, 4 * dim) # pointwise/1x1 convs, implemented with linear layers
In order to achieve this, dimensions are temporarily permuted:
Line 40: x = x.permute(0, 2, 3, 1) # (N, C, H, W) -> (N, H, W, C)

This leads to activations that match with the Conv2D case in the _package_layer function line 281-282 but do not have channels first:

elif flatten_indices.shape[1] == 3:  # 2DConv, e.g. resnet
            flatten_coord_names = ['channel', 'channel_x', 'channel_y']

Thus for models that have committed a 1x1 conv layer for any brain region, the wrong shape is inferred.
E.g. for ConvNeXt_xlarge (from the timm implementation, but same principle) V1-region, it infers
channels: 14
x-dim: 14
y-dim: 4096

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions